WorldWideScience

Sample records for bacterial genome scale

  1. Genome-scale models of bacterial metabolism: reconstruction and applications

    OpenAIRE

    Durot, Maxime; Bourguignon, Pierre-Yves; Schachter, Vincent

    2008-01-01

    Genome-scale metabolic models bridge the gap between genome-derived biochemical information and metabolic phenotypes in a principled manner, providing a solid interpretative framework for experimental data related to metabolic states, and enabling simple in silico experiments with whole-cell metabolism. Models have been reconstructed for almost 20 bacterial species, so far mainly through expert curation efforts integrating information from the literature with genome annotation. A wide variety...

  2. Large-scale genomic analysis suggests a neutral punctuated dynamics of transposable elements in bacterial genomes.

    Science.gov (United States)

    Iranzo, Jaime; Gómez, Manuel J; López de Saro, Francisco J; Manrubia, Susanna

    2014-06-01

    Insertion sequences (IS) are the simplest and most abundant form of transposable DNA found in bacterial genomes. When present in multiple copies, it is thought that they can promote genomic plasticity and genetic exchange, thus being a major force of evolutionary change. The main processes that determine IS content in genomes are, though, a matter of debate. In this work, we take advantage of the large amount of genomic data currently available and study the abundance distributions of 33 IS families in 1811 bacterial chromosomes. This allows us to test simple models of IS dynamics and estimate their key parameters by means of a maximum likelihood approach. We evaluate the roles played by duplication, lateral gene transfer, deletion and purifying selection. We find that the observed IS abundances are compatible with a neutral scenario where IS proliferation is controlled by deletions instead of purifying selection. Even if there may be some cases driven by selection, neutral behavior dominates over large evolutionary scales. According to this view, IS and hosts tend to coexist in a dynamic equilibrium state for most of the time. Our approach also allows for a detection of recent IS expansions, and supports the hypothesis that rapid expansions constitute transient events-punctuations-during which the state of coexistence of IS and host becomes perturbated. PMID:24967627

  3. Large-scale genomic analysis suggests a neutral punctuated dynamics of transposable elements in bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Jaime Iranzo

    2014-06-01

    Full Text Available Insertion sequences (IS are the simplest and most abundant form of transposable DNA found in bacterial genomes. When present in multiple copies, it is thought that they can promote genomic plasticity and genetic exchange, thus being a major force of evolutionary change. The main processes that determine IS content in genomes are, though, a matter of debate. In this work, we take advantage of the large amount of genomic data currently available and study the abundance distributions of 33 IS families in 1811 bacterial chromosomes. This allows us to test simple models of IS dynamics and estimate their key parameters by means of a maximum likelihood approach. We evaluate the roles played by duplication, lateral gene transfer, deletion and purifying selection. We find that the observed IS abundances are compatible with a neutral scenario where IS proliferation is controlled by deletions instead of purifying selection. Even if there may be some cases driven by selection, neutral behavior dominates over large evolutionary scales. According to this view, IS and hosts tend to coexist in a dynamic equilibrium state for most of the time. Our approach also allows for a detection of recent IS expansions, and supports the hypothesis that rapid expansions constitute transient events-punctuations-during which the state of coexistence of IS and host becomes perturbated.

  4. Genome-scale Co-evolutionary Inference Identifies Functions and Clients of Bacterial Hsp90

    OpenAIRE

    Press, Maximilian O.; Li, Hui; Creanza, Nicole; Kramer, Günter; Queitsch, Christine; Sourjik, Victor; Borenstein, Elhanan

    2013-01-01

    The molecular chaperone Hsp90 is essential in eukaryotes, in which it facilitates the folding of developmental regulators and signal transduction proteins known as Hsp90 clients. In contrast, Hsp90 is not essential in bacteria, and a broad characterization of its molecular and organismal function is lacking. To enable such characterization, we used a genome-scale phylogenetic analysis to identify genes that co-evolve with bacterial Hsp90. We find that genes whose gain and loss were coordinate...

  5. Genome-scale co-evolutionary inference identifies functions and clients of bacterial Hsp90.

    Directory of Open Access Journals (Sweden)

    Maximilian O Press

    Full Text Available The molecular chaperone Hsp90 is essential in eukaryotes, in which it facilitates the folding of developmental regulators and signal transduction proteins known as Hsp90 clients. In contrast, Hsp90 is not essential in bacteria, and a broad characterization of its molecular and organismal function is lacking. To enable such characterization, we used a genome-scale phylogenetic analysis to identify genes that co-evolve with bacterial Hsp90. We find that genes whose gain and loss were coordinated with Hsp90 throughout bacterial evolution tended to function in flagellar assembly, chemotaxis, and bacterial secretion, suggesting that Hsp90 may aid assembly of protein complexes. To add to the limited set of known bacterial Hsp90 clients, we further developed a statistical method to predict putative clients. We validated our predictions by demonstrating that the flagellar protein FliN and the chemotaxis kinase CheA behaved as Hsp90 clients in Escherichia coli, confirming the predicted role of Hsp90 in chemotaxis and flagellar assembly. Furthermore, normal Hsp90 function is important for wild-type motility and/or chemotaxis in E. coli. This novel function of bacterial Hsp90 agreed with our subsequent finding that Hsp90 is associated with a preference for multiple habitats and may therefore face a complex selection regime. Taken together, our results reveal previously unknown functions of bacterial Hsp90 and open avenues for future experimental exploration by implicating Hsp90 in the assembly of membrane protein complexes and adaptation to novel environments.

  6. Biofilm Formation Mechanisms of Pseudomonas aeruginosa Predicted via Genome-Scale Kinetic Models of Bacterial Metabolism.

    Science.gov (United States)

    Vital-Lopez, Francisco G; Reifman, Jaques; Wallqvist, Anders

    2015-10-01

    A hallmark of Pseudomonas aeruginosa is its ability to establish biofilm-based infections that are difficult to eradicate. Biofilms are less susceptible to host inflammatory and immune responses and have higher antibiotic tolerance than free-living planktonic cells. Developing treatments against biofilms requires an understanding of bacterial biofilm-specific physiological traits. Research efforts have started to elucidate the intricate mechanisms underlying biofilm development. However, many aspects of these mechanisms are still poorly understood. Here, we addressed questions regarding biofilm metabolism using a genome-scale kinetic model of the P. aeruginosa metabolic network and gene expression profiles. Specifically, we computed metabolite concentration differences between known mutants with altered biofilm formation and the wild-type strain to predict drug targets against P. aeruginosa biofilms. We also simulated the altered metabolism driven by gene expression changes between biofilm and stationary growth-phase planktonic cultures. Our analysis suggests that the synthesis of important biofilm-related molecules, such as the quorum-sensing molecule Pseudomonas quinolone signal and the exopolysaccharide Psl, is regulated not only through the expression of genes in their own synthesis pathway, but also through the biofilm-specific expression of genes in pathways competing for precursors to these molecules. Finally, we investigated why mutants defective in anthranilate degradation have an impaired ability to form biofilms. Alternative to a previous hypothesis that this biofilm reduction is caused by a decrease in energy production, we proposed that the dysregulation of the synthesis of secondary metabolites derived from anthranilate and chorismate is what impaired the biofilms of these mutants. Notably, these insights generated through our kinetic model-based approach are not accessible from previous constraint-based model analyses of P. aeruginosa biofilm

  7. LocateP: Genome-scale subcellular-location predictor for bacterial proteins

    Directory of Open Access Journals (Sweden)

    Zhou Miaomiao

    2008-03-01

    Full Text Available Abstract Background In the past decades, various protein subcellular-location (SCL predictors have been developed. Most of these predictors, like TMHMM 2.0, SignalP 3.0, PrediSi and Phobius, aim at the identification of one or a few SCLs, whereas others such as CELLO and Psortb.v.2.0 aim at a broader classification. Although these tools and pipelines can achieve a high precision in the accurate prediction of signal peptides and transmembrane helices, they have a much lower accuracy when other sequence characteristics are concerned. For instance, it proved notoriously difficult to identify the fate of proteins carrying a putative type I signal peptidase (SPIase cleavage site, as many of those proteins are retained in the cell membrane as N-terminally anchored membrane proteins. Moreover, most of the SCL classifiers are based on the classification of the Swiss-Prot database and consequently inherited the inconsistency of that SCL classification. As accurate and detailed SCL prediction on a genome scale is highly desired by experimental researchers, we decided to construct a new SCL prediction pipeline: LocateP. Results LocateP combines many of the existing high-precision SCL identifiers with our own newly developed identifiers for specific SCLs. The LocateP pipeline was designed such that it mimics protein targeting and secretion processes. It distinguishes 7 different SCLs within Gram-positive bacteria: intracellular, multi-transmembrane, N-terminally membrane anchored, C-terminally membrane anchored, lipid-anchored, LPxTG-type cell-wall anchored, and secreted/released proteins. Moreover, it distinguishes pathways for Sec- or Tat-dependent secretion and alternative secretion of bacteriocin-like proteins. The pipeline was tested on data sets extracted from literature, including experimental proteomics studies. The tests showed that LocateP performs as well as, or even slightly better than other SCL predictors for some locations and outperforms

  8. Consistency Analysis of Genome-Scale Models of Bacterial Metabolism: A Metamodel Approach

    Science.gov (United States)

    Ponce-de-Leon, Miguel; Calle-Espinosa, Jorge; Peretó, Juli; Montero, Francisco

    2015-01-01

    Genome-scale metabolic models usually contain inconsistencies that manifest as blocked reactions and gap metabolites. With the purpose to detect recurrent inconsistencies in metabolic models, a large-scale analysis was performed using a previously published dataset of 130 genome-scale models. The results showed that a large number of reactions (~22%) are blocked in all the models where they are present. To unravel the nature of such inconsistencies a metamodel was construed by joining the 130 models in a single network. This metamodel was manually curated using the unconnected modules approach, and then, it was used as a reference network to perform a gap-filling on each individual genome-scale model. Finally, a set of 36 models that had not been considered during the construction of the metamodel was used, as a proof of concept, to extend the metamodel with new biochemical information, and to assess its impact on gap-filling results. The analysis performed on the metamodel allowed to conclude: 1) the recurrent inconsistencies found in the models were already present in the metabolic database used during the reconstructions process; 2) the presence of inconsistencies in a metabolic database can be propagated to the reconstructed models; 3) there are reactions not manifested as blocked which are active as a consequence of some classes of artifacts, and; 4) the results of an automatic gap-filling are highly dependent on the consistency and completeness of the metamodel or metabolic database used as the reference network. In conclusion the consistency analysis should be applied to metabolic databases in order to detect and fill gaps as well as to detect and remove artifacts and redundant information. PMID:26629901

  9. The large-scale blast score ratio (LS-BSR pipeline: a method to rapidly compare genetic content between bacterial genomes

    Directory of Open Access Journals (Sweden)

    Jason W. Sahl

    2014-04-01

    Full Text Available Background. As whole genome sequence data from bacterial isolates becomes cheaper to generate, computational methods are needed to correlate sequence data with biological observations. Here we present the large-scale BLAST score ratio (LS-BSR pipeline, which rapidly compares the genetic content of hundreds to thousands of bacterial genomes, and returns a matrix that describes the relatedness of all coding sequences (CDSs in all genomes surveyed. This matrix can be easily parsed in order to identify genetic relationships between bacterial genomes. Although pipelines have been published that group peptides by sequence similarity, no other software performs the rapid, large-scale, full-genome comparative analyses carried out by LS-BSR. Results. To demonstrate the utility of the method, the LS-BSR pipeline was tested on 96 Escherichia coli and Shigella genomes; the pipeline ran in 163 min using 16 processors, which is a greater than 7-fold speedup compared to using a single processor. The BSR values for each CDS, which indicate a relative level of relatedness, were then mapped to each genome on an independent core genome single nucleotide polymorphism (SNP based phylogeny. Comparisons were then used to identify clade specific CDS markers and validate the LS-BSR pipeline based on molecular markers that delineate between classical E. coli pathogenic variant (pathovar designations. Scalability tests demonstrated that the LS-BSR pipeline can process 1,000 E. coli genomes in 27–57 h, depending upon the alignment method, using 16 processors. Conclusions. LS-BSR is an open-source, parallel implementation of the BSR algorithm, enabling rapid comparison of the genetic content of large numbers of genomes. The results of the pipeline can be used to identify specific markers between user-defined phylogenetic groups, and to identify the loss and/or acquisition of genetic information between bacterial isolates. Taxa-specific genetic markers can then be

  10. Population Genomics and the Bacterial Species Concept

    OpenAIRE

    Riley, Margaret A.; Lizotte-Waniewski, Michelle

    2009-01-01

    In recent years, the importance of horizontal gene transfer (HGT) in bacterial evolution has been elevated to such a degree that many bacteriologists now question the very existence of bacterial species. If gene transfer is as rampant as comparative genomic studies have suggested, how could bacterial species survive such genomic fluidity? And yet, most bacteriologists recognize, and name, as species, clusters of bacterial isolates that share complex phenotypic properties. The Core Genome Hypo...

  11. Genome-Scale Models

    DEFF Research Database (Denmark)

    Bergdahl, Basti; Sonnenschein, Nikolaus; Machado, Daniel;

    2016-01-01

    An introduction to genome-scale models, how to build and use them, will be given in this chapter. Genome-scale models have become an important part of systems biology and metabolic engineering, and are increasingly used in research, both in academica and in industry, both for modeling chemical...

  12. Insights from twenty years of bacterial genome sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Jun, Se Ran [ORNL; Nookaew, Intawat [ORNL; Leuze, Michael Rex [ORNL; Ahn, Tae-Hyuk [ORNL; Karpinets, Tatiana V [ORNL; Lund, Ole [Technical University of Denmark; Kora, Guruprasad H [ORNL; Wassenaar, Trudy [Molecular Microbiology & Genomics Consultants, Zotzenheim, Germany; Poudel, Suresh [ORNL; Ussery, David W [ORNL

    2015-01-01

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome

  13. Value of a newly sequenced bacterial genome

    DEFF Research Database (Denmark)

    Barbosa, Eudes; Aburjaile, Flavia F; Ramos, Rommel Tj;

    2014-01-01

    Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also...... heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting...... in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome...

  14. Bacterial Communities: Interactions to Scale

    Science.gov (United States)

    Stubbendieck, Reed M.; Vargas-Bautista, Carol; Straight, Paul D.

    2016-01-01

    In the environment, bacteria live in complex multispecies communities. These communities span in scale from small, multicellular aggregates to billions or trillions of cells within the gastrointestinal tract of animals. The dynamics of bacterial communities are determined by pairwise interactions that occur between different species in the community. Though interactions occur between a few cells at a time, the outcomes of these interchanges have ramifications that ripple through many orders of magnitude, and ultimately affect the macroscopic world including the health of host organisms. In this review we cover how bacterial competition influences the structures of bacterial communities. We also emphasize methods and insights garnered from culture-dependent pairwise interaction studies, metagenomic analyses, and modeling experiments. Finally, we argue that the integration of multiple approaches will be instrumental to future understanding of the underlying dynamics of bacterial communities. PMID:27551280

  15. Microbial minimalism: genome reduction in bacterial pathogens.

    Science.gov (United States)

    Moran, Nancy A

    2002-03-01

    When bacterial lineages make the transition from free-living or facultatively parasitic life cycles to permanent associations with hosts, they undergo a major loss of genes and DNA. Complete genome sequences are providing an understanding of how extreme genome reduction affects evolutionary directions and metabolic capabilities of obligate pathogens and symbionts. PMID:11893328

  16. Value of a newly sequenced bacterial genome.

    Science.gov (United States)

    Barbosa, Eudes Gv; Aburjaile, Flavia F; Ramos, Rommel Tj; Carneiro, Adriana R; Le Loir, Yves; Baumbach, Jan; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

    2014-05-26

    Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information. PMID:24921006

  17. Value of a newly sequenced bacterial genome

    Institute of Scientific and Technical Information of China (English)

    Eudes; GV; Barbosa; Flavia; F; Aburjaile; Rommel; TJ; Ramos; Adriana; R; Carneiro; Yves; Le; Loir; Jan; Baumbach; Anderson; Miyoshi; Artur; Silva; Vasco; Azevedo

    2014-01-01

    Next-generation sequencing(NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft(partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information.

  18. Bacterial chromatin: converging views at different scales.

    Science.gov (United States)

    Dame, Remus T; Tark-Dame, Mariliis

    2016-06-01

    Bacterial genomes are functionally organized and compactly folded into a structure referred to as bacterial chromatin or the nucleoid. An important role in genome folding is attributed to Nucleoid-Associated Proteins, also referred to as bacterial chromatin proteins. Although a lot of molecular insight in the mechanisms of operation of these proteins has been generated in the test tube, knowledge on genome organization in the cellular context is still lagging behind severely. Here, we discuss important advances in the understanding of three-dimensional genome organization due to the application of Chromosome Conformation Capture and super-resolution microscopy techniques. We focus on bacterial chromatin proteins whose proposed role in genome organization is supported by these approaches. Moreover, we discuss recent insights into the interrelationship between genome organization and genome activity/stability in bacteria. PMID:26942688

  19. Insights from Genomics into Bacterial Pathogen Populations

    OpenAIRE

    Wilson, DJ

    2012-01-01

    Bacterial pathogens impose a heavy burden of disease on human populations worldwide. The gravest threats are posed by highly virulent respiratory pathogens, enteric pathogens, and HIV-associated infections. Tuberculosis alone is responsible for the deaths of 1.5 million people annually. Treatment options for bacterial pathogens are being steadily eroded by the evolution and spread of drug resistance. However, population-level whole genome sequencing offers new hope in the fight against pathog...

  20. Xylella Genomics and Bacterial Pathogenicity to Plants

    OpenAIRE

    Dow, J. M.; Daniels, M. J.

    2000-01-01

    Xylella fastidiosa, a pathogen of citrus, is the first plant pathogenic bacterium for which the complete genome sequence has been published. Inspection of the sequence reveals high relatedness to many genes of other pathogens, notably Xanthomonas campestris. Based on this, we suggest that Xylella possesses certain easily testable properties that contribute to pathogenicity. We also present some general considerations for deriving information on pathogenicity from bacterial genomics.

  1. One Bacterial Cell, One Complete Genome

    Energy Technology Data Exchange (ETDEWEB)

    Woyke, Tanja; Tighe, Damon; Mavrommatis, Konstantinos; Clum, Alicia; Copeland, Alex; Schackwitz, Wendy; Lapidus, Alla; Wu, Dongying; McCutcheon, John P.; McDonald, Bradon R.; Moran, Nancy A.; Bristow, James; Cheng, Jan-Fang

    2010-04-26

    While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200?900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA). Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs), indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.

  2. One bacterial cell, one complete genome.

    Directory of Open Access Journals (Sweden)

    Tanja Woyke

    Full Text Available While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200-900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA. Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs, indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.

  3. Insights from 20 years of bacterial genome sequencing

    DEFF Research Database (Denmark)

    Land, Miriam; Hauser, Loren; Jun, Se-Ran;

    2015-01-01

    genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in......Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the...... genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative...

  4. Genome degeneration affects both extracellular and intracellular bacterial endosymbionts

    OpenAIRE

    Feldhaar, Heike; Gross, Roy

    2009-01-01

    The obligate intracellular bacterial endosymbionts of insects are a paradigm for reductive genome evolution. A study published recently in BMC Biology demonstrates that similar evolutionary forces shaping genome structure may also apply to extracellular endosymbionts.

  5. Reconstruction of Bacterial and Viral Genomes from Multiple Metagenomes

    Science.gov (United States)

    Gupta, Ankit; Kumar, Sanjiv; Prasoodanan, Vishnu P. K.; Harish, K.; Sharma, Ashok K.; Sharma, Vineet K.

    2016-01-01

    Several metagenomic projects have been accomplished or are in progress. However, in most cases, it is not feasible to generate complete genomic assemblies of species from the metagenomic sequencing of a complex environment. Only a few studies have reported the reconstruction of bacterial genomes from complex metagenomes. In this work, Binning-Assembly approach has been proposed and demonstrated for the reconstruction of bacterial and viral genomes from 72 human gut metagenomic datasets. A total 1156 bacterial genomes belonging to 219 bacterial families and, 279 viral genomes belonging to 84 viral families could be identified. More than 80% complete draft genome sequences could be reconstructed for a total of 126 bacterial and 11 viral genomes. Selected draft assembled genomes could be validated with 99.8% accuracy using their ORFs. The study provides useful information on the assembly expected for a species given its number of reads and abundance. This approach along with spiking was also demonstrated to be useful in improving the draft assembly of a bacterial genome. The Binning-Assembly approach can be successfully used to reconstruct bacterial and viral genomes from multiple metagenomic datasets obtained from similar environments. PMID:27148174

  6. From bacterial genome to functionality; case bifidobacteria.

    Science.gov (United States)

    Ventura, Marco; O'Connell-Motherway, Mary; Leahy, Sinead; Moreno-Munoz, Jose Antonio; Fitzgerald, Gerald F; van Sinderen, Douwe

    2007-11-30

    The availability of complete bacterial genome sequences has significantly furthered our understanding of the genetics, physiology and biochemistry of the microorganisms in question, particularly those that have commercially important applications. Bifidobacteria are among such microorganisms, as they constitute mammalian commensals of biotechnological significance due to their perceived role in maintaining a balanced gastrointestinal (GIT) microflora. Bifidobacteria are therefore frequently used as health-promoting or probiotic components in functional food products. A fundamental understanding of the metabolic activities employed by these commensal bacteria, in particular their capability to utilize a wide range of complex oligosaccharides, can reveal ways to provide in vivo growth advantages relative to other competing gut bacteria or pathogens. Furthermore, an in depth analysis of adaptive responses to nutritional or environmental stresses may provide methodologies to retain viability and improve functionality during commercial preparation, storage and delivery of the probiotic organism. PMID:17629975

  7. Genome Update: alignment of bacterial chromosomes

    DEFF Research Database (Denmark)

    Ussery, David; Jensen, Mette; Poulsen, Tine Rugh;

    2004-01-01

    There are four new microbial genomes listed in this month's Genome Update, three belonging to Gram-positive bacteria and one belonging to an archaeon that lives at pH 0; all of these genomes are listed in Table 1⇓. The method of genome comparison this month is that of genome alignment and......, as an example, an alignment of seven Staphylococcus aureus genomes and one Staphylococcus epidermidis genome is presented....

  8. Genome evolution and systems biology in bacterial endosymbionts of insects

    OpenAIRE

    Belda Cuesta, Eugeni

    2010-01-01

    Gene loss is the most important event in the process of genome reduction that appears associated with bacterial endosymbionts of insects. These small genomes were derived features evolved from ancestral prokaryotes with larger genome sizes, consequence of a massive process of genome reduction due to drastic changes in the ecological conditions and evolutionary pressures acting on these prokaryotic lineages during their ecological transition to host-dependent lifestyle. In the present thesis, ...

  9. Bacterial epidemiology and biology - lessons from genome sequencing.

    OpenAIRE

    Parkhill, J.; Wren, BW

    2011-01-01

    : ABSTRACT: Next-generation sequencing has ushered in a new era of microbial genomics, enabling the detailed historical and geographical tracing of bacteria. This is helping to shape our understanding of bacterial evolution.

  10. Holotransformations of bacterial colonies and genome cybernetics

    Science.gov (United States)

    Ben-Jacob, Eshel; Tenenbaum, Adam; Shochet, Ofer; Avidan, Orna

    1994-01-01

    We present a study of colony transformations during growth of Bacillus subtilis under adverse environmental conditions. It is a continuation of our pilot study of “Adaptive self-organization during growth of bacterial colonies” (Physica A 187 (1992) 378). First we identify and describe the transformations pathway, i.e. the excitation of the branching modes from Bacillus subtilis 168 (grown under diffusion limited conditions) and the phase transformations between the tip-splitting phase (phase T) and the chiral phase (phase C) which belong to the same mode. This pathway shows the evolution of complexity as the bacteria are exposed to adverse growth conditions. We present the morphology diagram of phases T and C as a function of agar concentration and pepton level. As expected, the growth of phase T is ramified (fractal-like or DLA-like) at low pepton level (about 1 g/1) and turns compact at high pepton level (about 10 g/1). The growth of phase C is also ramified at low pepton level and turns denser and finally compact as the pepton level increases. Generally speaking, the colonies develop more complex patterns and higher micro-level organization for more adverse environments. We use the growth velocity as a response function to describe the growth. At low agar concentration (and low pepton level) phase C grows faster than phase T, and for a high agar concentration (about 2%) phase T grows faster. We observe colony transformations between the two phases (phase transformations). They are found to be consistent with the “fastest growing morphology” selection principle adopted from azoic systems. The transformations are always from the slower phase to the faster one. Hence, we observe T→ C transformations at low agar concentrations and C→ T transformations at high agar concentrations. We have observed both localized and extended transformations. Usually, the transformations are localized for more adverse growth conditions, and extended for growth conditions

  11. Minimum taxonomic criteria for bacterial genome sequence depositions and announcements.

    Science.gov (United States)

    Bull, Matthew J; Marchesi, Julian R; Vandamme, Peter; Plummer, Sue; Mahenthiralingam, Eshwar

    2012-04-01

    Multiple bioinformatic methods are available to analyse the information encoded within the complete genome sequence of a bacterium and accurately assign its species status or nearest phylogenetic neighbour. However, it is clear that even now in what is the third decade of bacterial genomics, taxonomically incorrect genome sequence depositions are still being made. We outline a simple scheme of bioinformatic analysis and a set of minimum criteria that should be applied to all bacterial genomic data to ensure that they are accurately assigned to the species or genus level prior to database deposition. To illustrate the utility of the bioinformatic workflow, we analysed the recently deposited genome sequence of Lactobacillus acidophilus 30SC and demonstrated that this DNA was in fact derived from a strain of Lactobacillus amylovorus. Using these methods researchers can ensure that the taxonomic accuracy of genome sequence depositions is maintained within the ever increasing nucleic acid datasets. PMID:22366464

  12. A new experimental approach for studying bacterial genomic island evolution identifies island genes with bacterial host-specific expression patterns

    OpenAIRE

    Nickerson Cheryl A; Wilson James W

    2006-01-01

    Abstract Background Genomic islands are regions of bacterial genomes that have been acquired by horizontal transfer and often contain blocks of genes that function together for specific processes. Recently, it has become clear that the impact of genomic islands on the evolution of different bacterial species is significant and represents a major force in establishing bacterial genomic variation. However, the study of genomic island evolution has been mostly performed at the sequence level usi...

  13. Scaling of immune responses against intracellular bacterial infection

    OpenAIRE

    Abdullah, Zeinab; Knolle, Percy A.

    2014-01-01

    Macrophages detect bacterial infection through pattern recognition receptors (PRRs) localized at the cell surface, in intracellular vesicles or in the cytosol. Discrimination of viable and virulent bacteria from non-virulent bacteria (dead or viable) is necessary to appropriately scale the anti-bacterial immune response. Such scaling of anti-bacterial immunity is necessary to control the infection, but also to avoid immunopathology or bacterial persistence. PRR-mediated detection of bacterial...

  14. Bacterial Cellular Engineering by Genome Editing and Gene Silencing

    Directory of Open Access Journals (Sweden)

    Nobutaka Nakashima

    2014-02-01

    Full Text Available Genome editing is an important technology for bacterial cellular engineering, which is commonly conducted by homologous recombination-based procedures, including gene knockout (disruption, knock-in (insertion, and allelic exchange. In addition, some new recombination-independent approaches have emerged that utilize catalytic RNAs, artificial nucleases, nucleic acid analogs, and peptide nucleic acids. Apart from these methods, which directly modify the genomic structure, an alternative approach is to conditionally modify the gene expression profile at the posttranscriptional level without altering the genomes. This is performed by expressing antisense RNAs to knock down (silence target mRNAs in vivo. This review describes the features and recent advances on methods used in genomic engineering and silencing technologies that are advantageously used for bacterial cellular engineering.

  15. Differentiation of regions with atypical oligonucleotide composition in bacterial genomes

    Directory of Open Access Journals (Sweden)

    Reva Oleg N

    2005-10-01

    Full Text Available Abstract Background Complete sequencing of bacterial genomes has become a common technique of present day microbiology. Thereafter, data mining in the complete sequence is an essential step. New in silico methods are needed that rapidly identify the major features of genome organization and facilitate the prediction of the functional class of ORFs. We tested the usefulness of local oligonucleotide usage (OU patterns to recognize and differentiate types of atypical oligonucleotide composition in DNA sequences of bacterial genomes. Results A total of 163 bacterial genomes of eubacteria and archaea published in the NCBI database were analyzed. Local OU patterns exhibit substantial intrachromosomal variation in bacteria. Loci with alternative OU patterns were parts of horizontally acquired gene islands or ancient regions such as genes for ribosomal proteins and RNAs. OU statistical parameters, such as local pattern deviation (D, pattern skew (PS and OU variance (OUV enabled the detection and visualization of gene islands of different functional classes. Conclusion A set of approaches has been designed for the statistical analysis of nucleotide sequences of bacterial genomes. These methods are useful for the visualization and differentiation of regions with atypical oligonucleotide composition prior to or accompanying gene annotation.

  16. DIYA: a bacterial annotation pipeline for any genomics lab

    OpenAIRE

    Stewart, Andrew C.; Osborne, Brian; Read, Timothy D

    2009-01-01

    Summary:DIYA (Do-It-Yourself Annotator) is a modular and configurable open source pipeline software, written in Perl, used for the rapid annotation of bacterial genome sequences. The software is currently used to take DNA contigs as input, either in the form of complete genomes or the result of shotgun sequencing, and produce an annotated sequence in Genbank file format as output. Availability: Distribution and source code are available at (https://sourceforge.net/projects/diyg/). Contact: tr...

  17. Ecological and Temporal Constraints in the Evolution of Bacterial Genomes

    OpenAIRE

    Jose Luis Martínez; Luis Boto

    2011-01-01

    Studies on the experimental evolution of microorganisms, on their in vivo evolution (mainly in the case of bacteria producing chronic infections), as well as the availability of multiple full genomic sequences, are placing bacteria in the playground of evolutionary studies. In the present article we review the differential contribution to the evolution of bacterial genomes that processes such as gene modification, gene acquisition and gene loss may have when bacteria colonize different habita...

  18. The Neolithic revolution of bacterial genomes.

    Science.gov (United States)

    Mira, Alex; Pushker, Ravindra; Rodríguez-Valera, Francisco

    2006-05-01

    Current human activities undoubtedly impact natural ecosystems. However, the influence of Homo sapiens on living organisms must have also occurred in the past. Certain genomic characteristics of prokaryotes can be used to study the impact of ancient human activities on microorganisms. By analyzing DNA sequence similarity features of transposable elements, dramatic genomic changes have been identified in bacteria that are associated with large and stable human communities, agriculture and animal domestication: three features unequivocally linked to the Neolithic revolution. It is hypothesized that bacteria specialized in human-associated niches underwent an intense transformation after the social and demographic changes that took place with the first Neolithic settlements. These genomic changes are absent in related species that are not specialized in humans. PMID:16569502

  19. CRISPR-assisted editing of bacterial genomes

    OpenAIRE

    Jiang, Wenyan; Bikard, David; Cox, David; Zhang, Feng; Marraffini, Luciano A.

    2013-01-01

    The targeting of nucleases to specific DNA sequences facilitates genome editing. Recent work demonstrated that the CRISPR-associated (Cas) nuclease Cas9 can be targeted to sequences in vitro simply by modifying a short7 CRISPR RNA (crRNA) guide. Here we use this CRISPR-Cas system to introduce marker-free mutations in Streptococcus pneumoniae and Escherichia coli. The approach involves re-programming Cas9 by using a crRNA complementary to a target chromosomal locus and introducing a template D...

  20. Genes but not genomes reveal bacterial domestication of Lactococcus lactis.

    Directory of Open Access Journals (Sweden)

    Delphine Passerini

    Full Text Available BACKGROUND: The population structure and diversity of Lactococcus lactis subsp. lactis, a major industrial bacterium involved in milk fermentation, was determined at both gene and genome level. Seventy-six lactococcal isolates of various origins were studied by different genotyping methods and thirty-six strains displaying unique macrorestriction fingerprints were analyzed by a new multilocus sequence typing (MLST scheme. This gene-based analysis was compared to genomic characteristics determined by pulsed-field gel electrophoresis (PFGE. METHODOLOGY/PRINCIPAL FINDINGS: The MLST analysis revealed that L. lactis subsp. lactis is essentially clonal with infrequent intra- and intergenic recombination; also, despite its taxonomical classification as a subspecies, it displays a genetic diversity as substantial as that within several other bacterial species. Genome-based analysis revealed a genome size variability of 20%, a value typical of bacteria inhabiting different ecological niches, and that suggests a large pan-genome for this subspecies. However, the genomic characteristics (macrorestriction pattern, genome or chromosome size, plasmid content did not correlate to the MLST-based phylogeny, with strains from the same sequence type (ST differing by up to 230 kb in genome size. CONCLUSION/SIGNIFICANCE: The gene-based phylogeny was not fully consistent with the traditional classification into dairy and non-dairy strains but supported a new classification based on ecological separation between "environmental" strains, the main contributors to the genetic diversity within the subspecies, and "domesticated" strains, subject to recent genetic bottlenecks. Comparison between gene- and genome-based analyses revealed little relationship between core and dispensable genome phylogenies, indicating that clonal diversification and phenotypic variability of the "domesticated" strains essentially arose through substantial genomic flux within the dispensable

  1. Bacterial genomic adaptation and response to metals

    International Nuclear Information System (INIS)

    The beta-proteobacterium Cupriavidus metallidurans CH34 (formerly Ralstonia metallidurans) has been intensively studied since 1976 in SCK-CEN and VITO, for its adaptation capacity to survive in harsh (mostly industrial) environments, to overcome acute environmental stresses, for its resistance to a variety of heavy metals and for applications in environmental biotechnology. Recently, CH34 has become a model bacterium to study the effect of spaceflight conditions in several space flight experiments conducted by SCK-CEN (e.g. MESSAGE, BASE). Furthermore, Cupriavidus and Ralstonia species are isolated from the floor, air and surfaces of spacecraft assembly rooms; were found prior-to-flight on surfaces of space robots such as the Mars Odyssey Orbiter and even in-flight in ISS cooling water and Shuttle drinking water, vindicating its role as model bacterium in space research. In addition, Ralstonia species are also the causative agent of nosocomial infections and are among the unusual species recovered from cystic fibrosis (CF) patients. The genomic organization of Cuprivavidus metallidurans CH34 was studied in-depth to identify the genetic and regulatory structures involved in the resistance to heavy metals

  2. Within-host bacterial diversity hinders accurate reconstruction of transmission networks from genomic distance data.

    OpenAIRE

    Worby, Colin J.; Marc Lipsitch; Hanage, William P

    2014-01-01

    The prospect of using whole genome sequence data to investigate bacterial disease outbreaks has been keenly anticipated in many quarters, and the large-scale collection and sequencing of isolates from cases is becoming increasingly feasible. While sequence data can provide many important insights into disease spread and pathogen adaptation, it remains unclear how successfully they may be used to estimate individual routes of transmission. Several studies have attempted to reconstruct transmis...

  3. Within-Host Bacterial Diversity Hinders Accurate Reconstruction of Transmission Networks from Genomic Distance Data

    OpenAIRE

    Worby, Colin J.; Marc Lipsitch; William P Hanage

    2014-01-01

    The prospect of using whole genome sequence data to investigate bacterial disease outbreaks has been keenly anticipated in many quarters, and the large-scale collection and sequencing of isolates from cases is becoming increasingly feasible. While sequence data can provide many important insights into disease spread and pathogen adaptation, it remains unclear how successfully they may be used to estimate individual routes of transmission. Several studies have attempted to reconstruct transmis...

  4. BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal Matoq Saeed

    2015-08-18

    Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/

  5. The influence of the accessory genome on bacterial pathogen evolution

    OpenAIRE

    Jackson, Robert W.; Vinatzer, Boris; Arnold, Dawn L.; Dorus, Steve; Murillo, Jesus

    2011-01-01

    Bacterial pathogens exhibit significant variation in their genomic content of virulence factors. This reflects the abundance of strategies pathogens evolved to infect host organisms by suppressing host immunity. Molecular arms-races have been a strong driving force for the evolution of pathogenicity, with pathogens often encoding overlapping or redundant functions, such as type III protein secretion effectors and hosts encoding ever more sophisticated immune systems. The pathogens’ frequent e...

  6. Genome Assembly and Computational Analysis Pipelines for Bacterial Pathogens

    KAUST Repository

    Rangkuti, Farania Gama Ardhina

    2011-06-01

    Pathogens lie behind the deadliest pandemics in history. To date, AIDS pandemic has resulted in more than 25 million fatal cases, while tuberculosis and malaria annually claim more than 2 million lives. Comparative genomic analyses are needed to gain insights into the molecular mechanisms of pathogens, but the abundance of biological data dictates that such studies cannot be performed without the assistance of computational approaches. This explains the significant need for computational pipelines for genome assembly and analyses. The aim of this research is to develop such pipelines. This work utilizes various bioinformatics approaches to analyze the high-­throughput genomic sequence data that has been obtained from several strains of bacterial pathogens. A pipeline has been compiled for quality control for sequencing and assembly, and several protocols have been developed to detect contaminations. Visualization has been generated of genomic data in various formats, in addition to alignment, homology detection and sequence variant detection. We have also implemented a metaheuristic algorithm that significantly improves bacterial genome assemblies compared to other known methods. Experiments on Mycobacterium tuberculosis H37Rv data showed that our method resulted in improvement of N50 value of up to 9697% while consistently maintaining high accuracy, covering around 98% of the published reference genome. Other improvement efforts were also implemented, consisting of iterative local assemblies and iterative correction of contiguated bases. Our result expedites the genomic analysis of virulent genes up to single base pair resolution. It is also applicable to virtually every pathogenic microorganism, propelling further research in the control of and protection from pathogen-­associated diseases.

  7. Completing bacterial genome assemblies: strategy and performance comparisons.

    Science.gov (United States)

    Liao, Yu-Chieh; Lin, Shu-Hung; Lin, Hsin-Hung

    2015-01-01

    Determining the genomic sequences of microorganisms is the basis and prerequisite for understanding their biology and functional characterization. While the advent of low-cost, extremely high-throughput second-generation sequencing technologies and the parallel development of assembly algorithms have generated rapid and cost-effective genome assemblies, such assemblies are often unfinished, fragmented draft genomes as a result of short read lengths and long repeats present in multiple copies. Third-generation, PacBio sequencing technologies circumvented this problem by greatly increasing read length. Hybrid approaches including ALLPATHS-LG, PacBio corrected reads pipeline, SPAdes, and SSPACE-LongRead, and non-hybrid approaches--hierarchical genome-assembly process (HGAP) and PacBio corrected reads pipeline via self-correction--have therefore been proposed to utilize the PacBio long reads that can span many thousands of bases to facilitate the assembly of complete microbial genomes. However, standardized procedures that aim at evaluating and comparing these approaches are currently insufficient. To address the issue, we herein provide a comprehensive comparison by collecting datasets for the comparative assessment on the above-mentioned five assemblers. In addition to offering explicit and beneficial recommendations to practitioners, this study aims to aid in the design of a paradigm positioned to complete bacterial genome assembly. PMID:25735824

  8. mGenomeSubtractor: a web-based tool for parallel in silico subtractive hybridization analysis of multiple bacterial genomes.

    Science.gov (United States)

    Shao, Yucheng; He, Xinyi; Harrison, Ewan M; Tai, Cui; Ou, Hong-Yu; Rajakumar, Kumar; Deng, Zixin

    2010-07-01

    mGenomeSubtractor performs an mpiBLAST-based comparison of reference bacterial genomes against multiple user-selected genomes for investigation of strain variable accessory regions. With parallel computing architecture, mGenomeSubtractor is able to run rapid BLAST searches of the segmented reference genome against multiple subject genomes at the DNA or amino acid level within a minute. In addition to comparison of protein coding sequences, the highly flexible sliding window-based genome fragmentation approach offered can be used to identify short unique sequences within or between genes. mGenomeSubtractor provides powerful schematic outputs for exploration of identified core and accessory regions, including searches against databases of mobile genetic elements, virulence factors or bacterial essential genes, examination of G+C content and binucleotide distribution bias, and integrated primer design tools. mGenomeSubtractor also allows for the ready definition of species-specific gene pools based on available genomes. Pan-genomic arrays can be easily developed using the efficient oligonucleotide design tool. This simple high-throughput in silico 'subtractive hybridization' analytical tool will support the rapidly escalating number of comparative bacterial genomics studies aimed at defining genomic biomarkers of evolutionary lineage, phenotype, pathotype, environmental adaptation and/or disease-association of diverse bacterial species. mGenomeSubtractor is freely available to all users without any login requirement at: http://bioinfo-mml.sjtu.edu.cn/mGS/. PMID:20435682

  9. Reconstruction of a Bacterial Genome from DNA Cassettes

    Energy Technology Data Exchange (ETDEWEB)

    Christopher Dupont; John Glass; Laura Sheahan; Shibu Yooseph; Lisa Zeigler Allen; Mathangi Thiagarajan; Andrew Allen; Robert Friedman; J. Craig Venter

    2011-12-31

    This basic research program comprised two major areas: (1) acquisition and analysis of marine microbial metagenomic data and development of genomic analysis tools for broad, external community use; (2) development of a minimal bacterial genome. Our Marine Metagenomic Diversity effort generated and analyzed shotgun sequencing data from microbial communities sampled from over 250 sites around the world. About 40% of the 26 Gbp of sequence data has been made publicly available to date with a complete release anticipated in six months. Our results and those mining the deposited data have revealed a vast diversity of genes coding for critical metabolic processes whose phylogenetic and geographic distributions will enable a deeper understanding of carbon and nutrient cycling, microbial ecology, and rapid rate evolutionary processes such as horizontal gene transfer by viruses and plasmids. A global assembly of the generated dataset resulted in a massive set (5Gbp) of genome fragments that provide context to the majority of the generated data that originated from uncultivated organisms. Our Synthetic Biology team has made significant progress towards the goal of synthesizing a minimal mycoplasma genome that will have all of the machinery for independent life. This project, once completed, will provide fundamentally new knowledge about requirements for microbial life and help to lay a basic research foundation for developing microbiological approaches to bioenergy.

  10. A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance

    DEFF Research Database (Denmark)

    Thomsen, Martin Christen Frølund; Ahrenfeldt, Johanne; Bellod Cisneros, Jose Luis;

    2016-01-01

    and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the...... web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes...... platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely...

  11. Beginner’s guide to comparative bacterial genome analysis using next-generation sequence data

    OpenAIRE

    Edwards, David J.; Holt, Kathryn E.

    2013-01-01

    High throughput sequencing is now fast and cheap enough to be considered part of the toolbox for investigating bacteria, and there are thousands of bacterial genome sequences available for comparison in the public domain. Bacterial genome analysis is increasingly being performed by diverse groups in research, clinical and public health labs alike, who are interested in a wide array of topics related to bacterial genetics and evolution. Examples include outbreak analysis and the study of patho...

  12. Sequencing of Bacterial Genomes: Principles and Insights into Pathogenesis and Development of Antibiotics

    Directory of Open Access Journals (Sweden)

    Eric S. Donkor

    2013-10-01

    Full Text Available The impact of bacterial diseases on public health has become enormous, and is partly due to the increasing trend of antibiotic resistance displayed by bacterial pathogens. Sequencing of bacterial genomes has significantly improved our understanding about the biology of many bacterial pathogens as well as identification of novel antibiotic targets. Since the advent of genome sequencing two decades ago, about 1,800 bacterial genomes have been fully sequenced and these include important aetiological agents such as Streptococcus pneumoniae, Mycobacterium tuberculosis, Escherichia coli O157:H7, Vibrio cholerae, Clostridium difficile and Staphylococcus aureus. Very recently, there has been an explosion of bacterial genome data and is due to the development of next generation sequencing technologies, which are evolving so rapidly. Indeed, the field of microbial genomics is advancing at a very fast rate and it is difficult for researchers to be abreast with the new developments. This highlights the need for regular updates in microbial genomics through comprehensive reviews. This review paper seeks to provide an update on bacterial genome sequencing generally, and to analyze insights gained from sequencing in two areas, including bacterial pathogenesis and the development of antibiotics.

  13. Predicting statistical properties of open reading frames in bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Katharina Mir

    Full Text Available An analytical model based on the statistical properties of Open Reading Frames (ORFs of eubacterial genomes such as codon composition and sequence length of all reading frames was developed. This new model predicts the average length, maximum length as well as the length distribution of the ORFs of 70 species with GC contents varying between 21% and 74%. Furthermore, the number of annotated genes is predicted with high accordance. However, the ORF length distribution in the five alternative reading frames shows interesting deviations from the predicted distribution. In particular, long ORFs appear more often than expected statistically. The unexpected depletion of stop codons in these alternative open reading frames cannot completely be explained by a biased codon usage in the +1 frame. While it is unknown if the stop codon depletion has a biological function, it could be due to a protein coding capacity of alternative ORFs exerting a selection pressure which prevents the fixation of stop codon mutations. The comparison of the analytical model with bacterial genomes, therefore, leads to a hypothesis suggesting novel gene candidates which can now be investigated in subsequent wet lab experiments.

  14. CISA: contig integrator for sequence assembly of bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Shin-Hung Lin

    Full Text Available A plethora of algorithmic assemblers have been proposed for the de novo assembly of genomes, however, no individual assembler guarantees the optimal assembly for diverse species. Optimizing various parameters in an assembler is often performed in order to generate the most optimal assembly. However, few efforts have been pursued to take advantage of multiple assemblies to yield an assembly of high accuracy. In this study, we employ various state-of-the-art assemblers to generate different sets of contigs for bacterial genomes. A tool, named CISA, has been developed to integrate the assemblies into a hybrid set of contigs, resulting in assemblies of superior contiguity and accuracy, compared with the assemblies generated by the state-of-the-art assemblers and the hybrid assemblies merged by existing tools. This tool is implemented in Python and requires MUMmer and BLAST+ to be installed on the local machine. The source code of CISA and examples of its use are available at http://sb.nhri.org.tw/CISA/.

  15. Facile, High Quality Sequencing of Bacterial Genomes from Small Amounts of DNA

    OpenAIRE

    Momchilo Vuyisich; Ayesha Arefin; Karen Davenport; Shihai Feng; Cheryl Gleasner; Kim McMurry; Beverly Parson-Quintana; Jennifer Price; Matthew Scholz; Patrick Chain

    2014-01-01

    Sequencing bacterial genomes has traditionally required large amounts of genomic DNA (~1 μg). There have been few studies to determine the effects of the input DNA amount or library preparation method on the quality of sequencing data. Several new commercially available library preparation methods enable shotgun sequencing from as little as 1 ng of input DNA. In this study, we evaluated the NEBNext Ultra library preparation reagents for sequencing bacterial genomes. We have evaluated the util...

  16. Ralstonia solanacearum, a widespread bacterial plant pathogen in the post-genomic era

    OpenAIRE

    Peeters, Nemo; Guidot, Alice; Vailleau, Fabienne; Valls i Matheu, Marc

    2013-01-01

    Ralstonia solanacearum is a soil-borne bacterium causing the widespread disease known as bacterial wilt. Ralstonia solanacearum is also the causal agent of Moko disease of banana and brown rot of potato. Since the last R. solanacearum pathogen profile was published 10 years ago, studies concerning this plant pathogen have taken a genomic and post-genomic direction. This was pioneered by the first sequenced and annotated genome for a major plant bacterial pathogen and followed by many more gen...

  17. Unique core genomes of the bacterial family vibrionaceae: insights into niche adaptation and speciation

    OpenAIRE

    Kahlke Tim; Goesmann Alexander; Hjerde Erik; Willassen Nils; Haugen Peik

    2012-01-01

    Abstract Background The criteria for defining bacterial species and even the concept of bacterial species itself are under debate, and the discussion is apparently intensifying as more genome sequence data is becoming available. However, it is still unclear how the new advances in genomics should be used most efficiently to address this question. In this study we identify genes that are common to any group of genomes in our dataset, to determine whether genes specific to a particular taxon ex...

  18. Across bacterial phyla, distantly-related genomes with similar genomic GC content have similar patterns of amino acid usage.

    Directory of Open Access Journals (Sweden)

    John Lightfield

    Full Text Available The GC content of bacterial genomes ranges from 16% to 75% and wide ranges of genomic GC content are observed within many bacterial phyla, including both gram negative and gram positive phyla. Thus, divergent genomic GC content has evolved repeatedly in widely separated bacterial taxa. Since genomic GC content influences codon usage, we examined codon usage patterns and predicted protein amino acid content as a function of genomic GC content within eight different phyla or classes of bacteria. We found that similar patterns of codon usage and protein amino acid content have evolved independently in all eight groups of bacteria. For example, in each group, use of amino acids encoded by GC-rich codons increased by approximately 1% for each 10% increase in genomic GC content, while the use of amino acids encoded by AT-rich codons decreased by a similar amount. This consistency within every phylum and class studied led us to conclude that GC content appears to be the primary determinant of the codon and amino acid usage patterns observed in bacterial genomes. These results also indicate that selection for translational efficiency of highly expressed genes is constrained by the genomic parameters associated with the GC content of the host genome.

  19. A new experimental approach for studying bacterial genomic island evolution identifies island genes with bacterial host-specific expression patterns

    Directory of Open Access Journals (Sweden)

    Nickerson Cheryl A

    2006-01-01

    Full Text Available Abstract Background Genomic islands are regions of bacterial genomes that have been acquired by horizontal transfer and often contain blocks of genes that function together for specific processes. Recently, it has become clear that the impact of genomic islands on the evolution of different bacterial species is significant and represents a major force in establishing bacterial genomic variation. However, the study of genomic island evolution has been mostly performed at the sequence level using computer software or hybridization analysis to compare different bacterial genomic sequences. We describe here a novel experimental approach to study the evolution of species-specific bacterial genomic islands that identifies island genes that have evolved in such a way that they are differentially-expressed depending on the bacterial host background into which they are transferred. Results We demonstrate this approach by using a "test" genomic island that we have cloned from the Salmonella typhimurium genome (island 4305 and transferred to a range of Gram negative bacterial hosts of differing evolutionary relationships to S. typhimurium. Systematic analysis of the expression of the island genes in the different hosts compared to proper controls allowed identification of genes with genera-specific expression patterns. The data from the analysis can be arranged in a matrix to give an expression "array" of the island genes in the different bacterial backgrounds. A conserved 19-bp DNA site was found upstream of at least two of the differentially-expressed island genes. To our knowledge, this is the first systematic analysis of horizontally-transferred genomic island gene expression in a broad range of Gram negative hosts. We also present evidence in this study that the IS200 element found in island 4305 in S. typhimurium strain LT2 was inserted after the island had already been acquired by the S. typhimurium lineage and that this element is likely not

  20. Genomic Epidemiology: Whole-Genome-Sequencing–Powered Surveillance and Outbreak Investigation of Foodborne Bacterial Pathogens

    DEFF Research Database (Denmark)

    Deng, Xiangyu; den Bakker, Henk C.; Hendriksen, Rene S.

    2016-01-01

    As we are approaching the twentieth anniversary of PulseNet, a network of public health and regulatory laboratories that has changed the landscape of foodborne illness surveillance through molecular subtyping, public health microbiology is undergoing another transformation brought about by so......-called next-generation sequencing (NGS) technologies that have made whole-genome sequencing (WGS) of foodborne bacterial pathogens a realistic and superior alternative to traditional subtyping methods. Routine, real-time, and widespread application of WGS in food safety and public health is on the horizon...

  1. What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira.

    Directory of Open Access Journals (Sweden)

    Derrick E Fouts

    2016-02-01

    Full Text Available Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field. Infectious Leptospira, comprised of pathogenic and intermediately pathogenic Leptospira, evolutionarily diverged from non-infectious, saprophytic Leptospira, as demonstrated by the following computational biology analyses: 1 the definitive taxonomy and evolutionary relatedness among all known Leptospira species; 2 genomically-predicted metabolic reconstructions that indicate novel adaptation of infectious Leptospira to mammals, including sialic acid biosynthesis, pathogen-specific porphyrin metabolism and the first-time demonstration of cobalamin (B12 autotrophy as a bacterial virulence factor; 3 CRISPR/Cas systems demonstrated only to be present in pathogenic Leptospira, suggesting a potential mechanism for this clade's refractoriness to gene targeting; 4 finding Leptospira pathogen-specific specialized protein secretion systems; 5 novel virulence-related genes/gene families such as the Virulence Modifying (VM (PF07598 paralogs proteins and pathogen-specific adhesins; 6 discovery of novel, pathogen-specific protein modification and secretion mechanisms including unique lipoprotein signal peptide motifs, Sec-independent twin arginine protein secretion motifs, and the absence of certain canonical signal recognition particle proteins from all Leptospira; and 7 and demonstration of infectious Leptospira-specific signal-responsive gene expression, motility and chemotaxis systems. By identifying large scale changes in infectious (pathogenic and intermediately

  2. What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira.

    Science.gov (United States)

    Fouts, Derrick E; Matthias, Michael A; Adhikarla, Haritha; Adler, Ben; Amorim-Santos, Luciane; Berg, Douglas E; Bulach, Dieter; Buschiazzo, Alejandro; Chang, Yung-Fu; Galloway, Renee L; Haake, David A; Haft, Daniel H; Hartskeerl, Rudy; Ko, Albert I; Levett, Paul N; Matsunaga, James; Mechaly, Ariel E; Monk, Jonathan M; Nascimento, Ana L T; Nelson, Karen E; Palsson, Bernhard; Peacock, Sharon J; Picardeau, Mathieu; Ricaldi, Jessica N; Thaipandungpanit, Janjira; Wunder, Elsio A; Yang, X Frank; Zhang, Jun-Jie; Vinetz, Joseph M

    2016-02-01

    Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field. Infectious Leptospira, comprised of pathogenic and intermediately pathogenic Leptospira, evolutionarily diverged from non-infectious, saprophytic Leptospira, as demonstrated by the following computational biology analyses: 1) the definitive taxonomy and evolutionary relatedness among all known Leptospira species; 2) genomically-predicted metabolic reconstructions that indicate novel adaptation of infectious Leptospira to mammals, including sialic acid biosynthesis, pathogen-specific porphyrin metabolism and the first-time demonstration of cobalamin (B12) autotrophy as a bacterial virulence factor; 3) CRISPR/Cas systems demonstrated only to be present in pathogenic Leptospira, suggesting a potential mechanism for this clade's refractoriness to gene targeting; 4) finding Leptospira pathogen-specific specialized protein secretion systems; 5) novel virulence-related genes/gene families such as the Virulence Modifying (VM) (PF07598 paralogs) proteins and pathogen-specific adhesins; 6) discovery of novel, pathogen-specific protein modification and secretion mechanisms including unique lipoprotein signal peptide motifs, Sec-independent twin arginine protein secretion motifs, and the absence of certain canonical signal recognition particle proteins from all Leptospira; and 7) and demonstration of infectious Leptospira-specific signal-responsive gene expression, motility and chemotaxis systems. By identifying large scale changes in infectious (pathogenic and intermediately pathogenic

  3. Bacterial communities in full-scale wastewater treatment systems

    OpenAIRE

    Cydzik-Kwiatkowska, Agnieszka; Zielińska, Magdalena

    2016-01-01

    Bacterial metabolism determines the effectiveness of biological treatment of wastewater. Therefore, it is important to define the relations between the species structure and the performance of full-scale installations. Although there is much laboratory data on microbial consortia, our understanding of dependencies between the microbial structure and operational parameters of full-scale wastewater treatment plants (WWTP) is limited. This mini-review presents the types of microbial consortia in...

  4. Metabolomic Functional Analysis of Bacterial Genomes: Final Report

    Energy Technology Data Exchange (ETDEWEB)

    Arp, Daniel J; Sayavedra-Soto, Luis A

    2008-01-01

    The availability of the complete DNA sequence of the bacterial genome of Nitrosomonas europaea offered the opportunity for unprecedented and detailed investigations of function. We studied the function of genes involved in carbohydrate and Fe metabolism. N. europaea has genes for the synthesis and degradation of glycogen and sucrose but cannot grow on substrates other than ammonia and CO2. Granules of glycogen were detected in whole cells by electron microscopy and quantified in cell-free extracts by enzymatic methods. The cellular glycogen and sucrose content varied depending on the composition of the growth medium and cellular growth stage. N. europaea also depends heavily on iron for metabolism of ammonia, is particularly interesting since it lacks genes for siderophore production, and has genes with only low similarity to known iron reductases, yet grows relatively well in medium containing low Fe. By comparing the transcriptomes of cells grown in iron-replete medium versus iron-limited medium, 247 genes were identified as differentially expressed. Mutant strains deficient in genes for sucrose, glycogen and iron metabolism were created and are being used to further our understanding of ammonia oxidizing bacteria.

  5. Scale-Invariant Correlations in Dynamic Bacterial Clusters

    Science.gov (United States)

    Chen, Xiao; Dong, Xu; Be'er, Avraham; Swinney, Harry L.; Zhang, H. P.

    2012-04-01

    In Bacillus subtilis colonies, motile bacteria move collectively, spontaneously forming dynamic clusters. These bacterial clusters share similarities with other systems exhibiting polarized collective motion, such as bird flocks or fish schools. Here we study experimentally how velocity and orientation fluctuations within clusters are spatially correlated. For a range of cell density and cluster size, the correlation length is shown to be 30% of the spatial size of clusters, and the correlation functions collapse onto a master curve after rescaling the separation with correlation length. Our results demonstrate that correlations of velocity and orientation fluctuations are scale invariant in dynamic bacterial clusters.

  6. Bacterial sigma factors: a historical, structural, and genomic perspective.

    Science.gov (United States)

    Feklístov, Andrey; Sharon, Brian D; Darst, Seth A; Gross, Carol A

    2014-01-01

    Transcription initiation is the crucial focal point of gene expression in prokaryotes. The key players in this process, sigma factors (σs), associate with the catalytic core RNA polymerase to guide it through the essential steps of initiation: promoter recognition and opening, and synthesis of the first few nucleotides of the transcript. Here we recount the key advances in σ biology, from their discovery 45 years ago to the most recent progress in understanding their structure and function at the atomic level. Recent data provide important structural insights into the mechanisms whereby σs initiate promoter opening. We discuss both the housekeeping σs, which govern transcription of the majority of cellular genes, and the alternative σs, which direct RNA polymerase to specialized operons in response to environmental and physiological cues. The review concludes with a genome-scale view of the extracytoplasmic function σs, the most abundant group of alternative σs. PMID:25002089

  7. Draft Genome Sequences of Six Novel Bacterial Isolates from Chicken Ceca

    Science.gov (United States)

    Duggett, Nicholas A.; Kay, Gemma L.; Sergeant, Martin J.; Bedford, Michael; Constantinidou, Chrystala I.; Penn, Charles W.; Millard, Andrew D.

    2016-01-01

    The chicken is the most common domesticated animal and the most abundant bird in the world. However, the chicken gut is home to many previously uncharacterized bacterial taxa. Here, we report draft genome sequences from six bacterial isolates from chicken ceca, all of which fall outside any named species. PMID:27231374

  8. Genomic Epidemiology: Whole-Genome-Sequencing-Powered Surveillance and Outbreak Investigation of Foodborne Bacterial Pathogens.

    Science.gov (United States)

    Deng, Xiangyu; den Bakker, Henk C; Hendriksen, Rene S

    2016-01-01

    As we are approaching the twentieth anniversary of PulseNet, a network of public health and regulatory laboratories that has changed the landscape of foodborne illness surveillance through molecular subtyping, public health microbiology is undergoing another transformation brought about by so-called next-generation sequencing (NGS) technologies that have made whole-genome sequencing (WGS) of foodborne bacterial pathogens a realistic and superior alternative to traditional subtyping methods. Routine, real-time, and widespread application of WGS in food safety and public health is on the horizon. Technological, operational, and policy challenges are still present and being addressed by an international and multidisciplinary community of researchers, public health practitioners, and other stakeholders. PMID:26772415

  9. Facile, High Quality Sequencing of Bacterial Genomes from Small Amounts of DNA

    Science.gov (United States)

    Vuyisich, Momchilo; Arefin, Ayesha; Davenport, Karen; Feng, Shihai; Gleasner, Cheryl; McMurry, Kim; Parson-Quintana, Beverly; Price, Jennifer; Scholz, Matthew; Chain, Patrick

    2014-01-01

    Sequencing bacterial genomes has traditionally required large amounts of genomic DNA (~1 μg). There have been few studies to determine the effects of the input DNA amount or library preparation method on the quality of sequencing data. Several new commercially available library preparation methods enable shotgun sequencing from as little as 1 ng of input DNA. In this study, we evaluated the NEBNext Ultra library preparation reagents for sequencing bacterial genomes. We have evaluated the utility of NEBNext Ultra for resequencing and de novo assembly of four bacterial genomes and compared its performance with the TruSeq library preparation kit. The NEBNext Ultra reagents enable high quality resequencing and de novo assembly of a variety of bacterial genomes when using 100 ng of input genomic DNA. For the two most challenging genomes (Burkholderia spp.), which have the highest GC content and are the longest, we also show that the quality of both resequencing and de novo assembly is not decreased when only 10 ng of input genomic DNA is used. PMID:25478564

  10. Facile, High Quality Sequencing of Bacterial Genomes from Small Amounts of DNA

    Directory of Open Access Journals (Sweden)

    Momchilo Vuyisich

    2014-01-01

    Full Text Available Sequencing bacterial genomes has traditionally required large amounts of genomic DNA (~1 μg. There have been few studies to determine the effects of the input DNA amount or library preparation method on the quality of sequencing data. Several new commercially available library preparation methods enable shotgun sequencing from as little as 1 ng of input DNA. In this study, we evaluated the NEBNext Ultra library preparation reagents for sequencing bacterial genomes. We have evaluated the utility of NEBNext Ultra for resequencing and de novo assembly of four bacterial genomes and compared its performance with the TruSeq library preparation kit. The NEBNext Ultra reagents enable high quality resequencing and de novo assembly of a variety of bacterial genomes when using 100 ng of input genomic DNA. For the two most challenging genomes (Burkholderia spp., which have the highest GC content and are the longest, we also show that the quality of both resequencing and de novo assembly is not decreased when only 10 ng of input genomic DNA is used.

  11. Acidobacteria form a coherent but highly diverse group within the bacterial domain: evidence from environmental genomics

    DEFF Research Database (Denmark)

    Quaiser, Achim; Ochsenreiter, Torsten; Lanz, Christa; Schuster, Stephan C; Treusch, Alexander H; Eck, Jürgen; Schleper, Christa

    2003-01-01

    ecological role and extensive metabolic versatility. However, the genetic and physiological information about Acidobacteria is scarce. In order to gain insight into genome structure, evolution and diversity of these microorganisms we have initiated an environmental genomic approach by constructing large...... well-studied bacterial phyla....

  12. Complete Genomes of Classical Swine Fever Virus Cloned into Bacterial Artificial Chromosomes

    OpenAIRE

    Rasmussen, Thomas Bruun; Reimann, I; Uttenthal, Åse; De Beer, M.

    2011-01-01

    Complete genome amplification of viral RNA provides a new tool for the generation of modified pestiviruses. We have used our full-genome amplification strategy for generation of amplicons representing complete genomes of classical swine fever virus. The amplicons were cloned directly into a stable single-copy bacterial artificial chromosome (BAC) generating full-length pestivirus DNAs from which infectious RNA transcripts could be also derived. Our strategy allows construction of stable infec...

  13. Triad pattern algorithm for predicting strong promoter candidates in bacterial genomes

    Directory of Open Access Journals (Sweden)

    Sakanyan Vehary

    2008-05-01

    Full Text Available Abstract Background Bacterial promoters, which increase the efficiency of gene expression, differ from other promoters by several characteristics. This difference, not yet widely exploited in bioinformatics, looks promising for the development of relevant computational tools to search for strong promoters in bacterial genomes. Results We describe a new triad pattern algorithm that predicts strong promoter candidates in annotated bacterial genomes by matching specific patterns for the group I σ70 factors of Escherichia coli RNA polymerase. It detects promoter-specific motifs by consecutively matching three patterns, consisting of an UP-element, required for interaction with the α subunit, and then optimally-separated patterns of -35 and -10 boxes, required for interaction with the σ70 subunit of RNA polymerase. Analysis of 43 bacterial genomes revealed that the frequency of candidate sequences depends on the A+T content of the DNA under examination. The accuracy of in silico prediction was experimentally validated for the genome of a hyperthermophilic bacterium, Thermotoga maritima, by applying a cell-free expression assay using the predicted strong promoters. In this organism, the strong promoters govern genes for translation, energy metabolism, transport, cell movement, and other as-yet unidentified functions. Conclusion The triad pattern algorithm developed for predicting strong bacterial promoters is well suited for analyzing bacterial genomes with an A+T content of less than 62%. This computational tool opens new prospects for investigating global gene expression, and individual strong promoters in bacteria of medical and/or economic significance.

  14. Whole genome sequencing of bacteria in cystic fibrosis as a model for bacterial genome adaptation and evolution.

    Science.gov (United States)

    Sharma, Poonam; Gupta, Sushim Kumar; Rolain, Jean-Marc

    2014-03-01

    Cystic fibrosis (CF) airways harbor a wide variety of new and/or emerging multidrug resistant bacteria which impose a heavy burden on patients. These bacteria live in close proximity with one another, which increases the frequency of lateral gene transfer. The exchange and movement of mobile genetic elements and genomic islands facilitate the spread of genes between genetically diverse bacteria, which seem to be advantageous to the bacterium as it allows adaptation to the new niches of the CF lungs. Niche adaptation is one of the major evolutionary forces shaping bacterial genome composition and in CF the chronic strains adapt and become less virulent. The purpose of this review is to shed light on CF bacterial genome alterations. Next-generation sequencing technology is an exciting tool that may help us to decipher the genome architecture and the evolution of bacteria colonizing CF lungs. PMID:24502835

  15. Multidimensional scaling for large genomic data sets

    Directory of Open Access Journals (Sweden)

    Lu Henry

    2008-04-01

    Full Text Available Abstract Background Multi-dimensional scaling (MDS is aimed to represent high dimensional data in a low dimensional space with preservation of the similarities between data points. This reduction in dimensionality is crucial for analyzing and revealing the genuine structure hidden in the data. For noisy data, dimension reduction can effectively reduce the effect of noise on the embedded structure. For large data set, dimension reduction can effectively reduce information retrieval complexity. Thus, MDS techniques are used in many applications of data mining and gene network research. However, although there have been a number of studies that applied MDS techniques to genomics research, the number of analyzed data points was restricted by the high computational complexity of MDS. In general, a non-metric MDS method is faster than a metric MDS, but it does not preserve the true relationships. The computational complexity of most metric MDS methods is over O(N2, so that it is difficult to process a data set of a large number of genes N, such as in the case of whole genome microarray data. Results We developed a new rapid metric MDS method with a low computational complexity, making metric MDS applicable for large data sets. Computer simulation showed that the new method of split-and-combine MDS (SC-MDS is fast, accurate and efficient. Our empirical studies using microarray data on the yeast cell cycle showed that the performance of K-means in the reduced dimensional space is similar to or slightly better than that of K-means in the original space, but about three times faster to obtain the clustering results. Our clustering results using SC-MDS are more stable than those in the original space. Hence, the proposed SC-MDS is useful for analyzing whole genome data. Conclusion Our new method reduces the computational complexity from O(N3 to O(N when the dimension of the feature space is far less than the number of genes N, and it successfully

  16. Construction and Preliminary Characterization Analysis of Wuzhishan Miniature Pig Bacterial Artificial Chromosome Library with Approximately 8-Fold Genome Equivalent Coverage

    Directory of Open Access Journals (Sweden)

    Changqing Liu

    2013-01-01

    Full Text Available Bacterial artificial chromosome (BAC libraries have been invaluable tools for the genome-wide genetic dissection of complex organisms. Here, we report the construction and characterization of a high-redundancy BAC library from a very valuable pig breed in China, Wuzhishan miniature pig (Sus scrofa, using its blood cells and fibroblasts, respectively. The library contains approximately 153,600 clones ordered in 40 superpools of 10 × 384-deep well microplates. The average insert size of BAC clones was estimated to be 152.3 kb, representing approximately 7.68 genome equivalents of the porcine haploid genome and a 99.93% statistical probability of obtaining at least one clone containing a unique DNA sequence in the library. 19 pairs of microsatellite marker primers covering porcine chromosomes were used for screening the BAC library, which showed that each of these markers was positive in the library; the positive clone number was 2 to 9, and the average number was 7.89, which was consistent with 7.68-fold coverage of the porcine genome. And there were no significant differences of genomic BAC library from blood cells and fibroblast cells. Therefore, we identified 19 microsatellite markers that could potentially be used as genetic markers. As a result, this BAC library will serve as a valuable resource for gene identification, physical mapping, and comparative genomics and large-scale genome sequencing in the porcine.

  17. Physical descriptions of the bacterial nucleoid at large scales, and their biological implications

    CERN Document Server

    Benza, Vincenzo G; Dorfman, Kevin D; Scolari, Vittore F; Bromek, Krystyna; Cicuta, Pietro; Lagomarsino, Marco Cosentino

    2012-01-01

    Recent experimental and theoretical approaches have attempted to quantify the physical organization (compaction and geometry) of the bacterial chromosome with its complement of proteins (the nucleoid). The genomic DNA exists in a complex and dynamic protein-rich state, which is highly organised at various length scales. This has implications on modulating (when not enabling) the core biological processes of replication, transcription, segregation. We overview the progress in this area, driven in the last few years by new scientific ideas and new interdisciplinary experimental techniques, ranging from high space- and time-resolution microscopy to high-throughput genomics employing sequencing to map different aspects of the nucleoid-related interactome. The aim of this review is to present the wide spectrum of experimental and theoretical findings coherently, from a physics viewpoint. We also discuss some attempts of interpretation that unify different results, highlighting the role that statistical and soft co...

  18. Construction and characterization of bacterial artificial chromosomes (BACs) containing herpes simplex virus full-length genomes.

    Science.gov (United States)

    Nagel, Claus-Henning; Pohlmann, Anja; Sodeik, Beate

    2014-01-01

    Bacterial artificial chromosomes (BACs) are suitable vectors not only to maintain the large genomes of herpesviruses in Escherichia coli but also to enable the traceless introduction of any mutation using modern tools of bacterial genetics. To clone a herpes simplex virus genome, a BAC replication origin is first introduced into the viral genome by homologous recombination in eukaryotic host cells. As part of their nuclear replication cycle, genomes of herpesviruses circularize and these replication intermediates are then used to transform bacteria. After cloning, the integrity of the recombinant viral genomes is confirmed by restriction length polymorphism analysis and sequencing. The BACs may then be used to design virus mutants. Upon transfection into eukaryotic cells new herpesvirus strains harboring the desired mutations can be recovered and used for experiments in cultured cells as well as in animal infection models. PMID:24671676

  19. Determining and comparing protein function in Bacterial genome sequences

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla

    predictions were made in about 60% of the cases. This project has highlighted the difficulties and challenges in functional annotation and computational analysis of sequence data. It has provided possible solutions for creating reproducible pipelines for comparative genomics as well as constructed a number of......In November 2013, there was around 21.000 different prokaryotic genomes sequenced and publicly available, and the number is growing daily with another 20.000 or more genomes expected to be sequenced and deposited by the end of 2014. An important part of the analysis of this data is the functional...... known functions. This thesis describes the development of new tools for comparative functional annotation and a system for comparative genomics in general. As novel sequenced genomes are becoming more readily available, there is a need for standard analysis tools. The system CMG-biotools is presented...

  20. Host imprints on bacterial genomes--rapid, divergent evolution in individual patients.

    Directory of Open Access Journals (Sweden)

    Jaroslaw Zdziarski

    Full Text Available Bacteria lose or gain genetic material and through selection, new variants become fixed in the population. Here we provide the first, genome-wide example of a single bacterial strain's evolution in different deliberately colonized patients and the surprising insight that hosts appear to personalize their microflora. By first obtaining the complete genome sequence of the prototype asymptomatic bacteriuria strain E. coli 83972 and then resequencing its descendants after therapeutic bladder colonization of different patients, we identified 34 mutations, which affected metabolic and virulence-related genes. Further transcriptome and proteome analysis proved that these genome changes altered bacterial gene expression resulting in unique adaptation patterns in each patient. Our results provide evidence that, in addition to stochastic events, adaptive bacterial evolution is driven by individual host environments. Ongoing loss of gene function supports the hypothesis that evolution towards commensalism rather than virulence is favored during asymptomatic bladder colonization.

  1. SimBac: simulation of whole bacterial genomes with homologous recombination

    OpenAIRE

    Didelot, X.; De Maio, N.; Brown, T.; Wilson, DJ

    2015-01-01

    Bacteria can exchange genetic material, or acquire genes found in theenvironment. This process, generally known as bacterial recombination, can have a strong impact on the evolution and phenotype of bacteria, for example causing the spread of antibiotic resistance across clades and species, but can also disrupt phylogenetic and transmission inferences. With the increasing affordability of whole genome sequencing, the need has emerged for an efficient simulator of bacterial evolution to test a...

  2. Long-Range Periodic Patterns in Microbial Genomes Indicate Significant Multi-Scale Chromosomal Organization.

    Directory of Open Access Journals (Sweden)

    2006-01-01

    Full Text Available Genome organization can be studied through analysis of chromosome position-dependent patterns in sequence-derived parameters. A comprehensive analysis of such patterns in prokaryotic sequences and genome-scale functional data has yet to be performed. We detected spatial patterns in sequence-derived parameters for 163 chromosomes occurring in 135 bacterial and 16 archaeal organisms using wavelet analysis. Pattern strength was found to correlate with organism-specific features such as genome size, overall GC content, and the occurrence of known motility and chromosomal binding proteins. Given additional functional data for Escherichia coli, we found significant correlations among chromosome position dependent patterns in numerous properties, some of which are consistent with previously experimentally identified chromosome macrodomains. These results demonstrate that the large-scale organization of most sequenced genomes is significantly nonrandom, and, moreover, that this organization is likely linked to genome size, nucleotide composition, and information transfer processes. Constraints on genome evolution and design are thus not solely dependent upon information content, but also upon an intricate multi-parameter, multi-length-scale organization of the chromosome.

  3. Loss of Conserved Noncoding RNAs in Genomes of Bacterial Endosymbionts

    Science.gov (United States)

    Matelska, Dorota; Kurkowska, Malgorzata; Purta, Elzbieta; Bujnicki, Janusz M.; Dunin-Horkawicz, Stanislaw

    2016-01-01

    The genomes of intracellular symbiotic or pathogenic bacteria, such as of Buchnera, Mycoplasma, and Rickettsia, are typically smaller compared with their free-living counterparts. Here we showed that noncoding RNA (ncRNA) families, which are conserved in free-living bacteria, frequently could not be detected by computational methods in the small genomes. Statistical tests demonstrated that their absence is not an artifact of low GC content or small deletions in these small genomes, and thus it was indicative of an independent loss of ncRNAs in different endosymbiotic lineages. By analyzing the synteny (conservation of gene order) between the reduced and nonreduced genomes, we revealed instances of protein-coding genes that were preserved in the reduced genomes but lost cis-regulatory elements. We found that the loss of cis-regulatory ncRNA sequences, which regulate the expression of cognate protein-coding genes, is characterized by the reduction of secondary structure formation propensity, GC content, and length of the corresponding genomic regions. PMID:26782934

  4. Toward Complete Bacterial Genome Sequencing Through the Combined Use of Multiple Next-Generation Sequencing Platforms.

    Science.gov (United States)

    Jeong, Haeyoung; Lee, Dae-Hee; Ryu, Choong-Min; Park, Seung-Hwan

    2016-01-01

    PacBio's long-read sequencing technologies can be successfully used for a complete bacterial genome assembly using recently developed non-hybrid assemblers in the absence of secondgeneration, high-quality short reads. However, standardized procedures that take into account multiple pre-existing second-generation sequencing platforms are scarce. In addition to Illumina HiSeq and Ion Torrent PGM-based genome sequencing results derived from previous studies, we generated further sequencing data, including from the PacBio RS II platform, and applied various bioinformatics tools to obtain complete genome assemblies for five bacterial strains. Our approach revealed that the hierarchical genome assembly process (HGAP) non-hybrid assembler resulted in nearly complete assemblies at a moderate coverage of ~75x, but that different versions produced non-compatible results requiring post processing. The other two platforms further improved the PacBio assembly through scaffolding and a final error correction. PMID:26464377

  5. Optical mapping as a routine tool for bacterial genome sequence finishing

    Directory of Open Access Journals (Sweden)

    Gaudriault Sophie

    2007-09-01

    Full Text Available Abstract Background In sequencing the genomes of two Xenorhabdus species, we encountered a large number of sequence repeats and assembly anomalies that stalled finishing efforts. This included a stretch of about 12 Kb that is over 99.9% identical between the plasmid and chromosome of X. nematophila. Results Whole genome restriction maps of the sequenced strains were produced through optical mapping technology. These maps allowed rapid resolution of sequence assembly problems, permitted closing of the genome, and allowed correction of a large inversion in a genome assembly that we had considered finished. Conclusion Our experience suggests that routine use of optical mapping in bacterial genome sequence finishing is warranted. When combined with data produced through 454 sequencing, an optical map can rapidly and inexpensively generate an ordered and oriented set of contigs to produce a nearly complete genome sequence assembly.

  6. Bioinformatic detection of horizontally transferred DNA in bacterial genomes

    OpenAIRE

    Langille, Morgan GI; Brinkman, Fiona SL

    2009-01-01

    We highlight a selection of recent research on computational methods and associated challenges surrounding the prediction of bacterial horizontal gene transfer. This research area continues to face controversy, but is becoming more critical as the importance of horizontal gene transfer in medically and ecologically important prokaryotic evolution is further appreciated.

  7. A Markovian analysis of bacterial genome sequence constraints

    Directory of Open Access Journals (Sweden)

    Aaron D. Skewes

    2013-08-01

    Full Text Available The arrangement of nucleotides within a bacterial chromosome is influenced by numerous factors. The degeneracy of the third codon within each reading frame allows some flexibility of nucleotide selection; however, the third nucleotide in the triplet of each codon is at least partly determined by the preceding two. This is most evident in organisms with a strong G + C bias, as the degenerate codon must contribute disproportionately to maintaining that bias. Therefore, a correlation exists between the first two nucleotides and the third in all open reading frames. If the arrangement of nucleotides in a bacterial chromosome is represented as a Markov process, we would expect that the correlation would be completely captured by a second-order Markov model and an increase in the order of the model (e.g., third-, fourth-…order would not capture any additional uncertainty in the process. In this manuscript, we present the results of a comprehensive study of the Markov property that exists in the DNA sequences of 906 bacterial chromosomes. All of the 906 bacterial chromosomes studied exhibit a statistically significant Markov property that extends beyond second-order, and therefore cannot be fully explained by codon usage. An unrooted tree containing all 906 bacterial chromosomes based on their transition probability matrices of third-order shares ∼25% similarity to a tree based on sequence homologies of 16S rRNA sequences. This congruence to the 16S rRNA tree is greater than for trees based on lower-order models (e.g., second-order, and higher-order models result in diminishing improvements in congruence. A nucleotide correlation most likely exists within every bacterial chromosome that extends past three nucleotides. This correlation places significant limits on the number of nucleotide sequences that can represent probable bacterial chromosomes. Transition matrix usage is largely conserved by taxa, indicating that this property is likely

  8. Defining pathogenic bacterial species in the genomic era

    OpenAIRE

    DidierRaoult

    2011-01-01

    Actual definitions of bacterial species are limited due to the current criteria of definition and the use of restrictive genetic tools. The 16S rRNA sequence, for example, has been widely used as a marker for phylogenetic analyses; however, its use often leads to misleading species definitions. According to the first genetic studies, removing a certain number of genes from pathogenic bacteria removes their capacity to infect hosts. However, more recent studies have demonstrated that the speci...

  9. Genome-Scale Thermodynamic Analysis of Escherichia coli Metabolism

    OpenAIRE

    Christopher S Henry; Jankowski, Matthew D.; Broadbelt, Linda J.; Hatzimanikatis, Vassily

    2005-01-01

    Genome-scale metabolic models are an invaluable tool for analyzing metabolic systems as they provide a more complete picture of the processes of metabolism. We have constructed a genome-scale metabolic model of Escherichia coli based on the iJR904 model developed by the Palsson Laboratory at the University of California at San Diego. Group contribution methods were utilized to estimate the standard Gibbs free energy change of every reaction in the constructed model. Reactions in the model wer...

  10. Use of genome-scale microbial models for metabolic engineering

    DEFF Research Database (Denmark)

    Patil, Kiran Raosaheb; Åkesson, M.; Nielsen, Jens

    2004-01-01

    Metabolic engineering serves as an integrated approach to design new cell factories by providing rational design procedures and valuable mathematical and experimental tools. Mathematical models have an important role for phenotypic analysis, but can also be used for the design of optimal metabolic...... network structures. The major challenge for metabolic engineering in the post-genomic era is to broaden its design methodologies to incorporate genome-scale biological data. Genome-scale stoichiometric models of microorganisms represent a first step in this direction....

  11. ANItools web: a web tool for fast genome comparison within multiple bacterial strains

    Science.gov (United States)

    Han, Na; Qiang, Yujun; Zhang, Wen

    2016-01-01

    Background: Early classification of prokaryotes was based solely on phenotypic similarities, but modern prokaryote characterization has been strongly influenced by advances in genetic methods. With the fast development of the sequencing technology, the ever increasing number of genomic sequences per species offers the possibility for developing distance determinations based on whole-genome information. The average nucleotide identity (ANI), calculated from pair-wise comparisons of all sequences shared between two given strains, has been proposed as the new metrics for bacterial species definition and classification. Results: In this study, we developed the web version of ANItools (http://ani.mypathogen.cn/), which helps users directly get ANI values from online sources. A database covering ANI values of any two strains in a genus was also included (2773 strains, 1487 species and 668 genera). Importantly, ANItools web can automatically run genome comparison between the input genomic sequence and data sequences (Genus and Species levels), and generate a graphical report for ANI calculation results. Conclusion: ANItools web is useful for defining the relationship between bacterial strains, further contributing to the classification and identification of bacterial species using genome data. Database URL: http://ani.mypathogen.cn/ PMID:27270714

  12. Bacterial communities in full-scale wastewater treatment systems.

    Science.gov (United States)

    Cydzik-Kwiatkowska, Agnieszka; Zielińska, Magdalena

    2016-04-01

    Bacterial metabolism determines the effectiveness of biological treatment of wastewater. Therefore, it is important to define the relations between the species structure and the performance of full-scale installations. Although there is much laboratory data on microbial consortia, our understanding of dependencies between the microbial structure and operational parameters of full-scale wastewater treatment plants (WWTP) is limited. This mini-review presents the types of microbial consortia in WWTP. Information is given on extracellular polymeric substances production as factor that is key for formation of spatial structures of microorganisms. Additionally, we discuss data on microbial groups including nitrifiers, denitrifiers, Anammox bacteria, and phosphate- and glycogen-accumulating bacteria in full-scale aerobic systems that was obtained with the use of molecular techniques, including high-throughput sequencing, to shed light on dependencies between the microbial ecology of biomass and the overall efficiency and functional stability of wastewater treatment systems. Sludge bulking in WWTPs is addressed, as well as the microbial composition of consortia involved in antibiotic and micropollutant removal. PMID:26931606

  13. A peptide identification-free, genome sequence-independent shotgun proteomics workflow for strain-level bacterial differentiation

    OpenAIRE

    Wenguang Shao; Min Zhang; Henry Lam; Lau, Stanley C K

    2015-01-01

    Shotgun proteomics is an emerging tool for bacterial identification and differentiation. However, the identification of the mass spectra of peptides to genome-derived peptide sequences remains a key issue that limits the use of shotgun proteomics to bacteria with genome sequences available. In this proof-of-concept study, we report a novel bacterial fingerprinting method that enjoys the resolving power and accuracy of mass spectrometry without the burden of peptide identification (i.e. genome...

  14. CISA: Contig Integrator for Sequence Assembly of Bacterial Genomes

    OpenAIRE

    Lin, Shin-Hung; Liao, Yu-Chieh

    2013-01-01

    A plethora of algorithmic assemblers have been proposed for the de novo assembly of genomes, however, no individual assembler guarantees the optimal assembly for diverse species. Optimizing various parameters in an assembler is often performed in order to generate the most optimal assembly. However, few efforts have been pursued to take advantage of multiple assemblies to yield an assembly of high accuracy. In this study, we employ various state-of-the-art assemblers to generate different set...

  15. The OME Framework for genome-scale systems biology

    Energy Technology Data Exchange (ETDEWEB)

    Palsson, Bernhard O. [Univ. of California, San Diego, CA (United States); Ebrahim, Ali [Univ. of California, San Diego, CA (United States); Federowicz, Steve [Univ. of California, San Diego, CA (United States)

    2014-12-19

    The life sciences are undergoing continuous and accelerating integration with computational and engineering sciences. The biology that many in the field have been trained on may be hardly recognizable in ten to twenty years. One of the major drivers for this transformation is the blistering pace of advancements in DNA sequencing and synthesis. These advances have resulted in unprecedented amounts of new data, information, and knowledge. Many software tools have been developed to deal with aspects of this transformation and each is sorely needed [1-3]. However, few of these tools have been forced to deal with the full complexity of genome-scale models along with high throughput genome- scale data. This particular situation represents a unique challenge, as it is simultaneously necessary to deal with the vast breadth of genome-scale models and the dizzying depth of high-throughput datasets. It has been observed time and again that as the pace of data generation continues to accelerate, the pace of analysis significantly lags behind [4]. It is also evident that, given the plethora of databases and software efforts [5-12], it is still a significant challenge to work with genome-scale metabolic models, let alone next-generation whole cell models [13-15]. We work at the forefront of model creation and systems scale data generation [16-18]. The OME Framework was borne out of a practical need to enable genome-scale modeling and data analysis under a unified framework to drive the next generation of genome-scale biological models. Here we present the OME Framework. It exists as a set of Python classes. However, we want to emphasize the importance of the underlying design as an addition to the discussions on specifications of a digital cell. A great deal of work and valuable progress has been made by a number of communities [13, 19-24] towards interchange formats and implementations designed to achieve similar goals. While many software tools exist for handling genome-scale

  16. (Actino)Bacterial "intelligence": using comparative genomics to unravel the information processing capacities of microbes.

    Science.gov (United States)

    Pinto, Daniela; Mascher, Thorsten

    2016-08-01

    Bacterial genomes encode numerous and often sophisticated signaling devices to perceive changes in their environment and mount appropriate adaptive responses. With their help, microbes are able to orchestrate specific decision-making processes that alter the cellular behavior, but also integrate and communicate information. Moreover and beyond, some signal transducing systems also enable bacteria to remember and learn from previous stimuli to anticipate environmental changes. As recently suggested, all of these aspects indicate that bacteria do, in fact, exhibit cognition remarkably reminiscent of what we refer to as intelligent behavior, at least when referred to higher eukaryotes. In this essay, comprehensive data derived from comparative genomics analyses of microbial signal transduction systems are used to probe the concept of cognition in bacterial cells. Using a recent comprehensive analysis of over 100 actinobacterial genomes as a test case, we illustrate the different layers of the capacities of bacteria that result in cognitive and behavioral complexity as well as some form of 'bacterial intelligence'. We try to raise awareness to approach bacteria as cognitive organisms and believe that this view would enrich and open a new path in the experimental studies of bacterial signal transducing systems. PMID:26852121

  17. BIGSdb: Scalable analysis of bacterial genome variation at the population level

    Directory of Open Access Journals (Sweden)

    Maiden Martin CJ

    2010-12-01

    Full Text Available Abstract Background The opportunities for bacterial population genomics that are being realised by the application of parallel nucleotide sequencing require novel bioinformatics platforms. These must be capable of the storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible, scalable and computationally efficient manner. Results The Bacterial Isolate Genome Sequence Database (BIGSDB is a scalable, open source, web-accessible database system that meets these needs, enabling phenotype and sequence data, which can range from a single sequence read to whole genome data, to be efficiently linked for a limitless number of bacterial specimens. The system builds on the widely used mlstdbNet software, developed for the storage and distribution of multilocus sequence typing (MLST data, and incorporates the capacity to define and identify any number of loci and genetic variants at those loci within the stored nucleotide sequences. These loci can be further organised into 'schemes' for isolate characterisation or for evolutionary or functional analyses. Isolates and loci can be indexed by multiple names and any number of alternative schemes can be accommodated, enabling cross-referencing of different studies and approaches. LIMS functionality of the software enables linkage to and organisation of laboratory samples. The data are easily linked to external databases and fine-grained authentication of access permits multiple users to participate in community annotation by setting up or contributing to different schemes within the database. Some of the applications of BIGSDB are illustrated with the genera Neisseria and Streptococcus. The BIGSDB source code and documentation are available at http://pubmlst.org/software/database/bigsdb/. Conclusions Genomic data can be used to characterise bacterial isolates in many different ways but it can also be efficiently exploited for evolutionary or functional studies. BIGSDB

  18. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system.

    Science.gov (United States)

    Speth, Daan R; In 't Zandt, Michiel H; Guerrero-Cruz, Simon; Dutilh, Bas E; Jetten, Mike S M

    2016-01-01

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is used to seed reactors in wastewater treatment plants around the world; however, the role of most of its microbial community in ammonium removal remains unknown. Our analysis yielded 23 near-complete draft genomes that together represent the majority of the microbial community. We assign these genomes to distinct anaerobic and aerobic microbial communities. In the aerobic community, nitrifying organisms and heterotrophs predominate. In the anaerobic community, widespread potential for partial denitrification suggests a nitrite loop increases treatment efficiency. Of our genomes, 19 have no previously cultivated or sequenced close relatives and six belong to bacterial phyla without any cultivated members, including the most complete Omnitrophica (formerly OP3) genome to date. PMID:27029554

  19. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system

    Science.gov (United States)

    Speth, Daan R.; in 't Zandt, Michiel H.; Guerrero-Cruz, Simon; Dutilh, Bas E.; Jetten, Mike S. M.

    2016-01-01

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is used to seed reactors in wastewater treatment plants around the world; however, the role of most of its microbial community in ammonium removal remains unknown. Our analysis yielded 23 near-complete draft genomes that together represent the majority of the microbial community. We assign these genomes to distinct anaerobic and aerobic microbial communities. In the aerobic community, nitrifying organisms and heterotrophs predominate. In the anaerobic community, widespread potential for partial denitrification suggests a nitrite loop increases treatment efficiency. Of our genomes, 19 have no previously cultivated or sequenced close relatives and six belong to bacterial phyla without any cultivated members, including the most complete Omnitrophica (formerly OP3) genome to date. PMID:27029554

  20. BG7: a new approach for bacterial genome annotation designed for next generation sequencing data.

    Directory of Open Access Journals (Sweden)

    Pablo Pareja-Tobes

    Full Text Available BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version - which is developed in Java, takes advantage of Amazon Web Services (AWS cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities and efficiency were demonstrated in the 2011 EHEC Germany outbreak in which BG7 was used to get the first annotations right the next day after the first entero-hemorrhagic E. coli genome sequences were made publicly available. The suitability of BG7 for genome annotation has been proved for Illumina, 454, Ion Torrent, and PacBio sequencing technologies. Besides, thanks to its plasticity, our system could be very easily adapted to work with new technologies in the future.

  1. BG7: a new approach for bacterial genome annotation designed for next generation sequencing data.

    Science.gov (United States)

    Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Pareja, Eduardo; Tobes, Raquel

    2012-01-01

    BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version - which is developed in Java, takes advantage of Amazon Web Services (AWS) cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities and efficiency were demonstrated in the 2011 EHEC Germany outbreak in which BG7 was used to get the first annotations right the next day after the first entero-hemorrhagic E. coli genome sequences were made publicly available. The suitability of BG7 for genome annotation has been proved for Illumina, 454, Ion Torrent, and PacBio sequencing technologies. Besides, thanks to its plasticity, our system could be very easily adapted to work with new technologies in the future. PMID:23185310

  2. Outcome in patients with bacterial meningitis presenting with a minimal Glasgow Coma Scale score

    OpenAIRE

    Lucas, M. J.; Brouwer, M.C.; Ende, A; van de Beek, D.

    2014-01-01

    Objective: In bacterial meningitis, a decreased level of consciousness is predictive for unfavorable outcome, but the clinical features and outcome in patients presenting with a minimal score on the Glasgow Coma Scale are unknown. Methods: We assessed the incidence, clinical characteristics, and outcome of patients with bacterial meningitis presenting with a minimal score on the Glasgow Coma Scale from a nationwide cohort study of adults with community-acquired bacterial meningitis in the Net...

  3. The diversity of a distributed genome in bacterial populations

    CERN Document Server

    Baumdicker, F; Pfaffelhuber, P

    2009-01-01

    The distributed genome hypothesis states that the set of genes in a population of bacteria is distributed over all individuals that belong to the specific taxon. It implies that certain genes can be gained and lost from generation to generation. We use the random genealogy given by a Kingman coalescent in order to superimpose events of gene gain and loss along ancestral lines. Gene gains occur at constant rate along ancestral lines. We assume that gained genes have never been present in the population before. Gene losses occur at a rate proportional to the number of genes present along the ancestral line. In this "infinitely many genes model" we derive moments for several statistics within a sample: the average number of genes per individual, the average number of genes differing between individuals, the number of incongruent pairs of genes, the total number of different genes in the sample and the gene frequency spectrum. We demonstrate that the model gives a reasonable fit with gene frequency data from mari...

  4. Compaction of bacterial genomic DNA: clarifying the concepts

    International Nuclear Information System (INIS)

    The unconstrained genomic DNA of bacteria forms a coil, whose volume exceeds 1000 times the volume of the cell. Since prokaryotes lack a membrane-bound nucleus, in sharp contrast with eukaryotes, the DNA may consequently be expected to occupy the whole available volume when constrained to fit in the cell. Still, it has been known for more than half a century that the DNA is localized in a well-defined region of the cell, called the nucleoid, which occupies only 15% to 25% of the total volume. Although this problem has focused the attention of many scientists in recent decades, there is still no certainty concerning the mechanism that enables such a dramatic compaction. The goal of this Topical Review is to take stock of our knowledge on this question by listing all possible compaction mechanisms with the proclaimed desire to clarify the physical principles they are based upon and discuss them in the light of experimental results and the results of simulations based on coarse-grained models. In particular, the fundamental differences between ψ-condensation and segregative phase separation and between the condensation by small and long polycations are highlighted. This review suggests that the importance of certain mechanisms, like supercoiling and the architectural properties of DNA-bridging and DNA-bending nucleoid proteins, may have been overestimated, whereas other mechanisms, like segregative phase separation and the self-association of nucleoid proteins, as well as the possible role of the synergy of two or more mechanisms, may conversely deserve more attention. (topical review)

  5. Unique core genomes of the bacterial family vibrionaceae: insights into niche adaptation and speciation

    Directory of Open Access Journals (Sweden)

    Kahlke Tim

    2012-05-01

    Full Text Available Abstract Background The criteria for defining bacterial species and even the concept of bacterial species itself are under debate, and the discussion is apparently intensifying as more genome sequence data is becoming available. However, it is still unclear how the new advances in genomics should be used most efficiently to address this question. In this study we identify genes that are common to any group of genomes in our dataset, to determine whether genes specific to a particular taxon exist and to investigate their potential role in adaptation of bacteria to their specific niche. These genes were named unique core genes. Additionally, we investigate the existence and importance of unique core genes that are found in isolates of phylogenetically non-coherent groups. These groups of isolates, that share a genetic feature without sharing a closest common ancestor, are termed genophyletic groups. Results The bacterial family Vibrionaceae was used as the model, and we compiled and compared genome sequences of 64 different isolates. Using the software orthoMCL we determined clusters of homologous genes among the investigated genome sequences. We used multilocus sequence analysis to build a host phylogeny and mapped the numbers of unique core genes of all distinct groups of isolates onto the tree. The results show that unique core genes are more likely to be found in monophyletic groups of isolates. Genophyletic groups of isolates, in contrast, are less common especially for large groups of isolate. The subsequent annotation of unique core genes that are present in genophyletic groups indicate a high degree of horizontally transferred genes. Finally, the annotation of the unique core genes of Vibrio cholerae revealed genes involved in aerotaxis and biosynthesis of the iron-chelator vibriobactin. Conclusion The presented work indicates that genes specific for any taxon inside the bacterial family Vibrionaceae exist. These unique core genes encode

  6. THE MATRIX ITERATION ALGORITHM SOLVING AN ENUMERATION PROBLEM ON BACTERIAL COMPLETE GENOMES

    Institute of Scientific and Technical Information of China (English)

    YANG Huakang; HUANG Chengxiang; WEN Xiaowei

    2004-01-01

    Given an alphabet ∑ and a finite minimal set B of forbidden words, a combinatorial enumeration problem on bacterial complete genomes is transformed to enumerating strings of a given length which do not contain any string in B as their substrings. From the fact that a string in the language is equivalent to a path in the corresponding graph,we have obtained a polynomial time algorithm by modifying the power of the adjacency matrix in the graph.

  7. Sodium Ion Cycle in Bacterial Pathogens: Evidence from Cross-Genome Comparisons

    OpenAIRE

    Häse, Claudia C.; Fedorova, Natalie D.; Galperin, Michael Y.; Dibrov, Pavel A.

    2001-01-01

    Analysis of the bacterial genome sequences shows that many human and animal pathogens encode primary membrane Na+ pumps, Na+-transporting dicarboxylate decarboxylases or Na+-translocating NADH:ubiquinone oxidoreductase, and a number of Na+-dependent permeases. This indicates that these bacteria can utilize Na+ as a coupling ion instead of or in addition to the H+ cycle. This capability to use a Na+ cycle might be an important virulence factor for such pathogens as Vibrio cholerae, Neisseria m...

  8. An Improved Method for oriT-Directed Cloning and Functionalization of Large Bacterial Genomic Regions

    OpenAIRE

    Kvitko, Brian H.; McMillan, Ian A.; Schweizer, Herbert P.

    2013-01-01

    We have made significant improvements to a broad-host-range system for the cloning and manipulation of large bacterial genomic regions based on site-specific recombination between directly repeated oriT sites during conjugation. Using two suicide capture vectors carrying flanking homology regions, oriT sites are recombined on either side of the target region. Using a broad-host-range conjugation helper plasmid, the region between the oriT sites is conjugated into an Escherichia coli recipient...

  9. Genomic and Global Approaches to Unravelling How Hypermutable Sequences Influence Bacterial Pathogenesis

    Directory of Open Access Journals (Sweden)

    Fadil A. Bidmos

    2014-02-01

    Full Text Available Rapid adaptation to fluctuations in the host milieu contributes to the host persistence and virulence of bacterial pathogens. Adaptation is frequently mediated by hypermutable sequences in bacterial pathogens. Early bacterial genomic studies identified the multiplicity and virulence-associated functions of these hypermutable sequences. Thus, simple sequence repeat tracts (SSRs and site-specific recombination were found to control capsular type, lipopolysaccharide structure, pilin diversity and the expression of outer membrane proteins. We review how the population diversity inherent in the SSR-mediated mechanism of localised hypermutation is being unlocked by the investigation of whole genome sequences of disease isolates, analysis of clinical samples and use of model systems. A contrast is presented between the problematical nature of analysing simple sequence repeats in next generation sequencing data and in simpler, pragmatic PCR-based approaches. Specific examples are presented of the potential relevance of this localized hypermutation to meningococcal pathogenesis. This leads us to speculate on the future prospects for unravelling how hypermutable mechanisms may contribute to the transmission, spread and persistence of bacterial pathogens.

  10. Bacterial Genomic Data Analysis in the Next-Generation Sequencing Era.

    Science.gov (United States)

    Orsini, Massimiliano; Cuccuru, Gianmauro; Uva, Paolo; Fotia, Giorgio

    2016-01-01

    Bacterial genome sequencing is now an affordable choice for many laboratories for applications in research, diagnostic, and clinical microbiology. Nowadays, an overabundance of tools is available for genomic data analysis. However, tools differ for algorithms, languages, hardware requirements, and user interface, and combining them as it is necessary for sequence data interpretation often requires (bio)informatics skills which can be difficult to find in many laboratories. In addition, multiple data sources, as well as exceedingly large dataset sizes, and increasingly computational complexity further challenge the accessibility, reproducibility, and transparency of the entire process. In this chapter we will cover the main bioinformatics steps required for a complete bacterial genome analysis using next-generation sequencing data, from the raw sequence data to assembled and annotated genomes. All the tools described are available in the Orione framework ( http://orione.crs4.it ), which uniquely combines in a transparent way the most used open source bioinformatics tools for microbiology, allowing microbiologist without any specific hardware or informatics skill to conduct data-intensive computational analyses from quality control to microbial gene annotation. PMID:27115645

  11. Genome resequencing in Populus: Revealing large-scale genome variation and implications on specialized-trait genomics

    Energy Technology Data Exchange (ETDEWEB)

    Muchero, Wellington [ORNL; Labbe, Jessy L [ORNL; Priya, Ranjan [University of Tennessee, Knoxville (UTK); DiFazio, Steven P [West Virginia University, Morgantown; Tuskan, Gerald A [ORNL

    2014-01-01

    To date, Populus ranks among a few plant species with a complete genome sequence and other highly developed genomic resources. With the first genome sequence among all tree species, Populus has been adopted as a suitable model organism for genomic studies in trees. However, far from being just a model species, Populus is a key renewable economic resource that plays a significant role in providing raw materials for the biofuel and pulp and paper industries. Therefore, aside from leading frontiers of basic tree molecular biology and ecological research, Populus leads frontiers in addressing global economic challenges related to fuel and fiber production. The latter fact suggests that research aimed at improving quality and quantity of Populus as a raw material will likely drive the pursuit of more targeted and deeper research in order to unlock the economic potential tied in molecular biology processes that drive this tree species. Advances in genome sequence-driven technologies, such as resequencing individual genotypes, which in turn facilitates large scale SNP discovery and identification of large scale polymorphisms are key determinants of future success in these initiatives. In this treatise we discuss implications of genome sequence-enable technologies on Populus genomic and genetic studies of complex and specialized-traits.

  12. Large Scale Production of Magnetic Nanoparticles Using Bacterial Fermentation

    Energy Technology Data Exchange (ETDEWEB)

    Moon, Ji Won [ORNL; Rawn, Claudia J [ORNL; Rondinone, Adam Justin [ORNL; Love, Lonnie J [ORNL; Roh, Yul [Chonnam National University, Gwangju; Lauf, Robert J [ORNL; Everett, Susan M [ORNL; Phelps, Tommy Joe [ORNL

    2010-01-01

    Microbial production of nano-sized particles has a demonstrated capacity to make highly crystalline pure phase magnetite or with some substitution of Fe by Co, Ni, Cr, Mn, Zn or the rare earths. Microbial production of magnetic nanoparticles can be achieved in large quantities and at low cost. Over 1 kg (wet weight) of Zn-substituted magnetite (nominal composition of Zn0.6Fe2.4O4) has been recovered from 30 L fermentations. Transmission electron microscopy (TEM) was used to confirm that this mass produced extracellular magnetites exhibited good mono-dispersity. TEM results also showed a highly reproducible particle size and corroborated average crystallite size (ACS) of 13.1 0.8 nm determined through X-ray diffraction (N=7) at a 99 % confidence level. Based on scale-up experiments performed using a 35 L reactor, the increase in ACS reproducibility may be attributed to an increase of electron donor input, availability of divalent substitution metal ions and less ferrous ions in the case of substituted magnetite, increased reactor volume overcoming differences in each batch, or a combination of the above. While costs of commercial nanometer sized magnetite (25-50 nm) may cost $500/kg, microbial production is likely capable of producing 5-90 nm pure or substituted magnetites at a fraction of the cost of traditional chemical synthesis. While there are numerous approaches for the synthesis of nanoparticles, bacterial fermentation of magnetite or metal-substituted magnetite may represent an advantageous manufacturing technology with respect to yield, reproducibility and scalable synthesis with low costs at low energy input.

  13. Using Genome-scale Models to Predict Biological Capabilities

    DEFF Research Database (Denmark)

    O’Brien, Edward J.; Monk, Jonathan M.; Palsson, Bernhard O.

    2015-01-01

    growth capabilities on various substrates and the effect of gene knockouts at the genome scale. Thus, much interest has developed in understanding and applying these methods to areas such as metabolic engineering, antibiotic design, and organismal and enzyme evolution. This Primer will get you started....

  14. Genome sequence and plasmid transformation of the model high-yield bacterial cellulose producer Gluconacetobacter hansenii ATCC 53582

    OpenAIRE

    Michael Florea; Benjamin Reeve; James Abbott; Freemont, Paul S.; Tom Ellis

    2016-01-01

    Bacterial cellulose is a strong, highly pure form of cellulose that is used in a range of applications in industry, consumer goods and medicine. Gluconacetobacter hansenii ATCC 53582 is one of the highest reported bacterial cellulose producing strains and has been used as a model organism in numerous studies of bacterial cellulose production and studies aiming to increased cellulose productivity. Here we present a high-quality draft genome sequence for G. hansenii ATCC 53582 and find that in ...

  15. Cytotoxic chromosomal targeting by CRISPR/Cas systems can reshape bacterial genomes and expel or remodel pathogenicity islands.

    Directory of Open Access Journals (Sweden)

    Reuben B Vercoe

    2013-04-01

    Full Text Available In prokaryotes, clustered regularly interspaced short palindromic repeats (CRISPRs and their associated (Cas proteins constitute a defence system against bacteriophages and plasmids. CRISPR/Cas systems acquire short spacer sequences from foreign genetic elements and incorporate these into their CRISPR arrays, generating a memory of past invaders. Defence is provided by short non-coding RNAs that guide Cas proteins to cleave complementary nucleic acids. While most spacers are acquired from phages and plasmids, there are examples of spacers that match genes elsewhere in the host bacterial chromosome. In Pectobacterium atrosepticum the type I-F CRISPR/Cas system has acquired a self-complementary spacer that perfectly matches a protospacer target in a horizontally acquired island (HAI2 involved in plant pathogenicity. Given the paucity of experimental data about CRISPR/Cas-mediated chromosomal targeting, we examined this process by developing a tightly controlled system. Chromosomal targeting was highly toxic via targeting of DNA and resulted in growth inhibition and cellular filamentation. The toxic phenotype was avoided by mutations in the cas operon, the CRISPR repeats, the protospacer target, and protospacer-adjacent motif (PAM beside the target. Indeed, the natural self-targeting spacer was non-toxic due to a single nucleotide mutation adjacent to the target in the PAM sequence. Furthermore, we show that chromosomal targeting can result in large-scale genomic alterations, including the remodelling or deletion of entire pre-existing pathogenicity islands. These features can be engineered for the targeted deletion of large regions of bacterial chromosomes. In conclusion, in DNA-targeting CRISPR/Cas systems, chromosomal interference is deleterious by causing DNA damage and providing a strong selective pressure for genome alterations, which may have consequences for bacterial evolution and pathogenicity.

  16. Draft Genome Sequence of Clostridium straminisolvens Strain JCM 21531T, Isolated from a Cellulose-Degrading Bacterial Community

    OpenAIRE

    Yuki, Masahiro; Oshima, Kenshiro; Suda, Wataru; Sakamoto, Mitsuo; Kitamura, Keiko; Iida, Toshiya; Hattori, Masahira; Ohkuma, Moriya

    2014-01-01

    Here, we report the draft genome sequence of a fibrolytic bacterium, Clostridium straminisolvens JCM 21531T, isolated from a cellulose-degrading bacterial community. The genome information of this strain will be useful for studies on the degradation enzymes and functional interactions with other members in the community.

  17. Spatial Scales of Bacterial Diversity in Cold-Water Coral Reef Ecosystems

    OpenAIRE

    Sandra Schöttner; Christian Wild; Friederike Hoffmann; Antje Boetius; Alban Ramette

    2012-01-01

    Background: Cold-water coral reef ecosystems are recognized as biodiversity hotspots in the deep sea, but insights into their associated bacterial communities are still limited. Deciphering principle patterns of bacterial community variation over multiple spatial scales may however prove critical for a better understanding of factors contributing to cold-water coral reef stability and functioning. Methodology/Principal Findings: Bacterial community structure, as determined by Automated Riboso...

  18. Mapping and validating predictions of soil bacterial biodiversity using European and national scale datasets

    OpenAIRE

    Thomson, Bruce C.; Plassart, Pierre; Gweon, Hyun S.; STONE Dorothy; Creamer, Rachael E.; Lemanceau, Philippe; Bailey, Mark J

    2016-01-01

    Recent research has highlighted strong correlations between soil edaphic parameters and bacterial biodiversity. Here we seek to explore these relationships across the European Union member states with respect to mapping bacterial biodiversity at the continental scale. As part of the EU FP7 EcoFINDERs project, bacterial communities from 76 soil samples taken across Europe were assessed from eleven countries encompassing Arctic to Southern Mediterranean climes, representing a diverse range of s...

  19. Genome-scale metabolic representation of Amycolatopsis balhimycina

    DEFF Research Database (Denmark)

    Vongsangnak, Wanwipa; Figueiredo, L. F.; Förster, Jochen;

    2012-01-01

    to reconstruct a genome‐scale metabolic model for the organism. Here we generated an almost complete A. balhimycina genome sequence comprising 10,562,587 base pairs assembled into 2,153 contigs. The high GC‐genome (∼69%) includes 8,585 open reading frames (ORFs). We used our integrative toolbox called SEQTOR...... behavior and improved balhimycin production were simulated. The A. balhimycina model shows a good agreement between in silico data and experimental data and also identifies key reactions associated with increased balhimycin production. The reconstruction of the genome‐scale metabolic model of A...... for functional annotation and then integrated annotated data with biochemical and physiological information available for this organism to reconstruct a genome‐scale metabolic model of A. balhimycina. The resulting metabolic model contains 583 ORFs as protein encoding genes (7% of the predicted 8,585 ORFs), 407...

  20. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins.

    Science.gov (United States)

    Croucher, Nicholas J; Page, Andrew J; Connor, Thomas R; Delaney, Aidan J; Keane, Jacqueline A; Bentley, Stephen D; Parkhill, Julian; Harris, Simon R

    2015-02-18

    The emergence of new sequencing technologies has facilitated the use of bacterial whole genome alignments for evolutionary studies and outbreak analyses. These datasets, of increasing size, often include examples of multiple different mechanisms of horizontal sequence transfer resulting in substantial alterations to prokaryotic chromosomes. The impact of these processes demands rapid and flexible approaches able to account for recombination when reconstructing isolates' recent diversification. Gubbins is an iterative algorithm that uses spatial scanning statistics to identify loci containing elevated densities of base substitutions suggestive of horizontal sequence transfer while concurrently constructing a maximum likelihood phylogeny based on the putative point mutations outside these regions of high sequence diversity. Simulations demonstrate the algorithm generates highly accurate reconstructions under realistically parameterized models of bacterial evolution, and achieves convergence in only a few hours on alignments of hundreds of bacterial genome sequences. Gubbins is appropriate for reconstructing the recent evolutionary history of a variety of haploid genotype alignments, as it makes no assumptions about the underlying mechanism of recombination. The software is freely available for download at github.com/sanger-pathogens/Gubbins, implemented in Python and C and supported on Linux and Mac OS X. PMID:25414349

  1. Development and validation of an rDNA operon based primer walking strategy applicable to de novo bacterial genome finishing

    OpenAIRE

    Eastman, Alexander W; Yuan, Ze-Chun

    2015-01-01

    Advances in sequencing technology have drastically increased the depth and feasibility of bacterial genome sequencing. However, little information is available that details the specific techniques and procedures employed during genome sequencing despite the large numbers of published genomes. Shotgun approaches employed by second-generation sequencing platforms has necessitated the development of robust bioinformatics tools for in silico assembly, and complete assembly is limited by the prese...

  2. Genome-wide selective sweeps and gene-specific sweeps in natural bacterial populations.

    Science.gov (United States)

    Bendall, Matthew L; Stevens, Sarah Lr; Chan, Leong-Keat; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Froula, Jeff; Kang, Dongwan; Tringe, Susannah G; Bertilsson, Stefan; Moran, Mary A; Shade, Ashley; Newton, Ryan J; McMahon, Katherine D; Malmstrom, Rex R

    2016-07-01

    Multiple models describe the formation and evolution of distinct microbial phylogenetic groups. These evolutionary models make different predictions regarding how adaptive alleles spread through populations and how genetic diversity is maintained. Processes predicted by competing evolutionary models, for example, genome-wide selective sweeps vs gene-specific sweeps, could be captured in natural populations using time-series metagenomics if the approach were applied over a sufficiently long time frame. Direct observations of either process would help resolve how distinct microbial groups evolve. Here, from a 9-year metagenomic study of a freshwater lake (2005-2013), we explore changes in single-nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in 30 bacterial populations. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied by >1000-fold among populations. SNP allele frequencies also changed dramatically over time within some populations. Interestingly, nearly all SNP variants were slowly purged over several years from one population of green sulfur bacteria, while at the same time multiple genes either swept through or were lost from this population. These patterns were consistent with a genome-wide selective sweep in progress, a process predicted by the 'ecotype model' of speciation but not previously observed in nature. In contrast, other populations contained large, SNP-free genomic regions that appear to have swept independently through the populations prior to the study without purging diversity elsewhere in the genome. Evidence for both genome-wide and gene-specific sweeps suggests that different models of bacterial speciation may apply to different populations coexisting in the same environment. PMID:26744812

  3. Construction of an infectious clone of canine herpesvirus genome as a bacterial artificial chromosome.

    Science.gov (United States)

    Arii, Jun; Hushur, Orkash; Kato, Kentaro; Kawaguchi, Yasushi; Tohya, Yukinobu; Akashi, Hiroomi

    2006-04-01

    Canine herpesvirus (CHV) is an attractive candidate not only for use as a recombinant vaccine to protect dogs from a variety of canine pathogens but also as a viral vector for gene therapy in domestic animals. However, developments in this area have been impeded by the complicated techniques used for eukaryotic homologous recombination. To overcome these problems, we used bacterial artificial chromosomes (BACs) to generate infectious BACs. Our findings may be summarized as follows: (i) the CHV genome (pCHV/BAC), in which a BAC flanked by loxP sites was inserted into the thymidine kinase gene, was maintained in Escherichia coli; (ii) transfection of pCHV/BAC into A-72 cells resulted in the production of infectious virus; (iii) the BAC vector sequence was almost perfectly excisable from the genome of the reconstituted virus CHV/BAC by co-infection with CHV/BAC and a recombinant adenovirus that expressed the Cre recombinase; and (iv) a recombinant virus in which the glycoprotein C gene was deleted was generated by lambda recombination followed by Flp recombination, which resulted in a reduction in viral titer compared with that of the wild-type virus. The infectious clone pCHV/BAC is useful for the modification of the CHV genome using bacterial genetics, and CHV/BAC should have multiple applications in the rapid generation of genetically engineered CHV recombinants and the development of CHV vectors for vaccination and gene therapy in domestic animals. PMID:16515874

  4. Development and validation of an rDNA operon based primer walking strategy applicable to de novo bacterial genome finishing.

    Science.gov (United States)

    Eastman, Alexander W; Yuan, Ze-Chun

    2014-01-01

    Advances in sequencing technology have drastically increased the depth and feasibility of bacterial genome sequencing. However, little information is available that details the specific techniques and procedures employed during genome sequencing despite the large numbers of published genomes. Shotgun approaches employed by second-generation sequencing platforms has necessitated the development of robust bioinformatics tools for in silico assembly, and complete assembly is limited by the presence of repetitive DNA sequences and multi-copy operons. Typically, re-sequencing with multiple platforms and laborious, targeted Sanger sequencing are employed to finish a draft bacterial genome. Here we describe a novel strategy based on the identification and targeted sequencing of repetitive rDNA operons to expedite bacterial genome assembly and finishing. Our strategy was validated by finishing the genome of Paenibacillus polymyxa strain CR1, a bacterium with potential in sustainable agriculture and bio-based processes. An analysis of the 38 contigs contained in the P. polymyxa strain CR1 draft genome revealed 12 repetitive rDNA operons with varied intragenic and flanking regions of variable length, unanimously located at contig boundaries and within contig gaps. These highly similar but not identical rDNA operons were experimentally verified and sequenced simultaneously with multiple, specially designed primer sets. This approach also identified and corrected significant sequence rearrangement generated during the initial in silico assembly of sequencing reads. Our approach reduces the required effort associated with blind primer walking for contig assembly, increasing both the speed and feasibility of genome finishing. Our study further reinforces the notion that repetitive DNA elements are major limiting factors for genome finishing. Moreover, we provided a step-by-step workflow for genome finishing, which may guide future bacterial genome finishing projects. PMID

  5. Computational bacterial genome-wide analysis of phylogenetic profiles reveals potential virulence genes of Streptococcus agalactiae.

    Directory of Open Access Journals (Sweden)

    Frank Po-Yen Lin

    Full Text Available The phylogenetic profile of a gene is a reflection of its evolutionary history and can be defined as the differential presence or absence of a gene in a set of reference genomes. It has been employed to facilitate the prediction of gene functions. However, the hypothesis that the application of this concept can also facilitate the discovery of bacterial virulence factors has not been fully examined. In this paper, we test this hypothesis and report a computational pipeline designed to identify previously unknown bacterial virulence genes using group B streptococcus (GBS as an example. Phylogenetic profiles of all GBS genes across 467 bacterial reference genomes were determined by candidate-against-all BLAST searches,which were then used to identify candidate virulence genes by machine learning models. Evaluation experiments with known GBS virulence genes suggested good functional and model consistency in cross-validation analyses (areas under ROC curve, 0.80 and 0.98 respectively. Inspection of the top-10 genes in each of the 15 virulence functional groups revealed at least 15 (of 119 homologous genes implicated in virulence in other human pathogens but previously unrecognized as potential virulence genes in GBS. Among these highly-ranked genes, many encode hypothetical proteins with possible roles in GBS virulence. Thus, our approach has led to the identification of a set of genes potentially affecting the virulence potential of GBS, which are potential candidates for further in vitro and in vivo investigations. This computational pipeline can also be extended to in silico analysis of virulence determinants of other bacterial pathogens.

  6. MEMOSys: Bioinformatics platform for genome-scale metabolic models

    Directory of Open Access Journals (Sweden)

    Agren Rasmus

    2011-01-01

    Full Text Available Abstract Background Recent advances in genomic sequencing have enabled the use of genome sequencing in standard biological and biotechnological research projects. The challenge is how to integrate the large amount of data in order to gain novel biological insights. One way to leverage sequence data is to use genome-scale metabolic models. We have therefore designed and implemented a bioinformatics platform which supports the development of such metabolic models. Results MEMOSys (MEtabolic MOdel research and development System is a versatile platform for the management, storage, and development of genome-scale metabolic models. It supports the development of new models by providing a built-in version control system which offers access to the complete developmental history. Moreover, the integrated web board, the authorization system, and the definition of user roles allow collaborations across departments and institutions. Research on existing models is facilitated by a search system, references to external databases, and a feature-rich comparison mechanism. MEMOSys provides customizable data exchange mechanisms using the SBML format to enable analysis in external tools. The web application is based on the Java EE framework and offers an intuitive user interface. It currently contains six annotated microbial metabolic models. Conclusions We have developed a web-based system designed to provide researchers a novel application facilitating the management and development of metabolic models. The system is freely available at http://www.icbi.at/MEMOSys.

  7. Cloning the simian varicella virus genome in E. coli as an infectious bacterial artificial chromosome

    OpenAIRE

    Gray, Wayne L.; Zhou, Fuchun; Noffke, Juliane; Tischer, B Karsten

    2011-01-01

    Simian varicella virus (SVV) is closely related to human varicella-zoster virus and causes varicella and zoster-like disease in nonhuman primates. In this study, a mini-F replicon was inserted into a SVV cosmid and infectious SVV was generated by co-transfection of Vero cells with overlapping SVV cosmids. The entire SVV genome, cloned as a bacterial artificial chromosome (BAC), was stably propagated upon serial passage in E. coli. Transfection of pSVV-BAC DNA into Vero cells yielded infectiou...

  8. Genomics reveals historic and contemporary transmission dynamics of a bacterial disease among wildlife and livestock

    Science.gov (United States)

    Kamath, Pauline L.; Foster, Jeffrey T.; Drees, Kevin P.; Luikart, Gordon; Quance, Christine; Anderson, Neil J.; Clarke, P. Ryan; Cole, Eric K.; Drew, Mark L.; Edwards, William H.; Rhyan, Jack C.; Treanor, John J.; Wallen, Rick L.; White, Patrick J.; Robbe-Austerman, Suelee; Cross, Paul C.

    2016-01-01

    Whole-genome sequencing has provided fundamental insights into infectious disease epidemiology, but has rarely been used for examining transmission dynamics of a bacterial pathogen in wildlife. In the Greater Yellowstone Ecosystem (GYE), outbreaks of brucellosis have increased in cattle along with rising seroprevalence in elk. Here we use a genomic approach to examine Brucella abortus evolution, cross-species transmission and spatial spread in the GYE. We find that brucellosis was introduced into wildlife in this region at least five times. The diffusion rate varies among Brucella lineages (B3 to 8 km per year) and over time. We also estimate 12 host transitions from bison to elk, and 5 from elk to bison. Our results support the notion that free-ranging elk are currently a self-sustaining brucellosis reservoir and the source of livestock infections, and that control measures in bison are unlikely to affect the dynamics of unrelated strains circulating in nearby elk populations.

  9. Reconstruction and analysis of the genome-scale metabolic model of Lactobacillus casei LC2W.

    Science.gov (United States)

    Xu, Nan; Liu, Jie; Ai, Lianzhong; Liu, Liming

    2015-01-10

    Lactobacillus casei LC2W is a recently isolated probiotic lactic acid bacterial strain, which is widely used in the dairy and pharmaceutical industries and in clinical medicine. The first genome-scale metabolic model for L. casei, composed of 846 genes, 969 metabolic reactions, and 785 metabolites, was reconstructed using both manual genome annotation and an automatic SEED model. Then, the iJL846 model was validated by simulating cell growth on 15 reported carbon sources. The iJL846 model explored the metabolism of L. casei on a genome scale: (1) explanation of the genetic codes-metabolic functions of 342 genes were reannotated in this model; (2) characterization of the physiology-10 amino acids and 7 vitamins were identified to be essential nutrients for L. casei LC2W growth; (3) analyses of metabolic pathways-the transport and metabolism of the 17 essential nutrients and exopolysaccharide (EPS) biosynthesis-were performed; (4) exploration of metabolic capacity was conducted-for lactate, the importance of genes in its biosynthetic pathways was evaluated, and the requirements of amino acids were predicted for mixed acid fermentation; for flavor compounds, the effects of oxygen were analyzed, and three new knockout targets were selected for acetoin production; for EPS, 11 types of nutrients in the rich medium and important reactions in the biosynthetic pathway were identified that enhanced EPS production. In conclusion, the iJL846 model serves as a useful tool for understanding and engineering the metabolism of this probiotic strain. PMID:25452194

  10. Power Laws, Scale-Free Networks and Genome Biology

    CERN Document Server

    Koonin, Eugene V; Karev, Georgy P

    2006-01-01

    Power Laws, Scale-free Networks and Genome Biology deals with crucial aspects of the theoretical foundations of systems biology, namely power law distributions and scale-free networks which have emerged as the hallmarks of biological organization in the post-genomic era. The chapters in the book not only describe the interesting mathematical properties of biological networks but moves beyond phenomenology, toward models of evolution capable of explaining the emergence of these features. The collection of chapters, contributed by both physicists and biologists, strives to address the problems in this field in a rigorous but not excessively mathematical manner and to represent different viewpoints, which is crucial in this emerging discipline. Each chapter includes, in addition to technical descriptions of properties of biological networks and evolutionary models, a more general and accessible introduction to the respective problems. Most chapters emphasize the potential of theoretical systems biology for disco...

  11. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network

    DEFF Research Database (Denmark)

    Förster, Jochen; Famili, I.; Fu, P.;

    2003-01-01

    environment were included. A total of 708 structural open reading frames (ORFs) were accounted for in the reconstructed network, corresponding to 1035 metabolic reactions. Further, 140 reactions were included on the basis of biochemical evidence resulting in a genome-scale reconstructed metabolic network...... Escherichia coli. The reconstructed metabolic network is the first comprehensive network for a eukaryotic organism, and it may be used as the basis for in silico analysis of phenotypic functions....

  12. Combining p-values in large scale genomics experiments

    OpenAIRE

    Dmitri V Zaykin; Zhivotovsky, Lev A.; Czika, Wendy; Shao, Susan; Wolfinger, Russell D.

    2007-01-01

    In large-scale genomics experiments involving thousands of statistical tests, such as association scans and microarray expression experiments, a key question is: Which of the L tests represent true associations (TAs)? The traditional way to control false findings is via individual adjustments. In the presence of multiple TAs, p-value combination methods offer certain advantages. Both Fisher’s and Lancaster’s combination methods use an inverse gamma transformation. We identify the relation of ...

  13. An improved method for oriT-directed cloning and functionalization of large bacterial genomic regions.

    Science.gov (United States)

    Kvitko, Brian H; McMillan, Ian A; Schweizer, Herbert P

    2013-08-01

    We have made significant improvements to a broad-host-range system for the cloning and manipulation of large bacterial genomic regions based on site-specific recombination between directly repeated oriT sites during conjugation. Using two suicide capture vectors carrying flanking homology regions, oriT sites are recombined on either side of the target region. Using a broad-host-range conjugation helper plasmid, the region between the oriT sites is conjugated into an Escherichia coli recipient strain, where it is circularized and maintained as a chimeric mini-F vector. The cloned target region is functionalized in multiple ways to accommodate downstream manipulation. The target region is flanked with Gateway attB sites for recombination into other vectors and by rare 18-bp I-SceI restriction sites for subcloning. The Tn7-functionalized target can also be inserted at a naturally occurring chromosomal attTn7 site(s) or maintained as a broad-host-range plasmid for complementation or heterologous expression studies. We have used the oriTn7 capture technique to clone and complement Burkholderia pseudomallei genomic regions up to 140 kb in size and have created isogenic Burkholderia strains with various combinations of genomic islands. We believe this system will greatly aid the cloning and genetic analysis of genomic islands, biosynthetic gene clusters, and large open reading frames. PMID:23747708

  14. Multiplex sequencing of bacterial artificial chromosomes for assembling complex plant genomes.

    Science.gov (United States)

    Beier, Sebastian; Himmelbach, Axel; Schmutzer, Thomas; Felder, Marius; Taudien, Stefan; Mayer, Klaus F X; Platzer, Matthias; Stein, Nils; Scholz, Uwe; Mascher, Martin

    2016-07-01

    Hierarchical shotgun sequencing remains the method of choice for assembling high-quality reference sequences of complex plant genomes. The efficient exploitation of current high-throughput technologies and powerful computational facilities for large-insert clone sequencing necessitates the sequencing and assembly of a large number of clones in parallel. We developed a multiplexed pipeline for shotgun sequencing and assembling individual bacterial artificial chromosomes (BACs) using the Illumina sequencing platform. We illustrate our approach by sequencing 668 barley BACs (Hordeum vulgare L.) in a single Illumina HiSeq 2000 lane. Using a newly designed parallelized computational pipeline, we obtained sequence assemblies of individual BACs that consist, on average, of eight sequence scaffolds and represent >98% of the genomic inserts. Our BAC assemblies are clearly superior to a whole-genome shotgun assembly regarding contiguity, completeness and the representation of the gene space. Our methods may be employed to rapidly obtain high-quality assemblies of a large number of clones to assemble map-based reference sequences of plant and animal species with complex genomes by sequencing along a minimum tiling path. PMID:26801048

  15. Single-molecule approach to bacterial genomic comparisons via optical mapping.

    Energy Technology Data Exchange (ETDEWEB)

    Zhou, Shiguo [Univ. Wisc.-Madison; Kile, A. [Univ. Wisc.-Madison; Bechner, M. [Univ. Wisc.-Madison; Kvikstad, E. [Univ. Wisc.-Madison; Deng, W. [Univ. Wisc.-Madison; Wei, J. [Univ. Wisc.-Madison; Severin, J. [Univ. Wisc.-Madison; Runnheim, R. [Univ. Wisc.-Madison; Churas, C. [Univ. Wisc.-Madison; Forrest, D. [Univ. Wisc.-Madison; Dimalanta, E. [Univ. Wisc.-Madison; Lamers, C. [Univ. Wisc.-Madison; Burland, V. [Univ. Wisc.-Madison; Blattner, F. R. [Univ. Wisc.-Madison; Schwartz, David C. [Univ. Wisc.-Madison

    2004-01-01

    Modern comparative genomics has been established, in part, by the sequencing and annotation of a broad range of microbial species. To gain further insights, new sequencing efforts are now dealing with the variety of strains or isolates that gives a species definition and range; however, this number vastly outstrips our ability to sequence them. Given the availability of a large number of microbial species, new whole genome approaches must be developed to fully leverage this information at the level of strain diversity that maximize discovery. Here, we describe how optical mapping, a single-molecule system, was used to identify and annotate chromosomal alterations between bacterial strains represented by several species. Since whole-genome optical maps are ordered restriction maps, sequenced strains of Shigella flexneri serotype 2a (2457T and 301), Yersinia pestis (CO 92 and KIM), and Escherichia coli were aligned as maps to identify regions of homology and to further characterize them as possible insertions, deletions, inversions, or translocations. Importantly, an unsequenced Shigella flexneri strain (serotype Y strain AMC[328Y]) was optically mapped and aligned with two sequenced ones to reveal one novel locus implicated in serotype conversion and several other loci containing insertion sequence elements or phage-related gene insertions. Our results suggest that genomic rearrangements and chromosomal breakpoints are readily identified and annotated against a prototypic sequenced strain by using the tools of optical mapping.

  16. Differential regulation of horizontally acquired and core genome genes by the bacterial modulator H-NS.

    Directory of Open Access Journals (Sweden)

    Rosa C Baños

    2009-06-01

    Full Text Available Horizontal acquisition of DNA by bacteria dramatically increases genetic diversity and hence successful bacterial colonization of several niches, including the human host. A relevant issue is how this newly acquired DNA interacts and integrates in the regulatory networks of the bacterial cell. The global modulator H-NS targets both core genome and HGT genes and silences gene expression in response to external stimuli such as osmolarity and temperature. Here we provide evidence that H-NS discriminates and differentially modulates core and HGT DNA. As an example of this, plasmid R27-encoded H-NS protein has evolved to selectively silence HGT genes and does not interfere with core genome regulation. In turn, differential regulation of both gene lineages by resident chromosomal H-NS requires a helper protein: the Hha protein. Tight silencing of HGT DNA is accomplished by H-NS-Hha complexes. In contrast, core genes are modulated by H-NS homoligomers. Remarkably, the presence of Hha-like proteins is restricted to the Enterobacteriaceae. In addition, conjugative plasmids encoding H-NS variants have hitherto been isolated only from members of the family. Thus, the H-NS system in enteric bacteria presents unique evolutionary features. The capacity to selectively discriminate between core and HGT DNA may help to maintain horizontally transmitted DNA in silent form and may give these bacteria a competitive advantage in adapting to new environments, including host colonization.

  17. 13C metabolic flux analysis at a genome-scale.

    Science.gov (United States)

    Gopalakrishnan, Saratram; Maranas, Costas D

    2015-11-01

    Metabolic models used in 13C metabolic flux analysis generally include a limited number of reactions primarily from central metabolism. They typically omit degradation pathways, complete cofactor balances, and atom transition contributions for reactions outside central metabolism. This study addresses the impact on prediction fidelity of scaling-up mapping models to a genome-scale. The core mapping model employed in this study accounts for (75 reactions and 65 metabolites) primarily from central metabolism. The genome-scale metabolic mapping model (GSMM) (697 reaction and 595 metabolites) is constructed using as a basis the iAF1260 model upon eliminating reactions guaranteed not to carry flux based on growth and fermentation data for a minimal glucose growth medium. Labeling data for 17 amino acid fragments obtained from cells fed with glucose labeled at the second carbon was used to obtain fluxes and ranges. Metabolic fluxes and confidence intervals are estimated, for both core and genome-scale mapping models, by minimizing the sum of square of differences between predicted and experimentally measured labeling patterns using the EMU decomposition algorithm. Overall, we find that both topology and estimated values of the metabolic fluxes remain largely consistent between core and GSM model. Stepping up to a genome-scale mapping model leads to wider flux inference ranges for 20 key reactions present in the core model. The glycolysis flux range doubles due to the possibility of active gluconeogenesis, the TCA flux range expanded by 80% due to the availability of a bypass through arginine consistent with labeling data, and the transhydrogenase reaction flux was essentially unresolved due to the presence of as many as five routes for the inter-conversion of NADPH to NADH afforded by the genome-scale model. By globally accounting for ATP demands in the GSMM model the unused ATP decreased drastically with the lower bound matching the maintenance ATP requirement. A non

  18. Tracing the Spread of Clostridium difficile Ribotype 027 in Germany Based on Bacterial Genome Sequences.

    Directory of Open Access Journals (Sweden)

    Matthias Steglich

    Full Text Available We applied whole-genome sequencing to reconstruct the spatial and temporal dynamics underpinning the expansion of Clostridium difficile ribotype 027 in Germany. Based on re-sequencing of genomes from 57 clinical C. difficile isolates, which had been collected from hospitalized patients at 36 locations throughout Germany between 1990 and 2012, we demonstrate that C. difficile genomes have accumulated sequence variation sufficiently fast to document the pathogen's spread at a regional scale. We detected both previously described lineages of fluoroquinolone-resistant C. difficile ribotype 027, FQR1 and FQR2. Using Bayesian phylogeographic analyses, we show that fluoroquinolone-resistant C. difficile 027 was imported into Germany at least four times, that it had been widely disseminated across multiple federal states even before the first outbreak was noted in 2007, and that it has continued to spread since.

  19. First genomic insights into members of a candidate bacterial phylum responsible for wastewater bulking

    Directory of Open Access Journals (Sweden)

    Yuji Sekiguchi

    2015-01-01

    Full Text Available Filamentous cells belonging to the candidate bacterial phylum KSB3 were previously identified as the causative agent of fatal filament overgrowth (bulking in a high-rate industrial anaerobic wastewater treatment bioreactor. Here, we obtained near complete genomes from two KSB3 populations in the bioreactor, including the dominant bulking filament, using differential coverage binning of metagenomic data. Fluorescence in situ hybridization with 16S rRNA-targeted probes specific for the two populations confirmed that both are filamentous organisms. Genome-based metabolic reconstruction and microscopic observation of the KSB3 filaments in the presence of sugar gradients indicate that both filament types are Gram-negative, strictly anaerobic fermenters capable of non-flagellar based gliding motility, and have a strikingly large number of sensory and response regulator genes. We propose that the KSB3 filaments are highly sensitive to their surroundings and that cellular processes, including those causing bulking, are controlled by external stimuli. The obtained genomes lay the foundation for a more detailed understanding of environmental cues used by KSB3 filaments, which may lead to more robust treatment options to prevent bulking.

  20. Genome-wide Selective Sweeps in Natural Bacterial Populations Revealed by Time-series Metagenomics

    Energy Technology Data Exchange (ETDEWEB)

    Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Foster, Brian; Kang, Dongwan; Tringe, Susannah G.; Bertilsson, Stefan; Moran, Mary Ann; Shade, Ashley; Newton, Ryan J.; Stevens, Sarah; McMcahon, Katherine D.; Mamlstrom, Rex R.

    2014-05-12

    Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ecotype model? of diversification, but not previously observed in natural populations.

  1. Genome-wide Selective Sweeps in Natural Bacterial Populations Revealed by Time-series Metagenomics

    Energy Technology Data Exchange (ETDEWEB)

    Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Foster, Brian; Kang, Dongwan; Tringe, Susannah G.; Bertilsson, Stefan; Moran, Mary Ann; Shade, Ashley; Newton, Ryan J.; Stevens, Sarah; McMahon, Katherine D.; Malmstrom, Rex R.

    2014-06-18

    Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ‘ecotype model’ of diversification, but not previously observed in natural populations.

  2. A genome-scale metabolic reconstruction of Mycoplasma genitalium, iPS189.

    Directory of Open Access Journals (Sweden)

    Patrick F Suthers

    2009-02-01

    Full Text Available With a genome size of approximately 580 kb and approximately 480 protein coding regions, Mycoplasma genitalium is one of the smallest known self-replicating organisms and, additionally, has extremely fastidious nutrient requirements. The reduced genomic content of M. genitalium has led researchers to suggest that the molecular assembly contained in this organism may be a close approximation to the minimal set of genes required for bacterial growth. Here, we introduce a systematic approach for the construction and curation of a genome-scale in silico metabolic model for M. genitalium. Key challenges included estimation of biomass composition, handling of enzymes with broad specificities, and the lack of a defined medium. Computational tools were subsequently employed to identify and resolve connectivity gaps in the model as well as growth prediction inconsistencies with gene essentiality experimental data. The curated model, M. genitalium iPS189 (262 reactions, 274 metabolites, is 87% accurate in recapitulating in vivo gene essentiality results for M. genitalium. Approaches and tools described herein provide a roadmap for the automated construction of in silico metabolic models of other organisms.

  3. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    Energy Technology Data Exchange (ETDEWEB)

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin

  4. Cloning human herpes virus 6A genome into bacterial artificial chromosomes and study of DNA replication intermediates

    OpenAIRE

    Borenstein, Ronen; Frenkel, Niza

    2009-01-01

    Cloning of large viral genomes into bacterial artificial chromosomes (BACs) facilitates analyses of viral functions and molecular mutagenesis. Previous derivations of viral BACs involved laborious recombinations within infected cells. We describe a single-step production of viral BACs by direct cloning of unit length genomes, derived from circular or head-to-tail concatemeric DNA replication intermediates. The BAC cloning is independent of intracellular recombinations and DNA packaging constr...

  5. Genome-scale constraint-based modeling of Geobacter metallireducens

    Directory of Open Access Journals (Sweden)

    Famili Iman

    2009-01-01

    Full Text Available Abstract Background Geobacter metallireducens was the first organism that can be grown in pure culture to completely oxidize organic compounds with Fe(III oxide serving as electron acceptor. Geobacter species, including G. sulfurreducens and G. metallireducens, are used for bioremediation and electricity generation from waste organic matter and renewable biomass. The constraint-based modeling approach enables the development of genome-scale in silico models that can predict the behavior of complex biological systems and their responses to the environments. Such a modeling approach was applied to provide physiological and ecological insights on the metabolism of G. metallireducens. Results The genome-scale metabolic model of G. metallireducens was constructed to include 747 genes and 697 reactions. Compared to the G. sulfurreducens model, the G. metallireducens metabolic model contains 118 unique reactions that reflect many of G. metallireducens' specific metabolic capabilities. Detailed examination of the G. metallireducens model suggests that its central metabolism contains several energy-inefficient reactions that are not present in the G. sulfurreducens model. Experimental biomass yield of G. metallireducens growing on pyruvate was lower than the predicted optimal biomass yield. Microarray data of G. metallireducens growing with benzoate and acetate indicated that genes encoding these energy-inefficient reactions were up-regulated by benzoate. These results suggested that the energy-inefficient reactions were likely turned off during G. metallireducens growth with acetate for optimal biomass yield, but were up-regulated during growth with complex electron donors such as benzoate for rapid energy generation. Furthermore, several computational modeling approaches were applied to accelerate G. metallireducens research. For example, growth of G. metallireducens with different electron donors and electron acceptors were studied using the genome-scale

  6. Spatial scales of bacterial diversity in cold-water coral reef ecosystems.

    Directory of Open Access Journals (Sweden)

    Sandra Schöttner

    Full Text Available BACKGROUND: Cold-water coral reef ecosystems are recognized as biodiversity hotspots in the deep sea, but insights into their associated bacterial communities are still limited. Deciphering principle patterns of bacterial community variation over multiple spatial scales may however prove critical for a better understanding of factors contributing to cold-water coral reef stability and functioning. METHODOLOGY/PRINCIPAL FINDINGS: Bacterial community structure, as determined by Automated Ribosomal Intergenic Spacer Analysis (ARISA, was investigated with respect to (i microbial habitat type and (ii coral species and color, as well as the three spatial components (iii geomorphologic reef zoning, (iv reef boundary, and (v reef location. Communities revealed fundamental differences between coral-generated (branch surface, mucus and ambient microbial habitats (seawater, sediments. This habitat specificity appeared pivotal for determining bacterial community shifts over all other study levels investigated. Coral-derived surfaces showed species-specific patterns, differing significantly between Lophelia pertusa and Madrepora oculata, but not between L. pertusa color types. Within the reef center, no community distinction corresponded to geomorphologic reef zoning for both coral-generated and ambient microbial habitats. Beyond the reef center, however, bacterial communities varied considerably from local to regional scales, with marked shifts toward the reef periphery as well as between different in- and offshore reef sites, suggesting significant biogeographic imprinting but weak microbe-host specificity. CONCLUSIONS/SIGNIFICANCE: This study presents the first multi-scale survey of bacterial diversity in cold-water coral reefs, spanning a total of five observational levels including three spatial scales. It demonstrates that bacterial communities in cold-water coral reefs are structured by multiple factors acting at different spatial scales, which has

  7. Genome scale metabolic modeling of the riboflavin overproducer Ashbya gossypii.

    Science.gov (United States)

    Ledesma-Amaro, Rodrigo; Kerkhoven, Eduard J; Revuelta, José Luis; Nielsen, Jens

    2014-06-01

    Ashbya gossypii is a filamentous fungus that naturally overproduces riboflavin, or vitamin B2. Advances in genetic and metabolic engineering of A. gossypii have permitted the switch from industrial chemical synthesis to the current biotechnological production of this vitamin. Additionally, A. gossypii is a model organism with one of the smallest eukaryote genomes being phylogenetically close to Saccharomyces cerevisiae. It has therefore been used to study evolutionary aspects of bakers' yeast. We here reconstructed the first genome scale metabolic model of A. gossypii, iRL766. The model was validated by biomass growth, riboflavin production and substrate utilization predictions. Gene essentiality analysis of the A. gossypii model in comparison with the S. cerevisiae model demonstrated how the whole-genome duplication event that separates the two species has led to an even spread of paralogs among all metabolic pathways. Additionally, iRL766 was used to integrate transcriptomics data from two different growth stages of A. gossypii, comparing exponential growth to riboflavin production stages. Both reporter metabolite analysis and in silico identification of transcriptionally regulated enzymes demonstrated the important involvement of beta-oxidation and the glyoxylate cycle in riboflavin production. PMID:24374726

  8. A genomic scale map of genetic diversity in Trypanosoma cruzi

    Directory of Open Access Journals (Sweden)

    Ackermann Alejandro A

    2012-12-01

    Full Text Available Abstract Background Trypanosoma cruzi, the causal agent of Chagas Disease, affects more than 16 million people in Latin America. The clinical outcome of the disease results from a complex interplay between environmental factors and the genetic background of both the human host and the parasite. However, knowledge of the genetic diversity of the parasite, is currently limited to a number of highly studied loci. The availability of a number of genomes from different evolutionary lineages of T. cruzi provides an unprecedented opportunity to look at the genetic diversity of the parasite at a genomic scale. Results Using a bioinformatic strategy, we have clustered T. cruzi sequence data available in the public domain and obtained multiple sequence alignments in which one or two alleles from the reference CL-Brener were included. These data covers 4 major evolutionary lineages (DTUs: TcI, TcII, TcIII, and the hybrid TcVI. Using these set of alignments we have identified 288,957 high quality single nucleotide polymorphisms and 1,480 indels. In a reduced re-sequencing study we were able to validate ~ 97% of high-quality SNPs identified in 47 loci. Analysis of how these changes affect encoded protein products showed a 0.77 ratio of synonymous to non-synonymous changes in the T. cruzi genome. We observed 113 changes that introduce or remove a stop codon, some causing significant functional changes, and a number of tri-allelic and tetra-allelic SNPs that could be exploited in strain typing assays. Based on an analysis of the observed nucleotide diversity we show that the T. cruzi genome contains a core set of genes that are under apparent purifying selection. Interestingly, orthologs of known druggable targets show statistically significant lower nucleotide diversity values. Conclusions This study provides the first look at the genetic diversity of T. cruzi at a genomic scale. The analysis covers an estimated ~ 60% of the genetic diversity present in the

  9. Solving the problem of comparing whole bacterial genomes across different sequencing platforms.

    Directory of Open Access Journals (Sweden)

    Rolf S Kaas

    Full Text Available Whole genome sequencing (WGS shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent. We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools.

  10. Solving the problem of comparing whole bacterial genomes across different sequencing platforms.

    Science.gov (United States)

    Kaas, Rolf S; Leekitcharoenphon, Pimlapas; Aarestrup, Frank M; Lund, Ole

    2014-01-01

    Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools. PMID:25110940

  11. BFAST: an alignment tool for large scale genome resequencing.

    Directory of Open Access Journals (Sweden)

    Nils Homer

    Full Text Available BACKGROUND: The new generation of massively parallel DNA sequencers, combined with the challenge of whole human genome resequencing, result in the need for rapid and accurate alignment of billions of short DNA sequence reads to a large reference genome. Speed is obviously of great importance, but equally important is maintaining alignment accuracy of short reads, in the 25-100 base range, in the presence of errors and true biological variation. METHODOLOGY: We introduce a new algorithm specifically optimized for this task, as well as a freely available implementation, BFAST, which can align data produced by any of current sequencing platforms, allows for user-customizable levels of speed and accuracy, supports paired end data, and provides for efficient parallel and multi-threaded computation on a computer cluster. The new method is based on creating flexible, efficient whole genome indexes to rapidly map reads to candidate alignment locations, with arbitrary multiple independent indexes allowed to achieve robustness against read errors and sequence variants. The final local alignment uses a Smith-Waterman method, with gaps to support the detection of small indels. CONCLUSIONS: We compare BFAST to a selection of large-scale alignment tools -- BLAT, MAQ, SHRiMP, and SOAP -- in terms of both speed and accuracy, using simulated and real-world datasets. We show BFAST can achieve substantially greater sensitivity of alignment in the context of errors and true variants, especially insertions and deletions, and minimize false mappings, while maintaining adequate speed compared to other current methods. We show BFAST can align the amount of data needed to fully resequence a human genome, one billion reads, with high sensitivity and accuracy, on a modest computer cluster in less than 24 hours. BFAST is available at (http://bfast.sourceforge.net.

  12. Modeling Lactococcus lactis using a genome-scale flux model

    Directory of Open Access Journals (Sweden)

    Nielsen Jens

    2005-06-01

    Full Text Available Abstract Background Genome-scale flux models are useful tools to represent and analyze microbial metabolism. In this work we reconstructed the metabolic network of the lactic acid bacteria Lactococcus lactis and developed a genome-scale flux model able to simulate and analyze network capabilities and whole-cell function under aerobic and anaerobic continuous cultures. Flux balance analysis (FBA and minimization of metabolic adjustment (MOMA were used as modeling frameworks. Results The metabolic network was reconstructed using the annotated genome sequence from L. lactis ssp. lactis IL1403 together with physiological and biochemical information. The established network comprised a total of 621 reactions and 509 metabolites, representing the overall metabolism of L. lactis. Experimental data reported in the literature was used to fit the model to phenotypic observations. Regulatory constraints had to be included to simulate certain metabolic features, such as the shift from homo to heterolactic fermentation. A minimal medium for in silico growth was identified, indicating the requirement of four amino acids in addition to a sugar. Remarkably, de novo biosynthesis of four other amino acids was observed even when all amino acids were supplied, which is in good agreement with experimental observations. Additionally, enhanced metabolic engineering strategies for improved diacetyl producing strains were designed. Conclusion The L. lactis metabolic network can now be used for a better understanding of lactococcal metabolic capabilities and potential, for the design of enhanced metabolic engineering strategies and for integration with other types of 'omic' data, to assist in finding new information on cellular organization and function.

  13. Selection for Unequal Densities of Sigma70 Promoter-like Signalsin Different Regions of Large Bacterial Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Huerta, Araceli M.; Francino, M. Pilar; Morett, Enrique; Collado-Vides, Julio

    2006-03-01

    The evolutionary processes operating in the DNA regions that participate in the regulation of gene expression are poorly understood. In Escherichia coli, we have established a sequence pattern that distinguishes regulatory from nonregulatory regions. The density of promoter-like sequences, that are recognizable by RNA polymerase and may function as potential promoters, is high within regulatory regions, in contrast to coding regions and regions located between convergently-transcribed genes. Moreover, functional promoter sites identified experimentally are often found in the subregions of highest density of promoter-like signals, even when individual sites with higher binding affinity for RNA polymerase exist elsewhere within the regulatory region. In order to investigate the generality of this pattern, we have used position weight matrices describing the -35 and -10 promoter boxes of E. coli to search for these motifs in 43 additional genomes belonging to most established bacterial phyla, after specific calibration of the matrices according to the base composition of the noncoding regions of each genome. We have found that all bacterial species analyzed contain similar promoter-like motifs, and that, in most cases, these motifs follow the same genomic distribution observed in E. coli. Differential densities between regulatory and nonregulatory regions are detectable in most bacterial genomes, with the exception of those that have experienced evolutionary extreme genome reduction. Thus, the phylogenetic distribution of this pattern mirrors that of genes and other genomic features that require weak selection to be effective in order to persist. On this basis, we suggest that the loss of differential densities in the reduced genomes of host-restricted pathogens and symbionts is the outcome of a process of genome degradation resulting from the decreased efficiency of purifying selection in highly structured small populations. This implies that the differential

  14. antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences

    NARCIS (Netherlands)

    Medema, M.H.; Blin, K.; Cimermancic, P.; Jager, de V.C.L.; Zakrzewski, P.; Fischbach, M.A.; Weber, T.; Takano, E.; Breitling, R.

    2011-01-01

    Bacterial and fungal secondary metabolism is a rich source of novel bioactive compounds with potential pharmaceutical applications as antibiotics, anti-tumor drugs or cholesterol-lowering drugs. To find new drug candidates, microbiologists are increasingly relying on sequencing genomes of a wide var

  15. antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences.

    NARCIS (Netherlands)

    Medema, M.H.; Blin, K.; Cimermancic, P.; Jager, V.C.L. de; Zakrzewski, P.; Fischbach, M.A.; Weber, T.; Takano, E.; Breitling, R.

    2011-01-01

    Bacterial and fungal secondary metabolism is a rich source of novel bioactive compounds with potential pharmaceutical applications as antibiotics, anti-tumor drugs or cholesterol-lowering drugs. To find new drug candidates, microbiologists are increasingly relying on sequencing genomes of a wide var

  16. antiSMASH : rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences

    NARCIS (Netherlands)

    Medema, Marnix H.; Blin, Kai; Cimermancic, Peter; de Jager, Victor; Zakrzewski, Piotr; Fischbach, Michael A.; Weber, Tilmann; Takano, Eriko; Breitling, Rainer

    2011-01-01

    Bacterial and fungal secondary metabolism is a rich source of novel bioactive compounds with potential pharmaceutical applications as antibiotics, anti-tumor drugs or cholesterol-lowering drugs. To find new drug candidates, microbiologists are increasingly relying on sequencing genomes of a wide var

  17. Complete Genome Sequence of Japanese Erwinia Strain Ejp617, a Bacterial Shoot Blight Pathogen of Pear ▿

    OpenAIRE

    Park, Duck Hwan; Thapa, Shree Prasad; Choi, Beom-Soon; Kim, Won-Sik; Hur, Jang Hyun; Cho, Jun Mo; Lim, Jong-Sung; Choi, Ik-Young; Lim, Chun Keun

    2010-01-01

    The Japanese Erwinia strain Ejp617 is a plant pathogen that causes bacterial shoot blight of pear in Japan. Here, we report the complete genome sequence of strain Ejp617 isolated from Nashi pears in Japan to provide further valuable insight among related Erwinia species.

  18. Complete Genome Sequence of Gluconacetobacter hansenii Strain NQ5 (ATCC 53582), an Efficient Producer of Bacterial Cellulose.

    Science.gov (United States)

    Pfeffer, Sarah; Mehta, Kalpa; Brown, R Malcolm

    2016-01-01

    This study reports the release of the complete nucleotide sequence of Gluconacetobacter hansenii strain NQ5 (ATCC 53582). This strain was isolated by R. Malcolm Brown, Jr. in a sugar mill in North Queensland, Australia, and is an efficient producer of bacterial cellulose. The elucidation of the genome will contribute to the study of the molecular mechanisms necessary for cellulose biosynthesis. PMID:27516505

  19. Complete Genome Sequence of Lactobacillus rhamnosus Strain BPL5 (CECT 8800), a Probiotic for Treatment of Bacterial Vaginosis.

    Science.gov (United States)

    Chenoll, Empar; Codoñer, Francisco M; Martinez-Blanch, Juan F; Ramón, Daniel; Genovés, Salvador; Menabrito, Marco

    2016-01-01

    ITALIC! Lactobacillus rhamnosusBPL5 (CECT 8800), is a probiotic strain suitable for the treatment of bacterial vaginosis. Here, we report its complete genome sequence deciphered by PacBio single-molecule real-time (SMRT) technology. Analysis of the sequence may provide insight into its functional activity. PMID:27103719

  20. Complete Genome Sequence of Lactobacillus rhamnosus Strain BPL5 (CECT 8800), a Probiotic for Treatment of Bacterial Vaginosis

    Science.gov (United States)

    Codoñer, Francisco M.; Martinez-Blanch, Juan F.; Ramón, Daniel; Menabrito, Marco

    2016-01-01

    Lactobacillus rhamnosus BPL5 (CECT 8800), is a probiotic strain suitable for the treatment of bacterial vaginosis. Here, we report its complete genome sequence deciphered by PacBio single-molecule real-time (SMRT) technology. Analysis of the sequence may provide insight into its functional activity. PMID:27103719

  1. SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals.

    Science.gov (United States)

    Nikolaichik, Yevgeny; Damienikan, Aliaksandr U

    2016-01-01

    The majority of bacterial genome annotations are currently automated and based on a 'gene by gene' approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn't fit with regulatory information allowed us to correct product and gene names for over 300 loci. PMID:27257541

  2. SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals

    Science.gov (United States)

    Damienikan, Aliaksandr U.

    2016-01-01

    The majority of bacterial genome annotations are currently automated and based on a ‘gene by gene’ approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn’t fit with regulatory information allowed us to correct product and gene names for over 300 loci. PMID:27257541

  3. Dynamic bacterial communities on reverse-osmosis membranes in a full-scale desalination plant.

    Science.gov (United States)

    Manes, C-L de O; West, N; Rapenne, S; Lebaron, P

    2011-01-01

    To better understand biofouling of seawater reverse osmosis (SWRO) membranes, bacterial diversity was characterized in the intake water, in subsequently pretreated water and on SWRO membranes from a full-scale desalination plant (FSDP) during a 9 month period. 16S rRNA gene fingerprinting and sequencing revealed that bacterial communities in the water samples and on the SWRO membranes were very different. For the different sampling dates, the bacterial diversity of the active and the total bacterial fractions of the water samples remained relatively stable over the sampling period whereas the bacterial community structure on the four SWRO membrane samples was significantly different. The richness and evenness of the SWRO membrane bacterial communities increased with usage time with an increase in the Shannon diversity index of 2.2 to 3.7. In the oldest SWRO membrane (330 days), no single operational taxonomic unit (OTU) dominated and the majority of the OTUs fell into the Alphaproteobacteria or the Planctomycetes. In striking contrast, a Betaproteobacteria OTU affiliated to the genus Ideonella was dominant and exclusively found in the membrane used for the shortest time (10 days). This suggests that bacteria belonging to this genus could be one of the primary colonizers of the SWRO membrane. Knowledge of the dominant bacterial species on SWRO membranes and their dynamics should help guide culture studies for physiological characterization of biofilm forming species. PMID:21108068

  4. Distinct soil bacterial communities along a small-scale elevational gradient in alpine tundra

    Directory of Open Access Journals (Sweden)

    Congcong eShen

    2015-06-01

    Full Text Available The elevational diversity pattern for microorganisms has received great attention recently but is still understudied, and phylogenetic relatedness is rarely studied for microbial elevational distributions. Using a bar-coded pyrosequencing technique, we examined the biodiversity patterns for soil bacterial communities of tundra ecosystem along 2000–2500 m elevations on Changbai Mountain in China. Bacterial taxonomic richness displayed a linear decreasing trend with increasing elevation. Phylogenetic diversity and mean nearest taxon distance (MNTD exhibited a unimodal pattern with elevation. Bacterial communities were more phylogenetically clustered than expected by chance at all elevations based on the standardized effect size of MNTD metric. The bacterial communities differed dramatically among elevations, and the community composition was significantly correlated with soil total carbon, total nitrogen, C:N ratio, and dissolved organic carbon. Multiple ordinary least squares regression analysis showed that the observed biodiversity patterns strongly correlated with soil total carbon and C:N ratio. Taken together, this is the first time that a significant bacterial diversity pattern has been observed across a small-scale elevational gradient. Our results indicated that soil carbon and nitrogen contents were the critical environmental factors affecting bacterial elevational distribution in Changbai Mountain tundra. This suggested that ecological niche-based environmental filtering processes related to soil carbon and nitrogen contents could play a dominant role in structuring bacterial communities along the elevational gradient.

  5. Optimization of Mutation Pressure in Relation to Properties of Protein-Coding Sequences in Bacterial Genomes.

    Directory of Open Access Journals (Sweden)

    Paweł Błażej

    Full Text Available Most mutations are deleterious and require energetically costly repairs. Therefore, it seems that any minimization of mutation rate is beneficial. On the other hand, mutations generate genetic diversity indispensable for evolution and adaptation of organisms to changing environmental conditions. Thus, it is expected that a spontaneous mutational pressure should be an optimal compromise between these two extremes. In order to study the optimization of the pressure, we compared mutational transition probability matrices from bacterial genomes with artificial matrices fulfilling the same general features as the real ones, e.g., the stationary distribution and the speed of convergence to the stationarity. The artificial matrices were optimized on real protein-coding sequences based on Evolutionary Strategies approach to minimize or maximize the probability of non-synonymous substitutions and costs of amino acid replacements depending on their physicochemical properties. The results show that the empirical matrices have a tendency to minimize the effects of mutations rather than maximize their costs on the amino acid level. They were also similar to the optimized artificial matrices in the nucleotide substitution pattern, especially the high transitions/transversions ratio. We observed no substantial differences between the effects of mutational matrices on protein-coding sequences in genomes under study in respect of differently replicated DNA strands, mutational cost types and properties of the referenced artificial matrices. The findings indicate that the empirical mutational matrices are rather adapted to minimize mutational costs in the studied organisms in comparison to other matrices with similar mathematical constraints.

  6. Repetitive genome elements in a European corn borer, Ostrinia nubilalis, bacterial artificial chromosome library were indicated by bacterial artificial chromosome end sequencing and development of sequence tag site markers: implications for lepidopteran genomic research.

    Science.gov (United States)

    Coates, Brad S; Sumerford, Douglas V; Hellmich, Richard L; Lewis, Leslie C

    2009-01-01

    The European corn borer, Ostrinia nubilalis, is a serious pest of food, fiber, and biofuel crops in Europe, North America, and Asia and a model system for insect olfaction and speciation. A bacterial artificial chromosome library constructed for O. nubilalis contains 36 864 clones with an estimated average insert size of >or=120 kb and genome coverage of 8.8-fold. Screening OnB1 clones comprising approximately 2.76 genome equivalents determined the physical position of 24 sequence tag site markers, including markers linked to ecologically important and Bacillus thuringiensis toxin resistance traits. OnB1 bacterial artificial chromosome end sequence reads (GenBank dbGSS accessions ET217010 to ET217273) showed homology to annotated genes or expressed sequence tags and identified repetitive genome elements, O. nubilalis miniature subterminal inverted repeat transposable elements (OnMITE01 and OnMITE02), and ezi-like long interspersed nuclear elements. Mobility of OnMITE01 was demonstrated by the presence or absence in O. nubilalis of introns at two different loci. A (GTCT)n tetranucleotide repeat at the 5' ends of OnMITE01 and OnMITE02 are evidence for transposon-mediated movement of lepidopteran microsatellite loci. The number of repetitive elements in lepidopteran genomes will affect genome assembly and marker development. Single-locus sequence tag site markers described here have downstream application for integration within linkage maps and comparative genomic studies. PMID:19132072

  7. Analysis of Aspergillus nidulans metabolism at the genome-scale

    Directory of Open Access Journals (Sweden)

    Nielsen Jens

    2008-04-01

    Full Text Available Abstract Background Aspergillus nidulans is a member of a diverse group of filamentous fungi, sharing many of the properties of its close relatives with significance in the fields of medicine, agriculture and industry. Furthermore, A. nidulans has been a classical model organism for studies of development biology and gene regulation, and thus it has become one of the best-characterized filamentous fungi. It was the first Aspergillus species to have its genome sequenced, and automated gene prediction tools predicted 9,451 open reading frames (ORFs in the genome, of which less than 10% were assigned a function. Results In this work, we have manually assigned functions to 472 orphan genes in the metabolism of A. nidulans, by using a pathway-driven approach and by employing comparative genomics tools based on sequence similarity. The central metabolism of A. nidulans, as well as biosynthetic pathways of relevant secondary metabolites, was reconstructed based on detailed metabolic reconstructions available for A. niger and Saccharomyces cerevisiae, and information on the genetics, biochemistry and physiology of A. nidulans. Thereby, it was possible to identify metabolic functions without a gene associated, and to look for candidate ORFs in the genome of A. nidulans by comparing its sequence to sequences of well-characterized genes in other species encoding the function of interest. A classification system, based on defined criteria, was developed for evaluating and selecting the ORFs among the candidates, in an objective and systematic manner. The functional assignments served as a basis to develop a mathematical model, linking 666 genes (both previously and newly annotated to metabolic roles. The model was used to simulate metabolic behavior and additionally to integrate, analyze and interpret large-scale gene expression data concerning a study on glucose repression, thereby providing a means of upgrading the information content of experimental data

  8. Bacterial Artificial Chromosomes: A Functional Genomics Tool for the Study of Positive-strand RNA Viruses.

    Science.gov (United States)

    Yun, Sang-Im; Song, Byung-Hak; Kim, Jin-Kyoung; Lee, Young-Min

    2015-01-01

    Reverse genetics, an approach to rescue infectious virus entirely from a cloned cDNA, has revolutionized the field of positive-strand RNA viruses, whose genomes have the same polarity as cellular mRNA. The cDNA-based reverse genetics system is a seminal method that enables direct manipulation of the viral genomic RNA, thereby generating recombinant viruses for molecular and genetic studies of both viral RNA elements and gene products in viral replication and pathogenesis. It also provides a valuable platform that allows the development of genetically defined vaccines and viral vectors for the delivery of foreign genes. For many positive-strand RNA viruses such as Japanese encephalitis virus (JEV), however, the cloned cDNAs are unstable, posing a major obstacle to the construction and propagation of the functional cDNA. Here, the present report describes the strategic considerations in creating and amplifying a genetically stable full-length infectious JEV cDNA as a bacterial artificial chromosome (BAC) using the following general experimental procedures: viral RNA isolation, cDNA synthesis, cDNA subcloning and modification, assembly of a full-length cDNA, cDNA linearization, in vitro RNA synthesis, and virus recovery. This protocol provides a general methodology applicable to cloning full-length cDNA for a range of positive-strand RNA viruses, particularly those with a genome of >10 kb in length, into a BAC vector, from which infectious RNAs can be transcribed in vitro with a bacteriophage RNA polymerase. PMID:26780115

  9. LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Task 1.4.2 Report

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T; Borucki, M; Lam, M; Lenhoff, R; Vitalis, E

    2010-01-26

    Good progress has been made on both bacterial and viral sequencing by the TMTI centers. While access to appropriate samples is a limiting factor to throughput, excellent progress has been made with respect to getting agreements in place with key sources of relevant materials. Sharing of sequenced genomes funded by TMTI has been extremely limited to date. The April 2010 exercise should force a resolution to this, but additional managerial pressures may be needed to ensure that rapid sharing of TMTI-funded sequencing occurs, regardless of collaborator constraints concerning ultimate publication(s). Policies to permit TMTI-internal rapid sharing of sequenced genomes should be written into all TMTI agreements with collaborators now being negotiated. TMTI needs to establish a Web-based system for tracking samples destined for sequencing. This includes metadata on sample origins and contributor, information on sample shipment/receipt, prioritization by TMTI, assignment to one or more sequencing centers (including possible TMTI-sponsored sequencing at a contributor site), and status history of the sample sequencing effort. While this system could be a component of the AFRL system, it is not part of any current development effort. Policy and standardized procedures are needed to ensure appropriate verification of all TMTI samples prior to the investment in sequencing. PCR, arrays, and classical biochemical tests are examples of potential verification methods. Verification is needed to detect miss-labeled, degraded, mixed or contaminated samples. Regular QC exercises are needed to ensure that the TMTI-funded centers are meeting all standards for producing quality genomic sequence data.

  10. Genome sequencing and systems biology analysis of a lipase-producing bacterial strain.

    Science.gov (United States)

    Li, N; Li, D D; Zhang, Y Z; Yuan, Y Z; Geng, H; Xiong, L; Liu, D L

    2016-01-01

    Lipase-producing bacteria are naturally-occurring, industrially-relevant microorganisms that produce lipases, which can be used to synthesize biodiesel from waste oils. The efficiency of lipase expression varies between various microbial strains. Therefore, strains that can produce lipases with high efficiency must be screened, and the conditions of lipase metabolism and optimization of the production process in a given environment must be thoroughly studied. A high efficiency lipase-producing strain was isolated from the sediments of Jinsha River, identified by 16S rRNA sequence analysis as Serratia marcescens, and designated as HS-L5. A schematic diagram of the genome sequence was constructed by high-throughput genome sequencing. A series of genes related to lipid degradation were identified by functional gene annotation through sequence homology analysis. A genome-scale metabolic model of HS-ML5 was constructed using systems biology techniques. The model consisted of 1722 genes and 1567 metabolic reactions. The topological graph of the genome-scale metabolic model was compared to that of conventional metabolic pathways using a visualization software and KEGG database. The basic components and boundaries of the tributyrin degradation subnetwork were determined, and its flux balance analyzed using Matlab and COBRA Toolbox to simulate the effects of different conditions on the catalytic efficiency of lipases produced by HS-ML5. We proved that the catalytic activity of microbial lipases was closely related to the carbon metabolic pathway. As production and catalytic efficiency of lipases varied greatly with the environment, the catalytic efficiency and environmental adaptability of microbial lipases can be improved by proper control of the production conditions. PMID:27050954

  11. Bacterial Societies: Cooperation, Colonization, and Competition in Micro-Scale Ecosystems

    NARCIS (Netherlands)

    Hol, F.J.H.

    2014-01-01

    In this thesis, I describe experiments aimed at understanding bacterial population dynamics in ecosystems that are spatially structured at the micro-scale. We combine microfabrication and microfluidics to create synthetic ecosystems that have a complex yet well-defined geometry and chemical composit

  12. Bacterial community structure of a full-scale biofilter treating pig house exhaust air

    DEFF Research Database (Denmark)

    Kristiansen, Anja; Pedersen, Kristina Hadulla; Nielsen, Per Halkjær; Nielsen, Lars Peter; Nielsen, Jeppe Lund; Schramm, Andreas

    2011-01-01

    Biological air filters represent a promising tool for treating emissions of ammonia and odor from pig facilities. Quantitative fluorescence in situ hybridization (FISH) and 16S rRNA gene sequencing were used to investigate the bacterial community structure and diversity in a full-scale biofilter...

  13. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    International Nuclear Information System (INIS)

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  14. AlgaGEM – a genome-scale metabolic reconstruction of algae based on the Chlamydomonas reinhardtii genome

    OpenAIRE

    2011-01-01

    Background Microalgae have the potential to deliver biofuels without the associated competition for land resources. In order to realise the rates and titres necessary for commercial production, however, system-level metabolic engineering will be required. Genome scale metabolic reconstructions have revolutionized microbial metabolic engineering and are used routinely for in silico analysis and design. While genome scale metabolic reconstructions have been developed for many prokaryotes and mo...

  15. Strategies used for genetically modifying bacterial genome: ite-directed mutagenesis, gene inactivation, and gene over-expression*

    Science.gov (United States)

    Xu, Jian-zhong; Zhang, Wei-guo

    2016-01-01

    With the availability of the whole genome sequence of Escherichia coli or Corynebacterium glutamicum, strategies for directed DNA manipulation have developed rapidly. DNA manipulation plays an important role in understanding the function of genes and in constructing novel engineering bacteria according to requirement. DNA manipulation involves modifying the autologous genes and expressing the heterogenous genes. Two alternative approaches, using electroporation linear DNA or recombinant suicide plasmid, allow a wide variety of DNA manipulation. However, the over-expression of the desired gene is generally executed via plasmid-mediation. The current review summarizes the common strategies used for genetically modifying E. coli and C. glutamicum genomes, and discusses the technical problem of multi-layered DNA manipulation. Strategies for gene over-expression via integrating into genome are proposed. This review is intended to be an accessible introduction to DNA manipulation within the bacterial genome for novices and a source of the latest experimental information for experienced investigators. PMID:26834010

  16. Strategies used for genetically modifying bacterial genome: site-directed mutagenesis, gene inactivation, and gene over-expression.

    Science.gov (United States)

    Xu, Jian-zhong; Zhang, Wei-guo

    2016-02-01

    With the availability of the whole genome sequence of Escherichia coli or Corynebacterium glutamicum, strategies for directed DNA manipulation have developed rapidly. DNA manipulation plays an important role in understanding the function of genes and in constructing novel engineering bacteria according to requirement. DNA manipulation involves modifying the autologous genes and expressing the heterogenous genes. Two alternative approaches, using electroporation linear DNA or recombinant suicide plasmid, allow a wide variety of DNA manipulation. However, the over-expression of the desired gene is generally executed via plasmid-mediation. The current review summarizes the common strategies used for genetically modifying E. coli and C. glutamicum genomes, and discusses the technical problem of multi-layered DNA manipulation. Strategies for gene over-expression via integrating into genome are proposed. This review is intended to be an accessible introduction to DNA manipulation within the bacterial genome for novices and a source of the latest experimental information for experienced investigators. PMID:26834010

  17. Rapid pair-wise synteny analysis of large bacterial genomes using web-based GeneOrder4.0

    Directory of Open Access Journals (Sweden)

    Mahadevan Padmanabhan

    2010-02-01

    Full Text Available Abstract Background The growing whole genome sequence databases necessitate the development of user-friendly software tools to mine these data. Web-based tools are particularly useful to wet-bench biologists as they enable platform-independent analysis of sequence data, without having to perform complex programming tasks and software compiling. Findings GeneOrder4.0 is a web-based "on-the-fly" synteny and gene order analysis tool for comparative bacterial genomics (ca. 8 Mb. It enables the visualization of synteny by plotting protein similarity scores between two genomes and it also provides visual annotation of "hypothetical" proteins from older archived genomes based on more recent annotations. Conclusions The web-based software tool GeneOrder4.0 is a user-friendly application that has been updated to allow the rapid analysis of synteny and gene order in large bacterial genomes. It is developed with the wet-bench researcher in mind.

  18. Large-Scale Engineering of the Corynebacterium glutamicum Genome

    OpenAIRE

    Suzuki, Nobuaki; Okayama, Satoshi; Nonaka, Hiroshi; Tsuge, Yota; Inui, Masayuki; Yukawa, Hideaki

    2005-01-01

    The engineering of Corynebacterium glutamicum is important for enhanced production of biochemicals. To construct an improved C. glutamicum genome, we developed a precise genome excision method based on the Cre/loxP recombination system and successfully deleted 11 distinct genomic regions identified by comparative analysis of C. glutamicum genomes. Despite the loss of several predicted open reading frames, the mutant cells exhibited normal growth under standard laboratory conditions. With a to...

  19. Analysis of herpesvirus host specificity determinants using herpesvirus genomes as bacterial artificial chromosomes.

    Science.gov (United States)

    Arii, Jun; Kato, Kentaro; Kawaguchi, Yasushi; Tohya, Yukinobu; Akashi, Hiroomi

    2009-08-01

    Almost all mammalian alphaherpesviruses can grow in cells derived from several types of animals in vitro. However, FHV-1 can only infect feline cell lines. For this reason, FHV-1 should be a good model to investigate species barriers to herpesviruses in vivo. To apply bacterial mutagenesis of FHV-1, we cloned the FHV-1 genome as a BAC. Using lambda and flp recombinations, we introduced a monomeric red fluorescence protein into the C-terminus of glycoprotein D. Although GFP in the constructed recombinant FHV-1, a transfectant of the bacmid of FHV-1 that possessed the GFP, acted in non-feline cell lines, the virus could not enter non-feline cell lines, demonstrating that the host specificity of FHV-1 was restricted in an early step of infection. The host range of canine herpesvirus is limited to dogs in vitro and in vivo; it cannot enter non-canine cell lines as a result of infection but the GFP is active by transfection, revealing the same result that the restriction step is at an early stage of infection. These results suggest the possibility of breaking species barriers of FHV-1 and CHV by modifying the gene(s) that act at the early stage of infection. PMID:19659927

  20. Genome-scale reconstruction of the sigma factor network in Escherichia coli: topology and functional states

    DEFF Research Database (Denmark)

    Cho, Byung-Kwan; Kim, Donghyuk; Knight, Eric M.; Zengler, Karsten; Palsson, Bernhard

    2014-01-01

    promoters, we do not yet have a genome-scale assessment of their function. Results: Using multiple genome-scale measurements, we elucidated the network of s-factor and promoter interactions in Escherichia coli. The reconstructed network includes 4,724 sigma-factor-specific promoters corresponding to...

  1. Construction of a nurse shark (Ginglymostoma cirratum bacterial artificial chromosome (BAC library and a preliminary genome survey

    Directory of Open Access Journals (Sweden)

    Inoko Hidetoshi

    2006-05-01

    Full Text Available Abstract Background Sharks are members of the taxonomic class Chondrichthyes, the oldest living jawed vertebrates. Genomic studies of this group, in comparison to representative species in other vertebrate taxa, will allow us to theorize about the fundamental genetic, developmental, and functional characteristics in the common ancestor of all jawed vertebrates. Aims In order to obtain mapping and sequencing data for comparative genomics, we constructed a bacterial artificial chromosome (BAC library for the nurse shark, Ginglymostoma cirratum. Results The BAC library consists of 313,344 clones with an average insert size of 144 kb, covering ~4.5 × 1010 bp and thus providing an 11-fold coverage of the haploid genome. BAC end sequence analyses revealed, in addition to LINEs and SINEs commonly found in other animal and plant genomes, two new groups of nurse shark-specific repetitive elements, NSRE1 and NSRE2 that seem to be major components of the nurse shark genome. Screening the library with single-copy or multi-copy gene probes showed 6–28 primary positive clones per probe of which 50–90% were true positives, demonstrating that the BAC library is representative of the different regions of the nurse shark genome. Furthermore, some BAC clones contained multiple genes, making physical mapping feasible. Conclusion We have constructed a deep-coverage, high-quality, large insert, and publicly available BAC library for a cartilaginous fish. It will be very useful to the scientific community interested in shark genomic structure, comparative genomics, and functional studies. We found two new groups of repetitive elements specific to the nurse shark genome, which may contribute to the architecture and evolution of the nurse shark genome.

  2. Complete Genome Sequence of Cell Culture-Attenuated Guinea Pig Cytomegalovirus Cloned as an Infectious Bacterial Artificial Chromosome

    OpenAIRE

    Yang, Dongmei; Alam, Zohaib; Cui, Xiaohong; Chen, Michael; Sherrod, Carly J.; McVoy, Michael A.; Schleiss, Mark R.; Dittmer, Dirk P

    2014-01-01

    The complete genome sequence of attenuated guinea pig cytomegalovirus cloned as bacterial artificial chromosome N13R10 was determined. Comparison to pathogenic salivary gland-derived virus revealed 13 differences, 1 of which disrupted overlapping open reading frames encoding GP129 and GP130. Attenuation of N13R10 may arise from an inability to express GP129 and/or GP130.

  3. Metabolite coupling in genome-scale metabolic networks

    Directory of Open Access Journals (Sweden)

    Palsson Bernhard Ø

    2006-03-01

    Full Text Available Abstract Background Biochemically detailed stoichiometric matrices have now been reconstructed for various bacteria, yeast, and for the human cardiac mitochondrion based on genomic and proteomic data. These networks have been manually curated based on legacy data and elementally and charge balanced. Comparative analysis of these well curated networks is now possible. Pairs of metabolites often appear together in several network reactions, linking them topologically. This co-occurrence of pairs of metabolites in metabolic reactions is termed herein "metabolite coupling." These metabolite pairs can be directly computed from the stoichiometric matrix, S. Metabolite coupling is derived from the matrix ŜŜT, whose off-diagonal elements indicate the number of reactions in which any two metabolites participate together, where Ŝ is the binary form of S. Results Metabolite coupling in the studied networks was found to be dominated by a relatively small group of highly interacting pairs of metabolites. As would be expected, metabolites with high individual metabolite connectivity also tended to be those with the highest metabolite coupling, as the most connected metabolites couple more often. For metabolite pairs that are not highly coupled, we show that the number of reactions a pair of metabolites shares across a metabolic network closely approximates a line on a log-log scale. We also show that the preferential coupling of two metabolites with each other is spread across the spectrum of metabolites and is not unique to the most connected metabolites. We provide a measure for determining which metabolite pairs couple more often than would be expected based on their individual connectivity in the network and show that these metabolites often derive their principal biological functions from existing in pairs. Thus, analysis of metabolite coupling provides information beyond that which is found from studying the individual connectivity of individual

  4. Genome sequence and plasmid transformation of the model high-yield bacterial cellulose producer Gluconacetobacter hansenii ATCC 53582

    Science.gov (United States)

    Florea, Michael; Reeve, Benjamin; Abbott, James; Freemont, Paul S.; Ellis, Tom

    2016-03-01

    Bacterial cellulose is a strong, highly pure form of cellulose that is used in a range of applications in industry, consumer goods and medicine. Gluconacetobacter hansenii ATCC 53582 is one of the highest reported bacterial cellulose producing strains and has been used as a model organism in numerous studies of bacterial cellulose production and studies aiming to increased cellulose productivity. Here we present a high-quality draft genome sequence for G. hansenii ATCC 53582 and find that in addition to the previously described cellulose synthase operon, ATCC 53582 contains two additional cellulose synthase operons and several previously undescribed genes associated with cellulose production. In parallel, we also develop optimized protocols and identify plasmid backbones suitable for transformation of ATCC 53582, albeit with low efficiencies. Together, these results provide important information for further studies into cellulose synthesis and for future studies aiming to genetically engineer G. hansenii ATCC 53582 for increased cellulose productivity.

  5. Temporal scaling of bacterial taxa is influenced by both stochastic and deterministic ecological factors.

    Science.gov (United States)

    van der Gast, Christopher J; Ager, Duane; Lilley, Andrew K

    2008-06-01

    Microorganisms operate at a range of spatial and temporal scales acting as key drivers of ecosystem properties. Therefore, many key questions in microbial ecology require the consideration of both spatial and temporal scales. Spatial scaling, in particular the species-area relationship (SAR), has a long history in ecology and has recently been addressed in microbial ecology. However, the temporal analogue of the SAR, the species-time relationship, has received far less attention even in the science of general ecology. Here we focus upon the role of temporal scaling in microbial ecological patterns by coupling molecular characterization of bacterial communities in discrete island (bioreactor) systems with a macroecological approach. Our findings showed that the temporal scaling exponent (slope), and therefore taxa turnover of the bacterial taxa-time relationship decreased as selective pressure (industrial wastewater concentration) increased. Also, as the concentration of industrial wastewater increased across the bioreactors, we observed a gradual switch from stochastic community assembly to more deterministic (niche)-based considerations. The identification of broad-scale statistical patterns is particularly relevant to microbial ecology, as it is frequently difficult to identify individual species or their functions. In this study, we identify wide-reaching statistical patterns of diversity and show that they are shaped by the prevalent underlying ecological factors. PMID:18205822

  6. Bacterial Societies: Cooperation, Colonization, and Competition in Micro-Scale Ecosystems

    OpenAIRE

    Hol, F.J.H.

    2014-01-01

    In this thesis, I describe experiments aimed at understanding bacterial population dynamics in ecosystems that are spatially structured at the micro-scale. We combine microfabrication and microfluidics to create synthetic ecosystems that have a complex yet well-defined geometry and chemical composition. Bacteria that inhabit such ecosystems can be observed at high spatiotemporal resolution using fluorescence microscopy. Using this experimental approach we have gained deeper insight into diver...

  7. Spatial scales of bacterial community diversity at cold seeps (Eastern Mediterranean Sea)

    OpenAIRE

    Pop Ristova, Petra; Wenzhöfer, Frank; Ramette, Alban; Felden, Janine; Boetius, Antje

    2014-01-01

    Cold seeps are highly productive, fragmented marine ecosystems that form at the seafloor around hydrocarbon emission pathways. The products of microbial utilization of methane and other hydrocarbons fuel rich chemosynthetic communities at these sites, with much higher respiration rates compared with the surrounding deep-sea floor. Yet little is known as to the richness, composition and spatial scaling of bacterial communities of cold seeps compared with non-seep communities. Here we assessed ...

  8. Explaining microbial phenotypes on a genomic scale: GWAS for microbes

    OpenAIRE

    Dutilh, Bas E; Backus, Lennart; Edwards, Robert A.; Wels, Michiel; Bayjanov, Jumamurat R.; van Hijum, Sacha A. F. T.

    2013-01-01

    There is an increasing availability of complete or draft genome sequences for microbial organisms. These data form a potentially valuable resource for genotype–phenotype association and gene function prediction, provided that phenotypes are consistently annotated for all the sequenced strains. In this review, we address the requirements for successful gene-trait matching. We outline a basic protocol for microbial functional genomics, including genome assembly, annotation of genotypes (includi...

  9. Large-scale prokaryotic gene prediction and comparison to genome annotation

    DEFF Research Database (Denmark)

    Nielsen, Pernille; Krogh, Anders Stærmose

    2005-01-01

    Motivation: Prokaryotic genomes are sequenced and annotated at an increasing rate. The methods of annotation vary between sequencing groups. It makes genome comparison difficult and may lead to propagation of errors when questionable assignments are adapted from one genome to another. Genome...... comparison either on a large or small scale would be facilitated by using a single standard for annotation, which incorporates a transparency of why an open reading frame (ORF) is considered to be a gene. Results: A total of 143 prokaryotic genomes were scored with an updated version of the prokaryotic...

  10. Comparative genomics of non-pseudomonal bacterial species colonising paediatric cystic fibrosis patients.

    Science.gov (United States)

    Ormerod, Kate L; George, Narelle M; Fraser, James A; Wainwright, Claire; Hugenholtz, Philip

    2015-01-01

    The genetic disorder cystic fibrosis is a life-limiting condition affecting ∼70,000 people worldwide. Targeted, early, treatment of the dominant infecting species, Pseudomonas aeruginosa, has improved patient outcomes; however, there is concern that other species are now stepping in to take its place. In addition, the necessarily long-term antibiotic therapy received by these patients may be providing a suitable environment for the emergence of antibiotic resistance. To investigate these issues, we employed whole-genome sequencing of 28 non-Pseudomonas bacterial strains isolated from three paediatric patients. We did not find any trend of increasing antibiotic resistance (either by mutation or lateral gene transfer) in these isolates in comparison with other examples of the same species. In addition, each isolate contained a virulence gene repertoire that was similar to other examples of the relevant species. These results support the impaired clearance of the CF lung not demanding extensive virulence for survival in this habitat. By analysing serial isolates of the same species we uncovered several examples of strain persistence. The same strain of Staphylococcus aureus persisted for nearly a year, despite administration of antibiotics to which it was shown to be sensitive. This is consistent with previous studies showing antibiotic therapy to be inadequate in cystic fibrosis patients, which may also explain the lack of increasing antibiotic resistance over time. Serial isolates of two naturally multi-drug resistant organisms, Achromobacter xylosoxidans and Stenotrophomonas maltophilia, revealed that while all S. maltophilia strains were unique, A. xylosoxidans persisted for nearly five years, making this a species of particular concern. The data generated by this study will assist in developing an understanding of the non-Pseudomonas species associated with cystic fibrosis. PMID:26401445

  11. Changes in bacterial community structure in a full-scale membrane bioreactor for municipal wastewater treatment.

    Science.gov (United States)

    Hashimoto, Kurumi; Tsutsui, Hirofumi; Takada, Kazuki; Hamada, Hiroshi; Sakai, Kousuke; Inoue, Daisuke; Sei, Kazunari; Soda, Satoshi; Yamashita, Kyoko; Tsuji, Koji; Hashimoto, Toshikazu; Ike, Michihiko

    2016-07-01

    This study investigated changes in the structure and metabolic capabilities of the bacterial community in a full-scale membrane bioreactor (MBR) treating municipal wastewater. Microbial monitoring was also conducted for a parallel-running conventional activated sludge (CAS) process treating the same influent. The mixed-liquor suspended solid concentration in the MBR reached a steady-state on day 73 after the start-up. Then the MBR maintained higher rates of removal of organic compounds and nitrogen than the CAS process did. Terminal restriction fragment length polymorphism analysis revealed that the bacterial community structure in the MBR was similar to that in the CAS process at the start-up, but it became very different from that in the CAS process in the steady state. The bacterial community structure of the MBR continued to change dynamically even after 20 months of the steady-state operation, while that of the CAS process was maintained in a stable condition. By contrast, Biolog assay revealed that the carbon source utilization potential of the MBR resembled that of the CAS process as a whole, although it declined transiently. Overall, the results indicate that the bacterial community of the MBR has flexibility in terms of its phylogenetic structure and metabolic activity to maintain the high wastewater treatment capability. PMID:26811223

  12. Phylogenetic Relationships of 3/3 and 2/2 Hemoglobins in Archaeplastida Genomes to Bacterial and Other Eukaryote Hemoglobins

    Institute of Scientific and Technical Information of China (English)

    Serge N. Vinogradov; Iván Fernández; David Hoogewijs; Raúl Arredondo-Peter

    2011-01-01

    Land plants and algae form a supergroup, the Archaeplastida, believed to be monophyletic. We report the results of an analysis of the phylogeny of putative globins in the currently available genomes to bacterial and other eu-karyote hemoglobins (Hbs). Archaeplastida genomes have 3/3 and 2/2 Hbs, with the land plant genomes having group 2 2/2 Hbs, except for the unexpected occurrence of two group 1 2/2 Hbs in Ricinus communis. Bayesian analysis shows that plant 3/3 Hbs are related to vertebrate neuroglobins and bacterial flavohemoglobins (FHbs). We sought to define the bacterial groups, whose ancestors shared the precursors of Archaeplastida Hbs, via Bayesian and neighbor-joining anal-yses based on COBALTalignment of representative sets of bacterial 3/3 FHb-like globins and group 1 and 2 2/2 Hbs with the corresponding Archaeplastida Hbs. The results suggest that the Archaeplastida 3/3 and group 1 2/2 Hbs could have orig-inated from the horizontal gene transfers (HGTs) that accompanied the two generally accepted endosymbioses of a pro-teobacterium and a cyanobacterium with a eukaryote ancestor. In contrast, the origin of the group 2 2/2 Hbs unexpectedly appears to involve HGT from a bacterium ancestral to Chloroflexi, Deinococcales, Bacilli, and Actinomycetes. Furthermore,although intron positions and phases are mostly conserved among the land plant 3/3 and 2/2 globin genes, introns are absent in the algal 3/3 genes and intron positions and phases are highly variable in their 2/2 genes. Thus, introns are irrelevant to globin evolution in Archaeplastida.

  13. Comparative Genomics Analysis and Phenotypic Characterization of Shewanella putrefaciens W3-18-1: Anaerobic Respiration, Bacterial Microcompartments, and Lateral Flagella

    International Nuclear Information System (INIS)

    Respiratory versatility and psychrophily are the hallmarks of Shewanella. The ability to utilize a wide range of electron acceptors for respiration is due to the large number of c-type cytochrome genes present in the genome of Shewanella strains. More recently the dissimilatory metal reduction of Shewanella species has been extensively and intensively studied for potential applications in the bioremediation of radioactive wastes of groundwater and subsurface environments. Multiple Shewanella genome sequences are now available in the public databases (Fredrickson et al., 2008). Most of the sequenced Shewanella strains were isolated from marine environments and this genus was believed to be of marine origin (Hau and Gralnick, 2007). However, the well-characterized model strain, S. oneidensis MR-1, was isolated from the freshwater lake sediment of Lake Oneida, New York (Myers and Nealson, 1988) and similar bacteria have also been isolated from other freshwater environments (Venkateswaran et al., 1999). Here we comparatively analyzed the genome sequence and physiological characteristics of S. putrefaciens W3-18-1 and S. oneidensis MR-1, isolated from the marine and freshwater lake sediments, respectively. The anaerobic respirations, carbon source utilization, and cell motility have been experimentally investigated. Large scale horizontal gene transfers have been revealed and the genetic divergence between these two strains was considered to be critical to the bacterial adaptation to specific habitats, freshwater or marine sediments.

  14. Comparative Genomics Analysis and Phenotypic Characterization of Shewanella putrefaciens W3-18-1: Anaerobic Respiration, Bacterial Microcompartments, and Lateral Flagella

    Energy Technology Data Exchange (ETDEWEB)

    Qiu, D.; Tu, Q.; He, Zhili; Zhou, Jizhong

    2010-05-17

    Respiratory versatility and psychrophily are the hallmarks of Shewanella. The ability to utilize a wide range of electron acceptors for respiration is due to the large number of c-type cytochrome genes present in the genome of Shewanella strains. More recently the dissimilatory metal reduction of Shewanella species has been extensively and intensively studied for potential applications in the bioremediation of radioactive wastes of groundwater and subsurface environments. Multiple Shewanella genome sequences are now available in the public databases (Fredrickson et al., 2008). Most of the sequenced Shewanella strains were isolated from marine environments and this genus was believed to be of marine origin (Hau and Gralnick, 2007). However, the well-characterized model strain, S. oneidensis MR-1, was isolated from the freshwater lake sediment of Lake Oneida, New York (Myers and Nealson, 1988) and similar bacteria have also been isolated from other freshwater environments (Venkateswaran et al., 1999). Here we comparatively analyzed the genome sequence and physiological characteristics of S. putrefaciens W3-18-1 and S. oneidensis MR-1, isolated from the marine and freshwater lake sediments, respectively. The anaerobic respirations, carbon source utilization, and cell motility have been experimentally investigated. Large scale horizontal gene transfers have been revealed and the genetic divergence between these two strains was considered to be critical to the bacterial adaptation to specific habitats, freshwater or marine sediments.

  15. Large-scale genomic 2D visualization reveals extensive CG-AT skew correlation in bird genomes

    Directory of Open Access Journals (Sweden)

    Deng Xuemei

    2007-11-01

    Full Text Available Abstract Background Bird genomes have very different compositional structure compared with other warm-blooded animals. The variation in the base skew rules in the vertebrate genomes remains puzzling, but it must relate somehow to large-scale genome evolution. Current research is inclined to relate base skew with mutations and their fixation. Here we wish to explore base skew correlations in bird genomes, to develop methods for displaying and quantifying such correlations at different scales, and to discuss possible explanations for the peculiarities of the bird genomes in skew correlation. Results We have developed a method called Base Skew Double Triangle (BSDT for exhibiting the genome-scale change of AT/CG skew as a two-dimensional square picture, showing base skews at many scales simultaneously in a single image. By this method we found that most chicken chromosomes have high AT/CG skew correlation (symmetry in 2D picture, except for some microchromosomes. No other organisms studied (18 species show such high skew correlations. This visualized high correlation was validated by three kinds of quantitative calculations with overlapping and non-overlapping windows, all indicating that chicken and birds in general have a special genome structure. Similar features were also found in some of the mammal genomes, but clearly much weaker than in chickens. We presume that the skew correlation feature evolved near the time that birds separated from other vertebrate lineages. When we eliminated the repeat sequences from the genomes, the AT and CG skews correlation increased for some mammal genomes, but were still clearly lower than in chickens. Conclusion Our results suggest that BSDT is an expressive visualization method for AT and CG skew and enabled the discovery of the very high skew correlation in bird genomes; this peculiarity is worth further study. Computational analysis indicated that this correlation might be a compositional characteristic

  16. Draft Genome Sequence of Criibacterium bergeronii gen. nov., sp. nov., Strain CCRI-22567T, Isolated from a Vaginal Sample from a Woman with Bacterial Vaginosis.

    Science.gov (United States)

    Maheux, Andrée F; Bérubé, Ève; Boudreau, Dominique K; Raymond, Frédéric; Corbeil, Jacques; Roy, Paul H; Boissinot, Maurice; Omar, Rabeea F

    2016-01-01

    Criibacterium bergeronii gen. nov., sp. nov., CCRI-22567 is the type strain of the new genus Criibacterium The strain was isolated from a woman with bacterial vaginosis. The genome assembly comprised 2,384,460 bp, with 34.4% G+C content. This is the first genome announcement of a strain belonging to the genus Criibacterium. PMID:27587833

  17. Combining p-values in large scale genomics experiments

    Science.gov (United States)

    Zaykin, Dmitri V.; Zhivotovsky, Lev A.; Czika, Wendy; Shao, Susan; Wolfinger, Russell D.

    2008-01-01

    Summary In large-scale genomics experiments involving thousands of statistical tests, such as association scans and microarray expression experiments, a key question is: Which of the L tests represent true associations (TAs)? The traditional way to control false findings is via individual adjustments. In the presence of multiple TAs, p-value combination methods offer certain advantages. Both Fisher’s and Lancaster’s combination methods use an inverse gamma transformation. We identify the relation of the shape parameter of that distribution to the implicit threshold value; p-values below that threshold are favored by the inverse gamma method (GM). We explore this feature to improve power over Fisher’s method when L is large and the number of TAs is moderate. However, the improvement in power provided by combination methods is at the expense of a weaker claim made upon rejection of the null hypothesis – that there are some TAs among the L tests. Thus, GM remains a global test. To allow a stronger claim about a subset of p-values that is smaller than L, we investigate two methods with an explicit truncation: the rank truncated product method (RTP) that combines the first K ordered p-values, and the truncated product method (TPM) that combines p-values that are smaller than a specified threshold. We conclude that TPM allows claims to be made about subsets of p-values, while the claim of the RTP is, like GM, more appropriately about all L tests. GM gives somewhat higher power than TPM, RTP, Fisher, and Simes methods across a range of simulations. PMID:17879330

  18. Combining p-values in large-scale genomics experiments.

    Science.gov (United States)

    Zaykin, Dmitri V; Zhivotovsky, Lev A; Czika, Wendy; Shao, Susan; Wolfinger, Russell D

    2007-01-01

    In large-scale genomics experiments involving thousands of statistical tests, such as association scans and microarray expression experiments, a key question is: Which of the L tests represent true associations (TAs)? The traditional way to control false findings is via individual adjustments. In the presence of multiple TAs, p-value combination methods offer certain advantages. Both Fisher's and Lancaster's combination methods use an inverse gamma transformation. We identify the relation of the shape parameter of that distribution to the implicit threshold value; p-values below that threshold are favored by the inverse gamma method (GM). We explore this feature to improve power over Fisher's method when L is large and the number of TAs is moderate. However, the improvement in power provided by combination methods is at the expense of a weaker claim made upon rejection of the null hypothesis - that there are some TAs among the L tests. Thus, GM remains a global test. To allow a stronger claim about a subset of p-values that is smaller than L, we investigate two methods with an explicit truncation: the rank truncated product method (RTP) that combines the first K-ordered p-values, and the truncated product method (TPM) that combines p-values that are smaller than a specified threshold. We conclude that TPM allows claims to be made about subsets of p-values, while the claim of the RTP is, like GM, more appropriately about all L tests. GM gives somewhat higher power than TPM, RTP, Fisher, and Simes methods across a range of simulations. PMID:17879330

  19. Life history determines biogeographical patterns of soil bacterial communities over multiple spatial scales.

    Science.gov (United States)

    Bissett, A; Richardson, A E; Baker, G; Wakelin, S; Thrall, P H

    2010-10-01

    The extent to which the distribution of soil bacteria is controlled by local environment vs. spatial factors (e.g. dispersal, colonization limitation, evolutionary events) is poorly understood and widely debated. Our understanding of biogeographic controls in microbial communities is likely hampered by the enormous environmental variability encountered across spatial scales and the broad diversity of microbial life histories. Here, we constrained environmental factors (soil chemistry, climate, above-ground plant community) to investigate the specific influence of space, by fitting all other variables first, on bacterial communities in soils over distances from m to 10² km. We found strong evidence for a spatial component to bacterial community structure that varies with scale and organism life history (dispersal and survival ability). Geographic distance had no influence over community structure for organisms known to have survival stages, but the converse was true for organisms thought to be less hardy. Community function (substrate utilization) was also shown to be highly correlated with community structure, but not to abiotic factors, suggesting nonstochastic determinants of community structure are important Our results support the view that bacterial soil communities are constrained by both edaphic factors and geographic distance and further show that the relative importance of such constraints depends critically on the taxonomic resolution used to evaluate spatio-temporal patterns of microbial diversity, as well as life history of the groups being investigated, much as is the case for macro-organisms. PMID:25241408

  20. Sexagesimal scale for mapping human genome Escala sexagesimal para mapear el genoma humano

    OpenAIRE

    RICARDO CRUZ-COKE

    2001-01-01

    In a previous work I designed a diagram of the human genome based on a circular ideogram of the haploid set of chromosomes, using a low resolution scale of Megabase units. The purpose of this work is to draft a new scale to measure the physical map of the human genome at the highest resolution level. The entire length of the haploid genome of males is deployed in a circumference, marked with a sexagesimal scale with 360 degrees and 1296000 arc seconds. The radio of this circunference displays...

  1. Genome Sequence of the Banana Pathogen Dickeya zeae Strain MS1, Which Causes Bacterial Soft Rot

    OpenAIRE

    Zhang, Jing-Xin; Lin, Bi-Run; Shen, Hui-Fang; Pu, Xiao-Ming

    2013-01-01

    We report a draft genome sequence of Dickeya zeae strain MS1, which is the causative agent of banana soft rot in China, and we show several of its specific properties compared with those of other D. zeae strains. Genome sequencing provides a tool for understanding the genomic determination of the pathogenicity and phylogeny placement of this pathogen.

  2. Complete Genomes of Classical Swine Fever Virus Cloned into Bacterial Artificial Chromosomes

    DEFF Research Database (Denmark)

    Rasmussen, Thomas Bruun; Reimann, I.; Uttenthal, Åse;

    Complete genome amplification of viral RNA provides a new tool for the generation of modified pestiviruses. We have used our full-genome amplification strategy for generation of amplicons representing complete genomes of classical swine fever virus. The amplicons were cloned directly into a stable...

  3. Construction of a bacterial artificial chromosome library from the spikemoss Selaginella moellendorffii: a new resource for plant comparative genomics

    Directory of Open Access Journals (Sweden)

    Chapple Clint

    2005-06-01

    Full Text Available Abstract Background The lycophytes are an ancient lineage of vascular plants that diverged from the seed plant lineage about 400 Myr ago. Although the lycophytes occupy an important phylogenetic position for understanding the evolution of plants and their genomes, no genomic resources exist for this group of plants. Results Here we describe the construction of a large-insert bacterial artificial chromosome (BAC library from the lycophyte Selaginella moellendorffii. Based on cell flow cytometry, this species has the smallest genome size among the different lycophytes tested, including Huperzia lucidula, Diphaiastrum digita, Isoetes engelmanii and S. kraussiana. The arrayed BAC library consists of 9126 clones; the average insert size is estimated to be 122 kb. Inserts of chloroplast origin account for 2.3% of the clones. The BAC library contains an estimated ten genome-equivalents based on DNA hybridizations using five single-copy and two duplicated S. moellendorffii genes as probes. Conclusion The S. moellenforffii BAC library, the first to be constructed from a lycophyte, will be useful to the scientific community as a resource for comparative plant genomics and evolution.

  4. Purification and partial genome characterization of the bacterial endosymbiont Blattabacterium cuenoti from the fat bodies of cockroaches

    Directory of Open Access Journals (Sweden)

    Yamada Akinori

    2008-11-01

    Full Text Available Abstract Background Symbiotic relationships between intracellular bacteria and eukaryotes are widespread in nature. Genome sequencing of the bacterial partner has provided a number of key insights into the basis of these symbioses. A challenging aspect of sequencing symbiont genomes is separating the bacteria from the host tissues. In the present study, we describe a simple method of endosymbiont purification from complex environment, using Blattabacterium cuenoti inhabiting in cockroaches as a model system. Findings B. cuenoti cells were successfully purified from the fat bodies of the cockroach Panesthia angustipennis by a combination of slow- and fast-speed centrifugal fractionations, nylon-membrane filtration, and centrifugation with Percoll solutions. We performed pulse-field electrophoresis, diagnostic PCR and random sequencing of the shoutgun library. These experiments confirmed minimal contamination of host and mitochondrial DNA. The genome size and the G+C content of B. cuenoti were inferred to be 650 kb and 32.1 ± 7.6%, respectively. Conclusion The present study showed successful purification and characterization of the genome of B. cuenoti. Our methodology should be applicable for future symbiont genome sequencing projects. An advantage of the present purification method is that each step is easily performed with ordinary microtubes and a microcentrifuge, and without DNase treatment.

  5. Effect of different genomic relationship matrices on accuracy and scale

    OpenAIRE

    Chen, Ching-Yi; Misztal, I; I. Aguilar; Legarra, Andres; Muir, W.M.

    2011-01-01

    Phenotypic data on BW and breast meat area were available on up to 287,614 broilers. A total of 4,113 birds were genotyped for 57,636 SNP. Data were analyzed by a single-step genomic BLUP (ssGBLUP), which accounts for all phenotypic, pedigree, and genomic information. The genomic relationship matrix (G) in ssGBLUP was constructed using either equal (0.5; GEq) or current (GC) allele frequencies, and with all SNP or with SNP with minor allele frequencies (MAF) below multiple thresholds (0.1, 0....

  6. Bacterial Lifestyle in a Deep-sea Hydrothermal Vent Chimney Revealed by the Genome Sequence of the Thermophilic Bacterium Deferribacter desulfuricans SSM1

    OpenAIRE

    Takaki, Yoshihiro; Shimamura, Shigeru; Nakagawa, Satoshi; Fukuhara, Yasuo; Horikawa, Hiroshi; Ankai, Akiho; Harada, Takeshi; Hosoyama, Akira; Oguchi, Akio; Fukui, Shigehiro; Fujita, Nobuyuki; Takami, Hideto; Takai, Ken

    2010-01-01

    The complete genome sequence of the thermophilic sulphur-reducing bacterium, Deferribacter desulfuricans SMM1, isolated from a hydrothermal vent chimney has been determined. The genome comprises a single circular chromosome of 2 234 389 bp and a megaplasmid of 308 544 bp. Many genes encoded in the genome are most similar to the genes of sulphur- or sulphate-reducing bacterial species within Deltaproteobacteria. The reconstructed central metabolisms showed a heterotrophic lifestyle primarily d...

  7. BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions

    OpenAIRE

    Conrad Tom M; Park Junyoung O; Schellenberger Jan; Palsson Bernhard Ø

    2010-01-01

    Abstract Background Genome-scale metabolic reconstructions under the Constraint Based Reconstruction and Analysis (COBRA) framework are valuable tools for analyzing the metabolic capabilities of organisms and interpreting experimental data. As the number of such reconstructions and analysis methods increases, there is a greater need for data uniformity and ease of distribution and use. Description We describe BiGG, a knowledgebase of Biochemically, Genetically and Genomically structured genom...

  8. Deciphering Cyanide-Degrading Potential of Bacterial Community Associated with the Coking Wastewater Treatment Plant with a Novel Draft Genome.

    Science.gov (United States)

    Wang, Zhiping; Liu, Lili; Guo, Feng; Zhang, Tong

    2015-10-01

    Biotreatment processes fed with coking wastewater often encounter insufficient removal of pollutants, such as ammonia, phenols, and polycyclic aromatic hydrocarbons (PAHs), especially for cyanides. However, only a limited number of bacterial species in pure cultures have been confirmed to metabolize cyanides, which hinders the improvement of these processes. In this study, a microbial community of activated sludge enriched in a coking wastewater treatment plant was analyzed using 454 pyrosequencing and Illumina sequencing to characterize the potential cyanide-degrading bacteria. According to the classification of these pyro-tags, targeting V3/V4 regions of 16S rRNA gene, half of them were assigned to the family Xanthomonadaceae, implying that Xanthomonadaceae bacteria are well-adapted to coking wastewater. A nearly complete draft genome of the dominant bacterium was reconstructed from metagenome of this community to explore cyanide metabolism based on analysis of the genome. The assembled 16S rRNA gene from this draft genome showed that this bacterium was a novel species of Thermomonas within Xanthomonadaceae, which was further verified by comparative genomics. The annotation using KEGG and Pfam identified genes related to cyanide metabolism, including genes responsible for the iron-harvesting system, cyanide-insensitive terminal oxidase, cyanide hydrolase/nitrilase, and thiosulfate:cyanide transferase. Phylogenetic analysis showed that these genes had homologs in previously identified genomes of bacteria within Xanthomonadaceae and even presented similar gene cassettes, thus implying an inherent cyanide-decomposing potential. The findings of this study expand our knowledge about the bacterial degradation of cyanide compounds and will be helpful in the remediation of cyanides contamination. PMID:25910603

  9. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    DEFF Research Database (Denmark)

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized...... repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely...... redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases...

  10. FINDING NON-CODING RNAs THROUGH GENOME-SCALE CLUSTERING

    OpenAIRE

    TSENG, HUEI-HUN; Weinberg, Zasha; Gore, Jeremy; Breaker, Ronald R.; RUZZO, WALTER L.

    2009-01-01

    Non-coding RNAs (ncRNAs) are transcripts that do not code for proteins. Recent findings have shown that RNA-mediated regulatory mechanisms influence a substantial portion of typical microbial genomes. We present an efficient method for finding potential ncRNAs in bacteria by clustering genomic sequences based on homology inferred from both primary sequence and secondary structure. We evaluate our approach using a set of predominantly Firmicutes sequences. Our results showed that, though prima...

  11. Genomic survey of pathogenicity determinants and VNTR markers in the cassava bacterial pathogen Xanthomonas axonopodis pv. Manihotis strain CIO151.

    Directory of Open Access Journals (Sweden)

    Mario L Arrieta-Ortiz

    Full Text Available Xanthomonas axonopodis pv. manihotis (Xam is the causal agent of bacterial blight of cassava, which is among the main components of human diet in Africa and South America. Current information about the molecular pathogenicity factors involved in the infection process of this organism is limited. Previous studies in other bacteria in this genus suggest that advanced draft genome sequences are valuable resources for molecular studies on their interaction with plants and could provide valuable tools for diagnostics and detection. Here we have generated the first manually annotated high-quality draft genome sequence of Xam strain CIO151. Its genomic structure is similar to that of other xanthomonads, especially Xanthomonas euvesicatoria and Xanthomonas citri pv. citri species. Several putative pathogenicity factors were identified, including type III effectors, cell wall-degrading enzymes and clusters encoding protein secretion systems. Specific characteristics in this genome include changes in the xanthomonadin cluster that could explain the lack of typical yellow color in all strains of this pathovar and the presence of 50 regions in the genome with atypical nucleotide composition. The genome sequence was used to predict and evaluate 22 variable number of tandem repeat (VNTR loci that were subsequently demonstrated as polymorphic in representative Xam strains. Our results demonstrate that Xanthomonas axonopodis pv. manihotis strain CIO151 possesses ten clusters of pathogenicity factors conserved within the genus Xanthomonas. We report 126 genes that are potentially unique to Xam, as well as potential horizontal transfer events in the history of the genome. The relation of these regions with virulence and pathogenicity could explain several aspects of the biology of this pathogen, including its ability to colonize both vascular and non-vascular tissues of cassava plants. A set of 16 robust, polymorphic VNTR loci will be useful to develop a multi

  12. In Silico Genome-Scale Reconstruction and Validation of the Corynebacterium glutamicum Metabolic Network

    DEFF Research Database (Denmark)

    Kjeldsen, Kjeld Raunkjær; Nielsen, J.

    2009-01-01

    A genome-scale metabolic model of the Gram-positive bacteria Corynebacterium glutamicum ATCC 13032 was constructed comprising 446 reactions and 411 metabolite, based on the annotated genome and available biochemical information. The network was analyzed using constraint based methods. The model w...

  13. LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Tier 1 Report

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T; Borucki, M; Lenhoff, R; Vitalis, E

    2009-09-29

    The Lawrence Livermore National Lab Bioinformatics group has recently taken on a role in DTRA's Transformation Medical Technologies Initiative (TMTI). The high-level goal of TMTI is to accelerate the development of broad-spectrum countermeasures. To achieve those goals, TMTI has a near term need to obtain more sequence information across a large range of pathogens, near neighbors, and across a broad geographical and host range. Our role in this project is to research available sequence data for the organisms of interest and identify critical microbial sequence and knowledge gaps that need to be filled to meet TMTI objectives. This effort includes: (1) assessing current genomic sequence for each agent including phylogenetic and geographical diversity, host range, date of isolation range, virulence, sequence availability of key near neighbors, and other characteristics; (2) identifying Subject Matter Experts (SME's) and potential holders of isolate collections, contacting appropriate SME's with known expertise and isolate collections to obtain information on isolate availability and specific recommendations; (3) identifying sequence as well as knowledge gaps (eg virulence, host range, and antibiotic resistance determinants); (4) providing specific recommendations as to the most valuable strains to be placed on the DTRA sequencing queue. We acknowledge that criteria for prioritization of isolates for sequencing falls into two categories aligning with priority queues 1 and 2 as described in the summary. (Priority queue 0 relates to DTRA operational isolates whose availability is not predictable in advance.) 1. Selection of isolates that appear to have likelihood to provide information on virulence and antibiotic resistance. This will include sequence of known virulent strains. Particularly valuable would be virulent strains that have genetically similar yet avirulent, or non human transmissible, counterparts that can be used for comparison to help

  14. LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Tier 1 Report

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T; Borucki, M; Lenhoff, R; Vitalis, E

    2009-09-29

    The Lawrence Livermore National Lab Bioinformatics group has recently taken on a role in DTRA's Transformation Medical Technologies Initiative (TMTI). The high-level goal of TMTI is to accelerate the development of broad-spectrum countermeasures. To achieve those goals, TMTI has a near term need to obtain more sequence information across a large range of pathogens, near neighbors, and across a broad geographical and host range. Our role in this project is to research available sequence data for the organisms of interest and identify critical microbial sequence and knowledge gaps that need to be filled to meet TMTI objectives. This effort includes: (1) assessing current genomic sequence for each agent including phylogenetic and geographical diversity, host range, date of isolation range, virulence, sequence availability of key near neighbors, and other characteristics; (2) identifying Subject Matter Experts (SME's) and potential holders of isolate collections, contacting appropriate SME's with known expertise and isolate collections to obtain information on isolate availability and specific recommendations; (3) identifying sequence as well as knowledge gaps (eg virulence, host range, and antibiotic resistance determinants); (4) providing specific recommendations as to the most valuable strains to be placed on the DTRA sequencing queue. We acknowledge that criteria for prioritization of isolates for sequencing falls into two categories aligning with priority queues 1 and 2 as described in the summary. (Priority queue 0 relates to DTRA operational isolates whose availability is not predictable in advance.) 1. Selection of isolates that appear to have likelihood to provide information on virulence and antibiotic resistance. This will include sequence of known virulent strains. Particularly valuable would be virulent strains that have genetically similar yet avirulent, or non human transmissible, counterparts that can be used for comparison to help

  15. Independent large scale duplications in multiple M. tuberculosis lineages overlapping the same genomic region.

    Directory of Open Access Journals (Sweden)

    Brian Weiner

    Full Text Available Mycobacterium tuberculosis, the causative agent of most human tuberculosis, infects one third of the world's population and kills an estimated 1.7 million people a year. With the world-wide emergence of drug resistance, and the finding of more functional genetic diversity than previously expected, there is a renewed interest in understanding the forces driving genome evolution of this important pathogen. Genetic diversity in M. tuberculosis is dominated by single nucleotide polymorphisms and small scale gene deletion, with little or no evidence for large scale genome rearrangements seen in other bacteria. Recently, a single report described a large scale genome duplication that was suggested to be specific to the Beijing lineage. We report here multiple independent large-scale duplications of the same genomic region of M. tuberculosis detected through whole-genome sequencing. The duplications occur in strains belonging to both M. tuberculosis lineage 2 and 4, and are thus not limited to Beijing strains. The duplications occur in both drug-resistant and drug susceptible strains. The duplicated regions also have substantially different boundaries in different strains, indicating different originating duplication events. We further identify a smaller segmental duplication of a different genomic region of a lab strain of H37Rv. The presence of multiple independent duplications of the same genomic region suggests either instability in this region, a selective advantage conferred by the duplication, or both. The identified duplications suggest that large-scale gene duplication may be more common in M. tuberculosis than previously considered.

  16. Environmental versatility promotes modularity in genome-scale metabolic networks

    Directory of Open Access Journals (Sweden)

    Wagner Andreas

    2011-08-01

    Full Text Available Abstract Background The ubiquity of modules in biological networks may result from an evolutionary benefit of a modular organization. For instance, modularity may increase the rate of adaptive evolution, because modules can be easily combined into new arrangements that may benefit their carrier. Conversely, modularity may emerge as a by-product of some trait. We here ask whether this last scenario may play a role in genome-scale metabolic networks that need to sustain life in one or more chemical environments. For such networks, we define a network module as a maximal set of reactions that are fully coupled, i.e., whose fluxes can only vary in fixed proportions. This definition overcomes limitations of purely graph based analyses of metabolism by exploiting the functional links between reactions. We call a metabolic network viable in a given chemical environment if it can synthesize all of an organism's biomass compounds from nutrients in this environment. An organism's metabolism is highly versatile if it can sustain life in many different chemical environments. We here ask whether versatility affects the modularity of metabolic networks. Results Using recently developed techniques to randomly sample large numbers of viable metabolic networks from a vast space of metabolic networks, we use flux balance analysis to study in silico metabolic networks that differ in their versatility. We find that highly versatile networks are also highly modular. They contain more modules and more reactions that are organized into modules. Most or all reactions in a module are associated with the same biochemical pathways. Modules that arise in highly versatile networks generally involve reactions that process nutrients or closely related chemicals. We also observe that the metabolism of E. coli is significantly more modular than even our most versatile networks. Conclusions Our work shows that modularity in metabolic networks can be a by-product of functional

  17. A short-time scale colloidal system reveals early bacterial adhesion dynamics.

    Directory of Open Access Journals (Sweden)

    Christophe Beloin

    2008-07-01

    Full Text Available The development of bacteria on abiotic surfaces has important public health and sanitary consequences. However, despite several decades of study of bacterial adhesion to inert surfaces, the biophysical mechanisms governing this process remain poorly understood, due, in particular, to the lack of methodologies covering the appropriate time scale. Using micrometric colloidal surface particles and flow cytometry analysis, we developed a rapid multiparametric approach to studying early events in adhesion of the bacterium Escherichia coli. This approach simultaneously describes the kinetics and amplitude of early steps in adhesion, changes in physicochemical surface properties within the first few seconds of adhesion, and the self-association state of attached and free-floating cells. Examination of the role of three well-characterized E. coli surface adhesion factors upon attachment to colloidal surfaces--curli fimbriae, F-conjugative pilus, and Ag43 adhesin--showed clear-cut differences in the very initial phases of surface colonization for cell-bearing surface structures, all known to promote biofilm development. Our multiparametric analysis revealed a correlation in the adhesion phase with cell-to-cell aggregation properties and demonstrated that this phenomenon amplified surface colonization once initial cell-surface attachment was achieved. Monitoring of real-time physico-chemical particle surface properties showed that surface-active molecules of bacterial origin quickly modified surface properties, providing new insight into the intricate relations connecting abiotic surface physicochemical properties and bacterial adhesion. Hence, the biophysical analytical method described here provides a new and relevant approach to quantitatively and kinetically investigating bacterial adhesion and biofilm development.

  18. Comparative genomics of the bacterial genus Listeria: Genome evolution is characterized by limited gene acquisition and limited gene loss

    OpenAIRE

    den Bakker, Henk C.; Cummings, Craig A.; Ferreira, Vania; Vatta, Paolo; Orsi, Renato H.; Degoricija, Lovorka; Barker, Melissa; Petrauskene, Olga; Furtado, Manohar R; Wiedmann, Martin

    2010-01-01

    Background The bacterial genus Listeria contains pathogenic and non-pathogenic species, including the pathogens L. monocytogenes and L. ivanovii, both of which carry homologous virulence gene clusters such as the prfA cluster and clusters of internalin genes. Initial evidence for multiple deletions of the prfA cluster during the evolution of Listeria indicates that this genus provides an interesting model for studying the evolution of virulence and also presents practical challenges with rega...

  19. The Genomic Sequence of the Oral Pathobiont Strain NI1060 Reveals Unique Strategies for Bacterial Competition and Pathogenicity.

    Directory of Open Access Journals (Sweden)

    Youssef Darzi

    Full Text Available Strain NI1060 is an oral bacterium responsible for periodontitis in a murine ligature-induced disease model. To better understand its pathogenicity, we have determined the complete sequence of its 2,553,982 bp genome. Although closely related to Pasteurella pneumotropica, a pneumonia-associated rodent commensal based on its 16S rRNA, the NI1060 genomic content suggests that they are different species thriving on different energy sources via alternative metabolic pathways. Genomic and phylogenetic analyses showed that strain NI1060 is distinct from the genera currently described in the family Pasteurellaceae, and is likely to represent a novel species. In addition, we found putative virulence genes involved in lipooligosaccharide synthesis, adhesins and bacteriotoxic proteins. These genes are potentially important for host adaption and for the induction of dysbiosis through bacterial competition and pathogenicity. Importantly, strain NI1060 strongly stimulates Nod1, an innate immune receptor, but is defective in two peptidoglycan recycling genes due to a frameshift mutation. The in-depth analysis of its genome thus provides critical insights for the development of NI1060 as a prime model system for infectious disease.

  20. Dynamics of bacterial communities before and after distribution in a full-scale drinking water network

    KAUST Repository

    El-Chakhtoura, Joline

    2015-05-01

    Understanding the biological stability of drinking water distribution systems is imperative in the framework of process control and risk management. The objective of this research was to examine the dynamics of the bacterial community during drinking water distribution at high temporal resolution. Water samples (156 in total) were collected over short time-scales (minutes/hours/days) from the outlet of a treatment plant and a location in its corresponding distribution network. The drinking water is treated by biofiltration and disinfectant residuals are absent during distribution. The community was analyzed by 16S rRNA gene pyrosequencing and flow cytometry as well as conventional, culture-based methods. Despite a random dramatic event (detected with pyrosequencing and flow cytometry but not with plate counts), the bacterial community profile at the two locations did not vary significantly over time. A diverse core microbiome was shared between the two locations (58-65% of the taxa and 86-91% of the sequences) and found to be dependent on the treatment strategy. The bacterial community structure changed during distribution, with greater richness detected in the network and phyla such as Acidobacteria and Gemmatimonadetes becoming abundant. The rare taxa displayed the highest dynamicity, causing the major change during water distribution. This change did not have hygienic implications and is contingent on the sensitivity of the applied methods. The concept of biological stability therefore needs to be revised. Biostability is generally desired in drinking water guidelines but may be difficult to achieve in large-scale complex distribution systems that are inherently dynamic.

  1. Bacterial toxicity comparison between nano- and micro-scaled oxide particles

    Energy Technology Data Exchange (ETDEWEB)

    Jiang Wei; Mashayekhi, Hamid [Department of Plant, Soil and Insect Sciences, University of Massachusetts, Stockbridge Hall, Amherst, MA 01003 (United States); Xing Baoshan, E-mail: bx@pssci.umass.ed [Department of Plant, Soil and Insect Sciences, University of Massachusetts, Stockbridge Hall, Amherst, MA 01003 (United States)

    2009-05-15

    Toxicity of nano-scaled aluminum, silicon, titanium and zinc oxides to bacteria (Bacillus subtilis, Escherichia coli and Pseudomonas fluorescens) was examined and compared to that of their respective bulk (micro-scaled) counterparts. All nanoparticles but titanium oxide showed higher toxicity (at 20 mg/L) than their bulk counterparts. Toxicity of released metal ions was differentiated from that of the oxide particles. ZnO was the most toxic among the three nanoparticles, causing 100% mortality to the three tested bacteria. Al{sub 2}O{sub 3} nanoparticles had a mortality rate of 57% to B. subtilis, 36% to E. coli, and 70% to P. fuorescens. SiO{sub 2} nanoparticles killed 40% of B. subtilis, 58% of E. coli, and 70% of P. fluorescens. TEM images showed attachment of nanoparticles to the bacteria, suggesting that the toxicity was affected by bacterial attachment. Bacterial responses to nanoparticles were different from their bulk counterparts; hence nanoparticle toxicity mechanisms need to be studied thoroughly. - Oxide nanoparticles show higher toxicity than their bulk counterparts

  2. Bacterial toxicity comparison between nano- and micro-scaled oxide particles

    International Nuclear Information System (INIS)

    Toxicity of nano-scaled aluminum, silicon, titanium and zinc oxides to bacteria (Bacillus subtilis, Escherichia coli and Pseudomonas fluorescens) was examined and compared to that of their respective bulk (micro-scaled) counterparts. All nanoparticles but titanium oxide showed higher toxicity (at 20 mg/L) than their bulk counterparts. Toxicity of released metal ions was differentiated from that of the oxide particles. ZnO was the most toxic among the three nanoparticles, causing 100% mortality to the three tested bacteria. Al2O3 nanoparticles had a mortality rate of 57% to B. subtilis, 36% to E. coli, and 70% to P. fuorescens. SiO2 nanoparticles killed 40% of B. subtilis, 58% of E. coli, and 70% of P. fluorescens. TEM images showed attachment of nanoparticles to the bacteria, suggesting that the toxicity was affected by bacterial attachment. Bacterial responses to nanoparticles were different from their bulk counterparts; hence nanoparticle toxicity mechanisms need to be studied thoroughly. - Oxide nanoparticles show higher toxicity than their bulk counterparts

  3. Rapid genome-scale mapping of chromatin accessibility in tissue

    DEFF Research Database (Denmark)

    Grøntved, Lars; Bandle, Russell; John, Sam;

    2012-01-01

    BACKGROUND: The challenge in extracting genome-wide chromatin features from limiting clinical samples poses a significant hurdle in identification of regulatory marks that impact the physiological or pathological state. Current methods that identify nuclease accessible chromatin are reliant on la...

  4. Analysis of Aspergillus nidulans metabolism at the genome-scale

    DEFF Research Database (Denmark)

    David, Helga; Ozcelik, İlknur Ş; Hofmann, Gerald;

    2008-01-01

    function. Results: In this work, we have manually assigned functions to 472 orphan genes in the metabolism of A. nidulans, by using a pathway-driven approach and by employing comparative genomics tools based on sequence similarity. The central metabolism of A. nidulans, as well as biosynthetic pathways of...

  5. Mapping copy number variation by population-scale genome sequencing

    DEFF Research Database (Denmark)

    Mills, Ryan E.; Walter, Klaudia; Stewart, Chip;

    2011-01-01

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, ...

  6. Draft Genome Sequence of Nocardia jinanensis, an Opportunistic Bacterial Pathogen That Causes Cellulitis

    Science.gov (United States)

    Chakrabortti, Alolika; Li, Jinming

    2016-01-01

    The draft genome sequence of Nocardia jinanensis, an opportunistic pathogen that can cause skin infections, reveals genes that may contribute to the lifestyle and pathogenicity of N. jinanensis. The genome also reveals the biosynthetic capacity of N. jinanensis in producing mycolic acids, siderophores, and other polyketide and nonribosomal peptide-derived secondary metabolites. PMID:27445366

  7. Symmetry and scale orient Min protein patterns in shaped bacterial sculptures

    Science.gov (United States)

    Wu, Fabai; van Schie, Bas G. C.; Keymer, Juan E.; Dekker, Cees

    2015-08-01

    The boundary of a cell defines the shape and scale of its subcellular organization. However, the effects of the cell's spatial boundaries as well as the geometry sensing and scale adaptation of intracellular molecular networks remain largely unexplored. Here, we show that living bacterial cells can be ‘sculpted’ into defined shapes, such as squares and rectangles, which are used to explore the spatial adaptation of Min proteins that oscillate pole-to-pole in rod-shaped Escherichia coli to assist cell division. In a wide geometric parameter space, ranging from 2 × 1 × 1 to 11 × 6 × 1 μm3, Min proteins exhibit versatile oscillation patterns, sustaining rotational, longitudinal, diagonal, stripe and even transversal modes. These patterns are found to directly capture the symmetry and scale of the cell boundary, and the Min concentration gradients scale with the cell size within a characteristic length range of 3-6 μm. Numerical simulations reveal that local microscopic Turing kinetics of Min proteins can yield global symmetry selection, gradient scaling and an adaptive range, when and only when facilitated by the three-dimensional confinement of the cell boundary. These findings cannot be explained by previous geometry-sensing models based on the longest distance, membrane area or curvature, and reveal that spatial boundaries can facilitate simple molecular interactions to result in far more versatile functions than previously understood.

  8. The channel catfish genome sequence provides insights into the evolution of scale formation in teleosts.

    Science.gov (United States)

    Liu, Zhanjiang; Liu, Shikai; Yao, Jun; Bao, Lisui; Zhang, Jiaren; Li, Yun; Jiang, Chen; Sun, Luyang; Wang, Ruijia; Zhang, Yu; Zhou, Tao; Zeng, Qifan; Fu, Qiang; Gao, Sen; Li, Ning; Koren, Sergey; Jiang, Yanliang; Zimin, Aleksey; Xu, Peng; Phillippy, Adam M; Geng, Xin; Song, Lin; Sun, Fanyue; Li, Chao; Wang, Xiaozhu; Chen, Ailu; Jin, Yulin; Yuan, Zihao; Yang, Yujia; Tan, Suxu; Peatman, Eric; Lu, Jianguo; Qin, Zhenkui; Dunham, Rex; Li, Zhaoxia; Sonstegard, Tad; Feng, Jianbin; Danzmann, Roy G; Schroeder, Steven; Scheffler, Brian; Duke, Mary V; Ballard, Linda; Kucuktas, Huseyin; Kaltenboeck, Ludmilla; Liu, Haixia; Armbruster, Jonathan; Xie, Yangjie; Kirby, Mona L; Tian, Yi; Flanagan, Mary Elizabeth; Mu, Weijie; Waldbieser, Geoffrey C

    2016-01-01

    Catfish represent 12% of teleost or 6.3% of all vertebrate species, and are of enormous economic value. Here we report a high-quality reference genome sequence of channel catfish (Ictalurus punctatus), the major aquaculture species in the US. The reference genome sequence was validated by genetic mapping of 54,000 SNPs, and annotated with 26,661 predicted protein-coding genes. Through comparative analysis of genomes and transcriptomes of scaled and scaleless fish and scale regeneration experiments, we address the genomic basis for the most striking physical characteristic of catfish, the evolutionary loss of scales and provide evidence that lack of secretory calcium-binding phosphoproteins accounts for the evolutionary loss of scales in catfish. The channel catfish reference genome sequence, along with two additional genome sequences and transcriptomes of scaled catfishes, provide crucial resources for evolutionary and biological studies. This work also demonstrates the power of comparative subtraction of candidate genes for traits of structural significance. PMID:27249958

  9. Development of an interactive genome browser to visualize and analyse large scale genomic data

    OpenAIRE

    Sinclair, Lucas

    2010-01-01

    Genomic bioinformatics is a growing and developing field. Indeed, data analysis is becoming an integrative and essential part of any quantitative biological experiment as the technologies evolve and the wet lab methods used generate larger and larger quantities of data. Yet few standards have emerged and a plethora of analytical tools exist, none of which are established as a standard. The difficulties arise early on, even before processing any genomic data, as one first needs to visualize it...

  10. Mauve: Multiple Alignment of Conserved Genomic Sequence With Rearrangements

    OpenAIRE

    Darling, Aaron C.E.; Mau, Bob; Blattner, Frederick R.; Perna, Nicole T.

    2004-01-01

    As genomes evolve, they undergo large-scale evolutionary processes that present a challenge to sequence comparison not posed by short sequences. Recombination causes frequent genome rearrangements, horizontal transfer introduces new sequences into bacterial chromosomes, and deletions remove segments of the genome. Consequently, each genome is a mosaic of unique lineage-specific segments, regions shared with a subset of other genomes and segments conserved among all the genomes under considera...

  11. On the road to synthetic life: the minimal cell and genome-scale engineering.

    Science.gov (United States)

    Juhas, Mario

    2016-06-01

    Synthetic biology employs rational engineering principles to build biological systems from the libraries of standard, well characterized biological parts. Biological systems designed and built by synthetic biologists fulfill a plethora of useful purposes, ranging from better healthcare and energy production to biomanufacturing. Recent advancements in the synthesis, assembly and "booting-up" of synthetic genomes and in low and high-throughput genome engineering have paved the way for engineering on the genome-wide scale. One of the key goals of genome engineering is the construction of minimal genomes consisting solely of essential genes (genes indispensable for survival of living organisms). Besides serving as a toolbox to understand the universal principles of life, the cell encoded by minimal genome could be used to build a stringently controlled "cell factory" with a desired phenotype. This review provides an update on recent advances in the genome-scale engineering with particular emphasis on the engineering of minimal genomes. Furthermore, it presents an ongoing discussion to the scientific community for better suitability of minimal or robust cells for industrial applications. PMID:25578717

  12. MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes

    Directory of Open Access Journals (Sweden)

    Yang Yi-Fan

    2007-03-01

    Full Text Available Abstract Background Despite a remarkable success in the computational prediction of genes in Bacteria and Archaea, a lack of comprehensive understanding of prokaryotic gene structures prevents from further elucidation of differences among genomes. It continues to be interesting to develop new ab initio algorithms which not only accurately predict genes, but also facilitate comparative studies of prokaryotic genomes. Results This paper describes a new prokaryotic genefinding algorithm based on a comprehensive statistical model of protein coding Open Reading Frames (ORFs and Translation Initiation Sites (TISs. The former is based on a linguistic "Entropy Density Profile" (EDP model of coding DNA sequence and the latter comprises several relevant features related to the translation initiation. They are combined to form a so-called Multivariate Entropy Distance (MED algorithm, MED 2.0, that incorporates several strategies in the iterative program. The iterations enable us to develop a non-supervised learning process and to obtain a set of genome-specific parameters for the gene structure, before making the prediction of genes. Conclusion Results of extensive tests show that MED 2.0 achieves a competitive high performance in the gene prediction for both 5' and 3' end matches, compared to the current best prokaryotic gene finders. The advantage of the MED 2.0 is particularly evident for GC-rich genomes and archaeal genomes. Furthermore, the genome-specific parameters given by MED 2.0 match with the current understanding of prokaryotic genomes and may serve as tools for comparative genomic studies. In particular, MED 2.0 is shown to reveal divergent translation initiation mechanisms in archaeal genomes while making a more accurate prediction of TISs compared to the existing gene finders and the current GenBank annotation.

  13. TIGER: Toolbox for integrating genome-scale metabolic models, expression data, and transcriptional regulatory networks

    Directory of Open Access Journals (Sweden)

    Jensen Paul A

    2011-09-01

    Full Text Available Abstract Background Several methods have been developed for analyzing genome-scale models of metabolism and transcriptional regulation. Many of these methods, such as Flux Balance Analysis, use constrained optimization to predict relationships between metabolic flux and the genes that encode and regulate enzyme activity. Recently, mixed integer programming has been used to encode these gene-protein-reaction (GPR relationships into a single optimization problem, but these techniques are often of limited generality and lack a tool for automating the conversion of rules to a coupled regulatory/metabolic model. Results We present TIGER, a Toolbox for Integrating Genome-scale Metabolism, Expression, and Regulation. TIGER converts a series of generalized, Boolean or multilevel rules into a set of mixed integer inequalities. The package also includes implementations of existing algorithms to integrate high-throughput expression data with genome-scale models of metabolism and transcriptional regulation. We demonstrate how TIGER automates the coupling of a genome-scale metabolic model with GPR logic and models of transcriptional regulation, thereby serving as a platform for algorithm development and large-scale metabolic analysis. Additionally, we demonstrate how TIGER's algorithms can be used to identify inconsistencies and improve existing models of transcriptional regulation with examples from the reconstructed transcriptional regulatory network of Saccharomyces cerevisiae. Conclusion The TIGER package provides a consistent platform for algorithm development and extending existing genome-scale metabolic models with regulatory networks and high-throughput data.

  14. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network

    DEFF Research Database (Denmark)

    Förster, Jochen; Famili, I.; Fu, P.; Palsson, B.O.; Nielsen, Jens

    2003-01-01

    The metabolic network in the yeast Saccharomyces cerevisiae was reconstructed using currently available genomic, biochemical, and physiological information. The metabolic reactions were compartmentalized between the cytosol and the mitochondria, and transport steps between the compartments and the...... containing 1175 metabolic reactions and 584 metabolites. The number of gene functions included in the reconstructed network corresponds to similar to16% of all characterized ORFs in S. cerevisiae. Using the reconstructed network, the metabolic capabilities of S. cerevisiae were calculated and compared with...

  15. Predicting Gene Regulatory Elements in Silico on a Genomic Scale

    OpenAIRE

    Brazma, Alvis; Jonassen, Inge; Vilo, Jaak; Ukkonen, Esko

    1998-01-01

    We performed a systematic analysis of gene upstream regions in the yeast genome for occurrences of regular expression-type patterns with the goal of identifying potential regulatory elements. To achieve this goal, we have developed a new sequence pattern discovery algorithm that searches exhaustively for a priori unknown regular expression-type patterns that are over-represented in a given set of sequences. We applied the algorithm in two cases, (1) discovery of patterns in the complete set o...

  16. Direct-to-consumer genomics on the scales of autonomy.

    Science.gov (United States)

    Vayena, Effy

    2015-04-01

    Direct-to-consumer (DTC) genetic services have generated enormous controversy from their first emergence. A dramatic recent manifestation of this is the Food and Drug Administration's (FDA) cease and desist order against 23andMe, the leading provider in the market. Critics have argued for the restrictive regulation of such services, and even their prohibition, on the grounds of the harm they pose to consumers. Their advocates, by contrast, defend them as a means of enhancing the autonomy of those same consumers. Autonomy emerges as a key battle-field in this debate, because many of the 'harm' arguments can be interpreted as identifying threats to autonomy. This paper assesses whether DTC genomic services are a threat to, or instead, an enhancement of, personal autonomy. It deploys Joseph Raz's account of personal autonomy, with its emphasis on choice from a range of valuable options. It then seeks to counter claims that DTC genomics threatens autonomy because it involves manipulation in contravention of consumers' independence or because it does not generate valuable options which can be meaningfully engaged with by consumers. It is stressed that the value of the options generated by DTC genomics should not be judged exclusively from the perspective of medical actionability, but should take into consideration plural utilities. Finally, the paper ends by broaching policy recommendations, suggesting that there is a strong autonomy-based argument for permitting DTC genomic services, and that the key question is the nature of the regulatory conditions under which they should be permitted. The discussion of autonomy in this paper helps illuminate some of these conditions. PMID:24797610

  17. Comparative Genomic Analysis of Xanthomonas axonopodis pv. citrumelo F1, Which Causes Citrus Bacterial Spot Disease, and Related Strains Provides Insights into Virulence and Host Specificity ▿ #

    OpenAIRE

    Jalan, Neha; Aritua, Valente; Kumar, Dibyendu; Yu, Fahong; Jones, Jeffrey B; Graham, James H; Setubal, João C; Wang, Nian

    2011-01-01

    Xanthomonas axonopodis pv. citrumelo is a citrus pathogen causing citrus bacterial spot disease that is geographically restricted within the state of Florida. Illumina, 454 sequencing, and optical mapping were used to obtain a complete genome sequence of X. axonopodis pv. citrumelo strain F1, 4.9 Mb in size. The strain lacks plasmids, in contrast to other citrus Xanthomonas pathogens. Phylogenetic analysis revealed that this pathogen is very close to the tomato bacterial spot pathogen X. camp...

  18. Tomato pathogen genome may offer clues about bacterial evolution at the dawn of agriculture

    OpenAIRE

    Sutphin, Michael D.

    2008-01-01

    The availability of new genome sequencing technology has prompted a Virginia Tech plant scientist Boris Vinatzer to test an intriguing hypothesis about how agriculture's early beginnings may have impacted the evolution of plant pathogens.

  19. ANItools web: a web tool for fast genome comparison within multiple bacterial strains

    OpenAIRE

    Han, Na; Qiang, Yujun; Zhang, Wen

    2016-01-01

    Background: Early classification of prokaryotes was based solely on phenotypic similarities, but modern prokaryote characterization has been strongly influenced by advances in genetic methods. With the fast development of the sequencing technology, the ever increasing number of genomic sequences per species offers the possibility for developing distance determinations based on whole-genome information. The average nucleotide identity (ANI), calculated from pair-wise comparisons of all sequenc...

  20. What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira

    OpenAIRE

    Fouts, Derrick E.; Matthias, Michael A.; Adhikarla, Haritha; Adler, Ben; Amorim-Santos, Luciane; Berg, Douglas E.; Bulach, Dieter; Buschiazzo, Alejandro; Chang, Yung-Fu; Galloway, Renee L.; Haake, David A.; Haft, Daniel H.; Hartskeerl, Rudy; Ko, Albert I.; Levett, Paul N

    2016-01-01

    Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field....

  1. Genome dynamics of short oligonucleotides: the example of bacterial DNA uptake enhancing sequences.

    Directory of Open Access Journals (Sweden)

    Mohammed Bakkali

    Full Text Available Among the many bacteria naturally competent for transformation by DNA uptake-a phenomenon with significant clinical and financial implications- Pasteurellaceae and Neisseriaceae species preferentially take up DNA containing specific short sequences. The genomic overrepresentation of these DNA uptake enhancing sequences (DUES causes preferential uptake of conspecific DNA, but the function(s behind this overrepresentation and its evolution are still a matter for discovery. Here I analyze DUES genome dynamics and evolution and test the validity of the results to other selectively constrained oligonucleotides. I use statistical methods and computer simulations to examine DUESs accumulation in Haemophilus influenzae and Neisseria gonorrhoeae genomes. I analyze DUESs sequence and nucleotide frequencies, as well as those of all their mismatched forms, and prove the dependence of DUESs genomic overrepresentation on their preferential uptake by quantifying and correlating both characteristics. I then argue that mutation, uptake bias, and weak selection against DUESs in less constrained parts of the genome combined are sufficient enough to cause DUESs accumulation in susceptible parts of the genome with no need for other DUES function. The distribution of overrepresentation values across sequences with different mismatch loads compared to the DUES suggests a gradual yet not linear molecular drive of DNA sequences depending on their similarity to the DUES. Other genomically overrepresented sequences, both pro- and eukaryotic, show similar distribution of frequencies suggesting that the molecular drive reported above applies to other frequent oligonucleotides. Rare oligonucleotides, however, seem to be gradually drawn to genomic underrepresentation, thus, suggesting a molecular drag. To my knowledge this work provides the first clear evidence of the gradual evolution of selectively constrained oligonucleotides, including repeated, palindromic and protein

  2. Whole genome amplification and de novo assembly of single bacterial cells.

    Directory of Open Access Journals (Sweden)

    Sébastien Rodrigue

    Full Text Available BACKGROUND: Single-cell genome sequencing has the potential to allow the in-depth exploration of the vast genetic diversity found in uncultured microbes. We used the marine cyanobacterium Prochlorococcus as a model system for addressing important challenges facing high-throughput whole genome amplification (WGA and complete genome sequencing of individual cells. METHODOLOGY/PRINCIPAL FINDINGS: We describe a pipeline that enables single-cell WGA on hundreds of cells at a time while virtually eliminating non-target DNA from the reactions. We further developed a post-amplification normalization procedure that mitigates extreme variations in sequencing coverage associated with multiple displacement amplification (MDA, and demonstrated that the procedure increased sequencing efficiency and facilitated genome assembly. We report genome recovery as high as 99.6% with reference-guided assembly, and 95% with de novo assembly starting from a single cell. We also analyzed the impact of chimera formation during MDA on de novo assembly, and discuss strategies to minimize the presence of incorrectly joined regions in contigs. CONCLUSIONS/SIGNIFICANCE: The methods describe in this paper will be useful for sequencing genomes of individual cells from a variety of samples.

  3. Comparative genomics analysis of the companion mechanisms of Bacillus thuringiensis Bc601 and Bacillus endophyticus Hbe603 in bacterial consortium.

    Science.gov (United States)

    Jia, Nan; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2016-01-01

    Bacillus thuringiensis and Bacillus endophyticus both act as the companion bacteria, which cooperate with Ketogulonigenium vulgare in vitamin C two-step fermentation. Two Bacillus species have different morphologies, swarming motility and 2-keto-L-gulonic acid productivities when they co-culture with K. vulgare. Here, we report the complete genome sequencing of B. thuringiensis Bc601 and eight plasmids of B. endophyticus Hbe603, and carry out the comparative genomics analysis. Consequently, B. thuringiensis Bc601, with greater ability of response to the external environment, has been found more two-component system, sporulation coat and peptidoglycan biosynthesis related proteins than B. endophyticus Hbe603, and B. endophyticus Hbe603, with greater ability of nutrients biosynthesis, has been found more alpha-galactosidase, propanoate, glutathione and inositol phosphate metabolism, and amino acid degradation related proteins than B. thuringiensis Bc601. Different ability of swarming motility, response to the external environment and nutrients biosynthesis may reflect different companion mechanisms of two Bacillus species. Comparative genomic analysis of B. endophyticus and B. thuringiensis enables us to further understand the cooperative mechanism with K. vulgare, and facilitate the optimization of bacterial consortium. PMID:27353048

  4. Compiling Multicopy Single-Stranded DNA Sequences from Bacterial Genome Sequences

    OpenAIRE

    Yoo, Wonseok; Lim, Dongbin; Kim, Sangsoo

    2016-01-01

    A retron is a bacterial retroelement that encodes an RNA gene and a reverse transcriptase (RT). The former, once transcribed, works as a template primer for reverse transcription by the latter. The resulting DNA is covalently linked to the upstream part of the RNA; this chimera is called multicopy single-stranded DNA (msDNA), which is extrachromosomal DNA found in many bacterial species. Based on the conserved features in the eight known msDNA sequences, we developed a detection method and ap...

  5. From genetic circuits to industrial-scale biomanufacturing: bacterial promoters as a cornerstone of biotechnology

    Directory of Open Access Journals (Sweden)

    Pawel Jajesniak

    2015-08-01

    Full Text Available Since the advent of genetic engineering, Escherichia coli, the most widely studied prokaryotic model organism, and other bacterial species have remained at the forefront of biological research. These ubiquitous microorganisms play an essential role in deciphering complex gene regulation mechanisms, large-scale recombinant protein production, and lately the two emerging areas of biotechnology—synthetic biology and metabolic engineering. Among a myriad of factors affecting prokaryotic gene expression, judicious choice of promoter remains one of the most challenging and impactful decisions in many biological experiments. This review provides a comprehensive overview of the current state of bacterial promoter engineering, with an emphasis on its applications in heterologous protein production, synthetic biology and metabolic engineering. In addition to highlighting relevant advances in these fields, the article facilitates the selection of an appropriate promoter by providing pertinent guidelines and explores the development of complementary databases, bioinformatics tools and promoter standardization procedures. The review ends by providing a quick overview of other emerging technologies and future prospects of this vital research area.

  6. Complete genome sequence of the extremely acidophilic methanotroph isolate V4, Methylacidiphilum infernorum, a representative of the bacterial phylum Verrucomicrobia

    Directory of Open Access Journals (Sweden)

    Stott Matthew B

    2008-07-01

    Full Text Available Abstract Background The phylum Verrucomicrobia is a widespread but poorly characterized bacterial clade. Although cultivation-independent approaches detect representatives of this phylum in a wide range of environments, including soils, seawater, hot springs and human gastrointestinal tract, only few have been isolated in pure culture. We have recently reported cultivation and initial characterization of an extremely acidophilic methanotrophic member of the Verrucomicrobia, strain V4, isolated from the Hell's Gate geothermal area in New Zealand. Similar organisms were independently isolated from geothermal systems in Italy and Russia. Results We report the complete genome sequence of strain V4, the first one from a representative of the Verrucomicrobia. Isolate V4, initially named "Methylokorus infernorum" (and recently renamed Methylacidiphilum infernorum is an autotrophic bacterium with a streamlined genome of ~2.3 Mbp that encodes simple signal transduction pathways and has a limited potential for regulation of gene expression. Central metabolism of M. infernorum was reconstructed almost completely and revealed highly interconnected pathways of autotrophic central metabolism and modifications of C1-utilization pathways compared to other known methylotrophs. The M. infernorum genome does not encode tubulin, which was previously discovered in bacteria of the genus Prosthecobacter, or close homologs of any other signature eukaryotic proteins. Phylogenetic analysis of ribosomal proteins and RNA polymerase subunits unequivocally supports grouping Planctomycetes, Verrucomicrobia and Chlamydiae into a single clade, the PVC superphylum, despite dramatically different gene content in members of these three groups. Comparative-genomic analysis suggests that evolution of the M. infernorum lineage involved extensive horizontal gene exchange with a variety of bacteria. The genome of M. infernorum shows apparent adaptations for existence under extremely

  7. Construction of a bacterial artificial chromosome library from the spikemoss Selaginella moellendorffii: a new resource for plant comparative genomics

    OpenAIRE

    Chapple Clint; Carlson John; Arumuganathan K; Mueller Christopher; Kudrna Dave; Weng Jing-Ke; Kim Hye Ran; Sisneros Nicholas; Luo Meizhong; Tanurdzic Milos; Wang Wenming; de Pamphilis Claude; Mandoli Dina; Tomkins Jeff; Wing Rod A

    2005-01-01

    Abstract Background The lycophytes are an ancient lineage of vascular plants that diverged from the seed plant lineage about 400 Myr ago. Although the lycophytes occupy an important phylogenetic position for understanding the evolution of plants and their genomes, no genomic resources exist for this group of plants. Results Here we describe the construction of a large-insert bacterial artificial chromosome (BAC) library from the lycophyte Selaginella moellendorffii. Based on cell flow cytomet...

  8. Reconstruction of the genome-scale metabolic network of Kluyveromyces lactis

    OpenAIRE

    Dias, Oscar

    2013-01-01

    System Biology proposes to study biological components, as well as the interactions between them, to understand and predict systems’ behaviour through the use of mathematical models. Under this scope, Genome-Scale Metabolic Models (GSMMs) can be regarded as mathematical representations of the intrinsic metabolic capabilities of a given organism, encoded in its genome, and can be used in a variety of applications like predicting the phenotypical behaviour of a given organism in ...

  9. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system

    OpenAIRE

    Speth, D.R.; Zandt, M.H. in 't; Guerrero Cruz, S.; Dutilh, B.E.; Jetten, M. S. M.

    2016-01-01

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is used to seed reactors in wastewater treatment plants around the world; however, the role of most of its microbial community in ammonium removal remains unknown. Our analysis yielded 23 near-complete d...

  10. cisRED: a database system for genome-scale computational discovery of regulatory elements

    OpenAIRE

    Robertson, G; Bilenky, M.; Lin, K.; He, A.; Yuen, W.; Dagpinar, M.; Varhol, R.; Teague, K.; Griffith, O L; Zhang, X; Pan, Y.; Hassel, M.; Sleumer, M. C.; Pan, W; Pleasance, E. D.

    2005-01-01

    We describe cisRED, a database for conserved regulatory elements that are identified and ranked by a genome-scale computational system (). The database and high-throughput predictive pipeline are designed to address diverse target genomes in the context of rapidly evolving data resources and tools. Motifs are predicted in promoter regions using multiple discovery methods applied to sequence sets that include corresponding sequence regions from vertebrates. We estimate motif significance by ap...

  11. Reconstruction of a genome-scale metabolic network for Streptococcus pneumoniae R6

    OpenAIRE

    J.P. Saraiva; Pinto, Francisco; Rocha, I

    2013-01-01

    The gram-positive, lancet-shaped bacteria Streptococcus pneumoniae thrives in almost any environment. Under certain conditions this pathogen can cause several infections such as meningitis, otitis media, endocarditis or pneumonia. Genome-scale metabolic networks (GSMs) are commonly used to study phenotype-genotype relationships using biochemical, physiological and genomic information. These relationships might shed some light on identification of targets for metabolic engineering or, in t...

  12. Enabling technologies of genomic-scale sequence enrichment for targeted high-throughput sequencing

    OpenAIRE

    Summerer, Daniel

    2009-01-01

    Next-generation sequencing has still not reached its full potential due to the technical inability of effectively targeting desired genomic regions of interest. Once available, methods adressing this bottleneck will dramatically reduce cost and enable the efficient analysis of complex samples. Recently, a number of possible approaches for genomic-scale sequence enrichment have been reported using different strategies. All methods basically rely on sequence-specific nucleic acid hybridization,...

  13. Bacterial community structure and variation in a full-scale seawater desalination plant for drinking water production.

    Science.gov (United States)

    Belila, A; El-Chakhtoura, J; Otaibi, N; Muyzer, G; Gonzalez-Gil, G; Saikaly, P E; van Loosdrecht, M C M; Vrouwenvelder, J S

    2016-05-01

    Microbial processes inevitably play a role in membrane-based desalination plants, mainly recognized as membrane biofouling. We assessed the bacterial community structure and diversity during different treatment steps in a full-scale seawater desalination plant producing 40,000 m(3)/d of drinking water. Water samples were taken over the full treatment train consisting of chlorination, spruce media and cartridge filters, de-chlorination, first and second pass reverse osmosis (RO) membranes and final chlorine dosage for drinking water distribution. The water samples were analyzed for water quality parameters (total bacterial cell number, total organic carbon, conductivity, pH, etc.) and microbial community composition by 16S rRNA gene pyrosequencing. The planktonic microbial community was dominated by Proteobacteria (48.6%) followed by Bacteroidetes (15%), Firmicutes (9.3%) and Cyanobacteria (4.9%). During the pretreatment step, the spruce media filter did not impact the bacterial community composition dominated by Proteobacteria. In contrast, the RO and final chlorination treatment steps reduced the Proteobacterial relative abundance in the produced water where Firmicutes constituted the most dominant bacterial group. Shannon and Chao1 diversity indices showed that bacterial species richness and diversity decreased during the seawater desalination process. The two-stage RO filtration strongly reduced the water conductivity (>99%), TOC concentration (98.5%) and total bacterial cell number (>99%), albeit some bacterial DNA was found in the water after RO filtration. About 0.25% of the total bacterial operational taxonomic units (OTUs) were present in all stages of the desalination plant: the seawater, the RO permeates and the chlorinated drinking water, suggesting that these bacterial strains can survive in different environments such as high/low salt concentration and with/without residual disinfectant. These bacterial strains were not caused by contamination during

  14. Bacterial community structure and variation in a full-scale seawater desalination plant for drinking water production

    KAUST Repository

    Belila, A.

    2016-02-18

    Microbial processes inevitably play a role in membrane-based desalination plants, mainly recognized as membrane biofouling. We assessed the bacterial community structure and diversity during different treatment steps in a full-scale seawater desalination plant producing 40,000 m3/d of drinking water. Water samples were taken over the full treatment train consisting of chlorination, spruce media and cartridge filters, de-chlorination, first and second pass reverse osmosis (RO) membranes and final chlorine dosage for drinking water distribution. The water samples were analyzed for water quality parameters (total bacterial cell number, total organic carbon, conductivity, pH, etc.) and microbial community composition by 16S rRNA gene pyrosequencing. The planktonic microbial community was dominated by Proteobacteria (48.6%) followed by Bacteroidetes (15%), Firmicutes (9.3%) and Cyanobacteria (4.9%). During the pretreatment step, the spruce media filter did not impact the bacterial community composition dominated by Proteobacteria. In contrast, the RO and final chlorination treatment steps reduced the Proteobacterial relative abundance in the produced water where Firmicutes constituted the most dominant bacterial group. Shannon and Chao1 diversity indices showed that bacterial species richness and diversity decreased during the seawater desalination process. The two-stage RO filtration strongly reduced the water conductivity (>99%), TOC concentration (98.5%) and total bacterial cell number (>99%), albeit some bacterial DNA was found in the water after RO filtration. About 0.25% of the total bacterial operational taxonomic units (OTUs) were present in all stages of the desalination plant: the seawater, the RO permeates and the chlorinated drinking water, suggesting that these bacterial strains can survive in different environments such as high/low salt concentration and with/without residual disinfectant. These bacterial strains were not caused by contamination during

  15. Comparative Genomics between Two Xenorhabdus bovienii Strains Highlights Differential Evolutionary Scenarios within an Entomopathogenic Bacterial Species.

    Science.gov (United States)

    Bisch, Gaëlle; Ogier, Jean-Claude; Médigue, Claudine; Rouy, Zoé; Vincent, Stéphanie; Tailliez, Patrick; Givaudan, Alain; Gaudriault, Sophie

    2016-01-01

    Bacteria of the genus Xenorhabdus are symbionts of soil entomopathogenic nematodes of the genus Steinernema. This symbiotic association constitutes an insecticidal complex active against a wide range of insect pests. Within Xenorhabdus bovienii species, the X. bovienii CS03 strain (Xb CS03) is nonvirulent when directly injected into lepidopteran insects, and displays a low virulence when associated with its Steinernema symbiont. The genome of Xb CS03 was sequenced and compared with the genome of a virulent strain, X. bovienii SS-2004 (Xb SS-2004). The genome size and content widely differed between the two strains. Indeed, Xb CS03 had a large genome containing several specific loci involved in the inhibition of competitors, including a few NRPS-PKS loci (nonribosomal peptide synthetases and polyketide synthases) producing antimicrobial molecules. Consistently, Xb CS03 had a greater antimicrobial activity than Xb SS-2004. The Xb CS03 strain contained more pseudogenes than Xb SS-2004. Decay of genes involved in the host invasion and exploitation (toxins, invasins, or extracellular enzymes) was particularly important in Xb CS03. This may provide an explanation for the nonvirulence of the strain when injected into an insect host. We suggest that Xb CS03 and Xb SS-2004 followed divergent evolutionary scenarios to cope with their peculiar life cycle. The fitness strategy of Xb CS03 would involve competitor inhibition, whereas Xb SS-2004 would quickly and efficiently kill the insect host. Hence, Xenorhabdus strains would have widely divergent host exploitation strategies, which impact their genome structure. PMID:26769959

  16. Adaptation in Toxic Environments: Arsenic Genomic Islands in the Bacterial Genus Thiomonas

    Science.gov (United States)

    Freel, Kelle C.; Krueger, Martin C.; Farasin, Julien; Brochier-Armanet, Céline; Barbe, Valérie; Andrès, Jeremy; Cholley, Pierre-Etienne; Dillies, Marie-Agnès; Jagla, Bernd; Koechler, Sandrine; Leva, Yann; Magdelenat, Ghislaine; Plewniak, Frédéric; Proux, Caroline; Coppée, Jean-Yves; Bertin, Philippe N.; Heipieper, Hermann J.; Arsène-Ploetze, Florence

    2015-01-01

    Acid mine drainage (AMD) is a highly toxic environment for most living organisms due to the presence of many lethal elements including arsenic (As). Thiomonas (Tm.) bacteria are found ubiquitously in AMD and can withstand these extreme conditions, in part because they are able to oxidize arsenite. In order to further improve our knowledge concerning the adaptive capacities of these bacteria, we sequenced and assembled the genome of six isolates derived from the Carnoulès AMD, and compared them to the genomes of Tm. arsenitoxydans 3As (isolated from the same site) and Tm. intermedia K12 (isolated from a sewage pipe). A detailed analysis of the Tm. sp. CB2 genome revealed various rearrangements had occurred in comparison to what was observed in 3As and K12 and over 20 genomic islands (GEIs) were found in each of these three genomes. We performed a detailed comparison of the two arsenic-related islands found in CB2, carrying the genes required for arsenite oxidation and As resistance, with those found in K12, 3As, and five other Thiomonas strains also isolated from Carnoulès (CB1, CB3, CB6, ACO3 and ACO7). Our results suggest that these arsenic-related islands have evolved differentially in these closely related Thiomonas strains, leading to divergent capacities to survive in As rich environments. PMID:26422469

  17. Bayesian prediction of bacterial growth temperature range based on genome sequences

    DEFF Research Database (Denmark)

    Jensen, Dan Børge; Vesth, Tammi Camilla; Hallin, Peter Fischer;

    2012-01-01

    genomic sequence, would thus allow for an efficient and targeted search for production organisms, reducing the need for culturing experiments. Results: This study found a total of 40 protein families useful for distinction between three thermophilicity classes (thermophiles, mesophiles and psychrophiles......). The predictive performance of these protein families were compared to those of 87 basic sequence features (relative use of amino acids and codons, genomic and 16S rDNA AT content and genome size). When using naive Bayesian inference, it was possible to correctly predict the optimal temperature range...... with a Matthews correlation coefficient of up to 0.68. The best predictive performance was always achieved by including protein families as well as structural features, compared to either of these alone. A dedicated computer program was created to perform these predictions. Conclusions: This study...

  18. Reconstruction of genome-scale human metabolic models using omics data

    DEFF Research Database (Denmark)

    Ryu, Jae Yong; Kim, Hyun Uk; Lee, Sang Yup

    2015-01-01

    The impact of genome-scale human metabolic models on human systems biology and medical sciences is becoming greater, thanks to increasing volumes of model building platforms and publicly available omics data. The genome-scale human metabolic models started with Recon 1 in 2007, and have since been...... used to describe metabolic phenotypes of healthy and diseased human tissues and cells, and to predict therapeutic targets. Here we review recent trends in genome-scale human metabolic modeling, including various generic and tissue/cell type-specific human metabolic models developed to date, and methods......, databases and platforms used to construct them. For generic human metabolic models, we pay attention to Recon 2 and HMR 2.0 with emphasis on data sources used to construct them. Draft and high-quality tissue/cell type-specific human metabolic models have been generated using these generic human metabolic...

  19. Characteristic Length Scale of Electric Transport Properties of Genomes

    CERN Document Server

    Shih, C T

    2005-01-01

    A tight-binding model together with a novel statistical method are used to investigate the relation between the sequence-dependent electric transport properties and the sequences of protein-coding regions of complete genomes. A correlation parameter $\\Omega$ is defined to analyze the relation. For some particular propagation length $w_{max}$, the transport behaviors of the coding and non-coding sequences are very different and the correlation reaches its maximal value $\\Omega_{max}$. $w_{max}$ and \\omax are characteristic values for each species. The possible reason of the difference between the features of transport properties in the coding and non-coding regions is the mechanism of DNA damage repair processes together with the natural selection.

  20. A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm

    Directory of Open Access Journals (Sweden)

    Allen Eric E

    2008-10-01

    large-scale HGT patterns among protein families and genome groups. Although the DarkHorse algorithm cannot, by itself, provide definitive proof of horizontal gene transfer, it is a flexible, powerful tool that can be combined with slower, more rigorous methods in situations where these other methods could not otherwise be applied.

  1. First Complete Genome Sequence of Tenacibaculum dicentrarchi, an Emerging Bacterial Pathogen of Salmonids

    OpenAIRE

    Grothusen, Horst; Castillo, Alejandro; Henríquez, Patricio; Navas, Esteban; Bohle, Harry; Araya, Carolina; Bustamante, Fernando; Bustos, Patricio; Mancilla, Marcos

    2016-01-01

    Tenacibaculum-like bacilli have recently been isolated from diseased sea-reared Atlantic salmon in outbreaks that took place in the XI region (Región de Aysén) of Chile. Molecular typing identified the bacterium as Tenacibaculum dicentrarchi. Here, we report the complete genome sequence of the AY7486TD isolate recovered during those outbreaks.

  2. First Complete Genome Sequence of Tenacibaculum dicentrarchi, an Emerging Bacterial Pathogen of Salmonids.

    Science.gov (United States)

    Grothusen, Horst; Castillo, Alejandro; Henríquez, Patricio; Navas, Esteban; Bohle, Harry; Araya, Carolina; Bustamante, Fernando; Bustos, Patricio; Mancilla, Marcos

    2016-01-01

    Tenacibaculum-like bacilli have recently been isolated from diseased sea-reared Atlantic salmon in outbreaks that took place in the XI region (Región de Aysén) of Chile. Molecular typing identified the bacterium as Tenacibaculum dicentrarchi. Here, we report the complete genome sequence of the AY7486TD isolate recovered during those outbreaks. PMID:26893432

  3. Draft genome sequence of Erwinia tracheiphila, an economically important bacterial pathogen of cucurbits

    Science.gov (United States)

    Erwinia tracheiphila is one of the most economically important pathogen of cucumbers, melons, squashes, pumpkins, and gourds, in the Northeastern and Midwestern United States, yet the molecular pathology remains uninvestigated. Here we report the first draft genome sequence of an E. tracheiphila str...

  4. On the limits of computational functional genomics for bacterial lifestyle prediction

    DEFF Research Database (Denmark)

    Barbosa, Eudes; Röttger, Richard; Hauschild, Anne-Christin; Azevedo, Vasco; Baumbach, Jan

    observation bias, i.e. many HPs might yet be unclassified BPs. (H4) There is no intrinsic genomic characteristic of OPs compared with pathogens, as small mutations are likely to play a more dominant role to survive the immune system. To study these hypotheses, we implemented a bioinformatics pipeline that...

  5. On the limits of computational functional genomics for bacterial lifestyle prediction

    DEFF Research Database (Denmark)

    Barbosa, Eudes; Röttger, Richard; Hauschild, Anne-Christin; Azevedo, Vasco; Baumbach, Jan

    2014-01-01

    observation bias, i.e. many HPs might yet be unclassified BPs. (H4) There is no intrinsic genomic characteristic of OPs compared with pathogens, as small mutations are likely to play a more dominant role to survive the immune system. To study these hypotheses, we implemented a bioinformatics pipeline that...

  6. PathogenFinder - Distinguishing Friend from Foe Using Bacterial Whole Genome Sequence Data

    DEFF Research Database (Denmark)

    Cosentino, Salvatore; Larsen, Mette Voldby; Aarestrup, Frank Møller;

    2013-01-01

    Although the majority of bacteria are harmless or even beneficial to their host, others are highly virulent and can cause serious diseases, and even death. Due to the constantly decreasing cost of high-throughput sequencing there are now many completely sequenced genomes available from both human...

  7. Using Mahalanobis distance to compare genomic signatures between bacterial plasmids and chromosomes

    OpenAIRE

    Suzuki, Haruo; Sota, Masahiro; Brown, Celeste J.; Top, Eva M.

    2008-01-01

    Plasmids are ubiquitous mobile elements that serve as a pool of many host beneficial traits such as antibiotic resistance in bacterial communities. To understand the importance of plasmids in horizontal gene transfer, we need to gain insight into the ‘evolutionary history’ of these plasmids, i.e. the range of hosts in which they have evolved. Since extensive data support the proposal that foreign DNA acquires the host's nucleotide composition during long-term residence, comparison of nucleoti...

  8. Draft Genome Sequence of Paracoccus sp. MKU1, a New Bacterial Strain Isolated from an Industrial Effluent with Potential for Bioremediation

    Science.gov (United States)

    Nisha, Kamaldeen Nasrin; Sridhar, Jayavel; Varalakshmi, Perumal; Ashokkumar, Balasubramaniem

    2016-01-01

    Paracoccus sp. MKU1, a novel dimethylformamide degrading bacterial strain was originally isolated from an industrial effluent, Tirupur region, Tamil Nadu, India. Here, we report the draft genome sequence of Paracoccus sp. MKU1, which could provide the genetic insights on its evolution and application of this versatile bacterium for effective degradation of xenobiotics and thus in bioremediation.

  9. Genome-wide identification of Hsp70 genes in channel catfish and their regulated expression after bacterial infection.

    Science.gov (United States)

    Song, Lin; Li, Chao; Xie, Yangjie; Liu, Shikai; Zhang, Jiaren; Yao, Jun; Jiang, Chen; Li, Yun; Liu, Zhanjiang

    2016-02-01

    Heat shock proteins 70/110 (Hsp70/110) are a family of conserved ubiquitously expressed heat shock proteins which are produced by cells in response to exposure to stressful conditions. Besides the chaperone and housekeeping functions, they are also known to be involved in immune response during infection. In this study, we identified 16 Hsp70/110 geness in channel catfish (Ictalurus punctatus) through in silico analysis using RNA-Seq and genome databases. Among them 12 members of Hsp70 (Hspa) family and 4 members of Hsp110 (Hsph) family were identified. Phylogenetic and syntenic analyses provided strong evidence in supporting the orthologies of these HSPs. In addition, we also determined the expression patterns of Hsp70/110 genes after Flavobacterium columnare and Edwardsiella ictaluri infections by meta-analyses, for the first time in channel catfish. Ten out of sixteen genes were significantly up/down-regulated after bacterial challenges. Specifically, nine genes were found significantly expressed in gill after F. columnare infection. Two genes were found significantly expressed in intestine after E. ictaluri infection. Pathogen-specific pattern and tissue-specific pattern were found in the two infections. The significantly regulated expressions of catfish Hsp70 genes after bacterial infections suggested their involvement in immune response in catfish. PMID:26693666

  10. Biomarker-based classification of bacterial and fungal whole-blood infections in a genome-wide expression study

    Directory of Open Access Journals (Sweden)

    Andreas eDix

    2015-03-01

    Full Text Available Sepsis is a clinical syndrome that can be caused by bacteria or fungi. Early knowledge on the nature of the causative agent is a prerequisite for targeted anti-microbial therapy. Besides currently used detection methods like blood culture and PCR-based assays, the analysis of the transcriptional response of the host to infecting organisms holds great promise. In this study, we aim to examine the transcriptional footprint of infections caused by the bacterial pathogens Staphylococcus aureus and Escherichia coli and the fungal pathogens Candida albicans and Aspergillus fumigatus in a human whole-blood model. Moreover, we use the expression information to build a random forest classifier to classify if a sample contains a bacterial, fungal, or mock-infection. After normalizing the transcription intensities using stably expressed reference genes, we filtered the gene set for biomarkers of bacterial or fungal blood infections. This selection is based on differential expression and an additional gene relevance measure. In this way, we identified 38 biomarker genes, including IL6, SOCS3, and IRG1 which were already associated to sepsis by other studies. Using these genes, we trained the classifier and assessed its performance. It yielded a 96% accuracy (sensitivities >93%, specificities >97% for a 10-fold stratified cross-validation and a 92% accuracy (sensitivities and specificities >83% for an additional test dataset comprising Cryptococcus neoformans infections. Furthermore, the classifier is robust to Gaussian noise, indicating correct class predictions on datasets of new species. In conclusion, this genome-wide approach demonstrates an effective feature selection process in combination with the construction of a well-performing classification model. Further analyses of genes with pathogen-dependent expression patterns can provide insights into the systemic host responses, which may lead to new anti-microbial therapeutic advances.

  11. Population genomics reveals chromosome-scale heterogeneous evolution in a protoploid yeast.

    Science.gov (United States)

    Friedrich, Anne; Jung, Paul; Reisser, Cyrielle; Fischer, Gilles; Schacherer, Joseph

    2015-01-01

    Yeast species represent an ideal model system for population genomic studies but large-scale polymorphism surveys have only been reported for species of the Saccharomyces genus so far. Hence, little is known about intraspecific diversity and evolution in yeast. To obtain a new insight into the evolutionary forces shaping natural populations, we sequenced the genomes of an expansive worldwide collection of isolates from a species distantly related to Saccharomyces cerevisiae: Lachancea kluyveri (formerly S. kluyveri). We identified 6.5 million single nucleotide polymorphisms and showed that a large introgression event of 1 Mb of GC-rich sequence in the chromosomal arm probably occurred in the last common ancestor of all L. kluyveri strains. Our population genomic data clearly revealed that this 1-Mb region underwent a molecular evolution pattern very different from the rest of the genome. It is characterized by a higher recombination rate, with a dramatically elevated A:T → G:C substitution rate, which is the signature of an increased GC-biased gene conversion. In addition, the predicted base composition at equilibrium demonstrates that the chromosome-scale compositional heterogeneity will persist after the genome has reached mutational equilibrium. Altogether, the data presented herein clearly show that distinct recombination and substitution regimes can coexist and lead to different evolutionary patterns within a single genome. PMID:25349286

  12. Population genomic footprints of fine-scale differentiation between habitats in Mediterranean blue tits.

    Science.gov (United States)

    Szulkin, M; Gagnaire, P-A; Bierne, N; Charmantier, A

    2016-01-01

    Linking population genetic variation to the spatial heterogeneity of the environment is of fundamental interest to evolutionary biology and ecology, in particular when phenotypic differences between populations are observed at biologically small spatial scales. Here, we applied restriction-site associated DNA sequencing (RAD-Seq) to test whether phenotypically differentiated populations of wild blue tits (Cyanistes caeruleus) breeding in a highly heterogeneous environment exhibit genetic structure related to habitat type. Using 12 106 SNPs in 197 individuals from deciduous and evergreen oak woodlands, we applied complementary population genomic analyses, which revealed that genetic variation is influenced by both geographical distance and habitat type. A fine-scale genetic differentiation supported by genome- and transcriptome-wide analyses was found within Corsica, between two adjacent habitats where blue tits exhibit marked differences in breeding time while nesting < 6 km apart. Using redundancy analysis (RDA), we show that genomic variation remains associated with habitat type when controlling for spatial and temporal effects. Finally, our results suggest that the observed patterns of genomic differentiation were not driven by a small proportion of highly differentiated loci, but rather emerged through a process such as habitat choice, which reduces gene flow between habitats across the entire genome. The pattern of genomic isolation-by-environment closely matches differentiation observed at the phenotypic level, thereby offering significant potential for future inference of phenotype-genotype associations in a heterogeneous environment. PMID:26800038

  13. Development and experimental verification of a genome-scale metabolic model for Corynebacterium glutamicum

    Directory of Open Access Journals (Sweden)

    Hirasawa Takashi

    2009-08-01

    Full Text Available Abstract Background In silico genome-scale metabolic models enable the analysis of the characteristics of metabolic systems of organisms. In this study, we reconstructed a genome-scale metabolic model of Corynebacterium glutamicum on the basis of genome sequence annotation and physiological data. The metabolic characteristics were analyzed using flux balance analysis (FBA, and the results of FBA were validated using data from culture experiments performed at different oxygen uptake rates. Results The reconstructed genome-scale metabolic model of C. glutamicum contains 502 reactions and 423 metabolites. We collected the reactions and biomass components from the database and literatures, and made the model available for the flux balance analysis by filling gaps in the reaction networks and removing inadequate loop reactions. Using the framework of FBA and our genome-scale metabolic model, we first simulated the changes in the metabolic flux profiles that occur on changing the oxygen uptake rate. The predicted production yields of carbon dioxide and organic acids agreed well with the experimental data. The metabolic profiles of amino acid production phases were also investigated. A comprehensive gene deletion study was performed in which the effects of gene deletions on metabolic fluxes were simulated; this helped in the identification of several genes whose deletion resulted in an improvement in organic acid production. Conclusion The genome-scale metabolic model provides useful information for the evaluation of the metabolic capabilities and prediction of the metabolic characteristics of C. glutamicum. This can form a basis for the in silico design of C. glutamicum metabolic networks for improved bioproduction of desirable metabolites.

  14. Exploring massive, genome scale datasets with the genometricorr package

    KAUST Repository

    Favorov, Alexander

    2012-05-31

    We have created a statistically grounded tool for determining the correlation of genomewide data with other datasets or known biological features, intended to guide biological exploration of high-dimensional datasets, rather than providing immediate answers. The software enables several biologically motivated approaches to these data and here we describe the rationale and implementation for each approach. Our models and statistics are implemented in an R package that efficiently calculates the spatial correlation between two sets of genomic intervals (data and/or annotated features), for use as a metric of functional interaction. The software handles any type of pointwise or interval data and instead of running analyses with predefined metrics, it computes the significance and direction of several types of spatial association; this is intended to suggest potentially relevant relationships between the datasets. Availability and implementation: The package, GenometriCorr, can be freely downloaded at http://genometricorr.sourceforge.net/. Installation guidelines and examples are available from the sourceforge repository. The package is pending submission to Bioconductor. © 2012 Favorov et al.

  15. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models.

    Science.gov (United States)

    King, Zachary A; Lu, Justin; Dräger, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A; Ebrahim, Ali; Palsson, Bernhard O; Lewis, Nathan E

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data. PMID:26476456

  16. LAF: Logic Alignment Free and its application to bacterial genomes classification

    OpenAIRE

    Weitschek, Emanuel; Cunial, Fabio; Felici, Giovanni

    2015-01-01

    Alignment-free algorithms can be used to estimate the similarity of biological sequences and hence are often applied to the phylogenetic reconstruction of genomes. Most of these algorithms rely on comparing the frequency of all the distinct substrings of fixed length (k-mers) that occur in the analyzed sequences. In this paper, we present Logic Alignment Free (LAF), a method that combines alignment-free techniques and rule-based classification algorithms in order to assign biological samples ...

  17. Bacterial Catalase in the Microsporidian Nosema locustae: Implications for Microsporidian Metabolism and Genome Evolution

    OpenAIRE

    Fast, Naomi M; Law, Joyce S.; Williams, Bryony A P; Patrick J Keeling

    2003-01-01

    Microsporidia constitute a group of extremely specialized intracellular parasites that infect virtually all animals. They are highly derived, reduced fungi that lack several features typical of other eukaryotes, including canonical mitochondria, flagella, and peroxisomes. Consistent with the absence of peroxisomes in microsporidia, the recently completed genome of the microsporidian Encephalitozoon cuniculi lacks a gene for catalase, the major enzymatic marker for the organelle. We show, howe...

  18. Cloning and Mutagenesis of the Murine Gammaherpesvirus 68 Genome as an Infectious Bacterial Artificial Chromosome

    OpenAIRE

    Adler, Heiko; Messerle, Martin; Wagner, Markus; Koszinowski, Ulrich H.

    2000-01-01

    Gammaherpesviruses cause important infections of humans, in particular in immunocompromised patients. Recently, murine gammaherpesvirus 68 (MHV-68) infection of mice has been developed as a small animal model of gammaherpesvirus pathogenesis. Efficient generation of mutants of MHV-68 would significantly contribute to the understanding of viral gene functions in virus-host interaction, thereby further enhancing the potential of this model. To this end, we cloned the MHV-68 genome as a bacteria...

  19. Solving the Problem of Comparing Whole Bacterial Genomes across Different Sequencing Platforms

    OpenAIRE

    Kaas, Rolf Sommer; Leekitcharoenphon, Pimlapas; Aarestrup, Frank Møller; Lund, Ole

    2014-01-01

    Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites ...

  20. Genome-wide identification of Streptococcus pneumoniae genes essential for bacterial replication during experimental meningitis

    DEFF Research Database (Denmark)

    Molzen, T E; Burghout, P; Bootsma, H J;

    2010-01-01

    of invasive pneumococcal disease is required in order to enable the development of new or adjunctive treatments and/or pneumococcal vaccines that are efficient across serotypes. We applied genomic array footprinting (GAF) in the search for S. pneumoniae genes that are essential during experimental...... relevant as targets for future therapy and prevention of pneumococcal meningitis, since their mutants were attenuated in both models of infection as well as in competitive growth in human cerebrospinal fluid in vitro....

  1. Combined Analysis of Variation in Core, Accessory and Regulatory Genome Regions Provides a Super-Resolution View into the Evolution of Bacterial Populations.

    Science.gov (United States)

    McNally, Alan; Oren, Yaara; Kelly, Darren; Pascoe, Ben; Dunn, Steven; Sreecharan, Tristan; Vehkala, Minna; Välimäki, Niko; Prentice, Michael B; Ashour, Amgad; Avram, Oren; Pupko, Tal; Dobrindt, Ulrich; Literak, Ivan; Guenther, Sebastian; Schaufler, Katharina; Wieler, Lothar H; Zhiyong, Zong; Sheppard, Samuel K; McInerney, James O; Corander, Jukka

    2016-09-01

    The use of whole-genome phylogenetic analysis has revolutionized our understanding of the evolution and spread of many important bacterial pathogens due to the high resolution view it provides. However, the majority of such analyses do not consider the potential role of accessory genes when inferring evolutionary trajectories. Moreover, the recently discovered importance of the switching of gene regulatory elements suggests that an exhaustive analysis, combining information from core and accessory genes with regulatory elements could provide unparalleled detail of the evolution of a bacterial population. Here we demonstrate this principle by applying it to a worldwide multi-host sample of the important pathogenic E. coli lineage ST131. Our approach reveals the existence of multiple circulating subtypes of the major drug-resistant clade of ST131 and provides the first ever population level evidence of core genome substitutions in gene regulatory regions associated with the acquisition and maintenance of different accessory genome elements. PMID:27618184

  2. Rapid genome-scale mapping of chromatin accessibility in tissue

    Directory of Open Access Journals (Sweden)

    Grøntved Lars

    2012-06-01

    Full Text Available Abstract Background The challenge in extracting genome-wide chromatin features from limiting clinical samples poses a significant hurdle in identification of regulatory marks that impact the physiological or pathological state. Current methods that identify nuclease accessible chromatin are reliant on large amounts of purified nuclei as starting material. This complicates analysis of trace clinical tissue samples that are often stored frozen. We have developed an alternative nuclease based procedure to bypass nuclear preparation to interrogate nuclease accessible regions in frozen tissue samples. Results Here we introduce a novel technique that specifically identifies Tissue Accessible Chromatin (TACh. The TACh method uses pulverized frozen tissue as starting material and employs one of the two robust endonucleases, Benzonase or Cyansase, which are fully active under a range of stringent conditions such as high levels of detergent and DTT. As a proof of principle we applied TACh to frozen mouse liver tissue. Combined with massive parallel sequencing TACh identifies accessible regions that are associated with euchromatic features and accessibility at transcriptional start sites correlates positively with levels of gene transcription. Accessible chromatin identified by TACh overlaps to a large extend with accessible chromatin identified by DNase I using nuclei purified from freshly isolated liver tissue as starting material. The similarities are most pronounced at highly accessible regions, whereas identification of less accessible regions tends to be more divergence between nucleases. Interestingly, we show that some of the differences between DNase I and Benzonase relate to their intrinsic sequence biases and accordingly accessibility of CpG islands is probed more efficiently using TACh. Conclusion The TACh methodology identifies accessible chromatin derived from frozen tissue samples. We propose that this simple, robust approach can be applied

  3. Generation and Evaluation of a Genome-Scale Metabolic Network Model of Synechococcus elongatus PCC7942

    Directory of Open Access Journals (Sweden)

    Julián Triana

    2014-08-01

    Full Text Available The reconstruction of genome-scale metabolic models and their applications represent a great advantage of systems biology. Through their use as metabolic flux simulation models, production of industrially-interesting metabolites can be predicted. Due to the growing number of studies of metabolic models driven by the increasing genomic sequencing projects, it is important to conceptualize steps of reconstruction and analysis. We have focused our work in the cyanobacterium Synechococcus elongatus PCC7942, for which several analyses and insights are unveiled. A comprehensive approach has been used, which can be of interest to lead the process of manual curation and genome-scale metabolic analysis. The final model, iSyf715 includes 851 reactions and 838 metabolites. A biomass equation, which encompasses elementary building blocks to allow cell growth, is also included. The applicability of the model is finally demonstrated by simulating autotrophic growth conditions of Synechococcus elongatus PCC7942.

  4. Genome-scale cold stress response regulatory networks in ten Arabidopsis thaliana ecotypes

    DEFF Research Database (Denmark)

    Barah, Pankaj; Jayavelu, Naresh Doni; Rasmussen, Simon;

    2013-01-01

    available from Arabidopsis thaliana 1001 genome project, we further investigated sequence polymorphisms in the core cold stress regulon genes. Significant numbers of non-synonymous amino acid changes were observed in the coding region of the CBF regulon genes. Considering the limited knowledge about......BACKGROUND: Low temperature leads to major crop losses every year. Although several studies have been conducted focusing on diversity of cold tolerance level in multiple phenotypically divergent Arabidopsis thaliana (A. thaliana) ecotypes, genome-scale molecular understanding is still lacking....... RESULTS: In this study, we report genome-scale transcript response diversity of 10 A. thaliana ecotypes originating from different geographical locations to non-freezing cold stress (10°C). To analyze the transcriptional response diversity, we initially compared transcriptome changes in all 10 ecotypes...

  5. MultiMetEval : Comparative and Multi-Objective Analysis of Genome-Scale Metabolic Models

    NARCIS (Netherlands)

    Zakrzewski, Piotr; Medema, Marnix H.; Gevorgyan, Albert; Kierzek, Andrzej M.; Breitling, Rainer; Takano, Eriko; Fong, Stephen S.

    2012-01-01

    Comparative metabolic modelling is emerging as a novel field, supported by the development of reliable and standardized approaches for constructing genome-scale metabolic models in high throughput. New software solutions are needed to allow efficient comparative analysis of multiple models in the co

  6. Mapping condition-dependent regulation of metabolism in yeast through genome-scale modeling

    DEFF Research Database (Denmark)

    Österlund, Tobias; Nookaew, Intawat; Bordel, Sergio;

    2013-01-01

    : Here we present iTO977, a comprehensive genome-scale metabolic model that contains more reactions, metabolites and genes than previous models. The model was constructed based on two earlier reconstructions, namely iIN800 and the consensus network, and then improved and expanded using gap...

  7. M-GCAT: interactively and efficiently constructing large-scale multiple genome comparison frameworks in closely related species

    Directory of Open Access Journals (Sweden)

    Messeguer Xavier

    2006-10-01

    Full Text Available Abstract Background Due to recent advances in whole genome shotgun sequencing and assembly technologies, the financial cost of decoding an organism's DNA has been drastically reduced, resulting in a recent explosion of genomic sequencing projects. This increase in related genomic data will allow for in depth studies of evolution in closely related species through multiple whole genome comparisons. Results To facilitate such comparisons, we present an interactive multiple genome comparison and alignment tool, M-GCAT, that can efficiently construct multiple genome comparison frameworks in closely related species. M-GCAT is able to compare and identify highly conserved regions in up to 20 closely related bacterial species in minutes on a standard computer, and as many as 90 (containing 75 cloned genomes from a set of 15 published enterobacterial genomes in an hour. M-GCAT also incorporates a novel comparative genomics data visualization interface allowing the user to globally and locally examine and inspect the conserved regions and gene annotations. Conclusion M-GCAT is an interactive comparative genomics tool well suited for quickly generating multiple genome comparisons frameworks and alignments among closely related species. M-GCAT is freely available for download for academic and non-commercial use at: http://alggen.lsi.upc.es/recerca/align/mgcat/intro-mgcat.html.

  8. Solving the Problem of Comparing Whole Bacterial Genomes across Different Sequencing Platforms

    DEFF Research Database (Denmark)

    Kaas, Rolf Sommer; Leekitcharoenphon, Pimlapas; Aarestrup, Frank Møller;

    2014-01-01

    Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one...... data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a...

  9. A bacterial genome in transition - an exceptional enrichment of IS elements but lack of evidence for recent transposition in the symbiont Amoebophilus asiaticus

    Directory of Open Access Journals (Sweden)

    Penz Thomas

    2011-09-01

    Full Text Available Abstract Background Insertion sequence (IS elements are important mediators of genome plasticity and are widespread among bacterial and archaeal genomes. The 1.88 Mbp genome of the obligate intracellular amoeba symbiont Amoebophilus asiaticus contains an unusually large number of transposase genes (n = 354; 23% of all genes. Results The transposase genes in the A. asiaticus genome can be assigned to 16 different IS elements termed ISCaa1 to ISCaa16, which are represented by 2 to 24 full-length copies, respectively. Despite this high IS element load, the A. asiaticus genome displays a GC skew pattern typical for most bacterial genomes, indicating that no major rearrangements have occurred recently. Additionally, the high sequence divergence of some IS elements, the high number of truncated IS element copies (n = 143, as well as the absence of direct repeats in most IS elements suggest that the IS elements of A. asiaticus are transpositionally inactive. Although we could show transcription of 13 IS elements, we did not find experimental evidence for transpositional activity, corroborating our results from sequence analyses. However, we detected contiguous transcripts between IS elements and their downstream genes at nine loci in the A. asiaticus genome, indicating that some IS elements influence the transcription of downstream genes, some of which might be important for host cell interaction. Conclusions Taken together, the IS elements in the A. asiaticus genome are currently in the process of degradation and largely represent reflections of the evolutionary past of A. asiaticus in which its genome was shaped by their activity.

  10. Niche differentiation of bacterial communities at a millimeter scale in Shark Bay microbial mats

    Science.gov (United States)

    Wong, Hon Lun; Smith, Daniela-Lee; Visscher, Pieter T.; Burns, Brendan P.

    2015-10-01

    Modern microbial mats can provide key insights into early Earth ecosystems, and Shark Bay, Australia, holds one of the best examples of these systems. Identifying the spatial distribution of microorganisms with mat depth facilitates a greater understanding of specific niches and potentially novel microbial interactions. High throughput sequencing coupled with elemental analyses and biogeochemical measurements of two distinct mat types (smooth and pustular) at a millimeter scale were undertaken in the present study. A total of 8,263,982 16S rRNA gene sequences were obtained, which were affiliated to 58 bacterial and candidate phyla. The surface of both mats were dominated by Cyanobacteria, accompanied with known or putative members of Alphaproteobacteria and Bacteroidetes. The deeper anoxic layers of smooth mats were dominated by Chloroflexi, while Alphaproteobacteria dominated the lower layers of pustular mats. In situ microelectrode measurements revealed smooth mats have a steeper profile of O2 and H2S concentrations, as well as higher oxygen production, consumption, and sulfate reduction rates. Specific elements (Mo, Mg, Mn, Fe, V, P) could be correlated with specific mat types and putative phylogenetic groups. Models are proposed for these systems suggesting putative surface anoxic niches, differential nitrogen fixing niches, and those coupled with methane metabolism.

  11. Benchmarking of methods for identification of antimicrobial resistance genes in bacterial whole genome data

    DEFF Research Database (Denmark)

    Clausen, Philip T. L. C.; Zankari, Ea; Aarestrup, Frank Møller;

    2016-01-01

    two different methods in current use for identification of antibiotic resistance genes in bacterial WGS data. A novel method, KmerResistance, which examines the co-occurrence of k-mers between the WGS data and a database of resistance genes, was developed. The performance of this method was compared...... with two previously described methods; ResFinder and SRST2, which use an assembly/BLAST method and BWA, respectively, using two datasets with a total of 339 isolates, covering five species, originating from the Oxford University Hospitals NHS Trust and Danish pig farms. The predicted resistance was...... compared with the observed phenotypes for all isolates. To challenge further the sensitivity of the in silico methods, the datasets were also down-sampled to 1% of the reads and reanalysed. The best results were obtained by identification of resistance genes by mapping directly against the raw reads. This...

  12. Vertebrate Protein CTCF and its Multiple Roles in a Large-Scale Regulation of Genome Activity

    Science.gov (United States)

    Nikolaev, L.G; Akopov, S.B; Didych, D.A; Sverdlov, E.D

    2009-01-01

    The CTCF transcription factor is an 11 zinc fingers multifunctional protein that uses different zinc finger combinations to recognize and bind different sites within DNA. CTCF is thought to participate in various gene regulatory networks including transcription activation and repression, formation of independently functioning chromatin domains and regulation of imprinting. Sequencing of human and other genomes opened up a possibility to ascertain the genomic distribution of CTCF binding sites and to identify CTCF-dependent cis-regulatory elements, including insulators. In the review, we summarized recent data on genomic distribution of CTCF binding sites in the human and other genomes within a framework of the loop domain hypothesis of large-scale regulation of the genome activity. We also tried to formulate possible lines of studies on a variety of CTCF functions which probably depend on its ability to specifically bind DNA, interact with other proteins and form di- and multimers. These three fundamental properties allow CTCF to serve as a transcription factor, an insulator and a constitutive dispersed genome-wide demarcation tool able to recruit various factors that emerge in response to diverse external and internal signals, and thus to exert its signal-specific function(s). PMID:20119526

  13. Vertebrate Protein CTCF and its Multiple Roles in a Large-Scale Regulation of Genome Activity.

    Science.gov (United States)

    Nikolaev, L G; Akopov, S B; Didych, D A; Sverdlov, E D

    2009-08-01

    The CTCF transcription factor is an 11 zinc fingers multifunctional protein that uses different zinc finger combinations to recognize and bind different sites within DNA. CTCF is thought to participate in various gene regulatory networks including transcription activation and repression, formation of independently functioning chromatin domains and regulation of imprinting. Sequencing of human and other genomes opened up a possibility to ascertain the genomic distribution of CTCF binding sites and to identify CTCF-dependent cis-regulatory elements, including insulators. In the review, we summarized recent data on genomic distribution of CTCF binding sites in the human and other genomes within a framework of the loop domain hypothesis of large-scale regulation of the genome activity. We also tried to formulate possible lines of studies on a variety of CTCF functions which probably depend on its ability to specifically bind DNA, interact with other proteins and form di- and multimers. These three fundamental properties allow CTCF to serve as a transcription factor, an insulator and a constitutive dispersed genome-wide demarcation tool able to recruit various factors that emerge in response to diverse external and internal signals, and thus to exert its signal-specific function(s). PMID:20119526

  14. Rare and common regulatory variation in population-scale sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Stephen B Montgomery

    2011-07-01

    Full Text Available Population-scale genome sequencing allows the characterization of functional effects of a broad spectrum of genetic variants underlying human phenotypic variation. Here, we investigate the influence of rare and common genetic variants on gene expression patterns, using variants identified from sequencing data from the 1000 genomes project in an African and European population sample and gene expression data from lymphoblastoid cell lines. We detect comparable numbers of expression quantitative trait loci (eQTLs when compared to genotypes obtained from HapMap 3, but as many as 80% of the top expression quantitative trait variants (eQTVs discovered from 1000 genomes data are novel. The properties of the newly discovered variants suggest that mapping common causal regulatory variants is challenging even with full resequencing data; however, we observe significant enrichment of regulatory effects in splice-site and nonsense variants. Using RNA sequencing data, we show that 46.2% of nonsynonymous variants are differentially expressed in at least one individual in our sample, creating widespread potential for interactions between functional protein-coding and regulatory variants. We also use allele-specific expression to identify putative rare causal regulatory variants. Furthermore, we demonstrate that outlier expression values can be due to rare variant effects, and we approximate the number of such effects harboured in an individual by effect size. Our results demonstrate that integration of genomic and RNA sequencing analyses allows for the joint assessment of genome sequence and genome function.

  15. The genome sequence of E. coli W (ATCC 9637: comparative genome analysis and an improved genome-scale reconstruction of E. coli

    Directory of Open Access Journals (Sweden)

    Lee Sang

    2011-01-01

    Full Text Available Abstract Background Escherichia coli is a model prokaryote, an important pathogen, and a key organism for industrial biotechnology. E. coli W (ATCC 9637, one of four strains designated as safe for laboratory purposes, has not been sequenced. E. coli W is a fast-growing strain and is the only safe strain that can utilize sucrose as a carbon source. Lifecycle analysis has demonstrated that sucrose from sugarcane is a preferred carbon source for industrial bioprocesses. Results We have sequenced and annotated the genome of E. coli W. The chromosome is 4,900,968 bp and encodes 4,764 ORFs. Two plasmids, pRK1 (102,536 bp and pRK2 (5,360 bp, are also present. W has unique features relative to other sequenced laboratory strains (K-12, B and Crooks: it has a larger genome and belongs to phylogroup B1 rather than A. W also grows on a much broader range of carbon sources than does K-12. A genome-scale reconstruction was developed and validated in order to interrogate metabolic properties. Conclusions The genome of W is more similar to commensal and pathogenic B1 strains than phylogroup A strains, and therefore has greater utility for comparative analyses with these strains. W should therefore be the strain of choice, or 'type strain' for group B1 comparative analyses. The genome annotation and tools created here are expected to allow further utilization and development of E. coli W as an industrial organism for sucrose-based bioprocesses. Refinements in our E. coli metabolic reconstruction allow it to more accurately define E. coli metabolism relative to previous models.

  16. BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions

    Directory of Open Access Journals (Sweden)

    Conrad Tom M

    2010-04-01

    Full Text Available Abstract Background Genome-scale metabolic reconstructions under the Constraint Based Reconstruction and Analysis (COBRA framework are valuable tools for analyzing the metabolic capabilities of organisms and interpreting experimental data. As the number of such reconstructions and analysis methods increases, there is a greater need for data uniformity and ease of distribution and use. Description We describe BiGG, a knowledgebase of Biochemically, Genetically and Genomically structured genome-scale metabolic network reconstructions. BiGG integrates several published genome-scale metabolic networks into one resource with standard nomenclature which allows components to be compared across different organisms. BiGG can be used to browse model content, visualize metabolic pathway maps, and export SBML files of the models for further analysis by external software packages. Users may follow links from BiGG to several external databases to obtain additional information on genes, proteins, reactions, metabolites and citations of interest. Conclusions BiGG addresses a need in the systems biology community to have access to high quality curated metabolic models and reconstructions. It is freely available for academic use at http://bigg.ucsd.edu.

  17. Genome-scale identification of Legionella pneumophila effectors using a machine learning approach.

    Directory of Open Access Journals (Sweden)

    David Burstein

    2009-07-01

    Full Text Available A large number of highly pathogenic bacteria utilize secretion systems to translocate effector proteins into host cells. Using these effectors, the bacteria subvert host cell processes during infection. Legionella pneumophila translocates effectors via the Icm/Dot type-IV secretion system and to date, approximately 100 effectors have been identified by various experimental and computational techniques. Effector identification is a critical first step towards the understanding of the pathogenesis system in L. pneumophila as well as in other bacterial pathogens. Here, we formulate the task of effector identification as a classification problem: each L. pneumophila open reading frame (ORF was classified as either effector or not. We computationally defined a set of features that best distinguish effectors from non-effectors. These features cover a wide range of characteristics including taxonomical dispersion, regulatory data, genomic organization, similarity to eukaryotic proteomes and more. Machine learning algorithms utilizing these features were then applied to classify all the ORFs within the L. pneumophila genome. Using this approach we were able to predict and experimentally validate 40 new effectors, reaching a success rate of above 90%. Increasing the number of validated effectors to around 140, we were able to gain novel insights into their characteristics. Effectors were found to have low G+C content, supporting the hypothesis that a large number of effectors originate via horizontal gene transfer, probably from their protozoan host. In addition, effectors were found to cluster in specific genomic regions. Finally, we were able to provide a novel description of the C-terminal translocation signal required for effector translocation by the Icm/Dot secretion system. To conclude, we have discovered 40 novel L. pneumophila effectors, predicted over a hundred additional highly probable effectors, and shown the applicability of machine

  18. Time-scales of hydrological forcing on the geochemistry and bacterial community structure of temperate peat soils

    Science.gov (United States)

    Nunes, Flavia L. D.; Aquilina, Luc; De Ridder, Jo; Francez, André-Jean; Quaiser, Achim; Caudal, Jean-Pierre; Vandenkoornhuyse, Philippe; Dufresne, Alexis

    2015-10-01

    Peatlands are an important global carbon reservoir. The continued accumulation of carbon in peatlands depends on the persistence of anoxic conditions, in part induced by water saturation, which prevents oxidation of organic matter, and slows down decomposition. Here we investigate how and over what time scales the hydrological regime impacts the geochemistry and the bacterial community structure of temperate peat soils. Peat cores from two sites having contrasting groundwater budgets were subjected to four controlled drought-rewetting cycles. Pore water geochemistry and metagenomic profiling of bacterial communities showed that frequent water table drawdown induced lower concentrations of dissolved carbon, higher concentrations of sulfate and iron and reduced bacterial richness and diversity in the peat soil and water. Short-term drought cycles (3-9 day frequency) resulted in different communities from continuously saturated environments. Furthermore, the site that has more frequently experienced water table drawdown during the last two decades presented the most striking shifts in bacterial community structure, altering biogeochemical functioning of peat soils. Our results suggest that the increase in frequency and duration of drought conditions under changing climatic conditions or water resource use can induce profound changes in bacterial communities, with potentially severe consequences for carbon storage in temperate peatlands.

  19. ``Black Holes" and Bacterial Pathogenicity: A Large Genomic Deletion that Enhances the Virulence of Shigella spp. and Enteroinvasive Escherichia coli

    Science.gov (United States)

    Maurelli, Anthony T.; Fernandez, Reinaldo E.; Bloch, Craig A.; Rode, Christopher K.; Fasano, Alessio

    1998-03-01

    Plasmids, bacteriophages, and pathogenicity islands are genomic additions that contribute to the evolution of bacterial pathogens. For example, Shigella spp., the causative agents of bacillary dysentery, differ from the closely related commensal Escherichia coli in the presence of a plasmid in Shigella that encodes virulence functions. However, pathogenic bacteria also may lack properties that are characteristic of nonpathogens. Lysine decarboxylate (LDC) activity is present in ≈ 90% of E. coli strains but is uniformly absent in Shigella strains. When the gene for LDC, cadA, was introduced into Shigella flexneri 2a, virulence became attenuated, and enterotoxin activity was inhibited greatly. The enterotoxin inhibitor was identified as cadaverine, a product of the reaction catalyzed by LDC. Comparison of the S. flexneri 2a and laboratory E. coli K-12 genomes in the region of cadA revealed a large deletion in Shigella. Representative strains of Shigella spp. and enteroinvasive E. coli displayed similar deletions of cadA. Our results suggest that, as Shigella spp. evolved from E. coli to become pathogens, they not only acquired virulence genes on a plasmid but also shed genes via deletions. The formation of these ``black holes,'' deletions of genes that are detrimental to a pathogenic lifestyle, provides an evolutionary pathway that enables a pathogen to enhance virulence. Furthermore, the demonstration that cadaverine can inhibit enterotoxin activity may lead to more general models about toxin activity or entry into cells and suggests an avenue for antitoxin therapy. Thus, understanding the role of black holes in pathogen evolution may yield clues to new treatments of infectious diseases.

  20. Fractality and entropic scaling in the chromosomal distribution of conserved noncoding elements in the human genome.

    Science.gov (United States)

    Polychronopoulos, Dimitris; Athanasopoulou, Labrini; Almirantis, Yannis

    2016-06-15

    Conserved non-coding elements (CNEs) are defined using various degrees of sequence identity and thresholds of minimal length. Their conservation frequently exceeds the one observed for protein-coding sequences. We explored the chromosomal distribution of different classes of CNEs in the human genome. We employed two methodologies: the scaling of block entropy and box-counting, with the aim to assess fractal characteristics of different CNE datasets. Both approaches converged to the conclusion that well-developed fractality is characteristic of elements that are either extremely conserved between species or are of ancient origin, i.e. conserved between distant organisms across evolution. Given that CNEs are often clustered around genes, we verified by appropriate gene masking that fractal-like patterns emerge even when elements found in proximity or inside genes are excluded. An evolutionary scenario is proposed, involving genomic events that might account for fractal distribution of CNEs in the human genome as indicated through numerical simulations. PMID:26899868

  1. iAK692: A genome-scale metabolic model of Spirulina platensis C1

    Directory of Open Access Journals (Sweden)

    Klanchui Amornpan

    2012-06-01

    Full Text Available Abstract Background Spirulina (Arthrospira platensis is a well-known filamentous cyanobacterium used in the production of many industrial products, including high value compounds, healthy food supplements, animal feeds, pharmaceuticals and cosmetics, for example. It has been increasingly studied around the world for scientific purposes, especially for its genome, biology, physiology, and also for the analysis of its small-scale metabolic network. However, the overall description of the metabolic and biotechnological capabilities of S. platensis requires the development of a whole cellular metabolism model. Recently, the S. platensis C1 (Arthrospira sp. PCC9438 genome sequence has become available, allowing systems-level studies of this commercial cyanobacterium. Results In this work, we present the genome-scale metabolic network analysis of S. platensis C1, iAK692, its topological properties, and its metabolic capabilities and functions. The network was reconstructed from the S. platensis C1 annotated genomic sequence using Pathway Tools software to generate a preliminary network. Then, manual curation was performed based on a collective knowledge base and a combination of genomic, biochemical, and physiological information. The genome-scale metabolic model consists of 692 genes, 837 metabolites, and 875 reactions. We validated iAK692 by conducting fermentation experiments and simulating the model under autotrophic, heterotrophic, and mixotrophic growth conditions using COBRA toolbox. The model predictions under these growth conditions were consistent with the experimental results. The iAK692 model was further used to predict the unique active reactions and essential genes for each growth condition. Additionally, the metabolic states of iAK692 during autotrophic and mixotrophic growths were described by phenotypic phase plane (PhPP analysis. Conclusions This study proposes the first genome-scale model of S. platensis C1, iAK692, which is a

  2. FVGWAS: Fast voxelwise genome wide association analysis of large-scale imaging genetic data.

    Science.gov (United States)

    Huang, Meiyan; Nichols, Thomas; Huang, Chao; Yu, Yang; Lu, Zhaohua; Knickmeyer, Rebecca C; Feng, Qianjin; Zhu, Hongtu

    2015-09-01

    More and more large-scale imaging genetic studies are being widely conducted to collect a rich set of imaging, genetic, and clinical data to detect putative genes for complexly inherited neuropsychiatric and neurodegenerative disorders. Several major big-data challenges arise from testing genome-wide (NC>12 million known variants) associations with signals at millions of locations (NV~10(6)) in the brain from thousands of subjects (n~10(3)). The aim of this paper is to develop a Fast Voxelwise Genome Wide Association analysiS (FVGWAS) framework to efficiently carry out whole-genome analyses of whole-brain data. FVGWAS consists of three components including a heteroscedastic linear model, a global sure independence screening (GSIS) procedure, and a detection procedure based on wild bootstrap methods. Specifically, for standard linear association, the computational complexity is O (nNVNC) for voxelwise genome wide association analysis (VGWAS) method compared with O ((NC+NV)n(2)) for FVGWAS. Simulation studies show that FVGWAS is an efficient method of searching sparse signals in an extremely large search space, while controlling for the family-wise error rate. Finally, we have successfully applied FVGWAS to a large-scale imaging genetic data analysis of ADNI data with 708 subjects, 193,275voxels in RAVENS maps, and 501,584 SNPs, and the total processing time was 203,645s for a single CPU. Our FVGWAS may be a valuable statistical toolbox for large-scale imaging genetic analysis as the field is rapidly advancing with ultra-high-resolution imaging and whole-genome sequencing. PMID:26025292

  3. Analysis of Comparative Sequence and Genomic Data to Verify Phylogenetic Relationship and Explore a New Subfamily of Bacterial Lipases.

    Directory of Open Access Journals (Sweden)

    Malihe Masomian

    Full Text Available Thermostable and organic solvent-tolerant enzymes have significant potential in a wide range of synthetic reactions in industry due to their inherent stability at high temperatures and their ability to endure harsh organic solvents. In this study, a novel gene encoding a true lipase was isolated by construction of a genomic DNA library of thermophilic Aneurinibacillus thermoaerophilus strain HZ into Escherichia coli plasmid vector. Sequence analysis revealed that HZ lipase had 62% identity to putative lipase from Bacillus pseudomycoides. The closely characterized lipases to the HZ lipase gene are from thermostable Bacillus and Geobacillus lipases belonging to the subfamily I.5 with ≤ 57% identity. The amino acid sequence analysis of HZ lipase determined a conserved pentapeptide containing the active serine, GHSMG and a Ca(2+-binding motif, GCYGSD in the enzyme. Protein structure modeling showed that HZ lipase consisted of an α/β hydrolase fold and a lid domain. Protein sequence alignment, conserved regions analysis, clustal distance matrix and amino acid composition illustrated differences between HZ lipase and other thermostable lipases. Phylogenetic analysis revealed that this lipase represented a new subfamily of family I of bacterial true lipases, classified as family I.9. The HZ lipase was expressed under promoter Plac using IPTG and was characterized. The recombinant enzyme showed optimal activity at 65 °C and retained ≥ 97% activity after incubation at 50 °C for 1h. The HZ lipase was stable in various polar and non-polar organic solvents.

  4. Analysis of Comparative Sequence and Genomic Data to Verify Phylogenetic Relationship and Explore a New Subfamily of Bacterial Lipases.

    Science.gov (United States)

    Masomian, Malihe; Rahman, Raja Noor Zaliha Raja Abd; Salleh, Abu Bakar; Basri, Mahiran

    2016-01-01

    Thermostable and organic solvent-tolerant enzymes have significant potential in a wide range of synthetic reactions in industry due to their inherent stability at high temperatures and their ability to endure harsh organic solvents. In this study, a novel gene encoding a true lipase was isolated by construction of a genomic DNA library of thermophilic Aneurinibacillus thermoaerophilus strain HZ into Escherichia coli plasmid vector. Sequence analysis revealed that HZ lipase had 62% identity to putative lipase from Bacillus pseudomycoides. The closely characterized lipases to the HZ lipase gene are from thermostable Bacillus and Geobacillus lipases belonging to the subfamily I.5 with ≤ 57% identity. The amino acid sequence analysis of HZ lipase determined a conserved pentapeptide containing the active serine, GHSMG and a Ca(2+)-binding motif, GCYGSD in the enzyme. Protein structure modeling showed that HZ lipase consisted of an α/β hydrolase fold and a lid domain. Protein sequence alignment, conserved regions analysis, clustal distance matrix and amino acid composition illustrated differences between HZ lipase and other thermostable lipases. Phylogenetic analysis revealed that this lipase represented a new subfamily of family I of bacterial true lipases, classified as family I.9. The HZ lipase was expressed under promoter Plac using IPTG and was characterized. The recombinant enzyme showed optimal activity at 65 °C and retained ≥ 97% activity after incubation at 50 °C for 1h. The HZ lipase was stable in various polar and non-polar organic solvents. PMID:26934700

  5. Reconstruction and modeling protein translocation and compartmentalization in Escherichia coli at the genome-scale

    DEFF Research Database (Denmark)

    Liu, Joanne K.; O’Brien, Edward J.; Lerman, Joshua A.;

    2014-01-01

    Background: Membranes play a crucial role in cellular functions. Membranes provide a physical barrier, control the trafficking of substances entering and leaving the cell, and are a major determinant of cellular ultra-structure. In addition, components embedded within the membrane participate in...... into genome-scale models. Results: The recent genome-scale model of metabolism and protein expression in Escherichia coli (called a ME-model) computes the complete composition of the proteome required to perform whole cell functions. Here we expand the ME-model to include (1) a reconstruction of...... protein translocation pathways, (2) assignment of all cellular proteins to one of four compartments (cytoplasm, inner membrane, periplasm, and outer membrane) and a translocation pathway, (3) experimentally determined translocase catalytic and porin diffusion rates, and (4) a novel membrane constraint...

  6. A systems approach to predict oncometabolites via context-specific genome-scale metabolic networks.

    Directory of Open Access Journals (Sweden)

    Hojung Nam

    2014-09-01

    Full Text Available Altered metabolism in cancer cells has been viewed as a passive response required for a malignant transformation. However, this view has changed through the recently described metabolic oncogenic factors: mutated isocitrate dehydrogenases (IDH, succinate dehydrogenase (SDH, and fumarate hydratase (FH that produce oncometabolites that competitively inhibit epigenetic regulation. In this study, we demonstrate in silico predictions of oncometabolites that have the potential to dysregulate epigenetic controls in nine types of cancer by incorporating massive scale genetic mutation information (collected from more than 1,700 cancer genomes, expression profiling data, and deploying Recon 2 to reconstruct context-specific genome-scale metabolic models. Our analysis predicted 15 compounds and 24 substructures of potential oncometabolites that could result from the loss-of-function and gain-of-function mutations of metabolic enzymes, respectively. These results suggest a substantial potential for discovering unidentified oncometabolites in various forms of cancers.

  7. Evaluation of Genome-Enabled Selection for Bacterial Cold Water Disease Resistance Using Progeny Performance Data in Rainbow Trout: Insights on Genotyping Methods and Genomic Prediction Models

    Science.gov (United States)

    Vallejo, Roger L.; Leeds, Timothy D.; Fragomeni, Breno O.; Gao, Guangtu; Hernandez, Alvaro G.; Misztal, Ignacy; Welch, Timothy J.; Wiens, Gregory D.; Palti, Yniv

    2016-01-01

    Bacterial cold water disease (BCWD) causes significant economic losses in salmonid aquaculture, and traditional family-based breeding programs aimed at improving BCWD resistance have been limited to exploiting only between-family variation. We used genomic selection (GS) models to predict genomic breeding values (GEBVs) for BCWD resistance in 10 families from the first generation of the NCCCWA BCWD resistance breeding line, compared the predictive ability (PA) of GEBVs to pedigree-based estimated breeding values (EBVs), and compared the impact of two SNP genotyping methods on the accuracy of GEBV predictions. The BCWD phenotypes survival days (DAYS) and survival status (STATUS) had been recorded in training fish (n = 583) subjected to experimental BCWD challenge. Training fish, and their full sibs without phenotypic data that were used as parents of the subsequent generation, were genotyped using two methods: restriction-site associated DNA (RAD) sequencing and the Rainbow Trout Axiom® 57 K SNP array (Chip). Animal-specific GEBVs were estimated using four GS models: BayesB, BayesC, single-step GBLUP (ssGBLUP), and weighted ssGBLUP (wssGBLUP). Family-specific EBVs were estimated using pedigree and phenotype data in the training fish only. The PA of EBVs and GEBVs was assessed by correlating mean progeny phenotype (MPP) with mid-parent EBV (family-specific) or GEBV (animal-specific). The best GEBV predictions were similar to EBV with PA values of 0.49 and 0.46 vs. 0.50 and 0.41 for DAYS and STATUS, respectively. Among the GEBV prediction methods, ssGBLUP consistently had the highest PA. The RAD genotyping platform had GEBVs with similar PA to those of GEBVs from the Chip platform. The PA of ssGBLUP and wssGBLUP methods was higher with the Chip, but for BayesB and BayesC methods it was higher with the RAD platform. The overall GEBV accuracy in this study was low to moderate, likely due to the small training sample used. This study explored the potential of GS for

  8. Cloning of the Full-Length Rhesus Cytomegalovirus Genome as an Infectious and Self-Excisable Bacterial Artificial Chromosome for Analysis of Viral Pathogenesis

    OpenAIRE

    Chang, W. L. William; Peter A Barry

    2003-01-01

    Rigorous investigation of many functions encoded by cytomegaloviruses (CMVs) requires analysis in the context of virus-host interactions. To facilitate the construction of rhesus CMV (RhCMV) mutants for in vivo studies, a bacterial artificial chromosome (BAC) containing an enhanced green fluorescent protein (EGFP) cassette was engineered into the intergenic region between unique short 1 (US1) and US2 of the full-length viral genome by Cre/lox-mediated recombination. Infectious virions were re...

  9. Expanding a dynamic flux balance model of yeast fermentation to genome-scale

    OpenAIRE

    Agosin Eduardo; Pérez-Correa J Ricardo; Pizarro Francisco; Vargas Felipe A

    2011-01-01

    Abstract Background Yeast is considered to be a workhorse of the biotechnology industry for the production of many value-added chemicals, alcoholic beverages and biofuels. Optimization of the fermentation is a challenging task that greatly benefits from dynamic models able to accurately describe and predict the fermentation profile and resulting products under different genetic and environmental conditions. In this article, we developed and validated a genome-scale dynamic flux balance model,...

  10. Genome-scale NCRNA homology search using a Hamming distance-based filtration strategy

    OpenAIRE

    2012-01-01

    Background NCRNAs (noncoding RNAs) play important roles in many biological processes. Existing genome-scale ncRNA search tools identify ncRNAs in local sequence alignments generated by conventional sequence comparison methods. However, some types of ncRNA lack strong sequence conservation and tend to be missed or mis-aligned by conventional sequence comparison. Results In this paper, we propose an ncRNA identification framework that is complementary to existing sequence comparison tools. By i...

  11. A genome-scale metabolic model of the lipid-accumulating yeast Yarrowia lipolytica

    OpenAIRE

    Loira Nicolas; Dulermo Thierry; Nicaud Jean-Marc; Sherman David

    2012-01-01

    Abstract Background Yarrowia lipolytica is an oleaginous yeast which has emerged as an important microorganism for several biotechnological processes, such as the production of organic acids, lipases and proteases. It is also considered a good candidate for single-cell oil production. Although some of its metabolic pathways are well studied, its metabolic engineering is hindered by the lack of a genome-scale model that integrates the current knowledge about its metabolism. Results Combining i...

  12. Analysis of genome-wide association data by large-scale Bayesian logistic regression

    OpenAIRE

    Wang Yuanjia; Sha Nanshi; Fang Yixin

    2009-01-01

    Abstract Single-locus analysis is often used to analyze genome-wide association (GWA) data, but such analysis is subject to severe multiple comparisons adjustment. Multivariate logistic regression is proposed to fit a multi-locus model for case-control data. However, when the sample size is much smaller than the number of single-nucleotide polymorphisms (SNPs) or when correlation among SNPs is high, traditional multivariate logistic regression breaks down. To accommodate the scale of data fro...

  13. Applications of Genome-Scale Metabolic Models in Biotechnology and Systems Medicine

    OpenAIRE

    Zhang, Cheng; Hua, Qiang

    2016-01-01

    Genome-scale metabolic models (GEMs) have become a popular tool for systems biology, and they have been used in many fields such as industrial biotechnology and systems medicine. Since more and more studies are being conducted using GEMs, they have recently received considerable attention. In this review, we introduce the basic concept of GEMs and provide an overview of their applications in biotechnology, systems medicine, and some other fields. In addition, we describe the general principle...

  14. Biological Removal of Phosphate Using Phosphate Solubilizing Bacterial Consortium from Synthetic Wastewater: A Laboratory Scale

    OpenAIRE

    Dipak Paul; Sankar Narayan Sinha

    2015-01-01

    Biological phosphate removal is an important process having gained worldwide attention and widely used for removing phosphorus from wastewater. The present investigation was aimed to screen the efficient phosphate solubilizing bacterial isolates and used to remove phosphate from synthetic wastewater under shaking flasks conditions. Pseudomonas sp. JPSB12, Enterobacter sp. TPSB20, Flavobacterium sp. TPSB23 and mixed bacterial consortium (Pseudomonas sp. JPSB12+Enterobacter sp. TPSB20+Flavobact...

  15. Large scale distribution of bacterial communities in the upper Paraná River floodplain

    Directory of Open Access Journals (Sweden)

    Josiane Barros Chiaramonte

    2014-12-01

    Full Text Available A bacterial community has a central role in nutrient cycle in aquatic habitats. Therefore, it is important to analyze how this community is distributed throughout different locations. Thirty-six different sites in the upper Paraná River floodplain were surveyed to determine the influence of environmental variable in bacterial community composition. The sites are classified as rivers, channels, and floodplain lakes connected or unconnected to the main river channel. The bacterial community structure was analyzed by fluorescent in situ hybridization (FISH technique, based on frequency of the main domains Bacteria and Archaea, and subdivisions of the phylum Proteobacteria (Alpha-proteobacteria, Beta-proteobacteria, Gamma-proteobacteria and the Cytophaga-Flavobacterium cluster. It has been demonstrated that the bacterial community differed in density and frequency of the studied groups. And these differences responded to distinct characteristics of the three main rivers of the floodplain as well as to the classification of the environments found in this floodplain. We conclude that dissimilarities in the bacterial community structure are related to environmental heterogeneity, and the limnological variables that most predicted bacterial communities in the upper Paraná River floodplain was total and ammoniacal nitrogen, orthophosphate and chlorophyll-a.

  16. Modeling Method for Increased Precision and Scope of Directly Measurable Fluxes at a Genome-Scale.

    Science.gov (United States)

    McCloskey, Douglas; Young, Jamey D; Xu, Sibei; Palsson, Bernhard O; Feist, Adam M

    2016-04-01

    Metabolic flux analysis (MFA) is considered to be the gold standard for determining the intracellular flux distribution of biological systems. The majority of work using MFA has been limited to core models of metabolism due to challenges in implementing genome-scale MFA and the undesirable trade-off between increased scope and decreased precision in flux estimations. This work presents a tunable workflow for expanding the scope of MFA to the genome-scale without trade-offs in flux precision. The genome-scale MFA model presented here, iDM2014, accounts for 537 net reactions, which includes the core pathways of traditional MFA models and also covers the additional pathways of purine, pyrimidine, isoprenoid, methionine, riboflavin, coenzyme A, and folate, as well as other biosynthetic pathways. When evaluating the iDM2014 using a set of measured intracellular intermediate and cofactor mass isotopomer distributions (MIDs),1 it was found that a total of 232 net fluxes of central and peripheral metabolism could be resolved in the E. coli network. The increase in scope was shown to cover the full biosynthetic route to an expanded set of bioproduction pathways, which should facilitate applications such as the design of more complex bioprocessing strains and aid in identifying new antimicrobials. Importantly, it was found that there was no loss in precision of core fluxes when compared to a traditional core model, and additionally there was an overall increase in precision when considering all observable reactions. PMID:26981784

  17. AssociationViewer: a scalable and integrated software tool for visualization of large-scale variation data in genomic context.

    OpenAIRE

    Martin O.; Valsesia A.; Telenti A.; Xenarios I.; Stevenson B.J.

    2009-01-01

    SUMMARY: We present a tool designed for visualization of large-scale genetic and genomic data exemplified by results from genome-wide association studies. This software provides an integrated framework to facilitate the interpretation of SNP association studies in genomic context. Gene annotations can be retrieved from Ensembl, linkage disequilibrium data downloaded from HapMap and custom data imported in BED or WIG format. AssociationViewer integrates functionalities that enable the aggregat...

  18. Reconstruction of genome-scale metabolic models for 126 human tissues using mCADRE

    Directory of Open Access Journals (Sweden)

    Wang Yuliang

    2012-12-01

    Full Text Available Abstract Background Human tissues perform diverse metabolic functions. Mapping out these tissue-specific functions in genome-scale models will advance our understanding of the metabolic basis of various physiological and pathological processes. The global knowledgebase of metabolic functions categorized for the human genome (Human Recon 1 coupled with abundant high-throughput data now makes possible the reconstruction of tissue-specific metabolic models. However, the number of available tissue-specific models remains incomplete compared with the large diversity of human tissues. Results We developed a method called metabolic Context-specificity Assessed by Deterministic Reaction Evaluation (mCADRE. mCADRE is able to infer a tissue-specific network based on gene expression data and metabolic network topology, along with evaluation of functional capabilities during model building. mCADRE produces models with similar or better functionality and achieves dramatic computational speed up over existing methods. Using our method, we reconstructed draft genome-scale metabolic models for 126 human tissue and cell types. Among these, there are models for 26 tumor tissues along with their normal counterparts, and 30 different brain tissues. We performed pathway-level analyses of this large collection of tissue-specific models and identified the eicosanoid metabolic pathway, especially reactions catalyzing the production of leukotrienes from arachidnoic acid, as potential drug targets that selectively affect tumor tissues. Conclusions This large collection of 126 genome-scale draft metabolic models provides a useful resource for studying the metabolic basis for a variety of human diseases across many tissues. The functionality of the resulting models and the fast computational speed of the mCADRE algorithm make it a useful tool to build and update tissue-specific metabolic models.

  19. T346Hunter: a novel web-based tool for the prediction of type III, type IV and type VI secretion systems in bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Pedro Manuel Martínez-García

    Full Text Available T346Hunter (Type Three, Four and Six secretion system Hunter is a web-based tool for the identification and localisation of type III, type IV and type VI secretion systems (T3SS, T4SS and T6SS, respectively clusters in bacterial genomes. Non-flagellar T3SS (NF-T3SS and T6SS are complex molecular machines that deliver effector proteins from bacterial cells into the environment or into other eukaryotic or prokaryotic cells, with significant implications for pathogenesis of the strains encoding them. Meanwhile, T4SS is a more functionally diverse system, which is involved in not only effector translocation but also conjugation and DNA uptake/release. Development of control strategies against bacterial-mediated diseases requires genomic identification of the virulence arsenal of pathogenic bacteria, with T3SS, T4SS and T6SS being major determinants in this regard. Therefore, computational methods for systematic identification of these specialised machines are of particular interest. With the aim of facilitating this task, T346Hunter provides a user-friendly web-based tool for the prediction of T3SS, T4SS and T6SS clusters in newly sequenced bacterial genomes. After inspection of the available scientific literature, we constructed a database of hidden Markov model (HMM protein profiles and sequences representing the various components of T3SS, T4SS and T6SS. T346Hunter performs searches of such a database against user-supplied bacterial sequences and localises enriched regions in any of these three types of secretion systems. Moreover, through the T346Hunter server, users can visualise the predicted clusters obtained for approximately 1700 bacterial chromosomes and plasmids. T346Hunter offers great help to researchers in advancing their understanding of the biological mechanisms in which these sophisticated molecular machines are involved. T346Hunter is freely available at http://bacterial-virulence-factors.cbgp.upm.es/T346Hunter.

  20. Genome-scale reconstruction of the Streptococcus pyogenes M49 metabolic network reveals growth requirements and indicates potential drug targets

    NARCIS (Netherlands)

    J. Levering; T. Fiedler; A. Sieg; K.W.A. van Grinsven; S. Hering; N. Veith; B.G. Olivier; L. Klett; J. Hugenholtz; B. Teusink; B. Kreikemeyer; U. Kummer

    2016-01-01

    Genome-scale metabolic models comprise stoichiometric relations between metabolites, as well as associations between genes and metabolic reactions and facilitate the analysis of metabolism. We computationally reconstructed the metabolic network of the lactic acid bacterium Streptococcus pyogenes M49

  1. Genome-scale reconstruction of metabolic networks of Lactobacillus casei ATCC 334 and 12A.

    Directory of Open Access Journals (Sweden)

    Elena Vinay-Lara

    Full Text Available Lactobacillus casei strains are widely used in industry and the utility of this organism in these industrial applications is strain dependent. Hence, tools capable of predicting strain specific phenotypes would have utility in the selection of strains for specific industrial processes. Genome-scale metabolic models can be utilized to better understand genotype-phenotype relationships and to compare different organisms. To assist in the selection and development of strains with enhanced industrial utility, genome-scale models for L. casei ATCC 334, a well characterized strain, and strain 12A, a corn silage isolate, were constructed. Draft models were generated from RAST genome annotations using the Model SEED database and refined by evaluating ATP generating cycles, mass-and-charge-balances of reactions, and growth phenotypes. After the validation process was finished, we compared the metabolic networks of these two strains to identify metabolic, genetic and ortholog differences that may lead to different phenotypic behaviors. We conclude that the metabolic capabilities of the two networks are highly similar. The L. casei ATCC 334 model accounts for 1,040 reactions, 959 metabolites and 548 genes, while the L. casei 12A model accounts for 1,076 reactions, 979 metabolites and 640 genes. The developed L. casei ATCC 334 and 12A metabolic models will enable better understanding of the physiology of these organisms and be valuable tools in the development and selection of strains with enhanced utility in a variety of industrial applications.

  2. Genome-Scale Model Reveals Metabolic Basis of Biomass Partitioning in a Model Diatom.

    Directory of Open Access Journals (Sweden)

    Jennifer Levering

    Full Text Available Diatoms are eukaryotic microalgae that contain genes from various sources, including bacteria and the secondary endosymbiotic host. Due to this unique combination of genes, diatoms are taxonomically and functionally distinct from other algae and vascular plants and confer novel metabolic capabilities. Based on the genome annotation, we performed a genome-scale metabolic network reconstruction for the marine diatom Phaeodactylum tricornutum. Due to their endosymbiotic origin, diatoms possess a complex chloroplast structure which complicates the prediction of subcellular protein localization. Based on previous work we implemented a pipeline that exploits a series of bioinformatics tools to predict protein localization. The manually curated reconstructed metabolic network iLB1027_lipid accounts for 1,027 genes associated with 4,456 reactions and 2,172 metabolites distributed across six compartments. To constrain the genome-scale model, we determined the organism specific biomass composition in terms of lipids, carbohydrates, and proteins using Fourier transform infrared spectrometry. Our simulations indicate the presence of a yet unknown glutamine-ornithine shunt that could be used to transfer reducing equivalents generated by photosynthesis to the mitochondria. The model reflects the known biochemical composition of P. tricornutum in defined culture conditions and enables metabolic engineering strategies to improve the use of P. tricornutum for biotechnological applications.

  3. Genome-Scale Model Reveals Metabolic Basis of Biomass Partitioning in a Model Diatom

    Science.gov (United States)

    Broddrick, Jared; Dupont, Christopher L.; Peers, Graham; Beeri, Karen; Mayers, Joshua; Gallina, Alessandra A.; Allen, Andrew E.; Palsson, Bernhard O.; Zengler, Karsten

    2016-01-01

    Diatoms are eukaryotic microalgae that contain genes from various sources, including bacteria and the secondary endosymbiotic host. Due to this unique combination of genes, diatoms are taxonomically and functionally distinct from other algae and vascular plants and confer novel metabolic capabilities. Based on the genome annotation, we performed a genome-scale metabolic network reconstruction for the marine diatom Phaeodactylum tricornutum. Due to their endosymbiotic origin, diatoms possess a complex chloroplast structure which complicates the prediction of subcellular protein localization. Based on previous work we implemented a pipeline that exploits a series of bioinformatics tools to predict protein localization. The manually curated reconstructed metabolic network iLB1027_lipid accounts for 1,027 genes associated with 4,456 reactions and 2,172 metabolites distributed across six compartments. To constrain the genome-scale model, we determined the organism specific biomass composition in terms of lipids, carbohydrates, and proteins using Fourier transform infrared spectrometry. Our simulations indicate the presence of a yet unknown glutamine-ornithine shunt that could be used to transfer reducing equivalents generated by photosynthesis to the mitochondria. The model reflects the known biochemical composition of P. tricornutum in defined culture conditions and enables metabolic engineering strategies to improve the use of P. tricornutum for biotechnological applications. PMID:27152931

  4. Genome-Scale Model Reveals Metabolic Basis of Biomass Partitioning in a Model Diatom.

    Science.gov (United States)

    Levering, Jennifer; Broddrick, Jared; Dupont, Christopher L; Peers, Graham; Beeri, Karen; Mayers, Joshua; Gallina, Alessandra A; Allen, Andrew E; Palsson, Bernhard O; Zengler, Karsten

    2016-01-01

    Diatoms are eukaryotic microalgae that contain genes from various sources, including bacteria and the secondary endosymbiotic host. Due to this unique combination of genes, diatoms are taxonomically and functionally distinct from other algae and vascular plants and confer novel metabolic capabilities. Based on the genome annotation, we performed a genome-scale metabolic network reconstruction for the marine diatom Phaeodactylum tricornutum. Due to their endosymbiotic origin, diatoms possess a complex chloroplast structure which complicates the prediction of subcellular protein localization. Based on previous work we implemented a pipeline that exploits a series of bioinformatics tools to predict protein localization. The manually curated reconstructed metabolic network iLB1027_lipid accounts for 1,027 genes associated with 4,456 reactions and 2,172 metabolites distributed across six compartments. To constrain the genome-scale model, we determined the organism specific biomass composition in terms of lipids, carbohydrates, and proteins using Fourier transform infrared spectrometry. Our simulations indicate the presence of a yet unknown glutamine-ornithine shunt that could be used to transfer reducing equivalents generated by photosynthesis to the mitochondria. The model reflects the known biochemical composition of P. tricornutum in defined culture conditions and enables metabolic engineering strategies to improve the use of P. tricornutum for biotechnological applications. PMID:27152931

  5. Reconstruction and analysis of a genome-scale metabolic model for Scheffersomyces stipitis

    Directory of Open Access Journals (Sweden)

    Balagurunathan Balaji

    2012-02-01

    Full Text Available Abstract Background Fermentation of xylose, the major component in hemicellulose, is essential for economic conversion of lignocellulosic biomass to fuels and chemicals. The yeast Scheffersomyces stipitis (formerly known as Pichia stipitis has the highest known native capacity for xylose fermentation and possesses several genes for lignocellulose bioconversion in its genome. Understanding the metabolism of this yeast at a global scale, by reconstructing the genome scale metabolic model, is essential for manipulating its metabolic capabilities and for successful transfer of its capabilities to other industrial microbes. Results We present a genome-scale metabolic model for Scheffersomyces stipitis, a native xylose utilizing yeast. The model was reconstructed based on genome sequence annotation, detailed experimental investigation and known yeast physiology. Macromolecular composition of Scheffersomyces stipitis biomass was estimated experimentally and its ability to grow on different carbon, nitrogen, sulphur and phosphorus sources was determined by phenotype microarrays. The compartmentalized model, developed based on an iterative procedure, accounted for 814 genes, 1371 reactions, and 971 metabolites. In silico computed growth rates were compared with high-throughput phenotyping data and the model could predict the qualitative outcomes in 74% of substrates investigated. Model simulations were used to identify the biosynthetic requirements for anaerobic growth of Scheffersomyces stipitis on glucose and the results were validated with published literature. The bottlenecks in Scheffersomyces stipitis metabolic network for xylose uptake and nucleotide cofactor recycling were identified by in silico flux variability analysis. The scope of the model in enhancing the mechanistic understanding of microbial metabolism is demonstrated by identifying a mechanism for mitochondrial respiration and oxidative phosphorylation. Conclusion The genome-scale

  6. Surface topography of composite restorative materials following ultrasonic scaling and its Impact on bacterial plaque accumulation. An in-vitro SEM study

    OpenAIRE

    Hossam, A. Eid; Rafi, A. Togoo; Ahmed, A Saleh; Sumanth, Phani CR

    2013-01-01

    Background: This is an in vitro study to investigate the effects of ultrasonic scaling on the surface roughness and quantitative bacterial count on four different types of commonly used composite restorative materials for class V cavities.

  7. Archaeal and bacterial community dynamics and bioprocess performance of a bench-scale two-stage anaerobic digester.

    Science.gov (United States)

    Gonzalez-Martinez, Alejandro; Garcia-Ruiz, Maria Jesus; Rodriguez-Sanchez, Alejandro; Osorio, Francisco; Gonzalez-Lopez, Jesus

    2016-07-01

    Two-stage technologies have been developed for anaerobic digestion of waste-activated sludge. In this study, the archaeal and bacterial community structure dynamics and bioprocess performance of a bench-scale two-stage anaerobic digester treating urban sewage sludge have been studied by the means of high-throughput sequencing techniques and physicochemical parameters such as pH, dried sludge, volatile dried sludge, acid concentration, alkalinity, and biogas generation. The coupled analyses of archaeal and bacterial communities and physicochemical parameters showed a direct relationship between archaeal and bacterial populations and bioprocess performance during start-up and working operation of a two-stage anaerobic digester. Moreover, results demonstrated that archaeal and bacterial community structure was affected by changes in the acid/alkalinity ratio in the bioprocess. Thus, a predominance of the acetoclastic methanogen Methanosaeta was observed in the methanogenic bioreactor at high-value acid/alkaline ratio, while a predominance of Methanomassilicoccaeceae archaea and Methanoculleus genus was observed in the methanogenic bioreactor at low-value acid/alkaline ratio. Biodiversity tag-iTag sequencing studies showed that methanogenic archaea can be also detected in the acidogenic bioreactor, although its biological activity was decreased after 4 months of operation as supported by physicochemical analyses. Also, studies of the VFA producers and VFA consumers microbial populations showed as these microbiota were directly affected by the physicochemical parameters generated in the bioreactors. We suggest that the results obtained in our study could be useful for future implementations of two-stage anaerobic digestion processes at both bench- and full-scale. PMID:26940050

  8. Identifying anti-growth factors for human cancer cell lines through genome-scale metabolic modeling

    DEFF Research Database (Denmark)

    Ghaffari, Pouyan; Mardinoglu, Adil; Asplund, Anna;

    2015-01-01

    Human cancer cell lines are used as important model systems to study molecular mechanisms associated with tumor growth, hereunder how genomic and biological heterogeneity found in primary tumors affect cellular phenotypes. We reconstructed Genome scale metabolic models (GEMs) for eleven cell lines...... 85 antimetabolites that can inhibit growth of, or even kill, any of the cell lines, while at the same time not being toxic for 83 different healthy human cell types. 60 of these antimetabolites were found to inhibit growth in all cell lines. Finally, we experimentally validated one of the predicted...... antimetabolites using two cell lines with different phenotypic origins, and found that it is effective in inhibiting the growth of these cell lines. Using immunohistochemistry, we also showed high or moderate expression levels of proteins targeted by the validated antimetabolite. Identified anti-growth factors...

  9. Genome-scale analysis of positional clustering of mouse testis-specific genes

    Directory of Open Access Journals (Sweden)

    Lee Bernett TK

    2005-01-01

    Full Text Available Abstract Background Genes are not randomly distributed on a chromosome as they were thought even after removal of tandem repeats. The positional clustering of co-expressed genes is known in prokaryotes and recently reported in several eukaryotic organisms such as Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens. In order to further investigate the mode of tissue-specific gene clustering in higher eukaryotes, we have performed a genome-scale analysis of positional clustering of the mouse testis-specific genes. Results Our computational analysis shows that a large proportion of testis-specific genes are clustered in groups of 2 to 5 genes in the mouse genome. The number of clusters is much higher than expected by chance even after removal of tandem repeats. Conclusion Our result suggests that testis-specific genes tend to cluster on the mouse chromosomes. This provides another piece of evidence for the hypothesis that clusters of tissue-specific genes do exist.

  10. Why close a bacterial genome? The plasmid of Alteromonas macleodii HOT1A3 is a vector for inter-specific transfer of a flexible genomic island

    Directory of Open Access Journals (Sweden)

    Eduard eFadeev

    2016-03-01

    Full Text Available Genome sequencing is rapidly becoming a staple technique in environmental and clinical microbiology, yet computational challenges still remain, leading to many draft genomes which are typically fragmented into many contigs. We sequenced and completely assembled the genome of a marine heterotrophic bacterium, Alteromonas macleodii HOT1A3, and compared its full genome to several draft genomes obtained using different reference-based and de-novo methods. In general, the de-novo assemblies clearly outperformed the reference-based or hybrid ones, covering>99% of the genes and representing essentially all of the gene functions. However, only the fully closed genome (~4.5Mbp allowed us to identify the presence of a large, 148 kbp plasmid, pAM1A3. While HOT1A3 belongs to Alteromonas macleodii, typically found in surface waters (surface ecotype, this plasmid consists of an almost complete flexible genomic island, containing many genes involved in metal resistance previously identified in the genomes of Alteromonas mediterranea (deep ecotype. Indeed, similar to A. mediterranea, A. macleodii HOT1A3 grows at concentrations of zinc, mercury and copper that are inhibitory for other A. macleodii strains. The presence of a plasmid encoding almost an entire flexible genomic island suggests that wholesale genomic exchange between heterotrophic marine bacteria belonging to related but ecologically different populations is not uncommon.

  11. Bacterial lifestyle in a deep-sea hydrothermal vent chimney revealed by the genome sequence of the thermophilic bacterium Deferribacter desulfuricans SSM1.

    Science.gov (United States)

    Takaki, Yoshihiro; Shimamura, Shigeru; Nakagawa, Satoshi; Fukuhara, Yasuo; Horikawa, Hiroshi; Ankai, Akiho; Harada, Takeshi; Hosoyama, Akira; Oguchi, Akio; Fukui, Shigehiro; Fujita, Nobuyuki; Takami, Hideto; Takai, Ken

    2010-06-01

    The complete genome sequence of the thermophilic sulphur-reducing bacterium, Deferribacter desulfuricans SMM1, isolated from a hydrothermal vent chimney has been determined. The genome comprises a single circular chromosome of 2,234,389 bp and a megaplasmid of 308,544 bp. Many genes encoded in the genome are most similar to the genes of sulphur- or sulphate-reducing bacterial species within Deltaproteobacteria. The reconstructed central metabolisms showed a heterotrophic lifestyle primarily driven by C1 to C3 organics, e.g. formate, acetate, and pyruvate, and also suggested that the inability of autotrophy via a reductive tricarboxylic acid cycle may be due to the lack of ATP-dependent citrate lyase. In addition, the genome encodes numerous genes for chemoreceptors, chemotaxis-like systems, and signal transduction machineries. These signalling networks may be linked to this bacterium's versatile energy metabolisms and may provide ecophysiological advantages for D. desulfuricans SSM1 thriving in the physically and chemically fluctuating environments near hydrothermal vents. This is the first genome sequence from the phylum Deferribacteres. PMID:20189949

  12. Comparative analysis of ribosomal proteins in complete genomes: an example of reductive evolution at the domain scale

    OpenAIRE

    Lecompte, Odile; Ripp, Raymond; Thierry, Jean-Claude; Moras, Dino; Poch, Olivier

    2002-01-01

    A comprehensive investigation of ribosomal genes in complete genomes from 66 different species allows us to address the distribution of r-proteins between and within the three primary domains. Thirty-four r-protein families are represented in all domains but 33 families are specific to Archaea and Eucarya, providing evidence for specialisation at an early stage of evolution between the bacterial lineage and the lineage leading to Archaea and Eukaryotes. With only one specific r-protein, the a...

  13. A method for the large scale isolation of high transformation efficiency fungal genomic DNA.

    Science.gov (United States)

    Zhang, D; Yang, Y; Castlebury, L A; Cerniglia, C E

    1996-12-01

    A procedure for isolation of genomic DNA from the zygomycete Cunninghamella elegans and other filamentous fungi and yeasts is reported. This procedure involves disruption of cells by grinding using dry ice, removal of polysaccharides using cetyltrimethylammonium bromide and by phenol extractions, and precipitation of DNA with isopropanol at room temperature. The isolation method produced large scale (approximate 1 mg DNA/5 g wet cells) and highly purified high molecular mass DNA. Sau3AI partially digested DNA showed high transformation efficiency (> 10(6)/100 ng DNA) when ligated to ZAP-express lambda vector. PMID:8961565

  14. Dynamics of bacterial populations during bench-scale bioremediation of oily seawater and desert soil bioaugmented with coastal microbial mats.

    Science.gov (United States)

    Ali, Nidaa; Dashti, Narjes; Salamah, Samar; Sorkhoh, Naser; Al-Awadhi, Husain; Radwan, Samir

    2016-03-01

    This study describes a bench-scale attempt to bioremediate Kuwaiti, oily water and soil samples through bioaugmentation with coastal microbial mats rich in hydrocarbonoclastic bacterioflora. Seawater and desert soil samples were artificially polluted with 1% weathered oil, and bioaugmented with microbial mat suspensions. Oil removal and microbial community dynamics were monitored. In batch cultures, oil removal was more effective in soil than in seawater. Hydrocarbonoclastic bacteria associated with mat samples colonized soil more readily than seawater. The predominant oil degrading bacterium in seawater batches was the autochthonous seawater species Marinobacter hydrocarbonoclasticus. The main oil degraders in the inoculated soil samples, on the other hand, were a mixture of the autochthonous mat and desert soil bacteria; Xanthobacter tagetidis, Pseudomonas geniculata, Olivibacter ginsengisoli and others. More bacterial diversity prevailed in seawater during continuous than batch bioremediation. Out of seven hydrocarbonoclastic bacterial species isolated from those cultures, only one, Mycobacterium chlorophenolicum, was of mat origin. This result too confirms that most of the autochthonous mat bacteria failed to colonize seawater. Also culture-independent analysis of seawater from continuous cultures revealed high-bacterial diversity. Many of the bacteria belonged to the Alphaproteobacteria, Flavobacteria and Gammaproteobacteria, and were hydrocarbonoclastic. Optimal biostimulation practices for continuous culture bioremediation of seawater via mat bioaugmentation were adding the highest possible oil concentration as one lot in the beginning of bioremediation, addition of vitamins, and slowing down the seawater flow rate. PMID:26751253

  15. In situ probing the interior of single bacterial cells at nanometer scale

    International Nuclear Information System (INIS)

    We report a novel approach to probe the interior of single bacterial cells at nanometre resolution by combining focused ion beam (FIB) and atomic force microscopy (AFM). After removing layers of pre-defined thickness in the order of 100 nm on the target bacterial cells with FIB milling, AFM of different modes can be employed to probe the cellular interior under both ambient and aqueous environments. Our initial investigations focused on the surface topology induced by FIB milling and the hydration effects on AFM measurements, followed by assessment of the sample protocols. With fine-tuning of the process parameters, in situ AFM probing beneath the bacterial cell wall was achieved for the first time. We further demonstrate the proposed method by performing a spatial mapping of intracellular elasticity and chemistry of the multi-drug resistant strain Klebsiella pneumoniae cells prior to and after it was exposed to the ‘last-line’ antibiotic polymyxin B. Our results revealed increased stiffness occurring in both surface and interior regions of the treated cells, suggesting loss of integrity of the outer membrane from polymyxin treatments. In addition, the hydrophobicity measurement using a functionalized AFM tip was able to highlight the evident hydrophobic portion of the cell such as the regions containing cell membrane. We expect that the proposed FIB–AFM platform will help in gaining deeper insights of bacteria–drug interactions to develop potential strategies for combating multi-drug resistance. (paper)

  16. Biological Removal of Phosphate Using Phosphate Solubilizing Bacterial Consortium from Synthetic Wastewater: A Laboratory Scale

    Directory of Open Access Journals (Sweden)

    Dipak Paul

    2015-01-01

    Full Text Available Biological phosphate removal is an important process having gained worldwide attention and widely used for removing phosphorus from wastewater. The present investigation was aimed to screen the efficient phosphate solubilizing bacterial isolates and used to remove phosphate from synthetic wastewater under shaking flasks conditions. Pseudomonas sp. JPSB12, Enterobacter sp. TPSB20, Flavobacterium sp. TPSB23 and mixed bacterial consortium (Pseudomonas sp. JPSB12+Enterobacter sp. TPSB20+Flavobacterium sp. TPSB23 were used for the removal of phosphate. Among the individual strains, Enterobacter sp. TPSB20 was removed maximum phosphate (61.75% from synthetic wastewater in presence of glucose as a carbon source. The consortium was effectively removed phosphate (74.15-82.50% in the synthetic wastewater when compared to individual strains. The pH changes in culture medium with time and extracellular phosphatase activity (acid and alkaline were also investigated. The efficient removal of phosphate by the consortium may be due to the synergistic activity among the individual strains and phosphatase enzyme activity. The use of bacterial consortium in the remediation of phosphate contaminated aquatic environments has been discussed.

  17. Large-Scale Comparative Genomics Meta-Analysis of Campylobacter jejuni Isolates Reveals Low Level of Genome Plasticity

    OpenAIRE

    Taboada, Eduardo N.; Acedillo, Rey R; Carrillo, Catherine D.; Findlay, Wendy A.; Medeiros, Diane T.; Mykytczuk, Oksana L; Roberts, Michael J.; Valencia, C. Alexander; Farber, Jeffrey M.; Nash, John H E

    2004-01-01

    We have used comparative genomic hybridization (CGH) on a full-genome Campylobacter jejuni microarray to examine genome-wide gene conservation patterns among 51 strains isolated from food and clinical sources. These data have been integrated with data from three previous C. jejuni CGH studies to perform a meta-analysis that included 97 strains from the four separate data sets. Although many genes were found to be divergent across multiple strains (n = 350), many genes (n = 249) were uniquely ...

  18. Sexagesimal scale for mapping human genome Escala sexagesimal para mapear el genoma humano

    Directory of Open Access Journals (Sweden)

    RICARDO CRUZ-COKE

    2001-03-01

    Full Text Available In a previous work I designed a diagram of the human genome based on a circular ideogram of the haploid set of chromosomes, using a low resolution scale of Megabase units. The purpose of this work is to draft a new scale to measure the physical map of the human genome at the highest resolution level. The entire length of the haploid genome of males is deployed in a circumference, marked with a sexagesimal scale with 360 degrees and 1296000 arc seconds. The radio of this circunference displays a semilogaritmic metric scale from 1 m up to the nanometer level. The base pair level of DNA sequences, 10-9 of this circunsference, is measured in milliarsec unit (mas, equivalent to a thousand of arcsecond. The "mas" unit, correspond to 1.27 nanometers (nm or 0.427 base pair (bp and it is the framework for measure DNA sequences. Thus the three billion base pairs of the human genome may be identified by 1296000000 "mas" units in continous correlation from number 1 to number 1296000000. This sexagesimal scale covers all the levels of the nuclear genetic material, from nucleotides to chromosomes. The locations of every codon and every gene may be numbered in the physical map of chomosome regions according to this new scale, instead of the partial kilobase and Megabase scales used today. The advantage of the new scale is the unification of the set of chromosomes under a continous scale of measurement at the DNA level, facilitating the correlation with the phenotypes of man and other speciesEn un trabajo anterior yo diseñé un diagrama del genoma humano basado en un ideograma circular del conjunto haploide de cromosomas, usando una escala de baja resolución en megabases. El propósito de este trabajo es el de diseñar una nueva escala para medir el mapa físico del genoma humano al más alto nivel de resolución. La longitud completa del genoma haploide del varon es extendido en una circunsferencia, marcada con una escala sexagesimal de 360 grados y 1296000

  19. Integrating Kinetic Model of E. coli with Genome Scale Metabolic Fluxes Overcomes Its Open System Problem and Reveals Bistability in Central Metabolism.

    Directory of Open Access Journals (Sweden)

    Ahmad A Mannan

    Full Text Available An understanding of the dynamics of the metabolic profile of a bacterial cell is sought from a dynamical systems analysis of kinetic models. This modelling formalism relies on a deterministic mathematical description of enzyme kinetics and their metabolite regulation. However, it is severely impeded by the lack of available kinetic information, limiting the size of the system that can be modelled. Furthermore, the subsystem of the metabolic network whose dynamics can be modelled is faced with three problems: how to parameterize the model with mostly incomplete steady state data, how to close what is now an inherently open system, and how to account for the impact on growth. In this study we address these challenges of kinetic modelling by capitalizing on multi-'omics' steady state data and a genome-scale metabolic network model. We use these to generate parameters that integrate knowledge embedded in the genome-scale metabolic network model, into the most comprehensive kinetic model of the central carbon metabolism of E. coli realized to date. As an application, we performed a dynamical systems analysis of the resulting enriched model. This revealed bistability of the central carbon metabolism and thus its potential to express two distinct metabolic states. Furthermore, since our model-informing technique ensures both stable states are constrained by the same thermodynamically feasible steady state growth rate, the ensuing bistability represents a temporal coexistence of the two states, and by extension, reveals the emergence of a phenotypically heterogeneous population.

  20. Simplified large-scale Sanger genome sequencing for influenza A/H3N2 virus.

    Directory of Open Access Journals (Sweden)

    Hong Kai Lee

    Full Text Available BACKGROUND: The advent of next-generation sequencing technologies and the resultant lower costs of sequencing have enabled production of massive amounts of data, including the generation of full genome sequences of pathogens. However, the small genome size of the influenza virus arguably justifies the use of the more conventional Sanger sequencing technology which is still currently more readily available in most diagnostic laboratories. RESULTS: We present a simplified Sanger-based genome sequencing method for sequencing the influenza A/H3N2 virus in a large-scale format. The entire genome sequencing was completed with 19 reverse transcription-polymerase chain reactions (RT-PCRs and 39 sequencing reactions. This method was tested on 15 native clinical samples and 15 culture isolates, respectively, collected between 2009 and 2011. The 15 native clinical samples registered quantification cycle values ranging from 21.0 to 30.56, which were equivalent to 2.4×10(3-1.4×10(6 viral copies/µL of RNA extract. All the PCR-amplified products were sequenced directly without PCR product purification. Notably, high quality sequencing data up to 700 bp were generated for all the samples tested. The completed sequence covered 408,810 nucleotides in total, with 13,627 nucleotides per genome, attaining 100% coding completeness. Of all the bases produced, an average of 89.49% were Phred quality value 40 (QV40 bases (representing an accuracy of circa one miscall for every 10,000 bases or higher, and an average of 93.46% were QV30 bases (one miscall every 1000 bases or higher. CONCLUSIONS: This sequencing protocol has been shown to be cost-effective and less labor-intensive in obtaining full influenza genomes. The constant high quality of sequences generated imparts confidence in extending the application of this non-purified amplicon sequencing approach to other gene sequencing assays, with appropriate use of suitably designed primers.

  1. Insights from the Genome Sequence of Acidovorax citrulli M6, a Group I Strain of the Causal Agent of Bacterial Fruit Blotch of Cucurbits

    Science.gov (United States)

    Eckshtain-Levi, Noam; Shkedy, Dafna; Gershovits, Michael; Da Silva, Gustavo M.; Tamir-Ariel, Dafna; Walcott, Ron; Pupko, Tal; Burdman, Saul

    2016-01-01

    Acidovorax citrulli is a seedborne bacterium that causes bacterial fruit blotch of cucurbit plants including watermelon and melon. A. citrulli strains can be divided into two major groups based on DNA fingerprint analyses and biochemical properties. Group I strains have been generally isolated from non-watermelon cucurbits, while group II strains are closely associated with watermelon. In the present study, we report the genome sequence of M6, a group I model A. citrulli strain, isolated from melon. We used comparative genome analysis to investigate differences between the genome of strain M6 and the genome of the group II model strain AAC00-1. The draft genome sequence of A. citrulli M6 harbors 139 contigs, with an overall approximate size of 4.85 Mb. The genome of M6 is ∼500 Kb shorter than that of strain AAC00-1. Comparative analysis revealed that this size difference is mainly explained by eight fragments, ranging from ∼35–120 Kb and distributed throughout the AAC00-1 genome, which are absent in the M6 genome. In agreement with this finding, while AAC00-1 was found to possess 532 open reading frames (ORFs) that are absent in strain M6, only 123 ORFs in M6 were absent in AAC00-1. Most of these M6 ORFs are hypothetical proteins and most of them were also detected in two group I strains that were recently sequenced, tw6 and pslb65. Further analyses by PCR assays and coverage analyses with other A. citrulli strains support the notion that some of these fragments or significant portions of them are discriminative between groups I and II strains of A. citrulli. Moreover, GC content, effective number of codon values and cluster of orthologs’ analyses indicate that these fragments were introduced into group II strains by horizontal gene transfer events. Our study reports the genome sequence of a model group I strain of A. citrulli, one of the most important pathogens of cucurbits. It also provides the first comprehensive comparison at the genomic level between

  2. Insights from the Genome Sequence of Acidovorax citrulli M6, a Group I Strain of the Causal Agent of Bacterial Fruit Blotch of Cucurbits.

    Science.gov (United States)

    Eckshtain-Levi, Noam; Shkedy, Dafna; Gershovits, Michael; Da Silva, Gustavo M; Tamir-Ariel, Dafna; Walcott, Ron; Pupko, Tal; Burdman, Saul

    2016-01-01

    Acidovorax citrulli is a seedborne bacterium that causes bacterial fruit blotch of cucurbit plants including watermelon and melon. A. citrulli strains can be divided into two major groups based on DNA fingerprint analyses and biochemical properties. Group I strains have been generally isolated from non-watermelon cucurbits, while group II strains are closely associated with watermelon. In the present study, we report the genome sequence of M6, a group I model A. citrulli strain, isolated from melon. We used comparative genome analysis to investigate differences between the genome of strain M6 and the genome of the group II model strain AAC00-1. The draft genome sequence of A. citrulli M6 harbors 139 contigs, with an overall approximate size of 4.85 Mb. The genome of M6 is ∼500 Kb shorter than that of strain AAC00-1. Comparative analysis revealed that this size difference is mainly explained by eight fragments, ranging from ∼35-120 Kb and distributed throughout the AAC00-1 genome, which are absent in the M6 genome. In agreement with this finding, while AAC00-1 was found to possess 532 open reading frames (ORFs) that are absent in strain M6, only 123 ORFs in M6 were absent in AAC00-1. Most of these M6 ORFs are hypothetical proteins and most of them were also detected in two group I strains that were recently sequenced, tw6 and pslb65. Further analyses by PCR assays and coverage analyses with other A. citrulli strains support the notion that some of these fragments or significant portions of them are discriminative between groups I and II strains of A. citrulli. Moreover, GC content, effective number of codon values and cluster of orthologs' analyses indicate that these fragments were introduced into group II strains by horizontal gene transfer events. Our study reports the genome sequence of a model group I strain of A. citrulli, one of the most important pathogens of cucurbits. It also provides the first comprehensive comparison at the genomic level between the

  3. Small Traditional Human Communities Sustain Genomic Diversity over Microgeographic Scales despite Linguistic Isolation.

    Science.gov (United States)

    Cox, Murray P; Hudjashov, Georgi; Sim, Andre; Savina, Olga; Karafet, Tatiana M; Sudoyo, Herawati; Lansing, J Stephen

    2016-09-01

    At least since the Neolithic, humans have largely lived in networks of small, traditional communities. Often socially isolated, these groups evolved distinct languages and cultures over microgeographic scales of just tens of kilometers. Population genetic theory tells us that genetic drift should act quickly in such isolated groups, thus raising the question: do networks of small human communities maintain levels of genetic diversity over microgeographic scales? This question can no longer be asked in most parts of the world, which have been heavily impacted by historical events that make traditional society structures the exception. However, such studies remain possible in parts of Island Southeast Asia and Oceania, where traditional ways of life are still practiced. We captured genome-wide genetic data, together with linguistic records, for a case-study system-eight villages distributed across Sumba, a small, remote island in eastern Indonesia. More than 4,000 years after these communities were established during the Neolithic period, most speak different languages and can be distinguished genetically. Yet their nuclear diversity is not reduced, instead being comparable to other, even much larger, regional groups. Modeling reveals a separation of time scales: while languages and culture can evolve quickly, creating social barriers, sporadic migration averaged over many generations is sufficient to keep villages linked genetically. This loosely-connected network structure, once the global norm and still extant on Sumba today, provides a living proxy to explore fine-scale genome dynamics in the sort of small traditional communities within which the most recent episodes of human evolution occurred. PMID:27274003

  4. Genome-scale reconstruction of Salinispora tropica CNB-440 metabolism to study strain-specific adaptation.

    Science.gov (United States)

    Contador, C A; Rodríguez, V; Andrews, B A; Asenjo, J A

    2015-11-01

    The first manually curated genome-scale metabolic model for Salinispora tropica strain CNB-440 was constructed. The reconstruction enables characterization of the metabolic capabilities for understanding and modeling the cellular physiology of this actinobacterium. The iCC908 model was based on physiological and biochemical information of primary and specialised metabolism pathways. The reconstructed stoichiometric matrix consists of 1169 biochemical conversions, 204 transport reactions and 1317 metabolites. A total of 908 structural open reading frames (ORFs) were included in the reconstructed network. The number of gene functions included in the reconstructed network corresponds to 20% of all characterized ORFs in the S. tropica genome. The genome-scale metabolic model was used to study strain-specific capabilities in defined minimal media. iCC908 was used to analyze growth capabilities in 41 different minimal growth-supporting environments. These nutrient sources were evaluated experimentally to assess the accuracy of in silico growth simulations. The model predicted no auxotrophies for essential amino acids, which was corroborated experimentally. The strain is able to use 21 different carbon sources, 8 nitrogen sources and 4 sulfur sources from the nutrient sources tested. Experimental observation suggests that the cells may be able to store sulfur. False predictions provided opportunities to gain new insights into the physiology of this species, and to gap fill the missing knowledge. The incorporation of modifications led to increased accuracy in predicting the outcome of growth/no growth experiments from 76 to 93%. iCC908 can thus be used to define the metabolic capabilities of S. tropica and guide and enhance the production of specialised metabolites. PMID:26459337

  5. MultiMetEval: comparative and multi-objective analysis of genome-scale metabolic models.

    Directory of Open Access Journals (Sweden)

    Piotr Zakrzewski

    Full Text Available Comparative metabolic modelling is emerging as a novel field, supported by the development of reliable and standardized approaches for constructing genome-scale metabolic models in high throughput. New software solutions are needed to allow efficient comparative analysis of multiple models in the context of multiple cellular objectives. Here, we present the user-friendly software framework Multi-Metabolic Evaluator (MultiMetEval, built upon SurreyFBA, which allows the user to compose collections of metabolic models that together can be subjected to flux balance analysis. Additionally, MultiMetEval implements functionalities for multi-objective analysis by calculating the Pareto front between two cellular objectives. Using a previously generated dataset of 38 actinobacterial genome-scale metabolic models, we show how these approaches can lead to exciting novel insights. Firstly, after incorporating several pathways for the biosynthesis of natural products into each of these models, comparative flux balance analysis predicted that species like Streptomyces that harbour the highest diversity of secondary metabolite biosynthetic gene clusters in their genomes do not necessarily have the metabolic network topology most suitable for compound overproduction. Secondly, multi-objective analysis of biomass production and natural product biosynthesis in these actinobacteria shows that the well-studied occurrence of discrete metabolic switches during the change of cellular objectives is inherent to their metabolic network architecture. Comparative and multi-objective modelling can lead to insights that could not be obtained by normal flux balance analyses. MultiMetEval provides a powerful platform that makes these analyses straightforward for biologists. Sources and binaries of MultiMetEval are freely available from https://github.com/PiotrZakrzewski/MetEval/downloads.

  6. A genome-scale metabolic model of the lipid-accumulating yeast Yarrowia lipolytica

    Directory of Open Access Journals (Sweden)

    Loira Nicolas

    2012-05-01

    Full Text Available Abstract Background Yarrowia lipolytica is an oleaginous yeast which has emerged as an important microorganism for several biotechnological processes, such as the production of organic acids, lipases and proteases. It is also considered a good candidate for single-cell oil production. Although some of its metabolic pathways are well studied, its metabolic engineering is hindered by the lack of a genome-scale model that integrates the current knowledge about its metabolism. Results Combining in silico tools and expert manual curation, we have produced an accurate genome-scale metabolic model for Y. lipolytica. Using a scaffold derived from a functional metabolic model of the well-studied but phylogenetically distant yeast S. cerevisiae, we mapped conserved reactions, rewrote gene associations, added species-specific reactions and inserted specialized copies of scaffold reactions to account for species-specific expansion of protein families. We used physiological measures obtained under lab conditions to validate our predictions. Conclusions Y. lipolytica iNL895 represents the first well-annotated metabolic model of an oleaginous yeast, providing a base for future metabolic improvement, and a starting point for the metabolic reconstruction of other species in the Yarrowia clade and other oleaginous yeasts.

  7. A Method to Constrain Genome-Scale Models with 13C Labeling Data.

    Directory of Open Access Journals (Sweden)

    Héctor García Martín

    2015-09-01

    Full Text Available Current limitations in quantitatively predicting biological behavior hinder our efforts to engineer biological systems to produce biofuels and other desired chemicals. Here, we present a new method for calculating metabolic fluxes, key targets in metabolic engineering, that incorporates data from 13C labeling experiments and genome-scale models. The data from 13C labeling experiments provide strong flux constraints that eliminate the need to assume an evolutionary optimization principle such as the growth rate optimization assumption used in Flux Balance Analysis (FBA. This effective constraining is achieved by making the simple but biologically relevant assumption that flux flows from core to peripheral metabolism and does not flow back. The new method is significantly more robust than FBA with respect to errors in genome-scale model reconstruction. Furthermore, it can provide a comprehensive picture of metabolite balancing and predictions for unmeasured extracellular fluxes as constrained by 13C labeling data. A comparison shows that the results of this new method are similar to those found through 13C Metabolic Flux Analysis (13C MFA for central carbon metabolism but, additionally, it provides flux estimates for peripheral metabolism. The extra validation gained by matching 48 relative labeling measurements is used to identify where and why several existing COnstraint Based Reconstruction and Analysis (COBRA flux prediction algorithms fail. We demonstrate how to use this knowledge to refine these methods and improve their predictive capabilities. This method provides a reliable base upon which to improve the design of biological systems.

  8. Genome-scale metabolic model of Pichia pastoris with native and humanized glycosylation of recombinant proteins.

    Science.gov (United States)

    Irani, Zahra Azimzadeh; Kerkhoven, Eduard J; Shojaosadati, Seyed Abbas; Nielsen, Jens

    2016-05-01

    Pichia pastoris is used for commercial production of human therapeutic proteins, and genome-scale models of P. pastoris metabolism have been generated in the past to study the metabolism and associated protein production by this yeast. A major challenge with clinical usage of recombinant proteins produced by P. pastoris is the difference in N-glycosylation of proteins produced by humans and this yeast. However, through metabolic engineering, a P. pastoris strain capable of producing humanized N-glycosylated proteins was constructed. The current genome-scale models of P. pastoris do not address native nor humanized N-glycosylation, and we therefore developed ihGlycopastoris, an extension to the iLC915 model with both native and humanized N-glycosylation for recombinant protein production, but also an estimation of N-glycosylation of P. pastoris native proteins. This new model gives a better prediction of protein yield, demonstrates the effect of the different types of N-glycosylation of protein yield, and can be used to predict potential targets for strain improvement. The model represents a step towards a more complete description of protein production in P. pastoris, which is required for using these models to understand and optimize protein production processes. Biotechnol. Bioeng. 2016;113: 961-969. © 2015 Wiley Periodicals, Inc. PMID:26480251

  9. Genome-scale metabolic modeling of Mucor circinelloides and comparative analysis with other oleaginous species.

    Science.gov (United States)

    Vongsangnak, Wanwipa; Klanchui, Amornpan; Tawornsamretkit, Iyarest; Tatiyaborwornchai, Witthawin; Laoteng, Kobkul; Meechai, Asawin

    2016-06-01

    We present a novel genome-scale metabolic model iWV1213 of Mucor circinelloides, which is an oleaginous fungus for industrial applications. The model contains 1213 genes, 1413 metabolites and 1326 metabolic reactions across different compartments. We demonstrate that iWV1213 is able to accurately predict the growth rates of M. circinelloides on various nutrient sources and culture conditions using Flux Balance Analysis and Phenotypic Phase Plane analysis. Comparative analysis of three oleaginous genome-scale models, including M. circinelloides (iWV1213), Mortierella alpina (iCY1106) and Yarrowia lipolytica (iYL619_PCP) revealed that iWV1213 possesses a higher number of genes involved in carbohydrate, amino acid, and lipid metabolisms that might contribute to its versatility in nutrient utilization. Moreover, the identification of unique and common active reactions among the Zygomycetes oleaginous models using Flux Variability Analysis unveiled a set of gene/enzyme candidates as metabolic engineering targets for cellular improvement. Thus, iWV1213 offers a powerful metabolic engineering tool for multi-level omics analysis, enabling strain optimization as a cell factory platform of lipid-based production. PMID:26911256

  10. Genomes on ice.

    Science.gov (United States)

    Parkhill, Julian

    2016-03-01

    This month's Genome Watch discusses the analysis of a Helicobacter pylori genome from the preserved Copper-Age mummy known as the Iceman and how ancient genomes shed light on the history of bacterial pathogens. PMID:26853114

  11. Structural characterization of genomes by large scale sequence-structure threading: application of reliability analysis in structural genomics

    Directory of Open Access Journals (Sweden)

    Brunham Robert C

    2004-07-01

    Full Text Available Abstract Background We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, maintainability and safety work to model time-to-failure of mechanical devices, mechanisms, building constructions and equipment. Results We have found that the Weibull function describes protein fold distribution within and among genomes more accurately than conventional power functions which have been used in a number of structural genomic studies reported to date. It has also been found that the Weibull reliability parameter β for protein fold distributions varies between genomes and may reflect differences in rates of gene duplication in evolutionary history of organisms. Conclusions The results of this work demonstrate that reliability analysis can provide useful insights and testable predictions in the fields of comparative and structural genomics.

  12. Genome-scale detection of hypermethylated CpG islands in circulating cell-free DNA of hepatocellular carcinoma patients

    OpenAIRE

    Wen, Lu; Li, Jingyi; Guo, Huahu; Liu, Xiaomeng; Zheng, Shengmin; Zhang, Dafang; Zhu, Weihua; Qu, Jianhui; Guo, Limin; Du, Dexiao; Jin, Xiao; Zhang, Yuhao; Gao, Yun; Jie SHEN; Ge, Hao

    2015-01-01

    Despite advances in DNA methylome analyses of cells and tissues, current techniques for genome-scale profiling of DNA methylation in circulating cell-free DNA (ccfDNA) remain limited. Here we describe a methylated CpG tandems amplification and sequencing (MCTA-Seq) method that can detect thousands of hypermethylated CpG islands simultaneously in ccfDNA. This highly sensitive technique can work with genomic DNA as little as 7.5 pg, which is equivalent to 2.5 copies of the haploid genome. We ha...

  13. Genomic Insights into Aquimarina sp. Strain EL33, a Bacterial Symbiont of the Gorgonian Coral Eunicella labiata

    Science.gov (United States)

    Keller-Costa, Tina; Silva, Rúben; Lago-Lestón, Asunción

    2016-01-01

    To address the metabolic potential of symbiotic Aquimarina spp., we report here the genome sequence of Aquimarina sp. strain EL33, a bacterium isolated from the gorgonian coral Eunicella labiata. This first-described (to our knowledge) animal-associated Aquimarina genome possesses a sophisticated repertoire of genes involved in drug/antibiotic resistance and biosynthesis. PMID:27540075

  14. The Bacterial Communities of Full-Scale Biologically Active, Granular Activated Carbon Filters Are Stable and Diverse and Potentially Contain Novel Ammonia-Oxidizing Microorganisms

    OpenAIRE

    LaPara, Timothy M.; Hope Wilkinson, Katheryn; Strait, Jacqueline M.; Hozalski, Raymond M.; Sadowksy, Michael J.; Hamilton, Matthew J

    2015-01-01

    The bacterial community composition of the full-scale biologically active, granular activated carbon (BAC) filters operated at the St. Paul Regional Water Services (SPRWS) was investigated using Illumina MiSeq analysis of PCR-amplified 16S rRNA gene fragments. These bacterial communities were consistently diverse (Shannon index, >4.4; richness estimates, >1,500 unique operational taxonomic units [OTUs]) throughout the duration of the 12-month study period. In addition, only modest shifts in t...

  15. Evaluating the efficacy of the new Ion PGM Hi-Q Sequencing Kit applied to bacterial genomes.

    Science.gov (United States)

    Pereira, Felipe L; Soares, Siomar C; Dorella, Fernanda A; Leal, Carlos A G; Figueiredo, Henrique C P

    2016-05-01

    Benchtop NGS platforms are constantly evolving to follow new advances in genomics. Thus, the manufacturers are making improvements, such as the recent Ion PGM Hi-Q chemistry. We evaluate the efficacy of this new Hi-Q approach by comparing it with the former Ion PGM kit and the Illumina MiSEQ Nextera 3rd version. The Hi-Q chemistry showed improvement on mapping reads, with 49 errors for 10kbp mapped; in contrast, the former kit had 89 errors. Additionally, there was a reduction of 80% in erroneous variant detection with the Torrent Variant Caller. Also, an enhancement was observed in de novo assembly with a more confident result in whole-genome MLST, with up to 96% of the alleles assembled correctly for both tested microbial genomes. All of these advantages result in a final genome sequence closer to the performance with MiSEQ and will contribute to turn comparative genomic analysis a reliable task. PMID:27033417

  16. Genome-scale metabolic analysis of Clostridium thermocellum for bioethanol production

    Directory of Open Access Journals (Sweden)

    Brooks J Paul

    2010-03-01

    Full Text Available Abstract Background Microorganisms possess diverse metabolic capabilities that can potentially be leveraged for efficient production of biofuels. Clostridium thermocellum (ATCC 27405 is a thermophilic anaerobe that is both cellulolytic and ethanologenic, meaning that it can directly use the plant sugar, cellulose, and biochemically convert it to ethanol. A major challenge in using microorganisms for chemical production is the need to modify the organism to increase production efficiency. The process of properly engineering an organism is typically arduous. Results Here we present a genome-scale model of C. thermocellum metabolism, iSR432, for the purpose of establishing a computational tool to study the metabolic network of C. thermocellum and facilitate efforts to engineer C. thermocellum for biofuel production. The model consists of 577 reactions involving 525 intracellular metabolites, 432 genes, and a proteomic-based representation of a cellulosome. The process of constructing this metabolic model led to suggested annotation refinements for 27 genes and identification of areas of metabolism requiring further study. The accuracy of the iSR432 model was tested using experimental growth and by-product secretion data for growth on cellobiose and fructose. Analysis using this model captures the relationship between the reduction-oxidation state of the cell and ethanol secretion and allowed for prediction of gene deletions and environmental conditions that would increase ethanol production. Conclusions By incorporating genomic sequence data, network topology, and experimental measurements of enzyme activities and metabolite fluxes, we have generated a model that is reasonably accurate at predicting the cellular phenotype of C. thermocellum and establish a strong foundation for rational strain design. In addition, we are able to draw some important conclusions regarding the underlying metabolic mechanisms for observed behaviors of C. thermocellum

  17. Genome-scale reconstruction of the metabolic network in Yersinia pestis, strain 91001

    Energy Technology Data Exchange (ETDEWEB)

    Navid, A; Almaas, E

    2009-01-13

    The gram-negative bacterium Yersinia pestis, the aetiological agent of bubonic plague, is one the deadliest pathogens known to man. Despite its historical reputation, plague is a modern disease which annually afflicts thousands of people. Public safety considerations greatly limit clinical experimentation on this organism and thus development of theoretical tools to analyze the capabilities of this pathogen is of utmost importance. Here, we report the first genome-scale metabolic model of Yersinia pestis biovar Mediaevalis based both on its recently annotated genome, and physiological and biochemical data from literature. Our model demonstrates excellent agreement with Y. pestis known metabolic needs and capabilities. Since Y. pestis is a meiotrophic organism, we have developed CryptFind, a systematic approach to identify all candidate cryptic genes responsible for known and theoretical meiotrophic phenomena. In addition to uncovering every known cryptic gene for Y. pestis, our analysis of the rhamnose fermentation pathway suggests that betB is the responsible cryptic gene. Despite all of our medical advances, we still do not have a vaccine for bubonic plague. Recent discoveries of antibiotic resistant strains of Yersinia pestis coupled with the threat of plague being used as a bioterrorism weapon compel us to develop new tools for studying the physiology of this deadly pathogen. Using our theoretical model, we can study the cell's phenotypic behavior under different circumstances and identify metabolic weaknesses which may be harnessed for the development of therapeutics. Additionally, the automatic identification of cryptic genes expands the usage of genomic data for pharmaceutical purposes.

  18. Genome-scale reconstruction and analysis of the metabolic network in the hyperthermophilic archaeon Sulfolobus solfataricus.

    Directory of Open Access Journals (Sweden)

    Thomas Ulas

    Full Text Available We describe the reconstruction of a genome-scale metabolic model of the crenarchaeon Sulfolobus solfataricus, a hyperthermoacidophilic microorganism. It grows in terrestrial volcanic hot springs with growth occurring at pH 2-4 (optimum 3.5 and a temperature of 75-80°C (optimum 80°C. The genome of Sulfolobus solfataricus P2 contains 2,992,245 bp on a single circular chromosome and encodes 2,977 proteins and a number of RNAs. The network comprises 718 metabolic and 58 transport/exchange reactions and 705 unique metabolites, based on the annotated genome and available biochemical data. Using the model in conjunction with constraint-based methods, we simulated the metabolic fluxes induced by different environmental and genetic conditions. The predictions were compared to experimental measurements and phenotypes of S. solfataricus. Furthermore, the performance of the network for 35 different carbon sources known for S. solfataricus from the literature was simulated. Comparing the growth on different carbon sources revealed that glycerol is the carbon source with the highest biomass flux per imported carbon atom (75% higher than glucose. Experimental data was also used to fit the model to phenotypic observations. In addition to the commonly known heterotrophic growth of S. solfataricus, the crenarchaeon is also able to grow autotrophically using the hydroxypropionate-hydroxybutyrate cycle for bicarbonate fixation. We integrated this pathway into our model and compared bicarbonate fixation with growth on glucose as sole carbon source. Finally, we tested the robustness of the metabolism with respect to gene deletions using the method of Minimization of Metabolic Adjustment (MOMA, which predicted that 18% of all possible single gene deletions would be lethal for the organism.

  19. Theobroma cacao: A genetically integrated physical map and genome-scale comparative synteny analysis

    Science.gov (United States)

    A comprehensive integrated genomic framework is considered a centerpiece of genomic research. In collaboration with the USDA-ARS (SHRS) and Mars Inc., the Clemson University Genomics Institute (CUGI) has developed a genetically anchored physical map of the T. cacao genome. Three BAC libraries contai...

  20. Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform

    CERN Document Server

    Cox, Anthony J; Jakobi, Tobias; Rosone, Giovanna

    2012-01-01

    Motivation The Burrows-Wheeler transform (BWT) is the foundation of many algorithms for compression and indexing of text data, but the cost of computing the BWT of very large string collections has prevented these techniques from being widely applied to the large sets of sequences often encountered as the outcome of DNA sequencing experiments. In previous work, we presented a novel algorithm that allows the BWT of human genome scale data to be computed on very moderate hardware, thus enabling us to investigate the BWT as a tool for the compression of such datasets. Results We first used simulated reads to explore the relationship between the level of compression and the error rate, the length of the reads and the level of sampling of the underlying genome and compare choices of second-stage compression algorithm. We demonstrate that compression may be greatly improved by a particular reordering of the sequences in the collection and give a novel `implicit sorting' strategy that enables these benefits to be re...

  1. Genome-scale reconstruction of metabolic network for a halophilic extremophile, Chromohalobacter salexigens DSM 3043

    Directory of Open Access Journals (Sweden)

    Oner Ebru

    2011-01-01

    Full Text Available Abstract Background Chromohalobacter salexigens (formerly Halomonas elongata DSM 3043 is a halophilic extremophile with a very broad salinity range and is used as a model organism to elucidate prokaryotic osmoadaptation due to its strong euryhaline phenotype. Results C. salexigens DSM 3043's metabolism was reconstructed based on genomic, biochemical and physiological information via a non-automated but iterative process. This manually-curated reconstruction accounts for 584 genes, 1386 reactions, and 1411 metabolites. By using flux balance analysis, the model was extensively validated against literature data on the C. salexigens phenotypic features, the transport and use of different substrates for growth as well as against experimental observations on the uptake and accumulation of industrially important organic osmolytes, ectoine, betaine, and its precursor choline, which play important roles in the adaptive response to osmotic stress. Conclusions This work presents the first comprehensive genome-scale metabolic model of a halophilic bacterium. Being a useful guide for identification and filling of knowledge gaps, the reconstructed metabolic network iOA584 will accelerate the research on halophilic bacteria towards application of systems biology approaches and design of metabolic engineering strategies.

  2. Reconstruction and analysis of a genome-scale metabolic network of Corynebacterium glutamicum S9114.

    Science.gov (United States)

    Mei, Jie; Xu, Nan; Ye, Chao; Liu, Liming; Wu, Jianrong

    2016-01-10

    Corynebacterium glutamicum S9114 is commonly used for industrial glutamate production. Therefore, a comprehensive understanding of the physiological and metabolic characteristics of C. glutamicum is important for developing its potential for industrial production. A genome-scale metabolic model, iJM658, was reconstructed based on genome annotation and literature mining. The model consists of 658 genes, 984 metabolites and 1065 reactions. The model quantitatively predicted C. glutamicum growth on different carbon and nitrogen sources and determined 129 genes to be essential for cell growth. The iJM658 model predicted that C. glutamicum had two glutamate biosynthesis pathways and lacked eight key genes in biotin synthesis. Robustness analysis indicated a relative low oxygen level (1.21mmol/gDW/h) would improve glutamate production rate. Potential metabolic engineering targets for improving γ-aminobutyrate and isoleucine production rate were predicted by in silico deletion or overexpression of some genes. The iJM658 model is a useful tool for understanding and optimizing the metabolism of C. glutamicum and a valuable resource for future metabolic and physiological research. PMID:26392034

  3. A bacterial artificial chromosome library for the Australian saltwater crocodile (Crocodylus porosus) and its utilization in gene isolation and genome characterization

    Science.gov (United States)

    2009-01-01

    Background Crocodilians (Order Crocodylia) are an ancient vertebrate group of tremendous ecological, social, and evolutionary importance. They are the only extant reptilian members of Archosauria, a monophyletic group that also includes birds, dinosaurs, and pterosaurs. Consequently, crocodilian genomes represent a gateway through which the molecular evolution of avian lineages can be explored. To facilitate comparative genomics within Crocodylia and between crocodilians and other archosaurs, we have constructed a bacterial artificial chromosome (BAC) library for the Australian saltwater crocodile, Crocodylus porosus. This is the first BAC library for a crocodile and only the second BAC resource for a crocodilian. Results The C. porosus BAC library consists of 101,760 individually archived clones stored in 384-well microtiter plates. NotI digestion of random clones indicates an average insert size of 102 kb. Based on a genome size estimate of 2778 Mb, the library affords 3.7 fold (3.7×) coverage of the C. porosus genome. To investigate the utility of the library in studying sequence distribution, probes derived from CR1a and CR1b, two crocodilian CR1-like retrotransposon subfamilies, were hybridized to C. porosus macroarrays. The results indicate that there are a minimum of 20,000 CR1a/b elements in C. porosus and that their distribution throughout the genome is decidedly non-random. To demonstrate the utility of the library in gene isolation, we probed the C. porosus macroarrays with an overgo designed from a C-mos (oocyte maturation factor) partial cDNA. A BAC containing C-mos was identified and the C-mos locus was sequenced. Nucleotide and amino acid sequence alignment of the C. porosus C-mos coding sequence with avian and reptilian C-mos orthologs reveals greater sequence similarity between C. porosus and birds (specifically chicken and zebra finch) than between C. porosus and squamates (green anole). Conclusion We have demonstrated the utility of the

  4. Analysis of genetic variation and potential applications in genome-scale metabolic modeling

    Directory of Open Access Journals (Sweden)

    João Gonçalo Rocha Cardoso

    2015-02-01

    Full Text Available Genetic variation is the motor of evolution and allows organisms to overcome the environmental challenges they encounter. It can be both beneficial and harmful in the process of engineering cell factories for the production of proteins and chemicals. Throughout the history of biotechnology, there have been efforts to exploit genetic variation in our favor to create strains with favorable phenotypes. Genetic variation can either be present in natural populations or it can be artificially created by mutagenesis and selection or adaptive laboratory evolution. On the other hand, unintended genetic variation during a long term production process may lead to significant economic losses and it is important to understand how to control this type of variation. With the emergence of next-generation sequencing technologies, genetic variation in microbial strains can now be determined on an unprecedented scale and resolution by re-sequencing thousands of strains systematically. In this article, we review challenges in the integration and analysis of large-scale re-sequencing data, present an extensive overview of bioinformatics methods for predicting the effects of genetic variants on protein function, and discuss approaches for interfacing existing bioinformatics approaches with genome-scale models of cellular processes in order to predict effects of sequence variation on cellular phenotypes.

  5. Genome-scale modeling of the protein secretory machinery in yeast

    DEFF Research Database (Denmark)

    Feizi, Amir; Österlund, Tobias; Petranovic, Dina;

    2013-01-01

    The protein secretory machinery in Eukarya is involved in post-translational modification (PTMs) and sorting of the secretory and many transmembrane proteins. While the secretory machinery has been well-studied using classic reductionist approaches, a holistic view of its complex nature is lacking....... Here, we present the first genome-scale model for the yeast secretory machinery which captures the knowledge generated through more than 50 years of research. The model is based on the concept of a Protein Specific Information Matrix (PSIM: characterized by seven PTMs features). An algorithm...... was developed which mimics secretory machinery and assigns each secretory protein to a particular secretory class that determines the set of PTMs and transport steps specific to each protein. Protein abundances were integrated with the model in order to gain system level estimation of the metabolic demands...

  6. Integration of gene expression data into genome-scale metabolic models

    DEFF Research Database (Denmark)

    Åkesson, M.; Förster, Jochen; Nielsen, Jens

    2004-01-01

    gene expression from chemostat and batch cultures of Saccharomyces cerevisiae were combined with a recently developed genome-scale model, and the computed metabolic flux distributions were compared to experimental values from carbon labeling experiments and metabolic network analysis. The integration...... of expression data resulted in improved predictions of metabolic behavior in batch cultures, enabling quantitative predictions of exchange fluxes as well as qualitative estimations of changes in intracellular fluxes. A critical discussion of correlation between gene expression and metabolic fluxes is......A framework for integration of transcriptome data into stoichiometric metabolic models to obtain improved flux predictions is presented. The key idea is to exploit the regulatory information in the expression data to give additional constraints on the metabolic fluxes in the model. Measurements of...

  7. Moving image analysis to the cloud: A case study with a genome-scale tomographic study

    International Nuclear Information System (INIS)

    Over the last decade, the time required to measure a terabyte of microscopic imaging data has gone from years to minutes. This shift has moved many of the challenges away from experimental design and measurement to scalable storage, organization, and analysis. As many scientists and scientific institutions lack training and competencies in these areas, major bottlenecks have arisen and led to substantial delays and gaps between measurement, understanding, and dissemination. We present in this paper a framework for analyzing large 3D datasets using cloud-based computational and storage resources. We demonstrate its applicability by showing the setup and costs associated with the analysis of a genome-scale study of bone microstructure. We then evaluate the relative advantages and disadvantages associated with local versus cloud infrastructures

  8. Genome scale metabolic reconstruction of Chlorella variabilis for exploring its metabolic potential for biofuels.

    Science.gov (United States)

    Juneja, Ankita; Chaplen, Frank W R; Murthy, Ganti S

    2016-08-01

    A compartmentalized genome scale metabolic network was reconstructed for Chlorella variabilis to offer insight into various metabolic potentials from this alga. The model, iAJ526, was reconstructed with 1455 reactions, 1236 metabolites and 526 genes. 21% of the reactions were transport reactions and about 81% of the total reactions were associated with enzymes. Along with gap filling reactions, 2 major sub-pathways were added to the model, chitosan synthesis and rhamnose metabolism. The reconstructed model had reaction participation of 4.3 metabolites per reaction and average lethality fraction of 0.21. The model was effective in capturing the growth of C. variabilis under three light conditions (white, red and red+blue light) with fair agreement. This reconstructed metabolic network will serve an important role in systems biology for further exploration of metabolism for specific target metabolites and enable improved characteristics in the strain through metabolic engineering. PMID:26995318

  9. Moving image analysis to the cloud: A case study with a genome-scale tomographic study

    Science.gov (United States)

    Mader, Kevin; Stampanoni, Marco

    2016-01-01

    Over the last decade, the time required to measure a terabyte of microscopic imaging data has gone from years to minutes. This shift has moved many of the challenges away from experimental design and measurement to scalable storage, organization, and analysis. As many scientists and scientific institutions lack training and competencies in these areas, major bottlenecks have arisen and led to substantial delays and gaps between measurement, understanding, and dissemination. We present in this paper a framework for analyzing large 3D datasets using cloud-based computational and storage resources. We demonstrate its applicability by showing the setup and costs associated with the analysis of a genome-scale study of bone microstructure. We then evaluate the relative advantages and disadvantages associated with local versus cloud infrastructures.

  10. Genome-scale estimate of the metabolic turnover of E. Coli from the energy balance analysis

    Science.gov (United States)

    De Martino, D.

    2016-02-01

    In this article the notion of metabolic turnover is revisited in the light of recent results of out-of-equilibrium thermodynamics. By means of Monte Carlo methods we perform an exact sampling of the enzymatic fluxes in a genome scale metabolic network of E. Coli in stationary growth conditions from which we infer the metabolites turnover times. However the latter are inferred from net fluxes, and we argue that this approximation is not valid for enzymes working nearby thermodynamic equilibrium. We recalculate turnover times from total fluxes by performing an energy balance analysis of the network and recurring to the fluctuation theorem. We find in many cases values one of order of magnitude lower, implying a faster picture of intermediate metabolism.

  11. Moving image analysis to the cloud: A case study with a genome-scale tomographic study

    Energy Technology Data Exchange (ETDEWEB)

    Mader, Kevin [4Quant Ltd., Switzerland & Institute for Biomedical Engineering at University and ETH Zurich (Switzerland); Stampanoni, Marco [Institute for Biomedical Engineering at University and ETH Zurich, Switzerland & Swiss Light Source at Paul Scherrer Institut, Villigen (Switzerland)

    2016-01-28

    Over the last decade, the time required to measure a terabyte of microscopic imaging data has gone from years to minutes. This shift has moved many of the challenges away from experimental design and measurement to scalable storage, organization, and analysis. As many scientists and scientific institutions lack training and competencies in these areas, major bottlenecks have arisen and led to substantial delays and gaps between measurement, understanding, and dissemination. We present in this paper a framework for analyzing large 3D datasets using cloud-based computational and storage resources. We demonstrate its applicability by showing the setup and costs associated with the analysis of a genome-scale study of bone microstructure. We then evaluate the relative advantages and disadvantages associated with local versus cloud infrastructures.

  12. Genome-scale consequences of cofactor balancing in engineered pentose utilization pathways in Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Amit Ghosh

    Full Text Available Biofuels derived from lignocellulosic biomass offer promising alternative renewable energy sources for transportation fuels. Significant effort has been made to engineer Saccharomyces cerevisiae to efficiently ferment pentose sugars such as D-xylose and L-arabinose into biofuels such as ethanol through heterologous expression of the fungal D-xylose and L-arabinose pathways. However, one of the major bottlenecks in these fungal pathways is that the cofactors are not balanced, which contributes to inefficient utilization of pentose sugars. We utilized a genome-scale model of S. cerevisiae to predict the maximal achievable growth rate for cofactor balanced and imbalanced D-xylose and L-arabinose utilization pathways. Dynamic flux balance analysis (DFBA was used to simulate batch fermentation of glucose, D-xylose, and L-arabinose. The dynamic models and experimental results are in good agreement for the wild type and for the engineered D-xylose utilization pathway. Cofactor balancing the engineered D-xylose and L-arabinose utilization pathways simulated an increase in ethanol batch production of 24.7% while simultaneously reducing the predicted substrate utilization time by 70%. Furthermore, the effects of cofactor balancing the engineered pentose utilization pathways were evaluated throughout the genome-scale metabolic network. This work not only provides new insights to the global network effects of cofactor balancing but also provides useful guidelines for engineering a recombinant yeast strain with cofactor balanced engineered pathways that efficiently co-utilizes pentose and hexose sugars for biofuels production. Experimental switching of cofactor usage in enzymes has been demonstrated, but is a time-consuming effort. Therefore, systems biology models that can predict the likely outcome of such strain engineering efforts are highly useful for motivating which efforts are likely to be worth the significant time investment.

  13. Separation of bacterial spores from flowing water in macro-scale cavities by ultrasonic standing waves

    CERN Document Server

    Lipkens, B; Costolo, M; Stevens, A; Rietman, Edward

    2010-01-01

    The separation of micron-sized bacterial spores (Bacillus cereus) from a steady flow of water through the use of ultrasonic standing waves is demonstrated. An ultrasonic resonator with cross-section of 0.0254 m x 0.0254 m has been designed with a flow inlet and outlet for a water stream that ensures laminar flow conditions into and out of the resonator section of the flow tube. A 0.01905-m diameter PZT-4, nominal 2-MHz transducer is used to generate ultrasonic standing waves in the resonator. The acoustic resonator is 0.0356 m from transducer face to the opposite reflector wall with the acoustic field in a direction orthogonal to the water flow direction. At fixed frequency excitation, spores are concentrated at the stable locations of the acoustic radiation force and trapped in the resonator region. The effect of the transducer voltage and frequency on the efficiency of spore capture in the resonator has been investigated. Successful separation of B. cereus spores from water with typical volume flow rates of...

  14. Integrated genome-scale analysis of the transcriptional regulatory landscape in a blood stem/progenitor cell model.

    Science.gov (United States)

    Wilson, Nicola K; Schoenfelder, Stefan; Hannah, Rebecca; Sánchez Castillo, Manuel; Schütte, Judith; Ladopoulos, Vasileios; Mitchelmore, Joanna; Goode, Debbie K; Calero-Nieto, Fernando J; Moignard, Victoria; Wilkinson, Adam C; Jimenez-Madrid, Isabel; Kinston, Sarah; Spivakov, Mikhail; Fraser, Peter; Göttgens, Berthold

    2016-03-31

    Comprehensive study of transcriptional control processes will be required to enhance our understanding of both normal and malignant hematopoiesis. Modern sequencing technologies have revolutionized our ability to generate genome-scale expression and histone modification profiles, transcription factor (TF)-binding maps, and also comprehensive chromatin-looping information. Many of these technologies, however, require large numbers of cells, and therefore cannot be applied to rare hematopoietic stem/progenitor cell (HSPC) populations. The stem cell factor-dependent multipotent progenitor cell line HPC-7 represents a well-recognized cell line model for HSPCs. Here we report genome-wide maps for 17 TFs, 3 histone modifications, DNase I hypersensitive sites, and high-resolution promoter-enhancer interactomes in HPC-7 cells. Integrated analysis of these complementary data sets revealed TF occupancy patterns of genomic regions involved in promoter-anchored loops. Moreover, preferential associations between pairs of TFs bound at either ends of chromatin loops led to the identification of 4 previously unrecognized protein-protein interactions between key blood stem cell regulators. All HPC-7 data sets are freely available both through standard repositories and a user-friendly Web interface. Together with previously generated genome-wide data sets, this study integrates HPC-7 data into a genomic resource on par with ENCODE tier 1 cell lines and, importantly, is the only current model with comprehensive genome-scale data that is relevant to HSPC biology. PMID:26809507

  15. Bacterial niche-specific genome expansion is coupled with highly frequent gene disruptions in deep-sea sediments

    KAUST Repository

    Wang, Yong

    2011-12-21

    The complexity and dynamics of microbial metagenomes may be evaluated by genome size, gene duplication and the disruption rate between lineages. In this study, we pyrosequenced the metagenomes of microbes obtained from the brine and sediment of a deep-sea brine pool in the Red Sea to explore the possible genomic adaptations of the microbes in response to environmental changes. The microbes from the brine and sediments (both surface and deep layers) of the Atlantis II Deep brine pool had similar communities whereas the effective genome size varied from 7.4 Mb in the brine to more than 9 Mb in the sediment. This genome expansion in the sediment samples was due to gene duplication as evidenced by enrichment of the homologs. The duplicated genes were highly disrupted, on average by 47.6% and 70% for the surface and deep layers of the Atlantis II Deep sediment samples, respectively. The disruptive effects appeared to be mainly due to point mutations and frameshifts. In contrast, the homologs from the Atlantis II Deep brine sample were highly conserved and they maintained relatively small copy numbers. Likely, the adaptation of the microbes in the sediments was coupled with pseudogenizations and possibly functional diversifications of the paralogs in the expanded genomes. The maintenance of the pseudogenes in the large genomes is discussed. © 2011 Wang et al.

  16. Bacterial niche-specific genome expansion is coupled with highly frequent gene disruptions in deep-sea sediments.

    Directory of Open Access Journals (Sweden)

    Yong Wang

    Full Text Available The complexity and dynamics of microbial metagenomes may be evaluated by genome size, gene duplication and the disruption rate between lineages. In this study, we pyrosequenced the metagenomes of microbes obtained from the brine and sediment of a deep-sea brine pool in the Red Sea to explore the possible genomic adaptations of the microbes in response to environmental changes. The microbes from the brine and sediments (both surface and deep layers of the Atlantis II Deep brine pool had similar communities whereas the effective genome size varied from 7.4 Mb in the brine to more than 9 Mb in the sediment. This genome expansion in the sediment samples was due to gene duplication as evidenced by enrichment of the homologs. The duplicated genes were highly disrupted, on average by 47.6% and 70% for the surface and deep layers of the Atlantis II Deep sediment samples, respectively. The disruptive effects appeared to be mainly due to point mutations and frameshifts. In contrast, the homologs from the Atlantis II Deep brine sample were highly conserved and they maintained relatively small copy numbers. Likely, the adaptation of the microbes in the sediments was coupled with pseudogenizations and possibly functional diversifications of the paralogs in the expanded genomes. The maintenance of the pseudogenes in the large genomes is discussed.

  17. Evaluation of effectiveness of bacterial product which can degrade pesticide-dimethoate on the scale of true practice test

    International Nuclear Information System (INIS)

    Dimethoate, an organophosphate pesticide has been widely used in Dalat, Lamdong. It is much toxic to birds, human being and other mammals. Its widespread use has caused environmental concern on the basic of frequent detection of dimethoate in soil and water. Microorganisms are key agents in the degradation of waste, oil and a vast array of organic pesticide in terrestrial and aquatic ecosystems. In previous study, bacteria products which can degrade. Dimethoate were produced. The present study was designed to evaluate the effectiveness of bacterial product which can degrade Pesticide-Dimethoate on the scale of true practice test. The results indicated that application bacteria product to soil grown with Cauliflower and Chinese Cabbage sprayed with organic phosphorus pesticides (Dimethoate and Chloropyrifos), the pesticide residues in soil, water and vegetables were as follow: The residues of Dimethoate and Chloropyrifos in soil grown with Cauliflower, Chinese cabbages are different. They concentrated mostly in the surface litter and top soil layers with the depth from 0 to 20 cm. From the depth of 20 cm to 100 cm, the pesticide residues were ignorable. Residue of Chloropyrifos in soil was small as well. Dimethoate residues in soil grown with Cauliflower were higher than that of Chinese cabbages. On the basis of the environmental criteria of Ministry for Science, Technology and Environment (6/95), Dimethoate residues in soil grown with cauliflowers were in excess of the maximum limit. In the case of using bacteria product to soil, pesticide residues in soil were decreased. The results also indicated that Chloropyrifos residues in water (water obtained at the depth of 75 cm and 100 cm by days) were small. Residue of Dimethoate in water small. Residue of Dimethoate in water obtained from the Cauliflower bed were higher than of Chinese cabbages one. Using bacteria product to soil, pesticide residues in water decreased. On the basis of the environmental criteria of

  18. Comparative fluorescence in situ hybridization mapping of a 431-kb Arabidopsis thaliana bacterial artificial chromosome contig reveals the role of chromosomal duplications in the expansion of the Brassica rapa genome.

    OpenAIRE

    Jackson, S A; Cheng, Z; Wang, M L; Goodman, H M; Jiang, J

    2000-01-01

    Comparative genome studies are important contributors to our understanding of genome evolution. Most comparative genome studies in plants have been based on genetic mapping of homologous DNA loci in different genomes. Large-scale comparative physical mapping has been hindered by the lack of efficient and affordable techniques. We report here the adaptation of fluorescence in situ hybridization (FISH) techniques for comparative physical mapping between Arabidopsis thaliana and Brassica rapa. A...

  19. Scaling laws governing stochastic growth and division of single bacterial cells

    CERN Document Server

    Iyer-Biswas, Srividya; Henry, Jonathan T; Lo, Klevin; Burov, Stanislav; Lin, Yihan; Crooks, Gavin E; Crosson, Sean; Dinner, Aaron R; Scherer, Norbert F

    2014-01-01

    Uncovering the quantitative laws that govern the growth and division of single cells remains a major challenge. Using a unique combination of technologies that yields unprecedented statistical precision, we find that the sizes of individual Caulobacter crescentus cells increase exponentially in time. We also establish that they divide upon reaching a critical multiple ($\\approx$1.8) of their initial sizes, rather than an absolute size. We show that when the temperature is varied, the growth and division timescales scale proportionally with each other over the physiological temperature range. Strikingly, the cell-size and division-time distributions can both be rescaled by their mean values such that the condition-specific distributions collapse to universal curves. We account for these observations with a minimal stochastic model that is based on an autocatalytic cycle. It predicts the scalings, as well as specific functional forms for the universal curves. Our experimental and theoretical analysis reveals a ...

  20. Niche differentiation of bacterial communities at a millimeter scale in Shark Bay microbial mats

    OpenAIRE

    Hon Lun Wong; Daniela-Lee Smith; Pieter T. Visscher; Burns, Brendan P.

    2015-01-01

    Modern microbial mats can provide key insights into early Earth ecosystems, and Shark Bay, Australia, holds one of the best examples of these systems. Identifying the spatial distribution of microorganisms with mat depth facilitates a greater understanding of specific niches and potentially novel microbial interactions. High throughput sequencing coupled with elemental analyses and biogeochemical measurements of two distinct mat types (smooth and pustular) at a millimeter scale were undertake...

  1. Scaling laws governing stochastic growth and division of single bacterial cells.

    Science.gov (United States)

    Iyer-Biswas, Srividya; Wright, Charles S; Henry, Jonathan T; Lo, Klevin; Burov, Stanislav; Lin, Yihan; Crooks, Gavin E; Crosson, Sean; Dinner, Aaron R; Scherer, Norbert F

    2014-11-11

    Uncovering the quantitative laws that govern the growth and division of single cells remains a major challenge. Using a unique combination of technologies that yields unprecedented statistical precision, we find that the sizes of individual Caulobacter crescentus cells increase exponentially in time. We also establish that they divide upon reaching a critical multiple (≈ 1.8) of their initial sizes, rather than an absolute size. We show that when the temperature is varied, the growth and division timescales scale proportionally with each other over the physiological temperature range. Strikingly, the cell-size and division-time distributions can both be rescaled by their mean values such that the condition-specific distributions collapse to universal curves. We account for these observations with a minimal stochastic model that is based on an autocatalytic cycle. It predicts the scalings, as well as specific functional forms for the universal curves. Our experimental and theoretical analysis reveals a simple physical principle governing these complex biological processes: a single temperature-dependent scale of cellular time governs the stochastic dynamics of growth and division in balanced growth conditions. PMID:25349411

  2. Genomes and virulence factors of novel bacterial pathogens causing bleaching disease in the marine red alga Delisea pulchra.

    Directory of Open Access Journals (Sweden)

    Neil Fernandes

    Full Text Available Nautella sp. R11, a member of the marine Roseobacter clade, causes a bleaching disease in the temperate-marine red macroalga, Delisea pulchra. To begin to elucidate the molecular mechanisms underpinning the ability of Nautella sp. R11 to colonize, invade and induce bleaching of D. pulchra, we sequenced and analyzed its genome. The genome encodes several factors such as adhesion mechanisms, systems for the transport of algal metabolites, enzymes that confer resistance to oxidative stress, cytolysins, and global regulatory mechanisms that may allow for the switch of Nautella sp. R11 to a pathogenic lifestyle. Many virulence effectors common in phytopathogenic bacteria are also found in the R11 genome, such as the plant hormone indole acetic acid, cellulose fibrils, succinoglycan and nodulation protein L. Comparative genomics with non-pathogenic Roseobacter strains and a newly identified pathogen, Phaeobacter sp. LSS9, revealed a patchy distribution of putative virulence factors in all genomes, but also led to the identification of a quorum sensing (QS dependent transcriptional regulator that was unique to pathogenic Roseobacter strains. This observation supports the model that a combination of virulence factors and QS-dependent regulatory mechanisms enables indigenous members of the host alga's epiphytic microbial community to switch to a pathogenic lifestyle, especially under environmental conditions when innate host defence mechanisms are compromised.

  3. The architecture of ArgR-DNA complexes at the genome-scale in Escherichia coli

    DEFF Research Database (Denmark)

    Cho, Suhyung; Cho, Yoo-Bok; Kang, Taek Jin;

    2015-01-01

    DNA-binding motifs that are recognized by transcription factors (TFs) have been well studied; however, challenges remain in determining the in vivo architecture of TF-DNA complexes on a genome-scale. Here, we determined the in vivo architecture of Escherichia coli arginine repressor (ArgR)-DNA co...

  4. Genome-scale metabolic model of Pichia pastoris with native and humanized glycosylation of recombinant proteins

    DEFF Research Database (Denmark)

    Irani, Zahra Azimzadeh; Kerkhoven, Eduard J.; Shojaosadati, Seyed Abbas;

    2016-01-01

    Pichia pastoris is used for commercial production of human therapeutic proteins, and genome-scale models of P. pastoris metabolism have been generated in the past to study the metabolism and associated protein production by this yeast. A major challenge with clinical usage of recombinant proteins...

  5. A map of human genome variation from population-scale sequencing

    OpenAIRE

    Abdallah, Assya; Abecasis, Gonçalo R.; Abyzov, Alexej; Affourtit, Jason; Agarwala, Richa; Aksay, Gozde; Albers, Cornelis A.; Albrecht, Marcus W.; Alkan, Can; Altshuler, David L.; Ambrogio, Lauren; Amstislavskiy, Vyacheslav S.; Anderson, Paul; Ashworth, Dana; Attiya, Said

    2010-01-01

    The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mot...

  6. iRsp1095: A genome-scale reconstruction of the Rhodobacter sphaeroides metabolic network

    Directory of Open Access Journals (Sweden)

    Gorzalski Alexander S

    2011-07-01

    Full Text Available Abstract Background Rhodobacter sphaeroides is one of the best studied purple non-sulfur photosynthetic bacteria and serves as an excellent model for the study of photosynthesis and the metabolic capabilities of this and related facultative organisms. The ability of R. sphaeroides to produce hydrogen (H2, polyhydroxybutyrate (PHB or other hydrocarbons, as well as its ability to utilize atmospheric carbon dioxide (CO2 as a carbon source under defined conditions, make it an excellent candidate for use in a wide variety of biotechnological applications. A genome-level understanding of its metabolic capabilities should help realize this biotechnological potential. Results Here we present a genome-scale metabolic network model for R. sphaeroides strain 2.4.1, designated iRsp1095, consisting of 1,095 genes, 796 metabolites and 1158 reactions, including R. sphaeroides-specific biomass reactions developed in this study. Constraint-based analysis showed that iRsp1095 agreed well with experimental observations when modeling growth under respiratory and phototrophic conditions. Genes essential for phototrophic growth were predicted by single gene deletion analysis. During pathway-level analyses of R. sphaeroides metabolism, an alternative route for CO2 assimilation was identified. Evaluation of photoheterotrophic H2 production using iRsp1095 indicated that maximal yield would be obtained from growing cells, with this predicted maximum ~50% higher than that observed experimentally from wild type cells. Competing pathways that might prevent the achievement of this theoretical maximum were identified to guide future genetic studies. Conclusions iRsp1095 provides a robust framework for future metabolic engineering efforts to optimize the solar- and nutrient-powered production of biofuels and other valuable products by R. sphaeroides and closely related organisms.

  7. Diversity and relationships of cocirculating modern human rotaviruses revealed using large-scale comparative genomics.

    Science.gov (United States)

    McDonald, Sarah M; McKell, Allison O; Rippinger, Christine M; McAllen, John K; Akopov, Asmik; Kirkness, Ewen F; Payne, Daniel C; Edwards, Kathryn M; Chappell, James D; Patton, John T

    2012-09-01

    Group A rotaviruses (RVs) are 11-segmented, double-stranded RNA viruses and are primary causes of gastroenteritis in young children. Despite their medical relevance, the genetic diversity of modern human RVs is poorly understood, and the impact of vaccine use on circulating strains remains unknown. In this study, we report the complete genome sequence analysis of 58 RVs isolated from children with severe diarrhea and/or vomiting at Vanderbilt University Medical Center (VUMC) in Nashville, TN, during the years spanning community vaccine implementation (2005 to 2009). The RVs analyzed include 36 G1P[8], 18 G3P[8], and 4 G12P[8] Wa-like genogroup 1 strains with VP6-VP1-VP2-VP3-NSP1-NSP2-NSP3-NSP4-NSP5/6 genotype constellations of I1-R1-C1-M1-A1-N1-T1-E1-H1. By constructing phylogenetic trees, we identified 2 to 5 subgenotype alleles for each gene. The results show evidence of intragenogroup gene reassortment among the cocirculating strains. However, several isolates from different seasons maintained identical allele constellations, consistent with the notion that certain RV clades persisted in the community. By comparing the genes of VUMC RVs to those of other archival and contemporary RV strains for which sequences are available, we defined phylogenetic lineages and verified that the diversity of the strains analyzed in this study reflects that seen in other regions of the world. Importantly, the VP4 and VP7 proteins encoded by VUMC RVs and other contemporary strains show amino acid changes in or near neutralization domains, which might reflect antigenic drift of the virus. Thus, this large-scale, comparative genomic study of modern human RVs provides significant insight into how this pathogen evolves during its spread in the community. PMID:22696651

  8. Diversity and Relationships of Cocirculating Modern Human Rotaviruses Revealed Using Large-Scale Comparative Genomics

    Science.gov (United States)

    McKell, Allison O.; Rippinger, Christine M.; McAllen, John K.; Akopov, Asmik; Kirkness, Ewen F.; Payne, Daniel C.; Edwards, Kathryn M.; Chappell, James D.; Patton, John T.

    2012-01-01

    Group A rotaviruses (RVs) are 11-segmented, double-stranded RNA viruses and are primary causes of gastroenteritis in young children. Despite their medical relevance, the genetic diversity of modern human RVs is poorly understood, and the impact of vaccine use on circulating strains remains unknown. In this study, we report the complete genome sequence analysis of 58 RVs isolated from children with severe diarrhea and/or vomiting at Vanderbilt University Medical Center (VUMC) in Nashville, TN, during the years spanning community vaccine implementation (2005 to 2009). The RVs analyzed include 36 G1P[8], 18 G3P[8], and 4 G12P[8] Wa-like genogroup 1 strains with VP6-VP1-VP2-VP3-NSP1-NSP2-NSP3-NSP4-NSP5/6 genotype constellations of I1-R1-C1-M1-A1-N1-T1-E1-H1. By constructing phylogenetic trees, we identified 2 to 5 subgenotype alleles for each gene. The results show evidence of intragenogroup gene reassortment among the cocirculating strains. However, several isolates from different seasons maintained identical allele constellations, consistent with the notion that certain RV clades persisted in the community. By comparing the genes of VUMC RVs to those of other archival and contemporary RV strains for which sequences are available, we defined phylogenetic lineages and verified that the diversity of the strains analyzed in this study reflects that seen in other regions of the world. Importantly, the VP4 and VP7 proteins encoded by VUMC RVs and other contemporary strains show amino acid changes in or near neutralization domains, which might reflect antigenic drift of the virus. Thus, this large-scale, comparative genomic study of modern human RVs provides significant insight into how this pathogen evolves during its spread in the community. PMID:22696651

  9. The population genomics of begomoviruses: global scale population structure and gene flow

    Directory of Open Access Journals (Sweden)

    Prasanna HC

    2010-09-01

    Full Text Available Abstract Background The rapidly growing availability of diverse full genome sequences from across the world is increasing the feasibility of studying the large-scale population processes that underly observable pattern of virus diversity. In particular, characterizing the genetic structure of virus populations could potentially reveal much about how factors such as geographical distributions, host ranges and gene flow between populations combine to produce the discontinuous patterns of genetic diversity that we perceive as distinct virus species. Among the richest and most diverse full genome datasets that are available is that for the dicotyledonous plant infecting genus, Begomovirus, in the Family Geminiviridae. The begomoviruses all share the same whitefly vector, are highly recombinogenic and are distributed throughout tropical and subtropical regions where they seriously threaten the food security of the world's poorest people. Results We focus here on using a model-based population genetic approach to identify the genetically distinct sub-populations within the global begomovirus meta-population. We demonstrate the existence of at least seven major sub-populations that can further be sub-divided into as many as thirty four significantly differentiated and genetically cohesive minor sub-populations. Using the population structure framework revealed in the present study, we further explored the extent of gene flow and recombination between genetic populations. Conclusions Although geographical barriers are apparently the most significant underlying cause of the seven major population sub-divisions, within the framework of these sub-divisions, we explore patterns of gene flow to reveal that both host range differences and genetic barriers to recombination have probably been major contributors to the minor population sub-divisions that we have identified. We believe that the global Begomovirus population structure revealed here could

  10. Determining the control circuitry of redox metabolism at the genome-scale.

    Directory of Open Access Journals (Sweden)

    Stephen Federowicz

    2014-04-01

    Full Text Available Determining how facultative anaerobic organisms sense and direct cellular responses to electron acceptor availability has been a subject of intense study. However, even in the model organism Escherichia coli, established mechanisms only explain a small fraction of the hundreds of genes that are regulated during electron acceptor shifts. Here we propose a qualitative model that accounts for the full breadth of regulated genes by detailing how two global transcription factors (TFs, ArcA and Fnr of E. coli, sense key metabolic redox ratios and act on a genome-wide basis to regulate anabolic, catabolic, and energy generation pathways. We first fill gaps in our knowledge of this transcriptional regulatory network by carrying out ChIP-chip and gene expression experiments to identify 463 regulatory events. We then interfaced this reconstructed regulatory network with a highly curated genome-scale metabolic model to show that ArcA and Fnr regulate >80% of total metabolic flux and 96% of differential gene expression across fermentative and nitrate respiratory conditions. Based on the data, we propose a feedforward with feedback trim regulatory scheme, given the extensive repression of catabolic genes by ArcA and extensive activation of chemiosmotic genes by Fnr. We further corroborated this regulatory scheme by showing a 0.71 r(2 (p<1e-6 correlation between changes in metabolic flux and changes in regulatory activity across fermentative and nitrate respiratory conditions. Finally, we are able to relate the proposed model to a wealth of previously generated data by contextualizing the existing transcriptional regulatory network.

  11. Identification of Bacterial Small RNAs by RNA Sequencing

    DEFF Research Database (Denmark)

    Gómez Lozano, María; Marvig, Rasmus Lykke; Molin, Søren;

    2014-01-01

    Small regulatory RNAs (sRNAs) in bacteria are known to modulate gene expression and control a variety of processes including metabolic reactions, stress responses, and pathogenesis in response to environmental signals. A method to identify bacterial sRNAs on a genome-wide scale based on RNA seque...

  12. Noise analysis of genome-scale protein synthesis using a discrete computational model of translation

    International Nuclear Information System (INIS)

    Noise in genetic networks has been the subject of extensive experimental and computational studies. However, very few of these studies have considered noise properties using mechanistic models that account for the discrete movement of ribosomes and RNA polymerases along their corresponding templates (messenger RNA (mRNA) and DNA). The large size of these systems, which scales with the number of genes, mRNA copies, codons per mRNA, and ribosomes, is responsible for some of the challenges. Additionally, one should be able to describe the dynamics of ribosome exchange between the free ribosome pool and those bound to mRNAs, as well as how mRNA species compete for ribosomes. We developed an efficient algorithm for stochastic simulations that addresses these issues and used it to study the contribution and trade-offs of noise to translation properties (rates, time delays, and rate-limiting steps). The algorithm scales linearly with the number of mRNA copies, which allowed us to study the importance of genome-scale competition between mRNAs for the same ribosomes. We determined that noise is minimized under conditions maximizing the specific synthesis rate. Moreover, sensitivity analysis of the stochastic system revealed the importance of the elongation rate in the resultant noise, whereas the translation initiation rate constant was more closely related to the average protein synthesis rate. We observed significant differences between our results and the noise properties of the most commonly used translation models. Overall, our studies demonstrate that the use of full mechanistic models is essential for the study of noise in translation and transcription

  13. Noise analysis of genome-scale protein synthesis using a discrete computational model of translation

    Energy Technology Data Exchange (ETDEWEB)

    Racle, Julien; Hatzimanikatis, Vassily, E-mail: vassily.hatzimanikatis@epfl.ch [Laboratory of Computational Systems Biotechnology, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne (Switzerland); Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne (Switzerland); Stefaniuk, Adam Jan [Laboratory of Computational Systems Biotechnology, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne (Switzerland)

    2015-07-28

    Noise in genetic networks has been the subject of extensive experimental and computational studies. However, very few of these studies have considered noise properties using mechanistic models that account for the discrete movement of ribosomes and RNA polymerases along their corresponding templates (messenger RNA (mRNA) and DNA). The large size of these systems, which scales with the number of genes, mRNA copies, codons per mRNA, and ribosomes, is responsible for some of the challenges. Additionally, one should be able to describe the dynamics of ribosome exchange between the free ribosome pool and those bound to mRNAs, as well as how mRNA species compete for ribosomes. We developed an efficient algorithm for stochastic simulations that addresses these issues and used it to study the contribution and trade-offs of noise to translation properties (rates, time delays, and rate-limiting steps). The algorithm scales linearly with the number of mRNA copies, which allowed us to study the importance of genome-scale competition between mRNAs for the same ribosomes. We determined that noise is minimized under conditions maximizing the specific synthesis rate. Moreover, sensitivity analysis of the stochastic system revealed the importance of the elongation rate in the resultant noise, whereas the translation initiation rate constant was more closely related to the average protein synthesis rate. We observed significant differences between our results and the noise properties of the most commonly used translation models. Overall, our studies demonstrate that the use of full mechanistic models is essential for the study of noise in translation and transcription.

  14. Genome-wide linkage using the Social Responsiveness Scale in Utah autism pedigrees

    Directory of Open Access Journals (Sweden)

    Coon Hilary

    2010-04-01

    Full Text Available Abstract Background Autism Spectrum Disorders (ASD are phenotypically heterogeneous, characterized by impairments in the development of communication and social behaviour and the presence of repetitive behaviour and restricted interests. Dissecting the genetic complexity of ASD may require phenotypic data reflecting more detail than is offered by a categorical clinical diagnosis. Such data are available from the Social Responsiveness Scale (SRS which is a continuous, quantitative measure of social ability giving scores that range from significant impairment to above average ability. Methods We present genome-wide results for 64 multiplex and extended families ranging from two to nine generations. SRS scores were available from 518 genotyped pedigree subjects, including affected and unaffected relatives. Genotypes from the Illumina 6 k single nucleotide polymorphism panel were provided by the Center for Inherited Disease Research. Quantitative and qualitative analyses were done using MCLINK, a software package that uses Markov chain Monte Carlo (MCMC methods to perform multilocus linkage analysis on large extended pedigrees. Results When analysed as a qualitative trait, linkage occurred in the same locations as in our previous affected-only genome scan of these families, with findings on chromosomes 7q31.1-q32.3 [heterogeneity logarithm of the odds (HLOD = 2.91], 15q13.3 (HLOD = 3.64, and 13q12.3 (HLOD = 2.23. Additional positive qualitative results were seen on chromosomes 6 and 10 in regions that may be of interest for other neuropsychiatric disorders. When analysed as a quantitative trait, results replicated a peak found in an independent sample using quantitative SRS scores on chromosome 11p15.1-p15.4 (HLOD = 2.77. Additional positive quantitative results were seen on chromosomes 7, 9, and 19. Conclusions The SRS linkage peaks reported here substantially overlap with peaks found in our previous affected-only genome scan of clinical diagnosis

  15. A high-throughput approach to identify genomic variants of bacterial metabolite producers at the single-cell level.

    Science.gov (United States)

    Binder, Stephan; Schendzielorz, Georg; Stäbler, Norma; Krumbach, Karin; Hoffmann, Kristina; Bott, Michael; Eggeling, Lothar

    2012-01-01

    We present a novel method for visualizing intracellular metabolite concentrations within single cells of Escherichia coli and Corynebacterium glutamicum that expedites the screening process of producers. It is based on transcription factors and we used it to isolate new L-lysine producing mutants of C. glutamicum from a large library of mutagenized cells using fluorescence-activated cell sorting (FACS). This high-throughput method fills the gap between existing high-throughput methods for mutant generation and genome analysis. The technology has diverse applications in the analysis of producer populations and screening of mutant libraries that carry mutations in plasmids or genomes. PMID:22640862

  16. BAC CGH-array identified specific small-scale genomic imbalances in diploid DMBA-induced rat mammary tumors

    Directory of Open Access Journals (Sweden)

    Samuelson Emma

    2012-08-01

    Full Text Available Abstract Background Development of breast cancer is a multistage process influenced by hormonal and environmental factors as well as by genetic background. The search for genes underlying this malignancy has recently been highly productive, but the etiology behind this complex disease is still not understood. In studies using animal cancer models, heterogeneity of the genetic background and environmental factors is reduced and thus analysis and identification of genetic aberrations in tumors may become easier. To identify chromosomal regions potentially involved in the initiation and progression of mammary cancer, in the present work we subjected a subset of experimental mammary tumors to cytogenetic and molecular genetic analysis. Methods Mammary tumors were induced with DMBA (7,12-dimethylbenz[a]anthrazene in female rats from the susceptible SPRD-Cu3 strain and from crosses and backcrosses between this strain and the resistant WKY strain. We first produced a general overview of chromosomal aberrations in the tumors using conventional kartyotyping (G-banding and Comparative Genome Hybridization (CGH analyses. Particular chromosomal changes were then analyzed in more details using an in-house developed BAC (bacterial artificial chromosome CGH-array platform. Results Tumors appeared to be diploid by conventional karyotyping, however several sub-microscopic chromosome gains or losses in the tumor material were identified by BAC CGH-array analysis. An oncogenetic tree analysis based on the BAC CGH-array data suggested gain of rat chromosome (RNO band 12q11, loss of RNO5q32 or RNO6q21 as the earliest events in the development of these mammary tumors. Conclusions Some of the identified changes appear to be more specific for DMBA-induced mammary tumors and some are similar to those previously reported in ACI rat model for estradiol-induced mammary tumors. The later group of changes is more interesting, since they may represent anomalies that involve

  17. BAC CGH-array identified specific small-scale genomic imbalances in diploid DMBA-induced rat mammary tumors

    International Nuclear Information System (INIS)

    Development of breast cancer is a multistage process influenced by hormonal and environmental factors as well as by genetic background. The search for genes underlying this malignancy has recently been highly productive, but the etiology behind this complex disease is still not understood. In studies using animal cancer models, heterogeneity of the genetic background and environmental factors is reduced and thus analysis and identification of genetic aberrations in tumors may become easier. To identify chromosomal regions potentially involved in the initiation and progression of mammary cancer, in the present work we subjected a subset of experimental mammary tumors to cytogenetic and molecular genetic analysis. Mammary tumors were induced with DMBA (7,12-dimethylbenz[a]anthrazene) in female rats from the susceptible SPRD-Cu3 strain and from crosses and backcrosses between this strain and the resistant WKY strain. We first produced a general overview of chromosomal aberrations in the tumors using conventional kartyotyping (G-banding) and Comparative Genome Hybridization (CGH) analyses. Particular chromosomal changes were then analyzed in more details using an in-house developed BAC (bacterial artificial chromosome) CGH-array platform. Tumors appeared to be diploid by conventional karyotyping, however several sub-microscopic chromosome gains or losses in the tumor material were identified by BAC CGH-array analysis. An oncogenetic tree analysis based on the BAC CGH-array data suggested gain of rat chromosome (RNO) band 12q11, loss of RNO5q32 or RNO6q21 as the earliest events in the development of these mammary tumors. Some of the identified changes appear to be more specific for DMBA-induced mammary tumors and some are similar to those previously reported in ACI rat model for estradiol-induced mammary tumors. The later group of changes is more interesting, since they may represent anomalies that involve genes with a critical role in mammary tumor development. Genetic

  18. Structural Characterization of Genomes by Large Scale Sequence-Structure Threading: Application of Reliability Analysis in Structural Genomics

    OpenAIRE

    Brunham Robert C; Ho Sui Shannan J; Cherkasov Artem; Jones Steven JM

    2004-01-01

    Abstract Background We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, maintainability and safety work to model time-to-failure of mechanical devices, mechanisms, building constructions and equipment. Results We have found that the Weibull f...

  19. SECOM: A novel hash seed and community detection based-approach for genome-scale protein domain identification

    KAUST Repository

    Fan, Ming

    2012-06-28

    With rapid advances in the development of DNA sequencing technologies, a plethora of high-throughput genome and proteome data from a diverse spectrum of organisms have been generated. The functional annotation and evolutionary history of proteins are usually inferred from domains predicted from the genome sequences. Traditional database-based domain prediction methods cannot identify novel domains, however, and alignment-based methods, which look for recurring segments in the proteome, are computationally demanding. Here, we propose a novel genome-wide domain prediction method, SECOM. Instead of conducting all-against-all sequence alignment, SECOM first indexes all the proteins in the genome by using a hash seed function. Local similarity can thus be detected and encoded into a graph structure, in which each node represents a protein sequence and each edge weight represents the shared hash seeds between the two nodes. SECOM then formulates the domain prediction problem as an overlapping community-finding problem in this graph. A backward graph percolation algorithm that efficiently identifies the domains is proposed. We tested SECOM on five recently sequenced genomes of aquatic animals. Our tests demonstrated that SECOM was able to identify most of the known domains identified by InterProScan. When compared with the alignment-based method, SECOM showed higher sensitivity in detecting putative novel domains, while it was also three orders of magnitude faster. For example, SECOM was able to predict a novel sponge-specific domain in nucleoside-triphosphatase (NTPases). Furthermore, SECOM discovered two novel domains, likely of bacterial origin, that are taxonomically restricted to sea anemone and hydra. SECOM is an open-source program and available at http://sfb.kaust.edu.sa/Pages/Software.aspx. © 2012 Fan et al.

  20. Genome-scale modeling of the evolutionary path to C4 photosynthesis

    Science.gov (United States)

    Myers, Christopher R.; Bogart, Eli

    In C4 photosynthesis, plants maintain a high carbon dioxide level in specialized bundle sheath cells surrounding leaf veins and restrict CO2 assimilation to those cells, favoring CO2 over O2 in competition for Rubisco active sites. In C3 plants, which do not possess such a carbon concentrating mechanism, CO2 fixation is reduced due to this competition. Despite the complexity of the C4 system, it has evolved convergently from more than 60 independent origins in diverse families of plants around the world over the last 30 million years. We study the evolution of the C4 system in a genome-scale model of plant metabolism that describes interacting mesophyll and bundle sheath cells and enforces key nonlinear kinetic relationships. Adapting the zero-temperature string method for simulating transition paths in physics and chemistry, we find the highest-fitness paths connecting C3 and C4 positions in the model's high-dimensional parameter space, and show that they reproduce known aspects of the C3-C4 transition while making additional predictions about metabolic changes along the path. We explore the relationship between evolutionary history and C4 biochemical subtype, and the effects of atmospheric carbon dioxide levels.

  1. Novel insights into obesity and diabetes through genome-scale metabolic modeling

    Directory of Open Access Journals (Sweden)

    Leif eVäremo

    2013-04-01

    Full Text Available The growing prevalence of metabolic diseases, such as obesity and diabetes, are putting a high strain on global healthcare systems as well as increasing the demand for efficient treatment strategies. More than 360 million people worldwide are suffering from type 2 diabetes and, with the current trends, the projection is that 10% of the global adult population will be affected by 2030. In light of the systemic properties of metabolic diseases as well as the interconnected nature of metabolism, it is necessary to begin taking a holistic approach to study these diseases. Human genome-scale metabolic models (GEMs are topological and mathematical representations of cell metabolism and have proven to be valuable tools in the area of systems biology. Successful applications of GEMs include the process of gaining further biological and mechanistic understanding of diseases, finding potential biomarkers and identifying new drug targets. This review will focus on the modeling of human metabolism in the field of obesity and diabetes, showing its vast range of applications of clinical importance as well as point out future challenges.

  2. Cancer Biomarkers from Genome-Scale DNA Methylation: Comparison of Evolutionary and Semantic Analysis Methods

    Directory of Open Access Journals (Sweden)

    Ioannis Valavanis

    2015-11-01

    Full Text Available DNA methylation profiling exploits microarray technologies, thus yielding a wealth of high-volume data. Here, an intelligent framework is applied, encompassing epidemiological genome-scale DNA methylation data produced from the Illumina’s Infinium Human Methylation 450K Bead Chip platform, in an effort to correlate interesting methylation patterns with cancer predisposition and, in particular, breast cancer and B-cell lymphoma. Feature selection and classification are employed in order to select, from an initial set of ~480,000 methylation measurements at CpG sites, predictive cancer epigenetic biomarkers and assess their classification power for discriminating healthy versus cancer related classes. Feature selection exploits evolutionary algorithms or a graph-theoretic methodology which makes use of the semantics information included in the Gene Ontology (GO tree. The selected features, corresponding to methylation of CpG sites, attained moderate-to-high classification accuracies when imported to a series of classifiers evaluated by resampling or blindfold validation. The semantics-driven selection revealed sets of CpG sites performing similarly with evolutionary selection in the classification tasks. However, gene enrichment and pathway analysis showed that it additionally provides more descriptive sets of GO terms and KEGG pathways regarding the cancer phenotypes studied here. Results support the expediency of this methodology regarding its application in epidemiological studies.

  3. Expanding a dynamic flux balance model of yeast fermentation to genome-scale

    Directory of Open Access Journals (Sweden)

    Agosin Eduardo

    2011-05-01

    Full Text Available Abstract Background Yeast is considered to be a workhorse of the biotechnology industry for the production of many value-added chemicals, alcoholic beverages and biofuels. Optimization of the fermentation is a challenging task that greatly benefits from dynamic models able to accurately describe and predict the fermentation profile and resulting products under different genetic and environmental conditions. In this article, we developed and validated a genome-scale dynamic flux balance model, using experimentally determined kinetic constraints. Results Appropriate equations for maintenance, biomass composition, anaerobic metabolism and nutrient uptake are key to improve model performance, especially for predicting glycerol and ethanol synthesis. Prediction profiles of synthesis and consumption of the main metabolites involved in alcoholic fermentation closely agreed with experimental data obtained from numerous lab and industrial fermentations under different environmental conditions. Finally, fermentation simulations of genetically engineered yeasts closely reproduced previously reported experimental results regarding final concentrations of the main fermentation products such as ethanol and glycerol. Conclusion A useful tool to describe, understand and predict metabolite production in batch yeast cultures was developed. The resulting model, if used wisely, could help to search for new metabolic engineering strategies to manage ethanol content in batch fermentations.

  4. A Computational Solution to Automatically Map Metabolite Libraries in the Context of Genome Scale Metabolic Networks

    Science.gov (United States)

    Merlet, Benjamin; Paulhe, Nils; Vinson, Florence; Frainay, Clément; Chazalviel, Maxime; Poupin, Nathalie; Gloaguen, Yoann; Giacomoni, Franck; Jourdan, Fabien

    2016-01-01

    This article describes a generic programmatic method for mapping chemical compound libraries on organism-specific metabolic networks from various databases (KEGG, BioCyc) and flat file formats (SBML and Matlab files). We show how this pipeline was successfully applied to decipher the coverage of chemical libraries set up by two metabolomics facilities MetaboHub (French National infrastructure for metabolomics and fluxomics) and Glasgow Polyomics (GP) on the metabolic networks available in the MetExplore web server. The present generic protocol is designed to formalize and reduce the volume of information transfer between the library and the network database. Matching of metabolites between libraries and metabolic networks is based on InChIs or InChIKeys and therefore requires that these identifiers are specified in both libraries and networks. In addition to providing covering statistics, this pipeline also allows the visualization of mapping results in the context of metabolic networks. In order to achieve this goal, we tackled issues on programmatic interaction between two servers, improvement of metabolite annotation in metabolic networks and automatic loading of a mapping in genome scale metabolic network analysis tool MetExplore. It is important to note that this mapping can also be performed on a single or a selection of organisms of interest and is thus not limited to large facilities. PMID:26909353

  5. An Integrative Approach for the Large-scale Identification of Human Genome Kinases Regulating Cancer Metastasis

    Science.gov (United States)

    Zhang, Hanshuo; Wu, Pu-Yen; Ma, Ming; Ye, Yanzheng; Hao, Yang; Yang, Junyu; Yin, Shenyi; Sun, Changhong; Phan, John H.; Wang, May D.; Xi, Jianzhong Jeff

    2016-01-01

    Kinases regulate the majority of biological processes and become one of important groups of drug targets. To identify more kinases being potential for cancer therapy, we developed an integrative approach for the large-scale screen of functional genes capable of regulating the main traits of cancer metastasis, including cell migration as well as invasion. We first employed self-assembled cell microarray (SAMcell) to screen functional genes that regulate cancer cell migration using a siRNA library targeting 710 human genome kinase genes. We identified 81 genes capable of significantly regulating cancer cell migration. Following with invasion assays and bio-informatics analysis, we discovered that 16 genes with differentially expression in cancer samples can regulate both cell migration and invasion, among which 10 genes have been well known to play critical roles in the cancer development. The remaining 6 genes were experimentally validated to have the capacities of regulating the metastasis-related traits, including cell proliferation, apoptosis and anoikis activities besides cell motility. Together, these findings provide a new insight into the therapeutic use of human kinases. PMID:23751374

  6. MetExplore: a web server to link metabolomic experiments and genome-scale metabolic networks.

    Science.gov (United States)

    Cottret, Ludovic; Wildridge, David; Vinson, Florence; Barrett, Michael P; Charles, Hubert; Sagot, Marie-France; Jourdan, Fabien

    2010-07-01

    High-throughput metabolomic experiments aim at identifying and ultimately quantifying all metabolites present in biological systems. The metabolites are interconnected through metabolic reactions, generally grouped into metabolic pathways. Classical metabolic maps provide a relational context to help interpret metabolomics experiments and a wide range of tools have been developed to help place metabolites within metabolic pathways. However, the representation of metabolites within separate disconnected pathways overlooks most of the connectivity of the metabolome. By definition, reference pathways cannot integrate novel pathways nor show relationships between metabolites that may be linked by common neighbours without being considered as joint members of a classical biochemical pathway. MetExplore is a web server that offers the possibility to link metabolites identified in untargeted metabolomics experiments within the context of genome-scale reconstructed metabolic networks. The analysis pipeline comprises mapping metabolomics data onto the specific metabolic network of an organism, then applying graph-based methods and advanced visualization tools to enhance data analysis. The MetExplore web server is freely accessible at http://metexplore.toulouse.inra.fr. PMID:20444866

  7. Revisiting the chlorophyll biosynthesis pathway using genome scale metabolic model of Oryza sativa japonica.

    Science.gov (United States)

    Chatterjee, Ankita; Kundu, Sudip

    2015-01-01

    Chlorophyll is one of the most important pigments present in green plants and rice is one of the major food crops consumed worldwide. We curated the existing genome scale metabolic model (GSM) of rice leaf by incorporating new compartment, reactions and transporters. We used this modified GSM to elucidate how the chlorophyll is synthesized in a leaf through a series of bio-chemical reactions spanned over different organelles using inorganic macronutrients and light energy. We predicted the essential reactions and the associated genes of chlorophyll synthesis and validated against the existing experimental evidences. Further, ammonia is known to be the preferred source of nitrogen in rice paddy fields. The ammonia entering into the plant is assimilated in the root and leaf. The focus of the present work is centered on rice leaf metabolism. We studied the relative importance of ammonia transporters through the chloroplast and the cytosol and their interlink with other intracellular transporters. Ammonia assimilation in the leaves takes place by the enzyme glutamine synthetase (GS) which is present in the cytosol (GS1) and chloroplast (GS2). Our results provided possible explanation why GS2 mutants show normal growth under minimum photorespiration and appear chlorotic when exposed to air. PMID:26443104

  8. Origin of an Alternative Genetic Code in the Extremely Small and GC–Rich Genome of a Bacterial Symbiont

    OpenAIRE

    McCutcheon, John P.; McDonald, Bradon R.; Moran, Nancy A

    2009-01-01

    Author Summary The genetic code, which relates DNA sequence to protein sequence, is nearly universal across all life. Examples of recodings do exist, but new instances are rare. Genomes that exhibit recodings typically have other extreme properties, including reduced size, reduced gene sets, and low guanine plus cytosine (GC) content. The most common recoding event, the reassignment of UGA to Tryptophan instead of Stop (Stop→Trp), was previously known from several mitochondrial and one bacter...

  9. Population genomic analysis of a bacterial plant pathogen: novel insight into the origin of Pierce's disease of grapevine in the U.S.

    Directory of Open Access Journals (Sweden)

    Leonard Nunney

    Full Text Available Invasive diseases present an increasing problem worldwide; however, genomic techniques are now available to investigate the timing and geographical origin of such introductions. We employed genomic techniques to demonstrate that the bacterial pathogen causing Pierce's disease of grapevine (PD is not native to the US as previously assumed, but descended from a single genotype introduced from Central America. PD has posed a serious threat to the US wine industry ever since its first outbreak in Anaheim, California in the 1880s and continues to inhibit grape cultivation in a large area of the country. It is caused by infection of xylem vessels by the bacterium Xylella fastidiosa subsp. fastidiosa, a genetically distinct subspecies at least 15,000 years old. We present five independent kinds of evidence that strongly support our invasion hypothesis: 1 a genome-wide lack of genetic variability in X. fastidiosa subsp. fastidiosa found in the US, consistent with a recent common ancestor; 2 evidence for historical allopatry of the North American subspecies X. fastidiosa subsp. multiplex and X. fastidiosa subsp. fastidiosa; 3 evidence that X. fastidiosa subsp. fastidiosa evolved in a more tropical climate than X. fastidiosa subsp. multiplex; 4 much greater genetic variability in the proposed source population in Central America, variation within which the US genotypes are phylogenetically nested; and 5 the circumstantial evidence of importation of known hosts (coffee plants from Central America directly into southern California just prior to the first known outbreak of the disease. The lack of genetic variation in X. fastidiosa subsp. fastidiosa in the US suggests that preventing additional introductions is important since new genetic variation may undermine PD control measures, or may lead to infection of other crop plants through the creation of novel genotypes via inter-subspecific recombination. In general, geographically mixing of previously

  10. Exploring the metabolic network of the epidemic pathogen Burkholderia cenocepacia J2315 via genome-scale reconstruction

    Directory of Open Access Journals (Sweden)

    Panda Gurudutta

    2011-05-01

    Full Text Available Abstract Background Burkholderia cenocepacia is a threatening nosocomial epidemic pathogen in patients with cystic fibrosis (CF or a compromised immune system. Its high level of antibiotic resistance is an increasing concern in treatments against its infection. Strain B. cenocepacia J2315 is the most infectious isolate from CF patients. There is a strong demand to reconstruct a genome-scale metabolic network of B. cenocepacia J2315 to systematically analyze its metabolic capabilities and its virulence traits, and to search for potential clinical therapy targets. Results We reconstructed the genome-scale metabolic network of B. cenocepacia J2315. An iterative reconstruction process led to the establishment of a robust model, iKF1028, which accounts for 1,028 genes, 859 internal reactions, and 834 metabolites. The model iKF1028 captures important metabolic capabilities of B. cenocepacia J2315 with a particular focus on the biosyntheses of key metabolic virulence factors to assist in understanding the mechanism of disease infection and identifying potential drug targets. The model was tested through BIOLOG assays. Based on the model, the genome annotation of B. cenocepacia J2315 was refined and 24 genes were properly re-annotated. Gene and enzyme essentiality were analyzed to provide further insights into the genome function and architecture. A total of 45 essential enzymes were identified as potential therapeutic targets. Conclusions As the first genome-scale metabolic network of B. cenocepacia J2315, iKF1028 allows a systematic study of the metabolic properties of B. cenocepacia and its key metabolic virulence factors affecting the CF community. The model can be used as a discovery tool to design novel drugs against diseases caused by this notorious pathogen.

  11. CATMA, a comprehensive genome-scale resource for silencing and transcript profiling of Arabidopsis genes

    Directory of Open Access Journals (Sweden)

    Moreau Yves

    2007-10-01

    Gène genome annotations, respectively. To cover the remaining untagged genes, we identified 543 additional GSTs using less stringent design criteria and designed 990 sequence tags matching multiple members of gene families (Gene Family Tags or GFTs to cover any remaining untagged genes. These latter 1,533 features constitute the CATMAv4 addition. Conclusion To update the CATMA GST repertoire, we designed 7,289 additional sequence tags, bringing the total number of tagged TAIR6-annotated Arabidopsis nuclear protein-coding genes to 26,173. This resource is used both for the production of spotted microarrays and the large-scale cloning of hairpin RNA silencing vectors. All information about the resulting updated CATMA repertoire is available through the CATMA database http://www.catma.org.

  12. A high-resolution cattle CNV map by population-scale genome sequencing

    Science.gov (United States)

    Copy Number Variations (CNVs) are common genomic structural variations that have been linked to human diseases and phenotypic traits. Prior studies in cattle have produced low-resolution CNV maps. We constructed a draft, high-resolution map of cattle CNVs based on whole genome sequencing data from 7...

  13. In Vitro Whole Genome DNA Binding Analysis of the Bacterial Replication Initiator and Transcription Factor DnaA

    OpenAIRE

    Smith, Janet L.; Grossman, Alan D.

    2015-01-01

    DnaA, the replication initiation protein in bacteria, is an AAA+ ATPase that binds and hydrolyzes ATP and exists in a heterogeneous population of ATP-DnaA and ADP-DnaA. DnaA binds cooperatively to the origin of replication and several other chromosomal regions, and functions as a transcription factor at some of these regions. We determined the binding properties of Bacillus subtilis DnaA to genomic DNA in vitro at single nucleotide resolution using in vitro DNA affinity purification and deep ...

  14. Large-scale trends in the evolution of gene structures within 11 animal genomes.

    Directory of Open Access Journals (Sweden)

    Mark Yandell

    2006-03-01

    Full Text Available We have used the annotations of six animal genomes (Homo sapiens, Mus musculus, Ciona intestinalis, Drosophila melanogaster, Anopheles gambiae, and Caenorhabditis elegans together with the sequences of five unannotated Drosophila genomes to survey changes in protein sequence and gene structure over a variety of timescales--from the less than 5 million years since the divergence of D. simulans and D. melanogaster to the more than 500 million years that have elapsed since the Cambrian explosion. To do so, we have developed a new open-source software library called CGL (for "Comparative Genomics Library". Our results demonstrate that change in intron-exon structure is gradual, clock-like, and largely independent of coding-sequence evolution. This means that genome annotations can be used in new ways to inform, corroborate, and test conclusions drawn from comparative genomics analyses that are based upon protein and nucleotide sequence similarities.

  15. Development of a database system for mapping insertional mutations onto the mouse genome with large-scale experimental data

    OpenAIRE

    Yang, Wenwei; Jin, Ke; Xie, Xing; Li, Dongsheng; Yang, Jigang; Wang, Li; Gu, Ning; Zhong, Yang; Sun, Ling V.

    2009-01-01

    Background Insertional mutagenesis is an effective method for functional genomic studies in various organisms. It can rapidly generate easily tractable mutations. A large-scale insertional mutagenesis with the piggyBac (PB) transposon is currently performed in mice at the Institute of Developmental Biology and Molecular Medicine (IDM), Fudan University in Shanghai, China. This project is carried out via collaborations among multiple groups overseeing interconnected experimental steps and gene...

  16. Genome-scale modeling using flux ratio constraints to enable metabolic engineering of clostridial metabolism in silico

    Directory of Open Access Journals (Sweden)

    McAnulty Michael J

    2012-05-01

    Full Text Available Abstract Background Genome-scale metabolic networks and flux models are an effective platform for linking an organism genotype to its phenotype. However, few modeling approaches offer predictive capabilities to evaluate potential metabolic engineering strategies in silico. Results A new method called “flux balance analysis with flux ratios (FBrAtio” was developed in this research and applied to a new genome-scale model of Clostridium acetobutylicum ATCC 824 (iCAC490 that contains 707 metabolites and 794 reactions. FBrAtio was used to model wild-type metabolism and metabolically engineered strains of C. acetobutylicum where only flux ratio constraints and thermodynamic reversibility of reactions were required. The FBrAtio approach allowed solutions to be found through standard linear programming. Five flux ratio constraints were required to achieve a qualitative picture of wild-type metabolism for C. acetobutylicum for the production of: (i acetate, (ii lactate, (iii butyrate, (iv acetone, (v butanol, (vi ethanol, (vii CO2 and (viii H2. Results of this simulation study coincide with published experimental results and show the knockdown of the acetoacetyl-CoA transferase increases butanol to acetone selectivity, while the simultaneous over-expression of the aldehyde/alcohol dehydrogenase greatly increases ethanol production. Conclusions FBrAtio is a promising new method for constraining genome-scale models using internal flux ratios. The method was effective for modeling wild-type and engineered strains of C. acetobutylicum.

  17. Genome-environment association study suggests local adaptation to climate at the regional scale in Fagus sylvatica.

    Science.gov (United States)

    Pluess, Andrea R; Frank, Aline; Heiri, Caroline; Lalagüe, Hadrien; Vendramin, Giovanni G; Oddou-Muratorio, Sylvie

    2016-04-01

    The evolutionary potential of long-lived species, such as forest trees, is fundamental for their local persistence under climate change (CC). Genome-environment association (GEA) analyses reveal if species in heterogeneous environments at the regional scale are under differential selection resulting in populations with potential preadaptation to CC within this area. In 79 natural Fagus sylvatica populations, neutral genetic patterns were characterized using 12 simple sequence repeat (SSR) markers, and genomic variation (144 single nucleotide polymorphisms (SNPs) out of 52 candidate genes) was related to 87 environmental predictors in the latent factor mixed model, logistic regressions and isolation by distance/environmental (IBD/IBE) tests. SSR diversity revealed relatedness at up to 150 m intertree distance but an absence of large-scale spatial genetic structure and IBE. In the GEA analyses, 16 SNPs in 10 genes responded to one or several environmental predictors and IBE, corrected for IBD, was confirmed. The GEA often reflected the proposed gene functions, including indications for adaptation to water availability and temperature. Genomic divergence and the lack of large-scale neutral genetic patterns suggest that gene flow allows the spread of advantageous alleles in adaptive genes. Thereby, adaptation processes are likely to take place in species occurring in heterogeneous environments, which might reduce their regional extinction risk under CC. PMID:26777878

  18. Capturing the response of Clostridium acetobutylicum to chemical stressors using a regulated genome-scale metabolic model

    International Nuclear Information System (INIS)

    Clostridia are anaerobic Gram-positive Firmicutes containing broad and flexible systems for substrate utilization, which have been used successfully to produce a range of industrial compounds. Clostridium acetobutylicum has been used to produce butanol on an industrial scale through acetone-butanol-ethanol (ABE) fermentation. A genome-scale metabolic (GSM) model is a powerful tool for understanding the metabolic capacities of an organism and developing metabolic engineering strategies for strain development. The integration of stress related specific transcriptomics information with the GSM model provides opportunities for elucidating the focal points of regulation

  19. Genomic variation of subseafloor archaeal and bacterial populations from venting fluids at the Mid-Cayman Rise

    Science.gov (United States)

    Anderson, R. E.; Eren, A. M.; Stepanauskas, R.; Huber, J. A.; Reveillaud, J.

    2015-12-01

    Deep-sea hydrothermal vent systems serve as windows to a dynamic, gradient-dominated deep biosphere that is home to a wide diversity of archaea, bacteria, and viruses. Until recently the majority of these microbial lineages were uncultivated, resulting in a poor understanding of how the physical and geochemical context shapes microbial evolution in the deep subsurface. By comparing metagenomes, metatranscriptomes and single-cell genomes between geologically distinct vent fields, we can better understand the relationship between the environment and the evolution of subsurface microbial communities. An ideal setting in which to use this approach is the Mid-Cayman Rise, located on the world's deepest and slowest-spreading mid-ocean ridge, which hosts both the mafic-influenced Piccard and ultramafic-influenced Von Damm vent fields. Previous work has shown that Von Damm has higher taxonomic and metabolic diversity than Piccard, consistent with geochemical model expectations, and the fluids from all vents are enriched in hydrogen (Reveillaud et al., submitted). Mapping of both metagenomes and metatranscriptomes to a combined assembly showed very little overlap among the Von Damm samples, indicating substantial variability that is consistent with the diversity of potential metabolites in this ultramafic vent field. In contrast, the most consistently abundant and active lineage across the Piccard samples was Sulfurovum, a sulfur-oxidizing chemolithotroph that uses nitrate or oxygen as an electron acceptor. Moreover, analysis of point mutations within individual lineages suggested that Sulfurovumat Piccard is under strong selection, whereas microbial genomes at Von Damm were more variable. These results are consistent with the hypothesis that the subsurface environment at Piccard supports the emergence of a dominant lineage that is under strong selection pressure, whereas the more geochemically diverse microbial habitat at Von Damm creates a wider variety of stable

  20. Genomic-scale analysis of DNA words of arbitrary length by parallel computation.

    OpenAIRE

    Yang, X Y; Ripoll, A.; Arnau Llombart, Vicente; Marín Lozano, Ignacio; Luque, E.

    2006-01-01

    In the post-genomic era, one of the main tasks is deciphering the meaning of the DNA sequences of complex organisms. In order to do so, there is a clear need for biocomputer tools able to extract and order the information of long DNA molecules, such as whole chromosomes or even complete genomes. However, most genomic analyses have been concentrated on the detection and counting of short words having sizes of between 1 and 10 nucleotides. In this paper, we describe parallel algorithms with dif...

  1. Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale

    DEFF Research Database (Denmark)

    Huang, Shujia; Rao, Junhua; Ye, Weijian;

    2015-01-01

    well as large deletions. However, these approaches consistently display a substantial bias against the recovery of complex structural variants and novel sequence in individual genomes and do not provide interpretation information such as the annotation of ancestral state and formation mechanism. We...... population-scale pan-genomes. Our study also highlights the usefulness of the de novo assembly strategy for definition of genome structure....

  2. Genome-scale comparison and constraint-based metabolic reconstruction of the facultative anaerobic Fe(III-reducer Rhodoferax ferrireducens

    Directory of Open Access Journals (Sweden)

    Daugherty Sean

    2009-09-01

    Full Text Available Abstract Background Rhodoferax ferrireducens is a metabolically versatile, Fe(III-reducing, subsurface microorganism that is likely to play an important role in the carbon and metal cycles in the subsurface. It also has the unique ability to convert sugars to electricity, oxidizing the sugars to carbon dioxide with quantitative electron transfer to graphite electrodes in microbial fuel cells. In order to expand our limited knowledge about R. ferrireducens, the complete genome sequence of this organism was further annotated and then the physiology of R. ferrireducens was investigated with a constraint-based, genome-scale in silico metabolic model and laboratory studies. Results The iterative modeling and experimental approach unveiled exciting, previously unknown physiological features, including an expanded range of substrates that support growth, such as cellobiose and citrate, and provided additional insights into important features such as the stoichiometry of the electron transport chain and the ability to grow via fumarate dismutation. Further analysis explained why R. ferrireducens is unable to grow via photosynthesis or fermentation of sugars like other members of this genus and uncovered novel genes for benzoate metabolism. The genome also revealed that R. ferrireducens is well-adapted for growth in the subsurface because it appears to be capable of dealing with a number of environmental insults, including heavy metals, aromatic compounds, nutrient limitation and oxidative stress. Conclusion This study demonstrates that combining genome-scale modeling with the annotation of a new genome sequence can guide experimental studies and accelerate the understanding of the physiology of under-studied yet environmentally relevant microorganisms.

  3. Folding Free Energies of 5'-UTRs Impact Post-Transcriptional Regulation on a Genomic Scale in Yeast.

    Directory of Open Access Journals (Sweden)

    2005-12-01

    Full Text Available Using high-throughput technologies, abundances and other features of genes and proteins have been measured on a genome-wide scale in Saccharomyces cerevisiae. In contrast, secondary structure in 5'-untranslated regions (UTRs of mRNA has only been investigated for a limited number of genes. Here, the aim is to study genome-wide regulatory effects of mRNA 5'-UTR folding free energies. We performed computations of secondary structures in 5'-UTRs and their folding free energies for all verified genes in S. cerevisiae. We found significant correlations between folding free energies of 5'-UTRs and various transcript features measured in genome-wide studies of yeast. In particular, mRNAs with weakly folded 5'-UTRs have higher translation rates, higher abundances of the corresponding proteins, longer half-lives, and higher numbers of transcripts, and are upregulated after heat shock. Furthermore, 5'-UTRs have significantly higher folding free energies than other genomic regions and randomized sequences. We also found a positive correlation between transcript half-life and ribosome occupancy that is more pronounced for short-lived transcripts, which supports a picture of competition between translation and degradation. Among the genes with strongly folded 5'-UTRs, there is a huge overrepresentation of uncharacterized open reading frames. Based on our analysis, we conclude that (i there is a widespread bias for 5'-UTRs to be weakly folded, (ii folding free energies of 5'-UTRs are correlated with mRNA translation and turnover on a genomic scale, and (iii transcripts with strongly folded 5'-UTRs are often rare and hard to find experimentally.

  4. Proficiency testing for bacterial whole genome sequencing: an end-user survey of current capabilities, requirements and priorities

    DEFF Research Database (Denmark)

    Moran-Gilad, Jacob; Sintchenko, Vitali; Karlsmose Pedersen, Susanne;

    2015-01-01

    costs. The priority pathogens reported by respondents reflected the key drivers for NGS use (high burden disease and 'high profile' pathogens). The performance of and participation in PT was perceived as important by most respondents. The wide range of sequencing and bioinformatics practices reported by...... end-users highlights the importance of standardisation and harmonisation of NGS in public health and underpins the use of PT as a means to assuring quality. The findings of this survey will guide the design of the GMI PT program in relation to the spectrum of pathogens included, testing frequency and...... volume as well as technical requirements. The PT program for external quality assurance will evolve and inform the introduction of NGS into clinical and public health microbiology practice in the post-genomic era....

  5. Proficiency Testing for Bacterial Whole Genome Sequencing: An End-User Survey of Current Capabilities, Requirements and Priorities

    DEFF Research Database (Denmark)

    Moran-Gilad, Jacob; Sintchenko, Vitali; Karlsmose Pedersen, Susanne;

    2015-01-01

    range of costs. The priority pathogens reported by respondents reflected the key drivers for NGS use (high burden disease and ‘high profile’ pathogens). The performance of and participation in PT was perceived as important by most respondents. The wide range of sequencing and bioinformatics practices...... reported by end-users highlights the importance of standardisation and harmonisation of NGS in public health and underpins the use of PT as a means to assuring quality. The findings of this survey will guide the design of the GMI PT program in relation to the spectrum of pathogens included, testing...... frequency and volume as well as technical requirements. The PT program for external quality assurance will evolve and inform the introduction of NGS into clinical and public health microbiology practice in the post-genomic era....

  6. In Vitro Whole Genome DNA Binding Analysis of the Bacterial Replication Initiator and Transcription Factor DnaA.

    Directory of Open Access Journals (Sweden)

    Janet L Smith

    2015-05-01

    Full Text Available DnaA, the replication initiation protein in bacteria, is an AAA+ ATPase that binds and hydrolyzes ATP and exists in a heterogeneous population of ATP-DnaA and ADP-DnaA. DnaA binds cooperatively to the origin of replication and several other chromosomal regions, and functions as a transcription factor at some of these regions. We determined the binding properties of Bacillus subtilis DnaA to genomic DNA in vitro at single nucleotide resolution using in vitro DNA affinity purification and deep sequencing (IDAP-Seq. We used these data to identify 269 binding regions, refine the consensus sequence of the DnaA binding site, and compare the relative affinity of binding regions for ATP-DnaA and ADP-DnaA. Most sites had a slightly higher affinity for ATP-DnaA than ADP-DnaA, but a few had a strong preference for binding ATP-DnaA. Of the 269 sites, only the eight strongest binding ones have been observed to bind DnaA in vivo, suggesting that other cellular factors or the amount of available DnaA in vivo restricts DnaA binding to these additional sites. Conversely, we found several chromosomal regions that were bound by DnaA in vivo but not in vitro, and that the nucleoid-associated protein Rok was required for binding in vivo. Our in vitro characterization of the inherent ability of DnaA to bind the genome at single nucleotide resolution provides a backdrop for interpreting data on in vivo binding and regulation of DnaA, and is an approach that should be adaptable to many other DNA binding proteins.

  7. Leveraging Large-Scale Cancer Genomics Datasets for Germline Discovery - TCGA

    Science.gov (United States)

    The session will review how data types have changed over time, focusing on how next-generation sequencing is being employed to yield more precise information about the underlying genomic variation that influences tumor etiology and biology.

  8. Amplification of pico-scale DNA mediated by bacterial carrier DNA for small-cell-number transcription factor ChIP-seq

    DEFF Research Database (Denmark)

    Jakobsen, Janus S; Bagger, Frederik O; Hasemann, Marie S;

    2015-01-01

    BACKGROUND: Chromatin-Immunoprecipitation coupled with deep sequencing (ChIP-seq) is used to map transcription factor occupancy and generate epigenetic profiles genome-wide. The requirement of nano-scale ChIP DNA for generation of sequencing libraries has impeded ChIP-seq on in vivo tissues of low...... transcription factor (CEBPA) and histone mark (H3K4me3) ChIP. We further demonstrate that genomic profiles are highly resilient to changes in carrier DNA to ChIP DNA ratios. CONCLUSIONS: This represents a significant advance compared to existing technologies, which involve either complex steps of pre...

  9. Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing

    OpenAIRE

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance; Messina, Thomas; Fan, Hongtao; Jaeger, Edward; Stephens, Susan

    2013-01-01

    Background Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from la...

  10. Inter-genomic displacement via lateral gene transfer of bacterial trp operons in an overall context of vertical genealogy

    Directory of Open Access Journals (Sweden)

    Keyhani Nemat O

    2004-06-01

    Full Text Available Abstract Background The growing conviction that lateral gene transfer plays a significant role in prokaryote genealogy opens up a need for comprehensive evaluations of gene-enzyme systems on a case-by-case basis. Genes of tryptophan biosynthesis are frequently organized as whole-pathway operons, an attribute that is expected to facilitate multi-gene transfer in a single step. We have asked whether events of lateral gene transfer are sufficient to have obscured our ability to track the vertical genealogy that underpins tryptophan biosynthesis. Results In 47 complete-genome Bacteria, the genes encoding the seven catalytic domains that participate in primary tryptophan biosynthesis were distinguished from any paralogs or xenologs engaged in other specialized functions. A reliable list of orthologs with carefully ascertained functional roles has thus been assembled and should be valuable as an annotation resource. The protein domains associated with primary tryptophan biosynthesis were then concatenated, yielding single amino-acid sequence strings that represent the entire tryptophan pathway. Lateral gene transfer of several whole-pathway trp operons was demonstrated by use of phylogenetic analysis. Lateral gene transfer of partial-pathway trp operons was also shown, with newly recruited genes functioning either in primary biosynthesis (rarely or specialized metabolism (more frequently. Conclusions (i Concatenated tryptophan protein trees are congruent with 16S rRNA subtrees provided that the genomes represented are of sufficiently close phylogenetic spacing. There are currently seven tryptophan congruency groups in the Bacteria. Recognition of a succession of others can be expected in the near future, but ultimately these should coalesce to a single grouping that parallels the 16S rRNA tree (except for cases of lateral gene transfer. (ii The vertical trace of evolution for tryptophan biosynthesis can be deduced. The daunting complexities engendered

  11. Microarray analysis of serum mRNA in patients with head and neck squamous cell carcinoma at whole-genome scale

    Czech Academy of Sciences Publication Activity Database

    Čapková, M.; Šáchová, Jana; Strnad, Hynek; Kolář, Michal; Hroudová, Miluše; Chovanec, M.; Čada, Z.; Štefl, M.; Valach, J.; Kastner, J.; Smetana, K. Jr.; Plzák, J.

    -, April 23 (2014). ISSN 2314-6141 R&D Projects: GA MZd(CZ) NT13488 Institutional support: RVO:68378050 Keywords : Microarray Analysis * Head and Neck Squamous Cell Carcinoma * whole-genome scale Subject RIV: EB - Genetics ; Molecular Biology

  12. Genome-scale reconstruction of the Streptococcus pyogenes M49 metabolic network reveals growth requirements and indicates potential drug targets.

    Science.gov (United States)

    Levering, Jennifer; Fiedler, Tomas; Sieg, Antje; van Grinsven, Koen W A; Hering, Silvio; Veith, Nadine; Olivier, Brett G; Klett, Lara; Hugenholtz, Jeroen; Teusink, Bas; Kreikemeyer, Bernd; Kummer, Ursula

    2016-08-20

    Genome-scale metabolic models comprise stoichiometric relations between metabolites, as well as associations between genes and metabolic reactions and facilitate the analysis of metabolism. We computationally reconstructed the metabolic network of the lactic acid bacterium Streptococcus pyogenes M49. Initially, we based the reconstruction on genome annotations and already existing and curated metabolic networks of Bacillus subtilis, Escherichia coli, Lactobacillus plantarum and Lactococcus lactis. This initial draft was manually curated with the final reconstruction accounting for 480 genes associated with 576 reactions and 558 metabolites. In order to constrain the model further, we performed growth experiments of wild type and arcA deletion strains of S. pyogenes M49 in a chemically defined medium and calculated nutrient uptake and production fluxes. We additionally performed amino acid auxotrophy experiments to test the consistency of the model. The established genome-scale model can be used to understand the growth requirements of the human pathogen S. pyogenes and define optimal and suboptimal conditions, but also to describe differences and similarities between S. pyogenes and related lactic acid bacteria such as L. lactis in order to find strategies to reduce the growth of the pathogen and propose drug targets. PMID:26970054

  13. Integration of sequence-similarity and functional association information can overcome intrinsic problems in orthology mapping across bacterial genomes

    OpenAIRE

    Li, Guojun; Ma, Qin; Mao, Xizeng; Yin, Yanbin; Zhu, Xiaoran; Xu, Ying

    2011-01-01

    Existing methods for orthologous gene mapping suffer from two general problems: (i) they are computationally too slow and their results are difficult to interpret for automated large-scale applications when based on phylogenetic analyses; or (ii) they are too prone to making mistakes in dealing with complex situations involving horizontal gene transfers and gene fusion due to the lack of a sound basis when based on sequence similarity information. We present a novel algorithm, Global Optimiza...

  14. Revealing less derived nature of cartilaginous fish genomes with their evolutionary time scale inferred with nuclear genes.

    Directory of Open Access Journals (Sweden)

    Adina J Renz

    Full Text Available Cartilaginous fishes, divided into Holocephali (chimaeras and Elasmoblanchii (sharks, rays and skates, occupy a key phylogenetic position among extant vertebrates in reconstructing their evolutionary processes. Their accurate evolutionary time scale is indispensable for better understanding of the relationship between phenotypic and molecular evolution of cartilaginous fishes. However, our current knowledge on the time scale of cartilaginous fish evolution largely relies on estimates using mitochondrial DNA sequences. In this study, making the best use of the still partial, but large-scale sequencing data of cartilaginous fish species, we estimate the divergence times between the major cartilaginous fish lineages employing nuclear genes. By rigorous orthology assessment based on available genomic and transcriptomic sequence resources for cartilaginous fishes, we selected 20 protein-coding genes in the nuclear genome, spanning 2973 amino acid residues. Our analysis based on the Bayesian inference resulted in the mean divergence time of 421 Ma, the late Silurian, for the Holocephali-Elasmobranchii split, and 306 Ma, the late Carboniferous, for the split between sharks and rays/skates. By applying these results and other documented divergence times, we measured the relative evolutionary rate of the Hox A cluster sequences in the cartilaginous fish lineages, which resulted in a lower substitution rate with a factor of at least 2.4 in comparison to tetrapod lineages. The obtained time scale enables mapping phenotypic and molecular changes in a quantitative framework. It is of great interest to corroborate the less derived nature of cartilaginous fish at the molecular level as a genome-wide phenomenon.

  15. Bacterial gastroenteritis

    Science.gov (United States)

    Infectious diarrhea - bacterial gastroenteritis; Acute gastroenteritis; Gastroenteritis - bacterial ... Bacterial gastroenteritis can affect 1 person or a group of people who all ate the same food. It is ...

  16. The Bacterial Communities of Full-Scale Biologically Active, Granular Activated Carbon Filters Are Stable and Diverse and Potentially Contain Novel Ammonia-Oxidizing Microorganisms.

    Science.gov (United States)

    LaPara, Timothy M; Hope Wilkinson, Katheryn; Strait, Jacqueline M; Hozalski, Raymond M; Sadowksy, Michael J; Hamilton, Matthew J

    2015-10-01

    The bacterial community composition of the full-scale biologically active, granular activated carbon (BAC) filters operated at the St. Paul Regional Water Services (SPRWS) was investigated using Illumina MiSeq analysis of PCR-amplified 16S rRNA gene fragments. These bacterial communities were consistently diverse (Shannon index, >4.4; richness estimates, >1,500 unique operational taxonomic units [OTUs]) throughout the duration of the 12-month study period. In addition, only modest shifts in the quantities of individual bacterial populations were observed; of the 15 most prominent OTUs, the most highly variable population (a Variovorax sp.) modulated less than 13-fold over time and less than 8-fold from filter to filter. The most prominent population in the profiles was a Nitrospira sp., representing 13 to 21% of the community. Interestingly, very few of the known ammonia-oxidizing bacteria (AOB; amoA genes, however, suggested that AOB were prominent in the bacterial communities (amoA/16S rRNA gene ratio, 1 to 10%). We conclude, therefore, that the BAC filters at the SPRWS potentially contained significant numbers of unidentified and novel ammonia-oxidizing microorganisms that possess amoA genes similar to those of previously described AOB. PMID:26209671

  17. Large-scale genomics unveil polygenic architecture of human cortical surface area.

    Science.gov (United States)

    Chen, Chi-Hua; Peng, Qian; Schork, Andrew J; Lo, Min-Tzu; Fan, Chun-Chieh; Wang, Yunpeng; Desikan, Rahul S; Bettella, Francesco; Hagler, Donald J; Westlye, Lars T; Kremen, William S; Jernigan, Terry L; Le Hellard, Stephanie; Steen, Vidar M; Espeseth, Thomas; Huentelman, Matt; Håberg, Asta K; Agartz, Ingrid; Djurovic, Srdjan; Andreassen, Ole A; Schork, Nicholas; Dale, Anders M

    2015-01-01

    Little is known about how genetic variation contributes to neuroanatomical variability, and whether particular genomic regions comprising genes or evolutionarily conserved elements are enriched for effects that influence brain morphology. Here, we examine brain imaging and single-nucleotide polymorphisms (SNPs) data from ∼2,700 individuals. We show that a substantial proportion of variation in cortical surface area is explained by additive effects of SNPs dispersed throughout the genome, with a larger heritable effect for visual and auditory sensory and insular cortices (h(2)∼0.45). Genome-wide SNPs collectively account for, on average, about half of twin heritability across cortical regions (N=466 twins). We find enriched genetic effects in or near genes. We also observe that SNPs in evolutionarily more conserved regions contributed significantly to the heritability of cortical surface area, particularly, for medial and temporal cortical regions. SNPs in less conserved regions contributed more to occipital and dorsolateral prefrontal cortices. PMID:26189703

  18. Genome-scale phylogenetic analysis finds extensive gene transfer among fungi

    Science.gov (United States)

    Szöllősi, Gergely J.; Davín, Adrián Arellano; Tannier, Eric; Daubin, Vincent; Boussau, Bastien

    2015-01-01

    Although the role of lateral gene transfer is well recognized in the evolution of bacteria, it is generally assumed that it has had less influence among eukaryotes. To explore this hypothesis, we compare the dynamics of genome evolution in two groups of organisms: cyanobacteria and fungi. Ancestral genomes are inferred in both clades using two types of methods: first, Count, a gene tree unaware method that models gene duplications, gains and losses to explain the observed numbers of genes present in a genome; second, ALE, a more recent gene tree-aware method that reconciles gene trees with a species tree using a model of gene duplication, loss and transfer. We compare their merits and their ability to quantify the role of transfers, and assess the impact of taxonomic sampling on their inferences. We present what we believe is compelling evidence that gene transfer plays a significant role in the evolution of fungi. PMID:26323765

  19. Large Scale Sequencing of Dothideomycetes Provides Insights into Genome Evolution and Adaptation

    Energy Technology Data Exchange (ETDEWEB)

    Haridas, Sajeet; Crous, Pedro; Binder, Manfred; Spatafora, Joseph; Grigoriev, Igor

    2015-03-16

    Dothideomycetes is the largest and most diverse class of ascomycete fungi with 23 orders 110 families, 1300 genera and over 19,000 known species. We present comparative analysis of 70 Dothideomycete genomes including over 50 that we sequenced and are as yet unpublished. This extensive sampling has almost quadrupled the previous study of 18 species and uncovered a 10 fold range of genome sizes. We were able to clarify the phylogenetic positions of several species whose origins were unclear in previous morphological and sequence comparison studies. We analyzed selected gene families including proteases, transporters and small secreted proteins and show that major differences in gene content is influenced by speciation.

  20. Determining the Control Circuitry of Redox Metabolism at the Genome-Scale

    DEFF Research Database (Denmark)

    Federowicz, Stephen; Kim, Donghyuk; Ebrahim, Ali;

    2014-01-01

    regulated during electron acceptor shifts. Here we propose a qualitative model that accounts for the full breadth of regulated genes by detailing how two global transcription factors (TFs), ArcA and Fnr of E. coli, sense key metabolic redox ratios and act on a genome-wide basis to regulate anabolic......, catabolic, and energy generation pathways. We first fill gaps in our knowledge of this transcriptional regulatory network by carrying out ChIP-chip and gene expression experiments to identify 463 regulatory events. We then interfaced this reconstructed regulatory network with a highly curated genome...

  1. Large-scale protein structure modeling of the Saccharomyces cerevisiae genome

    OpenAIRE

    Sánchez, Roberto; Sali, Andrej

    1998-01-01

    The function of a protein generally is determined by its three-dimensional (3D) structure. Thus, it would be useful to know the 3D structure of the thousands of protein sequences that are emerging from the many genome projects. To this end, fold assignment, comparative protein structure modeling, and model evaluation were automated completely. As an illustration, the method was applied to the proteins in the Saccharomyces cerevisiae (baker’s yeast) genome. It resulted in all-atom 3D models fo...

  2. Predicting survival within the lung cancer histopathological hierarchy using a multi-scale genomic model of development.

    Directory of Open Access Journals (Sweden)

    Hongye Liu

    2006-07-01

    Full Text Available BACKGROUND: The histopathologic heterogeneity of lung cancer remains a significant confounding factor in its diagnosis and prognosis-spurring numerous recent efforts to find a molecular classification of the disease that has clinical relevance. METHODS AND FINDINGS: Molecular profiles of tumors from 186 patients representing four different lung cancer subtypes (and 17 normal lung tissue samples were compared with a mouse lung development model using principal component analysis in both temporal and genomic domains. An algorithm for the classification of lung cancers using a multi-scale developmental framework was developed. Kaplan-Meier survival analysis was conducted for lung adenocarcinoma patient subgroups identified via their developmental association. We found multi-scale genomic similarities between four human lung cancer subtypes and the developing mouse lung that are prognostically meaningful. Significant association was observed between the localization of human lung cancer cases along the principal mouse lung development trajectory and the corresponding patient survival rate at three distinct levels of classical histopathologic resolution: among different lung cancer subtypes, among patients within the adenocarcinoma subtype, and within the stage I adenocarcinoma subclass. The earlier the genomic association between a human tumor profile and the mouse lung development sequence, the poorer the patient's prognosis. Furthermore, decomposing this principal lung development trajectory identified a gene set that was significantly enriched for pyrimidine metabolism and cell-adhesion functions specific to lung development and oncogenesis. CONCLUSIONS: From a multi-scale disease modeling perspective, the molecular dynamics of murine lung development provide an effective framework that is not only data driven but also informed by the biology of development for elucidating the mechanisms of human lung cancer biology and its clinical outcome.

  3. A high-resolution cattle CNV map by population-scale genome sequencing

    Science.gov (United States)

    Copy Number Variations (CNVs) are common genomic structural variations that have been linked to human diseases and phenotypic traits. CNVs represent an important type of genetic variation among cattle breeds and even individual animals; however, only low-resolution maps of cattle CNVs currently exis...

  4. Toward genome-scale models of the Chinese hamster ovary cells: incentives, status and perspectives

    DEFF Research Database (Denmark)

    Kaas, Christian Schrøder; Fan, Yuzhou; Weilguny, Dietmar; Kristensen, Claus; Kildegaard, Helene Faustrup; Andersen, Mikael Rørdam

    2014-01-01

    Bioprocessing of the important Chinese hamster ovary (CHO) cell lines used for the production of biopharmaceuticals stands at the brink of several redefining events. In 2011, the field entered the genomics era, which has accelerated omics-based phenotyping of the cell lines. In this review we...

  5. Integrating large-scale functional genomics data to dissect metabolic networks for hydrogen production

    Energy Technology Data Exchange (ETDEWEB)

    Harwood, Caroline S

    2012-12-17

    The goal of this project is to identify gene networks that are critical for efficient biohydrogen production by leveraging variation in gene content and gene expression in independently isolated Rhodopseudomonas palustris strains. Coexpression methods were applied to large data sets that we have collected to define probabilistic causal gene networks. To our knowledge this a first systems level approach that takes advantage of strain-to strain variability to computationally define networks critical for a particular bacterial phenotypic trait.

  6. Emergence of Competitive Dominant Ammonia-Oxidizing Bacterial Populations in a Full-Scale Industrial Wastewater Treatment Plant

    OpenAIRE

    Layton, Alice C.; Dionisi, Hebe; Kuo, H.-W.; Robinson, Kevin G.; Garrett, Victoria M.; Meyers, Arthur; Sayler, Gary S.

    2005-01-01

    Ammonia-oxidizing bacterial populations in an industrial wastewater treatment plant were investigated with amoA and 16S rRNA gene real-time PCR assays. Nitrosomonas nitrosa initially dominated, but over time RI-27-type ammonia oxidizers, also within the Nitrosomonas communis lineage, increased from below detection to codominance. This shift occurred even though nitrification remained constant.

  7. A semiquantitative metric for evaluating clinical actionability of incidental or secondary findings from genome-scale sequencing

    Science.gov (United States)

    Berg, Jonathan S.; Foreman, Ann Katherine M.; O'Daniel, Julianne M.; Booker, Jessica K.; Boshe, Lacey; Carey, Timothy; Crooks, Kristy R.; Jensen, Brian C.; Juengst, Eric T.; Lee, Kristy; Nelson, Daniel K.; Powell, Bradford C.; Powell, Cynthia M.; Roche, Myra I.; Skrzynia, Cecile; Strande, Natasha T.; Weck, Karen E.; Wilhelmsen, Kirk C.; Evans, James P.

    2016-01-01

    Purpose: As genome-scale sequencing is increasingly applied in clinical scenarios, a wide variety of genomic findings will be discovered as secondary or incidental findings, and there is debate about how they should be handled. The clinical actionability of such findings varies, necessitating standardized frameworks for a priori decision making about their analysis. Genet Med 18 5, 467–475. Methods: We established a semiquantitative metric to assess five elements of actionability: severity and likelihood of the disease outcome, efficacy and burden of intervention, and knowledge base, with a total score from 0 to 15. Genet Med 18 5, 467–475. Results: The semiquantitative metric was applied to a list of putative actionable conditions, the list of genes recommended by the American College of Medical Genetics and Genomics (ACMG) for return when deleterious variants are discovered as secondary/incidental findings, and a random sample of 1,000 genes. Scores from the list of putative actionable conditions (median = 12) and the ACMG list (median = 11) were both statistically different than the randomly selected genes (median = 7) (P < 0.0001, two-tailed Mann-Whitney test). Genet Med 18 5, 467–475. Conclusion: Gene–disease pairs having a score of 11 or higher represent the top quintile of actionability. The semiquantitative metric effectively assesses clinical actionability, promotes transparency, and may facilitate assessments of clinical actionability by various groups and in diverse contexts. Genet Med 18 5, 467–475. PMID:26270767

  8. Research guidelines in the era of large-scale collaborations: an analysis of Genome-wide Association Study Consortia.

    Science.gov (United States)

    Austin, Melissa A; Hair, Marilyn S; Fullerton, Stephanie M

    2012-05-01

    Scientific research has shifted from studies conducted by single investigators to the creation of large consortia. Genetic epidemiologists, for example, now collaborate extensively for genome-wide association studies (GWAS). The effect has been a stream of confirmed disease-gene associations. However, effects on human subjects oversight, data-sharing, publication and authorship practices, research organization and productivity, and intellectual property remain to be examined. The aim of this analysis was to identify all research consortia that had published the results of a GWAS analysis since 2005, characterize them, determine which have publicly accessible guidelines for research practices, and summarize the policies in these guidelines. A review of the National Human Genome Research Institute's Catalog of Published Genome-Wide Association Studies identified 55 GWAS consortia as of April 1, 2011. These consortia were comprised of individual investigators, research centers, studies, or other consortia and studied 48 different diseases or traits. Only 14 (25%) were found to have publicly accessible research guidelines on consortia websites. The available guidelines provide information on organization, governance, and research protocols; half address institutional review board approval. Details of publication, authorship, data-sharing, and intellectual property vary considerably. Wider access to consortia guidelines is needed to establish appropriate research standards with broad applicability to emerging forms of large-scale collaboration. PMID:22491085

  9. Large-scale recoding of an arbovirus genome to rebalance its insect versus mammalian preference.

    Science.gov (United States)

    Shen, Sam H; Stauft, Charles B; Gorbatsevych, Oleksandr; Song, Yutong; Ward, Charles B; Yurovsky, Alisa; Mueller, Steffen; Futcher, Bruce; Wimmer, Eckard

    2015-04-14

    The protein synthesis machineries of two distinct phyla of the Animal kingdom, insects of Arthropoda and mammals of Chordata, have different preferences for how to best encode proteins. Nevertheless, arboviruses (arthropod-borne viruses) are capable of infecting both mammals and insects just like arboviruses that use insect vectors to infect plants. These organisms have evolved carefully balanced genomes that can efficiently use the translational machineries of different phyla, even if the phyla belong to different kingdoms. Using dengue virus as an example, we have undone the genome encoding balance and specifically shifted the encoding preference away from mammals. These mammalian-attenuated viruses grow to high titers in insect cells but low titers in mammalian cells, have dramatically increased LD50s in newborn mice, and induce high levels of protective antibodies. Recoded arboviruses with a bias toward phylum-specific expression could form the basis of a new generation of live attenuated vaccine candidates. PMID:25825721

  10. Improved annotation through genome-scale metabolic modeling of Aspergillus oryzae

    DEFF Research Database (Denmark)

    Vongsangnak, Wanwipa; Olsen, Peter; Hansen, Kim;

    2008-01-01

    Background: Since ancient times the filamentous fungus Aspergillus oryzae has been used in the fermentation industry for the production of fermented sauces and the production of industrial enzymes. Recently, the genome sequence of A. oryzae with 12,074 annotated genes was released but the number...... of hypothetical proteins accounted for more than 50% of the annotated genes. Considering the industrial importance of this fungus, it is therefore valuable to improve the annotation and further integrate genomic information with biochemical and physiological information available for this microorganism and other...... related fungi. Here we proposed the gene prediction by construction of an A. oryzae Expressed Sequence Tag (EST) library, sequencing and assembly. We enhanced the function assignment by our developed annotation strategy. The resulting better annotation was used to reconstruct the metabolic network leading...

  11. Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient

    OpenAIRE

    Loraine Ann; Hung Yeung; Salmi Mari L; Chang Chunqi; Yao Jianchao; Roux Stanley J

    2008-01-01

    Abstract Background Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An eff...

  12. Genome-scale investigation of phenotypically distinct but nearly clonal Trichoderma strains

    Science.gov (United States)

    Weld, Richard J.; Cox, Murray P.; Bradshaw, Rosie E.; McLean, Kirstin L.; Stewart, Alison; Steyaert, Johanna M.

    2016-01-01

    Biological control agents (BCA) are beneficial organisms that are applied to protect plants from pests. Many fungi of the genus Trichoderma are successful BCAs but the underlying mechanisms are not yet fully understood. Trichoderma cf. atroviride strain LU132 is a remarkably effective BCA compared to T. cf. atroviride strain LU140 but these strains were found to be highly similar at the DNA sequence level. This unusual combination of phenotypic variability and high DNA sequence similarity between separately isolated strains prompted us to undertake a genome comparison study in order to identify DNA polymorphisms. We further investigated if the polymorphisms had functional effects on the phenotypes. The two strains were clearly identified as individuals, exhibiting different growth rates, conidiation and metabolism. Superior pathogen control demonstrated by LU132 depended on its faster growth, which is a prerequisite for successful distribution and competition. Genome sequencing identified only one non-synonymous single nucleotide polymorphism (SNP) between the strains. Based on this SNP, we successfully designed and validated an RFLP protocol that can be used to differentiate LU132 from LU140 and other Trichoderma strains. This SNP changed the amino acid sequence of SERF, encoded by the previously undescribed single copy gene “small EDRK-rich factor” (serf). A deletion of serf in the two strains did not lead to identical phenotypes, suggesting that, in addition to the single functional SNP between the nearly clonal Trichoderma cf. atroviride strains, other non-genomic factors contribute to their phenotypic variation. This finding is significant as it shows that genomics is an extremely useful but not exhaustive tool for the study of biocontrol complexity and for strain typing. PMID:27190719

  13. Genome-scale DNA methylome and transcriptome profiling of human neutrophils

    OpenAIRE

    Chatterjee, Aniruddha; Stockwell, Peter A.; Rodger, Euan J.; Ian M Morison

    2016-01-01

    Methylation of DNA molecules is a key mechanism associated with human disease, altered gene expression and phenotype. Using reduced representation bisulphite sequencing (RRBS) technology we have analysed DNA methylation patterns in healthy individuals and identified genes showing significant inter-individual variation. Further, using whole genome transcriptome analysis (RNA-Seq) on the same individuals we showed a local and specific relationship of exon inclusion and variable DNA methylation ...

  14. Genomic evidence of rapid, global-scale gene flow in a Sulfolobus species

    OpenAIRE

    Mao, Dominic; Grogan, Dennis

    2012-01-01

    Local populations of Sulfolobus islandicus diverge genetically with geographical separation, and this has been attributed to restricted transfer of propagules imposed by the unfavorable spatial distribution of acidic geothermal habitat. We tested the generality of genetic divergence with distance in Sulfolobus species by analyzing genomes of Sulfolobus acidocaldarius drawn from three populations separated by more than 8000 km. In sharp contrast to S. islandicus, the geographically diverse S. ...

  15. Cloning of the Koi Herpesvirus Genome as an Infectious Bacterial Artificial Chromosome Demonstrates That Disruption of the Thymidine Kinase Locus Induces Partial Attenuation in Cyprinus carpio koi▿

    Science.gov (United States)

    Costes, B.; Fournier, G.; Michel, B.; Delforge, C.; Raj, V. Stalin; Dewals, B.; Gillet, L.; Drion, P.; Body, A.; Schynts, F.; Lieffrig, F.; Vanderplasschen, A.

    2008-01-01

    Koi herpesvirus (KHV) is the causative agent of a lethal disease in koi and common carp. In the present study, we describe the cloning of the KHV genome as a stable and infectious bacterial artificial chromosome (BAC) clone that can be used to produce KHV recombinant strains. This goal was achieved by the insertion of a loxP-flanked BAC cassette into the thymidine kinase (TK) locus. This insertion led to a BAC plasmid that was stably maintained in bacteria and was able to regenerate virions when permissive cells were transfected with the plasmid. Reconstituted virions free of the BAC cassette but carrying a disrupted TK locus (the FL BAC-excised strain) were produced by the transfection of Cre recombinase-expressing cells with the BAC. Similarly, virions with a wild-type revertant TK sequence (the FL BAC revertant strain) were produced by the cotransfection of cells with the BAC and a DNA fragment encoding the wild-type TK sequence. Reconstituted recombinant viruses were compared to the wild-type parental virus in vitro and in vivo. The FL BAC revertant strain and the FL BAC-excised strain replicated comparably to the parental FL strain. The FL BAC revertant strain induced KHV infection in koi carp that was indistinguishable from that induced by the parental strain, while the FL BAC-excised strain exhibited a partially attenuated phenotype. Finally, the usefulness of the KHV BAC for recombination studies was demonstrated by the production of an ORF16-deleted strain by using prokaryotic recombination technology. The availability of the KHV BAC is an important advance that will allow the study of viral genes involved in KHV pathogenesis, as well as the production of attenuated recombinant candidate vaccines. PMID:18337580

  16. Construction of a Genome-Scale Metabolic Model of Arthrospira platensis NIES-39 and Metabolic Design for Cyanobacterial Bioproduction.

    Directory of Open Access Journals (Sweden)

    Katsunori Yoshikawa

    Full Text Available Arthrospira (Spirulina platensis is a promising feedstock and host strain for bioproduction because of its high accumulation of glycogen and superior characteristics for industrial production. Metabolic simulation using a genome-scale metabolic model and flux balance analysis is a powerful method that can be used to design metabolic engineering strategies for the improvement of target molecule production. In this study, we constructed a genome-scale metabolic model of A. platensis NIES-39 including 746 metabolic reactions and 673 metabolites, and developed novel strategies to improve the production of valuable metabolites, such as glycogen and ethanol. The simulation results obtained using the metabolic model showed high consistency with experimental results for growth rates under several trophic conditions and growth capabilities on various organic substrates. The metabolic model was further applied to design a metabolic network to improve the autotrophic production of glycogen and ethanol. Decreased flux of reactions related to the TCA cycle and phosphoenolpyruvate reaction were found to improve glycogen production. Furthermore, in silico knockout simulation indicated that deletion of genes related to the respiratory chain, such as NAD(PH dehydrogenase and cytochrome-c oxidase, could enhance ethanol production by using ammonium as a nitrogen source.

  17. Construction of a Genome-Scale Metabolic Model of Arthrospira platensis NIES-39 and Metabolic Design for Cyanobacterial Bioproduction.

    Science.gov (United States)

    Yoshikawa, Katsunori; Aikawa, Shimpei; Kojima, Yuta; Toya, Yoshihiro; Furusawa, Chikara; Kondo, Akihiko; Shimizu, Hiroshi

    2015-01-01

    Arthrospira (Spirulina) platensis is a promising feedstock and host strain for bioproduction because of its high accumulation of glycogen and superior characteristics for industrial production. Metabolic simulation using a genome-scale metabolic model and flux balance analysis is a powerful method that can be used to design metabolic engineering strategies for the improvement of target molecule production. In this study, we constructed a genome-scale metabolic model of A. platensis NIES-39 including 746 metabolic reactions and 673 metabolites, and developed novel strategies to improve the production of valuable metabolites, such as glycogen and ethanol. The simulation results obtained using the metabolic model showed high consistency with experimental results for growth rates under several trophic conditions and growth capabilities on various organic substrates. The metabolic model was further applied to design a metabolic network to improve the autotrophic production of glycogen and ethanol. Decreased flux of reactions related to the TCA cycle and phosphoenolpyruvate reaction were found to improve glycogen production. Furthermore, in silico knockout simulation indicated that deletion of genes related to the respiratory chain, such as NAD(P)H dehydrogenase and cytochrome-c oxidase, could enhance ethanol production by using ammonium as a nitrogen source. PMID:26640947

  18. A genome-scale integration and analysis of Lactococcus lactis translation data.

    Directory of Open Access Journals (Sweden)

    Julien Racle

    Full Text Available Protein synthesis is a template polymerization process composed by three main steps: initiation, elongation, and termination. During translation, ribosomes are engaged into polysomes whose size is used for the quantitative characterization of translatome. However, simultaneous transcription and translation in the bacterial cytosol complicates the analysis of translatome data. We established a procedure for robust estimation of the ribosomal density in hundreds of genes from Lactococcus lactis polysome size measurements. We used a mechanistic model of translation to integrate the information about the ribosomal density and for the first time we estimated the protein synthesis rate for each gene and identified the rate limiting steps. Contrary to conventional considerations, we find significant number of genes to be elongation limited. This number increases during stress conditions compared to optimal growth and proteins synthesized at maximum rate are predominantly elongation limited. Consistent with bacterial physiology, we found proteins with similar rate and control characteristics belonging to the same functional categories. Under stress conditions, we found that synthesis rate of regulatory proteins is becoming comparable to proteins favored under optimal growth. These findings suggest that the coupling of metabolic states and protein synthesis is more important than previously thought.

  19. A three-scale analysis of bacterial communities involved in rocks colonization and soil formation in high mountain environments.

    Science.gov (United States)

    Esposito, Alfonso; Ciccazzo, Sonia; Borruso, Luigimaria; Zerbe, Stefan; Daffonchio, Daniele; Brusetti, Lorenzo

    2013-10-01

    Alpha and beta diversities of the bacterial communities growing on rock surfaces, proto-soils, riparian sediments, lichen thalli, and water springs biofilms in a glacier foreland were studied. We used three molecular based techniques to allow a deeper investigation at different taxonomic resolutions: denaturing gradient gel electrophoresis, length heterogeneity-PCR, and automated ribosomal intergenic spacer analysis. Bacterial communities were mainly composed of Acidobacteria, Proteobacteria, and Cyanobacteria with distinct variations among sites. Proteobacteria were more represented in sediments, biofilms, and lichens; Acidobacteria were mostly found in proto-soils; and Cyanobacteria on rocks. Firmicutes and Bacteroidetes were mainly found in biofilms. UniFrac P values confirmed a significant difference among different matrices. Significant differences (P < 0.001) in beta diversity were observed among the different matrices at the genus-species level, except for lichens and rocks which shared a more similar community structure, while at deep taxonomic resolution two distinct bacterial communities between lichens and rocks were found. PMID:23712376

  20. Draft Genome of Two Sphingopyxis sp. Strains, Dominant Members of the Bacterial Community Associated with a Drinking Water Distribution System Simulator

    Science.gov (United States)

    We report the draft genome of two Sphingopyxis spp. strains isolated from a chloraminated drinking water distribution system simulator. Both strains are ubiquitous residents and early colonizers of water distribution systems. Genomic annotation identified a class 1 integron (in...

  1. Draft Genome Sequence of Two Sphingopyxis sp. Strains, Dominant Members of the Bacterial Community Associated with a Drinking Water Distribution System Simulator

    Science.gov (United States)

    We report the draft genome of two Sphingopyxis spp. strains isolated from a chloraminated drinking water distribution system simulator. Both strains are ubiquitous residents and early colonizers of water distribution systems. Genomic annotation identified a class 1 integron (in...

  2. Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering.

    Directory of Open Access Journals (Sweden)

    Sebastian Will

    2007-04-01

    Full Text Available The RFAM database defines families of ncRNAs by means of sequence similarities that are sufficient to establish homology. In some cases, such as microRNAs and box H/ACA snoRNAs, functional commonalities define classes of RNAs that are characterized by structural similarities, and typically consist of multiple RNA families. Recent advances in high-throughput transcriptomics and comparative genomics have produced very large sets of putative noncoding RNAs and regulatory RNA signals. For many of them, evidence for stabilizing selection acting on their secondary structures has been derived, and at least approximate models of their structures have been computed. The overwhelming majority of these hypothetical RNAs cannot be assigned to established families or classes. We present here a structure-based clustering approach that is capable of extracting putative RNA classes from genome-wide surveys for structured RNAs. The LocARNA (local alignment of RNA tool implements a novel variant of the Sankoff algorithm that is sufficiently fast to deal with several thousand candidate sequences. The method is also robust against false positive predictions, i.e., a contamination of the input data with unstructured or nonconserved sequences. We have successfully tested the LocARNA-based clustering approach on the sequences of the RFAM-seed alignments. Furthermore, we have applied it to a previously published set of 3,332 predicted structured elements in the Ciona intestinalis genome (Missal K, Rose D, Stadler PF (2005 Noncoding RNAs in Ciona intestinalis. Bioinformatics 21 (Supplement 2: i77-i78. In addition to recovering, e.g., tRNAs as a structure-based class, the method identifies several RNA families, including microRNA and snoRNA candidates, and suggests several novel classes of ncRNAs for which to date no representative has been experimentally characterized.

  3. Estimating demographic parameters from large-scale population genomic data using Approximate Bayesian Computation

    Directory of Open Access Journals (Sweden)

    Li Sen

    2012-03-01

    Full Text Available Abstract Background The Approximate Bayesian Computation (ABC approach has been used to infer demographic parameters for numerous species, including humans. However, most applications of ABC still use limited amounts of data, from a small number of loci, compared to the large amount of genome-wide population-genetic data which have become available in the last few years. Results We evaluated the performance of the ABC approach for three 'population divergence' models - similar to the 'isolation with migration' model - when the data consists of several hundred thousand SNPs typed for multiple individuals by simulating data from known demographic models. The ABC approach was used to infer demographic parameters of interest and we compared the inferred values to the true parameter values that was used to generate hypothetical "observed" data. For all three case models, the ABC approach inferred most demographic parameters quite well with narrow credible intervals, for example, population divergence times and past population sizes, but some parameters were more difficult to infer, such as population sizes at present and migration rates. We compared the ability of different summary statistics to infer demographic parameters, including haplotype and LD based statistics, and found that the accuracy of the parameter estimates can be improved by combining summary statistics that capture different parts of information in the data. Furthermore, our results suggest that poor choices of prior distributions can in some circumstances be detected using ABC. Finally, increasing the amount of data beyond some hundred loci will substantially improve the accuracy of many parameter estimates using ABC. Conclusions We conclude that the ABC approach can accommodate realistic genome-wide population genetic data, which may be difficult to analyze with full likelihood approaches, and that the ABC can provide accurate and precise inference of demographic parameters from

  4. Genome-scale transcriptomic insights into early-stage fruit development in woodland strawberry Fragaria vesca.

    Science.gov (United States)

    Kang, Chunying; Darwish, Omar; Geretz, Aviva; Shahan, Rachel; Alkharouf, Nadim; Liu, Zhongchi

    2013-06-01

    Fragaria vesca, a diploid woodland strawberry with a small and sequenced genome, is an excellent model for studying fruit development. The strawberry fruit is unique in that the edible flesh is actually enlarged receptacle tissue. The true fruit are the numerous dry achenes dotting the receptacle's surface. Auxin produced from the achene is essential for the receptacle fruit set, a paradigm for studying crosstalk between hormone signaling and development. To investigate the molecular mechanism underlying strawberry fruit set, next-generation sequencing was employed to profile early-stage fruit development with five fruit tissue types and five developmental stages from floral anthesis to enlarged fruits. This two-dimensional data set provides a systems-level view of molecular events with precise spatial and temporal resolution. The data suggest that the endosperm and seed coat may play a more prominent role than the embryo in auxin and gibberellin biosynthesis for fruit set. A model is proposed to illustrate how hormonal signals produced in the endosperm and seed coat coordinate seed, ovary wall, and receptacle fruit development. The comprehensive fruit transcriptome data set provides a wealth of genomic resources for the strawberry and Rosaceae communities as well as unprecedented molecular insight into fruit set and early stage fruit development. PMID:23898027

  5. Modeling watershed-scale solute transport using an integrated, process-based hydrologic model with applications to bacterial fate and transport

    Science.gov (United States)

    Niu, Jie; Phanikumar, Mantha S.

    2015-10-01

    Distributed hydrologic models that simulate fate and transport processes at sub-daily timescales are useful tools for estimating pollutant loads exported from watersheds to lakes and oceans downstream. There has been considerable interest in the application of integrated process-based hydrologic models in recent years. While the models have been applied to address questions of water quantity and to better understand linkages between hydrology and land surface processes, routine applications of these models to address water quality issues are currently limited. In this paper, we first describe a general process-based watershed-scale solute transport modeling framework, based on an operator splitting strategy and a Lagrangian particle transport method combined with dispersion and reactions. The transport and the hydrologic modules are tightly coupled and the interactions among different hydrologic components are explicitly modeled. We test transport modules using data from plot-scale experiments and available analytical solutions for different hydrologic domains. The numerical solutions are also compared with an analytical solution for groundwater transit times with interactions between surface and subsurface flows. Finally, we demonstrate the application of the model to simulate bacterial fate and transport in the Red Cedar River watershed in Michigan and test hypotheses about sources and transport pathways. The watershed bacterial fate and transport model is expected to be useful for making near real-time predictions at marine and freshwater beaches.

  6. Putative bacterial interactions from metagenomic knowledge with an integrative systems ecology approach.

    Science.gov (United States)

    Bordron, Philippe; Latorre, Mauricio; Cortés, Maria-Paz; González, Mauricio; Thiele, Sven; Siegel, Anne; Maass, Alejandro; Eveillard, Damien

    2016-02-01

    Following the trend of studies that investigate microbial ecosystems using different metagenomic techniques, we propose a new integrative systems ecology approach that aims to decipher functional roles within a consortium through the integration of genomic and metabolic knowledge at genome scale. For the sake of application, using public genomes of five bacterial strains involved in copper bioleaching: Acidiphilium cryptum, Acidithiobacillus ferrooxidans, Acidithiobacillus thiooxidans, Leptospirillum ferriphilum, and Sulfobacillus thermosulfidooxidans, we first reconstructed a global metabolic network. Next, using a parsimony assumption, we deciphered sets of genes, called Sets from Genome Segments (SGS), that (1) are close on their respective genomes, (2) take an active part in metabolic pathways and (3) whose associated metabolic reactions are also closely connected within metabolic networks. Overall, this SGS paradigm depicts genomic functional units that emphasize respective roles of bacterial strains to catalyze metabolic pathways and environmental processes. Our analysis suggested that only few functional metabolic genes are horizontally transferred within the consortium and that no single bacterial strain can accomplish by itself the whole copper bioleaching. The use of SGS pinpoints a functional compartmentalization among the investigated species and exhibits putative bacterial interactions necessary for promoting these pathways. PMID:26677108

  7. BLANNOTATOR: enhanced homology-based function prediction of bacterial proteins

    Directory of Open Access Journals (Sweden)

    Kankainen Matti

    2012-02-01

    Full Text Available Abstract Background Automated function prediction has played a central role in determining the biological functions of bacterial proteins. Typically, protein function annotation relies on homology, and function is inferred from other proteins with similar sequences. This approach has become popular in bacterial genomics because it is one of the few methods that is practical for large datasets and because it does not require additional functional genomics experiments. However, the existing solutions produce erroneous predictions in many cases, especially when query sequences have low levels of identity with the annotated source protein. This problem has created a pressing need for improvements in homology-based annotation. Results We present an automated method for the functional annotation of bacterial protein sequences. Based on sequence similarity searches, BLANNOTATOR accurately annotates query sequences with one-line summary descriptions of protein function. It groups sequences identified by BLAST into subsets according to their annotation and bases its prediction on a set of sequences with consistent functional information. We show the results of BLANNOTATOR's performance in sets of bacterial proteins with known functions. We simulated the annotation process for 3090 SWISS-PROT proteins using a database in its state preceding the functional characterisation of the query protein. For this dataset, our method outperformed the five others that we tested, and the improved performance was maintained even in the absence of highly related sequence hits. We further demonstrate the value of our tool by analysing the putative proteome of Lactobacillus crispatus strain ST1. Conclusions BLANNOTATOR is an accurate method for bacterial protein function prediction. It is practical for genome-scale data and does not require pre-existing sequence clustering; thus, this method suits the needs of bacterial genome and metagenome researchers. The method and a

  8. A Genome-Scale Resource for the Functional Characterization of Arabidopsis Transcription Factors

    Directory of Open Access Journals (Sweden)

    Jose L. Pruneda-Paz

    2014-07-01

    Full Text Available Extensive transcriptional networks play major roles in cellular and organismal functions. Transcript levels are in part determined by the combinatorial and overlapping functions of multiple transcription factors (TFs bound to gene promoters. Thus, TF-promoter interactions provide the basic molecular wiring of transcriptional regulatory networks. In plants, discovery of the functional roles of TFs is limited by an increased complexity of network circuitry due to a significant expansion of TF families. Here, we present the construction of a comprehensive collection of Arabidopsis TFs clones created to provide a versatile resource for uncovering TF biological functions. We leveraged this collection by implementing a high-throughput DNA binding assay and identified direct regulators of a key clock gene (CCA1 that provide molecular links between different signaling modules and the circadian clock. The resources introduced in this work will significantly contribute to a better understanding of the transcriptional regulatory landscape of plant genomes.

  9. The roles of whole-genome and small-scale duplications in the functional specialization of Saccharomyces cerevisiae genes.

    Directory of Open Access Journals (Sweden)

    Mario A Fares

    Full Text Available Researchers have long been enthralled with the idea that gene duplication can generate novel functions, crediting this process with great evolutionary importance. Empirical data shows that whole-genome duplications (WGDs are more likely to be retained than small-scale duplications (SSDs, though their relative contribution to the functional fate of duplicates remains unexplored. Using the map of genetic interactions and the re-sequencing of 27 Saccharomyces cerevisiae genomes evolving for 2,200 generations we show that SSD-duplicates lead to neo-functionalization while WGD-duplicates partition ancestral functions. This conclusion is supported by: (a SSD-duplicates establish more genetic interactions than singletons and WGD-duplicates; (b SSD-duplicates copies share more interaction-partners than WGD-duplicates copies; (c WGD-duplicates interaction partners are more functionally related than SSD-duplicates partners; (d SSD-duplicates gene copies are more functionally divergent from one another, while keeping more overlapping functions, and diverge in their sub-cellular locations more than WGD-duplicates copies; and (e SSD-duplicates complement their functions to a greater extent than WGD-duplicates. We propose a novel model that uncovers the complexity of evolution after gene duplication.

  10. Genome-scale DNA variant analysis and functional validation of a SNP underlying yellow fruit color in wild strawberry

    Science.gov (United States)

    Hawkins, Charles; Caruana, Julie; Schiksnis, Erin; Liu, Zhongchi

    2016-01-01

    Fragaria vesca is a species of diploid strawberry being developed as a model for the octoploid garden strawberry. This work sequenced and compared the genomes of three F. vesca accessions: ‘Hawaii 4′, ‘Rügen’, and ‘Yellow Wonder’. Genome-scale analyses of shared and distinct SNPs among these three accessions have revealed that ‘Rügen’ and ‘Yellow Wonder’ are more similar to each other than they are to ‘Hawaii 4’. Though all three accessions are inbred seven generations, each accession still possesses extensive heterozygosity, highlighting the inherent differences between individual plants even of the same accession. The identification of the impact of each SNP as well as the large number of Indel markers provides a foundation for locating candidate mutations underlying phenotypic variations among these F. vesca accessions and for mapping new mutations generated through forward genetics screens. Through systematic analysis of SNP variants affecting genes in anthocyanin biosynthesis and regulation, a candidate SNP in FveMYB10 was identified and then functionally confirmed to be responsible for the yellow color fruits made by many F. vesca accessions. As a whole, this study provides further resources for F. vesca and establishes a foundation for linking traits of economic importance to specific genes and variants. PMID:27377763

  11. Genome-scale DNA variant analysis and functional validation of a SNP underlying yellow fruit color in wild strawberry.

    Science.gov (United States)

    Hawkins, Charles; Caruana, Julie; Schiksnis, Erin; Liu, Zhongchi

    2016-01-01

    Fragaria vesca is a species of diploid strawberry being developed as a model for the octoploid garden strawberry. This work sequenced and compared the genomes of three F. vesca accessions: 'Hawaii 4', 'Rügen', and 'Yellow Wonder'. Genome-scale analyses of shared and distinct SNPs among these three accessions have revealed that 'Rügen' and 'Yellow Wonder' are more similar to each other than they are to 'Hawaii 4'. Though all three accessions are inbred seven generations, each accession still possesses extensive heterozygosity, highlighting the inherent differences between individual plants even of the same accession. The identification of the impact of each SNP as well as the large number of Indel markers provides a foundation for locating candidate mutations underlying phenotypic variations among these F. vesca accessions and for mapping new mutations generated through forward genetics screens. Through systematic analysis of SNP variants affecting genes in anthocyanin biosynthesis and regulation, a candidate SNP in FveMYB10 was identified and then functionally confirmed to be responsible for the yellow color fruits made by many F. vesca accessions. As a whole, this study provides further resources for F. vesca and establishes a foundation for linking traits of economic importance to specific genes and variants. PMID:27377763

  12. Genética y genómica enfocadas en el estudio de la resistencia bacteriana Genetics and Genomics for the study of bacterial resistance

    Directory of Open Access Journals (Sweden)

    Ulises Garza-Ramos

    2009-01-01

    Full Text Available La resistencia bacteriana es un problema de salud pública causante de índices elevados de morbi-mortalidad hospitalaria. En la medida en que se usan los diferentes antibióticos se seleccionan bacterias resistentes a múltiples fármacos. El desarrollo de nuevas herramientas moleculares de la genómica y proteómica, como el PCR en tiempo real, pirosecuenciación de ADN, espectrometría de masas, microarreglos de ADN y bioinformática, permite conocer en forma más estrecha la fisiología y estructura de las bacterias y los mecanismos de resistencia a los antibióticos. Estos estudios hacen posible identificar nuevos blancos farmacológicos y diseñar antibióticos específicos para suministrar tratamientos más certeros que combatan las infecciones producidas por bacterias. Con estas técnicas también es posible la identificación rápida de los genes que confieren la resistencia a los antibióticos y el reconocimiento de las estructuras genéticas complejas como los integrones, que intervienen en la diseminación de los genes que producen la multirresistencia.Bacterial resistance is a public health problem causing high rates of morbidity and mortality in hospital settings. To the extent that different antibiotics are used, bacteria resistant to multiple drugs are selected. The development of new molecular genomic and proteomic tools such as real-time PCR, DNA pyrosequencing, mass spectrometry, DNA microarrays, and bioinformatics allow for more in-depth knowledge about the physiology and structure of bacteria and mechanisms involved in antibiotic resistance. These studies identify new targets for drugs and design specific antibiotics to provide more accurate treatments to combat infections caused by bacteria. Using these techniques, it will also be possible to rapidly identify genes that confer resistance to antibiotics, and to identify complex genetic structures, such as integrons that are involved in the spread of genes that confer

  13. Analysis of genetic variation and potential applications in genome-scale metabolic modeling

    DEFF Research Database (Denmark)

    Cardoso, Joao; Andersen, Mikael Rørdam; Herrgard, Markus;

    2015-01-01

    variation during a long term production process may lead to significant economic losses and it is important to understand how to control this type of variation. With the emergence of next-generation sequencing technologies, genetic variation in microbial strains can now be determined on an unprecedented...... scale and resolution by re-sequencing thousands of strains systematically. In this article, we review challenges in the integration and analysis of large-scale re-sequencing data, present an extensive overview of bioinformatics methods for predicting the effects of genetic variants on protein function...

  14. A genome-scale metabolic network reconstruction of tomato (Solanum lycopersicum L.) and its application to photorespiratory metabolism.

    Science.gov (United States)

    Yuan, Huili; Cheung, C Y Maurice; Poolman, Mark G; Hilbers, Peter A J; van Riel, Natal A W

    2016-01-01

    Tomato (Solanum lycopersicum L.) has been studied extensively due to its high economic value in the market, and high content in health-promoting antioxidant compounds. Tomato is also considered as an excellent model organism for studying the development and metabolism of fleshy fruits. However, the growth, yield and fruit quality of tomatoes can be affected by drought stress, a common abiotic stress for tomato. To investigate the potential metabolic response of tomato plants to drought, we reconstructed iHY3410, a genome-scale metabolic model of tomato leaf, and used this metabolic network to simulate tomato leaf metabolism. The resulting model includes 3410 genes and 2143 biochemical and transport reactions distributed across five intracellular organelles including cytosol, plastid, mitochondrion, peroxisome and vacuole. The model successfully described the known metabolic behaviour of tomato leaf under heterotrophic and phototrophic conditions. The in silico investigation of the metabolic characteristics for photorespiration and other relevant metabolic processes under drought stress suggested that: (i) the flux distributions through the mevalonate (MVA) pathway under drought were distinct from that under normal conditions; and (ii) the changes in fluxes through core metabolic pathways with varying flux ratio of RubisCO carboxylase to oxygenase may contribute to the adaptive stress response of plants. In addition, we improved on previous studies of reaction essentiality analysis for leaf metabolism by including potential alternative routes for compensating reaction knockouts. Altogether, the genome-scale model provides a sound framework for investigating tomato metabolism and gives valuable insights into the functional consequences of abiotic stresses. PMID:26576489

  15. Direct coupling of a genome-scale microbial in silico model and a groundwater reactive transport model

    International Nuclear Information System (INIS)

    The activity of microorganisms often plays an important role in dynamic natural attenuation or engineered bioremediation of subsurface contaminants, such as chlorinated solvents, metals, and radionuclides. To evaluate and/or design bioremediated systems, quantitative reactive transport models are needed. State-of-the-art reactive transport models often ignore the microbial effects or simulate the microbial effects with static growth yield and constant reaction rate parameters over simulated conditions, while in reality microorganisms can dynamically modify their functionality (such as utilization of alternative respiratory pathways) in response to spatial and temporal variations in environmental conditions. Constraint-based genome-scale microbial in silico models, using genomic data and multiple-pathway reaction networks, have been shown to be able to simulate transient metabolism of some well studied microorganisms and identify growth rate, substrate uptake rates, and byproduct rates under different growth conditions. These rates can be identified and used to replace specific microbially-mediated reaction rates in a reactive transport model using local geochemical conditions as constraints. We previously demonstrated the potential utility of integrating a constraint based microbial metabolism model with a reactive transport simulator as applied to bioremediation of uranium in groundwater. However, that work relied on an indirect coupling approach that was effective for initial demonstration but may not be extensible to more complex problems that are of significant interest (e.g., communities of microbial species, multiple constraining variables). Here, we extend that work by presenting and demonstrating a method of directly integrating a reactive transport model (FORTRAN code) with constraint-based in silico models solved with IBM ILOG CPLEX linear optimizer base system (C library). The models were integrated with BABEL, a language interoperability tool. The

  16. Metabolic network reconstruction and genome-scale model of butanol-producing strain Clostridium beijerinckii NCIMB 8052

    Directory of Open Access Journals (Sweden)

    Kim Pan-Jun

    2011-08-01

    Full Text Available Abstract Background Solventogenic clostridia offer a sustainable alternative to petroleum-based production of butanol--an important chemical feedstock and potential fuel additive or replacement. C. beijerinckii is an attractive microorganism for strain design to improve butanol production because it (i naturally produces the highest recorded butanol concentrations as a byproduct of fermentation; and (ii can co-ferment pentose and hexose sugars (the primary products from lignocellulosic hydrolysis. Interrogating C. beijerinckii metabolism from a systems viewpoint using constraint-based modeling allows for simulation of the global effect of genetic modifications. Results We present the first genome-scale metabolic model (iCM925 for C. beijerinckii, containing 925 genes, 938 reactions, and 881 metabolites. To build the model we employed a semi-automated procedure that integrated genome annotation information from KEGG, BioCyc, and The SEED, and utilized computational algorithms with manual curation to improve model completeness. Interestingly, we found only a 34% overlap in reactions collected from the three databases--highlighting the importance of evaluating the predictive accuracy of the resulting genome-scale model. To validate iCM925, we conducted fermentation experiments using the NCIMB 8052 strain, and evaluated the ability of the model to simulate measured substrate uptake and product production rates. Experimentally observed fermentation profiles were found to lie within the solution space of the model; however, under an optimal growth objective, additional constraints were needed to reproduce the observed profiles--suggesting the existence of selective pressures other than optimal growth. Notably, a significantly enriched fraction of actively utilized reactions in simulations--constrained to reflect experimental rates--originated from the set of reactions that overlapped between all three databases (P = 3.52 × 10-9, Fisher's exact test

  17. Large-scale analysis of antisense transcription in wheat using the Affymetrix GeneChip Wheat Genome Array

    Directory of Open Access Journals (Sweden)

    Settles Matthew L

    2009-05-01

    Full Text Available Abstract Background Natural antisense transcripts (NATs are transcripts of the opposite DNA strand to the sense-strand either at the same locus (cis-encoded or a different locus (trans-encoded. They can affect gene expression at multiple stages including transcription, RNA processing and transport, and translation. NATs give rise to sense-antisense transcript pairs and the number of these identified has escalated greatly with the availability of DNA sequencing resources and public databases. Traditionally, NATs were identified by the alignment of full-length cDNAs or expressed sequence tags to genome sequences, but an alternative method for large-scale detection of sense-antisense transcript pairs involves the use of microarrays. In this study we developed a novel protocol to assay sense- and antisense-strand transcription on the 55 K Affymetrix GeneChip Wheat Genome Array, which is a 3' in vitro transcription (3'IVT expression array. We selected five different tissue types for assay to enable maximum discovery, and used the 'Chinese Spring' wheat genotype because most of the wheat GeneChip probe sequences were based on its genomic sequence. This study is the first report of using a 3'IVT expression array to discover the expression of natural sense-antisense transcript pairs, and may be considered as proof-of-concept. Results By using alternative target preparation schemes, both the sense- and antisense-strand derived transcripts were labeled and hybridized to the Wheat GeneChip. Quality assurance verified that successful hybridization did occur in the antisense-strand assay. A stringent threshold for positive hybridization was applied, which resulted in the identification of 110 sense-antisense transcript pairs, as well as 80 potentially antisense-specific transcripts. Strand-specific RT-PCR validated the microarray observations, and showed that antisense transcription is likely to be tissue specific. For the annotated sense

  18. A general framework for association tests with multivariate traits in large-scale genomics studies.

    Science.gov (United States)

    He, Qianchuan; Avery, Christy L; Lin, Dan-Yu

    2013-12-01

    Genetic association studies often collect data on multiple traits that are correlated. Discovery of genetic variants influencing multiple traits can lead to better understanding of the etiology of complex human diseases. Conventional univariate association tests may miss variants that have weak or moderate effects on individual traits. We propose several multivariate test statistics to complement univariate tests. Our framework covers both studies of unrelated individuals and family studies and allows any type/mixture of traits. We relate the marginal distributions of multivariate traits to genetic variants and covariates through generalized linear models without modeling the dependence among the traits or family members. We construct score-type statistics, which are computationally fast and numerically stable even in the presence of covariates and which can be combined efficiently across studies with different designs and arbitrary patterns of missing data. We compare the power of the test statistics both theoretically and empirically. We provide a strategy to determine genome-wide significance that properly accounts for the linkage disequilibrium (LD) of genetic variants. The application of the new methods to the meta-analysis of five major cardiovascular cohort studies identifies a new locus (HSCB) that is pleiotropic for the four traits analyzed. PMID:24227293

  19. A graphical model method for integrating multiple sources of genome-scale data

    Science.gov (United States)

    Dvorkin, Daniel; Biehs, Brian; Kechris, Katerina

    2016-01-01

    Making effective use of multiple data sources is a major challenge in modern bioinformatics. Genome-wide data such as measures of transcription factor binding, gene expression, and sequence conservation, which are used to identify binding regions and genes that are important to major biological processes such as development and disease, can be difficult to use together due to the different biological meanings and statistical distributions of the heterogeneous data types, but each can provide valuable information for understanding the processes under study. Here we present methods for integrating multiple data sources to gain a more complete picture of gene regulation and expression. Our goal is to identify genes and cis-regulatory regions which play specific biological roles. We describe a graphical mixture model approach for data integration, examine the effect of using different model topologies, and discuss methods for evaluating the effectiveness of the models. Model fitting is computationally efficient and produces results which have clear biological and statistical interpretations. The Hedgehog and Dorsal signaling pathways in Drosophila, which are critical in embryonic development, are used as examples. PMID:23934610

  20. Genome-wide linkage using the Social Responsiveness Scale in Utah autism pedigrees

    OpenAIRE

    Coon, Hilary; Villalobos, Michele E; Robison, Reid J.; Camp, Nicola J.; Cannon, Dale S.; Allen-Brady, Kristina; Miller, Judith S; McMahon, William M

    2010-01-01

    Background Autism Spectrum Disorders (ASD) are phenotypically heterogeneous, characterized by impairments in the development of communication and social behaviour and the presence of repetitive behaviour and restricted interests. Dissecting the genetic complexity of ASD may require phenotypic data reflecting more detail than is offered by a categorical clinical diagnosis. Such data are available from the Social Responsiveness Scale (SRS) which is a continuous, quantitative measure of social a...

  1. Genome-wide linkage using the Social Responsiveness Scale in Utah autism pedigrees

    OpenAIRE

    Coon Hilary; Villalobos Michele E; Robison Reid J; Camp Nicola J; Cannon Dale S; Allen-Brady Kristina; Miller Judith S; McMahon William M

    2010-01-01

    Abstract Background Autism Spectrum Disorders (ASD) are phenotypically heterogeneous, characterized by impairments in the development of communication and social behaviour and the presence of repetitive behaviour and restricted interests. Dissecting the genetic complexity of ASD may require phenotypic data reflecting more detail than is offered by a categorical clinical diagnosis. Such data are available from the Social Responsiveness Scale (SRS) which is a continuous, quantitative measure of...

  2. Isolation and characterization of bovine herpesvirus 4 (BoHV-4 from a cow affected by post partum metritis and cloning of the genome as a bacterial artificial chromosome

    Directory of Open Access Journals (Sweden)

    Cavirani Sandro

    2009-08-01

    Full Text Available Abstract Background Bovine herpesvirus 4 (BoHV-4 is a gammaherpesvirus with a Worldwide distribution in cattle and is often isolated from the uterus of animals with postpartum metritis or pelvic inflammatory disease. Virus strain adaptation to an organ, tissue or cell type is an important issue for the pathogenesis of disease. To explore the mechanistic role of viral strain variation for uterine disease, the present study aimed to develop a tool enabling precise genetic discrimination between strains of BoHV-4 and to easily manipulate the viral genome. Methods A strain of BoHV-4 was isolated from the uterus of a persistently infected cow and designated BoHV-4-U. The authenticity of the isolate was confirmed by RFLP-PCR and sequencing using the TK and IE2 loci as genetic marker regions for the BoHV-4 genome. The isolated genome was cloned as a Bacterial Artificial Chromosome (BAC and manipulated through recombineering technology Results The BoHV-4-U genome was successfully cloned as a BAC, and the stability of the pBAC-BoHV-4-U clone was confirmed over twenty passages, with viral growth similar to the wild type virus. The feasibility of using BoHV-4-U for mutagenesis was demonstrated using the BAC recombineering system. Conclusion The analysis of genome strain variation is a key method for investigating genes associated with disease. A resource for dissection of the interactions between BoHV-4 and host endometrial cells was generated by cloning the genome of BoHV-4 as a BAC.

  3. Correlations Between Bacterial Ecology and Mobile DNA

    OpenAIRE

    Newton, Irene L. G.; Bordenstein, Seth R.

    2010-01-01

    Several factors can affect the density of mobile DNA in bacterial genomes including rates of exposure to novel gene pools, recombination, and reductive evolution. These traits are difficult to measure across a broad range of bacterial species, but the ecological niches occupied by an organism provide some indication of the relative magnitude of these forces. Here, by analyzing 384 bacterial genomes assigned to three ecological categories (obligate intracellular, facultative intracellular, and...

  4. Genome-Scale Transcriptome Analysis of the Desert Shrub Artemisia sphaerocephala

    Science.gov (United States)

    Zhang, Lijing; Hu, Xiaowei; Miao, Xiumei; Chen, Xiaolong; Nan, Shuzhen; Fu, Hua

    2016-01-01

    Background Artemisia sphaerocephala, a semi-shrub belonging to the Artemisia genus of the Compositae family, is an important pioneer plant that inhabits moving and semi-stable sand dunes in the deserts and steppes of northwest and north-central China. It is very resilient in extreme environments. Additionally, its seeds have excellent nutritional value, and the abundant lipids and polysaccharides in the seeds make this plant a potential valuable source of bio-energy. However, partly due to the scarcity of genetic information, the genetic mechanisms controlling the traits and environmental adaptation capacity of A. sphaerocephala are unknown. Results Here, we present the first in-depth transcriptomic analysis of A. sphaerocephala. To maximize the representation of conditional transcripts, mRNA was obtained from 17 samples, including living tissues of desert-growing A. sphaerocephala, seeds germinated in the laboratory, and calli subjected to no stress (control) and high and low temperature, high and low osmotic, and salt stresses. De novo transcriptome assembly performed using an Illumina HiSeq 2500 platform resulted in the generation of 68,373 unigenes. We analyzed the key genes involved in the unsaturated fatty acid synthesis pathway and identified 26 A. sphaerocephala fad2 genes, which is the largest fad2 gene family reported to date. Furthermore, a set of genes responsible for resistance to extreme temperatures, salt, drought and a combination of stresses was identified. Conclusion The present work provides abundant genomic information for functional dissection of the important traits of A. sphaerocephala and contributes to the current understanding of molecular adaptive mechanisms of A. sphaerocephala in the desert environment. Identification of the key genes in the unsaturated fatty acid synthesis pathway could increase understanding of the biological regulatory mechanisms of fatty acid composition traits in plants and facilitate genetic manipulation of the

  5. Genome-Scale Transcriptome Analysis of the Desert Shrub Artemisia sphaerocephala.

    Directory of Open Access Journals (Sweden)

    Lijing Zhang

    Full Text Available Artemisia sphaerocephala, a semi-shrub belonging to the Artemisia genus of the Compositae family, is an important pioneer plant that inhabits moving and semi-stable sand dunes in the deserts and steppes of northwest and north-central China. It is very resilient in extreme environments. Additionally, its seeds have excellent nutritional value, and the abundant lipids and polysaccharides in the seeds make this plant a potential valuable source of bio-energy. However, partly due to the scarcity of genetic information, the genetic mechanisms controlling the traits and environmental adaptation capacity of A. sphaerocephala are unknown.Here, we present the first in-depth transcriptomic analysis of A. sphaerocephala. To maximize the representation of conditional transcripts, mRNA was obtained from 17 samples, including living tissues of desert-growing A. sphaerocephala, seeds germinated in the laboratory, and calli subjected to no stress (control and high and low temperature, high and low osmotic, and salt stresses. De novo transcriptome assembly performed using an Illumina HiSeq 2500 platform resulted in the generation of 68,373 unigenes. We analyzed the key genes involved in the unsaturated fatty acid synthesis pathway and identified 26 A. sphaerocephala fad2 genes, which is the largest fad2 gene family reported to date. Furthermore, a set of genes responsible for resistance to extreme temperatures, salt, drought and a combination of stresses was identified.The present work provides abundant genomic information for functional dissection of the important traits of A. sphaerocephala and contributes to the current understanding of molecular adaptive mechanisms of A. sphaerocephala in the desert environment. Identification of the key genes in the unsaturated fatty acid synthesis pathway could increase understanding of the biological regulatory mechanisms of fatty acid composition traits in plants and facilitate genetic manipulation of the fatty acid

  6. ReacKnock: identifying reaction deletion strategies for microbial strain optimization based on genome-scale metabolic network.

    Directory of Open Access Journals (Sweden)

    Zixiang Xu

    Full Text Available Gene knockout has been used as a common strategy to improve microbial strains for producing chemicals. Several algorithms are available to predict the target reactions to be deleted. Most of them apply mixed integer bi-level linear programming (MIBLP based on metabolic networks, and use duality theory to transform bi-level optimization problem of large-scale MIBLP to single-level programming. However, the validity of the transformation was not proved. Solution of MIBLP depends on the structure of inner problem. If the inner problem is continuous, Karush-Kuhn-Tucker (KKT method can be used to reformulate the MIBLP to a single-level one. We adopt KKT technique in our algorithm ReacKnock to attack the intractable problem of the solution of MIBLP, demonstrated with the genome-scale metabolic network model of E. coli for producing various chemicals such as succinate, ethanol, threonine and etc. Compared to the previous methods, our algorithm is fast, stable and reliable to find the optimal solutions for all the chemical products tested, and able to provide all the alternative deletion strategies which lead to the same industrial objective.

  7. Draft Genome Sequence of a Copper-Resistant Marine Bacterium, Pantoea agglomerans Strain LMAE-2, a Bacterial Strain with Potential Use in Bioremediation.

    Science.gov (United States)

    Corsini, Gino; Valdés, Natalia; Pradel, Paulina; Tello, Mario; Cottet, Luis; Muiño, Laura; Karahanian, Eduardo; Castillo, Antonio; Gonzalez, Alex R

    2016-01-01

    Pantoea agglomerans LMAE-2 was isolated from seabed sediment moderately contaminated with Cu(2+) Here, we report its draft genome sequence, which has a size of 4.98 Mb. The presence of cop genes related with copper homeostasis in its genome may explain the resistance and strengthen its potential for use as bioremediation agent. PMID:27313292

  8. Large-scale genome-wide association studies and meta-analyses of longitudinal change in adult lung function.

    Directory of Open Access Journals (Sweden)

    Wenbo Tang

    Full Text Available Genome-wide association studies (GWAS have identified numerous loci influencing cross-sectional lung function, but less is known about genes influencing longitudinal change in lung function.We performed GWAS of the rate of change in forced expiratory volume in the first second (FEV1 in 14 longitudinal, population-based cohort studies comprising 27,249 adults of European ancestry using linear mixed effects model and combined cohort-specific results using fixed effect meta-analysis to identify novel genetic loci associated with longitudinal change in lung function. Gene expression analyses were subsequently performed for identified genetic loci. As a secondary aim, we estimated the mean rate of decline in FEV1 by smoking pattern, irrespective of genotypes, across these 14 studies using meta-analysis.The overall meta-analysis produced suggestive evidence for association at the novel IL16/STARD5/TMC3 locus on chromosome 15 (P  =  5.71 × 10(-7. In addition, meta-analysis using the five cohorts with ≥3 FEV1 measurements per participant identified the novel ME3 locus on chromosome 11 (P  =  2.18 × 10(-8 at genome-wide significance. Neither locus was associated with FEV1 decline in two additional cohort studies. We confirmed gene expression of IL16, STARD5, and ME3 in multiple lung tissues. Publicly available microarray data confirmed differential expression of all three genes in lung samples from COPD patients compared with controls. Irrespective of genotypes, the combined estimate for FEV1 decline was 26.9, 29.2 and 35.7 mL/year in never, former, and persistent smokers, respectively.In this large-scale GWAS, we identified two novel genetic loci in association with the rate of change in FEV1 that harbor candidate genes with biologically plausible functional links to lung function.

  9. Large-Scale Genome-Wide Association Studies and Meta-Analyses of Longitudinal Change in Adult Lung Function

    Science.gov (United States)

    Tang, Wenbo; Kowgier, Matthew; Loth, Daan W.; Soler Artigas, María; Joubert, Bonnie R.; Hodge, Emily; Gharib, Sina A.; Smith, Albert V.; Ruczinski, Ingo; Gudnason, Vilmundur; Mathias, Rasika A.; Harris, Tamara B.; Hansel, Nadia N.; Launer, Lenore J.; Barnes, Kathleen C.; Hansen, Joyanna G.; Albrecht, Eva; Aldrich, Melinda C.; Allerhand, Michael; Barr, R. Graham; Brusselle, Guy G.; Couper, David J.; Curjuric, Ivan; Davies, Gail; Deary, Ian J.; Dupuis, Josée; Fall, Tove; Foy, Millennia; Franceschini, Nora; Gao, Wei; Gläser, Sven; Gu, Xiangjun; Hancock, Dana B.; Heinrich, Joachim; Hofman, Albert; Imboden, Medea; Ingelsson, Erik; James, Alan; Karrasch, Stefan; Koch, Beate; Kritchevsky, Stephen B.; Kumar, Ashish; Lahousse, Lies; Li, Guo; Lind, Lars; Lindgren, Cecilia; Liu, Yongmei; Lohman, Kurt; Lumley, Thomas; McArdle, Wendy L.; Meibohm, Bernd; Morris, Andrew P.; Morrison, Alanna C.; Musk, Bill; North, Kari E.; Palmer, Lyle J.; Probst-Hensch, Nicole M.; Psaty, Bruce M.; Rivadeneira, Fernando; Rotter, Jerome I.; Schulz, Holger; Smith, Lewis J.; Sood, Akshay; Starr, John M.; Strachan, David P.; Teumer, Alexander; Uitterlinden, André G.; Völzke, Henry; Voorman, Arend; Wain, Louise V.; Wells, Martin T.; Wilk, Jemma B.; Williams, O. Dale; Heckbert, Susan R.; Stricker, Bruno H.; London, Stephanie J.; Fornage, Myriam; Tobin, Martin D.; O′Connor, George T.; Hall, Ian P.; Cassano, Patricia A.

    2014-01-01

    Background Genome-wide association studies (GWAS) have identified numerous loci influencing cross-sectional lung function, but less is known about genes influencing longitudinal change in lung function. Methods We performed GWAS of the rate of change in forced expiratory volume in the first second (FEV1) in 14 longitudinal, population-based cohort studies comprising 27,249 adults of European ancestry using linear mixed effects model and combined cohort-specific results using fixed effect meta-analysis to identify novel genetic loci associated with longitudinal change in lung function. Gene expression analyses were subsequently performed for identified genetic loci. As a secondary aim, we estimated the mean rate of decline in FEV1 by smoking pattern, irrespective of genotypes, across these 14 studies using meta-analysis. Results The overall meta-analysis produced suggestive evidence for association at the novel IL16/STARD5/TMC3 locus on chromosome 15 (P  =  5.71 × 10-7). In addition, meta-analysis using the five cohorts with ≥3 FEV1 measurements per participant identified the novel ME3 locus on chromosome 11 (P  =  2.18 × 10-8) at genome-wide significance. Neither locus was associated with FEV1 decline in two additional cohort studies. We confirmed gene expression of IL16, STARD5, and ME3 in multiple lung tissues. Publicly available microarray data confirmed differential expression of all three genes in lung samples from COPD patients compared with controls. Irrespective of genotypes, the combined estimate for FEV1 decline was 26.9, 29.2 and 35.7 mL/year in never, former, and persistent smokers, respectively. Conclusions In this large-scale GWAS, we identified two novel genetic loci in association with the rate of change in FEV1 that harbor candidate genes with biologically plausible functional links to lung function. PMID:24983941

  10. Systematic construction of kinetic models from genome-scale metabolic networks.

    Directory of Open Access Journals (Sweden)

    Natalie J Stanford

    Full Text Available The quantitative effects of environmental and genetic perturbations on metabolism can be studied in silico using kinetic models. We present a strategy for large-scale model construction based on a logical layering of data such as reaction fluxes, metabolite concentrations, and kinetic constants. The resulting models contain realistic standard rate laws and plausible parameters, adhere to the laws of thermodynamics, and reproduce a predefined steady state. These features have not been simultaneously achieved by previous workflows. We demonstrate the advantages and limitations of the workflow by translating the yeast consensus metabolic network into a kinetic model. Despite crudely selected data, the model shows realistic control behaviour, a stable dynamic, and realistic response to perturbations in extracellular glucose concentrations. The paper concludes by outlining how new data can continuously be fed into the workflow and how iterative model building can assist in directing experiments.

  11. Acorn: A grid computing system for constraint based modeling and visualization of the genome scale metabolic reaction networks via a web interface

    Directory of Open Access Journals (Sweden)

    Bushell Michael E

    2011-05-01

    Full Text Available Abstract Background Constraint-based approaches facilitate the prediction of cellular metabolic capabilities, based, in turn on predictions of the repertoire of enzymes encoded in the genome. Recently, genome annotations have been used to reconstruct genome scale metabolic reaction networks for numerous species, including Homo sapiens, which allow simulations that provide valuable insights into topics, including predictions of gene essentiality of pathogens, interpretation of genetic polymorphism in metabolic disease syndromes and suggestions for novel approaches to microbial metabolic engineering. These constraint-based simulations are being integrated with the functional genomics portals, an activity that requires efficient implementation of the constraint-based simulations in the web-based environment. Results Here, we present Acorn, an open source (GNU GPL grid computing system for constraint-based simulations of genome scale metabolic reaction networks within an interactive web environment. The grid-based architecture allows efficient execution of computationally intensive, iterative protocols such as Flux Variability Analysis, which can be readily scaled up as the numbers of models (and users increase. The web interface uses AJAX, which facilitates efficient model browsing and other search functions, and intuitive implementation of appropriate simulation conditions. Research groups can install Acorn locally and create user accounts. Users can also import models in the familiar SBML format and link reaction formulas to major functional genomics portals of choice. Selected models and simulation results can be shared between different users and made publically available. Users can construct pathway map layouts and import them into the server using a desktop editor integrated within the system. Pathway maps are then used to visualise numerical results within the web environment. To illustrate these features we have deployed Acorn and created a

  12. Three minimum tile paths from bacterial artificial chromosome libraries of the soybean (Glycine max cv. 'Forrest'): tools for structural and functional genomics

    OpenAIRE

    Afzal AJ; Yaegashi S; Yesudas C; Shultz JL; Kazi S; Lightfoot DA

    2006-01-01

    Abstract Background The creation of minimally redundant tile paths (hereafter MTP) from contiguous sets of overlapping clones (hereafter contigs) in physical maps is a critical step for structural and functional genomics. Build 4 of the physical map of soybean (Glycine max L. Merr. cv. 'Forrest') showed the 1 Gbp haploid genome was composed of 0.7 Gbp diploid, 0.1 Gbp tetraploid and 0.2 Gbp octoploid regions. Therefore, the size of the unique genome was about 0.8 Gbp. The aim here was to crea...

  13. Bacterial hydrodynamics

    CERN Document Server

    Lauga, Eric

    2015-01-01

    Bacteria predate plants and animals by billions of years. Today, they are the world's smallest cells yet they represent the bulk of the world's biomass, and the main reservoir of nutrients for higher organisms. Most bacteria can move on their own, and the majority of motile bacteria are able to swim in viscous fluids using slender helical appendages called flagella. Low-Reynolds-number hydrodynamics is at the heart of the ability of flagella to generate propulsion at the micron scale. In fact, fluid dynamic forces impact many aspects of bacteriology, ranging from the ability of cells to reorient and search their surroundings to their interactions within mechanically and chemically-complex environments. Using hydrodynamics as an organizing framework, we review the biomechanics of bacterial motility and look ahead to future challenges.

  14. Genome-scale reconstruction and system level investigation of the metabolic network of Methylobacterium extorquens AM1

    Directory of Open Access Journals (Sweden)

    Peyraud Rémi

    2011-11-01

    Full Text Available Abstract Background Methylotrophic microorganisms are playing a key role in biogeochemical processes - especially the global carbon cycle - and have gained interest for biotechnological purposes. Significant progress was made in the recent years in the biochemistry, genetics, genomics, and physiology of methylotrophic bacteria, showing that methylotrophy is much more widespread and versatile than initially assumed. Despite such progress, system-level description of the methylotrophic metabolism is currently lacking, and much remains to understand regarding the network-scale organization and properties of methylotrophy, and how the methylotrophic capacity emerges from this organization, especially in facultative organisms. Results In this work, we report on the integrated, system-level investigation of the metabolic network of the facultative methylotroph Methylobacterium extorquens AM1, a valuable model of methylotrophic bacteria. The genome-scale metabolic network of the bacterium was reconstructed and contains 1139 reactions and 977 metabolites. The sub-network operating upon methylotrophic growth was identified from both in silico and experimental investigations, and 13C-fluxomics was applied to measure the distribution of metabolic fluxes under such conditions. The core metabolism has a highly unusual topology, in which the unique enzymes that catalyse the key steps of C1 assimilation are tightly connected by several, large metabolic cycles (serine cycle, ethylmalonyl-CoA pathway, TCA cycle, anaplerotic processes. The entire set of reactions must operate as a unique process to achieve C1 assimilation, but was shown to be structurally fragile based on network analysis. This observation suggests that in nature a strong pressure of selection must exist to maintain the methylotrophic capability. Nevertheless, substantial substrate cycling could be measured within C2/C3/C4 inter-conversions, indicating that the metabolic network is highly

  15. Dynamics of bacterial populations during bench‐scale bioremediation of oily seawater and desert soil bioaugmented with coastal microbial mats

    OpenAIRE

    Ali, Nidaa; Dashti, Narjes; Salamah, Samar; Sorkhoh, Naser; Al‐Awadhi, Husain; Samir RADWAN

    2016-01-01

    Summary This study describes a bench‐scale attempt to bioremediate Kuwaiti, oily water and soil samples through bioaugmentation with coastal microbial mats rich in hydrocarbonoclastic bacterioflora. Seawater and desert soil samples were artificially polluted with 1% weathered oil, and bioaugmented with microbial mat suspensions. Oil removal and microbial community dynamics were monitored. In batch cultures, oil removal was more effective in soil than in seawater. Hydrocarbonoclastic bacteria ...

  16. Bacterial population genetics, evolution and epidemiology.

    OpenAIRE

    Spratt, B. G.; Maiden, M C

    1999-01-01

    Asexual bacterial populations inevitably consist of an assemblage of distinct clonal lineages. However, bacterial populations are not entirely asexual since recombinational exchanges occur, mobilizing small genome segments among lineages and species. The relative contribution of recombination, as opposed to de novo mutation, in the generation of new bacterial genotypes varies among bacterial populations and, as this contribution increases, the clonality of a given population decreases. In con...

  17. Genome-scale identification of membrane-associated human mRNAs.

    Directory of Open Access Journals (Sweden)

    Maximilian Diehn

    2006-01-01

    Full Text Available The subcellular localization of proteins is critical to their biological roles. Moreover, whether a protein is membrane-bound, secreted, or intracellular affects the usefulness of, and the strategies for, using a protein as a diagnostic marker or a target for therapy. We employed a rapid and efficient experimental approach to classify thousands of human gene products as either "membrane-associated/secreted" (MS or "cytosolic/nuclear" (CN. Using subcellular fractionation methods, we separated mRNAs associated with membranes from those associated with the soluble cytosolic fraction and analyzed these two pools by comparative hybridization to DNA microarrays. Analysis of 11 different human cell lines, representing lymphoid, myeloid, breast, ovarian, hepatic, colon, and prostate tissues, identified more than 5,000 previously uncharacterized MS and more than 6,400 putative CN genes at high confidence levels. The experimentally determined localizations correlated well with in silico predictions of signal peptides and transmembrane domains, but also significantly increased the number of human genes that could be cataloged as encoding either MS or CN proteins. Using gene expression data from a variety of primary human malignancies and normal tissues, we rationally identified hundreds of MS gene products that are significantly overexpressed in tumors compared to normal tissues and thus represent candidates for serum diagnostic tests or monoclonal antibody-based therapies. Finally, we used the catalog of CN gene products to generate sets of candidate markers of organ-specific tissue injury. The large-scale annotation of subcellular localization reported here will serve as a reference database and will aid in the rational design of diagnostic tests and molecular therapies for diverse diseases.

  18. Draft Genome Sequence of Flavobacterium sp. Strain TAB 87, Able To Inhibit the Growth of Cystic Fibrosis Bacterial Pathogens Belonging to the Burkholderia cepacia Complex.

    Science.gov (United States)

    Presta, Luana; Inzucchi, Ilaria; Bosi, Emanuele; Fondi, Marco; Perrin, Elena; Miceli, Elisangela; Tutino, Maria Luisa; Lo Giudice, Angelina; de Pascale, Donatella; Fani, Renato

    2016-01-01

    We report here the draft genome sequence of the Flavobacterium sp. TAB 87 strain, isolated from Antarctic seawater during a summer campaign near the French Antarctic station Dumont d'Urville (60°40'S, 40°01'E). It will allow for comparative genomics and the fulfillment of both fundamental and application-oriented investigations. It allowed the recognition of genes associated with the production of bioactive compounds and antibiotic resistance. PMID:27198032

  19. A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm

    OpenAIRE

    Allen Eric E; Gaasterland Terry; Podell Sheila

    2008-01-01

    Abstract Background The process of horizontal gene transfer (HGT) is believed to be widespread in Bacteria and Archaea, but little comparative data is available addressing its occurrence in complete microbial genomes. Collection of high-quality, automated HGT prediction data based on phylogenetic evidence has previously been impractical for large numbers of genomes at once, due to prohibitive computational demands. DarkHorse, a recently described statistical method for discovering phylogeneti...

  20. BPGA- an ultra-fast pan-genome analysis pipeline.

    Science.gov (United States)

    Chaudhari, Narendrakumar M; Gupta, Vinod Kumar; Dutta, Chitra

    2016-01-01

    Recent advances in ultra-high-throughput sequencing technology and metagenomics have led to a paradigm shift in microbial genomics from few genome comparisons to large-scale pan-genome studies at different scales of phylogenetic resolution. Pan-genome studies provide a framework for estimating the genomic diversity of the dataset, determining core (conserved), accessory (dispensable) and unique (strain-specific) gene pool of a species, tracing horizontal gene-flux across strains and providing insight into species evolution. The existing pan genome software tools suffer from various limitations like limited datasets, difficult installation/requirements, inadequate functional features etc. Here we present an ultra-fast computational pipeline BPGA (Bacterial Pan Genome Analysis tool) with seven functional modules. In addition to the routine pan genome analyses, BPGA introduces a number of novel features for downstream analyses like core/pan/MLST (Multi Locus Sequence Typing) phylogeny, exclusive presence/absence of genes in specific strains, subset analysis, atypical G + C content analysis and KEGG &COG mapping of core, accessory and unique genes. Other notable features include minimum running prerequisites, freedom to select the gene clustering method, ultra-fast execution, user friendly command line interface and high-quality graphics outputs. The performance of BPGA has been evaluated using a dataset of complete genome sequences of 28 Streptococcus pyogenes strains. PMID:27071527