WorldWideScience

Sample records for bacterial genome scale

  1. Genome-scale models of bacterial metabolism: reconstruction and applications

    OpenAIRE

    Durot, Maxime; Bourguignon, Pierre-Yves; Schachter, Vincent

    2008-01-01

    Genome-scale metabolic models bridge the gap between genome-derived biochemical information and metabolic phenotypes in a principled manner, providing a solid interpretative framework for experimental data related to metabolic states, and enabling simple in silico experiments with whole-cell metabolism. Models have been reconstructed for almost 20 bacterial species, so far mainly through expert curation efforts integrating information from the literature with genome annotation. A wide variety...

  2. Large-scale genomic analysis suggests a neutral punctuated dynamics of transposable elements in bacterial genomes.

    Science.gov (United States)

    Iranzo, Jaime; Gómez, Manuel J; López de Saro, Francisco J; Manrubia, Susanna

    2014-06-01

    Insertion sequences (IS) are the simplest and most abundant form of transposable DNA found in bacterial genomes. When present in multiple copies, it is thought that they can promote genomic plasticity and genetic exchange, thus being a major force of evolutionary change. The main processes that determine IS content in genomes are, though, a matter of debate. In this work, we take advantage of the large amount of genomic data currently available and study the abundance distributions of 33 IS families in 1811 bacterial chromosomes. This allows us to test simple models of IS dynamics and estimate their key parameters by means of a maximum likelihood approach. We evaluate the roles played by duplication, lateral gene transfer, deletion and purifying selection. We find that the observed IS abundances are compatible with a neutral scenario where IS proliferation is controlled by deletions instead of purifying selection. Even if there may be some cases driven by selection, neutral behavior dominates over large evolutionary scales. According to this view, IS and hosts tend to coexist in a dynamic equilibrium state for most of the time. Our approach also allows for a detection of recent IS expansions, and supports the hypothesis that rapid expansions constitute transient events-punctuations-during which the state of coexistence of IS and host becomes perturbated. PMID:24967627

  3. Large-scale genomic analysis suggests a neutral punctuated dynamics of transposable elements in bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Jaime Iranzo

    2014-06-01

    Full Text Available Insertion sequences (IS are the simplest and most abundant form of transposable DNA found in bacterial genomes. When present in multiple copies, it is thought that they can promote genomic plasticity and genetic exchange, thus being a major force of evolutionary change. The main processes that determine IS content in genomes are, though, a matter of debate. In this work, we take advantage of the large amount of genomic data currently available and study the abundance distributions of 33 IS families in 1811 bacterial chromosomes. This allows us to test simple models of IS dynamics and estimate their key parameters by means of a maximum likelihood approach. We evaluate the roles played by duplication, lateral gene transfer, deletion and purifying selection. We find that the observed IS abundances are compatible with a neutral scenario where IS proliferation is controlled by deletions instead of purifying selection. Even if there may be some cases driven by selection, neutral behavior dominates over large evolutionary scales. According to this view, IS and hosts tend to coexist in a dynamic equilibrium state for most of the time. Our approach also allows for a detection of recent IS expansions, and supports the hypothesis that rapid expansions constitute transient events-punctuations-during which the state of coexistence of IS and host becomes perturbated.

  4. Genome-scale Co-evolutionary Inference Identifies Functions and Clients of Bacterial Hsp90

    OpenAIRE

    Press, Maximilian O.; Li, Hui; Creanza, Nicole; Kramer, Günter; Queitsch, Christine; Sourjik, Victor; Borenstein, Elhanan

    2013-01-01

    The molecular chaperone Hsp90 is essential in eukaryotes, in which it facilitates the folding of developmental regulators and signal transduction proteins known as Hsp90 clients. In contrast, Hsp90 is not essential in bacteria, and a broad characterization of its molecular and organismal function is lacking. To enable such characterization, we used a genome-scale phylogenetic analysis to identify genes that co-evolve with bacterial Hsp90. We find that genes whose gain and loss were coordinate...

  5. Genome-scale co-evolutionary inference identifies functions and clients of bacterial Hsp90.

    Science.gov (United States)

    Press, Maximilian O; Li, Hui; Creanza, Nicole; Kramer, Günter; Queitsch, Christine; Sourjik, Victor; Borenstein, Elhanan

    2013-01-01

    The molecular chaperone Hsp90 is essential in eukaryotes, in which it facilitates the folding of developmental regulators and signal transduction proteins known as Hsp90 clients. In contrast, Hsp90 is not essential in bacteria, and a broad characterization of its molecular and organismal function is lacking. To enable such characterization, we used a genome-scale phylogenetic analysis to identify genes that co-evolve with bacterial Hsp90. We find that genes whose gain and loss were coordinated with Hsp90 throughout bacterial evolution tended to function in flagellar assembly, chemotaxis, and bacterial secretion, suggesting that Hsp90 may aid assembly of protein complexes. To add to the limited set of known bacterial Hsp90 clients, we further developed a statistical method to predict putative clients. We validated our predictions by demonstrating that the flagellar protein FliN and the chemotaxis kinase CheA behaved as Hsp90 clients in Escherichia coli, confirming the predicted role of Hsp90 in chemotaxis and flagellar assembly. Furthermore, normal Hsp90 function is important for wild-type motility and/or chemotaxis in E. coli. This novel function of bacterial Hsp90 agreed with our subsequent finding that Hsp90 is associated with a preference for multiple habitats and may therefore face a complex selection regime. Taken together, our results reveal previously unknown functions of bacterial Hsp90 and open avenues for future experimental exploration by implicating Hsp90 in the assembly of membrane protein complexes and adaptation to novel environments. PMID:23874229

  6. Genome-scale co-evolutionary inference identifies functions and clients of bacterial Hsp90.

    Directory of Open Access Journals (Sweden)

    Maximilian O Press

    Full Text Available The molecular chaperone Hsp90 is essential in eukaryotes, in which it facilitates the folding of developmental regulators and signal transduction proteins known as Hsp90 clients. In contrast, Hsp90 is not essential in bacteria, and a broad characterization of its molecular and organismal function is lacking. To enable such characterization, we used a genome-scale phylogenetic analysis to identify genes that co-evolve with bacterial Hsp90. We find that genes whose gain and loss were coordinated with Hsp90 throughout bacterial evolution tended to function in flagellar assembly, chemotaxis, and bacterial secretion, suggesting that Hsp90 may aid assembly of protein complexes. To add to the limited set of known bacterial Hsp90 clients, we further developed a statistical method to predict putative clients. We validated our predictions by demonstrating that the flagellar protein FliN and the chemotaxis kinase CheA behaved as Hsp90 clients in Escherichia coli, confirming the predicted role of Hsp90 in chemotaxis and flagellar assembly. Furthermore, normal Hsp90 function is important for wild-type motility and/or chemotaxis in E. coli. This novel function of bacterial Hsp90 agreed with our subsequent finding that Hsp90 is associated with a preference for multiple habitats and may therefore face a complex selection regime. Taken together, our results reveal previously unknown functions of bacterial Hsp90 and open avenues for future experimental exploration by implicating Hsp90 in the assembly of membrane protein complexes and adaptation to novel environments.

  7. Targeted Large-Scale Deletion of Bacterial Genomes Using CRISPR-Nickases.

    Science.gov (United States)

    Standage-Beier, Kylie; Zhang, Qi; Wang, Xiao

    2015-11-20

    Programmable CRISPR-Cas systems have augmented our ability to produce precise genome manipulations. Here we demonstrate and characterize the ability of CRISPR-Cas derived nickases to direct targeted recombination of both small and large genomic regions flanked by repetitive elements in Escherichia coli. While CRISPR directed double-stranded DNA breaks are highly lethal in many bacteria, we show that CRISPR-guided nickase systems can be programmed to make precise, nonlethal, single-stranded incisions in targeted genomic regions. This induces recombination events and leads to targeted deletion. We demonstrate that dual-targeted nicking enables deletion of 36 and 97 Kb of the genome. Furthermore, multiplex targeting enables deletion of 133 Kb, accounting for approximately 3% of the entire E. coli genome. This technology provides a framework for methods to manipulate bacterial genomes using CRISPR-nickase systems. We envision this system working synergistically with preexisting bacterial genome engineering methods.

  8. Biofilm Formation Mechanisms of Pseudomonas aeruginosa Predicted via Genome-Scale Kinetic Models of Bacterial Metabolism.

    Science.gov (United States)

    Vital-Lopez, Francisco G; Reifman, Jaques; Wallqvist, Anders

    2015-10-01

    A hallmark of Pseudomonas aeruginosa is its ability to establish biofilm-based infections that are difficult to eradicate. Biofilms are less susceptible to host inflammatory and immune responses and have higher antibiotic tolerance than free-living planktonic cells. Developing treatments against biofilms requires an understanding of bacterial biofilm-specific physiological traits. Research efforts have started to elucidate the intricate mechanisms underlying biofilm development. However, many aspects of these mechanisms are still poorly understood. Here, we addressed questions regarding biofilm metabolism using a genome-scale kinetic model of the P. aeruginosa metabolic network and gene expression profiles. Specifically, we computed metabolite concentration differences between known mutants with altered biofilm formation and the wild-type strain to predict drug targets against P. aeruginosa biofilms. We also simulated the altered metabolism driven by gene expression changes between biofilm and stationary growth-phase planktonic cultures. Our analysis suggests that the synthesis of important biofilm-related molecules, such as the quorum-sensing molecule Pseudomonas quinolone signal and the exopolysaccharide Psl, is regulated not only through the expression of genes in their own synthesis pathway, but also through the biofilm-specific expression of genes in pathways competing for precursors to these molecules. Finally, we investigated why mutants defective in anthranilate degradation have an impaired ability to form biofilms. Alternative to a previous hypothesis that this biofilm reduction is caused by a decrease in energy production, we proposed that the dysregulation of the synthesis of secondary metabolites derived from anthranilate and chorismate is what impaired the biofilms of these mutants. Notably, these insights generated through our kinetic model-based approach are not accessible from previous constraint-based model analyses of P. aeruginosa biofilm

  9. LocateP: Genome-scale subcellular-location predictor for bacterial proteins

    Directory of Open Access Journals (Sweden)

    Zhou Miaomiao

    2008-03-01

    Full Text Available Abstract Background In the past decades, various protein subcellular-location (SCL predictors have been developed. Most of these predictors, like TMHMM 2.0, SignalP 3.0, PrediSi and Phobius, aim at the identification of one or a few SCLs, whereas others such as CELLO and Psortb.v.2.0 aim at a broader classification. Although these tools and pipelines can achieve a high precision in the accurate prediction of signal peptides and transmembrane helices, they have a much lower accuracy when other sequence characteristics are concerned. For instance, it proved notoriously difficult to identify the fate of proteins carrying a putative type I signal peptidase (SPIase cleavage site, as many of those proteins are retained in the cell membrane as N-terminally anchored membrane proteins. Moreover, most of the SCL classifiers are based on the classification of the Swiss-Prot database and consequently inherited the inconsistency of that SCL classification. As accurate and detailed SCL prediction on a genome scale is highly desired by experimental researchers, we decided to construct a new SCL prediction pipeline: LocateP. Results LocateP combines many of the existing high-precision SCL identifiers with our own newly developed identifiers for specific SCLs. The LocateP pipeline was designed such that it mimics protein targeting and secretion processes. It distinguishes 7 different SCLs within Gram-positive bacteria: intracellular, multi-transmembrane, N-terminally membrane anchored, C-terminally membrane anchored, lipid-anchored, LPxTG-type cell-wall anchored, and secreted/released proteins. Moreover, it distinguishes pathways for Sec- or Tat-dependent secretion and alternative secretion of bacteriocin-like proteins. The pipeline was tested on data sets extracted from literature, including experimental proteomics studies. The tests showed that LocateP performs as well as, or even slightly better than other SCL predictors for some locations and outperforms

  10. Consistency Analysis of Genome-Scale Models of Bacterial Metabolism: A Metamodel Approach.

    Directory of Open Access Journals (Sweden)

    Miguel Ponce-de-Leon

    Full Text Available Genome-scale metabolic models usually contain inconsistencies that manifest as blocked reactions and gap metabolites. With the purpose to detect recurrent inconsistencies in metabolic models, a large-scale analysis was performed using a previously published dataset of 130 genome-scale models. The results showed that a large number of reactions (~22% are blocked in all the models where they are present. To unravel the nature of such inconsistencies a metamodel was construed by joining the 130 models in a single network. This metamodel was manually curated using the unconnected modules approach, and then, it was used as a reference network to perform a gap-filling on each individual genome-scale model. Finally, a set of 36 models that had not been considered during the construction of the metamodel was used, as a proof of concept, to extend the metamodel with new biochemical information, and to assess its impact on gap-filling results. The analysis performed on the metamodel allowed to conclude: 1 the recurrent inconsistencies found in the models were already present in the metabolic database used during the reconstructions process; 2 the presence of inconsistencies in a metabolic database can be propagated to the reconstructed models; 3 there are reactions not manifested as blocked which are active as a consequence of some classes of artifacts, and; 4 the results of an automatic gap-filling are highly dependent on the consistency and completeness of the metamodel or metabolic database used as the reference network. In conclusion the consistency analysis should be applied to metabolic databases in order to detect and fill gaps as well as to detect and remove artifacts and redundant information.

  11. Consistency Analysis of Genome-Scale Models of Bacterial Metabolism: A Metamodel Approach.

    Science.gov (United States)

    Ponce-de-Leon, Miguel; Calle-Espinosa, Jorge; Peretó, Juli; Montero, Francisco

    2015-01-01

    Genome-scale metabolic models usually contain inconsistencies that manifest as blocked reactions and gap metabolites. With the purpose to detect recurrent inconsistencies in metabolic models, a large-scale analysis was performed using a previously published dataset of 130 genome-scale models. The results showed that a large number of reactions (~22%) are blocked in all the models where they are present. To unravel the nature of such inconsistencies a metamodel was construed by joining the 130 models in a single network. This metamodel was manually curated using the unconnected modules approach, and then, it was used as a reference network to perform a gap-filling on each individual genome-scale model. Finally, a set of 36 models that had not been considered during the construction of the metamodel was used, as a proof of concept, to extend the metamodel with new biochemical information, and to assess its impact on gap-filling results. The analysis performed on the metamodel allowed to conclude: 1) the recurrent inconsistencies found in the models were already present in the metabolic database used during the reconstructions process; 2) the presence of inconsistencies in a metabolic database can be propagated to the reconstructed models; 3) there are reactions not manifested as blocked which are active as a consequence of some classes of artifacts, and; 4) the results of an automatic gap-filling are highly dependent on the consistency and completeness of the metamodel or metabolic database used as the reference network. In conclusion the consistency analysis should be applied to metabolic databases in order to detect and fill gaps as well as to detect and remove artifacts and redundant information. PMID:26629901

  12. The large-scale blast score ratio (LS-BSR) pipeline: a method to rapidly compare genetic content between bacterial genomes.

    Science.gov (United States)

    Sahl, Jason W; Caporaso, J Gregory; Rasko, David A; Keim, Paul

    2014-01-01

    Background. As whole genome sequence data from bacterial isolates becomes cheaper to generate, computational methods are needed to correlate sequence data with biological observations. Here we present the large-scale BLAST score ratio (LS-BSR) pipeline, which rapidly compares the genetic content of hundreds to thousands of bacterial genomes, and returns a matrix that describes the relatedness of all coding sequences (CDSs) in all genomes surveyed. This matrix can be easily parsed in order to identify genetic relationships between bacterial genomes. Although pipelines have been published that group peptides by sequence similarity, no other software performs the rapid, large-scale, full-genome comparative analyses carried out by LS-BSR. Results. To demonstrate the utility of the method, the LS-BSR pipeline was tested on 96 Escherichia coli and Shigella genomes; the pipeline ran in 163 min using 16 processors, which is a greater than 7-fold speedup compared to using a single processor. The BSR values for each CDS, which indicate a relative level of relatedness, were then mapped to each genome on an independent core genome single nucleotide polymorphism (SNP) based phylogeny. Comparisons were then used to identify clade specific CDS markers and validate the LS-BSR pipeline based on molecular markers that delineate between classical E. coli pathogenic variant (pathovar) designations. Scalability tests demonstrated that the LS-BSR pipeline can process 1,000 E. coli genomes in 27-57 h, depending upon the alignment method, using 16 processors. Conclusions. LS-BSR is an open-source, parallel implementation of the BSR algorithm, enabling rapid comparison of the genetic content of large numbers of genomes. The results of the pipeline can be used to identify specific markers between user-defined phylogenetic groups, and to identify the loss and/or acquisition of genetic information between bacterial isolates. Taxa-specific genetic markers can then be translated into clinical

  13. The large-scale blast score ratio (LS-BSR pipeline: a method to rapidly compare genetic content between bacterial genomes

    Directory of Open Access Journals (Sweden)

    Jason W. Sahl

    2014-04-01

    Full Text Available Background. As whole genome sequence data from bacterial isolates becomes cheaper to generate, computational methods are needed to correlate sequence data with biological observations. Here we present the large-scale BLAST score ratio (LS-BSR pipeline, which rapidly compares the genetic content of hundreds to thousands of bacterial genomes, and returns a matrix that describes the relatedness of all coding sequences (CDSs in all genomes surveyed. This matrix can be easily parsed in order to identify genetic relationships between bacterial genomes. Although pipelines have been published that group peptides by sequence similarity, no other software performs the rapid, large-scale, full-genome comparative analyses carried out by LS-BSR. Results. To demonstrate the utility of the method, the LS-BSR pipeline was tested on 96 Escherichia coli and Shigella genomes; the pipeline ran in 163 min using 16 processors, which is a greater than 7-fold speedup compared to using a single processor. The BSR values for each CDS, which indicate a relative level of relatedness, were then mapped to each genome on an independent core genome single nucleotide polymorphism (SNP based phylogeny. Comparisons were then used to identify clade specific CDS markers and validate the LS-BSR pipeline based on molecular markers that delineate between classical E. coli pathogenic variant (pathovar designations. Scalability tests demonstrated that the LS-BSR pipeline can process 1,000 E. coli genomes in 27–57 h, depending upon the alignment method, using 16 processors. Conclusions. LS-BSR is an open-source, parallel implementation of the BSR algorithm, enabling rapid comparison of the genetic content of large numbers of genomes. The results of the pipeline can be used to identify specific markers between user-defined phylogenetic groups, and to identify the loss and/or acquisition of genetic information between bacterial isolates. Taxa-specific genetic markers can then be

  14. Bacterial genome reengineering.

    Science.gov (United States)

    Zhou, Jindan; Rudd, Kenneth E

    2011-01-01

    The web application PrimerPair at ecogene.org generates large sets of paired DNA sequences surrounding- all protein and RNA genes of Escherichia coli K-12. Many DNA fragments, which these primers amplify, can be used to implement a genome reengineering strategy using complementary in vitro cloning and in vivo recombineering. The integration of a primer design tool with a model organism database increases the level of quality control. Computer-assisted design of gene primer pairs relies upon having highly accurate genomic DNA sequence information that exactly matches the DNA of the cells being used in the laboratory to ensure predictable DNA hybridizations. It is equally crucial to have confidence that the predicted start codons define the locations of genes accurately. Annotations in the EcoGene database are queried by PrimerPair to eliminate pseudogenes, IS elements, and other problematic genes before the design process starts. These projects progressively familiarize users with the EcoGene content, scope, and application interfaces that are useful for genome reengineering projects. The first protocol leads to the design of a pair of primer sequences that were used to clone and express a single gene. The N-terminal protein sequence was experimentally verified and the protein was detected in the periplasm. This is followed by instructions to design PCR primer pairs for cloning gene fragments encoding 50 periplasmic proteins without their signal peptides. The design process begins with the user simply designating one pair of forward and reverse primer endpoint positions relative to all start and stop codon positions. The gene name, genomic coordinates, and primer DNA sequences are reported to the user. When making chromosomal deletions, the integrity of the provisional primer design is checked to see whether it will generate any unwanted double deletions with adjacent genes. The bad designs are recalculated and replacement primers are provided alongside the

  15. Marine Bacterial Genomics

    DEFF Research Database (Denmark)

    Machado, Henrique

    For decades, terrestrial microorganisms have been used as sources of countless enzymes and chemical compounds that have been produced by pharmaceutical and biotech companies and used by mankind. There is a need for new chemical compounds, including antibiotics,new enzymatic activities and new...... microorganisms to be used as cell factories for production. Therefore exploitation of new microbial niches and use of different strategies is an opportunity to boost discoveries. Even though scientists have started to explore several habitats other than the terrestrial ones, the marine environment stands out...... as a hitherto under-explored niche. This thesis work uses high-throughput sequencing technologies on a collection of marine bacteria established during the Galathea 3 expedition, with the purpose of unraveling new biodiversity and new bioactivities. Several tools were used for genomic analysis in order...

  16. Genome-Scale Models

    DEFF Research Database (Denmark)

    Bergdahl, Basti; Sonnenschein, Nikolaus; Machado, Daniel;

    2016-01-01

    An introduction to genome-scale models, how to build and use them, will be given in this chapter. Genome-scale models have become an important part of systems biology and metabolic engineering, and are increasingly used in research, both in academica and in industry, both for modeling chemical...

  17. Distribution of Triplet Separators in Bacterial Genomes

    Institute of Scientific and Technical Information of China (English)

    HU Rui; ZHENG Wei-Mou

    2001-01-01

    Distributions of triplet separator lengths for two bacterial complete genomes are analyzed. The theoretical distributions for the independent random sequence and the first-order Markov chain are derived and compared with the distributions of the bacterial genomes. A prominent double band structure, which does not exist in the theoretical distributions, is observed in the bacterial distributions for most triplets.``

  18. Insights from twenty years of bacterial genome sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Jun, Se Ran [ORNL; Nookaew, Intawat [ORNL; Leuze, Michael Rex [ORNL; Ahn, Tae-Hyuk [ORNL; Karpinets, Tatiana V [ORNL; Lund, Ole [Technical University of Denmark; Kora, Guruprasad H [ORNL; Wassenaar, Trudy [Molecular Microbiology & Genomics Consultants, Zotzenheim, Germany; Poudel, Suresh [ORNL; Ussery, David W [ORNL

    2015-01-01

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome

  19. Bacterial Communities: Interactions to Scale

    Science.gov (United States)

    Stubbendieck, Reed M.; Vargas-Bautista, Carol; Straight, Paul D.

    2016-01-01

    In the environment, bacteria live in complex multispecies communities. These communities span in scale from small, multicellular aggregates to billions or trillions of cells within the gastrointestinal tract of animals. The dynamics of bacterial communities are determined by pairwise interactions that occur between different species in the community. Though interactions occur between a few cells at a time, the outcomes of these interchanges have ramifications that ripple through many orders of magnitude, and ultimately affect the macroscopic world including the health of host organisms. In this review we cover how bacterial competition influences the structures of bacterial communities. We also emphasize methods and insights garnered from culture-dependent pairwise interaction studies, metagenomic analyses, and modeling experiments. Finally, we argue that the integration of multiple approaches will be instrumental to future understanding of the underlying dynamics of bacterial communities. PMID:27551280

  20. Bacterial Communities: Interactions to Scale

    Directory of Open Access Journals (Sweden)

    Reed M. Stubbendieck

    2016-08-01

    Full Text Available In the environment, bacteria live in complex multispecies communities. These communities span in scale from small, multicellular aggregates to billions or trillions of cells within the gastrointestinal tract of animals. The dynamics of bacterial communities are determined by pairwise interactions that occur between different species in the community. Though interactions occur between a few cells at a time, the outcomes of these interchanges have ramifications that ripple through many orders of magnitude, and ultimately affect the macroscopic world including the health of host organisms. In this review we cover how bacterial competition influences the structures of bacterial communities. We also emphasize methods and insights garnered from culture-dependent pairwise interaction studies, metagenomic analyses, and modeling experiments. Finally, we argue that the integration of multiple approaches will be instrumental to future understanding of the underlying dynamics of bacterial communities.

  1. Value of a newly sequenced bacterial genome.

    Science.gov (United States)

    Barbosa, Eudes Gv; Aburjaile, Flavia F; Ramos, Rommel Tj; Carneiro, Adriana R; Le Loir, Yves; Baumbach, Jan; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

    2014-05-26

    Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information. PMID:24921006

  2. Value of a newly sequenced bacterial genome

    Institute of Scientific and Technical Information of China (English)

    Eudes; GV; Barbosa; Flavia; F; Aburjaile; Rommel; TJ; Ramos; Adriana; R; Carneiro; Yves; Le; Loir; Jan; Baumbach; Anderson; Miyoshi; Artur; Silva; Vasco; Azevedo

    2014-01-01

    Next-generation sequencing(NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft(partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information.

  3. Xylella Genomics and Bacterial Pathogenicity to Plants

    OpenAIRE

    Dow, J. M.; Daniels, M J

    2000-01-01

    Xylella fastidiosa, a pathogen of citrus, is the first plant pathogenic bacterium for which the complete genome sequence has been published. Inspection of the sequence reveals high relatedness to many genes of other pathogens, notably Xanthomonas campestris. Based on this, we suggest that Xylella possesses certain easily testable properties that contribute to pathogenicity. We also present some general considerations for deriving information on pathogenicity from bacterial genomics.

  4. Dynamics of genome rearrangement in bacterial populations.

    Directory of Open Access Journals (Sweden)

    Aaron E Darling

    represent the first characterization of genome arrangement evolution in a bacterial population evolving outside laboratory conditions. Insight into the process of genomic rearrangement may further the understanding of pathogen population dynamics and selection on the architecture of circular bacterial chromosomes.

  5. One bacterial cell, one complete genome.

    Directory of Open Access Journals (Sweden)

    Tanja Woyke

    Full Text Available While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200-900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA. Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs, indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.

  6. One Bacterial Cell, One Complete Genome

    Energy Technology Data Exchange (ETDEWEB)

    Woyke, Tanja; Tighe, Damon; Mavrommatis, Konstantinos; Clum, Alicia; Copeland, Alex; Schackwitz, Wendy; Lapidus, Alla; Wu, Dongying; McCutcheon, John P.; McDonald, Bradon R.; Moran, Nancy A.; Bristow, James; Cheng, Jan-Fang

    2010-04-26

    While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200?900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA). Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs), indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.

  7. Insights from genomics into bacterial pathogen populations.

    Directory of Open Access Journals (Sweden)

    Daniel J Wilson

    2012-09-01

    Full Text Available Bacterial pathogens impose a heavy burden of disease on human populations worldwide. The gravest threats are posed by highly virulent respiratory pathogens, enteric pathogens, and HIV-associated infections. Tuberculosis alone is responsible for the deaths of 1.5 million people annually. Treatment options for bacterial pathogens are being steadily eroded by the evolution and spread of drug resistance. However, population-level whole genome sequencing offers new hope in the fight against pathogenic bacteria. By providing insights into bacterial evolution and disease etiology, these approaches pave the way for novel interventions and therapeutic targets. Sequencing populations of bacteria across the whole genome provides unprecedented resolution to investigate (i within-host evolution, (ii transmission history, and (iii population structure. Moreover, advances in rapid benchtop sequencing herald a new era of real-time genomics in which sequencing and analysis can be deployed within hours in response to rapidly changing public health emergencies. The purpose of this review is to highlight the transformative effect of population genomics on bacteriology, and to consider the prospects for answering abiding questions such as why bacteria cause disease.

  8. Transforming clinical microbiology with bacterial genome sequencing

    Science.gov (United States)

    2016-01-01

    Whole genome sequencing of bacteria has recently emerged as a cost-effective and convenient approach for addressing many microbiological questions. Here we review the current status of clinical microbiology and how it has already begun to be transformed by the use of next-generation sequencing. We focus on three essential tasks: identifying the species of an isolate, testing its properties such as resistance to antibiotics and virulence, and monitoring the emergence and spread of bacterial pathogens. The application of next-generation sequencing will soon be sufficiently fast, accurate and cheap to be used in routine clinical microbiology practice, where it could replace many complex current techniques with a single, more efficient workflow. PMID:22868263

  9. Transforming clinical microbiology with bacterial genome sequencing.

    Science.gov (United States)

    Didelot, Xavier; Bowden, Rory; Wilson, Daniel J; Peto, Tim E A; Crook, Derrick W

    2012-09-01

    Whole-genome sequencing of bacteria has recently emerged as a cost-effective and convenient approach for addressing many microbiological questions. Here, we review the current status of clinical microbiology and how it has already begun to be transformed by using next-generation sequencing. We focus on three essential tasks: identifying the species of an isolate, testing its properties, such as resistance to antibiotics and virulence, and monitoring the emergence and spread of bacterial pathogens. We predict that the application of next-generation sequencing will soon be sufficiently fast, accurate and cheap to be used in routine clinical microbiology practice, where it could replace many complex current techniques with a single, more efficient workflow.

  10. From bacterial genome to functionality; case bifidobacteria.

    Science.gov (United States)

    Ventura, Marco; O'Connell-Motherway, Mary; Leahy, Sinead; Moreno-Munoz, Jose Antonio; Fitzgerald, Gerald F; van Sinderen, Douwe

    2007-11-30

    The availability of complete bacterial genome sequences has significantly furthered our understanding of the genetics, physiology and biochemistry of the microorganisms in question, particularly those that have commercially important applications. Bifidobacteria are among such microorganisms, as they constitute mammalian commensals of biotechnological significance due to their perceived role in maintaining a balanced gastrointestinal (GIT) microflora. Bifidobacteria are therefore frequently used as health-promoting or probiotic components in functional food products. A fundamental understanding of the metabolic activities employed by these commensal bacteria, in particular their capability to utilize a wide range of complex oligosaccharides, can reveal ways to provide in vivo growth advantages relative to other competing gut bacteria or pathogens. Furthermore, an in depth analysis of adaptive responses to nutritional or environmental stresses may provide methodologies to retain viability and improve functionality during commercial preparation, storage and delivery of the probiotic organism. PMID:17629975

  11. Genome Update: alignment of bacterial chromosomes

    DEFF Research Database (Denmark)

    Ussery, David; Jensen, Mette; Poulsen, Tine Rugh;

    2004-01-01

    There are four new microbial genomes listed in this month's Genome Update, three belonging to Gram-positive bacteria and one belonging to an archaeon that lives at pH 0; all of these genomes are listed in Table 1⇓. The method of genome comparison this month is that of genome alignment and......, as an example, an alignment of seven Staphylococcus aureus genomes and one Staphylococcus epidermidis genome is presented....

  12. Genome evolution and systems biology in bacterial endosymbionts of insects

    OpenAIRE

    Belda Cuesta, Eugeni

    2010-01-01

    Gene loss is the most important event in the process of genome reduction that appears associated with bacterial endosymbionts of insects. These small genomes were derived features evolved from ancestral prokaryotes with larger genome sizes, consequence of a massive process of genome reduction due to drastic changes in the ecological conditions and evolutionary pressures acting on these prokaryotic lineages during their ecological transition to host-dependent lifestyle. In the present thesis, ...

  13. Bacteriophage functional genomics and its role in bacterial pathogen detection.

    Science.gov (United States)

    Klumpp, Jochen; Fouts, Derrick E; Sozhamannan, Shanmuga

    2013-07-01

    Emerging and reemerging bacterial infectious diseases are a major public health concern worldwide. The role of bacteriophages in the emergence of novel bacterial pathogens by horizontal gene transfer was highlighted by the May 2011 Escherichia coli O104:H4 outbreaks that originated in Germany and spread to other European countries. This outbreak also highlighted the pivotal role played by recent advances in functional genomics in rapidly deciphering the virulence mechanism elicited by this novel pathogen and developing rapid diagnostics and therapeutics. However, despite a steady increase in the number of phage sequences in the public databases, boosted by the next-generation sequencing technologies, few functional genomics studies of bacteriophages have been conducted. Our definition of 'functional genomics' encompasses a range of aspects: phage genome sequencing, annotation and ascribing functions to phage genes, prophage identification in bacterial sequences, elucidating the events in various stages of phage life cycle using genomic, transcriptomic and proteomic approaches, defining the mechanisms of host takeover including specific bacterial-phage protein interactions and identifying virulence and other adaptive features encoded by phages and finally, using prophage genomic information for bacterial detection/diagnostics. Given the breadth and depth of this definition and the fact that some of these aspects (especially phage-encoded virulence/adaptive features) have been treated extensively in other reviews, we restrict our focus only on certain aspects. These include phage genome sequencing and annotation, identification of prophages in bacterial sequences and genetic characterization of phages, functional genomics of the infection process and finally, bacterial identification using genomic information.

  14. Bacteriophage functional genomics and its role in bacterial pathogen detection.

    Science.gov (United States)

    Klumpp, Jochen; Fouts, Derrick E; Sozhamannan, Shanmuga

    2013-07-01

    Emerging and reemerging bacterial infectious diseases are a major public health concern worldwide. The role of bacteriophages in the emergence of novel bacterial pathogens by horizontal gene transfer was highlighted by the May 2011 Escherichia coli O104:H4 outbreaks that originated in Germany and spread to other European countries. This outbreak also highlighted the pivotal role played by recent advances in functional genomics in rapidly deciphering the virulence mechanism elicited by this novel pathogen and developing rapid diagnostics and therapeutics. However, despite a steady increase in the number of phage sequences in the public databases, boosted by the next-generation sequencing technologies, few functional genomics studies of bacteriophages have been conducted. Our definition of 'functional genomics' encompasses a range of aspects: phage genome sequencing, annotation and ascribing functions to phage genes, prophage identification in bacterial sequences, elucidating the events in various stages of phage life cycle using genomic, transcriptomic and proteomic approaches, defining the mechanisms of host takeover including specific bacterial-phage protein interactions and identifying virulence and other adaptive features encoded by phages and finally, using prophage genomic information for bacterial detection/diagnostics. Given the breadth and depth of this definition and the fact that some of these aspects (especially phage-encoded virulence/adaptive features) have been treated extensively in other reviews, we restrict our focus only on certain aspects. These include phage genome sequencing and annotation, identification of prophages in bacterial sequences and genetic characterization of phages, functional genomics of the infection process and finally, bacterial identification using genomic information. PMID:23520178

  15. Harnessing CRISPR-Cas systems for bacterial genome editing.

    Science.gov (United States)

    Selle, Kurt; Barrangou, Rodolphe

    2015-04-01

    Manipulation of genomic sequences facilitates the identification and characterization of key genetic determinants in the investigation of biological processes. Genome editing via clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated (Cas) constitutes a next-generation method for programmable and high-throughput functional genomics. CRISPR-Cas systems are readily reprogrammed to induce sequence-specific DNA breaks at target loci, resulting in fixed mutations via host-dependent DNA repair mechanisms. Although bacterial genome editing is a relatively unexplored and underrepresented application of CRISPR-Cas systems, recent studies provide valuable insights for the widespread future implementation of this technology. This review summarizes recent progress in bacterial genome editing and identifies fundamental genetic and phenotypic outcomes of CRISPR targeting in bacteria, in the context of tool development, genome homeostasis, and DNA repair.

  16. Value of a newly sequenced bacterial genome

    DEFF Research Database (Denmark)

    Barbosa, Eudes; Aburjaile, Flavia F; Ramos, Rommel Tj;

    2014-01-01

    heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting...

  17. Identifying characteristic scales in the human genome

    Science.gov (United States)

    Carpena, P.; Bernaola-Galván, P.; Coronado, A. V.; Hackenberg, M.; Oliver, J. L.

    2007-03-01

    The scale-free, long-range correlations detected in DNA sequences contrast with characteristic lengths of genomic elements, being particularly incompatible with the isochores (long, homogeneous DNA segments). By computing the local behavior of the scaling exponent α of detrended fluctuation analysis (DFA), we discriminate between sequences with and without true scaling, and we find that no single scaling exists in the human genome. Instead, human chromosomes show a common compositional structure with two characteristic scales, the large one corresponding to the isochores and the other to small and medium scale genomic elements.

  18. Correcting Inconsistencies and Errors in Bacterial Genome Metadata Using an Automated Curation Tool in Excel (AutoCurE).

    Science.gov (United States)

    Schmedes, Sarah E; King, Jonathan L; Budowle, Bruce

    2015-01-01

    Whole-genome data are invaluable for large-scale comparative genomic studies. Current sequencing technologies have made it feasible to sequence entire bacterial genomes with relative ease and time with a substantially reduced cost per nucleotide, hence cost per genome. More than 3,000 bacterial genomes have been sequenced and are available at the finished status. Publically available genomes can be readily downloaded; however, there are challenges to verify the specific supporting data contained within the download and to identify errors and inconsistencies that may be present within the organizational data content and metadata. AutoCurE, an automated tool for bacterial genome database curation in Excel, was developed to facilitate local database curation of supporting data that accompany downloaded genomes from the National Center for Biotechnology Information. AutoCurE provides an automated approach to curate local genomic databases by flagging inconsistencies or errors by comparing the downloaded supporting data to the genome reports to verify genome name, RefSeq accession numbers, the presence of archaea, BioProject/UIDs, and sequence file descriptions. Flags are generated for nine metadata fields if there are inconsistencies between the downloaded genomes and genomes reports and if erroneous or missing data are evident. AutoCurE is an easy-to-use tool for local database curation for large-scale genome data prior to downstream analyses.

  19. Two-dimensional DNA displays for comparisons of bacterial genomes

    Directory of Open Access Journals (Sweden)

    Malloff Chad

    2003-01-01

    Full Text Available We have developed two whole genome-scanning techniques to aid in the discovery of polymorphisms as well as horizontally acquired genes in prokaryotic organisms. First, two-dimensional bacterial genomic display (2DBGD was developed using restriction enzyme fragmentation to separate genomic DNA based on size, and then employing denaturing gradient gel electrophoresis (DGGE in the second dimension to exploit differences in sequence composition. This technique was used to generate high-resolution displays that enable the direct comparison of > 800 genomic fragments simultaneously and can be adapted for the high-throughput comparison of bacterial genomes. 2DBGDs are capable of detecting acquired and altered DNA, however, only in very closely related strains. If used to compare more distantly related strains (e.g. different species within a genus numerous small changes (i.e. small deletions and point mutations unrelated to the interesting phenotype, would encumber the comparison of 2DBGDs. For this reason a second method, bacterial comparative genomic hybridization (BCGH, was developed to directly compare bacterial genomes to identify gain or loss of genomic DNA. BCGH relies on performing 2DBGD on a pooled sample of genomic DNA from 2 strains to be compared and subsequently hybridizing the resulting 2DBGD blot separately with DNA from each individual strain. Unique spots (hybridization signals represent foreign DNA. The identification of novel DNA is easily achieved by excising the DNA from a dried gel followed by subsequent cloning and sequencing. 2DBGD and BCGH thus represent novel high resolution genome scanning techniques for directly identifying altered and/or acquired DNA.

  20. Genomic islands are dynamic, ancient integrative elements in bacterial evolution.

    Science.gov (United States)

    Boyd, E Fidelma; Almagro-Moreno, Salvador; Parent, Michelle A

    2009-02-01

    Acquisition of genomic islands plays a central part in bacterial evolution as a mechanism of diversification and adaptation. Genomic islands are non-self-mobilizing integrative and excisive elements that encode diverse functional characteristics but all contain a recombination module comprised of an integrase, associated attachment sites and, in some cases, a recombination directionality factor. Here, we discuss how a group of related genomic islands are evolutionarily ancient elements unrelated to plasmids, phages, integrons and integrative conjugative elements. In addition, we explore the diversity of genomic islands and their insertion sites among Gram-negative bacteria and discuss why they integrate at a limited number of tRNA genes. PMID:19162481

  1. Bacterial Cellular Engineering by Genome Editing and Gene Silencing

    Directory of Open Access Journals (Sweden)

    Nobutaka Nakashima

    2014-02-01

    Full Text Available Genome editing is an important technology for bacterial cellular engineering, which is commonly conducted by homologous recombination-based procedures, including gene knockout (disruption, knock-in (insertion, and allelic exchange. In addition, some new recombination-independent approaches have emerged that utilize catalytic RNAs, artificial nucleases, nucleic acid analogs, and peptide nucleic acids. Apart from these methods, which directly modify the genomic structure, an alternative approach is to conditionally modify the gene expression profile at the posttranscriptional level without altering the genomes. This is performed by expressing antisense RNAs to knock down (silence target mRNAs in vivo. This review describes the features and recent advances on methods used in genomic engineering and silencing technologies that are advantageously used for bacterial cellular engineering.

  2. Differentiation of regions with atypical oligonucleotide composition in bacterial genomes

    Directory of Open Access Journals (Sweden)

    Reva Oleg N

    2005-10-01

    Full Text Available Abstract Background Complete sequencing of bacterial genomes has become a common technique of present day microbiology. Thereafter, data mining in the complete sequence is an essential step. New in silico methods are needed that rapidly identify the major features of genome organization and facilitate the prediction of the functional class of ORFs. We tested the usefulness of local oligonucleotide usage (OU patterns to recognize and differentiate types of atypical oligonucleotide composition in DNA sequences of bacterial genomes. Results A total of 163 bacterial genomes of eubacteria and archaea published in the NCBI database were analyzed. Local OU patterns exhibit substantial intrachromosomal variation in bacteria. Loci with alternative OU patterns were parts of horizontally acquired gene islands or ancient regions such as genes for ribosomal proteins and RNAs. OU statistical parameters, such as local pattern deviation (D, pattern skew (PS and OU variance (OUV enabled the detection and visualization of gene islands of different functional classes. Conclusion A set of approaches has been designed for the statistical analysis of nucleotide sequences of bacterial genomes. These methods are useful for the visualization and differentiation of regions with atypical oligonucleotide composition prior to or accompanying gene annotation.

  3. [Bacterial genomics and metagenomics: clinical applications and medical relevance].

    Science.gov (United States)

    Diene, S M; Bertelli, C; Pillonel, T; Schrenzel, J; Greub, G

    2014-11-12

    New sequencing technologies provide in a short time and at low cost high amount of genomic sequences useful for applications such as: a) development of diagnostic PCRs and/or serological tests; b) detection of virulence factors (virulome) or genes/SNPs associated with resistance to antibiotics (resistome) and c) investigation of transmission and dissemination of bacterial pathogens. Thus, bacterial genomics of medical importance is useful to clinical microbiologists, to infectious diseases specialists as well as to epidemiologists. Determining the microbial composition of a sample by metagenomics is another application of new sequencing technologies, useful to understand the impact of bacteria on various non-infectious diseases such as obesity, asthma, or diabetes. Genomics and metagenomics will likely become a specialized diagnostic analysis.

  4. Genes but not genomes reveal bacterial domestication of Lactococcus lactis.

    Directory of Open Access Journals (Sweden)

    Delphine Passerini

    Full Text Available BACKGROUND: The population structure and diversity of Lactococcus lactis subsp. lactis, a major industrial bacterium involved in milk fermentation, was determined at both gene and genome level. Seventy-six lactococcal isolates of various origins were studied by different genotyping methods and thirty-six strains displaying unique macrorestriction fingerprints were analyzed by a new multilocus sequence typing (MLST scheme. This gene-based analysis was compared to genomic characteristics determined by pulsed-field gel electrophoresis (PFGE. METHODOLOGY/PRINCIPAL FINDINGS: The MLST analysis revealed that L. lactis subsp. lactis is essentially clonal with infrequent intra- and intergenic recombination; also, despite its taxonomical classification as a subspecies, it displays a genetic diversity as substantial as that within several other bacterial species. Genome-based analysis revealed a genome size variability of 20%, a value typical of bacteria inhabiting different ecological niches, and that suggests a large pan-genome for this subspecies. However, the genomic characteristics (macrorestriction pattern, genome or chromosome size, plasmid content did not correlate to the MLST-based phylogeny, with strains from the same sequence type (ST differing by up to 230 kb in genome size. CONCLUSION/SIGNIFICANCE: The gene-based phylogeny was not fully consistent with the traditional classification into dairy and non-dairy strains but supported a new classification based on ecological separation between "environmental" strains, the main contributors to the genetic diversity within the subspecies, and "domesticated" strains, subject to recent genetic bottlenecks. Comparison between gene- and genome-based analyses revealed little relationship between core and dispensable genome phylogenies, indicating that clonal diversification and phenotypic variability of the "domesticated" strains essentially arose through substantial genomic flux within the dispensable

  5. Genome scale engineering techniques for metabolic engineering.

    Science.gov (United States)

    Liu, Rongming; Bassalo, Marcelo C; Zeitoun, Ramsey I; Gill, Ryan T

    2015-11-01

    Metabolic engineering has expanded from a focus on designs requiring a small number of genetic modifications to increasingly complex designs driven by advances in genome-scale engineering technologies. Metabolic engineering has been generally defined by the use of iterative cycles of rational genome modifications, strain analysis and characterization, and a synthesis step that fuels additional hypothesis generation. This cycle mirrors the Design-Build-Test-Learn cycle followed throughout various engineering fields that has recently become a defining aspect of synthetic biology. This review will attempt to summarize recent genome-scale design, build, test, and learn technologies and relate their use to a range of metabolic engineering applications.

  6. Bacterial genomic adaptation and response to metals

    International Nuclear Information System (INIS)

    The beta-proteobacterium Cupriavidus metallidurans CH34 (formerly Ralstonia metallidurans) has been intensively studied since 1976 in SCK-CEN and VITO, for its adaptation capacity to survive in harsh (mostly industrial) environments, to overcome acute environmental stresses, for its resistance to a variety of heavy metals and for applications in environmental biotechnology. Recently, CH34 has become a model bacterium to study the effect of spaceflight conditions in several space flight experiments conducted by SCK-CEN (e.g. MESSAGE, BASE). Furthermore, Cupriavidus and Ralstonia species are isolated from the floor, air and surfaces of spacecraft assembly rooms; were found prior-to-flight on surfaces of space robots such as the Mars Odyssey Orbiter and even in-flight in ISS cooling water and Shuttle drinking water, vindicating its role as model bacterium in space research. In addition, Ralstonia species are also the causative agent of nosocomial infections and are among the unusual species recovered from cystic fibrosis (CF) patients. The genomic organization of Cuprivavidus metallidurans CH34 was studied in-depth to identify the genetic and regulatory structures involved in the resistance to heavy metals

  7. Within-Host Bacterial Diversity Hinders Accurate Reconstruction of Transmission Networks from Genomic Distance Data

    OpenAIRE

    Worby, Colin J.; Marc Lipsitch; William P Hanage

    2014-01-01

    The prospect of using whole genome sequence data to investigate bacterial disease outbreaks has been keenly anticipated in many quarters, and the large-scale collection and sequencing of isolates from cases is becoming increasingly feasible. While sequence data can provide many important insights into disease spread and pathogen adaptation, it remains unclear how successfully they may be used to estimate individual routes of transmission. Several studies have attempted to reconstruct transmis...

  8. Within-host bacterial diversity hinders accurate reconstruction of transmission networks from genomic distance data.

    OpenAIRE

    Worby, Colin J.; Marc Lipsitch; Hanage, William P

    2014-01-01

    The prospect of using whole genome sequence data to investigate bacterial disease outbreaks has been keenly anticipated in many quarters, and the large-scale collection and sequencing of isolates from cases is becoming increasingly feasible. While sequence data can provide many important insights into disease spread and pathogen adaptation, it remains unclear how successfully they may be used to estimate individual routes of transmission. Several studies have attempted to reconstruct transmis...

  9. BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal Matoq Saeed

    2015-08-18

    Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/

  10. Genome Assembly and Computational Analysis Pipelines for Bacterial Pathogens

    KAUST Repository

    Rangkuti, Farania Gama Ardhina

    2011-06-01

    Pathogens lie behind the deadliest pandemics in history. To date, AIDS pandemic has resulted in more than 25 million fatal cases, while tuberculosis and malaria annually claim more than 2 million lives. Comparative genomic analyses are needed to gain insights into the molecular mechanisms of pathogens, but the abundance of biological data dictates that such studies cannot be performed without the assistance of computational approaches. This explains the significant need for computational pipelines for genome assembly and analyses. The aim of this research is to develop such pipelines. This work utilizes various bioinformatics approaches to analyze the high-­throughput genomic sequence data that has been obtained from several strains of bacterial pathogens. A pipeline has been compiled for quality control for sequencing and assembly, and several protocols have been developed to detect contaminations. Visualization has been generated of genomic data in various formats, in addition to alignment, homology detection and sequence variant detection. We have also implemented a metaheuristic algorithm that significantly improves bacterial genome assemblies compared to other known methods. Experiments on Mycobacterium tuberculosis H37Rv data showed that our method resulted in improvement of N50 value of up to 9697% while consistently maintaining high accuracy, covering around 98% of the published reference genome. Other improvement efforts were also implemented, consisting of iterative local assemblies and iterative correction of contiguated bases. Our result expedites the genomic analysis of virulent genes up to single base pair resolution. It is also applicable to virtually every pathogenic microorganism, propelling further research in the control of and protection from pathogen-­associated diseases.

  11. Elucidation of operon structures across closely related bacterial genomes.

    Science.gov (United States)

    Zhou, Chuan; Ma, Qin; Li, Guojun

    2014-01-01

    About half of the protein-coding genes in prokaryotic genomes are organized into operons to facilitate co-regulation during transcription. With the evolution of genomes, operon structures are undergoing changes which could coordinate diverse gene expression patterns in response to various stimuli during the life cycle of a bacterial cell. Here we developed a graph-based model to elucidate the diversity of operon structures across a set of closely related bacterial genomes. In the constructed graph, each node represents one orthologous gene group (OGG) and a pair of nodes will be connected if any two genes, from the corresponding two OGGs respectively, are located in the same operon as immediate neighbors in any of the considered genomes. Through identifying the connected components in the above graph, we found that genes in a connected component are likely to be functionally related and these identified components tend to form treelike topology, such as paths and stars, corresponding to different biological mechanisms in transcriptional regulation as follows. Specifically, (i) a path-structure component integrates genes encoding a protein complex, such as ribosome; and (ii) a star-structure component not only groups related genes together, but also reflects the key functional roles of the central node of this component, such as the ABC transporter with a transporter permease and substrate-binding proteins surrounding it. Most interestingly, the genes from organisms with highly diverse living environments, i.e., biomass degraders and animal pathogens of clostridia in our study, can be clearly classified into different topological groups on some connected components.

  12. Elucidation of operon structures across closely related bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Chuan Zhou

    Full Text Available About half of the protein-coding genes in prokaryotic genomes are organized into operons to facilitate co-regulation during transcription. With the evolution of genomes, operon structures are undergoing changes which could coordinate diverse gene expression patterns in response to various stimuli during the life cycle of a bacterial cell. Here we developed a graph-based model to elucidate the diversity of operon structures across a set of closely related bacterial genomes. In the constructed graph, each node represents one orthologous gene group (OGG and a pair of nodes will be connected if any two genes, from the corresponding two OGGs respectively, are located in the same operon as immediate neighbors in any of the considered genomes. Through identifying the connected components in the above graph, we found that genes in a connected component are likely to be functionally related and these identified components tend to form treelike topology, such as paths and stars, corresponding to different biological mechanisms in transcriptional regulation as follows. Specifically, (i a path-structure component integrates genes encoding a protein complex, such as ribosome; and (ii a star-structure component not only groups related genes together, but also reflects the key functional roles of the central node of this component, such as the ABC transporter with a transporter permease and substrate-binding proteins surrounding it. Most interestingly, the genes from organisms with highly diverse living environments, i.e., biomass degraders and animal pathogens of clostridia in our study, can be clearly classified into different topological groups on some connected components.

  13. Reconstruction of a Bacterial Genome from DNA Cassettes

    Energy Technology Data Exchange (ETDEWEB)

    Christopher Dupont; John Glass; Laura Sheahan; Shibu Yooseph; Lisa Zeigler Allen; Mathangi Thiagarajan; Andrew Allen; Robert Friedman; J. Craig Venter

    2011-12-31

    This basic research program comprised two major areas: (1) acquisition and analysis of marine microbial metagenomic data and development of genomic analysis tools for broad, external community use; (2) development of a minimal bacterial genome. Our Marine Metagenomic Diversity effort generated and analyzed shotgun sequencing data from microbial communities sampled from over 250 sites around the world. About 40% of the 26 Gbp of sequence data has been made publicly available to date with a complete release anticipated in six months. Our results and those mining the deposited data have revealed a vast diversity of genes coding for critical metabolic processes whose phylogenetic and geographic distributions will enable a deeper understanding of carbon and nutrient cycling, microbial ecology, and rapid rate evolutionary processes such as horizontal gene transfer by viruses and plasmids. A global assembly of the generated dataset resulted in a massive set (5Gbp) of genome fragments that provide context to the majority of the generated data that originated from uncultivated organisms. Our Synthetic Biology team has made significant progress towards the goal of synthesizing a minimal mycoplasma genome that will have all of the machinery for independent life. This project, once completed, will provide fundamentally new knowledge about requirements for microbial life and help to lay a basic research foundation for developing microbiological approaches to bioenergy.

  14. Impact of genome reduction on bacterial metabolism and its regulation.

    Science.gov (United States)

    Yus, Eva; Maier, Tobias; Michalodimitrakis, Konstantinos; van Noort, Vera; Yamada, Takuji; Chen, Wei-Hua; Wodke, Judith A H; Güell, Marc; Martínez, Sira; Bourgeois, Ronan; Kühner, Sebastian; Raineri, Emanuele; Letunic, Ivica; Kalinina, Olga V; Rode, Michaela; Herrmann, Richard; Gutiérrez-Gallego, Ricardo; Russell, Robert B; Gavin, Anne-Claude; Bork, Peer; Serrano, Luis

    2009-11-27

    To understand basic principles of bacterial metabolism organization and regulation, but also the impact of genome size, we systematically studied one of the smallest bacteria, Mycoplasma pneumoniae. A manually curated metabolic network of 189 reactions catalyzed by 129 enzymes allowed the design of a defined, minimal medium with 19 essential nutrients. More than 1300 growth curves were recorded in the presence of various nutrient concentrations. Measurements of biomass indicators, metabolites, and 13C-glucose experiments provided information on directionality, fluxes, and energetics; integration with transcription profiling enabled the global analysis of metabolic regulation. Compared with more complex bacteria, the M. pneumoniae metabolic network has a more linear topology and contains a higher fraction of multifunctional enzymes; general features such as metabolite concentrations, cellular energetics, adaptability, and global gene expression responses are similar, however.

  15. Genome-scale metabolic network reconstruction.

    Science.gov (United States)

    Fondi, Marco; Liò, Pietro

    2015-01-01

    Bacterial metabolism is an important source of novel products/processes for everyday life and strong efforts are being undertaken to discover and exploit new usable substances of microbial origin. Computational modeling and in silico simulations are powerful tools in this context since they allow the exploration and a deeper understanding of bacterial metabolic circuits. Many approaches exist to quantitatively simulate chemical reaction fluxes within the whole microbial metabolism and, regardless of the technique of choice, metabolic model reconstruction is the first step in every modeling pipeline. Reconstructing a metabolic network consists in drafting the list of the biochemical reactions that an organism can carry out together with information on cellular boundaries, a biomass assembly reaction, and exchange fluxes with the external environment. Building up models able to represent the different functional cellular states is universally recognized as a tricky task that requires intensive manual effort and much additional information besides genome sequence. In this chapter we present a general protocol for metabolic reconstruction in bacteria and the main challenges encountered during this process. PMID:25343869

  16. A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance

    DEFF Research Database (Denmark)

    Thomsen, Martin Christen Frølund; Ahrenfeldt, Johanne; Bellod Cisneros, Jose Luis;

    2016-01-01

    and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the...... web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes...... platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely...

  17. CISA: contig integrator for sequence assembly of bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Shin-Hung Lin

    Full Text Available A plethora of algorithmic assemblers have been proposed for the de novo assembly of genomes, however, no individual assembler guarantees the optimal assembly for diverse species. Optimizing various parameters in an assembler is often performed in order to generate the most optimal assembly. However, few efforts have been pursued to take advantage of multiple assemblies to yield an assembly of high accuracy. In this study, we employ various state-of-the-art assemblers to generate different sets of contigs for bacterial genomes. A tool, named CISA, has been developed to integrate the assemblies into a hybrid set of contigs, resulting in assemblies of superior contiguity and accuracy, compared with the assemblies generated by the state-of-the-art assemblers and the hybrid assemblies merged by existing tools. This tool is implemented in Python and requires MUMmer and BLAST+ to be installed on the local machine. The source code of CISA and examples of its use are available at http://sb.nhri.org.tw/CISA/.

  18. Predicting statistical properties of open reading frames in bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Katharina Mir

    Full Text Available An analytical model based on the statistical properties of Open Reading Frames (ORFs of eubacterial genomes such as codon composition and sequence length of all reading frames was developed. This new model predicts the average length, maximum length as well as the length distribution of the ORFs of 70 species with GC contents varying between 21% and 74%. Furthermore, the number of annotated genes is predicted with high accordance. However, the ORF length distribution in the five alternative reading frames shows interesting deviations from the predicted distribution. In particular, long ORFs appear more often than expected statistically. The unexpected depletion of stop codons in these alternative open reading frames cannot completely be explained by a biased codon usage in the +1 frame. While it is unknown if the stop codon depletion has a biological function, it could be due to a protein coding capacity of alternative ORFs exerting a selection pressure which prevents the fixation of stop codon mutations. The comparison of the analytical model with bacterial genomes, therefore, leads to a hypothesis suggesting novel gene candidates which can now be investigated in subsequent wet lab experiments.

  19. Sequencing of Bacterial Genomes: Principles and Insights into Pathogenesis and Development of Antibiotics

    Directory of Open Access Journals (Sweden)

    Eric S. Donkor

    2013-10-01

    Full Text Available The impact of bacterial diseases on public health has become enormous, and is partly due to the increasing trend of antibiotic resistance displayed by bacterial pathogens. Sequencing of bacterial genomes has significantly improved our understanding about the biology of many bacterial pathogens as well as identification of novel antibiotic targets. Since the advent of genome sequencing two decades ago, about 1,800 bacterial genomes have been fully sequenced and these include important aetiological agents such as Streptococcus pneumoniae, Mycobacterium tuberculosis, Escherichia coli O157:H7, Vibrio cholerae, Clostridium difficile and Staphylococcus aureus. Very recently, there has been an explosion of bacterial genome data and is due to the development of next generation sequencing technologies, which are evolving so rapidly. Indeed, the field of microbial genomics is advancing at a very fast rate and it is difficult for researchers to be abreast with the new developments. This highlights the need for regular updates in microbial genomics through comprehensive reviews. This review paper seeks to provide an update on bacterial genome sequencing generally, and to analyze insights gained from sequencing in two areas, including bacterial pathogenesis and the development of antibiotics.

  20. Facile, High Quality Sequencing of Bacterial Genomes from Small Amounts of DNA

    OpenAIRE

    Momchilo Vuyisich; Ayesha Arefin; Karen Davenport; Shihai Feng; Cheryl Gleasner; Kim McMurry; Beverly Parson-Quintana; Jennifer Price; Matthew Scholz; Patrick Chain

    2014-01-01

    Sequencing bacterial genomes has traditionally required large amounts of genomic DNA (~1 μg). There have been few studies to determine the effects of the input DNA amount or library preparation method on the quality of sequencing data. Several new commercially available library preparation methods enable shotgun sequencing from as little as 1 ng of input DNA. In this study, we evaluated the NEBNext Ultra library preparation reagents for sequencing bacterial genomes. We have evaluated the util...

  1. Across bacterial phyla, distantly-related genomes with similar genomic GC content have similar patterns of amino acid usage.

    Directory of Open Access Journals (Sweden)

    John Lightfield

    Full Text Available The GC content of bacterial genomes ranges from 16% to 75% and wide ranges of genomic GC content are observed within many bacterial phyla, including both gram negative and gram positive phyla. Thus, divergent genomic GC content has evolved repeatedly in widely separated bacterial taxa. Since genomic GC content influences codon usage, we examined codon usage patterns and predicted protein amino acid content as a function of genomic GC content within eight different phyla or classes of bacteria. We found that similar patterns of codon usage and protein amino acid content have evolved independently in all eight groups of bacteria. For example, in each group, use of amino acids encoded by GC-rich codons increased by approximately 1% for each 10% increase in genomic GC content, while the use of amino acids encoded by AT-rich codons decreased by a similar amount. This consistency within every phylum and class studied led us to conclude that GC content appears to be the primary determinant of the codon and amino acid usage patterns observed in bacterial genomes. These results also indicate that selection for translational efficiency of highly expressed genes is constrained by the genomic parameters associated with the GC content of the host genome.

  2. The SeqWord Genome Browser: an online tool for the identification and visualization of atypical regions of bacterial genomes through oligonucleotide usage

    Directory of Open Access Journals (Sweden)

    Tümmler Burkhard

    2008-08-01

    Full Text Available Abstract Background Data mining in large DNA sequences is a major challenge in microbial genomics and bioinformatics. Oligonucleotide usage (OU patterns provide a wealth of information for large scale sequence analysis and visualization. The purpose of this research was to make OU statistical analysis available as a novel web-based tool for functional genomics and annotation. The tool is also available as a downloadable package. Results The SeqWord Genome Browser (SWGB was developed to visualize the natural compositional variation of DNA sequences. The applet is also used for identification of divergent genomic regions both in annotated sequences of bacterial chromosomes, plasmids, phages and viruses, and in raw DNA sequences prior to annotation by comparing local and global OU patterns. The applet allows fast and reliable identification of clusters of horizontally transferred genomic islands, large multi-domain genes and genes for ribosomal RNA. Within the majority of genomic fragments (also termed genomic core sequence, regions enriched with housekeeping genes, ribosomal proteins and the regions rich in pseudogenes or genetic vestiges may be contrasted. Conclusion The SWGB applet presents a range of comprehensive OU statistical parameters calculated for a range of bacterial species, plasmids and phages. It is available on the Internet at http://www.bi.up.ac.za/SeqWord/mhhapplet.php.

  3. A new experimental approach for studying bacterial genomic island evolution identifies island genes with bacterial host-specific expression patterns

    Directory of Open Access Journals (Sweden)

    Nickerson Cheryl A

    2006-01-01

    Full Text Available Abstract Background Genomic islands are regions of bacterial genomes that have been acquired by horizontal transfer and often contain blocks of genes that function together for specific processes. Recently, it has become clear that the impact of genomic islands on the evolution of different bacterial species is significant and represents a major force in establishing bacterial genomic variation. However, the study of genomic island evolution has been mostly performed at the sequence level using computer software or hybridization analysis to compare different bacterial genomic sequences. We describe here a novel experimental approach to study the evolution of species-specific bacterial genomic islands that identifies island genes that have evolved in such a way that they are differentially-expressed depending on the bacterial host background into which they are transferred. Results We demonstrate this approach by using a "test" genomic island that we have cloned from the Salmonella typhimurium genome (island 4305 and transferred to a range of Gram negative bacterial hosts of differing evolutionary relationships to S. typhimurium. Systematic analysis of the expression of the island genes in the different hosts compared to proper controls allowed identification of genes with genera-specific expression patterns. The data from the analysis can be arranged in a matrix to give an expression "array" of the island genes in the different bacterial backgrounds. A conserved 19-bp DNA site was found upstream of at least two of the differentially-expressed island genes. To our knowledge, this is the first systematic analysis of horizontally-transferred genomic island gene expression in a broad range of Gram negative hosts. We also present evidence in this study that the IS200 element found in island 4305 in S. typhimurium strain LT2 was inserted after the island had already been acquired by the S. typhimurium lineage and that this element is likely not

  4. Bacterial communities in full-scale wastewater treatment systems

    OpenAIRE

    Cydzik-Kwiatkowska, Agnieszka; Zielińska, Magdalena

    2016-01-01

    Bacterial metabolism determines the effectiveness of biological treatment of wastewater. Therefore, it is important to define the relations between the species structure and the performance of full-scale installations. Although there is much laboratory data on microbial consortia, our understanding of dependencies between the microbial structure and operational parameters of full-scale wastewater treatment plants (WWTP) is limited. This mini-review presents the types of microbial consortia in...

  5. Metabolomic Functional Analysis of Bacterial Genomes: Final Report

    Energy Technology Data Exchange (ETDEWEB)

    Arp, Daniel J; Sayavedra-Soto, Luis A

    2008-01-01

    The availability of the complete DNA sequence of the bacterial genome of Nitrosomonas europaea offered the opportunity for unprecedented and detailed investigations of function. We studied the function of genes involved in carbohydrate and Fe metabolism. N. europaea has genes for the synthesis and degradation of glycogen and sucrose but cannot grow on substrates other than ammonia and CO2. Granules of glycogen were detected in whole cells by electron microscopy and quantified in cell-free extracts by enzymatic methods. The cellular glycogen and sucrose content varied depending on the composition of the growth medium and cellular growth stage. N. europaea also depends heavily on iron for metabolism of ammonia, is particularly interesting since it lacks genes for siderophore production, and has genes with only low similarity to known iron reductases, yet grows relatively well in medium containing low Fe. By comparing the transcriptomes of cells grown in iron-replete medium versus iron-limited medium, 247 genes were identified as differentially expressed. Mutant strains deficient in genes for sucrose, glycogen and iron metabolism were created and are being used to further our understanding of ammonia oxidizing bacteria.

  6. Scale-Invariant Correlations in Dynamic Bacterial Clusters

    Science.gov (United States)

    Chen, Xiao; Dong, Xu; Be'er, Avraham; Swinney, Harry L.; Zhang, H. P.

    2012-04-01

    In Bacillus subtilis colonies, motile bacteria move collectively, spontaneously forming dynamic clusters. These bacterial clusters share similarities with other systems exhibiting polarized collective motion, such as bird flocks or fish schools. Here we study experimentally how velocity and orientation fluctuations within clusters are spatially correlated. For a range of cell density and cluster size, the correlation length is shown to be 30% of the spatial size of clusters, and the correlation functions collapse onto a master curve after rescaling the separation with correlation length. Our results demonstrate that correlations of velocity and orientation fluctuations are scale invariant in dynamic bacterial clusters.

  7. Evolution in an oncogenic bacterial species with extreme genome plasticity: Helicobacter pylori East Asian genomes

    Directory of Open Access Journals (Sweden)

    Handa Naofumi

    2011-05-01

    Full Text Available Abstract Background The genome of Helicobacter pylori, an oncogenic bacterium in the human stomach, rapidly evolves and shows wide geographical divergence. The high incidence of stomach cancer in East Asia might be related to bacterial genotype. We used newly developed comparative methods to follow the evolution of East Asian H. pylori genomes using 20 complete genome sequences from Japanese, Korean, Amerind, European, and West African strains. Results A phylogenetic tree of concatenated well-defined core genes supported divergence of the East Asian lineage (hspEAsia; Japanese and Korean from the European lineage ancestor, and then from the Amerind lineage ancestor. Phylogenetic profiling revealed a large difference in the repertoire of outer membrane proteins (including oipA, hopMN, babABC, sabAB and vacA-2 through gene loss, gain, and mutation. All known functions associated with molybdenum, a rare element essential to nearly all organisms that catalyzes two-electron-transfer oxidation-reduction reactions, appeared to be inactivated. Two pathways linking acetyl~CoA and acetate appeared intact in some Japanese strains. Phylogenetic analysis revealed greater divergence between the East Asian (hspEAsia and the European (hpEurope genomes in proteins in host interaction, specifically virulence factors (tipα, outer membrane proteins, and lipopolysaccharide synthesis (human Lewis antigen mimicry enzymes. Divergence was also seen in proteins in electron transfer and translation fidelity (miaA, tilS, a DNA recombinase/exonuclease that recognizes genome identity (addA, and DNA/RNA hybrid nucleases (rnhAB. Positively selected amino acid changes between hspEAsia and hpEurope were mapped to products of cagA, vacA, homC (outer membrane protein, sotB (sugar transport, and a translation fidelity factor (miaA. Large divergence was seen in genes related to antibiotics: frxA (metronidazole resistance, def (peptide deformylase, drug target, and ftsA (actin

  8. Multidimensional scaling for large genomic data sets

    Directory of Open Access Journals (Sweden)

    Lu Henry

    2008-04-01

    Full Text Available Abstract Background Multi-dimensional scaling (MDS is aimed to represent high dimensional data in a low dimensional space with preservation of the similarities between data points. This reduction in dimensionality is crucial for analyzing and revealing the genuine structure hidden in the data. For noisy data, dimension reduction can effectively reduce the effect of noise on the embedded structure. For large data set, dimension reduction can effectively reduce information retrieval complexity. Thus, MDS techniques are used in many applications of data mining and gene network research. However, although there have been a number of studies that applied MDS techniques to genomics research, the number of analyzed data points was restricted by the high computational complexity of MDS. In general, a non-metric MDS method is faster than a metric MDS, but it does not preserve the true relationships. The computational complexity of most metric MDS methods is over O(N2, so that it is difficult to process a data set of a large number of genes N, such as in the case of whole genome microarray data. Results We developed a new rapid metric MDS method with a low computational complexity, making metric MDS applicable for large data sets. Computer simulation showed that the new method of split-and-combine MDS (SC-MDS is fast, accurate and efficient. Our empirical studies using microarray data on the yeast cell cycle showed that the performance of K-means in the reduced dimensional space is similar to or slightly better than that of K-means in the original space, but about three times faster to obtain the clustering results. Our clustering results using SC-MDS are more stable than those in the original space. Hence, the proposed SC-MDS is useful for analyzing whole genome data. Conclusion Our new method reduces the computational complexity from O(N3 to O(N when the dimension of the feature space is far less than the number of genes N, and it successfully

  9. Acidobacteria form a coherent but highly diverse group within the bacterial domain: evidence from environmental genomics

    DEFF Research Database (Denmark)

    Quaiser, Achim; Ochsenreiter, Torsten; Lanz, Christa;

    2003-01-01

    ecological role and extensive metabolic versatility. However, the genetic and physiological information about Acidobacteria is scarce. In order to gain insight into genome structure, evolution and diversity of these microorganisms we have initiated an environmental genomic approach by constructing large...... well-studied bacterial phyla....

  10. Facile, High Quality Sequencing of Bacterial Genomes from Small Amounts of DNA

    Science.gov (United States)

    Vuyisich, Momchilo; Arefin, Ayesha; Davenport, Karen; Feng, Shihai; Gleasner, Cheryl; McMurry, Kim; Parson-Quintana, Beverly; Price, Jennifer; Scholz, Matthew; Chain, Patrick

    2014-01-01

    Sequencing bacterial genomes has traditionally required large amounts of genomic DNA (~1 μg). There have been few studies to determine the effects of the input DNA amount or library preparation method on the quality of sequencing data. Several new commercially available library preparation methods enable shotgun sequencing from as little as 1 ng of input DNA. In this study, we evaluated the NEBNext Ultra library preparation reagents for sequencing bacterial genomes. We have evaluated the utility of NEBNext Ultra for resequencing and de novo assembly of four bacterial genomes and compared its performance with the TruSeq library preparation kit. The NEBNext Ultra reagents enable high quality resequencing and de novo assembly of a variety of bacterial genomes when using 100 ng of input genomic DNA. For the two most challenging genomes (Burkholderia spp.), which have the highest GC content and are the longest, we also show that the quality of both resequencing and de novo assembly is not decreased when only 10 ng of input genomic DNA is used. PMID:25478564

  11. Facile, High Quality Sequencing of Bacterial Genomes from Small Amounts of DNA

    Directory of Open Access Journals (Sweden)

    Momchilo Vuyisich

    2014-01-01

    Full Text Available Sequencing bacterial genomes has traditionally required large amounts of genomic DNA (~1 μg. There have been few studies to determine the effects of the input DNA amount or library preparation method on the quality of sequencing data. Several new commercially available library preparation methods enable shotgun sequencing from as little as 1 ng of input DNA. In this study, we evaluated the NEBNext Ultra library preparation reagents for sequencing bacterial genomes. We have evaluated the utility of NEBNext Ultra for resequencing and de novo assembly of four bacterial genomes and compared its performance with the TruSeq library preparation kit. The NEBNext Ultra reagents enable high quality resequencing and de novo assembly of a variety of bacterial genomes when using 100 ng of input genomic DNA. For the two most challenging genomes (Burkholderia spp., which have the highest GC content and are the longest, we also show that the quality of both resequencing and de novo assembly is not decreased when only 10 ng of input genomic DNA is used.

  12. Whole genome sequencing of bacteria in cystic fibrosis as a model for bacterial genome adaptation and evolution.

    Science.gov (United States)

    Sharma, Poonam; Gupta, Sushim Kumar; Rolain, Jean-Marc

    2014-03-01

    Cystic fibrosis (CF) airways harbor a wide variety of new and/or emerging multidrug resistant bacteria which impose a heavy burden on patients. These bacteria live in close proximity with one another, which increases the frequency of lateral gene transfer. The exchange and movement of mobile genetic elements and genomic islands facilitate the spread of genes between genetically diverse bacteria, which seem to be advantageous to the bacterium as it allows adaptation to the new niches of the CF lungs. Niche adaptation is one of the major evolutionary forces shaping bacterial genome composition and in CF the chronic strains adapt and become less virulent. The purpose of this review is to shed light on CF bacterial genome alterations. Next-generation sequencing technology is an exciting tool that may help us to decipher the genome architecture and the evolution of bacteria colonizing CF lungs. PMID:24502835

  13. Complete Genomes of Classical Swine Fever Virus Cloned into Bacterial Artificial Chromosomes

    OpenAIRE

    Rasmussen, Thomas Bruun; Reimann, I; Uttenthal, Åse; De Beer, M.

    2011-01-01

    Complete genome amplification of viral RNA provides a new tool for the generation of modified pestiviruses. We have used our full-genome amplification strategy for generation of amplicons representing complete genomes of classical swine fever virus. The amplicons were cloned directly into a stable single-copy bacterial artificial chromosome (BAC) generating full-length pestivirus DNAs from which infectious RNA transcripts could be also derived. Our strategy allows construction of stable infec...

  14. Physical descriptions of the bacterial nucleoid at large scales, and their biological implications

    CERN Document Server

    Benza, Vincenzo G; Dorfman, Kevin D; Scolari, Vittore F; Bromek, Krystyna; Cicuta, Pietro; Lagomarsino, Marco Cosentino

    2012-01-01

    Recent experimental and theoretical approaches have attempted to quantify the physical organization (compaction and geometry) of the bacterial chromosome with its complement of proteins (the nucleoid). The genomic DNA exists in a complex and dynamic protein-rich state, which is highly organised at various length scales. This has implications on modulating (when not enabling) the core biological processes of replication, transcription, segregation. We overview the progress in this area, driven in the last few years by new scientific ideas and new interdisciplinary experimental techniques, ranging from high space- and time-resolution microscopy to high-throughput genomics employing sequencing to map different aspects of the nucleoid-related interactome. The aim of this review is to present the wide spectrum of experimental and theoretical findings coherently, from a physics viewpoint. We also discuss some attempts of interpretation that unify different results, highlighting the role that statistical and soft co...

  15. Identification of prophages in bacterial genomes by dinucleotide relative abundance difference.

    Directory of Open Access Journals (Sweden)

    K V Srividhya

    Full Text Available BACKGROUND: Prophages are integrated viral forms in bacterial genomes that have been found to contribute to interstrain genetic variability. Many virulence-associated genes are reported to be prophage encoded. Present computational methods to detect prophages are either by identifying possible essential proteins such as integrases or by an extension of this technique, which involves identifying a region containing proteins similar to those occurring in prophages. These methods suffer due to the problem of low sequence similarity at the protein level, which suggests that a nucleotide based approach could be useful. METHODOLOGY: Earlier dinucleotide relative abundance (DRA have been used to identify regions, which deviate from the neighborhood areas, in genomes. We have used the difference in the dinucleotide relative abundance (DRAD between the bacterial and prophage DNA to aid location of DNA stretches that could be of prophage origin in bacterial genomes. Prophage sequences which deviate from bacterial regions in their dinucleotide frequencies are detected by scanning bacterial genome sequences. The method was validated using a subset of genomes with prophage data from literature reports. A web interface for prophage scan based on this method is available at http://bicmku.in:8082/prophagedb/dra.html. Two hundred bacterial genomes which do not have annotated prophages have been scanned for prophage regions using this method. CONCLUSIONS: The relative dinucleotide distribution difference helps detect prophage regions in genome sequences. The usefulness of this method is seen in the identification of 461 highly probable loci pertaining to prophages which have not been annotated so earlier. This work emphasizes the need to extend the efforts to detect and annotate prophage elements in genome sequences.

  16. Construction and Preliminary Characterization Analysis of Wuzhishan Miniature Pig Bacterial Artificial Chromosome Library with Approximately 8-Fold Genome Equivalent Coverage

    Directory of Open Access Journals (Sweden)

    Changqing Liu

    2013-01-01

    Full Text Available Bacterial artificial chromosome (BAC libraries have been invaluable tools for the genome-wide genetic dissection of complex organisms. Here, we report the construction and characterization of a high-redundancy BAC library from a very valuable pig breed in China, Wuzhishan miniature pig (Sus scrofa, using its blood cells and fibroblasts, respectively. The library contains approximately 153,600 clones ordered in 40 superpools of 10 × 384-deep well microplates. The average insert size of BAC clones was estimated to be 152.3 kb, representing approximately 7.68 genome equivalents of the porcine haploid genome and a 99.93% statistical probability of obtaining at least one clone containing a unique DNA sequence in the library. 19 pairs of microsatellite marker primers covering porcine chromosomes were used for screening the BAC library, which showed that each of these markers was positive in the library; the positive clone number was 2 to 9, and the average number was 7.89, which was consistent with 7.68-fold coverage of the porcine genome. And there were no significant differences of genomic BAC library from blood cells and fibroblast cells. Therefore, we identified 19 microsatellite markers that could potentially be used as genetic markers. As a result, this BAC library will serve as a valuable resource for gene identification, physical mapping, and comparative genomics and large-scale genome sequencing in the porcine.

  17. Construction and Analysis of Siberian Tiger Bacterial Artificial Chromosome Library with Approximately 6.5-Fold Genome Equivalent Coverage

    Directory of Open Access Journals (Sweden)

    Changqing Liu

    2014-03-01

    Full Text Available Bacterial artificial chromosome (BAC libraries are extremely valuable for the genome-wide genetic dissection of complex organisms. The Siberian tiger, one of the most well-known wild primitive carnivores in China, is an endangered animal. In order to promote research on its genome, a high-redundancy BAC library of the Siberian tiger was constructed and characterized. The library is divided into two sub-libraries prepared from blood cells and two sub-libraries prepared from fibroblasts. This BAC library contains 153,600 individually archived clones; for PCR-based screening of the library, BACs were placed into 40 superpools of 10 × 384-deep well microplates. The average insert size of BAC clones was estimated to be 116.5 kb, representing approximately 6.46 genome equivalents of the haploid genome and affording a 98.86% statistical probability of obtaining at least one clone containing a unique DNA sequence. Screening the library with 19 microsatellite markers and a SRY sequence revealed that each of these markers were present in the library; the average number of positive clones per marker was 6.74 (range 2 to 12, consistent with 6.46 coverage of the tiger genome. Additionally, we identified 72 microsatellite markers that could potentially be used as genetic markers. This BAC library will serve as a valuable resource for physical mapping, comparative genomic study and large-scale genome sequencing in the tiger.

  18. Construction and characterization of bacterial artificial chromosomes (BACs) containing herpes simplex virus full-length genomes.

    Science.gov (United States)

    Nagel, Claus-Henning; Pohlmann, Anja; Sodeik, Beate

    2014-01-01

    Bacterial artificial chromosomes (BACs) are suitable vectors not only to maintain the large genomes of herpesviruses in Escherichia coli but also to enable the traceless introduction of any mutation using modern tools of bacterial genetics. To clone a herpes simplex virus genome, a BAC replication origin is first introduced into the viral genome by homologous recombination in eukaryotic host cells. As part of their nuclear replication cycle, genomes of herpesviruses circularize and these replication intermediates are then used to transform bacteria. After cloning, the integrity of the recombinant viral genomes is confirmed by restriction length polymorphism analysis and sequencing. The BACs may then be used to design virus mutants. Upon transfection into eukaryotic cells new herpesvirus strains harboring the desired mutations can be recovered and used for experiments in cultured cells as well as in animal infection models. PMID:24671676

  19. Host imprints on bacterial genomes--rapid, divergent evolution in individual patients.

    Directory of Open Access Journals (Sweden)

    Jaroslaw Zdziarski

    Full Text Available Bacteria lose or gain genetic material and through selection, new variants become fixed in the population. Here we provide the first, genome-wide example of a single bacterial strain's evolution in different deliberately colonized patients and the surprising insight that hosts appear to personalize their microflora. By first obtaining the complete genome sequence of the prototype asymptomatic bacteriuria strain E. coli 83972 and then resequencing its descendants after therapeutic bladder colonization of different patients, we identified 34 mutations, which affected metabolic and virulence-related genes. Further transcriptome and proteome analysis proved that these genome changes altered bacterial gene expression resulting in unique adaptation patterns in each patient. Our results provide evidence that, in addition to stochastic events, adaptive bacterial evolution is driven by individual host environments. Ongoing loss of gene function supports the hypothesis that evolution towards commensalism rather than virulence is favored during asymptomatic bladder colonization.

  20. Genome-Scale Thermodynamic Analysis of Escherichia coli Metabolism

    OpenAIRE

    Christopher S Henry; Jankowski, Matthew D.; Broadbelt, Linda J.; Hatzimanikatis, Vassily

    2005-01-01

    Genome-scale metabolic models are an invaluable tool for analyzing metabolic systems as they provide a more complete picture of the processes of metabolism. We have constructed a genome-scale metabolic model of Escherichia coli based on the iJR904 model developed by the Palsson Laboratory at the University of California at San Diego. Group contribution methods were utilized to estimate the standard Gibbs free energy change of every reaction in the constructed model. Reactions in the model wer...

  1. A Markovian analysis of bacterial genome sequence constraints

    Directory of Open Access Journals (Sweden)

    Aaron D. Skewes

    2013-08-01

    Full Text Available The arrangement of nucleotides within a bacterial chromosome is influenced by numerous factors. The degeneracy of the third codon within each reading frame allows some flexibility of nucleotide selection; however, the third nucleotide in the triplet of each codon is at least partly determined by the preceding two. This is most evident in organisms with a strong G + C bias, as the degenerate codon must contribute disproportionately to maintaining that bias. Therefore, a correlation exists between the first two nucleotides and the third in all open reading frames. If the arrangement of nucleotides in a bacterial chromosome is represented as a Markov process, we would expect that the correlation would be completely captured by a second-order Markov model and an increase in the order of the model (e.g., third-, fourth-…order would not capture any additional uncertainty in the process. In this manuscript, we present the results of a comprehensive study of the Markov property that exists in the DNA sequences of 906 bacterial chromosomes. All of the 906 bacterial chromosomes studied exhibit a statistically significant Markov property that extends beyond second-order, and therefore cannot be fully explained by codon usage. An unrooted tree containing all 906 bacterial chromosomes based on their transition probability matrices of third-order shares ∼25% similarity to a tree based on sequence homologies of 16S rRNA sequences. This congruence to the 16S rRNA tree is greater than for trees based on lower-order models (e.g., second-order, and higher-order models result in diminishing improvements in congruence. A nucleotide correlation most likely exists within every bacterial chromosome that extends past three nucleotides. This correlation places significant limits on the number of nucleotide sequences that can represent probable bacterial chromosomes. Transition matrix usage is largely conserved by taxa, indicating that this property is likely

  2. Determining and comparing protein function in Bacterial genome sequences

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla

    predictions were made in about 60% of the cases. This project has highlighted the difficulties and challenges in functional annotation and computational analysis of sequence data. It has provided possible solutions for creating reproducible pipelines for comparative genomics as well as constructed a number......In November 2013, there was around 21.000 different prokaryotic genomes sequenced and publicly available, and the number is growing daily with another 20.000 or more genomes expected to be sequenced and deposited by the end of 2014. An important part of the analysis of this data is the functional...... annotation of genes – the descriptions assigned to genes that describe the likely function of the encoded proteins. This process is limited by several factors, including the definition of a function which can be more or less specific as well as how many genes can actually be assigned a function based...

  3. Using Genome-scale Models to Predict Biological Capabilities

    DEFF Research Database (Denmark)

    O’Brien, Edward J.; Monk, Jonathan M.; Palsson, Bernhard O.

    2015-01-01

    Constraint-based reconstruction and analysis (COBRA) methods at the genome scale have been under development since the first whole-genome sequences appeared in the mid-1990s. A few years ago, this approach began to demonstrate the ability to predict a range of cellular functions, including cellular...... growth capabilities on various substrates and the effect of gene knockouts at the genome scale. Thus, much interest has developed in understanding and applying these methods to areas such as metabolic engineering, antibiotic design, and organismal and enzyme evolution. This Primer will get you started....

  4. Genome Sequences of Nine Gram-Negative Vaginal Bacterial Isolates

    Science.gov (United States)

    Deitzler, Grace E.; Ruiz, Maria J.; Lu, Wendy; Weimer, Cory; Park, SoEun; Robinson, Lloyd S.; Hallsworth-Pepin, Kymberlie; Wollam, Aye; Mitreva, Makedonka

    2016-01-01

    The vagina is home to a wide variety of bacteria that have great potential to impact human health. Here, we announce reference strains (now available through BEI Resources) and draft genome sequences for 9 Gram-negative vaginal isolates from the taxa Citrobacter, Klebsiella, Fusobacterium, Proteus, and Prevotella. PMID:27688330

  5. Insights from 20 years of bacterial genome sequencing

    DEFF Research Database (Denmark)

    Land, Miriam; Hauser, Loren; Jun, Se-Ran;

    2015-01-01

    in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes...

  6. Genome Sequences of Nine Gram-Negative Vaginal Bacterial Isolates.

    Science.gov (United States)

    Deitzler, Grace E; Ruiz, Maria J; Lu, Wendy; Weimer, Cory; Park, SoEun; Robinson, Lloyd S; Hallsworth-Pepin, Kymberlie; Wollam, Aye; Mitreva, Makedonka; Lewis, Warren G; Lewis, Amanda L

    2016-01-01

    The vagina is home to a wide variety of bacteria that have great potential to impact human health. Here, we announce reference strains (now available through BEI Resources) and draft genome sequences for 9 Gram-negative vaginal isolates from the taxa Citrobacter, Klebsiella, Fusobacterium, Proteus, and Prevotella. PMID:27688330

  7. GFinisher: a new strategy to refine and finish bacterial genome assemblies

    Science.gov (United States)

    Guizelini, Dieval; Raittz, Roberto T.; Cruz, Leonardo M.; Souza, Emanuel M.; Steffens, Maria B. R.; Pedrosa, Fabio O.

    2016-10-01

    Despite the development in DNA sequencing technology, improving the number and the length of reads, the process of reconstruction of complete genome sequences, the so called genome assembly, is still complex. Only 13% of the prokaryotic genome sequencing projects have been completed. Draft genome sequences deposited in public databases are fragmented in contigs and may lack the full gene complement. The aim of the present work is to identify assembly errors and improve the assembly process of bacterial genomes. The biological patterns observed in genomic sequences and the application of a priori information can allow the identification of misassembled regions, and the reorganization and improvement of the overall de novo genome assembly. GFinisher starts generating a Fuzzy GC skew graphs for each contig in an assembly and follows breaking down the contigs in critical points in order to reassemble and close them using jFGap. This has been successfully applied to dataset from 96 genome assemblies, decreasing the number of contigs by up to 86%. GFinisher can easily optimize assemblies of prokaryotic draft genomes and can be used to improve the assembly programs based on nucleotide sequence patterns in the genome. The software and source code are available at http://gfinisher.sourceforge.net/.

  8. Complete Genome Sequence of the Intracellular Bacterial Symbiont TC1 in the Anaerobic Ciliate Trimyema compressum

    Science.gov (United States)

    Aoyama, Hiroaki; Saitoh, Seikoh; Nikoh, Naruo; Shimoji, Makiko; Shinzato, Misuzu; Teruya, Kuniko; Hirano, Takashi; Yamada, Takanori; Nobu, Masaru K.; Tamaki, Hideyuki; Shirai, Yumi; Park, Sanghwa; Narihiro, Takashi; Liu, Wen-Tso; Kamagata, Yoichi

    2016-01-01

    A free-living ciliate, Trimyema compressum, found in anoxic freshwater environments harbors methanogenic archaea and a bacterial symbiont named TC1 in its cytoplasm. Here, we report the complete genome sequence of the TC1 symbiont, consisting of a 1.59-Mb chromosome and a 35.8-kb plasmid, which was determined using the PacBio RSII sequencer. PMID:27660797

  9. Complete Genome Sequence of a Human Cytomegalovirus Strain AD169 Bacterial Artificial Chromosome Clone

    Science.gov (United States)

    Ostermann, Eleonore; Spohn, Michael; Indenbirken, Daniela

    2016-01-01

    The complete sequence of the human cytomegalovirus strain AD169 (variant ATCC) cloned as a bacterial artificial chromosome (AD169-BAC, also known as HB15 or pHB15) was determined. The viral genome has a length of 230,290 bp and shows 52 nucleotide differences compared to a previously sequenced AD169varATCC clone. PMID:27034483

  10. Complete genome sequence of Japanese erwinia strain ejp617, a bacterial shoot blight pathogen of pear.

    Science.gov (United States)

    Park, Duck Hwan; Thapa, Shree Prasad; Choi, Beom-Soon; Kim, Won-Sik; Hur, Jang Hyun; Cho, Jun Mo; Lim, Jong-Sung; Choi, Ik-Young; Lim, Chun Keun

    2011-01-01

    The Japanese Erwinia strain Ejp617 is a plant pathogen that causes bacterial shoot blight of pear in Japan. Here, we report the complete genome sequence of strain Ejp617 isolated from Nashi pears in Japan to provide further valuable insight among related Erwinia species.

  11. Complete Genome Sequence of the Intracellular Bacterial Symbiont TC1 in the Anaerobic Ciliate Trimyema compressum.

    Science.gov (United States)

    Shinzato, Naoya; Aoyama, Hiroaki; Saitoh, Seikoh; Nikoh, Naruo; Nakano, Kazuma; Shimoji, Makiko; Shinzato, Misuzu; Satou, Kazuhito; Teruya, Kuniko; Hirano, Takashi; Yamada, Takanori; Nobu, Masaru K; Tamaki, Hideyuki; Shirai, Yumi; Park, Sanghwa; Narihiro, Takashi; Liu, Wen-Tso; Kamagata, Yoichi

    2016-01-01

    A free-living ciliate, Trimyema compressum, found in anoxic freshwater environments harbors methanogenic archaea and a bacterial symbiont named TC1 in its cytoplasm. Here, we report the complete genome sequence of the TC1 symbiont, consisting of a 1.59-Mb chromosome and a 35.8-kb plasmid, which was determined using the PacBio RSII sequencer. PMID:27660797

  12. The OME Framework for genome-scale systems biology

    Energy Technology Data Exchange (ETDEWEB)

    Palsson, Bernhard O. [Univ. of California, San Diego, CA (United States); Ebrahim, Ali [Univ. of California, San Diego, CA (United States); Federowicz, Steve [Univ. of California, San Diego, CA (United States)

    2014-12-19

    The life sciences are undergoing continuous and accelerating integration with computational and engineering sciences. The biology that many in the field have been trained on may be hardly recognizable in ten to twenty years. One of the major drivers for this transformation is the blistering pace of advancements in DNA sequencing and synthesis. These advances have resulted in unprecedented amounts of new data, information, and knowledge. Many software tools have been developed to deal with aspects of this transformation and each is sorely needed [1-3]. However, few of these tools have been forced to deal with the full complexity of genome-scale models along with high throughput genome- scale data. This particular situation represents a unique challenge, as it is simultaneously necessary to deal with the vast breadth of genome-scale models and the dizzying depth of high-throughput datasets. It has been observed time and again that as the pace of data generation continues to accelerate, the pace of analysis significantly lags behind [4]. It is also evident that, given the plethora of databases and software efforts [5-12], it is still a significant challenge to work with genome-scale metabolic models, let alone next-generation whole cell models [13-15]. We work at the forefront of model creation and systems scale data generation [16-18]. The OME Framework was borne out of a practical need to enable genome-scale modeling and data analysis under a unified framework to drive the next generation of genome-scale biological models. Here we present the OME Framework. It exists as a set of Python classes. However, we want to emphasize the importance of the underlying design as an addition to the discussions on specifications of a digital cell. A great deal of work and valuable progress has been made by a number of communities [13, 19-24] towards interchange formats and implementations designed to achieve similar goals. While many software tools exist for handling genome-scale

  13. An evaluation of multiple annealing and looping based genome amplification using a synthetic bacterial community

    Institute of Scientific and Technical Information of China (English)

    WANG Yong; GAO Zhaoming; XU Ying; LI Guangyu; HE Lisheng; QIAN Peiyuan

    2016-01-01

    The low biomass in environmental samples is a major challenge for microbial metagenomic studies. The amplification of a genomic DNA was frequently applied to meeting the minimum requirement of the DNA for a high-throughput next-generation-sequencing technology. Using a synthetic bacterial community, the amplification efficiency of the Multiple Annealing and Looping Based Amplification Cycles (MALBAC) kit that is originally developed to amplify the single-cell genomic DNA of mammalian organisms is examined. The DNA template of 10 pg in each reaction of the MALBAC amplification may generate enough DNA for Illumina sequencing. Using 10 pg and 100 pg templates for each reaction set, the MALBAC kit shows a stable and homogeneous amplification as indicated by the highly consistent coverage of the reads from the two amplified samples on the contigs assembled by the original unamplified sample. Although GenomePlex whole genome amplification kit allows one to generate enough DNA using 100 pg of template in each reaction, the minority of the mixed bacterial species is not linearly amplified. For both of the kits, the GC-rich regions of the genomic DNA are not efficiently amplified as suggested by the low coverage of the contigs with the high GC content. The high efficiency of the MALBAC kit is supported for the amplification of environmental microbial DNA samples, and the concerns on its application are also raised to bacterial species with the high GC content.

  14. A peptide identification-free, genome sequence-independent shotgun proteomics workflow for strain-level bacterial differentiation

    OpenAIRE

    Wenguang Shao; Min Zhang; Henry Lam; Lau, Stanley C K

    2015-01-01

    Shotgun proteomics is an emerging tool for bacterial identification and differentiation. However, the identification of the mass spectra of peptides to genome-derived peptide sequences remains a key issue that limits the use of shotgun proteomics to bacteria with genome sequences available. In this proof-of-concept study, we report a novel bacterial fingerprinting method that enjoys the resolving power and accuracy of mass spectrometry without the burden of peptide identification (i.e. genome...

  15. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system.

    Science.gov (United States)

    Speth, Daan R; In 't Zandt, Michiel H; Guerrero-Cruz, Simon; Dutilh, Bas E; Jetten, Mike S M

    2016-01-01

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is used to seed reactors in wastewater treatment plants around the world; however, the role of most of its microbial community in ammonium removal remains unknown. Our analysis yielded 23 near-complete draft genomes that together represent the majority of the microbial community. We assign these genomes to distinct anaerobic and aerobic microbial communities. In the aerobic community, nitrifying organisms and heterotrophs predominate. In the anaerobic community, widespread potential for partial denitrification suggests a nitrite loop increases treatment efficiency. Of our genomes, 19 have no previously cultivated or sequenced close relatives and six belong to bacterial phyla without any cultivated members, including the most complete Omnitrophica (formerly OP3) genome to date. PMID:27029554

  16. (Actino)Bacterial "intelligence": using comparative genomics to unravel the information processing capacities of microbes.

    Science.gov (United States)

    Pinto, Daniela; Mascher, Thorsten

    2016-08-01

    Bacterial genomes encode numerous and often sophisticated signaling devices to perceive changes in their environment and mount appropriate adaptive responses. With their help, microbes are able to orchestrate specific decision-making processes that alter the cellular behavior, but also integrate and communicate information. Moreover and beyond, some signal transducing systems also enable bacteria to remember and learn from previous stimuli to anticipate environmental changes. As recently suggested, all of these aspects indicate that bacteria do, in fact, exhibit cognition remarkably reminiscent of what we refer to as intelligent behavior, at least when referred to higher eukaryotes. In this essay, comprehensive data derived from comparative genomics analyses of microbial signal transduction systems are used to probe the concept of cognition in bacterial cells. Using a recent comprehensive analysis of over 100 actinobacterial genomes as a test case, we illustrate the different layers of the capacities of bacteria that result in cognitive and behavioral complexity as well as some form of 'bacterial intelligence'. We try to raise awareness to approach bacteria as cognitive organisms and believe that this view would enrich and open a new path in the experimental studies of bacterial signal transducing systems. PMID:26852121

  17. Genomic Analyses of Bacterial Porin-Cytochrome Gene Clusters

    Directory of Open Access Journals (Sweden)

    Liang eShi

    2014-11-01

    Full Text Available The porin-cytochrome (Pcc protein complex is responsible for trans-outer membrane electron transfer during extracellular reduction of Fe(III by the dissimilatory metal-reducing bacterium Geobacter sulfurreducens PCA. The identified and characterized Pcc complex of G. sulfurreducens PCA consists of a porin-like outer-membrane protein, a periplasmic 8-heme c-type cytochrome (c-Cyt and an outer-membrane 12-heme c-Cyt, and the genes encoding the Pcc proteins are clustered in the same regions of genome (i.e., the pcc gene clusters of G. sulfurreducens PCA. A survey of additionally microbial genomes has identified the pcc gene clusters in all sequenced Geobacter spp. and other bacteria from six different phyla, including Anaeromyxobacter dehalogenans 2CP-1, A. dehalogenans 2CP-C, Anaeromyxobacter sp. K, Candidatus Kuenenia stuttgartiensis, Denitrovibrio acetiphilus DSM 12809, Desulfurispirillum indicum S5, Desulfurivibrio alkaliphilus AHT2, Desulfurobacterium thermolithotrophum DSM 11699, Desulfuromonas acetoxidans DSM 684, Ignavibacterium album JCM 16511, and Thermovibrio ammonificans HB-1. The numbers of genes in the pcc gene clusters vary, ranging from two to nine. Similar to the metal-reducing (Mtr gene clusters of other Fe(III-reducing bacteria, such as Shewanella spp., additional genes that encode putative c-Cyts with predicted cellular localizations at the cytoplasmic membrane, periplasm and outer membrane often associate with the pcc gene clusters. This suggests that the Pcc-associated c-Cyts may be part of the pathways for extracellular electron transfer reactions. The presence of pcc gene clusters in the microorganisms that do not reduce solid-phase Fe(III and Mn(IV oxides, such as D. alkaliphilus AHT2 and I. album JCM 16511, also suggests that some of the pcc gene clusters may be involved in extracellular electron transfer reactions with the substrates other than Fe(III and Mn(IV oxides.

  18. CRISPR-Cas: From the Bacterial Adaptive Immune System to a Versatile Tool for Genome Engineering.

    Science.gov (United States)

    Kirchner, Marion; Schneider, Sabine

    2015-11-01

    The field of biology has been revolutionized by the recent advancement of an adaptive bacterial immune system as a universal genome engineering tool. Bacteria and archaea use repetitive genomic elements termed clustered regularly interspaced short palindromic repeats (CRISPR) in combination with an RNA-guided nuclease (CRISPR-associated nuclease: Cas) to target and destroy invading DNA. By choosing the appropriate sequence of the guide RNA, this two-component system can be used to efficiently modify, target, and edit genomic loci of interest in plants, insects, fungi, mammalian cells, and whole organisms. This has opened up new frontiers in genome engineering, including the potential to treat or cure human genetic disorders. Now the potential risks as well as the ethical, social, and legal implications of this powerful new technique move into the limelight.

  19. BG7: a new approach for bacterial genome annotation designed for next generation sequencing data.

    Directory of Open Access Journals (Sweden)

    Pablo Pareja-Tobes

    Full Text Available BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version - which is developed in Java, takes advantage of Amazon Web Services (AWS cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities and efficiency were demonstrated in the 2011 EHEC Germany outbreak in which BG7 was used to get the first annotations right the next day after the first entero-hemorrhagic E. coli genome sequences were made publicly available. The suitability of BG7 for genome annotation has been proved for Illumina, 454, Ion Torrent, and PacBio sequencing technologies. Besides, thanks to its plasticity, our system could be very easily adapted to work with new technologies in the future.

  20. The diversity of a distributed genome in bacterial populations

    CERN Document Server

    Baumdicker, F; Pfaffelhuber, P

    2009-01-01

    The distributed genome hypothesis states that the set of genes in a population of bacteria is distributed over all individuals that belong to the specific taxon. It implies that certain genes can be gained and lost from generation to generation. We use the random genealogy given by a Kingman coalescent in order to superimpose events of gene gain and loss along ancestral lines. Gene gains occur at constant rate along ancestral lines. We assume that gained genes have never been present in the population before. Gene losses occur at a rate proportional to the number of genes present along the ancestral line. In this "infinitely many genes model" we derive moments for several statistics within a sample: the average number of genes per individual, the average number of genes differing between individuals, the number of incongruent pairs of genes, the total number of different genes in the sample and the gene frequency spectrum. We demonstrate that the model gives a reasonable fit with gene frequency data from mari...

  1. PathogenFinder - Distinguishing Friend from Foe Using Bacterial Whole Genome Sequence Data

    DEFF Research Database (Denmark)

    Cosentino, Salvatore; Larsen, Mette Voldby; Aarestrup, Frank Møller;

    2013-01-01

    Although the majority of bacteria are harmless or even beneficial to their host, others are highly virulent and can cause serious diseases, and even death. Due to the constantly decreasing cost of high-throughput sequencing there are now many completely sequenced genomes available from both human...... approaches. We describe PathogenFinder (http://cge.cbs.dtu.dk/services/PathogenFinder/), a web-server for the prediction of bacterial pathogenicity by analysing the input proteome, genome, or raw reads provided by the user. The method relies on groups of proteins, created without regard to their annotated...

  2. Compaction of bacterial genomic DNA: clarifying the concepts

    International Nuclear Information System (INIS)

    The unconstrained genomic DNA of bacteria forms a coil, whose volume exceeds 1000 times the volume of the cell. Since prokaryotes lack a membrane-bound nucleus, in sharp contrast with eukaryotes, the DNA may consequently be expected to occupy the whole available volume when constrained to fit in the cell. Still, it has been known for more than half a century that the DNA is localized in a well-defined region of the cell, called the nucleoid, which occupies only 15% to 25% of the total volume. Although this problem has focused the attention of many scientists in recent decades, there is still no certainty concerning the mechanism that enables such a dramatic compaction. The goal of this Topical Review is to take stock of our knowledge on this question by listing all possible compaction mechanisms with the proclaimed desire to clarify the physical principles they are based upon and discuss them in the light of experimental results and the results of simulations based on coarse-grained models. In particular, the fundamental differences between ψ-condensation and segregative phase separation and between the condensation by small and long polycations are highlighted. This review suggests that the importance of certain mechanisms, like supercoiling and the architectural properties of DNA-bridging and DNA-bending nucleoid proteins, may have been overestimated, whereas other mechanisms, like segregative phase separation and the self-association of nucleoid proteins, as well as the possible role of the synergy of two or more mechanisms, may conversely deserve more attention. (topical review)

  3. The CRISPR-Cas system - from bacterial immunity to genome engineering.

    Science.gov (United States)

    Czarnek, Maria; Bereta, Joanna

    2016-01-01

    Precise and efficient genome modifications present a great value in attempts to comprehend the roles of particular genes and other genetic elements in biological processes as well as in various pathologies. In recent years novel methods of genome modification known as genome editing, which utilize so called "programmable" nucleases, came into use. A true revolution in genome editing has been brought about by the introduction of the CRISP-Cas (clustered regularly interspaced short palindromic repeats-CRISPR associated) system, in which one of such nucleases, i.e. Cas9, plays a major role. This system is based on the elements of the bacterial and archaeal mechanism responsible for acquired immunity against phage infections and transfer of foreign genetic material. Microorganisms incorporate fragments of foreign DNA into CRISPR loci present in their genomes, which enables fast recognition and elimination of future infections. There are several types of CRISPR-Cas systems among prokaryotes but only elements of CRISPR type II are employed in genome engineering. CRISPR-Cas type II utilizes small RNA molecules (crRNA and tracrRNA) to precisely direct the effector nuclease - Cas9 - to a specific site in the genome, i.e. to the sequence complementary to crRNA. Cas9 may be used to: (i) introduce stable changes into genomes e.g. in the process of generation of knock-out and knock-in animals and cell lines, (ii) activate or silence the expression of a gene of interest, and (iii) visualize specific sites in genomes of living cells. The CRISPR-Cas-based tools have been successfully employed for generation of animal and cell models of a number of diseases, e.g. specific types of cancer. In the future, the genome editing by programmable nucleases may find wide application in medicine e.g. in the therapies of certain diseases of genetic origin and in the therapy of HIV-infected patients. PMID:27594566

  4. Genome resequencing in Populus: Revealing large-scale genome variation and implications on specialized-trait genomics

    Energy Technology Data Exchange (ETDEWEB)

    Muchero, Wellington [ORNL; Labbe, Jessy L [ORNL; Priya, Ranjan [University of Tennessee, Knoxville (UTK); DiFazio, Steven P [West Virginia University, Morgantown; Tuskan, Gerald A [ORNL

    2014-01-01

    To date, Populus ranks among a few plant species with a complete genome sequence and other highly developed genomic resources. With the first genome sequence among all tree species, Populus has been adopted as a suitable model organism for genomic studies in trees. However, far from being just a model species, Populus is a key renewable economic resource that plays a significant role in providing raw materials for the biofuel and pulp and paper industries. Therefore, aside from leading frontiers of basic tree molecular biology and ecological research, Populus leads frontiers in addressing global economic challenges related to fuel and fiber production. The latter fact suggests that research aimed at improving quality and quantity of Populus as a raw material will likely drive the pursuit of more targeted and deeper research in order to unlock the economic potential tied in molecular biology processes that drive this tree species. Advances in genome sequence-driven technologies, such as resequencing individual genotypes, which in turn facilitates large scale SNP discovery and identification of large scale polymorphisms are key determinants of future success in these initiatives. In this treatise we discuss implications of genome sequence-enable technologies on Populus genomic and genetic studies of complex and specialized-traits.

  5. An Improved Method for oriT-Directed Cloning and Functionalization of Large Bacterial Genomic Regions

    OpenAIRE

    Kvitko, Brian H.; McMillan, Ian A.; Schweizer, Herbert P.

    2013-01-01

    We have made significant improvements to a broad-host-range system for the cloning and manipulation of large bacterial genomic regions based on site-specific recombination between directly repeated oriT sites during conjugation. Using two suicide capture vectors carrying flanking homology regions, oriT sites are recombined on either side of the target region. Using a broad-host-range conjugation helper plasmid, the region between the oriT sites is conjugated into an Escherichia coli recipient...

  6. THE MATRIX ITERATION ALGORITHM SOLVING AN ENUMERATION PROBLEM ON BACTERIAL COMPLETE GENOMES

    Institute of Scientific and Technical Information of China (English)

    YANG Huakang; HUANG Chengxiang; WEN Xiaowei

    2004-01-01

    Given an alphabet ∑ and a finite minimal set B of forbidden words, a combinatorial enumeration problem on bacterial complete genomes is transformed to enumerating strings of a given length which do not contain any string in B as their substrings. From the fact that a string in the language is equivalent to a path in the corresponding graph,we have obtained a polynomial time algorithm by modifying the power of the adjacency matrix in the graph.

  7. Genomic and Global Approaches to Unravelling How Hypermutable Sequences Influence Bacterial Pathogenesis

    Directory of Open Access Journals (Sweden)

    Fadil A. Bidmos

    2014-02-01

    Full Text Available Rapid adaptation to fluctuations in the host milieu contributes to the host persistence and virulence of bacterial pathogens. Adaptation is frequently mediated by hypermutable sequences in bacterial pathogens. Early bacterial genomic studies identified the multiplicity and virulence-associated functions of these hypermutable sequences. Thus, simple sequence repeat tracts (SSRs and site-specific recombination were found to control capsular type, lipopolysaccharide structure, pilin diversity and the expression of outer membrane proteins. We review how the population diversity inherent in the SSR-mediated mechanism of localised hypermutation is being unlocked by the investigation of whole genome sequences of disease isolates, analysis of clinical samples and use of model systems. A contrast is presented between the problematical nature of analysing simple sequence repeats in next generation sequencing data and in simpler, pragmatic PCR-based approaches. Specific examples are presented of the potential relevance of this localized hypermutation to meningococcal pathogenesis. This leads us to speculate on the future prospects for unravelling how hypermutable mechanisms may contribute to the transmission, spread and persistence of bacterial pathogens.

  8. Genome Scale Transcriptomics of Baculovirus-Insect Interactions

    Directory of Open Access Journals (Sweden)

    Steven Reid

    2013-11-01

    Full Text Available Baculovirus-insect cell technologies are applied in the production of complex proteins, veterinary and human vaccines, gene delivery vectors‚ and biopesticides. Better understanding of how baculoviruses and insect cells interact would facilitate baculovirus-based production. While complete genomic sequences are available for over 58 baculovirus species, little insect genomic information is known. The release of the Bombyx mori and Plutella xylostella genomes, the accumulation of EST sequences for several Lepidopteran species, and especially the availability of two genome-scale analysis tools, namely oligonucleotide microarrays and next generation sequencing (NGS, have facilitated expression studies to generate a rich picture of insect gene responses to baculovirus infections. This review presents current knowledge on the interaction dynamics of the baculovirus-insect system‚ which is relatively well studied in relation to nucleocapsid transportation, apoptosis, and heat shock responses, but is still poorly understood regarding responses involved in pro-survival pathways, DNA damage pathways, protein degradation, translation, signaling pathways, RNAi pathways, and importantly metabolic pathways for energy, nucleotide and amino acid production. We discuss how the two genome-scale transcriptomic tools can be applied for studying such pathways and suggest that proteomics and metabolomics can produce complementary findings to transcriptomic studies.

  9. Genome scale transcriptomics of baculovirus-insect interactions.

    Science.gov (United States)

    Nguyen, Quan; Nielsen, Lars K; Reid, Steven

    2013-11-01

    Baculovirus-insect cell technologies are applied in the production of complex proteins, veterinary and human vaccines, gene delivery vectors' and biopesticides. Better understanding of how baculoviruses and insect cells interact would facilitate baculovirus-based production. While complete genomic sequences are available for over 58 baculovirus species, little insect genomic information is known. The release of the Bombyx mori and Plutella xylostella genomes, the accumulation of EST sequences for several Lepidopteran species, and especially the availability of two genome-scale analysis tools, namely oligonucleotide microarrays and next generation sequencing (NGS), have facilitated expression studies to generate a rich picture of insect gene responses to baculovirus infections. This review presents current knowledge on the interaction dynamics of the baculovirus-insect system' which is relatively well studied in relation to nucleocapsid transportation, apoptosis, and heat shock responses, but is still poorly understood regarding responses involved in pro-survival pathways, DNA damage pathways, protein degradation, translation, signaling pathways, RNAi pathways, and importantly metabolic pathways for energy, nucleotide and amino acid production. We discuss how the two genome-scale transcriptomic tools can be applied for studying such pathways and suggest that proteomics and metabolomics can produce complementary findings to transcriptomic studies.

  10. Bacterial Genomic Data Analysis in the Next-Generation Sequencing Era.

    Science.gov (United States)

    Orsini, Massimiliano; Cuccuru, Gianmauro; Uva, Paolo; Fotia, Giorgio

    2016-01-01

    Bacterial genome sequencing is now an affordable choice for many laboratories for applications in research, diagnostic, and clinical microbiology. Nowadays, an overabundance of tools is available for genomic data analysis. However, tools differ for algorithms, languages, hardware requirements, and user interface, and combining them as it is necessary for sequence data interpretation often requires (bio)informatics skills which can be difficult to find in many laboratories. In addition, multiple data sources, as well as exceedingly large dataset sizes, and increasingly computational complexity further challenge the accessibility, reproducibility, and transparency of the entire process. In this chapter we will cover the main bioinformatics steps required for a complete bacterial genome analysis using next-generation sequencing data, from the raw sequence data to assembled and annotated genomes. All the tools described are available in the Orione framework ( http://orione.crs4.it ), which uniquely combines in a transparent way the most used open source bioinformatics tools for microbiology, allowing microbiologist without any specific hardware or informatics skill to conduct data-intensive computational analyses from quality control to microbial gene annotation. PMID:27115645

  11. Phylogeny of bacterial and archaeal genomes using conserved genes: supertrees and supermatrices.

    Directory of Open Access Journals (Sweden)

    Jenna Morgan Lang

    Full Text Available Over 3000 microbial (bacterial and archaeal genomes have been made publically available to date, providing an unprecedented opportunity to examine evolutionary genomic trends and offering valuable reference data for a variety of other studies such as metagenomics. The utility of these genome sequences is greatly enhanced when we have an understanding of how they are phylogenetically related to each other. Therefore, we here describe our efforts to reconstruct the phylogeny of all available bacterial and archaeal genomes. We identified 24, single-copy, ubiquitous genes suitable for this phylogenetic analysis. We used two approaches to combine the data for the 24 genes. First, we concatenated alignments of all genes into a single alignment from which a Maximum Likelihood (ML tree was inferred using RAxML. Second, we used a relatively new approach to combining gene data, Bayesian Concordance Analysis (BCA, as implemented in the BUCKy software, in which the results of 24 single-gene phylogenetic analyses are used to generate a "primary concordance" tree. A comparison of the concatenated ML tree and the primary concordance (BUCKy tree reveals that the two approaches give similar results, relative to a phylogenetic tree inferred from the 16S rRNA gene. After comparing the results and the methods used, we conclude that the current best approach for generating a single phylogenetic tree, suitable for use as a reference phylogeny for comparative analyses, is to perform a maximum likelihood analysis of a concatenated alignment of conserved, single-copy genes.

  12. From Environment to Man: Genome Evolution and Adaptation of Human Opportunistic Bacterial Pathogens

    Science.gov (United States)

    Aujoulat, Fabien; Roger, Frédéric; Bourdier, Alice; Lotthé, Anne; Lamy, Brigitte; Marchandin, Hélène; Jumas-Bilak, Estelle

    2012-01-01

    Environment is recognized as a huge reservoir for bacterial species and a source of human pathogens. Some environmental bacteria have an extraordinary range of activities that include promotion of plant growth or disease, breakdown of pollutants, production of original biomolecules, but also multidrug resistance and human pathogenicity. The versatility of bacterial life-style involves adaptation to various niches. Adaptation to both open environment and human specific niches is a major challenge that involves intermediate organisms allowing pre-adaptation to humans. The aim of this review is to analyze genomic features of environmental bacteria in order to explain their adaptation to human beings. The genera Pseudomonas, Aeromonas and Ochrobactrum provide valuable examples of opportunistic behavior associated to particular genomic structure and evolution. Particularly, we performed original genomic comparisons among aeromonads and between the strictly intracellular pathogens Brucella spp. and the mild opportunistic pathogens Ochrobactrum spp. We conclude that the adaptation to human could coincide with a speciation in action revealed by modifications in both genomic and population structures. This adaptation-driven speciation could be a major mechanism for the emergence of true pathogens besides the acquisition of specialized virulence factors. PMID:24704914

  13. Large-scale data mining pilot project in human genome

    Energy Technology Data Exchange (ETDEWEB)

    Musick, R.; Fidelis, R.; Slezak, T.

    1997-05-01

    This whitepaper briefly describes a new, aggressive effort in large- scale data Livermore National Labs. The implications of `large- scale` will be clarified Section. In the short term, this effort will focus on several @ssion-critical questions of Genome project. We will adapt current data mining techniques to the Genome domain, to quantify the accuracy of inference results, and lay the groundwork for a more extensive effort in large-scale data mining. A major aspect of the approach is that we will be fully-staffed data warehousing effort in the human Genome area. The long term goal is strong applications- oriented research program in large-@e data mining. The tools, skill set gained will be directly applicable to a wide spectrum of tasks involving a for large spatial and multidimensional data. This includes applications in ensuring non-proliferation, stockpile stewardship, enabling Global Ecology (Materials Database Industrial Ecology), advancing the Biosciences (Human Genome Project), and supporting data for others (Battlefield Management, Health Care).

  14. Applying Shannon's information theory to bacterial and phage genomes and metagenomes.

    Science.gov (United States)

    Akhter, Sajia; Bailey, Barbara A; Salamon, Peter; Aziz, Ramy K; Edwards, Robert A

    2013-01-01

    All sequence data contain inherent information that can be measured by Shannon's uncertainty theory. Such measurement is valuable in evaluating large data sets, such as metagenomic libraries, to prioritize their analysis and annotation, thus saving computational resources. Here, Shannon's index of complete phage and bacterial genomes was examined. The information content of a genome was found to be highly dependent on the genome length, GC content, and sequence word size. In metagenomic sequences, the amount of information correlated with the number of matches found by comparison to sequence databases. A sequence with more information (higher uncertainty) has a higher probability of being significantly similar to other sequences in the database. Measuring uncertainty may be used for rapid screening for sequences with matches in available database, prioritizing computational resources, and indicating which sequences with no known similarities are likely to be important for more detailed analysis.

  15. Mapping and validating predictions of soil bacterial biodiversity using European and national scale datasets

    OpenAIRE

    Thomson, Bruce C.; Plassart, Pierre; Gweon, Hyun S.; STONE Dorothy; Creamer, Rachael E.; Lemanceau, Philippe; Bailey, Mark J

    2016-01-01

    Recent research has highlighted strong correlations between soil edaphic parameters and bacterial biodiversity. Here we seek to explore these relationships across the European Union member states with respect to mapping bacterial biodiversity at the continental scale. As part of the EU FP7 EcoFINDERs project, bacterial communities from 76 soil samples taken across Europe were assessed from eleven countries encompassing Arctic to Southern Mediterranean climes, representing a diverse range of s...

  16. Spatial Scales of Bacterial Diversity in Cold-Water Coral Reef Ecosystems

    OpenAIRE

    Sandra Schöttner; Christian Wild; Friederike Hoffmann; Antje Boetius; Alban Ramette

    2012-01-01

    Background: Cold-water coral reef ecosystems are recognized as biodiversity hotspots in the deep sea, but insights into their associated bacterial communities are still limited. Deciphering principle patterns of bacterial community variation over multiple spatial scales may however prove critical for a better understanding of factors contributing to cold-water coral reef stability and functioning. Methodology/Principal Findings: Bacterial community structure, as determined by Automated Riboso...

  17. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    Energy Technology Data Exchange (ETDEWEB)

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin

  18. Genome Sequences of 15 Gardnerella vaginalis Strains Isolated from the Vaginas of Women with and without Bacterial Vaginosis.

    Science.gov (United States)

    Robinson, Lloyd S; Perry, Justin; Lek, Sai; Wollam, Aye; Sodergren, Erica; Weinstock, George; Lewis, Warren G; Lewis, Amanda L

    2016-01-01

    Gardnerella vaginalis is a predominant species in bacterial vaginosis, a dysbiosis of the vagina that is associated with adverse health outcomes, including preterm birth. Here, we present the draft genome sequences of 15 Gardnerella vaginalis strains (now available through BEI Resources) isolated from women with and without bacterial vaginosis. PMID:27688326

  19. Cytotoxic chromosomal targeting by CRISPR/Cas systems can reshape bacterial genomes and expel or remodel pathogenicity islands.

    Directory of Open Access Journals (Sweden)

    Reuben B Vercoe

    2013-04-01

    Full Text Available In prokaryotes, clustered regularly interspaced short palindromic repeats (CRISPRs and their associated (Cas proteins constitute a defence system against bacteriophages and plasmids. CRISPR/Cas systems acquire short spacer sequences from foreign genetic elements and incorporate these into their CRISPR arrays, generating a memory of past invaders. Defence is provided by short non-coding RNAs that guide Cas proteins to cleave complementary nucleic acids. While most spacers are acquired from phages and plasmids, there are examples of spacers that match genes elsewhere in the host bacterial chromosome. In Pectobacterium atrosepticum the type I-F CRISPR/Cas system has acquired a self-complementary spacer that perfectly matches a protospacer target in a horizontally acquired island (HAI2 involved in plant pathogenicity. Given the paucity of experimental data about CRISPR/Cas-mediated chromosomal targeting, we examined this process by developing a tightly controlled system. Chromosomal targeting was highly toxic via targeting of DNA and resulted in growth inhibition and cellular filamentation. The toxic phenotype was avoided by mutations in the cas operon, the CRISPR repeats, the protospacer target, and protospacer-adjacent motif (PAM beside the target. Indeed, the natural self-targeting spacer was non-toxic due to a single nucleotide mutation adjacent to the target in the PAM sequence. Furthermore, we show that chromosomal targeting can result in large-scale genomic alterations, including the remodelling or deletion of entire pre-existing pathogenicity islands. These features can be engineered for the targeted deletion of large regions of bacterial chromosomes. In conclusion, in DNA-targeting CRISPR/Cas systems, chromosomal interference is deleterious by causing DNA damage and providing a strong selective pressure for genome alterations, which may have consequences for bacterial evolution and pathogenicity.

  20. Genome sequence and plasmid transformation of the model high-yield bacterial cellulose producer Gluconacetobacter hansenii ATCC 53582

    OpenAIRE

    Michael Florea; Benjamin Reeve; James Abbott; Freemont, Paul S.; Tom Ellis

    2016-01-01

    Bacterial cellulose is a strong, highly pure form of cellulose that is used in a range of applications in industry, consumer goods and medicine. Gluconacetobacter hansenii ATCC 53582 is one of the highest reported bacterial cellulose producing strains and has been used as a model organism in numerous studies of bacterial cellulose production and studies aiming to increased cellulose productivity. Here we present a high-quality draft genome sequence for G. hansenii ATCC 53582 and find that in ...

  1. Metabolic complementarity and genomics of the dual bacterial symbiosis of sharpshooters.

    Directory of Open Access Journals (Sweden)

    Dongying Wu

    2006-06-01

    Full Text Available Mutualistic intracellular symbiosis between bacteria and insects is a widespread phenomenon that has contributed to the global success of insects. The symbionts, by provisioning nutrients lacking from diets, allow various insects to occupy or dominate ecological niches that might otherwise be unavailable. One such insect is the glassy-winged sharpshooter (Homalodisca coagulata, which feeds on xylem fluid, a diet exceptionally poor in organic nutrients. Phylogenetic studies based on rRNA have shown two types of bacterial symbionts to be coevolving with sharpshooters: the gamma-proteobacterium Baumannia cicadellinicola and the Bacteroidetes species Sulcia muelleri. We report here the sequencing and analysis of the 686,192-base pair genome of B. cicadellinicola and approximately 150 kilobase pairs of the small genome of S. muelleri, both isolated from H. coagulata. Our study, which to our knowledge is the first genomic analysis of an obligate symbiosis involving multiple partners, suggests striking complementarity in the biosynthetic capabilities of the two symbionts: B. cicadellinicola devotes a substantial portion of its genome to the biosynthesis of vitamins and cofactors required by animals and lacks most amino acid biosynthetic pathways, whereas S. muelleri apparently produces most or all of the essential amino acids needed by its host. This finding, along with other results of our genome analysis, suggests the existence of metabolic codependency among the two unrelated endosymbionts and their insect host. This dual symbiosis provides a model case for studying correlated genome evolution and genome reduction involving multiple organisms in an intimate, obligate mutualistic relationship. In addition, our analysis provides insight for the first time into the differences in symbionts between insects (e.g., aphids that feed on phloem versus those like H. coagulata that feed on xylem. Finally, the genomes of these two symbionts provide potential

  2. Extraction of ribosomal RNA and genomic DNA from soil for studying the diversity of the indigenous bacterial community

    NARCIS (Netherlands)

    Duarte, G.F.; Rosado, A.S.; Keijzer-Wolters, A.C.; Elsas, van J.D.

    1998-01-01

    A method for the indirect (cell extraction followed by nucleic acid extraction) isolation of bacterial ribosomal RNA (rRNA) and genomic DNA from soil was developed. The protocol allowed for the rapid parallel extraction of genomic DNA as well as small and large ribosomal subunit RNA from four soils

  3. MEMOSys: Bioinformatics platform for genome-scale metabolic models

    Directory of Open Access Journals (Sweden)

    Agren Rasmus

    2011-01-01

    Full Text Available Abstract Background Recent advances in genomic sequencing have enabled the use of genome sequencing in standard biological and biotechnological research projects. The challenge is how to integrate the large amount of data in order to gain novel biological insights. One way to leverage sequence data is to use genome-scale metabolic models. We have therefore designed and implemented a bioinformatics platform which supports the development of such metabolic models. Results MEMOSys (MEtabolic MOdel research and development System is a versatile platform for the management, storage, and development of genome-scale metabolic models. It supports the development of new models by providing a built-in version control system which offers access to the complete developmental history. Moreover, the integrated web board, the authorization system, and the definition of user roles allow collaborations across departments and institutions. Research on existing models is facilitated by a search system, references to external databases, and a feature-rich comparison mechanism. MEMOSys provides customizable data exchange mechanisms using the SBML format to enable analysis in external tools. The web application is based on the Java EE framework and offers an intuitive user interface. It currently contains six annotated microbial metabolic models. Conclusions We have developed a web-based system designed to provide researchers a novel application facilitating the management and development of metabolic models. The system is freely available at http://www.icbi.at/MEMOSys.

  4. Power Laws, Scale-Free Networks and Genome Biology

    CERN Document Server

    Koonin, Eugene V; Karev, Georgy P

    2006-01-01

    Power Laws, Scale-free Networks and Genome Biology deals with crucial aspects of the theoretical foundations of systems biology, namely power law distributions and scale-free networks which have emerged as the hallmarks of biological organization in the post-genomic era. The chapters in the book not only describe the interesting mathematical properties of biological networks but moves beyond phenomenology, toward models of evolution capable of explaining the emergence of these features. The collection of chapters, contributed by both physicists and biologists, strives to address the problems in this field in a rigorous but not excessively mathematical manner and to represent different viewpoints, which is crucial in this emerging discipline. Each chapter includes, in addition to technical descriptions of properties of biological networks and evolutionary models, a more general and accessible introduction to the respective problems. Most chapters emphasize the potential of theoretical systems biology for disco...

  5. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins.

    Science.gov (United States)

    Croucher, Nicholas J; Page, Andrew J; Connor, Thomas R; Delaney, Aidan J; Keane, Jacqueline A; Bentley, Stephen D; Parkhill, Julian; Harris, Simon R

    2015-02-18

    The emergence of new sequencing technologies has facilitated the use of bacterial whole genome alignments for evolutionary studies and outbreak analyses. These datasets, of increasing size, often include examples of multiple different mechanisms of horizontal sequence transfer resulting in substantial alterations to prokaryotic chromosomes. The impact of these processes demands rapid and flexible approaches able to account for recombination when reconstructing isolates' recent diversification. Gubbins is an iterative algorithm that uses spatial scanning statistics to identify loci containing elevated densities of base substitutions suggestive of horizontal sequence transfer while concurrently constructing a maximum likelihood phylogeny based on the putative point mutations outside these regions of high sequence diversity. Simulations demonstrate the algorithm generates highly accurate reconstructions under realistically parameterized models of bacterial evolution, and achieves convergence in only a few hours on alignments of hundreds of bacterial genome sequences. Gubbins is appropriate for reconstructing the recent evolutionary history of a variety of haploid genotype alignments, as it makes no assumptions about the underlying mechanism of recombination. The software is freely available for download at github.com/sanger-pathogens/Gubbins, implemented in Python and C and supported on Linux and Mac OS X. PMID:25414349

  6. Genome-wide selective sweeps and gene-specific sweeps in natural bacterial populations.

    Science.gov (United States)

    Bendall, Matthew L; Stevens, Sarah Lr; Chan, Leong-Keat; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Froula, Jeff; Kang, Dongwan; Tringe, Susannah G; Bertilsson, Stefan; Moran, Mary A; Shade, Ashley; Newton, Ryan J; McMahon, Katherine D; Malmstrom, Rex R

    2016-07-01

    Multiple models describe the formation and evolution of distinct microbial phylogenetic groups. These evolutionary models make different predictions regarding how adaptive alleles spread through populations and how genetic diversity is maintained. Processes predicted by competing evolutionary models, for example, genome-wide selective sweeps vs gene-specific sweeps, could be captured in natural populations using time-series metagenomics if the approach were applied over a sufficiently long time frame. Direct observations of either process would help resolve how distinct microbial groups evolve. Here, from a 9-year metagenomic study of a freshwater lake (2005-2013), we explore changes in single-nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in 30 bacterial populations. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied by >1000-fold among populations. SNP allele frequencies also changed dramatically over time within some populations. Interestingly, nearly all SNP variants were slowly purged over several years from one population of green sulfur bacteria, while at the same time multiple genes either swept through or were lost from this population. These patterns were consistent with a genome-wide selective sweep in progress, a process predicted by the 'ecotype model' of speciation but not previously observed in nature. In contrast, other populations contained large, SNP-free genomic regions that appear to have swept independently through the populations prior to the study without purging diversity elsewhere in the genome. Evidence for both genome-wide and gene-specific sweeps suggests that different models of bacterial speciation may apply to different populations coexisting in the same environment. PMID:26744812

  7. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network

    DEFF Research Database (Denmark)

    Förster, Jochen; Famili, I.; Fu, P.;

    2003-01-01

    and the environment were included. A total of 708 structural open reading frames (ORFs) were accounted for in the reconstructed network, corresponding to 1035 metabolic reactions. Further, 140 reactions were included on the basis of biochemical evidence resulting in a genome-scale reconstructed metabolic network...... with Escherichia coli. The reconstructed metabolic network is the first comprehensive network for a eukaryotic organism, and it may be used as the basis for in silico analysis of phenotypic functions....

  8. Combining p-values in large scale genomics experiments

    OpenAIRE

    Dmitri V Zaykin; Zhivotovsky, Lev A.; Czika, Wendy; Shao, Susan; Wolfinger, Russell D.

    2007-01-01

    In large-scale genomics experiments involving thousands of statistical tests, such as association scans and microarray expression experiments, a key question is: Which of the L tests represent true associations (TAs)? The traditional way to control false findings is via individual adjustments. In the presence of multiple TAs, p-value combination methods offer certain advantages. Both Fisher’s and Lancaster’s combination methods use an inverse gamma transformation. We identify the relation of ...

  9. Genomic divergences among cattle, dog and human estimated from large-scale alignments of genomic sequences

    Directory of Open Access Journals (Sweden)

    Shade Larry L

    2006-06-01

    Full Text Available Abstract Background Approximately 11 Mb of finished high quality genomic sequences were sampled from cattle, dog and human to estimate genomic divergences and their regional variation among these lineages. Results Optimal three-way multi-species global sequence alignments for 84 cattle clones or loci (each >50 kb of genomic sequence were constructed using the human and dog genome assemblies as references. Genomic divergences and substitution rates were examined for each clone and for various sequence classes under different functional constraints. Analysis of these alignments revealed that the overall genomic divergences are relatively constant (0.32–0.37 change/site for pairwise comparisons among cattle, dog and human; however substitution rates vary across genomic regions and among different sequence classes. A neutral mutation rate (2.0–2.2 × 10(-9 change/site/year was derived from ancestral repetitive sequences, whereas the substitution rate in coding sequences (1.1 × 10(-9 change/site/year was approximately half of the overall rate (1.9–2.0 × 10(-9 change/site/year. Relative rate tests also indicated that cattle have a significantly faster rate of substitution as compared to dog and that this difference is about 6%. Conclusion This analysis provides a large-scale and unbiased assessment of genomic divergences and regional variation of substitution rates among cattle, dog and human. It is expected that these data will serve as a baseline for future mammalian molecular evolution studies.

  10. Genome-wide identification of Streptococcus pneumoniae genes essential for bacterial replication during experimental meningitis

    DEFF Research Database (Denmark)

    Molzen, T E; Burghout, P; Bootsma, H J;

    2010-01-01

    Meningitis is the most serious of invasive infections caused by the Gram-positive bacterium Streptococcus pneumoniae. Vaccines protect only against a limited number of serotypes, and evolving bacterial resistance to antimicrobials impedes treatment. Further insight into the molecular pathogenesis...... genes mutants of which had become attenuated or enriched, respectively, during infection. The results point to essential roles for capsular polysaccharides, nutrient uptake, and amino acid biosynthesis in bacterial replication during experimental meningitis. The GAF phenotype of a subset of identified...... of invasive pneumococcal disease is required in order to enable the development of new or adjunctive treatments and/or pneumococcal vaccines that are efficient across serotypes. We applied genomic array footprinting (GAF) in the search for S. pneumoniae genes that are essential during experimental meningitis...

  11. Computational bacterial genome-wide analysis of phylogenetic profiles reveals potential virulence genes of Streptococcus agalactiae.

    Directory of Open Access Journals (Sweden)

    Frank Po-Yen Lin

    Full Text Available The phylogenetic profile of a gene is a reflection of its evolutionary history and can be defined as the differential presence or absence of a gene in a set of reference genomes. It has been employed to facilitate the prediction of gene functions. However, the hypothesis that the application of this concept can also facilitate the discovery of bacterial virulence factors has not been fully examined. In this paper, we test this hypothesis and report a computational pipeline designed to identify previously unknown bacterial virulence genes using group B streptococcus (GBS as an example. Phylogenetic profiles of all GBS genes across 467 bacterial reference genomes were determined by candidate-against-all BLAST searches,which were then used to identify candidate virulence genes by machine learning models. Evaluation experiments with known GBS virulence genes suggested good functional and model consistency in cross-validation analyses (areas under ROC curve, 0.80 and 0.98 respectively. Inspection of the top-10 genes in each of the 15 virulence functional groups revealed at least 15 (of 119 homologous genes implicated in virulence in other human pathogens but previously unrecognized as potential virulence genes in GBS. Among these highly-ranked genes, many encode hypothetical proteins with possible roles in GBS virulence. Thus, our approach has led to the identification of a set of genes potentially affecting the virulence potential of GBS, which are potential candidates for further in vitro and in vivo investigations. This computational pipeline can also be extended to in silico analysis of virulence determinants of other bacterial pathogens.

  12. Cloning the simian varicella virus genome in E. coli as an infectious bacterial artificial chromosome

    OpenAIRE

    Gray, Wayne L.; Zhou, Fuchun; Noffke, Juliane; Tischer, B Karsten

    2011-01-01

    Simian varicella virus (SVV) is closely related to human varicella-zoster virus and causes varicella and zoster-like disease in nonhuman primates. In this study, a mini-F replicon was inserted into a SVV cosmid and infectious SVV was generated by co-transfection of Vero cells with overlapping SVV cosmids. The entire SVV genome, cloned as a bacterial artificial chromosome (BAC), was stably propagated upon serial passage in E. coli. Transfection of pSVV-BAC DNA into Vero cells yielded infectiou...

  13. Genomics reveals historic and contemporary transmission dynamics of a bacterial disease among wildlife and livestock

    Science.gov (United States)

    Kamath, Pauline L.; Foster, Jeffrey T.; Drees, Kevin P.; Luikart, Gordon; Quance, Christine; Anderson, Neil J.; Clarke, P. Ryan; Cole, Eric K.; Drew, Mark L.; Edwards, William H.; Rhyan, Jack C.; Treanor, John J.; Wallen, Rick L.; White, Patrick J.; Robbe-Austerman, Suelee; Cross, Paul C.

    2016-01-01

    Whole-genome sequencing has provided fundamental insights into infectious disease epidemiology, but has rarely been used for examining transmission dynamics of a bacterial pathogen in wildlife. In the Greater Yellowstone Ecosystem (GYE), outbreaks of brucellosis have increased in cattle along with rising seroprevalence in elk. Here we use a genomic approach to examine Brucella abortus evolution, cross-species transmission and spatial spread in the GYE. We find that brucellosis was introduced into wildlife in this region at least five times. The diffusion rate varies among Brucella lineages (B3 to 8 km per year) and over time. We also estimate 12 host transitions from bison to elk, and 5 from elk to bison. Our results support the notion that free-ranging elk are currently a self-sustaining brucellosis reservoir and the source of livestock infections, and that control measures in bison are unlikely to affect the dynamics of unrelated strains circulating in nearby elk populations.

  14. An improved method for oriT-directed cloning and functionalization of large bacterial genomic regions.

    Science.gov (United States)

    Kvitko, Brian H; McMillan, Ian A; Schweizer, Herbert P

    2013-08-01

    We have made significant improvements to a broad-host-range system for the cloning and manipulation of large bacterial genomic regions based on site-specific recombination between directly repeated oriT sites during conjugation. Using two suicide capture vectors carrying flanking homology regions, oriT sites are recombined on either side of the target region. Using a broad-host-range conjugation helper plasmid, the region between the oriT sites is conjugated into an Escherichia coli recipient strain, where it is circularized and maintained as a chimeric mini-F vector. The cloned target region is functionalized in multiple ways to accommodate downstream manipulation. The target region is flanked with Gateway attB sites for recombination into other vectors and by rare 18-bp I-SceI restriction sites for subcloning. The Tn7-functionalized target can also be inserted at a naturally occurring chromosomal attTn7 site(s) or maintained as a broad-host-range plasmid for complementation or heterologous expression studies. We have used the oriTn7 capture technique to clone and complement Burkholderia pseudomallei genomic regions up to 140 kb in size and have created isogenic Burkholderia strains with various combinations of genomic islands. We believe this system will greatly aid the cloning and genetic analysis of genomic islands, biosynthetic gene clusters, and large open reading frames. PMID:23747708

  15. Multiplex sequencing of bacterial artificial chromosomes for assembling complex plant genomes.

    Science.gov (United States)

    Beier, Sebastian; Himmelbach, Axel; Schmutzer, Thomas; Felder, Marius; Taudien, Stefan; Mayer, Klaus F X; Platzer, Matthias; Stein, Nils; Scholz, Uwe; Mascher, Martin

    2016-07-01

    Hierarchical shotgun sequencing remains the method of choice for assembling high-quality reference sequences of complex plant genomes. The efficient exploitation of current high-throughput technologies and powerful computational facilities for large-insert clone sequencing necessitates the sequencing and assembly of a large number of clones in parallel. We developed a multiplexed pipeline for shotgun sequencing and assembling individual bacterial artificial chromosomes (BACs) using the Illumina sequencing platform. We illustrate our approach by sequencing 668 barley BACs (Hordeum vulgare L.) in a single Illumina HiSeq 2000 lane. Using a newly designed parallelized computational pipeline, we obtained sequence assemblies of individual BACs that consist, on average, of eight sequence scaffolds and represent >98% of the genomic inserts. Our BAC assemblies are clearly superior to a whole-genome shotgun assembly regarding contiguity, completeness and the representation of the gene space. Our methods may be employed to rapidly obtain high-quality assemblies of a large number of clones to assemble map-based reference sequences of plant and animal species with complex genomes by sequencing along a minimum tiling path. PMID:26801048

  16. Single-molecule approach to bacterial genomic comparisons via optical mapping.

    Energy Technology Data Exchange (ETDEWEB)

    Zhou, Shiguo [Univ. Wisc.-Madison; Kile, A. [Univ. Wisc.-Madison; Bechner, M. [Univ. Wisc.-Madison; Kvikstad, E. [Univ. Wisc.-Madison; Deng, W. [Univ. Wisc.-Madison; Wei, J. [Univ. Wisc.-Madison; Severin, J. [Univ. Wisc.-Madison; Runnheim, R. [Univ. Wisc.-Madison; Churas, C. [Univ. Wisc.-Madison; Forrest, D. [Univ. Wisc.-Madison; Dimalanta, E. [Univ. Wisc.-Madison; Lamers, C. [Univ. Wisc.-Madison; Burland, V. [Univ. Wisc.-Madison; Blattner, F. R. [Univ. Wisc.-Madison; Schwartz, David C. [Univ. Wisc.-Madison

    2004-01-01

    Modern comparative genomics has been established, in part, by the sequencing and annotation of a broad range of microbial species. To gain further insights, new sequencing efforts are now dealing with the variety of strains or isolates that gives a species definition and range; however, this number vastly outstrips our ability to sequence them. Given the availability of a large number of microbial species, new whole genome approaches must be developed to fully leverage this information at the level of strain diversity that maximize discovery. Here, we describe how optical mapping, a single-molecule system, was used to identify and annotate chromosomal alterations between bacterial strains represented by several species. Since whole-genome optical maps are ordered restriction maps, sequenced strains of Shigella flexneri serotype 2a (2457T and 301), Yersinia pestis (CO 92 and KIM), and Escherichia coli were aligned as maps to identify regions of homology and to further characterize them as possible insertions, deletions, inversions, or translocations. Importantly, an unsequenced Shigella flexneri strain (serotype Y strain AMC[328Y]) was optically mapped and aligned with two sequenced ones to reveal one novel locus implicated in serotype conversion and several other loci containing insertion sequence elements or phage-related gene insertions. Our results suggest that genomic rearrangements and chromosomal breakpoints are readily identified and annotated against a prototypic sequenced strain by using the tools of optical mapping.

  17. 13C metabolic flux analysis at a genome-scale.

    Science.gov (United States)

    Gopalakrishnan, Saratram; Maranas, Costas D

    2015-11-01

    Metabolic models used in 13C metabolic flux analysis generally include a limited number of reactions primarily from central metabolism. They typically omit degradation pathways, complete cofactor balances, and atom transition contributions for reactions outside central metabolism. This study addresses the impact on prediction fidelity of scaling-up mapping models to a genome-scale. The core mapping model employed in this study accounts for (75 reactions and 65 metabolites) primarily from central metabolism. The genome-scale metabolic mapping model (GSMM) (697 reaction and 595 metabolites) is constructed using as a basis the iAF1260 model upon eliminating reactions guaranteed not to carry flux based on growth and fermentation data for a minimal glucose growth medium. Labeling data for 17 amino acid fragments obtained from cells fed with glucose labeled at the second carbon was used to obtain fluxes and ranges. Metabolic fluxes and confidence intervals are estimated, for both core and genome-scale mapping models, by minimizing the sum of square of differences between predicted and experimentally measured labeling patterns using the EMU decomposition algorithm. Overall, we find that both topology and estimated values of the metabolic fluxes remain largely consistent between core and GSM model. Stepping up to a genome-scale mapping model leads to wider flux inference ranges for 20 key reactions present in the core model. The glycolysis flux range doubles due to the possibility of active gluconeogenesis, the TCA flux range expanded by 80% due to the availability of a bypass through arginine consistent with labeling data, and the transhydrogenase reaction flux was essentially unresolved due to the presence of as many as five routes for the inter-conversion of NADPH to NADH afforded by the genome-scale model. By globally accounting for ATP demands in the GSMM model the unused ATP decreased drastically with the lower bound matching the maintenance ATP requirement. A non

  18. BPhyOG: An interactive server for genome-wide inference of bacterial phylogenies based on overlapping genes

    Directory of Open Access Journals (Sweden)

    Lin Kui

    2007-07-01

    Full Text Available Abstract Background Overlapping genes (OGs in bacterial genomes are pairs of adjacent genes of which the coding sequences overlap partly or entirely. With the rapid accumulation of sequence data, many OGs in bacterial genomes have now been identified. Indeed, these might prove a consistent feature across all microbial genomes. Our previous work suggests that OGs can be considered as robust markers at the whole genome level for the construction of phylogenies. An online, interactive web server for inferring phylogenies is needed for biologists to analyze phylogenetic relationships among a set of bacterial genomes of interest. Description BPhyOG is an online interactive server for reconstructing the phylogenies of completely sequenced bacterial genomes on the basis of their shared overlapping genes. It provides two tree-reconstruction methods: Neighbor Joining (NJ and Unweighted Pair-Group Method using Arithmetic averages (UPGMA. Users can apply the desired method to generate phylogenetic trees, which are based on an evolutionary distance matrix for the selected genomes. The distance between two genomes is defined by the normalized number of their shared OG pairs. BPhyOG also allows users to browse the OGs that were used to infer the phylogenetic relationships. It provides detailed annotation for each OG pair and the features of the component genes through hyperlinks. Users can also retrieve each of the homologous OG pairs that have been determined among 177 genomes. It is a useful tool for analyzing the tree of life and overlapping genes from a genomic standpoint. Conclusion BPhyOG is a useful interactive web server for genome-wide inference of any potential evolutionary relationship among the genomes selected by users. It currently includes 177 completely sequenced bacterial genomes containing 79,855 OG pairs, the annotation and homologous OG pairs of which are integrated comprehensively. The reliability of phylogenies complemented by

  19. Differential regulation of horizontally acquired and core genome genes by the bacterial modulator H-NS.

    Directory of Open Access Journals (Sweden)

    Rosa C Baños

    2009-06-01

    Full Text Available Horizontal acquisition of DNA by bacteria dramatically increases genetic diversity and hence successful bacterial colonization of several niches, including the human host. A relevant issue is how this newly acquired DNA interacts and integrates in the regulatory networks of the bacterial cell. The global modulator H-NS targets both core genome and HGT genes and silences gene expression in response to external stimuli such as osmolarity and temperature. Here we provide evidence that H-NS discriminates and differentially modulates core and HGT DNA. As an example of this, plasmid R27-encoded H-NS protein has evolved to selectively silence HGT genes and does not interfere with core genome regulation. In turn, differential regulation of both gene lineages by resident chromosomal H-NS requires a helper protein: the Hha protein. Tight silencing of HGT DNA is accomplished by H-NS-Hha complexes. In contrast, core genes are modulated by H-NS homoligomers. Remarkably, the presence of Hha-like proteins is restricted to the Enterobacteriaceae. In addition, conjugative plasmids encoding H-NS variants have hitherto been isolated only from members of the family. Thus, the H-NS system in enteric bacteria presents unique evolutionary features. The capacity to selectively discriminate between core and HGT DNA may help to maintain horizontally transmitted DNA in silent form and may give these bacteria a competitive advantage in adapting to new environments, including host colonization.

  20. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    Energy Technology Data Exchange (ETDEWEB)

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin

  1. Analysis of Aspergillus nidulans metabolism at the genome-scale

    DEFF Research Database (Denmark)

    David, Helga; Ozcelik, İlknur Ş; Hofmann, Gerald;

    2008-01-01

    a function. Results: In this work, we have manually assigned functions to 472 orphan genes in the metabolism of A. nidulans, by using a pathway-driven approach and by employing comparative genomics tools based on sequence similarity. The central metabolism of A. nidulans, as well as biosynthetic pathways......, in an objective and systematic manner. The functional assignments served as a basis to develop a mathematical model, linking 666 genes (both previously and newly annotated) to metabolic roles. The model was used to simulate metabolic behavior and additionally to integrate, analyze and interpret large-scale gene...

  2. Genome-scale constraint-based modeling of Geobacter metallireducens

    Directory of Open Access Journals (Sweden)

    Famili Iman

    2009-01-01

    Full Text Available Abstract Background Geobacter metallireducens was the first organism that can be grown in pure culture to completely oxidize organic compounds with Fe(III oxide serving as electron acceptor. Geobacter species, including G. sulfurreducens and G. metallireducens, are used for bioremediation and electricity generation from waste organic matter and renewable biomass. The constraint-based modeling approach enables the development of genome-scale in silico models that can predict the behavior of complex biological systems and their responses to the environments. Such a modeling approach was applied to provide physiological and ecological insights on the metabolism of G. metallireducens. Results The genome-scale metabolic model of G. metallireducens was constructed to include 747 genes and 697 reactions. Compared to the G. sulfurreducens model, the G. metallireducens metabolic model contains 118 unique reactions that reflect many of G. metallireducens' specific metabolic capabilities. Detailed examination of the G. metallireducens model suggests that its central metabolism contains several energy-inefficient reactions that are not present in the G. sulfurreducens model. Experimental biomass yield of G. metallireducens growing on pyruvate was lower than the predicted optimal biomass yield. Microarray data of G. metallireducens growing with benzoate and acetate indicated that genes encoding these energy-inefficient reactions were up-regulated by benzoate. These results suggested that the energy-inefficient reactions were likely turned off during G. metallireducens growth with acetate for optimal biomass yield, but were up-regulated during growth with complex electron donors such as benzoate for rapid energy generation. Furthermore, several computational modeling approaches were applied to accelerate G. metallireducens research. For example, growth of G. metallireducens with different electron donors and electron acceptors were studied using the genome-scale

  3. A genome-scale metabolic reconstruction of Mycoplasma genitalium, iPS189.

    Directory of Open Access Journals (Sweden)

    Patrick F Suthers

    2009-02-01

    Full Text Available With a genome size of approximately 580 kb and approximately 480 protein coding regions, Mycoplasma genitalium is one of the smallest known self-replicating organisms and, additionally, has extremely fastidious nutrient requirements. The reduced genomic content of M. genitalium has led researchers to suggest that the molecular assembly contained in this organism may be a close approximation to the minimal set of genes required for bacterial growth. Here, we introduce a systematic approach for the construction and curation of a genome-scale in silico metabolic model for M. genitalium. Key challenges included estimation of biomass composition, handling of enzymes with broad specificities, and the lack of a defined medium. Computational tools were subsequently employed to identify and resolve connectivity gaps in the model as well as growth prediction inconsistencies with gene essentiality experimental data. The curated model, M. genitalium iPS189 (262 reactions, 274 metabolites, is 87% accurate in recapitulating in vivo gene essentiality results for M. genitalium. Approaches and tools described herein provide a roadmap for the automated construction of in silico metabolic models of other organisms.

  4. Genome scale metabolic modeling of the riboflavin overproducer Ashbya gossypii.

    Science.gov (United States)

    Ledesma-Amaro, Rodrigo; Kerkhoven, Eduard J; Revuelta, José Luis; Nielsen, Jens

    2014-06-01

    Ashbya gossypii is a filamentous fungus that naturally overproduces riboflavin, or vitamin B2. Advances in genetic and metabolic engineering of A. gossypii have permitted the switch from industrial chemical synthesis to the current biotechnological production of this vitamin. Additionally, A. gossypii is a model organism with one of the smallest eukaryote genomes being phylogenetically close to Saccharomyces cerevisiae. It has therefore been used to study evolutionary aspects of bakers' yeast. We here reconstructed the first genome scale metabolic model of A. gossypii, iRL766. The model was validated by biomass growth, riboflavin production and substrate utilization predictions. Gene essentiality analysis of the A. gossypii model in comparison with the S. cerevisiae model demonstrated how the whole-genome duplication event that separates the two species has led to an even spread of paralogs among all metabolic pathways. Additionally, iRL766 was used to integrate transcriptomics data from two different growth stages of A. gossypii, comparing exponential growth to riboflavin production stages. Both reporter metabolite analysis and in silico identification of transcriptionally regulated enzymes demonstrated the important involvement of beta-oxidation and the glyoxylate cycle in riboflavin production. PMID:24374726

  5. Tracing the Spread of Clostridium difficile Ribotype 027 in Germany Based on Bacterial Genome Sequences.

    Directory of Open Access Journals (Sweden)

    Matthias Steglich

    Full Text Available We applied whole-genome sequencing to reconstruct the spatial and temporal dynamics underpinning the expansion of Clostridium difficile ribotype 027 in Germany. Based on re-sequencing of genomes from 57 clinical C. difficile isolates, which had been collected from hospitalized patients at 36 locations throughout Germany between 1990 and 2012, we demonstrate that C. difficile genomes have accumulated sequence variation sufficiently fast to document the pathogen's spread at a regional scale. We detected both previously described lineages of fluoroquinolone-resistant C. difficile ribotype 027, FQR1 and FQR2. Using Bayesian phylogeographic analyses, we show that fluoroquinolone-resistant C. difficile 027 was imported into Germany at least four times, that it had been widely disseminated across multiple federal states even before the first outbreak was noted in 2007, and that it has continued to spread since.

  6. A genomic scale map of genetic diversity in Trypanosoma cruzi

    Directory of Open Access Journals (Sweden)

    Ackermann Alejandro A

    2012-12-01

    Full Text Available Abstract Background Trypanosoma cruzi, the causal agent of Chagas Disease, affects more than 16 million people in Latin America. The clinical outcome of the disease results from a complex interplay between environmental factors and the genetic background of both the human host and the parasite. However, knowledge of the genetic diversity of the parasite, is currently limited to a number of highly studied loci. The availability of a number of genomes from different evolutionary lineages of T. cruzi provides an unprecedented opportunity to look at the genetic diversity of the parasite at a genomic scale. Results Using a bioinformatic strategy, we have clustered T. cruzi sequence data available in the public domain and obtained multiple sequence alignments in which one or two alleles from the reference CL-Brener were included. These data covers 4 major evolutionary lineages (DTUs: TcI, TcII, TcIII, and the hybrid TcVI. Using these set of alignments we have identified 288,957 high quality single nucleotide polymorphisms and 1,480 indels. In a reduced re-sequencing study we were able to validate ~ 97% of high-quality SNPs identified in 47 loci. Analysis of how these changes affect encoded protein products showed a 0.77 ratio of synonymous to non-synonymous changes in the T. cruzi genome. We observed 113 changes that introduce or remove a stop codon, some causing significant functional changes, and a number of tri-allelic and tetra-allelic SNPs that could be exploited in strain typing assays. Based on an analysis of the observed nucleotide diversity we show that the T. cruzi genome contains a core set of genes that are under apparent purifying selection. Interestingly, orthologs of known druggable targets show statistically significant lower nucleotide diversity values. Conclusions This study provides the first look at the genetic diversity of T. cruzi at a genomic scale. The analysis covers an estimated ~ 60% of the genetic diversity present in the

  7. First genomic insights into members of a candidate bacterial phylum responsible for wastewater bulking

    Directory of Open Access Journals (Sweden)

    Yuji Sekiguchi

    2015-01-01

    Full Text Available Filamentous cells belonging to the candidate bacterial phylum KSB3 were previously identified as the causative agent of fatal filament overgrowth (bulking in a high-rate industrial anaerobic wastewater treatment bioreactor. Here, we obtained near complete genomes from two KSB3 populations in the bioreactor, including the dominant bulking filament, using differential coverage binning of metagenomic data. Fluorescence in situ hybridization with 16S rRNA-targeted probes specific for the two populations confirmed that both are filamentous organisms. Genome-based metabolic reconstruction and microscopic observation of the KSB3 filaments in the presence of sugar gradients indicate that both filament types are Gram-negative, strictly anaerobic fermenters capable of non-flagellar based gliding motility, and have a strikingly large number of sensory and response regulator genes. We propose that the KSB3 filaments are highly sensitive to their surroundings and that cellular processes, including those causing bulking, are controlled by external stimuli. The obtained genomes lay the foundation for a more detailed understanding of environmental cues used by KSB3 filaments, which may lead to more robust treatment options to prevent bulking.

  8. Genome-wide Selective Sweeps in Natural Bacterial Populations Revealed by Time-series Metagenomics

    Energy Technology Data Exchange (ETDEWEB)

    Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Foster, Brian; Kang, Dongwan; Tringe, Susannah G.; Bertilsson, Stefan; Moran, Mary Ann; Shade, Ashley; Newton, Ryan J.; Stevens, Sarah; McMahon, Katherine D.; Malmstrom, Rex R.

    2014-06-18

    Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ‘ecotype model’ of diversification, but not previously observed in natural populations.

  9. Genome-wide Selective Sweeps in Natural Bacterial Populations Revealed by Time-series Metagenomics

    Energy Technology Data Exchange (ETDEWEB)

    Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Foster, Brian; Kang, Dongwan; Tringe, Susannah G.; Bertilsson, Stefan; Moran, Mary Ann; Shade, Ashley; Newton, Ryan J.; Stevens, Sarah; McMcahon, Katherine D.; Mamlstrom, Rex R.

    2014-05-12

    Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ecotype model? of diversification, but not previously observed in natural populations.

  10. BFAST: an alignment tool for large scale genome resequencing.

    Directory of Open Access Journals (Sweden)

    Nils Homer

    Full Text Available BACKGROUND: The new generation of massively parallel DNA sequencers, combined with the challenge of whole human genome resequencing, result in the need for rapid and accurate alignment of billions of short DNA sequence reads to a large reference genome. Speed is obviously of great importance, but equally important is maintaining alignment accuracy of short reads, in the 25-100 base range, in the presence of errors and true biological variation. METHODOLOGY: We introduce a new algorithm specifically optimized for this task, as well as a freely available implementation, BFAST, which can align data produced by any of current sequencing platforms, allows for user-customizable levels of speed and accuracy, supports paired end data, and provides for efficient parallel and multi-threaded computation on a computer cluster. The new method is based on creating flexible, efficient whole genome indexes to rapidly map reads to candidate alignment locations, with arbitrary multiple independent indexes allowed to achieve robustness against read errors and sequence variants. The final local alignment uses a Smith-Waterman method, with gaps to support the detection of small indels. CONCLUSIONS: We compare BFAST to a selection of large-scale alignment tools -- BLAT, MAQ, SHRiMP, and SOAP -- in terms of both speed and accuracy, using simulated and real-world datasets. We show BFAST can achieve substantially greater sensitivity of alignment in the context of errors and true variants, especially insertions and deletions, and minimize false mappings, while maintaining adequate speed compared to other current methods. We show BFAST can align the amount of data needed to fully resequence a human genome, one billion reads, with high sensitivity and accuracy, on a modest computer cluster in less than 24 hours. BFAST is available at (http://bfast.sourceforge.net.

  11. Spatial scales of bacterial diversity in cold-water coral reef ecosystems.

    Directory of Open Access Journals (Sweden)

    Sandra Schöttner

    Full Text Available BACKGROUND: Cold-water coral reef ecosystems are recognized as biodiversity hotspots in the deep sea, but insights into their associated bacterial communities are still limited. Deciphering principle patterns of bacterial community variation over multiple spatial scales may however prove critical for a better understanding of factors contributing to cold-water coral reef stability and functioning. METHODOLOGY/PRINCIPAL FINDINGS: Bacterial community structure, as determined by Automated Ribosomal Intergenic Spacer Analysis (ARISA, was investigated with respect to (i microbial habitat type and (ii coral species and color, as well as the three spatial components (iii geomorphologic reef zoning, (iv reef boundary, and (v reef location. Communities revealed fundamental differences between coral-generated (branch surface, mucus and ambient microbial habitats (seawater, sediments. This habitat specificity appeared pivotal for determining bacterial community shifts over all other study levels investigated. Coral-derived surfaces showed species-specific patterns, differing significantly between Lophelia pertusa and Madrepora oculata, but not between L. pertusa color types. Within the reef center, no community distinction corresponded to geomorphologic reef zoning for both coral-generated and ambient microbial habitats. Beyond the reef center, however, bacterial communities varied considerably from local to regional scales, with marked shifts toward the reef periphery as well as between different in- and offshore reef sites, suggesting significant biogeographic imprinting but weak microbe-host specificity. CONCLUSIONS/SIGNIFICANCE: This study presents the first multi-scale survey of bacterial diversity in cold-water coral reefs, spanning a total of five observational levels including three spatial scales. It demonstrates that bacterial communities in cold-water coral reefs are structured by multiple factors acting at different spatial scales, which has

  12. Modeling Lactococcus lactis using a genome-scale flux model

    Directory of Open Access Journals (Sweden)

    Nielsen Jens

    2005-06-01

    Full Text Available Abstract Background Genome-scale flux models are useful tools to represent and analyze microbial metabolism. In this work we reconstructed the metabolic network of the lactic acid bacteria Lactococcus lactis and developed a genome-scale flux model able to simulate and analyze network capabilities and whole-cell function under aerobic and anaerobic continuous cultures. Flux balance analysis (FBA and minimization of metabolic adjustment (MOMA were used as modeling frameworks. Results The metabolic network was reconstructed using the annotated genome sequence from L. lactis ssp. lactis IL1403 together with physiological and biochemical information. The established network comprised a total of 621 reactions and 509 metabolites, representing the overall metabolism of L. lactis. Experimental data reported in the literature was used to fit the model to phenotypic observations. Regulatory constraints had to be included to simulate certain metabolic features, such as the shift from homo to heterolactic fermentation. A minimal medium for in silico growth was identified, indicating the requirement of four amino acids in addition to a sugar. Remarkably, de novo biosynthesis of four other amino acids was observed even when all amino acids were supplied, which is in good agreement with experimental observations. Additionally, enhanced metabolic engineering strategies for improved diacetyl producing strains were designed. Conclusion The L. lactis metabolic network can now be used for a better understanding of lactococcal metabolic capabilities and potential, for the design of enhanced metabolic engineering strategies and for integration with other types of 'omic' data, to assist in finding new information on cellular organization and function.

  13. Cloning human herpes virus 6A genome into bacterial artificial chromosomes and study of DNA replication intermediates

    OpenAIRE

    Borenstein, Ronen; Frenkel, Niza

    2009-01-01

    Cloning of large viral genomes into bacterial artificial chromosomes (BACs) facilitates analyses of viral functions and molecular mutagenesis. Previous derivations of viral BACs involved laborious recombinations within infected cells. We describe a single-step production of viral BACs by direct cloning of unit length genomes, derived from circular or head-to-tail concatemeric DNA replication intermediates. The BAC cloning is independent of intracellular recombinations and DNA packaging constr...

  14. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    DEFF Research Database (Denmark)

    2016-01-01

    redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases...... for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data....

  15. Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains : a web-based resource

    Directory of Open Access Journals (Sweden)

    Vergnaud Gilles

    2004-01-01

    Full Text Available Abstract Background Polymorphic tandem repeat typing is a new generic technology which has been proved to be very efficient for bacterial pathogens such as B. anthracis, M. tuberculosis, P. aeruginosa, L. pneumophila, Y. pestis. The previously developed tandem repeats database takes advantage of the release of genome sequence data for a growing number of bacteria to facilitate the identification of tandem repeats. The development of an assay then requires the evaluation of tandem repeat polymorphism on well-selected sets of isolates. In the case of major human pathogens, such as S. aureus, more than one strain is being sequenced, so that tandem repeats most likely to be polymorphic can now be selected in silico based on genome sequence comparison. Results In addition to the previously described general Tandem Repeats Database, we have developed a tool to automatically identify tandem repeats of a different length in the genome sequence of two (or more closely related bacterial strains. Genome comparisons are pre-computed. The results of the comparisons are parsed in a database, which can be conveniently queried over the internet according to criteria of practical value, including repeat unit length, predicted size difference, etc. Comparisons are available for 16 bacterial species, and the orthopox viruses, including the variola virus and three of its close neighbors. Conclusions We are presenting an internet-based resource to help develop and perform tandem repeats based bacterial strain typing. The tools accessible at http://minisatellites.u-psud.fr now comprise four parts. The Tandem Repeats Database enables the identification of tandem repeats across entire genomes. The Strain Comparison Page identifies tandem repeats differing between different genome sequences from the same species. The "Blast in the Tandem Repeats Database" facilitates the search for a known tandem repeat and the prediction of amplification product sizes. The "Bacterial

  16. Predicting effects of structural stress in a genome-reduced model bacterial metabolism

    Science.gov (United States)

    Güell, Oriol; Sagués, Francesc; Serrano, M. Ángeles

    2012-08-01

    Mycoplasma pneumoniae is a human pathogen recently proposed as a genome-reduced model for bacterial systems biology. Here, we study the response of its metabolic network to different forms of structural stress, including removal of individual and pairs of reactions and knockout of genes and clusters of co-expressed genes. Our results reveal a network architecture as robust as that of other model bacteria regarding multiple failures, although less robust against individual reaction inactivation. Interestingly, metabolite motifs associated to reactions can predict the propagation of inactivation cascades and damage amplification effects arising in double knockouts. We also detect a significant correlation between gene essentiality and damages produced by single gene knockouts, and find that genes controlling high-damage reactions tend to be expressed independently of each other, a functional switch mechanism that, simultaneously, acts as a genetic firewall to protect metabolism. Prediction of failure propagation is crucial for metabolic engineering or disease treatment.

  17. Solving the problem of comparing whole bacterial genomes across different sequencing platforms.

    Directory of Open Access Journals (Sweden)

    Rolf S Kaas

    Full Text Available Whole genome sequencing (WGS shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent. We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools.

  18. Solving the problem of comparing whole bacterial genomes across different sequencing platforms.

    Science.gov (United States)

    Kaas, Rolf S; Leekitcharoenphon, Pimlapas; Aarestrup, Frank M; Lund, Ole

    2014-01-01

    Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools. PMID:25110940

  19. Distinct soil bacterial communities along a small-scale elevational gradient in alpine tundra

    Directory of Open Access Journals (Sweden)

    Congcong eShen

    2015-06-01

    Full Text Available The elevational diversity pattern for microorganisms has received great attention recently but is still understudied, and phylogenetic relatedness is rarely studied for microbial elevational distributions. Using a bar-coded pyrosequencing technique, we examined the biodiversity patterns for soil bacterial communities of tundra ecosystem along 2000–2500 m elevations on Changbai Mountain in China. Bacterial taxonomic richness displayed a linear decreasing trend with increasing elevation. Phylogenetic diversity and mean nearest taxon distance (MNTD exhibited a unimodal pattern with elevation. Bacterial communities were more phylogenetically clustered than expected by chance at all elevations based on the standardized effect size of MNTD metric. The bacterial communities differed dramatically among elevations, and the community composition was significantly correlated with soil total carbon, total nitrogen, C:N ratio, and dissolved organic carbon. Multiple ordinary least squares regression analysis showed that the observed biodiversity patterns strongly correlated with soil total carbon and C:N ratio. Taken together, this is the first time that a significant bacterial diversity pattern has been observed across a small-scale elevational gradient. Our results indicated that soil carbon and nitrogen contents were the critical environmental factors affecting bacterial elevational distribution in Changbai Mountain tundra. This suggested that ecological niche-based environmental filtering processes related to soil carbon and nitrogen contents could play a dominant role in structuring bacterial communities along the elevational gradient.

  20. Dynamic bacterial communities on reverse-osmosis membranes in a full-scale desalination plant.

    Science.gov (United States)

    Manes, C-L de O; West, N; Rapenne, S; Lebaron, P

    2011-01-01

    To better understand biofouling of seawater reverse osmosis (SWRO) membranes, bacterial diversity was characterized in the intake water, in subsequently pretreated water and on SWRO membranes from a full-scale desalination plant (FSDP) during a 9 month period. 16S rRNA gene fingerprinting and sequencing revealed that bacterial communities in the water samples and on the SWRO membranes were very different. For the different sampling dates, the bacterial diversity of the active and the total bacterial fractions of the water samples remained relatively stable over the sampling period whereas the bacterial community structure on the four SWRO membrane samples was significantly different. The richness and evenness of the SWRO membrane bacterial communities increased with usage time with an increase in the Shannon diversity index of 2.2 to 3.7. In the oldest SWRO membrane (330 days), no single operational taxonomic unit (OTU) dominated and the majority of the OTUs fell into the Alphaproteobacteria or the Planctomycetes. In striking contrast, a Betaproteobacteria OTU affiliated to the genus Ideonella was dominant and exclusively found in the membrane used for the shortest time (10 days). This suggests that bacteria belonging to this genus could be one of the primary colonizers of the SWRO membrane. Knowledge of the dominant bacterial species on SWRO membranes and their dynamics should help guide culture studies for physiological characterization of biofilm forming species. PMID:21108068

  1. Dynamic bacterial communities on reverse-osmosis membranes in a full-scale desalination plant.

    Science.gov (United States)

    Manes, C-L de O; West, N; Rapenne, S; Lebaron, P

    2011-01-01

    To better understand biofouling of seawater reverse osmosis (SWRO) membranes, bacterial diversity was characterized in the intake water, in subsequently pretreated water and on SWRO membranes from a full-scale desalination plant (FSDP) during a 9 month period. 16S rRNA gene fingerprinting and sequencing revealed that bacterial communities in the water samples and on the SWRO membranes were very different. For the different sampling dates, the bacterial diversity of the active and the total bacterial fractions of the water samples remained relatively stable over the sampling period whereas the bacterial community structure on the four SWRO membrane samples was significantly different. The richness and evenness of the SWRO membrane bacterial communities increased with usage time with an increase in the Shannon diversity index of 2.2 to 3.7. In the oldest SWRO membrane (330 days), no single operational taxonomic unit (OTU) dominated and the majority of the OTUs fell into the Alphaproteobacteria or the Planctomycetes. In striking contrast, a Betaproteobacteria OTU affiliated to the genus Ideonella was dominant and exclusively found in the membrane used for the shortest time (10 days). This suggests that bacteria belonging to this genus could be one of the primary colonizers of the SWRO membrane. Knowledge of the dominant bacterial species on SWRO membranes and their dynamics should help guide culture studies for physiological characterization of biofilm forming species.

  2. Transgenic Rice Plants Harboring Genomic DNA from Zizania latifolia Confer Bacterial Blight Resistance

    Institute of Scientific and Technical Information of China (English)

    SHEN Wei-wei; SONG Cheng-li; CHEN Jie; Fu Ya-ping; Wu Jian-li; JIANG Shao-mei

    2011-01-01

    Based on the sequence of a resistance gene analog FZ14 derived from Zizania latifolia (Griseb.),a pair of specific PCR primers FZ14P1/FZ14P2 was designed to isolate candidate disease resistance gene.The pooled-PCR approach was adopted using the primer pair to screen a genomic transformation-competent artificial chromosome (TAC) library derived from Z.latifolia.A positive TAC clone (ZR1) was obtained and confirmed by sequence analysis.The results indicated that ZR1 consisted of conserved motifs similar to P-loop (kinase 1a),kinase 2,kinase 3a and GLPL (Gly-Leu-Pro-Leu),suggesting that it could be a portion of NBS-LRR type of resistance gene.Using Agrobacterium-mediated transformation of Nipponbare mature embryo,a total of 48 independent transgenic T0 plants were obtained.Among them,36 plants were highly resistant to the virulent bacterial blight strain P×O71.The results indicate that ZR1 contains at least one functional bacterial blight resistance gene.

  3. Expression of lysozymes from Erwinia amylovora phages and Erwinia genomes and inhibition by a bacterial protein.

    Science.gov (United States)

    Müller, Ina; Gernold, Marina; Schneider, Bernd; Geider, Klaus

    2012-01-01

    Genes coding for lysozyme-inhibiting proteins (Ivy) were cloned from the chromosomes of the plant pathogens Erwinia amylovora and Erwinia pyrifoliae. The product interfered not only with activity of hen egg white lysozyme, but also with an enzyme from E. amylovora phage ΦEa1h. We have expressed lysozyme genes from the genomes of three Erwinia species in Escherichia coli. The lysozymes expressed from genes of the E. amylovora phages ΦEa104 and ΦEa116, Erwinia chromosomes and Arabidopsis thaliana were not affected by Ivy. The enzyme from bacteriophage ΦEa1h was fused at the N- or C-terminus to other peptides. Compared to the intact lysozyme, a His-tag reduced its lytic activity about 10-fold and larger fusion proteins abolished activity completely. Specific protease cleavage restored lysozyme activity of a GST-fusion. The bacteriophage-encoded lysozymes were more active than the enzymes from bacterial chromosomes. Viral lyz genes were inserted into a broad-host range vector, and transfer to E. amylovora inhibited cell growth. Inserted in the yeast Pichia pastoris, the ΦEa1h-lysozyme was secreted and also inhibited by Ivy. Here we describe expression of unrelated cloned 'silent' lyz genes from Erwinia chromosomes and a novel interference of bacterial Ivy proteins with a viral lysozyme.

  4. Selection for Unequal Densities of Sigma70 Promoter-like Signalsin Different Regions of Large Bacterial Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Huerta, Araceli M.; Francino, M. Pilar; Morett, Enrique; Collado-Vides, Julio

    2006-03-01

    The evolutionary processes operating in the DNA regions that participate in the regulation of gene expression are poorly understood. In Escherichia coli, we have established a sequence pattern that distinguishes regulatory from nonregulatory regions. The density of promoter-like sequences, that are recognizable by RNA polymerase and may function as potential promoters, is high within regulatory regions, in contrast to coding regions and regions located between convergently-transcribed genes. Moreover, functional promoter sites identified experimentally are often found in the subregions of highest density of promoter-like signals, even when individual sites with higher binding affinity for RNA polymerase exist elsewhere within the regulatory region. In order to investigate the generality of this pattern, we have used position weight matrices describing the -35 and -10 promoter boxes of E. coli to search for these motifs in 43 additional genomes belonging to most established bacterial phyla, after specific calibration of the matrices according to the base composition of the noncoding regions of each genome. We have found that all bacterial species analyzed contain similar promoter-like motifs, and that, in most cases, these motifs follow the same genomic distribution observed in E. coli. Differential densities between regulatory and nonregulatory regions are detectable in most bacterial genomes, with the exception of those that have experienced evolutionary extreme genome reduction. Thus, the phylogenetic distribution of this pattern mirrors that of genes and other genomic features that require weak selection to be effective in order to persist. On this basis, we suggest that the loss of differential densities in the reduced genomes of host-restricted pathogens and symbionts is the outcome of a process of genome degradation resulting from the decreased efficiency of purifying selection in highly structured small populations. This implies that the differential

  5. antiSMASH : rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences

    NARCIS (Netherlands)

    Medema, Marnix H.; Blin, Kai; Cimermancic, Peter; de Jager, Victor; Zakrzewski, Piotr; Fischbach, Michael A.; Weber, Tilmann; Takano, Eriko; Breitling, Rainer

    2011-01-01

    Bacterial and fungal secondary metabolism is a rich source of novel bioactive compounds with potential pharmaceutical applications as antibiotics, anti-tumor drugs or cholesterol-lowering drugs. To find new drug candidates, microbiologists are increasingly relying on sequencing genomes of a wide var

  6. Complete Genome Sequence of Gluconacetobacter hansenii Strain NQ5 (ATCC 53582), an Efficient Producer of Bacterial Cellulose.

    Science.gov (United States)

    Pfeffer, Sarah; Mehta, Kalpa; Brown, R Malcolm

    2016-08-11

    This study reports the release of the complete nucleotide sequence of Gluconacetobacter hansenii strain NQ5 (ATCC 53582). This strain was isolated by R. Malcolm Brown, Jr. in a sugar mill in North Queensland, Australia, and is an efficient producer of bacterial cellulose. The elucidation of the genome will contribute to the study of the molecular mechanisms necessary for cellulose biosynthesis.

  7. Complete Genome Sequence of Japanese Erwinia Strain Ejp617, a Bacterial Shoot Blight Pathogen of Pear ▿

    OpenAIRE

    Park, Duck Hwan; Thapa, Shree Prasad; Choi, Beom-Soon; Kim, Won-Sik; Hur, Jang Hyun; Cho, Jun Mo; Lim, Jong-Sung; Choi, Ik-Young; Lim, Chun Keun

    2010-01-01

    The Japanese Erwinia strain Ejp617 is a plant pathogen that causes bacterial shoot blight of pear in Japan. Here, we report the complete genome sequence of strain Ejp617 isolated from Nashi pears in Japan to provide further valuable insight among related Erwinia species.

  8. antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences

    NARCIS (Netherlands)

    Medema, M.H.; Blin, K.; Cimermancic, P.; Jager, de V.C.L.; Zakrzewski, P.; Fischbach, M.A.; Weber, T.; Takano, E.; Breitling, R.

    2011-01-01

    Bacterial and fungal secondary metabolism is a rich source of novel bioactive compounds with potential pharmaceutical applications as antibiotics, anti-tumor drugs or cholesterol-lowering drugs. To find new drug candidates, microbiologists are increasingly relying on sequencing genomes of a wide var

  9. antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences.

    NARCIS (Netherlands)

    Medema, M.H.; Blin, K.; Cimermancic, P.; Jager, V.C.L. de; Zakrzewski, P.; Fischbach, M.A.; Weber, T.; Takano, E.; Breitling, R.

    2011-01-01

    Bacterial and fungal secondary metabolism is a rich source of novel bioactive compounds with potential pharmaceutical applications as antibiotics, anti-tumor drugs or cholesterol-lowering drugs. To find new drug candidates, microbiologists are increasingly relying on sequencing genomes of a wide var

  10. Complete Genome Sequence of Gluconacetobacter hansenii Strain NQ5 (ATCC 53582), an Efficient Producer of Bacterial Cellulose.

    Science.gov (United States)

    Pfeffer, Sarah; Mehta, Kalpa; Brown, R Malcolm

    2016-01-01

    This study reports the release of the complete nucleotide sequence of Gluconacetobacter hansenii strain NQ5 (ATCC 53582). This strain was isolated by R. Malcolm Brown, Jr. in a sugar mill in North Queensland, Australia, and is an efficient producer of bacterial cellulose. The elucidation of the genome will contribute to the study of the molecular mechanisms necessary for cellulose biosynthesis. PMID:27516505

  11. SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals.

    Science.gov (United States)

    Nikolaichik, Yevgeny; Damienikan, Aliaksandr U

    2016-01-01

    The majority of bacterial genome annotations are currently automated and based on a 'gene by gene' approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn't fit with regulatory information allowed us to correct product and gene names for over 300 loci.

  12. SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals.

    Science.gov (United States)

    Nikolaichik, Yevgeny; Damienikan, Aliaksandr U

    2016-01-01

    The majority of bacterial genome annotations are currently automated and based on a 'gene by gene' approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn't fit with regulatory information allowed us to correct product and gene names for over 300 loci. PMID:27257541

  13. SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals

    Science.gov (United States)

    Damienikan, Aliaksandr U.

    2016-01-01

    The majority of bacterial genome annotations are currently automated and based on a ‘gene by gene’ approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn’t fit with regulatory information allowed us to correct product and gene names for over 300 loci. PMID:27257541

  14. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    International Nuclear Information System (INIS)

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  15. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    Energy Technology Data Exchange (ETDEWEB)

    Arneodo, Alain, E-mail: alain.arneodo@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Vaillant, Cedric, E-mail: cedric.vaillant@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Audit, Benjamin, E-mail: benjamin.audit@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Argoul, Francoise, E-mail: francoise.argoul@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); D' Aubenton-Carafa, Yves, E-mail: daubenton@cgm.cnrs-gif.f [Centre de Genetique Moleculaire, CNRS, Allee de la Terrasse, 91198 Gif-sur-Yvette (France); Thermes, Claude, E-mail: claude.thermes@cgm.cnrs-gif.f [Centre de Genetique Moleculaire, CNRS, Allee de la Terrasse, 91198 Gif-sur-Yvette (France)

    2011-02-15

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  16. Current state of genome-scale modeling in filamentous fungi

    DEFF Research Database (Denmark)

    Brandl, Julian; Andersen, Mikael Rørdam

    2015-01-01

    The group of filamentous fungi contains important species used in industrial biotechnology for acid, antibiotics and enzyme production. Their unique lifestyle turns these organisms into a valuable genetic reservoir of new natural products and biomass degrading enzymes that has not been used to full...... testing them in vivo. The increasing availability of high quality models and molecular biological tools for manipulating filamentous fungi renders the model-guided engineering of these fungal factories possible with comprehensive metabolic networks. A typical fungal model contains on average 1138 unique...... metabolic reactions and 1050 ORFs, making them a vast knowledge-base of fungal metabolism. In the present review we focus on the current state as well as potential future applications of genome-scale models in filamentous fungi....

  17. Next-generation genome-scale models for metabolic engineering

    DEFF Research Database (Denmark)

    King, Zachary A.; Lloyd, Colton J.; Feist, Adam M.;

    2015-01-01

    Constraint-based reconstruction and analysis (COBRA) methods have become widely used tools for metabolic engineering in both academic and industrial laboratories. By employing a genome-scale in silico representation of the metabolic network of a host organism, COBRA methods can be used to predict...... optimal genetic modifications that improve the rate and yield of chemical production. A new generation of COBRA models and methods is now being developed. -. encompassing many biological processes and simulation strategies. -. and next-generation models enable new types of predictions. Here, three key...... examples of applying COBRA methods to strain optimization are presented and discussed. Then, an outlook is provided on the next generation of COBRA models and the new types of predictions they will enable for systems metabolic engineering....

  18. Optimization of Mutation Pressure in Relation to Properties of Protein-Coding Sequences in Bacterial Genomes.

    Directory of Open Access Journals (Sweden)

    Paweł Błażej

    Full Text Available Most mutations are deleterious and require energetically costly repairs. Therefore, it seems that any minimization of mutation rate is beneficial. On the other hand, mutations generate genetic diversity indispensable for evolution and adaptation of organisms to changing environmental conditions. Thus, it is expected that a spontaneous mutational pressure should be an optimal compromise between these two extremes. In order to study the optimization of the pressure, we compared mutational transition probability matrices from bacterial genomes with artificial matrices fulfilling the same general features as the real ones, e.g., the stationary distribution and the speed of convergence to the stationarity. The artificial matrices were optimized on real protein-coding sequences based on Evolutionary Strategies approach to minimize or maximize the probability of non-synonymous substitutions and costs of amino acid replacements depending on their physicochemical properties. The results show that the empirical matrices have a tendency to minimize the effects of mutations rather than maximize their costs on the amino acid level. They were also similar to the optimized artificial matrices in the nucleotide substitution pattern, especially the high transitions/transversions ratio. We observed no substantial differences between the effects of mutational matrices on protein-coding sequences in genomes under study in respect of differently replicated DNA strands, mutational cost types and properties of the referenced artificial matrices. The findings indicate that the empirical mutational matrices are rather adapted to minimize mutational costs in the studied organisms in comparison to other matrices with similar mathematical constraints.

  19. Bacterial Societies: Cooperation, Colonization, and Competition in Micro-Scale Ecosystems

    NARCIS (Netherlands)

    Hol, F.J.H.

    2014-01-01

    In this thesis, I describe experiments aimed at understanding bacterial population dynamics in ecosystems that are spatially structured at the micro-scale. We combine microfabrication and microfluidics to create synthetic ecosystems that have a complex yet well-defined geometry and chemical composit

  20. Bacterial community structure of a full-scale biofilter treating pig house exhaust air

    DEFF Research Database (Denmark)

    Kristiansen, Anja; Pedersen, Kristina Hadulla; Nielsen, Per Halkjær;

    2011-01-01

    Biological air filters represent a promising tool for treating emissions of ammonia and odor from pig facilities. Quantitative fluorescence in situ hybridization (FISH) and 16S rRNA gene sequencing were used to investigate the bacterial community structure and diversity in a full-scale biofilter...

  1. LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Task 1.4.2 Report

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T; Borucki, M; Lam, M; Lenhoff, R; Vitalis, E

    2010-01-26

    Good progress has been made on both bacterial and viral sequencing by the TMTI centers. While access to appropriate samples is a limiting factor to throughput, excellent progress has been made with respect to getting agreements in place with key sources of relevant materials. Sharing of sequenced genomes funded by TMTI has been extremely limited to date. The April 2010 exercise should force a resolution to this, but additional managerial pressures may be needed to ensure that rapid sharing of TMTI-funded sequencing occurs, regardless of collaborator constraints concerning ultimate publication(s). Policies to permit TMTI-internal rapid sharing of sequenced genomes should be written into all TMTI agreements with collaborators now being negotiated. TMTI needs to establish a Web-based system for tracking samples destined for sequencing. This includes metadata on sample origins and contributor, information on sample shipment/receipt, prioritization by TMTI, assignment to one or more sequencing centers (including possible TMTI-sponsored sequencing at a contributor site), and status history of the sample sequencing effort. While this system could be a component of the AFRL system, it is not part of any current development effort. Policy and standardized procedures are needed to ensure appropriate verification of all TMTI samples prior to the investment in sequencing. PCR, arrays, and classical biochemical tests are examples of potential verification methods. Verification is needed to detect miss-labeled, degraded, mixed or contaminated samples. Regular QC exercises are needed to ensure that the TMTI-funded centers are meeting all standards for producing quality genomic sequence data.

  2. Bacterial Artificial Chromosomes: A Functional Genomics Tool for the Study of Positive-strand RNA Viruses.

    Science.gov (United States)

    Yun, Sang-Im; Song, Byung-Hak; Kim, Jin-Kyoung; Lee, Young-Min

    2015-01-01

    Reverse genetics, an approach to rescue infectious virus entirely from a cloned cDNA, has revolutionized the field of positive-strand RNA viruses, whose genomes have the same polarity as cellular mRNA. The cDNA-based reverse genetics system is a seminal method that enables direct manipulation of the viral genomic RNA, thereby generating recombinant viruses for molecular and genetic studies of both viral RNA elements and gene products in viral replication and pathogenesis. It also provides a valuable platform that allows the development of genetically defined vaccines and viral vectors for the delivery of foreign genes. For many positive-strand RNA viruses such as Japanese encephalitis virus (JEV), however, the cloned cDNAs are unstable, posing a major obstacle to the construction and propagation of the functional cDNA. Here, the present report describes the strategic considerations in creating and amplifying a genetically stable full-length infectious JEV cDNA as a bacterial artificial chromosome (BAC) using the following general experimental procedures: viral RNA isolation, cDNA synthesis, cDNA subcloning and modification, assembly of a full-length cDNA, cDNA linearization, in vitro RNA synthesis, and virus recovery. This protocol provides a general methodology applicable to cloning full-length cDNA for a range of positive-strand RNA viruses, particularly those with a genome of >10 kb in length, into a BAC vector, from which infectious RNAs can be transcribed in vitro with a bacteriophage RNA polymerase. PMID:26780115

  3. Repetitive genome elements in a European corn borer, Ostrinia nubilalis, bacterial artificial chromosome library were indicated by bacterial artificial chromosome end sequencing and development of sequence tag site markers: implications for lepidopteran genomic research.

    Science.gov (United States)

    Coates, Brad S; Sumerford, Douglas V; Hellmich, Richard L; Lewis, Leslie C

    2009-01-01

    The European corn borer, Ostrinia nubilalis, is a serious pest of food, fiber, and biofuel crops in Europe, North America, and Asia and a model system for insect olfaction and speciation. A bacterial artificial chromosome library constructed for O. nubilalis contains 36 864 clones with an estimated average insert size of >or=120 kb and genome coverage of 8.8-fold. Screening OnB1 clones comprising approximately 2.76 genome equivalents determined the physical position of 24 sequence tag site markers, including markers linked to ecologically important and Bacillus thuringiensis toxin resistance traits. OnB1 bacterial artificial chromosome end sequence reads (GenBank dbGSS accessions ET217010 to ET217273) showed homology to annotated genes or expressed sequence tags and identified repetitive genome elements, O. nubilalis miniature subterminal inverted repeat transposable elements (OnMITE01 and OnMITE02), and ezi-like long interspersed nuclear elements. Mobility of OnMITE01 was demonstrated by the presence or absence in O. nubilalis of introns at two different loci. A (GTCT)n tetranucleotide repeat at the 5' ends of OnMITE01 and OnMITE02 are evidence for transposon-mediated movement of lepidopteran microsatellite loci. The number of repetitive elements in lepidopteran genomes will affect genome assembly and marker development. Single-locus sequence tag site markers described here have downstream application for integration within linkage maps and comparative genomic studies. PMID:19132072

  4. Repetitive genome elements in a European corn borer, Ostrinia nubilalis, bacterial artificial chromosome library were indicated by bacterial artificial chromosome end sequencing and development of sequence tag site markers: implications for lepidopteran genomic research.

    Science.gov (United States)

    Coates, Brad S; Sumerford, Douglas V; Hellmich, Richard L; Lewis, Leslie C

    2009-01-01

    The European corn borer, Ostrinia nubilalis, is a serious pest of food, fiber, and biofuel crops in Europe, North America, and Asia and a model system for insect olfaction and speciation. A bacterial artificial chromosome library constructed for O. nubilalis contains 36 864 clones with an estimated average insert size of >or=120 kb and genome coverage of 8.8-fold. Screening OnB1 clones comprising approximately 2.76 genome equivalents determined the physical position of 24 sequence tag site markers, including markers linked to ecologically important and Bacillus thuringiensis toxin resistance traits. OnB1 bacterial artificial chromosome end sequence reads (GenBank dbGSS accessions ET217010 to ET217273) showed homology to annotated genes or expressed sequence tags and identified repetitive genome elements, O. nubilalis miniature subterminal inverted repeat transposable elements (OnMITE01 and OnMITE02), and ezi-like long interspersed nuclear elements. Mobility of OnMITE01 was demonstrated by the presence or absence in O. nubilalis of introns at two different loci. A (GTCT)n tetranucleotide repeat at the 5' ends of OnMITE01 and OnMITE02 are evidence for transposon-mediated movement of lepidopteran microsatellite loci. The number of repetitive elements in lepidopteran genomes will affect genome assembly and marker development. Single-locus sequence tag site markers described here have downstream application for integration within linkage maps and comparative genomic studies.

  5. Genome sequencing and systems biology analysis of a lipase-producing bacterial strain.

    Science.gov (United States)

    Li, N; Li, D D; Zhang, Y Z; Yuan, Y Z; Geng, H; Xiong, L; Liu, D L

    2016-01-01

    Lipase-producing bacteria are naturally-occurring, industrially-relevant microorganisms that produce lipases, which can be used to synthesize biodiesel from waste oils. The efficiency of lipase expression varies between various microbial strains. Therefore, strains that can produce lipases with high efficiency must be screened, and the conditions of lipase metabolism and optimization of the production process in a given environment must be thoroughly studied. A high efficiency lipase-producing strain was isolated from the sediments of Jinsha River, identified by 16S rRNA sequence analysis as Serratia marcescens, and designated as HS-L5. A schematic diagram of the genome sequence was constructed by high-throughput genome sequencing. A series of genes related to lipid degradation were identified by functional gene annotation through sequence homology analysis. A genome-scale metabolic model of HS-ML5 was constructed using systems biology techniques. The model consisted of 1722 genes and 1567 metabolic reactions. The topological graph of the genome-scale metabolic model was compared to that of conventional metabolic pathways using a visualization software and KEGG database. The basic components and boundaries of the tributyrin degradation subnetwork were determined, and its flux balance analyzed using Matlab and COBRA Toolbox to simulate the effects of different conditions on the catalytic efficiency of lipases produced by HS-ML5. We proved that the catalytic activity of microbial lipases was closely related to the carbon metabolic pathway. As production and catalytic efficiency of lipases varied greatly with the environment, the catalytic efficiency and environmental adaptability of microbial lipases can be improved by proper control of the production conditions. PMID:27050954

  6. Mapping condition-dependent regulation of metabolism in yeast through genome-scale modeling

    DEFF Research Database (Denmark)

    Österlund, Tobias; Nookaew, Intawat; Bordel, Sergio;

    2013-01-01

    ABSTRACT: BACKGROUND: The genome-scale metabolic model of Saccharomyces cerevisiae, first presented in 2003, was the first genome-scale network reconstruction for a eukaryotic organism. Since then continuous efforts have been made in order to improve and expand the yeast metabolic network. RESULT...

  7. Ultrastructural and molecular characterization of a bacterial symbiosis in the ecologically important scale insect family Coelostomidiidae.

    Science.gov (United States)

    Dhami, Manpreet K; Turner, Adrian P; Deines, Peter; Beggs, Jacqueline R; Taylor, Michael W

    2012-09-01

    Scale insects are important ecologically and as agricultural pests. The majority of scale insect taxa feed exclusively on plant phloem sap, which is carbon rich but deficient in essential amino acids. This suggests that, as seen in the related aphids and psyllids, scale insect nutrition might also depend upon bacterial symbionts, yet very little is known about scale insect-bacteria symbioses. We report here the first identification and molecular characterization of symbiotic bacteria associated with the New Zealand giant scale Coelostomidia wairoensis, using fluorescence in situ hybridization (FISH), transmission electron microscopy (TEM) and 16S rRNA gene-based analysis. Dissection and FISH confirmed the location of the bacteria in large, paired, multilobate organs in the abdominal region of the insect. TEM indicated that the dominant pleomorphic bacteria were confined to bacteriocytes in the sheath-enclosed bacteriome. Phylogenetic analysis revealed the presence of three distinct bacterial types, the bacteriome-associated B-symbiont (Bacteroidetes), an Erwinia-related symbiont (Gammaproteobacteria) and Wolbachia sp. (Alphaproteobacteria). This study extends the current knowledge of scale insect symbionts and is the first microbiological investigation of the ecologically important coelostomidiid scales.

  8. Construction of a nurse shark (Ginglymostoma cirratum bacterial artificial chromosome (BAC library and a preliminary genome survey

    Directory of Open Access Journals (Sweden)

    Inoko Hidetoshi

    2006-05-01

    Full Text Available Abstract Background Sharks are members of the taxonomic class Chondrichthyes, the oldest living jawed vertebrates. Genomic studies of this group, in comparison to representative species in other vertebrate taxa, will allow us to theorize about the fundamental genetic, developmental, and functional characteristics in the common ancestor of all jawed vertebrates. Aims In order to obtain mapping and sequencing data for comparative genomics, we constructed a bacterial artificial chromosome (BAC library for the nurse shark, Ginglymostoma cirratum. Results The BAC library consists of 313,344 clones with an average insert size of 144 kb, covering ~4.5 × 1010 bp and thus providing an 11-fold coverage of the haploid genome. BAC end sequence analyses revealed, in addition to LINEs and SINEs commonly found in other animal and plant genomes, two new groups of nurse shark-specific repetitive elements, NSRE1 and NSRE2 that seem to be major components of the nurse shark genome. Screening the library with single-copy or multi-copy gene probes showed 6–28 primary positive clones per probe of which 50–90% were true positives, demonstrating that the BAC library is representative of the different regions of the nurse shark genome. Furthermore, some BAC clones contained multiple genes, making physical mapping feasible. Conclusion We have constructed a deep-coverage, high-quality, large insert, and publicly available BAC library for a cartilaginous fish. It will be very useful to the scientific community interested in shark genomic structure, comparative genomics, and functional studies. We found two new groups of repetitive elements specific to the nurse shark genome, which may contribute to the architecture and evolution of the nurse shark genome.

  9. Bacterial Societies: Cooperation, Colonization, and Competition in Micro-Scale Ecosystems

    OpenAIRE

    Hol, F.J.H.

    2014-01-01

    In this thesis, I describe experiments aimed at understanding bacterial population dynamics in ecosystems that are spatially structured at the micro-scale. We combine microfabrication and microfluidics to create synthetic ecosystems that have a complex yet well-defined geometry and chemical composition. Bacteria that inhabit such ecosystems can be observed at high spatiotemporal resolution using fluorescence microscopy. Using this experimental approach we have gained deeper insight into diver...

  10. Complete Genome Sequence of Cell Culture-Attenuated Guinea Pig Cytomegalovirus Cloned as an Infectious Bacterial Artificial Chromosome

    OpenAIRE

    Yang, Dongmei; Alam, Zohaib; Cui, Xiaohong; Chen, Michael; Sherrod, Carly J.; McVoy, Michael A.; Schleiss, Mark R.; Dittmer, Dirk P

    2014-01-01

    The complete genome sequence of attenuated guinea pig cytomegalovirus cloned as bacterial artificial chromosome N13R10 was determined. Comparison to pathogenic salivary gland-derived virus revealed 13 differences, 1 of which disrupted overlapping open reading frames encoding GP129 and GP130. Attenuation of N13R10 may arise from an inability to express GP129 and/or GP130.

  11. Dissecting the energy metabolism in Mycoplasma pneumoniae through genome-scale metabolic modeling

    NARCIS (Netherlands)

    Wodke, J.A.; Puchalka, J.; Lluch-Senar, M.; Marcos, J.; Yus, E.; Godinho, M.; Gutierrez-Gallego, R.; Martins Dos Santos, V.A.P.; Serrano, L.; Klipp, E.; Maier, T.

    2013-01-01

    Mycoplasma pneumoniae, a threatening pathogen with a minimal genome, is a model organism for bacterial systems biology for which substantial experimental information is available. With the goal of understanding the complex interactions underlying its metabolism, we analyzed and characterized the met

  12. Genome sequence and plasmid transformation of the model high-yield bacterial cellulose producer Gluconacetobacter hansenii ATCC 53582

    Science.gov (United States)

    Florea, Michael; Reeve, Benjamin; Abbott, James; Freemont, Paul S.; Ellis, Tom

    2016-03-01

    Bacterial cellulose is a strong, highly pure form of cellulose that is used in a range of applications in industry, consumer goods and medicine. Gluconacetobacter hansenii ATCC 53582 is one of the highest reported bacterial cellulose producing strains and has been used as a model organism in numerous studies of bacterial cellulose production and studies aiming to increased cellulose productivity. Here we present a high-quality draft genome sequence for G. hansenii ATCC 53582 and find that in addition to the previously described cellulose synthase operon, ATCC 53582 contains two additional cellulose synthase operons and several previously undescribed genes associated with cellulose production. In parallel, we also develop optimized protocols and identify plasmid backbones suitable for transformation of ATCC 53582, albeit with low efficiencies. Together, these results provide important information for further studies into cellulose synthesis and for future studies aiming to genetically engineer G. hansenii ATCC 53582 for increased cellulose productivity.

  13. FN-Identify: Novel Restriction Enzymes-Based Method for Bacterial Identification in Absence of Genome Sequencing

    Directory of Open Access Journals (Sweden)

    Mohamed Awad

    2015-01-01

    Full Text Available Sequencing and restriction analysis of genes like 16S rRNA and HSP60 are intensively used for molecular identification in the microbial communities. With aid of the rapid progress in bioinformatics, genome sequencing became the method of choice for bacterial identification. However, the genome sequencing technology is still out of reach in the developing countries. In this paper, we propose FN-Identify, a sequencing-free method for bacterial identification. FN-Identify exploits the gene sequences data available in GenBank and other databases and the two algorithms that we developed, CreateScheme and GeneIdentify, to create a restriction enzyme-based identification scheme. FN-Identify was tested using three different and diverse bacterial populations (members of Lactobacillus, Pseudomonas, and Mycobacterium groups in an in silico analysis using restriction enzymes and sequences of 16S rRNA gene. The analysis of the restriction maps of the members of three groups using the fragment numbers information only or along with fragments sizes successfully identified all of the members of the three groups using a minimum of four and maximum of eight restriction enzymes. Our results demonstrate the utility and accuracy of FN-Identify method and its two algorithms as an alternative method that uses the standard microbiology laboratories techniques when the genome sequencing is not available.

  14. Combining p-values in large scale genomics experiments

    Science.gov (United States)

    Zaykin, Dmitri V.; Zhivotovsky, Lev A.; Czika, Wendy; Shao, Susan; Wolfinger, Russell D.

    2008-01-01

    Summary In large-scale genomics experiments involving thousands of statistical tests, such as association scans and microarray expression experiments, a key question is: Which of the L tests represent true associations (TAs)? The traditional way to control false findings is via individual adjustments. In the presence of multiple TAs, p-value combination methods offer certain advantages. Both Fisher’s and Lancaster’s combination methods use an inverse gamma transformation. We identify the relation of the shape parameter of that distribution to the implicit threshold value; p-values below that threshold are favored by the inverse gamma method (GM). We explore this feature to improve power over Fisher’s method when L is large and the number of TAs is moderate. However, the improvement in power provided by combination methods is at the expense of a weaker claim made upon rejection of the null hypothesis – that there are some TAs among the L tests. Thus, GM remains a global test. To allow a stronger claim about a subset of p-values that is smaller than L, we investigate two methods with an explicit truncation: the rank truncated product method (RTP) that combines the first K ordered p-values, and the truncated product method (TPM) that combines p-values that are smaller than a specified threshold. We conclude that TPM allows claims to be made about subsets of p-values, while the claim of the RTP is, like GM, more appropriately about all L tests. GM gives somewhat higher power than TPM, RTP, Fisher, and Simes methods across a range of simulations. PMID:17879330

  15. Combining p-values in large-scale genomics experiments.

    Science.gov (United States)

    Zaykin, Dmitri V; Zhivotovsky, Lev A; Czika, Wendy; Shao, Susan; Wolfinger, Russell D

    2007-01-01

    In large-scale genomics experiments involving thousands of statistical tests, such as association scans and microarray expression experiments, a key question is: Which of the L tests represent true associations (TAs)? The traditional way to control false findings is via individual adjustments. In the presence of multiple TAs, p-value combination methods offer certain advantages. Both Fisher's and Lancaster's combination methods use an inverse gamma transformation. We identify the relation of the shape parameter of that distribution to the implicit threshold value; p-values below that threshold are favored by the inverse gamma method (GM). We explore this feature to improve power over Fisher's method when L is large and the number of TAs is moderate. However, the improvement in power provided by combination methods is at the expense of a weaker claim made upon rejection of the null hypothesis - that there are some TAs among the L tests. Thus, GM remains a global test. To allow a stronger claim about a subset of p-values that is smaller than L, we investigate two methods with an explicit truncation: the rank truncated product method (RTP) that combines the first K-ordered p-values, and the truncated product method (TPM) that combines p-values that are smaller than a specified threshold. We conclude that TPM allows claims to be made about subsets of p-values, while the claim of the RTP is, like GM, more appropriately about all L tests. GM gives somewhat higher power than TPM, RTP, Fisher, and Simes methods across a range of simulations. PMID:17879330

  16. Sexagesimal scale for mapping human genome Escala sexagesimal para mapear el genoma humano

    OpenAIRE

    RICARDO CRUZ-COKE

    2001-01-01

    In a previous work I designed a diagram of the human genome based on a circular ideogram of the haploid set of chromosomes, using a low resolution scale of Megabase units. The purpose of this work is to draft a new scale to measure the physical map of the human genome at the highest resolution level. The entire length of the haploid genome of males is deployed in a circumference, marked with a sexagesimal scale with 360 degrees and 1296000 arc seconds. The radio of this circunference displays...

  17. Comparative genome-scale modelling of Staphylococcus aureus strains identifies strain-specific metabolic capabilities linked to pathogenicity

    Science.gov (United States)

    Bosi, Emanuele; Monk, Jonathan M.; Aziz, Ramy K.; Fondi, Marco; Nizet, Victor; Palsson, Bernhard Ø.

    2016-01-01

    Staphylococcus aureus is a preeminent bacterial pathogen capable of colonizing diverse ecological niches within its human host. We describe here the pangenome of S. aureus based on analysis of genome sequences from 64 strains of S. aureus spanning a range of ecological niches, host types, and antibiotic resistance profiles. Based on this set, S. aureus is expected to have an open pangenome composed of 7,411 genes and a core genome composed of 1,441 genes. Metabolism was highly conserved in this core genome; however, differences were identified in amino acid and nucleotide biosynthesis pathways between the strains. Genome-scale models (GEMs) of metabolism were constructed for the 64 strains of S. aureus. These GEMs enabled a systems approach to characterizing the core metabolic and panmetabolic capabilities of the S. aureus species. All models were predicted to be auxotrophic for the vitamins niacin (vitamin B3) and thiamin (vitamin B1), whereas strain-specific auxotrophies were predicted for riboflavin (vitamin B2), guanosine, leucine, methionine, and cysteine, among others. GEMs were used to systematically analyze growth capabilities in more than 300 different growth-supporting environments. The results identified metabolic capabilities linked to pathogenic traits and virulence acquisitions. Such traits can be used to differentiate strains responsible for mild vs. severe infections and preference for hosts (e.g., animals vs. humans). Genome-scale analysis of multiple strains of a species can thus be used to identify metabolic determinants of virulence and increase our understanding of why certain strains of this deadly pathogen have spread rapidly throughout the world. PMID:27286824

  18. Mapping copy number variation by population-scale genome sequencing

    DEFF Research Database (Denmark)

    Mills, Ryan E.; Walter, Klaudia; Stewart, Chip;

    2011-01-01

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is......, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications...

  19. Effect of different genomic relationship matrices on accuracy and scale

    OpenAIRE

    Chen, Ching-Yi; Misztal, I; I. Aguilar; Legarra, Andres; Muir, W.M.

    2011-01-01

    Phenotypic data on BW and breast meat area were available on up to 287,614 broilers. A total of 4,113 birds were genotyped for 57,636 SNP. Data were analyzed by a single-step genomic BLUP (ssGBLUP), which accounts for all phenotypic, pedigree, and genomic information. The genomic relationship matrix (G) in ssGBLUP was constructed using either equal (0.5; GEq) or current (GC) allele frequencies, and with all SNP or with SNP with minor allele frequencies (MAF) below multiple thresholds (0.1, 0....

  20. Comparative genomics of non-pseudomonal bacterial species colonising paediatric cystic fibrosis patients.

    Science.gov (United States)

    Ormerod, Kate L; George, Narelle M; Fraser, James A; Wainwright, Claire; Hugenholtz, Philip

    2015-01-01

    The genetic disorder cystic fibrosis is a life-limiting condition affecting ∼70,000 people worldwide. Targeted, early, treatment of the dominant infecting species, Pseudomonas aeruginosa, has improved patient outcomes; however, there is concern that other species are now stepping in to take its place. In addition, the necessarily long-term antibiotic therapy received by these patients may be providing a suitable environment for the emergence of antibiotic resistance. To investigate these issues, we employed whole-genome sequencing of 28 non-Pseudomonas bacterial strains isolated from three paediatric patients. We did not find any trend of increasing antibiotic resistance (either by mutation or lateral gene transfer) in these isolates in comparison with other examples of the same species. In addition, each isolate contained a virulence gene repertoire that was similar to other examples of the relevant species. These results support the impaired clearance of the CF lung not demanding extensive virulence for survival in this habitat. By analysing serial isolates of the same species we uncovered several examples of strain persistence. The same strain of Staphylococcus aureus persisted for nearly a year, despite administration of antibiotics to which it was shown to be sensitive. This is consistent with previous studies showing antibiotic therapy to be inadequate in cystic fibrosis patients, which may also explain the lack of increasing antibiotic resistance over time. Serial isolates of two naturally multi-drug resistant organisms, Achromobacter xylosoxidans and Stenotrophomonas maltophilia, revealed that while all S. maltophilia strains were unique, A. xylosoxidans persisted for nearly five years, making this a species of particular concern. The data generated by this study will assist in developing an understanding of the non-Pseudomonas species associated with cystic fibrosis. PMID:26401445

  1. Genome-wide dynamics of a bacterial response to antibiotics that target the cell envelope

    Directory of Open Access Journals (Sweden)

    Tran Ngat

    2011-05-01

    Full Text Available Abstract Background A decline in the discovery of new antibacterial drugs, coupled with a persistent rise in the occurrence of drug-resistant bacteria, has highlighted antibiotics as a diminishing resource. The future development of new drugs with novel antibacterial activities requires a detailed understanding of adaptive responses to existing compounds. This study uses Streptomyces coelicolor A3(2 as a model system to determine the genome-wide transcriptional response following exposure to three antibiotics (vancomycin, moenomycin A and bacitracin that target distinct stages of cell wall biosynthesis. Results A generalised response to all three antibiotics was identified which involves activation of transcription of the cell envelope stress sigma factor σE, together with elements of the stringent response, and of the heat, osmotic and oxidative stress regulons. Attenuation of this system by deletion of genes encoding the osmotic stress sigma factor σB or the ppGpp synthetase RelA reduced resistance to both vancomycin and bacitracin. Many antibiotic-specific transcriptional changes were identified, representing cellular processes potentially important for tolerance to each antibiotic. Sensitivity studies using mutants constructed on the basis of the transcriptome profiling confirmed a role for several such genes in antibiotic resistance, validating the usefulness of the approach. Conclusions Antibiotic inhibition of bacterial cell wall biosynthesis induces both common and compound-specific transcriptional responses. Both can be exploited to increase antibiotic susceptibility. Regulatory networks known to govern responses to environmental and nutritional stresses are also at the core of the common antibiotic response, and likely help cells survive until any specific resistance mechanisms are fully functional.

  2. Large-scale prokaryotic gene prediction and comparison to genome annotation

    DEFF Research Database (Denmark)

    Nielsen, Pernille; Krogh, Anders Stærmose

    2005-01-01

    Motivation: Prokaryotic genomes are sequenced and annotated at an increasing rate. The methods of annotation vary between sequencing groups. It makes genome comparison difficult and may lead to propagation of errors when questionable assignments are adapted from one genome to another. Genome...... comparison either on a large or small scale would be facilitated by using a single standard for annotation, which incorporates a transparency of why an open reading frame (ORF) is considered to be a gene. Results: A total of 143 prokaryotic genomes were scored with an updated version of the prokaryotic...... genefinder EasyGene. Comparison of the GenBank and RefSeq annotations with the EasyGene predictions reveals that in some genomes up to 60% of the genes may have been annotated with a wrong start codon, especially in the GC-rich genomes. The fractional difference between annotated and predicted confirms...

  3. Small-Scale Duplications Play a Significant Role in Rice Genome Evolution

    Institute of Scientific and Technical Information of China (English)

    GUO Xin-yi; XU Guo-hua; ZHANG Yang; HU Wei-min; FAN Long-jiang

    2005-01-01

    Genes are continually being created by the processes of genome duplication (ohnolog) and gene duplication (paralog)Whole-genome duplications have been found to be widespread in plant species and play an important role in plant evolution. Clearly un-overlapping duplicated blocks of whole-genome duplications can be detected in the genome of sequenced rice (Oryza sativa).Syntenic ohnolog pairs (ohnologues) of the whole-genome duplications in rice were identified based on their syntenic duplicate lines.The paralogs of ohnologues were further scanned using multi-round reciprocal BLAST best-hit searching (E<e-14). The results indicated that an average of 0.55 sister paralogs could be found for every ohnologue in rice. These results suggest that small-scale duplications, as well as whole-genome duplications, play a significant role in the two duplicated rice genomes.

  4. Phylogenetic Relationships of 3/3 and 2/2 Hemoglobins in Archaeplastida Genomes to Bacterial and Other Eukaryote Hemoglobins

    Institute of Scientific and Technical Information of China (English)

    Serge N. Vinogradov; Iván Fernández; David Hoogewijs; Raúl Arredondo-Peter

    2011-01-01

    Land plants and algae form a supergroup, the Archaeplastida, believed to be monophyletic. We report the results of an analysis of the phylogeny of putative globins in the currently available genomes to bacterial and other eu-karyote hemoglobins (Hbs). Archaeplastida genomes have 3/3 and 2/2 Hbs, with the land plant genomes having group 2 2/2 Hbs, except for the unexpected occurrence of two group 1 2/2 Hbs in Ricinus communis. Bayesian analysis shows that plant 3/3 Hbs are related to vertebrate neuroglobins and bacterial flavohemoglobins (FHbs). We sought to define the bacterial groups, whose ancestors shared the precursors of Archaeplastida Hbs, via Bayesian and neighbor-joining anal-yses based on COBALTalignment of representative sets of bacterial 3/3 FHb-like globins and group 1 and 2 2/2 Hbs with the corresponding Archaeplastida Hbs. The results suggest that the Archaeplastida 3/3 and group 1 2/2 Hbs could have orig-inated from the horizontal gene transfers (HGTs) that accompanied the two generally accepted endosymbioses of a pro-teobacterium and a cyanobacterium with a eukaryote ancestor. In contrast, the origin of the group 2 2/2 Hbs unexpectedly appears to involve HGT from a bacterium ancestral to Chloroflexi, Deinococcales, Bacilli, and Actinomycetes. Furthermore,although intron positions and phases are mostly conserved among the land plant 3/3 and 2/2 globin genes, introns are absent in the algal 3/3 genes and intron positions and phases are highly variable in their 2/2 genes. Thus, introns are irrelevant to globin evolution in Archaeplastida.

  5. Comparative Genomics Analysis and Phenotypic Characterization of Shewanella putrefaciens W3-18-1: Anaerobic Respiration, Bacterial Microcompartments, and Lateral Flagella

    International Nuclear Information System (INIS)

    Respiratory versatility and psychrophily are the hallmarks of Shewanella. The ability to utilize a wide range of electron acceptors for respiration is due to the large number of c-type cytochrome genes present in the genome of Shewanella strains. More recently the dissimilatory metal reduction of Shewanella species has been extensively and intensively studied for potential applications in the bioremediation of radioactive wastes of groundwater and subsurface environments. Multiple Shewanella genome sequences are now available in the public databases (Fredrickson et al., 2008). Most of the sequenced Shewanella strains were isolated from marine environments and this genus was believed to be of marine origin (Hau and Gralnick, 2007). However, the well-characterized model strain, S. oneidensis MR-1, was isolated from the freshwater lake sediment of Lake Oneida, New York (Myers and Nealson, 1988) and similar bacteria have also been isolated from other freshwater environments (Venkateswaran et al., 1999). Here we comparatively analyzed the genome sequence and physiological characteristics of S. putrefaciens W3-18-1 and S. oneidensis MR-1, isolated from the marine and freshwater lake sediments, respectively. The anaerobic respirations, carbon source utilization, and cell motility have been experimentally investigated. Large scale horizontal gene transfers have been revealed and the genetic divergence between these two strains was considered to be critical to the bacterial adaptation to specific habitats, freshwater or marine sediments.

  6. Comparative Genomics Analysis and Phenotypic Characterization of Shewanella putrefaciens W3-18-1: Anaerobic Respiration, Bacterial Microcompartments, and Lateral Flagella

    Energy Technology Data Exchange (ETDEWEB)

    Qiu, D.; Tu, Q.; He, Zhili; Zhou, Jizhong

    2010-05-17

    Respiratory versatility and psychrophily are the hallmarks of Shewanella. The ability to utilize a wide range of electron acceptors for respiration is due to the large number of c-type cytochrome genes present in the genome of Shewanella strains. More recently the dissimilatory metal reduction of Shewanella species has been extensively and intensively studied for potential applications in the bioremediation of radioactive wastes of groundwater and subsurface environments. Multiple Shewanella genome sequences are now available in the public databases (Fredrickson et al., 2008). Most of the sequenced Shewanella strains were isolated from marine environments and this genus was believed to be of marine origin (Hau and Gralnick, 2007). However, the well-characterized model strain, S. oneidensis MR-1, was isolated from the freshwater lake sediment of Lake Oneida, New York (Myers and Nealson, 1988) and similar bacteria have also been isolated from other freshwater environments (Venkateswaran et al., 1999). Here we comparatively analyzed the genome sequence and physiological characteristics of S. putrefaciens W3-18-1 and S. oneidensis MR-1, isolated from the marine and freshwater lake sediments, respectively. The anaerobic respirations, carbon source utilization, and cell motility have been experimentally investigated. Large scale horizontal gene transfers have been revealed and the genetic divergence between these two strains was considered to be critical to the bacterial adaptation to specific habitats, freshwater or marine sediments.

  7. In Silico Genome-Scale Reconstruction and Validation of the Staphylococcus aureus Metabolic Network

    NARCIS (Netherlands)

    Heinemann, Matthias; Kümmel, Anne; Ruinatscha, Reto; Panke, Sven

    2005-01-01

    A genome-scale metabolic model of the Gram-positive, facultative anaerobic opportunistic pathogen Staphylococcus aureus N315 was constructed based on current genomic data, literature, and physiological information. The model comprises 774 metabolic processes representing approximately 23% of all pro

  8. Environmental versatility promotes modularity in genome-scale metabolic networks

    Directory of Open Access Journals (Sweden)

    Wagner Andreas

    2011-08-01

    Full Text Available Abstract Background The ubiquity of modules in biological networks may result from an evolutionary benefit of a modular organization. For instance, modularity may increase the rate of adaptive evolution, because modules can be easily combined into new arrangements that may benefit their carrier. Conversely, modularity may emerge as a by-product of some trait. We here ask whether this last scenario may play a role in genome-scale metabolic networks that need to sustain life in one or more chemical environments. For such networks, we define a network module as a maximal set of reactions that are fully coupled, i.e., whose fluxes can only vary in fixed proportions. This definition overcomes limitations of purely graph based analyses of metabolism by exploiting the functional links between reactions. We call a metabolic network viable in a given chemical environment if it can synthesize all of an organism's biomass compounds from nutrients in this environment. An organism's metabolism is highly versatile if it can sustain life in many different chemical environments. We here ask whether versatility affects the modularity of metabolic networks. Results Using recently developed techniques to randomly sample large numbers of viable metabolic networks from a vast space of metabolic networks, we use flux balance analysis to study in silico metabolic networks that differ in their versatility. We find that highly versatile networks are also highly modular. They contain more modules and more reactions that are organized into modules. Most or all reactions in a module are associated with the same biochemical pathways. Modules that arise in highly versatile networks generally involve reactions that process nutrients or closely related chemicals. We also observe that the metabolism of E. coli is significantly more modular than even our most versatile networks. Conclusions Our work shows that modularity in metabolic networks can be a by-product of functional

  9. Draft Genome Sequence of Criibacterium bergeronii gen. nov., sp. nov., Strain CCRI-22567T, Isolated from a Vaginal Sample from a Woman with Bacterial Vaginosis.

    Science.gov (United States)

    Maheux, Andrée F; Bérubé, Ève; Boudreau, Dominique K; Raymond, Frédéric; Corbeil, Jacques; Roy, Paul H; Boissinot, Maurice; Omar, Rabeea F

    2016-01-01

    Criibacterium bergeronii gen. nov., sp. nov., CCRI-22567 is the type strain of the new genus Criibacterium The strain was isolated from a woman with bacterial vaginosis. The genome assembly comprised 2,384,460 bp, with 34.4% G+C content. This is the first genome announcement of a strain belonging to the genus Criibacterium. PMID:27587833

  10. Genome-wide copy number profiling on high-density bacterial artificial chromosomes, single-nucleotide polymorphisms, and oligonucleotide microarrays: a platform comparison based on statistical power analysis.

    NARCIS (Netherlands)

    Hehir-Kwa, J.Y.; Egmont-Peterson, M.; Janssen, I.M.; Smeets, D.F.C.M.; Geurts van Kessel, A.H.M.; Veltman, J.A.

    2007-01-01

    Recently, comparative genomic hybridization onto bacterial artificial chromosome (BAC) arrays (array-based comparative genomic hybridization) has proved to be successful for the detection of submicroscopic DNA copy-number variations in health and disease. Technological improvements to achieve a high

  11. Complete Genomes of Classical Swine Fever Virus Cloned into Bacterial Artificial Chromosomes

    DEFF Research Database (Denmark)

    Rasmussen, Thomas Bruun; Reimann, I.; Uttenthal, Åse;

    Complete genome amplification of viral RNA provides a new tool for the generation of modified pestiviruses. We have used our full-genome amplification strategy for generation of amplicons representing complete genomes of classical swine fever virus. The amplicons were cloned directly into a stable...

  12. Independent large scale duplications in multiple M. tuberculosis lineages overlapping the same genomic region.

    Directory of Open Access Journals (Sweden)

    Brian Weiner

    Full Text Available Mycobacterium tuberculosis, the causative agent of most human tuberculosis, infects one third of the world's population and kills an estimated 1.7 million people a year. With the world-wide emergence of drug resistance, and the finding of more functional genetic diversity than previously expected, there is a renewed interest in understanding the forces driving genome evolution of this important pathogen. Genetic diversity in M. tuberculosis is dominated by single nucleotide polymorphisms and small scale gene deletion, with little or no evidence for large scale genome rearrangements seen in other bacteria. Recently, a single report described a large scale genome duplication that was suggested to be specific to the Beijing lineage. We report here multiple independent large-scale duplications of the same genomic region of M. tuberculosis detected through whole-genome sequencing. The duplications occur in strains belonging to both M. tuberculosis lineage 2 and 4, and are thus not limited to Beijing strains. The duplications occur in both drug-resistant and drug susceptible strains. The duplicated regions also have substantially different boundaries in different strains, indicating different originating duplication events. We further identify a smaller segmental duplication of a different genomic region of a lab strain of H37Rv. The presence of multiple independent duplications of the same genomic region suggests either instability in this region, a selective advantage conferred by the duplication, or both. The identified duplications suggest that large-scale gene duplication may be more common in M. tuberculosis than previously considered.

  13. Genomic Encyclopedia of Bacterial and Archaeal Type Strains, Phase III: the genomes of soil and plant-associated and newly described type strains.

    Science.gov (United States)

    Whitman, William B; Woyke, Tanja; Klenk, Hans-Peter; Zhou, Yuguang; Lilburn, Timothy G; Beck, Brian J; De Vos, Paul; Vandamme, Peter; Eisen, Jonathan A; Garrity, George; Hugenholtz, Philip; Kyrpides, Nikos C

    2015-01-01

    The Genomic Encyclopedia of Bacteria and Archaea (GEBA) project was launched by the JGI in 2007 as a pilot project to sequence about 250 bacterial and archaeal genomes of elevated phylogenetic diversity. Herein, we propose to extend this approach to type strains of prokaryotes associated with soil or plants and their close relatives as well as type strains from newly described species. Understanding the microbiology of soil and plants is critical to many DOE mission areas, such as biofuel production from biomass, biogeochemistry, and carbon cycling. We are also targeting type strains of novel species while they are being described. Since 2006, about 630 new species have been described per year, many of which are closely aligned to DOE areas of interest in soil, agriculture, degradation of pollutants, biofuel production, biogeochemical transformation, and biodiversity.

  14. Structural genomics of eukaryotic targets at a laboratory scale.

    Science.gov (United States)

    Busso, Didier; Poussin-Courmontagne, Pierre; Rosé, David; Ripp, Raymond; Litt, Alain; Thierry, Jean-Claude; Moras, Dino

    2005-01-01

    Structural genomics programs are distributed worldwide and funded by large institutions such as the NIH in United-States, the RIKEN in Japan or the European Commission through the SPINE network in Europe. Such initiatives, essentially managed by large consortia, led to technology and method developments at the different steps required to produce biological samples compatible with structural studies. Besides specific applications, method developments resulted mainly upon miniaturization and parallelization. The challenge that academic laboratories faces to pursue structural genomics programs is to produce, at a higher rate, protein samples. The Structural Biology and Genomics Department (IGBMC - Illkirch - France) is implicated in a structural genomics program of high eukaryotes whose goal is solving crystal structures of proteins and their complexes (including large complexes) related to human health and biotechnology. To achieve such a challenging goal, the Department has established a medium-throughput pipeline for producing protein samples suitable for structural biology studies. Here, we describe the setting up of our initiative from cloning to crystallization and we demonstrate that structural genomics may be manageable by academic laboratories by strategic investments in robotic and by adapting classical bench protocols and new developments, in particular in the field of protein expression, to parallelization.

  15. Genomic survey of pathogenicity determinants and VNTR markers in the cassava bacterial pathogen Xanthomonas axonopodis pv. Manihotis strain CIO151.

    Directory of Open Access Journals (Sweden)

    Mario L Arrieta-Ortiz

    Full Text Available Xanthomonas axonopodis pv. manihotis (Xam is the causal agent of bacterial blight of cassava, which is among the main components of human diet in Africa and South America. Current information about the molecular pathogenicity factors involved in the infection process of this organism is limited. Previous studies in other bacteria in this genus suggest that advanced draft genome sequences are valuable resources for molecular studies on their interaction with plants and could provide valuable tools for diagnostics and detection. Here we have generated the first manually annotated high-quality draft genome sequence of Xam strain CIO151. Its genomic structure is similar to that of other xanthomonads, especially Xanthomonas euvesicatoria and Xanthomonas citri pv. citri species. Several putative pathogenicity factors were identified, including type III effectors, cell wall-degrading enzymes and clusters encoding protein secretion systems. Specific characteristics in this genome include changes in the xanthomonadin cluster that could explain the lack of typical yellow color in all strains of this pathovar and the presence of 50 regions in the genome with atypical nucleotide composition. The genome sequence was used to predict and evaluate 22 variable number of tandem repeat (VNTR loci that were subsequently demonstrated as polymorphic in representative Xam strains. Our results demonstrate that Xanthomonas axonopodis pv. manihotis strain CIO151 possesses ten clusters of pathogenicity factors conserved within the genus Xanthomonas. We report 126 genes that are potentially unique to Xam, as well as potential horizontal transfer events in the history of the genome. The relation of these regions with virulence and pathogenicity could explain several aspects of the biology of this pathogen, including its ability to colonize both vascular and non-vascular tissues of cassava plants. A set of 16 robust, polymorphic VNTR loci will be useful to develop a multi

  16. LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Tier 1 Report

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T; Borucki, M; Lenhoff, R; Vitalis, E

    2009-09-29

    The Lawrence Livermore National Lab Bioinformatics group has recently taken on a role in DTRA's Transformation Medical Technologies Initiative (TMTI). The high-level goal of TMTI is to accelerate the development of broad-spectrum countermeasures. To achieve those goals, TMTI has a near term need to obtain more sequence information across a large range of pathogens, near neighbors, and across a broad geographical and host range. Our role in this project is to research available sequence data for the organisms of interest and identify critical microbial sequence and knowledge gaps that need to be filled to meet TMTI objectives. This effort includes: (1) assessing current genomic sequence for each agent including phylogenetic and geographical diversity, host range, date of isolation range, virulence, sequence availability of key near neighbors, and other characteristics; (2) identifying Subject Matter Experts (SME's) and potential holders of isolate collections, contacting appropriate SME's with known expertise and isolate collections to obtain information on isolate availability and specific recommendations; (3) identifying sequence as well as knowledge gaps (eg virulence, host range, and antibiotic resistance determinants); (4) providing specific recommendations as to the most valuable strains to be placed on the DTRA sequencing queue. We acknowledge that criteria for prioritization of isolates for sequencing falls into two categories aligning with priority queues 1 and 2 as described in the summary. (Priority queue 0 relates to DTRA operational isolates whose availability is not predictable in advance.) 1. Selection of isolates that appear to have likelihood to provide information on virulence and antibiotic resistance. This will include sequence of known virulent strains. Particularly valuable would be virulent strains that have genetically similar yet avirulent, or non human transmissible, counterparts that can be used for comparison to help

  17. LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Tier 1 Report

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T; Borucki, M; Lenhoff, R; Vitalis, E

    2009-09-29

    The Lawrence Livermore National Lab Bioinformatics group has recently taken on a role in DTRA's Transformation Medical Technologies Initiative (TMTI). The high-level goal of TMTI is to accelerate the development of broad-spectrum countermeasures. To achieve those goals, TMTI has a near term need to obtain more sequence information across a large range of pathogens, near neighbors, and across a broad geographical and host range. Our role in this project is to research available sequence data for the organisms of interest and identify critical microbial sequence and knowledge gaps that need to be filled to meet TMTI objectives. This effort includes: (1) assessing current genomic sequence for each agent including phylogenetic and geographical diversity, host range, date of isolation range, virulence, sequence availability of key near neighbors, and other characteristics; (2) identifying Subject Matter Experts (SME's) and potential holders of isolate collections, contacting appropriate SME's with known expertise and isolate collections to obtain information on isolate availability and specific recommendations; (3) identifying sequence as well as knowledge gaps (eg virulence, host range, and antibiotic resistance determinants); (4) providing specific recommendations as to the most valuable strains to be placed on the DTRA sequencing queue. We acknowledge that criteria for prioritization of isolates for sequencing falls into two categories aligning with priority queues 1 and 2 as described in the summary. (Priority queue 0 relates to DTRA operational isolates whose availability is not predictable in advance.) 1. Selection of isolates that appear to have likelihood to provide information on virulence and antibiotic resistance. This will include sequence of known virulent strains. Particularly valuable would be virulent strains that have genetically similar yet avirulent, or non human transmissible, counterparts that can be used for comparison to help

  18. Pore-scale simulation of microbial growth using a genome-scale metabolic model: Implications for Darcy-scale reactive transport

    Science.gov (United States)

    Tartakovsky, G. D.; Tartakovsky, A. M.; Scheibe, T. D.; Fang, Y.; Mahadevan, R.; Lovley, D. R.

    2013-09-01

    Recent advances in microbiology have enabled the quantitative simulation of microbial metabolism and growth based on genome-scale characterization of metabolic pathways and fluxes. We have incorporated a genome-scale metabolic model of the iron-reducing bacteria Geobacter sulfurreducens into a pore-scale simulation of microbial growth based on coupling of iron reduction to oxidation of a soluble electron donor (acetate). In our model, fluid flow and solute transport is governed by a combination of the Navier-Stokes and advection-diffusion-reaction equations. Microbial growth occurs only on the surface of soil grains where solid-phase mineral iron oxides are available. Mass fluxes of chemical species associated with microbial growth are described by the genome-scale microbial model, implemented using a constraint-based metabolic model, and provide the Robin-type boundary condition for the advection-diffusion equation at soil grain surfaces. Conventional models of microbially-mediated subsurface reactions use a lumped reaction model that does not consider individual microbial reaction pathways, and describe reactions rates using empirically-derived rate formulations such as the Monod-type kinetics. We have used our pore-scale model to explore the relationship between genome-scale metabolic models and Monod-type formulations, and to assess the manifestation of pore-scale variability (microenvironments) in terms of apparent Darcy-scale microbial reaction rates. The genome-scale model predicted lower biomass yield, and different stoichiometry for iron consumption, in comparison to prior Monod formulations based on energetics considerations. We were able to fit an equivalent Monod model, by modifying the reaction stoichiometry and biomass yield coefficient, that could effectively match results of the genome-scale simulation of microbial behaviors under excess nutrient conditions, but predictions of the fitted Monod model deviated from those of the genome-scale model

  19. Pore-scale simulation of microbial growth using a genome-scale metabolic model: Implications for Darcy-scale reactive transport

    Energy Technology Data Exchange (ETDEWEB)

    Tartakovsky, Guzel D.; Tartakovsky, Alexandre M.; Scheibe, Timothy D.; Fang, Yilin; Mahadevan, Radhakrishnan; Lovley, Derek R.

    2013-09-07

    Recent advances in microbiology have enabled the quantitative simulation of microbial metabolism and growth based on genome-scale characterization of metabolic pathways and fluxes. We have incorporated a genome-scale metabolic model of the iron-reducing bacteria Geobacter sulfurreducens into a pore-scale simulation of microbial growth based on coupling of iron reduction to oxidation of a soluble electron donor (acetate). In our model, fluid flow and solute transport is governed by a combination of the Navier-Stokes and advection-diffusion-reaction equations. Microbial growth occurs only on the surface of soil grains where solid-phase mineral iron oxides are available. Mass fluxes of chemical species associated with microbial growth are described by the genome-scale microbial model, implemented using a constraint-based metabolic model, and provide the Robin-type boundary condition for the advection-diffusion equation at soil grain surfaces. Conventional models of microbially-mediated subsurface reactions use a lumped reaction model that does not consider individual microbial reaction pathways, and describe reactions rates using empirically-derived rate formulations such as the Monod-type kinetics. We have used our pore-scale model to explore the relationship between genome-scale metabolic models and Monod-type formulations, and to assess the manifestation of pore-scale variability (microenvironments) in terms of apparent Darcy-scale microbial reaction rates. The genome-scale model predicted lower biomass yield, and different stoichiometry for iron consumption, in comparisonto prior Monod formulations based on energetics considerations. We were able to fit an equivalent Monod model, by modifying the reaction stoichiometry and biomass yield coefficient, that could effectively match results of the genome-scale simulation of microbial behaviors under excess nutrient conditions, but predictions of the fitted Monod model deviated from those of the genome-scale model under

  20. Generating Genome-Scale Candidate Gene Lists for Pharmacogenomics

    DEFF Research Database (Denmark)

    Hansen, Niclas Tue; Brunak, Søren; Altman, R. B.

    2009-01-01

    , but they are expensive to generate manually and may therefore have incomplete coverage. We have developed a method that ranks 12,460 genes in the human genome on the basis of their potential relevance to a specific query drug and its putative indications. Our method uses known gene-drug interactions, networks of gene...

  1. Rapid genome-scale mapping of chromatin accessibility in tissue

    DEFF Research Database (Denmark)

    Grøntved, Lars; Bandle, Russell; John, Sam;

    2012-01-01

    BACKGROUND: The challenge in extracting genome-wide chromatin features from limiting clinical samples poses a significant hurdle in identification of regulatory marks that impact the physiological or pathological state. Current methods that identify nuclease accessible chromatin are reliant on la...

  2. A short-time scale colloidal system reveals early bacterial adhesion dynamics.

    Directory of Open Access Journals (Sweden)

    Christophe Beloin

    2008-07-01

    Full Text Available The development of bacteria on abiotic surfaces has important public health and sanitary consequences. However, despite several decades of study of bacterial adhesion to inert surfaces, the biophysical mechanisms governing this process remain poorly understood, due, in particular, to the lack of methodologies covering the appropriate time scale. Using micrometric colloidal surface particles and flow cytometry analysis, we developed a rapid multiparametric approach to studying early events in adhesion of the bacterium Escherichia coli. This approach simultaneously describes the kinetics and amplitude of early steps in adhesion, changes in physicochemical surface properties within the first few seconds of adhesion, and the self-association state of attached and free-floating cells. Examination of the role of three well-characterized E. coli surface adhesion factors upon attachment to colloidal surfaces--curli fimbriae, F-conjugative pilus, and Ag43 adhesin--showed clear-cut differences in the very initial phases of surface colonization for cell-bearing surface structures, all known to promote biofilm development. Our multiparametric analysis revealed a correlation in the adhesion phase with cell-to-cell aggregation properties and demonstrated that this phenomenon amplified surface colonization once initial cell-surface attachment was achieved. Monitoring of real-time physico-chemical particle surface properties showed that surface-active molecules of bacterial origin quickly modified surface properties, providing new insight into the intricate relations connecting abiotic surface physicochemical properties and bacterial adhesion. Hence, the biophysical analytical method described here provides a new and relevant approach to quantitatively and kinetically investigating bacterial adhesion and biofilm development.

  3. The channel catfish genome sequence provides insights into the evolution of scale formation in teleosts.

    Science.gov (United States)

    Liu, Zhanjiang; Liu, Shikai; Yao, Jun; Bao, Lisui; Zhang, Jiaren; Li, Yun; Jiang, Chen; Sun, Luyang; Wang, Ruijia; Zhang, Yu; Zhou, Tao; Zeng, Qifan; Fu, Qiang; Gao, Sen; Li, Ning; Koren, Sergey; Jiang, Yanliang; Zimin, Aleksey; Xu, Peng; Phillippy, Adam M; Geng, Xin; Song, Lin; Sun, Fanyue; Li, Chao; Wang, Xiaozhu; Chen, Ailu; Jin, Yulin; Yuan, Zihao; Yang, Yujia; Tan, Suxu; Peatman, Eric; Lu, Jianguo; Qin, Zhenkui; Dunham, Rex; Li, Zhaoxia; Sonstegard, Tad; Feng, Jianbin; Danzmann, Roy G; Schroeder, Steven; Scheffler, Brian; Duke, Mary V; Ballard, Linda; Kucuktas, Huseyin; Kaltenboeck, Ludmilla; Liu, Haixia; Armbruster, Jonathan; Xie, Yangjie; Kirby, Mona L; Tian, Yi; Flanagan, Mary Elizabeth; Mu, Weijie; Waldbieser, Geoffrey C

    2016-01-01

    Catfish represent 12% of teleost or 6.3% of all vertebrate species, and are of enormous economic value. Here we report a high-quality reference genome sequence of channel catfish (Ictalurus punctatus), the major aquaculture species in the US. The reference genome sequence was validated by genetic mapping of 54,000 SNPs, and annotated with 26,661 predicted protein-coding genes. Through comparative analysis of genomes and transcriptomes of scaled and scaleless fish and scale regeneration experiments, we address the genomic basis for the most striking physical characteristic of catfish, the evolutionary loss of scales and provide evidence that lack of secretory calcium-binding phosphoproteins accounts for the evolutionary loss of scales in catfish. The channel catfish reference genome sequence, along with two additional genome sequences and transcriptomes of scaled catfishes, provide crucial resources for evolutionary and biological studies. This work also demonstrates the power of comparative subtraction of candidate genes for traits of structural significance. PMID:27249958

  4. From amplification to gene in thyroid cancer: A high-resolution mapped bacterial-artificial-chromosome resource for cancer chromosome aberrations guides gene discovery after comparative genome hybridization

    Energy Technology Data Exchange (ETDEWEB)

    Chen, X.N.; Gonsky, R.; Korenberg, J.R. [UCLA School of Medicine, Los Angeles, CA (United States). Cedars-Sinai Research Inst.; Knauf, J.A.; Fagin, J.A. [Univ. of Cincinnati, OH (United States). Div. of Endocrinology/Metabolism; Wang, M.; Lai, E.H. [Univ. of North Carolina, Chapel Hill, NC (United States). Dept. of Pharmacology; Chissoe, S. [Washington Univ. School of Medicine, St. Louis, MO (United States). Genome Sequencing

    1998-08-01

    Chromosome rearrangements associated with neoplasms provide a rich resource for definition of the pathways of tumorigenesis. The power of comparative genome hybridization (CGH) to identify novel genes depends on the existence of suitable markers, which are lacking throughout most of the genome. The authors now report a general approach that translates CGH data into higher-resolution genomic-clone data that are then used to define the genes located in aneuploid regions. They used CGH to study 33 thyroid-tumor DNAs and two tumor-cell-line DNAs. The results revealed amplifications of chromosome band 2p21, with less-intense amplification on 2p13, 19q13.1, and 1p36 and with least-intense amplification on 1p34, 1q42, 5q31, 5q33-34, 9q32-34, and 14q32. To define the 2p21 region amplified, a dense array of 373 FISH-mapped chromosome 2 bacterial artificial chromosomes (BACs) was constructed, and 87 of these were hybridized to a tumor-cell line. Four BACs carried genomic DNA that was amplified in these cells. The maximum amplified region was narrowed to 3--6 Mb by multicolor FISH with the flanking BACs, and the minimum amplicon size was defined by a contig of 420 kb. Sequence analysis of the amplified BAC 1D9 revealed a fragment of the gene, encoding protein kinase C epsilon (PKC{epsilon}), that was then shown to be amplified and rearranged in tumor cells. In summary, CGH combined with a dense mapped resource of BACs and large-scale sequencing has led directly to the definition of PKC{epsilon} as a previously unmapped candidate gene involved in thyroid tumorigenesis.

  5. Dynamics of bacterial communities before and after distribution in a full-scale drinking water network

    KAUST Repository

    El-Chakhtoura, Joline

    2015-05-01

    Understanding the biological stability of drinking water distribution systems is imperative in the framework of process control and risk management. The objective of this research was to examine the dynamics of the bacterial community during drinking water distribution at high temporal resolution. Water samples (156 in total) were collected over short time-scales (minutes/hours/days) from the outlet of a treatment plant and a location in its corresponding distribution network. The drinking water is treated by biofiltration and disinfectant residuals are absent during distribution. The community was analyzed by 16S rRNA gene pyrosequencing and flow cytometry as well as conventional, culture-based methods. Despite a random dramatic event (detected with pyrosequencing and flow cytometry but not with plate counts), the bacterial community profile at the two locations did not vary significantly over time. A diverse core microbiome was shared between the two locations (58-65% of the taxa and 86-91% of the sequences) and found to be dependent on the treatment strategy. The bacterial community structure changed during distribution, with greater richness detected in the network and phyla such as Acidobacteria and Gemmatimonadetes becoming abundant. The rare taxa displayed the highest dynamicity, causing the major change during water distribution. This change did not have hygienic implications and is contingent on the sensitivity of the applied methods. The concept of biological stability therefore needs to be revised. Biostability is generally desired in drinking water guidelines but may be difficult to achieve in large-scale complex distribution systems that are inherently dynamic.

  6. On the limits of computational functional genomics for bacterial lifestyle prediction

    DEFF Research Database (Denmark)

    Barbosa, Eudes; Röttger, Richard; Hauschild, Anne-Christin;

    2014-01-01

    We review the level of genomic specificity regarding actinobacterial pathogenicity. As they occupy various niches in diverse habitats, one may assume the existence of lifestyle-specific genomic features. We include 240 actinobacteria classified into four pathogenicity classes: human pathogens (HP...... in the post-genome era and despite next-generation sequencing technology, our ability to efficiently deduce real-world conclusions, such as pathogenicity classification, remains quite limited....

  7. Symmetry and scale orient Min protein patterns in shaped bacterial sculptures

    Science.gov (United States)

    Wu, Fabai; van Schie, Bas G. C.; Keymer, Juan E.; Dekker, Cees

    2015-08-01

    The boundary of a cell defines the shape and scale of its subcellular organization. However, the effects of the cell's spatial boundaries as well as the geometry sensing and scale adaptation of intracellular molecular networks remain largely unexplored. Here, we show that living bacterial cells can be ‘sculpted’ into defined shapes, such as squares and rectangles, which are used to explore the spatial adaptation of Min proteins that oscillate pole-to-pole in rod-shaped Escherichia coli to assist cell division. In a wide geometric parameter space, ranging from 2 × 1 × 1 to 11 × 6 × 1 μm3, Min proteins exhibit versatile oscillation patterns, sustaining rotational, longitudinal, diagonal, stripe and even transversal modes. These patterns are found to directly capture the symmetry and scale of the cell boundary, and the Min concentration gradients scale with the cell size within a characteristic length range of 3-6 μm. Numerical simulations reveal that local microscopic Turing kinetics of Min proteins can yield global symmetry selection, gradient scaling and an adaptive range, when and only when facilitated by the three-dimensional confinement of the cell boundary. These findings cannot be explained by previous geometry-sensing models based on the longest distance, membrane area or curvature, and reveal that spatial boundaries can facilitate simple molecular interactions to result in far more versatile functions than previously understood.

  8. Bacterial toxicity comparison between nano- and micro-scaled oxide particles

    Energy Technology Data Exchange (ETDEWEB)

    Jiang Wei; Mashayekhi, Hamid [Department of Plant, Soil and Insect Sciences, University of Massachusetts, Stockbridge Hall, Amherst, MA 01003 (United States); Xing Baoshan, E-mail: bx@pssci.umass.ed [Department of Plant, Soil and Insect Sciences, University of Massachusetts, Stockbridge Hall, Amherst, MA 01003 (United States)

    2009-05-15

    Toxicity of nano-scaled aluminum, silicon, titanium and zinc oxides to bacteria (Bacillus subtilis, Escherichia coli and Pseudomonas fluorescens) was examined and compared to that of their respective bulk (micro-scaled) counterparts. All nanoparticles but titanium oxide showed higher toxicity (at 20 mg/L) than their bulk counterparts. Toxicity of released metal ions was differentiated from that of the oxide particles. ZnO was the most toxic among the three nanoparticles, causing 100% mortality to the three tested bacteria. Al{sub 2}O{sub 3} nanoparticles had a mortality rate of 57% to B. subtilis, 36% to E. coli, and 70% to P. fuorescens. SiO{sub 2} nanoparticles killed 40% of B. subtilis, 58% of E. coli, and 70% of P. fluorescens. TEM images showed attachment of nanoparticles to the bacteria, suggesting that the toxicity was affected by bacterial attachment. Bacterial responses to nanoparticles were different from their bulk counterparts; hence nanoparticle toxicity mechanisms need to be studied thoroughly. - Oxide nanoparticles show higher toxicity than their bulk counterparts

  9. Genome-Scale Metabolic Modeling in the Simulation of Field-Scale Uranium Bioremediation

    Science.gov (United States)

    Yabusaki, S.; Wilkins, M.; Fang, Y.; Williams, K. H.; Waichler, S.; Long, P. E.

    2015-12-01

    Coupled variably saturated flow and biogeochemical reactive transport modeling is used to improve understanding of the processes, properties, and conditions controlling uranium bio-immobilization in a field experiment where uranium-contaminated groundwater was amended with acetate and bicarbonate. The acetate stimulates indigenous microorganisms that catalyze metal reduction, including the conversion of aqueous U(VI) to solid-phase U(IV), which effectively removes uranium from solution. The initiation of the bicarbonate amendment prior to biostimulation was designed to promote U(VI) desorption that would increase the aqueous U(VI) available for bioreduction. The three-dimensional simulations were able to largely reproduce the timing and magnitude of the physical, chemical and biological responses to the acetate and bicarbonate amendment in the context of changing water table elevation and gradient. A time series of groundwater proteomic samples exhibited correlations between the most abundant Geobacter metallireducens proteins and the genome-scale metabolic model-predicted fluxes of intra-cellular reactions associated with each of those proteins. The desorption of U(VI) induced by the bicarbonate amendment led to initially higher rates of bioreduction compared to locations with minimal bicarbonate exposure. After bicarbonate amendment ceased, bioreduction continued at these locations whereas U(VI) sorption was the dominant removal mechanism at the bicarbonate-impacted sites.

  10. Direct-to-consumer genomics on the scales of autonomy.

    Science.gov (United States)

    Vayena, Effy

    2015-04-01

    Direct-to-consumer (DTC) genetic services have generated enormous controversy from their first emergence. A dramatic recent manifestation of this is the Food and Drug Administration's (FDA) cease and desist order against 23andMe, the leading provider in the market. Critics have argued for the restrictive regulation of such services, and even their prohibition, on the grounds of the harm they pose to consumers. Their advocates, by contrast, defend them as a means of enhancing the autonomy of those same consumers. Autonomy emerges as a key battle-field in this debate, because many of the 'harm' arguments can be interpreted as identifying threats to autonomy. This paper assesses whether DTC genomic services are a threat to, or instead, an enhancement of, personal autonomy. It deploys Joseph Raz's account of personal autonomy, with its emphasis on choice from a range of valuable options. It then seeks to counter claims that DTC genomics threatens autonomy because it involves manipulation in contravention of consumers' independence or because it does not generate valuable options which can be meaningfully engaged with by consumers. It is stressed that the value of the options generated by DTC genomics should not be judged exclusively from the perspective of medical actionability, but should take into consideration plural utilities. Finally, the paper ends by broaching policy recommendations, suggesting that there is a strong autonomy-based argument for permitting DTC genomic services, and that the key question is the nature of the regulatory conditions under which they should be permitted. The discussion of autonomy in this paper helps illuminate some of these conditions. PMID:24797610

  11. The Genomic Sequence of the Oral Pathobiont Strain NI1060 Reveals Unique Strategies for Bacterial Competition and Pathogenicity.

    Directory of Open Access Journals (Sweden)

    Youssef Darzi

    Full Text Available Strain NI1060 is an oral bacterium responsible for periodontitis in a murine ligature-induced disease model. To better understand its pathogenicity, we have determined the complete sequence of its 2,553,982 bp genome. Although closely related to Pasteurella pneumotropica, a pneumonia-associated rodent commensal based on its 16S rRNA, the NI1060 genomic content suggests that they are different species thriving on different energy sources via alternative metabolic pathways. Genomic and phylogenetic analyses showed that strain NI1060 is distinct from the genera currently described in the family Pasteurellaceae, and is likely to represent a novel species. In addition, we found putative virulence genes involved in lipooligosaccharide synthesis, adhesins and bacteriotoxic proteins. These genes are potentially important for host adaption and for the induction of dysbiosis through bacterial competition and pathogenicity. Importantly, strain NI1060 strongly stimulates Nod1, an innate immune receptor, but is defective in two peptidoglycan recycling genes due to a frameshift mutation. The in-depth analysis of its genome thus provides critical insights for the development of NI1060 as a prime model system for infectious disease.

  12. TIGER: Toolbox for integrating genome-scale metabolic models, expression data, and transcriptional regulatory networks

    Directory of Open Access Journals (Sweden)

    Jensen Paul A

    2011-09-01

    Full Text Available Abstract Background Several methods have been developed for analyzing genome-scale models of metabolism and transcriptional regulation. Many of these methods, such as Flux Balance Analysis, use constrained optimization to predict relationships between metabolic flux and the genes that encode and regulate enzyme activity. Recently, mixed integer programming has been used to encode these gene-protein-reaction (GPR relationships into a single optimization problem, but these techniques are often of limited generality and lack a tool for automating the conversion of rules to a coupled regulatory/metabolic model. Results We present TIGER, a Toolbox for Integrating Genome-scale Metabolism, Expression, and Regulation. TIGER converts a series of generalized, Boolean or multilevel rules into a set of mixed integer inequalities. The package also includes implementations of existing algorithms to integrate high-throughput expression data with genome-scale models of metabolism and transcriptional regulation. We demonstrate how TIGER automates the coupling of a genome-scale metabolic model with GPR logic and models of transcriptional regulation, thereby serving as a platform for algorithm development and large-scale metabolic analysis. Additionally, we demonstrate how TIGER's algorithms can be used to identify inconsistencies and improve existing models of transcriptional regulation with examples from the reconstructed transcriptional regulatory network of Saccharomyces cerevisiae. Conclusion The TIGER package provides a consistent platform for algorithm development and extending existing genome-scale metabolic models with regulatory networks and high-throughput data.

  13. Genome Sequences of 12 Bacterial Isolates Obtained from the Urine of Pregnant Women

    Science.gov (United States)

    Weimer, Cory M.; Deitzler, Grace E.; Robinson, Lloyd S.; Park, SoEun; Hallsworth-Pepin, Kymberlie; Wollam, Aye; Mitreva, Makedonka

    2016-01-01

    The presence of bacteria in urine can pose significant risks during pregnancy. However, there are few reference genome strains for many common urinary bacteria. We isolated 12 urinary strains of Streptococcus, Staphylococcus, Citrobacter, Gardnerella, and Lactobacillus. These strains and their genomes are now available to the research community. PMID:27688327

  14. Draft Genome Sequence of the Shellfish Bacterial Pathogen Vibrio sp. Strain B183.

    Science.gov (United States)

    Schreier, Harold J; Schott, Eric J

    2014-09-18

    We report the draft genome sequence of Vibrio sp. strain B183, a Gram-negative marine bacterium isolated from shellfish that causes mortality in larval mariculture. The availability of this genome sequence will facilitate the study of its virulence mechanisms and add to our knowledge of Vibrio sp. diversity and evolution.

  15. Draft Genome Sequence of Nocardia jinanensis, an Opportunistic Bacterial Pathogen That Causes Cellulitis

    Science.gov (United States)

    Chakrabortti, Alolika; Li, Jinming

    2016-01-01

    The draft genome sequence of Nocardia jinanensis, an opportunistic pathogen that can cause skin infections, reveals genes that may contribute to the lifestyle and pathogenicity of N. jinanensis. The genome also reveals the biosynthetic capacity of N. jinanensis in producing mycolic acids, siderophores, and other polyketide and nonribosomal peptide-derived secondary metabolites. PMID:27445366

  16. Draft Genome Sequence of Nocardia jinanensis, an Opportunistic Bacterial Pathogen That Causes Cellulitis.

    Science.gov (United States)

    Chakrabortti, Alolika; Li, Jinming; Liang, Zhao-Xun

    2016-01-01

    The draft genome sequence of Nocardia jinanensis, an opportunistic pathogen that can cause skin infections, reveals genes that may contribute to the lifestyle and pathogenicity of N. jinanensis The genome also reveals the biosynthetic capacity of N. jinanensis in producing mycolic acids, siderophores, and other polyketide and nonribosomal peptide-derived secondary metabolites. PMID:27445366

  17. Genome Sequences of 12 Bacterial Isolates Obtained from the Urine of Pregnant Women.

    Science.gov (United States)

    Weimer, Cory M; Deitzler, Grace E; Robinson, Lloyd S; Park, SoEun; Hallsworth-Pepin, Kymberlie; Wollam, Aye; Mitreva, Makedonka; Lewis, Warren G; Lewis, Amanda L

    2016-01-01

    The presence of bacteria in urine can pose significant risks during pregnancy. However, there are few reference genome strains for many common urinary bacteria. We isolated 12 urinary strains of Streptococcus, Staphylococcus, Citrobacter, Gardnerella, and Lactobacillus These strains and their genomes are now available to the research community. PMID:27688327

  18. Genomic DNA fingerprint analysis of biotype 1 Gardnerella vaginalis from patients with and without bacterial vaginosis.

    Science.gov (United States)

    Wu, S R; Hillier, S L; Nath, K

    1996-01-01

    Of the 20 biotype 1 Gardnerella vaginalis isolates analyzed, 10 from patients with bacterial vaginosis and 10 from patients without bacterial vaginosis, none shared the same DNA fingerprint. However, a 1.18-kb HindIII fragment was common among 18 of the 20 biotype 1 isolates in a restriction fragment length polymorphism analysis with a 7.9-kb G. vaginalis DNA probe. PMID:8748302

  19. In situ spatial patterns of soil bacterial populations, mapped at multiple scales, in an arable soil.

    Science.gov (United States)

    Nunan, N; Wu, K; Young, I M; Crawford, J W; Ritz, K

    2002-11-01

    Very little is known about the spatial organization of soil microbes across scales that are relevant both to microbial function and to field-based processes. The spatial distributions of microbes and microbially mediated activity have a high intrinsic variability. This can present problems when trying to quantify the effects of disturbance, management practices, or climate change on soil microbial systems and attendant function. A spatial sampling regime was implemented in an arable field. Cores of undisturbed soil were sampled from a 3 x 3 x 0.9 m volume of soil (topsoil and subsoil) and a biological thin section, in which the in situ distribution of bacteria could be quantified, prepared from each core. Geostatistical analysis was used to quantify the nature of spatial structure from micrometers to meters and spatial point pattern analysis to test for deviations from complete spatial randomness of mapped bacteria. Spatial structure in the topsoil was only found at the microscale (micrometers), whereas evidence for nested scales of spatial structure was found in the subsoil (at the microscale, and at the centimeter to meter scale). Geostatistical ranges of spatial structure at the micro scale were greater in the topsoil and tended to decrease with depth in the subsoil. Evidence for spatial aggregation in bacteria was stronger in the topsoil and also decreased with depth in the subsoil, though extremely high degrees of aggregation were found at very short distances in the deep subsoil. The data suggest that factors that regulate the distribution of bacteria in the subsoil operate at two scales, in contrast to one scale in the topsoil, and that bacterial patches are larger and more prevalent in the topsoil.

  20. MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes

    Directory of Open Access Journals (Sweden)

    Yang Yi-Fan

    2007-03-01

    Full Text Available Abstract Background Despite a remarkable success in the computational prediction of genes in Bacteria and Archaea, a lack of comprehensive understanding of prokaryotic gene structures prevents from further elucidation of differences among genomes. It continues to be interesting to develop new ab initio algorithms which not only accurately predict genes, but also facilitate comparative studies of prokaryotic genomes. Results This paper describes a new prokaryotic genefinding algorithm based on a comprehensive statistical model of protein coding Open Reading Frames (ORFs and Translation Initiation Sites (TISs. The former is based on a linguistic "Entropy Density Profile" (EDP model of coding DNA sequence and the latter comprises several relevant features related to the translation initiation. They are combined to form a so-called Multivariate Entropy Distance (MED algorithm, MED 2.0, that incorporates several strategies in the iterative program. The iterations enable us to develop a non-supervised learning process and to obtain a set of genome-specific parameters for the gene structure, before making the prediction of genes. Conclusion Results of extensive tests show that MED 2.0 achieves a competitive high performance in the gene prediction for both 5' and 3' end matches, compared to the current best prokaryotic gene finders. The advantage of the MED 2.0 is particularly evident for GC-rich genomes and archaeal genomes. Furthermore, the genome-specific parameters given by MED 2.0 match with the current understanding of prokaryotic genomes and may serve as tools for comparative genomic studies. In particular, MED 2.0 is shown to reveal divergent translation initiation mechanisms in archaeal genomes while making a more accurate prediction of TISs compared to the existing gene finders and the current GenBank annotation.

  1. Mauve: Multiple Alignment of Conserved Genomic Sequence With Rearrangements

    OpenAIRE

    Darling, Aaron C.E.; Mau, Bob; Blattner, Frederick R.; Perna, Nicole T.

    2004-01-01

    As genomes evolve, they undergo large-scale evolutionary processes that present a challenge to sequence comparison not posed by short sequences. Recombination causes frequent genome rearrangements, horizontal transfer introduces new sequences into bacterial chromosomes, and deletions remove segments of the genome. Consequently, each genome is a mosaic of unique lineage-specific segments, regions shared with a subset of other genomes and segments conserved among all the genomes under considera...

  2. Savant Genome Browser 2: visualization and analysis for population-scale genomics.

    Science.gov (United States)

    Fiume, Marc; Smith, Eric J M; Brook, Andrew; Strbenac, Dario; Turner, Brian; Mezlini, Aziz M; Robinson, Mark D; Wodak, Shoshana J; Brudno, Michael

    2012-07-01

    High-throughput sequencing (HTS) technologies are providing an unprecedented capacity for data generation, and there is a corresponding need for efficient data exploration and analysis capabilities. Although most existing tools for HTS data analysis are developed for either automated (e.g. genotyping) or visualization (e.g. genome browsing) purposes, such tools are most powerful when combined. For example, integration of visualization and computation allows users to iteratively refine their analyses by updating computational parameters within the visual framework in real-time. Here we introduce the second version of the Savant Genome Browser, a standalone program for visual and computational analysis of HTS data. Savant substantially improves upon its predecessor and existing tools by introducing innovative visualization modes and navigation interfaces for several genomic datatypes, and synergizing visual and automated analyses in a way that is powerful yet easy even for non-expert users. We also present a number of plugins that were developed by the Savant Community, which demonstrate the power of integrating visual and automated analyses using Savant. The Savant Genome Browser is freely available (open source) at www.savantbrowser.com.

  3. A tandem repeats database for bacterial genomes: application to the genotyping of Yersinia pestis and Bacillus anthracis

    Directory of Open Access Journals (Sweden)

    Denoeud France

    2001-03-01

    Full Text Available Abstract Background Some pathogenic bacteria are genetically very homogeneous, making strain discrimination difficult. In the last few years, tandem repeats have been increasingly recognized as markers of choice for genotyping a number of pathogens. The rapid evolution of these structures appears to contribute to the phenotypic flexibility of pathogens. The availability of whole-genome sequences has opened the way to the systematic evaluation of tandem repeats diversity and application to epidemiological studies. Results This report presents a database (http://minisatellites.u-psud.fr of tandem repeats from publicly available bacterial genomes which facilitates the identification and selection of tandem repeats. We illustrate the use of this database by the characterization of minisatellites from two important human pathogens, Yersinia pestis and Bacillus anthracis. In order to avoid simple sequence contingency loci which may be of limited value as epidemiological markers, and to provide genotyping tools amenable to ordinary agarose gel electrophoresis, only tandem repeats with repeat units at least 9 bp long were evaluated. Yersinia pestis contains 64 such minisatellites in which the unit is repeated at least 7 times. An additional collection of 12 loci with at least 6 units, and a high internal conservation were also evaluated. Forty-nine are polymorphic among five Yersinia strains (twenty-five among three Y. pestis strains. Bacillus anthracis contains 30 comparable structures in which the unit is repeated at least 10 times. Half of these tandem repeats show polymorphism among the strains tested. Conclusions Analysis of the currently available bacterial genome sequences classifies Bacillus anthracis and Yersinia pestis as having an average (approximately 30 per Mb density of tandem repeat arrays longer than 100 bp when compared to the other bacterial genomes analysed to date. In both cases, testing a fraction of these sequences for

  4. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system

    OpenAIRE

    Speth, D.R.; Zandt, M.H. in 't; Guerrero Cruz, S.; Dutilh, B.E.; Jetten, M. S. M.

    2016-01-01

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is used to seed reactors in wastewater treatment plants around the world; however, the role of most of its microbial community in ammonium removal remains unknown. Our analysis yielded 23 near-complete d...

  5. Reconstruction of a genome-scale metabolic network for Streptococcus pneumoniae R6

    OpenAIRE

    J.P. Saraiva; Pinto, Francisco; Rocha, I

    2013-01-01

    The gram-positive, lancet-shaped bacteria Streptococcus pneumoniae thrives in almost any environment. Under certain conditions this pathogen can cause several infections such as meningitis, otitis media, endocarditis or pneumonia. Genome-scale metabolic networks (GSMs) are commonly used to study phenotype-genotype relationships using biochemical, physiological and genomic information. These relationships might shed some light on identification of targets for metabolic engineering or, in t...

  6. In Silico Genome-Scale Reconstruction and Validation of the Corynebacterium glutamicum Metabolic Network

    DEFF Research Database (Denmark)

    Kjeldsen, Kjeld Raunkjær; Nielsen, J.

    2009-01-01

    A genome-scale metabolic model of the Gram-positive bacteria Corynebacterium glutamicum ATCC 13032 was constructed comprising 446 reactions and 411 metabolite, based on the annotated genome and available biochemical information. The network was analyzed using constraint based methods. The model...... and lactate. Comparable flux values between in silico model and experimental values were seen, although some differences in the phenotypic behavior between the model and the experimental data were observed,...

  7. A versatile genome-scale PCR-based pipeline for high-definition DNA FISH.

    Science.gov (United States)

    Bienko, Magda; Crosetto, Nicola; Teytelman, Leonid; Klemm, Sandy; Itzkovitz, Shalev; van Oudenaarden, Alexander

    2013-02-01

    We developed a cost-effective genome-scale PCR-based method for high-definition DNA FISH (HD-FISH). We visualized gene loci with diffraction-limited resolution, chromosomes as spot clusters and single genes together with transcripts by combining HD-FISH with single-molecule RNA FISH. We provide a database of over 4.3 million primer pairs targeting the human and mouse genomes that is readily usable for rapid and flexible generation of probes.

  8. Strain Dependent Genetic Networks for Antibiotic-Sensitivity in a Bacterial Pathogen with a Large Pan-Genome.

    Science.gov (United States)

    van Opijnen, Tim; Dedrick, Sandra; Bento, José

    2016-09-01

    The interaction between an antibiotic and bacterium is not merely restricted to the drug and its direct target, rather antibiotic induced stress seems to resonate through the bacterium, creating selective pressures that drive the emergence of adaptive mutations not only in the direct target, but in genes involved in many different fundamental processes as well. Surprisingly, it has been shown that adaptive mutations do not necessarily have the same effect in all species, indicating that the genetic background influences how phenotypes are manifested. However, to what extent the genetic background affects the manner in which a bacterium experiences antibiotic stress, and how this stress is processed is unclear. Here we employ the genome-wide tool Tn-Seq to construct daptomycin-sensitivity profiles for two strains of the bacterial pathogen Streptococcus pneumoniae. Remarkably, over half of the genes that are important for dealing with antibiotic-induced stress in one strain are dispensable in another. By confirming over 100 genotype-phenotype relationships, probing potassium-loss, employing genetic interaction mapping as well as temporal gene-expression experiments we reveal genome-wide conditionally important/essential genes, we discover roles for genes with unknown function, and uncover parts of the antibiotic's mode-of-action. Moreover, by mapping the underlying genomic network for two query genes we encounter little conservation in network connectivity between strains as well as profound differences in regulatory relationships. Our approach uniquely enables genome-wide fitness comparisons across strains, facilitating the discovery that antibiotic responses are complex events that can vary widely between strains, which suggests that in some cases the emergence of resistance could be strain specific and at least for species with a large pan-genome less predictable. PMID:27607357

  9. From genetic circuits to industrial-scale biomanufacturing: bacterial promoters as a cornerstone of biotechnology

    Directory of Open Access Journals (Sweden)

    Pawel Jajesniak

    2015-08-01

    Full Text Available Since the advent of genetic engineering, Escherichia coli, the most widely studied prokaryotic model organism, and other bacterial species have remained at the forefront of biological research. These ubiquitous microorganisms play an essential role in deciphering complex gene regulation mechanisms, large-scale recombinant protein production, and lately the two emerging areas of biotechnology—synthetic biology and metabolic engineering. Among a myriad of factors affecting prokaryotic gene expression, judicious choice of promoter remains one of the most challenging and impactful decisions in many biological experiments. This review provides a comprehensive overview of the current state of bacterial promoter engineering, with an emphasis on its applications in heterologous protein production, synthetic biology and metabolic engineering. In addition to highlighting relevant advances in these fields, the article facilitates the selection of an appropriate promoter by providing pertinent guidelines and explores the development of complementary databases, bioinformatics tools and promoter standardization procedures. The review ends by providing a quick overview of other emerging technologies and future prospects of this vital research area.

  10. On the limits of computational functional genomics for bacterial lifestyle prediction

    DEFF Research Database (Denmark)

    Barbosa, Eudes; Röttger, Richard; Hauschild, Anne-Christin;

    2014-01-01

    of an observation bias, i.e. many HPs might yet be unclassified BPs. (H4) There is no intrinsic genomic characteristic of OPs compared with pathogens, as small mutations are likely to play a more dominant role to survive the immune system. To study these hypotheses, we implemented a bioinformatics pipeline......We review the level of genomic specificity regarding actinobacterial pathogenicity. As they occupy various niches in diverse habitats, one may assume the existence of lifestyle-specific genomic features. We include 240 actinobacteria classified into four pathogenicity classes: human pathogens (HPs...... in the post-genome era and despite next-generation sequencing technology, our ability to efficiently deduce real-world conclusions, such as pathogenicity classification, remains quite limited....

  11. ANItools web: a web tool for fast genome comparison within multiple bacterial strains

    OpenAIRE

    Han, Na; Qiang, Yujun; Zhang, Wen

    2016-01-01

    Background: Early classification of prokaryotes was based solely on phenotypic similarities, but modern prokaryote characterization has been strongly influenced by advances in genetic methods. With the fast development of the sequencing technology, the ever increasing number of genomic sequences per species offers the possibility for developing distance determinations based on whole-genome information. The average nucleotide identity (ANI), calculated from pair-wise comparisons of all sequenc...

  12. Genomic epidemiology and global diversity of the emerging bacterial pathogen Elizabethkingia anophelis.

    Science.gov (United States)

    Breurec, Sebastien; Criscuolo, Alexis; Diancourt, Laure; Rendueles, Olaya; Vandenbogaert, Mathias; Passet, Virginie; Caro, Valérie; Rocha, Eduardo P C; Touchon, Marie; Brisse, Sylvain

    2016-01-01

    Elizabethkingia anophelis is an emerging pathogen involved in human infections and outbreaks in distinct world regions. We investigated the phylogenetic relationships and pathogenesis-associated genomic features of two neonatal meningitis isolates isolated 5 years apart from one hospital in Central African Republic and compared them with Elizabethkingia from other regions and sources. Average nucleotide identity firmly confirmed that E. anophelis, E. meningoseptica and E. miricola represent demarcated genomic species. A core genome multilocus sequence typing scheme, broadly applicable to Elizabethkingia species, was developed and made publicly available (http://bigsdb.pasteur.fr/elizabethkingia). Phylogenetic analysis revealed distinct E. anophelis sublineages and demonstrated high genetic relatedness between the African isolates, compatible with persistence of the strain in the hospital environment. CRISPR spacer variation between the African isolates was mirrored by the presence of a large mobile genetic element. The pan-genome of E. anophelis comprised 6,880 gene families, underlining genomic heterogeneity of this species. African isolates carried unique resistance genes acquired by horizontal transfer. We demonstrated the presence of extensive variation of the capsular polysaccharide synthesis gene cluster in E. anophelis. Our results demonstrate the dynamic evolution of this emerging pathogen and the power of genomic approaches for Elizabethkingia identification, population biology and epidemiology. PMID:27461509

  13. Whole genome amplification and de novo assembly of single bacterial cells.

    Directory of Open Access Journals (Sweden)

    Sébastien Rodrigue

    Full Text Available BACKGROUND: Single-cell genome sequencing has the potential to allow the in-depth exploration of the vast genetic diversity found in uncultured microbes. We used the marine cyanobacterium Prochlorococcus as a model system for addressing important challenges facing high-throughput whole genome amplification (WGA and complete genome sequencing of individual cells. METHODOLOGY/PRINCIPAL FINDINGS: We describe a pipeline that enables single-cell WGA on hundreds of cells at a time while virtually eliminating non-target DNA from the reactions. We further developed a post-amplification normalization procedure that mitigates extreme variations in sequencing coverage associated with multiple displacement amplification (MDA, and demonstrated that the procedure increased sequencing efficiency and facilitated genome assembly. We report genome recovery as high as 99.6% with reference-guided assembly, and 95% with de novo assembly starting from a single cell. We also analyzed the impact of chimera formation during MDA on de novo assembly, and discuss strategies to minimize the presence of incorrectly joined regions in contigs. CONCLUSIONS/SIGNIFICANCE: The methods describe in this paper will be useful for sequencing genomes of individual cells from a variety of samples.

  14. Characteristic Length Scale of Electric Transport Properties of Genomes

    CERN Document Server

    Shih, C T

    2005-01-01

    A tight-binding model together with a novel statistical method are used to investigate the relation between the sequence-dependent electric transport properties and the sequences of protein-coding regions of complete genomes. A correlation parameter $\\Omega$ is defined to analyze the relation. For some particular propagation length $w_{max}$, the transport behaviors of the coding and non-coding sequences are very different and the correlation reaches its maximal value $\\Omega_{max}$. $w_{max}$ and \\omax are characteristic values for each species. The possible reason of the difference between the features of transport properties in the coding and non-coding regions is the mechanism of DNA damage repair processes together with the natural selection.

  15. Comparative Genomic Analysis of Xanthomonas axonopodis pv. citrumelo F1, Which Causes Citrus Bacterial Spot Disease, and Related Strains Provides Insights into Virulence and Host Specificity ▿ #

    OpenAIRE

    Jalan, Neha; Aritua, Valente; Kumar, Dibyendu; Yu, Fahong; Jones, Jeffrey B; Graham, James H; Setubal, João C; Wang, Nian

    2011-01-01

    Xanthomonas axonopodis pv. citrumelo is a citrus pathogen causing citrus bacterial spot disease that is geographically restricted within the state of Florida. Illumina, 454 sequencing, and optical mapping were used to obtain a complete genome sequence of X. axonopodis pv. citrumelo strain F1, 4.9 Mb in size. The strain lacks plasmids, in contrast to other citrus Xanthomonas pathogens. Phylogenetic analysis revealed that this pathogen is very close to the tomato bacterial spot pathogen X. camp...

  16. Comparative genomics analysis of the companion mechanisms of Bacillus thuringiensis Bc601 and Bacillus endophyticus Hbe603 in bacterial consortium.

    Science.gov (United States)

    Jia, Nan; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2016-01-01

    Bacillus thuringiensis and Bacillus endophyticus both act as the companion bacteria, which cooperate with Ketogulonigenium vulgare in vitamin C two-step fermentation. Two Bacillus species have different morphologies, swarming motility and 2-keto-L-gulonic acid productivities when they co-culture with K. vulgare. Here, we report the complete genome sequencing of B. thuringiensis Bc601 and eight plasmids of B. endophyticus Hbe603, and carry out the comparative genomics analysis. Consequently, B. thuringiensis Bc601, with greater ability of response to the external environment, has been found more two-component system, sporulation coat and peptidoglycan biosynthesis related proteins than B. endophyticus Hbe603, and B. endophyticus Hbe603, with greater ability of nutrients biosynthesis, has been found more alpha-galactosidase, propanoate, glutathione and inositol phosphate metabolism, and amino acid degradation related proteins than B. thuringiensis Bc601. Different ability of swarming motility, response to the external environment and nutrients biosynthesis may reflect different companion mechanisms of two Bacillus species. Comparative genomic analysis of B. endophyticus and B. thuringiensis enables us to further understand the cooperative mechanism with K. vulgare, and facilitate the optimization of bacterial consortium. PMID:27353048

  17. Comparative genomics analysis of the companion mechanisms of Bacillus thuringiensis Bc601 and Bacillus endophyticus Hbe603 in bacterial consortium.

    Science.gov (United States)

    Jia, Nan; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2016-06-29

    Bacillus thuringiensis and Bacillus endophyticus both act as the companion bacteria, which cooperate with Ketogulonigenium vulgare in vitamin C two-step fermentation. Two Bacillus species have different morphologies, swarming motility and 2-keto-L-gulonic acid productivities when they co-culture with K. vulgare. Here, we report the complete genome sequencing of B. thuringiensis Bc601 and eight plasmids of B. endophyticus Hbe603, and carry out the comparative genomics analysis. Consequently, B. thuringiensis Bc601, with greater ability of response to the external environment, has been found more two-component system, sporulation coat and peptidoglycan biosynthesis related proteins than B. endophyticus Hbe603, and B. endophyticus Hbe603, with greater ability of nutrients biosynthesis, has been found more alpha-galactosidase, propanoate, glutathione and inositol phosphate metabolism, and amino acid degradation related proteins than B. thuringiensis Bc601. Different ability of swarming motility, response to the external environment and nutrients biosynthesis may reflect different companion mechanisms of two Bacillus species. Comparative genomic analysis of B. endophyticus and B. thuringiensis enables us to further understand the cooperative mechanism with K. vulgare, and facilitate the optimization of bacterial consortium.

  18. Bacterial community structure and variation in a full-scale seawater desalination plant for drinking water production

    KAUST Repository

    Belila, A.

    2016-02-18

    Microbial processes inevitably play a role in membrane-based desalination plants, mainly recognized as membrane biofouling. We assessed the bacterial community structure and diversity during different treatment steps in a full-scale seawater desalination plant producing 40,000 m3/d of drinking water. Water samples were taken over the full treatment train consisting of chlorination, spruce media and cartridge filters, de-chlorination, first and second pass reverse osmosis (RO) membranes and final chlorine dosage for drinking water distribution. The water samples were analyzed for water quality parameters (total bacterial cell number, total organic carbon, conductivity, pH, etc.) and microbial community composition by 16S rRNA gene pyrosequencing. The planktonic microbial community was dominated by Proteobacteria (48.6%) followed by Bacteroidetes (15%), Firmicutes (9.3%) and Cyanobacteria (4.9%). During the pretreatment step, the spruce media filter did not impact the bacterial community composition dominated by Proteobacteria. In contrast, the RO and final chlorination treatment steps reduced the Proteobacterial relative abundance in the produced water where Firmicutes constituted the most dominant bacterial group. Shannon and Chao1 diversity indices showed that bacterial species richness and diversity decreased during the seawater desalination process. The two-stage RO filtration strongly reduced the water conductivity (>99%), TOC concentration (98.5%) and total bacterial cell number (>99%), albeit some bacterial DNA was found in the water after RO filtration. About 0.25% of the total bacterial operational taxonomic units (OTUs) were present in all stages of the desalination plant: the seawater, the RO permeates and the chlorinated drinking water, suggesting that these bacterial strains can survive in different environments such as high/low salt concentration and with/without residual disinfectant. These bacterial strains were not caused by contamination during

  19. Bacterial community structure and variation in a full-scale seawater desalination plant for drinking water production.

    Science.gov (United States)

    Belila, A; El-Chakhtoura, J; Otaibi, N; Muyzer, G; Gonzalez-Gil, G; Saikaly, P E; van Loosdrecht, M C M; Vrouwenvelder, J S

    2016-05-01

    Microbial processes inevitably play a role in membrane-based desalination plants, mainly recognized as membrane biofouling. We assessed the bacterial community structure and diversity during different treatment steps in a full-scale seawater desalination plant producing 40,000 m(3)/d of drinking water. Water samples were taken over the full treatment train consisting of chlorination, spruce media and cartridge filters, de-chlorination, first and second pass reverse osmosis (RO) membranes and final chlorine dosage for drinking water distribution. The water samples were analyzed for water quality parameters (total bacterial cell number, total organic carbon, conductivity, pH, etc.) and microbial community composition by 16S rRNA gene pyrosequencing. The planktonic microbial community was dominated by Proteobacteria (48.6%) followed by Bacteroidetes (15%), Firmicutes (9.3%) and Cyanobacteria (4.9%). During the pretreatment step, the spruce media filter did not impact the bacterial community composition dominated by Proteobacteria. In contrast, the RO and final chlorination treatment steps reduced the Proteobacterial relative abundance in the produced water where Firmicutes constituted the most dominant bacterial group. Shannon and Chao1 diversity indices showed that bacterial species richness and diversity decreased during the seawater desalination process. The two-stage RO filtration strongly reduced the water conductivity (>99%), TOC concentration (98.5%) and total bacterial cell number (>99%), albeit some bacterial DNA was found in the water after RO filtration. About 0.25% of the total bacterial operational taxonomic units (OTUs) were present in all stages of the desalination plant: the seawater, the RO permeates and the chlorinated drinking water, suggesting that these bacterial strains can survive in different environments such as high/low salt concentration and with/without residual disinfectant. These bacterial strains were not caused by contamination during

  20. Bacterial community structure and variation in a full-scale seawater desalination plant for drinking water production.

    Science.gov (United States)

    Belila, A; El-Chakhtoura, J; Otaibi, N; Muyzer, G; Gonzalez-Gil, G; Saikaly, P E; van Loosdrecht, M C M; Vrouwenvelder, J S

    2016-05-01

    Microbial processes inevitably play a role in membrane-based desalination plants, mainly recognized as membrane biofouling. We assessed the bacterial community structure and diversity during different treatment steps in a full-scale seawater desalination plant producing 40,000 m(3)/d of drinking water. Water samples were taken over the full treatment train consisting of chlorination, spruce media and cartridge filters, de-chlorination, first and second pass reverse osmosis (RO) membranes and final chlorine dosage for drinking water distribution. The water samples were analyzed for water quality parameters (total bacterial cell number, total organic carbon, conductivity, pH, etc.) and microbial community composition by 16S rRNA gene pyrosequencing. The planktonic microbial community was dominated by Proteobacteria (48.6%) followed by Bacteroidetes (15%), Firmicutes (9.3%) and Cyanobacteria (4.9%). During the pretreatment step, the spruce media filter did not impact the bacterial community composition dominated by Proteobacteria. In contrast, the RO and final chlorination treatment steps reduced the Proteobacterial relative abundance in the produced water where Firmicutes constituted the most dominant bacterial group. Shannon and Chao1 diversity indices showed that bacterial species richness and diversity decreased during the seawater desalination process. The two-stage RO filtration strongly reduced the water conductivity (>99%), TOC concentration (98.5%) and total bacterial cell number (>99%), albeit some bacterial DNA was found in the water after RO filtration. About 0.25% of the total bacterial operational taxonomic units (OTUs) were present in all stages of the desalination plant: the seawater, the RO permeates and the chlorinated drinking water, suggesting that these bacterial strains can survive in different environments such as high/low salt concentration and with/without residual disinfectant. These bacterial strains were not caused by contamination during

  1. Complete genome sequence of the extremely acidophilic methanotroph isolate V4, Methylacidiphilum infernorum, a representative of the bacterial phylum Verrucomicrobia

    Directory of Open Access Journals (Sweden)

    Stott Matthew B

    2008-07-01

    Full Text Available Abstract Background The phylum Verrucomicrobia is a widespread but poorly characterized bacterial clade. Although cultivation-independent approaches detect representatives of this phylum in a wide range of environments, including soils, seawater, hot springs and human gastrointestinal tract, only few have been isolated in pure culture. We have recently reported cultivation and initial characterization of an extremely acidophilic methanotrophic member of the Verrucomicrobia, strain V4, isolated from the Hell's Gate geothermal area in New Zealand. Similar organisms were independently isolated from geothermal systems in Italy and Russia. Results We report the complete genome sequence of strain V4, the first one from a representative of the Verrucomicrobia. Isolate V4, initially named "Methylokorus infernorum" (and recently renamed Methylacidiphilum infernorum is an autotrophic bacterium with a streamlined genome of ~2.3 Mbp that encodes simple signal transduction pathways and has a limited potential for regulation of gene expression. Central metabolism of M. infernorum was reconstructed almost completely and revealed highly interconnected pathways of autotrophic central metabolism and modifications of C1-utilization pathways compared to other known methylotrophs. The M. infernorum genome does not encode tubulin, which was previously discovered in bacteria of the genus Prosthecobacter, or close homologs of any other signature eukaryotic proteins. Phylogenetic analysis of ribosomal proteins and RNA polymerase subunits unequivocally supports grouping Planctomycetes, Verrucomicrobia and Chlamydiae into a single clade, the PVC superphylum, despite dramatically different gene content in members of these three groups. Comparative-genomic analysis suggests that evolution of the M. infernorum lineage involved extensive horizontal gene exchange with a variety of bacteria. The genome of M. infernorum shows apparent adaptations for existence under extremely

  2. Modeling Method for Increased Precision and Scope of Directly Measurable Fluxes at a Genome-Scale

    DEFF Research Database (Denmark)

    McCloskey, Douglas; Young, Jamey D.; Xu, Sibei;

    2016-01-01

    -off between increased scope and decreased precision in flux estimations. This work presents a tunable workflow for expanding the scope of MFA to the genome-scale without trade-offs in flux precision. The genome-scale MFA model presented here, iDM2014, accounts for 537 net reactions, which includes the core...... distributions (MIDs),(1) it was found that a total of 232 net fluxes of central and peripheral metabolism could be resolved in the E. coli network. The increase in scope was shown to cover the full biosynthetic route to an expanded set of bioproduction pathways, which should facilitate applications...

  3. Adaptation in Toxic Environments: Arsenic Genomic Islands in the Bacterial Genus Thiomonas.

    Directory of Open Access Journals (Sweden)

    Kelle C Freel

    Full Text Available Acid mine drainage (AMD is a highly toxic environment for most living organisms due to the presence of many lethal elements including arsenic (As. Thiomonas (Tm. bacteria are found ubiquitously in AMD and can withstand these extreme conditions, in part because they are able to oxidize arsenite. In order to further improve our knowledge concerning the adaptive capacities of these bacteria, we sequenced and assembled the genome of six isolates derived from the Carnoulès AMD, and compared them to the genomes of Tm. arsenitoxydans 3As (isolated from the same site and Tm. intermedia K12 (isolated from a sewage pipe. A detailed analysis of the Tm. sp. CB2 genome revealed various rearrangements had occurred in comparison to what was observed in 3As and K12 and over 20 genomic islands (GEIs were found in each of these three genomes. We performed a detailed comparison of the two arsenic-related islands found in CB2, carrying the genes required for arsenite oxidation and As resistance, with those found in K12, 3As, and five other Thiomonas strains also isolated from Carnoulès (CB1, CB3, CB6, ACO3 and ACO7. Our results suggest that these arsenic-related islands have evolved differentially in these closely related Thiomonas strains, leading to divergent capacities to survive in As rich environments.

  4. Adaptation in Toxic Environments: Arsenic Genomic Islands in the Bacterial Genus Thiomonas

    Science.gov (United States)

    Freel, Kelle C.; Krueger, Martin C.; Farasin, Julien; Brochier-Armanet, Céline; Barbe, Valérie; Andrès, Jeremy; Cholley, Pierre-Etienne; Dillies, Marie-Agnès; Jagla, Bernd; Koechler, Sandrine; Leva, Yann; Magdelenat, Ghislaine; Plewniak, Frédéric; Proux, Caroline; Coppée, Jean-Yves; Bertin, Philippe N.; Heipieper, Hermann J.; Arsène-Ploetze, Florence

    2015-01-01

    Acid mine drainage (AMD) is a highly toxic environment for most living organisms due to the presence of many lethal elements including arsenic (As). Thiomonas (Tm.) bacteria are found ubiquitously in AMD and can withstand these extreme conditions, in part because they are able to oxidize arsenite. In order to further improve our knowledge concerning the adaptive capacities of these bacteria, we sequenced and assembled the genome of six isolates derived from the Carnoulès AMD, and compared them to the genomes of Tm. arsenitoxydans 3As (isolated from the same site) and Tm. intermedia K12 (isolated from a sewage pipe). A detailed analysis of the Tm. sp. CB2 genome revealed various rearrangements had occurred in comparison to what was observed in 3As and K12 and over 20 genomic islands (GEIs) were found in each of these three genomes. We performed a detailed comparison of the two arsenic-related islands found in CB2, carrying the genes required for arsenite oxidation and As resistance, with those found in K12, 3As, and five other Thiomonas strains also isolated from Carnoulès (CB1, CB3, CB6, ACO3 and ACO7). Our results suggest that these arsenic-related islands have evolved differentially in these closely related Thiomonas strains, leading to divergent capacities to survive in As rich environments. PMID:26422469

  5. Exploring massive, genome scale datasets with the genometricorr package

    KAUST Repository

    Favorov, Alexander

    2012-05-31

    We have created a statistically grounded tool for determining the correlation of genomewide data with other datasets or known biological features, intended to guide biological exploration of high-dimensional datasets, rather than providing immediate answers. The software enables several biologically motivated approaches to these data and here we describe the rationale and implementation for each approach. Our models and statistics are implemented in an R package that efficiently calculates the spatial correlation between two sets of genomic intervals (data and/or annotated features), for use as a metric of functional interaction. The software handles any type of pointwise or interval data and instead of running analyses with predefined metrics, it computes the significance and direction of several types of spatial association; this is intended to suggest potentially relevant relationships between the datasets. Availability and implementation: The package, GenometriCorr, can be freely downloaded at http://genometricorr.sourceforge.net/. Installation guidelines and examples are available from the sourceforge repository. The package is pending submission to Bioconductor. © 2012 Favorov et al.

  6. Probabilistic Clustering of Sequences Inferring new bacterial regulons by comparative genomics

    CERN Document Server

    Van Nimwegen, E; Rajewsky, N; Siggia, E D; Nimwegen, Erik van; Zavolan, Mihaela; Rajewsky, Nikolaus; Siggia, Eric D.

    2002-01-01

    Genome wide comparisons between enteric bacteria yield large sets of conserved putative regulatory sites on a gene by gene basis that need to be clustered into regulons. Using the assumption that regulatory sites can be represented as samples from weight matrices we derive a unique probability distribution for assignments of sites into clusters. Our algorithm, 'PROCSE' (probabilistic clustering of sequences), uses Monte-Carlo sampling of this distribution to partition and align thousands of short DNA sequences into clusters. The algorithm internally determines the number of clusters from the data, and assigns significance to the resulting clusters. We place theoretical limits on the ability of any algorithm to correctly cluster sequences drawn from weight matrices (WMs) when these WMs are unknown. Our analysis suggests that the set of all putative sites for a single genome (e.g. E. coli) is largely inadequate for clustering. When sites from different genomes are combined and all the homologous sites from the ...

  7. Improved bacteriophage genome data is necessary for integrating viral and bacterial ecology.

    Science.gov (United States)

    Bibby, Kyle

    2014-02-01

    The recent rise in "omics"-enabled approaches has lead to improved understanding in many areas of microbial ecology. However, despite the importance that viruses play in a broad microbial ecology context, viral ecology remains largely not integrated into high-throughput microbial ecology studies. A fundamental hindrance to the integration of viral ecology into omics-enabled microbial ecology studies is the lack of suitable reference bacteriophage genomes in reference databases-currently, only 0.001% of bacteriophage diversity is represented in genome sequence databases. This commentary serves to highlight this issue and to promote bacteriophage genome sequencing as a valuable scientific undertaking to both better understand bacteriophage diversity and move towards a more holistic view of microbial ecology.

  8. From DNA to FBA: How to Build Your Own Genome-Scale Metabolic Model.

    Science.gov (United States)

    Cuevas, Daniel A; Edirisinghe, Janaka; Henry, Chris S; Overbeek, Ross; O'Connell, Taylor G; Edwards, Robert A

    2016-01-01

    Microbiological studies are increasingly relying on in silico methods to perform exploration and rapid analysis of genomic data, and functional genomics studies are supplemented by the new perspectives that genome-scale metabolic models offer. A mathematical model consisting of a microbe's entire metabolic map can be rapidly determined from whole-genome sequencing and annotating the genomic material encoded in its DNA. Flux-balance analysis (FBA), a linear programming technique that uses metabolic models to predict the phenotypic responses imposed by environmental elements and factors, is the leading method to simulate and manipulate cellular growth in silico. However, the process of creating an accurate model to use in FBA consists of a series of steps involving a multitude of connections between bioinformatics databases, enzyme resources, and metabolic pathways. We present the methodology and procedure to obtain a metabolic model using PyFBA, an extensible Python-based open-source software package aimed to provide a platform where functional annotations are used to build metabolic models (http://linsalrob.github.io/PyFBA). Backed by the Model SEED biochemistry database, PyFBA contains methods to reconstruct a microbe's metabolic map, run FBA upon different media conditions, and gap-fill its metabolism. The extensibility of PyFBA facilitates novel techniques in creating accurate genome-scale metabolic models. PMID:27379044

  9. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models.

    Science.gov (United States)

    King, Zachary A; Lu, Justin; Dräger, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A; Ebrahim, Ali; Palsson, Bernhard O; Lewis, Nathan E

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data.

  10. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models.

    Science.gov (United States)

    King, Zachary A; Lu, Justin; Dräger, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A; Ebrahim, Ali; Palsson, Bernhard O; Lewis, Nathan E

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data. PMID:26476456

  11. Rapid genome-scale mapping of chromatin accessibility in tissue

    Science.gov (United States)

    2012-01-01

    Background The challenge in extracting genome-wide chromatin features from limiting clinical samples poses a significant hurdle in identification of regulatory marks that impact the physiological or pathological state. Current methods that identify nuclease accessible chromatin are reliant on large amounts of purified nuclei as starting material. This complicates analysis of trace clinical tissue samples that are often stored frozen. We have developed an alternative nuclease based procedure to bypass nuclear preparation to interrogate nuclease accessible regions in frozen tissue samples. Results Here we introduce a novel technique that specifically identifies Tissue Accessible Chromatin (TACh). The TACh method uses pulverized frozen tissue as starting material and employs one of the two robust endonucleases, Benzonase or Cyansase, which are fully active under a range of stringent conditions such as high levels of detergent and DTT. As a proof of principle we applied TACh to frozen mouse liver tissue. Combined with massive parallel sequencing TACh identifies accessible regions that are associated with euchromatic features and accessibility at transcriptional start sites correlates positively with levels of gene transcription. Accessible chromatin identified by TACh overlaps to a large extend with accessible chromatin identified by DNase I using nuclei purified from freshly isolated liver tissue as starting material. The similarities are most pronounced at highly accessible regions, whereas identification of less accessible regions tends to be more divergence between nucleases. Interestingly, we show that some of the differences between DNase I and Benzonase relate to their intrinsic sequence biases and accordingly accessibility of CpG islands is probed more efficiently using TACh. Conclusion The TACh methodology identifies accessible chromatin derived from frozen tissue samples. We propose that this simple, robust approach can be applied across a broad range of

  12. Rapid genome-scale mapping of chromatin accessibility in tissue

    Directory of Open Access Journals (Sweden)

    Grøntved Lars

    2012-06-01

    Full Text Available Abstract Background The challenge in extracting genome-wide chromatin features from limiting clinical samples poses a significant hurdle in identification of regulatory marks that impact the physiological or pathological state. Current methods that identify nuclease accessible chromatin are reliant on large amounts of purified nuclei as starting material. This complicates analysis of trace clinical tissue samples that are often stored frozen. We have developed an alternative nuclease based procedure to bypass nuclear preparation to interrogate nuclease accessible regions in frozen tissue samples. Results Here we introduce a novel technique that specifically identifies Tissue Accessible Chromatin (TACh. The TACh method uses pulverized frozen tissue as starting material and employs one of the two robust endonucleases, Benzonase or Cyansase, which are fully active under a range of stringent conditions such as high levels of detergent and DTT. As a proof of principle we applied TACh to frozen mouse liver tissue. Combined with massive parallel sequencing TACh identifies accessible regions that are associated with euchromatic features and accessibility at transcriptional start sites correlates positively with levels of gene transcription. Accessible chromatin identified by TACh overlaps to a large extend with accessible chromatin identified by DNase I using nuclei purified from freshly isolated liver tissue as starting material. The similarities are most pronounced at highly accessible regions, whereas identification of less accessible regions tends to be more divergence between nucleases. Interestingly, we show that some of the differences between DNase I and Benzonase relate to their intrinsic sequence biases and accordingly accessibility of CpG islands is probed more efficiently using TACh. Conclusion The TACh methodology identifies accessible chromatin derived from frozen tissue samples. We propose that this simple, robust approach can be applied

  13. Large-scale profiling of microRNAs for The Cancer Genome Atlas.

    Science.gov (United States)

    Chu, Andy; Robertson, Gordon; Brooks, Denise; Mungall, Andrew J; Birol, Inanc; Coope, Robin; Ma, Yussanne; Jones, Steven; Marra, Marco A

    2016-01-01

    The comprehensive multiplatform genomics data generated by The Cancer Genome Atlas (TCGA) Research Network is an enabling resource for cancer research. It includes an unprecedented amount of microRNA sequence data: ~11 000 libraries across 33 cancer types. Combined with initiatives like the National Cancer Institute Genomics Cloud Pilots, such data resources will make intensive analysis of large-scale cancer genomics data widely accessible. To support such initiatives, and to enable comparison of TCGA microRNA data to data from other projects, we describe the process that we developed and used to generate the microRNA sequence data, from library construction through to submission of data to repositories. In the context of this process, we describe the computational pipeline that we used to characterize microRNA expression across large patient cohorts.

  14. A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm

    Directory of Open Access Journals (Sweden)

    Allen Eric E

    2008-10-01

    large-scale HGT patterns among protein families and genome groups. Although the DarkHorse algorithm cannot, by itself, provide definitive proof of horizontal gene transfer, it is a flexible, powerful tool that can be combined with slower, more rigorous methods in situations where these other methods could not otherwise be applied.

  15. Draft genome sequence of Erwinia tracheiphila, an economically important bacterial pathogen of cucurbits

    Science.gov (United States)

    Erwinia tracheiphila is one of the most economically important pathogen of cucumbers, melons, squashes, pumpkins, and gourds, in the Northeastern and Midwestern United States, yet the molecular pathology remains uninvestigated. Here we report the first draft genome sequence of an E. tracheiphila str...

  16. Generation and Evaluation of a Genome-Scale Metabolic Network Model of Synechococcus elongatus PCC7942

    Directory of Open Access Journals (Sweden)

    Julián Triana

    2014-08-01

    Full Text Available The reconstruction of genome-scale metabolic models and their applications represent a great advantage of systems biology. Through their use as metabolic flux simulation models, production of industrially-interesting metabolites can be predicted. Due to the growing number of studies of metabolic models driven by the increasing genomic sequencing projects, it is important to conceptualize steps of reconstruction and analysis. We have focused our work in the cyanobacterium Synechococcus elongatus PCC7942, for which several analyses and insights are unveiled. A comprehensive approach has been used, which can be of interest to lead the process of manual curation and genome-scale metabolic analysis. The final model, iSyf715 includes 851 reactions and 838 metabolites. A biomass equation, which encompasses elementary building blocks to allow cell growth, is also included. The applicability of the model is finally demonstrated by simulating autotrophic growth conditions of Synechococcus elongatus PCC7942.

  17. MultiMetEval : Comparative and Multi-Objective Analysis of Genome-Scale Metabolic Models

    NARCIS (Netherlands)

    Zakrzewski, Piotr; Medema, Marnix H.; Gevorgyan, Albert; Kierzek, Andrzej M.; Breitling, Rainer; Takano, Eriko; Fong, Stephen S.

    2012-01-01

    Comparative metabolic modelling is emerging as a novel field, supported by the development of reliable and standardized approaches for constructing genome-scale metabolic models in high throughput. New software solutions are needed to allow efficient comparative analysis of multiple models in the co

  18. The architecture of ArgR-DNA complexes at the genome-scale in> Escherichia coli

    DEFF Research Database (Denmark)

    Cho, Suhyung; Cho, Yoo-Bok; Kang, Taek Jin;

    2015-01-01

    DNA-binding motifs that are recognized by transcription factors (TFs) have been well studied; however, challenges remain in determining the in vivo architecture of TF-DNA complexes on a genome-scale. Here, we determined the in vivo architecture of Escherichia coli arginine repressor (ArgR)-DNA co...

  19. A versatile genome-scale PCR-based pipeline for high-definition DNA FISH

    NARCIS (Netherlands)

    Bienko, M.; Crosetto, N.; Teytelman, L.; Klemm, S.; Itzkovitz, S.; van Oudenaarden, A.

    2013-01-01

    We developed a cost-effective genome-scale PCR-based method for high-definition DNA FISH (HD-FISH). We visualized gene loci with diffraction-limited resolution, chromosomes as spot clusters and single genes together with transcripts by combining HD-FISH with single-molecule RNA FISH. We provide a da

  20. Comparative genome-scale metabolic modeling of actinomycetes : The topology of essential core metabolism

    NARCIS (Netherlands)

    Alam, Mohammad Tauqeer; Medema, Marnix H.; Takano, Eriko; Breitling, Rainer; Gojobori, Takashi

    2011-01-01

    Actinomycetes are highly important bacteria. On one hand, some of them cause severe human and plant diseases, on the other hand, many species are known for their ability to produce antibiotics. Here we report the results of a comparative analysis of genome-scale metabolic models of 37 species of act

  1. Comparative genome-scale metabolic modeling of actinomycetes: the topology of essential core metabolism.

    NARCIS (Netherlands)

    Alam, M.T.; Medema, M.H.; Takano, E.; Breitling, R.

    2011-01-01

    Actinomycetes are highly important bacteria. On one hand, some of them cause severe human and plant diseases, on the other hand, many species are known for their ability to produce antibiotics. Here we report the results of a comparative analysis of genome-scale metabolic models of 37 species of act

  2. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing

    Science.gov (United States)

    Wang, Xuewen; Wang, Le

    2016-01-01

    Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar. PMID:27679641

  3. Genome-wide identification of Hsp70 genes in channel catfish and their regulated expression after bacterial infection.

    Science.gov (United States)

    Song, Lin; Li, Chao; Xie, Yangjie; Liu, Shikai; Zhang, Jiaren; Yao, Jun; Jiang, Chen; Li, Yun; Liu, Zhanjiang

    2016-02-01

    Heat shock proteins 70/110 (Hsp70/110) are a family of conserved ubiquitously expressed heat shock proteins which are produced by cells in response to exposure to stressful conditions. Besides the chaperone and housekeeping functions, they are also known to be involved in immune response during infection. In this study, we identified 16 Hsp70/110 geness in channel catfish (Ictalurus punctatus) through in silico analysis using RNA-Seq and genome databases. Among them 12 members of Hsp70 (Hspa) family and 4 members of Hsp110 (Hsph) family were identified. Phylogenetic and syntenic analyses provided strong evidence in supporting the orthologies of these HSPs. In addition, we also determined the expression patterns of Hsp70/110 genes after Flavobacterium columnare and Edwardsiella ictaluri infections by meta-analyses, for the first time in channel catfish. Ten out of sixteen genes were significantly up/down-regulated after bacterial challenges. Specifically, nine genes were found significantly expressed in gill after F. columnare infection. Two genes were found significantly expressed in intestine after E. ictaluri infection. Pathogen-specific pattern and tissue-specific pattern were found in the two infections. The significantly regulated expressions of catfish Hsp70 genes after bacterial infections suggested their involvement in immune response in catfish. PMID:26693666

  4. Biomarker-based classification of bacterial and fungal whole-blood infections in a genome-wide expression study

    Directory of Open Access Journals (Sweden)

    Andreas eDix

    2015-03-01

    Full Text Available Sepsis is a clinical syndrome that can be caused by bacteria or fungi. Early knowledge on the nature of the causative agent is a prerequisite for targeted anti-microbial therapy. Besides currently used detection methods like blood culture and PCR-based assays, the analysis of the transcriptional response of the host to infecting organisms holds great promise. In this study, we aim to examine the transcriptional footprint of infections caused by the bacterial pathogens Staphylococcus aureus and Escherichia coli and the fungal pathogens Candida albicans and Aspergillus fumigatus in a human whole-blood model. Moreover, we use the expression information to build a random forest classifier to classify if a sample contains a bacterial, fungal, or mock-infection. After normalizing the transcription intensities using stably expressed reference genes, we filtered the gene set for biomarkers of bacterial or fungal blood infections. This selection is based on differential expression and an additional gene relevance measure. In this way, we identified 38 biomarker genes, including IL6, SOCS3, and IRG1 which were already associated to sepsis by other studies. Using these genes, we trained the classifier and assessed its performance. It yielded a 96% accuracy (sensitivities >93%, specificities >97% for a 10-fold stratified cross-validation and a 92% accuracy (sensitivities and specificities >83% for an additional test dataset comprising Cryptococcus neoformans infections. Furthermore, the classifier is robust to Gaussian noise, indicating correct class predictions on datasets of new species. In conclusion, this genome-wide approach demonstrates an effective feature selection process in combination with the construction of a well-performing classification model. Further analyses of genes with pathogen-dependent expression patterns can provide insights into the systemic host responses, which may lead to new anti-microbial therapeutic advances.

  5. Estimation of long-terminal repeat element content in the Helicoverpa zea genome from next generation sequencing of reduced representation bacterial artificial chromosome (BAC) pools

    Science.gov (United States)

    The lepidopteran pest insect, Helicoverpa zea, feeds on cultivated corn and cotton crops in North America where control remains challenging due to evolution of resistance to chemical and transgenic insecticidal toxins, yet few genomic resources are available for this species. A bacterial artificial...

  6. Cloning and Mutagenesis of the Murine Gammaherpesvirus 68 Genome as an Infectious Bacterial Artificial Chromosome

    OpenAIRE

    Adler, Heiko; Messerle, Martin; Wagner, Markus; Koszinowski, Ulrich H.

    2000-01-01

    Gammaherpesviruses cause important infections of humans, in particular in immunocompromised patients. Recently, murine gammaherpesvirus 68 (MHV-68) infection of mice has been developed as a small animal model of gammaherpesvirus pathogenesis. Efficient generation of mutants of MHV-68 would significantly contribute to the understanding of viral gene functions in virus-host interaction, thereby further enhancing the potential of this model. To this end, we cloned the MHV-68 genome as a bacteria...

  7. Bacterial Catalase in the Microsporidian Nosema locustae: Implications for Microsporidian Metabolism and Genome Evolution

    OpenAIRE

    Fast, Naomi M; Law, Joyce S.; Williams, Bryony A P; Patrick J Keeling

    2003-01-01

    Microsporidia constitute a group of extremely specialized intracellular parasites that infect virtually all animals. They are highly derived, reduced fungi that lack several features typical of other eukaryotes, including canonical mitochondria, flagella, and peroxisomes. Consistent with the absence of peroxisomes in microsporidia, the recently completed genome of the microsporidian Encephalitozoon cuniculi lacks a gene for catalase, the major enzymatic marker for the organelle. We show, howe...

  8. Solving the Problem of Comparing Whole Bacterial Genomes across Different Sequencing Platforms

    OpenAIRE

    Kaas, Rolf Sommer; Leekitcharoenphon, Pimlapas; Aarestrup, Frank Møller; Lund, Ole

    2014-01-01

    Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites ...

  9. Construction of a full bacterial artificial chromosome (BAC) library of Oryza sativa genome

    Institute of Scientific and Technical Information of China (English)

    TAOQUANZHOU; HAIYINGZHAO; 等

    1994-01-01

    We have constructed a full BAC library for the superior early indica variety of Oryza sativa,Guang Lu Ai 4.The MAX Efficiency DH10B with increased stability of inserts was used as BAC host cells.The potent pBelo BACII with double selection markers was used as cloning vector.The cloning efficiency we have reached was as high as 98%,and the transformation efficiency was raised up to 106 transformants/μg of large fragment DNA.The BAC recombinant transformants were picked at random and analyzed for the size of inserts,which turned out to be of 120 kb in length on average.We have obtained more than 20,000 such BAC clones.According to conventional probability equation,they covered the entire rice genome of 420,000 kb in length.The entire length of inserts of the library obtained has the 5-to 6-fold coverage of the genome.To our knowledge,this is the first reported full BAC library for a complex genome.

  10. Bacterial molecular networks: bridging the gap between functional genomics and dynamical modelling.

    Science.gov (United States)

    van Helden, Jacques; Toussaint, Ariane; Thieffry, Denis

    2012-01-01

    This introductory review synthesizes the contents of the volume Bacterial Molecular Networks of the series Methods in Molecular Biology. This volume gathers 9 reviews and 16 method chapters describing computational protocols for the analysis of metabolic pathways, protein interaction networks, and regulatory networks. Each protocol is documented by concrete case studies dedicated to model bacteria or interacting populations. Altogether, the chapters provide a representative overview of state-of-the-art methods for data integration and retrieval, network visualization, graph analysis, and dynamical modelling.

  11. Vertebrate Protein CTCF and its Multiple Roles in a Large-Scale Regulation of Genome Activity

    Science.gov (United States)

    Nikolaev, L.G; Akopov, S.B; Didych, D.A; Sverdlov, E.D

    2009-01-01

    The CTCF transcription factor is an 11 zinc fingers multifunctional protein that uses different zinc finger combinations to recognize and bind different sites within DNA. CTCF is thought to participate in various gene regulatory networks including transcription activation and repression, formation of independently functioning chromatin domains and regulation of imprinting. Sequencing of human and other genomes opened up a possibility to ascertain the genomic distribution of CTCF binding sites and to identify CTCF-dependent cis-regulatory elements, including insulators. In the review, we summarized recent data on genomic distribution of CTCF binding sites in the human and other genomes within a framework of the loop domain hypothesis of large-scale regulation of the genome activity. We also tried to formulate possible lines of studies on a variety of CTCF functions which probably depend on its ability to specifically bind DNA, interact with other proteins and form di- and multimers. These three fundamental properties allow CTCF to serve as a transcription factor, an insulator and a constitutive dispersed genome-wide demarcation tool able to recruit various factors that emerge in response to diverse external and internal signals, and thus to exert its signal-specific function(s). PMID:20119526

  12. Vertebrate Protein CTCF and its Multiple Roles in a Large-Scale Regulation of Genome Activity.

    Science.gov (United States)

    Nikolaev, L G; Akopov, S B; Didych, D A; Sverdlov, E D

    2009-08-01

    The CTCF transcription factor is an 11 zinc fingers multifunctional protein that uses different zinc finger combinations to recognize and bind different sites within DNA. CTCF is thought to participate in various gene regulatory networks including transcription activation and repression, formation of independently functioning chromatin domains and regulation of imprinting. Sequencing of human and other genomes opened up a possibility to ascertain the genomic distribution of CTCF binding sites and to identify CTCF-dependent cis-regulatory elements, including insulators. In the review, we summarized recent data on genomic distribution of CTCF binding sites in the human and other genomes within a framework of the loop domain hypothesis of large-scale regulation of the genome activity. We also tried to formulate possible lines of studies on a variety of CTCF functions which probably depend on its ability to specifically bind DNA, interact with other proteins and form di- and multimers. These three fundamental properties allow CTCF to serve as a transcription factor, an insulator and a constitutive dispersed genome-wide demarcation tool able to recruit various factors that emerge in response to diverse external and internal signals, and thus to exert its signal-specific function(s). PMID:20119526

  13. Rare and common regulatory variation in population-scale sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Stephen B Montgomery

    2011-07-01

    Full Text Available Population-scale genome sequencing allows the characterization of functional effects of a broad spectrum of genetic variants underlying human phenotypic variation. Here, we investigate the influence of rare and common genetic variants on gene expression patterns, using variants identified from sequencing data from the 1000 genomes project in an African and European population sample and gene expression data from lymphoblastoid cell lines. We detect comparable numbers of expression quantitative trait loci (eQTLs when compared to genotypes obtained from HapMap 3, but as many as 80% of the top expression quantitative trait variants (eQTVs discovered from 1000 genomes data are novel. The properties of the newly discovered variants suggest that mapping common causal regulatory variants is challenging even with full resequencing data; however, we observe significant enrichment of regulatory effects in splice-site and nonsense variants. Using RNA sequencing data, we show that 46.2% of nonsynonymous variants are differentially expressed in at least one individual in our sample, creating widespread potential for interactions between functional protein-coding and regulatory variants. We also use allele-specific expression to identify putative rare causal regulatory variants. Furthermore, we demonstrate that outlier expression values can be due to rare variant effects, and we approximate the number of such effects harboured in an individual by effect size. Our results demonstrate that integration of genomic and RNA sequencing analyses allows for the joint assessment of genome sequence and genome function.

  14. Combined Analysis of Variation in Core, Accessory and Regulatory Genome Regions Provides a Super-Resolution View into the Evolution of Bacterial Populations

    Science.gov (United States)

    McNally, Alan; Oren, Yaara; Kelly, Darren; Sreecharan, Tristan; Vehkala, Minna; Välimäki, Niko; Prentice, Michael B.; Ashour, Amgad; Avram, Oren; Pupko, Tal; Literak, Ivan; Guenther, Sebastian; Schaufler, Katharina; Wieler, Lothar H.; Zhiyong, Zong; Sheppard, Samuel K.; Corander, Jukka

    2016-01-01

    The use of whole-genome phylogenetic analysis has revolutionized our understanding of the evolution and spread of many important bacterial pathogens due to the high resolution view it provides. However, the majority of such analyses do not consider the potential role of accessory genes when inferring evolutionary trajectories. Moreover, the recently discovered importance of the switching of gene regulatory elements suggests that an exhaustive analysis, combining information from core and accessory genes with regulatory elements could provide unparalleled detail of the evolution of a bacterial population. Here we demonstrate this principle by applying it to a worldwide multi-host sample of the important pathogenic E. coli lineage ST131. Our approach reveals the existence of multiple circulating subtypes of the major drug–resistant clade of ST131 and provides the first ever population level evidence of core genome substitutions in gene regulatory regions associated with the acquisition and maintenance of different accessory genome elements. PMID:27618184

  15. Combined Analysis of Variation in Core, Accessory and Regulatory Genome Regions Provides a Super-Resolution View into the Evolution of Bacterial Populations.

    Science.gov (United States)

    McNally, Alan; Oren, Yaara; Kelly, Darren; Pascoe, Ben; Dunn, Steven; Sreecharan, Tristan; Vehkala, Minna; Välimäki, Niko; Prentice, Michael B; Ashour, Amgad; Avram, Oren; Pupko, Tal; Dobrindt, Ulrich; Literak, Ivan; Guenther, Sebastian; Schaufler, Katharina; Wieler, Lothar H; Zhiyong, Zong; Sheppard, Samuel K; McInerney, James O; Corander, Jukka

    2016-09-01

    The use of whole-genome phylogenetic analysis has revolutionized our understanding of the evolution and spread of many important bacterial pathogens due to the high resolution view it provides. However, the majority of such analyses do not consider the potential role of accessory genes when inferring evolutionary trajectories. Moreover, the recently discovered importance of the switching of gene regulatory elements suggests that an exhaustive analysis, combining information from core and accessory genes with regulatory elements could provide unparalleled detail of the evolution of a bacterial population. Here we demonstrate this principle by applying it to a worldwide multi-host sample of the important pathogenic E. coli lineage ST131. Our approach reveals the existence of multiple circulating subtypes of the major drug-resistant clade of ST131 and provides the first ever population level evidence of core genome substitutions in gene regulatory regions associated with the acquisition and maintenance of different accessory genome elements. PMID:27618184

  16. Integration of expression data in genome-scale metabolic network reconstructions

    Directory of Open Access Journals (Sweden)

    Anna S. Blazier

    2012-08-01

    Full Text Available With the advent of high-throughput technologies, the field of systems biology has amassed an abundance of omics data, quantifying thousands of cellular components across a variety of scales, ranging from mRNA transcript levels to metabolite quantities. Methods are needed to not only integrate this omics data but to also use this data to heighten the predictive capabilities of computational models. Several recent studies have successfully demonstrated how flux balance analysis (FBA, a constraint-based modeling approach, can be used to integrate transcriptomic data into genome-scale metabolic network reconstructions to generate predictive computational models. In this review, we summarize such FBA-based methods for integrating expression data into genome-scale metabolic network reconstructions, highlighting their advantages as well as their limitations.

  17. Disinfection of bacterial biofilms in pilot-scale cooling tower systems.

    Science.gov (United States)

    Liu, Yang; Zhang, Wei; Sileika, Tadas; Warta, Richard; Cianciotto, Nicholas P; Packman, Aaron I

    2011-04-01

    The impact of continuous chlorination and periodic glutaraldehyde treatment on planktonic and biofilm microbial communities was evaluated in pilot-scale cooling towers operated continuously for 3 months. The system was operated at a flow rate of 10,080 l day(-1). Experiments were performed with a well-defined microbial consortium containing three heterotrophic bacteria: Pseudomonas aeruginosa, Klebsiella pneumoniae and Flavobacterium sp. The persistence of each species was monitored in the recirculating cooling water loop and in biofilms on steel and PVC coupons in the cooling tower basin. The observed bacterial colonization in cooling towers did not follow trends in growth rates observed under batch conditions and, instead, reflected differences in the ability of each organism to remain attached and form biofilms under the high-through flow conditions in cooling towers. Flavobacterium was the dominant organism in the community, while P. aeruginosa and K. pneumoniae did not attach well to either PVC or steel coupons in cooling towers and were not able to persist in biofilms. As a result, the much greater ability of Flavobacterium to adhere to surfaces protected it from disinfection, whereas P. aeruginosa and K. pneumoniae were subject to rapid disinfection in the planktonic state.

  18. Niche differentiation of bacterial communities at a millimeter scale in Shark Bay microbial mats

    Science.gov (United States)

    Wong, Hon Lun; Smith, Daniela-Lee; Visscher, Pieter T.; Burns, Brendan P.

    2015-10-01

    Modern microbial mats can provide key insights into early Earth ecosystems, and Shark Bay, Australia, holds one of the best examples of these systems. Identifying the spatial distribution of microorganisms with mat depth facilitates a greater understanding of specific niches and potentially novel microbial interactions. High throughput sequencing coupled with elemental analyses and biogeochemical measurements of two distinct mat types (smooth and pustular) at a millimeter scale were undertaken in the present study. A total of 8,263,982 16S rRNA gene sequences were obtained, which were affiliated to 58 bacterial and candidate phyla. The surface of both mats were dominated by Cyanobacteria, accompanied with known or putative members of Alphaproteobacteria and Bacteroidetes. The deeper anoxic layers of smooth mats were dominated by Chloroflexi, while Alphaproteobacteria dominated the lower layers of pustular mats. In situ microelectrode measurements revealed smooth mats have a steeper profile of O2 and H2S concentrations, as well as higher oxygen production, consumption, and sulfate reduction rates. Specific elements (Mo, Mg, Mn, Fe, V, P) could be correlated with specific mat types and putative phylogenetic groups. Models are proposed for these systems suggesting putative surface anoxic niches, differential nitrogen fixing niches, and those coupled with methane metabolism.

  19. Toward the automated generation of genome-scale metabolic networks in the SEED

    Directory of Open Access Journals (Sweden)

    Gould John

    2007-04-01

    Full Text Available Abstract Background Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. Results We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis. We have implemented our tools and database within the SEED, an open-source software environment for comparative

  20. Identifying anti-growth factors for human cancer cell lines through genome-scale metabolic modeling

    DEFF Research Database (Denmark)

    Ghaffari, Pouyan; Mardinoglu, Adil; Asplund, Anna;

    2015-01-01

    85 antimetabolites that can inhibit growth of, or even kill, any of the cell lines, while at the same time not being toxic for 83 different healthy human cell types. 60 of these antimetabolites were found to inhibit growth in all cell lines. Finally, we experimentally validated one of the predicted...... for inhibition of cell growth may provide leads for the development of efficient cancer treatment strategies.......Human cancer cell lines are used as important model systems to study molecular mechanisms associated with tumor growth, hereunder how genomic and biological heterogeneity found in primary tumors affect cellular phenotypes. We reconstructed Genome scale metabolic models (GEMs) for eleven cell lines...

  1. The genome sequence of E. coli W (ATCC 9637: comparative genome analysis and an improved genome-scale reconstruction of E. coli

    Directory of Open Access Journals (Sweden)

    Lee Sang

    2011-01-01

    Full Text Available Abstract Background Escherichia coli is a model prokaryote, an important pathogen, and a key organism for industrial biotechnology. E. coli W (ATCC 9637, one of four strains designated as safe for laboratory purposes, has not been sequenced. E. coli W is a fast-growing strain and is the only safe strain that can utilize sucrose as a carbon source. Lifecycle analysis has demonstrated that sucrose from sugarcane is a preferred carbon source for industrial bioprocesses. Results We have sequenced and annotated the genome of E. coli W. The chromosome is 4,900,968 bp and encodes 4,764 ORFs. Two plasmids, pRK1 (102,536 bp and pRK2 (5,360 bp, are also present. W has unique features relative to other sequenced laboratory strains (K-12, B and Crooks: it has a larger genome and belongs to phylogroup B1 rather than A. W also grows on a much broader range of carbon sources than does K-12. A genome-scale reconstruction was developed and validated in order to interrogate metabolic properties. Conclusions The genome of W is more similar to commensal and pathogenic B1 strains than phylogroup A strains, and therefore has greater utility for comparative analyses with these strains. W should therefore be the strain of choice, or 'type strain' for group B1 comparative analyses. The genome annotation and tools created here are expected to allow further utilization and development of E. coli W as an industrial organism for sucrose-based bioprocesses. Refinements in our E. coli metabolic reconstruction allow it to more accurately define E. coli metabolism relative to previous models.

  2. Solving the Problem of Comparing Whole Bacterial Genomes across Different Sequencing Platforms

    DEFF Research Database (Denmark)

    Kaas, Rolf Sommer; Leekitcharoenphon, Pimlapas; Aarestrup, Frank Møller;

    2014-01-01

    Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one...... data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due...

  3. Benchmarking of methods for identification of antimicrobial resistance genes in bacterial whole genome data

    DEFF Research Database (Denmark)

    Clausen, Philip T. L. C.; Zankari, Ea; Aarestrup, Frank Møller;

    2016-01-01

    was compared with the observed phenotypes for all isolates. To challenge further the sensitivity of the in silico methods, the datasets were also down-sampled to 1% of the reads and reanalysed. The best results were obtained by identification of resistance genes by mapping directly against the raw reads......Next generation sequencing (NGS) may be an alternative to phenotypic susceptibility testing for surveillance and clinical diagnosis. However, current bioinformatics methods may be associated with false positives and negatives. In this study, a novel mapping method was developed and benchmarked...... to two different methods in current use for identification of antibiotic resistance genes in bacterial WGS data. A novel method, KmerResistance, which examines the co-occurrence of k-mers between the WGS data and a database of resistance genes, was developed. The performance of this method was compared...

  4. Genome-scale identification of Legionella pneumophila effectors using a machine learning approach.

    Directory of Open Access Journals (Sweden)

    David Burstein

    2009-07-01

    Full Text Available A large number of highly pathogenic bacteria utilize secretion systems to translocate effector proteins into host cells. Using these effectors, the bacteria subvert host cell processes during infection. Legionella pneumophila translocates effectors via the Icm/Dot type-IV secretion system and to date, approximately 100 effectors have been identified by various experimental and computational techniques. Effector identification is a critical first step towards the understanding of the pathogenesis system in L. pneumophila as well as in other bacterial pathogens. Here, we formulate the task of effector identification as a classification problem: each L. pneumophila open reading frame (ORF was classified as either effector or not. We computationally defined a set of features that best distinguish effectors from non-effectors. These features cover a wide range of characteristics including taxonomical dispersion, regulatory data, genomic organization, similarity to eukaryotic proteomes and more. Machine learning algorithms utilizing these features were then applied to classify all the ORFs within the L. pneumophila genome. Using this approach we were able to predict and experimentally validate 40 new effectors, reaching a success rate of above 90%. Increasing the number of validated effectors to around 140, we were able to gain novel insights into their characteristics. Effectors were found to have low G+C content, supporting the hypothesis that a large number of effectors originate via horizontal gene transfer, probably from their protozoan host. In addition, effectors were found to cluster in specific genomic regions. Finally, we were able to provide a novel description of the C-terminal translocation signal required for effector translocation by the Icm/Dot secretion system. To conclude, we have discovered 40 novel L. pneumophila effectors, predicted over a hundred additional highly probable effectors, and shown the applicability of machine

  5. Modeling of Scale-Dependent Bacterial Growth by Chemical Kinetics Approach

    Directory of Open Access Journals (Sweden)

    Haydee Martínez

    2014-01-01

    Full Text Available We applied the so-called chemical kinetics approach to complex bacterial growth patterns that were dependent on the liquid-surface-area-to-volume ratio (SA/V of the bacterial cultures. The kinetic modeling was based on current experimental knowledge in terms of autocatalytic bacterial growth, its inhibition by the metabolite CO2, and the relief of inhibition through the physical escape of the inhibitor. The model quantitatively reproduces kinetic data of SA/V-dependent bacterial growth and can discriminate between differences in the growth dynamics of enteropathogenic E. coli, E. coli  JM83, and Salmonella typhimurium on one hand and Vibrio cholerae on the other hand. Furthermore, the data fitting procedures allowed predictions about the velocities of the involved key processes and the potential behavior in an open-flow bacterial chemostat, revealing an oscillatory approach to the stationary states.

  6. iAK692: A genome-scale metabolic model of Spirulina platensis C1

    Directory of Open Access Journals (Sweden)

    Klanchui Amornpan

    2012-06-01

    Full Text Available Abstract Background Spirulina (Arthrospira platensis is a well-known filamentous cyanobacterium used in the production of many industrial products, including high value compounds, healthy food supplements, animal feeds, pharmaceuticals and cosmetics, for example. It has been increasingly studied around the world for scientific purposes, especially for its genome, biology, physiology, and also for the analysis of its small-scale metabolic network. However, the overall description of the metabolic and biotechnological capabilities of S. platensis requires the development of a whole cellular metabolism model. Recently, the S. platensis C1 (Arthrospira sp. PCC9438 genome sequence has become available, allowing systems-level studies of this commercial cyanobacterium. Results In this work, we present the genome-scale metabolic network analysis of S. platensis C1, iAK692, its topological properties, and its metabolic capabilities and functions. The network was reconstructed from the S. platensis C1 annotated genomic sequence using Pathway Tools software to generate a preliminary network. Then, manual curation was performed based on a collective knowledge base and a combination of genomic, biochemical, and physiological information. The genome-scale metabolic model consists of 692 genes, 837 metabolites, and 875 reactions. We validated iAK692 by conducting fermentation experiments and simulating the model under autotrophic, heterotrophic, and mixotrophic growth conditions using COBRA toolbox. The model predictions under these growth conditions were consistent with the experimental results. The iAK692 model was further used to predict the unique active reactions and essential genes for each growth condition. Additionally, the metabolic states of iAK692 during autotrophic and mixotrophic growths were described by phenotypic phase plane (PhPP analysis. Conclusions This study proposes the first genome-scale model of S. platensis C1, iAK692, which is a

  7. FVGWAS: Fast voxelwise genome wide association analysis of large-scale imaging genetic data.

    Science.gov (United States)

    Huang, Meiyan; Nichols, Thomas; Huang, Chao; Yu, Yang; Lu, Zhaohua; Knickmeyer, Rebecca C; Feng, Qianjin; Zhu, Hongtu

    2015-09-01

    More and more large-scale imaging genetic studies are being widely conducted to collect a rich set of imaging, genetic, and clinical data to detect putative genes for complexly inherited neuropsychiatric and neurodegenerative disorders. Several major big-data challenges arise from testing genome-wide (NC>12 million known variants) associations with signals at millions of locations (NV~10(6)) in the brain from thousands of subjects (n~10(3)). The aim of this paper is to develop a Fast Voxelwise Genome Wide Association analysiS (FVGWAS) framework to efficiently carry out whole-genome analyses of whole-brain data. FVGWAS consists of three components including a heteroscedastic linear model, a global sure independence screening (GSIS) procedure, and a detection procedure based on wild bootstrap methods. Specifically, for standard linear association, the computational complexity is O (nNVNC) for voxelwise genome wide association analysis (VGWAS) method compared with O ((NC+NV)n(2)) for FVGWAS. Simulation studies show that FVGWAS is an efficient method of searching sparse signals in an extremely large search space, while controlling for the family-wise error rate. Finally, we have successfully applied FVGWAS to a large-scale imaging genetic data analysis of ADNI data with 708 subjects, 193,275voxels in RAVENS maps, and 501,584 SNPs, and the total processing time was 203,645s for a single CPU. Our FVGWAS may be a valuable statistical toolbox for large-scale imaging genetic analysis as the field is rapidly advancing with ultra-high-resolution imaging and whole-genome sequencing. PMID:26025292

  8. A systems approach to predict oncometabolites via context-specific genome-scale metabolic networks.

    Directory of Open Access Journals (Sweden)

    Hojung Nam

    2014-09-01

    Full Text Available Altered metabolism in cancer cells has been viewed as a passive response required for a malignant transformation. However, this view has changed through the recently described metabolic oncogenic factors: mutated isocitrate dehydrogenases (IDH, succinate dehydrogenase (SDH, and fumarate hydratase (FH that produce oncometabolites that competitively inhibit epigenetic regulation. In this study, we demonstrate in silico predictions of oncometabolites that have the potential to dysregulate epigenetic controls in nine types of cancer by incorporating massive scale genetic mutation information (collected from more than 1,700 cancer genomes, expression profiling data, and deploying Recon 2 to reconstruct context-specific genome-scale metabolic models. Our analysis predicted 15 compounds and 24 substructures of potential oncometabolites that could result from the loss-of-function and gain-of-function mutations of metabolic enzymes, respectively. These results suggest a substantial potential for discovering unidentified oncometabolites in various forms of cancers.

  9. Expanding a dynamic flux balance model of yeast fermentation to genome-scale

    OpenAIRE

    Agosin Eduardo; Pérez-Correa J Ricardo; Pizarro Francisco; Vargas Felipe A

    2011-01-01

    Abstract Background Yeast is considered to be a workhorse of the biotechnology industry for the production of many value-added chemicals, alcoholic beverages and biofuels. Optimization of the fermentation is a challenging task that greatly benefits from dynamic models able to accurately describe and predict the fermentation profile and resulting products under different genetic and environmental conditions. In this article, we developed and validated a genome-scale dynamic flux balance model,...

  10. Research progress of bacterial pan-genome%细菌泛基因组学的研究

    Institute of Scientific and Technical Information of China (English)

    庄绪冉; 朱泳璋

    2012-01-01

    Bacterium, one of the most ancient organisms, has great diversity and obvious differentiation in phenotype among different strains and even in different lines of one strain. The hereditary basis of differentiation is due to the genomir genetic information difference among different strains. In order to illustrate the individual genetic diversity and explore the hereditary basis of individual phylogenesis and phenotype difference, the concept of pan-genome is put forward. The causes of hereditary diversity of bacteria, the research strategy of pan-genome and its application in bacterial research are reviewed in this paper.%细菌是自然界中最古老的生物种群之一,不同菌种之间甚至是同一种菌的不同株系之间也具有丰富的遗传多样性,在表型特征上具有明显分化,这些分化的遗传基础主要源自菌株之间的基因组遗传信息的差异.为了更全面地在基因组水平上揭示细菌种内个体间的遗传多样性,进一步探寻个体间的系统发生关系和个体间表型差异的遗传基础,科学家提出了细菌泛基因组学的概念,该文对细菌菌种遗传多样性的形成机制、泛基因组的研究策略及其在细菌研究中的应用和进展等作一综述.

  11. Pantograph: A template-based method for genome-scale metabolic model reconstruction.

    Science.gov (United States)

    Loira, Nicolas; Zhukova, Anna; Sherman, David James

    2015-04-01

    Genome-scale metabolic models are a powerful tool to study the inner workings of biological systems and to guide applications. The advent of cheap sequencing has brought the opportunity to create metabolic maps of biotechnologically interesting organisms. While this drives the development of new methods and automatic tools, network reconstruction remains a time-consuming process where extensive manual curation is required. This curation introduces specific knowledge about the modeled organism, either explicitly in the form of molecular processes, or indirectly in the form of annotations of the model elements. Paradoxically, this knowledge is usually lost when reconstruction of a different organism is started. We introduce the Pantograph method for metabolic model reconstruction. This method combines a template reaction knowledge base, orthology mappings between two organisms, and experimental phenotypic evidence, to build a genome-scale metabolic model for a target organism. Our method infers implicit knowledge from annotations in the template, and rewrites these inferences to include them in the resulting model of the target organism. The generated model is well suited for manual curation. Scripts for evaluating the model with respect to experimental data are automatically generated, to aid curators in iterative improvement. We present an implementation of the Pantograph method, as a toolbox for genome-scale model reconstruction, curation and validation. This open source package can be obtained from: http://pathtastic.gforge.inria.fr.

  12. Microarray Data Processing Techniques for Genome-Scale Network Inference from Large Public Repositories.

    Science.gov (United States)

    Chockalingam, Sriram; Aluru, Maneesha; Aluru, Srinivas

    2016-01-01

    Pre-processing of microarray data is a well-studied problem. Furthermore, all popular platforms come with their own recommended best practices for differential analysis of genes. However, for genome-scale network inference using microarray data collected from large public repositories, these methods filter out a considerable number of genes. This is primarily due to the effects of aggregating a diverse array of experiments with different technical and biological scenarios. Here we introduce a pre-processing pipeline suitable for inferring genome-scale gene networks from large microarray datasets. We show that partitioning of the available microarray datasets according to biological relevance into tissue- and process-specific categories significantly extends the limits of downstream network construction. We demonstrate the effectiveness of our pre-processing pipeline by inferring genome-scale networks for the model plant Arabidopsis thaliana using two different construction methods and a collection of 11,760 Affymetrix ATH1 microarray chips. Our pre-processing pipeline and the datasets used in this paper are made available at http://alurulab.cc.gatech.edu/microarray-pp. PMID:27657141

  13. Basic and applied uses of genome-scale metabolic network reconstructions of Escherichia coli

    DEFF Research Database (Denmark)

    McCloskey, Douglas; Palsson, Bernhard; Feist, Adam

    2013-01-01

    The genome-scale model (GEM) of metabolism in the bacterium Escherichia coli K-12 has been in development for over a decade and is now in wide use. GEM-enabled studies of E. coli have been primarily focused on six applications: (1) metabolic engineering, (2) model-driven discovery, (3) prediction...... include the expansion of GEMs by integrating additional cellular processes beyond metabolism, the identification of key constraints based on emerging data types, and the development of computational methods able to handle such large-scale network models with sufficient accuracy....

  14. Multidrug-resistant Escherichia coli soft tissue infection investigated with bacterial whole genome sequencing

    Science.gov (United States)

    Buchanan, Ruaridh; Stoesser, Nicole; Crook, Derrick; Bowler, Ian C J W

    2014-01-01

    A 45-year-old man with dilated cardiomyopathy presented with acute leg pain and erythema suggestive of necrotising fasciitis. Initial surgical exploration revealed no necrosis and treatment for a soft tissue infection was started. Blood and tissue cultures unexpectedly grew a Gram-negative bacillus, subsequently identified by an automated broth microdilution phenotyping system as an extended-spectrum β-lactamase producing Escherichia coli. The patient was treated with a 3-week course of antibiotics (ertapenem followed by ciprofloxacin) and debridement for small areas of necrosis, followed by skin grafting. The presence of E. coli triggered investigation of both host and pathogen. The patient was found to have previously undiagnosed liver disease, a risk factor for E. coli soft tissue infection. Whole genome sequencing of isolates from all specimens confirmed they were clonal, of sequence type ST131 and associated with a likely plasmid-associated AmpC (CMY-2), several other resistance genes and a number of virulence factors. PMID:25331151

  15. Biological Removal of Phosphate Using Phosphate Solubilizing Bacterial Consortium from Synthetic Wastewater: A Laboratory Scale

    OpenAIRE

    Dipak Paul; Sankar Narayan Sinha

    2015-01-01

    Biological phosphate removal is an important process having gained worldwide attention and widely used for removing phosphorus from wastewater. The present investigation was aimed to screen the efficient phosphate solubilizing bacterial isolates and used to remove phosphate from synthetic wastewater under shaking flasks conditions. Pseudomonas sp. JPSB12, Enterobacter sp. TPSB20, Flavobacterium sp. TPSB23 and mixed bacterial consortium (Pseudomonas sp. JPSB12+Enterobacter sp. TPSB20+Flavobact...

  16. Similar processes but different environmental filters for soil bacterial and fungal community composition turnover on a broad spatial scale.

    Directory of Open Access Journals (Sweden)

    Nicolas Chemidlin Prévost-Bouré

    Full Text Available Spatial scaling of microorganisms has been demonstrated over the last decade. However, the processes and environmental filters shaping soil microbial community structure on a broad spatial scale still need to be refined and ranked. Here, we compared bacterial and fungal community composition turnovers through a biogeographical approach on the same soil sampling design at a broad spatial scale (area range: 13300 to 31000 km2: i to examine their spatial structuring; ii to investigate the relative importance of environmental selection and spatial autocorrelation in determining their community composition turnover; and iii to identify and rank the relevant environmental filters and scales involved in their spatial variations. Molecular fingerprinting of soil bacterial and fungal communities was performed on 413 soils from four French regions of contrasting environmental heterogeneity (LandesBacterial and fungal community composition turnovers were mainly driven by environmental selection explaining from 10% to 20% of community composition variations, but spatial variables also explained 3% to 9% of total variance. These variables highlighted significant spatial autocorrelation of both communities unexplained by the environmental variables measured and could partly be explained by dispersal limitations. Although the identified filters and their hierarchy were dependent on the region and organism, selection was systematically based on a common group of environmental variables: pH, trophic resources, texture and land use. Spatial autocorrelation was also important at

  17. Analysis of Comparative Sequence and Genomic Data to Verify Phylogenetic Relationship and Explore a New Subfamily of Bacterial Lipases.

    Directory of Open Access Journals (Sweden)

    Malihe Masomian

    Full Text Available Thermostable and organic solvent-tolerant enzymes have significant potential in a wide range of synthetic reactions in industry due to their inherent stability at high temperatures and their ability to endure harsh organic solvents. In this study, a novel gene encoding a true lipase was isolated by construction of a genomic DNA library of thermophilic Aneurinibacillus thermoaerophilus strain HZ into Escherichia coli plasmid vector. Sequence analysis revealed that HZ lipase had 62% identity to putative lipase from Bacillus pseudomycoides. The closely characterized lipases to the HZ lipase gene are from thermostable Bacillus and Geobacillus lipases belonging to the subfamily I.5 with ≤ 57% identity. The amino acid sequence analysis of HZ lipase determined a conserved pentapeptide containing the active serine, GHSMG and a Ca(2+-binding motif, GCYGSD in the enzyme. Protein structure modeling showed that HZ lipase consisted of an α/β hydrolase fold and a lid domain. Protein sequence alignment, conserved regions analysis, clustal distance matrix and amino acid composition illustrated differences between HZ lipase and other thermostable lipases. Phylogenetic analysis revealed that this lipase represented a new subfamily of family I of bacterial true lipases, classified as family I.9. The HZ lipase was expressed under promoter Plac using IPTG and was characterized. The recombinant enzyme showed optimal activity at 65 °C and retained ≥ 97% activity after incubation at 50 °C for 1h. The HZ lipase was stable in various polar and non-polar organic solvents.

  18. Analysis of Comparative Sequence and Genomic Data to Verify Phylogenetic Relationship and Explore a New Subfamily of Bacterial Lipases.

    Science.gov (United States)

    Masomian, Malihe; Rahman, Raja Noor Zaliha Raja Abd; Salleh, Abu Bakar; Basri, Mahiran

    2016-01-01

    Thermostable and organic solvent-tolerant enzymes have significant potential in a wide range of synthetic reactions in industry due to their inherent stability at high temperatures and their ability to endure harsh organic solvents. In this study, a novel gene encoding a true lipase was isolated by construction of a genomic DNA library of thermophilic Aneurinibacillus thermoaerophilus strain HZ into Escherichia coli plasmid vector. Sequence analysis revealed that HZ lipase had 62% identity to putative lipase from Bacillus pseudomycoides. The closely characterized lipases to the HZ lipase gene are from thermostable Bacillus and Geobacillus lipases belonging to the subfamily I.5 with ≤ 57% identity. The amino acid sequence analysis of HZ lipase determined a conserved pentapeptide containing the active serine, GHSMG and a Ca(2+)-binding motif, GCYGSD in the enzyme. Protein structure modeling showed that HZ lipase consisted of an α/β hydrolase fold and a lid domain. Protein sequence alignment, conserved regions analysis, clustal distance matrix and amino acid composition illustrated differences between HZ lipase and other thermostable lipases. Phylogenetic analysis revealed that this lipase represented a new subfamily of family I of bacterial true lipases, classified as family I.9. The HZ lipase was expressed under promoter Plac using IPTG and was characterized. The recombinant enzyme showed optimal activity at 65 °C and retained ≥ 97% activity after incubation at 50 °C for 1h. The HZ lipase was stable in various polar and non-polar organic solvents. PMID:26934700

  19. Mutational analysis of the human mitochondrial genome branches into the realm of bacterial genetics

    Energy Technology Data Exchange (ETDEWEB)

    Howell, N. [Univ. of Texas Medical Branch, Galveston, TX (United States)

    1996-10-01

    This is shaping up as a vintage year for studies of the genetics and evolution of the human mitochondrial genome (mtDNA). In a theoretical and experimental tour de force, Shenkar et al. (1996), on pages 772-780 of this issue, derive the mutation rate of the 4,977-bp (or {open_quotes}common{close_quotes}) deletion in the human mtDNA through refinement and extension of fluctuation analysis, a technique that was first used >50 years ago. Shenkar et al., in essence, have solved or bypassed many of the difficulties that are inherent in the application of fluctuation analysis to human mitochondrial gene mutations. Their study is important for two principal reasons. In the first place, high levels of this deletion cause a variety of pathological disorders, including Kearns-Sayre syndrome and chronic progressive external ophthalmoplegia. Their current report, therefore, is a major step in the elucidation of the molecular genetic pathogenesis of this group of mitochondrial disorders. For example, it now may be feasible to analyze the effects of selection on transmission and segregation of this deletion and, perhaps, other mtDNA mutations as well. Second, and at a broader level, the approach of Shenkar et al. should find widespread applicability to the study of other mtDNA mutations. It has been recognized for several years that mammalian mtDNA mutates much more rapidly than nuclear DNA, a phenomenon with potentially profound evolutionary implications. It is exciting and useful, both experimentally and theoretically, that this {open_quotes}old{close_quotes} approach can be used for {open_quotes}new{close_quotes} applications. 56 refs.

  20. Cloning of the Full-Length Rhesus Cytomegalovirus Genome as an Infectious and Self-Excisable Bacterial Artificial Chromosome for Analysis of Viral Pathogenesis

    OpenAIRE

    Chang, W. L. William; Peter A Barry

    2003-01-01

    Rigorous investigation of many functions encoded by cytomegaloviruses (CMVs) requires analysis in the context of virus-host interactions. To facilitate the construction of rhesus CMV (RhCMV) mutants for in vivo studies, a bacterial artificial chromosome (BAC) containing an enhanced green fluorescent protein (EGFP) cassette was engineered into the intergenic region between unique short 1 (US1) and US2 of the full-length viral genome by Cre/lox-mediated recombination. Infectious virions were re...

  1. Genome-Scale Model Reveals Metabolic Basis of Biomass Partitioning in a Model Diatom.

    Science.gov (United States)

    Levering, Jennifer; Broddrick, Jared; Dupont, Christopher L; Peers, Graham; Beeri, Karen; Mayers, Joshua; Gallina, Alessandra A; Allen, Andrew E; Palsson, Bernhard O; Zengler, Karsten

    2016-01-01

    Diatoms are eukaryotic microalgae that contain genes from various sources, including bacteria and the secondary endosymbiotic host. Due to this unique combination of genes, diatoms are taxonomically and functionally distinct from other algae and vascular plants and confer novel metabolic capabilities. Based on the genome annotation, we performed a genome-scale metabolic network reconstruction for the marine diatom Phaeodactylum tricornutum. Due to their endosymbiotic origin, diatoms possess a complex chloroplast structure which complicates the prediction of subcellular protein localization. Based on previous work we implemented a pipeline that exploits a series of bioinformatics tools to predict protein localization. The manually curated reconstructed metabolic network iLB1027_lipid accounts for 1,027 genes associated with 4,456 reactions and 2,172 metabolites distributed across six compartments. To constrain the genome-scale model, we determined the organism specific biomass composition in terms of lipids, carbohydrates, and proteins using Fourier transform infrared spectrometry. Our simulations indicate the presence of a yet unknown glutamine-ornithine shunt that could be used to transfer reducing equivalents generated by photosynthesis to the mitochondria. The model reflects the known biochemical composition of P. tricornutum in defined culture conditions and enables metabolic engineering strategies to improve the use of P. tricornutum for biotechnological applications.

  2. Genome-Scale Model Reveals Metabolic Basis of Biomass Partitioning in a Model Diatom.

    Directory of Open Access Journals (Sweden)

    Jennifer Levering

    Full Text Available Diatoms are eukaryotic microalgae that contain genes from various sources, including bacteria and the secondary endosymbiotic host. Due to this unique combination of genes, diatoms are taxonomically and functionally distinct from other algae and vascular plants and confer novel metabolic capabilities. Based on the genome annotation, we performed a genome-scale metabolic network reconstruction for the marine diatom Phaeodactylum tricornutum. Due to their endosymbiotic origin, diatoms possess a complex chloroplast structure which complicates the prediction of subcellular protein localization. Based on previous work we implemented a pipeline that exploits a series of bioinformatics tools to predict protein localization. The manually curated reconstructed metabolic network iLB1027_lipid accounts for 1,027 genes associated with 4,456 reactions and 2,172 metabolites distributed across six compartments. To constrain the genome-scale model, we determined the organism specific biomass composition in terms of lipids, carbohydrates, and proteins using Fourier transform infrared spectrometry. Our simulations indicate the presence of a yet unknown glutamine-ornithine shunt that could be used to transfer reducing equivalents generated by photosynthesis to the mitochondria. The model reflects the known biochemical composition of P. tricornutum in defined culture conditions and enables metabolic engineering strategies to improve the use of P. tricornutum for biotechnological applications.

  3. Genome-Scale Model Reveals Metabolic Basis of Biomass Partitioning in a Model Diatom

    Science.gov (United States)

    Broddrick, Jared; Dupont, Christopher L.; Peers, Graham; Beeri, Karen; Mayers, Joshua; Gallina, Alessandra A.; Allen, Andrew E.; Palsson, Bernhard O.; Zengler, Karsten

    2016-01-01

    Diatoms are eukaryotic microalgae that contain genes from various sources, including bacteria and the secondary endosymbiotic host. Due to this unique combination of genes, diatoms are taxonomically and functionally distinct from other algae and vascular plants and confer novel metabolic capabilities. Based on the genome annotation, we performed a genome-scale metabolic network reconstruction for the marine diatom Phaeodactylum tricornutum. Due to their endosymbiotic origin, diatoms possess a complex chloroplast structure which complicates the prediction of subcellular protein localization. Based on previous work we implemented a pipeline that exploits a series of bioinformatics tools to predict protein localization. The manually curated reconstructed metabolic network iLB1027_lipid accounts for 1,027 genes associated with 4,456 reactions and 2,172 metabolites distributed across six compartments. To constrain the genome-scale model, we determined the organism specific biomass composition in terms of lipids, carbohydrates, and proteins using Fourier transform infrared spectrometry. Our simulations indicate the presence of a yet unknown glutamine-ornithine shunt that could be used to transfer reducing equivalents generated by photosynthesis to the mitochondria. The model reflects the known biochemical composition of P. tricornutum in defined culture conditions and enables metabolic engineering strategies to improve the use of P. tricornutum for biotechnological applications. PMID:27152931

  4. Genome-scale reconstruction of metabolic networks of Lactobacillus casei ATCC 334 and 12A.

    Directory of Open Access Journals (Sweden)

    Elena Vinay-Lara

    Full Text Available Lactobacillus casei strains are widely used in industry and the utility of this organism in these industrial applications is strain dependent. Hence, tools capable of predicting strain specific phenotypes would have utility in the selection of strains for specific industrial processes. Genome-scale metabolic models can be utilized to better understand genotype-phenotype relationships and to compare different organisms. To assist in the selection and development of strains with enhanced industrial utility, genome-scale models for L. casei ATCC 334, a well characterized strain, and strain 12A, a corn silage isolate, were constructed. Draft models were generated from RAST genome annotations using the Model SEED database and refined by evaluating ATP generating cycles, mass-and-charge-balances of reactions, and growth phenotypes. After the validation process was finished, we compared the metabolic networks of these two strains to identify metabolic, genetic and ortholog differences that may lead to different phenotypic behaviors. We conclude that the metabolic capabilities of the two networks are highly similar. The L. casei ATCC 334 model accounts for 1,040 reactions, 959 metabolites and 548 genes, while the L. casei 12A model accounts for 1,076 reactions, 979 metabolites and 640 genes. The developed L. casei ATCC 334 and 12A metabolic models will enable better understanding of the physiology of these organisms and be valuable tools in the development and selection of strains with enhanced utility in a variety of industrial applications.

  5. Genome-scale reconstruction of the Streptococcus pyogenes M49 metabolic network reveals growth requirements and indicates potential drug targets

    NARCIS (Netherlands)

    J. Levering; T. Fiedler; A. Sieg; K.W.A. van Grinsven; S. Hering; N. Veith; B.G. Olivier; L. Klett; J. Hugenholtz; B. Teusink; B. Kreikemeyer; U. Kummer

    2016-01-01

    Genome-scale metabolic models comprise stoichiometric relations between metabolites, as well as associations between genes and metabolic reactions and facilitate the analysis of metabolism. We computationally reconstructed the metabolic network of the lactic acid bacterium Streptococcus pyogenes M49

  6. Reconstruction and analysis of a genome-scale metabolic model for Scheffersomyces stipitis

    Directory of Open Access Journals (Sweden)

    Balagurunathan Balaji

    2012-02-01

    Full Text Available Abstract Background Fermentation of xylose, the major component in hemicellulose, is essential for economic conversion of lignocellulosic biomass to fuels and chemicals. The yeast Scheffersomyces stipitis (formerly known as Pichia stipitis has the highest known native capacity for xylose fermentation and possesses several genes for lignocellulose bioconversion in its genome. Understanding the metabolism of this yeast at a global scale, by reconstructing the genome scale metabolic model, is essential for manipulating its metabolic capabilities and for successful transfer of its capabilities to other industrial microbes. Results We present a genome-scale metabolic model for Scheffersomyces stipitis, a native xylose utilizing yeast. The model was reconstructed based on genome sequence annotation, detailed experimental investigation and known yeast physiology. Macromolecular composition of Scheffersomyces stipitis biomass was estimated experimentally and its ability to grow on different carbon, nitrogen, sulphur and phosphorus sources was determined by phenotype microarrays. The compartmentalized model, developed based on an iterative procedure, accounted for 814 genes, 1371 reactions, and 971 metabolites. In silico computed growth rates were compared with high-throughput phenotyping data and the model could predict the qualitative outcomes in 74% of substrates investigated. Model simulations were used to identify the biosynthetic requirements for anaerobic growth of Scheffersomyces stipitis on glucose and the results were validated with published literature. The bottlenecks in Scheffersomyces stipitis metabolic network for xylose uptake and nucleotide cofactor recycling were identified by in silico flux variability analysis. The scope of the model in enhancing the mechanistic understanding of microbial metabolism is demonstrated by identifying a mechanism for mitochondrial respiration and oxidative phosphorylation. Conclusion The genome-scale

  7. Genome-scale analysis of positional clustering of mouse testis-specific genes

    Directory of Open Access Journals (Sweden)

    Lee Bernett TK

    2005-01-01

    Full Text Available Abstract Background Genes are not randomly distributed on a chromosome as they were thought even after removal of tandem repeats. The positional clustering of co-expressed genes is known in prokaryotes and recently reported in several eukaryotic organisms such as Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens. In order to further investigate the mode of tissue-specific gene clustering in higher eukaryotes, we have performed a genome-scale analysis of positional clustering of the mouse testis-specific genes. Results Our computational analysis shows that a large proportion of testis-specific genes are clustered in groups of 2 to 5 genes in the mouse genome. The number of clusters is much higher than expected by chance even after removal of tandem repeats. Conclusion Our result suggests that testis-specific genes tend to cluster on the mouse chromosomes. This provides another piece of evidence for the hypothesis that clusters of tissue-specific genes do exist.

  8. Determining the Control Circuitry of Redox Metabolism at the Genome-Scale

    DEFF Research Database (Denmark)

    Federowicz, Stephen; Kim, Donghyuk; Ebrahim, Ali;

    2014-01-01

    -scale metabolic model to show that ArcA and Fnr regulate >80% of total metabolic flux and 96% of differential gene expression across fermentative and nitrate respiratory conditions. Based on the data, we propose a feedforward with feedback trim regulatory scheme, given the extensive repression of catabolic genes...... that are regulated during electron acceptor shifts. Here we propose a qualitative model that accounts for the full breadth of regulated genes by detailing how two global transcription factors (TFs), ArcA and Fnr of E. coli, sense key metabolic redox ratios and act on a genome-wide basis to regulate anabolic......, catabolic, and energy generation pathways. We first fill gaps in our knowledge of this transcriptional regulatory network by carrying out ChIP-chip and gene expression experiments to identify 463 regulatory events. We then interfaced this reconstructed regulatory network with a highly curated genome...

  9. Comparative Genomics of Field Isolates of Mycobacterium bovis and M. caprae Provides Evidence for Possible Correlates with Bacterial Viability and Virulence

    Science.gov (United States)

    de la Fuente, José; Díez-Delgado, Iratxe; Contreras, Marinela; Vicente, Joaquín; Cabezas-Cruz, Alejandro; Tobes, Raquel; Manrique, Marina; López, Vladimir; Romero, Beatriz; Bezos, Javier; Dominguez, Lucas; Sevilla, Iker A.; Garrido, Joseba M.; Juste, Ramón; Madico, Guillermo; Jones-López, Edward; Gortazar, Christian

    2015-01-01

    Mycobacteria of the Mycobacterium tuberculosis complex (MTBC) greatly affect humans and animals worldwide. The life cycle of mycobacteria is complex and the mechanisms resulting in pathogen infection and survival in host cells are not fully understood. Recently, comparative genomics analyses have provided new insights into the evolution and adaptation of the MTBC to survive inside the host. However, most of this information has been obtained using M. tuberculosis but not other members of the MTBC such as M. bovis and M. caprae. In this study, the genome of three M. bovis (MB1, MB3, MB4) and one M. caprae (MB2) field isolates with different lesion score, prevalence and host distribution phenotypes were sequenced. Genome sequence information was used for whole-genome and protein-targeted comparative genomics analysis with the aim of finding correlates with phenotypic variation with potential implications for tuberculosis (TB) disease risk assessment and control. At the whole-genome level the results of the first comparative genomics study of field isolates of M. bovis including M. caprae showed that as previously reported for M. tuberculosis, sequential chromosomal nucleotide substitutions were the main driver of the M. bovis genome evolution. The phylogenetic analysis provided a strong support for the M. bovis/M. caprae clade, but supported M. caprae as a separate species. The comparison of the MB1 and MB4 isolates revealed differences in genome sequence, including gene families that are important for bacterial infection and transmission, thus highlighting differences with functional implications between isolates otherwise classified with the same spoligotype. Strategic protein-targeted analysis using the ESX or type VII secretion system, proteins linking stress response with lipid metabolism, host T cell epitopes of mycobacteria, antigens and peptidoglycan assembly protein identified new genetic markers and candidate vaccine antigens that warrant further study to

  10. Comparative Genomics of Field Isolates of Mycobacterium bovis and M. caprae Provides Evidence for Possible Correlates with Bacterial Viability and Virulence.

    Directory of Open Access Journals (Sweden)

    José de la Fuente

    2015-11-01

    Full Text Available Mycobacteria of the Mycobacterium tuberculosis complex (MTBC greatly affect humans and animals worldwide. The life cycle of mycobacteria is complex and the mechanisms resulting in pathogen infection and survival in host cells are not fully understood. Recently, comparative genomics analyses have provided new insights into the evolution and adaptation of the MTBC to survive inside the host. However, most of this information has been obtained using M. tuberculosis but not other members of the MTBC such as M. bovis and M. caprae. In this study, the genome of three M. bovis (MB1, MB3, MB4 and one M. caprae (MB2 field isolates with different lesion score, prevalence and host distribution phenotypes were sequenced. Genome sequence information was used for whole-genome and protein-targeted comparative genomics analysis with the aim of finding correlates with phenotypic variation with potential implications for tuberculosis (TB disease risk assessment and control. At the whole-genome level the results of the first comparative genomics study of field isolates of M. bovis including M. caprae showed that as previously reported for M. tuberculosis, sequential chromosomal nucleotide substitutions were the main driver of the M. bovis genome evolution. The phylogenetic analysis provided a strong support for the M. bovis/M. caprae clade, but supported M. caprae as a separate species. The comparison of the MB1 and MB4 isolates revealed differences in genome sequence, including gene families that are important for bacterial infection and transmission, thus highlighting differences with functional implications between isolates otherwise classified with the same spoligotype. Strategic protein-targeted analysis using the ESX or type VII secretion system, proteins linking stress response with lipid metabolism, host T cell epitopes of mycobacteria, antigens and peptidoglycan assembly protein identified new genetic markers and candidate vaccine antigens that warrant

  11. Comparative Genomics of Field Isolates of Mycobacterium bovis and M. caprae Provides Evidence for Possible Correlates with Bacterial Viability and Virulence.

    Science.gov (United States)

    de la Fuente, José; Díez-Delgado, Iratxe; Contreras, Marinela; Vicente, Joaquín; Cabezas-Cruz, Alejandro; Tobes, Raquel; Manrique, Marina; López, Vladimir; Romero, Beatriz; Bezos, Javier; Dominguez, Lucas; Sevilla, Iker A; Garrido, Joseba M; Juste, Ramón; Madico, Guillermo; Jones-López, Edward; Gortazar, Christian

    2015-11-01

    Mycobacteria of the Mycobacterium tuberculosis complex (MTBC) greatly affect humans and animals worldwide. The life cycle of mycobacteria is complex and the mechanisms resulting in pathogen infection and survival in host cells are not fully understood. Recently, comparative genomics analyses have provided new insights into the evolution and adaptation of the MTBC to survive inside the host. However, most of this information has been obtained using M. tuberculosis but not other members of the MTBC such as M. bovis and M. caprae. In this study, the genome of three M. bovis (MB1, MB3, MB4) and one M. caprae (MB2) field isolates with different lesion score, prevalence and host distribution phenotypes were sequenced. Genome sequence information was used for whole-genome and protein-targeted comparative genomics analysis with the aim of finding correlates with phenotypic variation with potential implications for tuberculosis (TB) disease risk assessment and control. At the whole-genome level the results of the first comparative genomics study of field isolates of M. bovis including M. caprae showed that as previously reported for M. tuberculosis, sequential chromosomal nucleotide substitutions were the main driver of the M. bovis genome evolution. The phylogenetic analysis provided a strong support for the M. bovis/M. caprae clade, but supported M. caprae as a separate species. The comparison of the MB1 and MB4 isolates revealed differences in genome sequence, including gene families that are important for bacterial infection and transmission, thus highlighting differences with functional implications between isolates otherwise classified with the same spoligotype. Strategic protein-targeted analysis using the ESX or type VII secretion system, proteins linking stress response with lipid metabolism, host T cell epitopes of mycobacteria, antigens and peptidoglycan assembly protein identified new genetic markers and candidate vaccine antigens that warrant further study to

  12. Fusion of large-scale genomic knowledge and frequency data computationally prioritizes variants in epilepsy.

    Science.gov (United States)

    Campbell, Ian M; Rao, Mitchell; Arredondo, Sean D; Lalani, Seema R; Xia, Zhilian; Kang, Sung-Hae L; Bi, Weimin; Breman, Amy M; Smith, Janice L; Bacino, Carlos A; Beaudet, Arthur L; Patel, Ankita; Cheung, Sau Wai; Lupski, James R; Stankiewicz, Paweł; Ramocki, Melissa B; Shaw, Chad A

    2013-01-01

    Curation and interpretation of copy number variants identified by genome-wide testing is challenged by the large number of events harbored in each personal genome. Conventional determination of phenotypic relevance relies on patterns of higher frequency in affected individuals versus controls; however, an increasing amount of ascertained variation is rare or private to clans. Consequently, frequency data have less utility to resolve pathogenic from benign. One solution is disease-specific algorithms that leverage gene knowledge together with variant frequency to aid prioritization. We used large-scale resources including Gene Ontology, protein-protein interactions and other annotation systems together with a broad set of 83 genes with known associations to epilepsy to construct a pathogenicity score for the phenotype. We evaluated the score for all annotated human genes and applied Bayesian methods to combine the derived pathogenicity score with frequency information from our diagnostic laboratory. Analysis determined Bayes factors and posterior distributions for each gene. We applied our method to subjects with abnormal chromosomal microarray results and confirmed epilepsy diagnoses gathered by electronic medical record review. Genes deleted in our subjects with epilepsy had significantly higher pathogenicity scores and Bayes factors compared to subjects referred for non-neurologic indications. We also applied our scores to identify a recently validated epilepsy gene in a complex genomic region and to reveal candidate genes for epilepsy. We propose a potential use in clinical decision support for our results in the context of genome-wide screening. Our approach demonstrates the utility of integrative data in medical genomics.

  13. Fusion of large-scale genomic knowledge and frequency data computationally prioritizes variants in epilepsy.

    Directory of Open Access Journals (Sweden)

    Ian M Campbell

    Full Text Available Curation and interpretation of copy number variants identified by genome-wide testing is challenged by the large number of events harbored in each personal genome. Conventional determination of phenotypic relevance relies on patterns of higher frequency in affected individuals versus controls; however, an increasing amount of ascertained variation is rare or private to clans. Consequently, frequency data have less utility to resolve pathogenic from benign. One solution is disease-specific algorithms that leverage gene knowledge together with variant frequency to aid prioritization. We used large-scale resources including Gene Ontology, protein-protein interactions and other annotation systems together with a broad set of 83 genes with known associations to epilepsy to construct a pathogenicity score for the phenotype. We evaluated the score for all annotated human genes and applied Bayesian methods to combine the derived pathogenicity score with frequency information from our diagnostic laboratory. Analysis determined Bayes factors and posterior distributions for each gene. We applied our method to subjects with abnormal chromosomal microarray results and confirmed epilepsy diagnoses gathered by electronic medical record review. Genes deleted in our subjects with epilepsy had significantly higher pathogenicity scores and Bayes factors compared to subjects referred for non-neurologic indications. We also applied our scores to identify a recently validated epilepsy gene in a complex genomic region and to reveal candidate genes for epilepsy. We propose a potential use in clinical decision support for our results in the context of genome-wide screening. Our approach demonstrates the utility of integrative data in medical genomics.

  14. Archaeal and bacterial community dynamics and bioprocess performance of a bench-scale two-stage anaerobic digester.

    Science.gov (United States)

    Gonzalez-Martinez, Alejandro; Garcia-Ruiz, Maria Jesus; Rodriguez-Sanchez, Alejandro; Osorio, Francisco; Gonzalez-Lopez, Jesus

    2016-07-01

    Two-stage technologies have been developed for anaerobic digestion of waste-activated sludge. In this study, the archaeal and bacterial community structure dynamics and bioprocess performance of a bench-scale two-stage anaerobic digester treating urban sewage sludge have been studied by the means of high-throughput sequencing techniques and physicochemical parameters such as pH, dried sludge, volatile dried sludge, acid concentration, alkalinity, and biogas generation. The coupled analyses of archaeal and bacterial communities and physicochemical parameters showed a direct relationship between archaeal and bacterial populations and bioprocess performance during start-up and working operation of a two-stage anaerobic digester. Moreover, results demonstrated that archaeal and bacterial community structure was affected by changes in the acid/alkalinity ratio in the bioprocess. Thus, a predominance of the acetoclastic methanogen Methanosaeta was observed in the methanogenic bioreactor at high-value acid/alkaline ratio, while a predominance of Methanomassilicoccaeceae archaea and Methanoculleus genus was observed in the methanogenic bioreactor at low-value acid/alkaline ratio. Biodiversity tag-iTag sequencing studies showed that methanogenic archaea can be also detected in the acidogenic bioreactor, although its biological activity was decreased after 4 months of operation as supported by physicochemical analyses. Also, studies of the VFA producers and VFA consumers microbial populations showed as these microbiota were directly affected by the physicochemical parameters generated in the bioreactors. We suggest that the results obtained in our study could be useful for future implementations of two-stage anaerobic digestion processes at both bench- and full-scale. PMID:26940050

  15. An integrated approach to reconstructing genome-scale transcriptional regulatory networks.

    Science.gov (United States)

    Imam, Saheed; Noguera, Daniel R; Donohue, Timothy J

    2015-02-01

    Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making them highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of integrating

  16. An integrated approach to reconstructing genome-scale transcriptional regulatory networks.

    Directory of Open Access Journals (Sweden)

    Saheed Imam

    2015-02-01

    Full Text Available Transcriptional regulatory networks (TRNs program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs. An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making them highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs, 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR, carbon metabolism (RSP_0489 and iron homeostasis (RSP_3341. In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages

  17. Sexagesimal scale for mapping human genome Escala sexagesimal para mapear el genoma humano

    Directory of Open Access Journals (Sweden)

    RICARDO CRUZ-COKE

    2001-03-01

    Full Text Available In a previous work I designed a diagram of the human genome based on a circular ideogram of the haploid set of chromosomes, using a low resolution scale of Megabase units. The purpose of this work is to draft a new scale to measure the physical map of the human genome at the highest resolution level. The entire length of the haploid genome of males is deployed in a circumference, marked with a sexagesimal scale with 360 degrees and 1296000 arc seconds. The radio of this circunference displays a semilogaritmic metric scale from 1 m up to the nanometer level. The base pair level of DNA sequences, 10-9 of this circunsference, is measured in milliarsec unit (mas, equivalent to a thousand of arcsecond. The "mas" unit, correspond to 1.27 nanometers (nm or 0.427 base pair (bp and it is the framework for measure DNA sequences. Thus the three billion base pairs of the human genome may be identified by 1296000000 "mas" units in continous correlation from number 1 to number 1296000000. This sexagesimal scale covers all the levels of the nuclear genetic material, from nucleotides to chromosomes. The locations of every codon and every gene may be numbered in the physical map of chomosome regions according to this new scale, instead of the partial kilobase and Megabase scales used today. The advantage of the new scale is the unification of the set of chromosomes under a continous scale of measurement at the DNA level, facilitating the correlation with the phenotypes of man and other speciesEn un trabajo anterior yo diseñé un diagrama del genoma humano basado en un ideograma circular del conjunto haploide de cromosomas, usando una escala de baja resolución en megabases. El propósito de este trabajo es el de diseñar una nueva escala para medir el mapa físico del genoma humano al más alto nivel de resolución. La longitud completa del genoma haploide del varon es extendido en una circunsferencia, marcada con una escala sexagesimal de 360 grados y 1296000

  18. Comparative genomic analysis of Xanthomonas axonopodis pv. citrumelo F1, which causes citrus bacterial spot disease, and related strains provides insights into virulence and host specificity.

    Science.gov (United States)

    Jalan, Neha; Aritua, Valente; Kumar, Dibyendu; Yu, Fahong; Jones, Jeffrey B; Graham, James H; Setubal, João C; Wang, Nian

    2011-11-01

    Xanthomonas axonopodis pv. citrumelo is a citrus pathogen causing citrus bacterial spot disease that is geographically restricted within the state of Florida. Illumina, 454 sequencing, and optical mapping were used to obtain a complete genome sequence of X. axonopodis pv. citrumelo strain F1, 4.9 Mb in size. The strain lacks plasmids, in contrast to other citrus Xanthomonas pathogens. Phylogenetic analysis revealed that this pathogen is very close to the tomato bacterial spot pathogen X. campestris pv. vesicatoria 85-10, with a completely different host range. We also compared X. axonopodis pv. citrumelo to the genome of citrus canker pathogen X. axonopodis pv. citri 306. Comparative genomic analysis showed differences in several gene clusters, like those for type III effectors, the type IV secretion system, lipopolysaccharide synthesis, and others. In addition to pthA, effectors such as xopE3, xopAI, and hrpW were absent from X. axonopodis pv. citrumelo while present in X. axonopodis pv. citri. These effectors might be responsible for survival and the low virulence of this pathogen on citrus compared to that of X. axonopodis pv. citri. We also identified unique effectors in X. axonopodis pv. citrumelo that may be related to the different host range as compared to that of X. axonopodis pv. citri. X. axonopodis pv. citrumelo also lacks various genes, such as syrE1, syrE2, and RTX toxin family genes, which were present in X. axonopodis pv. citri. These may be associated with the distinct virulences of X. axonopodis pv. citrumelo and X. axonopodis pv. citri. Comparison of the complete genome sequence of X. axonopodis pv. citrumelo to those of X. axonopodis pv. citri and X. campestris pv. vesicatoria provides valuable insights into the mechanism of bacterial virulence and host specificity.

  19. Why close a bacterial genome? The plasmid of Alteromonas macleodii HOT1A3 is a vector for inter-specific transfer of a flexible genomic island

    Directory of Open Access Journals (Sweden)

    Eduard eFadeev

    2016-03-01

    Full Text Available Genome sequencing is rapidly becoming a staple technique in environmental and clinical microbiology, yet computational challenges still remain, leading to many draft genomes which are typically fragmented into many contigs. We sequenced and completely assembled the genome of a marine heterotrophic bacterium, Alteromonas macleodii HOT1A3, and compared its full genome to several draft genomes obtained using different reference-based and de-novo methods. In general, the de-novo assemblies clearly outperformed the reference-based or hybrid ones, covering>99% of the genes and representing essentially all of the gene functions. However, only the fully closed genome (~4.5Mbp allowed us to identify the presence of a large, 148 kbp plasmid, pAM1A3. While HOT1A3 belongs to Alteromonas macleodii, typically found in surface waters (surface ecotype, this plasmid consists of an almost complete flexible genomic island, containing many genes involved in metal resistance previously identified in the genomes of Alteromonas mediterranea (deep ecotype. Indeed, similar to A. mediterranea, A. macleodii HOT1A3 grows at concentrations of zinc, mercury and copper that are inhibitory for other A. macleodii strains. The presence of a plasmid encoding almost an entire flexible genomic island suggests that wholesale genomic exchange between heterotrophic marine bacteria belonging to related but ecologically different populations is not uncommon.

  20. Biological Removal of Phosphate Using Phosphate Solubilizing Bacterial Consortium from Synthetic Wastewater: A Laboratory Scale

    Directory of Open Access Journals (Sweden)

    Dipak Paul

    2015-01-01

    Full Text Available Biological phosphate removal is an important process having gained worldwide attention and widely used for removing phosphorus from wastewater. The present investigation was aimed to screen the efficient phosphate solubilizing bacterial isolates and used to remove phosphate from synthetic wastewater under shaking flasks conditions. Pseudomonas sp. JPSB12, Enterobacter sp. TPSB20, Flavobacterium sp. TPSB23 and mixed bacterial consortium (Pseudomonas sp. JPSB12+Enterobacter sp. TPSB20+Flavobacterium sp. TPSB23 were used for the removal of phosphate. Among the individual strains, Enterobacter sp. TPSB20 was removed maximum phosphate (61.75% from synthetic wastewater in presence of glucose as a carbon source. The consortium was effectively removed phosphate (74.15-82.50% in the synthetic wastewater when compared to individual strains. The pH changes in culture medium with time and extracellular phosphatase activity (acid and alkaline were also investigated. The efficient removal of phosphate by the consortium may be due to the synergistic activity among the individual strains and phosphatase enzyme activity. The use of bacterial consortium in the remediation of phosphate contaminated aquatic environments has been discussed.

  1. Surface physicochemical properties at the micro and nano length scales: role on bacterial adhesion and Xylella fastidiosa biofilm development.

    Directory of Open Access Journals (Sweden)

    Gabriela S Lorite

    Full Text Available The phytopathogen Xylella fastidiosa grows as a biofilm causing vascular occlusion and consequently nutrient and water stress in different plant hosts by adhesion on xylem vessel surfaces composed of cellulose, hemicellulose, pectin and proteins. Understanding the factors which influence bacterial adhesion and biofilm development is a key issue in identifying mechanisms for preventing biofilm formation in infected plants. In this study, we show that X. fastidiosa biofilm development and architecture correlate well with physicochemical surface properties after interaction with the culture medium. Different biotic and abiotic substrates such as silicon (Si and derivatized cellulose films were studied. Both biofilms and substrates were characterized at the micro- and nanoscale, which corresponds to the actual bacterial cell and membrane/ protein length scales, respectively. Our experimental results clearly indicate that the presence of surfaces with different chemical composition affect X. fastidiosa behavior from the point of view of gene expression and adhesion functionality. Bacterial adhesion is facilitated on more hydrophilic surfaces with higher surface potentials; XadA1 adhesin reveals different strengths of interaction on these surfaces. Nonetheless, despite different architectural biofilm geometries and rates of development, the colonization process occurs on all investigated surfaces. Our results univocally support the hypothesis that different adhesion mechanisms are active along the biofilm life cycle representing an adaptation mechanism for variations on the specific xylem vessel composition, which the bacterium encounters within the infected plant.

  2. Small Traditional Human Communities Sustain Genomic Diversity over Microgeographic Scales despite Linguistic Isolation.

    Science.gov (United States)

    Cox, Murray P; Hudjashov, Georgi; Sim, Andre; Savina, Olga; Karafet, Tatiana M; Sudoyo, Herawati; Lansing, J Stephen

    2016-09-01

    At least since the Neolithic, humans have largely lived in networks of small, traditional communities. Often socially isolated, these groups evolved distinct languages and cultures over microgeographic scales of just tens of kilometers. Population genetic theory tells us that genetic drift should act quickly in such isolated groups, thus raising the question: do networks of small human communities maintain levels of genetic diversity over microgeographic scales? This question can no longer be asked in most parts of the world, which have been heavily impacted by historical events that make traditional society structures the exception. However, such studies remain possible in parts of Island Southeast Asia and Oceania, where traditional ways of life are still practiced. We captured genome-wide genetic data, together with linguistic records, for a case-study system-eight villages distributed across Sumba, a small, remote island in eastern Indonesia. More than 4,000 years after these communities were established during the Neolithic period, most speak different languages and can be distinguished genetically. Yet their nuclear diversity is not reduced, instead being comparable to other, even much larger, regional groups. Modeling reveals a separation of time scales: while languages and culture can evolve quickly, creating social barriers, sporadic migration averaged over many generations is sufficient to keep villages linked genetically. This loosely-connected network structure, once the global norm and still extant on Sumba today, provides a living proxy to explore fine-scale genome dynamics in the sort of small traditional communities within which the most recent episodes of human evolution occurred. PMID:27274003

  3. Integrating Kinetic Model of E. coli with Genome Scale Metabolic Fluxes Overcomes Its Open System Problem and Reveals Bistability in Central Metabolism.

    Directory of Open Access Journals (Sweden)

    Ahmad A Mannan

    Full Text Available An understanding of the dynamics of the metabolic profile of a bacterial cell is sought from a dynamical systems analysis of kinetic models. This modelling formalism relies on a deterministic mathematical description of enzyme kinetics and their metabolite regulation. However, it is severely impeded by the lack of available kinetic information, limiting the size of the system that can be modelled. Furthermore, the subsystem of the metabolic network whose dynamics can be modelled is faced with three problems: how to parameterize the model with mostly incomplete steady state data, how to close what is now an inherently open system, and how to account for the impact on growth. In this study we address these challenges of kinetic modelling by capitalizing on multi-'omics' steady state data and a genome-scale metabolic network model. We use these to generate parameters that integrate knowledge embedded in the genome-scale metabolic network model, into the most comprehensive kinetic model of the central carbon metabolism of E. coli realized to date. As an application, we performed a dynamical systems analysis of the resulting enriched model. This revealed bistability of the central carbon metabolism and thus its potential to express two distinct metabolic states. Furthermore, since our model-informing technique ensures both stable states are constrained by the same thermodynamically feasible steady state growth rate, the ensuing bistability represents a temporal coexistence of the two states, and by extension, reveals the emergence of a phenotypically heterogeneous population.

  4. An Experimentally-Supported Genome-Scale Metabolic Network Reconstruction for Yersinia pestis CO92

    Energy Technology Data Exchange (ETDEWEB)

    Charusanti, Pep; Chauhan, Sadhana; Mcateer, Kathleen; Lerman, Joshua A.; Hyduke, Daniel R.; Motin, Vladimir L.; Ansong, Charles; Adkins, Joshua N.; Palsson, Bernhard O.

    2011-10-13

    Yersinia pestis is a gram-negative bacterium that causes plague, a disease linked historically to the Black Death in Europe during the Middle Ages and to several outbreaks during the modern era. Metabolism in Y. pestis displays remarkable flexibility and robustness, allowing the bacterium to proliferate in both warm-blooded mammalian hosts and cold-blooded insect vectors such as fleas. Here we report a genome-scale reconstruction and mathematical model of metabolism for Y. pestis CO92 and supporting experimental growth and metabolite measurements. The model contains 815 genes, 678 proteins, 963 unique metabolites and 1678 reactions, accurately simulates growth on a range of carbon sources both qualitatively and quantitatively, and identifies gaps in several key biosynthetic pathways and suggests how those gaps might be filled. Furthermore, our model presents hypotheses to explain certain known nutritional requirements characteristic of this strain. Y. pestis continues to be a dangerous threat to human health during modern times. The Y. pestis genome-scale metabolic reconstruction presented here, which has been benchmarked against experimental data and correctly reproduces known phenotypes, thus provides an in silico platform with which to investigate the metabolism of this important human pathogen.

  5. An experimentally-supported genome-scale metabolic network reconstruction for Yersinia pestis CO92

    Directory of Open Access Journals (Sweden)

    Motin Vladimir L

    2011-10-01

    Full Text Available Abstract Background Yersinia pestis is a gram-negative bacterium that causes plague, a disease linked historically to the Black Death in Europe during the Middle Ages and to several outbreaks during the modern era. Metabolism in Y. pestis displays remarkable flexibility and robustness, allowing the bacterium to proliferate in both warm-blooded mammalian hosts and cold-blooded insect vectors such as fleas. Results Here we report a genome-scale reconstruction and mathematical model of metabolism for Y. pestis CO92 and supporting experimental growth and metabolite measurements. The model contains 815 genes, 678 proteins, 963 unique metabolites and 1678 reactions, accurately simulates growth on a range of carbon sources both qualitatively and quantitatively, and identifies gaps in several key biosynthetic pathways and suggests how those gaps might be filled. Furthermore, our model presents hypotheses to explain certain known nutritional requirements characteristic of this strain. Conclusions Y. pestis continues to be a dangerous threat to human health during modern times. The Y. pestis genome-scale metabolic reconstruction presented here, which has been benchmarked against experimental data and correctly reproduces known phenotypes, provides an in silico platform with which to investigate the metabolism of this important human pathogen.

  6. A Method to Constrain Genome-Scale Models with 13C Labeling Data.

    Directory of Open Access Journals (Sweden)

    Héctor García Martín

    2015-09-01

    Full Text Available Current limitations in quantitatively predicting biological behavior hinder our efforts to engineer biological systems to produce biofuels and other desired chemicals. Here, we present a new method for calculating metabolic fluxes, key targets in metabolic engineering, that incorporates data from 13C labeling experiments and genome-scale models. The data from 13C labeling experiments provide strong flux constraints that eliminate the need to assume an evolutionary optimization principle such as the growth rate optimization assumption used in Flux Balance Analysis (FBA. This effective constraining is achieved by making the simple but biologically relevant assumption that flux flows from core to peripheral metabolism and does not flow back. The new method is significantly more robust than FBA with respect to errors in genome-scale model reconstruction. Furthermore, it can provide a comprehensive picture of metabolite balancing and predictions for unmeasured extracellular fluxes as constrained by 13C labeling data. A comparison shows that the results of this new method are similar to those found through 13C Metabolic Flux Analysis (13C MFA for central carbon metabolism but, additionally, it provides flux estimates for peripheral metabolism. The extra validation gained by matching 48 relative labeling measurements is used to identify where and why several existing COnstraint Based Reconstruction and Analysis (COBRA flux prediction algorithms fail. We demonstrate how to use this knowledge to refine these methods and improve their predictive capabilities. This method provides a reliable base upon which to improve the design of biological systems.

  7. A genome-scale metabolic model of the lipid-accumulating yeast Yarrowia lipolytica

    Directory of Open Access Journals (Sweden)

    Loira Nicolas

    2012-05-01

    Full Text Available Abstract Background Yarrowia lipolytica is an oleaginous yeast which has emerged as an important microorganism for several biotechnological processes, such as the production of organic acids, lipases and proteases. It is also considered a good candidate for single-cell oil production. Although some of its metabolic pathways are well studied, its metabolic engineering is hindered by the lack of a genome-scale model that integrates the current knowledge about its metabolism. Results Combining in silico tools and expert manual curation, we have produced an accurate genome-scale metabolic model for Y. lipolytica. Using a scaffold derived from a functional metabolic model of the well-studied but phylogenetically distant yeast S. cerevisiae, we mapped conserved reactions, rewrote gene associations, added species-specific reactions and inserted specialized copies of scaffold reactions to account for species-specific expansion of protein families. We used physiological measures obtained under lab conditions to validate our predictions. Conclusions Y. lipolytica iNL895 represents the first well-annotated metabolic model of an oleaginous yeast, providing a base for future metabolic improvement, and a starting point for the metabolic reconstruction of other species in the Yarrowia clade and other oleaginous yeasts.

  8. MultiMetEval: comparative and multi-objective analysis of genome-scale metabolic models.

    Directory of Open Access Journals (Sweden)

    Piotr Zakrzewski

    Full Text Available Comparative metabolic modelling is emerging as a novel field, supported by the development of reliable and standardized approaches for constructing genome-scale metabolic models in high throughput. New software solutions are needed to allow efficient comparative analysis of multiple models in the context of multiple cellular objectives. Here, we present the user-friendly software framework Multi-Metabolic Evaluator (MultiMetEval, built upon SurreyFBA, which allows the user to compose collections of metabolic models that together can be subjected to flux balance analysis. Additionally, MultiMetEval implements functionalities for multi-objective analysis by calculating the Pareto front between two cellular objectives. Using a previously generated dataset of 38 actinobacterial genome-scale metabolic models, we show how these approaches can lead to exciting novel insights. Firstly, after incorporating several pathways for the biosynthesis of natural products into each of these models, comparative flux balance analysis predicted that species like Streptomyces that harbour the highest diversity of secondary metabolite biosynthetic gene clusters in their genomes do not necessarily have the metabolic network topology most suitable for compound overproduction. Secondly, multi-objective analysis of biomass production and natural product biosynthesis in these actinobacteria shows that the well-studied occurrence of discrete metabolic switches during the change of cellular objectives is inherent to their metabolic network architecture. Comparative and multi-objective modelling can lead to insights that could not be obtained by normal flux balance analyses. MultiMetEval provides a powerful platform that makes these analyses straightforward for biologists. Sources and binaries of MultiMetEval are freely available from https://github.com/PiotrZakrzewski/MetEval/downloads.

  9. Identification of novel targets for breast cancer by exploring gene switches on a genome scale

    Directory of Open Access Journals (Sweden)

    Wu Ming

    2011-11-01

    Full Text Available Abstract Background An important feature that emerges from analyzing gene regulatory networks is the "switch-like behavior" or "bistability", a dynamic feature of a particular gene to preferentially toggle between two steady-states. The state of gene switches plays pivotal roles in cell fate decision, but identifying switches has been difficult. Therefore a challenge confronting the field is to be able to systematically identify gene switches. Results We propose a top-down mining approach to exploring gene switches on a genome-scale level. Theoretical analysis, proof-of-concept examples, and experimental studies demonstrate the ability of our mining approach to identify bistable genes by sampling across a variety of different conditions. Applying the approach to human breast cancer data identified genes that show bimodality within the cancer samples, such as estrogen receptor (ER and ERBB2, as well as genes that show bimodality between cancer and non-cancer samples, where tumor-associated calcium signal transducer 2 (TACSTD2 is uncovered. We further suggest a likely transcription factor that regulates TACSTD2. Conclusions Our mining approach demonstrates that one can capitalize on genome-wide expression profiling to capture dynamic properties of a complex network. To the best of our knowledge, this is the first attempt in applying mining approaches to explore gene switches on a genome-scale, and the identification of TACSTD2 demonstrates that single cell-level bistability can be predicted from microarray data. Experimental confirmation of the computational results suggest TACSTD2 could be a potential biomarker and attractive candidate for drug therapy against both ER+ and ER- subtypes of breast cancer, including the triple negative subtype.

  10. T4SP Database 2.0: An Improved Database for Type IV Secretion Systems in Bacterial Genomes with New Online Analysis Tools

    Science.gov (United States)

    Han, Na; Yu, Weiwen; Qiang, Yujun

    2016-01-01

    Type IV secretion system (T4SS) can mediate the passage of macromolecules across cellular membranes and is essential for virulent and genetic material exchange among bacterial species. The Type IV Secretion Project 2.0 (T4SP 2.0) database is an improved and extended version of the platform released in 2013 aimed at assisting with the detection of Type IV secretion systems (T4SS) in bacterial genomes. This advanced version provides users with web server tools for detecting the existence and variations of T4SS genes online. The new interface for the genome browser provides a user-friendly access to the most complete and accurate resource of T4SS gene information (e.g., gene number, name, type, position, sequence, related articles, and quick links to other webs). Currently, this online database includes T4SS information of 5239 bacterial strains. Conclusions. T4SS is one of the most versatile secretion systems necessary for the virulence and survival of bacteria and the secretion of protein and/or DNA substrates from a donor to a recipient cell. This database on virB/D genes of the T4SS system will help scientists worldwide to improve their knowledge on secretion systems and also identify potential pathogenic mechanisms of various microbial species.

  11. Structural characterization of genomes by large scale sequence-structure threading: application of reliability analysis in structural genomics

    Directory of Open Access Journals (Sweden)

    Brunham Robert C

    2004-07-01

    Full Text Available Abstract Background We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, maintainability and safety work to model time-to-failure of mechanical devices, mechanisms, building constructions and equipment. Results We have found that the Weibull function describes protein fold distribution within and among genomes more accurately than conventional power functions which have been used in a number of structural genomic studies reported to date. It has also been found that the Weibull reliability parameter β for protein fold distributions varies between genomes and may reflect differences in rates of gene duplication in evolutionary history of organisms. Conclusions The results of this work demonstrate that reliability analysis can provide useful insights and testable predictions in the fields of comparative and structural genomics.

  12. Genome-scale detection of hypermethylated CpG islands in circulating cell-free DNA of hepatocellular carcinoma patients

    OpenAIRE

    Wen, Lu; Li, Jingyi; Guo, Huahu; Liu, Xiaomeng; Zheng, Shengmin; Zhang, Dafang; Zhu, Weihua; Qu, Jianhui; Guo, Limin; Du, Dexiao; Jin, Xiao; Zhang, Yuhao; Gao, Yun; Jie SHEN; Ge, Hao

    2015-01-01

    Despite advances in DNA methylome analyses of cells and tissues, current techniques for genome-scale profiling of DNA methylation in circulating cell-free DNA (ccfDNA) remain limited. Here we describe a methylated CpG tandems amplification and sequencing (MCTA-Seq) method that can detect thousands of hypermethylated CpG islands simultaneously in ccfDNA. This highly sensitive technique can work with genomic DNA as little as 7.5 pg, which is equivalent to 2.5 copies of the haploid genome. We ha...

  13. Genome-scale reconstruction and analysis of the metabolic network in the hyperthermophilic archaeon Sulfolobus solfataricus.

    Directory of Open Access Journals (Sweden)

    Thomas Ulas

    Full Text Available We describe the reconstruction of a genome-scale metabolic model of the crenarchaeon Sulfolobus solfataricus, a hyperthermoacidophilic microorganism. It grows in terrestrial volcanic hot springs with growth occurring at pH 2-4 (optimum 3.5 and a temperature of 75-80°C (optimum 80°C. The genome of Sulfolobus solfataricus P2 contains 2,992,245 bp on a single circular chromosome and encodes 2,977 proteins and a number of RNAs. The network comprises 718 metabolic and 58 transport/exchange reactions and 705 unique metabolites, based on the annotated genome and available biochemical data. Using the model in conjunction with constraint-based methods, we simulated the metabolic fluxes induced by different environmental and genetic conditions. The predictions were compared to experimental measurements and phenotypes of S. solfataricus. Furthermore, the performance of the network for 35 different carbon sources known for S. solfataricus from the literature was simulated. Comparing the growth on different carbon sources revealed that glycerol is the carbon source with the highest biomass flux per imported carbon atom (75% higher than glucose. Experimental data was also used to fit the model to phenotypic observations. In addition to the commonly known heterotrophic growth of S. solfataricus, the crenarchaeon is also able to grow autotrophically using the hydroxypropionate-hydroxybutyrate cycle for bicarbonate fixation. We integrated this pathway into our model and compared bicarbonate fixation with growth on glucose as sole carbon source. Finally, we tested the robustness of the metabolism with respect to gene deletions using the method of Minimization of Metabolic Adjustment (MOMA, which predicted that 18% of all possible single gene deletions would be lethal for the organism.

  14. Genome-scale metabolic analysis of Clostridium thermocellum for bioethanol production

    Directory of Open Access Journals (Sweden)

    Brooks J Paul

    2010-03-01

    Full Text Available Abstract Background Microorganisms possess diverse metabolic capabilities that can potentially be leveraged for efficient production of biofuels. Clostridium thermocellum (ATCC 27405 is a thermophilic anaerobe that is both cellulolytic and ethanologenic, meaning that it can directly use the plant sugar, cellulose, and biochemically convert it to ethanol. A major challenge in using microorganisms for chemical production is the need to modify the organism to increase production efficiency. The process of properly engineering an organism is typically arduous. Results Here we present a genome-scale model of C. thermocellum metabolism, iSR432, for the purpose of establishing a computational tool to study the metabolic network of C. thermocellum and facilitate efforts to engineer C. thermocellum for biofuel production. The model consists of 577 reactions involving 525 intracellular metabolites, 432 genes, and a proteomic-based representation of a cellulosome. The process of constructing this metabolic model led to suggested annotation refinements for 27 genes and identification of areas of metabolism requiring further study. The accuracy of the iSR432 model was tested using experimental growth and by-product secretion data for growth on cellobiose and fructose. Analysis using this model captures the relationship between the reduction-oxidation state of the cell and ethanol secretion and allowed for prediction of gene deletions and environmental conditions that would increase ethanol production. Conclusions By incorporating genomic sequence data, network topology, and experimental measurements of enzyme activities and metabolite fluxes, we have generated a model that is reasonably accurate at predicting the cellular phenotype of C. thermocellum and establish a strong foundation for rational strain design. In addition, we are able to draw some important conclusions regarding the underlying metabolic mechanisms for observed behaviors of C. thermocellum

  15. Genome-scale reconstruction of the metabolic network in Yersinia pestis, strain 91001

    Energy Technology Data Exchange (ETDEWEB)

    Navid, A; Almaas, E

    2009-01-13

    The gram-negative bacterium Yersinia pestis, the aetiological agent of bubonic plague, is one the deadliest pathogens known to man. Despite its historical reputation, plague is a modern disease which annually afflicts thousands of people. Public safety considerations greatly limit clinical experimentation on this organism and thus development of theoretical tools to analyze the capabilities of this pathogen is of utmost importance. Here, we report the first genome-scale metabolic model of Yersinia pestis biovar Mediaevalis based both on its recently annotated genome, and physiological and biochemical data from literature. Our model demonstrates excellent agreement with Y. pestis known metabolic needs and capabilities. Since Y. pestis is a meiotrophic organism, we have developed CryptFind, a systematic approach to identify all candidate cryptic genes responsible for known and theoretical meiotrophic phenomena. In addition to uncovering every known cryptic gene for Y. pestis, our analysis of the rhamnose fermentation pathway suggests that betB is the responsible cryptic gene. Despite all of our medical advances, we still do not have a vaccine for bubonic plague. Recent discoveries of antibiotic resistant strains of Yersinia pestis coupled with the threat of plague being used as a bioterrorism weapon compel us to develop new tools for studying the physiology of this deadly pathogen. Using our theoretical model, we can study the cell's phenotypic behavior under different circumstances and identify metabolic weaknesses which may be harnessed for the development of therapeutics. Additionally, the automatic identification of cryptic genes expands the usage of genomic data for pharmaceutical purposes.

  16. Genomes on ice.

    Science.gov (United States)

    Parkhill, Julian

    2016-03-01

    This month's Genome Watch discusses the analysis of a Helicobacter pylori genome from the preserved Copper-Age mummy known as the Iceman and how ancient genomes shed light on the history of bacterial pathogens. PMID:26853114

  17. Analysis of genetic variation and potential applications in genome-scale metabolic modeling

    Directory of Open Access Journals (Sweden)

    João Gonçalo Rocha Cardoso

    2015-02-01

    Full Text Available Genetic variation is the motor of evolution and allows organisms to overcome the environmental challenges they encounter. It can be both beneficial and harmful in the process of engineering cell factories for the production of proteins and chemicals. Throughout the history of biotechnology, there have been efforts to exploit genetic variation in our favor to create strains with favorable phenotypes. Genetic variation can either be present in natural populations or it can be artificially created by mutagenesis and selection or adaptive laboratory evolution. On the other hand, unintended genetic variation during a long term production process may lead to significant economic losses and it is important to understand how to control this type of variation. With the emergence of next-generation sequencing technologies, genetic variation in microbial strains can now be determined on an unprecedented scale and resolution by re-sequencing thousands of strains systematically. In this article, we review challenges in the integration and analysis of large-scale re-sequencing data, present an extensive overview of bioinformatics methods for predicting the effects of genetic variants on protein function, and discuss approaches for interfacing existing bioinformatics approaches with genome-scale models of cellular processes in order to predict effects of sequence variation on cellular phenotypes.

  18. Contig-Layout-Authenticator (CLA): A Combinatorial Approach to Ordering and Scaffolding of Bacterial Contigs for Comparative Genomics and Molecular Epidemiology.

    Science.gov (United States)

    Shaik, Sabiha; Kumar, Narender; Lankapalli, Aditya K; Tiwari, Sumeet K; Baddam, Ramani; Ahmed, Niyaz

    2016-01-01

    A wide variety of genome sequencing platforms have emerged in the recent past. High-throughput platforms like Illumina and 454 are essentially adaptations of the shotgun approach generating millions of fragmented single or paired sequencing reads. To reconstruct whole genomes, the reads have to be assembled into contigs, which often require further downstream processing. The contigs can be directly ordered according to a reference, scaffolded based on paired read information, or assembled using a combination of the two approaches. While the reference-based approach appears to mask strain-specific information, scaffolding based on paired-end information suffers when repetitive elements longer than the size of the sequencing reads are present in the genome. Sequencing technologies that produce long reads can solve the problems associated with repetitive elements but are not necessarily easily available to researchers. The most common high-throughput technology currently used is the Illumina short read platform. To improve upon the shortcomings associated with the construction of draft genomes with Illumina paired-end sequencing, we developed Contig-Layout-Authenticator (CLA). The CLA pipeline can scaffold reference-sorted contigs based on paired reads, resulting in better assembled genomes. Moreover, CLA also hints at probable misassemblies and contaminations, for the users to cross-check before constructing the consensus draft. The CLA pipeline was designed and trained extensively on various bacterial genome datasets for the ordering and scaffolding of large repetitive contigs. The tool has been validated and compared favorably with other widely-used scaffolding and ordering tools using both simulated and real sequence datasets. CLA is a user friendly tool that requires a single command line input to generate ordered scaffolds.

  19. Conditional random fields for fast, large-scale genome-wide association studies.

    Directory of Open Access Journals (Sweden)

    Jim C Huang

    Full Text Available Understanding the role of genetic variation in human diseases remains an important problem to be solved in genomics. An important component of such variation consist of variations at single sites in DNA, or single nucleotide polymorphisms (SNPs. Typically, the problem of associating particular SNPs to phenotypes has been confounded by hidden factors such as the presence of population structure, family structure or cryptic relatedness in the sample of individuals being analyzed. Such confounding factors lead to a large number of spurious associations and missed associations. Various statistical methods have been proposed to account for such confounding factors such as linear mixed-effect models (LMMs or methods that adjust data based on a principal components analysis (PCA, but these methods either suffer from low power or cease to be tractable for larger numbers of individuals in the sample. Here we present a statistical model for conducting genome-wide association studies (GWAS that accounts for such confounding factors. Our method scales in runtime quadratic in the number of individuals being studied with only a modest loss in statistical power as compared to LMM-based and PCA-based methods when testing on synthetic data that was generated from a generalized LMM. Applying our method to both real and synthetic human genotype/phenotype data, we demonstrate the ability of our model to correct for confounding factors while requiring significantly less runtime relative to LMMs. We have implemented methods for fitting these models, which are available at http://www.microsoft.com/science.

  20. Genome-scale reconstruction of metabolic network for a halophilic extremophile, Chromohalobacter salexigens DSM 3043

    Directory of Open Access Journals (Sweden)

    Oner Ebru

    2011-01-01

    Full Text Available Abstract Background Chromohalobacter salexigens (formerly Halomonas elongata DSM 3043 is a halophilic extremophile with a very broad salinity range and is used as a model organism to elucidate prokaryotic osmoadaptation due to its strong euryhaline phenotype. Results C. salexigens DSM 3043's metabolism was reconstructed based on genomic, biochemical and physiological information via a non-automated but iterative process. This manually-curated reconstruction accounts for 584 genes, 1386 reactions, and 1411 metabolites. By using flux balance analysis, the model was extensively validated against literature data on the C. salexigens phenotypic features, the transport and use of different substrates for growth as well as against experimental observations on the uptake and accumulation of industrially important organic osmolytes, ectoine, betaine, and its precursor choline, which play important roles in the adaptive response to osmotic stress. Conclusions This work presents the first comprehensive genome-scale metabolic model of a halophilic bacterium. Being a useful guide for identification and filling of knowledge gaps, the reconstructed metabolic network iOA584 will accelerate the research on halophilic bacteria towards application of systems biology approaches and design of metabolic engineering strategies.

  1. Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform

    CERN Document Server

    Cox, Anthony J; Jakobi, Tobias; Rosone, Giovanna

    2012-01-01

    Motivation The Burrows-Wheeler transform (BWT) is the foundation of many algorithms for compression and indexing of text data, but the cost of computing the BWT of very large string collections has prevented these techniques from being widely applied to the large sets of sequences often encountered as the outcome of DNA sequencing experiments. In previous work, we presented a novel algorithm that allows the BWT of human genome scale data to be computed on very moderate hardware, thus enabling us to investigate the BWT as a tool for the compression of such datasets. Results We first used simulated reads to explore the relationship between the level of compression and the error rate, the length of the reads and the level of sampling of the underlying genome and compare choices of second-stage compression algorithm. We demonstrate that compression may be greatly improved by a particular reordering of the sequences in the collection and give a novel `implicit sorting' strategy that enables these benefits to be re...

  2. Large-scale metabolome analysis and quantitative integration with genomics and proteomics data in Mycoplasma pneumoniae.

    Science.gov (United States)

    Maier, Tobias; Marcos, Josep; Wodke, Judith A H; Paetzold, Bernhard; Liebeke, Manuel; Gutiérrez-Gallego, Ricardo; Serrano, Luis

    2013-07-01

    Systems metabolomics, the identification and quantification of cellular metabolites and their integration with genomics and proteomics data, promises valuable functional insights into cellular biology. However, technical constraints, sample complexity issues and the lack of suitable complementary quantitative data sets prevented accomplishing such studies in the past. Here, we present an integrative metabolomics study of the genome-reduced bacterium Mycoplasma pneumoniae. We experimentally analysed its metabolome using a cross-platform approach. We explain intracellular metabolite homeostasis by quantitatively integrating our results with the cellular inventory of proteins, DNA and other macromolecules, as well as with available building blocks from the growth medium. We calculated in vivo catalytic parameters of glycolytic enzymes, making use of measured reaction velocities, as well as enzyme and metabolite pool sizes. A quantitative, inter-species comparison of absolute and relative metabolite abundances indicated that metabolic pathways are regulated as functional units, thereby simplifying adaptive responses. Our analysis demonstrates the potential for new scientific insight by integrating different types of large-scale experimental data from a single biological source.

  3. The Bacterial Communities of Full-Scale Biologically Active, Granular Activated Carbon Filters Are Stable and Diverse and Potentially Contain Novel Ammonia-Oxidizing Microorganisms

    OpenAIRE

    LaPara, Timothy M.; Hope Wilkinson, Katheryn; Strait, Jacqueline M.; Hozalski, Raymond M.; Sadowksy, Michael J.; Hamilton, Matthew J

    2015-01-01

    The bacterial community composition of the full-scale biologically active, granular activated carbon (BAC) filters operated at the St. Paul Regional Water Services (SPRWS) was investigated using Illumina MiSeq analysis of PCR-amplified 16S rRNA gene fragments. These bacterial communities were consistently diverse (Shannon index, >4.4; richness estimates, >1,500 unique operational taxonomic units [OTUs]) throughout the duration of the 12-month study period. In addition, only modest shifts in t...

  4. Genome-scale estimate of the metabolic turnover of E. Coli from the energy balance analysis

    Science.gov (United States)

    De Martino, D.

    2016-02-01

    In this article the notion of metabolic turnover is revisited in the light of recent results of out-of-equilibrium thermodynamics. By means of Monte Carlo methods we perform an exact sampling of the enzymatic fluxes in a genome scale metabolic network of E. Coli in stationary growth conditions from which we infer the metabolites turnover times. However the latter are inferred from net fluxes, and we argue that this approximation is not valid for enzymes working nearby thermodynamic equilibrium. We recalculate turnover times from total fluxes by performing an energy balance analysis of the network and recurring to the fluctuation theorem. We find in many cases values one of order of magnitude lower, implying a faster picture of intermediate metabolism.

  5. Integration of gene expression data into genome-scale metabolic models

    DEFF Research Database (Denmark)

    Åkesson, M.; Förster, Jochen; Nielsen, Jens

    2004-01-01

    A framework for integration of transcriptome data into stoichiometric metabolic models to obtain improved flux predictions is presented. The key idea is to exploit the regulatory information in the expression data to give additional constraints on the metabolic fluxes in the model. Measurements...... of gene expression from chemostat and batch cultures of Saccharomyces cerevisiae were combined with a recently developed genome-scale model, and the computed metabolic flux distributions were compared to experimental values from carbon labeling experiments and metabolic network analysis. The integration...... of expression data resulted in improved predictions of metabolic behavior in batch cultures, enabling quantitative predictions of exchange fluxes as well as qualitative estimations of changes in intracellular fluxes. A critical discussion of correlation between gene expression and metabolic fluxes is given....

  6. Moving image analysis to the cloud: A case study with a genome-scale tomographic study

    Science.gov (United States)

    Mader, Kevin; Stampanoni, Marco

    2016-01-01

    Over the last decade, the time required to measure a terabyte of microscopic imaging data has gone from years to minutes. This shift has moved many of the challenges away from experimental design and measurement to scalable storage, organization, and analysis. As many scientists and scientific institutions lack training and competencies in these areas, major bottlenecks have arisen and led to substantial delays and gaps between measurement, understanding, and dissemination. We present in this paper a framework for analyzing large 3D datasets using cloud-based computational and storage resources. We demonstrate its applicability by showing the setup and costs associated with the analysis of a genome-scale study of bone microstructure. We then evaluate the relative advantages and disadvantages associated with local versus cloud infrastructures.

  7. Moving image analysis to the cloud: A case study with a genome-scale tomographic study

    International Nuclear Information System (INIS)

    Over the last decade, the time required to measure a terabyte of microscopic imaging data has gone from years to minutes. This shift has moved many of the challenges away from experimental design and measurement to scalable storage, organization, and analysis. As many scientists and scientific institutions lack training and competencies in these areas, major bottlenecks have arisen and led to substantial delays and gaps between measurement, understanding, and dissemination. We present in this paper a framework for analyzing large 3D datasets using cloud-based computational and storage resources. We demonstrate its applicability by showing the setup and costs associated with the analysis of a genome-scale study of bone microstructure. We then evaluate the relative advantages and disadvantages associated with local versus cloud infrastructures

  8. Genome-scale modeling of the protein secretory machinery in yeast.

    Science.gov (United States)

    Feizi, Amir; Österlund, Tobias; Petranovic, Dina; Bordel, Sergio; Nielsen, Jens

    2013-01-01

    The protein secretory machinery in Eukarya is involved in post-translational modification (PTMs) and sorting of the secretory and many transmembrane proteins. While the secretory machinery has been well-studied using classic reductionist approaches, a holistic view of its complex nature is lacking. Here, we present the first genome-scale model for the yeast secretory machinery which captures the knowledge generated through more than 50 years of research. The model is based on the concept of a Protein Specific Information Matrix (PSIM: characterized by seven PTMs features). An algorithm was developed which mimics secretory machinery and assigns each secretory protein to a particular secretory class that determines the set of PTMs and transport steps specific to each protein. Protein abundances were integrated with the model in order to gain system level estimation of the metabolic demands associated with the processing of each specific protein as well as a quantitative estimation of the activity of each component of the secretory machinery.

  9. Genome-scale modeling of the protein secretory machinery in yeast

    DEFF Research Database (Denmark)

    Feizi, Amir; Österlund, Tobias; Petranovic, Dina;

    2013-01-01

    The protein secretory machinery in Eukarya is involved in post-translational modification (PTMs) and sorting of the secretory and many transmembrane proteins. While the secretory machinery has been well-studied using classic reductionist approaches, a holistic view of its complex nature is lacking....... Here, we present the first genome-scale model for the yeast secretory machinery which captures the knowledge generated through more than 50 years of research. The model is based on the concept of a Protein Specific Information Matrix (PSIM: characterized by seven PTMs features). An algorithm...... was developed which mimics secretory machinery and assigns each secretory protein to a particular secretory class that determines the set of PTMs and transport steps specific to each protein. Protein abundances were integrated with the model in order to gain system level estimation of the metabolic demands...

  10. Moving image analysis to the cloud: A case study with a genome-scale tomographic study

    Energy Technology Data Exchange (ETDEWEB)

    Mader, Kevin [4Quant Ltd., Switzerland & Institute for Biomedical Engineering at University and ETH Zurich (Switzerland); Stampanoni, Marco [Institute for Biomedical Engineering at University and ETH Zurich, Switzerland & Swiss Light Source at Paul Scherrer Institut, Villigen (Switzerland)

    2016-01-28

    Over the last decade, the time required to measure a terabyte of microscopic imaging data has gone from years to minutes. This shift has moved many of the challenges away from experimental design and measurement to scalable storage, organization, and analysis. As many scientists and scientific institutions lack training and competencies in these areas, major bottlenecks have arisen and led to substantial delays and gaps between measurement, understanding, and dissemination. We present in this paper a framework for analyzing large 3D datasets using cloud-based computational and storage resources. We demonstrate its applicability by showing the setup and costs associated with the analysis of a genome-scale study of bone microstructure. We then evaluate the relative advantages and disadvantages associated with local versus cloud infrastructures.

  11. Genome-scale consequences of cofactor balancing in engineered pentose utilization pathways in Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Amit Ghosh

    Full Text Available Biofuels derived from lignocellulosic biomass offer promising alternative renewable energy sources for transportation fuels. Significant effort has been made to engineer Saccharomyces cerevisiae to efficiently ferment pentose sugars such as D-xylose and L-arabinose into biofuels such as ethanol through heterologous expression of the fungal D-xylose and L-arabinose pathways. However, one of the major bottlenecks in these fungal pathways is that the cofactors are not balanced, which contributes to inefficient utilization of pentose sugars. We utilized a genome-scale model of S. cerevisiae to predict the maximal achievable growth rate for cofactor balanced and imbalanced D-xylose and L-arabinose utilization pathways. Dynamic flux balance analysis (DFBA was used to simulate batch fermentation of glucose, D-xylose, and L-arabinose. The dynamic models and experimental results are in good agreement for the wild type and for the engineered D-xylose utilization pathway. Cofactor balancing the engineered D-xylose and L-arabinose utilization pathways simulated an increase in ethanol batch production of 24.7% while simultaneously reducing the predicted substrate utilization time by 70%. Furthermore, the effects of cofactor balancing the engineered pentose utilization pathways were evaluated throughout the genome-scale metabolic network. This work not only provides new insights to the global network effects of cofactor balancing but also provides useful guidelines for engineering a recombinant yeast strain with cofactor balanced engineered pathways that efficiently co-utilizes pentose and hexose sugars for biofuels production. Experimental switching of cofactor usage in enzymes has been demonstrated, but is a time-consuming effort. Therefore, systems biology models that can predict the likely outcome of such strain engineering efforts are highly useful for motivating which efforts are likely to be worth the significant time investment.

  12. Genomic Insights into Aquimarina sp. Strain EL33, a Bacterial Symbiont of the Gorgonian Coral Eunicella labiata

    Science.gov (United States)

    Keller-Costa, Tina; Silva, Rúben; Lago-Lestón, Asunción

    2016-01-01

    To address the metabolic potential of symbiotic Aquimarina spp., we report here the genome sequence of Aquimarina sp. strain EL33, a bacterium isolated from the gorgonian coral Eunicella labiata. This first-described (to our knowledge) animal-associated Aquimarina genome possesses a sophisticated repertoire of genes involved in drug/antibiotic resistance and biosynthesis. PMID:27540075

  13. Genomic Insights into Aquimarina sp. Strain EL33, a Bacterial Symbiont of the Gorgonian Coral Eunicella labiata.

    Science.gov (United States)

    Keller-Costa, Tina; Silva, Rúben; Lago-Lestón, Asunción; Costa, Rodrigo

    2016-01-01

    To address the metabolic potential of symbiotic Aquimarina spp., we report here the genome sequence of Aquimarina sp. strain EL33, a bacterium isolated from the gorgonian coral Eunicella labiata This first-described (to our knowledge) animal-associated Aquimarina genome possesses a sophisticated repertoire of genes involved in drug/antibiotic resistance and biosynthesis. PMID:27540075

  14. Evaluating the efficacy of the new Ion PGM Hi-Q Sequencing Kit applied to bacterial genomes.

    Science.gov (United States)

    Pereira, Felipe L; Soares, Siomar C; Dorella, Fernanda A; Leal, Carlos A G; Figueiredo, Henrique C P

    2016-05-01

    Benchtop NGS platforms are constantly evolving to follow new advances in genomics. Thus, the manufacturers are making improvements, such as the recent Ion PGM Hi-Q chemistry. We evaluate the efficacy of this new Hi-Q approach by comparing it with the former Ion PGM kit and the Illumina MiSEQ Nextera 3rd version. The Hi-Q chemistry showed improvement on mapping reads, with 49 errors for 10kbp mapped; in contrast, the former kit had 89 errors. Additionally, there was a reduction of 80% in erroneous variant detection with the Torrent Variant Caller. Also, an enhancement was observed in de novo assembly with a more confident result in whole-genome MLST, with up to 96% of the alleles assembled correctly for both tested microbial genomes. All of these advantages result in a final genome sequence closer to the performance with MiSEQ and will contribute to turn comparative genomic analysis a reliable task. PMID:27033417

  15. Bacterial Chemotaxis Toward A NAPL Source Within A Pore-Scale Model Subject to A Range of Groundwater Flow Velocities

    Science.gov (United States)

    Wang, X.; Ford, R. M.

    2010-12-01

    Organic solvents such as toluene are the most widely distributed pollutants in groundwater. Biodegradation of these industrial pollutants requires that microorganisms in the aqueous phase are brought in contact with sources of contamination, which may be dispersed as pore-size organic-phase droplets within the saturated soil matrix. Chemotaxis toward chemical pollutants provides a mechanism for bacteria to migrate to locations of high contamination, which may not normally be accessible to bacteria carried along by groundwater flow, and thus it may improve the efficiency of bioremediation. A microfluidic device was designed to mimic the dissolution of an organic-phase contaminant from a single pore into a larger macropore representing a preferred pathway for microorganisms that are carried along by groundwater flow. The glass windows of the µ-chip allowed image analysis of bacterial distributions within the vicinity of the organic contaminant. Concentrations of chemotactic bacteria P. putida F1 near the organic/aqueous interface were 25% greater than those of a nonchemotactic mutant in the vicinity of toluene for a fluid velocity of 0.5 m/d. For E. coli responding to phenol, the bacterial concentrations were 60% greater than the controls, also at a velocity of 0.5 m/d. Velocities in the macropore were varied over a range that is typical of groundwater velocities from 0.5 to 10 m/d. The accumulation of chemotactic bacteria near the NAPL (nonaqueous phase liquid) chemoattractant source decreased as the fluid velocity increased. At the higher velocities, accumulation of chemotactic bacteria was comparable to the non-chemotactic control experiments. Computer-based simulation using finite element analysis software (COMSOL) was also performed to understand the effects of various model parameters on bacterial chemotaxis to NAPL. There was good agreement between the simulations (generated using reasonable values of the model parameters) and the experimental data for P

  16. Evaluation of effectiveness of bacterial product which can degrade pesticide-dimethoate on the scale of true practice test

    International Nuclear Information System (INIS)

    Dimethoate, an organophosphate pesticide has been widely used in Dalat, Lamdong. It is much toxic to birds, human being and other mammals. Its widespread use has caused environmental concern on the basic of frequent detection of dimethoate in soil and water. Microorganisms are key agents in the degradation of waste, oil and a vast array of organic pesticide in terrestrial and aquatic ecosystems. In previous study, bacteria products which can degrade. Dimethoate were produced. The present study was designed to evaluate the effectiveness of bacterial product which can degrade Pesticide-Dimethoate on the scale of true practice test. The results indicated that application bacteria product to soil grown with Cauliflower and Chinese Cabbage sprayed with organic phosphorus pesticides (Dimethoate and Chloropyrifos), the pesticide residues in soil, water and vegetables were as follow: The residues of Dimethoate and Chloropyrifos in soil grown with Cauliflower, Chinese cabbages are different. They concentrated mostly in the surface litter and top soil layers with the depth from 0 to 20 cm. From the depth of 20 cm to 100 cm, the pesticide residues were ignorable. Residue of Chloropyrifos in soil was small as well. Dimethoate residues in soil grown with Cauliflower were higher than that of Chinese cabbages. On the basis of the environmental criteria of Ministry for Science, Technology and Environment (6/95), Dimethoate residues in soil grown with cauliflowers were in excess of the maximum limit. In the case of using bacteria product to soil, pesticide residues in soil were decreased. The results also indicated that Chloropyrifos residues in water (water obtained at the depth of 75 cm and 100 cm by days) were small. Residue of Dimethoate in water small. Residue of Dimethoate in water obtained from the Cauliflower bed were higher than of Chinese cabbages one. Using bacteria product to soil, pesticide residues in water decreased. On the basis of the environmental criteria of

  17. Fast principal component analysis of large-scale genome-wide data.

    Directory of Open Access Journals (Sweden)

    Gad Abraham

    Full Text Available Principal component analysis (PCA is routinely used to analyze genome-wide single-nucleotide polymorphism (SNP data, for detecting population structure and potential outliers. However, the size of SNP datasets has increased immensely in recent years and PCA of large datasets has become a time consuming task. We have developed flashpca, a highly efficient PCA implementation based on randomized algorithms, which delivers identical accuracy in extracting the top principal components compared with existing tools, in substantially less time. We demonstrate the utility of flashpca on both HapMap3 and on a large Immunochip dataset. For the latter, flashpca performed PCA of 15,000 individuals up to 125 times faster than existing tools, with identical results, and PCA of 150,000 individuals using flashpca completed in 4 hours. The increasing size of SNP datasets will make tools such as flashpca essential as traditional approaches will not adequately scale. This approach will also help to scale other applications that leverage PCA or eigen-decomposition to substantially larger datasets.

  18. Scaling laws governing stochastic growth and division of single bacterial cells

    CERN Document Server

    Iyer-Biswas, Srividya; Henry, Jonathan T; Lo, Klevin; Burov, Stanislav; Lin, Yihan; Crooks, Gavin E; Crosson, Sean; Dinner, Aaron R; Scherer, Norbert F

    2014-01-01

    Uncovering the quantitative laws that govern the growth and division of single cells remains a major challenge. Using a unique combination of technologies that yields unprecedented statistical precision, we find that the sizes of individual Caulobacter crescentus cells increase exponentially in time. We also establish that they divide upon reaching a critical multiple ($\\approx$1.8) of their initial sizes, rather than an absolute size. We show that when the temperature is varied, the growth and division timescales scale proportionally with each other over the physiological temperature range. Strikingly, the cell-size and division-time distributions can both be rescaled by their mean values such that the condition-specific distributions collapse to universal curves. We account for these observations with a minimal stochastic model that is based on an autocatalytic cycle. It predicts the scalings, as well as specific functional forms for the universal curves. Our experimental and theoretical analysis reveals a ...

  19. Comparative Genomic and Phenotypic Characterization of Pathogenic and Non-Pathogenic Strains of Xanthomonas arboricola Reveals Insights into the Infection Process of Bacterial Spot Disease of Stone Fruits

    Science.gov (United States)

    Garita-Cambronero, Jerson; Palacio-Bielsa, Ana; López, María M.

    2016-01-01

    Xanthomonas arboricola pv. pruni is the causal agent of bacterial spot disease of stone fruits, a quarantinable pathogen in several areas worldwide, including the European Union. In order to develop efficient control methods for this disease, it is necessary to improve the understanding of the key determinants associated with host restriction, colonization and the development of pathogenesis. After an initial characterization, by multilocus sequence analysis, of 15 strains of X. arboricola isolated from Prunus, one strain did not group into the pathovar pruni or into other pathovars of this species and therefore it was identified and defined as a X. arboricola pv. pruni look-a-like. This non-pathogenic strain and two typical strains of X. arboricola pv. pruni were selected for a whole genome and phenotype comparative analysis in features associated with the pathogenesis process in Xanthomonas. Comparative analysis among these bacterial strains isolated from Prunus spp. and the inclusion of 15 publicly available genome sequences from other pathogenic and non-pathogenic strains of X. arboricola revealed variations in the phenotype associated with variations in the profiles of TonB-dependent transporters, sensors of the two-component regulatory system, methyl accepting chemotaxis proteins, components of the flagella and the type IV pilus, as well as in the repertoire of cell-wall degrading enzymes and the components of the type III secretion system and related effectors. These variations provide a global overview of those mechanisms that could be associated with the development of bacterial spot disease. Additionally, it pointed out some features that might influence the host specificity and the variable virulence observed in X. arboricola. PMID:27571391

  20. Niche differentiation of bacterial communities at a millimeter scale in Shark Bay microbial mats

    OpenAIRE

    Hon Lun Wong; Daniela-Lee Smith; Pieter T. Visscher; Burns, Brendan P.

    2015-01-01

    Modern microbial mats can provide key insights into early Earth ecosystems, and Shark Bay, Australia, holds one of the best examples of these systems. Identifying the spatial distribution of microorganisms with mat depth facilitates a greater understanding of specific niches and potentially novel microbial interactions. High throughput sequencing coupled with elemental analyses and biogeochemical measurements of two distinct mat types (smooth and pustular) at a millimeter scale were undertake...

  1. A Comparison of the Molecular Organization of Genomic Regions Associated with Resistance to Common Bacterial Blight in Two Phaseolus vulgaris Genotypes

    Directory of Open Access Journals (Sweden)

    Gregory E. Perry

    2013-08-01

    Full Text Available Resistance to common bacterial blight, caused by Xanthomonas axonopodis pv. phaseoli, in Phaseolus vulgaris is conditioned by several loci on different chromosomes. Previous studies with OAC-Rex, a CBB-resistant, white bean variety of Mesoamerican origin, identified two resistance loci associated with the molecular markers Pv-CTT001 and SU91, on chromosome 4 and 8, respectively. Resistance to CBB is assumed to be derived from an interspecific cross with Phaseolus acutifolius in the pedigree of OAC-Rex. Our current whole genome sequencing effort with OAC-Rex provided the opportunity to compare its genome in the regions associated with CBB resistance with the v1.0 release of the P. vulgaris line G19833, which is a large seeded bean of Andean origin, and (assumed to be CBB susceptible.. In addition, the genomic regions containing SAP6, a marker associated with P. vulgaris-derived CBB-resistance on chromosome 10, were compared. These analyses indicated that gene content was highly conserved between G19833 and OAC-Rex across the regions examined (>80%. However, fifty-nine genes unique to OAC Rex were identified, with resistance gene homologues making up the largest category (10 genes identified. Two unique genes in OAC-Rex located within the SU91 resistance QTL have homology to P. acutifolius ESTs and may be potential sources of CBB resistance. As the genomic sequence assembly of OAC-Rex is completed, we expect that further comparisons between it and the G19833 genome will lead to a greater understanding of CBB resistance in bean.

  2. A genome-scale metabolic reconstruction of Pseudomonas putida KT2440: iJN746 as a cell factory

    Directory of Open Access Journals (Sweden)

    Thiele Ines

    2008-09-01

    Full Text Available Abstract Background Pseudomonas putida is the best studied pollutant degradative bacteria and is harnessed by industrial biotechnology to synthesize fine chemicals. Since the publication of P. putida KT2440's genome, some in silico analyses of its metabolic and biotechnology capacities have been published. However, global understanding of the capabilities of P. putida KT2440 requires the construction of a metabolic model that enables the integration of classical experimental data along with genomic and high-throughput data. The constraint-based reconstruction and analysis (COBRA approach has been successfully used to build and analyze in silico genome-scale metabolic reconstructions. Results We present a genome-scale reconstruction of P. putida KT2440's metabolism, iJN746, which was constructed based on genomic, biochemical, and physiological information. This manually-curated reconstruction accounts for 746 genes, 950 reactions, and 911 metabolites. iJN746 captures biotechnologically relevant pathways, including polyhydroxyalkanoate synthesis and catabolic pathways of aromatic compounds (e.g., toluene, benzoate, phenylacetate, nicotinate, not described in other metabolic reconstructions or biochemical databases. The predictive potential of iJN746 was validated using experimental data including growth performance and gene deletion studies. Furthermore, in silico growth on toluene was found to be oxygen-limited, suggesting the existence of oxygen-efficient pathways not yet annotated in P. putida's genome. Moreover, we evaluated the production efficiency of polyhydroxyalkanoates from various carbon sources and found fatty acids as the most prominent candidates, as expected. Conclusion Here we presented the first genome-scale reconstruction of P. putida, a biotechnologically interesting all-surrounder. Taken together, this work illustrates the utility of iJN746 as i a knowledge-base, ii a discovery tool, and iii an engineering platform to explore P

  3. Exploring the feasibility of using copy number variants as genetic markers through large-scale whole genome sequencing experiments

    Science.gov (United States)

    Copy number variants (CNV) are large scale duplications or deletions of genomic sequence that are caused by a diverse set of molecular phenomena that are distinct from single nucleotide polymorphism (SNP) formation. Due to their different mechanisms of formation, CNVs are often difficult to track us...

  4. A genome-wide association study of Cloninger's temperament scales: Implications for the evolutionary genetics of personality

    NARCIS (Netherlands)

    Verweij, C.J.H.; Zietsch, B.P.; Medland, S.E.; Gordon, S.D.; Benyamin, B.; Nyholt, D.R.; McEvoy, B.P.; Sullivan, P.F.; Heath, A.C.; Madden, P.A.F.; Henders, A.K.; Montgomery, G.W.; Martin, N.G.; Wray, N.R.

    2010-01-01

    Variation in personality traits is 30-60% attributed to genetic influences. Attempts to unravel these genetic influences at the molecular level have, so far, been inconclusive. We performed the first genome-wide association study of Cloninger's temperament scales in a sample of 5117 individuals, in

  5. Large scale single nucleotide polymorphism discovery in unsequenced genomes using second generation high throughput sequencing technology: applied to turkey

    NARCIS (Netherlands)

    Kerstens, H.H.D.; Crooijmans, R.P.M.A.; Veenendaal, A.; Dibbits, B.W.; Chin-A-Woeng, T.F.C.; Dunnen, den J.T.; Groenen, M.A.M.

    2009-01-01

    Background - The development of second generation sequencing methods has enabled large scale DNA variation studies at moderate cost. For the high throughput discovery of single nucleotide polymorphisms (SNPs) in species lacking a sequenced reference genome, we set-up an analysis pipeline based on a

  6. Genome-scale metabolic model of Pichia pastoris with native and humanized glycosylation of recombinant proteins

    DEFF Research Database (Denmark)

    Irani, Zahra Azimzadeh; Kerkhoven, Eduard J.; Shojaosadati, Seyed Abbas;

    2016-01-01

    Pichia pastoris is used for commercial production of human therapeutic proteins, and genome-scale models of P. pastoris metabolism have been generated in the past to study the metabolism and associated protein production by this yeast. A major challenge with clinical usage of recombinant proteins...

  7. Bacterial niche-specific genome expansion is coupled with highly frequent gene disruptions in deep-sea sediments.

    Directory of Open Access Journals (Sweden)

    Yong Wang

    Full Text Available The complexity and dynamics of microbial metagenomes may be evaluated by genome size, gene duplication and the disruption rate between lineages. In this study, we pyrosequenced the metagenomes of microbes obtained from the brine and sediment of a deep-sea brine pool in the Red Sea to explore the possible genomic adaptations of the microbes in response to environmental changes. The microbes from the brine and sediments (both surface and deep layers of the Atlantis II Deep brine pool had similar communities whereas the effective genome size varied from 7.4 Mb in the brine to more than 9 Mb in the sediment. This genome expansion in the sediment samples was due to gene duplication as evidenced by enrichment of the homologs. The duplicated genes were highly disrupted, on average by 47.6% and 70% for the surface and deep layers of the Atlantis II Deep sediment samples, respectively. The disruptive effects appeared to be mainly due to point mutations and frameshifts. In contrast, the homologs from the Atlantis II Deep brine sample were highly conserved and they maintained relatively small copy numbers. Likely, the adaptation of the microbes in the sediments was coupled with pseudogenizations and possibly functional diversifications of the paralogs in the expanded genomes. The maintenance of the pseudogenes in the large genomes is discussed.

  8. Bacterial niche-specific genome expansion is coupled with highly frequent gene disruptions in deep-sea sediments

    KAUST Repository

    Wang, Yong

    2011-12-21

    The complexity and dynamics of microbial metagenomes may be evaluated by genome size, gene duplication and the disruption rate between lineages. In this study, we pyrosequenced the metagenomes of microbes obtained from the brine and sediment of a deep-sea brine pool in the Red Sea to explore the possible genomic adaptations of the microbes in response to environmental changes. The microbes from the brine and sediments (both surface and deep layers) of the Atlantis II Deep brine pool had similar communities whereas the effective genome size varied from 7.4 Mb in the brine to more than 9 Mb in the sediment. This genome expansion in the sediment samples was due to gene duplication as evidenced by enrichment of the homologs. The duplicated genes were highly disrupted, on average by 47.6% and 70% for the surface and deep layers of the Atlantis II Deep sediment samples, respectively. The disruptive effects appeared to be mainly due to point mutations and frameshifts. In contrast, the homologs from the Atlantis II Deep brine sample were highly conserved and they maintained relatively small copy numbers. Likely, the adaptation of the microbes in the sediments was coupled with pseudogenizations and possibly functional diversifications of the paralogs in the expanded genomes. The maintenance of the pseudogenes in the large genomes is discussed. © 2011 Wang et al.

  9. Noise analysis of genome-scale protein synthesis using a discrete computational model of translation

    International Nuclear Information System (INIS)

    Noise in genetic networks has been the subject of extensive experimental and computational studies. However, very few of these studies have considered noise properties using mechanistic models that account for the discrete movement of ribosomes and RNA polymerases along their corresponding templates (messenger RNA (mRNA) and DNA). The large size of these systems, which scales with the number of genes, mRNA copies, codons per mRNA, and ribosomes, is responsible for some of the challenges. Additionally, one should be able to describe the dynamics of ribosome exchange between the free ribosome pool and those bound to mRNAs, as well as how mRNA species compete for ribosomes. We developed an efficient algorithm for stochastic simulations that addresses these issues and used it to study the contribution and trade-offs of noise to translation properties (rates, time delays, and rate-limiting steps). The algorithm scales linearly with the number of mRNA copies, which allowed us to study the importance of genome-scale competition between mRNAs for the same ribosomes. We determined that noise is minimized under conditions maximizing the specific synthesis rate. Moreover, sensitivity analysis of the stochastic system revealed the importance of the elongation rate in the resultant noise, whereas the translation initiation rate constant was more closely related to the average protein synthesis rate. We observed significant differences between our results and the noise properties of the most commonly used translation models. Overall, our studies demonstrate that the use of full mechanistic models is essential for the study of noise in translation and transcription

  10. Noise analysis of genome-scale protein synthesis using a discrete computational model of translation

    Energy Technology Data Exchange (ETDEWEB)

    Racle, Julien; Hatzimanikatis, Vassily, E-mail: vassily.hatzimanikatis@epfl.ch [Laboratory of Computational Systems Biotechnology, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne (Switzerland); Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne (Switzerland); Stefaniuk, Adam Jan [Laboratory of Computational Systems Biotechnology, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne (Switzerland)

    2015-07-28

    Noise in genetic networks has been the subject of extensive experimental and computational studies. However, very few of these studies have considered noise properties using mechanistic models that account for the discrete movement of ribosomes and RNA polymerases along their corresponding templates (messenger RNA (mRNA) and DNA). The large size of these systems, which scales with the number of genes, mRNA copies, codons per mRNA, and ribosomes, is responsible for some of the challenges. Additionally, one should be able to describe the dynamics of ribosome exchange between the free ribosome pool and those bound to mRNAs, as well as how mRNA species compete for ribosomes. We developed an efficient algorithm for stochastic simulations that addresses these issues and used it to study the contribution and trade-offs of noise to translation properties (rates, time delays, and rate-limiting steps). The algorithm scales linearly with the number of mRNA copies, which allowed us to study the importance of genome-scale competition between mRNAs for the same ribosomes. We determined that noise is minimized under conditions maximizing the specific synthesis rate. Moreover, sensitivity analysis of the stochastic system revealed the importance of the elongation rate in the resultant noise, whereas the translation initiation rate constant was more closely related to the average protein synthesis rate. We observed significant differences between our results and the noise properties of the most commonly used translation models. Overall, our studies demonstrate that the use of full mechanistic models is essential for the study of noise in translation and transcription.

  11. Diversity and relationships of cocirculating modern human rotaviruses revealed using large-scale comparative genomics.

    Science.gov (United States)

    McDonald, Sarah M; McKell, Allison O; Rippinger, Christine M; McAllen, John K; Akopov, Asmik; Kirkness, Ewen F; Payne, Daniel C; Edwards, Kathryn M; Chappell, James D; Patton, John T

    2012-09-01

    Group A rotaviruses (RVs) are 11-segmented, double-stranded RNA viruses and are primary causes of gastroenteritis in young children. Despite their medical relevance, the genetic diversity of modern human RVs is poorly understood, and the impact of vaccine use on circulating strains remains unknown. In this study, we report the complete genome sequence analysis of 58 RVs isolated from children with severe diarrhea and/or vomiting at Vanderbilt University Medical Center (VUMC) in Nashville, TN, during the years spanning community vaccine implementation (2005 to 2009). The RVs analyzed include 36 G1P[8], 18 G3P[8], and 4 G12P[8] Wa-like genogroup 1 strains with VP6-VP1-VP2-VP3-NSP1-NSP2-NSP3-NSP4-NSP5/6 genotype constellations of I1-R1-C1-M1-A1-N1-T1-E1-H1. By constructing phylogenetic trees, we identified 2 to 5 subgenotype alleles for each gene. The results show evidence of intragenogroup gene reassortment among the cocirculating strains. However, several isolates from different seasons maintained identical allele constellations, consistent with the notion that certain RV clades persisted in the community. By comparing the genes of VUMC RVs to those of other archival and contemporary RV strains for which sequences are available, we defined phylogenetic lineages and verified that the diversity of the strains analyzed in this study reflects that seen in other regions of the world. Importantly, the VP4 and VP7 proteins encoded by VUMC RVs and other contemporary strains show amino acid changes in or near neutralization domains, which might reflect antigenic drift of the virus. Thus, this large-scale, comparative genomic study of modern human RVs provides significant insight into how this pathogen evolves during its spread in the community. PMID:22696651

  12. Diversity and Relationships of Cocirculating Modern Human Rotaviruses Revealed Using Large-Scale Comparative Genomics

    Science.gov (United States)

    McKell, Allison O.; Rippinger, Christine M.; McAllen, John K.; Akopov, Asmik; Kirkness, Ewen F.; Payne, Daniel C.; Edwards, Kathryn M.; Chappell, James D.; Patton, John T.

    2012-01-01

    Group A rotaviruses (RVs) are 11-segmented, double-stranded RNA viruses and are primary causes of gastroenteritis in young children. Despite their medical relevance, the genetic diversity of modern human RVs is poorly understood, and the impact of vaccine use on circulating strains remains unknown. In this study, we report the complete genome sequence analysis of 58 RVs isolated from children with severe diarrhea and/or vomiting at Vanderbilt University Medical Center (VUMC) in Nashville, TN, during the years spanning community vaccine implementation (2005 to 2009). The RVs analyzed include 36 G1P[8], 18 G3P[8], and 4 G12P[8] Wa-like genogroup 1 strains with VP6-VP1-VP2-VP3-NSP1-NSP2-NSP3-NSP4-NSP5/6 genotype constellations of I1-R1-C1-M1-A1-N1-T1-E1-H1. By constructing phylogenetic trees, we identified 2 to 5 subgenotype alleles for each gene. The results show evidence of intragenogroup gene reassortment among the cocirculating strains. However, several isolates from different seasons maintained identical allele constellations, consistent with the notion that certain RV clades persisted in the community. By comparing the genes of VUMC RVs to those of other archival and contemporary RV strains for which sequences are available, we defined phylogenetic lineages and verified that the diversity of the strains analyzed in this study reflects that seen in other regions of the world. Importantly, the VP4 and VP7 proteins encoded by VUMC RVs and other contemporary strains show amino acid changes in or near neutralization domains, which might reflect antigenic drift of the virus. Thus, this large-scale, comparative genomic study of modern human RVs provides significant insight into how this pathogen evolves during its spread in the community. PMID:22696651

  13. Determining the control circuitry of redox metabolism at the genome-scale.

    Directory of Open Access Journals (Sweden)

    Stephen Federowicz

    2014-04-01

    Full Text Available Determining how facultative anaerobic organisms sense and direct cellular responses to electron acceptor availability has been a subject of intense study. However, even in the model organism Escherichia coli, established mechanisms only explain a small fraction of the hundreds of genes that are regulated during electron acceptor shifts. Here we propose a qualitative model that accounts for the full breadth of regulated genes by detailing how two global transcription factors (TFs, ArcA and Fnr of E. coli, sense key metabolic redox ratios and act on a genome-wide basis to regulate anabolic, catabolic, and energy generation pathways. We first fill gaps in our knowledge of this transcriptional regulatory network by carrying out ChIP-chip and gene expression experiments to identify 463 regulatory events. We then interfaced this reconstructed regulatory network with a highly curated genome-scale metabolic model to show that ArcA and Fnr regulate >80% of total metabolic flux and 96% of differential gene expression across fermentative and nitrate respiratory conditions. Based on the data, we propose a feedforward with feedback trim regulatory scheme, given the extensive repression of catabolic genes by ArcA and extensive activation of chemiosmotic genes by Fnr. We further corroborated this regulatory scheme by showing a 0.71 r(2 (p<1e-6 correlation between changes in metabolic flux and changes in regulatory activity across fermentative and nitrate respiratory conditions. Finally, we are able to relate the proposed model to a wealth of previously generated data by contextualizing the existing transcriptional regulatory network.

  14. The population genomics of begomoviruses: global scale population structure and gene flow

    Directory of Open Access Journals (Sweden)

    Prasanna HC

    2010-09-01

    Full Text Available Abstract Background The rapidly growing availability of diverse full genome sequences from across the world is increasing the feasibility of studying the large-scale population processes that underly observable pattern of virus diversity. In particular, characterizing the genetic structure of virus populations could potentially reveal much about how factors such as geographical distributions, host ranges and gene flow between populations combine to produce the discontinuous patterns of genetic diversity that we perceive as distinct virus species. Among the richest and most diverse full genome datasets that are available is that for the dicotyledonous plant infecting genus, Begomovirus, in the Family Geminiviridae. The begomoviruses all share the same whitefly vector, are highly recombinogenic and are distributed throughout tropical and subtropical regions where they seriously threaten the food security of the world's poorest people. Results We focus here on using a model-based population genetic approach to identify the genetically distinct sub-populations within the global begomovirus meta-population. We demonstrate the existence of at least seven major sub-populations that can further be sub-divided into as many as thirty four significantly differentiated and genetically cohesive minor sub-populations. Using the population structure framework revealed in the present study, we further explored the extent of gene flow and recombination between genetic populations. Conclusions Although geographical barriers are apparently the most significant underlying cause of the seven major population sub-divisions, within the framework of these sub-divisions, we explore patterns of gene flow to reveal that both host range differences and genetic barriers to recombination have probably been major contributors to the minor population sub-divisions that we have identified. We believe that the global Begomovirus population structure revealed here could

  15. Large scale single nucleotide polymorphism discovery in unsequenced genomes using second generation high throughput sequencing technology: applied to turkey

    Directory of Open Access Journals (Sweden)

    den Dunnen Johan T

    2009-10-01

    Full Text Available Abstract Background The development of second generation sequencing methods has enabled large scale DNA variation studies at moderate cost. For the high throughput discovery of single nucleotide polymorphisms (SNPs in species lacking a sequenced reference genome, we set-up an analysis pipeline based on a short read de novo sequence assembler and a program designed to identify variation within short reads. To illustrate the potential of this technique, we present the results obtained with a randomly sheared, enzymatically generated, 2-3 kbp genome fraction of six pooled Meleagris gallopavo (turkey individuals. Results A total of 100 million 36 bp reads were generated, representing approximately 5-6% (~62 Mbp of the turkey genome, with an estimated sequence depth of 58. Reads consisting of bases called with less than 1% error probability were selected and assembled into contigs. Subsequently, high throughput discovery of nucleotide variation was performed using sequences with more than 90% reliability by using the assembled contigs that were 50 bp or longer as the reference sequence. We identified more than 7,500 SNPs with a high probability of representing true nucleotide variation in turkeys. Increasing the reference genome by adding publicly available turkey BAC-end sequences increased the number of SNPs to over 11,000. A comparison with the sequenced chicken genome indicated that the assembled turkey contigs were distributed uniformly across the turkey genome. Genotyping of a representative sample of 340 SNPs resulted in a SNP conversion rate of 95%. The correlation of the minor allele count (MAC and observed minor allele frequency (MAF for the validated SNPs was 0.69. Conclusion We provide an efficient and cost-effective approach for the identification of thousands of high quality SNPs in species currently lacking a sequenced genome and applied this to turkey. The methodology addresses a random fraction of the genome, resulting in an even

  16. Genome-wide linkage using the Social Responsiveness Scale in Utah autism pedigrees

    Directory of Open Access Journals (Sweden)

    Coon Hilary

    2010-04-01

    Full Text Available Abstract Background Autism Spectrum Disorders (ASD are phenotypically heterogeneous, characterized by impairments in the development of communication and social behaviour and the presence of repetitive behaviour and restricted interests. Dissecting the genetic complexity of ASD may require phenotypic data reflecting more detail than is offered by a categorical clinical diagnosis. Such data are available from the Social Responsiveness Scale (SRS which is a continuous, quantitative measure of social ability giving scores that range from significant impairment to above average ability. Methods We present genome-wide results for 64 multiplex and extended families ranging from two to nine generations. SRS scores were available from 518 genotyped pedigree subjects, including affected and unaffected relatives. Genotypes from the Illumina 6 k single nucleotide polymorphism panel were provided by the Center for Inherited Disease Research. Quantitative and qualitative analyses were done using MCLINK, a software package that uses Markov chain Monte Carlo (MCMC methods to perform multilocus linkage analysis on large extended pedigrees. Results When analysed as a qualitative trait, linkage occurred in the same locations as in our previous affected-only genome scan of these families, with findings on chromosomes 7q31.1-q32.3 [heterogeneity logarithm of the odds (HLOD = 2.91], 15q13.3 (HLOD = 3.64, and 13q12.3 (HLOD = 2.23. Additional positive qualitative results were seen on chromosomes 6 and 10 in regions that may be of interest for other neuropsychiatric disorders. When analysed as a quantitative trait, results replicated a peak found in an independent sample using quantitative SRS scores on chromosome 11p15.1-p15.4 (HLOD = 2.77. Additional positive quantitative results were seen on chromosomes 7, 9, and 19. Conclusions The SRS linkage peaks reported here substantially overlap with peaks found in our previous affected-only genome scan of clinical diagnosis

  17. Comparative fluorescence in situ hybridization mapping of a 431-kb Arabidopsis thaliana bacterial artificial chromosome contig reveals the role of chromosomal duplications in the expansion of the Brassica rapa genome.

    OpenAIRE

    Jackson, S A; Cheng, Z; Wang, M L; Goodman, H M; Jiang, J

    2000-01-01

    Comparative genome studies are important contributors to our understanding of genome evolution. Most comparative genome studies in plants have been based on genetic mapping of homologous DNA loci in different genomes. Large-scale comparative physical mapping has been hindered by the lack of efficient and affordable techniques. We report here the adaptation of fluorescence in situ hybridization (FISH) techniques for comparative physical mapping between Arabidopsis thaliana and Brassica rapa. A...

  18. Up-scaling aquaculture wastewater treatment by microalgal bacterial flocs: from lab reactors to an outdoor raceway pond.

    Science.gov (United States)

    Van Den Hende, Sofie; Beelen, Veerle; Bore, Gaëlle; Boon, Nico; Vervaeren, Han

    2014-05-01

    Sequencing batch reactors with microalgal bacterial flocs (MaB-floc SBRs) are a novel approach for photosynthetic aerated wastewater treatment based on bioflocculation. To assess their technical potential for aquaculture wastewater treatment in Northwest Europe, MaB-floc SBRs were up-scaled from indoor photobioreactors of 4 L over 40 and 400 L to a 12 m(3) outdoor raceway pond. Scale-up decreased the nutrient removal efficiencies with a factor 1-3 and the volumetric biomass productivities with a factor 10-13. Effluents met current discharge norms, except for nitrite and nitrate. Flue gas sparging was needed to decrease the effluent pH. Outdoor MaB-flocs showed enhanced settling properties and an increased ash and chlorophyll a content. Bioflocculation enabled successful harvesting by gravity settling and dewatering by filtering at 150-250 μm. Optimisation of nitrogen removal and biomass valorisation are future challenges towards industrial implementation of MaB-floc SBRs for aquaculture wastewater treatment.

  19. Continental-scale variation in seaweed host-associated bacterial communities is a function of host condition, not geography.

    Science.gov (United States)

    Marzinelli, Ezequiel M; Campbell, Alexandra H; Zozaya Valdes, Enrique; Vergés, Adriana; Nielsen, Shaun; Wernberg, Thomas; de Bettignies, Thibaut; Bennett, Scott; Caporaso, J Gregory; Thomas, Torsten; Steinberg, Peter D

    2015-10-01

    Interactions between hosts and associated microbial communities can fundamentally shape the development and ecology of 'holobionts', from humans to marine habitat-forming organisms such as seaweeds. In marine systems, planktonic microbial community structure is mainly driven by geography and related environmental factors, but the large-scale drivers of host-associated microbial communities are largely unknown. Using 16S-rRNA gene sequencing, we characterized 260 seaweed-associated bacterial and archaeal communities on the kelp Ecklonia radiata from three biogeographical provinces spanning 10° of latitude and 35° of longitude across the Australian continent. These phylogenetically and taxonomically diverse communities were more strongly and consistently associated with host condition than geographical location or environmental variables, and a 'core' microbial community characteristic of healthy kelps appears to be lost when hosts become stressed. Microbial communities on stressed individuals were more similar to each other among locations than those on healthy hosts. In contrast to biogeographical patterns of planktonic marine microbial communities, host traits emerge as critical determinants of associated microbial community structure of these holobionts, even at a continental scale.

  20. Evaluation of assembling methods on determination of whole genome sequence of Xylella fastidiosa blueberry bacterial leaf scorch strain

    Science.gov (United States)

    Blueberry bacterial leaf scorch (BBLS) disease, a threat to blueberry production in the Southern USA and potentially elsewhere, is caused by Xylella fastidiosa. Efficient control of BBLS requires knowledge of the pathogen. However, this is challenging because Xylella fastidiosa is difficult to cultu...

  1. Genomes and virulence factors of novel bacterial pathogens causing bleaching disease in the marine red alga Delisea pulchra.

    Directory of Open Access Journals (Sweden)

    Neil Fernandes

    Full Text Available Nautella sp. R11, a member of the marine Roseobacter clade, causes a bleaching disease in the temperate-marine red macroalga, Delisea pulchra. To begin to elucidate the molecular mechanisms underpinning the ability of Nautella sp. R11 to colonize, invade and induce bleaching of D. pulchra, we sequenced and analyzed its genome. The genome encodes several factors such as adhesion mechanisms, systems for the transport of algal metabolites, enzymes that confer resistance to oxidative stress, cytolysins, and global regulatory mechanisms that may allow for the switch of Nautella sp. R11 to a pathogenic lifestyle. Many virulence effectors common in phytopathogenic bacteria are also found in the R11 genome, such as the plant hormone indole acetic acid, cellulose fibrils, succinoglycan and nodulation protein L. Comparative genomics with non-pathogenic Roseobacter strains and a newly identified pathogen, Phaeobacter sp. LSS9, revealed a patchy distribution of putative virulence factors in all genomes, but also led to the identification of a quorum sensing (QS dependent transcriptional regulator that was unique to pathogenic Roseobacter strains. This observation supports the model that a combination of virulence factors and QS-dependent regulatory mechanisms enables indigenous members of the host alga's epiphytic microbial community to switch to a pathogenic lifestyle, especially under environmental conditions when innate host defence mechanisms are compromised.

  2. Cancer Biomarkers from Genome-Scale DNA Methylation: Comparison of Evolutionary and Semantic Analysis Methods

    Directory of Open Access Journals (Sweden)

    Ioannis Valavanis

    2015-11-01

    Full Text Available DNA methylation profiling exploits microarray technologies, thus yielding a wealth of high-volume data. Here, an intelligent framework is applied, encompassing epidemiological genome-scale DNA methylation data produced from the Illumina’s Infinium Human Methylation 450K Bead Chip platform, in an effort to correlate interesting methylation patterns with cancer predisposition and, in particular, breast cancer and B-cell lymphoma. Feature selection and classification are employed in order to select, from an initial set of ~480,000 methylation measurements at CpG sites, predictive cancer epigenetic biomarkers and assess their classification power for discriminating healthy versus cancer related classes. Feature selection exploits evolutionary algorithms or a graph-theoretic methodology which makes use of the semantics information included in the Gene Ontology (GO tree. The selected features, corresponding to methylation of CpG sites, attained moderate-to-high classification accuracies when imported to a series of classifiers evaluated by resampling or blindfold validation. The semantics-driven selection revealed sets of CpG sites performing similarly with evolutionary selection in the classification tasks. However, gene enrichment and pathway analysis showed that it additionally provides more descriptive sets of GO terms and KEGG pathways regarding the cancer phenotypes studied here. Results support the expediency of this methodology regarding its application in epidemiological studies.

  3. Expanding a dynamic flux balance model of yeast fermentation to genome-scale

    Directory of Open Access Journals (Sweden)

    Agosin Eduardo

    2011-05-01

    Full Text Available Abstract Background Yeast is considered to be a workhorse of the biotechnology industry for the production of many value-added chemicals, alcoholic beverages and biofuels. Optimization of the fermentation is a challenging task that greatly benefits from dynamic models able to accurately describe and predict the fermentation profile and resulting products under different genetic and environmental conditions. In this article, we developed and validated a genome-scale dynamic flux balance model, using experimentally determined kinetic constraints. Results Appropriate equations for maintenance, biomass composition, anaerobic metabolism and nutrient uptake are key to improve model performance, especially for predicting glycerol and ethanol synthesis. Prediction profiles of synthesis and consumption of the main metabolites involved in alcoholic fermentation closely agreed with experimental data obtained from numerous lab and industrial fermentations under different environmental conditions. Finally, fermentation simulations of genetically engineered yeasts closely reproduced previously reported experimental results regarding final concentrations of the main fermentation products such as ethanol and glycerol. Conclusion A useful tool to describe, understand and predict metabolite production in batch yeast cultures was developed. The resulting model, if used wisely, could help to search for new metabolic engineering strategies to manage ethanol content in batch fermentations.

  4. An Integrative Approach for the Large-scale Identification of Human Genome Kinases Regulating Cancer Metastasis

    Science.gov (United States)

    Zhang, Hanshuo; Wu, Pu-Yen; Ma, Ming; Ye, Yanzheng; Hao, Yang; Yang, Junyu; Yin, Shenyi; Sun, Changhong; Phan, John H.; Wang, May D.; Xi, Jianzhong Jeff

    2016-01-01

    Kinases regulate the majority of biological processes and become one of important groups of drug targets. To identify more kinases being potential for cancer therapy, we developed an integrative approach for the large-scale screen of functional genes capable of regulating the main traits of cancer metastasis, including cell migration as well as invasion. We first employed self-assembled cell microarray (SAMcell) to screen functional genes that regulate cancer cell migration using a siRNA library targeting 710 human genome kinase genes. We identified 81 genes capable of significantly regulating cancer cell migration. Following with invasion assays and bio-informatics analysis, we discovered that 16 genes with differentially expression in cancer samples can regulate both cell migration and invasion, among which 10 genes have been well known to play critical roles in the cancer development. The remaining 6 genes were experimentally validated to have the capacities of regulating the metastasis-related traits, including cell proliferation, apoptosis and anoikis activities besides cell motility. Together, these findings provide a new insight into the therapeutic use of human kinases. PMID:23751374

  5. Novel insights into obesity and diabetes through genome-scale metabolic modeling

    Directory of Open Access Journals (Sweden)

    Leif eVäremo

    2013-04-01

    Full Text Available The growing prevalence of metabolic diseases, such as obesity and diabetes, are putting a high strain on global healthcare systems as well as increasing the demand for efficient treatment strategies. More than 360 million people worldwide are suffering from type 2 diabetes and, with the current trends, the projection is that 10% of the global adult population will be affected by 2030. In light of the systemic properties of metabolic diseases as well as the interconnected nature of metabolism, it is necessary to begin taking a holistic approach to study these diseases. Human genome-scale metabolic models (GEMs are topological and mathematical representations of cell metabolism and have proven to be valuable tools in the area of systems biology. Successful applications of GEMs include the process of gaining further biological and mechanistic understanding of diseases, finding potential biomarkers and identifying new drug targets. This review will focus on the modeling of human metabolism in the field of obesity and diabetes, showing its vast range of applications of clinical importance as well as point out future challenges.

  6. Further developments towards a genome-scale metabolic model of yeast

    Directory of Open Access Journals (Sweden)

    Dunn Warwick B

    2010-10-01

    Full Text Available Abstract Background To date, several genome-scale network reconstructions have been used to describe the metabolism of the yeast Saccharomyces cerevisiae, each differing in scope and content. The recent community-driven reconstruction, while rigorously evidenced and well annotated, under-represented metabolite transport, lipid metabolism and other pathways, and was not amenable to constraint-based analyses because of lack of pathway connectivity. Results We have expanded the yeast network reconstruction to incorporate many new reactions from the literature and represented these in a well-annotated and standards-compliant manner. The new reconstruction comprises 1102 unique metabolic reactions involving 924 unique metabolites - significantly larger in scope than any previous reconstruction. The representation of lipid metabolism in particular has improved, with 234 out of 268 enzymes linked to lipid metabolism now present in at least one reaction. Connectivity is emphatically improved, with more than 90% of metabolites now reachable from the growth medium constituents. The present updates allow constraint-based analyses to be performed; viability predictions of single knockouts are comparable to results from in vivo experiments and to those of previous reconstructions. Conclusions We report the development of the most complete reconstruction of yeast metabolism to date that is based upon reliable literature evidence and richly annotated according to MIRIAM standards. The reconstruction is available in the Systems Biology Markup Language (SBML and via a publicly accessible database http://www.comp-sys-bio.org/yeastnet/.

  7. SECOM: A novel hash seed and community detection based-approach for genome-scale protein domain identification

    KAUST Repository

    Fan, Ming

    2012-06-28

    With rapid advances in the development of DNA sequencing technologies, a plethora of high-throughput genome and proteome data from a diverse spectrum of organisms have been generated. The functional annotation and evolutionary history of proteins are usually inferred from domains predicted from the genome sequences. Traditional database-based domain prediction methods cannot identify novel domains, however, and alignment-based methods, which look for recurring segments in the proteome, are computationally demanding. Here, we propose a novel genome-wide domain prediction method, SECOM. Instead of conducting all-against-all sequence alignment, SECOM first indexes all the proteins in the genome by using a hash seed function. Local similarity can thus be detected and encoded into a graph structure, in which each node represents a protein sequence and each edge weight represents the shared hash seeds between the two nodes. SECOM then formulates the domain prediction problem as an overlapping community-finding problem in this graph. A backward graph percolation algorithm that efficiently identifies the domains is proposed. We tested SECOM on five recently sequenced genomes of aquatic animals. Our tests demonstrated that SECOM was able to identify most of the known domains identified by InterProScan. When compared with the alignment-based method, SECOM showed higher sensitivity in detecting putative novel domains, while it was also three orders of magnitude faster. For example, SECOM was able to predict a novel sponge-specific domain in nucleoside-triphosphatase (NTPases). Furthermore, SECOM discovered two novel domains, likely of bacterial origin, that are taxonomically restricted to sea anemone and hydra. SECOM is an open-source program and available at http://sfb.kaust.edu.sa/Pages/Software.aspx. © 2012 Fan et al.

  8. Identification of Bacterial Small RNAs by RNA Sequencing

    DEFF Research Database (Denmark)

    Gómez Lozano, María; Marvig, Rasmus Lykke; Molin, Søren;

    2014-01-01

    Small regulatory RNAs (sRNAs) in bacteria are known to modulate gene expression and control a variety of processes including metabolic reactions, stress responses, and pathogenesis in response to environmental signals. A method to identify bacterial sRNAs on a genome-wide scale based on RNA...

  9. Using large-scale genome variation cohorts to decipher the molecular mechanism of cancer.

    Science.gov (United States)

    Habermann, Nina; Mardin, Balca R; Yakneen, Sergei; Korbel, Jan O

    2016-01-01

    Characterizing genomic structural variations (SVs) in the human genome remains challenging, and there is a growing interest to understand somatic SVs occurring in cancer, a disease of the genome. A havoc-causing SV process known as chromothripsis scars the genome when localized chromosome shattering and repair occur in a one-off catastrophe. Recent efforts led to the development of a set of conceptual criteria for the inference of chromothripsis events in cancer genomes and to the development of experimental model systems for studying this striking DNA alteration process in vitro. We discuss these approaches, and additionally touch upon current "Big Data" efforts that employ hybrid cloud computing to enable studies of numerous cancer genomes in an effort to search for commonalities and differences in molecular DNA alteration processes in cancer. PMID:27342254

  10. Using large-scale genome variation cohorts to decipher the molecular mechanism of cancer.

    Science.gov (United States)

    Habermann, Nina; Mardin, Balca R; Yakneen, Sergei; Korbel, Jan O

    2016-01-01

    Characterizing genomic structural variations (SVs) in the human genome remains challenging, and there is a growing interest to understand somatic SVs occurring in cancer, a disease of the genome. A havoc-causing SV process known as chromothripsis scars the genome when localized chromosome shattering and repair occur in a one-off catastrophe. Recent efforts led to the development of a set of conceptual criteria for the inference of chromothripsis events in cancer genomes and to the development of experimental model systems for studying this striking DNA alteration process in vitro. We discuss these approaches, and additionally touch upon current "Big Data" efforts that employ hybrid cloud computing to enable studies of numerous cancer genomes in an effort to search for commonalities and differences in molecular DNA alteration processes in cancer.

  11. CATMA, a comprehensive genome-scale resource for silencing and transcript profiling of Arabidopsis genes

    Directory of Open Access Journals (Sweden)

    Moreau Yves

    2007-10-01

    Gène genome annotations, respectively. To cover the remaining untagged genes, we identified 543 additional GSTs using less stringent design criteria and designed 990 sequence tags matching multiple members of gene families (Gene Family Tags or GFTs to cover any remaining untagged genes. These latter 1,533 features constitute the CATMAv4 addition. Conclusion To update the CATMA GST repertoire, we designed 7,289 additional sequence tags, bringing the total number of tagged TAIR6-annotated Arabidopsis nuclear protein-coding genes to 26,173. This resource is used both for the production of spotted microarrays and the large-scale cloning of hairpin RNA silencing vectors. All information about the resulting updated CATMA repertoire is available through the CATMA database http://www.catma.org.

  12. Exploring the metabolic network of the epidemic pathogen Burkholderia cenocepacia J2315 via genome-scale reconstruction

    Directory of Open Access Journals (Sweden)

    Panda Gurudutta

    2011-05-01

    Full Text Available Abstract Background Burkholderia cenocepacia is a threatening nosocomial epidemic pathogen in patients with cystic fibrosis (CF or a compromised immune system. Its high level of antibiotic resistance is an increasing concern in treatments against its infection. Strain B. cenocepacia J2315 is the most infectious isolate from CF patients. There is a strong demand to reconstruct a genome-scale metabolic network of B. cenocepacia J2315 to systematically analyze its metabolic capabilities and its virulence traits, and to search for potential clinical therapy targets. Results We reconstructed the genome-scale metabolic network of B. cenocepacia J2315. An iterative reconstruction process led to the establishment of a robust model, iKF1028, which accounts for 1,028 genes, 859 internal reactions, and 834 metabolites. The model iKF1028 captures important metabolic capabilities of B. cenocepacia J2315 with a particular focus on the biosyntheses of key metabolic virulence factors to assist in understanding the mechanism of disease infection and identifying potential drug targets. The model was tested through BIOLOG assays. Based on the model, the genome annotation of B. cenocepacia J2315 was refined and 24 genes were properly re-annotated. Gene and enzyme essentiality were analyzed to provide further insights into the genome function and architecture. A total of 45 essential enzymes were identified as potential therapeutic targets. Conclusions As the first genome-scale metabolic network of B. cenocepacia J2315, iKF1028 allows a systematic study of the metabolic properties of B. cenocepacia and its key metabolic virulence factors affecting the CF community. The model can be used as a discovery tool to design novel drugs against diseases caused by this notorious pathogen.

  13. A high-resolution cattle CNV map by population-scale genome sequencing

    Science.gov (United States)

    Copy Number Variations (CNVs) are common genomic structural variations that have been linked to human diseases and phenotypic traits. Prior studies in cattle have produced low-resolution CNV maps. We constructed a draft, high-resolution map of cattle CNVs based on whole genome sequencing data from 7...

  14. Large-scale trends in the evolution of gene structures within 11 animal genomes.

    Directory of Open Access Journals (Sweden)

    Mark Yandell

    2006-03-01

    Full Text Available We have used the annotations of six animal genomes (Homo sapiens, Mus musculus, Ciona intestinalis, Drosophila melanogaster, Anopheles gambiae, and Caenorhabditis elegans together with the sequences of five unannotated Drosophila genomes to survey changes in protein sequence and gene structure over a variety of timescales--from the less than 5 million years since the divergence of D. simulans and D. melanogaster to the more than 500 million years that have elapsed since the Cambrian explosion. To do so, we have developed a new open-source software library called CGL (for "Comparative Genomics Library". Our results demonstrate that change in intron-exon structure is gradual, clock-like, and largely independent of coding-sequence evolution. This means that genome annotations can be used in new ways to inform, corroborate, and test conclusions drawn from comparative genomics analyses that are based upon protein and nucleotide sequence similarities.

  15. Genome-scale prediction of proteins with long intrinsically disordered regions.

    Science.gov (United States)

    Peng, Zhenling; Mizianty, Marcin J; Kurgan, Lukasz

    2014-01-01

    Proteins with long disordered regions (LDRs), defined as having 30 or more consecutive disordered residues, are abundant in eukaryotes, and these regions are recognized as a distinct class of biologically functional domains. LDRs facilitate various cellular functions and are important for target selection in structural genomics. Motivated by the lack of methods that directly predict proteins with LDRs, we designed Super-fast predictor of proteins with Long Intrinsically DisordERed regions (SLIDER). SLIDER utilizes logistic regression that takes an empirically chosen set of numerical features, which consider selected physicochemical properties of amino acids, sequence complexity, and amino acid composition, as its inputs. Empirical tests show that SLIDER offers competitive predictive performance combined with low computational cost. It outperforms, by at least a modest margin, a comprehensive set of modern disorder predictors (that can indirectly predict LDRs) and is 16 times faster compared to the best currently available disorder predictor. Utilizing our time-efficient predictor, we characterized abundance and functional roles of proteins with LDRs over 110 eukaryotic proteomes. Similar to related studies, we found that eukaryotes have many (on average 30.3%) proteins with LDRs with majority of proteomes having between 25 and 40%, where higher abundance is characteristic to proteomes that have larger proteins. Our first-of-its-kind large-scale functional analysis shows that these proteins are enriched in a number of cellular functions and processes including certain binding events, regulation of catalytic activities, cellular component organization, biogenesis, biological regulation, and some metabolic and developmental processes. A webserver that implements SLIDER is available at http://biomine.ece.ualberta.ca/SLIDER/.

  16. Development of a database system for mapping insertional mutations onto the mouse genome with large-scale experimental data

    OpenAIRE

    Yang, Wenwei; Jin, Ke; Xie, Xing; Li, Dongsheng; Yang, Jigang; Wang, Li; Gu, Ning; Zhong, Yang; Sun, Ling V.

    2009-01-01

    Background Insertional mutagenesis is an effective method for functional genomic studies in various organisms. It can rapidly generate easily tractable mutations. A large-scale insertional mutagenesis with the piggyBac (PB) transposon is currently performed in mice at the Institute of Developmental Biology and Molecular Medicine (IDM), Fudan University in Shanghai, China. This project is carried out via collaborations among multiple groups overseeing interconnected experimental steps and gene...

  17. 白眉长臂猿基因组BAC文库的构建%Construction of Genome Bacterial Artificial Chromosome Library of Hylobates Hoolock

    Institute of Scientific and Technical Information of China (English)

    王起明; 孙烨超; 厉申捷; 叶建平

    2015-01-01

    High quality genomic DNA of Hylobates hoolock was obtained by gentle physical homogenization. The DNA was partially digested with EcoRⅠand EcoRⅠmethylase, and cloned to pCC1BAC vector. The positive clones were stored in 384-well plates. The constructed BAC library consists of 85800 clones. DNA from randomly selected 250 BAC clones was restricted with Not I restriction enzyme and fragments were separated by pulsed field gel electrophoresis. The result shows that the average insert size is estimated as approximately 110 kb, and the ratio of non-recombinant clones is 10. 0%. If the genome size of Hylobates hoolock is 3 ×106 kilo-base, the library could cover 3 times the number of genome.%通过温和的物理方法获得白眉长臂猿高质量的基因组DNA,EcoRⅠ和EcoRⅠ甲基化酶部分酶切后经回收、连接、转化、阳性克隆的保存,构建了含有85800个克隆的全基因组BAC( Bacterial artificial chromosome)文库.随机选取250个BAC克隆进行Not I酶切及脉冲场电泳分析,结果表明该文库的平均插入片段大小为110 kb,非重组克隆(无插入片段)的比率为10.0%.假定白眉长臂猿的基因组大小为3×106 kb,根据文库的平均插入片段大小,则该文库具有3倍的基因组覆盖率.

  18. Population genomic analysis of a bacterial plant pathogen: novel insight into the origin of Pierce's disease of grapevine in the U.S.

    Directory of Open Access Journals (Sweden)

    Leonard Nunney

    Full Text Available Invasive diseases present an increasing problem worldwide; however, genomic techniques are now available to investigate the timing and geographical origin of such introductions. We employed genomic techniques to demonstrate that the bacterial pathogen causing Pierce's disease of grapevine (PD is not native to the US as previously assumed, but descended from a single genotype introduced from Central America. PD has posed a serious threat to the US wine industry ever since its first outbreak in Anaheim, California in the 1880s and continues to inhibit grape cultivation in a large area of the country. It is caused by infection of xylem vessels by the bacterium Xylella fastidiosa subsp. fastidiosa, a genetically distinct subspecies at least 15,000 years old. We present five independent kinds of evidence that strongly support our invasion hypothesis: 1 a genome-wide lack of genetic variability in X. fastidiosa subsp. fastidiosa found in the US, consistent with a recent common ancestor; 2 evidence for historical allopatry of the North American subspecies X. fastidiosa subsp. multiplex and X. fastidiosa subsp. fastidiosa; 3 evidence that X. fastidiosa subsp. fastidiosa evolved in a more tropical climate than X. fastidiosa subsp. multiplex; 4 much greater genetic variability in the proposed source population in Central America, variation within which the US genotypes are phylogenetically nested; and 5 the circumstantial evidence of importation of known hosts (coffee plants from Central America directly into southern California just prior to the first known outbreak of the disease. The lack of genetic variation in X. fastidiosa subsp. fastidiosa in the US suggests that preventing additional introductions is important since new genetic variation may undermine PD control measures, or may lead to infection of other crop plants through the creation of novel genotypes via inter-subspecific recombination. In general, geographically mixing of previously

  19. Genome-scale modeling using flux ratio constraints to enable metabolic engineering of clostridial metabolism in silico

    Directory of Open Access Journals (Sweden)

    McAnulty Michael J

    2012-05-01

    Full Text Available Abstract Background Genome-scale metabolic networks and flux models are an effective platform for linking an organism genotype to its phenotype. However, few modeling approaches offer predictive capabilities to evaluate potential metabolic engineering strategies in silico. Results A new method called “flux balance analysis with flux ratios (FBrAtio” was developed in this research and applied to a new genome-scale model of Clostridium acetobutylicum ATCC 824 (iCAC490 that contains 707 metabolites and 794 reactions. FBrAtio was used to model wild-type metabolism and metabolically engineered strains of C. acetobutylicum where only flux ratio constraints and thermodynamic reversibility of reactions were required. The FBrAtio approach allowed solutions to be found through standard linear programming. Five flux ratio constraints were required to achieve a qualitative picture of wild-type metabolism for C. acetobutylicum for the production of: (i acetate, (ii lactate, (iii butyrate, (iv acetone, (v butanol, (vi ethanol, (vii CO2 and (viii H2. Results of this simulation study coincide with published experimental results and show the knockdown of the acetoacetyl-CoA transferase increases butanol to acetone selectivity, while the simultaneous over-expression of the aldehyde/alcohol dehydrogenase greatly increases ethanol production. Conclusions FBrAtio is a promising new method for constraining genome-scale models using internal flux ratios. The method was effective for modeling wild-type and engineered strains of C. acetobutylicum.

  20. Capturing the response of Clostridium acetobutylicum to chemical stressors using a regulated genome-scale metabolic model

    International Nuclear Information System (INIS)

    Clostridia are anaerobic Gram-positive Firmicutes containing broad and flexible systems for substrate utilization, which have been used successfully to produce a range of industrial compounds. Clostridium acetobutylicum has been used to produce butanol on an industrial scale through acetone-butanol-ethanol (ABE) fermentation. A genome-scale metabolic (GSM) model is a powerful tool for understanding the metabolic capacities of an organism and developing metabolic engineering strategies for strain development. The integration of stress related specific transcriptomics information with the GSM model provides opportunities for elucidating the focal points of regulation

  1. Three novel C1q domain containing proteins from the disk abalone Haliotis discus discus: Genomic organization and analysis of the transcriptional changes in response to bacterial pathogens.

    Science.gov (United States)

    Bathige, S D N K; Umasuthan, Navaneethaiyer; Jayasinghe, J D H E; Godahewa, G I; Park, Hae-Chul; Lee, Jehee

    2016-09-01

    The globular C1q (gC1q) domain containing proteins, commonly referred as C1q domain containing (C1qDC) proteins, are an essential family of proteins involved in various innate immune responses. In this study, three novel C1qDC proteins were identified from the disk abalone (Haliotis discus discus) transcriptome database and designated as AbC1qDC1, AbC1qDC2, and AbC1qDC3. The cDNA sequences of AbC1qDC1, AbC1qDC2, and AbC1qDC3 consisted of 807, 1305, and 660 bp open reading frames (ORFs) encoding 269, 435, and 220 amino acids (aa), respectively. Putative signal peptides and the N-terminal gC1q domain were identified in all three AbC1qDC proteins. An additional predicted motif region, known as the coiled coil region (CCR), was identified next to the signal sequence of AbC1qDC2. The genomic organization of the AbC1qDCs was determined using a bacterial artificial chromosome (BAC) library. It was found that the CDS of AbC1qDC1 was distributed among three exons, while the CDSs of AbC1qDC2 and AbC1qDC3 were distributed between two exons. Sequence analysis indicated that the AbC1qDC proteins shared muscle, and mantle tissues compare to the other tissues analyzed, using reverse transcription, followed by quantitative real-time PCR (qPCR) using SYBR Green, whereas AbC1qDC3 was predominantly expressed in gill tissues, followed by muscles and the hepatopancreas. The temporal expression of AbC1qDC transcripts in gills after bacterial (Vibrio parahaemolyticus and Listeria monocytogenes) and lipopolysaccharide stimulation indicated that AbC1qDCs can be strongly induced by both Gram-negative and Gram-positive bacterial species with different response profiles. The results of this study suggest that AbC1qDCs are involved in immune responses against invading bacterial pathogens. PMID:27417231

  2. Genomic-scale analysis of DNA words of arbitrary length by parallel computation.

    OpenAIRE

    Yang, X Y; Ripoll, A.; Arnau Llombart, Vicente; Marín Lozano, Ignacio; Luque, E.

    2006-01-01

    In the post-genomic era, one of the main tasks is deciphering the meaning of the DNA sequences of complex organisms. In order to do so, there is a clear need for biocomputer tools able to extract and order the information of long DNA molecules, such as whole chromosomes or even complete genomes. However, most genomic analyses have been concentrated on the detection and counting of short words having sizes of between 1 and 10 nucleotides. In this paper, we describe parallel algorithms with dif...

  3. Small-Scale Vertical Distribution of Bacterial Biomass and Diversity in Biological Soil Crusts from Arid Lands in the Colorado Plateau

    Science.gov (United States)

    Garcia-Pichel, F.; Johnson, S.L.; Youngkin, D.; Belnap, J.

    2003-01-01

    We characterized, at millimeter resolution, bacterial biomass, diversity, and vertical stratification of biological soil crusts in arid lands from the Colorado Plateau. Microscopic counts, extractable DNA, and plate counts of viable aerobic copiotrophs (VAC) revealed that the top centimeter of crusted soils contained atypically large bacterial populations, tenfold larger than those in uncrusted, deeper soils. The plate counts were not always consistent with more direct estimates of microbial biomass. Bacterial populations peaked at the immediate subsurface (1-2 mm) in light-appearing, young crusts, and at the surface (0-1 mm) in well-developed, dark crusts, which corresponds to the location of cyanobacterial populations. Bacterial abundance decreased with depth below these horizons. Spatially resolved DGGE fingerprints of Bacterial 16S rRNA genes demonstrated the presence of highly diverse natural communities, but we could detect neither trends with depth in bacterial richness or diversity, nor a difference in diversity indices between crust types. Fingerprints, however, revealed the presence of marked stratification in the structure of the microbial communities, probably a result of vertical gradients in physicochemical parameters. Sequencing and phylogenetic analyses indicated that most of the naturally occurring bacteria are novel types, with low sequence similarity (83-93%) to those available in public databases. DGGE analyses of the VAC populations indicated communities of lower diversity, with most types having sequences more than 94% similar to those in public databases. Our study indicates that soil crusts represent small-scale mantles of fertility in arid ecosystems, harboring vertically structured, little-known bacterial populations that are not well represented by standard cultivation methods.

  4. In Vitro Whole Genome DNA Binding Analysis of the Bacterial Replication Initiator and Transcription Factor DnaA

    OpenAIRE

    Smith, Janet L.; Grossman, Alan D.

    2015-01-01

    DnaA, the replication initiation protein in bacteria, is an AAA+ ATPase that binds and hydrolyzes ATP and exists in a heterogeneous population of ATP-DnaA and ADP-DnaA. DnaA binds cooperatively to the origin of replication and several other chromosomal regions, and functions as a transcription factor at some of these regions. We determined the binding properties of Bacillus subtilis DnaA to genomic DNA in vitro at single nucleotide resolution using in vitro DNA affinity purification and deep ...

  5. Genome-scale comparison and constraint-based metabolic reconstruction of the facultative anaerobic Fe(III-reducer Rhodoferax ferrireducens

    Directory of Open Access Journals (Sweden)

    Daugherty Sean

    2009-09-01

    Full Text Available Abstract Background Rhodoferax ferrireducens is a metabolically versatile, Fe(III-reducing, subsurface microorganism that is likely to play an important role in the carbon and metal cycles in the subsurface. It also has the unique ability to convert sugars to electricity, oxidizing the sugars to carbon dioxide with quantitative electron transfer to graphite electrodes in microbial fuel cells. In order to expand our limited knowledge about R. ferrireducens, the complete genome sequence of this organism was further annotated and then the physiology of R. ferrireducens was investigated with a constraint-based, genome-scale in silico metabolic model and laboratory studies. Results The iterative modeling and experimental approach unveiled exciting, previously unknown physiological features, including an expanded range of substrates that support growth, such as cellobiose and citrate, and provided additional insights into important features such as the stoichiometry of the electron transport chain and the ability to grow via fumarate dismutation. Further analysis explained why R. ferrireducens is unable to grow via photosynthesis or fermentation of sugars like other members of this genus and uncovered novel genes for benzoate metabolism. The genome also revealed that R. ferrireducens is well-adapted for growth in the subsurface because it appears to be capable of dealing with a number of environmental insults, including heavy metals, aromatic compounds, nutrient limitation and oxidative stress. Conclusion This study demonstrates that combining genome-scale modeling with the annotation of a new genome sequence can guide experimental studies and accelerate the understanding of the physiology of under-studied yet environmentally relevant microorganisms.

  6. Folding Free Energies of 5'-UTRs Impact Post-Transcriptional Regulation on a Genomic Scale in Yeast.

    Directory of Open Access Journals (Sweden)

    2005-12-01

    Full Text Available Using high-throughput technologies, abundances and other features of genes and proteins have been measured on a genome-wide scale in Saccharomyces cerevisiae. In contrast, secondary structure in 5'-untranslated regions (UTRs of mRNA has only been investigated for a limited number of genes. Here, the aim is to study genome-wide regulatory effects of mRNA 5'-UTR folding free energies. We performed computations of secondary structures in 5'-UTRs and their folding free energies for all verified genes in S. cerevisiae. We found significant correlations between folding free energies of 5'-UTRs and various transcript features measured in genome-wide studies of yeast. In particular, mRNAs with weakly folded 5'-UTRs have higher translation rates, higher abundances of the corresponding proteins, longer half-lives, and higher numbers of transcripts, and are upregulated after heat shock. Furthermore, 5'-UTRs have significantly higher folding free energies than other genomic regions and randomized sequences. We also found a positive correlation between transcript half-life and ribosome occupancy that is more pronounced for short-lived transcripts, which supports a picture of competition between translation and degradation. Among the genes with strongly folded 5'-UTRs, there is a huge overrepresentation of uncharacterized open reading frames. Based on our analysis, we conclude that (i there is a widespread bias for 5'-UTRs to be weakly folded, (ii folding free energies of 5'-UTRs are correlated with mRNA translation and turnover on a genomic scale, and (iii transcripts with strongly folded 5'-UTRs are often rare and hard to find experimentally.

  7. Leveraging Large-Scale Cancer Genomics Datasets for Germline Discovery - TCGA

    Science.gov (United States)

    The session will review how data types have changed over time, focusing on how next-generation sequencing is being employed to yield more precise information about the underlying genomic variation that influences tumor etiology and biology.

  8. Chlamydia psittaci: new insights into genomic diversity, clinical pathology, host-pathogen interaction and anti-bacterial immunity.

    Science.gov (United States)

    Knittler, Michael R; Berndt, Angela; Böcker, Selina; Dutow, Pavel; Hänel, Frank; Heuer, Dagmar; Kägebein, Danny; Klos, Andreas; Koch, Sophia; Liebler-Tenorio, Elisabeth; Ostermann, Carola; Reinhold, Petra; Saluz, Hans Peter; Schöfl, Gerhard; Sehnert, Philipp; Sachse, Konrad

    2014-10-01

    The distinctive and unique features of the avian and mammalian zoonotic pathogen Chlamydia (C.) psittaci include the fulminant course of clinical disease, the remarkably wide host range and the high proportion of latent infections that are not leading to overt disease. Current knowledge on associated diseases is rather poor, even in comparison to other chlamydial agents. In the present paper, we explain and summarize the major findings of a national research network that focused on the elucidation of host-pathogen interactions in vitro and in animal models of C. psittaci infection, with the objective of improving our understanding of genomics, pathology, pathophysiology, molecular pathogenesis and immunology, and conceiving new approaches to therapy. We discuss new findings on comparative genome analysis, the complexity of pathophysiological interactions and systemic consequences, local immune response, the role of the complement system and antigen presentation pathways in the general context of state-of-the-art knowledge on chlamydial infections in humans and animals and single out relevant research topics to fill remaining knowledge gaps on this important yet somewhat neglected pathogen.

  9. Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing

    OpenAIRE

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance; Messina, Thomas; Fan, Hongtao; Jaeger, Edward; Stephens, Susan

    2013-01-01

    Background Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from la...

  10. Genome-scale functional characterization of Drosophila developmental enhancers in vivo.

    Science.gov (United States)

    Kvon, Evgeny Z; Kazmar, Tomas; Stampfel, Gerald; Yáñez-Cuna, J Omar; Pagani, Michaela; Schernhuber, Katharina; Dickson, Barry J; Stark, Alexander

    2014-08-01

    Transcriptional enhancers are crucial regulators of gene expression and animal development and the characterization of their genomic organization, spatiotemporal activities and sequence properties is a key goal in modern biology. Here we characterize the in vivo activity of 7,705 Drosophila melanogaster enhancer candidates covering 13.5% of the non-coding non-repetitive genome throughout embryogenesis. 3,557 (46%) candidates are active, suggesting a high density with 50,000 to 100,000 developmental enhancers genome-wide. The vast majority of enhancers display specific spatial patterns that are highly dynamic during development. Most appear to regulate their neighbouring genes, suggesting that the cis-regulatory genome is organized locally into domains, which are supported by chromosomal domains, insulator binding and genome evolution. However, 12 to 21 per cent of enhancers appear to skip non-expressed neighbours and regulate a more distal gene. Finally, we computationally identify cis-regulatory motifs that are predictive and required for enhancer activity, as we validate experimentally. This work provides global insights into the organization of an animal regulatory genome and the make-up of enhancer sequences and confirms and generalizes principles from previous studies. All enhancer patterns are annotated manually with a controlled vocabulary and all results are available through a web interface (http://enhancers.starklab.org), including the raw images of all microscopy slides for manual inspection at arbitrary zoom levels.

  11. Genomic variation of subseafloor archaeal and bacterial populations from venting fluids at the Mid-Cayman Rise

    Science.gov (United States)

    Anderson, R. E.; Eren, A. M.; Stepanauskas, R.; Huber, J. A.; Reveillaud, J.

    2015-12-01

    Deep-sea hydrothermal vent systems serve as windows to a dynamic, gradient-dominated deep biosphere that is home to a wide diversity of archaea, bacteria, and viruses. Until recently the majority of these microbial lineages were uncultivated, resulting in a poor understanding of how the physical and geochemical context shapes microbial evolution in the deep subsurface. By comparing metagenomes, metatranscriptomes and single-cell genomes between geologically distinct vent fields, we can better understand the relationship between the environment and the evolution of subsurface microbial communities. An ideal setting in which to use this approach is the Mid-Cayman Rise, located on the world's deepest and slowest-spreading mid-ocean ridge, which hosts both the mafic-influenced Piccard and ultramafic-influenced Von Damm vent fields. Previous work has shown that Von Damm has higher taxonomic and metabolic diversity than Piccard, consistent with geochemical model expectations, and the fluids from all vents are enriched in hydrogen (Reveillaud et al., submitted). Mapping of both metagenomes and metatranscriptomes to a combined assembly showed very little overlap among the Von Damm samples, indicating substantial variability that is consistent with the diversity of potential metabolites in this ultramafic vent field. In contrast, the most consistently abundant and active lineage across the Piccard samples was Sulfurovum, a sulfur-oxidizing chemolithotroph that uses nitrate or oxygen as an electron acceptor. Moreover, analysis of point mutations within individual lineages suggested that Sulfurovumat Piccard is under strong selection, whereas microbial genomes at Von Damm were more variable. These results are consistent with the hypothesis that the subsurface environment at Piccard supports the emergence of a dominant lineage that is under strong selection pressure, whereas the more geochemically diverse microbial habitat at Von Damm creates a wider variety of stable

  12. In Vitro Whole Genome DNA Binding Analysis of the Bacterial Replication Initiator and Transcription Factor DnaA.

    Directory of Open Access Journals (Sweden)

    Janet L Smith

    2015-05-01

    Full Text Available DnaA, the replication initiation protein in bacteria, is an AAA+ ATPase that binds and hydrolyzes ATP and exists in a heterogeneous population of ATP-DnaA and ADP-DnaA. DnaA binds cooperatively to the origin of replication and several other chromosomal regions, and functions as a transcription factor at some of these regions. We determined the binding properties of Bacillus subtilis DnaA to genomic DNA in vitro at single nucleotide resolution using in vitro DNA affinity purification and deep sequencing (IDAP-Seq. We used these data to identify 269 binding regions, refine the consensus sequence of the DnaA binding site, and compare the relative affinity of binding regions for ATP-DnaA and ADP-DnaA. Most sites had a slightly higher affinity for ATP-DnaA than ADP-DnaA, but a few had a strong preference for binding ATP-DnaA. Of the 269 sites, only the eight strongest binding ones have been observed to bind DnaA in vivo, suggesting that other cellular factors or the amount of available DnaA in vivo restricts DnaA binding to these additional sites. Conversely, we found several chromosomal regions that were bound by DnaA in vivo but not in vitro, and that the nucleoid-associated protein Rok was required for binding in vivo. Our in vitro characterization of the inherent ability of DnaA to bind the genome at single nucleotide resolution provides a backdrop for interpreting data on in vivo binding and regulation of DnaA, and is an approach that should be adaptable to many other DNA binding proteins.

  13. A genome-scale CRISPR-Cas9 screening method for protein stability reveals novel regulators of Cdc25A

    Science.gov (United States)

    Wu, Yuanzhong; Zhou, Liwen; Wang, Xin; Lu, Jinping; Zhang, Ruhua; Liang, Xiaoting; Wang, Li; Deng, Wuguo; Zeng, Yi-Xin; Huang, Haojie; Kang, Tiebang

    2016-01-01

    The regulation of stability is particularly crucial for unstable proteins in cells. However, a convenient and unbiased method of identifying regulators of protein stability remains to be developed. Recently, a genome-scale CRISPR-Cas9 library has been established as a genetic tool to mediate loss-of-function screening. Here, we developed a protein stability regulators screening assay (Pro-SRSA) by combining the whole-genome CRISPR-Cas9 library with a dual-fluorescence-based protein stability reporter and high-throughput sequencing to screen for regulators of protein stability. Using Cdc25A as an example, Cul4B-DDB1DCAF8 was identified as a new E3 ligase for Cdc25A. Moreover, the acetylation of Cdc25A at lysine 150, which was acetylated by p300/CBP and deacetylated by HDAC3, prevented the ubiquitin-mediated degradation of Cdc25A by the proteasome. This is the first study to report that acetylation, as a novel posttranslational modification, modulates Cdc25A stability, and we suggest that this unbiased CRISPR-Cas9 screening method at the genome scale may be widely used to globally identify regulators of protein stability. PMID:27462461

  14. Genome-scale reconstruction of the Streptococcus pyogenes M49 metabolic network reveals growth requirements and indicates potential drug targets.

    Science.gov (United States)

    Levering, Jennifer; Fiedler, Tomas; Sieg, Antje; van Grinsven, Koen W A; Hering, Silvio; Veith, Nadine; Olivier, Brett G; Klett, Lara; Hugenholtz, Jeroen; Teusink, Bas; Kreikemeyer, Bernd; Kummer, Ursula

    2016-08-20

    Genome-scale metabolic models comprise stoichiometric relations between metabolites, as well as associations between genes and metabolic reactions and facilitate the analysis of metabolism. We computationally reconstructed the metabolic network of the lactic acid bacterium Streptococcus pyogenes M49. Initially, we based the reconstruction on genome annotations and already existing and curated metabolic networks of Bacillus subtilis, Escherichia coli, Lactobacillus plantarum and Lactococcus lactis. This initial draft was manually curated with the final reconstruction accounting for 480 genes associated with 576 reactions and 558 metabolites. In order to constrain the model further, we performed growth experiments of wild type and arcA deletion strains of S. pyogenes M49 in a chemically defined medium and calculated nutrient uptake and production fluxes. We additionally performed amino acid auxotrophy experiments to test the consistency of the model. The established genome-scale model can be used to understand the growth requirements of the human pathogen S. pyogenes and define optimal and suboptimal conditions, but also to describe differences and similarities between S. pyogenes and related lactic acid bacteria such as L. lactis in order to find strategies to reduce the growth of the pathogen and propose drug targets. PMID:26970054

  15. Exploring Other Genomes: Bacteria.

    Science.gov (United States)

    Flannery, Maura C.

    2001-01-01

    Points out the importance of genomes other than the human genome project and provides information on the identified bacterial genomes Pseudomonas aeuroginosa, Leprosy, Cholera, Meningitis, Tuberculosis, Bubonic Plague, and plant pathogens. Considers the computer's use in genome studies. (Contains 14 references.) (YDS)

  16. Participants' recall and understanding of genomic research and large-scale data sharing.

    Science.gov (United States)

    Robinson, Jill Oliver; Slashinski, Melody J; Wang, Tao; Hilsenbeck, Susan G; McGuire, Amy L

    2013-10-01

    As genomic researchers are urged to openly share generated sequence data with other researchers, it is important to examine the utility of informed consent documents and processes, particularly as these relate to participants' engagement with and recall of the information presented to them, their objective or subjective understanding of the key elements of genomic research (e.g., data sharing), as well as how these factors influence or mediate the decisions they make. We conducted a randomized trial of three experimental informed consent documents (ICDs) with participants (n = 229) being recruited to genomic research studies; each document afforded varying control over breadth of release of genetic information. Recall and understanding, their impact on data sharing decisions, and comfort in decision making were assessed in a follow-up structured interview. Over 25% did not remember signing an ICD to participate in a genomic study, and the majority (54%) could not correctly identify with whom they had agreed to share their genomic data. However, participants felt that they understood enough to make an informed decision, and lack of recall did not impact final data sharing decisions or satisfaction with participation. These findings raise questions about the types of information participants need in order to provide valid informed consent, and whether subjective understanding and comfort with decision making are sufficient to satisfy the ethical principle of respect for persons.

  17. Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics.

    Science.gov (United States)

    Kelly, Benjamin J; Fitch, James R; Hu, Yangqiu; Corsmeier, Donald J; Zhong, Huachun; Wetzel, Amy N; Nordquist, Russell D; Newsom, David L; White, Peter

    2015-01-20

    While advances in genome sequencing technology make population-scale genomics a possibility, current approaches for analysis of these data rely upon parallelization strategies that have limited scalability, complex implementation and lack reproducibility. Churchill, a balanced regional parallelization strategy, overcomes these challenges, fully automating the multiple steps required to go from raw sequencing reads to variant discovery. Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth whole genome sample in less than two hours. The method is highly scalable, enabling full analysis of the 1000 Genomes raw sequence dataset in a week using cloud resources. http://churchill.nchri.org/.

  18. Inter-genomic displacement via lateral gene transfer of bacterial trp operons in an overall context of vertical genealogy

    Directory of Open Access Journals (Sweden)

    Keyhani Nemat O

    2004-06-01

    Full Text Available Abstract Background The growing conviction that lateral gene transfer plays a significant role in prokaryote genealogy opens up a need for comprehensive evaluations of gene-enzyme systems on a case-by-case basis. Genes of tryptophan biosynthesis are frequently organized as whole-pathway operons, an attribute that is expected to facilitate multi-gene transfer in a single step. We have asked whether events of lateral gene transfer are sufficient to have obscured our ability to track the vertical genealogy that underpins tryptophan biosynthesis. Results In 47 complete-genome Bacteria, the genes encoding the seven catalytic domains that participate in primary tryptophan biosynthesis were distinguished from any paralogs or xenologs engaged in other specialized functions. A reliable list of orthologs with carefully ascertained functional roles has thus been assembled and should be valuable as an annotation resource. The protein domains associated with primary tryptophan biosynthesis were then concatenated, yielding single amino-acid sequence strings that represent the entire tryptophan pathway. Lateral gene transfer of several whole-pathway trp operons was demonstrated by use of phylogenetic analysis. Lateral gene transfer of partial-pathway trp operons was also shown, with newly recruited genes functioning either in primary biosynthesis (rarely or specialized metabolism (more frequently. Conclusions (i Concatenated tryptophan protein trees are congruent with 16S rRNA subtrees provided that the genomes represented are of sufficiently close phylogenetic spacing. There are currently seven tryptophan congruency groups in the Bacteria. Recognition of a succession of others can be expected in the near future, but ultimately these should coalesce to a single grouping that parallels the 16S rRNA tree (except for cases of lateral gene transfer. (ii The vertical trace of evolution for tryptophan biosynthesis can be deduced. The daunting complexities engendered

  19. Large Scale Sequencing of Dothideomycetes Provides Insights into Genome Evolution and Adaptation

    Energy Technology Data Exchange (ETDEWEB)

    Haridas, Sajeet; Crous, Pedro; Binder, Manfred; Spatafora, Joseph; Grigoriev, Igor

    2015-03-16

    Dothideomycetes is the largest and most diverse class of ascomycete fungi with 23 orders 110 families, 1300 genera and over 19,000 known species. We present comparative analysis of 70 Dothideomycete genomes including over 50 that we sequenced and are as yet unpublished. This extensive sampling has almost quadrupled the previous study of 18 species and uncovered a 10 fold range of genome sizes. We were able to clarify the phylogenetic positions of several species whose origins were unclear in previous morphological and sequence comparison studies. We analyzed selected gene families including proteases, transporters and small secreted proteins and show that major differences in gene content is influenced by speciation.

  20. The Bacterial Communities of Full-Scale Biologically Active, Granular Activated Carbon Filters Are Stable and Diverse and Potentially Contain Novel Ammonia-Oxidizing Microorganisms.

    Science.gov (United States)

    LaPara, Timothy M; Hope Wilkinson, Katheryn; Strait, Jacqueline M; Hozalski, Raymond M; Sadowksy, Michael J; Hamilton, Matthew J

    2015-10-01

    The bacterial community composition of the full-scale biologically active, granular activated carbon (BAC) filters operated at the St. Paul Regional Water Services (SPRWS) was investigated using Illumina MiSeq analysis of PCR-amplified 16S rRNA gene fragments. These bacterial communities were consistently diverse (Shannon index, >4.4; richness estimates, >1,500 unique operational taxonomic units [OTUs]) throughout the duration of the 12-month study period. In addition, only modest shifts in the quantities of individual bacterial populations were observed; of the 15 most prominent OTUs, the most highly variable population (a Variovorax sp.) modulated less than 13-fold over time and less than 8-fold from filter to filter. The most prominent population in the profiles was a Nitrospira sp., representing 13 to 21% of the community. Interestingly, very few of the known ammonia-oxidizing bacteria (AOB; amoA genes, however, suggested that AOB were prominent in the bacterial communities (amoA/16S rRNA gene ratio, 1 to 10%). We conclude, therefore, that the BAC filters at the SPRWS potentially contained significant numbers of unidentified and novel ammonia-oxidizing microorganisms that possess amoA genes similar to those of previously described AOB. PMID:26209671

  1. Toward genome-scale models of the Chinese hamster ovary cells: incentives, status and perspectives

    DEFF Research Database (Denmark)

    Kaas, Christian Schrøder; Fan, Yuzhou; Weilguny, Dietmar;

    2014-01-01

    Bioprocessing of the important Chinese hamster ovary (CHO) cell lines used for the production of biopharmaceuticals stands at the brink of several redefining events. In 2011, the field entered the genomics era, which has accelerated omics-based phenotyping of the cell lines. In this review we des...

  2. Review: High-performance computing to detect epistasis in genome scale data sets.

    Science.gov (United States)

    Upton, Alex; Trelles, Oswaldo; Cornejo-García, José Antonio; Perkins, James Richard

    2016-05-01

    It is becoming clear that most human diseases have a complex etiology that cannot be explained by single nucleotide polymorphisms (SNPs) or simple additive combinations; the general consensus is that they are caused by combinations of multiple genetic variations. The limited success of some genome-wide association studies is partly a result of this focus on single genetic markers. A more promising approach is to take into account epistasis, by considering the association of multiple SNP interactions with disease. However, as genomic data continues to grow in resolution, and genome and exome sequencing become more established, the number of combinations of variants to consider increases rapidly. Two potential solutions should be considered: the use of high-performance computing, which allows us to consider a larger number of variables, and heuristics to make the solution more tractable, essential in the case of genome sequencing. In this review, we look at different computational methods to analyse epistatic interactions within disease-related genetic data sets created by microarray technology. We also review efforts to use epistatic analysis results to produce biomarkers for diagnostic tests and give our views on future directions in this field in light of advances in sequencing technology and variants in non-coding regions.

  3. Improved annotation through genome-scale metabolic modeling of Aspergillus oryzae

    DEFF Research Database (Denmark)

    Vongsangnak, Wanwipa; Olsen, Peter; Hansen, Kim;

    2008-01-01

    Background: Since ancient times the filamentous fungus Aspergillus oryzae has been used in the fermentation industry for the production of fermented sauces and the production of industrial enzymes. Recently, the genome sequence of A. oryzae with 12,074 annotated genes was released but the number...

  4. Predicting survival within the lung cancer histopathological hierarchy using a multi-scale genomic model of development.

    Directory of Open Access Journals (Sweden)

    Hongye Liu

    2006-07-01

    Full Text Available BACKGROUND: The histopathologic heterogeneity of lung cancer remains a significant confounding factor in its diagnosis and prognosis-spurring numerous recent efforts to find a molecular classification of the disease that has clinical relevance. METHODS AND FINDINGS: Molecular profiles of tumors from 186 patients representing four different lung cancer subtypes (and 17 normal lung tissue samples were compared with a mouse lung development model using principal component analysis in both temporal and genomic domains. An algorithm for the classification of lung cancers using a multi-scale developmental framework was developed. Kaplan-Meier survival analysis was conducted for lung adenocarcinoma patient subgroups identified via their developmental association. We found multi-scale genomic similarities between four human lung cancer subtypes and the developing mouse lung that are prognostically meaningful. Significant association was observed between the localization of human lung cancer cases along the principal mouse lung development trajectory and the corresponding patient survival rate at three distinct levels of classical histopathologic resolution: among different lung cancer subtypes, among patients within the adenocarcinoma subtype, and within the stage I adenocarcinoma subclass. The earlier the genomic association between a human tumor profile and the mouse lung development sequence, the poorer the patient's prognosis. Furthermore, decomposing this principal lung development trajectory identified a gene set that was significantly enriched for pyrimidine metabolism and cell-adhesion functions specific to lung development and oncogenesis. CONCLUSIONS: From a multi-scale disease modeling perspective, the molecular dynamics of murine lung development provide an effective framework that is not only data driven but also informed by the biology of development for elucidating the mechanisms of human lung cancer biology and its clinical outcome.

  5. A potential food biopreservative, CecXJ-37N, non-covalently intercalates into the nucleotides of bacterial genomic DNA beyond membrane attack.

    Science.gov (United States)

    Liu, Dongliang; Liu, Jun; Li, Jinyao; Xia, Lijie; Yang, Jianhua; Sun, Surong; Ma, Ji; Zhang, Fuchun

    2017-02-15

    The antibacterial activities and mechanism of an amide-modified peptide CecXJ-37N were investigated in this study. CecXJ-37N showed small MICs (0.25-7.8μM) against eight harmful strains common in food industry. The α-helix proportion of CecXJ-37N increased by 11-fold in prokaryotic membrane comparable environments; cytotoxicity studies demonstrated the MHC was significantly higher than that of non-amidated isoform. Moreover, CecXJ-37N possessed stronger capacities to resist trypsin and pepsin hydrolysis within two hours. Flow cytometry and scanning electron microscopy demonstrated that CecXJ-37N induced pore-formation, morphological changes, and lysed E. coli cells. Fluorescence microscopy indicated that CecXJ-37N penetrated E. coli membrane and accumulated in cytoplasm. Further ultraviolet-visible spectroscopy suggested that CecXJ-37N changed the action mode of parental peptide interacting with bacterial genome from outside binding to a tightly non-covalent intercalation into nucleotides. Overall, this study suggested that amide-modification enhanced antimicrobial activity and reduced the cytotoxicity, thus could be potential strategies for developing novel food preservatives.

  6. A potential food biopreservative, CecXJ-37N, non-covalently intercalates into the nucleotides of bacterial genomic DNA beyond membrane attack.

    Science.gov (United States)

    Liu, Dongliang; Liu, Jun; Li, Jinyao; Xia, Lijie; Yang, Jianhua; Sun, Surong; Ma, Ji; Zhang, Fuchun

    2017-02-15

    The antibacterial activities and mechanism of an amide-modified peptide CecXJ-37N were investigated in this study. CecXJ-37N showed small MICs (0.25-7.8μM) against eight harmful strains common in food industry. The α-helix proportion of CecXJ-37N increased by 11-fold in prokaryotic membrane comparable environments; cytotoxicity studies demonstrated the MHC was significantly higher than that of non-amidated isoform. Moreover, CecXJ-37N possessed stronger capacities to resist trypsin and pepsin hydrolysis within two hours. Flow cytometry and scanning electron microscopy demonstrated that CecXJ-37N induced pore-formation, morphological changes, and lysed E. coli cells. Fluorescence microscopy indicated that CecXJ-37N penetrated E. coli membrane and accumulated in cytoplasm. Further ultraviolet-visible spectroscopy suggested that CecXJ-37N changed the action mode of parental peptide interacting with bacterial genome from outside binding to a tightly non-covalent intercalation into nucleotides. Overall, this study suggested that amide-modification enhanced antimicrobial activity and reduced the cytotoxicity, thus could be potential strategies for developing novel food preservatives. PMID:27664674

  7. Integrating large-scale functional genomics data to dissect metabolic networks for hydrogen production

    Energy Technology Data Exchange (ETDEWEB)

    Harwood, Caroline S

    2012-12-17

    The goal of this project is to identify gene networks that are critical for efficient biohydrogen production by leveraging variation in gene content and gene expression in independently isolated Rhodopseudomonas palustris strains. Coexpression methods were applied to large data sets that we have collected to define probabilistic causal gene networks. To our knowledge this a first systems level approach that takes advantage of strain-to strain variability to computationally define networks critical for a particular bacterial phenotypic trait.

  8. Large-scale appearance of ultraconserved elements in tetrapod genomes and slowdown of the molecular clock.

    Science.gov (United States)

    Stephen, Stuart; Pheasant, Michael; Makunin, Igor V; Mattick, John S

    2008-02-01

    Mammalian genomes contain millions of highly conserved noncoding sequences, many of which are regulatory. The most extreme examples are the 481 ultraconserved elements (UCEs) that are identical over at least 200 bp in human, mouse, and rat and show 96% identity with chicken, which diverged approximately 310 MYA. If the substitution rate in UCEs remained constant, these elements should also be present with a high level of identity in fish (approximately 450 Myr), but this is not the case, suggesting that many appeared in the amniotes or tetrapods or that the molecular clock has slowed down in these lineages, or both. Taking advantage of the availability of multiple genomes, we identified 13,736 UCEs in the human genome that are identical over at least 100 bp in at least 3 of 5 placental mammals, including 2,189 sequences over at least 200 bp, thereby greatly expanding the repertoire of known UCEs, and investigated the evolution of these sequences in opossum, chicken, frog, and fish. We conclude that there was a massive genome-wide acquisition and expansion of UCEs during tetrapod and then amniote evolution, accompanied by a slowdown of the molecular clock, particularly in the amniotes, a process consistent with their functional exaptation in these lineages. The majority of tetrapod-specific UCEs are noncoding and associated with genes involved in regulation of transcription and development. In contrast, fish genomes contain relatively few UCEs, the majority of which are common to all bony vertebrates. These elements are different from other conserved noncoding elements and appear to be important regulatory innovations that became fixed following the emergence of vertebrates from the sea to the land.

  9. Meta-Analysis of Heterogeneous Data Sources for Genome-Scale Identification of Risk Genes in Complex Phenotypes

    DEFF Research Database (Denmark)

    Pers, Tune Hannes; Hansen, Niclas Tue; Hansen, Kasper Lage;

    2011-01-01

    Meta‐analyses of large‐scale association studies typically proceed solely within one data type and do not exploit the potential complementarities in other sources of molecular evidence. Here, we present an approach to combine heterogeneous data from genome‐wide association (GWA) studies, protein‐protein...... interaction screens, disease similarity, linkage studies, and gene expression experiments into a multi‐layered evidence network which is used to prioritize the entire protein‐coding part of the genome identifying a shortlist of candidate genes. We report specifically results on bipolar disorder, a genetically...

  10. Amplification of pico-scale DNA mediated by bacterial carrier DNA for small-cell-number transcription factor ChIP-seq

    DEFF Research Database (Denmark)

    Jakobsen, Janus S; Bagger, Frederik O; Hasemann, Marie S;

    2015-01-01

    BACKGROUND: Chromatin-Immunoprecipitation coupled with deep sequencing (ChIP-seq) is used to map transcription factor occupancy and generate epigenetic profiles genome-wide. The requirement of nano-scale ChIP DNA for generation of sequencing libraries has impeded ChIP-seq on in vivo tissues of lo......-selection for nucleosome-containing chromatin or pre-amplification of precipitated DNA, making them prone to introduce experimental biases....

  11. Bacterial and fungal genome detection PCR/NAT: discussion of the Mai 2015 distribution for external quality assessment of nucleic acid-based protocols in diagnostic medical microbiology by INSTAND e.V.

    OpenAIRE

    Reischl, U.; W. Schneider; Ehrenschwender, M; Hiergeist, A; Maaß, M.; Baier, M; Straube, E; Frangoulidis, D.; Grass, G.; von Buttlar, H; Fingerle, V.; A Sing; Jacobs, E; Reiter-Owona, I; Anders, A.

    2015-01-01

    This contribution provides an analysis report of the recent proficiency testing scheme "Bacterial and Fungal Genome Detection (PCR/NAT)". It summarizes some benchmarks and the overall assessment of results reported by all of the participating laboratories. A highly desired scheme for external quality assessment (EQAS) of molecular diagnostic methods in the field of medical microbiology was activated in 2002 by the German Society of Hygiene and Microbiology (DGHM) and is now organized by INST...

  12. Comparative analysis of the bacterial diversity in a lab-scale moving bed biofilm reactor (MBBR) applied to treat urban wastewater under different operational conditions.

    Science.gov (United States)

    Calderón, Kadiya; Martín-Pascual, Jaime; Poyatos, José Manuel; Rodelas, Belén; González-Martínez, Alejandro; González-López, Jesús

    2012-10-01

    Different types of carriers were tested as support material in a lab-scale moving bed biofilm reactor (MBBR) used to treat urban wastewater under three different conditions of hydraulic retention time (HRT) and carrier filling ratios (FR). The bacterial diversity developed on the biofilms responsible of the treatment was studied using a cultivation-independent approach based on the polymerase chain reaction-temperature gradient gel electrophoresis technique (PCR-TGGE). Cluster analysis of TGGE fingerprints showed significant differences of community structure dependent upon the different operational conditions applied. Redundancy analysis (RDA) was used to determine the relationship between the operational conditions (type of carrier, HRT, FR) and bacterial biofilm diversity, demonstrating a significant effect of FR=50%. Phylogenetic analysis of PCR-reamplified and sequenced TGGE bands revealed that the prevalent Bacteria populations in the biofilm were related to Betaproteobacteria (46%), Firmicutes (34%),Alphaproteobacteria (14%) and Gammaproteobacteria (9%).

  13. Large-scale recoding of an arbovirus genome to rebalance its insect versus mammalian preference.

    Science.gov (United States)

    Shen, Sam H; Stauft, Charles B; Gorbatsevych, Oleksandr; Song, Yutong; Ward, Charles B; Yurovsky, Alisa; Mueller, Steffen; Futcher, Bruce; Wimmer, Eckard

    2015-04-14

    The protein synthesis machineries of two distinct phyla of the Animal kingdom, insects of Arthropoda and mammals of Chordata, have different preferences for how to best encode proteins. Nevertheless, arboviruses (arthropod-borne viruses) are capable of infecting both mammals and insects just like arboviruses that use insect vectors to infect plants. These organisms have evolved carefully balanced genomes that can efficiently use the translational machineries of different phyla, even if the phyla belong to different kingdoms. Using dengue virus as an example, we have undone the genome encoding balance and specifically shifted the encoding preference away from mammals. These mammalian-attenuated viruses grow to high titers in insect cells but low titers in mammalian cells, have dramatically increased LD50s in newborn mice, and induce high levels of protective antibodies. Recoded arboviruses with a bias toward phylum-specific expression could form the basis of a new generation of live attenuated vaccine candidates.

  14. Genome-scale investigation of phenotypically distinct but nearly clonal Trichoderma strains.

    Science.gov (United States)

    Lange, Claudia; Weld, Richard J; Cox, Murray P; Bradshaw, Rosie E; McLean, Kirstin L; Stewart, Alison; Steyaert, Johanna M

    2016-01-01

    Biological control agents (BCA) are beneficial organisms that are applied to protect plants from pests. Many fungi of the genus Trichoderma are successful BCAs but the underlying mechanisms are not yet fully understood. Trichoderma cf. atroviride strain LU132 is a remarkably effective BCA compared to T. cf. atroviride strain LU140 but these strains were found to be highly similar at the DNA sequence level. This unusual combination of phenotypic variability and high DNA sequence similarity between separately isolated strains prompted us to undertake a genome comparison study in order to identify DNA polymorphisms. We further investigated if the polymorphisms had functional effects on the phenotypes. The two strains were clearly identified as individuals, exhibiting different growth rates, conidiation and metabolism. Superior pathogen control demonstrated by LU132 depended on its faster growth, which is a prerequisite for successful distribution and competition. Genome sequencing identified only one non-synonymous single nucleotide polymorphism (SNP) between the strains. Based on this SNP, we successfully designed and validated an RFLP protocol that can be used to differentiate LU132 from LU140 and other Trichoderma strains. This SNP changed the amino acid sequence of SERF, encoded by the previously undescribed single copy gene "small EDRK-rich factor" (serf). A deletion of serf in the two strains did not lead to identical phenotypes, suggesting that, in addition to the single functional SNP between the nearly clonal Trichoderma cf. atroviride strains, other non-genomic factors contribute to their phenotypic variation. This finding is significant as it shows that genomics is an extremely useful but not exhaustive tool for the study of biocontrol complexity and for strain typing. PMID:27190719

  15. Genomic evidence of rapid, global-scale gene flow in a Sulfolobus species

    OpenAIRE

    Mao, Dominic; Grogan, Dennis

    2012-01-01

    Local populations of Sulfolobus islandicus diverge genetically with geographical separation, and this has been attributed to restricted transfer of propagules imposed by the unfavorable spatial distribution of acidic geothermal habitat. We tested the generality of genetic divergence with distance in Sulfolobus species by analyzing genomes of Sulfolobus acidocaldarius drawn from three populations separated by more than 8000 km. In sharp contrast to S. islandicus, the geographically diverse S. ...

  16. Genome-scale DNA methylome and transcriptome profiling of human neutrophils

    OpenAIRE

    Chatterjee, Aniruddha; Stockwell, Peter A.; Rodger, Euan J.; Ian M Morison

    2016-01-01

    Methylation of DNA molecules is a key mechanism associated with human disease, altered gene expression and phenotype. Using reduced representation bisulphite sequencing (RRBS) technology we have analysed DNA methylation patterns in healthy individuals and identified genes showing significant inter-individual variation. Further, using whole genome transcriptome analysis (RNA-Seq) on the same individuals we showed a local and specific relationship of exon inclusion and variable DNA methylation ...

  17. Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient

    OpenAIRE

    Loraine Ann; Hung Yeung; Salmi Mari L; Chang Chunqi; Yao Jianchao; Roux Stanley J

    2008-01-01

    Abstract Background Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An eff...

  18. Emergence of Competitive Dominant Ammonia-Oxidizing Bacterial Populations in a Full-Scale Industrial Wastewater Treatment Plant

    OpenAIRE

    Layton, Alice C.; Dionisi, Hebe; Kuo, H.-W.; Robinson, Kevin G.; Garrett, Victoria M.; Meyers, Arthur; Sayler, Gary S.

    2005-01-01

    Ammonia-oxidizing bacterial populations in an industrial wastewater treatment plant were investigated with amoA and 16S rRNA gene real-time PCR assays. Nitrosomonas nitrosa initially dominated, but over time RI-27-type ammonia oxidizers, also within the Nitrosomonas communis lineage, increased from below detection to codominance. This shift occurred even though nitrification remained constant.

  19. Construction of a Genome-Scale Metabolic Model of Arthrospira platensis NIES-39 and Metabolic Design for Cyanobacterial Bioproduction.

    Directory of Open Access Journals (Sweden)

    Katsunori Yoshikawa

    Full Text Available Arthrospira (Spirulina platensis is a promising feedstock and host strain for bioproduction because of its high accumulation of glycogen and superior characteristics for industrial production. Metabolic simulation using a genome-scale metabolic model and flux balance analysis is a powerful method that can be used to design metabolic engineering strategies for the improvement of target molecule production. In this study, we constructed a genome-scale metabolic model of A. platensis NIES-39 including 746 metabolic reactions and 673 metabolites, and developed novel strategies to improve the production of valuable metabolites, such as glycogen and ethanol. The simulation results obtained using the metabolic model showed high consistency with experimental results for growth rates under several trophic conditions and growth capabilities on various organic substrates. The metabolic model was further applied to design a metabolic network to improve the autotrophic production of glycogen and ethanol. Decreased flux of reactions related to the TCA cycle and phosphoenolpyruvate reaction were found to improve glycogen production. Furthermore, in silico knockout simulation indicated that deletion of genes related to the respiratory chain, such as NAD(PH dehydrogenase and cytochrome-c oxidase, could enhance ethanol production by using ammonium as a nitrogen source.

  20. Construction of a Genome-Scale Metabolic Model of Arthrospira platensis NIES-39 and Metabolic Design for Cyanobacterial Bioproduction.

    Science.gov (United States)

    Yoshikawa, Katsunori; Aikawa, Shimpei; Kojima, Yuta; Toya, Yoshihiro; Furusawa, Chikara; Kondo, Akihiko; Shimizu, Hiroshi

    2015-01-01

    Arthrospira (Spirulina) platensis is a promising feedstock and host strain for bioproduction because of its high accumulation of glycogen and superior characteristics for industrial production. Metabolic simulation using a genome-scale metabolic model and flux balance analysis is a powerful method that can be used to design metabolic engineering strategies for the improvement of target molecule production. In this study, we constructed a genome-scale metabolic model of A. platensis NIES-39 including 746 metabolic reactions and 673 metabolites, and developed novel strategies to improve the production of valuable metabolites, such as glycogen and ethanol. The simulation results obtained using the metabolic model showed high consistency with experimental results for growth rates under several trophic conditions and growth capabilities on various organic substrates. The metabolic model was further applied to design a metabolic network to improve the autotrophic production of glycogen and ethanol. Decreased flux of reactions related to the TCA cycle and phosphoenolpyruvate reaction were found to improve glycogen production. Furthermore, in silico knockout simulation indicated that deletion of genes related to the respiratory chain, such as NAD(P)H dehydrogenase and cytochrome-c oxidase, could enhance ethanol production by using ammonium as a nitrogen source. PMID:26640947

  1. The RAVEN toolbox and its use for generating a genome-scale metabolic model for Penicillium chrysogenum.

    Directory of Open Access Journals (Sweden)

    Rasmus Agren

    Full Text Available We present the RAVEN (Reconstruction, Analysis and Visualization of Metabolic Networks Toolbox: a software suite that allows for semi-automated reconstruction of genome-scale models. It makes use of published models and/or the KEGG database, coupled with extensive gap-filling and quality control features. The software suite also contains methods for visualizing simulation results and omics data, as well as a range of methods for performing simulations and analyzing the results. The software is a useful tool for system-wide data analysis in a metabolic context and for streamlined reconstruction of metabolic networks based on protein homology. The RAVEN Toolbox workflow was applied in order to reconstruct a genome-scale metabolic model for the important microbial cell factory Penicillium chrysogenum Wisconsin54-1255. The model was validated in a bibliomic study of in total 440 references, and it comprises 1471 unique biochemical reactions and 1006 ORFs. It was then used to study the roles of ATP and NADPH in the biosynthesis of penicillin, and to identify potential metabolic engineering targets for maximization of penicillin production.

  2. The RAVEN toolbox and its use for generating a genome-scale metabolic model for Penicillium chrysogenum.

    Science.gov (United States)

    Agren, Rasmus; Liu, Liming; Shoaie, Saeed; Vongsangnak, Wanwipa; Nookaew, Intawat; Nielsen, Jens

    2013-01-01

    We present the RAVEN (Reconstruction, Analysis and Visualization of Metabolic Networks) Toolbox: a software suite that allows for semi-automated reconstruction of genome-scale models. It makes use of published models and/or the KEGG database, coupled with extensive gap-filling and quality control features. The software suite also contains methods for visualizing simulation results and omics data, as well as a range of methods for performing simulations and analyzing the results. The software is a useful tool for system-wide data analysis in a metabolic context and for streamlined reconstruction of metabolic networks based on protein homology. The RAVEN Toolbox workflow was applied in order to reconstruct a genome-scale metabolic model for the important microbial cell factory Penicillium chrysogenum Wisconsin54-1255. The model was validated in a bibliomic study of in total 440 references, and it comprises 1471 unique biochemical reactions and 1006 ORFs. It was then used to study the roles of ATP and NADPH in the biosynthesis of penicillin, and to identify potential metabolic engineering targets for maximization of penicillin production.

  3. A genome-scale integration and analysis of Lactococcus lactis translation data.

    Directory of Open Access Journals (Sweden)

    Julien Racle

    Full Text Available Protein synthesis is a template polymerization process composed by three main steps: initiation, elongation, and termination. During translation, ribosomes are engaged into polysomes whose size is used for the quantitative characterization of translatome. However, simultaneous transcription and translation in the bacterial cytosol complicates the analysis of translatome data. We established a procedure for robust estimation of the ribosomal density in hundreds of genes from Lactococcus lactis polysome size measurements. We used a mechanistic model of translation to integrate the information about the ribosomal density and for the first time we estimated the protein synthesis rate for each gene and identified the rate limiting steps. Contrary to conventional considerations, we find significant number of genes to be elongation limited. This number increases during stress conditions compared to optimal growth and proteins synthesized at maximum rate are predominantly elongation limited. Consistent with bacterial physiology, we found proteins with similar rate and control characteristics belonging to the same functional categories. Under stress conditions, we found that synthesis rate of regulatory proteins is becoming comparable to proteins favored under optimal growth. These findings suggest that the coupling of metabolic states and protein synthesis is more important than previously thought.

  4. Genetical genomics reveals large scale genotype-by-environment interactions in Arabidopsis thaliana.

    Directory of Open Access Journals (Sweden)

    L. Basten eSnoek

    2013-01-01

    Full Text Available One of the major goals of quantitative genetics is to unravel the complex interactions between molecular genetic factors and the environment. The effects of these genotype-by-environment interactions also affect and cause variation in gene expression. The regulatory loci responsible for this variation can be found by genetical genomics that involves the mapping of quantitative trait loci (QTLs for gene expression traits also called expression QTL (eQTLs. Most genetical genomics experiments published so far, are performed in a single environment and hence do not allow investigation of the role of genotype-by-environment interactions. Furthermore, most studies have been done in a steady state environment leading to acclimated expression patterns. However a response to the environment or change therein can be highly plastic and possibly lead to more and larger differences between genotypes. Here we present a genetical genomics study on 120 Arabidopsis thaliana, Landsberg erecta x Cape Verde Islands, recombinant inbred lines (RILs in active response to the environment by treating them with 3 hours of shade. The results of this experiment are compared to a previous study on seedlings of the same RILs from a steady state environment. The combination of two highly different conditions but exactly the same RILs with a fixed genetic variation showed the large role of genotype-by-environment interactions on gene expression levels.We found environment-dependent hotspots of transcript regulation. The major hotspot was confirmed by the expression profile of a near isogenic line. Our combined analysis leads us to propose CSN5A, a COP9 signalosome component, as a candidate regulator for the gene expression response to shade.

  5. Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering.

    Directory of Open Access Journals (Sweden)

    Sebastian Will

    2007-04-01

    Full Text Available The RFAM database defines families of ncRNAs by means of sequence similarities that are sufficient to establish homology. In some cases, such as microRNAs and box H/ACA snoRNAs, functional commonalities define classes of RNAs that are characterized by structural similarities, and typically consist of multiple RNA families. Recent advances in high-throughput transcriptomics and comparative genomics have produced very large sets of putative noncoding RNAs and regulatory RNA signals. For many of them, evidence for stabilizing selection acting on their secondary structures has been derived, and at least approximate models of their structures have been computed. The overwhelming majority of these hypothetical RNAs cannot be assigned to established families or classes. We present here a structure-based clustering approach that is capable of extracting putative RNA classes from genome-wide surveys for structured RNAs. The LocARNA (local alignment of RNA tool implements a novel variant of the Sankoff algorithm that is sufficiently fast to deal with several thousand candidate sequences. The method is also robust against false positive predictions, i.e., a contamination of the input data with unstructured or nonconserved sequences. We have successfully tested the LocARNA-based clustering approach on the sequences of the RFAM-seed alignments. Furthermore, we have applied it to a previously published set of 3,332 predicted structured elements in the Ciona intestinalis genome (Missal K, Rose D, Stadler PF (2005 Noncoding RNAs in Ciona intestinalis. Bioinformatics 21 (Supplement 2: i77-i78. In addition to recovering, e.g., tRNAs as a structure-based class, the method identifies several RNA families, including microRNA and snoRNA candidates, and suggests several novel classes of ncRNAs for which to date no representative has been experimentally characterized.

  6. Estimating demographic parameters from large-scale population genomic data using Approximate Bayesian Computation

    Directory of Open Access Journals (Sweden)

    Li Sen

    2012-03-01

    Full Text Available Abstract Background The Approximate Bayesian Computation (ABC approach has been used to infer demographic parameters for numerous species, including humans. However, most applications of ABC still use limited amounts of data, from a small number of loci, compared to the large amount of genome-wide population-genetic data which have become available in the last few years. Results We evaluated the performance of the ABC approach for three 'population divergence' models - similar to the 'isolation with migration' model - when the data consists of several hundred thousand SNPs typed for multiple individuals by simulating data from known demographic models. The ABC approach was used to infer demographic parameters of interest and we compared the inferred values to the true parameter values that was used to generate hypothetical "observed" data. For all three case models, the ABC approach inferred most demographic parameters quite well with narrow credible intervals, for example, population divergence times and past population sizes, but some parameters were more difficult to infer, such as population sizes at present and migration rates. We compared the ability of different summary statistics to infer demographic parameters, including haplotype and LD based statistics, and found that the accuracy of the parameter estimates can be improved by combining summary statistics that capture different parts of information in the data. Furthermore, our results suggest that poor choices of prior distributions can in some circumstances be detected using ABC. Finally, increasing the amount of data beyond some hundred loci will substantially improve the accuracy of many parameter estimates using ABC. Conclusions We conclude that the ABC approach can accommodate realistic genome-wide population genetic data, which may be difficult to analyze with full likelihood approaches, and that the ABC can provide accurate and precise inference of demographic parameters from

  7. Cloning of the Koi Herpesvirus Genome as an Infectious Bacterial Artificial Chromosome Demonstrates That Disruption of the Thymidine Kinase Locus Induces Partial Attenuation in Cyprinus carpio koi▿

    Science.gov (United States)

    Costes, B.; Fournier, G.; Michel, B.; Delforge, C.; Raj, V. Stalin; Dewals, B.; Gillet, L.; Drion, P.; Body, A.; Schynts, F.; Lieffrig, F.; Vanderplasschen, A.

    2008-01-01

    Koi herpesvirus (KHV) is the causative agent of a lethal disease in koi and common carp. In the present study, we describe the cloning of the KHV genome as a stable and infectious bacterial artificial chromosome (BAC) clone that can be used to produce KHV recombinant strains. This goal was achieved by the insertion of a loxP-flanked BAC cassette into the thymidine kinase (TK) locus. This insertion led to a BAC plasmid that was stably maintained in bacteria and was able to regenerate virions when permissive cells were transfected with the plasmid. Reconstituted virions free of the BAC cassette but carrying a disrupted TK locus (the FL BAC-excised strain) were produced by the transfection of Cre recombinase-expressing cells with the BAC. Similarly, virions with a wild-type revertant TK sequence (the FL BAC revertant strain) were produced by the cotransfection of cells with the BAC and a DNA fragment encoding the wild-type TK sequence. Reconstituted recombinant viruses were compared to the wild-type parental virus in vitro and in vivo. The FL BAC revertant strain and the FL BAC-excised strain replicated comparably to the parental FL strain. The FL BAC revertant strain induced KHV infection in koi carp that was indistinguishable from that induced by the parental strain, while the FL BAC-excised strain exhibited a partially attenuated phenotype. Finally, the usefulness of the KHV BAC for recombination studies was demonstrated by the production of an ORF16-deleted strain by using prokaryotic recombination technology. The availability of the KHV BAC is an important advance that will allow the study of viral genes involved in KHV pathogenesis, as well as the production of attenuated recombinant candidate vaccines. PMID:18337580

  8. A three-scale analysis of bacterial communities involved in rocks colonization and soil formation in high mountain environments.

    Science.gov (United States)

    Esposito, Alfonso; Ciccazzo, Sonia; Borruso, Luigimaria; Zerbe, Stefan; Daffonchio, Daniele; Brusetti, Lorenzo

    2013-10-01

    Alpha and beta diversities of the bacterial communities growing on rock surfaces, proto-soils, riparian sediments, lichen thalli, and water springs biofilms in a glacier foreland were studied. We used three molecular based techniques to allow a deeper investigation at different taxonomic resolutions: denaturing gradient gel electrophoresis, length heterogeneity-PCR, and automated ribosomal intergenic spacer analysis. Bacterial communities were mainly composed of Acidobacteria, Proteobacteria, and Cyanobacteria with distinct variations among sites. Proteobacteria were more represented in sediments, biofilms, and lichens; Acidobacteria were mostly found in proto-soils; and Cyanobacteria on rocks. Firmicutes and Bacteroidetes were mainly found in biofilms. UniFrac P values confirmed a significant difference among different matrices. Significant differences (P < 0.001) in beta diversity were observed among the different matrices at the genus-species level, except for lichens and rocks which shared a more similar community structure, while at deep taxonomic resolution two distinct bacterial communities between lichens and rocks were found. PMID:23712376

  9. Chromosome Scale Genome Assembly andTranscriptome Profiling of Nannochloropsisgaditana in Nitrogen Depletion

    Institute of Scientific and Technical Information of China (English)

    2014-01-01

    Nannochloropsis is rapidly emerging as a model organism for the study of biofuel production in microalgae.Here, we report a high-quality genomic assembly of Nannochloropsis gaditana, consisting of large contigs, up to 500 kbplong, and scaffolds that in most cases span the entire length of the chromosomes. We identified 10646 complete genesand characterized possible alternative transcripts. The annotation of the predicted genes and the analysis of cellular pro-cesses revealed traits relevant for the genetic improvement of this organism such as genes involved in DNA recombina-tion, RNA silencing, and cell wall synthesis. We also analyzed the modification of the transcriptional profile in nitrogendeficiencyma condition known to stimulate lipid accumulation. While the content of lipids increased, we did not detectmajor changes in expression of the genes involved in their biosynthesis. At the same time, we observed a very signifi-cant down-regulation of mitochondrial gene expression, suggesting that part of the AcetyI-CoA and NAD(P)H, normallyoxidized through the mitochondrial respiration, would be made available for fatty acids synthesis, increasing the fluxthrough the lipid biosynthetic pathway. Finally, we released an information resource of the genomic data of IV. gaditana,available online at www.nannochloropsis.org.

  10. Perspectives on clinical informatics: integrating large-scale clinical, genomic, and health information for clinical care.

    Science.gov (United States)

    Choi, In Young; Kim, Tae-Min; Kim, Myung Shin; Mun, Seong K; Chung, Yeun-Jun

    2013-12-01

    The advances in electronic medical records (EMRs) and bioinformatics (BI) represent two significant trends in healthcare. The widespread adoption of EMR systems and the completion of the Human Genome Project developed the technologies for data acquisition, analysis, and visualization in two different domains. The massive amount of data from both clinical and biology domains is expected to provide personalized, preventive, and predictive healthcare services in the near future. The integrated use of EMR and BI data needs to consider four key informatics areas: data modeling, analytics, standardization, and privacy. Bioclinical data warehouses integrating heterogeneous patient-related clinical or omics data should be considered. The representative standardization effort by the Clinical Bioinformatics Ontology (CBO) aims to provide uniquely identified concepts to include molecular pathology terminologies. Since individual genome data are easily used to predict current and future health status, different safeguards to ensure confidentiality should be considered. In this paper, we focused on the informatics aspects of integrating the EMR community and BI community by identifying opportunities, challenges, and approaches to provide the best possible care service for our patients and the population.

  11. Large-scale analysis of tandem repeat variability in the human genome.

    Science.gov (United States)

    Duitama, Jorge; Zablotskaya, Alena; Gemayel, Rita; Jansen, An; Belet, Stefanie; Vermeesch, Joris R; Verstrepen, Kevin J; Froyen, Guy

    2014-05-01

    Tandem repeats are short DNA sequences that are repeated head-to-tail with a propensity to be variable. They constitute a significant proportion of the human genome, also occurring within coding and regulatory regions. Variation in these repeats can alter the function and/or expression of genes allowing organisms to swiftly adapt to novel environments. Importantly, some repeat expansions have also been linked to certain neurodegenerative diseases. Therefore, accurate sequencing of tandem repeats could contribute to our understanding of common phenotypic variability and might uncover missing genetic factors in idiopathic clinical conditions. However, despite long-standing evidence for the functional role of repeats, they are largely ignored because of technical limitations in sequencing, mapping and typing. Here, we report on a novel capture technique and data filtering protocol that allowed simultaneous sequencing of thousands of tandem repeats in the human genomes of a three generation family using GS-FLX-plus Titanium technology. Our results demonstrated that up to 7.6% of tandem repeats in this family (4% in coding sequences) differ from the reference sequence, and identified a de novo variation in the family tree. The method opens new routes to look at this underappreciated type of genetic variability, including the identification of novel disease-related repeats.

  12. Draft Genome Sequence of Two Sphingopyxis sp. Strains, Dominant Members of the Bacterial Community Associated with a Drinking Water Distribution System Simulator

    Science.gov (United States)

    We report the draft genome of two Sphingopyxis spp. strains isolated from a chloraminated drinking water distribution system simulator. Both strains are ubiquitous residents and early colonizers of water distribution systems. Genomic annotation identified a class 1 integron (in...

  13. A Genome-Scale Resource for the Functional Characterization of Arabidopsis Transcription Factors

    Directory of Open Access Journals (Sweden)

    Jose L. Pruneda-Paz

    2014-07-01

    Full Text Available Extensive transcriptional networks play major roles in cellular and organismal functions. Transcript levels are in part determined by the combinatorial and overlapping functions of multiple transcription factors (TFs bound to gene promoters. Thus, TF-promoter interactions provide the basic molecular wiring of transcriptional regulatory networks. In plants, discovery of the functional roles of TFs is limited by an increased complexity of network circuitry due to a significant expansion of TF families. Here, we present the construction of a comprehensive collection of Arabidopsis TFs clones created to provide a versatile resource for uncovering TF biological functions. We leveraged this collection by implementing a high-throughput DNA binding assay and identified direct regulators of a key clock gene (CCA1 that provide molecular links between different signaling modules and the circadian clock. The resources introduced in this work will significantly contribute to a better understanding of the transcriptional regulatory landscape of plant genomes.

  14. Genome-scale DNA variant analysis and functional validation of a SNP underlying yellow fruit color in wild strawberry

    Science.gov (United States)

    Hawkins, Charles; Caruana, Julie; Schiksnis, Erin; Liu, Zhongchi

    2016-01-01

    Fragaria vesca is a species of diploid strawberry being developed as a model for the octoploid garden strawberry. This work sequenced and compared the genomes of three F. vesca accessions: ‘Hawaii 4′, ‘Rügen’, and ‘Yellow Wonder’. Genome-scale analyses of shared and distinct SNPs among these three accessions have revealed that ‘Rügen’ and ‘Yellow Wonder’ are more similar to each other than they are to ‘Hawaii 4’. Though all three accessions are inbred seven generations, each accession still possesses extensive heterozygosity, highlighting the inherent differences between individual plants even of the same accession. The identification of the impact of each SNP as well as the large number of Indel markers provides a foundation for locating candidate mutations underlying phenotypic variations among these F. vesca accessions and for mapping new mutations generated through forward genetics screens. Through systematic analysis of SNP variants affecting genes in anthocyanin biosynthesis and regulation, a candidate SNP in FveMYB10 was identified and then functionally confirmed to be responsible for the yellow color fruits made by many F. vesca accessions. As a whole, this study provides further resources for F. vesca and establishes a foundation for linking traits of economic importance to specific genes and variants. PMID:27377763

  15. The roles of whole-genome and small-scale duplications in the functional specialization of Saccharomyces cerevisiae genes.

    Directory of Open Access Journals (Sweden)

    Mario A Fares

    Full Text Available Researchers have long been enthralled with the idea that gene duplication can generate novel functions, crediting this process with great evolutionary importance. Empirical data shows that whole-genome duplications (WGDs are more likely to be retained than small-scale duplications (SSDs, though their relative contribution to the functional fate of duplicates remains unexplored. Using the map of genetic interactions and the re-sequencing of 27 Saccharomyces cerevisiae genomes evolving for 2,200 generations we show that SSD-duplicates lead to neo-functionalization while WGD-duplicates partition ancestral functions. This conclusion is supported by: (a SSD-duplicates establish more genetic interactions than singletons and WGD-duplicates; (b SSD-duplicates copies share more interaction-partners than WGD-duplicates copies; (c WGD-duplicates interaction partners are more functionally related than SSD-duplicates partners; (d SSD-duplicates gene copies are more functionally divergent from one another, while keeping more overlapping functions, and diverge in their sub-cellular locations more than WGD-duplicates copies; and (e SSD-duplicates complement their functions to a greater extent than WGD-duplicates. We propose a novel model that uncovers the complexity of evolution after gene duplication.

  16. Genome-scale DNA variant analysis and functional validation of a SNP underlying yellow fruit color in wild strawberry.

    Science.gov (United States)

    Hawkins, Charles; Caruana, Julie; Schiksnis, Erin; Liu, Zhongchi

    2016-01-01

    Fragaria vesca is a species of diploid strawberry being developed as a model for the octoploid garden strawberry. This work sequenced and compared the genomes of three F. vesca accessions: 'Hawaii 4', 'Rügen', and 'Yellow Wonder'. Genome-scale analyses of shared and distinct SNPs among these three accessions have revealed that 'Rügen' and 'Yellow Wonder' are more similar to each other than they are to 'Hawaii 4'. Though all three accessions are inbred seven generations, each accession still possesses extensive heterozygosity, highlighting the inherent differences between individual plants even of the same accession. The identification of the impact of each SNP as well as the large number of Indel markers provides a foundation for locating candidate mutations underlying phenotypic variations among these F. vesca accessions and for mapping new mutations generated through forward genetics screens. Through systematic analysis of SNP variants affecting genes in anthocyanin biosynthesis and regulation, a candidate SNP in FveMYB10 was identified and then functionally confirmed to be responsible for the yellow color fruits made by many F. vesca accessions. As a whole, this study provides further resources for F. vesca and establishes a foundation for linking traits of economic importance to specific genes and variants. PMID:27377763

  17. A large-scale genomic approach affords unprecedented resolution for the molecular epidemiology and evolutionary history of contagious caprine pleuropneumonia.

    Science.gov (United States)

    Dupuy, Virginie; Verdier, Axel; Thiaucourt, François; Manso-Silván, Lucía

    2015-01-01

    Contagious caprine pleuropneumonia (CCPP), caused by Mycoplasma capricolum subsp. capripneumoniae (Mccp), is a devastating disease of domestic goats and of some wild ungulate species. The disease is currently spreading in Africa and Asia and poses a serious threat to disease-free areas. A comprehensive view of the evolutionary history and dynamics of Mccp is essential to understand the epidemiology of CCPP. Yet, analysing the diversity of genetically monomorphic pathogens, such as Mccp, is complicated due to their low variability. In this study, the molecular epidemiology and evolution of CCPP was investigated using a large-scale genomic approach based on next-generation sequencing technologies, applied to a sample of strains representing the global distribution of this disease. A highly discriminatory multigene typing system was developed, allowing the differentiation of 24 haplotypes among 25 Mccp strains distributed in six genotyping groups, which showed some correlation with geographic origin. A Bayesian approach was used to infer the first robust phylogeny of the species and to date the principal events of its evolutionary history. The emergence of Mccp was estimated only at about 270 years ago, which explains the low genetic diversity of this species despite its high mutation rate, evaluated at 1.3 × 10(-6) substitutions per site per year. Finally, plausible scenarios were proposed to elucidate the evolution and dynamics of CCPP in Asia and Africa, though limited by the paucity of Mccp strains, particularly in Asia. This study shows how combining large-scale genomic data with spatial and temporal data makes it possible to obtain a comprehensive view of the epidemiology of CCPP, a precondition for the development of improved disease surveillance and control measures. PMID:26149260

  18. Direct Mutagenesis of Thousands of Genomic Targets using Microarray-derived Oligonucleotides

    DEFF Research Database (Denmark)

    Bonde, Mads; Kosuri, Sriram; Genee, Hans Jasper;

    2015-01-01

    Multiplex Automated Genome Engineering (MAGE) allows simultaneous mutagenesis of multiple target sites in bacterial genomes using short oligonucleotides. However, large-scale mutagenesis requires hundreds to thousands of unique oligos, which are costly to synthesize and impossible to scale......-up by traditional phosphoramidite column-based approaches. Here, we describe a novel method to amplify oligos from microarray chips for direct use in MAGE to perturb thousands of genomic sites simultaneously. We demonstrated the feasibility of large-scale mutagenesis by inserting T7 promoters upstream of 2585...

  19. Listeria Genomics

    Science.gov (United States)

    Cabanes, Didier; Sousa, Sandra; Cossart, Pascale

    The opportunistic intracellular foodborne pathogen Listeria monocytogenes has become a paradigm for the study of host-pathogen interactions and bacterial adaptation to mammalian hosts. Analysis of L. monocytogenes infection has provided considerable insight into how bacteria invade cells, move intracellularly, and disseminate in tissues, as well as tools to address fundamental processes in cell biology. Moreover, the vast amount of knowledge that has been gathered through in-depth comparative genomic analyses and in vivo studies makes L. monocytogenes one of the most well-studied bacterial pathogens. This chapter provides an overview of progress in the exploration of genomic, transcriptomic, and proteomic data in Listeria spp. to understand genome evolution and diversity, as well as physiological aspects of metabolism used by bacteria when growing in diverse environments, in particular in infected hosts.

  20. Direct coupling of a genome-scale microbial in silico model and a groundwater reactive transport model

    Science.gov (United States)

    Fang, Yilin; Scheibe, Timothy D.; Mahadevan, Radhakrishnan; Garg, Srinath; Long, Philip E.; Lovley, Derek R.

    2011-03-01

    The activity of microorganisms often plays an important role in dynamic natural attenuation or engineered bioremediation of subsurface contaminants, such as chlorinated solvents, metals, and radionuclides. To evaluate and/or design bioremediated systems, quantitative reactive transport models are needed. State-of-the-art reactive transport models often ignore the microbial effects or simulate the microbial effects with static growth yield and constant reaction rate parameters over simulated conditions, while in reality microorganisms can dynamically modify their functionality (such as utilization of alternative respiratory pathways) in response to spatial and temporal variations in environmental conditions. Constraint-based genome-scale microbial in silico models, using genomic data and multiple-pathway reaction networks, have been shown to be able to simulate transient metabolism of some well studied microorganisms and identify growth rate, substrate uptake rates, and byproduct rates under different growth conditions. These rates can be identified and used to replace specific microbially-mediated reaction rates in a reactive transport model using local geochemical conditions as constraints. We previously demonstrated the potential utility of integrating a constraint-based microbial metabolism model with a reactive transport simulator as applied to bioremediation of uranium in groundwater. However, that work relied on an indirect coupling approach that was effective for initial demonstration but may not be extensible to more complex problems that are of significant interest (e.g., communities of microbial species and multiple constraining variables). Here, we extend that work by presenting and demonstrating a method of directly integrating a reactive transport model (FORTRAN code) with constraint-based in silico models solved with IBM ILOG CPLEX linear optimizer base system (C library). The models were integrated with BABEL, a language interoperability tool. The

  1. Metabolic network reconstruction and genome-scale model of butanol-producing strain Clostridium beijerinckii NCIMB 8052

    Directory of Open Access Journals (Sweden)

    Kim Pan-Jun

    2011-08-01

    Full Text Available Abstract Background Solventogenic clostridia offer a sustainable alternative to petroleum-based production of butanol--an important chemical feedstock and potential fuel additive or replacement. C. beijerinckii is an attractive microorganism for strain design to improve butanol production because it (i naturally produces the highest recorded butanol concentrations as a byproduct of fermentation; and (ii can co-ferment pentose and hexose sugars (the primary products from lignocellulosic hydrolysis. Interrogating C. beijerinckii metabolism from a systems viewpoint using constraint-based modeling allows for simulation of the global effect of genetic modifications. Results We present the first genome-scale metabolic model (iCM925 for C. beijerinckii, containing 925 genes, 938 reactions, and 881 metabolites. To build the model we employed a semi-automated procedure that integrated genome annotation information from KEGG, BioCyc, and The SEED, and utilized computational algorithms with manual curation to improve model completeness. Interestingly, we found only a 34% overlap in reactions collected from the three databases--highlighting the importance of evaluating the predictive accuracy of the resulting genome-scale model. To validate iCM925, we conducted fermentation experiments using the NCIMB 8052 strain, and evaluated the ability of the model to simulate measured substrate uptake and product production rates. Experimentally observed fermentation profiles were found to lie within the solution space of the model; however, under an optimal growth objective, additional constraints were needed to reproduce the observed profiles--suggesting the existence of selective pressures other than optimal growth. Notably, a significantly enriched fraction of actively utilized reactions in simulations--constrained to reflect experimental rates--originated from the set of reactions that overlapped between all three databases (P = 3.52 × 10-9, Fisher's exact test

  2. Large-scale analysis of antisense transcription in wheat using the Affymetrix GeneChip Wheat Genome Array

    Directory of Open Access Journals (Sweden)

    Settles Matthew L

    2009-05-01

    Full Text Available Abstract Background Natural antisense transcripts (NATs are transcripts of the opposite DNA strand to the sense-strand either at the same locus (cis-encoded or a different locus (trans-encoded. They can affect gene expression at multiple stages including transcription, RNA processing and transport, and translation. NATs give rise to sense-antisense transcript pairs and the number of these identified has escalated greatly with the availability of DNA sequencing resources and public databases. Traditionally, NATs were identified by the alignment of full-length cDNAs or expressed sequence tags to genome sequences, but an alternative method for large-scale detection of sense-antisense transcript pairs involves the use of microarrays. In this study we developed a novel protocol to assay sense- and antisense-strand transcription on the 55 K Affymetrix GeneChip Wheat Genome Array, which is a 3' in vitro transcription (3'IVT expression array. We selected five different tissue types for assay to enable maximum discovery, and used the 'Chinese Spring' wheat genotype because most of the wheat GeneChip probe sequences were based on its genomic sequence. This study is the first report of using a 3'IVT expression array to discover the expression of natural sense-antisense transcript pairs, and may be considered as proof-of-concept. Results By using alternative target preparation schemes, both the sense- and antisense-strand derived transcripts were labeled and hybridized to the Wheat GeneChip. Quality assurance verified that successful hybridization did occur in the antisense-strand assay. A stringent threshold for positive hybridization was applied, which resulted in the identification of 110 sense-antisense transcript pairs, as well as 80 potentially antisense-specific transcripts. Strand-specific RT-PCR validated the microarray observations, and showed that antisense transcription is likely to be tissue specific. For the annotated sense

  3. Genome-wide linkage using the Social Responsiveness Scale in Utah autism pedigrees

    OpenAIRE

    Coon, Hilary; Villalobos, Michele E; Robison, Reid J.; Camp, Nicola J.; Cannon, Dale S.; Allen-Brady, Kristina; Miller, Judith S; McMahon, William M

    2010-01-01

    Background Autism Spectrum Disorders (ASD) are phenotypically heterogeneous, characterized by impairments in the development of communication and social behaviour and the presence of repetitive behaviour and restricted interests. Dissecting the genetic complexity of ASD may require phenotypic data reflecting more detail than is offered by a categorical clinical diagnosis. Such data are available from the Social Responsiveness Scale (SRS) which is a continuous, quantitative measure of social a...

  4. Genome-wide linkage using the Social Responsiveness Scale in Utah autism pedigrees

    OpenAIRE

    Coon Hilary; Villalobos Michele E; Robison Reid J; Camp Nicola J; Cannon Dale S; Allen-Brady Kristina; Miller Judith S; McMahon William M

    2010-01-01

    Abstract Background Autism Spectrum Disorders (ASD) are phenotypically heterogeneous, characterized by impairments in the development of communication and social behaviour and the presence of repetitive behaviour and restricted interests. Dissecting the genetic complexity of ASD may require phenotypic data reflecting more detail than is offered by a categorical clinical diagnosis. Such data are available from the Social Responsiveness Scale (SRS) which is a continuous, quantitative measure of...

  5. A general framework for association tests with multivariate traits in large-scale genomics studies.

    Science.gov (United States)

    He, Qianchuan; Avery, Christy L; Lin, Dan-Yu

    2013-12-01

    Genetic association studies often collect data on multiple traits that are correlated. Discovery of genetic variants influencing multiple traits can lead to better understanding of the etiology of complex human diseases. Conventional univariate association tests may miss variants that have weak or moderate effects on individual traits. We propose several multivariate test statistics to complement univariate tests. Our framework covers both studies of unrelated individuals and family studies and allows any type/mixture of traits. We relate the marginal distributions of multivariate traits to genetic variants and covariates through generalized linear models without modeling the dependence among the traits or family members. We construct score-type statistics, which are computationally fast and numerically stable even in the presence of covariates and which can be combined efficiently across studies with different designs and arbitrary patterns of missing data. We compare the power of the test statistics both theoretically and empirically. We provide a strategy to determine genome-wide significance that properly accounts for the linkage disequilibrium (LD) of genetic variants. The application of the new methods to the meta-analysis of five major cardiovascular cohort studies identifies a new locus (HSCB) that is pleiotropic for the four traits analyzed. PMID:24227293

  6. Mercator: a fast and simple web server for genome scale functional annotation of plant sequence data.

    Science.gov (United States)

    Lohse, Marc; Nagel, Axel; Herter, Thomas; May, Patrick; Schroda, Michael; Zrenner, Rita; Tohge, Takayuki; Fernie, Alisdair R; Stitt, Mark; Usadel, Björn

    2014-05-01

    Next-generation technologies generate an overwhelming amount of gene sequence data. Efficient annotation tools are required to make these data amenable to functional genomics analyses. The Mercator pipeline automatically assigns functional terms to protein or nucleotide sequences. It uses the MapMan 'BIN' ontology, which is tailored for functional annotation of plant 'omics' data. The classification procedure performs parallel sequence searches against reference databases, compiles the results and computes the most likely MapMan BINs for each query. In the current version, the pipeline relies on manually curated reference classifications originating from the three reference organisms (Arabidopsis, Chlamydomonas, rice), various other plant species that have a reviewed SwissProt annotation, and more than 2000 protein domain and family profiles at InterPro, CDD and KOG. Functional annotations predicted by Mercator achieve accuracies above 90% when benchmarked against manual annotation. In addition to mapping files for direct use in the visualization software MapMan, Mercator provides graphical overview charts, detailed annotation information in a convenient web browser interface and a MapMan-to-GO translation table to export results as GO terms. Mercator is available free of charge via http://mapman.gabipd.org/web/guest/app/Mercator.

  7. Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex.

    Science.gov (United States)

    Konermann, Silvana; Brigham, Mark D; Trevino, Alexandro E; Joung, Julia; Abudayyeh, Omar O; Barcena, Clea; Hsu, Patrick D; Habib, Naomi; Gootenberg, Jonathan S; Nishimasu, Hiroshi; Nureki, Osamu; Zhang, Feng

    2015-01-29

    Systematic interrogation of gene function requires the ability to perturb gene expression in a robust and generalizable manner. Here we describe structure-guided engineering of a CRISPR-Cas9 complex to mediate efficient transcriptional activation at endogenous genomic loci. We used these engineered Cas9 activation complexes to investigate single-guide RNA (sgRNA) targeting rules for effective transcriptional activation, to demonstrate multiplexed activation of ten genes simultaneously, and to upregulate long intergenic non-coding RNA (lincRNA) transcripts. We also synthesized a library consisting of 70,290 guides targeting all human RefSeq coding isoforms to screen for genes that, upon activation, confer resistance to a BRAF inhibitor. The top hits included genes previously shown to be able to confer resistance, and novel candidates were validated using individual sgRNA and complementary DNA overexpression. A gene expression signature based on the top screening hits correlated with markers of BRAF inhibitor resistance in cell lines and patient-derived samples. These results collectively demonstrate the potential of Cas9-based activators as a powerful genetic perturbation technology.

  8. Genome-scale DNA methylome and transcriptome profiling of human neutrophils

    Science.gov (United States)

    Chatterjee, Aniruddha; Stockwell, Peter A.; Rodger, Euan J.; Morison, Ian M.

    2016-01-01

    Methylation of DNA molecules is a key mechanism associated with human disease, altered gene expression and phenotype. Using reduced representation bisulphite sequencing (RRBS) technology we have analysed DNA methylation patterns in healthy individuals and identified genes showing significant inter-individual variation. Further, using whole genome transcriptome analysis (RNA-Seq) on the same individuals we showed a local and specific relationship of exon inclusion and variable DNA methylation pattern. For RRBS, 363 million, 100-bp reads were generated from 13 samples using Illumina GAII and HiSeq2000 platforms. Here we also present additional RRBS data for a female pair of monozygotic twins that was not described in our original publication. Further, We performed RNA-Seq on four of these individuals, generating 174 million, 51-bp high quality reads on an Illumina HiSeq2000 platform. The current data set could be exploited as a comprehensive resource for understanding the nature and mechanism of variable phenotypic traits and altered disease susceptibility due to variable DNA methylation and gene expression patterns in healthy individuals. PMID:26978482

  9. Exploring massive, genome scale datasets with the GenometriCorr package.

    Directory of Open Access Journals (Sweden)

    Alexander Favorov

    2012-05-01

    Full Text Available UNLABELLED: We have created a statistically grounded tool for determining the correlation of genomewide data with other datasets or known biological features, intended to guide biological exploration of high-dimensional datasets, rather than providing immediate answers. The software enables several biologically motivated approaches to these data and here we describe the rationale and implementation for each approach. Our models and statistics are implemented in an R package that efficiently calculates the spatial correlation between two sets of genomic intervals (data and/or annotated features, for use as a metric of functional interaction. The software handles any type of pointwise or interval data and instead of running analyses with predefined metrics, it computes the significance and direction of several types of spatial association; this is intended to suggest potentially relevant relationships between the datasets. AVAILABILITY AND IMPLEMENTATION: The package, GenometriCorr, can be freely downloaded at http://genometricorr.sourceforge.net/. Installation guidelines and examples are available from the sourceforge repository. The package is pending submission to Bioconductor.

  10. Genética y genómica enfocadas en el estudio de la resistencia bacteriana Genetics and Genomics for the study of bacterial resistance

    Directory of Open Access Journals (Sweden)

    Ulises Garza-Ramos

    2009-01-01

    Full Text Available La resistencia bacteriana es un problema de salud pública causante de índices elevados de morbi-mortalidad hospitalaria. En la medida en que se usan los diferentes antibióticos se seleccionan bacterias resistentes a múltiples fármacos. El desarrollo de nuevas herramientas moleculares de la genómica y proteómica, como el PCR en tiempo real, pirosecuenciación de ADN, espectrometría de masas, microarreglos de ADN y bioinformática, permite conocer en forma más estrecha la fisiología y estructura de las bacterias y los mecanismos de resistencia a los antibióticos. Estos estudios hacen posible identificar nuevos blancos farmacológicos y diseñar antibióticos específicos para suministrar tratamientos más certeros que combatan las infecciones producidas por bacterias. Con estas técnicas también es posible la identificación rápida de los genes que confieren la resistencia a los antibióticos y el reconocimiento de las estructuras genéticas complejas como los integrones, que intervienen en la diseminación de los genes que producen la multirresistencia.Bacterial resistance is a public health problem causing high rates of morbidity and mortality in hospital settings. To the extent that different antibiotics are used, bacteria resistant to multiple drugs are selected. The development of new molecular genomic and proteomic tools such as real-time PCR, DNA pyrosequencing, mass spectrometry, DNA microarrays, and bioinformatics allow for more in-depth knowledge about the physiology and structure of bacteria and mechanisms involved in antibiotic resistance. These studies identify new targets for drugs and design specific antibiotics to provide more accurate treatments to combat infections caused by bacteria. Using these techniques, it will also be possible to rapidly identify genes that confer resistance to antibiotics, and to identify complex genetic structures, such as integrons that are involved in the spread of genes that confer

  11. A Probabilistic Genome-Wide Gene Reading Frame Sequence Model

    DEFF Research Database (Denmark)

    Have, Christian Theil; Mørk, Søren

    using the probabilistic logic programming language and machine learning system PRISM - a fast and efficient model prototyping environment, using bacterial gene finding performance as a benchmark of signal strength. The model is used to prune a set of gene predictions from an underlying gene finder...... and are evaluated by the effect on prediction performance. Since bacterial gene finding to a large extent is a solved problem it forms an ideal proving ground for evaluating the explicit modeling of larger scale gene sequence composition of genomes. We conclude that the sequential composition of gene reading frames......We introduce a new type of probabilistic sequence model, that model the sequential composition of reading frames of genes in a genome. Our approach extends gene finders with a model of the sequential composition of genes at the genome-level -- effectively producing a sequential genome annotation...

  12. Putative bacterial interactions from metagenomic knowledge with an integrative systems ecology approach.

    Science.gov (United States)

    Bordron, Philippe; Latorre, Mauricio; Cortés, Maria-Paz; González, Mauricio; Thiele, Sven; Siegel, Anne; Maass, Alejandro; Eveillard, Damien

    2016-02-01

    Following the trend of studies that investigate microbial ecosystems using different metagenomic techniques, we propose a new integrative systems ecology approach that aims to decipher functional roles within a consortium through the integration of genomic and metabolic knowledge at genome scale. For the sake of application, using public genomes of five bacterial strains involved in copper bioleaching: Acidiphilium cryptum, Acidithiobacillus ferrooxidans, Acidithiobacillus thiooxidans, Leptospirillum ferriphilum, and Sulfobacillus thermosulfidooxidans, we first reconstructed a global metabolic network. Next, using a parsimony assumption, we deciphered sets of genes, called Sets from Genome Segments (SGS), that (1) are close on their respective genomes, (2) take an active part in metabolic pathways and (3) whose associated metabolic reactions are also closely connected within metabolic networks. Overall, this SGS paradigm depicts genomic functional units that emphasize respective roles of bacterial strains to catalyze metabolic pathways and environmental processes. Our analysis suggested that only few functional metabolic genes are horizontally transferred within the consortium and that no single bacterial strain can accomplish by itself the whole copper bioleaching. The use of SGS pinpoints a functional compartmentalization among the investigated species and exhibits putative bacterial interactions necessary for promoting these pathways. PMID:26677108

  13. ReacKnock: identifying reaction deletion strategies for microbial strain optimization based on genome-scale metabolic network.

    Directory of Open Access Journals (Sweden)

    Zixiang Xu

    Full Text Available Gene knockout has been used as a common strategy to improve microbial strains for producing chemicals. Several algorithms are available to predict the target reactions to be deleted. Most of them apply mixed integer bi-level linear programming (MIBLP based on metabolic networks, and use duality theory to transform bi-level optimization problem of large-scale MIBLP to single-level programming. However, the validity of the transformation was not proved. Solution of MIBLP depends on the structure of inner problem. If the inner problem is continuous, Karush-Kuhn-Tucker (KKT method can be used to reformulate the MIBLP to a single-level one. We adopt KKT technique in our algorithm ReacKnock to attack the intractable problem of the solution of MIBLP, demonstrated with the genome-scale metabolic network model of E. coli for producing various chemicals such as succinate, ethanol, threonine and etc. Compared to the previous methods, our algorithm is fast, stable and reliable to find the optimal solutions for all the chemical products tested, and able to provide all the alternative deletion strategies which lead to the same industrial objective.

  14. Systematic construction of kinetic models from genome-scale metabolic networks.

    Directory of Open Access Journals (Sweden)

    Natalie J Stanford

    Full Text Available The quantitative effects of environmental and genetic perturbations on metabolism can be studied in silico using kinetic models. We present a strategy for large-scale model construction based on a logical layering of data such as reaction fluxes, metabolite concentrations, and kinetic constants. The resulting models contain realistic standard rate laws and plausible parameters, adhere to the laws of thermodynamics, and reproduce a predefined steady state. These features have not been simultaneously achieved by previous workflows. We demonstrate the advantages and limitations of the workflow by translating the yeast consensus metabolic network into a kinetic model. Despite crudely selected data, the model shows realistic control behaviour, a stable dynamic, and realistic response to perturbations in extracellular glucose concentrations. The paper concludes by outlining how new data can continuously be fed into the workflow and how iterative model building can assist in directing experiments.

  15. Genome-scale metabolic reconstruction and in silico analysis of methylotrophic yeast Pichia pastoris for strain improvement

    Directory of Open Access Journals (Sweden)

    Chung Bevan KS

    2010-07-01

    Full Text Available Abstract Background Pichia pastoris has been recognized as an effective host for recombinant protein production. A number of studies have been reported for improving this expression system. However, its physiology and cellular metabolism still remained largely uncharacterized. Thus, it is highly desirable to establish a systems biotechnological framework, in which a comprehensive in silico model of P. pastoris can be employed together with high throughput experimental data analysis, for better understanding of the methylotrophic yeast's metabolism. Results A fully compartmentalized metabolic model of P. pastoris (iPP668, composed of 1,361 reactions and 1,177 metabolites, was reconstructed based on its genome annotation and biochemical information. The constraints-based flux analysis was then used to predict achievable growth rate which is consistent with the cellular phenotype of P. pastoris observed during chemostat experiments. Subsequent in silico analysis further explored the effect of various carbon sources on cell growth, revealing sorbitol as a promising candidate for culturing recombinant P. pastoris strains producing heterologous proteins. Interestingly, methanol consumption yields a high regeneration rate of reducing equivalents which is substantial for the synthesis of valuable pharmaceutical precursors. Hence, as a case study, we examined the applicability of P. pastoris system to whole-cell biotransformation and also identified relevant metabolic engineering targets that have been experimentally verified. Conclusion The genome-scale metabolic model characterizes the cellular physiology of P. pastoris, thus allowing us to gain valuable insights into the metabolism of methylotrophic yeast and devise possible strategies for strain improvement through in silico simulations. This computational approach, combined with synthetic biology techniques, potentially forms a basis for rational analysis and design of P. pastoris metabolic network

  16. Genome-scale metabolic reconstructions of Pichia stipitis and Pichia pastoris and in silico evaluation of their potentials

    Directory of Open Access Journals (Sweden)

    Caspeta Luis

    2012-04-01

    Full Text Available Abstract Background Pichia stipitis and Pichia pastoris have long been investigated due to their native abilities to metabolize every sugar from lignocellulose and to modulate methanol consumption, respectively. The latter has been driving the production of several recombinant proteins. As a result, significant advances in their biochemical knowledge, as well as in genetic engineering and fermentation methods have been generated. The release of their genome sequences has allowed systems level research. Results In this work, genome-scale metabolic models (GEMs of P. stipitis (iSS884 and P. pastoris (iLC915 were reconstructed. iSS884 includes 1332 reactions, 922 metabolites, and 4 compartments. iLC915 contains 1423 reactions, 899 metabolites, and 7 compartments. Compared with the previous GEMs of P. pastoris, PpaMBEL1254 and iPP668, iLC915 contains more genes and metabolic functions, as well as improved predictive capabilities. Simulations of physiological responses for the growth of both yeasts on selected carbon sources using iSS884 and iLC915 closely reproduced the experimental data. Additionally, the iSS884 model was used to predict ethanol production from xylose at different oxygen uptake rates. Simulations with iLC915 closely reproduced the effect of oxygen uptake rate on physiological states of P. pastoris expressing a recombinant protein. The potential of P. stipitis for the conversion of xylose and glucose into ethanol using reactors in series, and of P. pastoris to produce recombinant proteins using mixtures of methanol and glycerol or sorbitol are also discussed. Conclusions In conclusion the first GEM of P. stipitis (iSS884 was reconstructed and validated. The expanded version of the P. pastoris GEM, iLC915, is more complete and has improved capabilities over the existing models. Both GEMs are useful frameworks to explore the versatility of these yeasts and to capitalize on their biotechnological potentials.

  17. Large-Scale Genome-Wide Association Studies and Meta-Analyses of Longitudinal Change in Adult Lung Function

    Science.gov (United States)

    Tang, Wenbo; Kowgier, Matthew; Loth, Daan W.; Soler Artigas, María; Joubert, Bonnie R.; Hodge, Emily; Gharib, Sina A.; Smith, Albert V.; Ruczinski, Ingo; Gudnason, Vilmundur; Mathias, Rasika A.; Harris, Tamara B.; Hansel, Nadia N.; Launer, Lenore J.; Barnes, Kathleen C.; Hansen, Joyanna G.; Albrecht, Eva; Aldrich, Melinda C.; Allerhand, Michael; Barr, R. Graham; Brusselle, Guy G.; Couper, David J.; Curjuric, Ivan; Davies, Gail; Deary, Ian J.; Dupuis, Josée; Fall, Tove; Foy, Millennia; Franceschini, Nora; Gao, Wei; Gläser, Sven; Gu, Xiangjun; Hancock, Dana B.; Heinrich, Joachim; Hofman, Albert; Imboden, Medea; Ingelsson, Erik; James, Alan; Karrasch, Stefan; Koch, Beate; Kritchevsky, Stephen B.; Kumar, Ashish; Lahousse, Lies; Li, Guo; Lind, Lars; Lindgren, Cecilia; Liu, Yongmei; Lohman, Kurt; Lumley, Thomas; McArdle, Wendy L.; Meibohm, Bernd; Morris, Andrew P.; Morrison, Alanna C.; Musk, Bill; North, Kari E.; Palmer, Lyle J.; Probst-Hensch, Nicole M.; Psaty, Bruce M.; Rivadeneira, Fernando; Rotter, Jerome I.; Schulz, Holger; Smith, Lewis J.; Sood, Akshay; Starr, John M.; Strachan, David P.; Teumer, Alexander; Uitterlinden, André G.; Völzke, Henry; Voorman, Arend; Wain, Louise V.; Wells, Martin T.; Wilk, Jemma B.; Williams, O. Dale; Heckbert, Susan R.; Stricker, Bruno H.; London, Stephanie J.; Fornage, Myriam; Tobin, Martin D.; O′Connor, George T.; Hall, Ian P.; Cassano, Patricia A.

    2014-01-01

    Background Genome-wide association studies (GWAS) have identified numerous loci influencing cross-sectional lung function, but less is known about genes influencing longitudinal change in lung function. Methods We performed GWAS of the rate of change in forced expiratory volume in the first second (FEV1) in 14 longitudinal, population-based cohort studies comprising 27,249 adults of European ancestry using linear mixed effects model and combined cohort-specific results using fixed effect meta-analysis to identify novel genetic loci associated with longitudinal change in lung function. Gene expression analyses were subsequently performed for identified genetic loci. As a secondary aim, we estimated the mean rate of decline in FEV1 by smoking pattern, irrespective of genotypes, across these 14 studies using meta-analysis. Results The overall meta-analysis produced suggestive evidence for association at the novel IL16/STARD5/TMC3 locus on chromosome 15 (P  =  5.71 × 10-7). In addition, meta-analysis using the five cohorts with ≥3 FEV1 measurements per participant identified the novel ME3 locus on chromosome 11 (P  =  2.18 × 10-8) at genome-wide significance. Neither locus was associated with FEV1 decline in two additional cohort studies. We confirmed gene expression of IL16, STARD5, and ME3 in multiple lung tissues. Publicly available microarray data confirmed differential expression of all three genes in lung samples from COPD patients compared with controls. Irrespective of genotypes, the combined estimate for FEV1 decline was 26.9, 29.2 and 35.7 mL/year in never, former, and persistent smokers, respectively. Conclusions In this large-scale GWAS, we identified two novel genetic loci in association with the rate of change in FEV1 that harbor candidate genes with biologically plausible functional links to lung function. PMID:24983941

  18. Large-scale genome-wide association studies and meta-analyses of longitudinal change in adult lung function.

    Directory of Open Access Journals (Sweden)

    Wenbo Tang

    Full Text Available Genome-wide association studies (GWAS have identified numerous loci influencing cross-sectional lung function, but less is known about genes influencing longitudinal change in lung function.We performed GWAS of the rate of change in forced expiratory volume in the first second (FEV1 in 14 longitudinal, population-based cohort studies comprising 27,249 adults of European ancestry using linear mixed effects model and combined cohort-specific results using fixed effect meta-analysis to identify novel genetic loci associated with longitudinal change in lung function. Gene expression analyses were subsequently performed for identified genetic loci. As a secondary aim, we estimated the mean rate of decline in FEV1 by smoking pattern, irrespective of genotypes, across these 14 studies using meta-analysis.The overall meta-analysis produced suggestive evidence for association at the novel IL16/STARD5/TMC3 locus on chromosome 15 (P  =  5.71 × 10(-7. In addition, meta-analysis using the five cohorts with ≥3 FEV1 measurements per participant identified the novel ME3 locus on chromosome 11 (P  =  2.18 × 10(-8 at genome-wide significance. Neither locus was associated with FEV1 decline in two additional cohort studies. We confirmed gene expression of IL16, STARD5, and ME3 in multiple lung tissues. Publicly available microarray data confirmed differential expression of all three genes in lung samples from COPD patients compared with controls. Irrespective of genotypes, the combined estimate for FEV1 decline was 26.9, 29.2 and 35.7 mL/year in never, former, and persistent smokers, respectively.In this large-scale GWAS, we identified two novel genetic loci in association with the rate of change in FEV1 that harbor candidate genes with biologically plausible functional links to lung function.

  19. Isolation and characterization of bovine herpesvirus 4 (BoHV-4 from a cow affected by post partum metritis and cloning of the genome as a bacterial artificial chromosome

    Directory of Open Access Journals (Sweden)

    Cavirani Sandro

    2009-08-01

    Full Text Available Abstract Background Bovine herpesvirus 4 (BoHV-4 is a gammaherpesvirus with a Worldwide distribution in cattle and is often isolated from the uterus of animals with postpartum metritis or pelvic inflammatory disease. Virus strain adaptation to an organ, tissue or cell type is an important issue for the pathogenesis of disease. To explore the mechanistic role of viral strain variation for uterine disease, the present study aimed to develop a tool enabling precise genetic discrimination between strains of BoHV-4 and to easily manipulate the viral genome. Methods A strain of BoHV-4 was isolated from the uterus of a persistently infected cow and designated BoHV-4-U. The authenticity of the isolate was confirmed by RFLP-PCR and sequencing using the TK and IE2 loci as genetic marker regions for the BoHV-4 genome. The isolated genome was cloned as a Bacterial Artificial Chromosome (BAC and manipulated through recombineering technology Results The BoHV-4-U genome was successfully cloned as a BAC, and the stability of the pBAC-BoHV-4-U clone was confirmed over twenty passages, with viral growth similar to the wild type virus. The feasibility of using BoHV-4-U for mutagenesis was demonstrated using the BAC recombineering system. Conclusion The analysis of genome strain variation is a key method for investigating genes associated with disease. A resource for dissection of the interactions between BoHV-4 and host endometrial cells was generated by cloning the genome of BoHV-4 as a BAC.

  20. A pilot-scale study of biohydrogen production from distillery effluent using defined bacterial co-culture

    Energy Technology Data Exchange (ETDEWEB)

    Vatsala, T.M.; Raj, S. Mohan; Manimaran, A. (Shri AMM Murugappa Chettiar Research Centre, Photosynthesis and Energy Division, Tharamani, Chennai, India, 600)

    2008-10-15

    We evaluated the feasibility of improving the scale of hydrogen (H{sub 2}) production from sugar cane distillery effluent using co-cultures of Citrobacter freundii 01, Enterobacter aerogenes E10 and Rhodopseudomonas palustris P2 at 100 m{sup 3} scale. The culture conditions at 100 ml and 2 L scales were optimized in minimal medium and we observed that the co-culture of the above three strains enhanced H{sub 2} productivity significantly. Results at the 100 m{sup 3} scale revealed a maximum of 21.38 kg of H{sub 2}, corresponding to 10692.6 mol, which was obtained through batch method at 40 h from reducing sugar (3862.3 mol) as glucose. The average yield of H{sub 2} was 2.76 mol mol{sup -1} glucose, and the rate of H{sub 2} production was estimated as 0.53 kg/100 m{sup 3}/h. Our results demonstrate the utility of distillery effluent as a source of clean alternative energy and provide insights into treatment for industrial exploitation. (author)