WorldWideScience

Sample records for bd1 genome sequence

  1. The Bifidobacterium dentium Bd1 genome sequence reflects its genetic adaptation to the human oral cavity.

    Directory of Open Access Journals (Sweden)

    Marco Ventura

    2009-12-01

    Full Text Available Bifidobacteria, one of the relatively dominant components of the human intestinal microbiota, are considered one of the key groups of beneficial intestinal bacteria (probiotic bacteria. However, in addition to health-promoting taxa, the genus Bifidobacterium also includes Bifidobacterium dentium, an opportunistic cariogenic pathogen. The genetic basis for the ability of B. dentium to survive in the oral cavity and contribute to caries development is not understood. The genome of B. dentium Bd1, a strain isolated from dental caries, was sequenced to completion to uncover a single circular 2,636,368 base pair chromosome with 2,143 predicted open reading frames. Annotation of the genome sequence revealed multiple ways in which B. dentium has adapted to the oral environment through specialized nutrient acquisition, defences against antimicrobials, and gene products that increase fitness and competitiveness within the oral niche. B. dentium Bd1 was shown to metabolize a wide variety of carbohydrates, consistent with genome-based predictions, while colonization and persistence factors implicated in tissue adhesion, acid tolerance, and the metabolism of human saliva-derived compounds were also identified. Global transcriptome analysis demonstrated that many of the genes encoding these predicted traits are highly expressed under relevant physiological conditions. This is the first report to identify, through various genomic approaches, specific genetic adaptations of a Bifidobacterium taxon, Bifidobacterium dentium Bd1, to a lifestyle as a cariogenic microorganism in the oral cavity. In silico analysis and comparative genomic hybridization experiments clearly reveal a high level of genome conservation among various B. dentium strains. The data indicate that the genome of this opportunistic cariogen has evolved through a very limited number of horizontal gene acquisition events, highlighting the narrow boundaries that separate commensals from

  2. The Bifidobacterium dentium Bd1 Genome Sequence Reflects Its Genetic Adaptation to the Human Oral Cavity

    OpenAIRE

    Ventura, Marco; Turroni, Francesca; Zomer, Aldert; Foroni, Elena; Giubellini, Vanessa; Bottacini, Francesca; Canchaya, Carlos; Claesson, Marcus J.; He, Fei; Mantzourani, Maria; Mulas, Laura; Ferrarini, Alberto; Gao, Beile; Delledonne, Massimo; Henrissat, Bernard

    2009-01-01

    Bifidobacteria, one of the relatively dominant components of the human intestinal microbiota, are considered one of the key groups of beneficial intestinal bacteria (probiotic bacteria). However, in addition to health-promoting taxa, the genus Bifidobacterium also includes Bifidobacterium dentium, an opportunistic cariogenic pathogen. The genetic basis for the ability of B. dentium to survive in the oral cavity and contribute to caries development is not understood. The genome of B. dentium B...

  3. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  4. Whole Genome Sequencing

    Science.gov (United States)

    ... you want to learn. Search form Search Whole Genome Sequencing You are here Home Testing & Services Testing ... the full story, click here . What is whole genome sequencing? Whole genome sequencing is the mapping out ...

  5. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  6. Classifying Genomic Sequences by Sequence Feature Analysis

    Institute of Scientific and Technical Information of China (English)

    Zhi-Hua Liu; Dian Jiao; Xiao Sun

    2005-01-01

    Traditional sequence analysis depends on sequence alignment. In this study, we analyzed various functional regions of the human genome based on sequence features, including word frequency, dinucleotide relative abundance, and base-base correlation. We analyzed the human chromosome 22 and classified the upstream,exon, intron, downstream, and intergenic regions by principal component analysis and discriminant analysis of these features. The results show that we could classify the functional regions of genome based on sequence feature and discriminant analysis.

  7. Genome Sequence Databases (Overview): Sequencing and Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  8. TRAC-BD1/MOD1 user's guideline

    International Nuclear Information System (INIS)

    Code assessment studies and specific code applications have provided insight into the effective use of the TRAC-BWR series of codes. This document reports the experience gained from the studies and serves to assist the user in the effective application of the TRAC-BD1/MOD1 computer code. This document stresses the user's perspective relative to approprite use of the TRAC-BD1/MOD1 code and is considered an adjunct to other documentation provided with the code

  9. TRAC-BD1: transient reactor analysis code for boiling-water systems

    Energy Technology Data Exchange (ETDEWEB)

    Spore, J.W.; Weaver, W.L.; Shumway, R.W.; Giles, M.M.; Phillips, R.E.; Mohr, C.M.; Singer, G.L.; Aguilar, F.; Fischer, S.R.

    1981-01-01

    The Boiling Water Reactor (BWR) version of the Transient Reactor Analysis Code (TRAC) is being developed at the Idaho National Engineering Laboratory (INEL) to provide an advanced best-estimate predictive capability for the analysis of postulated accidents in BWRs. The TRAC-BD1 program provides the Loss of Coolant Accident (LOCA) analysis capability for BWRs and for many BWR related thermal hydraulic experimental facilities. This code features a three-dimensional treatment of the BWR pressure vessel; a detailed model of a BWR fuel bundle including multirod, multibundle, radiation heat transfer, leakage path modeling capability, flow-regime-dependent constitutive equation treatment, reflood tracking capability for both falling films and bottom flood quench fronts, and consistent treatment of the entire accident sequence. The BWR component models in TRAC-BD1 are described and comparisons with data presented. Application of the code to a BWR6 LOCA is also presented.

  10. The diploid genome sequence of Candida albicans

    OpenAIRE

    Jones, Ted; Federspiel, Nancy A.; Chibana, Hiroji; Dungan, Jan; Kalman, Sue; Magee, B. B.; Newport, George; Thorstenson, Yvonne R.; Agabian, Nina; Magee, P T; Davis, Ronald W.; Scherer, Stewart

    2004-01-01

    We present the diploid genome sequence of the fungal pathogen Candida albicans. Because C. albicans has no known haploid or homozygous form, sequencing was performed as a whole-genome shotgun of the heterozygous diploid genome in strain SC5314, a clinical isolate that is the parent of strains widely used for molecular analysis. We developed computational methods to assemble a diploid genome sequence in good agreement with available physical mapping data. We provide a whole-genome description ...

  11. Development in Rice Genome Research Based on Accurate Genome Sequence

    OpenAIRE

    2008-01-01

    Rice is one of the most important crops in the world. Although genetic improvement is a key technology for the acceleration of rice breeding, a lack of genome information had restricted efforts in molecular-based breeding until the completion of the high-quality rice genome sequence, which opened new opportunities for research in various areas of genomics. The syntenic relationship of the rice genome to other cereal genomes makes the rice genome invaluable for understanding how cereal genomes...

  12. Value of a newly sequenced bacterial genome

    DEFF Research Database (Denmark)

    Barbosa, Eudes; Aburjaile, Flavia F; Ramos, Rommel Tj;

    2014-01-01

    Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also...... heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting...... in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome...

  13. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  14. Fungal genome sequencing: basic biology to biotechnology.

    Science.gov (United States)

    Sharma, Krishna Kant

    2016-08-01

    The genome sequences provide a first glimpse into the genomic basis of the biological diversity of filamentous fungi and yeast. The genome sequence of the budding yeast, Saccharomyces cerevisiae, with a small genome size, unicellular growth, and rich history of genetic and molecular analyses was a milestone of early genomics in the 1990s. The subsequent completion of fission yeast, Schizosaccharomyces pombe and genetic model, Neurospora crassa initiated a revolution in the genomics of the fungal kingdom. In due course of time, a substantial number of fungal genomes have been sequenced and publicly released, representing the widest sampling of genomes from any eukaryotic kingdom. An ambitious genome-sequencing program provides a wealth of data on metabolic diversity within the fungal kingdom, thereby enhancing research into medical science, agriculture science, ecology, bioremediation, bioenergy, and the biotechnology industry. Fungal genomics have higher potential to positively affect human health, environmental health, and the planet's stored energy. With a significant increase in sequenced fungal genomes, the known diversity of genes encoding organic acids, antibiotics, enzymes, and their pathways has increased exponentially. Currently, over a hundred fungal genome sequences are publicly available; however, no inclusive review has been published. This review is an initiative to address the significance of the fungal genome-sequencing program and provides the road map for basic and applied research. PMID:25721271

  15. Draft Genome Sequence of Lactobacillus rhamnosus 2166.

    OpenAIRE

    Karlyshev, Andrey V.; Melnikov, Vyacheslav G.; Kosarev, Igor V.; Abramov, Vyacheslav M.

    2014-01-01

    In this report, we present a draft sequence of the genome of Lactobacillus rhamnosus strain 2166, a potential novel probiotic. Genome annotation and read mapping onto a reference genome of L. rhamnosus strain GG allowed for the identification of the differences and similarities in the genomic contents and gene arrangements of these strains.

  16. Value of a newly sequenced bacterial genome.

    Science.gov (United States)

    Barbosa, Eudes Gv; Aburjaile, Flavia F; Ramos, Rommel Tj; Carneiro, Adriana R; Le Loir, Yves; Baumbach, Jan; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

    2014-05-26

    Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information. PMID:24921006

  17. Value of a newly sequenced bacterial genome

    Institute of Scientific and Technical Information of China (English)

    Eudes; GV; Barbosa; Flavia; F; Aburjaile; Rommel; TJ; Ramos; Adriana; R; Carneiro; Yves; Le; Loir; Jan; Baumbach; Anderson; Miyoshi; Artur; Silva; Vasco; Azevedo

    2014-01-01

    Next-generation sequencing(NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft(partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information.

  18. Accurate and comprehensive sequencing of personal genomes

    OpenAIRE

    Ajay, Subramanian S.; Parker, Stephen C.J.; Ozel Abaan, Hatice; Fuentes Fajardo, Karin V.; Margulies, Elliott H.

    2011-01-01

    As whole-genome sequencing becomes commoditized and we begin to sequence and analyze personal genomes for clinical and diagnostic purposes, it is necessary to understand what constitutes a complete sequencing experiment for determining genotypes and detecting single-nucleotide variants. Here, we show that the current recommendation of ∼30× coverage is not adequate to produce genotype calls across a large fraction of the genome with acceptably low error rates. Our results are based on analyses...

  19. Automated correction of genome sequence errors

    OpenAIRE

    Gajer, Pawel; Schatz, Michael; Salzberg, Steven L

    2004-01-01

    By using information from an assembly of a genome, a new program called AutoEditor significantly improves base calling accuracy over that achieved by previous algorithms. This in turn improves the overall accuracy of genome sequences and facilitates the use of these sequences for polymorphism discovery. We describe the algorithm and its application in a large set of recent genome sequencing projects. The number of erroneous base calls in these projects was reduced by 80%. In an analysis of ov...

  20. Sequence Maneuverer: tool for sequence extraction from genomes

    OpenAIRE

    Yasmin, Tayyaba; Rehman, Inayat Ur; Ansari, Adnan Ahmad; liaqat, Khurrum; Khan, Muhammad Irfan

    2012-01-01

    The availability of genomic sequences of many organisms has opened new challenges in many aspects particularly in terms of genome analysis. Sequence extraction is a vital step and many tools have been developed to solve this issue. These tools are available publically but have limitations with reference to the sequence extraction, length of the sequence to be extracted, organism specificity and lack of user friendly interface. We have developed a java based software package having three modul...

  1. Towards a reference pecan genome sequence

    Science.gov (United States)

    The cost of generating DNA sequence data has declined dramatically over the previous 15 years as a result of the Human Genome Project and the potential applications of genome sequencing for human medicine. This cost reduction has generated renewed interest among crop breeding scientists in applying...

  2. Sequence Maneuverer: tool for sequence extraction from genomes

    Science.gov (United States)

    Yasmin, Tayyaba; Rehman, Inayat Ur; Ansari, Adnan Ahmad; liaqat, Khurrum; khan, Muhammad Irfan

    2012-01-01

    The availability of genomic sequences of many organisms has opened new challenges in many aspects particularly in terms of genome analysis. Sequence extraction is a vital step and many tools have been developed to solve this issue. These tools are available publically but have limitations with reference to the sequence extraction, length of the sequence to be extracted, organism specificity and lack of user friendly interface. We have developed a java based software package having three modules which can be used independently or sequentially. The tool efficiently extracts sequences from large datasets with few simple steps. It can efficiently extract multiple sequences of any desired length from a genome of any organism. The results are crosschecked by published data. Availability URL 1: http://ww3.comsats.edu.pk/bio/ResearchProjects.aspx URL 2: http://ww3.comsats.edu.pk/bio/SequenceManeuverer.aspx PMID:23275734

  3. Genomic sequencing of Pleistocene cave bears

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

    2005-04-01

    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

  4. Plantagora: modeling whole genome sequencing and assembly of plant genomes.

    Directory of Open Access Journals (Sweden)

    Roger Barthelson

    Full Text Available BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly

  5. Comparison of 61 Sequenced Escherichia coli Genomes

    DEFF Research Database (Denmark)

    Lukjancenko, Oksana; Wassenaar, T. M.; Ussery, David

    2010-01-01

    Escherichia coli is an important component of the biosphere and is an ideal model for studies of processes involved in bacterial genome evolution. Sixty-one publically available E. coli and Shigella spp. sequenced genomes are compared, using basic methods to produce phylogenetic and proteomics...... the pan-genome and about 80% of a typical genome; some of these variable genes tend to be co-localized on genomic islands. The diversity within the species E. coli, and the overlap in gene content between this and related species, suggests a continuum rather than sharp species borders in this group of...

  6. Microbial species delineation using whole genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos; Mukherjee, Supratim; Ivanova, Natalia; Mavrommatics, Kostas; Pati, Amrita; Konstantinidis, Konstantinos

    2014-10-20

    Species assignments in prokaryotes use a manual, poly-phasic approach utilizing both phenotypic traits and sequence information of phylogenetic marker genes. With thousands of genomes being sequenced every year, an automated, uniform and scalable approach exploiting the rich genomic information in whole genome sequences is desired, at least for the initial assignment of species to an organism. We have evaluated pairwise genome-wide Average Nucleotide Identity (gANI) values and alignment fractions (AFs) for nearly 13,000 genomes using our fast implementation of the computation, identifying robust and widely applicable hard cut-offs for species assignments based on AF and gANI. Using these cutoffs, we generated stable species-level clusters of organisms, which enabled the identification of several species mis-assignments and facilitated the assignment of species for organisms without species definitions.

  7. The characterization of twenty sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Kimberly Pelak

    2010-09-01

    Full Text Available We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten "case" genomes from individuals with severe hemophilia A and ten "control" genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways.

  8. Genome sequence and analysis of Lactobacillus helveticus

    Directory of Open Access Journals (Sweden)

    PaolaCremonesi

    2013-01-01

    Full Text Available The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of L. helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract. As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones.

  9. Sequencing and comparing whole mitochondrial genomes ofanimals

    Energy Technology Data Exchange (ETDEWEB)

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

  10. Multilocus Sequence Typing of Total-Genome-Sequenced Bacteria

    DEFF Research Database (Denmark)

    Larsen, Mette Voldby; Cosentino, Salvatore; Rasmussen, Simon;

    2012-01-01

    Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the "gold standard" of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS)...

  11. Genomic prediction using QTL derived from whole genome sequence data

    DEFF Research Database (Denmark)

    Brøndum, Rasmus Froberg; Su, Guosheng; Janss, Luc;

    This study investigated the gain in accuracy of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k SNP data. Analyses were performed for Nordic Holstein and Danish Jersey animals, using either...... a genomic BLUP or a Bayesian variable selection model. When using the genomic BLUP model, results showed increases in accuracy of up to two percentage points for production traits in both Holstein and Jersey animals by including the extra variants in the analysis, and an extra 1.5 percentage points...

  12. Simple sequence repeats in mycobacterial genomes

    Indian Academy of Sciences (India)

    Vattipally B Sreenu; Pankaj Kumar; Javaregowda Nagaraju; Hampapathalu A Nagarajaram

    2007-01-01

    Simple sequence repeats (SSRs) or microsatellites are the repetitive nucleotide sequences of motifs of length 1–6 bp. They are scattered throughout the genomes of all the known organisms ranging from viruses to eukaryotes. Microsatellites undergo mutations in the form of insertions and deletions (INDELS) of their repeat units with some bias towards insertions that lead to microsatellite tract expansion. Although prokaryotic genomes derive some plasticity due to microsatellite mutations they have in-built mechanisms to arrest undue expansions of microsatellites and one such mechanism is constituted by post-replicative DNA repair enzymes MutL, MutH and MutS. The mycobacterial genomes lack these enzymes and as a null hypothesis one could expect these genomes to harbour many long tracts. It is therefore interesting to analyse the mycobacterial genomes for distribution and abundance of microsatellites tracts and to look for potentially polymorphic microsatellites. Available mycobacterial genomes, Mycobacterium avium, M. leprae, M. bovis and the two strains of M. tuberculosis (CDC1551 and H37Rv) were analysed for frequencies and abundance of SSRs. Our analysis revealed that the SSRs are distributed throughout the mycobacterial genomes at an average of 220–230 SSR tracts per kb. All the mycobacterial genomes contain few regions that are conspicuously denser or poorer in microsatellites compared to their expected genome averages. The genomes distinctly show scarcity of long microsatellites despite the absence of a post-replicative DNA repair system. Such severe scarcity of long microsatellites could arise as a result of strong selection pressures operating against long and unstable sequences although influence of GC-content and role of point mutations in arresting microsatellite expansions can not be ruled out. Nonetheless, the long tracts occasionally found in coding as well as non-coding regions may account for limited genome plasticity in these genomes.

  13. Genomic Prediction from Whole Genome Sequence in Livestock: The 1000 Bull Genomes Project

    DEFF Research Database (Denmark)

    Hayes, Benjamin J; MacLeod, Iona M; Daetwyler, Hans D;

    Advantages of using whole genome sequence data to predict genomic estimated breeding values (GEBV) include better persistence of accuracy of GEBV across generations and more accurate GEBV across breeds. The 1000 Bull Genomes Project provides a database of whole genome sequenced key ancestor bulls......, for imputing sequence variant genotypes into reference sets for genomic prediction. Run 3.0 included 429 sequences, with 31.8 million variants detected. BayesRC, a new method for genomic prediction, addresses some challenges associated with using the sequence data, and takes advantage of biological...... information. In a dairy data set, predictions using BayesRC and imputed sequence data from 1000 Bull Genomes were 2% more accurate than with 800k data. We could demonstrate the method identified causal mutations in some cases. Further improvements will come from more accurate imputation of sequence variant...

  14. Genome Sequence of Pseudomonas chlororaphis Strain 189

    Science.gov (United States)

    Town, Jennifer; Audy, Patrice; Boyetchko, Susan M.

    2016-01-01

    Pseudomonas chlororaphis strain 189 is a potent inhibitor of the growth of the potato pathogen Phytophthora infestans. We determined the complete, finished sequence of the 6.8-Mbp genome of this strain, consisting of a single contiguous molecule. Strain 189 is closely related to previously sequenced strains of P. chlororaphis. PMID:27340063

  15. Next-generation sequencing: applications beyond genomes

    OpenAIRE

    Marguerat, Samuel; Wilhelm, Brian T.; Bähler, Jürg

    2008-01-01

    The development of DNA sequencing more than 30 years ago has profoundly impacted biological research. In the last couple of years, remarkable technological innovations have emerged that allow the direct and cost-effective sequencing of complex samples at unprecedented scale and speed. These next-generation technologies make it feasible to sequence not only static genomes, but also entire transcriptomes expressed under different conditions. These and other powerful applications of next-generat...

  16. Genome Sequence of the Palaeopolyploid soybean

    Energy Technology Data Exchange (ETDEWEB)

    Schmutz, Jeremy; Cannon, Steven B.; Schlueter, Jessica; Ma, Jianxin; Mitros, Therese; Nelson, William; Hyten, David L.; Song, Qijian; Thelen, Jay J.; Cheng, Jianlin; Xu, Dong; Hellsten, Uffe; May, Gregory D.; Yu, Yeisoo; Sakura, Tetsuya; Umezawa, Taishi; Bhattacharyya, Madan K.; Sandhu, Devinder; Valliyodan, Babu; Lindquist, Erika; Peto, Myron; Grant, David; Shu, Shengqiang; Goodstein, David; Barry, Kerrie; Futrell-Griggs, Montona; Abernathy, Brian; Du, Jianchang; Tian, Zhixi; Zhu, Liucun; Gill, Navdeep; Joshi, Trupti; Libault, Marc; Sethuraman, Anand; Zhang, Xue-Cheng; Shinozaki, Kazuo; Nguyen, Henry T.; Wing, Rod A.; Cregan, Perry; Specht, James; Grimwood, Jane; Rokhsar, Dan; Stacey, Gary; Shoemaker, Randy C.; Jackson, Scott A.

    2009-08-03

    Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70percent more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78percent of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75percent of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

  17. Rhipicephalus (Boophilus) microplus strain Deutsch, whole genome shotgun sequencing project first submission of genome sequence

    Science.gov (United States)

    The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence difficult. Cot filtration/selection techniques were used to reduce the repetitive fraction of the tick genome and enrich for the fraction of DNA with gene-containing regions. The Cot-selected ...

  18. Viral genome sequencing by random priming methods

    Directory of Open Access Journals (Sweden)

    Zhang Xinsheng

    2008-01-01

    Full Text Available Abstract Background Most emerging health threats are of zoonotic origin. For the overwhelming majority, their causative agents are RNA viruses which include but are not limited to HIV, Influenza, SARS, Ebola, Dengue, and Hantavirus. Of increasing importance therefore is a better understanding of global viral diversity to enable better surveillance and prediction of pandemic threats; this will require rapid and flexible methods for complete viral genome sequencing. Results We have adapted the SISPA methodology 123 to genome sequencing of RNA and DNA viruses. We have demonstrated the utility of the method on various types and sources of viruses, obtaining near complete genome sequence of viruses ranging in size from 3,000–15,000 kb with a median depth of coverage of 14.33. We used this technique to generate full viral genome sequence in the presence of host contaminants, using viral preparations from cell culture supernatant, allantoic fluid and fecal matter. Conclusion The method described is of great utility in generating whole genome assemblies for viruses with little or no available sequence information, viruses from greatly divergent families, previously uncharacterized viruses, or to more fully describe mixed viral infections.

  19. Sequencing and Analysis of a Genomic Fragment Provide an Insight into the Dunaliella viridis Genomic Sequence

    Institute of Scientific and Technical Information of China (English)

    Xiao-Ming SUN; Yuan-Ping TANG; Xiang-Zong MENG; Wen-Wen ZHANG; Shan LI; Zhi-Rui DENG; Zheng-Kai XU; Ren-Tao SONG

    2006-01-01

    Dunaliella is a genus of wall-less unicellular eukaryotic green alga. Its exceptional resistances to salt and various other stresses have made it an ideal model for stress tolerance study. However, very little is known about its genome and genomic sequences. In this study, we sequenced and analyzed a 29,268 bp genomic fragment from Dunaliella viridis. The fragment showed low sequence homology to the GenBank database. At the nucleotide level, only a segment with significant sequence homology to 18S rRNA was found. The fragment contained six putative genes, but only one gene showed significant homology at the protein level to GenBank database. The average GC content of this sequence was 51.1%, which was much lower than that of close related green algae Chlamydomonas (65.7%). Significant segmental duplications were found within this fragment. The duplicated sequences accounted for about 35.7% of the entire region. Large amounts of simple sequence repeats (microsatellites) were found, with strong bias towards (AC)n type (76%). Analysis of other Dunaliella genomic sequences in the GenBank database (total 25,749 bp) was in agreement with these findings. These sequence features made it difficult to sequence Dunaliella genomic sequences. Further investigation should be made to reveal the biological significance of these unique sequence features.

  20. Sequencing and analysis of a genomic fragment provide an insight into the Dunaliella viridis genomic sequence.

    Science.gov (United States)

    Sun, Xiao-Ming; Tang, Yuan-Ping; Meng, Xiang-Zong; Zhang, Wen-Wen; Li, Shan; Deng, Zhi-Rui; Xu, Zheng-Kai; Song, Ren-Tao

    2006-11-01

    Dunaliella is a genus of wall-less unicellular eukaryotic green alga. Its exceptional resistances to salt and various other stresses have made it an ideal model for stress tolerance study. However, very little is known about its genome and genomic sequences. In this study, we sequenced and analyzed a 29,268 bp genomic fragment from Dunaliella viridis. The fragment showed low sequence homology to the GenBank database. At the nucleotide level, only a segment with significant sequence homology to 18S rRNA was found. The fragment contained six putative genes, but only one gene showed significant homology at the protein level to GenBank database. The average GC content of this sequence was 51.1%, which was much lower than that of close related green algae Chlamydomonas (65.7%). Significant segmental duplications were found within this fragment. The duplicated sequences accounted for about 35.7% of the entire region. Large amounts of simple sequence repeats (microsatellites) were found, with strong bias towards (AC)(n) type (76%). Analysis of other Dunaliella genomic sequences in the GenBank database (total 25,749 bp) was in agreement with these findings. These sequence features made it difficult to sequence Dunaliella genomic sequences. Further investigation should be made to reveal the biological significance of these unique sequence features. PMID:17091199

  1. Sorghum genome sequencing by methylation filtration.

    Directory of Open Access Journals (Sweden)

    Joseph A Bedell

    2005-01-01

    Full Text Available Sorghum bicolor is a close relative of maize and is a staple crop in Africa and much of the developing world because of its superior tolerance of arid growth conditions. We have generated sequence from the hypomethylated portion of the sorghum genome by applying methylation filtration (MF technology. The evidence suggests that 96% of the genes have been sequence tagged, with an average coverage of 65% across their length. Remarkably, this level of gene discovery was accomplished after generating a raw coverage of less than 300 megabases of the 735-megabase genome. MF preferentially captures exons and introns, promoters, microRNAs, and simple sequence repeats, and minimizes interspersed repeats, thus providing a robust view of the functional parts of the genome. The sorghum MF sequence set is beneficial to research on sorghum and is also a powerful resource for comparative genomics among the grasses and across the entire plant kingdom. Thousands of hypothetical gene predictions in rice and Arabidopsis are supported by the sorghum dataset, and genomic similarities highlight evolutionarily conserved regions that will lead to a better understanding of rice and Arabidopsis.

  2. Sorghum genome sequencing by methylation filtration.

    Science.gov (United States)

    Bedell, Joseph A; Budiman, Muhammad A; Nunberg, Andrew; Citek, Robert W; Robbins, Dan; Jones, Joshua; Flick, Elizabeth; Rholfing, Theresa; Fries, Jason; Bradford, Kourtney; McMenamy, Jennifer; Smith, Michael; Holeman, Heather; Roe, Bruce A; Wiley, Graham; Korf, Ian F; Rabinowicz, Pablo D; Lakey, Nathan; McCombie, W Richard; Jeddeloh, Jeffrey A; Martienssen, Robert A

    2005-01-01

    Sorghum bicolor is a close relative of maize and is a staple crop in Africa and much of the developing world because of its superior tolerance of arid growth conditions. We have generated sequence from the hypomethylated portion of the sorghum genome by applying methylation filtration (MF) technology. The evidence suggests that 96% of the genes have been sequence tagged, with an average coverage of 65% across their length. Remarkably, this level of gene discovery was accomplished after generating a raw coverage of less than 300 megabases of the 735-megabase genome. MF preferentially captures exons and introns, promoters, microRNAs, and simple sequence repeats, and minimizes interspersed repeats, thus providing a robust view of the functional parts of the genome. The sorghum MF sequence set is beneficial to research on sorghum and is also a powerful resource for comparative genomics among the grasses and across the entire plant kingdom. Thousands of hypothetical gene predictions in rice and Arabidopsis are supported by the sorghum dataset, and genomic similarities highlight evolutionarily conserved regions that will lead to a better understanding of rice and Arabidopsis. PMID:15660154

  3. Population genetic inference from genomic sequence variation

    OpenAIRE

    Pool, John E.; Hellmann, Ines; Jeffrey D. Jensen; Nielsen, Rasmus

    2010-01-01

    Population genetics has evolved from a theory-driven field with little empirical data into a data-driven discipline in which genome-scale data sets test the limits of available models and computational analysis methods. In humans and a few model organisms, analyses of whole-genome sequence polymorphism data are currently under way. And in light of the falling costs of next-generation sequencing technologies, such studies will soon become common in many other organisms as well. Here, we assess...

  4. An International Plan to Sequence the Onion Genome

    Science.gov (United States)

    The cost of DNA sequencing continues to decline and, in the near future, it will become reasonable to undertake sequencing of the enormous nuclear genome of onion. We undertook sequencing of expressed and genomic regions of the onion genome to learn about the structure of the onion genome, as well a...

  5. The diploid genome sequence of an Asian individual

    DEFF Research Database (Denmark)

    Wang, Jun; Wang, Wei; Li, Ruiqiang;

    2008-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we...

  6. Multilocus sequence typing of total-genome-sequenced bacteria.

    Science.gov (United States)

    Larsen, Mette V; Cosentino, Salvatore; Rasmussen, Simon; Friis, Carsten; Hasman, Henrik; Marvig, Rasmus Lykke; Jelsbak, Lars; Sicheritz-Pontén, Thomas; Ussery, David W; Aarestrup, Frank M; Lund, Ole

    2012-04-01

    Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the "gold standard" of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST. PMID:22238442

  7. Mauve: Multiple Alignment of Conserved Genomic Sequence With Rearrangements

    OpenAIRE

    Darling, Aaron C.E.; Mau, Bob; Blattner, Frederick R.; Perna, Nicole T.

    2004-01-01

    As genomes evolve, they undergo large-scale evolutionary processes that present a challenge to sequence comparison not posed by short sequences. Recombination causes frequent genome rearrangements, horizontal transfer introduces new sequences into bacterial chromosomes, and deletions remove segments of the genome. Consequently, each genome is a mosaic of unique lineage-specific segments, regions shared with a subset of other genomes and segments conserved among all the genomes under considera...

  8. Hidden ribozymes in eukaryotic genome sequence

    OpenAIRE

    Sean P Ryder

    2010-01-01

    The small self-cleaving ribozymes fold into complex tertiary structures to promote autocatalytic cleavage or ligation at a precise position within their sequence. Until recently, relatively few examples had been identified. Two papers now reveal that self-cleaving ribozymes are prevalent in eukaryotic genomes and, in some cases, might play a role in regulating gene expression.

  9. Whole genome sequences of four Brucella strains.

    Science.gov (United States)

    Ding, Jiabo; Pan, Yuanlong; Jiang, Hai; Cheng, Junsheng; Liu, Taotao; Qin, Nan; Yang, Yi; Cui, Buyun; Chen, Chen; Liu, Cuihua; Mao, Kairong; Zhu, Baoli

    2011-07-01

    Brucella melitensis and Brucella suis are intracellular pathogens of livestock and humans. Here we report four genome sequences, those of the virulent strain B. melitensis M28-12 and vaccine strains B. melitensis M5 and M111 and B. suis S2, which show different virulences and pathogenicities, which will help to design a more effective brucellosis vaccine. PMID:21602346

  10. Genome Sequence of Lactobacillus amylovorus GRL1112

    OpenAIRE

    Kant, R.; Paulin, L.; Alatalo, E.; DE VOS W.M.; Palva, A.

    2010-01-01

    Lactobacillus amylovorus is a common member of the normal gastrointestinal tract (GIT) microbiota in pigs. Here, we report the genome sequence of L. amylovorus GRL1112, a porcine feces isolate displaying strong adherence to the pig intestinal epithelial cells. The strain is of interest, as it is a potential probiotic bacterium.

  11. Genome sequence of Lactobacillus amylovorus GRL1112.

    Science.gov (United States)

    Kant, Ravi; Paulin, Lars; Alatalo, Edward; de Vos, Willem M; Palva, Airi

    2011-02-01

    Lactobacillus amylovorus is a common member of the normal gastrointestinal tract (GIT) microbiota in pigs. Here, we report the genome sequence of L. amylovorus GRL1112, a porcine feces isolate displaying strong adherence to the pig intestinal epithelial cells. The strain is of interest, as it is a potential probiotic bacterium. PMID:21131492

  12. Mapping whole genome shotgun sequence and variant calling in mammalian species without their reference genomes

    Science.gov (United States)

    Genomics research in mammals has produced reference genome sequences that are essential for identifying variation associated with disease. High quality reference genome sequences are now available for humans, model species, and economically important agricultural animals. Comparisons between these s...

  13. The Theory and Practice of Genome Sequence Assembly.

    Science.gov (United States)

    Simpson, Jared T; Pop, Mihai

    2015-01-01

    The current genomic revolution was made possible by joint advances in genome sequencing technologies and computational approaches for analyzing sequence data. The close interaction between biologists and computational scientists is perhaps most apparent in the development of approaches for sequencing entire genomes, a feat that would not be possible without sophisticated computational tools called genome assemblers (short for genome sequence assemblers). Here, we survey the key developments in algorithms for assembling genome sequences since the development of the first DNA sequencing methods more than 35 years ago. PMID:25939056

  14. Whole Genome Sequencing: Cracking the Genetic Code for Foodborne Illness

    Science.gov (United States)

    ... For Consumers Home For Consumers Consumer Updates Whole Genome Sequencing: Cracking the Genetic Code for Foodborne Illness ... Bacteria that cause disease have millions of different genomes, or sequences of genetic code, each as unique ...

  15. Sequence analysis and editing for bisulphite genomic sequencing projects

    OpenAIRE

    Carr, IM; Valleley, EMA; Cordery, SF; Markham, AF; Bonthron, DT

    2007-01-01

    Bisulphite genomic sequencing is a widely used technique for detailed analysis of the methylation status of a region of DNA. It relies upon the selective deamination of unmethylated cytosine to uracil after treatment with sodium bisulphite, usually followed by PCR amplification of the chosen target region. Since this two-step procedure replaces all unmethylated cytosine bases with thymine, PCR products derived from unmethylated templates contain only three types of nucleotide, in unequal prop...

  16. Sequence motif discovery with computational genome-wide analysis

    OpenAIRE

    Akashi, Hirofumi; Aoki, Fumio; Toyota, Minoru; Maruyama, Reo; Sasaki, Yasushi; Mita, Hiroaki; Tokura, Hajime; Imai, Kohzoh; Tatsumi, Haruyuki

    2006-01-01

    As a result of the human genome project and advancements in DNA sequencing technology, we can utilize a huge amount of nucleotide sequence data and can search DNA sequence motifs in whole human genome. However, searching motifs with the naked eye is an enormous task and searching throughout the whole genome is absolutely impossible. Therefore, we have developed a computational genome-wide analyzing system for detecting DNA sequence motifs with biological significance. We used a multi-parallel...

  17. What Will We Do with a Cotton Genome Sequence?

    Institute of Scientific and Technical Information of China (English)

    BRUBAKER Curt

    2008-01-01

    @@ With the publication of "Toward Sequencing Cotton (Gossypium) Genomes" [Chen et al.PlantPhysiology,2007,145:1303-1310-] a clear consensus emerged from the cotton genomics community not only that cotton genome sequences were a critical resource for research and commercial innovationin cotton genomics,but that there was a logical means of achieving this goal.

  18. The predictive capacity of personal genome sequencing.

    Science.gov (United States)

    Roberts, Nicholas J; Vogelstein, Joshua T; Parmigiani, Giovanni; Kinzler, Kenneth W; Vogelstein, Bert; Velculescu, Victor E

    2012-05-01

    New DNA sequencing methods will soon make it possible to identify all germline variants in any individual at a reasonable cost. However, the ability of whole-genome sequencing to predict predisposition to common diseases in the general population is unknown. To estimate this predictive capacity, we use the concept of a "genometype." A specific genometype represents the genomes in the population conferring a specific level of genetic risk for a specified disease. Using this concept, we estimated the maximum capacity of whole-genome sequencing to identify individuals at clinically significant risk for 24 different diseases. Our estimates were derived from the analysis of large numbers of monozygotic twin pairs; twins of a pair share the same genometype and therefore identical genetic risk factors. Our analyses indicate that (i) for 23 of the 24 diseases, most of the individuals will receive negative test results; (ii) these negative test results will, in general, not be very informative, because the risk of developing 19 of the 24 diseases in those who test negative will still be, at minimum, 50 to 80% of that in the general population; and (iii) on the positive side, in the best-case scenario, more than 90% of tested individuals might be alerted to a clinically significant predisposition to at least one disease. These results have important implications for the valuation of genetic testing by industry, health insurance companies, public policy-makers, and consumers. PMID:22472521

  19. Enhanced Dynamic Algorithm of Genome Sequence Alignments

    Directory of Open Access Journals (Sweden)

    Arabi E. keshk

    2014-05-01

    Full Text Available The merging of biology and computer science has created a new field called computational biology that explore the capacities of computers to gain knowledge from biological data, bioinformatics. Computational biology is rooted in life sciences as well as computers, information sciences, and technologies. The main problem in computational biology is sequence alignment that is a way of arranging the sequences of DNA, RNA or protein to identify the region of similarity and relationship between sequences. This paper introduces an enhancement of dynamic algorithm of genome sequence alignment, which called EDAGSA. It is filling the three main diagonals without filling the entire matrix by the unused data. It gets the optimal solution with decreasing the execution time and therefore the performance is increased. To illustrate the effectiveness of optimizing the performance of the proposed algorithm, it is compared with the traditional methods such as Needleman-Wunsch, Smith-Waterman and longest common subsequence algorithms. Also, database is implemented for using the algorithm in multi-sequence alignments for searching the optimal sequence that matches the given sequence.

  20. Swine Genome Sequencing Consortium (SGSC: A Strategic Roadmap for Sequencing The Pig Genome

    Directory of Open Access Journals (Sweden)

    Kellye Eversole

    2006-04-01

    Full Text Available The Swine Genome Sequencing Consortium (SGSC was formed in September 2003 by academic, government and industry representatives to provide international coordination for sequencing the pig genome. The SGSC’s mission is to advance biomedical research for animal production and health by the development of DNAbased tools and products resulting from the sequencing of the swine genome. During the past 2 years, the SGSC has met bi-annually to develop a strategic roadmap for creating the required scientific resources, to integrate existing physical maps, and to create a sequencing strategy that captured international participation and a broad funding base. During the past year, SGSC members have integrated their respective physical mapping data with the goal of creating a minimal tiling path (MTP that will be used as the sequencing template. During the recent Plant and Animal Genome meeting (January 16, 2005 San Diego, CA, presentations demonstrated that a human–pig comparative map has been completed, BAC fingerprint contigs (FPC for each of the autosomes and X chromosome have been constructed and that BAC end-sequencing has permitted, through BLAST analysis and RH-mapping, anchoring of the contigs. Thus, significant progress has been made towards the creation of a MTP. In addition, whole-genome (WG shotgun libraries have been constructed and are currently being sequenced in various laboratories around the globe. Thus, a hybrid sequencing approach in which 3x coverage of BACs comprising the MTP and 3x of the WG-shotgun libraries will be used to develop a draft 6x coverage of the pig genome.

  1. Underlying Data for Sequencing the Mitochondrial Genome with the Massively Parallel Sequencing Platform Ion Torrent™ PGM™

    OpenAIRE

    Seo, Seung Bum; Zeng, Xiangpei; King, Jonathan L.; Larue, Bobby L; Assidi, Mourad; Al-Qahtani, Mohamed H; Sajantila, Antti; Budowle, Bruce

    2015-01-01

    Abstract Background Massively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (L...

  2. Underlying Data for Sequencing the Mitochondrial Genome with the Massively Parallel Sequencing Platform Ion Torrent™ PGM™

    OpenAIRE

    Seo, Seung Bum; Zeng, Xiangpei; King, Jonathan L.; Larue, Bobby L; Assidi, Mourad; Al-Qahtani, Mohamed H; Sajantila, Antti; Budowle, Bruce

    2015-01-01

    Background Massively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (Life Technologies, S...

  3. Identifying driver mutations in sequenced cancer genomes

    DEFF Research Database (Denmark)

    Raphael, Benjamin J; Dobson, Jason R; Oesper, Layla;

    2014-01-01

    protein sequence or structure. Finally, we review techniques to identify recurrent combinations of somatic mutations, including approaches that examine mutations in known pathways or protein-interaction networks, as well as de novo approaches that identify combinations of mutations according to......-throughput DNA sequencing data, particularly for tumor samples that comprise heterogeneous populations of cells. Next, we review computational approaches that aim to predict driver mutations according to their frequency of occurrence in a cohort of samples, or according to their predicted functional impact on......, and random mutations. Here, we review computational approaches to identify somatic mutations in cancer genome sequences and to distinguish the driver mutations that are responsible for cancer from random, passenger mutations. First, we describe approaches to detect somatic mutations from high...

  4. Why Assembling Plant Genome Sequences Is So Challenging

    Directory of Open Access Journals (Sweden)

    Pedro Seoane

    2012-09-01

    Full Text Available In spite of the biological and economic importance of plants, relatively few plant species have been sequenced. Only the genome sequence of plants with relatively small genomes, most of them angiosperms, in particular eudicots, has been determined. The arrival of next-generation sequencing technologies has allowed the rapid and efficient development of new genomic resources for non-model or orphan plant species. But the sequencing pace of plants is far from that of animals and microorganisms. This review focuses on the typical challenges of plant genomes that can explain why plant genomics is less developed than animal genomics. Explanations about the impact of some confounding factors emerging from the nature of plant genomes are given. As a result of these challenges and confounding factors, the correct assembly and annotation of plant genomes is hindered, genome drafts are produced, and advances in plant genomics are delayed.

  5. Nuclear hBD-1 accumulation in malignant salivary gland tumours

    International Nuclear Information System (INIS)

    Whereas the antimicrobial peptides hBD-2 and -3 are related to inflammation, the constitutively expressed hBD-1 might function as 8p tumour suppressor gene and thus play a key role in control of transcription and induction of apoptosis in malignant epithelial tumours. Therefore this study was conducted to characterise proteins involved in cell cycle control and host defence in different benign and malignant salivary gland tumours in comparison with healthy salivary gland tissue. 21 paraffin-embedded tissue samples of benign (n = 7), and malignant (n = 7) salivary gland tumours as well as healthy (n = 7) salivary glands were examined immunohistochemically for the expression of p53, bcl-2, and hBD-1, -2, -3. HBD-1 was distributed in the cytoplasm of healthy salivary glands and benign salivary gland tumours but seems to migrate into the nucleus of malignant salivary gland tumours. Pleomorphic adenomas showed cytoplasmic as well as weak nuclear hBD-1 staining. HBD-1, 2 and 3 are traceable in healthy salivary gland tissue as well as in benign and malignant salivary gland tumours. As hBD-1 is shifted from the cytoplasm to the nucleus in malignant salivary gland tumours, we hypothesize that it might play a role in the oncogenesis of these tumours. In pleomorphic adenomas hBD-1 might be connected to their biologic behaviour of recurrence and malignant transformation

  6. Simple sequence repeats in bryophyte mitochondrial genomes.

    Science.gov (United States)

    Zhao, Chao-Xian; Zhu, Rui-Liang; Liu, Yang

    2016-01-01

    Simple sequence repeats (SSRs) are thought to be common in plant mitochondrial (mt) genomes, but have yet to be fully described for bryophytes. We screened the mt genomes of two liverworts (Marchantia polymorpha and Pleurozia purpurea), two mosses (Physcomitrella patens and Anomodon rugelii) and two hornworts (Phaeoceros laevis and Nothoceros aenigmaticus), and detected 475 SSRs. Some SSRs are found conserved during the evolution, among which except one exists in both liverworts and mosses, all others are shared only by the two liverworts, mosses or hornworts. SSRs are known as DNA tracts having high mutation rates; however, according to our observations, they still can evolve slowly. The conservativeness of these SSRs suggests that they are under strong selection and could play critical roles in maintaining the gene functions. PMID:24491104

  7. Initial sequencing and comparative analysis of the mouse genome

    Energy Technology Data Exchange (ETDEWEB)

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  8. Genome sequence of Haemophilus parasuis strain 29755

    OpenAIRE

    Mullins, Michael A.; Register, Karen B.; Bayles, Darrell O; Dyer, David W.; Joanna S Kuehn; Phillips, Gregory J.

    2011-01-01

    Haemophilus parasuis is a member of the family Pasteurellaceae and is the etiologic agent of Glässer’s disease in pigs, a systemic syndrome associated with only a subset of isolates. The genetic basis for virulence and systemic spread of particular H. parasuis isolates is currently unknown. Strain 29755 is an invasive isolate that has long been used in the study of Glässer’s disease. Accordingly, the genome sequence of strain 29755 is of considerable importance to investigators endeavoring to...

  9. Building the sequence map of the human pan-genome

    DEFF Research Database (Denmark)

    Li, Ruiqiang; Li, Yingrui; Zheng, Hancheng;

    2009-01-01

    Here we integrate the de novo assembly of an Asian and an African genome with the NCBI reference human genome, as a step toward constructing the human pan-genome. We identified approximately 5 Mb of novel sequences not present in the reference genome in each of these assemblies. Most novel...... sequences are individual or population specific, as revealed by their comparison to all available human DNA sequence and by PCR validation using the human genome diversity cell line panel. We found novel sequences present in patterns consistent with known human migration paths. Cross-species conservation...

  10. Insights from 20 years of bacterial genome sequencing

    DEFF Research Database (Denmark)

    Land, Miriam; Hauser, Loren; Jun, Se-Ran;

    2015-01-01

    genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in......Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the...... genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative...

  11. Draft Genome Sequence of Alternaria alternata ATCC 34957.

    Science.gov (United States)

    Nguyen, Hai D T; Lewis, Christopher T; Lévesque, C André; Gräfenhan, Tom

    2016-01-01

    We report the draft genome sequence of Alternaria alternata ATCC 34957. This strain was previously reported to produce alternariol and alternariol monomethyl ether on weathered grain sorghum. The genome was sequenced with PacBio technology and assembled into 27 scaffolds with a total genome size of 33.5 Mb. PMID:26769939

  12. Draft Genome Sequence of Fungus Clonostachys rosea Strain YKD0085.

    Science.gov (United States)

    Liu, Shuai; Chang, Yaowen; Hu, Xujia; Gong, Xuanyun; Di, Yingtong; Dong, Jinyan; Hao, Xiaojiang

    2016-01-01

    Here, we report the draft genome sequence of Clonostachys rosea (strain YKD0085). The functional annotation of C. rosea provides important information related to its ability to produce secondary metabolites. The genome sequence presented here builds the basis for further genome mining. PMID:27340057

  13. Complete Genome Sequence of Staphylococcus aureus Siphovirus Phage JS01

    OpenAIRE

    Jia, Hongying; Bai, Qinqin; Yang, Yongchun; Yao, Huochun

    2013-01-01

    Staphylococcus aureus is the most prevalent and economically significant pathogen causing bovine mastitis. We isolated and characterized one staphylophage from the milk of mastitis-affected cattle and sequenced its genome. Transmission electron microscopy (TEM) observation shows that it belongs to the family Siphovirus. We announce here its complete genome sequence and report major findings from the genomic analysis.

  14. First Draft Genome Sequence of Staphylococcus condimenti F-2T

    Science.gov (United States)

    Zheng, Beiwen; Hu, Xinjun; Jiang, Xiawei; Li, Ang; Yao, Jian

    2016-01-01

    This report describes the draft genome sequence of S. condimenti strain F-2T (DSM 11674), a potential starter culture. The genome assembly comprised 2,616,174 bp with 34.6% GC content. To the best of our knowledge, this is the first documentation that reports the whole-genome sequence of S. condimenti. PMID:27257207

  15. Draft Genome Sequence of Streptomyces hygroscopicus subsp. hygroscopicus NBRC 16556.

    Science.gov (United States)

    Komaki, Hisayuki; Ichikawa, Natsuko; Oguchi, Akio; Hamada, Moriyuki; Tamura, Tomohiko; Suzuki, Ken-Ichiro; Fujita, Nobuyuki

    2016-01-01

    Here, we report the draft genome sequence of strain NBRC 16556, deposited as Streptomyces hygroscopicus subsp. hygroscopicus into the NBRC culture collection. An average nucleotide identity analysis confirmed that the taxonomic identification is correct. The genome sequence will serve as a valuable reference for genome mining to search new secondary metabolites. PMID:27198007

  16. Whole-Genome Shotgun Sequencing of a Colonizing Multilocus Sequence Type 17 Streptococcus agalactiae Strain

    Science.gov (United States)

    Singh, Pallavi; Springman, A. Cody; Davies, H. Dele

    2012-01-01

    This report highlights the whole-genome shotgun draft sequence for a Streptococcus agalactiae strain representing multilocus sequence type (ST) 17, isolated from a colonized woman at 8 weeks postpartum. This sequence represents an important addition to the published genomes and will promote comparative genomic studies of S. agalactiae recovered from diverse sources. PMID:23045509

  17. Whole-Genome Shotgun Sequencing of a Colonizing Multilocus Sequence Type 17 Streptococcus agalactiae Strain

    OpenAIRE

    Singh, Pallavi; Springman, A. Cody; Davies, H Dele; Manning, Shannon D.

    2012-01-01

    This report highlights the whole-genome shotgun draft sequence for a Streptococcus agalactiae strain representing multilocus sequence type (ST) 17, isolated from a colonized woman at 8 weeks postpartum. This sequence represents an important addition to the published genomes and will promote comparative genomic studies of S. agalactiae recovered from diverse sources.

  18. Genome Sequence of Stachybotrys chartarum Strain 51-11

    OpenAIRE

    Betancourt, Doris A; Dean, Timothy R.; Kim, Jean; Levy, Josh

    2015-01-01

    The Stachybotrys chartarum strain 51-11 genome was sequenced by shotgun sequencing utilizing Illumina HiSeq 2000 and PacBio technologies. Since S. chartarum has been implicated as having health impacts within water-damaged buildings, any information extracted from the genomic sequence data relating to toxins or the metabolism of the fungus might be useful.

  19. First Complete Genome Sequence of Cherry virus A.

    Science.gov (United States)

    Koinuma, Hiroaki; Nijo, Takamichi; Iwabuchi, Nozomu; Yoshida, Tetsuya; Keima, Takuya; Okano, Yukari; Maejima, Kensaku; Yamaji, Yasuyuki; Namba, Shigetou

    2016-01-01

    The 5'-terminal genomic sequence of Cherry virus A (CVA) has long been unknown. We determined the first complete genome sequence of an apricot isolate of CVA (7,434 nucleotides [nt]). The 5'-untranslated region was 107 nt in length, which was 53 nt longer than those of known CVA sequences. PMID:27284130

  20. Complete Genome Sequence of Rift Valley Fever Virus Strain Lunyo

    OpenAIRE

    Lumley, Sarah; Horton, Daniel L.; Marston, Denise A.; Johnson, Nicholas; Ellis, Richard J.; Fooks, Anthony R.; Hewson, Roger

    2016-01-01

    Using next-generation sequencing technologies, the first complete genome sequence of Rift Valley fever virus strain Lunyo is reported here. Originally reported as an attenuated antigenic variant strain from Uganda, genomic sequence analysis shows that Lunyo clusters together with other Ugandan isolates.

  1. Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis

    DEFF Research Database (Denmark)

    Carlton, Jane M.; Hirt, Robert P.; Silva, Joana C.;

    2007-01-01

    We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the approximately 160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion...

  2. Draft Genome Sequence of Brevibacterium massiliense Strain 541308T

    OpenAIRE

    Roux, Véronique; Robert, Catherine; Gimenez, Grégory; Raoult, Didier

    2012-01-01

    A draft genome sequence of Brevibacterium massiliense, an aerobic bacterium isolated from a human ankle discharge, is described here. CRISPR-associated proteins were found to be encoded in the genome, and analysis of transport proteins was performed.

  3. First complete genome sequence of infectious laryngotracheitis virus

    Directory of Open Access Journals (Sweden)

    Ficorilli Nino P

    2011-04-01

    Full Text Available Abstract Background Infectious laryngotracheitis virus (ILTV is an alphaherpesvirus that causes acute respiratory disease in chickens worldwide. To date, only one complete genomic sequence of ILTV has been reported. This sequence was generated by concatenating partial sequences from six different ILTV strains. Thus, the full genomic sequence of a single (individual strain of ILTV has not been determined previously. This study aimed to use high throughput sequencing technology to determine the complete genomic sequence of a live attenuated vaccine strain of ILTV. Results The complete genomic sequence of the Serva vaccine strain of ILTV was determined, annotated and compared to the concatenated ILTV reference sequence. The genome size of the Serva strain was 152,628 bp, with a G + C content of 48%. A total of 80 predicted open reading frames were identified. The Serva strain had 96.5% DNA sequence identity with the concatenated ILTV sequence. Notably, the concatenated ILTV sequence was found to lack four large regions of sequence, including 528 bp and 594 bp of sequence in the UL29 and UL36 genes, respectively, and two copies of a 1,563 bp sequence in the repeat regions. Considerable differences in the size of the predicted translation products of 4 other genes (UL54, UL30, UL37 and UL38 were also identified. More than 530 single-nucleotide polymorphisms (SNPs were identified. Most SNPs were located within three genomic regions, corresponding to sequence from the SA-2 ILTV vaccine strain in the concatenated ILTV sequence. Conclusions This is the first complete genomic sequence of an individual ILTV strain. This sequence will facilitate future comparative genomic studies of ILTV by providing an appropriate reference sequence for the sequence analysis of other ILTV strains.

  4. First complete genome sequence of infectious laryngotracheitis virus

    OpenAIRE

    Ficorilli Nino P; Browning Glenn F; Petermann Ivonne; Noormohammadi Amir H; Markham John F; Markham Philip F; Lee Sang-Won; Hartley Carol A; Devlin Joanne M

    2011-01-01

    Abstract Background Infectious laryngotracheitis virus (ILTV) is an alphaherpesvirus that causes acute respiratory disease in chickens worldwide. To date, only one complete genomic sequence of ILTV has been reported. This sequence was generated by concatenating partial sequences from six different ILTV strains. Thus, the full genomic sequence of a single (individual) strain of ILTV has not been determined previously. This study aimed to use high throughput sequencing technology to determine t...

  5. Coevolution between simple sequence repeats (SSRs and virus genome size

    Directory of Open Access Journals (Sweden)

    Zhao Xiangyan

    2012-08-01

    Full Text Available Abstract Background Relationship between the level of repetitiveness in genomic sequence and genome size has been investigated by making use of complete prokaryotic and eukaryotic genomes, but relevant studies have been rarely made in virus genomes. Results In this study, a total of 257 viruses were examined, which cover 90% of genera. The results showed that simple sequence repeats (SSRs is strongly, positively and significantly correlated with genome size. Certain repeat class is distributed in a certain range of genome sequence length. Mono-, di- and tri- repeats are widely distributed in all virus genomes, tetra- SSRs as a common component consist in genomes which more than 100 kb in size; in the range of genome  Conclusions We conducted this research standing on the height of the whole virus. We concluded that genome size is an important factor in affecting the occurrence of SSRs; hosts are also responsible for the variances of SSRs content to a certain degree.

  6. Whole Genome Sequencing: Innovation Dream or Privacy Nightmare?

    OpenAIRE

    De Cristofaro, Emiliano

    2012-01-01

    Over the past several years, DNA sequencing has emerged as one of the driving forces in life-sciences, paving the way for affordable and accurate whole genome sequencing. As genomes represent the entirety of an organism's hereditary information, the availability of complete human genomes prompts a wide range of revolutionary applications. The hope for improving modern healthcare and better understanding the human genome propels many interesting and challenging research frontiers. Unfortunatel...

  7. Draft Genome Sequences of Klebsiella variicola Plant Isolates

    OpenAIRE

    Martínez-Romero, Esperanza; Silva-Sanchez, Jesús; Barrios, Humberto; Rodríguez-Medina, Nadia; Martínez-Barnetche, Jesús; Téllez-Sosa, Juan; Gómez-Barreto, Rosa Elena; Garza-Ramos, Ulises

    2015-01-01

    Three endophytic Klebsiella variicola isolates—T29A, 3, and 6A2, obtained from sugar cane stem, maize shoots, and banana leaves, respectively—were used for whole-genome sequencing. Here, we report the draft genome sequences of circular chromosomes and plasmids. The genomes contain plant colonization and cellulases genes. This study will help toward understanding the genomic basis of K. variicola interaction with plant hosts.

  8. Integration of new alternative reference strain genome sequences into the Saccharomyces genome database.

    Science.gov (United States)

    Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla; Demeter, Janos; Engel, Stacia; Hellerstedt, Sage T; Karra, Kalpana; Hitz, Benjamin C; Nash, Robert S; Paskov, Kelley; Sheppard, Travis; Skrzypek, Marek; Weng, Shuai; Wong, Edith; Michael Cherry, J

    2016-01-01

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. To provide a wider scope of genetic and phenotypic variation in yeast, the genome sequences and their corresponding annotations from 11 alternative S. cerevisiae reference strains have been integrated into SGD. Genomic and protein sequence information for genes from these strains are now available on the Sequence and Protein tab of the corresponding Locus Summary pages. We illustrate how these genome sequences can be utilized to aid our understanding of strain-specific functional and phenotypic differences.Database URL: www.yeastgenome.org. PMID:27252399

  9. Next-generation sequencing and large genome assemblies

    OpenAIRE

    Henson, Joseph; Tischler, German; Ning, Zemin

    2012-01-01

    The next-generation sequencing (NGS) revolution has drastically reduced time and cost requirements for sequencing of large genomes, and also qualitatively changed the problem of assembly. This article reviews the state of the art in de novo genome assembly, paying particular attention to mammalian-sized genomes. The strengths and weaknesses of the main sequencing platforms are highlighted, leading to a discussion of assembly and the new challenges associated with NGS data. Current approaches ...

  10. Nuclear hBD-1 accumulation in malignant salivary gland tumours.

    NARCIS (Netherlands)

    Wenghoefer, M.H.; Pantelis, A.; Dommisch, H.; Gotz, W.; Reich, R.; Berge, S.; Martini, M.; Allam, J.P.; Jepsen, S.; Merkelbach-Bruse, S.; Fischer, H.P.; Novak, N.; Winter, J.

    2008-01-01

    BACKGROUND: Whereas the antimicrobial peptides hBD-2 and -3 are related to inflammation, the constitutively expressed hBD-1 might function as 8p tumour suppressor gene and thus play a key role in control of transcription and induction of apoptosis in malignant epithelial tumours. Therefore this stud

  11. Sugarcane genome sequencing by methylation filtration provides tools for genomic research in the genus Saccharum

    OpenAIRE

    Grativol, Clícia; Regulski, Michael; Bertalan, Marcelo; McCombie, W Richard; da Silva, Felipe Rodrigues; Neto, Adhemar Zerlotini; Vicentini, Renato; Farinelli, Laurent; Hemerly, Adriana Silva; Martienssen, Robert A; Ferreira, Paulo Cavalcanti Gomes

    2014-01-01

    Many economically important crops have large and complex genomes, which hampers sequencing of their genome by standard methods such as WGS. Large tracts of methylated repeats occur at plant genomes interspersed by hypomethylated gene-rich regions. Gene enrichment strategies based on methylation profile offer an alternative to sequencing repetitive genomes. Here, we have applied methyl filtration (MF) with McrBC digestion to enrich for euchromatic regions of sugarcane genome. To verify the eff...

  12. Insights from twenty years of bacterial genome sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Jun, Se Ran [ORNL; Nookaew, Intawat [ORNL; Leuze, Michael Rex [ORNL; Ahn, Tae-Hyuk [ORNL; Karpinets, Tatiana V [ORNL; Lund, Ole [Technical University of Denmark; Kora, Guruprasad H [ORNL; Wassenaar, Trudy [Molecular Microbiology & Genomics Consultants, Zotzenheim, Germany; Poudel, Suresh [ORNL; Ussery, David W [ORNL

    2015-01-01

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome

  13. Genome Project Standards in a New Era of Sequencing

    Energy Technology Data Exchange (ETDEWEB)

    GSC Consortia; HMP Jumpstart Consortia; Chain, P. S. G.; Grafham, D. V.; Fulton, R. S.; FitzGerald, M. G.; Hostetler, J.; Muzny, D.; Detter, J. C.; Ali, J.; Birren, B.; Bruce, D. C.; Buhay, C.; Cole, J. R.; Ding, Y.; Dugan, S.; Field, D.; Garrity, G. M.; Gibbs, R.; Graves, T.; Han, C. S.; Harrison, S. H.; Highlander, S.; Hugenholtz, P.; Khouri, H. M.; Kodira, C. D.; Kolker, E.; Kyrpides, N. C.; Lang, D.; Lapidus, A.; Malfatti, S. A.; Markowitz, V.; Metha, T.; Nelson, K. E.; Parkhill, J.; Pitluck, S.; Qin, X.; Read, T. D.; Schmutz, J.; Sozhamannan, S.; Strausberg, R.; Sutton, G.; Thomson, N. R.; Tiedje, J. M.; Weinstock, G.; Wollam, A.

    2009-06-01

    For over a decade, genome 43 sequences have adhered to only two standards that are relied on for purposes of sequence analysis by interested third parties (1, 2). However, ongoing developments in revolutionary sequencing technologies have resulted in a redefinition of traditional whole genome sequencing that requires a careful reevaluation of such standards. With commercially available 454 pyrosequencing (followed by Illumina, SOLiD, and now Helicos), there has been an explosion of genomes sequenced under the moniker 'draft', however these can be very poor quality genomes (due to inherent errors in the sequencing technologies, and the inability of assembly programs to fully address these errors). Further, one can only infer that such draft genomes may be of poor quality by navigating through the databases to find the number and type of reads deposited in sequence trace repositories (and not all genomes have this available), or to identify the number of contigs or genome fragments deposited to the database. The difficulty in assessing the quality of such deposited genomes has created some havoc for genome analysis pipelines and contributed to many wasted hours of (mis)interpretation. These same novel sequencing technologies have also brought an exponential leap in raw sequencing capability, and at greatly reduced prices that have further skewed the time- and cost-ratios of draft data generation versus the painstaking process of improving and finishing a genome. The resulting effect is an ever-widening gap between drafted and finished genomes that only promises to continue (Figure 1), hence there is an urgent need to distinguish good and poor datasets. The sequencing institutes in the authorship, along with the NIH's Human Microbiome Project Jumpstart Consortium (3), strongly believe that a new set of standards is required for genome sequences. The following represents a set of six community-defined categories of genome sequence standards that better

  14. Genome Wide Characterization of Simple Sequence Repeats in Cucumber

    Science.gov (United States)

    The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...

  15. Why size really matters when sequencing plant genomes

    Czech Academy of Sciences Publication Activity Database

    Kelly, L.J.; Leitch, A.R.; Fay, M. F.; Renny-Byfield, S.; Pellicer, J.; Macas, Jiří; Leitch, I.J.

    2012-01-01

    Roč. 5, č. 4 (2012), s. 415-425. ISSN 1755-0874 Institutional research plan: CEZ:AV0Z50510513 Institutional support: RVO:60077344 Keywords : C-value * genome assembly * genome size evolution * genome sequencing Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 0.924, year: 2012

  16. Genome Sequence of Mushroom Soft-Rot Pathogen Janthinobacterium agaricidamnosum.

    Science.gov (United States)

    Graupner, Katharina; Lackner, Gerald; Hertweck, Christian

    2015-01-01

    Janthinobacterium agaricidamnosum causes soft-rot disease of the cultured button mushroom Agaricus bisporus and is thus responsible for agricultural losses. Here, we present the genome sequence of J. agaricidamnosum DSM 9628. The 5.9-Mb genome harbors several secondary metabolite biosynthesis gene clusters, which renders this neglected bacterium a promising source for genome mining approaches. PMID:25883287

  17. Genome Sequence of Mushroom Soft-Rot Pathogen Janthinobacterium agaricidamnosum

    OpenAIRE

    Graupner, Katharina; Lackner, Gerald; Hertweck, Christian

    2015-01-01

    Janthinobacterium agaricidamnosum causes soft-rot disease of the cultured button mushroom Agaricus bisporus and is thus responsible for agricultural losses. Here, we present the genome sequence of J. agaricidamnosum DSM 9628. The 5.9-Mb genome harbors several secondary metabolite biosynthesis gene clusters, which renders this neglected bacterium a promising source for genome mining approaches.

  18. Nucleotide sequence and genome organization of carnation mottle virus RNA.

    OpenAIRE

    Guilley, H; Carrington, J C; Balàzs, E; Jonard, G; Richards, K; Morris, T J

    1985-01-01

    The complete nucleotide sequence of carnation mottle genomic RNA (4003 nucleotides) is presented. The sequence was determined for cloned cDNA copies of viral RNA containing over 99% of the sequence and was completed by direct sequence analysis of RNA and cDNA transcripts. The sequence contains two long open reading frames which together can account for observed translation products. One translation product would arise by suppression of an amber termination codon and the sequence raises the po...

  19. Sequence resources at the Candida Genome Database

    OpenAIRE

    Arnaud, Martha B.; Costanzo, Maria C.; Skrzypek, Marek S.; Shah, Prachi; Binkley, Gail; Lane, Christopher; Miyasato, Stuart R.; SHERLOCK, Gavin

    2006-01-01

    The Candida Genome Database (CGD, ) contains a curated collection of genomic information and community resources for researchers who are interested in the molecular biology of the opportunistic pathogen Candida albicans. With the recent release of a new assembly of the C.albicans genome, Assembly 20, C.albicans genomics has entered a new era. Although the C.albicans genome assembly continues to undergo refinement, multiple assemblies and gene nomenclatures will remain in widespread use by the...

  20. Complete Genome Sequence of the Human Gut Symbiont Roseburia hominis

    DEFF Research Database (Denmark)

    Travis, Anthony J.; Kelly, Denise; Flint, Harry J; Aminov, Rustam

    2015-01-01

    We report here the complete genome sequence of the human gut symbiont Roseburia hominis A2-183(T) (= DSM 16839(T) = NCIMB 14029(T)), isolated from human feces. The genome is represented by a 3,592,125-bp chromosome with 3,405 coding sequences. A number of potential functions contributing to host...

  1. Draft Genome Sequence of the Fish Pathogen Piscirickettsia salmonis

    OpenAIRE

    Eppinger, Mark; McNair, Katelyn; Zogaj, Xhavit; Dinsdale, Elizabeth A.; Edwards, Robert A.; Klose, Karl E.

    2013-01-01

    Piscirickettsia salmonis is a Gram-negative intracellular fish pathogen that has a significant impact on the salmon industry. Here, we report the genome sequence of P. salmonis strain LF-89. This is the first draft genome sequence of P. salmonis, and it reveals interesting attributes, including flagellar genes, despite this bacterium being considered nonmotile.

  2. Draft Genome Sequence of the Fish Pathogen Piscirickettsia salmonis.

    Science.gov (United States)

    Eppinger, Mark; McNair, Katelyn; Zogaj, Xhavit; Dinsdale, Elizabeth A; Edwards, Robert A; Klose, Karl E

    2013-01-01

    Piscirickettsia salmonis is a Gram-negative intracellular fish pathogen that has a significant impact on the salmon industry. Here, we report the genome sequence of P. salmonis strain LF-89. This is the first draft genome sequence of P. salmonis, and it reveals interesting attributes, including flagellar genes, despite this bacterium being considered nonmotile. PMID:24201203

  3. Biased distribution of DNA uptake sequences towards genome maintenance genes

    DEFF Research Database (Denmark)

    Davidsen, T.; Rodland, E.A.; Lagesen, K.;

    2004-01-01

    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...

  4. Complete genome sequence of ‘Candidatus Liberibacter africanus’

    Science.gov (United States)

    The complete genome sequence of ‘Candidatus Liberibacter africanus’ (Laf), strain ptsapsy, was obtained by an Illumina HiSeq 2000. The Laf genome comprises 1,192,232 nucleotides, 34.5% GC content, 1,141 predicted coding sequences, 44 tRNAs, 3 complete copies of ribosomal RNA genes (16S, 23S and 5S) ...

  5. Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii

    OpenAIRE

    Siozios, Stefanos; Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

    2013-01-01

    Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named “wSuzi” that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii.

  6. Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii.

    Science.gov (United States)

    Siozios, Stefanos; Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

    2013-01-01

    Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named "wSuzi" that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii. PMID:23472225

  7. Draft Genome Sequence of Klebsiella pneumoniae Isolate PR04

    OpenAIRE

    Zulkifli, M. H.; L. K. Teh; L. S. Lee; Z. A. Zakaria; Salleh, M. Z.

    2013-01-01

    Klebsiella pneumoniae PR04 was isolated from a patient hospitalized in Malaysia. The draft genome sequence of K. pneumoniae PR04 shows differences compared to the reference sequences of K. pneumoniae strains MGH 78578 and NTUH-K2044 in terms of their genomic structures.

  8. The carrot genome sequence brings colors out of the dark.

    Science.gov (United States)

    Garcia-Mas, Jordi; Rodriguez-Concepcion, Manuel

    2016-05-27

    The genome sequence of carrot (Daucus carota L.) is the first completed for an Apiaceae species, furthering knowledge of the evolution of the important euasterid II clade. Analyzing the whole-genome sequence allowed for the identification of a gene that may regulate the accumulation of carotenoids in the root. PMID:27230684

  9. Complete Genome Sequences of Five Paenibacillus larvae Bacteriophages.

    Science.gov (United States)

    Sheflo, Michael A; Gardner, Adam V; Merrill, Bryan D; Fisher, Joshua N B; Lunt, Bryce L; Breakwell, Donald P; Grose, Julianne H; Burnett, Sandra H

    2013-01-01

    Paenibacillus larvae is a pathogen of honeybees that causes American foulbrood (AFB). We isolated bacteriophages from soil containing bee debris collected near beehives in Utah. We announce five high-quality complete genome sequences, which represent the first completed genome sequences submitted to GenBank for any P. larvae bacteriophage. PMID:24233582

  10. Genome sequence of Kocuria palustris strain W4

    DEFF Research Database (Denmark)

    Herschend, Jakob; Raghupathi, Prem Krishnan; Røder, Henriette Lyng;

    2016-01-01

    We report the 3.09 Mb draft genome sequence ofKocuria palustrisW4, isolated from a slaughterhouse in Denmark.......We report the 3.09 Mb draft genome sequence ofKocuria palustrisW4, isolated from a slaughterhouse in Denmark....

  11. Nearly Complete Genome Sequence of Lactobacillus plantarum Strain NIZO2877

    NARCIS (Netherlands)

    Martino, M.E.; Bayjanov, J.R.; Joncour, P.; Hughes, S.; Gillet, B.; Kleerebezem, M; Siezen, R.; Hijum, S.A.F.T. van; Leulier, F.

    2015-01-01

    Lactobacillus plantarum is a versatile bacterial species that is isolated mostly from foods. Here, we present the first genome sequence of L. plantarum strain NIZO2877 isolated from a hot dog in Vietnam. Its two contigs represent a nearly complete genome sequence.

  12. On the current status of Phakopsora pachyrhizi genome sequencing

    Directory of Open Access Journals (Sweden)

    Marco eLoehrer

    2014-08-01

    Full Text Available Recent advances in the field of sequencing technologies and bioinformatics allow a more rapid access to genomes of non-model organisms at sinking costs. Accordingly, draft genomes of several economically important cereal rust fungi have been released in the last three years. Aside from the very recent flax rust and poplar rust draft assemblies there are no genomic data available for other dicot-infecting rust fungi. In this article we outline rust fungus sequencing efforts and comment on the current status of Phakopsora pachyrhizi (Asian soybean rust genome sequencing.

  13. Unexpected cross-species contamination in genome sequencing projects

    Directory of Open Access Journals (Sweden)

    Samier Merchant

    2014-11-01

    Full Text Available The raw data from a genome sequencing project sometimes contains DNA from contaminating organisms, which may be introduced during sample collection or sequence preparation. In some instances, these contaminants remain in the sequence even after assembly and deposition of the genome into public databases. As a result, searches of these databases may yield erroneous and confusing results. We used efficient microbiome analysis software to scan the draft assembly of domestic cow, Bos taurus, and identify 173 small contigs that appeared to derive from microbial contaminants. In the course of verifying these findings, we discovered that one genome, Neisseria gonorrhoeae TCDC-NG08107, although putatively a complete genome, contained multiple sequences that actually derived from the cow and sheep genomes. Our findings illustrate the need to carefully validate findings of anomalous DNA that rely on comparisons to either draft or finished genomes.

  14. Minimum taxonomic criteria for bacterial genome sequence depositions and announcements.

    Science.gov (United States)

    Bull, Matthew J; Marchesi, Julian R; Vandamme, Peter; Plummer, Sue; Mahenthiralingam, Eshwar

    2012-04-01

    Multiple bioinformatic methods are available to analyse the information encoded within the complete genome sequence of a bacterium and accurately assign its species status or nearest phylogenetic neighbour. However, it is clear that even now in what is the third decade of bacterial genomics, taxonomically incorrect genome sequence depositions are still being made. We outline a simple scheme of bioinformatic analysis and a set of minimum criteria that should be applied to all bacterial genomic data to ensure that they are accurately assigned to the species or genus level prior to database deposition. To illustrate the utility of the bioinformatic workflow, we analysed the recently deposited genome sequence of Lactobacillus acidophilus 30SC and demonstrated that this DNA was in fact derived from a strain of Lactobacillus amylovorus. Using these methods researchers can ensure that the taxonomic accuracy of genome sequence depositions is maintained within the ever increasing nucleic acid datasets. PMID:22366464

  15. Genome sequencing and annotation of Serratia sp. strain TEL.

    Science.gov (United States)

    Lephoto, Tiisetso E; Gray, Vincent M

    2015-12-01

    We present the annotation of the draft genome sequence of Serratia sp. strain TEL (GenBank accession number KP711410). This organism was isolated from entomopathogenic nematode Oscheius sp. strain TEL (GenBank accession number KM492926) collected from grassland soil and has a genome size of 5,000,541 bp and 542 subsystems. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LDEG00000000. PMID:26697332

  16. Comparative Copy Number Variation From Whole Genome Sequencing

    OpenAIRE

    Janevski, A.; Varadan, V.; Kamalakaran, S.; Banerjee, N.; Dimitrova, D

    2011-01-01

    Whole genome sequencing enables a high resolution view of the humangenome and enables unique insights into copy number variations in anunprecedented scale. Numerous tools and studies have already been introduced that provide confirmatory and new genomic variability datain individuals and across populations. We investigate two such methods, CNV-seq and FREEC and compare their outputs when applied to five whole genome sequences representing four populations. We focus onthe ability of these tool...

  17. Whole-genome sequence-based analysis of thyroid function

    DEFF Research Database (Denmark)

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby;

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N = 2,287). Using additional whole-genome...... association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function....

  18. Marsupial Genome Sequences: Providing Insight into Evolution and Disease

    OpenAIRE

    Deakin, Janine E.

    2012-01-01

    Marsupials (metatherians), with their position in vertebrate phylogeny and their unique biological features, have been studied for many years by a dedicated group of researchers, but it has only been since the sequencing of the first marsupial genome that their value has been more widely recognised. We now have genome sequences for three distantly related marsupial species (the grey short-tailed opossum, the tammar wallaby, and Tasmanian devil), with the promise of many more genomes to be seq...

  19. Whole-genome sequencing in bacteriology: state of the art

    OpenAIRE

    Dark, Michael

    2013-01-01

    Michael J DarkDepartment of Infectious Diseases and Pathology and Emerging Pathogens Institute, University of Florida, Gainesville, FL, USAAbstract: Over the last ten years, genome sequencing capabilities have expanded exponentially. There have been tremendous advances in sequencing technology, DNA sample preparation, genome assembly, and data analysis. This has led to advances in a number of facets of bacterial genomics, including metagenomics, clinical medicine, bacterial archaeology, and b...

  20. Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities

    OpenAIRE

    Chen, Kevin; Pachter, Lior

    2005-01-01

    The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fe...

  1. Perspectives of Integrative Cancer Genomics in Next Generation Sequencing Era

    OpenAIRE

    Kwon, So Mee; Cho, Hyunwoo; Choi, Ji Hye; Jee, Byul A; Jo, Yuna; Woo, Hyun Goo

    2012-01-01

    The explosive development of genomics technologies including microarrays and next generation sequencing (NGS) has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establis...

  2. Whole-Genome Sequences of Thirteen Isolates of Borrelia burgdorferi

    Energy Technology Data Exchange (ETDEWEB)

    Schutzer S. E.; Dunn J.; Fraser-Liggett, C. M.; Casjens, S. R.; Qiu, W.-G.; Mongodin, E. F.; Luft, B. J.

    2011-02-01

    Borrelia burgdorferi is a causative agent of Lyme disease in North America and Eurasia. The first complete genome sequence of B. burgdorferi strain 31, available for more than a decade, has assisted research on the pathogenesis of Lyme disease. Because a single genome sequence is not sufficient to understand the relationship between genotypic and geographic variation and disease phenotype, we determined the whole-genome sequences of 13 additional B. burgdorferi isolates that span the range of natural variation. These sequences should allow improved understanding of pathogenesis and provide a foundation for novel detection, diagnosis, and prevention strategies.

  3. Generation of physical map contig-specific sequences useful for whole genome sequence scaffolding.

    Directory of Open Access Journals (Sweden)

    Yanliang Jiang

    Full Text Available Along with the rapid advances of the nextgen sequencing technologies, more and more species are added to the list of organisms whose whole genomes are sequenced. However, the assembled draft genome of many organisms consists of numerous small contigs, due to the short length of the reads generated by nextgen sequencing platforms. In order to improve the assembly and bring the genome contigs together, more genome resources are needed. In this study, we developed a strategy to generate a valuable genome resource, physical map contig-specific sequences, which are randomly distributed genome sequences in each physical contig. Two-dimensional tagging method was used to create specific tags for 1,824 physical contigs, in which the cost was dramatically reduced. A total of 94,111,841 100-bp reads and 315,277 assembled contigs are identified containing physical map contig-specific tags. The physical map contig-specific sequences along with the currently available BAC end sequences were then used to anchor the catfish draft genome contigs. A total of 156,457 genome contigs (~79% of whole genome sequencing assembly were anchored and grouped into 1,824 pools, in which 16,680 unique genes were annotated. The physical map contig-specific sequences are valuable resources to link physical map, genetic linkage map and draft whole genome sequences, consequently have the capability to improve the whole genome sequences assembly and scaffolding, and improve the genome-wide comparative analysis as well. The strategy developed in this study could also be adopted in other species whose whole genome assembly is still facing a challenge.

  4. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    OpenAIRE

    Tettelin, Hervé; Masignani, Vega; Cieslewicz, Michael J.; Eisen, Jonathan A.; Peterson, Scott; Wessels, Michael R.; Paulsen, Ian T.; Nelson, Karen E.; Margarit, Immaculada; Read, Timothy D.; Madoff, Lawrence C.; Wolf, Alex M.; Beanan, Maureen J; Brinkac, Lauren M.; Sean C Daugherty

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the other completely sequenced genomes identified genes specific to the streptococci and to S. agalactiae. These in silico analyses, combined with comparative genome hybridization experiments between the ...

  5. Progress in Understanding and Sequencing the Genome of Brassica rapa

    OpenAIRE

    Hong, Chang Pyo; Kwon, Soo-Jin; Kim, Jung Sun; Yang, Tae-Jin; Park, Beom-Seok; Lim, Yong Pyo

    2008-01-01

    Brassica rapa, which is closely related to Arabidopsis thaliana, is an important crop and a model plant for studying genome evolution via polyploidization. We report the current understanding of the genome structure of B. rapa and efforts for the whole-genome sequencing of the species. The tribe Brassicaceae, which comprises ca. 240 species, descended from a common hexaploid ancestor with a basic genome similar to that of Arabidopsis. Chromosome rearrangements, including fusions and/or fissio...

  6. Complete Genome Sequence of Probiotic Strain Lactobacillus acidophilus La-14.

    Science.gov (United States)

    Stahl, Buffy; Barrangou, Rodolphe

    2013-01-01

    We present the 1,991,830-bp complete genome sequence of Lactobacillus acidophilus strain La-14 (SD-5212). Comparative genomic analysis revealed 99.98% similarity overall to the L. acidophilus NCFM genome. Globally, 111 single nucleotide polymorphisms (SNPs) (95 SNPs, 16 indels) were observed throughout the genome. Also, a 416-bp deletion in the LA14_1146 sugar ABC transporter was identified. PMID:23788546

  7. Complete Genome Sequence of Probiotic Strain Lactobacillus acidophilus La-14

    OpenAIRE

    Stahl, Buffy; Barrangou, Rodolphe

    2013-01-01

    We present the 1,991,830-bp complete genome sequence of Lactobacillus acidophilus strain La-14 (SD-5212). Comparative genomic analysis revealed 99.98% similarity overall to the L. acidophilus NCFM genome. Globally, 111 single nucleotide polymorphisms (SNPs) (95 SNPs, 16 indels) were observed throughout the genome. Also, a 416-bp deletion in the LA14_1146 sugar ABC transporter was identified.

  8. Analysis of the Thermotoga maritima genome combining a variety of sequence similarity and genome context tools

    OpenAIRE

    Kyrpides, Nikos C; Ouzounis, Christos A; Iliopoulos, Ioannis; Vonstein, Veronika; Overbeek, Ross

    2000-01-01

    The proliferation of genome sequence data has led to the development of a number of tools and strategies that facilitate computational analysis. These methods include the identification of motif patterns, membership of the query sequences in family databases, metabolic pathway involvement and gene proximity. We re-examined the completely sequenced genome of Thermotoga maritima by employing the combined use of the above methods. By analyzing all 1877 proteins encoded in this genome, we identif...

  9. Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

    Energy Technology Data Exchange (ETDEWEB)

    McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

    2005-08-26

    Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

  10. Real-time, portable genome sequencing for Ebola surveillance.

    Science.gov (United States)

    Quick, Joshua; Loman, Nicholas J; Duraffour, Sophie; Simpson, Jared T; Severi, Ettore; Cowley, Lauren; Bore, Joseph Akoi; Koundouno, Raymond; Dudas, Gytis; Mikhail, Amy; Ouédraogo, Nobila; Afrough, Babak; Bah, Amadou; Baum, Jonathan H J; Becker-Ziaja, Beate; Boettcher, Jan Peter; Cabeza-Cabrerizo, Mar; Camino-Sánchez, Álvaro; Carter, Lisa L; Doerrbecker, Juliane; Enkirch, Theresa; García-Dorival, Isabel; Hetzelt, Nicole; Hinzmann, Julia; Holm, Tobias; Kafetzopoulou, Liana Eleni; Koropogui, Michel; Kosgey, Abigael; Kuisma, Eeva; Logue, Christopher H; Mazzarelli, Antonio; Meisel, Sarah; Mertens, Marc; Michel, Janine; Ngabo, Didier; Nitzsche, Katja; Pallasch, Elisa; Patrono, Livia Victoria; Portmann, Jasmine; Repits, Johanna Gabriella; Rickett, Natasha Y; Sachse, Andreas; Singethan, Katrin; Vitoriano, Inês; Yemanaberhan, Rahel L; Zekeng, Elsa G; Racine, Trina; Bello, Alexander; Sall, Amadou Alpha; Faye, Ousmane; Faye, Oumar; Magassouba, N'Faly; Williams, Cecelia V; Amburgey, Victoria; Winona, Linda; Davis, Emily; Gerlach, Jon; Washington, Frank; Monteil, Vanessa; Jourdain, Marine; Bererd, Marion; Camara, Alimou; Somlare, Hermann; Camara, Abdoulaye; Gerard, Marianne; Bado, Guillaume; Baillet, Bernard; Delaune, Déborah; Nebie, Koumpingnin Yacouba; Diarra, Abdoulaye; Savane, Yacouba; Pallawo, Raymond Bernard; Gutierrez, Giovanna Jaramillo; Milhano, Natacha; Roger, Isabelle; Williams, Christopher J; Yattara, Facinet; Lewandowski, Kuiama; Taylor, James; Rachwal, Phillip; Turner, Daniel J; Pollakis, Georgios; Hiscox, Julian A; Matthews, David A; O'Shea, Matthew K; Johnston, Andrew McD; Wilson, Duncan; Hutley, Emma; Smit, Erasmus; Di Caro, Antonino; Wölfel, Roman; Stoecker, Kilian; Fleischmann, Erna; Gabriel, Martin; Weller, Simon A; Koivogui, Lamine; Diallo, Boubacar; Keïta, Sakoba; Rambaut, Andrew; Formenty, Pierre; Günther, Stephan; Carroll, Miles W

    2016-02-11

    The Ebola virus disease epidemic in West Africa is the largest on record, responsible for over 28,599 cases and more than 11,299 deaths. Genome sequencing in viral outbreaks is desirable to characterize the infectious agent and determine its evolutionary rate. Genome sequencing also allows the identification of signatures of host adaptation, identification and monitoring of diagnostic targets, and characterization of responses to vaccines and treatments. The Ebola virus (EBOV) genome substitution rate in the Makona strain has been estimated at between 0.87 × 10(-3) and 1.42 × 10(-3) mutations per site per year. This is equivalent to 16-27 mutations in each genome, meaning that sequences diverge rapidly enough to identify distinct sub-lineages during a prolonged epidemic. Genome sequencing provides a high-resolution view of pathogen evolution and is increasingly sought after for outbreak surveillance. Sequence data may be used to guide control measures, but only if the results are generated quickly enough to inform interventions. Genomic surveillance during the epidemic has been sporadic owing to a lack of local sequencing capacity coupled with practical difficulties transporting samples to remote sequencing facilities. To address this problem, here we devise a genomic surveillance system that utilizes a novel nanopore DNA sequencing instrument. In April 2015 this system was transported in standard airline luggage to Guinea and used for real-time genomic surveillance of the ongoing epidemic. We present sequence data and analysis of 142 EBOV samples collected during the period March to October 2015. We were able to generate results less than 24 h after receiving an Ebola-positive sample, with the sequencing process taking as little as 15-60 min. We show that real-time genomic surveillance is possible in resource-limited settings and can be established rapidly to monitor outbreaks. PMID:26840485

  11. Determining and comparing protein function in Bacterial genome sequences

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla

    predictions were made in about 60% of the cases. This project has highlighted the difficulties and challenges in functional annotation and computational analysis of sequence data. It has provided possible solutions for creating reproducible pipelines for comparative genomics as well as constructed a number of......In November 2013, there was around 21.000 different prokaryotic genomes sequenced and publicly available, and the number is growing daily with another 20.000 or more genomes expected to be sequenced and deposited by the end of 2014. An important part of the analysis of this data is the functional...... known functions. This thesis describes the development of new tools for comparative functional annotation and a system for comparative genomics in general. As novel sequenced genomes are becoming more readily available, there is a need for standard analysis tools. The system CMG-biotools is presented...

  12. Marsupial genome sequences: providing insight into evolution and disease.

    Science.gov (United States)

    Deakin, Janine E

    2012-01-01

    Marsupials (metatherians), with their position in vertebrate phylogeny and their unique biological features, have been studied for many years by a dedicated group of researchers, but it has only been since the sequencing of the first marsupial genome that their value has been more widely recognised. We now have genome sequences for three distantly related marsupial species (the grey short-tailed opossum, the tammar wallaby, and Tasmanian devil), with the promise of many more genomes to be sequenced in the near future, making this a particularly exciting time in marsupial genomics. The emergence of a transmissible cancer, which is obliterating the Tasmanian devil population, has increased the importance of obtaining and analysing marsupial genome sequence for understanding such diseases as well as for conservation efforts. In addition, these genome sequences have facilitated studies aimed at answering questions regarding gene and genome evolution and provided insight into the evolution of epigenetic mechanisms. Here I highlight the major advances in our understanding of evolution and disease, facilitated by marsupial genome projects, and speculate on the future contributions to be made by such sequences. PMID:24278712

  13. Draft genome sequence of Enterococcus faecium strain LMG 8148.

    Science.gov (United States)

    Michiels, Joran E; Van den Bergh, Bram; Fauvart, Maarten; Michiels, Jan

    2016-01-01

    Enterococcus faecium, traditionally considered a harmless gut commensal, is emerging as an important nosocomial pathogen showing increasing rates of multidrug resistance. We report the draft genome sequence of E. faecium strain LMG 8148, isolated in 1968 from a human in Gothenburg, Sweden. The draft genome has a total length of 2,697,490 bp, a GC-content of 38.3 %, and 2,402 predicted protein-coding sequences. The isolation of this strain predates the emergence of E. faecium as a nosocomial pathogen. Consequently, its genome can be useful in comparative genomic studies investigating the evolution of E. faecium as a pathogen. PMID:27610213

  14. Complete genome sequence of Acidimicrobium ferrooxidans type strain (ICPT)

    Energy Technology Data Exchange (ETDEWEB)

    Clum, Alicia; Nolan, Matt; Lang, Elke; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Goker, Markus; Spring, Stefan; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Chain, Patrick; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Lapidus, Alla

    2009-05-20

    Acidimicrobium ferrooxidans (Clark and Norris 1996) is the sole and type species of the genus, which until recently was the only genus within the actinobacterial family Acidimicrobiaceae and in the order Acidomicrobiales. Rapid oxidation of iron pyrite during autotrophic growth in the absence of an enhanced CO2 concentration is characteristic for A. ferrooxidans. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the order Acidomicrobiales, and the 2,158,157 bp long single replicon genome with its 2038 protein coding and 54 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  15. Puzzling sequences: studying microbial genomes from 'Ötzi'

    International Nuclear Information System (INIS)

    Ancient remains, and mummies in particular, are of central value for archaeological research. The Tyrolean iceman “Ötzi” was conserved in a glacier of the Ötztal Alps about 5000 years ago. Aside from morphological and phenotypical classification, the determination of DNA sequences and the subsequent genome analyses have been first applied to mitochondrial DNA and then been extended to genomic DNA. Typically also ancient microbial DNA is sequenced. These sequences allow the identification of pathogens as well as studying the evolution of microorganisms. The talk will explain the metagenomic aspects of the “Ötzi” genome project and discuss the first results. (author)

  16. Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.

    Directory of Open Access Journals (Sweden)

    Martijn Staats

    Full Text Available Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22-82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus were generated with 81.4-97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2-71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes, but at least generating vital comparative genomic data for testing (phylogenetic, demographic and genetic hypotheses, that become increasingly more

  17. BAC-pool 454-sequencing: A rapid and efficient approach to sequence complex tetraploid cotton genomes

    Science.gov (United States)

    New and emerging next generation sequencing technologies have been promising in reducing sequencing costs, but not significantly for complex polyploid plant genomes such as cotton. Large and highly repetitive genome of G. hirsutum (~2.5GB) is less amenable and cost-intensive with traditional BAC-by...

  18. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  19. Genome sequencing and analysis of the model grass Brachypodium distachyon.

    Science.gov (United States)

    2010-02-11

    Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops. PMID:20148030

  20. The complete chloroplast genome sequence of Zanthoxylum piperitum.

    Science.gov (United States)

    Lee, Jonghoon; Lee, Hyeon Ju; Kim, Kyunghee; Lee, Sang-Choon; Sung, Sang Hyun; Yang, Tae-Jin

    2016-09-01

    The complete chloroplast genome sequence of Zanthoxylum piperitum, a plant species with useful aromatic oils in family Rutaceae, was generated in this study by de novo assembly with whole-genome sequence data. The chloroplast genome was 158 154 bp in length with a typical quadripartite structure containing a pair of inverted repeats of 27 644 bp, separated by large single copy and small single copy of 85 340 bp and 17 526 bp, respectively. The chloroplast genome harbored 112 genes consisting of 78 protein-coding genes 30 tRNA genes and 4 rRNA genes. Phylogenetic analysis of the complete chloroplast genome sequences with those of known relatives revealed that Z. piperitum is most closely related to the Citrus species. PMID:26260183

  1. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    Energy Technology Data Exchange (ETDEWEB)

    Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M.; Fahlgren, Noah; Fawcett, Jeffrey A.; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D.; Ossowski, Stephan; Ottilar, Robert P.; Salamov, Asaf A.; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E.; Bergelson, Joy; Carrington, James C.; Gaut, Brandon S.; Schmutz, Jeremy; Mayer, Klaus F. X.; Van de Peer, Yves; Grigoriev, Igor V.; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

    2011-04-29

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.

  2. Perspectives of integrative cancer genomics in next generation sequencing era.

    Science.gov (United States)

    Kwon, So Mee; Cho, Hyunwoo; Choi, Ji Hye; Jee, Byul A; Jo, Yuna; Woo, Hyun Goo

    2012-06-01

    The explosive development of genomics technologies including microarrays and next generation sequencing (NGS) has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establish the huge cancer genome compendium and integrative omics analyses, so-called "integromics", have extended our understanding on the cancer genome, showing its daunting complexity and heterogeneity. However, the challenges of the structured integration, sharing, and interpretation of the big omics data still remain to be resolved. Here, we review several issues raised in cancer omics data analysis, including NGS, focusing particularly on the study design and analysis strategies. This might be helpful to understand the current trends and strategies of the rapidly evolving cancer genomics research. PMID:23105932

  3. Analysis of Simple Sequence Repeats in Genomes of Rhizobia

    Institute of Scientific and Technical Information of China (English)

    GAO Ya-mei; HAN Yi-qiang; TANG Hui; SUN Dong-mei; WANG Yan-jie; WANG Wei-dong

    2008-01-01

    Simple sequence repeats (SSRs) or microsatellites, as genetic markers, are ubiquitous in genomes of various organisms. The analysis of SSR in rhizobia genome provides useful information for a variety of applications in population genetics of rhizobia. We analyzed the occurrences, relative abundance, and relative density of SSRs, the most common in Bradyrhizobium japonicum, Mesorhizobium loti, and Sinorhizobium meliloti genomes se-quenced in the microorganisms tandem repeats database, and SSRs in the three species genomes were compared with each other. The result showed that there were 1 410, 859, and 638 SSRs in B. japonicum, M. loti, and 5. meliloti genomes, respectively. In the genomes of B. japonicum, M. loti, and 5. meliloti, tetranucleotide, pentanucleotide, and hexanucleotide repeats were more abundant and indicated higher mutation rates in these species. The least abundance was mononucleotide repeat. The SSRs type and distribution were similar among these species.

  4. Draft Genome Sequences of Gammaproteobacterial Methanotrophs Isolated from Marine Ecosystems.

    Science.gov (United States)

    Flynn, James D; Hirayama, Hisako; Sakai, Yasuyoshi; Dunfield, Peter F; Klotz, Martin G; Knief, Claudia; Op den Camp, Huub J M; Jetten, Mike S M; Khmelenina, Valentina N; Trotsenko, Yuri A; Murrell, J Colin; Semrau, Jeremy D; Svenning, Mette M; Stein, Lisa Y; Kyrpides, Nikos; Shapiro, Nicole; Woyke, Tanja; Bringel, Françoise; Vuilleumier, Stéphane; DiSpirito, Alan A; Kalyuzhnaya, Marina G

    2016-01-01

    The genome sequences of Methylobacter marinus A45, Methylobacter sp. strain BBA5.1, and Methylomarinum vadi IT-4 were obtained. These aerobic methanotrophs are typical members of coastal and hydrothermal vent marine ecosystems. PMID:26798114

  5. Draft Genome Sequences of Gammaproteobacterial Methanotrophs Isolated from Marine Ecosystems

    OpenAIRE

    Flynn, James D.; Hirayama, Hisako; Sakai, Yasuyoshi; Dunfield, Peter F.; Klotz, Martin G.; Knief, Claudia; Op Den Camp, Huub J M; Jetten, Mike S. M.; Khmelenina, Valentina N; Trotsenko, Yuri A.; Murrell, J. Colin; Semrau, Jeremy D.; Svenning, Mette M.; Stein, Lisa Y.; Kyrpides, Nikos

    2016-01-01

    The genome sequences of Methylobacter marinus A45, Methylobacter sp. strain BBA5.1, and Methylomarinum vadi IT-4 were obtained. These aerobic methanotrophs are typical members of coastal and hydrothermal vent marine ecosystems.

  6. Draft Genome Sequences of Gammaproteobacterial Methanotrophs Isolated from Marine Ecosystems

    Science.gov (United States)

    Flynn, James D.; Hirayama, Hisako; Sakai, Yasuyoshi; Dunfield, Peter F.; Knief, Claudia; Op den Camp, Huub J. M.; Jetten, Mike S. M.; Khmelenina, Valentina N.; Trotsenko, Yuri A.; Murrell, J. Colin; Semrau, Jeremy D.; Svenning, Mette M.; Stein, Lisa Y.; Kyrpides, Nikos; Shapiro, Nicole; Woyke, Tanja; Bringel, Françoise; Vuilleumier, Stéphane; DiSpirito, Alan A.

    2016-01-01

    The genome sequences of Methylobacter marinus A45, Methylobacter sp. strain BBA5.1, and Methylomarinum vadi IT-4 were obtained. These aerobic methanotrophs are typical members of coastal and hydrothermal vent marine ecosystems. PMID:26798114

  7. Draft Genome Sequence of Paecilomyces hepiali, Isolated from Cordyceps sinensis.

    Science.gov (United States)

    Yu, Yi; Wang, Wenting; Wang, Linping; Pang, Fang; Guo, Lanping; Song, Lai; Liu, Guiming; Feng, Chengqiang

    2016-01-01

    Paecilomyces hepiali is an endoparasitic fungus that commonly exists in the natural Cordyceps sinensis Here, we report the draft genome sequence of P. hepiali, which will facilitate the exploitation of medicinal compounds produced by the fungus. PMID:27389266

  8. Draft Genome Sequence of Paecilomyces hepiali, Isolated from Cordyceps sinensis

    Science.gov (United States)

    Yu, Yi; Wang, Wenting; Wang, Linping; Pang, Fang; Guo, Lanping; Song, Lai

    2016-01-01

    Paecilomyces hepiali is an endoparasitic fungus that commonly exists in the natural Cordyceps sinensis. Here, we report the draft genome sequence of P. hepiali, which will facilitate the exploitation of medicinal compounds produced by the fungus. PMID:27389266

  9. First Draft Genome Sequence of a Mycobacterium gordonae Clinical Isolate

    Science.gov (United States)

    Smirnova, T.; Blagodatskikh, K.; Varlamov, D.; Sochivko, D.; Larionova, E.; Andreevskaya, S.; Andrievskaya, I.; Chernousova, L.

    2016-01-01

    Here, we report the first draft genome sequence of the clinically relevant species Mycobacterium gordonae. The clinical isolate Mycobacterium gordonae 14-8773 was obtained from the sputum of a patient with mycobacteriosis. PMID:27365356

  10. Genome Sequence of Bacillus thuringiensis subsp. kurstaki Strain HD-1

    OpenAIRE

    Day, Michael; Ibrahim, Mohamed; Dyer, David; Bulla, Lee

    2014-01-01

    We report here the complete genome sequence of Bacillus thuringiensis subsp. kurstaki strain HD-1, which serves as the primary U.S. reference standard for all commercial insecticidal formulations of B. thuringiensis manufactured around the world.

  11. Bacterial epidemiology and biology - lessons from genome sequencing.

    OpenAIRE

    Parkhill, J.; Wren, BW

    2011-01-01

    : ABSTRACT: Next-generation sequencing has ushered in a new era of microbial genomics, enabling the detailed historical and geographical tracing of bacteria. This is helping to shape our understanding of bacterial evolution.

  12. Seeing chordate evolution through the Ciona genome sequence

    OpenAIRE

    Cañestro, Cristian; Bassham, Susan; Postlethwait, John H.

    2003-01-01

    A draft sequence of the compact genome of the sea squirt Ciona intestinalis, a non-vertebrate chordate that diverged very early from other chordates, including vertebrates, illuminates how chordates originated and how vertebrate developmental innovations evolved.

  13. Complete Genome Sequence of Rahnella aquatilis CIP 78.65

    Energy Technology Data Exchange (ETDEWEB)

    Martinez, Robert J [University of Alabama, Tuscaloosa; Bruce, David [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Han, James [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Held, Brittany [Los Alamos National Laboratory (LANL); Land, Miriam L [ORNL; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Pennacchio, Len [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Sobeckya, Patricia A. [University of Alabama, Tuscaloosa

    2012-01-01

    Rahnella aquatilis CIP 78.65 is a gammaproteobacterium isolated from a drinking water source in Lille, France. Here we report the complete genome sequence of Rahnella aquatilis CIP 78.65, the type strain of R. aquatilis.

  14. Cancer Genome Sequencing and Its Implications for Personalized Cancer Vaccines

    International Nuclear Information System (INIS)

    New DNA sequencing platforms have revolutionized human genome sequencing. The dramatic advances in genome sequencing technologies predict that the $1,000 genome will become a reality within the next few years. Applied to cancer, the availability of cancer genome sequences permits real-time decision-making with the potential to affect diagnosis, prognosis, and treatment, and has opened the door towards personalized medicine. A promising strategy is the identification of mutated tumor antigens, and the design of personalized cancer vaccines. Supporting this notion are preliminary analyses of the epitope landscape in breast cancer suggesting that individual tumors express significant numbers of novel antigens to the immune system that can be specifically targeted through cancer vaccines

  15. Brucella abortus S19 genome sequenced, points toward virulence genes

    OpenAIRE

    Whyte, Barry James

    2008-01-01

    Researchers at the Virginia Bioinformatics Institute at Virginia Tech; the National Animal Disease Center in Ames, Iowa; and collaborators at 454 Life Sciences, Branford, Conn., have sequenced the genome of Brucella abortus strain S19.

  16. Complete Genome Sequence of Mycobacterium phlei Type Strain RIVM601174

    KAUST Repository

    Abdallah, A. M.

    2012-05-24

    Mycobacterium phlei is a rapidly growing nontuberculous Mycobacterium species that is typically nonpathogenic, with few reported cases of human disease. Here we report the whole genome sequence of M. phlei type strain RIVM601174.

  17. Complete Genome Sequences of Six Strains of the Genus Methylobacterium

    Energy Technology Data Exchange (ETDEWEB)

    Marx, Christopher J [Harvard University; Bringel, Francoise O. [University of Strasbourg; Christoserdova, Ludmila [University of Washington, Seattle; Moulin, Lionel [UMR, France; UI Hague, Muhammad Farhan [University of Strasbourg; Fleischman, Darrell E. [Wright State University, Dayton, OH; Gruffaz, Christelle [CNRS, Strasbourg, France; Jourand, Philippe [UMR, France; Knief, Claudia [ETH Zurich, Switzerland; Lee, Ming-Chun [Harvard University; Muller, Emilie E. L. [CNRS, Strasbourg, France; Nadalig, Thierry [CNRS, Strasbourg, France; Peyraud, Remi [ETH Zurich, Switzerland; Roselli, Sandro [CNRS, Strasbourg, France; Russ, Lina [ETH Zurich, Switzerland; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Ivanov, Pavel S. [University of Wyoming, Laramie; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lajus, Aurelie [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Land, Miriam L [ORNL; Medigue, Claudine [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Stolyar, Sergey [University of Washington; Vorholt, Julia A. [ETH Zurich, Switzerland; Vuilleumier, Stephane [University of Strasbourg

    2012-01-01

    The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints.

  18. Complete genome sequences of six strains of the genus methylobacterium

    Energy Technology Data Exchange (ETDEWEB)

    Marx, Christopher J [Harvard University; Bringel, Francoise O. [University of Strasbourg; Christoserdova, Ludmila [University of Washington, Seattle; Moulin, Lionel [UMR, France; Farhan Ul Haque, Muhammad [CNRS, Strasbourg, France; Fleischman, Darrell E. [Wright State University, Dayton, OH; Gruffaz, Christelle [CNRS, Strasbourg, France; Jourand, Philippe [UMR, France; Knief, Claudia [ETH Zurich, Switzerland; Lee, Ming-Chun [Harvard University; Muller, Emilie E. L. [CNRS, Strasbourg, France; Nadalig, Thierry [CNRS, Strasbourg, France; Peyraud, Remi [ETH Zurich, Switzerland; Roselli, Sandro [CNRS, Strasbourg, France; Russ, Lina [ETH Zurich, Switzerland; Aguero, Fernan [Universidad Nacional de General San Martin; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lajus, Aurelie [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Land, Miriam L [ORNL; Medigue, Claudine [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Stolyar, Sergey [University of Washington; Vorholt, Julia A. [ETH Zurich, Switzerland; Vuilleumier, Stephane [University of Strasbourg

    2012-01-01

    The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints.

  19. Sequencing of Wheat Chromosome 6B: Toward Functional Genomics

    Czech Academy of Sciences Publication Activity Database

    Tanaka, T.; Kobayashi, F.; Joshi, G.P.; Onuki, R.; Šimková, Hana; Nasuda, S.; Doležel, Jaroslav; Ogihara, Y.; Itoh, T.; Handa, H.

    Verlag: Springer, 2015 - (Handa, H.), s. 111-116 ISBN 978-4-431-55674-9 Institutional support: RVO:61389030 Keywords : Chromosome 6B * Genome sequencing * Marker construction Subject RIV: EB - Genetics ; Molecular Biology

  20. Sequence analysis of the complete mitochondrial genome of Youxian sheldrake.

    Science.gov (United States)

    He, Shao-Ping; Liu, Li-Li; Yu, Qi-Fang; Li, Si; He, Jian-Hua

    2016-01-01

    Youxian sheldrake is excellent native breeds in Hunan province in China. The complete mitochondrial (mt) genome sequence plays an important role in the accurate determination of phylogenetic relationships among metazoans. This is the first study to determine the complete mitochondrial genome sequence of Youxian sheldrake using PCR-based amplification and Sanger sequencing. The characteristic of the entire mitochondrial genome was analyzed in detail, the total length of the mitogenome is 16,605 bp, with the base composition of 29.21% A, 22.18% T, 32.84% C, 15.77% G in the Youxian sheldrake. It contained 2 ribosomal RNA genes, 13 protein-coding genes, 22 transfer RNA genes and a major non-coding control region (D-loop region). The complete mitochondrial genome sequence of Youxian sheldrake provided an important data for further study of the phylogenetics of poultry, and available data for the genetics and breeding. PMID:25090395

  1. Complete genome sequence of Treponema pallidum strain DAL-1

    Science.gov (United States)

    Zobaníková, Marie; Mikolka, Pavol; Čejková, Darina; Pospíšilová, Petra; Chen, Lei; Strouhal, Michal; Qin, Xiang; Weinstock, George M.; Šmajs, David

    2012-01-01

    Treponema pallidum strain DAL-1 is a human uncultivable pathogen causing the sexually transmitted disease syphilis. Strain DAL-1 was isolated from the amniotic fluid of a pregnant woman in the secondary stage of syphilis. Here we describe the 1,139,971 bp long genome of T. pallidum strain DAL-1 which was sequenced using two independent sequencing methods (454 pyrosequencing and Illumina). In rabbits, strain DAL-1 replicated better than the T. pallidum strain Nichols. The comparison of the complete DAL-1 genome sequence with the Nichols sequence revealed a list of genetic differences that are potentially responsible for the increased rabbit virulence of the DAL-1 strain. PMID:23449808

  2. Complete genome sequencing and comparative genomic analysis of functionally diverse Lysinibacillus sphaericus III(3)7.

    Science.gov (United States)

    Rey, Andrés; Silva-Quintero, Laura; Dussán, Jenny

    2016-09-01

    Lysinibacillus sphaericus III(3)7 is a native Colombian strain, the first one isolated from soil samples. This strain has shown high levels of pathogenic activity against Culex quinquefaciatus larvae in laboratory assays compared to other members of the same species. Using Pacific Biosciences sequencing technology we sequenced, annotated (de novo) and described the genome of strain III(3)7, achieving a complete genome sequence status. We then performed a comparative analysis between the newly sequenced genome and the ones previously reported for Colombian isolates L. sphaericus OT4b.31, CBAM5 and OT4b.25, with the inclusion of L. sphaericus C3-41 that has been used as a reference genome for most of previous genome sequencing projects. We concluded that L. sphaericus III(3)7 is highly similar with strain OT4b.25 and shares high levels of synteny with isolates CBAM5 and C3-41. PMID:27419068

  3. Analysis of the bread wheat genome using whole-genome shotgun sequencing

    OpenAIRE

    Brenchley R.; Brenchley, Rachel; Spannagl M.; Spannagl, Manuel; Pfeifer M; Pfeifer, Matthias; Barker, Gary L. A.; Barker G.L.A.; D'Amore R.; D'Amore, Rosalinda; Allen A.M.; Allen, Alexandra M.; McKenzie, Neil; McKenzie N.; Kramer, Melissa

    2012-01-01

    Summary Bread wheat (Triticum aestivum) is a globally important crop, accounting for 20% of the calories consumed by mankind. We sequenced its large and challenging 17 Gb hexaploid genome using 454 pyrosequencing and compared this with the sequences of diploid ancestral and progenitor genomes. Between 94,000-96,000 genes were identified, and two-thirds were assigned to the A, B and D genomes. High-resolution synteny maps identified many small disruptions to conserved gene order. We show the h...

  4. Intra-species sequence comparisons for annotating genomes

    Energy Technology Data Exchange (ETDEWEB)

    Boffelli, Dario; Weer, Claire V.; Weng, Li; Lewis, Keith D.; Shoukry, Malak I.; Pachter, Lior; Keys, David N.; Rubin, Edward M.

    2004-07-15

    Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intra-species sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents and a set of genomic intervals amplified, resequenced and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom and raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species. The sequence data from this study has been submitted to GenBank under accession nos. AY667278-AY667407.

  5. Whole genome sequencing in clinical and public health microbiology

    OpenAIRE

    Kwong, J. C.; McCallum, N; Sintchenko, V.; Howden, B. P.

    2015-01-01

    SummaryGenomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology. The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology. Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laborat...

  6. Genome sequence and comparative analysis of Avibacterium paragallinarum

    OpenAIRE

    Requena, David; Chumbe, Ana; Torres, Michael; Alzamora, Ofelia; Ramirez, Manuel; Valdivia-Olarte, Hugo; Gutierrez, Andres Hazaet; Izquierdo-Lara, Ray; Saravia, Luis Enrique; Zavaleta, Milagros; Tataje-Lavanda, Luis; Best, Ivan; Fernández-Sánchez, Manolo; Icochea, Eliana; Zimic, Mirko

    2013-01-01

    Background: Avibacterium paragallinarum, the causative agent of infectious coryza, is a highly contagious respiratory acute disease of poultry, which affects commercial chickens, laying hens and broilers worldwide. Methodology: In this study, we performed the whole genome sequencing, assembly and annotation of a Peruvian isolate of A. paragallinarum. Genome was sequenced in a 454 GS FLX Titanium system. De novo assembly was performed and annotation was completed with GS De Novo Assembler 2.6 ...

  7. Mapping Challenging Mutations by Whole-Genome Sequencing

    OpenAIRE

    Smith, Harold E.; Fabritius, Amy S.; Aimee Jaramillo-Lambert; Andy Golden

    2016-01-01

    Whole-genome sequencing provides a rapid and powerful method for identifying mutations on a global scale, and has spurred a renewed enthusiasm for classical genetic screens in model organisms. The most commonly characterized category of mutation consists of monogenic, recessive traits, due to their genetic tractability. Therefore, most of the mapping methods for mutation identification by whole-genome sequencing are directed toward alleles that fulfill those criteria (i.e., single-gene, homoz...

  8. Whole genome and transcriptome sequencing of a B3 thymoma.

    Directory of Open Access Journals (Sweden)

    Iacopo Petrini

    Full Text Available Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomic hybridization using Agilent platform, transcriptome sequencing using HiSeq 2000 (Illumina and whole genome sequencing using Complete Genomics Inc platform. Whole genome sequencing determined, in tumor and normal, the sequence of both alleles in more than 95% of the reference genome (NCBI Build 37. Copy number (CN aberrations were comparable with those previously described for B3 thymomas, with CN gain of chromosome 1q, 5, 7 and X and CN loss of 3p, 6, 11q42.2-qter and q13. One translocation t(11;X was identified by whole genome sequencing and confirmed by PCR and Sanger sequencing. Ten single nucleotide variations (SNVs and 2 insertion/deletions (INDELs were identified; these mutations resulted in non-synonymous amino acid changes or affected splicing sites. The lack of common cancer-associated mutations in this patient suggests that thymomas may evolve through mechanisms distinctive from other tumor types, and supports the rationale for additional high-throughput sequencing screens to better understand the somatic genetic architecture of thymoma.

  9. Genome Sequence of the Biocontrol Strain Pseudomonas fluorescens F113

    OpenAIRE

    Redondo-Nieto, M.; M. Barret; Morrisey, J; Germaine, K.; Martínez-Granero, F.; Barahona, E.; Navazo, A.; Sánchez-Contreras, M.; Moynihan, J.; Giddens, S.; Coppoolse, E.; Muriel, C.; Stiekema, W.; Rainey, P; Dowling, D

    2012-01-01

    Pseudomonas fluorescens F113 is a plant growth-promoting rhizobacterium (PGPR) that has biocontrol activity against fungal plant pathogens and is a model for rhizosphere colonization. Here, we present its complete genome sequence, which shows that besides a core genome very similar to those of other strains sequenced within this species, F113 possesses a wide array of genes encoding specialized functions for thriving in the rhizosphere and interacting with eukaryotic organisms.

  10. Draft genome sequence of Therminicola potens strain JR

    Energy Technology Data Exchange (ETDEWEB)

    Byrne-Bailey, K.G.; Wrighton, K.C.; Melnyk, R.A.; Agbo, P.; Hazen, T.C.; Coates, J.D.

    2010-07-01

    'Thermincola potens' strain JR is one of the first Gram-positive dissimilatory metal-reducing bacteria (DMRB) for which there is a complete genome sequence. Consistent with the physiology of this organism, preliminary annotation revealed an abundance of multiheme c-type cytochromes that are putatively associated with the periplasm and cell surface in a Gram-positive bacterium. Here we report the complete genome sequence of strain JR.

  11. Genome Sequence of Pantoea agglomerans Strain IG1

    OpenAIRE

    Matsuzawa, Tomohiko; Mori, Kazuki; Kadowaki, Takeshi; Shimada, Misato; Tashiro, Kosuke; Kuhara, Satoru; Inagawa, Hiroyuki; Soma, Gen-Ichiro; Takegawa, Kaoru

    2012-01-01

    Pantoea agglomerans is a Gram-negative bacterium that grows symbiotically with various plants. Here we report the 4.8-Mb genome sequence of P. agglomerans strain IG1. The lipopolysaccharides derived from P. agglomerans IG1 have been shown to be effective in the prevention of various diseases, such as bacterial or viral infection, lifestyle-related diseases. This genome sequence represents a substantial step toward the elucidation of pathways for production of lipopolysaccharides.

  12. Complete Genome Sequence of Pseudomonas aeruginosa Phage AAT-1.

    Science.gov (United States)

    Andrade-Domínguez, Andrés; Kolter, Roberto

    2016-01-01

    Aspects of the interaction between phages and animals are of interest and importance for medical applications. Here, we report the genome sequence of the lytic Pseudomonas phage AAT-1, isolated from mammalian serum. AAT-1 is a double-stranded DNA phage, with a genome of 57,599 bp, containing 76 predicted open reading frames. PMID:27563032

  13. Draft Genome Sequence of Avibacterium paragallinarum Strain 221

    OpenAIRE

    Xu, Fuzhou; Miao, Deyuan; Du, Yu; CHEN, XIAOLING; Zhang, Peijun; Sun, Huiling

    2013-01-01

    Avibacterium paragallinarum is the causative agent of infectious coryza. Here we report the draft genome sequence of reference strain 221 of A. paragallinarum serovar A. The genome is composed of 135 contigs for 2,685,568 bp with a 41% G+C content.

  14. Draft Genome Sequence of Amycolatopsis decaplanina Strain DSM 44594T

    OpenAIRE

    Kaur, Navjot; Kumar, Shailesh; Bala, Monu; Raghava, Gajendra Pal Singh; Mayilraj, Shanmugam

    2013-01-01

    We report the 8.5-Mb genome sequence of Amycolatopsis decaplanina strain DSM 44594T, isolated from a soil sample from India. The draft genome of strain DSM 44594T consists of 8,533,276 bp with a 68.6% G+C content, 7,899 protein-coding genes, and 57 RNAs.

  15. Complete genome sequence of Aeromonas hydrophila AL06-06

    Science.gov (United States)

    Aeromonas hydrophila occurs in freshwater environments and infects fish and mammals. In this work, we report the complete genome sequence of Aeromonas hydrophila AL06-06, which was isolated from diseased goldfish and is being used for comparative genomic studies with A. hydrophila strains causing ba...

  16. A snapshot of the emerging tomato genome sequence

    NARCIS (Netherlands)

    Mueller, L.A.; Klein Lankhorst, R.M.; Tanksley, S.D.; Peters, R.M.; Staveren, van M.J.; Datema, E.; Fiers, M.W.E.J.; Ham, van R.C.H.J.; Szinay, D.; Jong, de J.H.S.G.M.

    2009-01-01

    The genome of tomato (Solanum lycopersicum L.) is being sequenced by an international consortium of 10 countries (Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy, and the United States) as part of the larger “International Solanaceae Genome Project (SOL): System

  17. Draft Genome Sequence of Rhodococcus sp. Strain 311R

    Science.gov (United States)

    Ehsani, Elham; Jauregui, Ruy; Geffers, Robert; Jareck, Michael; Boon, Nico; Pieper, Dietmar H.

    2015-01-01

    Here, we report the draft genome sequence of Rhodococcus sp. strain 311R, which was isolated from a site contaminated with alkanes and aromatic compounds. Strain 311R shares 90% of the genome of Rhodococcus erythropolis SK121, which is the closest related bacteria. PMID:25999565

  18. Whole-Genome Sequences of Three Symbiotic Endozoicomonas Bacteria

    KAUST Repository

    Neave, Matthew J.

    2014-08-14

    Members of the genus Endozoicomonas associate with a wide range of marine organisms. Here, we report on the whole-genome sequencing, assembly, and annotation of three Endozoicomonas type strains. These data will assist in exploring interactions between Endozoicomonas organisms and their hosts, and it will aid in the assembly of genomes from uncultivated Endozoicomonas spp.

  19. Complete Genome Sequence of Pediococcus pentosaceus Strain SL4

    DEFF Research Database (Denmark)

    Dantoft, Shruti Harnal; Bielak, Eliza Maria; Seo, Jae-Gu;

    2013-01-01

    Pediococcus pentosaceus SL4 was isolated from a Korean fermented vegetable product, kimchi. We report here the whole-genome sequence (WGS) of P. pentosaceus SL4. The genome consists of a 1.79-Mb circular chromosome (G+C content of 37.3%) and seven distinct plasmids ranging in size from 4 kb to 50...

  20. Genome Sequence of Chinese Porcine Parvovirus Strain PPV2010

    OpenAIRE

    Cui, Jin; wang, xin; Ren, Yudong; Cui, Shangjin; Li, Guangxing; Ren, Xiaofeng

    2012-01-01

    Porcine parvovirus (PPV) isolate PPV2010 has recently emerged in China. Herein, we analyze the complete genome sequence of PPV2010. Our results indicate that the genome of PPV2010 bears mixed characteristics of virulent PPV and vaccine strains. Importantly, PPV2010 has the potential to be a naturally attenuated candidate vaccine strain.

  1. The tomato genome sequence provides insight into fleshy fruit evolution

    Science.gov (United States)

    The genome of the inbred tomato cultivar ‘Heinz 1706’ was sequenced and assembled using a combination of Sanger and “next generation” technologies. The predicted genome size is ~900 Mb, consistent with prior estimates, of which 760 Mb were assembled in 91 scaffolds aligned to the 12 tomato chromosom...

  2. Genome sequence of the cultivated cotton Gossypium arboreum

    Science.gov (United States)

    Cotton is one of the most economically important natural fiber crops in the world, and the complex tetraploid nature of its genome (AADD, 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled 98.3% of the 1.7-gigabase G. arboreum (AA, 2n = 26...

  3. Complete Genome Sequence of Bacillus thuringiensis Bacteriophage Smudge

    Science.gov (United States)

    Cornell, Jessica L.; Breslin, Eileen; Schuhmacher, Zachary; Himelright, Madison; Berluti, Cassandra; Boyd, Charles; Carson, Rachel; Del Gallo, Elle; Giessler, Caris; Gilliam, Benjamin; Heatherly, Catherine; Nevin, Julius; Nguyen, Bryan; Nguyen, Justin; Parada, Jocelyn; Sutterfield, Blake; Tukruni, Muruj

    2016-01-01

    Smudge, a bacteriophage enriched from soil using Bacillus thuringiensis DSM-350 as the host, had its complete genome sequenced. Smudge is a myovirus with a genome consisting of 292 genes and was identified as belonging to the C1 cluster of Bacillus phages. PMID:27540049

  4. Complete Genome Sequence of Mycobacterium bovis Strain BCG-1 (Russia).

    Science.gov (United States)

    Sotnikova, Evgeniya A; Shitikov, Egor A; Malakhova, Maja V; Kostryukova, Elena S; Ilina, Elena N; Atrasheuskaya, Alena V; Ignatyev, Georgy M; Vinokurova, Nataliya V; Gorbachyov, Vyacheslav Y

    2016-01-01

    Mycobacterium bovisBCG (Bacille Calmette-Guérin) is a vaccine strain used for protection against tuberculosis. Here, we announce the complete genome sequence ofM. bovisstrain BCG-1 (Russia). Extensive use of this strain necessitates the study of its genome stability by comparative analysis. PMID:27034492

  5. Complete Chloroplast Genome Sequence of Phagomixotrophic Green Alga Cymbomonas tetramitiformis

    Science.gov (United States)

    Paasch, Amber E.; Graham, Linda E.; Kim, Eunsoo

    2016-01-01

    We report here the complete chloroplast genome sequence of Cymbomonas tetramitiformis strain PLY262, which is a prasinophycean green alga that retains a phagomixotrophic mode of nutrition. The genome is 84,524 bp in length, with a G+C content of 37%, and contains 3 rRNAs, 26 tRNAs, and 76 protein-coding genes. PMID:27313295

  6. Complete genome sequence of Campylobacter gracilis ATCC 33236T

    Science.gov (United States)

    The human oral pathogen Campylobacter gracilis has been isolated from periodontal and endodontal infections, and also from non-oral head, neck or lung infections. This study describes the whole-genome sequence of the human periodontal isolate ATCC 33236T (=FDC 1084), which is the first closed genome...

  7. Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis

    NARCIS (Netherlands)

    Carlton, Jane M.; Hirt, Robert P.; Silva, Joana C.; Delcher, Arthur L.; Schatz, Michael; Zhao, Qi; Wortman, Jennifer R.; Bidwell, Shelby L.; Alsmark, U. Cecilia M.; Besteiro, Sebastien; Sicheritz-Ponten, Thomas; Noel, Christophe J.; Dacks, Joel B.; Foster, Peter G.; Simillion, Cedric; Van de Peer, Yves; Miranda-Saavedra, Diego; Barton, Geoffrey J.; Westrop, Gareth D.; Mueller, Sylke; Dessi, Daniele; Fiori, Pier Luigi; Ren, Qinghu; Paulsen, Ian; Zhang, Hanbang; Bastida-Corcuera, Felix D.; Simoes-Barbosa, Augusto; Brown, Mark T.; Hayes, Richard D.; Mukherjee, Mandira; Okumura, Cheryl Y.; Schneider, Rachel; Smith, Alias J.; Vanacova, Stepanka; Villalvazo, Maria; Haas, Brian J.; Pertea, Mihaela; Feldblyum, Tamara V.; Utterback, Terry R.; Shu, Chung-Li; Osoegawa, Kazutoyo; de Jong, Pieter J.; Hrdy, Ivan; Horvathova, Lenka; Zubacova, Zuzana; Dolezal, Pavel; Malik, Shehre-Banoo; Logsdon, John M.; Henze, Katrin; Gupta, Arti; Wang, Ching C.; Dunne, Rebecca L.; Upcroft, Jacqueline A.; Upcroft, Peter; White, Owen; Salzberg, Steven L.; Tang, Petrus; Chiu, Cheng-Hsun; Lee, Ying-Shiung; Embley, T. Martin; Coombs, Graham H.; Mottram, Jeremy C.; Tachezy, Jan; Fraser-Liggett, Claire M.; Johnson, Patricia J.

    2007-01-01

    We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the similar to 160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion, in conjunction wi

  8. Complete Genome Sequence of Bacillus thuringiensis Strain 407 Cry-

    OpenAIRE

    Poehlein, Anja; Liesegang, Heiko

    2013-01-01

    Bacillus thuringiensis is an insect pathogen that has been used widely as a biopesticide. Here, we report the genome sequence of strain 407 Cry-, which is used to study the genetic determinants of pathogenicity. The genome consists of a 5.5-Mb chromosome and nine plasmids, including a novel 502-kb megaplasmid.

  9. Complete Genome Sequence of Cyanobacterial Siphovirus KBS2A.

    Science.gov (United States)

    Ponsero, Alise J; Chen, Feng; Lennon, Jay T; Wilhelm, Steven W

    2013-01-01

    We present the genome of a cyanosiphovirus (KBS2A) that infects a marine Synechococcus sp. (strain WH7803). Unique to this genome, relative to other sequenced cyanosiphoviruses, is the absence of elements associated with integration into the host chromosome, suggesting this virus may not be able to establish a lysogenic relationship. PMID:23969045

  10. Complete Genome Sequence of Cyanobacterial Siphovirus KBS2A

    OpenAIRE

    Ponsero, Alise J.; Chen, Feng; Lennon, Jay T.; Wilhelm, Steven W.

    2013-01-01

    We present the genome of a cyanosiphovirus (KBS2A) that infects a marine Synechococcus sp. (strain WH7803). Unique to this genome, relative to other sequenced cyanosiphoviruses, is the absence of elements associated with integration into the host chromosome, suggesting this virus may not be able to establish a lysogenic relationship.

  11. Complete Genome Sequence of Bacillus thuringiensis Bacteriophage Smudge.

    Science.gov (United States)

    Cornell, Jessica L; Breslin, Eileen; Schuhmacher, Zachary; Himelright, Madison; Berluti, Cassandra; Boyd, Charles; Carson, Rachel; Del Gallo, Elle; Giessler, Caris; Gilliam, Benjamin; Heatherly, Catherine; Nevin, Julius; Nguyen, Bryan; Nguyen, Justin; Parada, Jocelyn; Sutterfield, Blake; Tukruni, Muruj; Temple, Louise

    2016-01-01

    Smudge, a bacteriophage enriched from soil using Bacillus thuringiensis DSM-350 as the host, had its complete genome sequenced. Smudge is a myovirus with a genome consisting of 292 genes and was identified as belonging to the C1 cluster of Bacillus phages. PMID:27540049

  12. Complete Genome Sequence of Cyanobacterium Leptolyngbya sp. NIES-3755

    Science.gov (United States)

    Fujisawa, Takatomo; Ohtsubo, Yoshiyuki; Katayama, Mitsunori; Misawa, Naomi; Wakazuki, Sachiko; Shimura, Yohei; Nakamura, Yasukazu; Kawachi, Masanobu; Yoshikawa, Hirofumi; Eki, Toshihiko

    2016-01-01

    Cyanobacterial genus Leptolyngbya comprises genetically diverse species, but the availability of their complete genome information is limited. Here, we isolated Leptolyngbya sp. strain NIES-3755 from soil at the Toyohashi University of Technology, Japan. We determined the complete genome sequence of the NIES-3755 strain, which is composed of one chromosome and three plasmids. PMID:26988037

  13. The complete chloroplast genome sequence of Abies nephrolepis (Pinaceae: Abietoideae

    Directory of Open Access Journals (Sweden)

    Dong-Keun Yi

    2016-06-01

    Full Text Available The plant chloroplast (cp genome has maintained a relatively conserved structure and gene content throughout evolution. Cp genome sequences have been used widely for resolving evolutionary and phylogenetic issues at various taxonomic levels of plants. Here, we report the complete cp genome of Abies nephrolepis. The A. nephrolepis cp genome is 121,336 base pairs (bp in length including a pair of short inverted repeat regions (IRa and IRb of 139 bp each separated by a small single copy (SSC region of 54,323 bp (SSC and a large single copy region of 66,735 bp (LSC. It contains 114 genes, 68 of which are protein coding genes, 35 tRNA and four rRNA genes, six open reading frames, and one pseudogene. Seventeen repeat units and 64 simple sequence repeats (SSR have been detected in A. nephrolepis cp genome. Large IR sequences locate in 42-kb inversion points (1186 bp. The A. nephrolepis cp genome is identical to Abies koreana’s which is closely related to taxa. Pairwise comparison between two cp genomes revealed 140 polymorphic sites in each. Complete cp genome sequence of A. nephrolepis has a significant potential to provide information on the evolutionary pattern of Abietoideae and valuable data for development of DNA markers for easy identification and classification.

  14. Finished Genome Sequence of Collimonas arenae Cal35

    NARCIS (Netherlands)

    Wu, Je-Jia; de Jager, Victor; Deng, Wen-ling; Leveau, Johan

    2015-01-01

    We announce the finished genome sequence of soil forest isolate Collimonas arenae Cal35, which comprises a 5.6-Mbp chromosome and 41-kb plasmid. The Cal35 genome is the second one published for the bacterial genus Collimonas and represents the first opportunity for high-resolution comparison of geno

  15. Dissection of the octoploid strawberry genome by deep sequencing of the genomes of Fragaria species.

    Science.gov (United States)

    Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N

    2014-01-01

    Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ∼200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021

  16. Comparative genomics beyond sequence-based alignments

    DEFF Research Database (Denmark)

    Þórarinsson, Elfar; Yao, Zizhen; Wiklund, Eric D.;

    2008-01-01

    Recent computational scans for non-coding RNAs (ncRNAs) in multiple organisms have relied on existing multiple sequence alignments. However, as sequence similarity drops, a key signal of RNA structure--frequent compensating base changes--is increasingly likely to cause sequence-based alignment me...

  17. Structural and functional characterization of hBD-1(Ser35), a peptide deduced from a DEFB1 polymorphism.

    Science.gov (United States)

    Circo, Raffaella; Skerlavaj, Barbara; Gennaro, Renato; Amoroso, Antonio; Zanetti, Margherita

    2002-04-26

    beta-Defensins are mammalian antimicrobial peptides that share a unique disulfide-bonding motif of six conserved cysteines. An intragenic polymorphism of the DEFB1 gene that changes a highly conserved Cys to Ser in the peptide coding region has recently been described. The deduced peptide cannot form three disulfide bonds, as one of the cysteines is unpaired. We have determined the cysteine connectivities of a corresponding synthetic hBD-1(Ser35) peptide, investigated the structure by circular dichroism spectroscopy, and assayed the in vitro antimicrobial activity. Despite a different arrangement of the disulfides, hBD-1(Ser35) proved as active as hBD-1 against the microorganisms tested. This activity likely depends on the ability of hBD-1(Ser35) to adopt an amphipathic conformation in hydrophobic environment, similar to the wild type peptide, as suggested by CD spectroscopy. PMID:12054642

  18. First fungal genome sequence from Africa: A preliminary analysis

    Directory of Open Access Journals (Sweden)

    Rene Sutherland

    2012-01-01

    Full Text Available Some of the most significant breakthroughs in the biological sciences this century will emerge from the development of next generation sequencing technologies. The ease of availability of DNA sequence made possible through these new technologies has given researchers opportunities to study organisms in a manner that was not possible with Sanger sequencing. Scientists will, therefore, need to embrace genomics, as well as develop and nurture the human capacity to sequence genomes and utilise the ’tsunami‘ of data that emerge from genome sequencing. In response to these challenges, we sequenced the genome of Fusarium circinatum, a fungal pathogen of pine that causes pitch canker, a disease of great concern to the South African forestry industry. The sequencing work was conducted in South Africa, making F. circinatum the first eukaryotic organism for which the complete genome has been sequenced locally. Here we report on the process that was followed to sequence, assemble and perform a preliminary characterisation of the genome. Furthermore, details of the computer annotation and manual curation of this genome are presented. The F. circinatum genome was found to be nearly 44 million bases in size, which is similar to that of four other Fusarium genomes that have been sequenced elsewhere. The genome contains just over 15 000 open reading frames, which is less than that of the related species, Fusarium oxysporum, but more than that for Fusarium verticillioides. Amongst the various putative gene clusters identified in F. circinatum, those encoding the secondary metabolites fumosin and fusarin appeared to harbour evidence of gene translocation. It is anticipated that similar comparisons of other loci will provide insights into the genetic basis for pathogenicity of the pitch canker pathogen. Perhaps more importantly, this project has engaged a relatively large group of scientists

  19. Enabling technologies of genomic-scale sequence enrichment for targeted high-throughput sequencing

    OpenAIRE

    Summerer, Daniel

    2009-01-01

    Next-generation sequencing has still not reached its full potential due to the technical inability of effectively targeting desired genomic regions of interest. Once available, methods adressing this bottleneck will dramatically reduce cost and enable the efficient analysis of complex samples. Recently, a number of possible approaches for genomic-scale sequence enrichment have been reported using different strategies. All methods basically rely on sequence-specific nucleic acid hybridization,...

  20. Ancient Human Genome Sequence of an Extinct Palaeo-Eskimo

    DEFF Research Database (Denmark)

    Rasmussen, Morten; Li, Yingrui; Lindgreen, Stinus;

    2010-01-01

    We report here the genome sequence of an ancient human. Obtained from approximately 4,000-year-old permafrost-preserved hair, the genome represents a male individual from the first known culture to settle in Greenland. Sequenced to an average depth of 20x, we recover 79% of the diploid genome, an...... possible phenotypic characteristics of the individual that belonged to a culture whose location has yielded only trace human remains. We compare the high-confidence SNPs to those of contemporary populations to find the populations most closely related to the individual. This provides evidence for a...

  1. Complete mitochondrial genome sequence of Aoluguya reindeer (Rangifer tarandus).

    Science.gov (United States)

    Ju, Yan; Liu, Huamiao; Rong, Min; Yang, Yifeng; Wei, Haijun; Shao, Yuanchen; Chen, Xiumin; Xing, Xiumei

    2016-05-01

    The complete mitochondria genome of the reindeer, Rangifer tarandus, was determined by accurate polymerase chain reaction. The entire genome is 16,357 bp in length and contains 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and a D-loop region, all of which are arranged in a typical vertebrate manner. The overall base composition of the reindeer's mitochondrial genome is 33.7% of A, 23.1% of C, 30.1% of T and 13.2%of G. A termination associated sequence and several conserved central sequence block domains were discovered within the control region. PMID:25469816

  2. Monitoring genomic sequences during SELEX using high-throughput sequencing: neutral SELEX.

    Directory of Open Access Journals (Sweden)

    Bob Zimmermann

    Full Text Available BACKGROUND: SELEX is a well established in vitro selection tool to analyze the structure of ligand-binding nucleic acid sequences called aptamers. Genomic SELEX transforms SELEX into a tool to discover novel, genomically encoded RNA or DNA sequences binding a ligand of interest, called genomic aptamers. Concerns have been raised regarding requirements imposed on RNA sequences undergoing SELEX selection. METHODOLOGY/PRINCIPAL FINDINGS: To evaluate SELEX and assess the extent of these effects, we designed and performed a Neutral SELEX experiment omitting the selection step, such that the sequences are under the sole selective pressure of SELEX's amplification steps. Using high-throughput sequencing, we obtained thousands of full-length sequences from the initial genomic library and the pools after each of the 10 rounds of Neutral SELEX. We compared these to sequences obtained from a Genomic SELEX experiment deriving from the same initial library, but screening for RNAs binding with high affinity to the E. coli regulator protein Hfq. With each round of Neutral SELEX, sequences became less stable and changed in nucleotide content, but no sequences were enriched. In contrast, we detected substantial enrichment in the Hfq-selected set with enriched sequences having structural stability similar to the neutral sequences but with significantly different nucleotide selection. CONCLUSIONS/SIGNIFICANCE: Our data indicate that positive selection in SELEX acts independently of the neutral selective requirements imposed on the sequences. We conclude that Genomic SELEX, when combined with high-throughput sequencing of positively and neutrally selected pools, as well as the gnomic library, is a powerful method to identify genomic aptamers.

  3. Specialized microbial databases for inductive exploration of microbial genome sequences

    Directory of Open Access Journals (Sweden)

    Cabau Cédric

    2005-02-01

    Full Text Available Abstract Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore http://bioinfo.hku.hk/genochore.html, a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis associated to related organisms for comparison.

  4. Genome sequence analysis of the model grass Brachypodium distachyon: insights into grass genome evolution

    Energy Technology Data Exchange (ETDEWEB)

    Schulman, Al

    2009-08-09

    Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromeric regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops

  5. Sequencing and comparative analyses of the genomes of zoysiagrasses.

    Science.gov (United States)

    Tanaka, Hidenori; Hirakawa, Hideki; Kosugi, Shunichi; Nakayama, Shinobu; Ono, Akiko; Watanabe, Akiko; Hashiguchi, Masatsugu; Gondo, Takahiro; Ishigaki, Genki; Muguerza, Melody; Shimizu, Katsuya; Sawamura, Noriko; Inoue, Takayasu; Shigeki, Yuichi; Ohno, Naoki; Tabata, Satoshi; Akashi, Ryo; Sato, Shusei

    2016-04-01

    Zoysiais a warm-season turfgrass, which comprises 11 allotetraploid species (2n= 4x= 40), each possessing different morphological and physiological traits. To characterize the genetic systems ofZoysiaplants and to analyse their structural and functional differences in individual species and accessions, we sequenced the genomes ofZoysiaspecies using HiSeq and MiSeq platforms. As a reference sequence ofZoysiaspecies, we generated a high-quality draft sequence of the genome ofZ. japonicaaccession 'Nagirizaki' (334 Mb) in which 59,271 protein-coding genes were predicted. In parallel, draft genome sequences ofZ. matrella'Wakaba' andZ. pacifica'Zanpa' were also generated for comparative analyses. To investigate the genetic diversity among theZoysiaspecies, genome sequence reads of three additional accessions,Z. japonica'Kyoto',Z. japonica'Miyagi' andZ. matrella'Chiba Fair Green', were accumulated, and aligned against the reference genome of 'Nagirizaki' along with those from 'Wakaba' and 'Zanpa'. As a result, we detected 7,424,163 single-nucleotide polymorphisms and 852,488 short indels among these species. The information obtained in this study will be valuable for basic studies on zoysiagrass evolution and genetics as well as for the breeding of zoysiagrasses, and is made available in the 'Zoysia Genome Database' athttp://zoysia.kazusa.or.jp. PMID:26975196

  6. Human genome and genetic sequencing research and informed consent

    International Nuclear Information System (INIS)

    On March 29, 2001, the Ethical Guidelines for Human Genome and Genetic Sequencing Research were established. They have intended to serve as ethical guidelines for all human genome and genetic sequencing research practice, for the purpose of upholding respect for human dignity and rights and enforcing use of proper methods in the pursuit of human genome and genetic sequencing research, with the understanding and cooperation of the public. The RadGenomics Project has prepared a research protocol and informed consent document that follow these ethical guidelines. We have endeavored to protect the privacy of individual information, and have established a procedure for examination of research practices by an ethics committee. Here we report our procedure in order to offer this concept to the patients. (authors)

  7. Open access to sequence: Browsing the Pichia pastoris genome

    Directory of Open Access Journals (Sweden)

    Graf Alexandra

    2009-10-01

    Full Text Available Abstract The first genome sequences of the important yeast protein production host Pichia pastoris have been released into the public domain this spring. In order to provide the scientific community easy and versatile access to the sequence, two web-sites have been installed as a resource for genomic sequence, gene and protein information for P. pastoris: A GBrowse based genome browser was set up at http://www.pichiagenome.org and a genome portal with gene annotation and browsing functionality at http://bioinformatics.psb.ugent.be/webtools/bogas. Both websites are offering information on gene annotation and function, regulation and structure. In addition, a WiKi based platform allows all users to create additional information on genes, proteins, physiology and other items of P. pastoris research, so that the Pichia community can benefit from exchange of knowledge, data and materials.

  8. A Probabilistic Genome-Wide Gene Reading Frame Sequence Model

    DEFF Research Database (Denmark)

    Have, Christian Theil; Mørk, Søren

    We introduce a new type of probabilistic sequence model, that model the sequential composition of reading frames of genes in a genome. Our approach extends gene finders with a model of the sequential composition of genes at the genome-level -- effectively producing a sequential genome annotation as...... output. The model can be used to obtain the most probable genome annotation based on a combination of i: a gene finder score of each gene candidate and ii: the sequence of the reading frames of gene candidates through a genome. The model --- as well as a higher order variant --- is developed and tested...... using the probabilistic logic programming language and machine learning system PRISM - a fast and efficient model prototyping environment, using bacterial gene finding performance as a benchmark of signal strength. The model is used to prune a set of gene predictions from an underlying gene finder and...

  9. Complete Genome Sequence of the Alfalfa latent virus.

    Science.gov (United States)

    Nemchinov, Lev G; Shao, Jonathan; Postnikova, Olga A

    2015-01-01

    The first complete genome sequence of the Alfalfa latent carlavirus (ALV) was obtained by primer walking and Illumina RNA sequencing. The virus differs substantially from the Czech ALV isolate and the Pea streak virus isolate from Wisconsin. The absence of a clear nucleic acid-binding protein indicates ALV divergence from other carlaviruses. PMID:25883281

  10. Draft Genome Sequence of Biocontrol Agent Bacillus cereus UW85.

    Science.gov (United States)

    Lozano, Gabriel L; Holt, Jonathan; Ravel, Jacques; Rasko, David A; Thomas, Michael G; Handelsman, Jo

    2016-01-01

    Bacillus cereus UW85 was isolated from a root of a field-grown alfalfa plant from Arlington, WI, and identified for its ability to suppress damping off, a disease caused by Phytophthora megasperma f. sp. medicaginis on alfalfa. Here, we report the draft genome sequence of B. cereus UW85, obtained by a combination of Sanger and Illumina sequencing. PMID:27587823

  11. Genome sequence of Stachybotrys chartarum Strain 51-11

    Science.gov (United States)

    Stachybotrys chartarum strain 51-11 genome was sequenced by shotgun sequencing utilizing Illumina Hiseq 2000 and PacBio long read technology. Since Stachybotrys chartarum has been implicated in health impacts within water-damaged buildings, any information extracted from the geno...

  12. Complete Genome Sequence of Vibrio alginolyticus ZJ-T.

    Science.gov (United States)

    Deng, Yiqin; Chen, Chang; Zhao, Zhe; Huang, Xiaochun; Yang, Yiying; Ding, Xiongqi

    2016-01-01

    Vibrio alginolyticus is a ubiquitous Gram-negative bacterium which is normally distributed in the coastal and estuarine environments. It has been suggested to be an opportunistic pathogen to both marine animals and humans, Here, the completed genome sequence of V. alginolyticus ZJ-T was determined by Illumina high-throughput sequencing. PMID:27587824

  13. Sequencing and analysis of an Irish human genome.

    LENUS (Irish Health Repository)

    Tong, Pin

    2010-01-01

    Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence.

  14. Draft Genome Sequence of Type Strain Streptococcus gordonii ATCC 10558

    DEFF Research Database (Denmark)

    Rasmussen, Louise Hesselbjerg; Dargis, Rimtas; Christensen, Jens Jørgen Elmer;

    2016-01-01

    Streptococcus gordonii ATCC 10558T was isolated from a patient with infective endocarditis in 1946 and announced as a type strain in 1989. Here, we report the 2,154,510-bp draft genome sequence of S. gordonii ATCC 10558T. This sequence will contribute to knowledge about the pathogenesis of...

  15. Genomic insight into the common carp (Cyprinus carpio genome by sequencing analysis of BAC-end sequences

    Directory of Open Access Journals (Sweden)

    Wang Jintu

    2011-04-01

    Full Text Available Abstract Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio, a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3

  16. Genomic Sequencing of Single Microbial Cells from Environmental Samples

    Energy Technology Data Exchange (ETDEWEB)

    Ishoey, Thomas; Woyke, Tanja; Stepanauskas, Ramunas; Novotny, Mark; Lasken, Roger S.

    2008-02-01

    Recently developed techniques allow genomic DNA sequencing from single microbial cells [Lasken RS: Single-cell genomic sequencing using multiple displacement amplification, Curr Opin Microbiol 2007, 10:510-516]. Here, we focus on research strategies for putting these methods into practice in the laboratory setting. An immediate consequence of single-cell sequencing is that it provides an alternative to culturing organisms as a prerequisite for genomic sequencing. The microgram amounts of DNA required as template are amplified from a single bacterium by a method called multiple displacement amplification (MDA) avoiding the need to grow cells. The ability to sequence DNA from individual cells will likely have an immense impact on microbiology considering the vast numbers of novel organisms, which have been inaccessible unless culture-independent methods could be used. However, special approaches have been necessary to work with amplified DNA. MDA may not recover the entire genome from the single copy present in most bacteria. Also, some sequence rearrangements can occur during the DNA amplification reaction. Over the past two years many research groups have begun to use MDA, and some practical approaches to single-cell sequencing have been developed. We review the consensus that is emerging on optimum methods, reliability of amplified template, and the proper interpretation of 'composite' genomes which result from the necessity of combining data from several single-cell MDA reactions in order to complete the assembly. Preferred laboratory methods are considered on the basis of experience at several large sequencing centers where >70% of genomes are now often recovered from single cells. Methods are reviewed for preparation of bacterial fractions from environmental samples, single-cell isolation, DNA amplification by MDA, and DNA sequencing.

  17. Draft Genome Sequence of Neisseria gonorrhoeae Sequence Type 1407, a Multidrug-Resistant Clinical Isolate.

    Science.gov (United States)

    Anselmo, A; Ciammaruconi, A; Carannante, A; Neri, A; Fazio, C; Fortunato, A; Palozzi, A M; Vacca, P; Fillo, S; Lista, F; Stefanelli, P

    2015-01-01

    Gonorrhea may become untreatable due to the spread of resistant or multidrug-resistant strains. Cefixime-resistant gonococci belonging to sequence type 1407 have been described worldwide. We report the genome sequence of Neisseria gonorrhoeae strain G2891, a multidrug-resistant isolate of sequence type 1407, collected in Italy in 2013. PMID:26272575

  18. Comparison of two Next Generation sequencing platforms for full genome sequencing of Classical Swine Fever Virus

    DEFF Research Database (Denmark)

    Fahnøe, Ulrik; Pedersen, Anders Gorm; Höper, Dirk;

    2013-01-01

    Next Generation Sequencing (NGS) is becoming more adopted into viral research and will be the preferred technology in the years to come. We have recently sequenced several strains of Classical Swine Fever Virus (CSFV) by NGS on both Genome Sequencer FLX (GS FLX) and Iontorrent PGM platforms. In...

  19. Chemical rationale for selection of isolates for genome sequencing

    DEFF Research Database (Denmark)

    Rank, Christian; Larsen, Thomas Ostenfeld; Frisvad, Jens Christian

    The advances in gene sequencing will in the near future enable researchers to affordably acquire the full genomes of handpicked isolates. We here present a method to evaluate the chemical potential of an entire species and select representatives for genome sequencing. The selection criteria for new...... strains to be sequenced can be manifold, but for studying the functional phenotype, using a metabolome based approach offers a cheap and rapid assessment of critical strains to cover the chemical diversity. We have applied this methodology on the complex A. flavus/A. oryzae group. Though these two species...

  20. Complete genome sequence of Treponema pallidum, the syphilis spirochete.

    Science.gov (United States)

    Fraser, C M; Norris, S J; Weinstock, G M; White, O; Sutton, G G; Dodson, R; Gwinn, M; Hickey, E K; Clayton, R; Ketchum, K A; Sodergren, E; Hardham, J M; McLeod, M P; Salzberg, S; Peterson, J; Khalak, H; Richardson, D; Howell, J K; Chidambaram, M; Utterback, T; McDonald, L; Artiach, P; Bowman, C; Cotton, M D; Fujii, C; Garland, S; Hatch, B; Horst, K; Roberts, K; Sandusky, M; Weidman, J; Smith, H O; Venter, J C

    1998-07-17

    The complete genome sequence of Treponema pallidum was determined and shown to be 1,138,006 base pairs containing 1041 predicted coding sequences (open reading frames). Systems for DNA replication, transcription, translation, and repair are intact, but catabolic and biosynthetic activities are minimized. The number of identifiable transporters is small, and no phosphoenolpyruvate:phosphotransferase carbohydrate transporters were found. Potential virulence factors include a family of 12 potential membrane proteins and several putative hemolysins. Comparison of the T. pallidum genome sequence with that of another pathogenic spirochete, Borrelia burgdorferi, the agent of Lyme disease, identified unique and common genes and substantiates the considerable diversity observed among pathogenic spirochetes. PMID:9665876

  1. Pig genome sequence - analysis and publication strategy

    DEFF Research Database (Denmark)

    Archibald, Alan L.; Bolund, Lars; Churcher, Carol;

    2010-01-01

    preferentially selected for sequencing. In accordance with the Bermuda and Fort Lauderdale agreements and the more recent Toronto Statement the data have been released into public sequence repositories (Genbank/EMBL, NCBI/Ensembl trace repositories) in a timely manner and in advance of publication. CONCLUSIONS...

  2. DNA sequencing leads to genomics progress in China

    Institute of Scientific and Technical Information of China (English)

    WU JiaYan; XIAO JingFa; ZHANG RuoSi; YU Jun

    2011-01-01

    1 Science in the large-scale sequencing era Ten years ago,the first draft sequence assembly of the human genome was completed [1],bringing biomedical research one-step closer toward the goal of revolutionizing diagnosis,prevention,and treatment of human diseases.Recently,journalists from the journal Nature surveyed more than 1000 life scientists regarding this laudable aim [2],obtaining substantially negative responses [3].However,almost all of those surveyed had been influenced,in one way or another,by the availability of the human genome sequence,and they also agreed with the notion that the "sequence is the start." The complexity of genome biology and almost every aspect of human biology is far greater than previously thought [4].

  3. Genomic multiple sequence alignments: refinement using a genetic algorithm

    Directory of Open Access Journals (Sweden)

    Lefkowitz Elliot J

    2005-08-01

    Full Text Available Abstract Background Genomic sequence data cannot be fully appreciated in isolation. Comparative genomics – the practice of comparing genomic sequences from different species – plays an increasingly important role in understanding the genotypic differences between species that result in phenotypic differences as well as in revealing patterns of evolutionary relationships. One of the major challenges in comparative genomics is producing a high-quality alignment between two or more related genomic sequences. In recent years, a number of tools have been developed for aligning large genomic sequences. Most utilize heuristic strategies to identify a series of strong sequence similarities, which are then used as anchors to align the regions between the anchor points. The resulting alignment is globally correct, but in many cases is suboptimal locally. We describe a new program, GenAlignRefine, which improves the overall quality of global multiple alignments by using a genetic algorithm to improve local regions of alignment. Regions of low quality are identified, realigned using the program T-Coffee, and then refined using a genetic algorithm. Because a better COFFEE (Consistency based Objective Function For alignmEnt Evaluation score generally reflects greater alignment quality, the algorithm searches for an alignment that yields a better COFFEE score. To improve the intrinsic slowness of the genetic algorithm, GenAlignRefine was implemented as a parallel, cluster-based program. Results We tested the GenAlignRefine algorithm by running it on a Linux cluster to refine sequences from a simulation, as well as refine a multiple alignment of 15 Orthopoxvirus genomic sequences approximately 260,000 nucleotides in length that initially had been aligned by Multi-LAGAN. It took approximately 150 minutes for a 40-processor Linux cluster to optimize some 200 fuzzy (poorly aligned regions of the orthopoxvirus alignment. Overall sequence identity increased only

  4. Molecular epidemiology of dengue viruses from complete genome sequences

    OpenAIRE

    Ong, Swee Hoe

    2010-01-01

    The availability of the complete genetic blueprint of the dengue virus is essential in molecular epidemiological studies to uncover the role of the virus in dengue pathogenesis. During the course of this project, over two hundred complete genomes of the dengue virus were generated from clinical samples collected in three dengue-endemic Southeast Asian countries. In addition, a bioinformatics platform integrating a sequence database, sequence retrieval tools, sequence annotation data and a var...

  5. Information-theoretic View of Sequence Organization in a Genome

    OpenAIRE

    Luo, Liaofu; Gao, Yang; Lu, Jun

    2010-01-01

    Sequence organizations are viewed from two points: one is from informational redundancy or informational correlation (IC) and another is from k-mer frequency statistics. Two problems are investigated. The first is how the ICs exceed the fluctuation bound and the order emerges from fluctuation in a genome when the sequence length attains some critical value. We demonstrated that the transition from fluctuation to order takes place at about sequence length 200-300 thousands bases for human and ...

  6. Physical map-assisted whole-genome shotgun sequence assemblies

    OpenAIRE

    Warren, René L.; Varabei, Dmitry; Platt, Darren; Huang, Xiaoqiu; Messina, David; Yang, Shiaw-Pyng; Kronstad, James W.; Krzywinski, Martin; Warren, Wesley C; Wallis, John W.; Hillier, LaDeana W.; Chinwalla, Asif T.; Schein, Jacqueline E.; Siddiqui, Asim S.; Marra, Marco A.

    2006-01-01

    We describe a targeted approach to improve the contiguity of whole-genome shotgun sequence (WGS) assemblies at run-time, using information from Bacterial Artificial Chromosome (BAC)-based physical maps. Clone sizes and overlaps derived from clone fingerprints are used for the calculation of length constraints between any two BAC neighbors sharing 40% of their size. These constraints are used to promote the linkage and guide the arrangement of sequence contigs within a sequence scaffold at the...

  7. Draft Genome Sequences of Two Virulent Serotypes of Avian Pasteurella multocida

    OpenAIRE

    Abrahante, Juan E.; Johnson, Timothy J.; Hunter, Samuel S.; Maheswaran, Samuel K.; Hauglund, Melissa J.; Bayles, Darrell O.; Tatum, Fred M.; Briggs, Robert E.

    2013-01-01

    Here we report the draft genome sequences of two virulent avian strains of Pasteurella multocida. Comparative analyses of these genomes were done with the published genome sequence of avirulent P. multocida strain Pm70.

  8. Mitochondrial genome sequencing helps show the evolutionary mechanism of mitochondrial genome formation in Brassica

    Directory of Open Access Journals (Sweden)

    Yan Jiyong

    2011-10-01

    Full Text Available Abstract Background Angiosperm mitochondrial genomes are more complex than those of other organisms. Analyses of the mitochondrial genome sequences of at least 11 angiosperm species have showed several common properties; these cannot easily explain, however, how the diverse mitotypes evolved within each genus or species. We analyzed the evolutionary relationships of Brassica mitotypes by sequencing. Results We sequenced the mitotypes of cam (Brassica rapa, ole (B. oleracea, jun (B. juncea, and car (B. carinata and analyzed them together with two previously sequenced mitotypes of B. napus (pol and nap. The sizes of whole single circular genomes of cam, jun, ole, and car are 219,747 bp, 219,766 bp, 360,271 bp, and 232,241 bp, respectively. The mitochondrial genome of ole is largest as a resulting of the duplication of a 141.8 kb segment. The jun mitotype is the result of an inherited cam mitotype, and pol is also derived from the cam mitotype with evolutionary modifications. Genes with known functions are conserved in all mitotypes, but clear variation in open reading frames (ORFs with unknown functions among the six mitotypes was observed. Sequence relationship analysis showed that there has been genome compaction and inheritance in the course of Brassica mitotype evolution. Conclusions We have sequenced four Brassica mitotypes, compared six Brassica mitotypes and suggested a mechanism for mitochondrial genome formation in Brassica, including evolutionary events such as inheritance, duplication, rearrangement, genome compaction, and mutation.

  9. The diploid genome sequence of an individual human.

    Directory of Open Access Journals (Sweden)

    Samuel Levy

    2007-09-01

    Full Text Available Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel included 3,213,401 single nucleotide polymorphisms (SNPs, 53,823 block substitutions (2-206 bp, 292,102 heterozygous insertion/deletion events (indels(1-571 bp, 559,473 homozygous indels (1-82,711 bp, 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

  10. Genome-wide sequence variations among Mycobacterium avium subspecies paratuberculosis.

    Directory of Open Access Journals (Sweden)

    AdelMTalaat

    2011-12-01

    Full Text Available Mycobacterium avium subspecies paratuberculosis (M. ap, the causative agent of Johne’s disease (JD, infects many farmed ruminants, wildlife animals and humans. To better understand the molecular pathogenesis of these infections, we analyzed the whole genome sequences of several M. ap and M. avium subspecies avium (M. avium strains isolated from various hosts and environments. Using Next-generation sequencing technology, all 6 M. ap isolates showed a high percentage of homology (98% to the reference genome sequence of M. ap K-10 isolated from cattle. However, 2 M. avium isolates (DT 78 and Env 77 showed significant sequence diversity from the reference strain M. avium 104. The genomes of M. avium isolates DT 78 and Env 77 exhibited only 87% and 40% homology, respectively, to the M. avium 104 reference genome. Within the M. ap isolates, genomic rearrangements (insertions/deletions, Indels were not detected, and only unique single nucleotide polymorphisms (SNPs were observed among the 6 M. ap strains. While most of the SNPs (~100 in M. ap genomes were non-synonymous, a total of ~ 6000 SNPs were detected among M. avium genomes, most of them were synonymous suggesting a differential selective pressure between M. ap and M. avium isolates. In addition, SNPs-based phylo-genomic analysis showed that isolates from goat and Oryx are closely related to the cattle (K-10 strain while the human isolate (M. ap 4B is closely related to the environmental strains, indicating environmental source to human infections. Overall, SNPs were the most common variations among M. ap isolates while SNPs in addition to Indels were prevalent among M. avium isolates. Genomic variations will be useful in designing host-specific markers for the analysis of mycobacterial evolution and for developing novel diagnostics directed against Johne’s disease in animals.

  11. Comparison of methods for genomic localization of gene trap sequences

    Directory of Open Access Journals (Sweden)

    Ferrin Thomas E

    2006-09-01

    Full Text Available Abstract Background Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences were used to evaluate localization results. Results In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small proportion of the sequence tags. However, large differences in performance were noted with regard to correctly identifying exon boundaries. BLAT correctly identified the vast majority of exon boundaries, while SSAHA and MegaBLAST missed the majority of exon boundaries. SSAHA consistently reported the fewest false positives and is the fastest algorithm. MegaBLAST was comparable to BLAT in speed, but was the most susceptible to localizing sequence tags incorrectly to pseudogenes. Conclusion The differences in performance for sequence tags and full-length reference sequences were surprisingly small. Characteristic variations in localization results for each program were noted that affect the localization of sequence at exon boundaries, in particular.

  12. Evolution Analysis of Simple Sequence Repeats in Plant Genome.

    Directory of Open Access Journals (Sweden)

    Zhen Qin

    Full Text Available Simple sequence repeats (SSRs are widespread units on genome sequences, and play many important roles in plants. In order to reveal the evolution of plant genomes, we investigated the evolutionary regularities of SSRs during the evolution of plant species and the plant kingdom by analysis of twelve sequenced plant genome sequences. First, in the twelve studied plant genomes, the main SSRs were those which contain repeats of 1-3 nucleotides combination. Second, in mononucleotide SSRs, the A/T percentage gradually increased along with the evolution of plants (except for P. patens. With the increase of SSRs repeat number the percentage of A/T in C. reinhardtii had no significant change, while the percentage of A/T in terrestrial plants species gradually declined. Third, in dinucleotide SSRs, the percentage of AT/TA increased along with the evolution of plant kingdom and the repeat number increased in terrestrial plants species. This trend was more obvious in dicotyledon than monocotyledon. The percentage of CG/GC showed the opposite pattern to the AT/TA. Forth, in trinucleotide SSRs, the percentages of combinations including two or three A/T were in a rising trend along with the evolution of plant kingdom; meanwhile with the increase of SSRs repeat number in plants species, different species chose different combinations as dominant SSRs. SSRs in C. reinhardtii, P. patens, Z. mays and A. thaliana showed their specific patterns related to evolutionary position or specific changes of genome sequences. The results showed that, SSRs not only had the general pattern in the evolution of plant kingdom, but also were associated with the evolution of the specific genome sequence. The study of the evolutionary regularities of SSRs provided new insights for the analysis of the plant genome evolution.

  13. Enhanced Dynamic Algorithm of Genome Sequence Alignments

    OpenAIRE

    Arabi E. keshk

    2014-01-01

    The merging of biology and computer science has created a new field called computational biology that explore the capacities of computers to gain knowledge from biological data, bioinformatics. Computational biology is rooted in life sciences as well as computers, information sciences, and technologies. The main problem in computational biology is sequence alignment that is a way of arranging the sequences of DNA, RNA or protein to identify the region of similarity and relationship between se...

  14. Toward Complete Bacterial Genome Sequencing Through the Combined Use of Multiple Next-Generation Sequencing Platforms.

    Science.gov (United States)

    Jeong, Haeyoung; Lee, Dae-Hee; Ryu, Choong-Min; Park, Seung-Hwan

    2016-01-01

    PacBio's long-read sequencing technologies can be successfully used for a complete bacterial genome assembly using recently developed non-hybrid assemblers in the absence of secondgeneration, high-quality short reads. However, standardized procedures that take into account multiple pre-existing second-generation sequencing platforms are scarce. In addition to Illumina HiSeq and Ion Torrent PGM-based genome sequencing results derived from previous studies, we generated further sequencing data, including from the PacBio RS II platform, and applied various bioinformatics tools to obtain complete genome assemblies for five bacterial strains. Our approach revealed that the hierarchical genome assembly process (HGAP) non-hybrid assembler resulted in nearly complete assemblies at a moderate coverage of ~75x, but that different versions produced non-compatible results requiring post processing. The other two platforms further improved the PacBio assembly through scaffolding and a final error correction. PMID:26464377

  15. Plasmodium knowlesi genome sequences from clinical isolates reveal extensive genomic dimorphism.

    Directory of Open Access Journals (Sweden)

    Miguel M Pinheiro

    Full Text Available Plasmodium knowlesi is a newly described zoonosis that causes malaria in the human population that can be severe and fatal. The study of P. knowlesi parasites from human clinical isolates is relatively new and, in order to obtain maximum information from patient sample collections, we explored the possibility of generating P. knowlesi genome sequences from archived clinical isolates. Our patient sample collection consisted of frozen whole blood samples that contained excessive human DNA contamination and, in that form, were not suitable for parasite genome sequencing. We developed a method to reduce the amount of human DNA in the thawed blood samples in preparation for high throughput parasite genome sequencing using Illumina HiSeq and MiSeq sequencing platforms. Seven of fifteen samples processed had sufficiently pure P. knowlesi DNA for whole genome sequencing. The reads were mapped to the P. knowlesi H strain reference genome and an average mapping of 90% was obtained. Genes with low coverage were removed leaving 4623 genes for subsequent analyses. Previously we identified a DNA sequence dimorphism on a small fragment of the P. knowlesi normocyte binding protein xa gene on chromosome 14. We used the genome data to assemble full-length Pknbpxa sequences and discovered that the dimorphism extended along the gene. An in-house algorithm was developed to detect SNP sites co-associating with the dimorphism. More than half of the P. knowlesi genome was dimorphic, involving genes on all chromosomes and suggesting that two distinct types of P. knowlesi infect the human population in Sarawak, Malaysian Borneo. We use P. knowlesi clinical samples to demonstrate that Plasmodium DNA from archived patient samples can produce high quality genome data. We show that analyses, of even small numbers of difficult clinical malaria isolates, can generate comprehensive genomic information that will improve our understanding of malaria parasite diversity and

  16. Complete genome sequence of Arcobacter nitrofigilis type strain (CIT)

    Energy Technology Data Exchange (ETDEWEB)

    Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Gronow, Sabine [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Chertkov, Olga [Los Alamos National Laboratory (LANL); Bruce, David [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

    2010-01-01

    Arcobacter nitrofigilis (McClung et al. 1983) Vandamme et al. 1991 is the type species of the genus Arcobacter in the epsilonproteobacterial family Campylobacteraceae. The species was first described in 1983 as Campylobacter nitrofigilis [1] after its detection as a free-living, nitrogen-fixing Campylobacter species associated with Spartina alterniflora Loisel. roots [2]. It is of phylogenetic interest because of its lifestyle as a symbiotic organism in a marine environment in contrast to many other Arcobacter species which are associated with warm-blooded animals and tend to be pathogenic. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a type stain of the genus Arcobacter. The 3,192,235 bp genome with its 3,154 protein-coding and 70 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  17. The complete plastid genome sequence of Bomarea edulis (Alstroemeriaceae: Liliales).

    Science.gov (United States)

    Kim, Jung Sung; Kim, Hyoung Tae; Yoon, Chang Young; Kim, Joo-Hwan

    2016-05-01

    Bomarea, a member of the family Alstroemeriaceae, is distributed from Chile to Mexico and includes approximately 120 species. Recent molecular phylogenetic studies have clarified the monophyly of the family within the order Liliales and the sister relationship with the family Colchicaceae. At this time, five plastid genomes of Liliales have been analyzed at the familial level. To examine plastid genome variation at the generic level, we sequenced the plastid genome of Bomarea edulis, which is the most widely distributed species in the genus, and compared it with Alstroemeria aurea. The plastid genome sequence of B. edulis was 154,925 bp in length with a similar structure as A. aurea, excluding the IR-LSC junction. Ycf68 and infA were pseudogenes caused by frameshift mutations, and the ycf15 gene was deleted, similar to A. aurea. PMID:25319309

  18. The complete mitochondrial genome sequence of Malus hupehensis var. pinyiensis.

    Science.gov (United States)

    Duan, Naibin; Sun, Honghe; Wang, Nan; Fei, Zhangjun; Chen, Xuesen

    2016-07-01

    The complete mitochondrial genome sequence of Malus hupehensis var. pinyiensis, a widely used apple rootstock, was determined using the Illumina high-throughput sequencing approach. The genome is 422,555 bp in length and has a GC content of 45.21%. It is separated by a pair of inverted repeats of 32,504 bp, to form a large single copy region of 213,055 bp and a small single copy region of 144,492 bp. The genome contains 38 protein-coding genes, four pseudogenes, 25 tRNA genes, and three rRNA genes. The genome is 25,608 bp longer than that of M. domestica, and several structural variations between these two mitogenomes were detected. PMID:26539696

  19. Draft genome sequence of the rubber tree Hevea brasiliensis

    Directory of Open Access Journals (Sweden)

    Rahman Ahmad Yamin Abdul

    2013-02-01

    Full Text Available Abstract Background Hevea brasiliensis, a member of the Euphorbiaceae family, is the major commercial source of natural rubber (NR. NR is a latex polymer with high elasticity, flexibility, and resilience that has played a critical role in the world economy since 1876. Results Here, we report the draft genome sequence of H. brasiliensis. The assembly spans ~1.1 Gb of the estimated 2.15 Gb haploid genome. Overall, ~78% of the genome was identified as repetitive DNA. Gene prediction shows 68,955 gene models, of which 12.7% are unique to Hevea. Most of the key genes associated with rubber biosynthesis, rubberwood formation, disease resistance, and allergenicity have been identified. Conclusions The knowledge gained from this genome sequence will aid in the future development of high-yielding clones to keep up with the ever increasing need for natural rubber.

  20. An automated annotation tool for genomic DNA sequences using GeneScan and BLAST

    Indian Academy of Sciences (India)

    Andrew M. Lynn; Chakresh Kumar Jain; K. Kosalai; Pranjan Barman; Nupur Thakur; Harish Batra; Alok Bhattacharya

    2001-04-01

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated annotation of genome DNA sequences.

  1. Accuracy of genomic prediction using imputed whole-genome sequence data in white layers.

    Science.gov (United States)

    Heidaritabar, M; Calus, M P L; Megens, H-J; Vereijken, A; Groenen, M A M; Bastiaansen, J W M

    2016-06-01

    There is an increasing interest in using whole-genome sequence data in genomic selection breeding programmes. Prediction of breeding values is expected to be more accurate when whole-genome sequence is used, because the causal mutations are assumed to be in the data. We performed genomic prediction for the number of eggs in white layers using imputed whole-genome resequence data including ~4.6 million SNPs. The prediction accuracies based on sequence data were compared with the accuracies from the 60 K SNP panel. Predictions were based on genomic best linear unbiased prediction (GBLUP) as well as a Bayesian variable selection model (BayesC). Moreover, the prediction accuracy from using different types of variants (synonymous, non-synonymous and non-coding SNPs) was evaluated. Genomic prediction using the 60 K SNP panel resulted in a prediction accuracy of 0.74 when GBLUP was applied. With sequence data, there was a small increase (~1%) in prediction accuracy over the 60 K genotypes. With both 60 K SNP panel and sequence data, GBLUP slightly outperformed BayesC in predicting the breeding values. Selection of SNPs more likely to affect the phenotype (i.e. non-synonymous SNPs) did not improve the accuracy of genomic prediction. The fact that sequence data were based on imputation from a small number of sequenced animals may have limited the potential to improve the prediction accuracy. A small reference population (n = 1004) and possible exclusion of many causal SNPs during quality control can be other possible reasons for limited benefit of sequence data. We expect, however, that the limited improvement is because the 60 K SNP panel was already sufficiently dense to accurately determine the relationships between animals in our data. PMID:26776363

  2. Genome and exome sequencing in the clinic: unbiased genomic approaches with a high diagnostic yield

    NARCIS (Netherlands)

    Nelen, M.; Veltman, J.A.

    2012-01-01

    For the reasons discussed here, we think whole-genome- or exome-based approaches are currently most suited for diagnostic implementation in genetically heterogeneous diseases, initially to complement and later to replace Sanger sequencing, qPCR and genomic microarrays. Patients do need to be counsel

  3. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Eisen, JA; Peterson, S; Paulsen, IT; Nelson, KE; Margarit, [No Value; Read, TD; Madoff, LC; Beanan, MJ; Brinkac, LM; Daugherty, SC; DeBoy, RT; Durkin, AS; Kolonay, JF; Madupu, R; Lewis, MR; Radune, D; Fedorova, NB; Scanlan, D; Khouri, H; Mulligan, S; Carty, HA; Cline, RT; Van Aken, SE; Gill, J; Scarselli, M; Mora, M; Iacobini, ET; Brettoni, C; Galli, G; Mariani, M; Vegni, F; Maione, D; Rinaudo, D; Rappuoli, R; Telford, JL; Kasper, DL; Grandi, G; Fraser, CM

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the oth

  4. Candida albicans genome sequence: a platform for genomics in the absence of genetics

    OpenAIRE

    Odds, Frank C.; Brown, Alistair JP; Gow, Neil AR

    2004-01-01

    Publication of the complete diploid genome sequence of the yeast Candida albicans will accelerate research into the pathogenesis of Candida infections. Comparative genomic analysis highlights genes that may contribute to C. albicans survival and its fitness as a human commensal and pathogen.

  5. Complete Genome Sequence of Streptomyces ambofaciens DSM 40697, a Paradigm for Genome Plasticity Studies

    Science.gov (United States)

    Thibessard, Annabelle

    2016-01-01

    The sequence of Streptomyces ambofaciens DSM 40697 was completely determined. The genome consists of an 8.1-Mbp linear chromosome with terminal inverted repeats of 210 kb. Genomic islands were identified, one of which corresponds to a new putative integrative and conjugative element (ICE) called pSAM3. PMID:27257195

  6. Complete genome sequence of Croceibacter atlanticus HTCC2559T.

    Science.gov (United States)

    Oh, Hyun-Myung; Kang, Ilnam; Ferriera, Steve; Giovannoni, Stephen J; Cho, Jang-Cheon

    2010-09-01

    Here we announce the complete genome sequence of Croceibacter atlanticus HTCC2559(T), which was isolated by high-throughput dilution-to-extinction culturing from the Bermuda Atlantic Time Series station in the Western Sargasso Sea. Strain HTCC2559(T) contained genes for carotenoid biosynthesis, flavonoid biosynthesis, and several macromolecule-degrading enzymes. The genome confirmed physiological observations of cultivated Croceibacter atlanticus strain HTCC2559(T), which identified it as an obligate chemoheterotroph. PMID:20639333

  7. The genome sequence of the filamentous fungus Neurospora crassa

    OpenAIRE

    Read, Nick D; et al.

    2003-01-01

    Neurospora crassa is a central organism in the history of twentieth-century genetics, biochemistry and molecular biology. Here, we report a high-quality draft sequence of the N. crassa genome. The approximately 40-megabase genome encodes about 10,000 protein-coding genes—more than twice as many as in the fission yeast Schizosaccharomyces pombe and only about 25% fewer than in the fruitfly Drosophila melanogaster. Analysis of the gene set yields insights into unexpected aspects of Neu...

  8. Complete Genome Sequence of a Novel Porcine Parvovirus in China

    OpenAIRE

    Dai, Xiao-Fang; Wang, Qiu-Ju; Jiang, Shi-Jin; Xie, Zhi-Jing

    2012-01-01

    The porcine parvovirus JT strain (PPV-JT) was isolated from a piglet showing nonsuppurative myocarditis in Shandong, China, in 2010. The complete genomic sequence of PPV-JT, 4,941 bp long, was determined from clones made from replicative form (RF) DNA. The genomic analysis demonstrated that the PPV-JT might be involved in a recombination event, which will help us understand the molecular characteristics and evolutionary of PPV in China.

  9. Complete Genome Sequence of the Endophytic Fungus Diaporthe (Phomopsis) ampelina.

    Science.gov (United States)

    Savitha, J; Bhargavi, S D; Praveen, V K

    2016-01-01

    Diaporthe ampelina was isolated as an endophytic fungus from the root of Commiphora wightii, a medicinal plant collected from Dhanvantri Vana, Bangalore University, Bangalore, India. The whole genome is 59 Mb, contains a total of 905 scaffolds, and has a G+C content of 51.74%. The genome sequence of D. ampelina shows a complete absence of lovastatin (an anticholesterol drug) gene cluster. PMID:27257198

  10. Whole Genome and Transcriptome Sequencing of a B3 Thymoma

    OpenAIRE

    Iacopo Petrini; Arun Rajan; Trung Pham; Donna Voeller; Sean Davis; James Gao; Yisong Wang; Giuseppe Giaccone

    2013-01-01

    Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomi...

  11. Microsatellite evolution inferred from human– chimpanzee genomic sequence alignments

    OpenAIRE

    Webster, Matthew T.; Smith, Nick G.C.; Ellegren, Hans

    2002-01-01

    Most studies of microsatellite evolution utilize long, highly mutable loci, which are unrepresentative of the majority of simple repeats in the human genome. Here we use an unbiased sample of 2,467 microsatellite loci derived from alignments of 5.1 Mb of genomic sequence from human and chimpanzee to investigate the mutation process of tandemly repetitive DNA. The results indicate that the process of microsatellite evolution is highly heterogeneous, exhibiting differences between loci of diffe...

  12. Complete genome sequence of the European sheatfish virus

    OpenAIRE

    Mavian, Carla; López-Bueno, Alberto; Somalo, María Pilar Fernández; Alcamí, Antonio; Alejo, Alí

    2012-01-01

    Viral diseases are an increasing threat to the thriving aquaculture industry worldwide. An emerging group of fish pathogens is formed by several ranaviruses, which have been isolated at different locations from freshwater and seawater fish species since 1985.We report the complete genome sequence of European sheatfish ranavirus (ESV), the first ranavirus isolated in Europe, which causes high mortality rates in infected sheatfish (Silurus glanis) and in other species. Analysis of the genome se...

  13. Complete genome sequence of the European sheatfish virus

    OpenAIRE

    Mavian, Carla; López-Bueno, Alberto; Alcamí, Antonio; Alejo, Alí; Fernández Somalo, María Pilar

    2012-01-01

    Viral diseases are an increasing threat to the thriving aquaculture industry worldwide. An emerging group of fish pathogens is formed by several ranaviruses, which have been isolated at different locations from freshwater and seawater fish species since 1985.Wereport the complete genome sequence of European sheatfish ranavirus (ESV), the first ranavirus isolated in Europe, which causes high mortality rates in infected sheatfish (Silurus glanis) and in other species. Analysis of the genome seq...

  14. Arrangement of repetitive sequences in the genome of herpesvirus Sylvilagus.

    OpenAIRE

    Medveczky, M M; Geck, P; Clarke, C; Byrnes, J; Sullivan, J L; Medveczky, P G

    1989-01-01

    Herpesvirus sylvilagus is a lymphotropic (type gamma) herpesvirus of cottontail rabbits (Sylvilagus floridanus). Analysis of virion DNA of herpesvirus sylvilagus has revealed that the genome consists of one stretch of about 120 kilobase pairs of internal, unique DNA flanked by a variable number of 553-base-pair tandem repeats. The G + C content of the repetitive DNA is extremely high (83%), as determined by sequencing. The organization of the herpesvirus sylvilagus genome is, therefore, simil...

  15. Complete Genome Sequence of the Endophytic Fungus Diaporthe (Phomopsis) ampelina

    Science.gov (United States)

    Bhargavi, S. D.; Praveen, V. K.

    2016-01-01

    Diaporthe ampelina was isolated as an endophytic fungus from the root of Commiphora wightii, a medicinal plant collected from Dhanvantri Vana, Bangalore University, Bangalore, India. The whole genome is 59 Mb, contains a total of 905 scaffolds, and has a G+C content of 51.74%. The genome sequence of D. ampelina shows a complete absence of lovastatin (an anticholesterol drug) gene cluster. PMID:27257198

  16. Genome sequencing, annotation of Citrobacter freundii strain GTC 09479

    Directory of Open Access Journals (Sweden)

    Kazuyuki Kimura

    2014-12-01

    Full Text Available We report the 4.9-Mb genome sequence of Citrobacter freundii strain GTC 09479, isolated from urine sample collected during the year 1983 at Gifu University Graduate School of Medicine, Japan. This draft genome consist of 4,899,578 bp with 51.62% G + C, 4,574 predicted CDSs, 72 tRNAs and 10 rRNAs.

  17. Standardized metadata for human pathogen/vector genomic sequences.

    Directory of Open Access Journals (Sweden)

    Vivien G Dugan

    Full Text Available High throughput sequencing has accelerated the determination of genome sequences for thousands of human infectious disease pathogens and dozens of their vectors. The scale and scope of these data are enabling genotype-phenotype association studies to identify genetic determinants of pathogen virulence and drug/insecticide resistance, and phylogenetic studies to track the origin and spread of disease outbreaks. To maximize the utility of genomic sequences for these purposes, it is essential that metadata about the pathogen/vector isolate characteristics be collected and made available in organized, clear, and consistent formats. Here we report the development of the GSCID/BRC Project and Sample Application Standard, developed by representatives of the Genome Sequencing Centers for Infectious Diseases (GSCIDs, the Bioinformatics Resource Centers (BRCs for Infectious Diseases, and the U.S. National Institute of Allergy and Infectious Diseases (NIAID, part of the National Institutes of Health (NIH, informed by interactions with numerous collaborating scientists. It includes mapping to terms from other data standards initiatives, including the Genomic Standards Consortium's minimal information (MIxS and NCBI's BioSample/BioProjects checklists and the Ontology for Biomedical Investigations (OBI. The standard includes data fields about characteristics of the organism or environmental source of the specimen, spatial-temporal information about the specimen isolation event, phenotypic characteristics of the pathogen/vector isolated, and project leadership and support. By modeling metadata fields into an ontology-based semantic framework and reusing existing ontologies and minimum information checklists, the application standard can be extended to support additional project-specific data fields and integrated with other data represented with comparable standards. The use of this metadata standard by all ongoing and future GSCID sequencing projects will

  18. Sequence analysis reveals mosaic genome of Aichi virus

    Directory of Open Access Journals (Sweden)

    Han Xiaohong

    2011-08-01

    Full Text Available Abstract Aichi virus is a positive-sense and single-stranded RNA virus, which demonstrated to be related to diarrhea of Children. In the present study, phylogenetic and recombination analysis based on the Aichi virus complete genomes available in GenBank reveal a mosaic genome sequence [GenBank: FJ890523], of which the nt 261-852 region (the nt position was based on the aligned sequence file shows close relationship with AB010145/Japan with 97.9% sequence identity, while the other genomic regions show close relationship with AY747174/German with 90.1% sequence identity. Our results will provide valuable hints for future research on Aichi virus diversity. Aichi virus is a member of the Kobuvirus genus of the Picornaviridae family 12 and belongs to a positive-sense and single-stranded RNA virus. Its presence in fecal specimens of children suffering from diarrhea has been demonstrated in several Asian countries 3456, in Brazil and German 7, in France 8 and in Tunisia 9. Some reports showed the high level of seroprevalence in adults 710, suggesting the widespread exposure to Aichi virus during childhood. The genome of Aichi virus contains 8,280 nucleotides and a poly(A tail. The single large open reading frame (nt 713-8014 according to the strain AB010145 encodes a polyprotein of 2,432 amino acids that is cleaved into the typical picornavirus structural proteins VP0, VP3, VP1, and nonstructural proteins 2A, 2B, 2C, 3A, 3B, 3C and 3D 211. Based on the phylogenetic analysis of 519-bp sequences at the 3C-3D (3CD junction, Aichi viruses can be divided into two genotypes A and B with approximately 90% sequence homology 12. Although only six complete genomes of Aichi virus were deposited in GenBank at present, mosaic genomes can be found in strains from different countries.

  19. Transcription of densovirus endogenous sequences in the Myzus persicae genome.

    Science.gov (United States)

    Clavijo, Gabriel; van Munster, Manuella; Monsion, Baptiste; Bochet, Nicole; Brault, Véronique

    2016-04-01

    Integration of non-retroviral sequences in the genome of different organisms has been observed and, in some cases, a relationship of these integrations with immunity has been established. The genome of the green peach aphid, Myzus persicae (clone G006), was screened for densovirus-like sequence (DLS) integrations. A total of 21 DLSs localized on 10 scaffolds were retrieved that mostly shared sequence identity with two aphid-infecting viruses, Myzus persicae densovirus (MpDNV) and Dysaphis plantaginea densovirus (DplDNV). In some cases, uninterrupted potential ORFs corresponding to non-structural viral proteins or capsid proteins were found within DLSs identified in the aphid genome. In particular, one scaffold harboured a complete virus-like genome, while another scaffold contained two virus-like genomes in reverse orientation. Remarkably, transcription of some of these ORFs was observed in M. persicae, suggesting a biological effect of these viral integrations. In contrast to most of the other densoviruses identified so far that induce acute host infection, it has been reported previously that MpDNV has only a minor effect on M. persicae fitness, while DplDNV can even have a beneficial effect on its aphid host. This suggests that DLS integration in the M. persicae genome may be responsible for the latency of MpDNV infection in the aphid host. PMID:26758080

  20. Low-pass sequencing for microbial comparative genomics

    Directory of Open Access Journals (Sweden)

    Kennedy Sean

    2004-01-01

    Full Text Available Abstract Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1 the metabolically versatile Haloarcula marismortui; (2 the non-pigmented Natrialba asiatica; (3 the psychrophile Halorubrum lacusprofundi and (4 the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI for their predicted proteins. Multiple insertion sequence (IS elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP and transcription factor IIB (TFB homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1 high GC content and (2 low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the

  1. The complete mitochondrial genome sequence of the budgerigar, Melopsittacus undulatus.

    Science.gov (United States)

    Guan, Xiaojing; Xu, Jun; Smith, Edward J

    2016-01-01

    Here, we describe the budgie's mitochondrial genome sequence, a resource that can facilitate this parrot's use as a model organism as well as for determining its phylogenetic relatedness to other parrots/Psittaciformes. The estimated total length of the sequence was 18,193 bp. In addition to the to the 13 protein and tRNA and rRNA coding regions, the sequence also includes a duplicated hypervariable region, a feature unique to only a few birds. The two hypervariable regions shared a sequence identity of about 86%. PMID:24660934

  2. Pittosporum cryptic virus 1: genome sequence completion using next-generation sequencing.

    Science.gov (United States)

    Elbeaino, Toufic; Kubaa, Raied Abou; Tuzlali, Hasan Tuna; Digiaro, Michele

    2016-07-01

    Next-generation sequencing (NGS) was applied to dsRNAs extracted from an Italian pittosporum plant infected with pittosporum cryptic virus 1 (PiCV1). NGS allowed assembly of the full genome sequence of PiCV1, comprising dsRNA1 (1.9 kbp) and dsRNA2 (1.5 kbp), which encode the RNA-dependent RNA polymerase and capsid protein genes, respectively. Phylogenetic and sequence analyses confirmed that PiCV1 is a new member of the genus Deltapartitivirus, family Partiviridae. From the same plant, NSG also permitted assembly of the complete genome sequence of eggplant mottled dwarf virus (EMDV), which shared 86 % to 98 % nucleotide sequence identity with complete and partial sequences (ca 6750 nt) of other known EMDV isolates with sequences available in the GenBank database. PMID:27087112

  3. Sequence Classification: 892357 [TMBETA-GENOME[Archive

    Lifescience Database Archive (English)

    Full Text Available Non-TMB Non-TMH Non-TMB Non-TMB Non-TMB Non-TMB >gi|6322971|ref|NP_013043.1| Fe(II)-dependent su ... dioxygenase, involved in sulfonate catabolism for use ... as a sulfur source, contains sequence that closely ...

  4. Genome Sequencing and Annotation of Mycobacterium tuberculosis PR08 strain

    Directory of Open Access Journals (Sweden)

    Mohammad Maaruf Jaafar

    2016-03-01

    Full Text Available Mycobacterium tuberculosis is an acid fast bacterial species in the family Mycobacteriaceae and is the causative agent of most cases of tuberculosis. Here, we report the genomic features of Mycobacterium tuberculosis isolated from the cerebrospinal fluid (CSF of a patient diagnosed with both pulmonary and extrapulmonary tuberculosis (TB. The isolated strain was identified as Mycobacterium tuberculosis PR08 (MTB PR08. Genomic DNA of the MTB PR08 strain was extracted and subjected to whole genome sequencing using MiSeq (Illumina, CA,USA. The draft genome size of MTB PR08 strain is 4,292,364 bp with a G + C content of 65.2%. This strain was annotated to have 4723 genes and 48 RNAs. This whole genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession number CP010895.

  5. Sequencing and analysis of the giant panda genome

    Institute of Scientific and Technical Information of China (English)

    YANG HuanMing

    2010-01-01

    @@ The giant panda (Ailuropoda melanoleuca) is loved all over the world and is considered a symbol of China, as illustrated by its being one of the mascots for the Beijing 2008 Olympic Games.It is also one of the world's most endangered animals and a flagship species for conservation.Using next-generation sequencing technology (Illumina Genome Analyzer) and our in-house assembly software, we have generated the first map of the giant panda genome sequence.This map will provide an unparalleled amount of information to aid in understanding the genetic and biological nature of this unique species and will contribute significantly to disease control and conservation efforts for this endangered species.In March 2008, the giant panda genome sequencing and analysis project was started at the Beijing Genomics Institute (BGI) in Shenzhen with collaborators from the Kunming Institute of Zoology and the Chengdu Research Base of Giant Panda Breeding.On 21 Jan.2010, this collaboration resulted in the publication, as a cover story in the journal Nature, of the sequencing and analysis of the giant panda genome.

  6. Genome sequence of the pea aphid Acyrthosiphon pisum

    DEFF Research Database (Denmark)

    Richards, S.; Gibbs, R. A.; Gerardo, N. M.;

    2010-01-01

    Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first...... published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we...

  7. The complete chloroplast genome sequence of Curcuma flaviflora (Curcuma).

    Science.gov (United States)

    Zhang, Yan; Deng, Jiabin; Li, Yangyi; Gao, Gang; Ding, Chunbang; Zhang, Li; Zhou, Yonghong; Yang, Ruiwu

    2016-09-01

    The complete chloroplast (cp) genome of Curcuma flaviflora, a medicinal plant in Southeast Asia, was sequenced. The genome size was 160 478 bp in length, with 36.3% GC content. A pair of inverted repeats (IRs) of 26 946 bp were separated by a large single copy (LSC) of 88 008 bp and a small single copy (SSC) of 18 578 bp, respectively. The cp genome contained 132 annotated genes, including 79 protein coding genes, 30 tRNA genes, and four rRNA genes. And 19 of these genes were duplicated in inverted repeat regions. PMID:26367332

  8. The complete mitochondrial genome sequence of Emperor Penguins (Aptenodytes forsteri).

    Science.gov (United States)

    Xu, Qiwu; Xia, Yan; Dang, Xiao; Chen, Xiaoli

    2016-09-01

    The emperor penguin (Aptenodytes forsteri) is the largest living species of penguin. Herein, we first reported the complete mitochondrial genome of emperor penguin. The mitochondrial genome is a circular molecule of 17 301 bp in length, consisting of 13 protein-coding genes, 22 tRNA genes, two rRNA, and one control region. To verify the accuracy and the utility of new determined mitogenome sequences, we constructed the species phylogenetic tree of emperor penguin together with 10 other closely species. This is the second complete mitochondrial genome of penguin, and this is going to be an important data to study mitochondrial evolution of birds. PMID:26403091

  9. Draft genome sequence of the rubber tree Hevea brasiliensis

    OpenAIRE

    Rahman Ahmad Yamin Abdul; Usharraj Abhilash O; Misra Biswapriya B; Thottathil Gincy P; Jayasekaran Kandakumar; Feng Yun; Hou Shaobin; Ong Su Yean; Ng Fui Ling; Lee Ling Sze; Tan Hock Siew; Sakaff Muhd Khairul Luqman Muhd; Teh Beng Soon; Khoo Bee; Badai Siti Suriawati

    2013-01-01

    Abstract Background Hevea brasiliensis, a member of the Euphorbiaceae family, is the major commercial source of natural rubber (NR). NR is a latex polymer with high elasticity, flexibility, and resilience that has played a critical role in the world economy since 1876. Results Here, we report the draft genome sequence of H. brasiliensis. The assembly spans ~1.1 Gb of the estimated 2.15 Gb haploid genome. Overall, ~78% of the genome was identified as repetitive DNA. Gene prediction shows 68,95...

  10. Establishing a framework for comparative analysis of genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  11. Complete Plastid Genome Sequence of the Brown Alga Undaria pinnatifida.

    Directory of Open Access Journals (Sweden)

    Lei Zhang

    Full Text Available In this study, we fully sequenced the circular plastid genome of a brown alga, Undaria pinnatifida. The genome is 130,383 base pairs (bp in size; it contains a large single-copy (LSC, 76,598 bp and a small single-copy region (SSC, 42,977 bp, separated by two inverted repeats (IRa and IRb: 5,404 bp. The genome contains 139 protein-coding, 28 tRNA, and 6 rRNA genes; none of these genes contains introns. Organization and gene contents of the U. pinnatifida plastid genome were similar to those of Saccharina japonica. There is a co-linear relationship between the plastid genome of U. pinnatifida and that of three previously sequenced large brown algal species. Phylogenetic analyses of 43 taxa based on 23 plastid protein-coding genes grouped all plastids into a red or green lineage. In the large brown algae branch, U. pinnatifida and S. japonica formed a sister clade with much closer relationship to Ectocarpus siliculosus than to Fucus vesiculosus. For the first time, the start codon ATT was identified in the plastid genome of large brown algae, in the atpA gene of U. pinnatifida. In addition, we found a gene-length change induced by a 3-bp repetitive DNA in ycf35 and ilvB genes of the U. pinnatifida plastid genome.

  12. Sequence modelling and an extensible data model for genomic database

    Energy Technology Data Exchange (ETDEWEB)

    Li, Peter Wei-Der [California Univ., San Francisco, CA (United States)]|[Lawrence Berkeley Lab., CA (United States)

    1992-01-01

    The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS`s do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the ``Extensible Object Model``, to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.

  13. Sequence modelling and an extensible data model for genomic database

    Energy Technology Data Exchange (ETDEWEB)

    Li, Peter Wei-Der (California Univ., San Francisco, CA (United States) Lawrence Berkeley Lab., CA (United States))

    1992-01-01

    The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS's do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the Extensible Object Model'', to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.

  14. Building a model: developing genomic resources for common milkweed (Asclepias syriaca with low coverage genome sequencing

    Directory of Open Access Journals (Sweden)

    Weitemier Kevin

    2011-05-01

    Full Text Available Abstract Background Milkweeds (Asclepias L. have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L. could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp and 5S rDNA (120 bp sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp, with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae unigenes (median coverage of 0.29× and 66% of single copy orthologs (COSII in asterids (median coverage of 0.14×. From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites and phylogenetics (low-copy nuclear genes studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species

  15. Mitochondrial DNA sequences in the nuclear genome of a locust.

    Science.gov (United States)

    Gellissen, G; Bradfield, J Y; White, B N; Wyatt, G R

    The endosymbiotic theory of the origin of mitochondria is widely accepted, and implies that loss of genes from the mitochondria to the nucleus of eukaryotic cells has occurred over evolutionary time. However, evidence at the DNA sequence level for gene transfer between these organelles has so far been limited to a single example, the demonstration that a mitochondrial ATPase subunit gene of Neurospora crassa has an homologous partner in the nuclear genome. From a gene library of the insect, Locusta migratoria, we have now isolated two clones, representing separate fragments of nuclear DNA, which contain sequences homologous to the mitochondrial genes for ribosomal RNA, as well as regions of homology with highly repeated nuclear sequences. The results suggest the transfer of sequences between mitochondrial and nuclear genomes, followed by evolutionary divergence. PMID:6298629

  16. Complete genome sequence of Allochromatium vinosum DSM 180T

    Energy Technology Data Exchange (ETDEWEB)

    Weissgerber, Thomas [Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany; Zigann, Renate [Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany; Bruce, David [Los Alamos National Laboratory (LANL); Chang, Yun-Juan [ORNL; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Hauser, Loren John [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Land, Miriam L [ORNL; Munk, Christine [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Dahl, Christiane [Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany

    2011-01-01

    Allochromatium vinosum formerly Chromatium vinosum is a mesophilic purple sulfur bacte- rium belonging to the family Chromatiaceae in the bacterial class Gammaproteobacteria. The genus Allochromatium contains currently five species. All members were isolated from fresh- water, brackish water or marine habitats and are predominately obligate phototrophs. Here we describe the features of the organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the Chromatiaceae within the purple sulfur bacteria thriving in globally occurring habitats. The 3,669,074 bp ge- nome with its 3,302 protein-coding and 64 RNA genes was sequenced within the Joint Ge- nome Institute Community Sequencing Program.

  17. Complete genome sequence of Thauera aminoaromatica strain MZ1T

    Energy Technology Data Exchange (ETDEWEB)

    Sanseverino, John [ORNL; Chauhan, Archana [University of Tennessee, Knoxville (UTK); Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Dalin, Eileen [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Sims, David [Los Alamos National Laboratory (LANL); Brettin, Thomas S [ORNL; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Chang, Yun-Juan [ORNL; Larimer, Frank W [ORNL; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Moser, Scott [University of Tennessee, Knoxville (UTK); Jegier, Patricia [University of Tennessee, Knoxville (UTK); Close, Dan [University of Tennessee, Knoxville (UTK); Wang, Ying [University of Tennessee, Knoxville (UTK); Layton, Alice [University of Tennessee, Knoxville (UTK); Allen, Michael S. [University of Tennessee, Knoxville (UTK); Sayler, Gary [University of Tennessee, Knoxville (UTK)

    2012-01-01

    Thauera aminoaromatica strain MZ1T, an isolate belonging to genus Thauera, of the family Rhodocyclaceae and the class the Betaproteobacteria, has been characterized for its ability to produce abundant exopolysaccharide and degrade various aromatic compounds with nitrate as an electron acceptor. These properties, if fully understood at the genome-sequence level, can aid in environmental processing of organic matter in anaerobic cycles by short-circuiting a central anaerobic metabolite, acetate, from microbiological conversion to methane, a criti-cal greenhouse gas. Strain MZ1T is the first strain from the genus Thauera with a completely sequenced genome. The 4,496,212 bp chromosome and 78,374 bp plasmid contain 4,071 protein-coding and 71 RNA genes, and were sequenced as part of the DOE Community Se-quencing Program CSP{_}776774.

  18. Complete chloroplast genome sequence of Fritillaria unibracteata var. wabuensis based on SMRT Sequencing Technology.

    Science.gov (United States)

    Li, Ying; Li, Qiushi; Li, Xiwen; Song, Jingyuan; Sun, Chao

    2016-09-01

    Fritillaria unibracteata var. wabuensis is an important medicinal plant used for the treatment of cough symptoms related to the respiratory system. The chloroplast genome of F. unibracteata var. wabuensis (GenBank accession no. KF769142) was assembled using the PacBio RS platform (Pacific Biosciences, Beverly, MA) as a circle sequence with 151 009 bp. The assembled genome contains 133 genes, including 88 protein-coding, 37 tRNA, and eight rRNA genes. This genome sequence will provide important resource for further studies on the evolution of Fritillaria genus and molecular identification of Fritillaria herbs and their adulterants. This work suggests that PacBio RS is a powerful tool to sequence and assemble chloroplast genomes. PMID:26370383

  19. Complete Genome Sequence of Streptococcus agalactiae CNCTC 10/84, a Hypervirulent Sequence Type 26 Strain

    OpenAIRE

    Hooven, Thomas A.; Randis, Tara M.; Sean C Daugherty; Narechania, Apurva; Planet, Paul J.; Tettelin, Hervé; Ratner, Adam J.

    2014-01-01

    Streptococcus agalactiae (group B Streptococcus [GBS]) is a human pathogen with a propensity to cause neonatal infections. We report the complete genome sequence of GBS strain CNCTC 10/84, a hypervirulent clinical isolate frequently used to study GBS pathogenesis. Comparative analysis of this sequence may shed light on novel pathogenic mechanisms.

  20. Genome sequencing highlights the dynamic early history of dogs.

    OpenAIRE

    Freedman, Adam H.; Ilan Gronau; Schweizer, Rena M.; Diego Ortega-Del Vecchyo; Eunjung Han; Silva, Pedro M.; Marco Galaverni; Zhenxin Fan; Peter Marx; Belen Lorente-Galdos; Holly Beale; Oscar Ramirez; Farhad Hormozdiari; Can Alkan; Carles Vilà

    2014-01-01

    To identify genetic changes underlying dog domestication and reconstruct their early evolutionary history, we generated high-quality genome sequences from three gray wolves, one from each of the three putative centers of dog domestication, two basal dog lineages (Basenji and Dingo) and a golden jackal as an outgroup. Analysis of these sequences supports a demographic model in which dogs and wolves diverged through a dynamic process involving population bottlenecks in both lineages and post-di...

  1. Genome Sequencing Highlights the Dynamic Early History of Dogs

    OpenAIRE

    Freedman, A.H.; Gronau, I.; Schweizer, R.M.; Han, E; Silva, P.M.; Galaverni, M.; Fan, Z; Marx, P; Lorente-Galdos, B.; Beale, H.; Ramirez, O.; Hormozdiari, Fereydoun; Alkan, Can; Vilà, Carles; Geffen, E

    2014-01-01

    To identify genetic changes underlying dog domestication and reconstruct their early evolutionary history, we generated high-quality genome sequences from three gray wolves, one from each of the three putative centers of dog domestication, two basal dog lineages (Basenji and Dingo) and a golden jackal as an outgroup. Analysis of these sequences supports a demographic model in which dogs and wolves diverged through a dynamic process involving population bottlenecks in both lineages and post-di...

  2. Sequencing Crop Genomes: A Gateway to Improve Tropical Agriculture

    OpenAIRE

    Thottathil, Gincy Paily; Jayasekaran, Kandakumar; Othman, Ahmad Sofiman

    2016-01-01

    Agricultural development in the tropics lags behind development in the temperate latitudes due to the lack of advanced technology, and various biotic and abiotic factors. To cope with the increasing demand for food and other plant-based products, improved crop varieties have to be developed. To breed improved varieties, a better understanding of crop genetics is necessary. With the advent of next-generation DNA sequencing technologies, many important crop genomes have been sequenced. Primary ...

  3. The impact of next-generation sequencing on genomics

    OpenAIRE

    Zhang, Jun; Chiodini, Rod; Badr, Ahmed; Zhang, Genfa

    2011-01-01

    This article reviews basic concepts, general applications, and the potential impact of next-generation sequencing (NGS) technologies on genomics, with particular reference to currently available and possible future platforms and bioinformatics. NGS technologies have demonstrated the capacity to sequence DNA at unprecedented speed, thereby enabling previously unimaginable scientific achievements and novel biological applications. But, the massive data produced by NGS also presents a significan...

  4. Analysis of Chimpanzee History Based on Genome Sequence Alignments

    OpenAIRE

    Caswell, Jennifer L.; Richter, Daniel J.; Neubauer, Julie; Schirmer, Christine; Gnerre, Sante; Mallick, Swapan; Reich, David Emil

    2008-01-01

    Population geneticists often study small numbers of carefully chosen loci, but it has become possible to obtain orders of magnitude for more data from overlaps of genome sequences. Here, we generate tens of millions of base pairs of multiple sequence alignments from combinations of three western chimpanzees, three central chimpanzees, an eastern chimpanzee, a bonobo, a human, an orangutan, and a macaque. Analysis provides a more precise understanding of demographic history than was previously...

  5. Complete mitochondrial genome sequence of Romanogobio tenuicorpus (Amur whitefin gudgeon).

    Science.gov (United States)

    Dong, Fang; Tong, Guang-Xiang; Kuang, You-Yi; Sun, Xiao-Wen

    2015-01-01

    Amur whitefin gudgeon (Romanogobio tenuicorpus) belongs to the family Cyprinidae, it is freshwater aquaculture species in China. In the report, we determined the complete mitochondrial genome sequence of Romanogobio tenuicorpus, which is 16,600 bp long circular molecule with 13 protein-coding genes, 22 tRNA genes, 2 rRNA genes and a control region, the conserved sequence blocks, CSB1, CSB2 and CSB3 were also detected. PMID:24409923

  6. Molecular evolution of herpesviruses: genomic and protein sequence comparisons.

    OpenAIRE

    Karlin, S; Mocarski, E S; Schachtel, G A

    1994-01-01

    Phylogenetic reconstruction of herpesvirus evolution is generally founded on amino acid sequence comparisons of specific proteins. These are relevant to the evolution of the specific gene (or set of genes), but the resulting phylogeny may vary depending on the particular sequence chosen for analysis (or comparison). In the first part of this report, we compare 13 herpesvirus genomes by using a new multidimensional methodology based on distance measures and partial orderings of dinucleotide re...

  7. Mitochondrial genome sequences and comparative genomics ofPhytophthora ramorum and P. sojae

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Frank N.; Douda, Bensasson; Tyler, Brett M.; Boore,Jeffrey L.

    2007-01-01

    The complete sequences of the mitochondrial genomes of theoomycetes of Phytophthora ramorum and P. sojae were determined during thecourse of their complete nuclear genome sequencing (Tyler, et al. 2006).Both are circular, with sizes of 39,314 bp for P. ramorum and 42,975 bpfor P. sojae. Each contains a total of 37 identifiable protein-encodinggenes, 25 or 26 tRNAs (P. sojae and P. ramorum, respectively)specifying19 amino acids, and a variable number of ORFs (7 for P. ramorum and 12for P. sojae) which are potentially additional functional genes.Non-coding regions comprise approximately 11.5 percent and 18.4 percentof the genomes of P. ramorum and P. sojae, respectively. Relative to P.sojae, there is an inverted repeat of 1,150 bp in P. ramorum thatincludes an unassigned unique ORF, a tRNA gene, and adjacent non-codingsequences, but otherwise the gene order in both species is identical.Comparisons of these genomes with published sequences of the P. infestansmitochondrial genome reveals a number of similarities, but the gene orderin P. infestans differs in two adjacent locations due to inversions.Sequence alignments of the three genomes indicated sequence conservationranging from 75 to 85 percent and that specific regions were morevariable than others.

  8. Targeted or whole genome sequencing of formalin fixed tissue samples: potential applications in cancer genomics.

    Science.gov (United States)

    Munchel, Sarah; Hoang, Yen; Zhao, Yue; Cottrell, Joseph; Klotzle, Brandy; Godwin, Andrew K; Koestler, Devin; Beyerlein, Peter; Fan, Jian-Bing; Bibikova, Marina; Chien, Jeremy

    2015-09-22

    Current genomic studies are limited by the poor availability of fresh-frozen tissue samples. Although formalin-fixed diagnostic samples are in abundance, they are seldom used in current genomic studies because of the concern of formalin-fixation artifacts. Better characterization of these artifacts will allow the use of archived clinical specimens in translational and clinical research studies. To provide a systematic analysis of formalin-fixation artifacts on Illumina sequencing, we generated 26 DNA sequencing data sets from 13 pairs of matched formalin-fixed paraffin-embedded (FFPE) and fresh-frozen (FF) tissue samples. The results indicate high rate of concordant calls between matched FF/FFPE pairs at reference and variant positions in three commonly used sequencing approaches (whole genome, whole exome, and targeted exon sequencing). Global mismatch rates and C · G > T · A substitutions were comparable between matched FF/FFPE samples, and discordant rates were low (<0.26%) in all samples. Finally, low-pass whole genome sequencing produces similar pattern of copy number alterations between FF/FFPE pairs. The results from our studies suggest the potential use of diagnostic FFPE samples for cancer genomic studies to characterize and catalog variations in cancer genomes. PMID:26305677

  9. Identification of cancer-driver genes in focal genomic alterations from whole genome sequencing data.

    Science.gov (United States)

    Jang, Ho; Hur, Youngmi; Lee, Hyunju

    2016-01-01

    DNA copy number alterations (CNAs) are the main genomic events that occur during the initiation and development of cancer. Distinguishing driver aberrant regions from passenger regions, which might contain candidate target genes for cancer therapies, is an important issue. Several methods for identifying cancer-driver genes from multiple cancer patients have been developed for single nucleotide polymorphism (SNP) arrays. However, for NGS data, methods for the SNP array cannot be directly applied because of different characteristics of NGS such as higher resolutions of data without predefined probes and incorrectly mapped reads to reference genomes. In this study, we developed a wavelet-based method for identification of focal genomic alterations for sequencing data (WIFA-Seq). We applied WIFA-Seq to whole genome sequencing data from glioblastoma multiforme, ovarian serous cystadenocarcinoma and lung adenocarcinoma, and identified focal genomic alterations, which contain candidate cancer-related genes as well as previously known cancer-driver genes. PMID:27156852

  10. Identification of cancer-driver genes in focal genomic alterations from whole genome sequencing data

    Science.gov (United States)

    Jang, Ho; Hur, Youngmi; Lee, Hyunju

    2016-01-01

    DNA copy number alterations (CNAs) are the main genomic events that occur during the initiation and development of cancer. Distinguishing driver aberrant regions from passenger regions, which might contain candidate target genes for cancer therapies, is an important issue. Several methods for identifying cancer-driver genes from multiple cancer patients have been developed for single nucleotide polymorphism (SNP) arrays. However, for NGS data, methods for the SNP array cannot be directly applied because of different characteristics of NGS such as higher resolutions of data without predefined probes and incorrectly mapped reads to reference genomes. In this study, we developed a wavelet-based method for identification of focal genomic alterations for sequencing data (WIFA-Seq). We applied WIFA-Seq to whole genome sequencing data from glioblastoma multiforme, ovarian serous cystadenocarcinoma and lung adenocarcinoma, and identified focal genomic alterations, which contain candidate cancer-related genes as well as previously known cancer-driver genes. PMID:27156852

  11. Sequence Determination from Overlapping Fragments: A Simple Model of Whole-Genome Shotgun Sequencing

    Science.gov (United States)

    Derrida, Bernard; Fink, Thomas M.

    2002-02-01

    Assembling fragments randomly sampled from along a sequence is the basis of whole-genome shotgun sequencing, a technique used to map the DNA of the human and other genomes. We calculate the probability that a random sequence can be recovered from a collection of overlapping fragments. We provide an exact solution for an infinite alphabet and in the case of constant overlaps. For the general problem we apply two assembly strategies and give the probability that the assembly puzzle can be solved in the limit of infinitely many fragments.

  12. A Pan-HIV Strategy for Complete Genome Sequencing.

    Science.gov (United States)

    Berg, Michael G; Yamaguchi, Julie; Alessandri-Gradt, Elodie; Tell, Robert W; Plantier, Jean-Christophe; Brennan, Catherine A

    2016-04-01

    Molecular surveillance is essential to monitor HIV diversity and track emerging strains. We have developed a universal library preparation method (HIV-SMART [i.e.,switchingmechanismat 5' end ofRNAtranscript]) for next-generation sequencing that harnesses the specificity of HIV-directed priming to enable full genome characterization of all HIV-1 groups (M, N, O, and P) and HIV-2. Broad application of the HIV-SMART approach was demonstrated using a panel of diverse cell-cultured virus isolates. HIV-1 non-subtype B-infected clinical specimens from Cameroon were then used to optimize the protocol to sequence directly from plasma. When multiplexing 8 or more libraries per MiSeq run, full genome coverage at a median ∼2,000× depth was routinely obtained for either sample type. The method reproducibly generated the same consensus sequence, consistently identified viral sequence heterogeneity present in specimens, and at viral loads of ≤4.5 log copies/ml yielded sufficient coverage to permit strain classification. HIV-SMART provides an unparalleled opportunity to identify diverse HIV strains in patient specimens and to determine phylogenetic classification based on the entire viral genome. Easily adapted to sequence any RNA virus, this technology illustrates the utility of next-generation sequencing (NGS) for viral characterization and surveillance. PMID:26699702

  13. A shot in the genome: how accurately do shotgun 454 sequences represent a genome?

    Directory of Open Access Journals (Sweden)

    Meglécz Emese

    2012-05-01

    Full Text Available Abstract Background Next generation sequencing (NGS provides a valuable method to quickly obtain sequence information from non-model organisms at a genomic scale. In principle, if sequencing is not targeted for a genomic region or sequence type (e.g. coding region, microsatellites NGS reads can be used as a genome snapshot and provide information on the different types of sequences in the genome. However, no study has ascertained if a typical 454 dataset of low coverage (1/4-1/8 of a PicoTiter plate leading to generally less than 0.1x of coverage represents all parts of genomes equally. Findings Partial genome shotgun sequencing of total DNA (without enrichment on a 454 NGS platform was used to obtain reads of Apis mellifera (454 reads hereafter. These 454 reads were compared to the assembled chromosomes of this species in three different aspects: (i dimer and trimer compositions, (ii the distribution of mapped 454 sequences along the chromosomes and (iii the numbers of different classes of microsatellites. Highly significant chi-square tests for all three types of analyses indicated that the 454 data is not a perfect random sample of the genome. Only the number of 454 reads mapped to each of the 16 chromosomes and the number of microsatellites pooled by motif (repeat unit length was not significantly different from the expected values. However, a very strong correlation (correlation coefficients greater than 0.97 was observed between most of the 454 variables (the number of different dimers and trimers, the number of 454 reads mapped to each chromosome fragments of one Mb, the number of 454 reads mapped to each chromosome, the number of microsatellites of each class and their corresponding genomic variables. Conclusions The results of chi square tests suggest that 454 shotgun reads cannot be regarded as a perfect representation of the genome especially if the comparison is done on a finer scale (e.g. chromosome fragments instead of whole

  14. Sugarcane genome sequencing by methylation filtration provides tools for genomic research in the genus Saccharum.

    Science.gov (United States)

    Grativol, Clícia; Regulski, Michael; Bertalan, Marcelo; McCombie, W Richard; da Silva, Felipe Rodrigues; Zerlotini Neto, Adhemar; Vicentini, Renato; Farinelli, Laurent; Hemerly, Adriana Silva; Martienssen, Robert A; Ferreira, Paulo Cavalcanti Gomes

    2014-07-01

    Many economically important crops have large and complex genomes that hamper their sequencing by standard methods such as whole genome shotgun (WGS). Large tracts of methylated repeats occur in plant genomes that are interspersed by hypomethylated gene-rich regions. Gene-enrichment strategies based on methylation profiles offer an alternative to sequencing repetitive genomes. Here, we have applied methyl filtration with McrBC endonuclease digestion to enrich for euchromatic regions in the sugarcane genome. To verify the efficiency of methylation filtration and the assembly quality of sequences submitted to gene-enrichment strategy, we have compared assemblies using methyl-filtered (MF) and unfiltered (UF) libraries. The use of methy filtration allowed a better assembly by filtering out 35% of the sugarcane genome and by producing 1.5× more scaffolds and 1.7× more assembled Mb in length compared with unfiltered dataset. The coverage of sorghum coding sequences (CDS) by MF scaffolds was at least 36% higher than by the use of UF scaffolds. Using MF technology, we increased by 134× the coverage of gene regions of the monoploid sugarcane genome. The MF reads assembled into scaffolds that covered all genes of the sugarcane bacterial artificial chromosomes (BACs), 97.2% of sugarcane expressed sequence tags (ESTs), 92.7% of sugarcane RNA-seq reads and 98.4% of sorghum protein sequences. Analysis of MF scaffolds from encoded enzymes of the sucrose/starch pathway discovered 291 single-nucleotide polymorphisms (SNPs) in the wild sugarcane species, S. spontaneum and S. officinarum. A large number of microRNA genes was also identified in the MF scaffolds. The information achieved by the MF dataset provides a valuable tool for genomic research in the genus Saccharum and for improvement of sugarcane as a biofuel crop. PMID:24773339

  15. The complete plastid genome sequence of Abies koreana (Pinaceae: Abietoideae).

    Science.gov (United States)

    Yi, Dong-Keun; Yang, Jong Cheol; So, Soonku; Joo, Minjung; Kim, Dong-Kap; Shin, Chang Ho; Lee, You-Mi; Choi, Kyung

    2016-07-01

    The nucleotide sequence of the chloroplast genome from Abies koreana is the first to have complete genome sequence from genus Abies of family Pinaceae. The circular double-stranded DNA, which consists of 121,373 base pairs (bp), contains a pair of very short inverted repeat regions (IRa and IRb) of 264 bp each, which are separated by a small and large single-copy regions (SSC and LSC) of 54,197 and 66,648 bp, respectively. The genome contents of 114 genes (68 peptide-encoding genes, 35 tRNA genes, four rRNA genes, six open reading frames and one pseudogene) are similar to the chloroplast DNA of other species of Abietoideae. Loss of ndh genes was also identified in the genome of A. koreana like other genomes in the family Pinaceae. Thirteen genes contain one (11 genes) or two (rps12 and ycf3 genes) introns. In phylogenetic analysis, the tree confirms that Abies, Keteleeria and Cedrus are strongly supported as monophyletic. Other inverted repeat sequences located in 42-kb inversion points (1186 bp) include trnS-psaM-ycf12- ψtrnG genes. PMID:25812052

  16. [Genome sequencing and personalized medicine: perspectives and limitations].

    Science.gov (United States)

    Le Gall, Jean-Yves; Debré, Patrice

    2014-01-01

    DNA sequencing technologies have advanced at an exponential rate in recent years: the first human genome was sequenced in 2001 after many years of effort by dozens of international laboratories at a cost of tens of millions of dollars, while in 2013 a genome can be sequenced within 24 hours for a few hundred dollars (exome sequencing takes only a few hours). More and more hospital laboratories are acquiring new high-throughput sequencing devices ("next-generation sequencers", NGS), allowing them to analyze tens or hundreds of genes, or even the entire exome. This is having a major impact on medical concepts and practices, especially with respect to genetics and oncology. This ability to search for mutations simultaneously in a large number of genes is finding applications in the diagnosis of Mendelian diseases (including at birth), routine screening for heterozygotes, and pre-conception diagnosis. NGS is now sufficiently sensitive to analyze circulating fetal DNA in maternal blood (cell-free fetal DNA, cffDNA), enabling applications such as non invasive diagnosis of fetal sex (and X-linked diseases), fetal rhesus among rhesus-negative women, trisomy and, in the near future, Mendelian mutations. Data on multifactorial diseases are still preliminary, but it should soon be possible to identify "strong" factors of genetic predisposition that have so far been beyond the scope of genome-wide association studies (GWAS). In the field of constitutional oncogenetics, NGS can also be used for simultaneous analysis of genes involved in " hereditary " cancers (21 breast cancer genes, 6 colon cancer genes, etc.). More generally, NGS can identify all genomic abnormalities (deletions, translocations, mutations) in a given malignant tissue (hemopathy or solid tumor), and has the potential to distinguish between important mutations (those that drive tumor progression) from " bystander " or accessory mutations, and also to identify "druggable" mutations amenable to targeted therapies

  17. Draft Genome Sequence of Rice Isolate Pseudomonas chlororaphis EA105

    OpenAIRE

    McCully, Lucy M.; Bitzer, Adam S.; Spence, Carla A.; Bais, Harsh P.; Silby, Mark W.

    2014-01-01

    Pseudomonas chlororaphis EA105, a strain isolated from rice rhizosphere, has shown antagonistic activities against a rice fungal pathogen, and could be important in defense against rice blast. We report the draft genome sequence of EA105, which is an estimated size of 6.6 Mb.

  18. Complete Genome Sequence of the Haloalkaliphilic, Hydrogen Producing Halanaerobium hydrogenoformans

    Energy Technology Data Exchange (ETDEWEB)

    Brown, Steven D [ORNL; Begemann, Matthew B [University of Wisconsin, Madison; Mormile, Dr. Melanie R. [Missouri University of Science and Technology; Wall, Judy D. [University of Missouri; Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Samual [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Elias, Dwayne A [ORNL

    2011-01-01

    Halanaerobium hydrogenoformans is an alkaliphilic bacterium capable of biohydrogen production at pH 11 and 7% (w/v) salt. We present the 2.6 Mb genome sequence to provide insights into its physiology and potential for bioenergy applications.

  19. Genome Sequence of the Yeast Cyberlindnera fabianii (Hansenula fabianii)

    OpenAIRE

    Freel, Kelle C.; Sarilar, Veronique; Neuvéglise, Cécile; Devillers, Hugo; Friedrich, Anne; Schacherer, Joseph

    2014-01-01

    The yeast Cyberlindnera fabianii is used in wastewater treatment, fermentation of alcoholic beverages, and has caused blood infections. To assist in the accurate identification of this species, and to determine the genetic basis for properties involved in fermentation and water treatment, we sequenced and annotated the genome of C. fabianii (YJS4271).

  20. Complete Genome Sequence of Haemophilus parasuis SH0165▿

    OpenAIRE

    Yue, Min; Yang, Fan; Yang, Jian; Bei, Weicheng; Cai, Xuwang; Chen, Lihong; Dong, Jie; Zhou, Rui; Jin, Meilin; Jin, Qi; Chen, Huanchun

    2008-01-01

    Haemophilus parasuis is the causative agent of Glässer's disease, which produces big losses in swine populations worldwide. H. parasuis SH0165, belonging to the dominant serovar 5 in China, is a clinically isolated strain with high-level virulence. Here, we report the first completed genome sequence of this species.

  1. Complete Genome Sequence of a Novel Human Betapapillomavirus, HPV-159

    OpenAIRE

    Kocjan, Boštjan J.; Hošnjak, Lea; Seme, Katja; Poljak, Mario

    2013-01-01

    A novel human papillomavirus (HPV), now officially recognized as HPV-159, isolated from an anal swab, was fully cloned, sequenced, and genetically characterized. HPV-159 has a genomic organization that is typical of cutaneotrophic HPV types, and it belongs to the genus Betapapillomavirus.

  2. Complete Genome Sequence of Beijerinckia indica subsp. indica▿

    OpenAIRE

    Tamas, Ivica; Dedysh, Svetlana N.; Liesack, Werner; Stott, Matthew B.; Alam, Maqsudul; Murrell, J. Colin; Dunfield, Peter F.

    2010-01-01

    Beijerinckia indica subsp. indica is an aerobic, acidophilic, exopolysaccharide-producing, N2-fixing soil bacterium. It is a generalist chemoorganotroph that is phylogenetically closely related to facultative and obligate methanotrophs of the genera Methylocella and Methylocapsa. Here we report the full genome sequence of this bacterium.

  3. The mitochondrial genome sequence of the Tasmanian tiger (Thylacinus cynocephalus)

    DEFF Research Database (Denmark)

    Miller, Webb; Drautz, Daniela I; Janecka, Jan E; Lesk, Arthur M; Ratan, Aakrosh; Tomsho, Lynn P; Packard, Mike; Zhang, Yeting; McClellan, Lindsay R; Qi, Ji; Zhao, Fangqing; Gilbert, M Thomas P; Dalén, Love; Arsuaga, Juan Luis; Ericson, Per G P; Huson, Daniel H; Helgen, Kristofer M; Murphy, William J; Götherström, Anders; Schuster, Stephan C

    2009-01-01

    We report the first two complete mitochondrial genome sequences of the thylacine (Thylacinus cynocephalus), or so-called Tasmanian tiger, extinct since 1936. The thylacine's phylogenetic position within australidelphian marsupials has long been debated, and here we provide strong support for the...

  4. Complete Genome Sequence of Robiginitalea biformata HTCC2501▿

    OpenAIRE

    Oh, Hyun-Myung; Giovannoni, Stephen J.; Lee, Kiyoung; Ferriera, Steve; Johnson, Justin; Cho, Jang-Cheon

    2009-01-01

    Robiginitalea biformata HTCC2501, isolated from the Sargasso Sea by dilution-to-extinction culturing, has been known as an aerobic chemoheterotroph with carotenoid pigments and dimorphic growth phases. Here, we announce the complete sequence of the R. biformata HTCC2501 genome, which contains genes for carotenoid biosynthesis and several macromolecule-degrading enzymes.

  5. Genome sequence of the human pathogen Vibrio cholerae Amazonia.

    NARCIS (Netherlands)

    Thompson, C.C.; Marin, M.A.; Dias, G.M.; Dutilh, B.E.; Edwards, R.A.; Iida, T.; Thompson, F.L.; Vicente, A.C.

    2011-01-01

    Vibrio cholerae O1 Amazonia is a pathogen that was isolated from cholera-like diarrhea cases in at least two countries, Brazil and Ghana. Based on multilocus sequence analysis, this lineage belongs to a distinct profile compared to strains from El Tor and classical biotypes. The genomic analysis rev

  6. Genome Sequence of the Tick-Borne Pathogen Rickettsia raoultii.

    Science.gov (United States)

    El Karkouri, Khalid; Mediannikov, Oleg; Robert, Catherine; Raoult, Didier; Fournier, Pierre-Edouards

    2016-01-01

    ITALIC! Rickettsia raoultiiis a tick-associated spotted fever group (SFG) organism, causing scalp eschar and neck lymphadenopathy after tick bite (SENLAT) in humans. We report here the genome sequence of ITALIC! R. raoultiistrain Khabarovsk(T)(CSUR R3(T), ATCC VR-1596(T)), which was isolated from a ITALIC! Dermacentor silvarumtick collected in Russia. PMID:27103706

  7. Genome Sequence of the Tick-Borne Pathogen Rickettsia raoultii

    OpenAIRE

    El Karkouri, Khalid; Mediannikov, Oleg; Robert, Catherine; Raoult, Didier; Fournier, Pierre-Edouards

    2016-01-01

    Rickettsia raoultii is a tick-associated spotted fever group (SFG) organism, causing scalp eschar and neck lymphadenopathy after tick bite (SENLAT) in humans. We report here the genome sequence of R. raoultii strain KhabarovskT (CSUR R3T, ATCC VR-1596T), which was isolated from a Dermacentor silvarum tick collected in Russia.

  8. Genome Sequence of the Paleopolyploid Soybean (Glycine max (L.) Merr.)

    Science.gov (United States)

    We report the genome sequence for soybean (Glycine max var. Williams 82), one of the most important crop plants worldwide because of its ability to produce both protein and oil. Soybean is a recently domesticated legume that plays a vital role in crop rotation as it fixes atmospheric nitrogen via s...

  9. Draft Genome Sequence of Bacillus subtilis strain KATMIRA1933

    OpenAIRE

    Karlyshev, Andrey V.; Melnikov, Vyacheslav G.; Chikindas, Michael L.

    2014-01-01

    In this report, we present a draft sequence of Bacillus subtilis KATMIRA1933. Previous studies demonstrated probiotic properties of this strain partially attributed to production of an antibacterial compound, subtilosin. Comparative analysis of this strain’s genome with that of a commercial probiotic strain, B. subtilis Natto, is presented.

  10. Complete Genome Sequences of Four Isolates of Plutella xylostella Granulovirus

    Science.gov (United States)

    2016-01-01

    Granuloviruses are widespread pathogens of Plutella xylostella L. (diamondback moth) and potential biopesticides for control of this global insect pest. We report the complete genomes of four Plutella xylostella granulovirus isolates from China, Malaysia, and Taiwan exhibiting pairs of noncoding, homologous repeat regions with significant sequence variation but equivalent length. PMID:27365355

  11. Draft Genome Sequence of Streptococcus agalactiae PR06

    OpenAIRE

    MZ, Irma Syakina; L. K. Teh; Salleh, M. Z.

    2013-01-01

    Streptococcus agalactiae (group B streptococcus [GBS]) is a Gram-positive bacterium that was first recognized as a causative agent of bovine mastitis. S. agalactiae has subsequently emerged as a significant cause of human diseases. Here, we report the draft genome sequence of S. agalactiae PR06, which was isolated from a septicemic patient in a local hospital in Malaysia.

  12. Draft Genome Sequence of Pectobacterium wasabiae Strain CFIA1002.

    Science.gov (United States)

    Yuan, Kat Xiaoli; Adam, Zaky; Tambong, James; Lévesque, C André; Chen, Wen; Lewis, Christopher T; De Boer, Solke H; Li, Xiang Sean

    2014-01-01

    Pectobacterium wasabiae, originally causing soft rot disease in horseradish in Japan, was recently found to cause blackleg-like symptoms on potato in the United States, Canada, and Europe. A draft genome sequence of a Canadian potato isolate of P. wasabiae CFIA1002 will enhance the characterization of its pathogenicity and host specificity features. PMID:24831134

  13. Draft Genome Sequence of Pectobacterium wasabiae Strain CFIA1002

    OpenAIRE

    Yuan, Kat (Xiaoli); Adam, Zaky; Tambong, James; Lévesque, C. André; Chen, Wen; Lewis, Christopher T.; De Boer, Solke H.; LI, XIANG

    2014-01-01

    Pectobacterium wasabiae, originally causing soft rot disease in horseradish in Japan, was recently found to cause blackleg-like symptoms on potato in the United States, Canada, and Europe. A draft genome sequence of a Canadian potato isolate of P. wasabiae CFIA1002 will enhance the characterization of its pathogenicity and host specificity features.

  14. Draft Genome Sequences of the Turfgrass Pathogen Sclerotinia homoeocarpa.

    Science.gov (United States)

    Green, Robert; Sang, Hyunkyu; Chang, Taehyun; Allan-Perkins, Elisha; Petit, Elsa; Jung, Geunhwa

    2016-01-01

    Sclerotinia homoeocarpa (F. T. Bennett) is one of the most economically important pathogens on high-amenity cool-season turfgrasses, where it causes dollar spot. To understand the genetic mechanisms of fungicide resistance, which has become highly prevalent, the whole genomes of two isolates with varied resistance levels to fungicides were sequenced. PMID:26868400

  15. Complete Genome Sequence of Bacillus thuringiensis Bacteriophage BMBtp2

    OpenAIRE

    Dong, Zhaoxia; Peng, Donghai; Wang, Yueying; Zhu, Lei; Ruan, Lifang; Sun, Ming

    2013-01-01

    Bacillus thuringiensis is an insect pathogen which has been widely used for biocontrol. During B. thuringiensis fermentation, lysogenic bacteriophages cause severe losses of yield. Here, we announce the complete genome sequence of a bacteriophage, BMBtp2, which is induced from B. thuringiensis strain YBT-1765, which may be helpful to clarify the mechanism involved in bacteriophage contamination.

  16. Draft Genome Sequence of Halomonas smyrnensis AAD6T

    OpenAIRE

    Sogutcu, Elif; Emrence, Zeliha; Arikan, Muzzaffer; Cakiris, Aris; Abaci, Neslihan; Öner, Ebru Toksoy; Üstek, Duran; Arga, Kazim Yalcin

    2012-01-01

    Halomonas smyrnensis AAD6T is a Gram-negative, aerobic, exopolysaccharide-producing, and moderately halophilic bacterium that produces levan, a fructose homopolymer with many potential uses in various industries. We report the draft genome sequence of H. smyrnensis AAD6T, which will accelerate research on the rational design and optimization of microbial levan production.

  17. Genome Sequence of Ureaplasma diversum Strain ATCC 49782.

    Science.gov (United States)

    Marques, Lucas M; Guimarães, Ana M S; Martins, Hellen B; Rezende, Izadora S; Barbosa, Maysa S; Campos, Guilherme B; do Nascimento, Naíla C; Dos Santos, Andrea P; Amorim, Aline T; Santos, Verena M; Messick, Joanne B; Timenetsky, Jorge

    2015-01-01

    Here, we report the complete genome sequence of Ureaplasma diversum strain ATCC 49782. This species is of bovine origin, having an association with reproductive disorders in cattle, including placentitis, fetal alveolitis, abortion, and birth of weak calves. It has a small circular chromosome of 975,425 bp. PMID:25883297

  18. Genome Sequence of Klebsiella pneumoniae Urinary Tract Isolate Top52

    OpenAIRE

    Johnson, Jeremiah G.; Spurbeck, Rachel R.; Sandhu, Sukhinder K.; Matson, Jyl S.

    2014-01-01

    Klebsiella pneumoniae is a significant cause of nosocomial infections, including ventilator-associated pneumonias and catheter-associated urinary tract infections. K. pneumoniae strain TOP52 #1721 (Top52) was isolated from a woman presenting with acute cystitis and subsequently characterized using various murine models of infection. Here we present the genome sequence of K. pneumoniae Top52.

  19. Complete Genome Sequence of Mycobacterium vaccae Type Strain ATCC 25954

    KAUST Repository

    Ho, Y. S.

    2012-10-26

    Mycobacterium vaccae is a rapidly growing, nontuberculous Mycobacterium species that is generally not considered a human pathogen and is of major pharmaceutical interest as an immunotherapeutic agent. We report here the annotated genome sequence of the M. vaccae type strain, ATCC 25954.

  20. Complete Genome Sequence of Biocontrol Strain Pseudomonas fluorescens LBUM 223

    OpenAIRE

    Roquigny, Roxane; Arseneault, Tanya; Gadkar, Vijay J.; Novinscak, Amy; Joly, David L.; Filion, Martin

    2015-01-01

    Pseudomonas fluorescens LBUM 223 is a plant growth-promoting rhizobacterium (PGPR) with biocontrol activity against various plant pathogens. It produces the antimicrobial metabolite phenazine-1-carboxylic acid, which is involved in the biocontrol of Streptomyces scabies, the causal agent of common scab of potato. Here, we report the complete genome sequence of P. fluorescens LBUM 223.

  1. Whole Genome Sequences of Four Brucella Strains ▿

    OpenAIRE

    Ding, Jiabo; Pan, Yuanlong; Jiang, Hai; Cheng, Junsheng; Liu, Taotao; Qin, Nan; Yi YANG; Cui, Buyun; Chen, Chen; Liu, Cuihua; Mao, Kairong; Zhu, Baoli

    2011-01-01

    Brucella melitensis and Brucella suis are intracellular pathogens of livestock and humans. Here we report four genome sequences, those of the virulent strain B. melitensis M28-12 and vaccine strains B. melitensis M5 and M111 and B. suis S2, which show different virulences and pathogenicities, which will help to design a more effective brucellosis vaccine.

  2. Genome Sequence of Klebsiella pneumoniae Respiratory Isolate IA565

    OpenAIRE

    Johnson, Jeremiah G.; Spurbeck, Rachel R.; Sandhu, Sukhinder K.; Matson, Jyl S

    2014-01-01

    Klebsiella pneumoniae is a clinically significant opportunistic bacterial pathogen as well as a normal member of the human microbiota. K. pneumoniae strain IA565 was isolated from a tracheal aspirate at the University of Iowa Hospitals and Clinics. Here, we present the genome sequence of K. pneumoniae IA565.

  3. Complete Genome Sequence of Anaplasma marginale subsp. centrale

    Science.gov (United States)

    Anaplasma marginale subsp. centrale is a naturally attenuated subtype that has been used as a vaccine for a century. We sequenced the genome of this organism and compared it to those of virulent senso stricto A. marginale strains. The comparison markedly narrows the number of outer membrane protein ...

  4. Complete Genome Sequences of Four Isolates of Plutella xylostella Granulovirus.

    Science.gov (United States)

    Spence, Robert J; Noune, Christopher; Hauxwell, Caroline

    2016-01-01

    Granuloviruses are widespread pathogens of Plutella xylostella L. (diamondback moth) and potential biopesticides for control of this global insect pest. We report the complete genomes of four Plutella xylostella granulovirus isolates from China, Malaysia, and Taiwan exhibiting pairs of noncoding, homologous repeat regions with significant sequence variation but equivalent length. PMID:27365355

  5. Genome Sequences of Six Paenibacillus larvae Siphoviridae Phages.

    Science.gov (United States)

    Carson, Susan; Bruff, Emily; DeFoor, William; Dums, Jacob; Groth, Adam; Hatfield, Taylor; Iyer, Aruna; Joshi, Kalyani; McAdams, Sarah; Miles, Devon; Miller, Delanie; Oufkir, Abdoullah; Raynor, Brinkley; Riley, Sara; Roland, Shelby; Rozier, Horace; Talley, Sarah; Miller, Eric S

    2015-01-01

    Six sequenced and annotated genomes of Paenibacillus larvae phages isolated from the combs of American foulbrood-diseased beehives are 37 to 45 kbp and have approximately 42% G+C content and 60 to 74 protein-coding genes. Phage Lily is most divergent from Diva, Rani, Redbud, Shelly, and Sitara. PMID:26089405

  6. Draft Genome Sequences of Four Plant Probiotic Bacillus Strains.

    Science.gov (United States)

    Jeong, Haeyoung; Park, Seung-Hwan; Choi, Soo-Keun

    2016-01-01

    Here, we report the whole-genome sequences of four Bacillus strains that exhibit plant probiotic activities. Three of them are the type strains of Bacillus endophyticus, "Bacillus gaemokensis," and Bacillus trypoxylicola, and the other, Bacillus sp. strain KCTC 13219, should be reclassified into a species belonging to the genus Lysinibacillus. PMID:27174273

  7. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    Energy Technology Data Exchange (ETDEWEB)

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin

  8. Complete Genome Sequence of Actinobaculum schaalii Strain CCUG 27420

    DEFF Research Database (Denmark)

    Kristiansen, Rikke; Dueholm, Morten S; Bank, Steffen;

    2014-01-01

    Complete genome sequencing of the emerging uropathogen Actinobaculum schaalii indicates that an important mechanism of its virulence is attachment pili, which allow the organism to adhere to the surface of animal cells, greatly enhancing the ability of this organism to colonize the urinary tract....

  9. Characterization of reniform nematode genome through shotgun sequencing

    Science.gov (United States)

    The reniform nematode (RN), a major agricultural pest particularly on cotton in the United States(U.S.), is among the major plant parasitic nematodes for which limited genomic information exists. In this study, over 380 Mb of sequence data were generated from four pooled adult female RN and assembl...

  10. Bioinformatics for whole-genome shotgun sequencing of microbial communities.

    Directory of Open Access Journals (Sweden)

    Kevin Chen

    2005-07-01

    Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.

  11. The complete plastid genome sequence of Picea jezoensis (Pinaceae: Piceoideae).

    Science.gov (United States)

    Yang, Jong Cheol; Joo, Minjung; So, Soonku; Yi, Dong-Keun; Shin, Chang Ho; Lee, You-Mi; Choi, Kyung

    2016-09-01

    The nucleotide sequence of the complete chloroplast genome of P. jezoensis was completed. The total genome size was 124 146 bp, containing a pair of very short inverted repeats (IRa and IRb) of 422 bp, which were separated by large single copy (LSC) and small single copy (SSC) with 66 956 bp and 56 346 bp, respectively. The overall GC contents of the plastid genome were determined as 38.8%. One hundred fifteen genes including 68 peptide-encoding genes, 35 tRNA genes, four rRNA genes, six open-reading frames, and two pseudogenes were annotated. In these genes, 15 genes contained only one or two introns. Phylogenetic analyses using maximum likelihood (ML) methods were performed from fully sequenced Gymnosperms and other species of dataset composed of 69 protein-coding genes. PMID:26332576

  12. Complete genome sequence of the fish pathogen Flavobacterium psychrophilum.

    Science.gov (United States)

    Duchaud, Eric; Boussaha, Mekki; Loux, Valentin; Bernardet, Jean-François; Michel, Christian; Kerouault, Brigitte; Mondot, Stanislas; Nicolas, Pierre; Bossy, Robert; Caron, Christophe; Bessières, Philippe; Gibrat, Jean-François; Claverol, Stéphane; Dumetz, Fabien; Le Hénaff, Michel; Benmansour, Abdenour

    2007-07-01

    We report here the complete genome sequence of the virulent strain JIP02/86 (ATCC 49511) of Flavobacterium psychrophilum, a widely distributed pathogen of wild and cultured salmonid fish. The genome consists of a 2,861,988-base pair (bp) circular chromosome with 2,432 predicted protein-coding genes. Among these predicted proteins, stress response mediators, gliding motility proteins, adhesins and many putative secreted proteases are probably involved in colonization, invasion and destruction of the host tissues. The genome sequence provides the basis for explaining the relationships of the pathogen to the host and opens new perspectives for the development of more efficient disease control strategies. It also allows for a better understanding of the physiology and evolution of a significant representative of the family Flavobacteriaceae, whose members are associated with an interesting diversity of lifestyles and habitats. PMID:17592475

  13. Draft genome sequence of the Algerian bee Apis mellifera intermissa.

    Science.gov (United States)

    Haddad, Nizar Jamal; Loucif-Ayad, Wahida; Adjlane, Noureddine; Saini, Deepti; Manchiganti, Rushiraj; Krishnamurthy, Venkatesh; AlShagoor, Banan; Batainh, Ahmed Mahmud; Mugasimangalam, Raja

    2015-06-01

    Apis mellifera intermissa is the native honeybee subspecies of Algeria. A. m. intermissa occurs in Tunisia, Algeria and Morocco, between the Atlas and the Mediterranean and Atlantic coasts. This bee is very important due to its high ability to adapt to great variations in climatic conditions and due to its preferable cleaning behavior. Here we report the draft genome sequence of this honey bee, its Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession JSUV00000000. The 240-Mb genome is being annotated and analyzed. Comparison with the genome of other Apis mellifera sub-species promises to yield insights into the evolution of adaptations to high temperature and resistance to Varroa parasite infestation. PMID:26484171

  14. Draft genome sequence of the Algerian bee Apis mellifera intermissa

    Directory of Open Access Journals (Sweden)

    Nizar Jamal Haddad

    2015-06-01

    Full Text Available Apis mellifera intermissa is the native honeybee subspecies of Algeria. A. m. intermissa occurs in Tunisia, Algeria and Morocco, between the Atlas and the Mediterranean and Atlantic coasts. This bee is very important due to its high ability to adapt to great variations in climatic conditions and due to its preferable cleaning behavior. Here we report the draft genome sequence of this honey bee, its Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession JSUV00000000. The 240-Mb genome is being annotated and analyzed. Comparison with the genome of other Apis mellifera sub-species promises to yield insights into the evolution of adaptations to high temperature and resistance to Varroa parasite infestation.

  15. Contrasting DNA sequence organisation patterns in sauropsidian genomes.

    Science.gov (United States)

    Epplen, J T; Diedrich, U; Wagenmann, M; Schmidtke, J; Engel, W

    1979-11-01

    The genomic DNA organisation patterns of four sauropsidian species, namely Python reticularis, Caiman crocodilus, Terrapene carolina triungius and Columba livia domestica were investigated by reassociation of short and long DNA fragments, by hyperchromicity measurements of reannealed fragments and by length estimations of S1-nuclease resistant repetitive duplexes. While the genomic DNA of the three reptilian species shows a short period interspersion pattern, the genome of the avian species is organised in a long period interspersion pattern apparently typical for birds. These findings are discussed in view of the close phylogenetic relationships of birds and reptiles, and also with regard to a possible relationship between the extent of sequence interspersion and genome size. PMID:533670

  16. Quantifying Next Generation Sequencing Sample Pre-Processing Bias in HIV-1 Complete Genome Sequencing

    Directory of Open Access Journals (Sweden)

    Bram Vrancken

    2016-01-01

    Full Text Available Genetic analyses play a central role in infectious disease research. Massively parallelized “mechanical cloning” and sequencing technologies were quickly adopted by HIV researchers in order to broaden the understanding of the clinical importance of minor drug-resistant variants. These efforts have, however, remained largely limited to small genomic regions. The growing need to monitor multiple genome regions for drug resistance testing, as well as the obvious benefit for studying evolutionary and epidemic processes makes complete genome sequencing an important goal in viral research. In addition, a major drawback for NGS applications to RNA viruses is the need for large quantities of input DNA. Here, we use a generic overlapping amplicon-based near full-genome amplification protocol to compare low-input enzymatic fragmentation (Nextera™ with conventional mechanical shearing for Roche 454 sequencing. We find that the fragmentation method has only a modest impact on the characterization of the population composition and that for reliable results, the variation introduced at all steps of the procedure—from nucleic acid extraction to sequencing—should be taken into account, a finding that is also relevant for NGS technologies that are now more commonly used. Furthermore, by applying our protocol to deep sequence a number of pre-therapy plasma and PBMC samples, we illustrate the potential benefits of a near complete genome sequencing approach in routine genotyping.

  17. Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-10-24

    Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic diversity

  18. MetaSim: a sequencing simulator for genomics and metagenomics.

    Directory of Open Access Journals (Sweden)

    Daniel C Richter

    Full Text Available BACKGROUND: The new research field of metagenomics is providing exciting insights into various, previously unclassified ecological systems. Next-generation sequencing technologies are producing a rapid increase of environmental data in public databases. There is great need for specialized software solutions and statistical methods for dealing with complex metagenome data sets. METHODOLOGY/PRINCIPAL FINDINGS: To facilitate the development and improvement of metagenomic tools and the planning of metagenomic projects, we introduce a sequencing simulator called MetaSim. Our software can be used to generate collections of synthetic reads that reflect the diverse taxonomical composition of typical metagenome data sets. Based on a database of given genomes, the program allows the user to design a metagenome by specifying the number of genomes present at different levels of the NCBI taxonomy, and then to collect reads from the metagenome using a simulation of a number of different sequencing technologies. A population sampler optionally produces evolved sequences based on source genomes and a given evolutionary tree. CONCLUSIONS/SIGNIFICANCE: MetaSim allows the user to simulate individual read datasets that can be used as standardized test scenarios for planning sequencing projects or for benchmarking metagenomic software.

  19. Sequencing Crop Genomes: A Gateway to Improve Tropical Agriculture.

    Science.gov (United States)

    Thottathil, Gincy Paily; Jayasekaran, Kandakumar; Othman, Ahmad Sofiman

    2016-02-01

    Agricultural development in the tropics lags behind development in the temperate latitudes due to the lack of advanced technology, and various biotic and abiotic factors. To cope with the increasing demand for food and other plant-based products, improved crop varieties have to be developed. To breed improved varieties, a better understanding of crop genetics is necessary. With the advent of next-generation DNA sequencing technologies, many important crop genomes have been sequenced. Primary importance has been given to food crops, including cereals, tuber crops, vegetables, and fruits. The DNA sequence information is extremely valuable for identifying key genes controlling important agronomic traits and for identifying genetic variability among the cultivars. However, massive DNA re-sequencing and gene expression studies have to be performed to substantially improve our understanding of crop genetics. Application of the knowledge obtained from the genomes, transcriptomes, expression studies, and epigenetic studies would enable the development of improved varieties and may lead to a second green revolution. The applications of next generation DNA sequencing technologies in crop improvement, its limitations, future prospects, and the features of important crop genome projects are reviewed herein. PMID:27019684

  20. Secure distributed genome analysis for GWAS and sequence comparison computation

    Science.gov (United States)

    2015-01-01

    Background The rapid increase in the availability and volume of genomic data makes significant advances in biomedical research possible, but sharing of genomic data poses challenges due to the highly sensitive nature of such data. To address the challenges, a competition for secure distributed processing of genomic data was organized by the iDASH research center. Methods In this work we propose techniques for securing computation with real-life genomic data for minor allele frequency and chi-squared statistics computation, as well as distance computation between two genomic sequences, as specified by the iDASH competition tasks. We put forward novel optimizations, including a generalization of a version of mergesort, which might be of independent interest. Results We provide implementation results of our techniques based on secret sharing that demonstrate practicality of the suggested protocols and also report on performance improvements due to our optimization techniques. Conclusions This work describes our techniques, findings, and experimental results developed and obtained as part of iDASH 2015 research competition to secure real-life genomic computations and shows feasibility of securely computing with genomic data in practice. PMID:26733307

  1. Multiplexed DNA sequence capture of mitochondrial genomes using PCR products.

    Directory of Open Access Journals (Sweden)

    Tomislav Maricic

    Full Text Available BACKGROUND: To utilize the power of high-throughput sequencers, target enrichment methods have been developed. The majority of these require reagents and equipment that are only available from commercial vendors and are not suitable for the targets that are a few kilobases in length. METHODOLOGY/PRINCIPAL FINDINGS: We describe a novel and economical method in which custom made long-range PCR products are used to capture complete human mitochondrial genomes from complex DNA mixtures. We use the method to capture 46 complete mitochondrial genomes in parallel and we sequence them on a single lane of an Illumina GA(II instrument. CONCLUSIONS/SIGNIFICANCE: This method is economical and simple and particularly suitable for targets that can be amplified by PCR and do not contain highly repetitive sequences such as mtDNA. It has applications in population genetics and forensics, as well as studies of ancient DNA.

  2. SVA: software for annotating and visualizing sequenced human genomes

    Science.gov (United States)

    Ge, Dongliang; Ruzzo, Elizabeth K.; Shianna, Kevin V.; He, Min; Pelak, Kimberly; Heinzen, Erin L.; Need, Anna C.; Cirulli, Elizabeth T.; Maia, Jessica M.; Dickson, Samuel P.; Zhu, Mingfu; Singh, Abanish; Allen, Andrew S.; Goldstein, David B.

    2011-01-01

    Summary: Here we present Sequence Variant Analyzer (SVA), a software tool that assigns a predicted biological function to variants identified in next-generation sequencing studies and provides a browser to visualize the variants in their genomic contexts. SVA also provides for flexible interaction with software implementing variant association tests allowing users to consider both the bioinformatic annotation of identified variants and the strength of their associations with studied traits. We illustrate the annotation features of SVA using two simple examples of sequenced genomes that harbor Mendelian mutations. Availability and implementation: Freely available on the web at http://www.svaproject.org. Contact: d.ge@duke.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21624899

  3. Downsizing genomic medicine: approaching the ethical complexity of whole-genome sequencing by starting small.

    Science.gov (United States)

    Sharp, Richard R

    2011-03-01

    As we look to a time when whole-genome sequencing is integrated into patient care, it is possible to anticipate a number of ethical challenges that will need to be addressed. The most intractable of these concern informed consent and the responsible management of very large amounts of genetic information. Given the range of possible findings, it remains unclear to what extent it will be possible to obtain meaningful patient consent to genomic testing. Equally unclear is how clinicians will disseminate the enormous volume of genetic information produced by whole-genome sequencing. Toward developing practical strategies for managing these ethical challenges, we propose a research agenda that approaches multiplexed forms of clinical genetic testing as natural laboratories in which to develop best practices for managing the ethical complexities of genomic medicine. PMID:21311340

  4. Complete genome sequence of Actinosynnema mirum type strain (101T)

    Energy Technology Data Exchange (ETDEWEB)

    Land, Miriam; Lapidus, Alla; Mayilraj, Shanmugam; Chen, Feng; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Lucas, Susan; Tice, Hope; Cheng, Jan-Fang; Chertkov, Olga; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Rohde, Manfred; Goker, Markus; Pati, Amrita; Ivanova, Natalia; Mavrommatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia; Brettin, Thomas; Detter, John C.; Han, Cliff; Chain, Patrick; Tindall, Brian; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter

    2009-05-20

    Actinosynnema mirum Hasegawa et al. 1978 is the type species of the genus, and is of phylogenetic interest because of its central phylogenetic location in the Actino-synnemataceae, a rapidly growing family within the actinobacterial suborder Pseudo-nocardineae. A. mirum is characterized by its motile spores borne on synnemata and as a producer of nocardicin antibiotics. It is capable of growing aerobically and under a moderate CO2 atmosphere. The strain is a Gram-positive, aerial and substrate mycelium producing bacterium, originally isolated from a grass blade collected from the Raritan River, New Jersey. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first complete genome sequence of a member of the family Actinosynnemataceae, and only the second sequence from the actinobacterial suborder Pseudonocardineae. The 8,248,144 bp long single replicon genome with its 7100 protein-coding and 77 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  5. Insights into hominid evolution from the gorilla genome sequence.

    Science.gov (United States)

    Scally, Aylwyn; Dutheil, Julien Y; Hillier, LaDeana W; Jordan, Gregory E; Goodhead, Ian; Herrero, Javier; Hobolth, Asger; Lappalainen, Tuuli; Mailund, Thomas; Marques-Bonet, Tomas; McCarthy, Shane; Montgomery, Stephen H; Schwalie, Petra C; Tang, Y Amy; Ward, Michelle C; Xue, Yali; Yngvadottir, Bryndis; Alkan, Can; Andersen, Lars N; Ayub, Qasim; Ball, Edward V; Beal, Kathryn; Bradley, Brenda J; Chen, Yuan; Clee, Chris M; Fitzgerald, Stephen; Graves, Tina A; Gu, Yong; Heath, Paul; Heger, Andreas; Karakoc, Emre; Kolb-Kokocinski, Anja; Laird, Gavin K; Lunter, Gerton; Meader, Stephen; Mort, Matthew; Mullikin, James C; Munch, Kasper; O'Connor, Timothy D; Phillips, Andrew D; Prado-Martinez, Javier; Rogers, Anthony S; Sajjadian, Saba; Schmidt, Dominic; Shaw, Katy; Simpson, Jared T; Stenson, Peter D; Turner, Daniel J; Vigilant, Linda; Vilella, Albert J; Whitener, Weldon; Zhu, Baoli; Cooper, David N; de Jong, Pieter; Dermitzakis, Emmanouil T; Eichler, Evan E; Flicek, Paul; Goldman, Nick; Mundy, Nicholas I; Ning, Zemin; Odom, Duncan T; Ponting, Chris P; Quail, Michael A; Ryder, Oliver A; Searle, Stephen M; Warren, Wesley C; Wilson, Richard K; Schierup, Mikkel H; Rogers, Jane; Tyler-Smith, Chris; Durbin, Richard

    2012-03-01

    Gorillas are humans' closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago. In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution. PMID:22398555

  6. The impact of next-generation sequencing on genomics

    Institute of Scientific and Technical Information of China (English)

    Jun Zhang; Rod Chiodini; Ahmed Badr; Genfa Zhang

    2011-01-01

    This article reviews basic concepts,general applications,and the potential impact of next-generation sequencing(NGS)technologies on genomics,with particular reference to currently available and possible future platforms and bioinformatics.NGS technologies have demonstrated the capacity to sequence DNA at unprecedented speed,thereby enabling previously unimaginable scientific achievements and novel biological applications.But,the massive data produced by NGS also presents a significant challenge for data storage,analyses,and management solutions.Advanced bioinformatic tools are essential for the successful application of NGS technology.As evidenced throughout this review,NGS technologies will have a striking impact on genomic research and the entire biological field.With its ability to tackle the unsolved challenges unconquered by previous genomic technologies,NGS is likely to unravel the complexity of the human genome in terms of genetic variations,some of which may be confined to susceptible loci for some common human conditions.The impact of NGS technologies on genomics will be far reaching and likely change the field for years to come.

  7. First High-Quality Draft Genome Sequence of Pasteurella multocida Sequence Type 128 Isolated from Infected Bone

    OpenAIRE

    Kavousi, Niloofar; Eng, Wilhelm Wei Han; Lee, Yin Peng; Tan, Lian Huat; Thuraisingham, Ravindran; Catherine M Yule; Gan, Han Ming

    2016-01-01

    We report here the first high-quality draft genome sequence of Pasteurella multocida sequence type 128, which was isolated from the infected finger bone of an adult female who was bitten by a domestic dog. The draft genome will be a valuable addition to the scarce genomic resources available for P. multocida.

  8. First High-Quality Draft Genome Sequence of Pasteurella multocida Sequence Type 128 Isolated from Infected Bone.

    Science.gov (United States)

    Kavousi, Niloofar; Eng, Wilhelm Wei Han; Lee, Yin Peng; Tan, Lian Huat; Thuraisingham, Ravindran; Yule, Catherine M; Gan, Han Ming

    2016-01-01

    We report here the first high-quality draft genome sequence of Pasteurella multocida sequence type 128, which was isolated from the infected finger bone of an adult female who was bitten by a domestic dog. The draft genome will be a valuable addition to the scarce genomic resources available for P. multocida. PMID:26941132

  9. ReRep: Computational detection of repetitive sequences in genome survey sequences (GSS

    Directory of Open Access Journals (Sweden)

    Alves-Ferreira Marcelo

    2008-09-01

    Full Text Available Abstract Background Genome survey sequences (GSS offer a preliminary global view of a genome since, unlike ESTs, they cover coding as well as non-coding DNA and include repetitive regions of the genome. A more precise estimation of the nature, quantity and variability of repetitive sequences very early in a genome sequencing project is of considerable importance, as such data strongly influence the estimation of genome coverage, library quality and progress in scaffold construction. Also, the elimination of repetitive sequences from the initial assembly process is important to avoid errors and unnecessary complexity. Repetitive sequences are also of interest in a variety of other studies, for instance as molecular markers. Results We designed and implemented a straightforward pipeline called ReRep, which combines bioinformatics tools for identifying repetitive structures in a GSS dataset. In a case study, we first applied the pipeline to a set of 970 GSSs, sequenced in our laboratory from the human pathogen Leishmania braziliensis, the causative agent of leishmaniosis, an important public health problem in Brazil. We also verified the applicability of ReRep to new sequencing technologies using a set of 454-reads of an Escheria coli. The behaviour of several parameters in the algorithm is evaluated and suggestions are made for tuning of the analysis. Conclusion The ReRep approach for identification of repetitive elements in GSS datasets proved to be straightforward and efficient. Several potential repetitive sequences were found in a L. braziliensis GSS dataset generated in our laboratory, and further validated by the analysis of a more complete genomic dataset from the EMBL and Sanger Centre databases. ReRep also identified most of the E. coli K12 repeats prior to assembly in an example dataset obtained by automated sequencing using 454 technology. The parameters controlling the algorithm behaved consistently and may be tuned to the properties

  10. Whole-genome sequencing and analysis of the Malaysian cynomolgus macaque (Macaca fascicularis) genome

    OpenAIRE

    Higashino, Atsunori; Sakate, Ryuichi; Kameoka, Yosuke; Takahashi, Ichiro; Hirata, Makoto; Tanuma, Reiko; Masui, Tohru; Yasutomi, Yasuhiro; Osada, Naoki

    2012-01-01

    Background The genetic background of the cynomolgus macaque (Macaca fascicularis) is made complex by the high genetic diversity, population structure, and gene introgression from the closely related rhesus macaque (Macaca mulatta). Herein we report the whole-genome sequence of a Malaysian cynomolgus macaque male with more than 40-fold coverage, which was determined using a resequencing method based on the Indian rhesus macaque genome. Results We identified approximately 9.7 million single nuc...

  11. Genomic organization and sequence analysis of the vomeronasal receptor V2R genes in mouse genome

    Institute of Scientific and Technical Information of China (English)

    YANG Hui; Zhang YaPing

    2007-01-01

    Two multigene superfamilies, named V1R and V2R, encoding seven-transmembrane-domain G-protein coupled receptors (GPCRs) have been identified as pheromone receptors in mammals. Three V2R gene families have been described in mouse and rat. Here we screened the updated mouse genome sequence database and finally retrieved 63 putative functional V2R genes including three newly identified genes which formed a new additional family. We described the genomic organization of these genes and also characterized the conservation of mouse V2R protein sequences. These genomic and sequence information we described are useful as part of the evidence to speculate the functional domain of V2Rs and should give aid to the functionality study in the future.

  12. Draft genome sequence of Acidithiobacillus ferrooxidans YQH-1

    Directory of Open Access Journals (Sweden)

    Lei Yan

    2015-12-01

    Full Text Available Acidithiobacillus ferrooxidans YQH-1 is a moderate acidophilic bacterium isolated from a river in a volcano of Northeast China. Here, we describe the draft genome of strain YQH-1, which was assembled into 123 contigs containing 3,111,222 bp with a G + C content of 58.63%. A large number of genes related to carbon dioxide fixation, dinitrogen fixation, pH tolerance, heavy metal detoxification, and oxidative stress defense were detected. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LJBT00000000.

  13. Arrangement of repetitive sequences in the genome of herpesvirus Sylvilagus.

    Science.gov (United States)

    Medveczky, M M; Geck, P; Clarke, C; Byrnes, J; Sullivan, J L; Medveczky, P G

    1989-02-01

    Herpesvirus sylvilagus is a lymphotropic (type gamma) herpesvirus of cottontail rabbits (Sylvilagus floridanus). Analysis of virion DNA of herpesvirus sylvilagus has revealed that the genome consists of one stretch of about 120 kilobase pairs of internal, unique DNA flanked by a variable number of 553-base-pair tandem repeats. The G + C content of the repetitive DNA is extremely high (83%), as determined by sequencing. The organization of the herpesvirus sylvilagus genome is, therefore, similar to that of the primate lymphotropic viruses herpesvirus saimiri and herpesvirus ateles. PMID:2911114

  14. Complete genome sequence of Croceibacter bacteriophage P2559S.

    Science.gov (United States)

    Kang, Ilnam; Kang, Dongmin; Cho, Jang-Cheon

    2012-08-01

    Croceibacter atlanticus HTCC2559(T), a marine bacterium isolated from the Sargasso Sea, is a phylogenetically unique member of the family Flavobacteriaceae. Strain HTCC2559(T) possesses genes related to interaction with primary producers, which makes studies on bacteriophages infecting the strain interesting. Here we report the genome sequence of bacteriophage P2559S, which was isolated off the coast of the Republic of Korea and lytically infects HTCC2559(T). Many genes predicted in the P2559S genome had their homologs in Bacteroides phages. PMID:22843867

  15. The complete chloroplast genome sequence of Clematis terniflora DC. (Ranunculaceae).

    Science.gov (United States)

    Li, Mengzhu; Yang, Bingxian; Chen, Qinyi; Zhu, Wei; Ma, Ji; Tian, Jingkui

    2016-07-01

    Clematis terniflora DC. is an important medicinal plant used in the treatment of inflammatory symptoms related to respiratory and urinary systems. In this study, we found that the complete cp genome of C. terniflora DC. is 159,528 bp. The phylogenetic analysis of 32 taxa showed a strong sister relationship with Ranunculus macranthus, which also strongly supports the position of Ranunculales. The complete cp genome sequence of Clematis terniflora DC. reported here has the potential to advance population and phylogenetic studies of this medicinal plant. PMID:25865739

  16. The complete chloroplast genome sequence of Hibiscus syriacus.

    Science.gov (United States)

    Kwon, Hae-Yun; Kim, Joon-Hyeok; Kim, Sea-Hyun; Park, Ji-Min; Lee, Hyoshin

    2016-09-01

    The complete chloroplast genome sequence of Hibiscus syriacus L. is presented in this study. The genome is composed of 161 019 bp in length, with a typical circular structure containing a pair of inverted repeats of 25 745 bp of length separated by a large single-copy region and a small single-copy region of 89 698 bp and 19 831 bp of length, respectively. The overall GC content is 36.8%. One hundred and fourteen genes were annotated, including 81 protein-coding genes, 4 ribosomal RNA genes and 29 transfer RNA genes. PMID:26357910

  17. Phytophthora Genome Sequences Uncover Evolutionary Origins and Mechanisms of Pathogenesis

    Energy Technology Data Exchange (ETDEWEB)

    Tyler, Brett M.; Tripathy, Sucheta; Zhang, Xuemin; Dehal, Paramvir; Jiang, Rays H. Y.; Aerts, Andrea; Arredondo, Felipe D.; Baxter, Laura; Bensasson, Douda; Beynon, JIm L.; Chapman, Jarrod; Damasceno, Cynthia M. B.; Dorrance, Anne E.; Dou, Daolong; Dickerman, Allan W.; Dubchak, Inna L.; Garbelotto, Matteo; Gijzen, Mark; Gordon, Stuart G.; Govers, Francine; Grunwald, NIklaus J.; Huang, Wayne; Ivors, Kelly L.; Jones, Richard W.; Kamoun, Sophien; Krampis, Konstantinos; Lamour, Kurt H.; Lee, Mi-Kyung; McDonald, W. Hayes; Medina, Monica; Meijer, Harold J. G.; Nordberg, Erik K.; Maclean, Donald J.; Ospina-Giraldo, Manuel D.; Morris, Paul F.; Phuntumart, Vipaporn; Putnam, Nicholas J.; Rash, Sam; Rose, Jocelyn K. C.; Sakihama, Yasuko; Salamov, Asaf A.; Savidor, Alon; Scheuring, Chantel F.; Smith, Brian M.; Sobral, Bruno W. S.; Terry, Astrid; Torto-Alalibo, Trudy A.; Win, Joe; Xu, Zhanyou; Zhang, Hongbin; Grigoriev, Igor V.; Rokhsar, Daniel S.; Boore, Jeffrey L.

    2006-04-17

    Draft genome sequences have been determined for the soybean pathogen Phytophthora sojae and the sudden oak death pathogen Phytophthora ramorum. Oömycetes such as these Phytophthora species share the kingdom Stramenopila with photosynthetic algae such as diatoms, and the presence of many Phytophthora genes of probable phototroph origin supports a photosynthetic ancestry for the stramenopiles. Comparison of the two species' genomes reveals a rapid expansion and diversification of many protein families associated with plant infection such as hydrolases, ABC transporters, protein toxins, proteinase inhibitors, and, in particular, a superfamily of 700 proteins with similarity to known oömycete avirulence genes.

  18. Complete Genome Sequence of a Human Enterovirus 71 Strain Isolated in Brunei in 2006

    OpenAIRE

    Zaini, Zainun; McMinn, Peter

    2013-01-01

    The complete genome sequence of a human enterovirus 71 strain isolated in Brunei in 2006 was determined. Phylogenetic analysis based on the complete genome sequence classified this strain into subgenogroup B5.

  19. A map of human genome variation from population-scale sequencing

    OpenAIRE

    Abdallah, Assya; Abecasis, Gonçalo R.; Abyzov, Alexej; Affourtit, Jason; Agarwala, Richa; Aksay, Gozde; Albers, Cornelis A.; Albrecht, Marcus W.; Alkan, Can; Altshuler, David L.; Ambrogio, Lauren; Amstislavskiy, Vyacheslav S.; Anderson, Paul; Ashworth, Dana; Attiya, Said

    2010-01-01

    The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mot...

  20. Complete Genome Sequence of Pelosinus sp. Strain UFO1 Assembled Using Single-Molecule Real-Time DNA Sequencing Technology

    OpenAIRE

    Brown, Steven D.; Utturkar, Sagar M.; Magnuson, Timothy S.; Ray, Allison E.; Poole, Farris L.; Lancaster, W Andrew; Thorgersen, Michael P.; Adams, Michael W. W.; Elias, Dwayne A.

    2014-01-01

    Pelosinus species can reduce metals such as Fe(III), U(VI), and Cr(VI) and have been isolated from diverse geographical regions. Five draft genome sequences have been published. We report the complete genome sequence for Pelosinus sp. strain UFO1 using only PacBio DNA sequence data and without manual finishing.

  1. Sequence and comparative genomic analysis of actin-related proteins.

    Science.gov (United States)

    Muller, Jean; Oma, Yukako; Vallar, Laurent; Friederich, Evelyne; Poch, Olivier; Winsor, Barbara

    2005-12-01

    Actin-related proteins (ARPs) are key players in cytoskeleton activities and nuclear functions. Two complexes, ARP2/3 and ARP1/11, also known as dynactin, are implicated in actin dynamics and in microtubule-based trafficking, respectively. ARP4 to ARP9 are components of many chromatin-modulating complexes. Conventional actins and ARPs codefine a large family of homologous proteins, the actin superfamily, with a tertiary structure known as the actin fold. Because ARPs and actin share high sequence conservation, clear family definition requires distinct features to easily and systematically identify each subfamily. In this study we performed an in depth sequence and comparative genomic analysis of ARP subfamilies. A high-quality multiple alignment of approximately 700 complete protein sequences homologous to actin, including 148 ARP sequences, allowed us to extend the ARP classification to new organisms. Sequence alignments revealed conserved residues, motifs, and inserted sequence signatures to define each ARP subfamily. These discriminative characteristics allowed us to develop ARPAnno (http://bips.u-strasbg.fr/ARPAnno), a new web server dedicated to the annotation of ARP sequences. Analyses of sequence conservation among actins and ARPs highlight part of the actin fold and suggest interactions between ARPs and actin-binding proteins. Finally, analysis of ARP distribution across eukaryotic phyla emphasizes the central importance of nuclear ARPs, particularly the multifunctional ARP4. PMID:16195354

  2. The complete chloroplast genome sequence of Anoectochilus emeiensis.

    Science.gov (United States)

    Zhu, Shuying; Niu, Zhitao; Yan, Wenjin; Xue, Qingyun; Ding, Xiaoyu

    2016-09-01

    The complete chloroplast (cp) genome sequence of Anoectochilus emeiensis, an extremely endangered medical plant with important economic value, was determined and characterized. The genome size was 152 650 bp, containing a pair of inverted repeats (IRs) (26 319 bp) which were separated by a large single copy (LSC) (82 670 bp) and a small single copy (SSC) (17 342 bp). The cpDNA of A. emeiensis contained 113 unique genes, including 79 protein coding genes, 30 tRNA genes and 4 rRNA genes. Among them, 18 genes contained one or two introns. The overall AT content of the genome was 63.1%. PMID:26403535

  3. Genome sequence and description of Anaerosalibacter massiliensis sp. nov.

    Directory of Open Access Journals (Sweden)

    N. Dione

    2016-03-01

    Full Text Available Anaerosalibacter massiliensis sp. nov. strain ND1T (= CSUR P762 = DSM 27308 is the type strain of A. massiliensis sp. nov., a new species within the genus Anaerosalibacter. This strain, the genome of which is described here, was isolated from the faecal flora of a 49-year-old healthy Brazilian man. Anaerosalibacter massiliensis is a Gram-positive, obligate anaerobic rod and member of the family Clostridiaceae. With the complete genome sequence and annotation, we describe here the features of this organism. The 3 197 911 bp long genome (one chromosome but no plasmid contains 3271 protein-coding and 62 RNA genes, including six rRNA genes.

  4. The complete mitochondrial genome sequence of Mustela eversmannii (Carnivora: Mustelidae).

    Science.gov (United States)

    Liu, Guangshuai; Yang, Xiufeng; Zhang, Honghai; Sun, Guolei; Zhao, Chao; Dou, Huashan

    2016-09-01

    In this study, the complete mitochondrial genome of Steppe polecat, Mustela eversmannii, was sequenced for the first time using muscle tissue. The mitochondrial genome is a circular molecule of 16 463 bp in length and overall base composition is A (32.7%), T (27.3%), C (26.1%), and G (13.9%), which indicates a strong A-T bias. A phylogenetic analysis on the basis of 13 protein-coding genes and two rRNA genes of 10 Mustela species' mitochondrial genomes using maximum likelihood (ML) and Bayesian inference (BI) demonstrated that these Mustela species were clustered into two clades and M. eversmannii was close to M. putorius. PMID:26367202

  5. Genome sequence of Aedes aegypti, a major arbovirus vector.

    Science.gov (United States)

    Nene, Vishvanath; Wortman, Jennifer R; Lawson, Daniel; Haas, Brian; Kodira, Chinnappa; Tu, Zhijian Jake; Loftus, Brendan; Xi, Zhiyong; Megy, Karyn; Grabherr, Manfred; Ren, Quinghu; Zdobnov, Evgeny M; Lobo, Neil F; Campbell, Kathryn S; Brown, Susan E; Bonaldo, Maria F; Zhu, Jingsong; Sinkins, Steven P; Hogenkamp, David G; Amedeo, Paolo; Arensburger, Peter; Atkinson, Peter W; Bidwell, Shelby; Biedler, Jim; Birney, Ewan; Bruggner, Robert V; Costas, Javier; Coy, Monique R; Crabtree, Jonathan; Crawford, Matt; Debruyn, Becky; Decaprio, David; Eiglmeier, Karin; Eisenstadt, Eric; El-Dorry, Hamza; Gelbart, William M; Gomes, Suely L; Hammond, Martin; Hannick, Linda I; Hogan, James R; Holmes, Michael H; Jaffe, David; Johnston, J Spencer; Kennedy, Ryan C; Koo, Hean; Kravitz, Saul; Kriventseva, Evgenia V; Kulp, David; Labutti, Kurt; Lee, Eduardo; Li, Song; Lovin, Diane D; Mao, Chunhong; Mauceli, Evan; Menck, Carlos F M; Miller, Jason R; Montgomery, Philip; Mori, Akio; Nascimento, Ana L; Naveira, Horacio F; Nusbaum, Chad; O'leary, Sinéad; Orvis, Joshua; Pertea, Mihaela; Quesneville, Hadi; Reidenbach, Kyanne R; Rogers, Yu-Hui; Roth, Charles W; Schneider, Jennifer R; Schatz, Michael; Shumway, Martin; Stanke, Mario; Stinson, Eric O; Tubio, Jose M C; Vanzee, Janice P; Verjovski-Almeida, Sergio; Werner, Doreen; White, Owen; Wyder, Stefan; Zeng, Qiandong; Zhao, Qi; Zhao, Yongmei; Hill, Catherine A; Raikhel, Alexander S; Soares, Marcelo B; Knudson, Dennis L; Lee, Norman H; Galagan, James; Salzberg, Steven L; Paulsen, Ian T; Dimopoulos, George; Collins, Frank H; Birren, Bruce; Fraser-Liggett, Claire M; Severson, David W

    2007-06-22

    We present a draft sequence of the genome of Aedes aegypti, the primary vector for yellow fever and dengue fever, which at approximately 1376 million base pairs is about 5 times the size of the genome of the malaria vector Anopheles gambiae. Nearly 50% of the Ae. aegypti genome consists of transposable elements. These contribute to a factor of approximately 4 to 6 increase in average gene length and in sizes of intergenic regions relative to An. gambiae and Drosophila melanogaster. Nonetheless, chromosomal synteny is generally maintained among all three insects, although conservation of orthologous gene order is higher (by a factor of approximately 2) between the mosquito species than between either of them and the fruit fly. An increase in genes encoding odorant binding, cytochrome P450, and cuticle domains relative to An. gambiae suggests that members of these protein families underpin some of the biological differences between the two mosquito species. PMID:17510324

  6. Whole Genome Mapping with Feature Sets from High-Throughput Sequencing Data.

    Science.gov (United States)

    Pan, Yonglong; Wang, Xiaoming; Liu, Lin; Wang, Hao; Luo, Meizhong

    2016-01-01

    A good physical map is essential to guide sequence assembly in de novo whole genome sequencing, especially when sequences are produced by high-throughput sequencing such as next-generation-sequencing (NGS) technology. We here present a novel method, Feature sets-based Genome Mapping (FGM). With FGM, physical map and draft whole genome sequences can be generated, anchored and integrated using the same data set of NGS sequences, independent of restriction digestion. Method model was created and parameters were inspected by simulations using the Arabidopsis genome sequence. In the simulations, when ~4.8X genome BAC library including 4,096 clones was used to sequence the whole genome, ~90% of clones were successfully connected to physical contigs, and 91.58% of genome sequences were mapped and connected to chromosomes. This method was experimentally verified using the existing physical map and genome sequence of rice. Of 4,064 clones covering 115 Mb sequence selected from ~3 tiles of 3 chromosomes of a rice draft physical map, 3,364 clones were reconstructed into physical contigs and 98 Mb sequences were integrated into the 3 chromosomes. The physical map-integrated draft genome sequences can provide permanent frameworks for eventually obtaining high-quality reference sequences by targeted sequencing, gap filling and combining other sequences. PMID:27611682

  7. Combined evidence annotation of transposable elements in genome sequences.

    Directory of Open Access Journals (Sweden)

    Hadi Quesneville

    2005-07-01

    Full Text Available Transposable elements (TEs are mobile, repetitive sequences that make up significant fractions of metazoan genomes. Despite their near ubiquity and importance in genome and chromosome biology, most efforts to annotate TEs in genome sequences rely on the results of a single computational program, RepeatMasker. In contrast, recent advances in gene annotation indicate that high-quality gene models can be produced from combining multiple independent sources of computational evidence. To elevate the quality of TE annotations to a level comparable to that of gene models, we have developed a combined evidence-model TE annotation pipeline, analogous to systems used for gene annotation, by integrating results from multiple homology-based and de novo TE identification methods. As proof of principle, we have annotated "TE models" in Drosophila melanogaster Release 4 genomic sequences using the combined computational evidence derived from RepeatMasker, BLASTER, TBLASTX, all-by-all BLASTN, RECON, TE-HMM and the previous Release 3.1 annotation. Our system is designed for use with the Apollo genome annotation tool, allowing automatic results to be curated manually to produce reliable annotations. The euchromatic TE fraction of D. melanogaster is now estimated at 5.3% (cf. 3.86% in Release 3.1, and we found a substantially higher number of TEs (n = 6,013 than previously identified (n = 1,572. Most of the new TEs derive from small fragments of a few hundred nucleotides long and highly abundant families not previously annotated (e.g., INE-1. We also estimated that 518 TE copies (8.6% are inserted into at least one other TE, forming a nest of elements. The pipeline allows rapid and thorough annotation of even the most complex TE models, including highly deleted and/or nested elements such as those often found in heterochromatic sequences. Our pipeline can be easily adapted to other genome sequences, such as those of the D. melanogaster heterochromatin or other

  8. Next-Generation Sequencing and Genome Editing in Plant Virology

    Science.gov (United States)

    Hadidi, Ahmed; Flores, Ricardo; Candresse, Thierry; Barba, Marina

    2016-01-01

    Next-generation sequencing (NGS) has been applied to plant virology since 2009. NGS provides highly efficient, rapid, low cost DNA, or RNA high-throughput sequencing of the genomes of plant viruses and viroids and of the specific small RNAs generated during the infection process. These small RNAs, which cover frequently the whole genome of the infectious agent, are 21–24 nt long and are known as vsRNAs for viruses and vd-sRNAs for viroids. NGS has been used in a number of studies in plant virology including, but not limited to, discovery of novel viruses and viroids as well as detection and identification of those pathogens already known, analysis of genome diversity and evolution, and study of pathogen epidemiology. The genome engineering editing method, clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 system has been successfully used recently to engineer resistance to DNA geminiviruses (family, Geminiviridae) by targeting different viral genome sequences in infected Nicotiana benthamiana or Arabidopsis plants. The DNA viruses targeted include tomato yellow leaf curl virus and merremia mosaic virus (begomovirus); beet curly top virus and beet severe curly top virus (curtovirus); and bean yellow dwarf virus (mastrevirus). The technique has also been used against the RNA viruses zucchini yellow mosaic virus, papaya ringspot virus and turnip mosaic virus (potyvirus) and cucumber vein yellowing virus (ipomovirus, family, Potyviridae) by targeting the translation initiation genes eIF4E in cucumber or Arabidopsis plants. From these recent advances of major importance, it is expected that NGS and CRISPR-Cas technologies will play a significant role in the very near future in advancing the field of plant virology and connecting it with other related fields of biology.

  9. Next-Generation Sequencing and Genome Editing in Plant Virology.

    Science.gov (United States)

    Hadidi, Ahmed; Flores, Ricardo; Candresse, Thierry; Barba, Marina

    2016-01-01

    Next-generation sequencing (NGS) has been applied to plant virology since 2009. NGS provides highly efficient, rapid, low cost DNA, or RNA high-throughput sequencing of the genomes of plant viruses and viroids and of the specific small RNAs generated during the infection process. These small RNAs, which cover frequently the whole genome of the infectious agent, are 21-24 nt long and are known as vsRNAs for viruses and vd-sRNAs for viroids. NGS has been used in a number of studies in plant virology including, but not limited to, discovery of novel viruses and viroids as well as detection and identification of those pathogens already known, analysis of genome diversity and evolution, and study of pathogen epidemiology. The genome engineering editing method, clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 system has been successfully used recently to engineer resistance to DNA geminiviruses (family, Geminiviridae) by targeting different viral genome sequences in infected Nicotiana benthamiana or Arabidopsis plants. The DNA viruses targeted include tomato yellow leaf curl virus and merremia mosaic virus (begomovirus); beet curly top virus and beet severe curly top virus (curtovirus); and bean yellow dwarf virus (mastrevirus). The technique has also been used against the RNA viruses zucchini yellow mosaic virus, papaya ringspot virus and turnip mosaic virus (potyvirus) and cucumber vein yellowing virus (ipomovirus, family, Potyviridae) by targeting the translation initiation genes eIF4E in cucumber or Arabidopsis plants. From these recent advances of major importance, it is expected that NGS and CRISPR-Cas technologies will play a significant role in the very near future in advancing the field of plant virology and connecting it with other related fields of biology. PMID:27617007

  10. The complete chloroplast genome sequence of Euonymus japonicus (Celastraceae).

    Science.gov (United States)

    Choi, Kyoung Su; Park, SeonJoo

    2016-09-01

    The complete chloroplast (cp) genome sequence of the Euonymus japonicus, the first sequenced of the genus Euonymus, was reported in this study. The total length was 157 637 bp, containing a pair of 26 678 bp inverted repeat region (IR), which were separated by small single copy (SSC) region and large single copy (LSC) region of 18 340 bp and 85 941 bp, respectively. This genome contains 107 unique genes, including 74 coding genes, four rRNA genes, and 29 tRNA genes. Seventeen genes contain intron of E. japonicus, of which three genes (clpP, ycf3, and rps12) include two introns. The maximum likelihood (ML) phylogenetic analysis revealed that E. japonicus was closely related to Manihot and Populus. PMID:26407184

  11. Construction of an integrated database to support genomic sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, W.; Overbeek, R.

    1994-11-01

    The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

  12. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks.

    Science.gov (United States)

    Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S K; Mammel, Mark K; Tarr, Phillip I; Eppinger, Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and

  13. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks

    Science.gov (United States)

    Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S. K.; Mammel, Mark K.; Tarr, Phillip I.; Eppinger, Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and

  14. Genome sequence of carboxylesterase, carboxylase and xylose isomerase producing alkaliphilic haloarchaeon Haloterrigena turkmenica WANU15

    Directory of Open Access Journals (Sweden)

    Samy Selim

    2016-03-01

    Full Text Available We report draft genome sequence of Haloterrigena turkmenica strain WANU15, isolated from Soda Lake. The draft genome size is 2,950,899 bp with a G + C content of 64% and contains 49 RNA sequence. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LKCV00000000.

  15. Early insights into the genome sequence of Uromyces fabae

    OpenAIRE

    Link, Tobias; Seibel, Christian; Voegele, Ralf T.

    2014-01-01

    Uromyces fabae is a major pathogen of broad bean, Vicia faba. U. fabae has served as a model among rust fungi to elucidate the development of infection structures, expression and secretion of cell wall degrading enzymes and gene expression. Using U. fabae, enormous progress was made regarding nutrient uptake and metabolism and in the search for secreted proteins and effectors. Here, we present results from a genome survey of U. fabae. Paired end Illumina sequencing provided 53 Gb of data. An ...

  16. Genome Sequence of Propionibacterium acnes Type II Strain ATCC 11828

    OpenAIRE

    Horváth, Balázs; Hunyadkürti, Judit; Vörös, Andrea; Fekete, Csaba; Urbán, Edit; Kemény, Lajos; Nagy, István

    2012-01-01

    Propionibacterium acnes is an anaerobic Gram-positive bacterium that forms part of the normal human cutaneous microbiota and is occasionally associated with inflammatory diseases (I. Kurokawa et al., Exp. Dermatol. 18:821–832, 2009). Here we present the complete genome sequence for the commercially available P. acnes type II reference strain ATCC 11828 (I. Nagy et al., Microbes Infect. 8:2195–2205, 2006) recovered from a subcutaneous abscess.

  17. Complete Genome Sequence of Croceibacter Bacteriophage P2559S

    OpenAIRE

    Kang, Ilnam; Kang, Dongmin; Cho, Jang-Cheon

    2012-01-01

    Croceibacter atlanticus HTCC2559T, a marine bacterium isolated from the Sargasso Sea, is a phylogenetically unique member of the family Flavobacteriaceae. Strain HTCC2559T possesses genes related to interaction with primary producers, which makes studies on bacteriophages infecting the strain interesting. Here we report the genome sequence of bacteriophage P2559S, which was isolated off the coast of the Republic of Korea and lytically infects HTCC2559T. Many genes predicted in the P2559S geno...

  18. Multiplexed DNA Sequence Capture of Mitochondrial Genomes Using PCR Products

    OpenAIRE

    Tomislav Maricic; Mark Whitten; Svante Pääbo

    2010-01-01

    BACKGROUND: To utilize the power of high-throughput sequencers, target enrichment methods have been developed. The majority of these require reagents and equipment that are only available from commercial vendors and are not suitable for the targets that are a few kilobases in length. METHODOLOGY/PRINCIPAL FINDINGS: We describe a novel and economical method in which custom made long-range PCR products are used to capture complete human mitochondrial genomes from complex DNA mixtures. We use th...

  19. Yersinia pestis genome sequencing identifies patterns of global phylogenetic diversity.

    OpenAIRE

    Morelli, Giovanna; Song, Yajun; Mazzoni, Camila J.; Eppinger, Mark; Roumagnac, Philippe; Wagner, David M.; Feldkamp, Mirjam; Kusecek, Barica; Vogler, Amy J.; Li, Yanjun; Cui, Yujun; Thomson, Nicholas R.; Jombart, Thibaut; Leblois, Raphael; Lichtner, Peter

    2010-01-01

    Plague is a pandemic human invasive disease caused by the bacterial agent Yersinia pestis. We here report a comparison of 17 whole genomes of Y. pestis isolates from global sources. We also screened a global collection of 286 Y. pestis isolates for 933 SNPs using Sequenom MassArray SNP typing. We conducted phylogenetic analyses on this sequence variation dataset, assigned isolates to populations based on maximum parsimony and, from these results, made inferences regarding historical transmiss...

  20. Genome Sequence of Proteus mirabilis Clinical Isolate C05028

    OpenAIRE

    Shi, Xiaolu; Zhu, Yuanfang; Li, Yinghui; Jiang, Min; Lin, Yiman; Qiu, Yaqun; Chen, Qiongcheng; Yuan, Yanting; Ni, Peixiang; Hu, Qinghua; Huang, Shenghe

    2014-01-01

    Genomic DNA of Proteus mirabilis C05028 was sequenced by an Illumina HiSeq platform and was assembled to 39 scaffolds with a total length of 3.8 Mb. Next, open reading frames (ORFs) were identified and were annotated by the KEGG, COG, and NR databases. Finally, we found special virulence factors only existing in P. mirabilis C05028.

  1. Whole genome sequencing of clinical isolates of Giardia lamblia

    OpenAIRE

    Hanevik, Kurt; Bakken, R.; Brattbakk, Hans-Richard; Saghaug, Christina Skår; Langeland, Nina

    2015-01-01

    Clinical isolates from protozoan parasites such as Giardia lamblia are at present practically impossible to culture. By using simple cyst purification methods, we show that Giardia whole genome sequencing of clinical stool samples is possible. Immunomagnetic separation after sucrose gradient flotation gave superior results compared to sucrose gradient flotation alone. The method enables detailed analysis of a wide range of genes of interest for genotyping, virulence and drug resistance.

  2. Registered Report: Melanoma genome sequencing reveals frequent PREX2 mutations

    OpenAIRE

    sprotocols

    2015-01-01

    Authors: Denise Chroscinski, Darryl Sampey, Alex Hewitt, The Reproducibility Project: Cancer Biology† ### Abstract The [Reproducibility Project: Cancer Biology](https://osf.io/e81xl/wiki/home/) seeks to address growing concerns about reproducibility in scientific research by conducting replications of 50 papers in the field of cancer biology published between 2010 and 2012. This Registered Report describes the proposed replication plan of key experiments from “Melanoma genome sequenci...

  3. Draft Genome Sequence of Uropathogenic Escherichia coli Strain NB8.

    Science.gov (United States)

    Weng, Xing-Bei; Mi, Zu-Huang; Wang, Chun-Xin; Zhu, Jian-Ming

    2016-01-01

    Escherichia coli NB8 is a clinical pyelonephritis isolate. Here, we report the draft genome sequence of uropathogenic E. coli NB8, which contains drug resistance genes encoding resistance to beta-lactams, aminoglycosides, quinolones, macrolides, colistin, sulfonamide-trimethoprim, and tetracycline. NB8 infects the kidney and bladder, making it an important tool for studying E. coli pathogenesis. PMID:27609920

  4. Complete Genome Sequence of Mycobacterium xenopi Type Strain RIVM700367

    KAUST Repository

    Abdallah, A. M.

    2012-05-24

    Mycobacterium xenopi is a slow-growing, thermophilic, water-related Mycobacterium species. Like other nontuberculous mycobacteria, M. xenopi more commonly infects humans with altered immune function, such as chronic obstructive pulmonary disease patients. It is considered clinically relevant in a significant proportion of the patients from whom it is isolated. We report here the whole genome sequence of M. xenopi type strain RIVM700367.

  5. Evolutionary insights from suffix array-based genome sequence analysis

    Indian Academy of Sciences (India)

    Anindya Poddar; Nagasuma Chandra; Madhavi Ganapathiraju; K Sekar; Judith Klein-Seetharaman; Raj Reddy; N Balakrishnan

    2007-08-01

    Gene and protein sequence analyses, central components of studies in modern biology are easily amenable to string matching and pattern recognition algorithms. The growing need of analysing whole genome sequences more efficiently and thoroughly, has led to the emergence of new computational methods. Suffix trees and suffix arrays are data structures, well known in many other areas and are highly suited for sequence analysis too. Here we report an improvement to the design of construction of suffix arrays. Enhancement in versatility and scalability, enabled by this approach, is demonstrated through the use of real-life examples. The scalability of the algorithm to whole genomes renders it suitable to address many biologically interesting problems. One example is the evolutionary insight gained by analysing unigrams, bi-grams and higher n-grams, indicating that the genetic code has a direct influence on the overall composition of the genome. Further, different proteomes have been analysed for the coverage of the possible peptide space, which indicate that as much as a quarter of the total space at the tetra-peptide level is left un-sampled in prokaryotic organisms, although almost all tri-peptides can be seen in one protein or another in a proteome. Besides, distinct patterns begin to emerge for the counts of particular tetra and higher peptides, indicative of a ‘meaning’ for tetra and higher n-grams. The toolkit has also been used to demonstrate the usefulness of identifying repeats in whole proteomes efficiently. As an example, 16 members of one COG, coded by the genome of Mycobacterium tuberculosis H37Rv have been found to contain a repeating sequence of 300 amino acids.

  6. Investigation of terpene diversification across multiple sequenced plant genomes

    OpenAIRE

    Boutanaev, Alexander M.; Moses, Tessa; Zi, Jiachen; Nelson, David R.; Mugford, Sam T.; Peters, Reuben J.; Osbourn, Anne

    2014-01-01

    The terpenes are the largest class of plant natural products. This major class of compounds represents tremendous chemical diversity of which only a relatively small fraction has so far been accessed and used by industry. The primary drivers of terpene diversification are terpenoid synthases and cytochromes P450, which synthesize and modify terpene scaffolds. Here, focusing on these two gene families, we investigate terpene synthesis and evolution across 17 sequenced plant genomes. Our analys...

  7. Complete genome sequence of Desulfomicrobium baculatum type strain (XT)

    Energy Technology Data Exchange (ETDEWEB)

    Copeland, Alex; Spring, Stefan; Goker, Markus; Schneider, Susanne; Lapidus, Alla; Glavina Del Rio, Tijana; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C; Meincke, Linda; Sims, David; Brettin, Thomas; Detter, John C; Han, Cliff; Chain, Patrick; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C; Lucas, Susan

    2009-05-20

    Desulfomicrobium baculatum is the type species of the genus Desulfomicrobium, which is the type genus of the family Desulfomicrobiaceae. It is of phylogenetic interest because of the isolated location of the family Desulfomicrobiaceae within the order Desulfovibrionales. D. baculatum strain XT is a Gram-negative, motile, sulfate-reducing bacterium isolated from water-saturated manganese carbonate ore. It is strictly anaerobic and does not require NaCl for growth, although NaCl concentrations up to 6percent (w/v) are tolerated. The metabolism is respiratory or fermentative. In the presence of sulfate, pyruvate and lactate are incompletely oxidized to acetate and CO2. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the deltaproteobacterial family Desulfomicrobiaceae, and this 3,942,657 bp long single replicon genome with its 3494 protein-coding and 72 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  8. Artificial duplicate reads in sequencing data of 454 Genome Sequencer FLX System

    Institute of Scientific and Technical Information of China (English)

    Hui Dong; Yangyi Chen; Yan Shen; Shengyue Wang; Guoping Zhao; Weirong Jin

    2011-01-01

    The 454 Genome Sequencer (GS) FLX System is one of the next-generation sequencing systems featured by long reads, high accuracy, and ultra-high throughput.Based on the mechanism of emulsion PCR, a unique DNA template would only generate a unique sequence read after being amplified and sequenced on GS FLX.However,biased amplification of DNA templates might occur in the process of emulsion PCR, which results in production of artificial duplicate reads.Under the condition that each DNA template is unique to another, 3.49%-18.14% of total reads in GS FLX-sequencing data were found to be artificial duplicate reads.These duplicate reads may lead to misunderstanding of sequencing data and special attention should be paid to the potential biases they introduced to the data.

  9. An efficient procedure for plant organellar genome assembly, based on whole genome data from the 454 GS FLX sequencing platform

    Directory of Open Access Journals (Sweden)

    Zhang Tongwu

    2011-11-01

    Full Text Available Abstract Motivation Complete organellar genome sequences (chloroplasts and mitochondria provide valuable resources and information for studying plant molecular ecology and evolution. As high-throughput sequencing technology advances, it becomes the norm that a shotgun approach is used to obtain complete genome sequences. Therefore, to assemble organellar sequences from the whole genome, shotgun reads are inevitable. However, associated techniques are often cumbersome, time-consuming, and difficult, because true organellar DNA is difficult to separate efficiently from nuclear copies, which have been transferred to the nucleus through the course of evolution. Results We report a new, rapid procedure for plant chloroplast and mitochondrial genome sequencing and assembly using the Roche/454 GS FLX platform. Plant cells can contain multiple copies of the organellar genomes, and there is a significant correlation between the depth of sequence reads in contigs and the number of copies of the genome. Without isolating organellar DNA from the mixture of nuclear and organellar DNA for sequencing, we retrospectively extracted assembled contigs of either chloroplast or mitochondrial sequences from the whole genome shotgun data. Moreover, the contig connection graph property of Newbler (a platform-specific sequence assembler ensures an efficient final assembly. Using this procedure, we assembled both chloroplast and mitochondrial genomes of a resurrection plant, Boea hygrometrica, with high fidelity. We also present information and a minimal sequence dataset as a reference for the assembly of other plant organellar genomes.

  10. Whole-Genome de novo Sequencing Of Quail And Grey Partridge

    DEFF Research Database (Denmark)

    Holm, Lars-Erik; Panitz, Frank; Burt, Dave;

    2011-01-01

    The development in sequencing methods has made it possible to perform whole genome de novo sequencing of species without large commercial interests. Within the EU-financed QUANTOMICS project (KBBE-2A-222664), we have performed de novo sequencing of quail (Coturnix coturnix) and grey partridge...... (Perdix perdix) on a Genome Analyzer GAII (Illumina) using paired-end sequencing. The amount of generated sequences amounts to 8 to 9 Gb for each species. The analysis and assembly of the generated sequences is ongoing. Access to the whole genome sequence from these two species will enable enhanced...... comparative studies towards the chicken genome and will aid in identifying evolutionarily conserved sequences within the Galliformes. The obtained sequences from quail and partridge represent a beginning of generating the whole genome sequence for these species. The continuation of establishing the genome...

  11. Ancient human genome sequence of an extinct Palaeo-Eskimo

    Science.gov (United States)

    Rasmussen, Morten; Li, Yingrui; Lindgreen, Stinus; Pedersen, Jakob Skou; Albrechtsen, Anders; Moltke, Ida; Metspalu, Mait; Metspalu, Ene; Kivisild, Toomas; Gupta, Ramneek; Bertalan, Marcelo; Nielsen, Kasper; Gilbert, M. Thomas P.; Wang, Yong; Raghavan, Maanasa; Campos, Paula F.; Kamp, Hanne Munkholm; Wilson, Andrew S.; Gledhill, Andrew; Tridico, Silvana; Bunce, Michael; Lorenzen, Eline D.; Binladen, Jonas; Guo, Xiaosen; Zhao, Jing; Zhang, Xiuqing; Zhang, Hao; Li, Zhuo; Chen, Minfeng; Orlando, Ludovic; Kristiansen, Karsten; Bak, Mads; Tommerup, Niels; Bendixen, Christian; Pierre, Tracey L.; Grønnow, Bjarne; Meldgaard, Morten; Andreasen, Claus; Fedorova, Sardana A.; Osipova, Ludmila P.; Higham, Thomas F. G.; Ramsey, Christopher Bronk; Hansen, Thomas v. O.; Nielsen, Finn C.; Crawford, Michael H.; Brunak, Søren; Sicheritz-Pontén, Thomas; Villems, Richard; Nielsen, Rasmus; Krogh, Anders; Wang, Jun; Willerslev, Eske

    2013-01-01

    We report here the genome sequence of an ancient human. Obtained from ∼4,000-year-old permafrost-preserved hair, the genome represents a male individual from the first known culture to settle in Greenland. Sequenced to an average depth of 20×, we recover 79% of the diploid genome, an amount close to the practical limit of current sequencing technologies. We identify 353,151 high-confidence single-nucleotide polymorphisms (SNPs), of which 6.8% have not been reported previously. We estimate raw read contamination to be no higher than 0.8%. We use functional SNP assessment to assign possible phenotypic characteristics of the individual that belonged to a culture whose location has yielded only trace human remains. We compare the high-confidence SNPs to those of contemporary populations to find the populations most closely related to the individual. This provides evidence for a migration from Siberia into the New World some 5,500 years ago, independent of that giving rise to the modern Native Americans and Inuit. PMID:20148029

  12. The genomic sequence of lymphocryptovirus from cynomolgus macaque.

    Science.gov (United States)

    Kamperschroer, Cris; Gosink, Mark M; Kumpf, Steven W; O'Donnell, Lynn M; Tartaro, Karrie R

    2016-01-15

    Lymphocryptoviruses such as Epstein-Barr virus (EBV) cause persistent infections in human and non-human primates, and suppression of the immune system can increase the risk of lymphocryptovirus (LCV)-associated tumor development in both human and non-human primates. To enable LCV infection as a non-clinical model to study effects of therapeutics on EBV immunity, we determined the genomic DNA sequence of the LCV from cynomolgus macaque, a species commonly used for non-clinical testing. Comparison to rhesus macaque LCV and human EBV sequences indicates that LCV from the cynomolgus macaque has the same genomic arrangement and a high degree of similarity in most genes, especially with rhesus macaque LCV. Genes showing lower similarity were those encoding proteins involved in latency and/or tumor promotion or immune evasion. The genomic sequence of LCV from cynomolgus macaque should aid the development of non-clinical tools for identifying therapeutics that impact LCV immunity and carry potential lymphoma risk. PMID:26590795

  13. Two complete chloroplast genome sequences of Cannabis sativa varieties.

    Science.gov (United States)

    Oh, Hyehyun; Seo, Boyoung; Lee, Seunghwan; Ahn, Dong-Ha; Jo, Euna; Park, Jin-Kyoung; Min, Gi-Sik

    2016-07-01

    In this study, we determined the complete chloroplast (cp) genomes from two varieties of Cannabis sativa. The genome sizes were 153,848 bp (the Korean non-drug variety, Cheungsam) and 153,854 bp (the African variety, Yoruba Nigeria). The genome structures were identical with 131 individual genes [86 protein-coding genes (PCGs), eight rRNA, and 37 tRNA genes]. Further, except for the presence of an intron in the rps3 genes of two C. sativa varieties, the cp genomes of C. sativa had conservative features similar to that of all known species in the order Rosales. To verify the position of C. sativa within the order Rosales, we conducted phylogenetic analysis by using concatenated sequences of all PCGs from 17 complete cp genomes. The resulting tree strongly supported monophyly of Rosales. Further, the family Cannabaceae, represented by C. sativa, showed close relationship with the family Moraceae. The phylogenetic relationship outlined in our study is well congruent with those previously shown for the order Rosales. PMID:26104156

  14. Complete genome sequence of an attenuated Sparfloxacin-resistant Streptococcus agalactiae strain 138spar

    Science.gov (United States)

    The complete genome of a sparfloxacin-resistant Streptococcus agalactiae vaccine strain 138spar is 1,838,126 bp in size. The genome has 1892 coding sequences and 82 RNAs. The annotation of the genome is added by the NCBI Prokaryotic Genome Annotation Pipeline. The publishing of this genome will allo...

  15. GABenchToB: A Genome Assembly Benchmark Tuned on Bacteria and Benchtop Sequencers

    OpenAIRE

    Jünemann, Sebastian; Prior, Karola; Albersmeier, Andreas; Albaum, Stefan; Kalinowski, Jörn; Goesmann, Alexander; Stoye, Jens; Harmsen, Dag

    2014-01-01

    De novo genome assembly is the process of reconstructing a complete genomic sequence from countless small sequencing reads. Due to the complexity of this task, numerous genome assemblers have been developed to cope with different requirements and the different kinds of data provided by sequencers within the fast evolving field of next-generation sequencing technologies. In particular, the recently introduced generation of benchtop sequencers, like Illumina's MiSeq and Ion Torrent's Personal G...

  16. The Genome Sequence of Caenorhabditis briggsae: A Platform for Comparative Genomics

    Directory of Open Access Journals (Sweden)

    Stein Lincoln D

    2003-01-01

    Full Text Available The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp and C. elegans (100.3 Mbp genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C

  17. The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics.

    Directory of Open Access Journals (Sweden)

    Lincoln D Stein

    2003-11-01

    Full Text Available The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp and C. elegans (100.3 Mbp genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C

  18. De novo assembly of the carrot mitochondrial genome using next generation sequencing of whole genomic DNA provides first evidence of DNA transfer into an angiosperm plastid genome

    Directory of Open Access Journals (Sweden)

    Iorizzo Massimo

    2012-05-01

    Full Text Available Abstract Background Sequence analysis of organelle genomes has revealed important aspects of plant cell evolution. The scope of this study was to develop an approach for de novo assembly of the carrot mitochondrial genome using next generation sequence data from total genomic DNA. Results Sequencing data from a carrot 454 whole genome library were used to develop a de novo assembly of the mitochondrial genome. Development of a new bioinformatic tool allowed visualizing contig connections and elucidation of the de novo assembly. Southern hybridization demonstrated recombination across two large repeats. Genome annotation allowed identification of 44 protein coding genes, three rRNA and 17 tRNA. Identification of the plastid genome sequence allowed organelle genome comparison. Mitochondrial intergenic sequence analysis allowed detection of a fragment of DNA specific to the carrot plastid genome. PCR amplification and sequence analysis across different Apiaceae species revealed consistent conservation of this fragment in the mitochondrial genomes and an insertion in Daucus plastid genomes, giving evidence of a mitochondrial to plastid transfer of DNA. Sequence similarity with a retrotransposon element suggests a possibility that a transposon-like event transferred this sequence into the plastid genome. Conclusions This study confirmed that whole genome sequencing is a practical approach for de novo assembly of higher plant mitochondrial genomes. In addition, a new aspect of intercompartmental genome interaction was reported providing the first evidence for DNA transfer into an angiosperm plastid genome. The approach used here could be used more broadly to sequence and assemble mitochondrial genomes of diverse species. This information will allow us to better understand intercompartmental interactions and cell evolution.

  19. BioNano genome mapping of individual chromosomes supports physical mapping and sequence assembly in complex plant genomes.

    Science.gov (United States)

    Staňková, Helena; Hastie, Alex R; Chan, Saki; Vrána, Jan; Tulpová, Zuzana; Kubaláková, Marie; Visendi, Paul; Hayashi, Satomi; Luo, Mingcheng; Batley, Jacqueline; Edwards, David; Doležel, Jaroslav; Šimková, Hana

    2016-07-01

    The assembly of a reference genome sequence of bread wheat is challenging due to its specific features such as the genome size of 17 Gbp, polyploid nature and prevalence of repetitive sequences. BAC-by-BAC sequencing based on chromosomal physical maps, adopted by the International Wheat Genome Sequencing Consortium as the key strategy, reduces problems caused by the genome complexity and polyploidy, but the repeat content still hampers the sequence assembly. Availability of a high-resolution genomic map to guide sequence scaffolding and validate physical map and sequence assemblies would be highly beneficial to obtaining an accurate and complete genome sequence. Here, we chose the short arm of chromosome 7D (7DS) as a model to demonstrate for the first time that it is possible to couple chromosome flow sorting with genome mapping in nanochannel arrays and create a de novo genome map of a wheat chromosome. We constructed a high-resolution chromosome map composed of 371 contigs with an N50 of 1.3 Mb. Long DNA molecules achieved by our approach facilitated chromosome-scale analysis of repetitive sequences and revealed a ~800-kb array of tandem repeats intractable to current DNA sequencing technologies. Anchoring 7DS sequence assemblies obtained by clone-by-clone sequencing to the 7DS genome map provided a valuable tool to improve the BAC-contig physical map and validate sequence assembly on a chromosome-arm scale. Our results indicate that creating genome maps for the whole wheat genome in a chromosome-by-chromosome manner is feasible and that they will be an affordable tool to support the production of improved pseudomolecules. PMID:26801360

  20. The first genome sequence of a metatherian herpesvirus: Macropodid herpesvirus 1

    OpenAIRE

    Vaz, Paola K.; Timothy J Mahony; Hartley, Carol A.; Fowler, Elizabeth V.; Ficorilli, Nino; Sang W. Lee; Gilkerson, James R.; Browning, Glenn F.; Devlin, Joanne M.

    2016-01-01

    Background While many placental herpesvirus genomes have been fully sequenced, the complete genome of a marsupial herpesvirus has not been described. Here we present the first genome sequence of a metatherian herpesvirus, Macropodid herpesvirus 1 (MaHV-1). Results The MaHV-1 viral genome was sequenced using an Illumina MiSeq sequencer, de novo assembly was performed and the genome was annotated. The MaHV-1 genome was 140 kbp in length and clustered phylogenetically with the primate simplexvir...

  1. Recoding of the stop codon UGA to glycine by a BD1-5/SN-2 bacterium and niche partitioning between Alpha- and Gammaproteobacteria in a tidal sediment microbial community naturally selected in a laboratory chemostat

    Directory of Open Access Journals (Sweden)

    MarcStrous

    2014-05-01

    Full Text Available Sandy coastal sediments are global hotspots for microbial mineralization of organic matter and denitrification. These sediments are characterized by advective porewater flow, tidal cycling and an active and complex microbial community. Metagenomic sequencing of microbial communities sampled from such sediments showed that potential sulfur oxidizing Gammaproteobacteria and members of the enigmatic BD1-5/SN-2 candidate phylum were abundant in situ (>10% and ~2% respectively. By mimicking the dynamic oxic/anoxic environmental conditions of the sediment in a laboratory chemostat, a simplified microbial community was selected from the more complex inoculum. Metagenomics, proteomics and fluorescence in situ hybridization showed that this simplified community contained both a potential sulfur oxidizing Gammaproteobacterium (at 24±2% abundance and a member of the BD1-5/SN-2 candidate phylum (at 7±6% abundance. Despite the abundant supply of organic substrates to the chemostat, proteomic analysis suggested that the selected gammaproteobacterium grew partially autotrophically and performed hydrogen/formate oxidation. The enrichment of a member of the BD1-5/SN-2 candidate phylum enabled, for the first time, direct microscopic observation by fluorescent in situ hybridization and the experimental validation of the previously predicted translation of the stop codon UGA into glycine.

  2. Recoding of the stop codon UGA to glycine by a BD1-5/SN-2 bacterium and niche partitioning between Alpha- and Gammaproteobacteria in a tidal sediment microbial community naturally selected in a laboratory chemostat

    Energy Technology Data Exchange (ETDEWEB)

    Hanke, Anna [Max Planck Institute for Marine Microbiology; Hamann, Emmo [Max Planck Institute for Marine Microbiology; Sharma, Ritin [ORNL; Geelhoed, Jeanine [Max Planck Institute for Marine Microbiology; Hargesheimer, Theresa [Max Planck Institute for Marine Microbiology; Kraft, Beate [Max Planck Institute for Marine Microbiology; Meyer, Volker [Max Planck Institute for Marine Microbiology; Lenk, Sabine [Max Planck Institute for Marine Microbiology; Osmers, Harald [Max Planck Institute for Marine Microbiology; Wu, Rong [Delft University of Technology, Delft, Netherlands; Makinwa, Kofi [Delft University of Technology, Delft, Netherlands; Hettich, Robert {Bob} L [ORNL; Banfield, Jillian F. [University of California, Berkeley; Tegetmeyer, Halina [Max Planck Institute for Marine Microbiology; Strouss, Marc [University of Calgary, ALberta, Canada

    2014-01-01

    Sandy coastal sediments are global hot spots for microbial mineralization of organic matter and denitrification. These sediments are characterized by advective pore water flow, tidal cycling and an active and complex microbial community. Metagenomic sequencing of microbial communities sampled from such sediments showed that potential sulfuroxidizing Gammaproteobacteria and members of the enigmaticBD1-5/ SN-2 candidatephylumwereabundantinsitu (>10% and 2% respectively). By mimicking the dynamic oxic/anoxic environmental conditions of the sedimentin a laboratory chemostat, a simplified microbial community was selected from the more complex inoculum. Metagenomics, proteomics and fluorescenceinsituhybridization showed that this simplified community contained both a potential sulfuroxidizing Gamma proteobacteria (at 24 2% abundance) and a member of the BD1-5 / SN-2candidatephylum (at 7 6%abundance). Despite the abundant supply of organic substrates to the chemostat, proteomic analysis suggested that the selected gamma proteobacterium grew partially auto trophically and performed hydrogen/formate oxidation. The enrichment of a member of the BD1-5/SN-2candidatephylum enabled, for the first time, direct microscopic observation by fluorescent insitu hybridization and the experimental validation of the previously predicted translation of the stop codon UGA into glycine.

  3. Identification of genomic regions associated with female fertility in Danish Jersey using whole genome sequence data

    DEFF Research Database (Denmark)

    Höglund, Johanna; Guldbrandtsen, Bernt; Lund, Mogens Sandø;

    2015-01-01

    Background: Female fertility is an important trait in cattle breeding programs. In the Nordic countries selection is based on a fertility index (FTI). The fertility index is a weighted combination of four female fertility traits estimated breeding values for number of inseminations per conception...... sires from Denmark with official breeding values for female fertility traits. The association analyses were carried out in two steps: first the cattle genome was scanned for quantitative trait loci using a sire model for FTI using imputed whole genome sequence variants; second the significant...... cows on BTA20, BTA23 and BTA25, IFL for heifers on BTA7 and QTL9-2 on BTA9, NRR for heifers on BTA7 and BTA23, and NRR for cows on BTA23. Conclusion: The genome wide association study presented here revealed 6 genomic regions associated with FTI. Screening these 6 QTL regions for the underlying female...

  4. Applications of Genomic Sequencing in Pediatric CNS Tumors.

    Science.gov (United States)

    Bavle, Abhishek A; Lin, Frank Y; Parsons, D Williams

    2016-05-01

    Recent advances in genome-scale sequencing methods have resulted in a significant increase in our understanding of the biology of human cancers. When applied to pediatric central nervous system (CNS) tumors, these remarkable technological breakthroughs have facilitated the molecular characterization of multiple tumor types, provided new insights into the genetic basis of these cancers, and prompted innovative strategies that are changing the management paradigm in pediatric neuro-oncology. Genomic tests have begun to affect medical decision making in a number of ways, from delineating histopathologically similar tumor types into distinct molecular subgroups that correlate with clinical characteristics, to guiding the addition of novel therapeutic agents for patients with high-risk or poor-prognosis tumors, or alternatively, reducing treatment intensity for those with a favorable prognosis. Genomic sequencing has also had a significant impact on translational research strategies in pediatric CNS tumors, resulting in wide-ranging applications that have the potential to direct the rational preclinical screening of novel therapeutic agents, shed light on tumor heterogeneity and evolution, and highlight differences (or similarities) between pediatric and adult CNS tumors. Finally, in addition to allowing the identification of somatic (tumor-specific) mutations, the analysis of patient-matched constitutional (germline) DNA has facilitated the detection of pathogenic germline alterations in cancer genes in patients with CNS tumors, with critical implications for genetic counseling and tumor surveillance strategies for children with familial predisposition syndromes. As our understanding of the molecular landscape of pediatric CNS tumors continues to advance, innovative applications of genomic sequencing hold significant promise for further improving the care of children with these cancers. PMID:27188671

  5. The complete mitochondrial genome sequence of Schizopygopsis anteroventris (Cypriniformes: Cyprinidae).

    Science.gov (United States)

    Liang, Yangyang; Chen, Yifeng; Li, Chunhua; He, Dekui

    2016-09-01

    Schizopygopsis anteroventris (Cyprinidae: Schizothoracinae) is an ecologically and economically important cyprinid endemic to Qinghai-Tibet Plateau, China. In this study, we sequenced the complete mitochondrial genome of S. anteroventris by DNA sequencing based on PCR fragments. The mitogenome of S. anteroventris is 16,620 in length, containing 13 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes and two non-coding regions: the control region (D-loop) and the origin of light-strand replication (OL). The gene order in the mitogenome is identical with common vertebrate form. The complete mitogenome sequence is useful for further genetic studies, phylogenetic analysis and resource protection of S. anteroventris. PMID:25791361

  6. Second generation sequencing of the mesothelioma tumor genome.

    Directory of Open Access Journals (Sweden)

    Raphael Bueno

    Full Text Available The current paradigm for elucidating the molecular etiology of cancers relies on the interrogation of small numbers of genes, which limits the scope of investigation. Emerging second-generation massively parallel DNA sequencing technologies have enabled more precise definition of the cancer genome on a global scale. We examined the genome of a human primary malignant pleural mesothelioma (MPM tumor and matched normal tissue by using a combination of sequencing-by-synthesis and pyrosequencing methodologies to a 9.6X depth of coverage. Read density analysis uncovered significant aneuploidy and numerous rearrangements. Method-dependent informatics rules, which combined the results of different sequencing platforms, were developed to identify and validate candidate mutations of multiple types. Many more tumor-specific rearrangements than point mutations were uncovered at this depth of sequencing, resulting in novel, large-scale, inter- and intra-chromosomal deletions, inversions, and translocations. Nearly all candidate point mutations appeared to be previously unknown SNPs. Thirty tumor-specific fusions/translocations were independently validated with PCR and Sanger sequencing. Of these, 15 represented disrupted gene-encoding regions, including kinases, transcription factors, and growth factors. One large deletion in DPP10 resulted in altered transcription and expression of DPP10 transcripts in a set of 53 additional MPM tumors correlated with survival. Additionally, three point mutations were observed in the coding regions of NKX6-2, a transcription regulator, and NFRKB, a DNA-binding protein involved in modulating NFKB1. Several regions containing genes such as PCBD2 and DHFR, which are involved in growth factor signaling and nucleotide synthesis, respectively, were selectively amplified in the tumor. Second-generation sequencing uncovered all types of mutations in this MPM tumor, with DNA rearrangements representing the dominant type.

  7. The nucleotide sequence and genome organization of Plasmopara halstedii virus

    Directory of Open Access Journals (Sweden)

    Göpfert Jens C

    2011-03-01

    Full Text Available Abstract Background Only very few viruses of Oomycetes have been studied in detail. Isometric virions were found in different isolates of the oomycete Plasmopara halstedii, the downy mildew pathogen of sunflower. However, complete nucleotide sequences and data on the genome organization were lacking. Methods Viral RNA of different P. halstedii isolates was subjected to nucleotide sequencing and analysis of the viral genome. The N-terminal sequence of the viral coat protein was determined using Top-Down MALDI-TOF analysis. Results The complete nucleotide sequences of both single-stranded RNA segments (RNA1 and RNA2 were established. RNA1 consisted of 2793 nucleotides (nt exclusive its 3' poly(A tract and a single open-reading frame (ORF1 of 2745 nt. ORF1 was framed by a 5' untranslated region (5' UTR of 18 nt and a 3' untranslated region (3' UTR of 30 nt. ORF1 contained motifs of RNA-dependent RNA polymerases (RdRp and showed similarities to RdRp of Scleropthora macrospora virus A (SmV A and viruses within the Nodaviridae family. RNA2 consisted of 1526 nt exclusive its 3' poly(A tract and a second ORF (ORF2 of 1128 nt. ORF2 coded for the single viral coat protein (CP and was framed by a 5' UTR of 164 nt and a 3' UTR of 234 nt. The deduced amino acid sequence of ORF2 was verified by nano-LC-ESI-MS/MS experiments. Top-Down MALDI-TOF analysis revealed the N-terminal sequence of the CP. The N-terminal sequence represented a region within ORF2 suggesting a proteolytic processing of the CP in vivo. The CP showed similarities to CP of SmV A and viruses within the Tombusviridae family. Fragments of RNA1 (ca. 1.9 kb and RNA2 (ca. 1.4 kb were used to analyze the nucleotide sequence variation of virions in different P. halstedii isolates. Viral sequence variation was 0.3% or less regardless of their host's pathotypes, the geographical origin and the sensitivity towards the fungicide metalaxyl. Conclusions The results showed the presence of a single and new

  8. Complete genome sequence of Enterococcus faecium strain TX16 and comparative genomic analysis of Enterococcus faecium genomes

    Directory of Open Access Journals (Sweden)

    Qin Xiang

    2012-07-01

    Full Text Available Abstract Background Enterococci are among the leading causes of hospital-acquired infections in the United States and Europe, with Enterococcus faecalis and Enterococcus faecium being the two most common species isolated from enterococcal infections. In the last decade, the proportion of enterococcal infections caused by E. faecium has steadily increased compared to other Enterococcus species. Although the underlying mechanism for the gradual replacement of E. faecalis by E. faecium in the hospital environment is not yet understood, many studies using genotyping and phylogenetic analysis have shown the emergence of a globally dispersed polyclonal subcluster of E. faecium strains in clinical environments. Systematic study of the molecular epidemiology and pathogenesis of E. faecium has been hindered by the lack of closed, complete E. faecium genomes that can be used as references. Results In this study, we report the complete genome sequence of the E. faecium strain TX16, also known as DO, which belongs to multilocus sequence type (ST 18, and was the first E. faecium strain ever sequenced. Whole genome comparison of the TX16 genome with 21 E. faecium draft genomes confirmed that most clinical, outbreak, and hospital-associated (HA strains (including STs 16, 17, 18, and 78, in addition to strains of non-hospital origin, group in the same clade (referred to as the HA clade and are evolutionally considerably more closely related to each other by phylogenetic and gene content similarity analyses than to isolates in the community-associated (CA clade with approximately a 3–4% average nucleotide sequence difference between the two clades at the core genome level. Our study also revealed that many genomic loci in the TX16 genome are unique to the HA clade. 380 ORFs in TX16 are HA-clade specific and antibiotic resistance genes are enriched in HA-clade strains. Mobile elements such as IS16 and transposons were also found almost exclusively in HA strains

  9. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays.

    Science.gov (United States)

    Drmanac, Radoje; Sparks, Andrew B; Callow, Matthew J; Halpern, Aaron L; Burns, Norman L; Kermani, Bahram G; Carnevali, Paolo; Nazarenko, Igor; Nilsen, Geoffrey B; Yeung, George; Dahl, Fredrik; Fernandez, Andres; Staker, Bryan; Pant, Krishna P; Baccash, Jonathan; Borcherding, Adam P; Brownley, Anushka; Cedeno, Ryan; Chen, Linsu; Chernikoff, Dan; Cheung, Alex; Chirita, Razvan; Curson, Benjamin; Ebert, Jessica C; Hacker, Coleen R; Hartlage, Robert; Hauser, Brian; Huang, Steve; Jiang, Yuan; Karpinchyk, Vitali; Koenig, Mark; Kong, Calvin; Landers, Tom; Le, Catherine; Liu, Jia; McBride, Celeste E; Morenzoni, Matt; Morey, Robert E; Mutch, Karl; Perazich, Helena; Perry, Kimberly; Peters, Brock A; Peterson, Joe; Pethiyagoda, Charit L; Pothuraju, Kaliprasad; Richter, Claudia; Rosenbaum, Abraham M; Roy, Shaunak; Shafto, Jay; Sharanhovich, Uladzislau; Shannon, Karen W; Sheppy, Conrad G; Sun, Michel; Thakuria, Joseph V; Tran, Anne; Vu, Dylan; Zaranek, Alexander Wait; Wu, Xiaodi; Drmanac, Snezana; Oliphant, Arnold R; Banyai, William C; Martin, Bruce; Ballinger, Dennis G; Church, George M; Reid, Clifford A

    2010-01-01

    Genome sequencing of large numbers of individuals promises to advance the understanding, treatment, and prevention of human diseases, among other applications. We describe a genome sequencing platform that achieves efficient imaging and low reagent consumption with combinatorial probe anchor ligation chemistry to independently assay each base from patterned nanoarrays of self-assembling DNA nanoballs. We sequenced three human genomes with this platform, generating an average of 45- to 87-fold coverage per genome and identifying 3.2 to 4.5 million sequence variants per genome. Validation of one genome data set demonstrates a sequence accuracy of about 1 false variant per 100 kilobases. The high accuracy, affordable cost of $4400 for sequencing consumables, and scalability of this platform enable complete human genome sequencing for the detection of rare variants in large-scale genetic studies. PMID:19892942

  10. Early insights into the genome sequence of Uromyces fabae.

    Science.gov (United States)

    Link, Tobias; Seibel, Christian; Voegele, Ralf T

    2014-01-01

    Uromyces fabae is a major pathogen of broad bean, Vicia faba. U. fabae has served as a model among rust fungi to elucidate the development of infection structures, expression and secretion of cell wall degrading enzymes and gene expression. Using U. fabae, enormous progress was made regarding nutrient uptake and metabolism and in the search for secreted proteins and effectors. Here, we present results from a genome survey of U. fabae. Paired end Illumina sequencing provided 53 Gb of data. An assembly gave 59,735 scaffolds with a total length of 216 Mb. K-mer analysis estimated the genome size to be 329 Mb. Of a representative set of 23,153 predicted proteins we could annotate 10,209, and predict 599 secreted proteins. Clustering of the protein set indicates families of highly likely effectors. We also found new homologs of RTP1p, a prototype rust effector. The U. fabae genome will be an important resource for comparative analyses with U. appendiculatus and P. pachyrhizi and provide information regarding the phylogenetic relationship of the genus Uromyces with respect to other rust fungi already sequenced, namely Puccinia graminis f. sp. tritici, P. striiformis f. sp. tritici, Melampsora lini, and Melampsora larici-populina. PMID:25400651

  11. Whole Genome Re-Sequencing of Three Domesticated Chicken Breeds.

    Science.gov (United States)

    Oh, Dongyep; Son, Bongjun; Mun, Seyoung; Oh, Man Hwan; Oh, Sejong; Ha, Jaejung; Yi, Junkoo; Lee, Seunguk; Han, Kyudong

    2016-02-01

    Chicken is one of the most popular domesticated species worldwide, as it can serve an important role in agricultural as well as biomedical research fields. Because it inhabits almost every continent and presents diverse morphology and traits, the need of genetic markers for distinguishing each breed for various purposes has increased. The whole genome sequencing of three different breeds (White Leghorn, Korean domestic, and Araucana) that show similar coloring patterns, with the exception of the White Leghorn breed, have confirmed previously reported genomic alterations and identified many novel variants. Additionally, the Whole Genome Re-Sequencing (WGRS) approach identified an approximately 4 kb insert within SLCO1B3 responsible for blue egg shell color. Targeted investigation of pigment-related genes corroborated previously reported non-synonymous mutations, and provided deeper insight into chicken coloring, where not a single but a combination of non-synonymous mutations in the MC1R gene is likely to be responsible for altered feather coloring. PMID:26853871

  12. Early insights into the genome sequence of Uromyces fabae

    Directory of Open Access Journals (Sweden)

    Tobias eLink

    2014-10-01

    Full Text Available Uromyces fabae is a major pathogen of broad bean, Vicia faba. U. fabae has served as a model among rust fungi to elucidate the development of infection structures, expression and secretion of cell wall degrading enzymes and gene expression. Using U. fabae, enormous progress was made regarding nutrient uptake and metabolism and in the search for secreted proteins and effectors. Here, we present results from a genome survey of U. fabae. Paired end Illumina sequencing provided 53 Gb of data. An assembly gave 59,735 scaffolds with a total length of 216 Mb. K-mer analysis estimated the genome size to be 329 Mb. Of a representative set of 23,153 predicted proteins we could annotate 10,209, and predict 599 secreted proteins. Clustering of the protein set indicates families of highly likely effectors. We also found new homologs of RTP1p, a prototype rust effector. The U. fabae genome will be an important resource for comparative analyses with U. appendiculatus and P. pachyrhizi and provide information regarding the phylogenetic relationship of the genus Uromyces with respect to other rust fungi already sequenced, namely Puccinia graminis f. sp. tritici, P. striiformis f. sp. tritici, Melampsora lini, and Melampsora larici-populina.

  13. The Complete Chloroplast Genome Sequences of Five Epimedium Species: Lights into Phylogenetic and Taxonomic Analyses

    OpenAIRE

    Zhang, Yanjun; Du, Liuwen; Liu, Ao; Chen, Jianjun; Wu, Li; Hu, Weiming; Zhang, Wei; Kim, Kyunghee; Lee, Sang-Choon; Yang, Tae-Jin; Wang, Ying

    2016-01-01

    Epimedium L. is a phylogenetically and economically important genus in the family Berberidaceae. We here sequenced the complete chloroplast (cp) genomes of four Epimedium species using Illumina sequencing technology via a combination of de novo and reference-guided assembly, which was also the first comprehensive cp genome analysis on Epimedium combining the cp genome sequence of E. koreanum previously reported. The five Epimedium cp genomes exhibited typical quadripartite and circular struct...

  14. Coverage Bias and Sensitivity of Variant Calling for Four Whole-genome Sequencing Technologies

    OpenAIRE

    Nora Rieber; Marc Zapatka; Bärbel Lasitschka; David Jones1; Paul Northcott; Barbara Hutter; Natalie Jäger; Marcel Kool; Michael Taylor; Peter Lichter; Stefan Pfister; Stephan Wolf; Benedikt Brors; Roland Eils

    2013-01-01

    The emergence of high-throughput, next-generation sequencing technologies has dramatically altered the way we assess genomes in population genetics and in cancer genomics. Currently, there are four commonly used whole-genome sequencing platforms on the market: Illumina's HiSeq2000, Life Technologies' SOLiD 4 and its completely redesigned 5500xl SOLiD, and Complete Genomics' technology. A number of earlier studies have compared a subset of those sequencing platforms or compared those platforms...

  15. WebGMAP: a web service for mapping and aligning cDNA sequences to genomes

    OpenAIRE

    Liang, Chun; Liu, Lin; Ji, Guoli

    2009-01-01

    The genomes of thousands of organisms are being sequenced, often with accompanying sequences of cDNAs or ESTs. One of the great challenges in bioinformatics is to make these genomic sequences and genome annotations accessible in a user-friendly manner to general biologists to address interesting biological questions. We have created an open-access web service called WebGMAP (http://www.bioinfolab.org/software/webgmap) that seamlessly integrates cDNA-genome alignment tools, such as GMAP, with ...

  16. Open access to sequence: Browsing the Pichia pastoris genome

    OpenAIRE

    Graf Alexandra; Lin Yao-Cheng; Rouzé Pierre; Callewaert Nico; Mattanovich Diethard; Redl Andreas; Tiels Petra; Gasser Brigitte; De Schutter Kristof

    2009-01-01

    Abstract The first genome sequences of the important yeast protein production host Pichia pastoris have been released into the public domain this spring. In order to provide the scientific community easy and versatile access to the sequence, two web-sites have been installed as a resource for genomic sequence, gene and protein information for P. pastoris: A GBrowse based genome browser was set up at http://www.pichiagenome.org and a genome portal with gene annotation and browsing functionalit...

  17. Complete Genome Sequence of the First Chinese Virulent Infectious Laryngotracheitis Virus

    OpenAIRE

    Kong, Congcong; ZHAO Yan; Cui, Xianlan; ZHANG, XIAOMIN; Cui, Hongyu; Xue, Mei; Wang, Yunfeng

    2013-01-01

    Infectious laryngotracheitis (ILT) is an acute respiratory disease caused by infectious laryngotracheitis virus (ILTV). The complete genome sequences of five attenuated ILTV vaccine strains and six virulent ILTV strains as well as two Australian ILTV field strains have been published in Australia and the USA so far. To provide the complete genome sequence information of ILTVs from different geographic regions, the whole genome of ILTV LJS09 isolated in China was sequenced. The genome of ILTV ...

  18. Sequence Collection - TMBETA-GENOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available [ Credits ] BLAST Search Image Search Home About Archive Update History Contact us TM...ultiple chromosomes, the entry set for each chormosome is given individually. Data file File name: tm...beta_genome_sequence_collection.zip File URL: ftp://ftp.biosciencedbc.jp/archive/tmbeta-genome/LATEST/tm...beta_genome_sequence_collection.zip File size: 8.8 KB Simple search URL http://togodb.biosci...base Site Policy | Contact Us Sequence Collection - TMBETA-GENOME | LSDB Archive ...

  19. Draft Genome Sequence of Escherichia coli Strain Nissle 1917 (Serovar O6:K5:H1)

    OpenAIRE

    Cress, Brady F.; Linhardt, Robert J.; Koffas, Mattheos A. G.

    2013-01-01

    We announce the availability of the 5.023-Mbp high-quality draft assembly of the Escherichia coli strain Nissle 1917 (serovar O6:K5:H1) genome. Short genomic segments from this important probiotic strain have been available in public databases, but the full genome sequence has remained inaccessible. Thus, high-coverage, whole genome sequencing of E. coli Nissle 1917 is presented herein. Reannotation and metabolic reconstruction will enable comparative genomics analysis and model-guided predic...

  20. Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii) genome.

    OpenAIRE

    Byrappa Venkatesh; Kirkness, Ewen F.; Yong-Hwee Loh; Halpern, Aaron L; Lee, Alison P.; Justin Johnson; Nidhi Dandona; Viswanathan, Lakshmi D; Alice Tay; J Craig Venter; Strausberg, Robert L; Sydney Brenner

    2007-01-01

    Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras) provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4× coverage) and comparative analysis of the elephant shark genome, one of t...

  1. The canine hookworm genome: analysis and classification of Ancylostoma caninum survey sequences

    OpenAIRE

    Abubucker, Sahar; Martin, John; Yin, Yong; Fulton, Lucinda; Yang, Shiaw-Pyng; Hallsworth-Pepin, Kym; Johnston, J. Spencer; Hawdon, John; McCarter, James P.; Wilson, Richard K.; Mitreva, Makedonka

    2007-01-01

    Hookworms infect nearly a billion people. The Ancylostoma caninum hookworm of canids is a model for studying human infections and information from its genome coupled with functional genomics and proteomics can accelerate progress towards hookworm control. As a step towards a full-scale A. caninum genome project, we generated 104,000 genome survey sequences (GSSs) and determined the genome size of the canine hookworm. GSSs assembled into 57.6 Mb of unique sequence from a ge...

  2. The zebrafish reference genome sequence and its relationship to the human genome.

    Science.gov (United States)

    Howe, Kerstin; Clark, Matthew D; Torroja, Carlos F; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T; Guerra-Assunção, José A; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F; Laird, Gavin K; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Elliot, David; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Begum, Sharmin; Mortimore, Beverley; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Lloyd, Christine; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James D; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Lanz, Christa; Raddatz, Günter; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Schuster, Stephan C; Carter, Nigel P; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M J; Enright, Anton; Geisler, Robert; Plasterk, Ronald H A; Lee, Charles; Westerfield, Monte; de Jong, Pieter J; Zon, Leonard I; Postlethwait, John H; Nüsslein-Volhard, Christiane; Hubbard, Tim J P; Roest Crollius, Hugues; Rogers, Jane; Stemple, Derek L

    2013-04-25

    Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination. PMID:23594743

  3. The sequence and analysis of a Chinese pig genome

    Directory of Open Access Journals (Sweden)

    Fang Xiaodong

    2012-11-01

    Full Text Available Abstract Background The pig is an economically important food source, amounting to approximately 40% of all meat consumed worldwide. Pigs also serve as an important model organism because of their similarity to humans at the anatomical, physiological and genetic level, making them very useful for studying a variety of human diseases. A pig strain of particular interest is the miniature pig, specifically the Wuzhishan pig (WZSP, as it has been extensively inbred. Its high level of homozygosity offers increased ease for selective breeding for specific traits and a more straightforward understanding of the genetic changes that underlie its biological characteristics. WZSP also serves as a promising means for applications in surgery, tissue engineering, and xenotransplantation. Here, we report the sequencing and analysis of an inbreeding WZSP genome. Results Our results reveal some unique genomic features, including a relatively high level of homozygosity in the diploid genome, an unusual distribution of heterozygosity, an over-representation of tRNA-derived transposable elements, a small amount of porcine endogenous retrovirus, and a lack of type C retroviruses. In addition, we carried out systematic research on gene evolution, together with a detailed investigation of the counterparts of human drug target genes. Conclusion Our results provide the opportunity to more clearly define the genomic character of pig, which could enhance our ability to create more useful pig models.

  4. IDENTIFICATION OF AVIAN-SPECIFIC FECAL METAGENOMIC SEQUENCES USING GENOME FRAGMENT ENRICHMENTS

    Science.gov (United States)

    Sequence analysis of microbial genomes has provided biologists the opportunity to compare genetic differences between closely related microorganisms. While random sequencing has also been used to study natural microbial communities, metagenomic comparisons via sequencing analysis...

  5. Low-depth shotgun sequencing resolves complete mitochondrial genome sequence of Labeo rohita.

    Science.gov (United States)

    Das, Sofia P; Bit, Amrita; Patnaik, Siddhi; Sahoo, L; Meher, P K; Jayasankar, P; Saha, T M; Patel, A B; Patel, Namrata; Koringa, P; Joshi, C G; Agarwal, Suyash; Pandey, Manmohan; Srivastava, Shreya; Kushwaha, B; Kumar, Ravindra; Nagpure, N S; Iquebal, M A; Jaiswal, Sarika; Kumar, Dinesh; Jena, J K; Das, P

    2016-09-01

    Labeo rohita, popularly known as rohu, is a widely cultured species in whole Indian subcontinent. In the present study, we used in-silico approach to resolve complete mitochondrial genome of rohu. Low-depth shotgun sequencing using Roche 454 GS FLX (Branford, Connecticut, USA) followed by de novo assembly in CLC Genomics Workbench version 7.0.4 (Aarhus, Denmark) revealed the complete mitogenome of L. rohita to be 16 606 bp long (accession No. KR185963). It comprised of 13 protein-coding genes, 22 tRNAs, 2 rRNAs and 1 putative control region. The gene order and organization are similar to most vertebrates. The mitogenome in the present investigation has 99% similarity with that of previously reported mitogenomes of rohu and this is also evident from the phylogenetic study using maximum-likelihood (ML) tree method. This study was done to determine the feasibility, accuracy and reliability of low-depth sequence data obtained from NGS platform as compared to the Sanger sequencing. Thus, NGS technology has proven to be competent and a rapid in-silico alternative to resolve the complete mitochondrial genome sequence, thereby reducing labors and time. PMID:26260184

  6. Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences

    Directory of Open Access Journals (Sweden)

    Holland Barbara R

    2006-07-01

    Full Text Available Abstract Background Phylogenetic methods which do not rely on multiple sequence alignments are important tools in inferring trees directly from completely sequenced genomes. Here, we extend the recently described Genome BLAST Distance Phylogeny (GBDP strategy to compute phylogenetic trees from all completely sequenced plastid genomes currently available and from a selection of mitochondrial genomes representing the major eukaryotic lineages. BLASTN, TBLASTX, or combinations of both are used to locate high-scoring segment pairs (HSPs between two sequences from which pairwise similarities and distances are computed in different ways resulting in a total of 96 GBDP variants. The suitability of these distance formulae for phylogeny reconstruction is directly estimated by computing a recently described measure of "treelikeness", the so-called δ value, from the respective distance matrices. Additionally, we compare the trees inferred from these matrices using UPGMA, NJ, BIONJ, FastME, or STC, respectively, with the NCBI taxonomy tree of the taxa under study. Results Our results indicate that, at this taxonomic level, plastid genomes are much more valuable for inferring phylogenies than are mitochondrial genomes, and that distances based on breakpoints are of little use. Distances based on the proportion of "matched" HSP length to average genome length were best for tree estimation. Additionally we found that using TBLASTX instead of BLASTN and, particularly, combining TBLASTX and BLASTN leads to a small but significant increase in accuracy. Other factors do not significantly affect the phylogenetic outcome. The BIONJ algorithm results in phylogenies most in accordance with the current NCBI taxonomy, with NJ and FastME performing insignificantly worse, and STC performing as well if applied to high quality distance matrices. δ values are found to be a reliable predictor of phylogenetic accuracy. Conclusion Using the most treelike distance matrices, as

  7. The complete plastid genome sequence of Eustrephus latifolius (Asparagaceae: Lomandroideae).

    Science.gov (United States)

    Kim, Hyoung Tae; Kim, Jung Sung; Kim, Joo-Hwan

    2016-01-01

    The complete chloroplast (cp) genome sequence of Eustrephus latifolius was firstly determined in subfamily Lomandriodeae of family Asparagaceae. It was 159,736 bp and contained a large single copy region (82,403 bp) and a small single copy region (13,607 bp) which were separated by two inverted repeat regions (31,863 bp). In total, 132 genes were identified and they were consisted of 83 coding genes, 8 rRNA genes, 38 tRNA genes, 3 pseudogenes. rpl23 and clpP were pseudogenes due to sequence deletions. Among 23 genes containing introns, rps12 and ycf3 contained two introns and the rest had just one intron. The intact ycf68 was identified within an intron of trnI-GAU. The amino acid sequence was almost identical with Phoenix dactylifera in Aracales. Ycf1 of E. latifolius was completely located in IR. It was similar to cp genome structure of Lemna minor, Spirodela polyrhiza, Wolffiella lingulata, Wolffia australiana in Alismatales. PMID:25186113

  8. The complete genome sequence of the Atlantic salmon paramyxovirus (ASPV)

    International Nuclear Information System (INIS)

    The complete RNA genome of the Atlantic salmon paramyxovirus (ASPV), isolated from Atlantic salmon suffering from proliferative gill inflammation (PGI), has been determined. The genome is 16,965 nucleotides in length and consists of six nonoverlapping genes in the order 3'- N - P/C/V - M - F - HN - L -5', coding for the nucleocapsid, phospho-, matrix, fusion, hemagglutinin-neuraminidase and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and trinucleotide intergenic regions similar to those of other Paramyxoviridae. The ASPV P-gene expression strategy is like that of the respiro- and morbilliviruses, which express the phosphoprotein from the primary transcript, and edit a portion of the mRNA to encode the accessory proteins V and W. It also encodes the C-protein by ribosomal choice of translation initiation. Pairwise comparisons of amino acid identities, and phylogenetic analysis of deduced ASPV protein sequences with homologous sequences from other Paramyxoviridae, show that ASPV has an affinity for the genus Respirovirus, but may represent a new genus within the subfamily Paramyxovirinae

  9. Bacillus anthracis genome organization in light of whole transcriptome sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Jeffrey; Zhu, Wenhan; Passalacqua, Karla D.; Bergman, Nicholas; Borodovsky, Mark

    2010-03-22

    Emerging knowledge of whole prokaryotic transcriptomes could validate a number of theoretical concepts introduced in the early days of genomics. What are the rules connecting gene expression levels with sequence determinants such as quantitative scores of promoters and terminators? Are translation efficiency measures, e.g. codon adaptation index and RBS score related to gene expression? We used the whole transcriptome shotgun sequencing of a bacterial pathogen Bacillus anthracis to assess correlation of gene expression level with promoter, terminator and RBS scores, codon adaptation index, as well as with a new measure of gene translational efficiency, average translation speed. We compared computational predictions of operon topologies with the transcript borders inferred from RNA-Seq reads. Transcriptome mapping may also improve existing gene annotation. Upon assessment of accuracy of current annotation of protein-coding genes in the B. anthracis genome we have shown that the transcriptome data indicate existence of more than a hundred genes missing in the annotation though predicted by an ab initio gene finder. Interestingly, we observed that many pseudogenes possess not only a sequence with detectable coding potential but also promoters that maintain transcriptional activity.

  10. Sequencing ebola and marburg viruses genomes using microarrays.

    Science.gov (United States)

    Hardick, Justin; Woelfel, Roman; Gardner, Warren; Ibrahim, Sofi

    2016-08-01

    Periodic outbreaks of Ebola and Marburg hemorrhagic fevers have occurred in Africa over the past four decades with case fatality rates reaching as high as 90%. The latest Ebola outbreak in West Africa in 2014 raised concerns that these infections can spread across continents and pose serious health risks. Early and accurate identification of the causative agents is necessary to contain outbreaks. In this report, we describe sequencing-by-hybridization (SBH) technique using high density microarrays to identify Ebola and Marburg viruses. The microarrays were designed to interrogate the sequences of entire viral genomes, and were evaluated with three species of Ebolavirus (Reston, Sudan, and Zaire), and three strains of Marburgvirus (Angola, Musoke, and Ravn). The results showed that the consensus sequences generated with four or more hybridizations had 92.1-98.9% accuracy over 95-99% of the genomes. Additionally, with SBH microarrays it was possible to distinguish between different strains of the Lake Victoria Marburgvirus. J. Med. Virol. 88:1303-1308, 2016. © 2016 Wiley Periodicals, Inc. PMID:26822839

  11. The complete chloroplast genome sequence of Alocasia macrorrhizos.

    Science.gov (United States)

    Wang, Bin; Han, Limin

    2016-09-01

    The complete chloroplast sequence of Alocasia macrorrhizos is 154 995 bp in length, containing a pair of inverted repeats of 25 944 bp separated by a large single-copy (LSC) region and a small single-copy (SSC) region of 87 366 bp and 15 741 bp, respectively. The chloroplast genome encodes 132 predicted functional genes, including 87 protein-coding genes, four ribosomal RNA genes, and 37 transfer RNA genes, 18 of which are duplicated in the inverted repeat regions. In these genes, 16 genes contained single intron and two genes comprising double introns. A maximum-likelihood phylogenetic analysis using complete chloroplast genome revealed that A. macrorrhizos does not belong to Araceae family, which infers that the A. macrorrhizos is distant from the species in Araceae family. PMID:26258514

  12. The complete chloroplast genome sequence of Brachypodium distachyon: sequence comparison and phylogenetic analysis of eight grass plastomes

    Directory of Open Access Journals (Sweden)

    Anderson Olin D

    2008-07-01

    Full Text Available Abstract Background Wheat, barley, and rye, of tribe Triticeae in the Poaceae, are among the most important crops worldwide but they present many challenges to genomics-aided crop improvement. Brachypodium distachyon, a close relative of those cereals has recently emerged as a model for grass functional genomics. Sequencing of the nuclear and organelle genomes of Brachypodium is one of the first steps towards making this species available as a tool for researchers interested in cereals biology. Findings The chloroplast genome of Brachypodium distachyon was sequenced by a combinational approach using BAC end and shotgun sequences derived from a selected BAC containing the entire chloroplast genome. Comparative analysis indicated that the chloroplast genome is conserved in gene number and organization with respect to those of other cereals. However, several Brachypodium genes evolve at a faster rate than those in other grasses. Sequence analysis reveals that rice and wheat have a ~2.1 kb deletion in their plastid genomes and this deletion must have occurred independently in both species. Conclusion We demonstrate that BAC libraries can be used to sequence plastid, and likely other organellar, genomes. As expected, the Brachypodium chloroplast genome is very similar to those of other sequenced grasses. The phylogenetic analyses and the pattern of insertions and deletions in the chloroplast genome confirmed that Brachypodium is a close relative of the tribe Triticeae. Nevertheless, we show that some large indels can arise multiple times and may confound phylogenetic reconstruction.

  13. MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences

    OpenAIRE

    Schwartz, Scott; Elnitski, Laura; Li, Mei; Weirauch, Matt; Riemer, Cathy; Smit, Arian; Green, Eric D; Hardison, Ross C.; Miller, Webb

    2003-01-01

    Analysis of multiple sequence alignments can generate important, testable hypotheses about the phylogenetic history and cellular function of genomic sequences. We describe the MultiPipMaker server, which aligns multiple, long genomic DNA sequences quickly and with good sensitivity (available at http://bio.cse.psu.edu/ since May 2001). Alignments are computed between a contiguous reference sequence and one or more secondary sequences, which can be finished or draft sequence. The outputs includ...

  14. Complete plastid genome sequence of Vaccinium macrocarpon: structure, gene content and rearrangements revealed by next generation sequencing

    Science.gov (United States)

    The complete plastid genome sequence of the American cranberry was reconstructed using next-generation sequencing data by in silico procedures. We used Roche 454 shotgun sequence data to isolate cranberry plastid-specific sequences of the cultivar ‘HyRed’ via homology comparisons with complete seque...

  15. Twenty-One Genome Sequences from Pseudomonas Species and 19 Genome Sequences from Diverse Bacteria Isolated from the Rhizosphere and Endosphere of Populus deltoides

    Energy Technology Data Exchange (ETDEWEB)

    Brown, Steven D [ORNL; Utturkar, Sagar M [ORNL; Klingeman, Dawn Marie [ORNL; Johnson, Courtney M [ORNL; Martin, Stanton [ORNL; Land, Miriam L [ORNL; Lu, Tse-Yuan [ORNL; Schadt, Christopher Warren [ORNL; Doktycz, Mitchel John [ORNL; Pelletier, Dale A [ORNL

    2012-01-01

    To aid in the investigation of the Populus deltoides microbiome we generated draft genome sequences for twenty one Pseudomonas and twenty one other diverse bacteria isolated from Populus deltoides roots. Genome sequences for isolates similar to Acidovorax, Bradyrhizobium, Brevibacillus, Burkholderia, Caulobacter, Chryseobacterium, Flavobacterium, Herbaspirillum, Novosphingobium, Pantoea, Phyllobacterium, Polaromonas, Rhizobium, Sphingobium and Variovorax were generated.

  16. Countering Gattaca: Efficient and Secure Testing of Fully-Sequenced Human Genomes (Full Version)

    OpenAIRE

    Baldi, Pierre; Baronio, Roberta; De Cristofaro, Emiliano; Gasti, Paolo; Tsudik, Gene

    2011-01-01

    Recent advances in DNA sequencing technologies have put ubiquitous availability of fully sequenced human genomes within reach. It is no longer hard to imagine the day when everyone will have the means to obtain and store one's own DNA sequence. Widespread and affordable availability of fully sequenced genomes immediately opens up important opportunities in a number of health-related fields. In particular, common genomic applications and tests performed in vitro today will soon be conducted co...

  17. Facile, High Quality Sequencing of Bacterial Genomes from Small Amounts of DNA

    OpenAIRE

    Momchilo Vuyisich; Ayesha Arefin; Karen Davenport; Shihai Feng; Cheryl Gleasner; Kim McMurry; Beverly Parson-Quintana; Jennifer Price; Matthew Scholz; Patrick Chain

    2014-01-01

    Sequencing bacterial genomes has traditionally required large amounts of genomic DNA (~1 μg). There have been few studies to determine the effects of the input DNA amount or library preparation method on the quality of sequencing data. Several new commercially available library preparation methods enable shotgun sequencing from as little as 1 ng of input DNA. In this study, we evaluated the NEBNext Ultra library preparation reagents for sequencing bacterial genomes. We have evaluated the util...

  18. Organization and Evolution of Primate Centromeric DNA from Whole-Genome Shotgun Sequence Data

    OpenAIRE

    Alkan, Can; Eichler, Evan E.; Ventura, Mario; Archidiacono, Nicoletta; Rocchi, Mariano; Sahinalp, S Cenk

    2007-01-01

    Author Summary Centromeric DNA has been described as the last frontier of genomic sequencing; such regions are typically poorly assembled during the whole-genome shotgun sequence assembly process due to their repetitive complexity. This paper develops a computational algorithm to systematically extract data regarding primate centromeric DNA structure and organization from that ∼5% of sequence that is not included as part of standard genome sequence assemblies. Using this computational approac...

  19. Draft Sequence of the Rice Genome:A Milestone Publication

    Institute of Scientific and Technical Information of China (English)

    Guo Haiyan; Zhao Baohua

    2002-01-01

    @@ Following China's announcement of its completion of the draft genome sequence of the rice indica subspecies on October 12,2001 (page 126, Bulletin of the Chinese Academy of Sciences Vol. 15 No.3), Chinese scientists published their findings in the April 5 issue of Science, the journal of the American Association for the Advancement of Science (AAAS). In the same issue, a team of scientists with the Switzerland-based Syngenta company reported a similar achievement for another major rice subspecies,japonica.

  20. Deciphering the biology of Mycobacterium tuberculosis from thecomplete genome sequence

    DEFF Research Database (Denmark)

    Cole, S.T.; Krogh, Anders Stærmose

    1998-01-01

    Countless millions of people have died from tuberculosis, a chronic infectious disease caused by the tubercle bacillus. The complete genome sequence of the best-characterized strain of Mycobacterium tuberculosis, H37Rv, has been determined and analysed in order to improve our understanding of the....... tuberculosis differs radically from other bacteria in that a very large portion of its coding capacity is devoted to the production of enzymes involved in lipogenesis and lipolysis, and to two new families of glycine-rich proteins with a repetitive structure that may represent a source of antigenic variation....

  1. Quantification of read species behavior within whole genome sequencing of cancer genomes for the stratification and visualization of genomic variation.

    Science.gov (United States)

    Hibsh, Dror; Buetow, Kenneth H; Yaari, Gur; Efroni, Sol

    2016-05-19

    The cancer genome is abnormal genome, and the ability to monitor its sequence had undergone a technological revolution. Yet prognosis and diagnosis remain an expert-based decision, with only limited abilities to provide machine-based decisions. We introduce a heterogeneity-based method for stratifying and visualizing whole-genome sequencing (WGS) reads. This method uses the heterogeneity within WGS reads to markedly reduce the dimensionality of next-generation sequencing data; it is available through the tool HiBS (Heterogeneity-Based Subclassification) that allows cancer sample classification. We validated HiBS using >200 WGS samples from nine different cancer types from The Cancer Genome Atlas (TCGA). With HiBS, we show progress with two WGS related issues: (i) differentiation between normal (NB) and tumor (TP) samples based solely on the information structure of their WGS data, and (ii) identification of specific regions of chromosomal amplification/deletion and their association with tumor stage. By comparing results to those obtained through available WGS analyses tools, we demonstrate some of the novelties obtained by the approach implemented in HiBS and also show nearly perfect normal/tumor classification, used to identify known and unknown chromosomal aberrations. Finally, the HiBS index has been associated with breast cancer tumor stage. PMID:26809676

  2. Defining and Evaluating a Core Genome Multilocus Sequence Typing Scheme for Whole-Genome Sequence-Based Typing of Listeria monocytogenes

    OpenAIRE

    Ruppitsch, Werner; Pietzka, Ariane; Prior, Karola; Bletz, Stefan; Fernandez, Haizpea Lasa; Allerberger, Franz; Harmsen, Dag; Mellmann, Alexander

    2015-01-01

    Whole-genome sequencing (WGS) has emerged today as an ultimate typing tool to characterize Listeria monocytogenes outbreaks. However, data analysis and interlaboratory comparability of WGS data are still challenging for most public health laboratories. Therefore, we have developed and evaluated a new L. monocytogenes typing scheme based on genome-wide gene-by-gene comparisons (core genome multilocus the sequence typing [cgMLST]) to allow for a unique typing nomenclature. Initially, we determi...

  3. Optimized Protocol for Simple Extraction of High-Quality Genomic DNA from Clostridium difficile for Whole-Genome Sequencing

    OpenAIRE

    Sim, James Heng Chiak; Anikst, Victoria; Lohith, Akshar; Pourmand, Nader; Banaei, Niaz

    2015-01-01

    Successful sequencing of the Clostridium difficile genome requires high-quality genomic DNA (gDNA) as the starting material. gDNA extraction using conventional methods is laborious. We describe here an optimized method for the simple extraction of C. difficile gDNA using the QIAamp DNA minikit, which yielded high-quality sequence reads on the Illumina MiSeq platform.

  4. Effective Normalization for Copy Number Variation Detection from Whole Genome Sequencing

    OpenAIRE

    Janevski Angel; Varadan Vinay; Kamalakaran Sitharthan; Banerjee Nilanjana; Dimitrova Nevenka

    2012-01-01

    Abstract Background Whole genome sequencing enables a high resolution view of the human genome and provides unique insights into genome structure at an unprecedented scale. There have been a number of tools to infer copy number variation in the genome. These tools, while validated, also include a number of parameters that are configurable to genome data being analyzed. These algorithms allow for normalization to account for individual and population-specific effects on individual genome CNV e...

  5. Whole-genome sequence variation, population structure and demographic history of the Dutch population

    NARCIS (Netherlands)

    The Genome of the Netherlands Consortium; Marschall, T.; Schoenhuth, A.

    2014-01-01

    Whole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch parent-offspring

  6. Genome Sequence of the Banana Pathogen Dickeya zeae Strain MS1, Which Causes Bacterial Soft Rot

    OpenAIRE

    Zhang, Jing-Xin; Lin, Bi-Run; Shen, Hui-Fang; Pu, Xiao-Ming

    2013-01-01

    We report a draft genome sequence of Dickeya zeae strain MS1, which is the causative agent of banana soft rot in China, and we show several of its specific properties compared with those of other D. zeae strains. Genome sequencing provides a tool for understanding the genomic determination of the pathogenicity and phylogeny placement of this pathogen.

  7. Complete Genome Sequence of Porcine Parvovirus 2 Recovered from Swine Sera

    OpenAIRE

    F.S. Campos; Kluge, M.; A.C. Franco; Giongo, A.; Valdez, F. P.; Saddi, T. M.; Brito, W. M. E. D.; Roehe, P.M.

    2016-01-01

    A complete genomic sequence of porcine parvovirus 2 (PPV-2) was detected by viral metagenome analysis on swine sera. A phylogenetic analysis of this genome reveals that it is highly similar to previously reported North American PPV-2 genomes. The complete PPV-2 sequence is 5,426 nucleotides long.

  8. Complete Genome Sequence of Porcine Parvovirus 2 Recovered from Swine Sera.

    Science.gov (United States)

    Campos, F S; Kluge, M; Franco, A C; Giongo, A; Valdez, F P; Saddi, T M; Brito, W M E D; Roehe, P M

    2016-01-01

    A complete genomic sequence of porcine parvovirus 2 (PPV-2) was detected by viral metagenome analysis on swine sera. A phylogenetic analysis of this genome reveals that it is highly similar to previously reported North American PPV-2 genomes. The complete PPV-2 sequence is 5,426 nucleotides long. PMID:26823583

  9. Genome Sequence of Candida tropicalis no. 121, Used for RNA Production

    OpenAIRE

    Li, Bingbing; Guo, Ting; Chen, Yong; Xie, Jingjing; Niu, Huanqing; Liu, Dong; Cheng, Jian; Chen, Xiaochun; Wu, Jinglan; Zhuang, Wei; Zhu, Chenjie; Ying, Hanjie

    2014-01-01

    We report here the complete genome sequence of Candida tropicalis no. 121. C. tropicalis no. 121 is a high-RNA-producing strain obtained by mutagenesis in our laboratory. The complete genome sequence was determined using the Illumina HiSeq 2000 and contains 6,415 genes. The genome size of C. tropicalis no. 121 is >15.3 Mb.

  10. Draft Genome Sequence of Microbacterium sp. Strain HM58-2, Which Hydrolyzes Acylhydrazides

    Science.gov (United States)

    Akiyama, Tomonori; Ishige, Taichiro; Kanesaki, Yu; Ito, Shinsaku; Oinuma, Ken-Ichi; Takaya, Naoki; Sasaki, Yasuyuki

    2016-01-01

    We report the draft genome sequence of Microbacterium sp. strain HM58-2, which produces hydrazidase, an enzyme hydrolyzing acylhydrazides. The estimated genome size is 3.9 Mb. Genome sequence information of this strain will help to identify an assimilating mechanism of nonnatural compounds in this strain and to develop ecological applications. PMID:27313297

  11. Complete Genome Sequences of the Potyvirus Sweet potato virus 2 from East Timor and Australia

    Science.gov (United States)

    Maina, Solomon; Edwards, Owain R.; de Almeida, Luis; Ximenes, Abel

    2016-01-01

    We present here the first complete genome sequences of Sweet potato virus 2 (SPV2) from sweet potato in Australia and East Timor, and compare these with five complete SPV2 genome sequences from South Korea and one each from Spain and the United States. Both were closely related to SPV2 genomes from South Korea, Spain, and the United States. PMID:27257208

  12. Draft Genome Sequence of Lactobacillus gorillae Strain KZ01T, Isolated from a Western Lowland Gorilla

    OpenAIRE

    TSUCHIDA, Sayaka; Nezuo, Maiko; Tsukahara, Masatoshi; Ogura, Yoshitoshi; Hayashi, Tetsuya; Ushida, Kazunari

    2015-01-01

    Here, we report the draft genome sequence of Lactobacillus gorillae strain KZ01T isolated from a western lowland gorilla (Gorilla gorilla gorilla). This genome sequence will be helpful for the comparative genomics between human and nonhuman primate-associated Lactobacillus.

  13. Draft Genome Sequence of Lactobacillus gorillae Strain KZ01T, Isolated from a Western Lowland Gorilla.

    Science.gov (United States)

    Tsuchida, Sayaka; Nezuo, Maiko; Tsukahara, Masatoshi; Ogura, Yoshitoshi; Hayashi, Tetsuya; Ushida, Kazunari

    2015-01-01

    Here, we report the draft genome sequence of Lactobacillus gorillae strain KZ01(T) isolated from a western lowland gorilla (Gorilla gorilla gorilla). This genome sequence will be helpful for the comparative genomics between human and nonhuman primate-associated Lactobacillus. PMID:26472838

  14. Draft Genome Sequence of Rickettsia sp. Strain MEAM1, Isolated from the Whitefly Bemisia tabaci

    OpenAIRE

    Rao, Qiong; Wang, Shuang; Zhu, Dan-Tong; Wang, Xiao-Wei; Liu, Shu-Sheng

    2012-01-01

    We report the draft genome sequence of the Rickettsia sp. strain MEAM1, which is a facultative symbiont from an invasive species of the whitefly Bemisia tabaci. The total length of the assembled genome is approximately 1.24 Mb, with 335 scaffolds and 1,247 coding sequences predicted within the genome.

  15. Draft Genome Sequence of Zhouia amylolytica AD3, Isolated from Tidal Flat Sediment

    Science.gov (United States)

    Jia, Baolei; Jin, Hyun Mi; Lee, Hyo Jung

    2016-01-01

    Zhouia amylolytica AD3 was isolated from tidal flat sediment at Taean, South Korea. We report here the draft genome sequence of Z. amylolytica AD3, which is the first report of a genome sequence of the genus Zhouia. The genomic information will provide a better understanding of the physiology, adaptation, and evolution of Zhouia species. PMID:27151796

  16. Mapping genomic features to functional traits through microbial whole genome sequences.

    Science.gov (United States)

    Zhang, Wei; Zeng, Erliang; Liu, Dan; Jones, Stuart E; Emrich, Scott

    2014-01-01

    Recently, the utility of trait-based approaches for microbial communities has been identified. Increasing availability of whole genome sequences provide the opportunity to explore the genetic foundations of a variety of functional traits. We proposed a machine learning framework to quantitatively link the genomic features with functional traits. Genes from bacteria genomes belonging to different functional traits were grouped to Cluster of Orthologs (COGs), and were used as features. Then, TF-IDF technique from the text mining domain was applied to transform the data to accommodate the abundance and importance of each COG. After TF-IDF processing, COGs were ranked using feature selection methods to identify their relevance to the functional trait of interest. Extensive experimental results demonstrated that functional trait related genes can be detected using our method. Further, the method has the potential to provide novel biological insights. PMID:24989863

  17. Identification of transcribed sequences in the human genome

    Energy Technology Data Exchange (ETDEWEB)

    Gardiner, K.

    1992-12-01

    The workshop was held at the National Institutes of Mental Health, Bethesda, Maryland, on October 4 and 5, 1991. Twenty-four investigators attended from England, Germany and the United States. The topics discussed included: Genome sequence analysis using computer assisted detection of open reading frames, splice sites and hexamer patterns, direct exon identification using trapping of internal and 3' exons, and a recombination based system, cDNA library construction and screening, including the use of normalization and subtraction procedures, Alu and splice donor site PCR from hybrid cell lines, and microdissection clones as probes, use of labeled CDNAS as probes to screen lambda and cosmid libraries, and sequencing of random cDNAs.

  18. Genomic insights into methanotrophy: the complete genome sequence of Methylococcus capsulatus (Bath.

    Directory of Open Access Journals (Sweden)

    Naomi Ward

    2004-10-01

    Full Text Available Methanotrophs are ubiquitous bacteria that can use the greenhouse gas methane as a sole carbon and energy source for growth, thus playing major roles in global carbon cycles, and in particular, substantially reducing emissions of biologically generated methane to the atmosphere. Despite their importance, and in contrast to organisms that play roles in other major parts of the carbon cycle such as photosynthesis, no genome-level studies have been published on the biology of methanotrophs. We report the first complete genome sequence to our knowledge from an obligate methanotroph, Methylococcus capsulatus (Bath, obtained by the shotgun sequencing approach. Analysis revealed a 3.3-Mb genome highly specialized for a methanotrophic lifestyle, including redundant pathways predicted to be involved in methanotrophy and duplicated genes for essential enzymes such as the methane monooxygenases. We used phylogenomic analysis, gene order information, and comparative analysis with the partially sequenced methylotroph Methylobacterium extorquens to detect genes of unknown function likely to be involved in methanotrophy and methylotrophy. Genome analysis suggests the ability of M. capsulatus to scavenge copper (including a previously unreported nonribosomal peptide synthetase and to use copper in regulation of methanotrophy, but the exact regulatory mechanisms remain unclear. One of the most surprising outcomes of the project is evidence suggesting the existence of previously unsuspected metabolic flexibility in M. capsulatus, including an ability to grow on sugars, oxidize chemolithotrophic hydrogen and sulfur, and live under reduced oxygen tension, all of which have implications for methanotroph ecology. The availability of the complete genome of M. capsulatus (Bath deepens our understanding of methanotroph biology and its relationship to global carbon cycles. We have gained evidence for greater metabolic flexibility than was previously known, and for

  19. Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and Genome Analyzer systems

    OpenAIRE

    Minoche, André E.; Dohm, Juliane C; Himmelbauer, Heinz

    2011-01-01

    Background The generation and analysis of high-throughput sequencing data are becoming a major component of many studies in molecular biology and medical research. Illumina's Genome Analyzer (GA) and HiSeq instruments are currently the most widely used sequencing devices. Here, we comprehensively evaluate properties of genomic HiSeq and GAIIx data derived from two plant genomes and one virus, with read lengths of 95 to 150 bases. Results We provide quantifications and evidence for GC bias, er...

  20. A sample view of the pedunculate oak (Quercus robur) genome from the sequencing of hypomethylated and random genomic libraries

    OpenAIRE

    Lesur, I.; Durand, J.; Sebastiani, F.; Gyllenstrand, N; Bodénès, C; Lascoux, Martin; Kremer, A; Vendramin, GG; Plomion, C

    2011-01-01

    Genomic resources have recently been developed for a number of species of Fagaceae, with the purpose of identifying the genetic factors underlying the adaptation of these long-lived, biologically predominant, commercially and ecologically important species to their environment. The sequencing of genomes of the size of the oak genome (740 Mb/C) is now becoming both possible and affordable due to breakthroughs in sequencing technology. However, an understanding of the composition and structure ...

  1. High-density rhesus macaque oligonucleotide microarray design using early-stage rhesus genome sequence information and human genome annotations

    OpenAIRE

    Magness Charles L; Thomas Matthew J; Proll Sean C; Paeper Bryan; Korth Marcus J; Wallace James C; Iadonato Shawn P; Nelson Charles; Katze Michael G

    2007-01-01

    Abstract Background Until recently, few genomic reagents specific for non-human primate research have been available. To address this need, we have constructed a macaque-specific high-density oligonucleotide microarray by using highly fragmented low-pass sequence contigs from the rhesus genome project together with the detailed sequence and exon structure of the human genome. Using this method, we designed oligonucleotide probes to over 17,000 distinct rhesus/human gene orthologs and increase...

  2. Optical mapping as a routine tool for bacterial genome sequence finishing

    Directory of Open Access Journals (Sweden)

    Gaudriault Sophie

    2007-09-01

    Full Text Available Abstract Background In sequencing the genomes of two Xenorhabdus species, we encountered a large number of sequence repeats and assembly anomalies that stalled finishing efforts. This included a stretch of about 12 Kb that is over 99.9% identical between the plasmid and chromosome of X. nematophila. Results Whole genome restriction maps of the sequenced strains were produced through optical mapping technology. These maps allowed rapid resolution of sequence assembly problems, permitted closing of the genome, and allowed correction of a large inversion in a genome assembly that we had considered finished. Conclusion Our experience suggests that routine use of optical mapping in bacterial genome sequence finishing is warranted. When combined with data produced through 454 sequencing, an optical map can rapidly and inexpensively generate an ordered and oriented set of contigs to produce a nearly complete genome sequence assembly.

  3. The complete genome sequence and comparative genome analysis of the high pathogenicity Yersinia enterocolitica strain 8081.

    Directory of Open Access Journals (Sweden)

    Nicholas R Thomson

    2006-12-01

    Full Text Available The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common

  4. Relationships of wild and domesticated rices (Oryza AA genome species) based upon whole chloroplast genome sequences

    OpenAIRE

    Wambugu, Peterson W.; Marta Brozynska; Agnelo Furtado; Daniel L. Waters; Robert J. Henry

    2015-01-01

    Rice is the most important crop in the world, acting as the staple food for over half of the world’s population. The evolutionary relationships of cultivated rice and its wild relatives have remained contentious and inconclusive. Here we report on the use of whole chloroplast sequences to elucidate the evolutionary and phylogenetic relationships in the AA genome Oryza species, representing the primary gene pool of rice. This is the first study that has produced a well resolved and strongly su...

  5. Quantitative trait loci markers derived from whole genome sequence data increases the reliability of genomic prediction.

    Science.gov (United States)

    Brøndum, R F; Su, G; Janss, L; Sahana, G; Guldbrandtsen, B; Boichard, D; Lund, M S

    2015-06-01

    This study investigated the effect on the reliability of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k single nucleotide polymorphism (SNP) array data. The extra markers were selected with the aim of augmenting the custom low-density Illumina BovineLD SNP chip (San Diego, CA) used in the Nordic countries. The single-marker analysis was done breed-wise on all 16 index traits included in the breeding goals for Nordic Holstein, Danish Jersey, and Nordic Red cattle plus the total merit index itself. Depending on the trait's economic weight, 15, 10, or 5 quantitative trait loci (QTL) were selected per trait per breed and 3 to 5 markers were selected to tag each QTL. After removing duplicate markers (same marker selected for more than one trait or breed) and filtering for high pairwise linkage disequilibrium and assaying performance on the array, a total of 1,623 QTL markers were selected for inclusion on the custom chip. Genomic prediction analyses were performed for Nordic and French Holstein and Nordic Red animals using either a genomic BLUP or a Bayesian variable selection model. When using the genomic BLUP model including the QTL markers in the analysis, reliability was increased by up to 4 percentage points for production traits in Nordic Holstein animals, up to 3 percentage points for Nordic Reds, and up to 5 percentage points for French Holstein. Smaller gains of up to 1 percentage point was observed for mastitis, but only a 0.5 percentage point increase was seen for fertility. When using a Bayesian model accuracies were generally higher with only 54k data compared with the genomic BLUP approach, but increases in reliability were relatively smaller when QTL markers were included. Results from this study indicate that the reliability of genomic prediction can be increased by including markers significant in genome-wide association studies on whole genome

  6. Complete Genome Sequence of the Mycobacterium immunogenum Type Strain CCUG 47286

    Science.gov (United States)

    Jaén-Luchoro, Daniel; Seguí, Carolina; Aliaga-Lozano, Francisco; Salvà-Serra, Francisco; Busquets, Antonio; Gomila, Margarita; Ramírez, Antonio; Ruiz, Mikel; Lalucat, Jorge

    2016-01-01

    Here, we report the complete genome sequence of Mycobacterium immunogenum type strain CCUG 47286, a nontuberculous mycobacterium. The whole genome has 5,573,781 bp and covers as many as 5,484 predicted genes. This genome contributes to the task of closing the still-existing gap of genomes of rapidly growing mycobacterial type strains. PMID:27231356

  7. Draft genome sequences of two closely-related aflatoxigenic Aspergillus species obtained from the Ivory Coast

    Science.gov (United States)

    The genomes of the A. ochraceoroseus and A. rambellii type strains were sequenced using a personal genome machine, followed by annotation of their genes. The genome size for A. ochraceoroseus was found to be approximately 23 Mb and contained 7,837 genes, while the A. rambellii genome was found to be...

  8. First Complete Genome Sequence of Pepper vein yellows virus from Australia.

    Science.gov (United States)

    Maina, Solomon; Edwards, Owain R; Jones, Roger A C

    2016-01-01

    We present here the first complete genomic RNA sequence of the polerovirus Pepper vein yellows virus (PeVYV) obtained from a pepper plant in Australia. We compare it with complete PeVYV genomes from Japan and China. The Australian genome was more closely related to the Japanese than the Chinese genome. PMID:27231375

  9. First Complete Genome Sequence of Pepper vein yellows virus from Australia

    Science.gov (United States)

    Maina, Solomon; Edwards, Owain R.

    2016-01-01

    We present here the first complete genomic RNA sequence of the polerovirus Pepper vein yellows virus (PeVYV) obtained from a pepper plant in Australia. We compare it with complete PeVYV genomes from Japan and China. The Australian genome was more closely related to the Japanese than the Chinese genome. PMID:27231375

  10. Genome sequence surveyws of Brachiola algerae and Edhazardia aedis reveal microsporidia with low gene densities.

    Science.gov (United States)

    Microsporidia are well known models of extreme nuclear genome reduction and compaction. The smallest microsporidian genomes have received the most attention, but with a size range of 2.3 Mb to 19.5 Mb the nature of the larger genomes remains unknown. Here we have undertaken genome sequence surveys ...

  11. Complete genome sequence of a virulent Streptococcus agalactiae strain 138P isolated from disease Nile tilapia

    Science.gov (United States)

    The complete genome of a virulent Streptococcus agalactiae strain 138P is 1838701 bp in size, containing 1831 genes. The genome has 1593 coding sequences, 152 pseudo genes, 16 rRNAs, 69 tRNAs, and 1 non-coding RNA. The annotation of the genome is added by the NCBI Prokaryotic Genome Annotation Pipel...

  12. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available Few studies investigated the donkey (Equus asinus at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca. The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing and Ion Torrent (RRL runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  13. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms

    Science.gov (United States)

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  14. Genome Sequence Analyses of Pseudomonas savastanoi pv. glycinea and Subtractive Hybridization-Based Comparative Genomics with Nine Pseudomonads

    OpenAIRE

    Qi, Mingsheng; Wang, Dongping; Bradley, Carl A.; Zhao, Youfu

    2011-01-01

    Bacterial blight, caused by Pseudomonas savastanoi pv. glycinea (Psg), is a common disease of soybean. In an effort to compare a current field isolate with one isolated in the early 1960s, the genomes of two Psg strains, race 4 and B076, were sequenced using 454 pyrosequencing. The genomes of both Psg strains share more than 4,900 highly conserved genes, indicating very low genetic diversity between Psg genomes. Though conserved, genome rearrangements and recombination events occur commonly w...

  15. The complete mitochondrial genome sequence of Aspergillus flavus.

    Science.gov (United States)

    Yan, Zhengsong; Chen, Dan; Shen, Yiping; Ye, Baodong

    2016-07-01

    Aspergillus flavus is a haploid filamentous fungus that is common in the environment and has been implicated in human infections. The complete mitochondrial genome of A. flavus has been determined by high-throughput sequencing technology in this work. Our study revealed that the mitochondrial genome of A. flavus is 31,602 bp long, with an A + T content of 74.83%, which consists of a usual set of mitochondrial proteins and RNA genes, including large and small ribosomal RNAs, 15 proteins, and 20 tRNA genes and contains two introns. Notably, it also contains two hypothetical proteins without obvious homology to any known proteins. All structural genes are located on one strand and are apparently transcribed in one direction. Codon usage analysis indicated that all protein coding genes employ the standard fungal mitochondrial start and stop codons; and the nucleotide bias toward AT was also reflected in the codon usage. The complete mitochondrial genomes of A. flavus would be useful for future investigation of the genetic, evolution, and clinical identification of Aspergillus species. PMID:25922962

  16. The Pinus taeda genome is characterized by diverse and highly diverged repetitive sequences

    Directory of Open Access Journals (Sweden)

    Yandell Mark

    2010-07-01

    Full Text Available Abstract Background In today's age of genomic discovery, no attempt has been made to comprehensively sequence a gymnosperm genome. The largest genus in the coniferous family Pinaceae is Pinus, whose 110-120 species have extremely large genomes (c. 20-40 Gb, 2N = 24. The size and complexity of these genomes have prompted much speculation as to the feasibility of completing a conifer genome sequence. Conifer genomes are reputed to be highly repetitive, but there is little information available on the nature and identity of repetitive units in gymnosperms. The pines have extensive genetic resources, with approximately 329000 ESTs from eleven species and genetic maps in eight species, including a dense genetic map of the twelve linkage groups in Pinus taeda. Results We present here the Sanger sequence and annotation of ten P. taeda BAC clones and Genome Analyzer II whole genome shotgun (WGS sequences representing 7.5% of the genome. Computational annotation of ten BACs predicts three putative protein-coding genes and at least fifteen likely pseudogenes in nearly one megabase of sequence. We found three conifer-specific LTR retroelements in the BACs, and tentatively identified at least 15 others based on evidence from the distantly related angiosperms. Alignment of WGS sequences to the BACs indicates that 80% of BAC sequences have similar copies (≥ 75% nucleotide identity elsewhere in the genome, but only 23% have identical copies (99% identity. The three most common repetitive elements in the genome were identified and, when combined, represent less than 5% of the genome. Conclusions This study indicates that the majority of repeats in the P. taeda genome are 'novel' and will therefore require additional BAC or genomic sequencing for accurate characterization. The pine genome contains a very large number of diverged and probably defunct repetitive elements. This study also provides new evidence that sequencing a pine genome using a WGS approach is

  17. Illumina Synthetic Long Read Sequencing Allows Recovery of Missing Sequences even in the “Finished” C. elegans Genome

    Science.gov (United States)

    Li, Runsheng; Hsieh, Chia-Ling; Young, Amanda; Zhang, Zhihong; Ren, Xiaoliang; Zhao, Zhongying

    2015-01-01

    Most next-generation sequencing platforms permit acquisition of high-throughput DNA sequences, but the relatively short read length limits their use in genome assembly or finishing. Illumina has recently released a technology called Synthetic Long-Read Sequencing that can produce reads of unusual length, i.e., predominately around 10 Kb. However, a systematic assessment of their use in genome finishing and assembly is still lacking. We evaluate the promise and deficiency of the long reads in these aspects using isogenic C. elegans genome with no gap. First, the reads are highly accurate and capable of recovering most types of repetitive sequences. However, the presence of tandem repetitive sequences prevents pre-assembly of long reads in the relevant genomic region. Second, the reads are able to reliably detect missing but not extra sequences in the C. elegans genome. Third, the reads of smaller size are more capable of recovering repetitive sequences than those of bigger size. Fourth, at least 40 Kbp missing genomic sequences are recovered in the C. elegans genome using the long reads. Finally, an N50 contig size of at least 86 Kbp can be achieved with 24×reads but with substantial mis-assembly errors, highlighting a need for novel assembly algorithm for the long reads. PMID:26039588

  18. Prediction of causative genomic relationships using sequence data of five French and Danish dairy cattle breeds

    DEFF Research Database (Denmark)

    van den Berg, Irene; Boichard, Didier; Lund, Mogens Sandø

    The increasing number of sequenced individuals makes the use of sequences for genomic prediction feasible. With the current SNP chips, the accuracy of genomic selection is limited in populations with low linkage disequilibrium, as is the case in across-breed prediction in dairy cattle. Sequence...... distance to the causative mutations increased. This increase was larger across breed than within breed...

  19. Complete Genome Sequence of a New H9N2 Avian Influenza Virus Isolated in China

    OpenAIRE

    Wang, Jing-Yu; Ren, Juan-Juan; Liu, Wan-Hua; Tang, Pan; Wu, Ning; Wang, Chi-Young; Chang, Ching-Dong; Liu, Hung-Jen

    2013-01-01

    The complete genomic sequence of a new H9N2 avian influenza virus (AIV), isolated in northwestern China, was determined. Sequence and phylogenetic analyses based on the sequences of eight genomic segments revealed that the isolate is phylogenetically related to the Y280-like sublineage.

  20. Inferring patient to patient transmission of Mycobacterium tuberculosis from whole genome sequencing data

    NARCIS (Netherlands)

    Bryant, J.M.; Schürch, A.C.; Deutekom, van H.; Harris, S.R.; Beer, de J.L.; Jager, de V.C.L.; Kremer, K.; Hijum, van S.A.F.T.; Siezen, R.J.; Borgdorff, M.; Bentley, S.D.; Parkhill, J.; Soolingen, van D.

    2013-01-01

    BACKGROUND: Mycobacterium tuberculosis is characterised by limited genomic diversity, which makes the application of whole genome sequencing particularly attractive for clinical and epidemiological investigation. However, in order to confidently infer transmission events, an accurate knowledge of th

  1. Inferring patient to patient transmission of Mycobacterium tuberculosis from whole genome sequencing data

    NARCIS (Netherlands)

    J.M. Bryant (Josephine); A. Schürch (Anita); H. van Deutekom (Henk); S.R. Harris (Simon); J.L. de Beer (Jessica); V. de Jager (Victor); K. Kremer (Kristin); S.A.F.T. van Hijum (Sacha); R.J. Siezen (Roland); M.W. Borgdorff (Martien ); S.D. Bentley (Stephen); J. Parkhill (Julian); D. van Soolingen (Dick)

    2013-01-01

    textabstractBackground: Mycobacterium tuberculosis is characterised by limited genomic diversity, which makes the application of whole genome sequencing particularly attractive for clinical and epidemiological investigation. However, in order to confidently infer transmission events, an accurate kno

  2. Inferring patient to patient transmission of Mycobacterium tuberculosis from whole genome sequencing data.

    NARCIS (Netherlands)

    Bryant, J.M.; Schurch, A.C.; Deutekom, H. van; Harris, S.R.; Beer, J.L. de; Jager, V. de; Kremer, K.; Hijum, S.A.F.T. van; Siezen, R.J.; Borgdorff, M.; Bentley, S.D.; Parkhill, J.; Soolingen, D. van

    2013-01-01

    BACKGROUND: Mycobacterium tuberculosis is characterised by limited genomic diversity, which makes the application of whole genome sequencing particularly attractive for clinical and epidemiological investigation. However, in order to confidently infer transmission events, an accurate knowledge of th

  3. Complete Genome Sequences of Three Ebola Virus Isolates from the 2014 Outbreak in West Africa

    OpenAIRE

    Hoenen, T.; Groseth, A.; Feldmann, F.; Marzi, A.; Ebihara, H.; Kobinger, G.; Günther, S. (Stefan); Feldmann, H.

    2014-01-01

    Here, we report the complete genome sequences, including the genome termini, of three Ebola virus isolates (species Zaire ebolavirus) originating from Guinea that are now being widely used in laboratories in North America for research regarding West African Ebola viruses.

  4. Draft Genome Sequence of Lysinibacillus sp. Strain A1, Isolated from Malaysian Tropical Soil

    OpenAIRE

    Chan, Kok-Gan; Chen, Jian Woon; Chang, Chien-Yi; Yin, Wai-Fong; Chan, Xin-Yue

    2015-01-01

    In this work, we describe the genome of Lysinibacillus sp. strain A1, which was isolated from tropical soil. Analysis of its genome sequence shows the presence of a gene encoding for a putative peptidase responsible for nitrogen compounds.

  5. Draft Genome Sequence of Paenibacillus pini JCM 16418T, Isolated from the Rhizosphere of Pine Tree

    OpenAIRE

    Yuki, Masahiro; Oshima, Kenshiro; Suda, Wataru; Oshida, Yumi; Kitamura, Keiko; Iida, Toshiya; Hattori, Masahira; Ohkuma, Moriya

    2014-01-01

    Paenibacillus pini strain JCM 16418T is a cellulolytic bacterium isolated from the rhizosphere of pine trees. Here, we report the draft genome sequence of this strain. This genome information will be useful for studies of rhizosphere bacteria.

  6. Complete Genome Sequences of Chrysanthemum Stunt Viroid from a Single Chrysanthemum Cultivar

    Science.gov (United States)

    Choi, Hoseong; Jo, Yeonhwa; Yoon, Ju-Yeon; Choi, Seung-Kook

    2015-01-01

    The chrysanthemum stunt viroid (CSVd), a member of the genus Pospiviroid with a single circular RNA genome, infects many chrysanthemum species. Here, we report 25 complete genome sequences of CSVd in a single chrysanthemum cultivar, revealing 20 variants. PMID:26251489

  7. Accurate Prediction of the Statistics of Repetitions in Random Sequences: A Case Study in Archaea Genomes.

    Science.gov (United States)

    Régnier, Mireille; Chassignet, Philippe

    2016-01-01

    Repetitive patterns in genomic sequences have a great biological significance and also algorithmic implications. Analytic combinatorics allow to derive formula for the expected length of repetitions in a random sequence. Asymptotic results, which generalize previous works on a binary alphabet, are easily computable. Simulations on random sequences show their accuracy. As an application, the sample case of Archaea genomes illustrates how biological sequences may differ from random sequences. PMID:27376057

  8. Comparative analysis of whole-genome sequences of Streptococcus suis

    Institute of Scientific and Technical Information of China (English)

    LI Pengli; WEI Wu; LI Yixue; MA Yuanyuan; DING Guohui; LI Xiaoping; WANG Xiaojing; ZHANG Liwen; SUN Jingchun; WANG Yong; TU Kang; WANG Ningning; HAO Pei; WANG Chuan; CAO Zhiwei; SHI Tieliu

    2006-01-01

    The outbreak of Streptococcus suis recently in some districts of Sichuan Province in China has caused over 30 deaths and over 200 infections in human beings. In order to study the pathogenicity mechanism and to prevent the bacteria from spreading and infecting human beings and swine, we have annotated and analyzed the genomes of two strains, Streptococcus suis P1/7 and 89-1591 respectively. The whole length of P1/7 is 2.007 Mb,and has 1969 ORFs. In contrast, the partial genome sequence of 89-1591 is 1.98 Mb in length and exists in 177 contigs with 1918 ORFs. Analysis shows that the average lengths of CDSs in two genomes are very close, and the numbers of the homolog ORFs are 1306 between those two strains. Most of the toxicity factors of the two strains are homologeous, but there are still some significant differences between those two strains. For example, among the 11 genes (cps2A-cps2K) encoding for the capsules in P1/7, 4(cps2A, 2B, 2I, 2J) are not detected in strain 89-1591.At the same time, the genes encoding EF and Haemolysin in P1/7 are also not found in strain 89-1591. Besides, the genes related to DNA replication, repair and recombination differ from each other significantly and there also exist certain differences among the surface proteins. Those characteristics indicate that those two strains have evolved their own specific functions to adapt to the different environments and that the pathogenesis of the two strains is different. We have accumulated comprehensive genomics information for future systematic studies of S.sui. Our results are helpful for disease prevention,vaccine development, as well as drug design for S.suis.

  9. Complete sequence of the mitochondrial genome of Odontamblyopus rubicundus (Perciformes: Gobiidae): genome characterization and phylogenetic analysis

    Indian Academy of Sciences (India)

    Tianxing Liu; Xiaoxiao Jin; Rixin Wang; Tianjun Xu

    2013-12-01

    Odontamblyopus rubicundus is a species of gobiid fishes, inhabits muddy-bottomed coastal waters. In this paper, the first complete mitochondrial genome sequence of O. rubicundus is reported. The complete mitochondrial genome sequence is 17119 bp in length and contains 13 protein-coding genes, two rRNA genes, 22 tRNA genes, a control region and an L-strand origin as in other teleosts. Most mitochondrial genes are encoded on H-strand except for ND6 and seven tRNA genes. Some overlaps occur in protein-coding genes and tRNAs ranging from 1 to 7 bp. The possibly nonfunctional L-strand origin folded into a typical stem-loop secondary structure and a conserved motif (5′-GCCGG-3′) was found at the base of the stem within the $tRNA^{Cys}$ gene. The TAS, CSB-2 and CSB-3 could be detected in the control region. However, in contrast to most of other fishes, the central conserved sequence block domain and the CSB-1 could not be recognized in O. rubicundus, which is consistent with Acanthogobius hasta (Gobiidae). In addition, phylogenetic analyses based on different sequences of species of Gobiidae and different methods showed that the classification of O. rubicundus into Odontamblyopus due to morphology is debatable.

  10. The complete genome sequence and genome structure of passion fruit mosaic virus.

    Science.gov (United States)

    Song, Yeon Sook; Ryu, Ki Hyun

    2011-06-01

    In this study, we determined the complete sequence of the genomic RNA of a Florida isolate of maracuja mosaic virus (MarMV-FL) and compared it to that of a Peru isolate of the virus (MarMV-P) and those of other known tobamoviruses. Complete sequence analysis revealed that the isolate should be considered a member of a new species and named passion fruit mosaic virus (PafMV). The genomic RNA of PafMV consists of 6,791 nucleotides and encodes four open reading frames (ORFs) coding for proteins of 125 kDa (1,101 aa), 184 kDa (1,612 aa), 34 kDa (311 aa) and 18 kDa (164 aa) in consecutive order from the 5' to the 3' end. The sequence homologies of the four ORFs of PafMV were from 78.8% to 81.6% to those of MarMV-P at the amino acid level. The sequence homologies of the four ORFs of PafMV ranged from 36.0% to 77.9% and from 21.7% to 81.6% to those of other tobamoviruses, at the nucleotide and amino acid level, respectively. Phylogenetic analysis revealed that these PafMV-encoded proteins are closely related to those of MarMV-P. In conclusion, the results indicate that PafMV and MarMV-P belong to different species within the genus Tobamovirus. PMID:21547441

  11. Genome Sequencing of a Mung Bean Plant Growth Promoting Strain of P. aeruginosa with Biocontrol Ability

    OpenAIRE

    Devaraj Illakkiam; Manoharan Shankar; Paramasivan Ponraj; Jeyaprakash Rajendhran; Paramasamy Gunasekaran

    2014-01-01

    Pseudomonas aeruginosa PGPR2 is a mung bean rhizosphere strain that produces secondary metabolites and hydrolytic enzymes contributing to excellent antifungal activity against Macrophomina phaseolina, one of the prevalent fungal pathogens of mung bean. Genome sequencing was performed using the Ion Torrent Personal Genome Machine generating 1,354,732 reads (6,772,433 sequenced bases) achieving ~25-fold coverage of the genome. Reference genome assembly using MIRA 3.4.0 yielded 198 contigs. The ...

  12. Inference of Gorilla Demographic and Selective History from Whole-Genome Sequence Data

    OpenAIRE

    McManus, Kimberly F; Kelley, Joanna L.; Song, Shiya; Veeramah, Krishna R; Woerner, August E.; Stevison, Laurie S.; Ryder, Oliver A.; Ape Genome Project, Great; Kidd, Jeffrey M.; Wall, Jeffrey D.; Bustamante, Carlos D.; Hammer, Michael F.

    2015-01-01

    Although population-level genomic sequence data have been gathered extensively for humans, similar data from our closest living relatives are just beginning to emerge. Examination of genomic variation within great apes offers many opportunities to increase our understanding of the forces that have differentially shaped the evolutionary history of hominid taxa. Here, we expand upon the work of the Great Ape Genome Project by analyzing medium to high coverage whole-genome sequences from 14 west...

  13. Personal Genome Sequencing in Ostensibly Healthy Individuals and the PeopleSeq Consortium

    Science.gov (United States)

    Linderman, Michael D.; Nielsen, Daiva E.; Green, Robert C.

    2016-01-01

    Thousands of ostensibly healthy individuals have had their exome or genome sequenced, but a much smaller number of these individuals have received any personal genomic results from that sequencing. We term those projects in which ostensibly healthy participants can receive sequencing-derived genetic findings and may also have access to their genomic data as participatory predispositional personal genome sequencing (PPGS). Here we are focused on genome sequencing applied in a pre-symptomatic context and so define PPGS to exclude diagnostic genome sequencing intended to identify the molecular cause of suspected or diagnosed genetic disease. In this report we describe the design of completed and underway PPGS projects, briefly summarize the results reported to date and introduce the PeopleSeq Consortium, a newly formed collaboration of PPGS projects designed to collect much-needed longitudinal outcome data. PMID:27023617

  14. Improved Complete Genome Sequence of the Extremely Radioresistant Bacterium Deinococcus radiodurans R1 Obtained Using PacBio Single-Molecule Sequencing.

    Science.gov (United States)

    Hua, Xiaoting; Hua, Yuejin

    2016-01-01

    The genome sequence of Deinococcus radiodurans R1 was published in 1999. We resequenced D. radiodurans R1 using PacBio and compared the sequence with the published one. Large insertions and single nucleotide polymorphisms (SNPs) were observed among the genome sequences. A more accurate genome sequence will be helpful to studies of D. radiodurans. PMID:27587813

  15. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    International Nuclear Information System (INIS)

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  16. Whole genome sequence and genome annotation of Colletotrichum acutatum, causal agent of anthracnose in pepper plants in South Korea

    Directory of Open Access Journals (Sweden)

    Joon-Hee Han

    2016-06-01

    Full Text Available Colletotrichum acutatum is a destructive fungal pathogen which causes anthracnose in a wide range of crops. Here we report the whole genome sequence and annotation of C. acutatum strain KC05, isolated from an infected pepper in Kangwon, South Korea. Genomic DNA from the KC05 strain was used for the whole genome sequencing using a PacBio sequencer and the MiSeq system. The KC05 genome was determined to be 52,190,760 bp in size with a G + C content of 51.73% in 27 scaffolds and to contain 13,559 genes with an average length of 1516 bp. Gene prediction and annotation were performed by incorporating RNA-Seq data. The genome sequence of the KC05 was deposited at DDBJ/ENA/GenBank under the accession number LUXP00000000.

  17. Whole genome sequence and genome annotation of Colletotrichum acutatum, causal agent of anthracnose in pepper plants in South Korea.

    Science.gov (United States)

    Han, Joon-Hee; Chon, Jae-Kyung; Ahn, Jong-Hwa; Choi, Ik-Young; Lee, Yong-Hwan; Kim, Kyoung Su

    2016-06-01

    Colletotrichum acutatum is a destructive fungal pathogen which causes anthracnose in a wide range of crops. Here we report the whole genome sequence and annotation of C. acutatum strain KC05, isolated from an infected pepper in Kangwon, South Korea. Genomic DNA from the KC05 strain was used for the whole genome sequencing using a PacBio sequencer and the MiSeq system. The KC05 genome was determined to be 52,190,760 bp in size with a G + C content of 51.73% in 27 scaffolds and to contain 13,559 genes with an average length of 1516 bp. Gene prediction and annotation were performed by incorporating RNA-Seq data. The genome sequence of the KC05 was deposited at DDBJ/ENA/GenBank under the accession number LUXP00000000. PMID:27114908

  18. Sequencing and genome annotation of honey bee microsporidia parasite, Nosema apis and comparative genome analysis with its sympatric congener, N. ceranae

    Science.gov (United States)

    Here we present a draft genome sequence and annotation of the honey bee microsporidian parasite, Nosema apis. We applied the whole-genome shotgun (WGS) sequencing approach to sequence and assemble the genome of N. apis to 22-fold sequence coverage. We predicted 2927 protein-coding genes in the N. ...

  19. Whole genome sequence analysis of unidentified genetically modified papaya for development of a specific detection method.

    Science.gov (United States)

    Nakamura, Kosuke; Kondo, Kazunari; Akiyama, Hiroshi; Ishigaki, Takumi; Noguchi, Akio; Katsumata, Hiroshi; Takasaki, Kazuto; Futo, Satoshi; Sakata, Kozue; Fukuda, Nozomi; Mano, Junichi; Kitta, Kazumi; Tanaka, Hidenori; Akashi, Ryo; Nishimaki-Mogami, Tomoko

    2016-08-15

    Identification of transgenic sequences in an unknown genetically modified (GM) papaya (Carica papaya L.) by whole genome sequence analysis was demonstrated. Whole genome sequence data were generated for a GM-positive fresh papaya fruit commodity detected in monitoring using real-time polymerase chain reaction (PCR). The sequences obtained were mapped against an open database for papaya genome sequence. Transgenic construct- and event-specific sequences were identified as a GM papaya developed to resist infection from a Papaya ringspot virus. Based on the transgenic sequences, a specific real-time PCR detection method for GM papaya applicable to various food commodities was developed. Whole genome sequence analysis enabled identifying unknown transgenic construct- and event-specific sequences in GM papaya and development of a reliable method for detecting them in papaya food commodities. PMID:27006240

  20. Comparative analysis of catfish BAC end sequences with the zebrafish genome

    OpenAIRE

    Abernathy Jason; Xu Peng; Somridhivej Benjaporn; Ninwichian Parichart; Wang Shaolin; Jiang Yanliang; Liu Hong; Kucuktas Huseyin; Liu Zhanjiang

    2009-01-01

    Abstract Background Comparative mapping is a powerful tool to transfer genomic information from sequenced genomes to closely related species for which whole genome sequence data are not yet available. However, such an approach is still very limited in catfish, the most important aquaculture species in the United States. This project was initiated to generate additional BAC end sequences and demonstrate their applications in comparative mapping in catfish. Results We reported the generation of...