WorldWideScience

Sample records for sequenced genomes reveal

  1. Registered Report: Melanoma genome sequencing reveals frequent PREX2 mutations

    OpenAIRE

    sprotocols

    2015-01-01

    Authors: Denise Chroscinski, Darryl Sampey, Alex Hewitt, The Reproducibility Project: Cancer Biology† ### Abstract The [Reproducibility Project: Cancer Biology](https://osf.io/e81xl/wiki/home/) seeks to address growing concerns about reproducibility in scientific research by conducting replications of 50 papers in the field of cancer biology published between 2010 and 2012. This Registered Report describes the proposed replication plan of key experiments from “Melanoma genome sequenci...

  2. Genomic View of Bipolar Disorder Revealed by Whole Genome Sequencing in a Genetic Isolate

    Science.gov (United States)

    Georgi, Benjamin; Craig, David; Kember, Rachel L.; Liu, Wencheng; Lindquist, Ingrid; Nasser, Sara; Brown, Christopher; Egeland, Janice A.; Paul, Steven M.; Bućan, Maja

    2014-01-01

    Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders. PMID:24625924

  3. Genomic view of bipolar disorder revealed by whole genome sequencing in a genetic isolate.

    Directory of Open Access Journals (Sweden)

    Benjamin Georgi

    2014-03-01

    Full Text Available Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders.

  4. Genome sequence of Thermofilum pendens reveals an exceptional loss of biosynthetic pathways without genome reduction

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos; Anderson, Iain; Rodriguez, Jason; Susanti, Dwi; Porat, Iris; Reich, Claudia; Ulrich, Luke E.; Elkins, James G.; Mavromatis, Kostas; Lykidis, Athanasios; Kim, Edwin; Thompson, Linda S.; Nolan, Matt; Land, Miriam; Copeland, Alex; Lapidus, Alla; Lucas, Susan; Detter, Chris; Zhulin, Igor B.; Olsen, Gary J.; Whitman, William; Mukhopadhyay, Biswarup; Bristow, James; Kyrpides, Nikos

    2008-01-01

    We report the complete genome of Thermofilum pendens, a deep-branching, hyperthermophilic member of the order Thermoproteales within the archaeal kingdom Crenarchaeota. T. pendens is a sulfur-dependent, anaerobic heterotroph isolated from a solfatara in Iceland. It is an extracellular commensal, requiring an extract of Thermoproteus tenax for growth, and the genome sequence reveals that biosynthetic pathways for purines, most amino acids, and most cofactors are absent. In fact T. pendens has fewer biosynthetic enzymes than obligate intracellular parasites, although it does not display other features common among obligate parasites and thus does not appear to be in the process of becoming a parasite. It appears that T. pendens has adapted to life in an environment rich in nutrients. T. pendens was known to utilize peptides as an energy source, but the genome reveals substantial ability to grow on carbohydrates. T. pendens is the first crenarchaeote and only the second archaeon found to have a transporter of the phosphotransferase system. In addition to fermentation, T. pendens may gain energy from sulfur reduction with hydrogen and formate as electron donors. It may also be capable of sulfur-independent growth on formate with formate hydrogenlyase. Additional novel features are the presence of a monomethylamine:corrinoid methyltransferase, the first time this enzyme has been found outside of Methanosarcinales, and a presenilin-related protein. Predicted highly expressed proteins do not include housekeeping genes, and instead include ABC transporters for carbohydrates and peptides, and CRISPR-associated proteins.

  5. Supplementary Material for: Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-01-01

    Abstract Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic

  6. Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-10-24

    Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic diversity

  7. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  8. A specific indel marker for the Philippines Schistosoma japonicum revealed by analysis of mitochondrial genome sequences.

    Science.gov (United States)

    Li, Juan; Chen, Fen; Sugiyama, Hiromu; Blair, David; Lin, Rui-Qing; Zhu, Xing-Quan

    2015-07-01

    In the present study, near-complete mitochondrial (mt) genome sequences for Schistosoma japonicum from different regions in the Philippines and Japan were amplified and sequenced. Comparisons among S. japonicum from the Philippines, Japan, and China revealed a geographically based length difference in mt genomes, but the mt genomic organization and gene arrangement were the same. Sequence differences among samples from the Philippines and all samples from the three endemic areas were 0.57-2.12 and 0.76-3.85 %, respectively. The most variable part of the mt genome was the non-coding region. In the coding portion of the genome, protein-coding genes varied more than rRNA genes and tRNAs. The near-complete mt genome sequences for Philippine specimens were identical in length (14,091 bp) which was 4 bp longer than those of S. japonicum samples from Japan and China. This indel provides a unique genetic marker for S. japonicum samples from the Philippines. Phylogenetic analyses based on the concatenated amino acids of 12 protein-coding genes showed that samples of S. japonicum clustered according to their geographical origins. The identified mitochondrial indel marker will be useful for tracing the source of S. japonicum infection in humans and animals in Southeast Asia.

  9. Targeted Genome Sequencing Reveals Varicella-Zoster Virus Open Reading Frame 12 Deletion.

    Science.gov (United States)

    Cohrs, Randall J; Lee, Katherine S; Beach, Addilynn; Sanford, Bridget; Baird, Nicholas L; Como, Christina; Graybill, Chiharu; Jones, Dallas; Tekeste, Eden; Ballard, Mitchell; Chen, Xiaomi; Yalacki, David; Frietze, Seth; Jones, Kenneth; Lenac Rovis, Tihana; Jonjić, Stipan; Haas, Jürgen; Gilden, Don

    2017-10-15

    The neurotropic herpesvirus varicella-zoster virus (VZV) establishes a lifelong latent infection in humans following primary infection. The low abundance of VZV nucleic acids in human neurons has hindered an understanding of the mechanisms that regulate viral gene transcription during latency. To overcome this critical barrier, we optimized a targeted capture protocol to enrich VZV DNA and cDNA prior to whole-genome/transcriptome sequence analysis. Since the VZV genome is remarkably stable, it was surprising to detect that VZV32, a VZV laboratory strain with no discernible growth defect in tissue culture, contained a 2,158-bp deletion in open reading frame (ORF) 12. Consequently, ORF 12 and 13 protein expression was abolished and Akt phosphorylation was inhibited. The discovery of the ORF 12 deletion, revealed through targeted genome sequencing analysis, points to the need to authenticate the VZV genome when the virus is propagated in tissue culture. IMPORTANCE Viruses isolated from clinical samples often undergo genetic modifications when cultured in the laboratory. Historically, VZV is among the most genetically stable herpesviruses, a notion supported by more than 60 complete genome sequences from multiple isolates and following multiple in vitro passages. However, application of enrichment protocols to targeted genome sequencing revealed the unexpected deletion of a significant portion of VZV ORF 12 following propagation in cultured human fibroblast cells. While the enrichment protocol did not introduce bias in either the virus genome or transcriptome, the findings indicate the need for authentication of VZV by sequencing when the virus is propagated in tissue culture. Copyright © 2017 American Society for Microbiology.

  10. cDNA sequences reveal considerable gene prediction inaccuracy in the Plasmodium falciparum genome

    Directory of Open Access Journals (Sweden)

    Valenzuela Jesus G

    2007-07-01

    Full Text Available Abstract Background The completion of the Plasmodium falciparum genome represents a milestone in malaria research. The genome sequence allows for the development of genome-wide approaches such as microarray and proteomics that will greatly facilitate our understanding of the parasite biology and accelerate new drug and vaccine development. Designing and application of these genome-wide assays, however, requires accurate information on gene prediction and genome annotation. Unfortunately, the genes in the parasite genome databases were mostly identified using computer software that could make some erroneous predictions. Results We aimed to obtain cDNA sequences to examine the accuracy of gene prediction in silico. We constructed cDNA libraries from mixed blood stages of P. falciparum parasite using the SMART cDNA library construction technique and generated 17332 high-quality expressed sequence tags (EST, including 2198 from primer-walking experiments. Assembly of our sequence tags produced 2548 contigs and 2671 singletons versus 5220 contigs and 5910 singletons when our EST were assembled with EST in public databases. Comparison of all the assembled EST/contigs with predicted CDS and genomic sequences in the PlasmoDB database identified 356 genes with predicted coding sequences fully covered by EST, including 85 genes (23.6% with introns incorrectly predicted. Careful automatic software and manual alignments found an additional 308 genes that have introns different from those predicted, with 152 new introns discovered and 182 introns with sizes or locations different from those predicted. Alternative spliced and antisense transcripts were also detected. Matching cDNA to predicted genes also revealed silent chromosomal regions, mostly at subtelomere regions. Conclusion Our data indicated that approximately 24% of the genes in the current databases were predicted incorrectly, although some of these inaccuracies could represent alternatively

  11. De novo assembly of genomes from long sequence reads reveals uncharted territories of Propionibacterium freudenreichii.

    Science.gov (United States)

    Deptula, Paulina; Laine, Pia K; Roberts, Richard J; Smolander, Olli-Pekka; Vihinen, Helena; Piironen, Vieno; Paulin, Lars; Jokitalo, Eija; Savijoki, Kirsi; Auvinen, Petri; Varmanen, Pekka

    2017-10-16

    Propionibacterium freudenreichii is an industrially important bacterium granted the Generally Recognized as Safe (the GRAS) status, due to its long safe use in food bioprocesses. Despite the recognized role in the food industry and in the production of vitamin B12, as well as its documented health-promoting potential, P. freudenreichii remained poorly characterised at the genomic level. At present, only three complete genome sequences are available for the species. We used the PacBio RS II sequencing platform to generate complete genomes of 20 P. freudenreichii strains and compared them in detail. Comparative analyses revealed both sequence conservation and genome organisational diversity among the strains. Assembly from long reads resulted in the discovery of additional circular elements: two putative conjugative plasmids and three active, lysogenic bacteriophages. It also permitted characterisation of the CRISPR-Cas systems. The use of the PacBio sequencing platform allowed identification of DNA modifications, which in turn allowed characterisation of the restriction-modification systems together with their recognition motifs. The observed genomic differences suggested strain variation in surface piliation and specific mucus binding, which were validated by experimental studies. The phenotypic characterisation displayed large diversity between the strains in ability to utilise a range of carbohydrates, to grow at unfavourable conditions and to form a biofilm. The complete genome sequencing allowed detailed characterisation of the industrially important species, P. freudenreichii by facilitating the discovery of previously unknown features. The results presented here lay a solid foundation for future genetic and functional genomic investigations of this actinobacterial species.

  12. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...... on transcriptional evidence. Analysis of repetitive sequences suggests that they are underrepresented in the reference assembly, reflecting an enrichment of gene-rich regions in the current assembly. Characterization of Lotus natural variation by resequencing of L. japonicus accessions and diploid Lotus species...... is currently ongoing, facilitated by the MG20 reference sequence...

  13. Ultradeep sequencing of a human ultraconserved region reveals somatic and constitutional genomic instability.

    Directory of Open Access Journals (Sweden)

    Anna De Grassi

    2010-01-01

    Full Text Available Early detection of cancer-associated genomic instability is crucial, particularly in tumour types in which this instability represents the essential underlying mechanism of tumourigenesis. Currently used methods require the presence of already established neoplastic cells because they only detect clonal mutations. In principle, parallel sequencing of single DNA filaments could reveal the early phases of tumour initiation by detecting low-frequency mutations, provided an adequate depth of coverage and an effective control of the experimental error. We applied ultradeep sequencing to estimate the genomic instability of individuals with hereditary non-polyposis colorectal cancer (HNPCC. To overcome the experimental error, we used an ultraconserved region (UCR of the human genome as an internal control. By comparing the mutability outside and inside the UCR, we observed a tendency of the ultraconserved element to accumulate significantly fewer mutations than the flanking segments in both neoplastic and nonneoplastic HNPCC samples. No difference between the two regions was detectable in cells from healthy donors, indicating that all three HNPCC samples have mutation rates higher than the healthy genome. This is the first, to our knowledge, direct evidence of an intrinsic genomic instability of individuals with heterozygous mutations in mismatch repair genes, and constitutes the proof of principle for the development of a more sensitive molecular assay of genomic instability.

  14. The Douglas-fir genome sequence reveals specialization of the photosynthetic apparatus in Pinaceae

    Science.gov (United States)

    David B. Neale; Patrick E. McGuire; Nicholas C. Wheeler; Kristian A. Stevens; Marc W. Crepeau; Charis Cardeno; Aleksey V. Zimin; Daniela Puiu; Geo M. Pertea; U. Uzay Sezen; Claudio Casola; Tomasz E. Koralewski; Robin Paul; Daniel Gonzalez-Ibeas; Sumaira Zaman; Richard Cronn; Mark Yandell; Carson Holt; Charles H. Langley; James A. Yorke; Steven L. Salzberg; Jill L. Wegrzyn

    2017-01-01

    A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb.) Franco (Coastal Douglas-fir) is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50...

  15. Multi-region and single-cell sequencing reveal variable genomic heterogeneity in rectal cancer.

    Science.gov (United States)

    Liu, Mingshan; Liu, Yang; Di, Jiabo; Su, Zhe; Yang, Hong; Jiang, Beihai; Wang, Zaozao; Zhuang, Meng; Bai, Fan; Su, Xiangqian

    2017-11-23

    Colorectal cancer is a heterogeneous group of malignancies with complex molecular subtypes. While colon cancer has been widely investigated, studies on rectal cancer are very limited. Here, we performed multi-region whole-exome sequencing and single-cell whole-genome sequencing to examine the genomic intratumor heterogeneity (ITH) of rectal tumors. We sequenced nine tumor regions and 88 single cells from two rectal cancer patients with tumors of the same molecular classification and characterized their mutation profiles and somatic copy number alterations (SCNAs) at the multi-region and the single-cell levels. A variable extent of genomic heterogeneity was observed between the two patients, and the degree of ITH increased when analyzed on the single-cell level. We found that major SCNAs were early events in cancer development and inherited steadily. Single-cell sequencing revealed mutations and SCNAs which were hidden in bulk sequencing. In summary, we studied the ITH of rectal cancer at regional and single-cell resolution and demonstrated that variable heterogeneity existed in two patients. The mutational scenarios and SCNA profiles of two patients with treatment naïve from the same molecular subtype are quite different. Our results suggest each tumor possesses its own architecture, which may result in different diagnosis, prognosis, and drug responses. Remarkable ITH exists in the two patients we have studied, providing a preliminary impression of ITH in rectal cancer.

  16. Single nucleus genome sequencing reveals high similarity among nuclei of an endomycorrhizal fungus.

    Directory of Open Access Journals (Sweden)

    Kui Lin

    2014-01-01

    Full Text Available Nuclei of arbuscular endomycorrhizal fungi have been described as highly diverse due to their asexual nature and absence of a single cell stage with only one nucleus. This has raised fundamental questions concerning speciation, selection and transmission of the genetic make-up to next generations. Although this concept has become textbook knowledge, it is only based on studying a few loci, including 45S rDNA. To provide a more comprehensive insight into the genetic makeup of arbuscular endomycorrhizal fungi, we applied de novo genome sequencing of individual nuclei of Rhizophagus irregularis. This revealed a surprisingly low level of polymorphism between nuclei. In contrast, within a nucleus, the 45S rDNA repeat unit turned out to be highly diverged. This finding demystifies a long-lasting hypothesis on the complex genetic makeup of arbuscular endomycorrhizal fungi. Subsequent genome assembly resulted in the first draft reference genome sequence of an arbuscular endomycorrhizal fungus. Its length is 141 Mbps, representing over 27,000 protein-coding gene models. We used the genomic sequence to reinvestigate the phylogenetic relationships of Rhizophagus irregularis with other fungal phyla. This unambiguously demonstrated that Glomeromycota are more closely related to Mucoromycotina than to its postulated sister Dikarya.

  17. Complete mitochondrial genome sequencing reveals novel haplotypes in a Polynesian population.

    Directory of Open Access Journals (Sweden)

    Miles Benton

    Full Text Available The high risk of metabolic disease traits in Polynesians may be partly explained by elevated prevalence of genetic variants involved in energy metabolism. The genetics of Polynesian populations has been shaped by island hoping migration events which have possibly favoured thrifty genes. The aim of this study was to sequence the mitochondrial genome in a group of Maoris in an effort to characterise genome variation in this Polynesian population for use in future disease association studies. We sequenced the complete mitochondrial genomes of 20 non-admixed Maori subjects using Affymetrix technology. DNA diversity analyses showed the Maori group exhibited reduced mitochondrial genome diversity compared to other worldwide populations, which is consistent with historical bottleneck and founder effects. Global phylogenetic analysis positioned these Maori subjects specifically within mitochondrial haplogroup--B4a1a1. Interestingly, we identified several novel variants that collectively form new and unique Maori motifs--B4a1a1c, B4a1a1a3 and B4a1a1a5. Compared to ancestral populations we observed an increased frequency of non-synonymous coding variants of several mitochondrial genes in the Maori group, which may be a result of positive selection and/or genetic drift effects. In conclusion, this study reports the first complete mitochondrial genome sequence data for a Maori population. Overall, these new data reveal novel mitochondrial genome signatures in this Polynesian population and enhance the phylogenetic picture of maternal ancestry in Oceania. The increased frequency of several mitochondrial coding variants makes them good candidates for future studies aimed at assessment of metabolic disease risk in Polynesian populations.

  18. Complete genome sequence of Thermus brockianus GE-1 reveals key enzymes of xylan/xylose metabolism.

    Science.gov (United States)

    Schäfers, Christian; Blank, Saskia; Wiebusch, Sigrid; Elleuche, Skander; Antranikian, Garabed

    2017-01-01

    Thermus brockianus strain GE-1 is a thermophilic, Gram-negative, rod-shaped and non-motile bacterium that was isolated from the Geysir geothermal area, Iceland. Like other thermophiles, Thermus species are often used as model organisms to understand the mechanism of action of extremozymes, especially focusing on their heat-activity and thermostability. Genome-specific features of T. brockianus GE-1 and their properties further help to explain processes of the adaption of extremophiles at elevated temperatures. Here we analyze the first whole genome sequence of T. brockianus strain GE-1. Insights of the genome sequence and the methodologies that were applied during de novo assembly and annotation are given in detail. The finished genome shows a phred quality value of QV50. The complete genome size is 2.38 Mb, comprising the chromosome (2,035,182 bp), the megaplasmid pTB1 (342,792 bp) and the smaller plasmid pTB2 (10,299 bp). Gene prediction revealed 2,511 genes in total, including 2,458 protein-encoding genes, 53 RNA and 66 pseudo genes. A unique genomic region on megaplasmid pTB1 was identified encoding key enzymes for xylan depolymerization and xylose metabolism. This is in agreement with the growth experiments in which xylan is utilized as sole source of carbon. Accordingly, we identified sequences encoding the xylanase Xyn10, an endoglucanase, the membrane ABC sugar transporter XylH, the xylose-binding protein XylF, the xylose isomerase XylA catalyzing the first step of xylose metabolism and the xylulokinase XylB, responsible for the second step of xylose metabolism. Our data indicate that an ancestor of T. brockianus obtained the ability to use xylose as alternative carbon source by horizontal gene transfer.

  19. Whole genome sequencing revealed host adaptation-focused genomic plasticity of pathogenic Leptospira

    Science.gov (United States)

    Xu, Yinghua; Zhu, Yongzhang; Wang, Yuezhu; Chang, Yung-Fu; Zhang, Ying; Jiang, Xiugao; Zhuang, Xuran; Zhu, Yongqiang; Zhang, Jinlong; Zeng, Lingbing; Yang, Minjun; Li, Shijun; Wang, Shengyue; Ye, Qiang; Xin, Xiaofang; Zhao, Guoping; Zheng, Huajun; Guo, Xiaokui; Wang, Junzhi

    2016-01-01

    Leptospirosis, caused by pathogenic Leptospira spp., has recently been recognized as an emerging infectious disease worldwide. Despite its severity and global importance, knowledge about the molecular pathogenesis and virulence evolution of Leptospira spp. remains limited. Here we sequenced and analyzed 102 isolates representing global sources. A high genomic variability were observed among different Leptospira species, which was attributed to massive gene gain and loss events allowing for adaptation to specific niche conditions and changing host environments. Horizontal gene transfer and gene duplication allowed the stepwise acquisition of virulence factors in pathogenic Leptospira evolved from a recent common ancestor. More importantly, the abundant expansion of specific virulence-related protein families, such as metalloproteases-associated paralogs, were exclusively identified in pathogenic species, reflecting the importance of these protein families in the pathogenesis of leptospirosis. Our observations also indicated that positive selection played a crucial role on this bacteria adaptation to hosts. These novel findings may lead to greater understanding of the global diversity and virulence evolution of Leptospira spp. PMID:26833181

  20. Dynamic evolution of pathogenicity revealed by sequencing and comparative genomics of 19 Pseudomonas syringae isolates.

    Directory of Open Access Journals (Sweden)

    David A Baltrus

    2011-07-01

    Full Text Available Closely related pathogens may differ dramatically in host range, but the molecular, genetic, and evolutionary basis for these differences remains unclear. In many Gram- negative bacteria, including the phytopathogen Pseudomonas syringae, type III effectors (TTEs are essential for pathogenicity, instrumental in structuring host range, and exhibit wide diversity between strains. To capture the dynamic nature of virulence gene repertoires across P. syringae, we screened 11 diverse strains for novel TTE families and coupled this nearly saturating screen with the sequencing and assembly of 14 phylogenetically diverse isolates from a broad collection of diseased host plants. TTE repertoires vary dramatically in size and content across all P. syringae clades; surprisingly few TTEs are conserved and present in all strains. Those that are likely provide basal requirements for pathogenicity. We demonstrate that functional divergence within one conserved locus, hopM1, leads to dramatic differences in pathogenicity, and we demonstrate that phylogenetics-informed mutagenesis can be used to identify functionally critical residues of TTEs. The dynamism of the TTE repertoire is mirrored by diversity in pathways affecting the synthesis of secreted phytotoxins, highlighting the likely role of both types of virulence factors in determination of host range. We used these 14 draft genome sequences, plus five additional genome sequences previously reported, to identify the core genome for P. syringae and we compared this core to that of two closely related non-pathogenic pseudomonad species. These data revealed the recent acquisition of a 1 Mb megaplasmid by a sub-clade of cucumber pathogens. This megaplasmid encodes a type IV secretion system and a diverse set of unknown proteins, which dramatically increases both the genomic content of these strains and the pan-genome of the species.

  1. Dynamic evolution of pathogenicity revealed by sequencing and comparative genomics of 19 Pseudomonas syringae isolates.

    Science.gov (United States)

    Baltrus, David A; Nishimura, Marc T; Romanchuk, Artur; Chang, Jeff H; Mukhtar, M Shahid; Cherkis, Karen; Roach, Jeff; Grant, Sarah R; Jones, Corbin D; Dangl, Jeffery L

    2011-07-01

    Closely related pathogens may differ dramatically in host range, but the molecular, genetic, and evolutionary basis for these differences remains unclear. In many Gram- negative bacteria, including the phytopathogen Pseudomonas syringae, type III effectors (TTEs) are essential for pathogenicity, instrumental in structuring host range, and exhibit wide diversity between strains. To capture the dynamic nature of virulence gene repertoires across P. syringae, we screened 11 diverse strains for novel TTE families and coupled this nearly saturating screen with the sequencing and assembly of 14 phylogenetically diverse isolates from a broad collection of diseased host plants. TTE repertoires vary dramatically in size and content across all P. syringae clades; surprisingly few TTEs are conserved and present in all strains. Those that are likely provide basal requirements for pathogenicity. We demonstrate that functional divergence within one conserved locus, hopM1, leads to dramatic differences in pathogenicity, and we demonstrate that phylogenetics-informed mutagenesis can be used to identify functionally critical residues of TTEs. The dynamism of the TTE repertoire is mirrored by diversity in pathways affecting the synthesis of secreted phytotoxins, highlighting the likely role of both types of virulence factors in determination of host range. We used these 14 draft genome sequences, plus five additional genome sequences previously reported, to identify the core genome for P. syringae and we compared this core to that of two closely related non-pathogenic pseudomonad species. These data revealed the recent acquisition of a 1 Mb megaplasmid by a sub-clade of cucumber pathogens. This megaplasmid encodes a type IV secretion system and a diverse set of unknown proteins, which dramatically increases both the genomic content of these strains and the pan-genome of the species. © 2011 Baltrus et al.

  2. Sequencing the CHO DXB11 genome reveals regional variations in genomic stability and haploidy

    DEFF Research Database (Denmark)

    Kaas, Christian Schrøder; Kristensen, Claus; Betenbaugh, Michael J.

    2015-01-01

    in eight additional analyzed CHO genomes (15-20% haploidy) but not in the genome of the Chinese hamster. The dhfr gene is confirmed to be haploid in CHO DXB11; transcriptionally active and the remaining allele contains a G410C point mutation causing a Thr137Arg missense mutation. We find similar to 2.......5 million single nucleotide polymorphisms (SNP's), 44 gene deletions in the CHO DXB11 genome and 9357 SNP's, which interfere with the coding regions of 3458 genes. Copy number variations for nine CHO genomes were mapped to the chromosomes of the Chinese hamster showing unique signatures for each chromosome...

  3. Whole-genome sequencing reveals a potential causal mutation for dwarfism in the Miniature Shetland pony.

    Science.gov (United States)

    Metzger, Julia; Gast, Alana Christina; Schrimpf, Rahel; Rau, Janina; Eikelberg, Deborah; Beineke, Andreas; Hellige, Maren; Distl, Ottmar

    2017-04-01

    The Miniature Shetland pony represents a horse breed with an extremely small body size. Clinical examination of a dwarf Miniature Shetland pony revealed a lowered size at the withers, malformed skull and brachygnathia superior. Computed tomography (CT) showed a shortened maxilla and a cleft of the hard and soft palate which protruded into the nasal passage leading to breathing difficulties. Pathological examination confirmed these findings but did not reveal histopathological signs of premature ossification in limbs or cranial sutures. Whole-genome sequencing of this dwarf Miniature Shetland pony and comparative sequence analysis using 26 reference equids from NCBI Sequence Read Archive revealed three probably damaging missense variants which could be exclusively found in the affected foal. Validation of these three missense mutations in 159 control horses from different horse breeds and five donkeys revealed only the aggrecan (ACAN)-associated g.94370258G>C variant as homozygous wild-type in all control samples. The dwarf Miniature Shetland pony had the homozygous mutant genotype C/C of the ACAN:g.94370258G>C variant and the normal parents were heterozygous G/C. An unaffected full sib and 3/5 unaffected half-sibs were heterozygous G/C for the ACAN:g.94370258G>C variant. In summary, we could demonstrate a dwarf phenotype in a miniature pony breed perfectly associated with a missense mutation within the ACAN gene.

  4. The complete genome sequence of Fibrobacter succinogenes S85 reveals a cellulolytic and metabolic specialist.

    Directory of Open Access Journals (Sweden)

    Garret Suen

    Full Text Available Fibrobacter succinogenes is an important member of the rumen microbial community that converts plant biomass into nutrients usable by its host. This bacterium, which is also one of only two cultivated species in its phylum, is an efficient and prolific degrader of cellulose. Specifically, it has a particularly high activity against crystalline cellulose that requires close physical contact with this substrate. However, unlike other known cellulolytic microbes, it does not degrade cellulose using a cellulosome or by producing high extracellular titers of cellulase enzymes. To better understand the biology of F. succinogenes, we sequenced the genome of the type strain S85 to completion. A total of 3,085 open reading frames were predicted from its 3.84 Mbp genome. Analysis of sequences predicted to encode for carbohydrate-degrading enzymes revealed an unusually high number of genes that were classified into 49 different families of glycoside hydrolases, carbohydrate binding modules (CBMs, carbohydrate esterases, and polysaccharide lyases. Of the 31 identified cellulases, none contain CBMs in families 1, 2, and 3, typically associated with crystalline cellulose degradation. Polysaccharide hydrolysis and utilization assays showed that F. succinogenes was able to hydrolyze a number of polysaccharides, but could only utilize the hydrolytic products of cellulose. This suggests that F. succinogenes uses its array of hemicellulose-degrading enzymes to remove hemicelluloses to gain access to cellulose. This is reflected in its genome, as F. succinogenes lacks many of the genes necessary to transport and metabolize the hydrolytic products of non-cellulose polysaccharides. The F. succinogenes genome reveals a bacterium that specializes in cellulose as its sole energy source, and provides insight into a novel strategy for cellulose degradation.

  5. Mitochondrial genome sequences reveal deep divergences among Anopheles punctulatus sibling species in Papua New Guinea

    Directory of Open Access Journals (Sweden)

    Logue Kyle

    2013-02-01

    Full Text Available Abstract Background Members of the Anopheles punctulatus group (AP group are the primary vectors of human malaria in Papua New Guinea. The AP group includes 13 sibling species, most of them morphologically indistinguishable. Understanding why only certain species are able to transmit malaria requires a better comprehension of their evolutionary history. In particular, understanding relationships and divergence times among Anopheles species may enable assessing how malaria-related traits (e.g. blood feeding behaviours, vector competence have evolved. Methods DNA sequences of 14 mitochondrial (mt genomes from five AP sibling species and two species of the Anopheles dirus complex of Southeast Asia were sequenced. DNA sequences from all concatenated protein coding genes (10,770 bp were then analysed using a Bayesian approach to reconstruct phylogenetic relationships and date the divergence of the AP sibling species. Results Phylogenetic reconstruction using the concatenated DNA sequence of all mitochondrial protein coding genes indicates that the ancestors of the AP group arrived in Papua New Guinea 25 to 54 million years ago and rapidly diverged to form the current sibling species. Conclusion Through evaluation of newly described mt genome sequences, this study has revealed a divergence among members of the AP group in Papua New Guinea that would significantly predate the arrival of humans in this region, 50 thousand years ago. The divergence observed among the mtDNA sequences studied here may have resulted from reproductive isolation during historical changes in sea-level through glacial minima and maxima. This leads to a hypothesis that the AP sibling species have evolved independently for potentially thousands of generations. This suggests that the evolution of many phenotypes, such as insecticide resistance will arise independently in each of the AP sibling species studied here.

  6. Seventeen new complete mtDNA sequences reveal extensive mitochondrial genome evolution within the Demospongiae.

    Directory of Open Access Journals (Sweden)

    Xiujuan Wang

    Full Text Available Two major transitions in animal evolution--the origins of multicellularity and bilaterality--correlate with major changes in mitochondrial DNA (mtDNA organization. Demosponges, the largest class in the phylum Porifera, underwent only the first of these transitions and their mitochondrial genomes display a peculiar combination of ancestral and animal-specific features. To get an insight into the evolution of mitochondrial genomes within the Demospongiae, we determined 17 new mtDNA sequences from this group and analyzing them with five previously published sequences. Our analysis revealed that all demosponge mtDNAs are 16- to 25-kbp circular molecules, containing 13-15 protein genes, 2 rRNA genes, and 2-27 tRNA genes. All but four pairs of sampled genomes had unique gene orders, with the number of shared gene boundaries ranging from 1 to 41. Although most demosponge species displayed low rates of mitochondrial sequence evolution, a significant acceleration in evolutionary rates occurred in the G1 group (orders Dendroceratida, Dictyoceratida, and Verticillitida. Large variation in mtDNA organization was also observed within the G0 group (order Homosclerophorida including gene rearrangements, loss of tRNA genes, and the presence of two introns in Plakortis angulospiculatus. While introns are rare in modern-day demosponge mtDNA, we inferred that at least one intron was present in cox1 of the common ancestor of all demosponges. Our study uncovered an extensive mitochondrial genomic diversity within the Demospongiae. Although all sampled mitochondrial genomes retained some ancestral features, including a minimally modified genetic code, conserved structures of tRNA genes, and presence of multiple non-coding regions, they vary considerably in their size, gene content, gene order, and the rates of sequence evolution. Some of the changes in demosponge mtDNA, such as the loss of tRNA genes and the appearance of hairpin-containing repetitive elements

  7. Genome Sequencing Reveals the Potential of Achromobacter sp. HZ01 for Bioremediation

    Directory of Open Access Journals (Sweden)

    Yue-Hui Hong

    2017-08-01

    Full Text Available Petroleum pollution is a severe environmental issue. Comprehensively revealing the genetic backgrounds of hydrocarbon-degrading microorganisms contributes to developing effective methods for bioremediation of crude oil-polluted environments. Marine bacterium Achromobacter sp. HZ01 is capable of degrading hydrocarbons and producing biosurfactants. In this study, the draft genome (5.5 Mbp of strain HZ01 has been obtained by Illumina sequencing, containing 5,162 predicted genes. Genome annotation shows that “amino acid metabolism” is the most abundant metabolic pathway. Strain HZ01 is not capable of using some common carbohydrates as the sole carbon sources, which is due to that it contains few genes associated with carbohydrate transport and lacks some important enzymes related to glycometabolism. It contains abundant proteins directly related to petroleum hydrocarbon degradation. AlkB hydroxylase and its homologs were not identified. It harbors a complete enzyme system of terminal oxidation pathway for n-alkane degradation, which may be initiated by cytochrome P450. The enzymes involved in the catechol pathway are relatively complete for the degradation of aromatic compounds. This bacterium lacks several essential enzymes for methane oxidation, and Baeyer-Villiger monooxygenase involved in the subterminal oxidation pathway and cycloalkane degradation was not identified. These results suggest that strain HZ01 degrades n-alkanes via the terminal oxidation pathway, degrades aromatic compounds primarily via the catechol pathway and cannot perform methane oxidation or cycloalkane degradation. Additionally, strain HZ01 possesses abundant genes related to the metabolism of secondary metabolites, including some genes involved in biosurfactant (such as glycolipids and lipopeptides synthesis. The genome analysis also reveals its genetic basis for nitrogen metabolism, antibiotic resistance, regulatory responses to environmental changes, cell motility

  8. Next-generation sequencing reveals genomic features in the Japanese quail.

    Science.gov (United States)

    Kawahara-Miki, Ryouka; Sano, Satoshi; Nunome, Mitsuo; Shimmura, Tsuyoshi; Kuwayama, Takehito; Takahashi, Shinji; Kawashima, Takaharu; Matsuda, Yoichi; Yoshimura, Takashi; Kono, Tomohiro

    2013-06-01

    The Japanese quail has several advantages as a laboratory animal for biological and biomedical investigations. In this study, the draft genome of the Japanese quail was sequenced and assembled using next-generation sequencing technology. To improve the quality of the assembly, the sequence reads from the Japanese quail were aligned against the reference genome of the chicken. The final draft assembly consisted of 1.75 Gbp with an N50 contig length of 11,409 bp. On the basis of the draft genome sequence obtained, we developed 100 microsatellite markers and used these markers to evaluate the genetic variability and diversity of 11 lines of Japanese quail. Furthermore, we identified Japanese quail orthologs of spermatogenesis markers and analyzed their expression using in situ hybridization. The Japanese quail genome sequence obtained in the present study could enhance the value of this species as a model animal. Copyright © 2013 Elsevier Inc. All rights reserved.

  9. Whole-genome sequencing reveals mutational landscape underlying phenotypic differences between two widespread Chinese cattle breeds.

    Directory of Open Access Journals (Sweden)

    Yao Xu

    Full Text Available Whole-genome sequencing provides a powerful tool to obtain more genetic variability that could produce a range of benefits for cattle breeding industry. Nanyang (Bos indicus and Qinchuan (Bos taurus are two important Chinese indigenous cattle breeds with distinct phenotypes. To identify the genetic characteristics responsible for variation in phenotypes between the two breeds, in the present study, we for the first time sequenced the genomes of four Nanyang and four Qinchuan cattle with 10 to 12 fold on average of 97.86% and 98.98% coverage of genomes, respectively. Comparison with the Bos_taurus_UMD_3.1 reference assembly yielded 9,010,096 SNPs for Nanyang, and 6,965,062 for Qinchuan cattle, 51% and 29% of which were novel SNPs, respectively. A total of 154,934 and 115,032 small indels (1 to 3 bp were found in the Nanyang and Qinchuan genomes, respectively. The SNP and indel distribution revealed that Nanyang showed a genetically high diversity as compared to Qinchuan cattle. Furthermore, a total of 2,907 putative cases of copy number variation (CNV were identified by aligning Nanyang to Qinchuan genome, 783 of which (27% encompassed the coding regions of 495 functional genes. The gene ontology (GO analysis revealed that many CNV genes were enriched in the immune system and environment adaptability. Among several CNV genes related to lipid transport and fat metabolism, Lepin receptor gene (LEPR overlapping with CNV_1815 showed remarkably higher copy number in Qinchuan than Nanyang (log2 (ratio = -2.34988; P value = 1.53E-102. Further qPCR and association analysis investigated that the copy number of the LEPR gene presented positive correlations with transcriptional expression and phenotypic traits, suggesting the LEPR CNV may contribute to the higher fat deposition in muscles of Qinchuan cattle. Our findings provide evidence that the distinct phenotypes of Nanyang and Qinchuan breeds may be due to the different genetic variations including SNPs

  10. Whole-genome sequencing reveals mutational landscape underlying phenotypic differences between two widespread Chinese cattle breeds

    OpenAIRE

    Xu, Yao; Jiang, Yu; Shi, Tao; Cai, Hanfang; Lan, Xianyong; Zhao, Xin; Plath, Martin; Chen, Hong

    2017-01-01

    Whole-genome sequencing provides a powerful tool to obtain more genetic variability that could produce a range of benefits for cattle breeding industry. Nanyang (Bos indicus) and Qinchuan (Bos taurus) are two important Chinese indigenous cattle breeds with distinct phenotypes. To identify the genetic characteristics responsible for variation in phenotypes between the two breeds, in the present study, we for the first time sequenced the genomes of four Nanyang and four Qinchuan cattle with 10 ...

  11. Genetic variation in the Staphylococcus aureus 8325 strain lineage revealed by whole-genome sequencing.

    Directory of Open Access Journals (Sweden)

    Kristoffer T Bæk

    Full Text Available Staphylococcus aureus strains of the 8325 lineage, especially 8325-4 and derivatives lacking prophage, have been used extensively for decades of research. We report herein the results of our deep sequence analysis of strain 8325-4. Assignment of sequence variants compared with the reference strain 8325 (NRS77/PS47 required correction of errors in the 8325 reference genome, and reassessment of variation previously attributed to chemical mutagenesis of the restriction-defective RN4220. Using an extensive strain pedigree analysis, we discovered that 8325-4 contains 16 single nucleotide polymorphisms (SNP arising prior to the construction of RN4220. We identified 5 indels in 8325-4 compared with 8325. Three indels correspond to expected Φ11, 12, 13 excisions, one indel is explained by a sequence assembly artifact, and the final indel (Δ63bp in the spa-sarS intergenic region is common to only a sub-lineage of 8325-4 strains including SH1000. This deletion was found to significantly decrease (75% steady state sarS but not spa transcript levels in post-exponential phase. The sub-lineage 8325-4 was also found to harbor 4 additional SNPs. We also found large sequence variation between 8325, 8325-4 and RN4220 in a cluster of repetitive hypothetical proteins (SA0282 homologs near the Ess secretion cluster. The overall 8325-4 SNP set results in 17 alterations within coding sequences. Remarkably, we discovered that all tested strains of the 8325-4 lineage lack phenol soluble modulin α3 (PSMα3, a virulence determinant implicated in neutrophil chemotaxis, biofilm architecture and surface spreading. Collectively, our results clarify and define the 8325-4 pedigree and reveal clear evidence that mutations existing throughout all branches of this lineage, including the widely used RN6390 and SH1000 strains, could conceivably impact virulence regulation.

  12. Plant genetic archaeology: whole-genome sequencing reveals the pedigree of a classical trisomic line.

    Science.gov (United States)

    Salomé, Patrice A; Weigel, Detlef

    2014-12-18

    The circadian oscillator is astonishingly robust to changes in the environment but also to genomic changes that alter the copy number of its components through genome duplication, gene duplication, and homeologous gene loss. While studying the potential effect of aneuploidy on the Arabidopsis thaliana circadian clock, we discovered that a line thought to be trisomic for chromosome 3 also bears the gi-1 mutation, resulting in a short period and late flowering. With the help of whole-genome sequencing, we uncovered the unexpected complexity of this trisomic stock's history, as its genome shows evidence of past outcrossing with another A. thaliana accession. Our study indicates that although historical aneuploidy lines exist and are available, it might be safer to generate new individuals and confirm their genomes and karyotypes by sequencing. Copyright © 2015 Salomé and Weigel.

  13. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences

    Directory of Open Access Journals (Sweden)

    Alessandra Traini

    2013-01-01

    Full Text Available Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  14. Fosmid library end sequencing reveals a rarely known genome structure of marine shrimp Penaeus monodon

    Directory of Open Access Journals (Sweden)

    Chen Ming

    2011-05-01

    Full Text Available Abstract Background The black tiger shrimp (Penaeus monodon is one of the most important aquaculture species in the world, representing the crustacean lineage which possesses the greatest species diversity among marine invertebrates. Yet, we barely know anything about their genomic structure. To understand the organization and evolution of the P. monodon genome, a fosmid library consisting of 288,000 colonies and was constructed, equivalent to 5.3-fold coverage of the 2.17 Gb genome. Approximately 11.1 Mb of fosmid end sequences (FESs from 20,926 non-redundant reads representing 0.45% of the P. monodon genome were obtained for repetitive and protein-coding sequence analyses. Results We found that microsatellite sequences were highly abundant in the P. monodon genome, comprising 8.3% of the total length. The density and the average length of microsatellites were evidently higher in comparison to those of other taxa. AT-rich microsatellite motifs, especially poly (AT and poly (AAT, were the most abundant. High abundance of microsatellite sequences were also found in the transcribed regions. Furthermore, via self-BlastN analysis we identified 103 novel repetitive element families which were categorized into four groups, i.e., 33 WSSV-like repeats, 14 retrotransposons, 5 gene-like repeats, and 51 unannotated repeats. Overall, various types of repeats comprise 51.18% of the P. monodon genome in length. Approximately 7.4% of the FESs contained protein-coding sequences, and the Inhibitor of Apoptosis Protein (IAP gene and the Innexin 3 gene homologues appear to be present in high abundance in the P. monodon genome. Conclusions The redundancy of various repeat types in the P. monodon genome illustrates its highly repetitive nature. In particular, long and dense microsatellite sequences as well as abundant WSSV-like sequences highlight the uniqueness of genome organization of penaeid shrimp from those of other taxa. These results provide substantial

  15. Genome sequencing of Ewing sarcoma patients reveals genetic predisposition | Center for Cancer Research

    Science.gov (United States)

    The largest and most comprehensive genomic analysis of individuals with Ewing sarcoma performed to date reveals that some patients are genetically predisposed to developing the cancer.  Learn more...

  16. Genome sequencing of chimpanzee malaria parasites reveals possible pathways of adaptation to human hosts

    KAUST Repository

    Otto, Thomas D.

    2014-09-09

    Plasmodium falciparum causes most human malaria deaths, having prehistorically evolved from parasites of African Great Apes. Here we explore the genomic basis of P. falciparum adaptation to human hosts by fully sequencing the genome of the closely related chimpanzee parasite species P. reichenowi, and obtaining partial sequence data from a more distantly related chimpanzee parasite (P. gaboni). The close relationship between P. reichenowi and P. falciparum is emphasized by almost complete conservation of genomic synteny, but against this strikingly conserved background we observe major differences at loci involved in erythrocyte invasion. The organization of most virulence-associated multigene families, including the hypervariable var genes, is broadly conserved, but P. falciparum has a smaller subset of rif and stevor genes whose products are expressed on the infected erythrocyte surface. Genome-wide analysis identifies other loci under recent positive selection, but a limited number of changes at the host–parasite interface may have mediated host switching.

  17. Sequencing of bovine herpesvirus 4 v.test strain reveals important genome features

    Directory of Open Access Journals (Sweden)

    Gillet Laurent

    2011-08-01

    Full Text Available Abstract Background Bovine herpesvirus 4 (BoHV-4 is a useful model for the human pathogenic gammaherpesviruses Epstein-Barr virus and Kaposi's Sarcoma-associated Herpesvirus. Although genome manipulations of this virus have been greatly facilitated by the cloning of the BoHV-4 V.test strain as a Bacterial Artificial Chromosome (BAC, the lack of a complete genome sequence for this strain limits its experimental use. Methods In this study, we have determined the complete sequence of BoHV-4 V.test strain by a pyrosequencing approach. Results The long unique coding region (LUR consists of 108,241 bp encoding at least 79 open reading frames and is flanked by several polyrepetitive DNA units (prDNA. As previously suggested, we showed that the prDNA unit located at the left prDNA-LUR junction (prDNA-G differs from the other prDNA units (prDNA-inner. Namely, the prDNA-G unit lacks the conserved pac-2 cleavage and packaging signal in its right terminal region. Based on the mechanisms of cleavage and packaging of herpesvirus genomes, this feature implies that only genomes bearing left and right end prDNA units are encapsulated into virions. Conclusions In this study, we have determined the complete genome sequence of the BAC-cloned BoHV-4 V.test strain and identified genome organization features that could be important in other herpesviruses.

  18. Genomic and Functional Characteristics of Human Cytomegalovirus Revealed by Next-Generation Sequencing

    Directory of Open Access Journals (Sweden)

    Steven Sijmons

    2014-03-01

    Full Text Available The complete genome of human cytomegalovirus (HCMV was elucidated almost 25 years ago using a traditional cloning and Sanger sequencing approach. Analysis of the genetic content of additional laboratory and clinical isolates has lead to a better, albeit still incomplete, definition of the coding potential and diversity of wild-type HCMV strains. The introduction of a new generation of massively parallel sequencing technologies, collectively called next-generation sequencing, has profoundly increased the throughput and resolution of the genomics field. These increased possibilities are already leading to a better understanding of the circulating diversity of HCMV clinical isolates. The higher resolution of next-generation sequencing provides new opportunities in the study of intrahost viral population structures. Furthermore, deep sequencing enables novel diagnostic applications for sensitive drug resistance mutation detection. RNA-seq applications have changed the picture of the HCMV transcriptome, which resulted in proof of a vast amount of splicing events and alternative transcripts. This review discusses the application of next-generation sequencing technologies, which has provided a clearer picture of the intricate nature of the HCMV genome. The continuing development and application of novel sequencing technologies will further augment our understanding of this ubiquitous, but elusive, herpesvirus.

  19. Genomic and Functional Characteristics of Human Cytomegalovirus Revealed by Next-Generation Sequencing

    Science.gov (United States)

    Sijmons, Steven; Van Ranst, Marc; Maes, Piet

    2014-01-01

    The complete genome of human cytomegalovirus (HCMV) was elucidated almost 25 years ago using a traditional cloning and Sanger sequencing approach. Analysis of the genetic content of additional laboratory and clinical isolates has lead to a better, albeit still incomplete, definition of the coding potential and diversity of wild-type HCMV strains. The introduction of a new generation of massively parallel sequencing technologies, collectively called next-generation sequencing, has profoundly increased the throughput and resolution of the genomics field. These increased possibilities are already leading to a better understanding of the circulating diversity of HCMV clinical isolates. The higher resolution of next-generation sequencing provides new opportunities in the study of intrahost viral population structures. Furthermore, deep sequencing enables novel diagnostic applications for sensitive drug resistance mutation detection. RNA-seq applications have changed the picture of the HCMV transcriptome, which resulted in proof of a vast amount of splicing events and alternative transcripts. This review discusses the application of next-generation sequencing technologies, which has provided a clearer picture of the intricate nature of the HCMV genome. The continuing development and application of novel sequencing technologies will further augment our understanding of this ubiquitous, but elusive, herpesvirus. PMID:24603756

  20. Genome sequences and SNP analyses of Corynespora cassiicola from cotton and soybean in the southeastern United States reveal limited diversity.

    Directory of Open Access Journals (Sweden)

    Sandesh K Shrestha

    Full Text Available Corynespora cassiicola attackes diverse agriculturally important plants, including soybean and cotton, in the US. It is a reemerge pathogen on cotton in southeastern US. Whole genome sequences of four cotton and one soybean isolate from Tennessee were used to develop single nucleotide polymorphism markers for cotton isolates. Cotton isolates had little diversity at the genome level and very little differentiation from the soybean isolate. Analysis of 75 isolates from cotton and soybean, using targeted-sequencing of 22 polymorphic SNP sites, revealed eight multi-locus genotypes and it appears a single clonal lineage predominates across the southeastern region. The cotton and soybean genome sequences were significantly different from the public reference genome derived from a rubber isolate and the utility of these novel resources will be discussed.

  1. Genomic diversity and evolution of Mycobacterium ulcerans revealed by next-generation sequencing.

    Directory of Open Access Journals (Sweden)

    Weihong Qi

    2009-09-01

    Full Text Available Mycobacterium ulcerans is the causative agent of Buruli ulcer, the third most common mycobacterial disease after tuberculosis and leprosy. It is an emerging infectious disease that afflicts mainly children and youths in West Africa. Little is known about the evolution and transmission mode of M. ulcerans, partially due to the lack of known genetic polymorphisms among isolates, limiting the application of genetic epidemiology. To systematically profile single nucleotide polymorphisms (SNPs, we sequenced the genomes of three M. ulcerans strains using 454 and Solexa technologies. Comparison with the reference genome of the Ghanaian classical lineage isolate Agy99 revealed 26,564 SNPs in a Japanese strain representing the ancestral lineage. Only 173 SNPs were found when comparing Agy99 with two other Ghanaian isolates, which belong to the two other types previously distinguished in Ghana by variable number tandem repeat typing. We further analyzed a collection of Ghanaian strains using the SNPs discovered. With 68 SNP loci, we were able to differentiate 54 strains into 13 distinct SNP haplotypes. The average SNP nucleotide diversity was low (average 0.06-0.09 across 68 SNP loci, and 96% of the SNP locus pairs were in complete linkage disequilibrium. We estimated that the divergence of the M. ulcerans Ghanaian clade from the Japanese strain occurred 394 to 529 thousand years ago. The Ghanaian subtypes diverged about 1000 to 3000 years ago, or even much more recently, because we found evidence that they evolved significantly faster than average. Our results offer significant insight into the evolution of M. ulcerans and provide a comprehensive report on genetic diversity within a highly clonal M. ulcerans population from a Buruli ulcer endemic region, which can facilitate further epidemiological studies of this pathogen through the development of high-resolution tools.

  2. Metabolic diversity and ecological niches of Achromatium populations revealed with single-cell genomic sequencing

    Directory of Open Access Journals (Sweden)

    Muammar eMansor

    2015-08-01

    Full Text Available Large, sulfur-cycling, calcite-precipitating bacteria in the genus Achromatium represent a significant proportion of bacterial communities near sediment-water interfaces throughout the world. Our understanding of their potentially crucial roles in calcium, carbon, sulfur, nitrogen, and iron cycling is limited because they have not been cultured or sequenced using environmental genomics approaches to date. We utilized single-cell genomic sequencing to obtain one incomplete and two nearly complete draft genomes for Achromatium collected at Warm Mineral Springs, FL. Based on 16S rRNA gene sequences, the three cells represent distinct and relatively distant Achromatium populations (91-92% identity. The draft genomes encode key genes involved in sulfur and hydrogen oxidation; oxygen, nitrogen and polysulfide respiration; carbon and nitrogen fixation; organic carbon assimilation and storage; chemotaxis; twitching motility; antibiotic resistance; and membrane transport. Known genes for iron and manganese energy metabolism were not detected. The presence of pyrophosphatase and vacuolar (V-type ATPases, which are generally rare in bacterial genomes, suggests a role for these enzymes in calcium transport, proton pumping, and/or energy generation in the membranes of calcite-containing inclusions.

  3. Next generation sequencing and FISH reveal uneven and nonrandom microsatellite distribution in two grasshopper genomes.

    Science.gov (United States)

    Ruiz-Ruano, Francisco J; Cuadrado, Ángeles; Montiel, Eugenia E; Camacho, Juan Pedro M; López-León, María Dolores

    2015-06-01

    Simple sequence repeats (SSRs), also known as microsatellites, are one of the prominent DNA sequences shaping the repeated fraction of eukaryotic genomes. In spite of their profuse use as molecular markers for a variety of genetic and evolutionary studies, their genomic location, distribution, and function are not yet well understood. Here we report the first thorough joint analysis of microsatellite motifs at both genomic and chromosomal levels in animal species, by a combination of 454 sequencing and fluorescent in situ hybridization (FISH) techniques performed on two grasshopper species. The in silico analysis of the 454 reads suggested that microsatellite expansion is not driving size increase of these genomes, as SSR abundance was higher in the species showing the smallest genome. However, the two species showed the same uneven and nonrandom location of SSRs, with clear predominance of dinucleotide motifs and association with several types of repetitive elements, mostly histone gene spacers, ribosomal DNA intergenic spacers (IGS), and transposable elements (TEs). The FISH analysis showed a dispersed chromosome distribution of microsatellite motifs in euchromatic regions, in coincidence with chromosome location patterns previously observed for many mobile elements in these species. However, some SSR motifs were clustered, especially those located in the histone gene cluster.

  4. Molecular footprints of domestication and improvement in soybean revealed by whole genome re-sequencing

    Science.gov (United States)

    2013-01-01

    Background Artificial selection played an important role in the origin of modern Glycine max cultivars from the wild soybean Glycine soja. To elucidate the consequences of artificial selection accompanying the domestication and modern improvement of soybean, 25 new and 30 published whole-genome re-sequencing accessions, which represent wild, domesticated landrace, and Chinese elite soybean populations were analyzed. Results A total of 5,102,244 single nucleotide polymorphisms (SNPs) and 707,969 insertion/deletions were identified. Among the SNPs detected, 25.5% were not described previously. We found that artificial selection during domestication led to more pronounced reduction in the genetic diversity of soybean than the switch from landraces to elite cultivars. Only a small proportion (2.99%) of the whole genomic regions appear to be affected by artificial selection for preferred agricultural traits. The selection regions were not distributed randomly or uniformly throughout the genome. Instead, clusters of selection hotspots in certain genomic regions were observed. Moreover, a set of candidate genes (4.38% of the total annotated genes) significantly affected by selection underlying soybean domestication and genetic improvement were identified. Conclusions Given the uniqueness of the soybean germplasm sequenced, this study drew a clear picture of human-mediated evolution of the soybean genomes. The genomic resources and information provided by this study would also facilitate the discovery of genes/loci underlying agronomically important traits. PMID:23984715

  5. Whole-genome sequencing reveals novel insights into sulfur oxidation in the extremophile Acidithiobacillus thiooxidans.

    Science.gov (United States)

    Yin, Huaqun; Zhang, Xian; Li, Xiaoqi; He, Zhili; Liang, Yili; Guo, Xue; Hu, Qi; Xiao, Yunhua; Cong, Jing; Ma, Liyuan; Niu, Jiaojiao; Liu, Xueduan

    2014-07-04

    Acidithiobacillus thiooxidans (A. thiooxidans), a chemolithoautotrophic extremophile, is widely used in the industrial recovery of copper (bioleaching or biomining). The organism grows and survives by autotrophically utilizing energy derived from the oxidation of elemental sulfur and reduced inorganic sulfur compounds (RISCs). However, the lack of genetic manipulation systems has restricted our exploration of its physiology. With the development of high-throughput sequencing technology, the whole genome sequence analysis of A. thiooxidans has allowed preliminary models to be built for genes/enzymes involved in key energy pathways like sulfur oxidation. The genome of A. thiooxidans A01 was sequenced and annotated. It contains key sulfur oxidation enzymes involved in the oxidation of elemental sulfur and RISCs, such as sulfur dioxygenase (SDO), sulfide quinone reductase (SQR), thiosulfate:quinone oxidoreductase (TQO), tetrathionate hydrolase (TetH), sulfur oxidizing protein (Sox) system and their associated electron transport components. Also, the sulfur oxygenase reductase (SOR) gene was detected in the draft genome sequence of A. thiooxidans A01, and multiple sequence alignment was performed to explore the function of groups of related protein sequences. In addition, another putative pathway was found in the cytoplasm of A. thiooxidans, which catalyzes sulfite to sulfate as the final product by phosphoadenosine phosphosulfate (PAPS) reductase and adenylylsulfate (APS) kinase. This differs from its closest relative Acidithiobacillus caldus, which is performed by sulfate adenylyltransferase (SAT). Furthermore, real-time quantitative PCR analysis showed that most of sulfur oxidation genes were more strongly expressed in the S0 medium than that in the Na2S2O3 medium at the mid-log phase. Sulfur oxidation model of A. thiooxidans A01 has been constructed based on previous studies from other sulfur oxidizing strains and its genome sequence analyses, providing insights

  6. Genome sequence of Candidatus Nitrososphaera evergladensis from group I.1b enriched from Everglades soil reveals novel genomic features of the ammonia-oxidizing archaea.

    Directory of Open Access Journals (Sweden)

    Kateryna V Zhalnina

    Full Text Available The activity of ammonia-oxidizing archaea (AOA leads to the loss of nitrogen from soil, pollution of water sources and elevated emissions of greenhouse gas. To date, eight AOA genomes are available in the public databases, seven are from the group I.1a of the Thaumarchaeota and only one is from the group I.1b, isolated from hot springs. Many soils are dominated by AOA from the group I.1b, but the genomes of soil representatives of this group have not been sequenced and functionally characterized. The lack of knowledge of metabolic pathways of soil AOA presents a critical gap in understanding their role in biogeochemical cycles. Here, we describe the first complete genome of soil archaeon Candidatus Nitrososphaera evergladensis, which has been reconstructed from metagenomic sequencing of a highly enriched culture obtained from an agricultural soil. The AOA enrichment was sequenced with the high throughput next generation sequencing platforms from Pacific Biosciences and Ion Torrent. The de novo assembly of sequences resulted in one 2.95 Mb contig. Annotation of the reconstructed genome revealed many similarities of the basic metabolism with the rest of sequenced AOA. Ca. N. evergladensis belongs to the group I.1b and shares only 40% of whole-genome homology with the closest sequenced relative Ca. N. gargensis. Detailed analysis of the genome revealed coding sequences that were completely absent from the group I.1a. These unique sequences code for proteins involved in control of DNA integrity, transporters, two-component systems and versatile CRISPR defense system. Notably, genomes from the group I.1b have more gene duplications compared to the genomes from the group I.1a. We suggest that the presence of these unique genes and gene duplications may be associated with the environmental versatility of this group.

  7. Whole genome sequencing of the monomorphic pathogen Mycobacterium bovis reveals local differentiation of cattle clinical isolates.

    Science.gov (United States)

    Lasserre, Moira; Fresia, Pablo; Greif, Gonzalo; Iraola, Gregorio; Castro-Ramos, Miguel; Juambeltz, Arturo; Nuñez, Álvaro; Naya, Hugo; Robello, Carlos; Berná, Luisa

    2018-01-02

    Bovine tuberculosis (bTB) poses serious risks to animal welfare and economy, as well as to public health as a zoonosis. Its etiological agent, Mycobacterium bovis, belongs to the Mycobacterium tuberculosis complex (MTBC), a group of genetically monomorphic organisms featured by a remarkably high overall nucleotide identity (99.9%). Indeed, this characteristic is of major concern for correct typing and determination of strain-specific traits based on sequence diversity. Due to its historical economic dependence on cattle production, Uruguay is deeply affected by the prevailing incidence of Mycobacterium bovis. With the world's highest number of cattle per human, and its intensive cattle production, Uruguay represents a particularly suited setting to evaluate genomic variability among isolates, and the diversity traits associated to this pathogen. We compared 186 genomes from MTBC strains isolated worldwide, and found a highly structured population in M. bovis. The analysis of 23 new M. bovis genomes, belonging to strains isolated in Uruguay evidenced three groups present in the country. Despite presenting an expected highly conserved genomic structure and sequence, these strains segregate into a clustered manner within the worldwide phylogeny. Analysis of the non-pe/ppe differential areas against a reference genome defined four main sources of variability, namely: regions of difference (RD), variable genes, duplications and novel genes. RDs and variant analysis segregated the strains into clusters that are concordant with their spoligotype identities. Due to its high homoplasy rate, spoligotyping failed to reflect the true genomic diversity among worldwide representative strains, however, it remains a good indicator for closely related populations. This study introduces a comprehensive population structure analysis of worldwide M. bovis isolates. The incorporation and analysis of 23 novel Uruguayan M. bovis genomes, sheds light onto the genomic diversity of this

  8. Single-cell paired-end genome sequencing reveals structural variation per cell cycle

    Science.gov (United States)

    Voet, Thierry; Kumar, Parveen; Van Loo, Peter; Cooke, Susanna L.; Marshall, John; Lin, Meng-Lay; Zamani Esteki, Masoud; Van der Aa, Niels; Mateiu, Ligia; McBride, David J.; Bignell, Graham R.; McLaren, Stuart; Teague, Jon; Butler, Adam; Raine, Keiran; Stebbings, Lucy A.; Quail, Michael A.; D’Hooghe, Thomas; Moreau, Yves; Futreal, P. Andrew; Stratton, Michael R.; Vermeesch, Joris R.; Campbell, Peter J.

    2013-01-01

    The nature and pace of genome mutation is largely unknown. Because standard methods sequence DNA from populations of cells, the genetic composition of individual cells is lost, de novo mutations in cells are concealed within the bulk signal and per cell cycle mutation rates and mechanisms remain elusive. Although single-cell genome analyses could resolve these problems, such analyses are error-prone because of whole-genome amplification (WGA) artefacts and are limited in the types of DNA mutation that can be discerned. We developed methods for paired-end sequence analysis of single-cell WGA products that enable (i) detecting multiple classes of DNA mutation, (ii) distinguishing DNA copy number changes from allelic WGA-amplification artefacts by the discovery of matching aberrantly mapping read pairs among the surfeit of paired-end WGA and mapping artefacts and (iii) delineating the break points and architecture of structural variants. By applying the methods, we capture DNA copy number changes acquired over one cell cycle in breast cancer cells and in blastomeres derived from a human zygote after in vitro fertilization. Furthermore, we were able to discover and fine-map a heritable inter-chromosomal rearrangement t(1;16)(p36;p12) by sequencing a single blastomere. The methods will expedite applications in basic genome research and provide a stepping stone to novel approaches for clinical genetic diagnosis. PMID:23630320

  9. Genome sequencing of a virulent avian Pasteurella multocida strain GX-Pm reveals the candidate genes involved in the pathogenesis.

    Science.gov (United States)

    Yu, Chengjie; Sizhu, Suolang; Luo, Qingping; Xu, Xuewen; Fu, Lei; Zhang, Anding

    2016-04-01

    Pasteurella multocida (P. multocida) was first shown to be the causative agent of fowl cholera by Louis Pasteur in 1881. First genomic study was performed on an avirulent avian strain Pm70, and until 2013, two genomes of virulent avian strains X73 and P1059 were sequenced. Comparative genome study supplied important information for further study on the pathogenesis of fowl cholera. In the previous study, a capsular serotype A strain GX-Pm was isolated from the liver of a chicken, which died during an outbreak of fowl cholera in 2011. The strain showed multiple drug resistance and was highly virulent to chickens. Therefore, the present study performed the genome sequencing and a comparative genomic analysis to reveal the candidate genes involved in virulence of P. multocida. Sequenced draft genome sequence of GX-Pm was 2,292,886 bp, contained 2941 protein-coding genes, 5 genomic islands, 4 IS elements and 2 prophage regions. Notability, all the predicted drug-resistance genes were included in predicted genomic islands. A comparative genome study on virulent avian strains P1059, X73 and GX-Pm with the avirulent avian strain Pm 70 indicated that 475 unique genes were only identified in either of virulent strains but absent in the avirulent strain. Among these genes, 20 genes were contained within genomes of all three virulent strains, including a few of putative virulence genes. Further characterization of the pathogenic functions of these genes would benefit the understanding of pathogenesis of fowl cholera. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. Comparative genomic sequence analysis of strawberry and other rosids reveals significant microsynteny

    Directory of Open Access Journals (Sweden)

    Abbott Albert

    2010-06-01

    Full Text Available Abstract Background Fragaria belongs to the Rosaceae, an economically important family that includes a number of important fruit producing genera such as Malus and Prunus. Using genomic sequences from 50 Fragaria fosmids, we have examined the microsynteny between Fragaria and other plant models. Results In more than half of the strawberry fosmids, we found syntenic regions that are conserved in Populus, Vitis, Medicago and/or Arabidopsis with Populus containing the greatest number of syntenic regions with Fragaria. The longest syntenic region was between LG VIII of the poplar genome and the strawberry fosmid 72E18, where seven out of twelve predicted genes were collinear. We also observed an unexpectedly high level of conserved synteny between Fragaria (rosid I and Vitis (basal rosid. One of the strawberry fosmids, 34E24, contained a cluster of R gene analogs (RGAs with NBS and LRR domains. We detected clusters of RGAs with high sequence similarity to those in 34E24 in all the genomes compared. In the phylogenetic tree we have generated, all the NBS-LRR genes grouped together with Arabidopsis CNL-A type NBS-LRR genes. The Fragaria RGA grouped together with those of Vitis and Populus in the phylogenetic tree. Conclusions Our analysis shows considerable microsynteny between Fragaria and other plant genomes such as Populus, Medicago, Vitis, and Arabidopsis to a lesser degree. We also detected a cluster of NBS-LRR type genes that are conserved in all the genomes compared.

  11. Sequencing rare marine actinomycete genomes reveals high density of unique natural product biosynthetic gene clusters

    Science.gov (United States)

    Schorn, Michelle A.; Alanjary, Mohammad M.; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R.; Ziemert, Nadine

    2016-01-01

    Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites. PMID:27902408

  12. The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima

    NARCIS (Netherlands)

    Chipman, Ariel D; Ferrier, David E K; Brena, Carlo; Qu, Jiaxin; Hughes, Daniel S T; Schröder, Reinhard; Torres-Oliva, Montserrat; Znassi, Nadia; Jiang, Huaiyang; Almeida, Francisca C; Alonso, Claudio R; Apostolou, Zivkos; Aqrawi, Peshtewani; Arthur, Wallace; Barna, Jennifer C J; Blankenburg, Kerstin P; Brites, Daniela; Capella-Gutiérrez, Salvador; Coyle, Marcus; Dearden, Peter K; Du Pasquier, Louis; Duncan, Elizabeth J; Ebert, Dieter; Eibner, Cornelius; Erikson, Galina; Evans, Peter D; Extavour, Cassandra G; Francisco, Liezl; Gabaldón, Toni; Gillis, William J; Goodwin-Horn, Elizabeth A; Green, Jack E; Griffiths-Jones, Sam; Grimmelikhuijzen, Cornelis J P; Gubbala, Sai; Guigó, Roderic; Han, Yi; Hauser, Frank; Havlak, Paul; Hayden, Luke; Helbing, Sophie; Holder, Michael; Hui, Jerome H L; Hunn, Julia P; Hunnekuhl, Vera S; Jackson, LaRonda; Javaid, Mehwish; Jhangiani, Shalini N; Jiggins, Francis M; Jones, Tamsin E; Kaiser, Tobias S; Kalra, Divya; Kenny, Nathan J; Korchina, Viktoriya; Kovar, Christie L; Kraus, F Bernhard; Lapraz, François; Lee, Sandra L; Lv, Jie; Mandapat, Christigale; Manning, Gerard; Mariotti, Marco; Mata, Robert; Mathew, Tittu; Neumann, Tobias; Newsham, Irene; Ngo, Dinh N; Ninova, Maria; Okwuonu, Geoffrey; Ongeri, Fiona; Palmer, William J; Patil, Shobha; Patraquim, Pedro; Pham, Christopher; Pu, Ling-Ling; Putman, Nicholas H; Rabouille, Catherine; Ramos, Olivia Mendivil; Rhodes, Adelaide C; Robertson, Helen E; Robertson, Hugh M; Ronshaugen, Matthew; Rozas, Julio; Saada, Nehad; Sánchez-Gracia, Alejandro; Scherer, Steven E; Schurko, Andrew M; Siggens, Kenneth W; Simmons, DeNard; Stief, Anna; Stolle, Eckart; Telford, Maximilian J; Tessmar-Raible, Kristin; Thornton, Rebecca; van der Zee, Maurijn; von Haeseler, Arndt; Williams, James M; Willis, Judith H; Wu, Yuanqing; Zou, Xiaoyan; Lawson, Daniel; Muzny, Donna M; Worley, Kim C; Gibbs, Richard A; Akam, Michael; Richards, Stephen

    2014-01-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present

  13. Combining genomic sequencing methods to explore viral diversity and reveal potential virus-host interactions

    Directory of Open Access Journals (Sweden)

    Cheryl-Emiliane Tien Chow

    2015-04-01

    Full Text Available Viral diversity and virus-host interactions in oxygen-starved regions of the ocean, also known as oxygen minimum zones (OMZs, remain relatively unexplored. Microbial community metabolism in OMZs alters nutrient and energy flow through marine food webs, resulting in biological nitrogen loss and greenhouse gas production. Thus, viruses infecting OMZ microbes have the potential to modulate community metabolism with resulting feedback on ecosystem function. Here, we describe viral communities inhabiting oxic surface (10m and oxygen-starved basin (200m waters of Saanich Inlet, a seasonally anoxic fjord on the coast of Vancouver Island, British Columbia using viral metagenomics and complete viral fosmid sequencing on samples collected between April 2007 and April 2010. Of 6459 open reading frames (ORFs predicted across all 34 viral fosmids, 77.6% (n=5010 had no homology to reference viral genomes. These fosmids recruited a higher proportion of viral metagenomic sequences from Saanich Inlet than from nearby northeastern subarctic Pacific Ocean (Line P waters, indicating differences in the viral communities between coastal and open ocean locations. While functional annotations of fosmid ORFs were limited, recruitment to NCBI’s non-redundant ‘nr’ database and publicly available single-cell genomes identified putative viruses infecting marine thaumarchaeal and SUP05 proteobacteria to provide potential host linkages with relevance to coupled biogeochemical cycling processes in OMZ waters. Taken together, these results highlight the power of coupled analyses of multiple sequence data types, such as viral metagenomic and fosmid sequence data with prokaryotic single cell genomes, to chart viral diversity, elucidate genomic and ecological contexts for previously unclassifiable viral sequences, and identify novel host interactions in natural and engineered ecosystems.

  14. Ultra-deep sequencing reveals the subclonal structure and genomic evolution of oral squamous cell carcinoma

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin Jakob

    Background: Oral squamous cell carcinoma (OSCC), a subgroup of head and neck squamous cell carcinoma (HNSCC), is primarily caused by alcohol consumption and tobacco use. Recent DNA sequencing studies suggests that HNSCC are very heterogeneous between patients; however the intra-patient subclonal...... structure remains unexplored due to lack of sampling multiple tumor biopsies from each patient. Materials and methods: To examine the clonal structure and describe the genomic cancer evolution we applied whole-exome sequencing combined with targeted ultra-deep targeted sequencing on biopsies from 5stage IV...... OSCC patients. From each patient, a series of biopsies were sampled from 3 distinct geographical sites in primary tumor and 1 lymph node metastasis. A whole blood sample was taken as the matched reference. Results and discussion: Our results demonstrate that ultra-deep sequencing gives a level...

  15. Major soybean maturity gene haplotypes revealed by SNPViz analysis of 72 sequenced soybean genomes.

    Directory of Open Access Journals (Sweden)

    Tiffany Langewisch

    Full Text Available In this Genomics Era, vast amounts of next-generation sequencing data have become publicly available for multiple genomes across hundreds of species. Analyses of these large-scale datasets can become cumbersome, especially when comparing nucleotide polymorphisms across many samples within a dataset and among different datasets or organisms. To facilitate the exploration of allelic variation and diversity, we have developed and deployed an in-house computer software to categorize and visualize these haplotypes. The SNPViz software enables users to analyze region-specific haplotypes from single nucleotide polymorphism (SNP datasets for different sequenced genomes. The examination of allelic variation and diversity of important soybean [Glycine max (L. Merr.] flowering time and maturity genes may provide additional insight into flowering time regulation and enhance researchers' ability to target soybean breeding for particular environments. For this study, we utilized two available soybean genomic datasets for a total of 72 soybean genotypes encompassing cultivars, landraces, and the wild species Glycine soja. The major soybean maturity genes E1, E2, E3, and E4 along with the Dt1 gene for plant growth architecture were analyzed in an effort to determine the number of major haplotypes for each gene, to evaluate the consistency of the haplotypes with characterized variant alleles, and to identify evidence of artificial selection. The results indicated classification of a small number of predominant haplogroups for each gene and important insights into possible allelic diversity for each gene within the context of known causative mutations. The software has both a stand-alone and web-based version and can be used to analyze other genes, examine additional soybean datasets, and view similar genome sequence and SNP datasets from other species.

  16. Major soybean maturity gene haplotypes revealed by SNPViz analysis of 72 sequenced soybean genomes.

    Science.gov (United States)

    Langewisch, Tiffany; Zhang, Hongxin; Vincent, Ryan; Joshi, Trupti; Xu, Dong; Bilyeu, Kristin

    2014-01-01

    In this Genomics Era, vast amounts of next-generation sequencing data have become publicly available for multiple genomes across hundreds of species. Analyses of these large-scale datasets can become cumbersome, especially when comparing nucleotide polymorphisms across many samples within a dataset and among different datasets or organisms. To facilitate the exploration of allelic variation and diversity, we have developed and deployed an in-house computer software to categorize and visualize these haplotypes. The SNPViz software enables users to analyze region-specific haplotypes from single nucleotide polymorphism (SNP) datasets for different sequenced genomes. The examination of allelic variation and diversity of important soybean [Glycine max (L.) Merr.] flowering time and maturity genes may provide additional insight into flowering time regulation and enhance researchers' ability to target soybean breeding for particular environments. For this study, we utilized two available soybean genomic datasets for a total of 72 soybean genotypes encompassing cultivars, landraces, and the wild species Glycine soja. The major soybean maturity genes E1, E2, E3, and E4 along with the Dt1 gene for plant growth architecture were analyzed in an effort to determine the number of major haplotypes for each gene, to evaluate the consistency of the haplotypes with characterized variant alleles, and to identify evidence of artificial selection. The results indicated classification of a small number of predominant haplogroups for each gene and important insights into possible allelic diversity for each gene within the context of known causative mutations. The software has both a stand-alone and web-based version and can be used to analyze other genes, examine additional soybean datasets, and view similar genome sequence and SNP datasets from other species.

  17. Giraffe genome sequence reveals clues to its unique morphology and physiology

    OpenAIRE

    Agaba, Morris; Ishengoma, Edson; Miller, Webb C.; McGrath, Barbara C.; Hudson, Chelsea N.; Bedoya Reina, Oscar C.; Ratan, Aakrosh; Burhans, Rico; Chikhi, Rayan; Medvedev, Paul; Praul, Craig A.; Wu-Cavener, Lan; Wood, Brendan; Robertson, Heather; Penfold, Linda

    2016-01-01

    The origins of giraffe's imposing stature and associated cardiovascular adaptations are unknown. Okapi, which lacks these unique features, is giraffe's closest relative and provides a useful comparison, to identify genetic variation underlying giraffe's long neck and cardiovascular system. The genomes of giraffe and okapi were sequenced, and through comparative analyses genes and pathways were identified that exhibit unique genetic changes and likely contribute to giraffe's unique features. S...

  18. Shotgun Bisulfite Sequencing of the Betula platyphylla Genome Reveals the Tree’s DNA Methylation Patterning

    Directory of Open Access Journals (Sweden)

    Chang Su

    2014-12-01

    Full Text Available DNA methylation plays a critical role in the regulation of gene expression. Most studies of DNA methylation have been performed in herbaceous plants, and little is known about the methylation patterns in tree genomes. In the present study, we generated a map of methylated cytosines at single base pair resolution for Betula platyphylla (white birch by bisulfite sequencing combined with transcriptomics to analyze DNA methylation and its effects on gene expression. We obtained a detailed view of the function of DNA methylation sequence composition and distribution in the genome of B. platyphylla. There are 34,460 genes in the whole genome of birch, and 31,297 genes are methylated. Conservatively, we estimated that 14.29% of genomic cytosines are methylcytosines in birch. Among the methylation sites, the CHH context accounts for 48.86%, and is the largest proportion. Combined transcriptome and methylation analysis showed that the genes with moderate methylation levels had higher expression levels than genes with high and low methylation. In addition, methylated genes are highly enriched for the GO subcategories of binding activities, catalytic activities, cellular processes, response to stimulus and cell death, suggesting that methylation mediates these pathways in birch trees.

  19. Genome sequencing reveals complex secondary metabolome in themarine actinomycete Salinispora tropica

    Energy Technology Data Exchange (ETDEWEB)

    Udwary, Daniel W.; Zeigler, Lisa; Asolkar, Ratnakar; Singan,Vasanth; Lapidus, Alla; Fenical, William; Jensen, Paul R.; Moore, BradleyS.

    2007-05-01

    Recent fermentation studies have identified actinomycetes ofthe marine-dwelling genus Salinispora as prolific natural productproducers. To further evaluate their biosynthetic potential, we analyzedall identifiable secondary natural product gene clusters from therecently sequenced 5,184,724 bp S. tropica CNB-440 circular genome. Ouranalysis shows that biosynthetic potential meets or exceeds that shown byprevious Streptomyces genome sequences as well as other naturalproduct-producing actinomycetes. The S. tropica genome features ninepolyketide synthase systems of every known formally classified family,non-ribosomal peptide synthetases and several hybrid clusters. While afew clusters appear to encode molecules previously identified inStreptomyces species,the majority of the 15 biosynthetic loci are novel.Specific chemical information about putative and observed natural productmolecules is presented and discussed. In addition, our bioinformaticanalysis was critical for the structure elucidation of the novelpolyenemacrolactam salinilactam A. This study demonstrates the potentialfor genomic analysis to complement and strengthen traditional naturalproduct isolation studies and firmly establishes the genus Salinispora asa rich source of novel drug-like molecules.

  20. Ultra Deep Sequencing of a Baculovirus Population Reveals Widespread Genomic Variations

    Directory of Open Access Journals (Sweden)

    Aurélien Chateigner

    2015-07-01

    Full Text Available Viruses rely on widespread genetic variation and large population size for adaptation. Large DNA virus populations are thought to harbor little variation though natural populations may be polymorphic. To measure the genetic variation present in a dsDNA virus population, we deep sequenced a natural strain of the baculovirus Autographa californica multiple nucleopolyhedrovirus. With 124,221X average genome coverage of our 133,926 bp long consensus, we could detect low frequency mutations (0.025%. K-means clustering was used to classify the mutations in four categories according to their frequency in the population. We found 60 high frequency non-synonymous mutations under balancing selection distributed in all functional classes. These mutants could alter viral adaptation dynamics, either through competitive or synergistic processes. Lastly, we developed a technique for the delimitation of large deletions in next generation sequencing data. We found that large deletions occur along the entire viral genome, with hotspots located in homologous repeat regions (hrs. Present in 25.4% of the genomes, these deletion mutants presumably require functional complementation to complete their infection cycle. They might thus have a large impact on the fitness of the baculovirus population. Altogether, we found a wide breadth of genomic variation in the baculovirus population, suggesting it has high adaptive potential.

  1. Analysis of genome sequences from plant pathogenic Rhodococcus reveals genetic novelties in virulence loci.

    Directory of Open Access Journals (Sweden)

    Allison L Creason

    Full Text Available Members of Gram-positive Actinobacteria cause economically important diseases to plants. Within the Rhodococcus genus, some members can cause growth deformities and persist as pathogens on a wide range of host plants. The current model predicts that phytopathogenic isolates require a cluster of three loci present on a linear plasmid, with the fas operon central to virulence. The Fas proteins synthesize, modify, and activate a mixture of growth regulating cytokinins, which cause a hormonal imbalance in plants, resulting in abnormal growth. We sequenced and compared the genomes of 20 isolates of Rhodococcus to gain insights into the mechanisms and evolution of virulence in these bacteria. Horizontal gene transfer was identified as critical but limited in the scale of virulence evolution, as few loci are conserved and exclusive to phytopathogenic isolates. Although the fas operon is present in most phytopathogenic isolates, it is absent from phytopathogenic isolate A21d2. Instead, this isolate has a horizontally acquired gene chimera that encodes a novel fusion protein with isopentyltransferase and phosphoribohydrolase domains, predicted to be capable of catalyzing and activating cytokinins, respectively. Cytokinin profiling of the archetypal D188 isolate revealed only one activate cytokinin type that was specifically synthesized in a fas-dependent manner. These results suggest that only the isopentenyladenine cytokinin type is synthesized and necessary for Rhodococcus phytopathogenicity, which is not consistent with the extant model stating that a mixture of cytokinins is necessary for Rhodococcus to cause leafy gall symptoms. In all, data indicate that only four horizontally acquired functions are sufficient to confer the trait of phytopathogenicity to members of the genetically diverse clade of Rhodococcus.

  2. Analysis of genome sequences from plant pathogenic Rhodococcus reveals genetic novelties in virulence loci.

    Science.gov (United States)

    Creason, Allison L; Vandeputte, Olivier M; Savory, Elizabeth A; Davis, Edward W; Putnam, Melodie L; Hu, Erdong; Swader-Hines, David; Mol, Adeline; Baucher, Marie; Prinsen, Els; Zdanowska, Magdalena; Givan, Scott A; El Jaziri, Mondher; Loper, Joyce E; Mahmud, Taifo; Chang, Jeff H

    2014-01-01

    Members of Gram-positive Actinobacteria cause economically important diseases to plants. Within the Rhodococcus genus, some members can cause growth deformities and persist as pathogens on a wide range of host plants. The current model predicts that phytopathogenic isolates require a cluster of three loci present on a linear plasmid, with the fas operon central to virulence. The Fas proteins synthesize, modify, and activate a mixture of growth regulating cytokinins, which cause a hormonal imbalance in plants, resulting in abnormal growth. We sequenced and compared the genomes of 20 isolates of Rhodococcus to gain insights into the mechanisms and evolution of virulence in these bacteria. Horizontal gene transfer was identified as critical but limited in the scale of virulence evolution, as few loci are conserved and exclusive to phytopathogenic isolates. Although the fas operon is present in most phytopathogenic isolates, it is absent from phytopathogenic isolate A21d2. Instead, this isolate has a horizontally acquired gene chimera that encodes a novel fusion protein with isopentyltransferase and phosphoribohydrolase domains, predicted to be capable of catalyzing and activating cytokinins, respectively. Cytokinin profiling of the archetypal D188 isolate revealed only one activate cytokinin type that was specifically synthesized in a fas-dependent manner. These results suggest that only the isopentenyladenine cytokinin type is synthesized and necessary for Rhodococcus phytopathogenicity, which is not consistent with the extant model stating that a mixture of cytokinins is necessary for Rhodococcus to cause leafy gall symptoms. In all, data indicate that only four horizontally acquired functions are sufficient to confer the trait of phytopathogenicity to members of the genetically diverse clade of Rhodococcus.

  3. Complete genome sequence of Fer-de-Lance Virus reveals a novel gene in reptilian Paramyxoviruses

    Science.gov (United States)

    Kurath, G.; Batts, W.N.; Ahne, W.; Winton, J.R.

    2004-01-01

    The complete RNA genome sequence of the archetype reptilian paramyxovirus, Fer-de-Lance virus (FDLV), has been determined. The genome is 15,378 nucleotides in length and consists of seven nonoverlapping genes in the order 3??? N-U-P-M-F-HN-L 5???, coding for the nucleocapsid, unknown, phospho-, matrix, fusion, hemagglutinin-neuraminidase, and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and tri-nucleotide intergenic regions similar to those of other Paramyxoviridae. The FDLV P gene expression strategy is like that of rubulaviruses, which express the accessory V protein from the primary transcript and edit a portion of the mRNA to encode P and I proteins. There is also an overlapping open reading frame potentially encoding a small basic protein in the P gene. The gene designated U (unknown), encodes a deduced protein of 19.4 kDa that has no counterpart in other paramyxoviruses and has no similarity with sequences in the National Center for Biotechnology Information database. Active transcription of the U gene in infected cells was demonstrated by Northern blot analysis, and bicistronic N-U mRNA was also evident. The genomes of two other snake paramyxovirus genotypes were also found to have U genes, with 11 to 16% nucleotide divergence from the FDLV U gene. Pairwise comparisons of amino acid identities and phylogenetic analyses of all deduced FDLV protein sequences with homologous sequences from other Paramyxoviridae indicate that FDLV represents a new genus within the subfamily Paramyxovirinae. We suggest the name Ferlavirus for the new genus, with FDLV as the type species.

  4. Ultra-deep sequencing reveals the subclonal structure and genomic evolution of oral squamous cell carcinoma

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin Jakob

    Background: Oral squamous cell carcinoma (OSCC), a subgroup of head and neck squamous cell carcinoma (HNSCC), is primarily caused by alcohol consumption and tobacco use. Recent DNA sequencing studies suggests that HNSCC are very heterogeneous between patients; however the intra-patient subclonal...... structure remains unexplored due to lack of sampling multiple tumor biopsies from each patient. Materials and methods: To examine the clonal structure and describe the genomic cancer evolution we applied whole-exome sequencing combined with targeted ultra-deep targeted sequencing on biopsies from 5stage IV...... of unprecedented high resolution enabling clear detection of subclonal structure and observation of otherwise undetectable mutations. Furthermore, we demonstrate that OSCC show a high degree of inter-patient heterogeneity but a low degree of intra-patient/tumor heterogeneity. However, some OSCC cancers contain...

  5. The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima.

    Directory of Open Access Journals (Sweden)

    Ariel D Chipman

    2014-11-01

    Full Text Available Myriapods (e.g., centipedes and millipedes display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air is likely effected by expansion of other receptor gene families. For some genes S. maritima has evolved paralogues to generate coding sequence diversity, where insects use alternate splicing. This is most striking for the Dscam gene, which in Drosophila generates more than 100,000 alternate splice forms, but in S. maritima is encoded by over 100 paralogues. We see an intriguing linkage between the absence of any known photosensory proteins in a blind organism and the additional absence of canonical circadian clock genes. The phylogenetic position of myriapods allows us to identify where in arthropod phylogeny several particular molecular mechanisms and traits emerged. For example, we conclude that juvenile hormone signalling evolved with the emergence of the exoskeleton in the arthropods and that RR-1 containing cuticle proteins evolved in the lineage leading to Mandibulata. We also identify when various gene expansions and losses occurred. The genome of S. maritima offers us a unique glimpse into the ancestral arthropod genome, while also displaying many adaptations

  6. The First Myriapod Genome Sequence Reveals Conservative Arthropod Gene Content and Genome Organisation in the Centipede Strigamia maritima

    Science.gov (United States)

    Chipman, Ariel D.; Ferrier, David E. K.; Brena, Carlo; Qu, Jiaxin; Hughes, Daniel S. T.; Schröder, Reinhard; Torres-Oliva, Montserrat; Znassi, Nadia; Jiang, Huaiyang; Almeida, Francisca C.; Alonso, Claudio R.; Apostolou, Zivkos; Aqrawi, Peshtewani; Arthur, Wallace; Barna, Jennifer C. J.; Blankenburg, Kerstin P.; Brites, Daniela; Capella-Gutiérrez, Salvador; Coyle, Marcus; Dearden, Peter K.; Du Pasquier, Louis; Duncan, Elizabeth J.; Ebert, Dieter; Eibner, Cornelius; Erikson, Galina; Evans, Peter D.; Extavour, Cassandra G.; Francisco, Liezl; Gabaldón, Toni; Gillis, William J.; Goodwin-Horn, Elizabeth A.; Green, Jack E.; Griffiths-Jones, Sam; Grimmelikhuijzen, Cornelis J. P.; Gubbala, Sai; Guigó, Roderic; Han, Yi; Hauser, Frank; Havlak, Paul; Hayden, Luke; Helbing, Sophie; Holder, Michael; Hui, Jerome H. L.; Hunn, Julia P.; Hunnekuhl, Vera S.; Jackson, LaRonda; Javaid, Mehwish; Jhangiani, Shalini N.; Jiggins, Francis M.; Jones, Tamsin E.; Kaiser, Tobias S.; Kalra, Divya; Kenny, Nathan J.; Korchina, Viktoriya; Kovar, Christie L.; Kraus, F. Bernhard; Lapraz, François; Lee, Sandra L.; Lv, Jie; Mandapat, Christigale; Manning, Gerard; Mariotti, Marco; Mata, Robert; Mathew, Tittu; Neumann, Tobias; Newsham, Irene; Ngo, Dinh N.; Ninova, Maria; Okwuonu, Geoffrey; Ongeri, Fiona; Palmer, William J.; Patil, Shobha; Patraquim, Pedro; Pham, Christopher; Pu, Ling-Ling; Putman, Nicholas H.; Rabouille, Catherine; Ramos, Olivia Mendivil; Rhodes, Adelaide C.; Robertson, Helen E.; Robertson, Hugh M.; Ronshaugen, Matthew; Rozas, Julio; Saada, Nehad; Sánchez-Gracia, Alejandro; Scherer, Steven E.; Schurko, Andrew M.; Siggens, Kenneth W.; Simmons, DeNard; Stief, Anna; Stolle, Eckart; Telford, Maximilian J.; Tessmar-Raible, Kristin; Thornton, Rebecca; van der Zee, Maurijn; von Haeseler, Arndt; Williams, James M.; Willis, Judith H.; Wu, Yuanqing; Zou, Xiaoyan; Lawson, Daniel; Muzny, Donna M.; Worley, Kim C.; Gibbs, Richard A.; Akam, Michael; Richards, Stephen

    2014-01-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air is likely effected by expansion of other receptor gene families. For some genes S. maritima has evolved paralogues to generate coding sequence diversity, where insects use alternate splicing. This is most striking for the Dscam gene, which in Drosophila generates more than 100,000 alternate splice forms, but in S. maritima is encoded by over 100 paralogues. We see an intriguing linkage between the absence of any known photosensory proteins in a blind organism and the additional absence of canonical circadian clock genes. The phylogenetic position of myriapods allows us to identify where in arthropod phylogeny several particular molecular mechanisms and traits emerged. For example, we conclude that juvenile hormone signalling evolved with the emergence of the exoskeleton in the arthropods and that RR-1 containing cuticle proteins evolved in the lineage leading to Mandibulata. We also identify when various gene expansions and losses occurred. The genome of S. maritima offers us a unique glimpse into the ancestral arthropod genome, while also displaying many adaptations to its specific

  7. Genome and transcriptome sequencing of lung cancers reveal diverse mutational and splicing events.

    Science.gov (United States)

    Liu, Jinfeng; Lee, William; Jiang, Zhaoshi; Chen, Zhongqiang; Jhunjhunwala, Suchit; Haverty, Peter M; Gnad, Florian; Guan, Yinghui; Gilbert, Houston N; Stinson, Jeremy; Klijn, Christiaan; Guillory, Joseph; Bhatt, Deepali; Vartanian, Steffan; Walter, Kimberly; Chan, Jocelyn; Holcomb, Thomas; Dijkgraaf, Peter; Johnson, Stephanie; Koeman, Julie; Minna, John D; Gazdar, Adi F; Stern, Howard M; Hoeflich, Klaus P; Wu, Thomas D; Settleman, Jeff; de Sauvage, Frederic J; Gentleman, Robert C; Neve, Richard M; Stokoe, David; Modrusan, Zora; Seshagiri, Somasekar; Shames, David S; Zhang, Zemin

    2012-12-01

    Lung cancer is a highly heterogeneous disease in terms of both underlying genetic lesions and response to therapeutic treatments. We performed deep whole-genome sequencing and transcriptome sequencing on 19 lung cancer cell lines and three lung tumor/normal pairs. Overall, our data show that cell line models exhibit similar mutation spectra to human tumor samples. Smoker and never-smoker cancer samples exhibit distinguishable patterns of mutations. A number of epigenetic regulators, including KDM6A, ASH1L, SMARCA4, and ATAD2, are frequently altered by mutations or copy number changes. A systematic survey of splice-site mutations identified 106 splice site mutations associated with cancer specific aberrant splicing, including mutations in several known cancer-related genes. RAC1b, an isoform of the RAC1 GTPase that includes one additional exon, was found to be preferentially up-regulated in lung cancer. We further show that its expression is significantly associated with sensitivity to a MAP2K (MEK) inhibitor PD-0325901. Taken together, these data present a comprehensive genomic landscape of a large number of lung cancer samples and further demonstrate that cancer-specific alternative splicing is a widespread phenomenon that has potential utility as therapeutic biomarkers. The detailed characterizations of the lung cancer cell lines also provide genomic context to the vast amount of experimental data gathered for these lines over the decades, and represent highly valuable resources for cancer biology.

  8. Complete genome sequence analysis of novel human bocavirus reveals genetic recombination between human bocavirus 2 and human bocavirus 4.

    Science.gov (United States)

    Khamrin, Pattara; Okitsu, Shoko; Ushijima, Hiroshi; Maneekarn, Niwat

    2013-07-01

    Epidemiological surveillance of human bocavirus (HBoV) was conducted on fecal specimens collected from hospitalized children with diarrhea in Chiang Mai, Thailand in 2011. By partial sequence analysis of VP1 gene, an unusual strain of HBoV (CMH-S011-11), was initially identified as HBoV4. The complete genome sequence of CMH-S011-11 was performed and analyzed further to clarify whether it was a recombinant strain or a new HBoV variant. Analysis of complete genome sequence revealed that the coding sequence starting from NS1, NP1 to VP1/VP2 was 4795 nucleotides long. Interestingly, the nucleotide sequence of NS1 gene of CMH-S011-11 was most closely related to the HBoV2 reference strains detected in Pakistan, which contradicted to the initial genotyping result of the partial VP1 region in the previous study. In addition, comparison of NP1 nucleotide sequence of CMH-S011-11 with those of other HBoV1-4 reference strains also revealed a high level of sequence identity with HBoV2. On the other hand, nucleotide sequence of VP1/VP2 gene of CMH-S011-11 was most closely related to those of HBoV4 reference strains detected in Nigeria. The overall full-length sequence analysis revealed that this CMH-S011-11 was grouped within HBoV4 species, but located in a separate branch from other HBoV4 prototype strains. Recombination analysis revealed that CMH-S011-11 was the result of recombination between HBoV2 and HBoV4 strains with the break point located near the start codon of VP2. Copyright © 2013 Elsevier B.V. All rights reserved.

  9. Complete mitochondrial genome sequences of three bats species and whole genome mitochondrial analyses reveal patterns of codon bias and lend support to a basal split in Chiroptera.

    Science.gov (United States)

    Meganathan, P R; Pagan, Heidi J T; McCulloch, Eve S; Stevens, Richard D; Ray, David A

    2012-01-15

    Order Chiroptera is a unique group of mammals whose members have attained self-powered flight as their main mode of locomotion. Much speculation persists regarding bat evolution; however, lack of sufficient molecular data hampers evolutionary and conservation studies. Of ~1200 species, complete mitochondrial genome sequences are available for only eleven. Additional sequences should be generated if we are to resolve many questions concerning these fascinating mammals. Herein, we describe the complete mitochondrial genomes of three bats: Corynorhinus rafinesquii, Lasiurus borealis and Artibeus lituratus. We also compare the currently available mitochondrial genomes and analyze codon usage in Chiroptera. C. rafinesquii, L. borealis and A. lituratus mitochondrial genomes are 16438 bp, 17048 bp and 16709 bp, respectively. Genome organization and gene arrangements are similar to other bats. Phylogenetic analyses using complete mitochondrial genome sequences support previously established phylogenetic relationships and suggest utility in future studies focusing on the evolutionary aspects of these species. Comprehensive analyses of available bat mitochondrial genomes reveal distinct nucleotide patterns and synonymous codon preferences corresponding to different chiropteran families. These patterns suggest that mutational and selection forces are acting to different extents within Chiroptera and shape their mitochondrial genomes. Copyright © 2011 Elsevier B.V. All rights reserved.

  10. Molecular footprints of domestication and improvement in soybean revealed by whole genome re-sequencing

    DEFF Research Database (Denmark)

    Li, Ying-hui; Zhao, Shan-cen; Ma, Jian-xin

    2013-01-01

    BACKGROUND:Artificial selection played an important role in the origin of modern Glycine max cultivars from the wild soybean Glycine soja. To elucidate the consequences of artificial selection accompanying the domestication and modern improvement of soybean, 25 new and 30 published whole-genome re......-sequencing accessions, which represent wild, domesticated landrace, and Chinese elite soybean populations were analyzed.RESULTS:A total of 5,102,244 single nucleotide polymorphisms (SNPs) and 707,969 insertion/deletions were identified. Among the SNPs detected, 25.5% were not described previously. We found...... that artificial selection during domestication led to more pronounced reduction in the genetic diversity of soybean than the switch from landraces to elite cultivars. Only a small proportion (2.99%) of the whole genomic regions appear to be affected by artificial selection for preferred agricultural traits...

  11. Oil palm genome sequence reveals divergence of interfertile species in old and new worlds

    Science.gov (United States)

    Singh, Rajinder; Ong-Abdullah, Meilina; Low, Eng-Ti Leslie; Manaf, Mohamad Arif Abdul; Rosli, Rozana; Nookiah, Rajanaidu; Ooi, Leslie Cheng-Li; Ooi, Siew–Eng; Chan, Kuang-Lim; Halim, Mohd Amin; Azizi, Norazah; Nagappan, Jayanthi; Bacher, Blaire; Lakey, Nathan; Smith, Steven W; He, Dong; Hogan, Michael; Budiman, Muhammad A; Lee, Ernest K; DeSalle, Rob; Kudrna, David; Goicoechea, Jose Louis; Wing, Rod; Wilson, Richard K; Fulton, Robert S; Ordway, Jared M; Martienssen, Robert A; Sambanthamurthi, Ravigadevi

    2013-01-01

    Oil palm is the most productive oil-bearing crop. Planted on only 5% of the total vegetable oil acreage, palm oil accounts for 33% of vegetable oil, and 45% of edible oil worldwide, but increased cultivation competes with dwindling rainforest reserves. We report the 1.8 gigabase (Gb) genome sequence of the African oil palm Elaeis guineensis, the predominant source of worldwide oil production. 1.535 Gb of assembled sequence and transcriptome data from 30 tissue types were used to predict at least 34,802 genes, including oil biosynthesis genes and homologues of WRINKLED1 (WRI1), and other transcriptional regulators1, which are highly expressed in the kernel. We also report the draft sequence of the S. American oil palm Elaeis oleifera, which has the same number of chromosomes (2n=32) and produces fertile interspecific hybrids with E. guineensis2, but appears to have diverged in the new world. Segmental duplications of chromosome arms define the palaeotetraploid origin of palm trees. The oil palm sequence enables the discovery of genes for important traits as well as somaclonal epigenetic alterations which restrict the use of clones in commercial plantings3, and thus helps achieve sustainability for biofuels and edible oils, reducing the rainforest footprint of this tropical plantation crop. PMID:23883927

  12. Infidelity of SARS-CoV Nsp14-Exonuclease Mutant Virus Replication Is Revealed by Complete Genome Sequencing

    Science.gov (United States)

    Eckerle, Lance D.; Becker, Michelle M.; Halpin, Rebecca A.; Li, Kelvin; Venter, Eli; Lu, Xiaotao; Scherbakova, Sana; Graham, Rachel L.; Baric, Ralph S.; Stockwell, Timothy B.; Spiro, David J.; Denison, Mark R.

    2010-01-01

    Most RNA viruses lack the mechanisms to recognize and correct mutations that arise during genome replication, resulting in quasispecies diversity that is required for pathogenesis and adaptation. However, it is not known how viruses encoding large viral RNA genomes such as the Coronaviridae (26 to 32 kb) balance the requirements for genome stability and quasispecies diversity. Further, the limits of replication infidelity during replication of large RNA genomes and how decreased fidelity impacts virus fitness over time are not known. Our previous work demonstrated that genetic inactivation of the coronavirus exoribonuclease (ExoN) in nonstructural protein 14 (nsp14) of murine hepatitis virus results in a 15-fold decrease in replication fidelity. However, it is not known whether nsp14-ExoN is required for replication fidelity of all coronaviruses, nor the impact of decreased fidelity on genome diversity and fitness during replication and passage. We report here the engineering and recovery of nsp14-ExoN mutant viruses of severe acute respiratory syndrome coronavirus (SARS-CoV) that have stable growth defects and demonstrate a 21-fold increase in mutation frequency during replication in culture. Analysis of complete genome sequences from SARS-ExoN mutant viral clones revealed unique mutation sets in every genome examined from the same round of replication and a total of 100 unique mutations across the genome. Using novel bioinformatic tools and deep sequencing across the full-length genome following 10 population passages in vitro, we demonstrate retention of ExoN mutations and continued increased diversity and mutational load compared to wild-type SARS-CoV. The results define a novel genetic and bioinformatics model for introduction and identification of multi-allelic mutations in replication competent viruses that will be powerful tools for testing the effects of decreased fidelity and increased quasispecies diversity on viral replication, pathogenesis, and

  13. The complete genome sequence of Staphylothermus marinus reveals differences in sulfur metabolism among heterotrophic Crenarchaeota

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, iain J.; Dharmarajan, Lakshmi; Rodriguez, Jason; Hooper, Sean; Porat, Iris; Ulrich, Luke E.; Elkins, James G.; Mavromatis, Kostas; Sun, Hui; Land, Miriam; Lapidus, Alla; Lucas, Susan; Barry, Kerrie; Huber, Harald; Zhulin, Igor B.; Whitman, William B.; Mukhopadhyay, Biswarup; Woese, Carl; Bristow, James; Kyrpides, Nikos

    2008-09-05

    Staphylothermus marinus is an anaerobic, sulfur-reducing peptide fermenter of the archaeal phylum Crenarchaeota. It is the third heterotrophic, obligate sulfur reducing crenarchaeote to be sequenced and provides an opportunity for comparative analysis of the three genomes. The 1.57 Mbp genome of the hyperthermophilic crenarchaeote Staphylothermus marinus has been completely sequenced. The main energy generating pathways likely involve 2-oxoacid:ferredoxin oxidoreductases and ADP-forming acetyl-CoA synthases. S. marinus possesses several enzymes not present in other crenarchaeotes including a sodium ion-translocating decarboxylase likely to be involved in amino acid degradation. S. marinus lacks sulfur-reducing enzymes present in the other two sulfur-reducing crenarchaeotes that have been sequenced - Thermofilum pendens and Hyperthermus butylicus. Instead it has three operons similar to the mbh and mbx operons of Pyrococcus furiosus, which may play a role in sulfur reduction and/or hydrogen production. The two marine organisms, S. marinus and H. butylicus, possess more sodium-dependent transporters than T. pendens and use symporters for potassium uptake while T. pendens uses an ATP-dependent potassium transporter. T. pendens has adapted to a nutrient-rich environment while H. butylicus is adapted to a nutrient-poor environment, and S. marinus lies between these two extremes. The three heterotrophic sulfur-reducing crenarchaeotes have adapted to their habitats, terrestrial vs. marine, via their transporter content, and they have also adapted to environments with differing levels of nutrients. Despite the fact that they all use sulfur as an electron acceptor, they are likely to have different pathways for sulfur reduction.

  14. The complete genome sequence of Staphylothermus marinus reveals differences in sulfur metabolism among heterotrophic Crenarchaeota

    Directory of Open Access Journals (Sweden)

    Barry Kerrie

    2009-04-01

    Full Text Available Abstract Background Staphylothermus marinus is an anaerobic, sulfur-reducing peptide fermenter of the archaeal phylum Crenarchaeota. It is the third heterotrophic, obligate sulfur reducing crenarchaeote to be sequenced and provides an opportunity for comparative analysis of the three genomes. Results The 1.57 Mbp genome of the hyperthermophilic crenarchaeote Staphylothermus marinus has been completely sequenced. The main energy generating pathways likely involve 2-oxoacid:ferredoxin oxidoreductases and ADP-forming acetyl-CoA synthases. S. marinus possesses several enzymes not present in other crenarchaeotes including a sodium ion-translocating decarboxylase likely to be involved in amino acid degradation. S. marinus lacks sulfur-reducing enzymes present in the other two sulfur-reducing crenarchaeotes that have been sequenced – Thermofilum pendens and Hyperthermus butylicus. Instead it has three operons similar to the mbh and mbx operons of Pyrococcus furiosus, which may play a role in sulfur reduction and/or hydrogen production. The two marine organisms, S. marinus and H. butylicus, possess more sodium-dependent transporters than T. pendens and use symporters for potassium uptake while T. pendens uses an ATP-dependent potassium transporter. T. pendens has adapted to a nutrient-rich environment while H. butylicus is adapted to a nutrient-poor environment, and S. marinus lies between these two extremes. Conclusion The three heterotrophic sulfur-reducing crenarchaeotes have adapted to their habitats, terrestrial vs. marine, via their transporter content, and they have also adapted to environments with differing levels of nutrients. Despite the fact that they all use sulfur as an electron acceptor, they are likely to have different pathways for sulfur reduction.

  15. Complex Routes of Nosocomial Vancomycin-Resistant Enterococcus faecium Transmission Revealed by Genome Sequencing.

    Science.gov (United States)

    Raven, Kathy E; Gouliouris, Theodore; Brodrick, Hayley; Coll, Francesc; Brown, Nicholas M; Reynolds, Rosy; Reuter, Sandra; Török, M Estée; Parkhill, Julian; Peacock, Sharon J

    2017-04-01

    Vancomycin-resistant Enterococcus faecium (VREfm) is a leading cause of nosocomial infection. Here, we describe the utility of whole-genome sequencing in defining nosocomial VREfm transmission. A retrospective study at a single hospital in the United Kingdom identified 342 patients with E. faecium bloodstream infection over 7 years. Of these, 293 patients had a stored isolate and formed the basis for the study. The first stored isolate from each case was sequenced (200 VREfm [197 vanA, 2 vanB, and 1 isolate containing both vanA and vanB], 93 vancomycin-susceptible E. faecium) and epidemiological data were collected. Genomes were also available for E. faecium associated with bloodstream infections in 15 patients in neighboring hospitals, and 456 patients across the United Kingdom and Ireland. The majority of infections in the 293 patients were hospital-acquired (n = 249) or healthcare-associated (n = 42). Phylogenetic analysis showed that 291 of 293 isolates resided in a hospital-associated clade that contained numerous discrete clusters of closely related isolates, indicative of multiple introductions into the hospital followed by clonal expansion associated with transmission. Fine-scale analysis of 6 exemplar phylogenetic clusters containing isolates from 93 patients (32%) identified complex transmission routes that spanned numerous wards and years, extending beyond the detection of conventional infection control. These contained both vancomycin-resistant and -susceptible isolates. We also identified closely related isolates from patients at Cambridge University Hospitals NHS Foundation Trust and regional and national hospitals, suggesting interhospital transmission. These findings provide important insights for infection control practice and signpost areas for interventions. We conclude that sequencing represents a powerful tool for the enhanced surveillance and control of nosocomial E. faecium transmission and infection.

  16. Staphylococcus epidermidis pan-genome sequence analysis reveals diversity of skin commensal and hospital infection-associated isolates.

    Science.gov (United States)

    Conlan, Sean; Mijares, Lilia A; Becker, Jesse; Blakesley, Robert W; Bouffard, Gerard G; Brooks, Shelise; Coleman, Holly; Gupta, Jyoti; Gurson, Natalie; Park, Morgan; Schmidt, Brian; Thomas, Pamela J; Otto, Michael; Kong, Heidi H; Murray, Patrick R; Segre, Julia A

    2012-07-25

    While Staphylococcus epidermidis is commonly isolated from healthy human skin, it is also the most frequent cause of nosocomial infections on indwelling medical devices. Despite its importance, few genome sequences existed and the most frequent hospital-associated lineage, ST2, had not been fully sequenced. We cultivated 71 commensal S. epidermidis isolates from 15 skin sites and compared them with 28 nosocomial isolates from venous catheters and blood cultures. We produced 21 commensal and 9 nosocomial draft genomes, and annotated and compared their gene content, phylogenetic relatedness and biochemical functions. The commensal strains had an open pan-genome with 80% core genes and 20% variable genes. The variable genome was characterized by an overabundance of transposable elements, transcription factors and transporters. Biochemical diversity, as assayed by antibiotic resistance and in vitro biofilm formation, demonstrated the varied phenotypic consequences of this genomic diversity. The nosocomial isolates exhibited both large-scale rearrangements and single-nucleotide variation. We showed that S. epidermidis genomes separate into two phylogenetic groups, one consisting only of commensals. The formate dehydrogenase gene, present only in commensals, is a discriminatory marker between the two groups. Commensal skin S. epidermidis have an open pan-genome and show considerable diversity between isolates, even when derived from a single individual or body site. For ST2, the most common nosocomial lineage, we detect variation between three independent isolates sequenced. Finally, phylogenetic analyses revealed a previously unrecognized group of S. epidermidis strains characterized by reduced virulence and formate dehydrogenase, which we propose as a clinical molecular marker.

  17. Giraffe genome sequence reveals clues to its unique morphology and physiology.

    Science.gov (United States)

    Agaba, Morris; Ishengoma, Edson; Miller, Webb C; McGrath, Barbara C; Hudson, Chelsea N; Bedoya Reina, Oscar C; Ratan, Aakrosh; Burhans, Rico; Chikhi, Rayan; Medvedev, Paul; Praul, Craig A; Wu-Cavener, Lan; Wood, Brendan; Robertson, Heather; Penfold, Linda; Cavener, Douglas R

    2016-05-17

    The origins of giraffe's imposing stature and associated cardiovascular adaptations are unknown. Okapi, which lacks these unique features, is giraffe's closest relative and provides a useful comparison, to identify genetic variation underlying giraffe's long neck and cardiovascular system. The genomes of giraffe and okapi were sequenced, and through comparative analyses genes and pathways were identified that exhibit unique genetic changes and likely contribute to giraffe's unique features. Some of these genes are in the HOX, NOTCH and FGF signalling pathways, which regulate both skeletal and cardiovascular development, suggesting that giraffe's stature and cardiovascular adaptations evolved in parallel through changes in a small number of genes. Mitochondrial metabolism and volatile fatty acids transport genes are also evolutionarily diverged in giraffe and may be related to its unusual diet that includes toxic plants. Unexpectedly, substantial evolutionary changes have occurred in giraffe and okapi in double-strand break repair and centrosome functions.

  18. De novo gene evolution of antifreeze glycoproteins in codfishes revealed by whole genome sequence data.

    Science.gov (United States)

    Baalsrud, Helle Tessand; Tørresen, Ole Kristian; Hongrø Solbakken, Monica; Salzburger, Walter; Hanel, Reinhold; Jakobsen, Kjetill S; Jentoft, Sissel

    2017-12-05

    New genes can arise through duplication of a pre-existing gene or de novo from non-coding DNA, providing raw material for evolution of new functions in response to a changing environment. A prime example is the independent evolution of antifreeze glycoprotein genes (afgps) in the Arctic codfishes and Antarctic notothenioids to prevent freezing. However, the highly repetitive nature of these genes complicates studies of their organization. In notothenioids, afgps evolved from an extant gene, yet the evolutionary origin of afgps in codfishes is unknown. Here, we demonstrate that afgps in codfishes have evolved de novo from non-coding DNA 13-18 Ma, coinciding with the cooling of the Northern Hemisphere. Using whole-genome sequence data from several codfishes and notothenioids, we find higher copy number of afgp in species exposed to more severe freezing suggesting a gene dosage effect. Notably, antifreeze function is lost in one lineage of codfishes analogous to the afgp losses in non-Antarctic notothenioids. This indicates that selection can eliminate the antifreeze function when freezing is no longer imminent. Additionally, we show that evolution of afgp-assisting antifreeze potentiating protein genes (afpps) in notothenioids coincides with origin and lineage-specific losses of afgp. The origin of afgps in codfishes is one of the first examples of an essential gene born from non-coding DNA in a non-model species. Our study underlines the power of comparative genomics to uncover past molecular signatures of genome evolution, and further highlights the impact of de novo gene origin in response to a changing selection regime. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  19. Genomic DNA sequences from mastodon and woolly mammoth reveal deep speciation of forest and savanna elephants.

    Directory of Open Access Journals (Sweden)

    Nadin Rohland

    2010-12-01

    Full Text Available To elucidate the history of living and extinct elephantids, we generated 39,763 bp of aligned nuclear DNA sequence across 375 loci for African savanna elephant, African forest elephant, Asian elephant, the extinct American mastodon, and the woolly mammoth. Our data establish that the Asian elephant is the closest living relative of the extinct mammoth in the nuclear genome, extending previous findings from mitochondrial DNA analyses. We also find that savanna and forest elephants, which some have argued are the same species, are as or more divergent in the nuclear genome as mammoths and Asian elephants, which are considered to be distinct genera, thus resolving a long-standing debate about the appropriate taxonomic classification of the African elephants. Finally, we document a much larger effective population size in forest elephants compared with the other elephantid taxa, likely reflecting species differences in ancient geographic structure and range and differences in life history traits such as variance in male reproductive success.

  20. Assembly of the Lactuca sativa, L. cv. Tizian draft genome sequence reveals differences within major resistance complex 1 as compared to the cv. Salinas reference genome.

    Science.gov (United States)

    Verwaaijen, Bart; Wibberg, Daniel; Nelkner, Johanna; Gordin, Miriam; Rupp, Oliver; Winkler, Anika; Bremges, Andreas; Blom, Jochen; Grosch, Rita; Pühler, Alfred; Schlüter, Andreas

    2018-02-10

    Lettuce (Lactuca sativa, L.) is an important annual plant of the family Asteraceae (Compositae). The commercial lettuce cultivar Tizian has been used in various scientific studies investigating the interaction of the plant with phytopathogens or biological control agents. Here, we present the de novo draft genome sequencing and gene prediction for this specific cultivar derived from transcriptome sequence data. The assembled scaffolds amount to a size of 2.22 Gb. Based on RNAseq data, 31,112 transcript isoforms were identified. Functional predictions for these transcripts were determined within the GenDBE annotation platform. Comparison with the cv. Salinas reference genome revealed a high degree of sequence similarity on genome and transcriptome levels, with an average amino acid identity of 99%. Furthermore, it was observed that two large regions are either missing or are highly divergent within the cv. Tizian genome compared to cv. Salinas. One of these regions covers the major resistance complex 1 region of cv. Salinas. The cv. Tizian draft genome sequence provides a valuable resource for future functional and transcriptome analyses focused on this lettuce cultivar. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Analysis of Draft Genome Sequence of Pseudomonas sp. QTF5 Reveals Its Benzoic Acid Degradation Ability and Heavy Metal Tolerance

    Directory of Open Access Journals (Sweden)

    Yang Li

    2017-01-01

    Full Text Available Pseudomonas sp. QTF5 was isolated from the continuous permafrost near the bitumen layers in the Qiangtang basin of Qinghai-Tibetan Plateau in China (5,111 m above sea level. It is psychrotolerant and highly and widely tolerant to heavy metals and has the ability to metabolize benzoic acid and salicylic acid. To gain insight into the genetic basis for its adaptation, we performed whole genome sequencing and analyzed the resistant genes and metabolic pathways. Based on 120 published and annotated genomes representing 31 species in the genus Pseudomonas, in silico genomic DNA-DNA hybridization (<54% and average nucleotide identity calculation (<94% revealed that QTF5 is closest to Pseudomonas lini and should be classified into a novel species. This study provides the genetic basis to identify the genes linked to its specific mechanisms for adaptation to extreme environment and application of this microorganism in environmental conservation.

  2. Time-Resolved Transposon Insertion Sequencing Reveals Genome-Wide Fitness Dynamics during Infection.

    Science.gov (United States)

    Yang, Guanhua; Billings, Gabriel; Hubbard, Troy P; Park, Joseph S; Yin Leung, Ka; Liu, Qin; Davis, Brigid M; Zhang, Yuanxing; Wang, Qiyao; Waldor, Matthew K

    2017-10-03

    Transposon insertion sequencing (TIS) is a powerful high-throughput genetic technique that is transforming functional genomics in prokaryotes, because it enables genome-wide mapping of the determinants of fitness. However, current approaches for analyzing TIS data assume that selective pressures are constant over time and thus do not yield information regarding changes in the genetic requirements for growth in dynamic environments (e.g., during infection). Here, we describe structured analysis of TIS data collected as a time series, termed pattern analysis of conditional essentiality (PACE). From a temporal series of TIS data, PACE derives a quantitative assessment of each mutant's fitness over the course of an experiment and identifies mutants with related fitness profiles. In so doing, PACE circumvents major limitations of existing methodologies, specifically the need for artificial effect size thresholds and enumeration of bacterial population expansion. We used PACE to analyze TIS samples of Edwardsiella piscicida (a fish pathogen) collected over a 2-week infection period from a natural host (the flatfish turbot). PACE uncovered more genes that affect E. piscicida 's fitness in vivo than were detected using a cutoff at a terminal sampling point, and it identified subpopulations of mutants with distinct fitness profiles, one of which informed the design of new live vaccine candidates. Overall, PACE enables efficient mining of time series TIS data and enhances the power and sensitivity of TIS-based analyses. IMPORTANCE Transposon insertion sequencing (TIS) enables genome-wide mapping of the genetic determinants of fitness, typically based on observations at a single sampling point. Here, we move beyond analysis of endpoint TIS data to create a framework for analysis of time series TIS data, termed pattern analysis of conditional essentiality (PACE). We applied PACE to identify genes that contribute to colonization of a natural host by the fish pathogen

  3. Evolution of the RH gene family in vertebrates revealed by brown hagfish (Eptatretus atami) genome sequences.

    Science.gov (United States)

    Suzuki, Akinori; Komata, Hidero; Iwashita, Shogo; Seto, Shotaro; Ikeya, Hironobu; Tabata, Mitsutoshi; Kitano, Takashi

    2017-02-01

    In vertebrates, there are four major genes in the RH (Rhesus) gene family, RH, RHAG, RHBG, and RHCG. These genes are thought to have been formed by the two rounds of whole-genome duplication (2R-WGD) in the common ancestor of all vertebrates. In our previous work, where we analyzed details of the gene duplications process of this gene family, three nucleotide sequences belonging to this family were identified in Far Eastern brook lamprey (Lethenteron reissneri), and the phylogenetic positions of the genes were determined. Lampreys, along with hagfishes, are cyclostomata (jawless fishes), which is a sister group of gnathostomata (jawed vertebrates). Although those results suggested that one gene was orthologous to the gnathostome RHCG genes, we did not identify clear orthologues for other genes. In this study, therefore, we identified three novel cDNA sequences that belong to the RH gene family using de novo transcriptome analysis of another cyclostome: the brown hagfish (Eptatretus atami). We also determined the nucleotide sequences for the RHBG and RHCG genes in a red stingray (Dasyatis akajei), which belongs to the cartilaginous fishes. The phylogenetic tree showed that two brown hagfish genes, which were probably duplicated in the cyclostome lineage, formed a cluster with the gnathostome RHAG genes, whereas another brown hagfish gene formed a cluster with the gnathostome RHCG genes. We estimated that the RH genes had a higher evolutionary rate than the RHAG, RHBG, and RHCG genes. Interestingly, in the RHBG genes, only the bird lineage showed a higher rate of nonsynonymous substitutions. It is likely that this higher rate was caused by a state of relaxed functional constraints rather than positive selection nor by pseudogenization. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. Mitochondrial genome sequencing reveals potential origins of the scabies mite Sarcoptes scabiei infesting two iconic Australian marsupials.

    Science.gov (United States)

    Fraser, Tamieka A; Shao, Renfu; Fountain-Jones, Nicholas M; Charleston, Michael; Martin, Alynn; Whiteley, Pam; Holme, Roz; Carver, Scott; Polkinghorne, Adam

    2017-11-28

    Debilitating skin infestations caused by the mite, Sarcoptes scabiei, have a profound impact on human and animal health globally. In Australia, this impact is evident across different segments of Australian society, with a growing recognition that it can contribute to rapid declines of native Australian marsupials. Cross-host transmission has been suggested to play a significant role in the epidemiology and origin of mite infestations in different species but a chronic lack of genetic resources has made further inferences difficult. To investigate the origins and molecular epidemiology of S. scabiei in Australian wildlife, we sequenced the mitochondrial genomes of S. scabiei from diseased wombats (Vombatus ursinus) and koalas (Phascolarctos cinereus) spanning New South Wales, Victoria and Tasmania, and compared them with the recently sequenced mitochondrial genome sequences of S. scabiei from humans. We found unique S. scabiei haplotypes among individual wombat and koala hosts with high sequence similarity (99.1% - 100%). Phylogenetic analysis of near full-length mitochondrial genomes revealed three clades of S. scabiei (one human and two marsupial), with no apparent geographic or host species pattern, suggestive of multiple introductions. The availability of additional mitochondrial gene sequences also enabled a re-evaluation of a range of putative molecular markers of S. scabiei, revealing that cox1 is the most informative gene for molecular epidemiological investigations. Utilising this gene target, we provide additional evidence to support cross-host transmission between different animal hosts. Our results suggest a history of parasite invasion through colonisation of Australia from hosts across the globe and the potential for cross-host transmission being a common feature of the epidemiology of this neglected pathogen. If this is the case, comparable patterns may exist elsewhere in the 'New World'. This work provides a basis for expanded molecular studies into

  5. Whole-genome sequencing reveals the mechanisms for evolution of streptomycin resistance in Lactobacillus plantarum.

    Science.gov (United States)

    Zhang, Fuxin; Gao, Jiayuan; Wang, Bini; Huo, Dongxue; Wang, Zhaoxia; Zhang, Jiachao; Shao, Yuyu

    2018-04-01

    In this research, we investigated the evolution of streptomycin resistance in Lactobacillus plantarum ATCC14917, which was passaged in medium containing a gradually increasing concentration of streptomycin. After 25 d, the minimum inhibitory concentration (MIC) of L. plantarum ATCC14917 had reached 131,072 µg/mL, which was 8,192-fold higher than the MIC of the original parent isolate. The highly resistant L. plantarum ATCC14917 isolate was then passaged in antibiotic-free medium to determine the stability of resistance. The MIC value of the L. plantarum ATCC14917 isolate decreased to 2,048 µg/mL after 35 d but remained constant thereafter, indicating that resistance was irreversible even in the absence of selection pressure. Whole-genome sequencing of parent isolates, control isolates, and isolates following passage was used to study the resistance mechanism of L. plantarum ATCC14917 to streptomycin and adaptation in the presence and absence of selection pressure. Five mutated genes (single nucleotide polymorphisms and structural variants) were verified in highly resistant L. plantarum ATCC14917 isolates, which were related to ribosomal protein S12, LPXTG-motif cell wall anchor domain protein, LrgA family protein, Ser/Thr phosphatase family protein, and a hypothetical protein that may correlate with resistance to streptomycin. After passage in streptomycin-free medium, only the mutant gene encoding ribosomal protein S12 remained; the other 4 mutant genes had reverted to the wild type as found in the parent isolate. Although the MIC value of L. plantarum ATCC14917 was reduced in the absence of selection pressure, it remained 128-fold higher than the MIC value of the parent isolate, indicating that ribosomal protein S12 may play an important role in streptomycin resistance. Using the mobile elements database, we demonstrated that streptomycin resistance-related genes in L. plantarum ATCC14917 were not located on mobile elements. This research offers a way of

  6. Insights into Genome Plasticity and Pathogenicity of the Plant Pathogenic Bacterium Xanthomonas campestris pv. vesicatoria Revealed by the Complete Genome Sequence

    Science.gov (United States)

    Thieme, Frank; Koebnik, Ralf; Bekel, Thomas; Berger, Carolin; Boch, Jens; Büttner, Daniela; Caldana, Camila; Gaigalat, Lars; Goesmann, Alexander; Kay, Sabine; Kirchner, Oliver; Lanz, Christa; Linke, Burkhard; McHardy, Alice C.; Meyer, Folker; Mittenhuber, Gerhard; Nies, Dietrich H.; Niesbach-Klösgen, Ulla; Patschkowski, Thomas; Rückert, Christian; Rupp, Oliver; Schneiker, Susanne; Schuster, Stephan C.; Vorhölter, Frank-Jörg; Weber, Ernst; Pühler, Alfred; Bonas, Ulla; Bartels, Daniela; Kaiser, Olaf

    2005-01-01

    The gram-negative plant-pathogenic bacterium Xanthomonas campestris pv. vesicatoria is the causative agent of bacterial spot disease in pepper and tomato plants, which leads to economically important yield losses. This pathosystem has become a well-established model for studying bacterial infection strategies. Here, we present the whole-genome sequence of the pepper-pathogenic Xanthomonas campestris pv. vesicatoria strain 85-10, which comprises a 5.17-Mb circular chromosome and four plasmids. The genome has a high G+C content (64.75%) and signatures of extensive genome plasticity. Whole-genome comparisons revealed a gene order similar to both Xanthomonas axonopodis pv. citri and Xanthomonas campestris pv. campestris and a structure completely different from Xanthomonas oryzae pv. oryzae. A total of 548 coding sequences (12.2%) are unique to X. campestris pv. vesicatoria. In addition to a type III secretion system, which is essential for pathogenicity, the genome of strain 85-10 encodes all other types of protein secretion systems described so far in gram-negative bacteria. Remarkably, one of the putative type IV secretion systems encoded on the largest plasmid is similar to the Icm/Dot systems of the human pathogens Legionella pneumophila and Coxiella burnetii. Comparisons with other completely sequenced plant pathogens predicted six novel type III effector proteins and several other virulence factors, including adhesins, cell wall-degrading enzymes, and extracellular polysaccharides. PMID:16237009

  7. Whole Genome Sequencing Reveals Potential New Targets for Improving Nitrogen Uptake and Utilization in Sorghum bicolor

    Directory of Open Access Journals (Sweden)

    Karen Massel

    2016-10-01

    Full Text Available Nitrogen (N fertilizers are a major agricultural input where more than 100 million tons are supplied annually. Cereals are particularly inefficient at soil N uptake, where the unrecovered nitrogen causes serious environmental damage. Sorghum bicolor (sorghum is an important cereal crop, particularly in resource-poor semi-arid regions, and is known to have a high NUE in comparison to other major cereals under limited N conditions. This study provides the first assessment of genetic diversity and signatures of selection across 230 fully sequenced genes putatively involved in the uptake and mobilization of N from a diverse panel of sorghum lines. This comprehensive analysis reveals an overall reduction in diversity as a result of domestication and a total of 128 genes displaying signatures of purifying selection, thereby revealing possible gene targets to improve NUE in sorghum and cereals alike. A number of key genes appear to have been involved in selective sweeps, reducing their sequence diversity. The ammonium transporter (AMT genes generally had low allelic diversity, whereas a substantial number of nitrate/peptide transporter 1 (NRT1/PTR genes had higher nucleotide diversity in domesticated germplasm. Interestingly, members of the distinct race Guinea margaritiferum contained a number of unique alleles, and along with the wild sorghum species, represent a rich resource of new variation for plant improvement of NUE in sorghum.

  8. Exploring evidence of positive selection reveals genetic basis of meat quality traits in Berkshire pigs through whole genome sequencing.

    Science.gov (United States)

    Jeong, Hyeonsoo; Song, Ki-Duk; Seo, Minseok; Caetano-Anollés, Kelsey; Kim, Jaemin; Kwak, Woori; Oh, Jae-Don; Kim, EuiSoo; Jeong, Dong Kee; Cho, Seoae; Kim, Heebal; Lee, Hak-Kyo

    2015-08-20

    Natural and artificial selection following domestication has led to the existence of more than a hundred pig breeds, as well as incredible variation in phenotypic traits. Berkshire pigs are regarded as having superior meat quality compared to other breeds. As the meat production industry seeks selective breeding approaches to improve profitable traits such as meat quality, information about genetic determinants of these traits is in high demand. However, most of the studies have been performed using trained sensory panel analysis without investigating the underlying genetic factors. Here we investigate the relationship between genomic composition and this phenotypic trait by scanning for signatures of positive selection in whole-genome sequencing data. We generated genomes of 10 Berkshire pigs at a total of 100.6 coverage depth, using the Illumina Hiseq2000 platform. Along with the genomes of 11 Landrace and 13 Yorkshire pigs, we identified genomic variants of 18.9 million SNVs and 3.4 million Indels in the mapped regions. We identified several associated genes related to lipid metabolism, intramuscular fatty acid deposition, and muscle fiber type which attribute to pork quality (TG, FABP1, AKIRIN2, GLP2R, TGFBR3, JPH3, ICAM2, and ERN1) by applying between population statistical tests (XP-EHH and XP-CLR). A statistical enrichment test was also conducted to detect breed specific genetic variation. In addition, de novo short sequence read assembly strategy identified several candidate genes (SLC25A14, IGF1, PI4KA, CACNA1A) as also contributing to lipid metabolism. Results revealed several candidate genes involved in Berkshire meat quality; most of these genes are involved in lipid metabolism and intramuscular fat deposition. These results can provide a basis for future research on the genomic characteristics of Berkshire pigs.

  9. Mitochondrial genome sequences reveal evolutionary relationships of the Phytophthora 1c clade species.

    Science.gov (United States)

    Lassiter, Erica S; Russ, Carsten; Nusbaum, Chad; Zeng, Qiandong; Saville, Amanda C; Olarte, Rodrigo A; Carbone, Ignazio; Hu, Chia-Hui; Seguin-Orlando, Andaine; Samaniego, Jose A; Thorne, Jeffrey L; Ristaino, Jean B

    2015-11-01

    Phytophthora infestans is one of the most destructive plant pathogens of potato and tomato globally. The pathogen is closely related to four other Phytophthora species in the 1c clade including P. phaseoli, P. ipomoeae, P. mirabilis and P. andina that are important pathogens of other wild and domesticated hosts. P. andina is an interspecific hybrid between P. infestans and an unknown Phytophthora species. We have sequenced mitochondrial genomes of the sister species of P. infestans and examined the evolutionary relationships within the clade. Phylogenetic analysis indicates that the P. phaseoli mitochondrial lineage is basal within the clade. P. mirabilis and P. ipomoeae are sister lineages and share a common ancestor with the Ic mitochondrial lineage of P. andina. These lineages in turn are sister to the P. infestans and P. andina Ia mitochondrial lineages. The P. andina Ic lineage diverged much earlier than the P. andina Ia mitochondrial lineage and P. infestans. The presence of two mitochondrial lineages in P. andina supports the hybrid nature of this species. The ancestral state of the P. andina Ic lineage in the tree and its occurrence only in the Andean regions of Ecuador, Colombia and Peru suggests that the origin of this species hybrid in nature may occur there.

  10. Whole genome sequencing reveals mycobacterial microevolution among concurrent isolates from sputum and blood in HIV infected TB patients.

    Science.gov (United States)

    Ssengooba, Willy; de Jong, Bouke C; Joloba, Moses L; Cobelens, Frank G; Meehan, Conor J

    2016-08-05

    In the context of advanced immunosuppression, M. tuberculosis is known to cause detectable mycobacteremia. However, little is known about the intra-patient mycobacterial microevolution and the direction of seeding between the sputum and blood compartments. From a diagnostic study of HIV-infected TB patients, 51 pairs of concurrent blood and sputum M. tuberculosis isolates from the same patient were available. In a previous analysis, we identified a subset with genotypic concordance, based on spoligotyping and 24 locus MIRU-VNTR. These paired isolates with identical genotypes were analyzed by whole genome sequencing and phylogenetic analysis. Of the 25 concordant pairs (49 % of the 51 paired isolates), 15 (60 %) remained viable for extraction of high quality DNA for whole genome sequencing. Two patient pairs were excluded due to poor quality sequence reads. The median CD4 cell count was 32 (IQR; 16-101)/mm(3) and ten (77 %) patients were on ART. No drug resistance mutations were identified in any of the sequences analyzed. Three (23.1 %) of 13 patients had SNPs separating paired isolates from blood and sputum compartments, indicating evidence of microevolution. Using a phylogenetic approach to identify the ancestral compartment, in two (15 %) patients the blood isolate was ancestral to the sputum isolate, in one (8 %) it was the opposite, and ten (77 %) of the pairs were identical. Among HIV-infected patients with poor cellular immunity, infection with multiple strains of M. tuberculosis was found in half of the patients. In those patients with identical strains, whole genome sequencing indicated that M. tuberculosis intra-patient microevolution does occur in a few patients, yet did not reveal a consistent direction of spread between sputum and blood. This suggests that these compartments are highly connected and potentially seed each other repeatedly.

  11. Lichen Biosynthetic Gene Clusters. Part I. Genome Sequencing Reveals a Rich Biosynthetic Potential.

    Science.gov (United States)

    Bertrand, Robert L; Abdel-Hameed, Mona; Sorensen, John L

    2018-02-27

    Lichens are symbionts of fungi and algae that produce diverse secondary metabolites with useful properties. Little is known of lichen natural product biosynthesis because of the challenges of working with lichenizing fungi. We describe the first attempt to comprehensively profile the genetic secondary metabolome of a lichenizing fungus. An Illumina platform combined with the Antibiotics and Secondary Metabolites Analysis Shell (FungiSMASH, version 4.0) was used to sequence and annotate assembled contigs of the fungal partner of Cladonia uncialis. Up to 48 putative gene clusters are described comprising type I and type III polyketide synthases (PKS), nonribosomal peptide synthetases (NRPS), hybrid PKS-NRPS, and terpene synthases. The number of gene clusters revealed by this work dwarfs the number of known secondary metabolites from C. uncialis, suggesting that lichenizing fungi have an unexplored biosynthetic potential.

  12. The complete genome sequence of Bacillus velezensis 9912D reveals its biocontrol mechanism as a novel commercial biological fungicide agent.

    Science.gov (United States)

    Pan, Hua-Qi; Li, Qing-Lian; Hu, Jiang-Chun

    2017-04-10

    A Bacillus sp. 9912 mutant, 9912D, was approved as a new biological fungicide agent by the Ministry of Agriculture of the People's Republic of China in 2016 owing to its excellent inhibitory effect on various plant pathogens and being environment-friendly. Here, we present the genome of 9912D with a circular chromosome having 4436 coding DNA sequences (CDSs), and a circular plasmid encoding 59 CDSs. This strain was finally designated as Bacillus velezensis based on phylogenomic analyses. Genome analysis revealed a total of 19 candidate gene clusters involved in secondary metabolite biosynthesis, including potential new type II lantibiotics. The absence of fengycin biosynthetic gene cluster is noteworthy. Our data offer insights into the genetic, biological and physiological characteristics of this strain and aid in deeper understanding of its biocontrol mechanism. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. Genome sequencing of Listeria monocytogenes "Quargel" listeriosis outbreak strains reveals two different strains with distinct in vitro virulence potential.

    Directory of Open Access Journals (Sweden)

    Kathrin Rychli

    Full Text Available A large listeriosis outbreak occurred in Austria, Germany and the Czech Republic in 2009 and 2010. The outbreak was traced back to a traditional Austrian curd cheese called "Quargel" which was contaminated with two distinct serovar 1/2a Listeria monocytogenes strains (QOC1 and QOC2. In this study we sequenced and analysed the genomes of both outbreak strains in order to investigate the extent of genetic diversity between the two strains belonging to MLST sequence types 398 (QOC2 and 403 (QOC1. Both genomes are highly similar, but also display distinct properties: The QOC1 genome is approximately 74 kbp larger than the QOC2 genome. In addition, the strains harbour 93 (QOC1 and 45 (QOC2 genes encoding strain-specific proteins. A 21 kbp region showing highest similarity to plasmid pLMIV encoding three putative internalins is integrated in the QOC1 genome. In contrast to QOC1, strain QOC2 harbours a vip homologue, which encodes a LPXTG surface protein involved in cell invasion. In accordance, in vitro virulence assays revealed distinct differences in invasion efficiency and intracellular proliferation within different cell types. The higher virulence potential of QOC1 in non-phagocytic cells may be explained by the presence of additional internalins in the pLMIV-like region, whereas the higher invasion capability of QOC2 into phagocytic cells may be due to the presence of a vip homologue. In addition, both strains show differences in stress-related gene content. Strain QOC1 encodes a so-called stress survival islet 1, whereas strain QOC2 harbours a homologue of the uncharacterized LMOf2365_0481 gene. Consistently, QOC1 shows higher resistance to acidic, alkaline and gastric stress. In conclusion, our results show that strain QOC1 and QOC2 are distinct and did not recently evolve from a common ancestor.

  14. The complete genome sequence of Bacillus velezensis strain GH1-13 reveals agriculturally beneficial properties and a unique plasmid.

    Science.gov (United States)

    Kim, Sang Yoon; Song, Hajin; Sang, Mee Kyung; Weon, Hang-Yeon; Song, Jaekyeong

    2017-10-10

    The bacterial strain Bacillus velezensis GH1-13, isolated from rice paddy soil in Korea, has been shown to promote plant growth and have strong antagonistic activities against pathogens. Here, we report the complete genome sequence of GH1-13, revealing that it possesses a single 4,071,980-bp circular chromosome with 46.2% GC-content. The chromosome encodes 3,930 genes, and we have also identified a unique plasmid in the strain that encodes a further 104 genes (71,628bp and 31.7% GC-content). The genome was found to contain various enzyme-encoding operons, including indole-3-acetic acid (IAA) biosynthesis proteins, 2,3-butanediol dehydrogenase, various non-ribosomal peptide synthetases, and several polyketide synthases. These properties are responsible for the promotion of plant growth and the biosynthesis of secondary metabolites. They therefore have multiple beneficial effects that could be applied to agriculture. Through curing, we found that the unique plasmid of GH1-13 has important roles in the production of phytohormones, such as IAA, and in shaping phenotypic and physiological characteristics. The plasmid therefore likely influences the biological activities of GH1-13. The complete genome sequence of B. velezensis GH1-13 contributes to our understanding of this beneficial strain and will encourage research into its development for agricultural or biotechnological applications, enhancing productivity and crop quality. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Genomic fossils reveal adaptation of non-autonomous pararetroviruses driven by concerted evolution of noncoding regulatory sequences.

    Science.gov (United States)

    Chen, Sunlu; Zheng, Huizhen; Kishima, Yuji

    2017-06-01

    The interplay of different virus species in a host cell after infection can affect the adaptation of each virus. Endogenous viral elements, such as endogenous pararetroviruses (PRVs), have arisen from vertical inheritance of viral sequences integrated into host germline genomes. As viral genomic fossils, these sequences can thus serve as valuable paleogenomic data to study the long-term evolutionary dynamics of virus-virus interactions, but they have rarely been applied for this purpose. All extant PRVs have been considered autonomous species in their parasitic life cycle in host cells. Here, we provide evidence for multiple non-autonomous PRV species with structural defects in viral activity that have frequently infected ancient grass hosts and adapted through interplay between viruses. Our paleogenomic analyses using endogenous PRVs in grass genomes revealed that these non-autonomous PRV species have participated in interplay with autonomous PRVs in a possible commensal partnership, or, alternatively, with one another in a possible mutualistic partnership. These partnerships, which have been established by the sharing of noncoding regulatory sequences (NRSs) in intergenic regions between two partner viruses, have been further maintained and altered by the sequence homogenization of NRSs between partners. Strikingly, we found that frequent region-specific recombination, rather than mutation selection, is the main causative mechanism of NRS homogenization. Our results, obtained from ancient DNA records of viruses, suggest that adaptation of PRVs has occurred by concerted evolution of NRSs between different virus species in the same host. Our findings further imply that evaluation of within-host NRS interactions within and between populations of viral pathogens may be important.

  16. Whole Genome Sequencing of Danish Staphylococcus argenteus Reveals a Genetically Diverse Collection with Clear Separation from Staphylococcus aureus

    DEFF Research Database (Denmark)

    Hansen, Thomas A.; Bartels, Mette D.; Hogh, Silje V.

    2017-01-01

    Staphylococcus argenteus (S. argenteus) is a newly identified Staphylococcus species that has been misidentified as Staphylococcus aureus (S. aureus) and is clinically relevant. We identified 25 S. argenteus genomes in our collection of whole genome sequenced S. aureus. These genomes were compare...

  17. Chromosome-specific sequencing reveals an extensive dispensable genome component in wheat

    Czech Academy of Sciences Publication Activity Database

    Liu, M.; Stiller, J.; Holušová, Kateřina; Vrána, Jan; Liu, D.; Doležel, Jaroslav; Liu, C.

    2016-01-01

    Roč. 6, NOV 8 (2016), č. článku 36398. ISSN 2045-2322 R&D Projects: GA MŠk(CZ) LO1204; GA ČR GBP501/12/G090 Institutional support: RVO:61389030 Keywords : triticum-aestivum l. * fusarium crown rot * pan-genome * hexaploid wheat * bread wheat * draft genome * rna-seq * maize * transcriptome Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 4.259, year: 2016

  18. Sequence-Based Mapping and Genome Editing Reveal Mutations in Stickleback Hps5 Cause Oculocutaneous Albinism and the casper Phenotype

    Directory of Open Access Journals (Sweden)

    James C. Hart

    2017-09-01

    Full Text Available Here, we present and characterize the spontaneous X-linked recessive mutation casper, which causes oculocutaneous albinism in threespine sticklebacks (Gasterosteus aculeatus. In humans, Hermansky-Pudlak syndrome results in pigmentation defects due to disrupted formation of the melanin-containing lysosomal-related organelle (LRO, the melanosome. casper mutants display not only reduced pigmentation of melanosomes in melanophores, but also reductions in the iridescent silver color from iridophores, while the yellow pigmentation from xanthophores appears unaffected. We mapped casper using high-throughput sequencing of genomic DNA from bulked casper mutants to a region of the stickleback X chromosome (chromosome 19 near the stickleback ortholog of Hermansky-Pudlak syndrome 5 (Hps5. casper mutants have an insertion of a single nucleotide in the sixth exon of Hps5, predicted to generate an early frameshift. Genome editing using CRISPR/Cas9 induced lesions in Hps5 and phenocopied the casper mutation. Injecting single or paired Hps5 guide RNAs revealed higher incidences of genomic deletions from paired guide RNAs compared to single gRNAs. Stickleback Hps5 provides a genetic system where a hemizygous locus in XY males and a diploid locus in XX females can be used to generate an easily scored visible phenotype, facilitating quantitative studies of different genome editing approaches. Lastly, we show the ability to better visualize patterns of fluorescent transgenic reporters in Hps5 mutant fish. Thus, Hps5 mutations present an opportunity to study pigmented LROs in the emerging stickleback model system, as well as a tool to aid in assaying genome editing and visualizing enhancer activity in transgenic fish.

  19. Genome sequence comparison reveals a candidate gene involved in male-hermaphrodite differentiation in papaya (Carica papaya) trees.

    Science.gov (United States)

    Ueno, Hiroki; Urasaki, Naoya; Natsume, Satoshi; Yoshida, Kentaro; Tarora, Kazuhiko; Shudo, Ayano; Terauchi, Ryohei; Matsumura, Hideo

    2015-04-01

    The sex type of papaya (Carica papaya) is determined by the pair of sex chromosomes (XX, female; XY, male; and XY(h), hermaphrodite), in which there is a non-recombining genomic region in the Y and Y(h) chromosomes. This region is presumed to be involved in determination of males and hermaphrodites; it is designated as the male-specific region in the Y chromosome (MSY) and the hermaphrodite-specific region in the Y(h) chromosome (HSY). Here, we identified the genes determining male and hermaphrodite sex types by comparing MSY and HSY genomic sequences. In the MSY and HSY genomic regions, we identified 14,528 nucleotide substitutions and 965 short indels with a large gap and two highly diverged regions. In the predicted genes expressed in flower buds, we found no nucleotide differences leading to amino acid changes between the MSY and HSY. However, we found an HSY-specific transposon insertion in a gene (SVP like) showing a similarity to the Short Vegetative Phase (SVP) gene. Study of SVP-like transcripts revealed that the MSY allele encoded an intact protein, while the HSY allele encoded a truncated protein. Our findings demonstrated that the SVP-like gene is a candidate gene for male-hermaphrodite determination in papaya.

  20. Genome Sequencing and Mapping Reveal Loss of Heterozygosity as a Mechanism for Rapid Adaptation in the Vegetable Pathogen Phytophthora capsici

    Energy Technology Data Exchange (ETDEWEB)

    Lamour, Kurt H.; Mudge, Joann; Gobena, Daniel; Hurtado-Gonzales, Oscar P.; Schmutz, Jeremy; Kuo, Alan; Miller, Neil A.; Rice, Brandon J.; Raffaele, Sylvain; Cano, Liliana M.; Bharti, Arvind K.; Donahoo, Ryan S.; Finely, Sabra; Huitema, Edgar; Hulvey, Jon; Platt, Darren; Salamov, Asaf; Savidor, Alon; Sharma, Rahul; Stam, Remco; Sotrey, Dylan; Thines, Marco; Win, Joe; Haas, Brian J.; Dinwiddie, Darrell L.; Jenkins, Jerry; Knight, James R.; Affourtit, Jason P.; Han, Cliff S.; Chertkov, Olga; Lindquist, Erika A.; Detter, Chris; Grigoriev, Igor V.; Kamoun, Sophien; Kingsmore, Stephen F.

    2012-02-07

    The oomycete vegetable pathogen Phytophthora capsici has shown remarkable adaptation to fungicides and new hosts. Like other members of this destructive genus, P. capsici has an explosive epidemiology, rapidly producing massive numbers of asexual spores on infected hosts. In addition, P. capsici can remain dormant for years as sexually recombined oospores, making it difficult to produce crops at infested sites, and allowing outcrossing populations to maintain significant genetic variation. Genome sequencing, development of a high-density genetic map, and integrative genomic or genetic characterization of P. capsici field isolates and intercross progeny revealed significant mitotic loss of heterozygosity (LOH) in diverse isolates. LOH was detected in clonally propagated field isolates and sexual progeny, cumulatively affecting >30percent of the genome. LOH altered genotypes for more than 11,000 single-nucleotide variant sites and showed a strong association with changes in mating type and pathogenicity. Overall, it appears that LOH may provide a rapid mechanism for fixing alleles and may be an important component of adaptability for P. capsici.

  1. Nuclear genomic sequences reveal that polar bears are an old and distinct bear lineage.

    Science.gov (United States)

    Hailer, Frank; Kutschera, Verena E; Hallström, Björn M; Klassert, Denise; Fain, Steven R; Leonard, Jennifer A; Arnason, Ulfur; Janke, Axel

    2012-04-20

    Recent studies have shown that the polar bear matriline (mitochondrial DNA) evolved from a brown bear lineage since the late Pleistocene, potentially indicating rapid speciation and adaption to arctic conditions. Here, we present a high-resolution data set from multiple independent loci across the nuclear genomes of a broad sample of polar, brown, and black bears. Bayesian coalescent analyses place polar bears outside the brown bear clade and date the divergence much earlier, in the middle Pleistocene, about 600 (338 to 934) thousand years ago. This provides more time for polar bear evolution and confirms previous suggestions that polar bears carry introgressed brown bear mitochondrial DNA due to past hybridization. Our results highlight that multilocus genomic analyses are crucial for an accurate understanding of evolutionary history.

  2. Complete genome sequence of Enterobacter cloacae R11 reveals multiple genes potentially associated with high-level polymyxin E resistance.

    Science.gov (United States)

    Zhong, Chuanqing; Zhang, Chao; Fu, Jiafang; Chen, Wenbing; Jiang, Tianyi; Cao, Guangxiang

    2018-01-01

    Enterobacter cloacae strain R11 is a multidrug-resistant bacterium isolated from sewage water near a swine feedlot in China. Strain R11 can survive in medium containing up to 192 μg/mL polymyxin E, indicating a tolerance for this antibiotic that is significantly higher than that reported for other gram-negative bacteria. In this study, conjugation experiments showed that partial polymyxin E resistance could be transferred from strain R11 to Escherichia coli strain 25922, revealing that some genes related to polymyxin E resistance are plasmid-based. The complete genome sequence of this strain was determined, yielding a total of 4 993 008 bp (G+C content, 53.15%) and 4908 genes for the circular chromosome and 4 circular plasmids. Genome analysis revealed a total of 73 putative antibiotic resistance genes, including several polymyxin E resistance genes and genes potentially involved in multidrug resistance. These data provide insights into the genetic basis of the polymyxin E resistance and multidrug resistance of E. cloacae.

  3. Genome sequence of the Asian Tiger mosquito, Aedes albopictus, reveals insights into its biology, genetics, and evolution

    NARCIS (Netherlands)

    Chena, X.G.; Jiang, X.; Gu, J.; Xu, M.; Wu, Y.; Deng, Y.; Zhang, C.; Bonizzoni, M.; Dermauw, W.; Vontas, J.; Armbruster, P.; Huang, X.; Yang, Y.; Zhang, H.; He, W.; Peng, H.; Liu, Y.; Wu, K.; Chen, J.; Lirakis, M.; Topalis, P.; Van Leeuwen, T.; Hall, B.A.; Thorpe, C.; Mueller, R.L.; Sun, C.; Waterhouse, R.M.; Yan, G.; Tu, Z.J.; Fang, X.; James, A.A.

    2015-01-01

    The Asian tiger mosquito, Aedes albopictus, is a highly successful invasive species that transmits a number of human viral diseases, including dengue and Chikungunya fevers. This species has a large genome with significant population-based size variation. The complete genome sequence was determined

  4. Reduced representation bisulphite sequencing of the cattle genome reveals DNA methylation patterns

    Science.gov (United States)

    Using reduced representation bisulphite sequencing (RRBS), we obtained the first single-base-resolution maps of bovine DNA methylation in ten somatic tissues. In total, we observed 1,868,049 cytosines in the CG-enriched regions. Similar to the methylation patterns in other species, the CG context wa...

  5. Genome re-sequencing of semi-wild soybean reveals a complex Soja population structure and deep introgression.

    Science.gov (United States)

    Qiu, Jie; Wang, Yu; Wu, Sanling; Wang, Ying-Ying; Ye, Chu-Yu; Bai, Xuefei; Li, Zefeng; Yan, Chenghai; Wang, Weidi; Wang, Ziqiang; Shu, Qingyao; Xie, Jiahua; Lee, Suk-Ha; Fan, Longjiang

    2014-01-01

    Semi-wild soybean is a unique type of soybean that retains both wild and domesticated characteristics, which provides an important intermediate type for understanding the evolution of the subgenus Soja population in the Glycine genus. In this study, a semi-wild soybean line (Maliaodou) and a wild line (Lanxi 1) collected from the lower Yangtze regions were deeply sequenced while nine other semi-wild lines were sequenced to a 3-fold genome coverage. Sequence analysis revealed that (1) no independent phylogenetic branch covering all 10 semi-wild lines was observed in the Soja phylogenetic tree; (2) besides two distinct subpopulations of wild and cultivated soybean in the Soja population structure, all semi-wild lines were mixed with some wild lines into a subpopulation rather than an independent one or an intermediate transition type of soybean domestication; (3) high heterozygous rates (0.19-0.49) were observed in several semi-wild lines; and (4) over 100 putative selective regions were identified by selective sweep analysis, including those related to the development of seed size. Our results suggested a hybridization origin for the semi-wild soybean, which makes a complex Soja population structure.

  6. Genome re-sequencing of semi-wild soybean reveals a complex Soja population structure and deep introgression.

    Directory of Open Access Journals (Sweden)

    Jie Qiu

    Full Text Available Semi-wild soybean is a unique type of soybean that retains both wild and domesticated characteristics, which provides an important intermediate type for understanding the evolution of the subgenus Soja population in the Glycine genus. In this study, a semi-wild soybean line (Maliaodou and a wild line (Lanxi 1 collected from the lower Yangtze regions were deeply sequenced while nine other semi-wild lines were sequenced to a 3-fold genome coverage. Sequence analysis revealed that (1 no independent phylogenetic branch covering all 10 semi-wild lines was observed in the Soja phylogenetic tree; (2 besides two distinct subpopulations of wild and cultivated soybean in the Soja population structure, all semi-wild lines were mixed with some wild lines into a subpopulation rather than an independent one or an intermediate transition type of soybean domestication; (3 high heterozygous rates (0.19-0.49 were observed in several semi-wild lines; and (4 over 100 putative selective regions were identified by selective sweep analysis, including those related to the development of seed size. Our results suggested a hybridization origin for the semi-wild soybean, which makes a complex Soja population structure.

  7. Genome sequence of the button mushroom Agaricus bisporus reveals mechanisms governing adaptation to a humic-rich ecological niche

    Energy Technology Data Exchange (ETDEWEB)

    Morin, Emmanuelle; Kohler, Annegret; Baker, Adam R.; Foulongne-Oriol, Marie; Lombard, Vincent; Nagy, Laszlo G.; Ohm, Robin A.; Patyshakuliyeva, Aleksandrina; Brun, Annick; Aerts, Andrea L.; Bailey, Andrew M.; Billette, Christophe; Coutinho, Pedro M.; Deakin, Greg; Doddapaneni, Harshavardhan; Floudas, Dimitrios; Grimwood, Jane; Hilden, Kristiina; Kues, Ursula; LaButti, Kurt M.; Lapidus, Alla; Lindquist, Erika A.; Lucas, Susan M.; Murat, Claude; Riley, Robert W.; Salamov, Asaf A.; Schmutz, Jeremy; Subramanian, Venkataramanan; Wosten, Han A. B.; Xu, Jianping; Eastwood, Daniel C.; Foster, Gary D.; Sonnenberg, Anton S. M.; Cullen, Dan; de Vries, Ronald P.; Lundell, Taina; Hibbett, David S.; Henrissat, Bernard; Burton, Kerry S.; Kerrigan, Richard W.; Challen, Michael P.; Grigoriev, Igor V.; Martin, Francis

    2012-04-27

    Agaricus bisporus is the model fungus for the adaptation, persistence, and growth in the humic-rich leaf-litter environment. Aside from its ecological role, A. bisporus has been an important component of the human diet for over 200 y and worldwide cultivation of the button mushroom forms a multibillion dollar industry. We present two A. bisporus genomes, their gene repertoires and transcript profiles on compost and during mushroom formation. The genomes encode a full repertoire of polysaccharide-degrading enzymes similar to that of wood-decayers. Comparative transcriptomics of mycelium grown on defined medium, casing-soil, and compost revealed genes encoding enzymes involved in xylan, cellulose, pectin, and protein degradation are more highly expressed in compost. The striking expansion of heme-thiolate peroxidases and etherases is distinctive from Agaricomycotina wood-decayers and suggests a broad attack on decaying lignin and related metabolites found in humic acid-rich environment. Similarly, up-regulation of these genes together with a lignolytic manganese peroxidase, multiple copper radical oxidases, and cytochrome P450s is consistent with challenges posed by complex humic-rich substrates. The gene repertoire and expression of hydrolytic enzymes in A. bisporus is substantially different from the taxonomically related ectomycorrhizal symbiont Laccaria bicolor. A common promoter motif was also identified in genes very highly expressed in humic-rich substrates. These observations reveal genetic and enzymatic mechanisms governing adaptation to the humic-rich ecological niche formed during plant degradation, further defining the critical role such fungi contribute to soil structure and carbon sequestration in terrestrial ecosystems. Genome sequence will expedite mushroom breeding for improved agronomic characteristics.

  8. Whole-genome sequencing reveals novel insights into sulfur oxidation in the extremophile Acidithiobacillus thiooxidans

    OpenAIRE

    Yin, Huaqun; Zhang, Xian; Li, Xiaoqi; He, Zhili; Liang, Yili; Guo, Xue; Hu, Qi; Xiao, Yunhua; Cong, Jing; Ma, Liyuan; Niu, Jiaojiao; Liu, Xueduan

    2014-01-01

    Background Acidithiobacillus thiooxidans (A. thiooxidans), a chemolithoautotrophic extremophile, is widely used in the industrial recovery of copper (bioleaching or biomining). The organism grows and survives by autotrophically utilizing energy derived from the oxidation of elemental sulfur and reduced inorganic sulfur compounds (RISCs). However, the lack of genetic manipulation systems has restricted our exploration of its physiology. With the development of high-throughput sequencing techno...

  9. Comparative and genetic analysis of the four sequenced Paenibacillus polymyxa genomes reveals a diverse metabolism and conservation of genes relevant to plant-growth promotion and competitiveness.

    Science.gov (United States)

    Eastman, Alexander W; Heinrichs, David E; Yuan, Ze-Chun

    2014-10-03

    Members of the genus Paenibacillus are important plant growth-promoting rhizobacteria that can serve as bio-reactors. Paenibacillus polymyxa promotes the growth of a variety of economically important crops. Our lab recently completed the genome sequence of Paenibacillus polymyxa CR1. As of January 2014, four P. polymyxa genomes have been completely sequenced but no comparative genomic analyses have been reported. Here we report the comparative and genetic analyses of four sequenced P. polymyxa genomes, which revealed a significantly conserved core genome. Complex metabolic pathways and regulatory networks were highly conserved and allow P. polymyxa to rapidly respond to dynamic environmental cues. Genes responsible for phytohormone synthesis, phosphate solubilization, iron acquisition, transcriptional regulation, σ-factors, stress responses, transporters and biomass degradation were well conserved, indicating an intimate association with plant hosts and the rhizosphere niche. In addition, genes responsible for antimicrobial resistance and non-ribosomal peptide/polyketide synthesis are present in both the core and accessory genome of each strain. Comparative analyses also reveal variations in the accessory genome, including large plasmids present in strains M1 and SC2. Furthermore, a considerable number of strain-specific genes and genomic islands are irregularly distributed throughout each genome. Although a variety of plant-growth promoting traits are encoded by all strains, only P. polymyxa CR1 encodes the unique nitrogen fixation cluster found in other Paenibacillus sp. Our study revealed that genomic loci relevant to host interaction and ecological fitness are highly conserved within the P. polymyxa genomes analysed, despite variations in the accessory genome. This work suggets that plant-growth promotion by P. polymyxa is mediated largely through phytohormone production, increased nutrient availability and bio-control mechanisms. This study provides an in

  10. Selfish supernumerary chromosome reveals its origin as a mosaic of host genome and organellar sequences

    Czech Academy of Sciences Publication Activity Database

    Martis, M.M.; Klemme, S.; Banaei-Moghaddam, A.M.; Blattner, F.R.; Macas, Jiří; Schmutzer, T.; Scholz, U.; Gundlach, H.; Wicker, T.; Šimková, Hana; Novák, Petr; Neumann, Pavel; Kubaláková, Marie; Bauer, E.; Haseneyer, G.; Fuchs, J.; Doležel, Jaroslav; Stein, N.; Mayer, K.F.X.; Houben, A.

    2012-01-01

    Roč. 109, č. 33 (2012), s. 13343-13346 ISSN 0027-8424 R&D Projects: GA ČR GBP501/12/G090; GA MŠk(CZ) OC10037 Institutional research plan: CEZ:AV0Z50510513 Institutional support: RVO:60077344 ; RVO:61389030 Keywords : FULL-LENGTH CDNAS * SECALE-CEREALE L. * B-CHROMOSOMES * REPETITIVE SEQUENCES Subject RIV: EB - Genetics ; Molecular Biology; EB - Genetics ; Molecular Biology (UEB-Q) Impact factor: 9.737, year: 2012

  11. Analysis of 90 Mb of the potato genome reveals conservation of gene structures and order with tomato but divergence in repetitive sequence composition

    Directory of Open Access Journals (Sweden)

    O'Brien Kimberly

    2008-06-01

    Full Text Available Abstract Background The Solanaceae family contains a number of important crop species including potato (Solanum tuberosum which is grown for its underground storage organ known as a tuber. Albeit the 4th most important food crop in the world, other than a collection of ~220,000 Expressed Sequence Tags, limited genomic sequence information is currently available for potato and advances in potato yield and nutrition content would be greatly assisted through access to a complete genome sequence. While morphologically diverse, Solanaceae species such as potato, tomato, pepper, and eggplant share not only genes but also gene order thereby permitting highly informative comparative genomic analyses. Results In this study, we report on analysis 89.9 Mb of potato genomic sequence representing 10.2% of the genome generated through end sequencing of a potato bacterial artificial chromosome (BAC clone library (87 Mb and sequencing of 22 potato BAC clones (2.9 Mb. The GC content of potato is very similar to Solanum lycopersicon (tomato and other dicotyledonous species yet distinct from the monocotyledonous grass species, Oryza sativa. Parallel analyses of repetitive sequences in potato and tomato revealed substantial differences in their abundance, 34.2% in potato versus 46.3% in tomato, which is consistent with the increased genome size per haploid genome of these two Solanum species. Specific classes and types of repetitive sequences were also differentially represented between these two species including a telomeric-related repetitive sequence, ribosomal DNA, and a number of unclassified repetitive sequences. Comparative analyses between tomato and potato at the gene level revealed a high level of conservation of gene content, genic feature, and gene order although discordances in synteny were observed. Conclusion Genomic level analyses of potato and tomato confirm that gene sequence and gene order are conserved between these solanaceous species and that

  12. Genome Sequencing of Museum Specimens Reveals Rapid Changes in the Genetic Composition of Honey Bees in California.

    Science.gov (United States)

    Cridland, Julie M; Ramirez, Santiago R; Dean, Cheryl A; Sciligo, Amber; Tsutsui, Neil D

    2018-02-01

    The western honey bee, Apis mellifera, is an enormously influential pollinator in both natural and managed ecosystems. In North America, this species has been introduced numerous times from a variety of different source populations in Europe and Africa. Since then, feral populations have expanded into many different environments across their broad introduced range. Here, we used whole genome sequencing of historical museum specimens and newly collected modern populations from California (USA) to analyze the impact of demography and selection on introduced populations during the past 105 years. We find that populations from both northern and southern California exhibit pronounced genetic changes, but have changed in different ways. In northern populations, honey bees underwent a substantial shift from western European to eastern European ancestry since the 1960s, whereas southern populations are dominated by the introgression of Africanized genomes during the past two decades. Additionally, we identify an isolated island population that has experienced comparatively little change over a large time span. Fine-scale comparison of different populations and time points also revealed SNPs that differ in frequency, highlighting a number of genes that may be important for recent adaptations in these introduced populations. © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  13. The complete mitochondrial genome sequence of the spider habronattus oregonensis reveals rearranged and extremely truncated tRNAs

    Energy Technology Data Exchange (ETDEWEB)

    Masta, Susan E.; Boore, Jeffrey L.

    2004-01-31

    We sequenced the entire mitochondrial genome of the jumping spider Habronattus oregonensis of the arachnid order Araneae (Arthropoda: Chelicerata). A number of unusual features distinguish this genome from other chelicerate and arthropod mitochondrial genomes. Most of the transfer RNA gene sequences are greatly reduced in size and cannot be folded into typical cloverleaf-shaped secondary structures. At least nine of the tRNA sequences lack the potential to form TYC arm stem pairings, and instead are inferred to have TV-replacement loops. Furthermore, sequences that could encode the 3' aminoacyl acceptor stems in at least 10 tRNAs appear to be lacking, because fully paired acceptor stems are not possible and because the downstream sequences instead encode adjacent genes. Hence, these appear to be among the smallest known tRNA genes. We postulate that an RNA editing mechanism must exist to restore the 3' aminoacyl acceptor stems in order to allow the tRNAs to function. At least seven tRN As are rearranged with respect to the chelicerate Limulus polyphemus, although the arrangement of the protein-coding genes is identical. Most mitochondrial protein-coding genes of H. oregonensis have ATN as initiation codons, as commonly found in arthropod mtDNAs, but cytochrome oxidase subunit 2 and 3 genes apparently use UUG as an initiation codon. Finally, many of the gene sequences overlap one another and are truncated. This 14,381 bp genome, the first mitochondrial genome of a spider yet sequenced, is one of the smallest arthropod mitochondrial genomes known. We suggest that post transcriptional RNA editing can likely maintain function of the tRNAs while permitting the accumulation of mutations that would otherwise be deleterious. Such mechanisms may have allowed for the minimization of the spider mitochondrial genome.

  14. Rapid genome-wide evolution in Brassica rapa populations following drought revealed by sequencing of ancestral and descendant gene pools.

    Science.gov (United States)

    Franks, Steven J; Kane, Nolan C; O'Hara, Niamh B; Tittes, Silas; Rest, Joshua S

    2016-08-01

    There is increasing evidence that evolution can occur rapidly in response to selection. Recent advances in sequencing suggest the possibility of documenting genetic changes as they occur in populations, thus uncovering the genetic basis of evolution, particularly if samples are available from both before and after selection. Here, we had a unique opportunity to directly assess genetic changes in natural populations following an evolutionary response to a fluctuation in climate. We analysed genome-wide differences between ancestors and descendants of natural populations of Brassica rapa plants from two locations that rapidly evolved changes in multiple phenotypic traits, including flowering time, following a multiyear late-season drought in California. These ancestor-descendant comparisons revealed evolutionary shifts in allele frequencies in many genes. Some genes showing evolutionary shifts have functions related to drought stress and flowering time, consistent with an adaptive response to selection. Loci differentiated between ancestors and descendants (FST outliers) were generally different from those showing signatures of selection based on site frequency spectrum analysis (Tajima's D), indicating that the loci that evolved in response to the recent drought and those under historical selection were generally distinct. Very few genes showed similar evolutionary responses between two geographically distinct populations, suggesting independent genetic trajectories of evolution yielding parallel phenotypic changes. The results show that selection can result in rapid genome-wide evolutionary shifts in allele frequencies in natural populations, and highlight the usefulness of combining resurrection experiments in natural populations with genomics for studying the genetic basis of adaptive evolution. © 2016 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

  15. The complete genome sequence of Dickeya zeae EC1 reveals substantial divergence from other Dickeya strains and species.

    Science.gov (United States)

    Zhou, Jianuan; Cheng, Yingying; Lv, Mingfa; Liao, Lisheng; Chen, Yufan; Gu, Yanfang; Liu, Shiyin; Jiang, Zide; Xiong, Yuanyan; Zhang, Lianhui

    2015-08-04

    Dickeya zeae is a bacterial species that infects monocotyledons and dicotyledons. Two antibiotic-like phytotoxins named zeamine and zeamine II were reported to play an important role in rice seed germination, and two genes associated with zeamines production, i.e., zmsA and zmsK, have been thoroughly characterized. However, other virulence factors and its molecular mechanisms of host specificity and pathogenesis are hardly known. The complete genome of D. zeae strain EC1 isolated from diseased rice plants was sequenced, annotated, and compared with the genomes of other Dickeya spp.. The pathogen contains a chromosome of 4,532,364 bp with 4,154 predicted protein-coding genes. Comparative genomics analysis indicates that D. zeae EC1 is most co-linear with D. chrysanthemi Ech1591, most conserved with D. zeae Ech586 and least similar to D. paradisiaca Ech703. Substantial genomic rearrangement was revealed by comparing EC1 with Ech586 and Ech703. Most virulence genes were well-conserved in Dickeya strains except Ech703. Significantly, the zms gene cluster involved in biosynthesis of zeamines, which were shown previously as key virulence determinants, is present in D. zeae strains isolated from rice, and some D. solani strains, but absent in other Dickeya species and the D. zeae strains isolated from other plants or sources. In addition, a DNA fragment containing 9 genes associated with fatty acid biosynthesis was found inserted in the fli gene cluster encoding flagellar biosynthesis of strain EC1 and other two rice isolates but not in other strains. This gene cluster shares a high protein similarity to the fatty acid genes from Pantoea ananatis. Our findings delineate the genetic background of D. zeae EC1, which infects both dicotyledons and monocotyledons, and suggest that D. zeae strains isolated from rice could be grouped into a distinct pathovar, i.e., D. zeae subsp. oryzae. In addition, the results of this study also unveiled that the zms gene cluster presented in

  16. Nuclear species-diagnostic SNP markers mined from 454 amplicon sequencing reveal admixture genomic structure of modern citrus varieties.

    Directory of Open Access Journals (Sweden)

    Franck Curk

    Full Text Available Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105 were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species

  17. Genome sequencing reveals metabolic and cellular interdependence in an amoeba-kinetoplastid symbiosis

    Czech Academy of Sciences Publication Activity Database

    Tanifuji, G.; Cenci, U.; Moog, D.; Dean, S.; Nakayama, T.; David, Vojtěch; Fiala, Ivan; Curtis, B.A.; Sibbald, S. J.; Onodera, N. T.; Colp, M.; Flegontov, Pavel; Johnson-MacKinnon, J.; McPhee, M.; Inagaki, Y.; Hashimoto, T.; Kelly, S.; Gull, K.; Lukeš, Julius; Archibald, J.M.

    2017-01-01

    Roč. 7, SEP 15 (2017), č. článku 11688. ISSN 2045-2322 R&D Projects: GA ČR(CZ) GA14-23986S; GA MŠk LL1601 Institutional support: RVO:60077344 Keywords : trypanosoma-brucei reveals * hidden markov model * neoparamoeba-pemaquidensis * gill disease * phylogenetic analyses * ichthyobodo-necator * gene prediction * host control * evolution * proteomics Subject RIV: EB - Gene tics ; Molecular Biology OBOR OECD: Biochemistry and molecular biology Impact factor: 4.259, year: 2016

  18. Revealing Genomic Profile That Underlies Tropism of Myeloma Cells Using Whole Exome Sequencing

    Directory of Open Access Journals (Sweden)

    Youngil Koh

    2015-01-01

    Full Text Available Background. Previously we established two cell lines (SNU_MM1393_BM and SNU_MM1393_SC from different tissues (bone marrow and subcutis of mice which were injected with single patient’s myeloma sample. We tried to define genetic changes specific for each cell line using whole exome sequencing (WES. Materials and Methods. We extracted DNA from SNU_MM1393_BM and SNU_MM1393_SC and performed WES. For single nucleotide variants (SNV calling, we used Varscan2. Annotation of mutation was performed using ANNOVAR. Results. When calling of somatic mutations was performed, 68 genes were nonsynonymously mutated only in SNU_MM1393_SC, while 136 genes were nonsynonymously mutated only in SNU_MM1393_BM. KIAA1199, FRY, AP3B2, and OPTC were representative genes specifically mutated in SNU_MM1393_SC. When comparison analysis was performed using TCGA data, mutational pattern of SNU_MM1393_SC resembled that of melanoma mostly. Pathway analysis using KEGG database showed that mutated genes specific of SNU_MM1393_BM were related to differentiation, while those of SNU_MM1393_SC were related to tumorigenesis. Conclusion. We found out genetic changes that underlie tropism of myeloma cells using WES. Genetic signature of cutaneous plasmacytoma shares that of melanoma implying common mechanism for skin tropism. KIAA1199, FRY, AP3B2, and OPTC are candidate genes for skin tropism of cancers.

  19. Phylogenetic diversity and genotypical complexity of H9N2 influenza A viruses revealed by genomic sequence analysis.

    Directory of Open Access Journals (Sweden)

    Guoying Dong

    Full Text Available H9N2 influenza A viruses have become established worldwide in terrestrial poultry and wild birds, and are occasionally transmitted to mammals including humans and pigs. To comprehensively elucidate the genetic and evolutionary characteristics of H9N2 influenza viruses, we performed a large-scale sequence analysis of 571 viral genomes from the NCBI Influenza Virus Resource Database, representing the spectrum of H9N2 influenza viruses isolated from 1966 to 2009. Our study provides a panoramic framework for better understanding the genesis and evolution of H9N2 influenza viruses, and for describing the history of H9N2 viruses circulating in diverse hosts. Panorama phylogenetic analysis of the eight viral gene segments revealed the complexity and diversity of H9N2 influenza viruses. The 571 H9N2 viral genomes were classified into 74 separate lineages, which had marked host and geographical differences in phylogeny. Panorama genotypical analysis also revealed that H9N2 viruses include at least 98 genotypes, which were further divided according to their HA lineages into seven series (A-G. Phylogenetic analysis of the internal genes showed that H9N2 viruses are closely related to H3, H4, H5, H7, H10, and H14 subtype influenza viruses. Our results indicate that H9N2 viruses have undergone extensive reassortments to generate multiple reassortants and genotypes, suggesting that the continued circulation of multiple genotypical H9N2 viruses throughout the world in diverse hosts has the potential to cause future influenza outbreaks in poultry and epidemics in humans. We propose a nomenclature system for identifying and unifying all lineages and genotypes of H9N2 influenza viruses in order to facilitate international communication on the evolution, ecology and epidemiology of H9N2 influenza viruses.

  20. Draft genome sequence of Streptomyces coelicoflavus ZG0656 reveals the putative biosynthetic gene cluster of acarviostatin family α-amylase inhibitors.

    Science.gov (United States)

    Guo, X; Geng, P; Bai, F; Bai, G; Sun, T; Li, X; Shi, L; Zhong, Q

    2012-08-01

    The aims of this study are to obtain the draft genome sequence of Streptomyces coelicoflavus ZG0656, which produces novel acarviostatin family α-amylase inhibitors, and then to reveal the putative acarviostatin-related gene cluster and the biosynthetic pathway. The draft genome sequence of S. coelicoflavus ZG0656 was generated using a shotgun approach employing a combination of 454 and Solexa sequencing technologies. Genome analysis revealed a putative gene cluster for acarviostatin biosynthesis, termed sct-cluster. The cluster contains 13 acarviostatin synthetic genes, six transporter genes, four starch degrading or transglycosylation enzyme genes and two regulator genes. On the basis of bioinformatic analysis, we proposed a putative biosynthetic pathway of acarviostatins. The intracellular steps produce a structural core, acarviostatin I00-7-P, and the extracellular assemblies lead to diverse acarviostatin end products. The draft genome sequence of S. coelicoflavus ZG0656 revealed the putative biosynthetic gene cluster of acarviostatins and a putative pathway of acarviostatin production. To our knowledge, S. coelicoflavus ZG0656 is the first strain in this species for which a genome sequence has been reported. The analysis of sct-cluster provided important insights into the biosynthesis of acarviostatins. This work will be a platform for producing novel variants and yield improvement. © 2012 The Authors. Letters in Applied Microbiology © 2012 The Society for Applied Microbiology.

  1. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... that the minimum number of genes from each species that need to be compared to produce a reliable phylogeny is about 20. Yeast has also become an attractive model to study speciation in eukaryotes, especially to understand molecular mechanisms behind the establishment of reproductive isolation. Comparison...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  2. Staphylococcus epidermidis pan-genome sequence analysis reveals diversity of skin commensal and hospital infection-associated isolates

    OpenAIRE

    Conlan, Sean; Mijares, Lilia A; Becker, Jesse; Blakesley, Robert W; Bouffard, Gerard G; Brooks, Shelise; Coleman, Holly; Gupta, Jyoti; Gurson, Natalie; Park, Morgan; Schmidt, Brian; Thomas, Pamela J; Otto, Michael; Kong, Heidi H; Murray, Patrick R

    2012-01-01

    Background While Staphylococcus epidermidis is commonly isolated from healthy human skin, it is also the most frequent cause of nosocomial infections on indwelling medical devices. Despite its importance, few genome sequences existed and the most frequent hospital-associated lineage, ST2, had not been fully sequenced. Results We cultivated 71 commensal S. epidermidis isolates from 15 skin sites and compared them with 28 nosocomial isolates from venous catheters and blood cultures. We produced...

  3. Next-Generation Sequencing Reveals the Impact of Repetitive DNA Across Phylogenetically Closely Related Genomes of Orobanchaceae

    Science.gov (United States)

    Piednoël, Mathieu; Aberer, Andre J.; Schneeweiss, Gerald M.; Macas, Jiri; Novak, Petr; Gundlach, Heidrun; Temsch, Eva M.; Renner, Susanne S.

    2013-01-01

    We used next-generation sequencing to characterize the genomes of nine species of Orobanchaceae of known phylogenetic relationships, different life forms, and including a polyploid species. The study species are the autotrophic, nonparasitic Lindenbergia philippensis, the hemiparasitic Schwalbea americana, and seven nonphotosynthetic parasitic species of Orobanche (Orobanche crenata, Orobanche cumana, Orobanche gracilis (tetraploid), and Orobanche pancicii) and Phelipanche (Phelipanche lavandulacea, Phelipanche purpurea, and Phelipanche ramosa). Ty3/Gypsy elements comprise 1.93%–28.34% of the nine genomes and Ty1/Copia elements comprise 8.09%–22.83%. When compared with L. philippensis and S. americana, the nonphotosynthetic species contain higher proportions of repetitive DNA sequences, perhaps reflecting relaxed selection on genome size in parasitic organisms. Among the parasitic species, those in the genus Orobanche have smaller genomes but higher proportions of repetitive DNA than those in Phelipanche, mostly due to a diversification of repeats and an accumulation of Ty3/Gypsy elements. Genome downsizing in the tetraploid O. gracilis probably led to sequence loss across most repeat types. PMID:22723303

  4. Genome sequence of the Asian Tiger mosquito, Aedes albopictus, reveals insights into its biology, genetics, and evolution.

    Science.gov (United States)

    Chen, Xiao-Guang; Jiang, Xuanting; Gu, Jinbao; Xu, Meng; Wu, Yang; Deng, Yuhua; Zhang, Chi; Bonizzoni, Mariangela; Dermauw, Wannes; Vontas, John; Armbruster, Peter; Huang, Xin; Yang, Yulan; Zhang, Hao; He, Weiming; Peng, Hongjuan; Liu, Yongfeng; Wu, Kun; Chen, Jiahua; Lirakis, Manolis; Topalis, Pantelis; Van Leeuwen, Thomas; Hall, Andrew Brantley; Jiang, Xiaofang; Thorpe, Chevon; Mueller, Rachel Lockridge; Sun, Cheng; Waterhouse, Robert Michael; Yan, Guiyun; Tu, Zhijian Jake; Fang, Xiaodong; James, Anthony A

    2015-11-03

    The Asian tiger mosquito, Aedes albopictus, is a highly successful invasive species that transmits a number of human viral diseases, including dengue and Chikungunya fevers. This species has a large genome with significant population-based size variation. The complete genome sequence was determined for the Foshan strain, an established laboratory colony derived from wild mosquitoes from southeastern China, a region within the historical range of the origin of the species. The genome comprises 1,967 Mb, the largest mosquito genome sequenced to date, and its size results principally from an abundance of repetitive DNA classes. In addition, expansions of the numbers of members in gene families involved in insecticide-resistance mechanisms, diapause, sex determination, immunity, and olfaction also contribute to the larger size. Portions of integrated flavivirus-like genomes support a shared evolutionary history of association of these viruses with their vector. The large genome repertory may contribute to the adaptability and success of Ae. albopictus as an invasive species.

  5. The first complete mitochondrial genome sequences of Amblypygi (Chelicerata: Arachnida) reveal conservation of the ancestral arthropod gene order.

    Science.gov (United States)

    Fahrein, Kathrin; Masta, Susan E; Podsiadlowski, Lars

    2009-05-01

    Amblypygi (whip spiders) are terrestrial chelicerates inhabiting the subtropics and tropics. In morphological and rRNA-based phylogenetic analyses, Amblypygi cluster with Uropygi (whip scorpions) and Araneae (spiders) to form the taxon Tetrapulmonata, but there is controversy regarding the interrelationship of these three taxa. Mitochondrial genomes provide an additional large data set of phylogenetic information (sequences, gene order, RNA secondary structure), but in arachnids, mitochondrial genome data are missing for some of the major orders. In the course of an ongoing project concerning arachnid mitochondrial genomics, we present the first two complete mitochondrial genomes from Amblypygi. Both genomes were found to be typical circular duplex DNA molecules with all 37 genes usually present in bilaterian mitochondrial genomes. In both species, gene order is identical to that of Limulus polyphemus (Xiphosura), which is assumed to reflect the putative arthropod ground pattern. All tRNA gene sequences have the potential to fold into structures that are typical of metazoan mitochondrial tRNAs, except for tRNA-Ala, which lacks the D arm in both amblypygids, suggesting the loss of this feature early in amblypygid evolution. Phylogenetic analysis resulted in weak support for Uropygi being the sister group of Amblypygi.

  6. Next-Generation Sequencing Reveals the Impact of Repetitive DNA Across Phylogenetically Closely Related Genomes of Orobanchaceae

    Czech Academy of Sciences Publication Activity Database

    Piednoël, M.; Aberer, A.J.; Schneeweiss, G. M.; Macas, Jiří; Novák, Petr; Gundlach, H.; Temsch, E.M.; Renner, S.S.

    2012-01-01

    Roč. 29, č. 11 (2012), s. 3601-3611 ISSN 0737-4038 Institutional research plan: CEZ:AV0Z50510513 Institutional support: RVO:60077344 Keywords : next-generation sequencing * polyploidy * genome size * Ty3/Gypsy * transposable elements Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 10.353, year: 2012

  7. Application of SMRT genome sequencing to reveal the methylomes of bacteria associated with respiratory disease outbreaks in beef cattle

    Science.gov (United States)

    DNA base modification systems are common in bacteria and can modulate gene expression as well as act in defense against invading viruses. Recent advances in the direct identification of modified bases in the genome via Single Molecule Real Time (SMRT) sequencing supports an integrated analytical ap...

  8. Illumina next-generation sequencing reveals the complete mitochondrial genome of Psenopsis anomala (Perciformes: Centrolophidae) with phylogenetic consideration.

    Science.gov (United States)

    Chen, Huapu; Che, Zhiwei; Li, Jiantao; Dai, Mingli; Xiang, Ling; Deng, Siping; Zhu, Chunhua; Huang, Hai; Li, Guangli

    2016-09-01

    Using Illumina next-generation sequencing (NGS), the complete mitochondrial genome of the Psenopsis anomala was sequenced in the present study. The mitochondrial genome of P. anomala is 16,528 bp long and consists of 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes, and a control region. The structure about gene order and composition of P. anomala mitochondrial genome is similar to those of most other vertebrates. The nucleotide compositions of the light strand in descending order is 29.18% of T, 27.97% of G, 27.06% of A, and 15.79% of C. With the exception of the NADH dehydrogenase subunit 6 (ND6) and eight tRNA genes, other mitochondrial genes are encoded on the heavy strand. The phylogenetic analysis by maximum-likelihood (ML) method shown that the Psenopsis anomala was closer to Peprilus triacanthus in the phylogenetic relationship.

  9. Determination of the melon chloroplast and mitochondrial genome sequences reveals that the largest reported mitochondrial genome in plants contains a significant amount of DNA having a nuclear origin

    Directory of Open Access Journals (Sweden)

    Aranda Miguel A

    2011-08-01

    Full Text Available Abstract Background The melon belongs to the Cucurbitaceae family, whose economic importance among vegetable crops is second only to Solanaceae. The melon has a small genome size (454 Mb, which makes it suitable for molecular and genetic studies. Despite similar nuclear and chloroplast genome sizes, cucurbits show great variation when their mitochondrial genomes are compared. The melon possesses the largest plant mitochondrial genome, as much as eight times larger than that of other cucurbits. Results The nucleotide sequences of the melon chloroplast and mitochondrial genomes were determined. The chloroplast genome (156,017 bp included 132 genes, with 98 single-copy genes dispersed between the small (SSC and large (LSC single-copy regions and 17 duplicated genes in the inverted repeat regions (IRa and IRb. A comparison of the cucumber and melon chloroplast genomes showed differences in only approximately 5% of nucleotides, mainly due to short indels and SNPs. Additionally, 2.74 Mb of mitochondrial sequence, accounting for 95% of the estimated mitochondrial genome size, were assembled into five scaffolds and four additional unscaffolded contigs. An 84% of the mitochondrial genome is contained in a single scaffold. The gene-coding region accounted for 1.7% (45,926 bp of the total sequence, including 51 protein-coding genes, 4 conserved ORFs, 3 rRNA genes and 24 tRNA genes. Despite the differences observed in the mitochondrial genome sizes of cucurbit species, Citrullus lanatus (379 kb, Cucurbita pepo (983 kb and Cucumis melo (2,740 kb share 120 kb of sequence, including the predicted protein-coding regions. Nevertheless, melon contained a high number of repetitive sequences and a high content of DNA of nuclear origin, which represented 42% and 47% of the total sequence, respectively. Conclusions Whereas the size and gene organisation of chloroplast genomes are similar among the cucurbit species, mitochondrial genomes show a wide variety of sizes

  10. Determination of the melon chloroplast and mitochondrial genome sequences reveals that the largest reported mitochondrial genome in plants contains a significant amount of DNA having a nuclear origin.

    Science.gov (United States)

    Rodríguez-Moreno, Luis; González, Víctor M; Benjak, Andrej; Martí, M Carmen; Puigdomènech, Pere; Aranda, Miguel A; Garcia-Mas, Jordi

    2011-08-20

    The melon belongs to the Cucurbitaceae family, whose economic importance among vegetable crops is second only to Solanaceae. The melon has a small genome size (454 Mb), which makes it suitable for molecular and genetic studies. Despite similar nuclear and chloroplast genome sizes, cucurbits show great variation when their mitochondrial genomes are compared. The melon possesses the largest plant mitochondrial genome, as much as eight times larger than that of other cucurbits. The nucleotide sequences of the melon chloroplast and mitochondrial genomes were determined. The chloroplast genome (156,017 bp) included 132 genes, with 98 single-copy genes dispersed between the small (SSC) and large (LSC) single-copy regions and 17 duplicated genes in the inverted repeat regions (IRa and IRb). A comparison of the cucumber and melon chloroplast genomes showed differences in only approximately 5% of nucleotides, mainly due to short indels and SNPs. Additionally, 2.74 Mb of mitochondrial sequence, accounting for 95% of the estimated mitochondrial genome size, were assembled into five scaffolds and four additional unscaffolded contigs. An 84% of the mitochondrial genome is contained in a single scaffold. The gene-coding region accounted for 1.7% (45,926 bp) of the total sequence, including 51 protein-coding genes, 4 conserved ORFs, 3 rRNA genes and 24 tRNA genes. Despite the differences observed in the mitochondrial genome sizes of cucurbit species, Citrullus lanatus (379 kb), Cucurbita pepo (983 kb) and Cucumis melo (2,740 kb) share 120 kb of sequence, including the predicted protein-coding regions. Nevertheless, melon contained a high number of repetitive sequences and a high content of DNA of nuclear origin, which represented 42% and 47% of the total sequence, respectively. Whereas the size and gene organisation of chloroplast genomes are similar among the cucurbit species, mitochondrial genomes show a wide variety of sizes, with a non-conserved structure both in gene number

  11. Candidate genes revealed by a genome scan for mosquito resistance to a bacterial insecticide: sequence and gene expression variations

    Directory of Open Access Journals (Sweden)

    David Jean-Philippe

    2009-11-01

    Full Text Available Abstract Background Genome scans are becoming an increasingly popular approach to study the genetic basis of adaptation and speciation, but on their own, they are often helpless at identifying the specific gene(s or mutation(s targeted by selection. This shortcoming is hopefully bound to disappear in the near future, thanks to the wealth of new genomic resources that are currently being developed for many species. In this article, we provide a foretaste of this exciting new era by conducting a genome scan in the mosquito Aedes aegypti with the aim to look for candidate genes involved in resistance to Bacillus thuringiensis subsp. israelensis (Bti insecticidal toxins. Results The genome of a Bti-resistant and a Bti-susceptible strains was surveyed using about 500 MITE-based molecular markers, and the loci showing the highest inter-strain genetic differentiation were sequenced and mapped on the Aedes aegypti genome sequence. Several good candidate genes for Bti-resistance were identified in the vicinity of these highly differentiated markers. Two of them, coding for a cadherin and a leucine aminopeptidase, were further examined at the sequence and gene expression levels. In the resistant strain, the cadherin gene displayed patterns of nucleotide polymorphisms consistent with the action of positive selection (e.g. an excess of high compared to intermediate frequency mutations, as well as a significant under-expression compared to the susceptible strain. Conclusion Both sequence and gene expression analyses agree to suggest a role for positive selection in the evolution of this cadherin gene in the resistant strain. However, it is unlikely that resistance to Bti is conferred by this gene alone, and further investigation will be needed to characterize other genes significantly associated with Bti resistance in Ae. aegypti. Beyond these results, this article illustrates how genome scans can build on the body of new genomic information (here, full

  12. The Complete Sequence of the Acacia ligulata Chloroplast Genome Reveals a Highly Divergent clpP1 Gene.

    Directory of Open Access Journals (Sweden)

    Anna V Williams

    Full Text Available Legumes are a highly diverse angiosperm family that include many agriculturally important species. To date, 21 complete chloroplast genomes have been sequenced from legume crops confined to the Papilionoideae subfamily. Here we report the first chloroplast genome from the Mimosoideae, Acacia ligulata, and compare it to the previously sequenced legume genomes. The A. ligulata chloroplast genome is 174,233 bp in size, comprising inverted repeats of 38,225 bp and single-copy regions of 92,798 bp and 4,985 bp [corrected]. Acacia ligulata lacks the inversion present in many of the Papilionoideae, but is not otherwise significantly different in terms of gene and repeat content. The key feature is its highly divergent clpP1 gene, normally considered essential in chloroplast genomes. In A. ligulata, although transcribed and spliced, it probably encodes a catalytically inactive protein. This study provides a significant resource for further genetic research into Acacia and the Mimosoideae. The divergent clpP1 gene suggests that Acacia will provide an interesting source of information on the evolution and functional diversity of the chloroplast Clp protease complex.

  13. The genome sequence of the rumen methanogen Methanobrevibacter ruminantium reveals new possibilities for controlling ruminant methane emissions.

    Directory of Open Access Journals (Sweden)

    Sinead C Leahy

    Full Text Available BACKGROUND: Methane (CH(4 is a potent greenhouse gas (GHG, having a global warming potential 21 times that of carbon dioxide (CO(2. Methane emissions from agriculture represent around 40% of the emissions produced by human-related activities, the single largest source being enteric fermentation, mainly in ruminant livestock. Technologies to reduce these emissions are lacking. Ruminant methane is formed by the action of methanogenic archaea typified by Methanobrevibacter ruminantium, which is present in ruminants fed a wide variety of diets worldwide. To gain more insight into the lifestyle of a rumen methanogen, and to identify genes and proteins that can be targeted to reduce methane production, we have sequenced the 2.93 Mb genome of M. ruminantium M1, the first rumen methanogen genome to be completed. METHODOLOGY/PRINCIPAL FINDINGS: The M1 genome was sequenced, annotated and subjected to comparative genomic and metabolic pathway analyses. Conserved and methanogen-specific gene sets suitable as targets for vaccine development or chemogenomic-based inhibition of rumen methanogens were identified. The feasibility of using a synthetic peptide-directed vaccinology approach to target epitopes of methanogen surface proteins was demonstrated. A prophage genome was described and its lytic enzyme, endoisopeptidase PeiR, was shown to lyse M1 cells in pure culture. A predicted stimulation of M1 growth by alcohols was demonstrated and microarray analyses indicated up-regulation of methanogenesis genes during co-culture with a hydrogen (H(2 producing rumen bacterium. We also report the discovery of non-ribosomal peptide synthetases in M. ruminantium M1, the first reported in archaeal species. CONCLUSIONS/SIGNIFICANCE: The M1 genome sequence provides new insights into the lifestyle and cellular processes of this important rumen methanogen. It also defines vaccine and chemogenomic targets for broad inhibition of rumen methanogens and represents a significant

  14. Genome sequence of the pathogenic intestinal spirochete brachyspira hyodysenteriae reveals adaptations to its lifestyle in the porcine large intestine.

    Directory of Open Access Journals (Sweden)

    Matthew I Bellgard

    Full Text Available Brachyspira hyodysenteriae is an anaerobic intestinal spirochete that colonizes the large intestine of pigs and causes swine dysentery, a disease of significant economic importance. The genome sequence of B. hyodysenteriae strain WA1 was determined, making it the first representative of the genus Brachyspira to be sequenced, and the seventeenth spirochete genome to be reported. The genome consisted of a circular 3,000,694 base pair (bp chromosome, and a 35,940 bp circular plasmid that has not previously been described. The spirochete had 2,122 protein-coding sequences. Of the predicted proteins, more had similarities to proteins of the enteric Escherichia coli and Clostridium species than they did to proteins of other spirochetes. Many of these genes were associated with transport and metabolism, and they may have been gradually acquired through horizontal gene transfer in the environment of the large intestine. A reconstruction of central metabolic pathways identified a complete set of coding sequences for glycolysis, gluconeogenesis, a non-oxidative pentose phosphate pathway, nucleotide metabolism, lipooligosaccharide biosynthesis, and a respiratory electron transport chain. A notable finding was the presence on the plasmid of the genes involved in rhamnose biosynthesis. Potential virulence genes included those for 15 proteases and six hemolysins. Other adaptations to an enteric lifestyle included the presence of large numbers of genes associated with chemotaxis and motility. B. hyodysenteriae has diverged from other spirochetes in the process of accommodating to its habitat in the porcine large intestine.

  15. Whole genome sequence of two Rathayibacter toxicus strains reveals a tunicamycin biosynthetic cluster similar to Streptomyces chartreusis.

    Directory of Open Access Journals (Sweden)

    Aaron J Sechler

    Full Text Available Rathayibacter toxicus is a forage grass associated Gram-positive bacterium of major concern to food safety and agriculture. This species is listed by USDA-APHIS as a plant pathogen select agent because it produces a tunicamycin-like toxin that is lethal to livestock and may be vectored by nematode species native to the U.S. The complete genomes of two strains of R. toxicus, including the type strain FH-79, were sequenced and analyzed in comparison with all available, complete R. toxicus genomes. Genome sizes ranged from 2,343,780 to 2,394,755 nucleotides, with 2079 to 2137 predicted open reading frames; all four strains showed remarkable synteny over nearly the entire genome, with only a small transposed region. A cluster of genes with similarity to the tunicamycin biosynthetic cluster from Streptomyces chartreusis was identified. The tunicamycin gene cluster (TGC in R. toxicus contained 14 genes in two transcriptional units, with all of the functional elements for tunicamycin biosynthesis present. The TGC had a significantly lower GC content (52% than the rest of the genome (61.5%, suggesting that the TGC may have originated from a horizontal transfer event. Further analysis indicated numerous remnants of other potential horizontal transfer events are present in the genome. In addition to the TGC, genes potentially associated with carotenoid and exopolysaccharide production, bacteriocins and secondary metabolites were identified. A CRISPR array is evident. There were relatively few plant-associated cell-wall hydrolyzing enzymes, but there were numerous secreted serine proteases that share sequence homology to the pathogenicity-associated protein Pat-1 of Clavibacter michiganensis. Overall, the genome provides clear insight into the possible mechanisms for toxin production in R. toxicus, providing a basis for future genetic approaches.

  16. Whole genome sequence of two Rathayibacter toxicus strains reveals a tunicamycin biosynthetic cluster similar to Streptomyces chartreusis.

    Science.gov (United States)

    Sechler, Aaron J; Tancos, Matthew A; Schneider, David J; King, Jonas G; Fennessey, Christine M; Schroeder, Brenda K; Murray, Timothy D; Luster, Douglas G; Schneider, William L; Rogers, Elizabeth E

    2017-01-01

    Rathayibacter toxicus is a forage grass associated Gram-positive bacterium of major concern to food safety and agriculture. This species is listed by USDA-APHIS as a plant pathogen select agent because it produces a tunicamycin-like toxin that is lethal to livestock and may be vectored by nematode species native to the U.S. The complete genomes of two strains of R. toxicus, including the type strain FH-79, were sequenced and analyzed in comparison with all available, complete R. toxicus genomes. Genome sizes ranged from 2,343,780 to 2,394,755 nucleotides, with 2079 to 2137 predicted open reading frames; all four strains showed remarkable synteny over nearly the entire genome, with only a small transposed region. A cluster of genes with similarity to the tunicamycin biosynthetic cluster from Streptomyces chartreusis was identified. The tunicamycin gene cluster (TGC) in R. toxicus contained 14 genes in two transcriptional units, with all of the functional elements for tunicamycin biosynthesis present. The TGC had a significantly lower GC content (52%) than the rest of the genome (61.5%), suggesting that the TGC may have originated from a horizontal transfer event. Further analysis indicated numerous remnants of other potential horizontal transfer events are present in the genome. In addition to the TGC, genes potentially associated with carotenoid and exopolysaccharide production, bacteriocins and secondary metabolites were identified. A CRISPR array is evident. There were relatively few plant-associated cell-wall hydrolyzing enzymes, but there were numerous secreted serine proteases that share sequence homology to the pathogenicity-associated protein Pat-1 of Clavibacter michiganensis. Overall, the genome provides clear insight into the possible mechanisms for toxin production in R. toxicus, providing a basis for future genetic approaches.

  17. Whole Genome Sequencing Reveals the Islands of Novel Polymorphisms in Two Native Aromatic Japonica Rice Landraces from Vietnam.

    Science.gov (United States)

    Trung, Khuat Huu; Nguyen, Truong Khoa; Khuat, Hoang Bao Truc; Nguyen, Thuy Diep; Khanh, Tran Dang; Xuan, Tran Dang; Nguyen, Xuan-Hung

    2017-06-01

    Elucidation of the rice genome will not only broaden our understanding of genetic characterization of the agronomic characteristics but also facilitate the rice genetic improvement through marker assisted breeding. However, the genome resources of aromatic rice varieties are largely unexploited. Therefore, the whole genome of two elite aromatic traditional japonica rice landraces in North Vietnam, Tam Xoan Bac Ninh (TXBN), and Tam Xoan Hai Hau (TXHH), was sequenced to identify their genome-wide polymorphisms. Overall, we identified over 40,000 novel polymorphisms in each aromatic rice landrace. Although a discontinuous 8-bp deletion and an A/T SNP just upstream the 5-bp deletion in exon 7 of BADH2 gene were present in both rice landraces, the number of SNP high resolution regions of TXBN was six times higher than that of TXHH. Furthermore, several hot spot regions of novel SNPs and indels were found in both genomes, providing their potential gene pools related to aroma formation. The genomic information of two aromatic rice landraces described in this study will facilitate the identification of fragrance-related genes and the genetic improvement of rice. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  18. The subclonal structure and genomic evolution of oral squamous cell carcinoma revealed by ultra-deep sequencing

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin J

    2017-01-01

    Recent studies suggest that head and neck squamous cell carcinomas are very heterogeneous between patients; however the subclonal structure remains unexplored mainly due to studies using only a single biopsy per patient. To deconvolutethe clonal structure and describe the genomic cancer evolution......, we applied whole-exome sequencing combined with ultra-deep targeted sequencing on oral squamous cell carcinomas (OSCC). From each patient, a set of biopsies was sampled from distinct geographical sites in primary tumor and lymph node metastasis.We demonstrate that the included OSCCs show a high...

  19. The genome sequence of the leaf-cutter ant Atta cephalotes reveals insights into its obligate symbiotic lifestyle.

    Directory of Open Access Journals (Sweden)

    Garret Suen

    2011-02-01

    Full Text Available Leaf-cutter ants are one of the most important herbivorous insects in the Neotropics, harvesting vast quantities of fresh leaf material. The ants use leaves to cultivate a fungus that serves as the colony's primary food source. This obligate ant-fungus mutualism is one of the few occurrences of farming by non-humans and likely facilitated the formation of their massive colonies. Mature leaf-cutter ant colonies contain millions of workers ranging in size from small garden tenders to large soldiers, resulting in one of the most complex polymorphic caste systems within ants. To begin uncovering the genomic underpinnings of this system, we sequenced the genome of Atta cephalotes using 454 pyrosequencing. One prediction from this ant's lifestyle is that it has undergone genetic modifications that reflect its obligate dependence on the fungus for nutrients. Analysis of this genome sequence is consistent with this hypothesis, as we find evidence for reductions in genes related to nutrient acquisition. These include extensive reductions in serine proteases (which are likely unnecessary because proteolysis is not a primary mechanism used to process nutrients obtained from the fungus, a loss of genes involved in arginine biosynthesis (suggesting that this amino acid is obtained from the fungus, and the absence of a hexamerin (which sequesters amino acids during larval development in other insects. Following recent reports of genome sequences from other insects that engage in symbioses with beneficial microbes, the A. cephalotes genome provides new insights into the symbiotic lifestyle of this ant and advances our understanding of host-microbe symbioses.

  20. Complete genome sequence analysis of Nocardia brasiliensis HUJEG-1 reveals a saprobic lifestyle and the genes needed for human pathogenesis.

    Science.gov (United States)

    Vera-Cabrera, Lucio; Ortiz-Lopez, Rocio; Elizondo-Gonzalez, Ramiro; Ocampo-Candiani, Jorge

    2013-01-01

    Nocardia brasiliensis is an important etiologic agent of mycetoma. These bacteria live as a saprobe in soil or organic material and enter the tissue via minor trauma. Mycetoma is characterized by tumefaction and the production of fistula and abscesses, with no spontaneous cure. By using mass sequencing, we determined the complete genomic nucleotide sequence of the bacteria. According to our data, the genome is a circular chromosome 9,436,348-bp long with 68% G+C content that encodes 8,414 proteins. We observed orthologs for virulence factors, a higher number of genes involved in lipid biosynthesis and catabolism, and gene clusters for the synthesis of bioactive compounds, such as antibiotics, terpenes, and polyketides. An in silico analysis of the sequence supports the conclusion that the bacteria acquired diverse genes by horizontal transfer from other soil bacteria, even from eukaryotic organisms. The genome composition reflects the evolution of bacteria via the acquisition of a large amount of DNA, which allows it to survive in new ecological niches, including humans.

  1. Complete genome sequence analysis of Nocardia brasiliensis HUJEG-1 reveals a saprobic lifestyle and the genes needed for human pathogenesis.

    Directory of Open Access Journals (Sweden)

    Lucio Vera-Cabrera

    Full Text Available Nocardia brasiliensis is an important etiologic agent of mycetoma. These bacteria live as a saprobe in soil or organic material and enter the tissue via minor trauma. Mycetoma is characterized by tumefaction and the production of fistula and abscesses, with no spontaneous cure. By using mass sequencing, we determined the complete genomic nucleotide sequence of the bacteria. According to our data, the genome is a circular chromosome 9,436,348-bp long with 68% G+C content that encodes 8,414 proteins. We observed orthologs for virulence factors, a higher number of genes involved in lipid biosynthesis and catabolism, and gene clusters for the synthesis of bioactive compounds, such as antibiotics, terpenes, and polyketides. An in silico analysis of the sequence supports the conclusion that the bacteria acquired diverse genes by horizontal transfer from other soil bacteria, even from eukaryotic organisms. The genome composition reflects the evolution of bacteria via the acquisition of a large amount of DNA, which allows it to survive in new ecological niches, including humans.

  2. First Complete Mitochondrial Genome Sequence from a Box Jellyfish Reveals a Highly Fragmented Linear Architecture and Insights into Telomere Evolution

    Science.gov (United States)

    Smith, David Roy; Kayal, Ehsan; Yanagihara, Angel A.; Collins, Allen G.; Pirro, Stacy; Keeling, Patrick J.

    2012-01-01

    Animal mitochondrial DNAs (mtDNAs) are typically single circular chromosomes, with the exception of those from medusozoan cnidarians (jellyfish and hydroids), which are linear and sometimes fragmented. Most medusozoans have linear monomeric or linear bipartite mitochondrial genomes, but preliminary data have suggested that box jellyfish (cubozoans) have mtDNAs that consist of many linear chromosomes. Here, we present the complete mtDNA sequence from the winged box jellyfish Alatina moseri (the first from a cubozoan). This genome contains unprecedented levels of fragmentation: 18 unique genes distributed over eight 2.9- to 4.6-kb linear chromosomes. The telomeres are identical within and between chromosomes, and recombination between subtelomeric sequences has led to many genes initiating or terminating with sequences from other genes (the most extreme case being 150 nt of a ribosomal RNA containing the 5′ end of nad2), providing evidence for a gene conversion–based model of telomere evolution. The silent-site nucleotide variation within the A. moseri mtDNA is among the highest observed from a eukaryotic genome and may be associated with elevated rates of recombination. PMID:22117085

  3. Whole genome sequencing reveals complex evolution patterns of multidrug-resistant Mycobacterium tuberculosis Beijing strains in patients.

    Directory of Open Access Journals (Sweden)

    Matthias Merker

    Full Text Available Multidrug-resistant (MDR Mycobacterium tuberculosis complex (MTBC strains represent a major threat for tuberculosis (TB control. Treatment of MDR-TB patients is long and less effective, resulting in a significant number of treatment failures. The development of further resistances leads to extensively drug-resistant (XDR variants. However, data on the individual reasons for treatment failure, e.g. an induced mutational burst, and on the evolution of bacteria in the patient are only sparsely available. To address this question, we investigated the intra-patient evolution of serial MTBC isolates obtained from three MDR-TB patients undergoing longitudinal treatment, finally leading to XDR-TB. Sequential isolates displayed identical IS6110 fingerprint patterns, suggesting the absence of exogenous re-infection. We utilized whole genome sequencing (WGS to screen for variations in three isolates from Patient A and four isolates from Patient B and C, respectively. Acquired polymorphisms were subsequently validated in up to 15 serial isolates by Sanger sequencing. We determined eight (Patient A and nine (Patient B polymorphisms, which occurred in a stepwise manner during the course of the therapy and were linked to resistance or a potential compensatory mechanism. For both patients, our analysis revealed the long-term co-existence of clonal subpopulations that displayed different drug resistance allele combinations. Out of these, the most resistant clone was fixed in the population. In contrast, baseline and follow-up isolates of Patient C were distinguished each by eleven unique polymorphisms, indicating an exogenous re-infection with an XDR strain not detected by IS6110 RFLP typing. Our study demonstrates that intra-patient microevolution of MDR-MTBC strains under longitudinal treatment is more complex than previously anticipated. However, a mutator phenotype was not detected. The presence of different subpopulations might confound phenotypic and

  4. Whole genome sequencing of mutation accumulation lines reveals a low mutation rate in the social amoeba Dictyostelium discoideum.

    Directory of Open Access Journals (Sweden)

    Gerda Saxer

    Full Text Available Spontaneous mutations play a central role in evolution. Despite their importance, mutation rates are some of the most elusive parameters to measure in evolutionary biology. The combination of mutation accumulation (MA experiments and whole-genome sequencing now makes it possible to estimate mutation rates by directly observing new mutations at the molecular level across the whole genome. We performed an MA experiment with the social amoeba Dictyostelium discoideum and sequenced the genomes of three randomly chosen lines using high-throughput sequencing to estimate the spontaneous mutation rate in this model organism. The mitochondrial mutation rate of 6.76×10(-9, with a Poisson confidence interval of 4.1×10(-9 - 9.5×10(-9, per nucleotide per generation is slightly lower than estimates for other taxa. The mutation rate estimate for the nuclear DNA of 2.9×10(-11, with a Poisson confidence interval ranging from 7.4×10(-13 to 1.6×10(-10, is the lowest reported for any eukaryote. These results are consistent with low microsatellite mutation rates previously observed in D. discoideum and low levels of genetic variation observed in wild D. discoideum populations. In addition, D. discoideum has been shown to be quite resistant to DNA damage, which suggests an efficient DNA-repair mechanism that could be an adaptation to life in soil and frequent exposure to intracellular and extracellular mutagenic compounds. The social aspect of the life cycle of D. discoideum and a large portion of the genome under relaxed selection during vegetative growth could also select for a low mutation rate. This hypothesis is supported by a significantly lower mutation rate per cell division in multicellular eukaryotes compared with unicellular eukaryotes.

  5. The Subclonal Structure and Genomic Evolution of Oral Squamous Cell Carcinoma Revealed by Ultra-deep Sequencing

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin Jakob

    Background: Oral squamous cell carcinoma (OSCC), a subgroup of head and neck squamous cell carcinoma (HNSCC), is primarily caused by alcohol consumption and tobacco use. Recent DNA sequencing studies suggests that HNSCC are very heterogeneous between patients; however the intra-patient subclonal...... structure remains unexplored due to lack of sampling multiple tumor biopsies from each patient. Materials and methods: To examine the clonal structure and describe the genomic cancer evolution we applied whole-exome sequencing combined with targeted ultra-deep targeted sequencing on biopsies from 5stage IV...... of unprecedented high resolution enabling clear detection of subclonal structure and observation of otherwise undetectable mutations. Furthermore, we demonstrate that OSCC show a high degree of inter-patient heterogeneity but a low degree of intra-patient/tumor heterogeneity. However, some OSCC cancers contain...

  6. Whole genome sequencing and analysis of Campylobacter coli YH502 from retail chicken reveals a plasmid-borne type VI secretion system

    Directory of Open Access Journals (Sweden)

    Sandeep Ghatak

    2017-03-01

    Full Text Available Campylobacter is a major cause of foodborne illnesses worldwide. Campylobacter infections, commonly caused by ingestion of undercooked poultry and meat products, can lead to gastroenteritis and chronic reactive arthritis in humans. Whole genome sequencing (WGS is a powerful technology that provides comprehensive genetic information about bacteria and is increasingly being applied to study foodborne pathogens: e.g., evolution, epidemiology/outbreak investigation, and detection. Herein we report the complete genome sequence of Campylobacter coli strain YH502 isolated from retail chicken in the United States. WGS, de novo assembly, and annotation of the genome revealed a chromosome of 1,718,974 bp and a mega-plasmid (pCOS502 of 125,964 bp. GC content of the genome was 31.2% with 1931 coding sequences and 53 non-coding RNAs. Multiple virulence factors including a plasmid-borne type VI secretion system and antimicrobial resistance genes (beta-lactams, fluoroquinolones, and aminoglycoside were found. The presence of T6SS in a mobile genetic element (plasmid suggests plausible horizontal transfer of these virulence genes to other organisms. The C. coli YH502 genome also harbors CRISPR sequences and associated proteins. Phylogenetic analysis based on average nucleotide identity and single nucleotide polymorphisms identified closely related C. coli genomes available in the NCBI database. Taken together, the analyzed genomic data of this potentially virulent strain of C. coli will facilitate further understanding of this important foodborne pathogen most likely leading to better control strategies. The chromosome and plasmid sequences of C. coli YH502 have been deposited in GenBank under the accession numbers CP018900.1 and CP018901.1, respectively.

  7. The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima

    DEFF Research Database (Denmark)

    Chipman, Ariel D.; Ferrier, David E.K.; Brena, Carlo

    2014-01-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We pres...

  8. The Genomic Sequence of the Oral Pathobiont Strain NI1060 Reveals Unique Strategies for Bacterial Competition and Pathogenicity.

    Directory of Open Access Journals (Sweden)

    Youssef Darzi

    Full Text Available Strain NI1060 is an oral bacterium responsible for periodontitis in a murine ligature-induced disease model. To better understand its pathogenicity, we have determined the complete sequence of its 2,553,982 bp genome. Although closely related to Pasteurella pneumotropica, a pneumonia-associated rodent commensal based on its 16S rRNA, the NI1060 genomic content suggests that they are different species thriving on different energy sources via alternative metabolic pathways. Genomic and phylogenetic analyses showed that strain NI1060 is distinct from the genera currently described in the family Pasteurellaceae, and is likely to represent a novel species. In addition, we found putative virulence genes involved in lipooligosaccharide synthesis, adhesins and bacteriotoxic proteins. These genes are potentially important for host adaption and for the induction of dysbiosis through bacterial competition and pathogenicity. Importantly, strain NI1060 strongly stimulates Nod1, an innate immune receptor, but is defective in two peptidoglycan recycling genes due to a frameshift mutation. The in-depth analysis of its genome thus provides critical insights for the development of NI1060 as a prime model system for infectious disease.

  9. Analysis of The Cancer Genome Atlas sequencing data reveals novel properties of the human papillomavirus 16 genome in head and neck squamous cell carcinoma.

    Science.gov (United States)

    Nulton, Tara J; Olex, Amy L; Dozmorov, Mikhail; Morgan, Iain M; Windle, Brad

    2017-03-14

    Human papillomavirus (HPV) DNA is detected in up to 80% of oropharyngeal carcinomas (OPC) and this HPV positive disease has reached epidemic proportions. To increase our understanding of the disease, we investigated the status of the HPV16 genome in HPV-positive head and neck cancers (HNC). Raw RNA-Seq and Whole Genome Sequence data from The Cancer Genome Atlas HNC samples were analyzed to gain a full understanding of the HPV genome status for these tumors. Several remarkable and novel observations were made following this analysis. Firstly, there are three main HPV genome states in these tumors that are split relatively evenly: An episomal only state, an integrated state, and a state in which the viral genome exists as a hybrid episome with human DNA. Secondly, none of the tumors expressed high levels of E6; E6*I is the dominant variant expressed in all tumors. The most striking conclusion from this study is that around three quarters of HPV16 positive HNC contain episomal versions of the viral genome that are likely replicating in an E1-E2 dependent manner. The clinical and therapeutic implications of these observations are discussed.

  10. Genomic library screening for viruses from the human dental plaque revealed pathogen-specific lytic phage sequences.

    Science.gov (United States)

    Al-Jarbou, Ahmed Nasser

    2012-01-01

    Bacterial pathogenesis presents an astounding arsenal of virulence factors that allow them to conquer many different niches throughout the course of infection. Principally fascinating is the fact that some bacterial species are able to induce different diseases by expression of different combinations of virulence factors. Nevertheless, studies aiming at screening for the presence of bacteriophages in humans have been limited. Such screening procedures would eventually lead to identification of phage-encoded properties that impart increased bacterial fitness and/or virulence in a particular niche, and hence, would potentially be used to reverse the course of bacterial infections. As the human oral cavity represents a rich and dynamic ecosystem for several upper respiratory tract pathogens. However, little is known about virus diversity in human dental plaque which is an important reservoir. We applied the culture-independent approach to characterize virus diversity in human dental plaque making a library from a virus DNA fraction amplified using a multiple displacement method and sequenced 80 clones. The resulting sequence showed 44% significant identities to GenBank databases by TBLASTX analysis. TBLAST homology comparisons showed that 66% was viral; 18% eukarya; 10% bacterial; 6% mobile elements. These sequences were sorted into 6 contigs and 45 single sequences in which 4 contigs and a single sequence showed significant identity to a small region of a putative prophage in the Corynebacterium diphtheria genome. These findings interestingly highlight the uniqueness of over half of the sequences, whilst the dominance of a pathogen-specific prophage sequences imply their role in virulence.

  11. Whole genome re-sequencing reveals genome-wide variations among parental lines of 16 mapping populations in chickpea (Cicer arietinum L.).

    Science.gov (United States)

    Thudi, Mahendar; Khan, Aamir W; Kumar, Vinay; Gaur, Pooran M; Katta, Krishnamohan; Garg, Vanika; Roorkiwal, Manish; Samineni, Srinivasan; Varshney, Rajeev K

    2016-01-27

    Chickpea (Cicer arietinum L.) is the second most important grain legume cultivated by resource poor farmers in South Asia and Sub-Saharan Africa. In order to harness the untapped genetic potential available for chickpea improvement, we re-sequenced 35 chickpea genotypes representing parental lines of 16 mapping populations segregating for abiotic (drought, heat, salinity), biotic stresses (Fusarium wilt, Ascochyta blight, Botrytis grey mould, Helicoverpa armigera) and nutritionally important (protein content) traits using whole genome re-sequencing approach. A total of 192.19 Gb data, generated on 35 genotypes of chickpea, comprising 973.13 million reads, with an average sequencing depth of ~10 X for each line. On an average 92.18 % reads from each genotype were aligned to the chickpea reference genome with 82.17 % coverage. A total of 2,058,566 unique single nucleotide polymorphisms (SNPs) and 292,588 Indels were detected while comparing with the reference chickpea genome. Highest number of SNPs were identified on the Ca4 pseudomolecule. In addition, copy number variations (CNVs) such as gene deletions and duplications were identified across the chickpea parental genotypes, which were minimum in PI 489777 (1 gene deletion) and maximum in JG 74 (1,497). A total of 164,856 line specific variations (144,888 SNPs and 19,968 Indels) with the highest percentage were identified in coding regions in ICC 1496 (21 %) followed by ICCV 97105 (12 %). Of 539 miscellaneous variations, 339, 138 and 62 were inter-chromosomal variations (CTX), intra-chromosomal variations (ITX) and inversions (INV) respectively. Genome-wide SNPs, Indels, CNVs, PAVs, and miscellaneous variations identified in different mapping populations are a valuable resource in genetic research and helpful in locating genes/genomic segments responsible for economically important traits. Further, the genome-wide variations identified in the present study can be used for developing high density SNP arrays for

  12. Full Genome Sequencing Reveals New Southern African Territories Genotypes Bringing Us Closer to Understanding True Variability of Foot-and-Mouth Disease Virus in Africa.

    Science.gov (United States)

    Lasecka-Dykes, Lidia; Wright, Caroline F; Di Nardo, Antonello; Logan, Grace; Mioulet, Valerie; Jackson, Terry; Tuthill, Tobias J; Knowles, Nick J; King, Donald P

    2018-04-13

    Foot-and-mouth disease virus (FMDV) causes a highly contagious disease of cloven-hooved animals that poses a constant burden on farmers in endemic regions and threatens the livestock industries in disease-free countries. Despite the increased number of publicly available whole genome sequences, FMDV data are biased by the opportunistic nature of sampling. Since whole genomic sequences of Southern African Territories (SAT) are particularly underrepresented, this study sequenced 34 isolates from eastern and southern Africa. Phylogenetic analyses revealed two novel genotypes (that comprised 8/34 of these SAT isolates) which contained unusual 5′ untranslated and non-structural encoding regions. While recombination has occurred between these sequences, phylogeny violation analyses indicated that the high degree of sequence diversity for the novel SAT genotypes has not solely arisen from recombination events. Based on estimates of the timing of ancestral divergence, these data are interpreted as being representative of un-sampled FMDV isolates that have been subjected to geographical isolation within Africa by the effects of the Great African Rinderpest Pandemic (1887–1897), which caused a mass die-out of FMDV-susceptible hosts. These findings demonstrate that further sequencing of African FMDV isolates is likely to reveal more unusual genotypes and will allow for better understanding of natural variability and evolution of FMDV.

  13. Complete genome sequence of avian paramyxovirus (APMV serotype 5 completes the analysis of nine APMV serotypes and reveals the longest APMV genome.

    Directory of Open Access Journals (Sweden)

    Arthur S Samuel

    2010-02-01

    Full Text Available Avian paramyxoviruses (APMV consist of nine known serotypes. The genomes of representatives of all APMV serotypes except APMV type 5 have recently been fully sequenced. Here, we report the complete genome sequence of the APMV-5 prototype strain budgerigar/Kunitachi/74.APMV-5 Kunitachi virus is unusual in that it lacks a virion hemagglutinin and does not grow in the allantoic cavity of embryonated chicken eggs. However, the virus grew in the amniotic cavity of embryonated chicken eggs and in twelve different established cell lines and two primary cell cultures. The genome is 17,262 nucleotides (nt long, which is the longest among members of genus Avulavirus, and encodes six non-overlapping genes in the order of 3'N-P/V/W-M-F-HN-L-5' with intergenic regions of 4-57 nt. The genome length follows the 'rule of six' and contains a 55-nt leader sequence at the 3'end and a 552 nt trailer sequence at the 5' end. The phosphoprotein (P gene contains a conserved RNA editing site and is predicted to encode P, V, and W proteins. The cleavage site of the F protein (G-K-R-K-K-R downward arrowF conforms to the cleavage site motif of the ubiquitous cellular protease furin. Consistent with this, exogenous protease was not required for virus replication in vitro. However, the intracerebral pathogenicity index of APMV-5 strain Kunitachi in one-day-old chicks was found to be zero, indicating that the virus is avirulent for chickens despite the presence of a polybasic F cleavage site.Phylogenetic analysis of the sequences of the APVM-5 genome and proteins versus those of the other APMV serotypes showed that APMV-5 is more closely related to APMV-6 than to the other APMVs. Furthermore, these comparisons provided evidence of extensive genome-wide divergence that supports the classification of the APMVs into nine separate serotypes. The structure of the F cleavage site does not appear to be a reliable indicator of virulence among APMV serotypes 2-9. The availability of

  14. Genome sequencing of mouse induced pluripotent stem cells reveals retroelement stability and infrequent DNA rearrangement during reprogramming.

    Science.gov (United States)

    Quinlan, Aaron R; Boland, Michael J; Leibowitz, Mitchell L; Shumilina, Svetlana; Pehrson, Sidney M; Baldwin, Kristin K; Hall, Ira M

    2011-10-04

    The biomedical utility of induced pluripotent stem cells (iPSCs) will be diminished if most iPSC lines harbor deleterious genetic mutations. Recent microarray studies have shown that human iPSCs carry elevated levels of DNA copy number variation compared with those in embryonic stem cells, suggesting that these and other classes of genomic structural variation (SV), including inversions, smaller duplications and deletions, complex rearrangements, and retroelement transpositions, may frequently arise as a consequence of reprogramming. Here we employ whole-genome paired-end DNA sequencing and sensitive mapping algorithms to identify all classes of SV in three fully pluripotent mouse iPSC lines. Despite the improved scope and resolution of this study, we find few spontaneous mutations per line (one or two) and no evidence for endogenous retroelement transposition. These results show that genome stability can persist throughout reprogramming, and argue that it is possible to generate iPSCs lacking gene-disrupting mutations using current reprogramming methods. Copyright © 2011 Elsevier Inc. All rights reserved.

  15. Complete sequencing and pan-genomic analysis of Lactobacillus delbrueckii subsp. bulgaricus reveal its genetic basis for industrial yogurt production.

    Directory of Open Access Journals (Sweden)

    Pei Hao

    Full Text Available Lactobacillus delbrueckii subsp. bulgaricus (Lb. bulgaricus is an important species of Lactic Acid Bacteria (LAB used for cheese and yogurt fermentation. The genome of Lb. bulgaricus 2038, an industrial strain mainly used for yogurt production, was completely sequenced and compared against the other two ATCC collection strains of the same subspecies. Specific physiological properties of strain 2038, such as lysine biosynthesis, formate production, aspartate-related carbon-skeleton intermediate metabolism, unique EPS synthesis and efficient DNA restriction/modification systems, are all different from those of the collection strains that might benefit the industrial production of yogurt. Other common features shared by Lb. bulgaricus strains, such as efficient protocooperation with Streptococcus thermophilus and lactate production as well as well-equipped stress tolerance mechanisms may account for it being selected originally for yogurt fermentation industry. Multiple lines of evidence suggested that Lb. bulgaricus 2038 was genetically closer to the common ancestor of the subspecies than the other two sequenced collection strains, probably due to a strict industrial maintenance process for strain 2038 that might have halted its genome decay and sustained a gene network suitable for large scale yogurt production.

  16. Complete sequencing and pan-genomic analysis of Lactobacillus delbrueckii subsp. bulgaricus reveal its genetic basis for industrial yogurt production.

    Science.gov (United States)

    Hao, Pei; Zheng, Huajun; Yu, Yao; Ding, Guohui; Gu, Wenyi; Chen, Shuting; Yu, Zhonghao; Ren, Shuangxi; Oda, Munehiro; Konno, Tomonobu; Wang, Shengyue; Li, Xuan; Ji, Zai-Si; Zhao, Guoping

    2011-01-17

    Lactobacillus delbrueckii subsp. bulgaricus (Lb. bulgaricus) is an important species of Lactic Acid Bacteria (LAB) used for cheese and yogurt fermentation. The genome of Lb. bulgaricus 2038, an industrial strain mainly used for yogurt production, was completely sequenced and compared against the other two ATCC collection strains of the same subspecies. Specific physiological properties of strain 2038, such as lysine biosynthesis, formate production, aspartate-related carbon-skeleton intermediate metabolism, unique EPS synthesis and efficient DNA restriction/modification systems, are all different from those of the collection strains that might benefit the industrial production of yogurt. Other common features shared by Lb. bulgaricus strains, such as efficient protocooperation with Streptococcus thermophilus and lactate production as well as well-equipped stress tolerance mechanisms may account for it being selected originally for yogurt fermentation industry. Multiple lines of evidence suggested that Lb. bulgaricus 2038 was genetically closer to the common ancestor of the subspecies than the other two sequenced collection strains, probably due to a strict industrial maintenance process for strain 2038 that might have halted its genome decay and sustained a gene network suitable for large scale yogurt production.

  17. Whole genome sequencing reveals a 7 base-pair deletion in DMD exon 42 in a dog with muscular dystrophy.

    Science.gov (United States)

    Nghiem, Peter P; Bello, Luca; Balog-Alvarez, Cindy; López, Sara Mata; Bettis, Amanda; Barnett, Heather; Hernandez, Briana; Schatzberg, Scott J; Piercy, Richard J; Kornegay, Joe N

    2017-04-01

    Dystrophin is a key cytoskeletal protein coded by the Duchenne muscular dystrophy (DMD) gene located on the X-chromosome. Truncating mutations in the DMD gene cause loss of dystrophin and the classical DMD clinical syndrome. Spontaneous DMD gene mutations and associated phenotypes occur in several other species. The mdx mouse model and the golden retriever muscular dystrophy (GRMD) canine model have been used extensively to study DMD disease pathogenesis and show efficacy and side effects of putative treatments. Certain DMD gene mutations in high-risk, the so-called hot spot areas can be particularly helpful in modeling molecular therapies. Identification of specific mutations has been greatly enhanced by new genomic methods. Whole genome, next generation sequencing (WGS) has been recently used to define DMD patient mutations, but has not been used in dystrophic dogs. A dystrophin-deficient Cavalier King Charles Spaniel (CKCS) dog was evaluated at the functional, histopathological, biochemical, and molecular level. The affected dog's phenotype was compared to the previously reported canine dystrophinopathies. WGS was then used to detect a 7 base pair deletion in DMD exon 42 (c.6051-6057delTCTCAAT mRNA), predicting a frameshift in gene transcription and truncation of dystrophin protein translation. The deletion was confirmed with conventional PCR and Sanger sequencing. This mutation is in a secondary DMD gene hotspot area distinct from the one identified earlier at the 5' donor splice site of intron 50 in the CKCS breed.

  18. Transmission of Staphylococcus aureus from Humans to Green Monkeys in The Gambia as Revealed by Whole-Genome Sequencing.

    Science.gov (United States)

    Senghore, Madikay; Bayliss, Sion C; Kwambana-Adams, Brenda A; Foster-Nyarko, Ebenezer; Manneh, Jainaba; Dione, Michel; Badji, Henry; Ebruke, Chinelo; Doughty, Emma L; Thorpe, Harry A; Jasinska, Anna J; Schmitt, Christopher A; Cramer, Jennifer D; Turner, Trudy R; Weinstock, George; Freimer, Nelson B; Pallen, Mark J; Feil, Edward J; Antonio, Martin

    2016-10-01

    Staphylococcus aureus is an important pathogen of humans and animals. We genome sequenced 90 S. aureus isolates from The Gambia: 46 isolates from invasive disease in humans, 13 human carriage isolates, and 31 monkey carriage isolates. We inferred multiple anthroponotic transmissions of S. aureus from humans to green monkeys (Chlorocebus sabaeus) in The Gambia over different time scales. We report a novel monkey-associated clade of S. aureus that emerged from a human-to-monkey switch estimated to have occurred 2,700 years ago. Adaptation of this lineage to the monkey host is accompanied by the loss of phage-carrying genes that are known to play an important role in human colonization. We also report recent anthroponotic transmission of the well-characterized human lineages sequence type 6 (ST6) and ST15 to monkeys, probably because of steadily increasing encroachment of humans into the monkeys' habitat. Although we have found no evidence of transmission of S. aureus from monkeys to humans, as the two species come into ever-closer contact, there might be an increased risk of additional interspecies exchanges of potential pathogens. The population structures of Staphylococcus aureus in humans and monkeys in sub-Saharan Africa have been previously described using multilocus sequence typing (MLST). However, these data lack the power to accurately infer details regarding the origin and maintenance of new adaptive lineages. Here, we describe the use of whole-genome sequencing to detect transmission of S. aureus between humans and nonhuman primates and to document the genetic changes accompanying host adaptation. We note that human-to-monkey switches tend to be more common than the reverse and that a novel monkey-associated clade is likely to have emerged from such a switch approximately 2,700 years ago. Moreover, analysis of the accessory genome provides important clues as to the genetic changes underpinning host adaptation and, in particular, shows that human

  19. Evolution of novel wood decay mechanisms in Agaricales revealed by the genome sequences of Fistulina hepatica and Cylindrobasidium torrendii.

    Science.gov (United States)

    Floudas, Dimitrios; Held, Benjamin W; Riley, Robert; Nagy, Laszlo G; Koehler, Gage; Ransdell, Anthony S; Younus, Hina; Chow, Julianna; Chiniquy, Jennifer; Lipzen, Anna; Tritt, Andrew; Sun, Hui; Haridas, Sajeet; LaButti, Kurt; Ohm, Robin A; Kües, Ursula; Blanchette, Robert A; Grigoriev, Igor V; Minto, Robert E; Hibbett, David S

    2015-03-01

    Wood decay mechanisms in Agaricomycotina have been traditionally separated in two categories termed white and brown rot. Recently the accuracy of such a dichotomy has been questioned. Here, we present the genome sequences of the white-rot fungus Cylindrobasidium torrendii and the brown-rot fungus Fistulina hepatica both members of Agaricales, combining comparative genomics and wood decay experiments. C. torrendii is closely related to the white-rot root pathogen Armillaria mellea, while F. hepatica is related to Schizophyllum commune, which has been reported to cause white rot. Our results suggest that C. torrendii and S. commune are intermediate between white-rot and brown-rot fungi, but at the same time they show characteristics of decay that resembles soft rot. Both species cause weak wood decay and degrade all wood components but leave the middle lamella intact. Their gene content related to lignin degradation is reduced, similar to brown-rot fungi, but both have maintained a rich array of genes related to carbohydrate degradation, similar to white-rot fungi. These characteristics appear to have evolved from white-rot ancestors with stronger ligninolytic ability. F. hepatica shows characteristics of brown rot both in terms of wood decay genes found in its genome and the decay that it causes. However, genes related to cellulose degradation are still present, which is a plesiomorphic characteristic shared with its white-rot ancestors. Four wood degradation-related genes, homologs of which are frequently lost in brown-rot fungi, show signs of pseudogenization in the genome of F. hepatica. These results suggest that transition toward a brown-rot lifestyle could be an ongoing process in F. hepatica. Our results reinforce the idea that wood decay mechanisms are more diverse than initially thought and that the dichotomous separation of wood decay mechanisms in Agaricomycotina into white rot and brown rot should be revisited. Copyright © 2015 Elsevier Inc. All rights

  20. Evolution of novel wood decay mechanisms in Agaricales revealed by the genome sequences of Fistulina hepatica and Cylindrobasidium torrendii

    Science.gov (United States)

    Floudas, Dimitrios; Held, Benjamin W.; Riley, Robert; Nagy, Laszlo G.; Koehler, Gage; Ransdell, Anthony S.; Younus, Hina; Chow, Julianna; Chiniquy, Jennifer; Lipzen, Anna; Tritt, Andrew; Sun, Hui; Haridas, Sajeet; LaButti, Kurt; Ohm, Robin A.; Kües, Ursula; Blanchette, Robert A.; Grigoriev, Igor V.; Minto, Robert E.; Hibbett, David S.

    2015-01-01

    Wood decay mechanisms in Agaricomycotina have been traditionally separated in two categories termed white and brown rot. Recently the accuracy of such a dichotomy has been questioned. Here, we present the genome sequences of the white rot fungus Cylindrobasidium torrendii and the brown rot fungus Fistulina hepatica both members of Agaricales, combining comparative genomics and wood decay experiments. Cylindrobasidium torrendii is closely related to the white-rot root pathogen Armillaria mellea, while F. hepatica is related to Schizophyllum commune, which has been reported to cause white rot. Our results suggest that C. torrendii and S. commune are intermediate between white-rot and brown-rot fungi, but at the same time they show characteristics of decay that resembles soft rot. Both species cause weak wood decay and degrade all wood components but leave the middle lamella intact. Their gene content related to lignin degradation is reduced, similar to brown-rot fungi, but both have maintained a rich array of genes related to carbohydrate degradation, similar to white-rot fungi. These characteristics appear to have evolved from white-rot ancestors with stronger ligninolytic ability. Fistulina hepatica shows characteristics of brown rot both in terms of wood decay genes found in its genome and the decay that it causes. However, genes related to cellulose degradation are still present, which is a plesiomorphic characteristic shared with its white-rot ancestors. Four wood degradation-related genes, homologs of which are frequently lost in brown-rot fungi, show signs of pseudogenization in the genome of F. hepatica. These results suggest that transition towards a brown rot lifestyle could be an ongoing process in F. hepatica. Our results reinforce the idea that wood decay mechanisms are more diverse than initially thought and that the dichotomous separation of wood decay mechanisms in Agaricomycotina into white rot and brown rot should be revisited. PMID:25683379

  1. The Genome Sequences of Cellulomonas fimi and ?Cellvibrio gilvus? Reveal the Cellulolytic Strategies of Two Facultative Anaerobes, Transfer of ?Cellvibrio gilvus? to the Genus Cellulomonas, and Proposal of Cellulomonas gilvus sp. nov

    OpenAIRE

    Christopherson, Melissa R.; Suen, Garret; Bramhacharya, Shanti; Jewell, Kelsea A.; Aylward, Frank O.; Mead, David; Brumm, Phillip J.

    2013-01-01

    Actinobacteria in the genus Cellulomonas are the only known and reported cellulolytic facultative anaerobes. To better understand the cellulolytic strategy employed by these bacteria, we sequenced the genome of the Cellulomonas fimi ATCC 484(T). For comparative purposes, we also sequenced the genome of the aerobic cellulolytic "Cellvibrio gilvus" ATCC 13127(T). An initial analysis of these genomes using phylogenetic and whole-genome comparison revealed that "Cellvibrio gilvus" belongs to the ...

  2. Novel RAD sequence data reveal a lack of genomic divergence between dietary ecotypes in a landlocked salmonid population

    Science.gov (United States)

    Limborg, Morten T.; Larson, Wesley; Shedd, Kyle; Seeb, Lisa W.; Seeb, James E.

    2017-01-01

    Preservation of heritable ecological diversity within species and populations is a key challenge for managing natural resources and wild populations. Salmonid fish are iconic and socio-economically important species for commercial, aquaculture, and recreational fisheries across the globe. Many salmonids are known to exhibit ecological divergence within species, including distinct feeding ecotypes within the same lakes. Here we used 5559 SNPs, derived from RAD sequencing, to perform population genetic comparisons between two dietary ecotypes of sockeye salmon (Oncorhynchus nerka) in Jo-Jo Lake, Alaska (USA). We tested the standing hypothesis that these two ecotypes are currently diverging as a result of adaptation to distinct dietary niches; results support earlier conclusions of a single panmictic population. The RAD sequence data revealed 40 new SNPs not previously detected in the species, and our sequence data can be used in future studies of ecotypic diversity in salmonid species.

  3. Genome Sequence Databases (Overview): Sequencing and Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  4. Whole Genome Sequencing of the Symbiont Pseudovibrio sp. from the Intertidal Marine Sponge Polymastia penicillus Revealed a Gene Repertoire for Host-Switching Permissive Lifestyle.

    Science.gov (United States)

    Alex, Anoop; Antunes, Agostinho

    2015-10-31

    Sponges harbor a complex consortium of microbial communities living in symbiotic relationship benefiting each other through the integration of metabolites. The mechanisms influencing a successful microbial association with a sponge partner are yet to be fully understood. Here, we sequenced the genome of Pseudovibrio sp. POLY-S9 strain isolated from the intertidal marine sponge Polymastia penicillus sampled from the Atlantic coast of Portugal to identify the genomic features favoring the symbiotic relationship. The draft genome revealed an exceptionally large genome size of 6.6 Mbp compared with the previously reported genomes of the genus Pseudovibrio isolated from a coral and a sponge larva. Our genomic study detected the presence of several biosynthetic gene clusters-polyketide synthase, nonribosomal peptide synthetase and siderophore-affirming the potential ability of the genus Pseudovibrio to produce a wide variety of metabolic compounds. Moreover, we identified a repertoire of genes encoding adaptive symbioses factors (eukaryotic-like proteins), such as the ankyrin repeats, tetratrico peptide repeats, and Sel1 repeats that improve the attachment to the eukaryotic hosts and the avoidance of the host's immune response : The genome also harbored a large number of mobile elements (∼5%) and gene transfer agents, which explains the massive genome expansion and suggests a possible mechanism of horizontal gene transfer. In conclusion, the genome of POLY-S9 exhibited an increase in size, number of mobile DNA, multiple metabolite gene clusters, and secretion systems, likely to influence the genome diversification and the evolvability. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  5. Genome Sequences of Oryza Species

    KAUST Repository

    Kumagai, Masahiko

    2018-02-14

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  6. Transmission of single HIV-1 genomes and dynamics of early immune escape revealed by ultra-deep sequencing.

    Directory of Open Access Journals (Sweden)

    Will Fischer

    2010-08-01

    Full Text Available We used ultra-deep sequencing to obtain tens of thousands of HIV-1 sequences from regions targeted by CD8+ T lymphocytes from longitudinal samples from three acutely infected subjects, and modeled viral evolution during the critical first weeks of infection. Previous studies suggested that a single virus established productive infection, but these conclusions were tempered because of limited sampling; now, we have greatly increased our confidence in this observation through modeling the observed earliest sample diversity based on vastly more extensive sampling. Conventional sequencing of HIV-1 from acute/early infection has shown different patterns of escape at different epitopes; we investigated the earliest escapes in exquisite detail. Over 3-6 weeks, ultradeep sequencing revealed that the virus explored an extraordinary array of potential escape routes in the process of evading the earliest CD8 T-lymphocyte responses--using 454 sequencing, we identified over 50 variant forms of each targeted epitope during early immune escape, while only 2-7 variants were detected in the same samples via conventional sequencing. In contrast to the diversity seen within epitopes, non-epitope regions, including the Envelope V3 region, which was sequenced as a control in each subject, displayed very low levels of variation. In early infection, in the regions sequenced, the consensus forms did not have a fitness advantage large enough to trigger reversion to consensus amino acids in the absence of immune pressure. In one subject, a genetic bottleneck was observed, with extensive diversity at the second time point narrowing to two dominant escape forms by the third time point, all within two months of infection. Traces of immune escape were observed in the earliest samples, suggesting that immune pressure is present and effective earlier than previously reported; quantifying the loss rate of the founder virus suggests a direct role for CD8 T-lymphocyte responses

  7. The Genome Sequence of Methanohalophilus mahii SLPT Reveals Differences in the Energy Metabolism among Members of the Methanosarcinaceae Inhabiting Freshwater and Saline Environments

    Directory of Open Access Journals (Sweden)

    Stefan Spring

    2010-01-01

    Full Text Available Methanohalophilus mahii is the type species of the genus Methanohalophilus, which currently comprises three distinct species with validly published names. Mhp. mahii represents moderately halophilic methanogenic archaea with a strictly methylotrophic metabolism. The type strain SLPT was isolated from hypersaline sediments collected from the southern arm of Great Salt Lake, Utah. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 2,012,424 bp genome is a single replicon with 2032 protein-coding and 63 RNA genes and part of the Genomic Encyclopedia of Bacteria and Archaea project. A comparison of the reconstructed energy metabolism in the halophilic species Mhp. mahii with other representatives of the Methanosarcinaceae reveals some interesting differences to freshwater species.

  8. Whole genome sequencing of pairwise human subjects reveals DNA mutations specific to developmental dysplasia of the hip.

    Science.gov (United States)

    Zhu, Lun-Qing; Su, Guang-Hao; Dai, Jin; Zhang, Wen-Yan; Yin, Chun-Hua; Zhang, Fu-Yong; Zhu, Zhen-Hua; Guo, Zhi-Xiong; Fang, Jian-Feng; Zou, Cheng-da; Chen, Xing-Guang; Zhang, Ya; Xu, Cai-Ying; Zhen, Yun-Fang; Wang, Xiao-Dong

    2018-02-25

    Developmental dysplasia of the hip (DDH) is a common congenital malformation characterized by mismatch in shape between the femoral head and acetabulum, and leads to hip dysplasia. To date, the pathogenesis of DDH is poorly understood and may involve multiple factors, including genetic predisposition. However, comprehensive genetic analysis has not been applied to investigate a genetic component of DDH. In the present study, 10 pairs of healthy fathers and DDH daughters were enrolled to identify genetic hallmarks of DDH using high throughput whole genome sequencing. The DDH-specific DNA mutations were found in each patient. Overall 1344 genes contained DDH-specific mutations. Functional enrichment analysis showed that these genes played important roles in the cytoskeleton, microtubule cytoskeleton, sarcoplasm and microtubule associated complex. These functions affected osteoblast and osteoclast development. Therefore, we proposed that the DDH-specific mutations might affect bone development, and caused DDH. Our pairwise high throughput sequencing results comprehensively delineated genetic hallmarks of DDH. Further research into the biological impact of these mutations may inform the development of DDH diagnostic tools and allow neonatal gene screening. Copyright © 2018. Published by Elsevier Inc.

  9. Whole-genome sequencing of multiple myeloma from diagnosis to plasma cell leukemia reveals genomic initiating events, evolution, and clonal tides.

    Science.gov (United States)

    Egan, Jan B; Shi, Chang-Xin; Tembe, Waibhav; Christoforides, Alexis; Kurdoglu, Ahmet; Sinari, Shripad; Middha, Sumit; Asmann, Yan; Schmidt, Jessica; Braggio, Esteban; Keats, Jonathan J; Fonseca, Rafael; Bergsagel, P Leif; Craig, David W; Carpten, John D; Stewart, A Keith

    2012-08-02

    The longitudinal evolution of a myeloma genome from diagnosis to plasma cell leukemia has not previously been reported. We used whole-genome sequencing (WGS) on 4 purified tumor samples and patient germline DNA drawn over a 5-year period in a t(4;14) multiple myeloma patient. Tumor samples were acquired at diagnosis, first relapse, second relapse, and end-stage secondary plasma cell leukemia (sPCL). In addition to the t(4;14), all tumor time points also shared 10 common single-nucleotide variants (SNVs) on WGS comprising shared initiating events. Interestingly, we observed genomic sequence variants that waxed and waned with time in progressive tumors, suggesting the presence of multiple independent, yet related, clones at diagnosis that rose and fell in dominance. Five newly acquired SNVs, including truncating mutations of RB1 and ZKSCAN3, were observed only in the final sPCL sample suggesting leukemic transformation events. This longitudinal WGS characterization of the natural history of a high-risk myeloma patient demonstrated tumor heterogeneity at diagnosis with shifting dominance of tumor clones over time and has also identified potential mutations contributing to myelomagenesis as well as transformation from myeloma to overt extramedullary disease such as sPCL.

  10. Genomic sequencing of Pleistocene cave bears

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

    2005-04-01

    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

  11. Whole-genome sequencing reveals that Shewanella haliotis Kim et al. 2007 can be considered a later heterotypic synonym of Shewanella algae Simidu et al. 1990.

    Science.gov (United States)

    Szeinbaum, Nadia; Kellum, Cailin E; Glass, Jennifer B; Janda, J Michael; DiChristina, Thomas J

    2018-04-01

    Previously, experimental DNA-DNA hybridization (DDH) between Shewanellahaliotis JCM 14758 T and Shewanellaalgae JCM 21037 T had suggested that the two strains could be considered different species, despite minimal phenotypic differences. The recent isolation of Shewanella sp. MN-01, with 99 % 16S rRNA gene identity to S. algae and S. haliotis, revealed a potential taxonomic problem between these two species. In this study, we reassessed the nomenclature of S. haliotis and S. algae using available whole-genome sequences. The whole-genome sequence of S. haliotis JCM 14758 T and ten S. algae strains showed ≥97.7 % average nucleotide identity and >78.9 % digital DDH, clearly above the recommended species thresholds. According to the rules of priority and in view of the results obtained, S. haliotis is to be considered a later heterotypic synonym of S. algae. Because the whole-genome sequence of Shewanella sp. strain MN-01 shares >99 % ANI with S. algae JCM 14758 T , it can be confidently identified as S. algae.

  12. Genome sequencing of the lizard parasite Leishmania tarentolae reveals loss of genes associated to the intracellular stage of human pathogenic species

    OpenAIRE

    Raymond, Fr?d?ric; Boisvert, S?bastien; Roy, Ga?tan; Ritt, Jean-Fran?ois; L?gar?, Danielle; Isnard, Amandine; Stanke, Mario; Olivier, Martin; Tremblay, Michel J.; Papadopoulou, Barbara; Ouellette, Marc; Corbeil, Jacques

    2011-01-01

    The Leishmania tarentolae Parrot-TarII strain genome sequence was resolved to an average 16-fold mean coverage by next-generation DNA sequencing technologies. This is the first non-pathogenic to humans kinetoplastid protozoan genome to be described thus providing an opportunity for comparison with the completed genomes of pathogenic Leishmania species. A high synteny was observed between all sequenced Leishmania species. A limited number of chromosomal regions diverged between L. tarentolae a...

  13. Genome-Wide Footprints of Pig Domestication and Selection Revealed through Massive Parallel Sequencing of Pooled DNA

    NARCIS (Netherlands)

    Amaral, A.J.; Ferretti, L.; Megens, H.J.W.C.; Crooijmans, R.P.M.A.; Nie, H.; Ramos-Onsins, S.E.; Perez-Enciso, M.; Schook, L.B.; Groenen, M.A.M.

    2011-01-01

    Background Artificial selection has caused rapid evolution in domesticated species. The identification of selection footprints across domesticated genomes can contribute to uncover the genetic basis of phenotypic diversity. Methodology/Main Findings Genome wide footprints of pig domestication and

  14. Complete genome sequence and transcriptomic analysis of a novel marine strain Bacillus weihaiensis reveals the mechanism of brown algae degradation.

    Science.gov (United States)

    Zhu, Yueming; Chen, Peng; Bao, Yunjuan; Men, Yan; Zeng, Yan; Yang, Jiangang; Sun, Jibin; Sun, Yuanxia

    2016-11-30

    A novel marine strain representing efficient degradation ability toward brown algae was isolated, identified, and assigned to Bacillus weihaiensis Alg07. The alga-associated marine bacteria promote the nutrient cycle and perform important functions in the marine ecosystem. The de novo sequencing of the B. weihaiensis Alg07 genome was carried out. Results of gene annotation and carbohydrate-active enzyme analysis showed that the strain harbored enzymes that can completely degrade alginate and laminarin, which are the specific polysaccharides of brown algae. We also found genes for the utilization of mannitol, the major storage monosaccharide in the cell of brown algae. To understand the process of brown algae decomposition by B. weihaiensis Alg07, RNA-seq transcriptome analysis and qRT-PCR were performed. The genes involved in alginate metabolism were all up-regulated in the initial stage of kelp degradation, suggesting that the strain Alg07 first degrades alginate to destruct the cell wall so that the laminarin and mannitol are released and subsequently decomposed. The key genes involved in alginate and laminarin degradation were expressed in Escherichia coli and characterized. Overall, the model of brown algae degradation by the marine strain Alg07 was established, and novel alginate lyases and laminarinase were discovered.

  15. Bioinformatic analysis of whole genome sequencing data

    OpenAIRE

    Maqbool, Khurram

    2014-01-01

    Evolution has shaped the life forms for billion of years. Domestication is an accelerated process that can be used as a model for evolutionary changes. The aim of this thesis project has been to carry out extensive bioinformatic analyses of whole genome sequencing data to reveal SNPs, InDels and selective sweeps in the chicken, pig and dog genome. Pig genome sequencing revealed loci under selection for elongation of back and increased number of vertebrae, associated with the NR6A1, PLAG1,...

  16. Lactococcus lactis Diversity in Undefined Mixed Dairy Starter Cultures as Revealed by Comparative Genome Analyses and Targeted Amplicon Sequencing of epsD.

    Science.gov (United States)

    Frantzen, Cyril A; Kleppen, Hans Petter; Holo, Helge

    2018-02-01

    Undefined mesophilic mixed (DL) starter cultures are used in the production of continental cheeses and contain unknown strain mixtures of Lactococcus lactis and leuconostocs. The choice of starter culture affects the taste, aroma, and quality of the final product. To gain insight into the diversity of Lactococcus lactis strains in starter cultures, we whole-genome sequenced 95 isolates from three different starter cultures. Pan-genomic analyses, which included 30 publically available complete genomes, grouped the strains into 21 L. lactis subsp . lactis and 28 L. lactis subsp. cremoris lineages. Only one of the 95 isolates grouped with previously sequenced strains, and the three starter cultures showed no overlap in lineage distributions. The culture diversity was assessed by targeted amplicon sequencing using purR , a core gene, and epsD , present in 93 of the 95 starter culture isolates but absent in most of the reference strains. This enabled an unprecedented discrimination of starter culture Lactococcus lactis and revealed substantial differences between the three starter cultures and compositional shifts during the cultivation of cultures in milk. IMPORTANCE In contemporary cheese production, standardized frozen seed stock starter cultures are used to ensure production stability, reproducibility, and quality control of the product. The dairy industry experiences significant disruptions of cheese production due to phage attacks, and one commonly used countermeasure to phage attack is to employ a starter rotation strategy, in which two or more starters with minimal overlap in phage sensitivity are used alternately. A culture-independent analysis of the lactococcal diversity in complex undefined starter cultures revealed large differences between the three starter cultures and temporal shifts in lactococcal composition during the production of bulk starters. A better understanding of the lactococcal diversity in starter cultures will enable the development of

  17. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

    Science.gov (United States)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur , amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.

  18. Bioinformatics Analysis of the Complete Genome Sequence of the Mango Tree Pathogen Pseudomonas syringae pv. syringae UMAF0158 Reveals Traits Relevant to Virulence and Epiphytic Lifestyle.

    Directory of Open Access Journals (Sweden)

    Pedro Manuel Martínez-García

    Full Text Available The genome sequence of more than 100 Pseudomonas syringae strains has been sequenced to date; however only few of them have been fully assembled, including P. syringae pv. syringae B728a. Different strains of pv. syringae cause different diseases and have different host specificities; so, UMAF0158 is a P. syringae pv. syringae strain related to B728a but instead of being a bean pathogen it causes apical necrosis of mango trees, and the two strains belong to different phylotypes of pv.syringae and clades of P. syringae. In this study we report the complete sequence and annotation of P. syringae pv. syringae UMAF0158 chromosome and plasmid pPSS158. A comparative analysis with the available sequenced genomes of other 25 P. syringae strains, both closed (the reference genomes DC3000, 1448A and B728a and draft genomes was performed. The 5.8 Mb UMAF0158 chromosome has 59.3% GC content and comprises 5017 predicted protein-coding genes. Bioinformatics analysis revealed the presence of genes potentially implicated in the virulence and epiphytic fitness of this strain. We identified several genetic features, which are absent in B728a, that may explain the ability of UMAF0158 to colonize and infect mango trees: the mangotoxin biosynthetic operon mbo, a gene cluster for cellulose production, two different type III and two type VI secretion systems, and a particular T3SS effector repertoire. A mutant strain defective in the rhizobial-like T3SS Rhc showed no differences compared to wild-type during its interaction with host and non-host plants and worms. Here we report the first complete sequence of the chromosome of a pv. syringae strain pathogenic to a woody plant host. Our data also shed light on the genetic factors that possibly determine the pathogenic and epiphytic lifestyle of UMAF0158. This work provides the basis for further analysis on specific mechanisms that enable this strain to infect woody plants and for the functional analysis of host

  19. Whole-Genome Sequencing and Comparative Analysis of Mycobacterium brisbanense Reveals a Possible Soil Origin and Capability in Fertiliser Synthesis.

    Science.gov (United States)

    Wee, Wei Yee; Tan, Tze King; Jakubovics, Nicholas S; Choo, Siew Woh

    2016-01-01

    Mycobacterium brisbanense is a member of Mycobacterium fortuitum third biovariant complex, which includes rapidly growing Mycobacterium spp. that normally inhabit soil, dust and water, and can sometimes cause respiratory tract infections in humans. We present the first whole-genome analysis of M. brisbanense UM_WWY which was isolated from a 70-year-old Malaysian patient. Molecular phylogenetic analyses confirmed the identification of this strain as M. brisbanense and showed that it has an unusually large genome compared with related mycobacteria. The large genome size of M. brisbanense UM_WWY (~7.7Mbp) is consistent with further findings that this strain has a highly variable genome structure that contains many putative horizontally transferred genomic islands and prophage. Comparative analysis showed that M. brisbanense UM_WWY is the only Mycobacterium species that possesses a complete set of genes encoding enzymes involved in the urea cycle, suggesting that this soil bacterium is able to synthesize urea for use as plant fertilizers. It is likely that M. brisbanense UM_WWY is adapted to live in soil as its primary habitat since the genome contains many genes associated with nitrogen metabolism. Nevertheless, a large number of predicted virulence genes were identified in M. brisbanense UM_WWY that are mostly shared with well-studied mycobacterial pathogens such as Mycobacterium tuberculosis and Mycobacterium abscessus. These findings are consistent with the role of M. brisbanense as an opportunistic pathogen of humans. The whole-genome study of UM_WWY has provided the basis for future work of M. brisbanense.

  20. Sequence-Based Mapping and Genome Editing Reveal Mutations in SticklebackHps5Cause Oculocutaneous Albinism and thecasperPhenotype.

    Science.gov (United States)

    Hart, James C; Miller, Craig T

    2017-09-07

    Here, we present and characterize the spontaneous X-linked recessive mutation casper , which causes oculocutaneous albinism in threespine sticklebacks ( Gasterosteus aculeatus ). In humans, Hermansky-Pudlak syndrome results in pigmentation defects due to disrupted formation of the melanin-containing lysosomal-related organelle (LRO), the melanosome. casper mutants display not only reduced pigmentation of melanosomes in melanophores, but also reductions in the iridescent silver color from iridophores, while the yellow pigmentation from xanthophores appears unaffected. We mapped casper using high-throughput sequencing of genomic DNA from bulked casper mutants to a region of the stickleback X chromosome (chromosome 19) near the stickleback ortholog of Hermansky-Pudlak syndrome 5 ( Hps5 ). casper mutants have an insertion of a single nucleotide in the sixth exon of Hps5 , predicted to generate an early frameshift. Genome editing using CRISPR/Cas9 induced lesions in Hps5 and phenocopied the casper mutation. Injecting single or paired Hps5 guide RNAs revealed higher incidences of genomic deletions from paired guide RNAs compared to single gRNAs. Stickleback Hps5 provides a genetic system where a hemizygous locus in XY males and a diploid locus in XX females can be used to generate an easily scored visible phenotype, facilitating quantitative studies of different genome editing approaches. Lastly, we show the ability to better visualize patterns of fluorescent transgenic reporters in Hps5 mutant fish. Thus, Hps5 mutations present an opportunity to study pigmented LROs in the emerging stickleback model system, as well as a tool to aid in assaying genome editing and visualizing enhancer activity in transgenic fish. Copyright © 2017 Hart and Milller.

  1. Genome sequence and analysis of Lactobacillus helveticus

    Directory of Open Access Journals (Sweden)

    Paola eCremonesi

    2013-01-01

    Full Text Available The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of L. helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract.As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones.

  2. Next generation sequencing reveals genome downsizing in allotetraploid Nicotiana tabacum, predominantly through the elimination of paternally derived repetitive DNAs

    Czech Academy of Sciences Publication Activity Database

    Renny-Byfield, S.; Chester, M.; Kovařík, Aleš; Le Comber, S.C.; Grandbastien, M.-A.; Deloger, M.; Nichols, R.A.; Macas, Jiří; Novák, Petr; Chase, M.W.; Leitch, A.R.

    2011-01-01

    Roč. 28, č. 10 (2011), s. 2843-2854 ISSN 0737-4038 R&D Projects: GA MŠk(CZ) OC10037 Institutional research plan: CEZ:AV0Z50040507; CEZ:AV0Z50040702; CEZ:AV0Z50510513 Keywords : allopolyploidy * evolution * genome structure Subject RIV: BO - Biophysics Impact factor: 5.550, year: 2011

  3. Genome-wide-analyses of Listeria monocytogenes from food-processing plants reveals clonal diversity and dates the emergence of persisting sequence types

    DEFF Research Database (Denmark)

    Knudsen, Gitte Maegaard; Nielsen, Jesper Boye; Marvig, Rasmus Lykke

    2017-01-01

    Whole genome sequencing is increasing used in epidemiology, e.g. for tracing outbreaks of food-borne diseases. This requires in-depth understanding of pathogen emergence, persistence, and genomic diversity along the food production chain including in food processing plants. We sequenced the genomes...... of 80 isolates of Listeria monocytogenes sampled from Danish food processing plants over a time-period of 20 years, and analyzed the sequences together with 10 public available reference genomes to advance our understanding of inter- and intra-plant genomic diversity of L. monocytogenes. Except....... Using time-based phylogenetic analyses of the persistent STs, we estimate the L. monocytogenes evolutionary rate to be 0.18-0.35 SNPs/year, suggesting that the persistent STs emerged approximately 100 years ago, which correlates with the onset of industrialization and globalization of the food market....

  4. The complete genome sequence of Dickeya zeae EC1 reveals substantial divergence from other Dickeya strains and species

    OpenAIRE

    Zhou, Jianuan; Cheng, Yingying; Lv, Mingfa; Liao, Lisheng; Chen, Yufan; Gu, Yanfang; Liu, Shiyin; Jiang, Zide; Xiong, Yuanyan; Zhang, Lianhui

    2015-01-01

    Background Dickeya zeae is a bacterial species that infects monocotyledons and dicotyledons. Two antibiotic-like phytotoxins named zeamine and zeamine II were reported to play an important role in rice seed germination, and two genes associated with zeamines production, i.e., zmsA and zmsK, have been thoroughly characterized. However, other virulence factors and its molecular mechanisms of host specificity and pathogenesis are hardly known. Results The complete genome of D. zeae strain EC1 is...

  5. The complete chloroplast DNA sequence of the green alga Oltmannsiellopsis viridis reveals a distinctive quadripartite architecture in the chloroplast genome of early diverging ulvophytes

    Directory of Open Access Journals (Sweden)

    Lemieux Claude

    2006-02-01

    Full Text Available Abstract Background The phylum Chlorophyta contains the majority of the green algae and is divided into four classes. The basal position of the Prasinophyceae has been well documented, but the divergence order of the Ulvophyceae, Trebouxiophyceae and Chlorophyceae is currently debated. The four complete chloroplast DNA (cpDNA sequences presently available for representatives of these classes have revealed extensive variability in overall structure, gene content, intron composition and gene order. The chloroplast genome of Pseudendoclonium (Ulvophyceae, in particular, is characterized by an atypical quadripartite architecture that deviates from the ancestral type by a large inverted repeat (IR featuring an inverted rRNA operon and a small single-copy (SSC region containing 14 genes normally found in the large single-copy (LSC region. To gain insights into the nature of the events that led to the reorganization of the chloroplast genome in the Ulvophyceae, we have determined the complete cpDNA sequence of Oltmannsiellopsis viridis, a representative of a distinct, early diverging lineage. Results The 151,933 bp IR-containing genome of Oltmannsiellopsis differs considerably from Pseudendoclonium and other chlorophyte cpDNAs in intron content and gene order, but shares close similarities with its ulvophyte homologue at the levels of quadripartite architecture, gene content and gene density. Oltmannsiellopsis cpDNA encodes 105 genes, contains five group I introns, and features many short dispersed repeats. As in Pseudendoclonium cpDNA, the rRNA genes in the IR are transcribed toward the single copy region featuring the genes typically found in the ancestral LSC region, and the opposite single copy region harbours genes characteristic of both the ancestral SSC and LSC regions. The 52 genes that were transferred from the ancestral LSC to SSC region include 12 of those observed in Pseudendoclonium cpDNA. Surprisingly, the overall gene organization of

  6. Genome sequencing reveals a new lineage associated with lablab bean and genetic exchange between Xanthomonas axonopodis pv. phaseoli and Xanthomonas fuscans subsp. fuscans

    Directory of Open Access Journals (Sweden)

    Valente eAritua

    2015-10-01

    Full Text Available Common bacterial blight is a devastating seed-borne disease of common beans that also occurs on other legume species including lablab and Lima beans. We sequenced and analysed the genomes of 26 isolates of Xanthomonas axonopodis pv. phaseoli and X. fuscans subsp. fuscans, the causative agents of this disease, collected over four decades and six continents. This revealed considerable genetic variation within both taxa, encompassing both single-nucleotide variants and differences in gene content, that could be exploited for tracking pathogen spread. The bacterial isolate from Lima bean fell within the previously described Genetic Lineage 1, along with the pathovar type isolate (NCPPB 3035. The isolates from lablab represent a new, previously unknown genetic lineage closely related to strains of X. axonopodis pv. glycines. Finally, we identified more than 100 genes that appear to have been recently acquired by Xanthomonas axonopodis pv. phaseoli from X. fuscans subsp. fuscans.

  7. Genome sequence of a diabetes-prone rodent reveals a mutation hotspot around the ParaHox gene cluster

    DEFF Research Database (Denmark)

    Hargreaves, Adam D.; Zhou, Long; Christensen, Josef

    2017-01-01

    The sand rat Psammomys obesus is a gerbil species native to deserts of North Africa and the Middle East, and is constrained in its ecology because high carbohydrate diets induce obesity and type II diabetes that, in extreme cases, can lead to pancreatic failure and death. We report the sequencing...

  8. Genetic control of environmental variation of two quantitative traits of Drosophila melanogaster revealed by whole-genome sequencing

    DEFF Research Database (Denmark)

    Sørensen, Peter; de los Campos, Gustavo; Morgante, Fabio

    2015-01-01

    and others more volatile performance. Understanding the mechanisms responsible for environmental variability not only informs medical questions but is relevant in evolution and in agricultural science. In this work fully sequenced inbred lines of Drosophila melanogaster were analyzed to study the nature...

  9. Malaria Genome Sequencing Project

    Science.gov (United States)

    2004-01-01

    proteins in plastid segregation mutants of Toxoplasma gandii. L. Biot. Parasito . Today 11, 1-4 (1995). Chem. 276, 28436-28442 (2001). 11. Su, X. et al... parasito - gene mapping studies have shown that regions of gene synteny exist phorous vacuole membrane29 . between species of rodent malaria9 and between...Carucci, D. J. Rodent models of malaria in the genomics era. Trends Parasito , 18, selection of karyotype mutants and non-gametocyte producer mutants

  10. Genome Re-sequencing of Diverse Sweet Cherry (Prunus avium) Individuals Reveals a Modifier Gene Mutation Conferring Pollen-part Self-compatibility.

    Science.gov (United States)

    Ono, Kentaro; Akagi, Takashi; Morimoto, Takuya; Wünsch, Ana; Tao, Ryutaro

    2018-04-04

    The S-RNase-based gametophytic self-incompatibility (GSI) reproduction barrier is important for maintaining genetic diversity in species of the families Solanaceae, Plantaginaceae, and Rosaceae. Among the plant taxa with S-RNase-based GSI, Prunus species in the family Rosaceae exhibit Prunus-specific self-incompatibility (SI). Although pistil S and pollen S determinants have been identified, the mechanism underlying SI remains uncharacterized in Prunus species. A putative pollen-part modifier was identified in this study. Disruption of this modifier supposedly confers self-compatibility (SC) to sweet cherry (Prunus avium) 'Cristobalina'. To identify the modifier, genome re-sequencing experiments were completed involving sweet cherry individuals from 18 cultivars and 43 individuals in two segregating populations. Cataloging of subsequences (35-bp kmers) from the obtained genomic reads, while referring to the mRNA-sequencing data, enabled the identification of a candidate gene [M locus-encoded GST (MGST)]. Additionally, the insertion of a transposon-like sequence in the putative MGST promoter region in 'Cristobalina' down-regulated MGST expression levels, likely leading to the SC of this cultivar. Phylogenetic, evolutionary, and gene expression analyses revealed that MGST may have undergone lineage-specific evolution, and the encoded protein may function differently from the corresponding proteins encoded by GST orthologs in other species, including members of the subfamily Maloideae (Rosaceae). Thus, MGST may be important for Prunus-specific SI. The identification of this novel modifier will expand our understanding of the Prunus-specific GSI system. We herein discuss the possible functions of MGST in the Prunus-specific GSI system.

  11. Harnessing Whole Genome Sequencing in Medical Mycology.

    Science.gov (United States)

    Cuomo, Christina A

    2017-01-01

    Comparative genome sequencing studies of human fungal pathogens enable identification of genes and variants associated with virulence and drug resistance. This review describes current approaches, resources, and advances in applying whole genome sequencing to study clinically important fungal pathogens. Genomes for some important fungal pathogens were only recently assembled, revealing gene family expansions in many species and extreme gene loss in one obligate species. The scale and scope of species sequenced is rapidly expanding, leveraging technological advances to assemble and annotate genomes with higher precision. By using iteratively improved reference assemblies or those generated de novo for new species, recent studies have compared the sequence of isolates representing populations or clinical cohorts. Whole genome approaches provide the resolution necessary for comparison of closely related isolates, for example, in the analysis of outbreaks or sampled across time within a single host. Genomic analysis of fungal pathogens has enabled both basic research and diagnostic studies. The increased scale of sequencing can be applied across populations, and new metagenomic methods allow direct analysis of complex samples.

  12. The complete nucleotide sequence of the genome of Barley yellow dwarf virus-RMV reveals it to be a new Polerovirus distantly related to other yellow dwarf viruses

    Directory of Open Access Journals (Sweden)

    Elizabeth N. Krueger

    2013-07-01

    Full Text Available The yellow dwarf viruses (YDVs of the Luteoviridae family represent the most widespread group of cereal viruses worldwide. They include the Barley yellow dwarf viruses (BYDVs of genus Luteovirus, the Cereal yellow dwarf viruses (CYDVs and Wheat yellow dwarf virus (WYDV of genus Polerovirus. All of these viruses are obligately aphid transmitted and phloem-limited. The first described YDVs (initially all called BYDV were classified by their most efficient vector. One of these viruses, BYDV-RMV, is transmitted most efficiently by the corn leaf aphid, Rhopalosiphum maidis. Here we report the complete 5612 nucleotide sequence of the genomic RNA of a Montana isolate of BYDV-RMV (isolate RMV MTFE87, Genbank accession no. KC921392. The sequence revealed that BYDV-RMV is a polerovirus, but it is quite distantly related to the CYDVs or WYDV, which are very closely related to each other. Nor is BYDV-RMV closely related to any other particular polerovirus. Depending on the gene that is compared, different poleroviruses (none of them a YDV share the most sequence similarity to BYDV-RMV. Because of its distant relationship to other YDVs, and because it commonly infects maize via its vector, R. maidis, we propose that BYDV-RMV be renamed Maize yellow dwarf virus-RMV (MYDV-RMV.

  13. Genome sequence and analysis of the tuber crop potato

    DEFF Research Database (Denmark)

    Xu, X.; Pan, S.; Cheng, S.

    2011-01-01

    and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade...... contributed to the evolution of tuber development. The potato genome sequence provides a platform for genetic improvement of this vital crop....

  14. Targeted isolation, sequence assembly and characterization of two white spruce (Picea glauca BAC clones for terpenoid synthase and cytochrome P450 genes involved in conifer defence reveal insights into a conifer genome

    Directory of Open Access Journals (Sweden)

    Ritland Carol

    2009-08-01

    Full Text Available Abstract Background Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs and full-length (FLcDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. Results We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR and a cytochrome P450 (CYP720B4 from a non-arrayed genomic BAC library of white spruce (Picea glauca. Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR and 94 kbp (CYP720B4 long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs, high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. Conclusion We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene

  15. Mining olive genome through library sequencing and bioinformatics ...

    African Journals Online (AJOL)

    As one of the initial steps of olive (Olea europaea L.) genome analysis, a small insert genomic DNA library was constructed (digesting olive genomic DNA with SmaI and cloning the digestion products into pUC19 vector) and randomly picked 83 colonies were sequenced. Analysis of the insert sequences revealed 12 clones ...

  16. Genome sequencing and comparative genomics of enterohemorrhagic Escherichia coli O145:H25 and O145:H28 reveal distinct evolutionary paths and marked variations in traits associated with virulence & colonization.

    Science.gov (United States)

    Lorenz, Sandra C; Gonzalez-Escalona, Narjol; Kotewicz, Michael L; Fischer, Markus; Kase, Julie A

    2017-08-22

    Enterohemorrhagic Escherichia coli (EHEC) O145 are among the top non-O157 serogroups associated with severe human disease worldwide. Two serotypes, O145:H25 and O145:H28 have been isolated from human patients but little information is available regarding the virulence repertoire, origin and evolutionary relatedness of O145:H25. Hence, we sequenced the complete genome of two O145:H25 strains associated with hemolytic uremic syndrome (HUS) and compared the genomes with those of previously sequenced O145:H28 and other EHEC strains. The genomes of the two O145:H25 strains were 5.3 Mbp in size; slightly smaller than those of O145:H28 and other EHEC strains. Both strains contained three nearly identical plasmids and several prophages and integrative elements, many of which differed significantly in size, gene content and organization as compared to those present in O145:H28 and other EHECs. Furthermore, notable variations were observed in several fimbrial gene cluster and intimin types possessed by O145:H25 and O145:H28 indicating potential adaptation to distinct areas of host colonization. Comparative genomics further revealed that O145:H25 are genetically more similar to other non-O157 EHEC strains than to O145:H28. Phylogenetic analysis accompanied by comparative genomics revealed that O145:H25 and O145:H28 evolved from two separate clonal lineages and that horizontal gene transfer and gene loss played a major role in the divergence of these EHEC serotypes. The data provide further evidence that ruminants might be a possible reservoir for O145:H25 but that they might be impaired in their ability to establish a persistent colonization as compared to other EHEC strains.

  17. Revealing the inventory of type III effectors in Pantoea agglomerans gall-forming pathovars using draft genome sequences and a machine-learning approach.

    Science.gov (United States)

    Nissan, Gal; Gershovits, Michael; Morozov, Michael; Chalupowicz, Laura; Sessa, Guido; Manulis-Sasson, Shulamit; Barash, Isaac; Pupko, Tal

    2018-02-01

    Pantoea agglomerans, a widespread epiphytic bacterium, has evolved into a hypersensitive response and pathogenicity (hrp)-dependent and host-specific gall-forming pathogen by the acquisition of a pathogenicity plasmid containing a type III secretion system (T3SS) and its effectors (T3Es). Pantoea agglomerans pv. betae (Pab) elicits galls on beet (Beta vulgaris) and gypsophila (Gypsophila paniculata), whereas P. agglomerans pv. gypsophilae (Pag) incites galls on gypsophila and a hypersensitive response (HR) on beet. Draft genome sequences were generated and employed in combination with a machine-learning approach and a translocation assay into beet roots to identify the pools of T3Es in the two pathovars. The genomes of the sequenced Pab4188 and Pag824-1 strains have a similar size (∼5 MB) and GC content (∼55%). Mutational analysis revealed that, in Pab4188, eight T3Es (HsvB, HsvG, PseB, DspA/E, HopAY1, HopX2, HopAF1 and HrpK) contribute to pathogenicity on beet and gypsophila. In Pag824-1, nine T3Es (HsvG, HsvB, PthG, DspA/E, HopAY1, HopD1, HopX2, HopAF1 and HrpK) contribute to pathogenicity on gypsophila, whereas the PthG effector triggers HR on beet. HsvB, HsvG, PthG and PseB appear to endow pathovar specificities to Pab and Pag, and no homologous T3Es were identified for these proteins in other phytopathogenic bacteria. Conversely, the remaining T3Es contribute to the virulence of both pathovars, and homologous T3Es were found in other phytopathogenic bacteria. Remarkably, HsvG and HsvB, which act as host-specific transcription factors, displayed the largest contribution to disease development. © 2016 BSPP AND JOHN WILEY & SONS LTD.

  18. Genome sequences of lower Great Lakes Microcystis sp. reveal strain-specific genes that are present and expressed in western Lake Erie blooms.

    Directory of Open Access Journals (Sweden)

    Kevin Anthony Meyer

    Full Text Available Blooms of the potentially toxic cyanobacterium Microcystis are increasing worldwide. In the Laurentian Great Lakes they pose major socioeconomic, ecological, and human health threats, particularly in western Lake Erie. However, the interpretation of "omics" data is constrained by the highly variable genome of Microcystis and the small number of reference genome sequences from strains isolated from the Great Lakes. To address this, we sequenced two Microcystis isolates from Lake Erie (Microcystis aeruginosa LE3 and M. wesenbergii LE013-01 and one from upstream Lake St. Clair (M. cf aeruginosa LSC13-02, and compared these data to the genomes of seventeen Microcystis spp. from across the globe as well as one metagenome and seven metatranscriptomes from a 2014 Lake Erie Microcystis bloom. For the publically available strains analyzed, the core genome is ~1900 genes, representing ~11% of total genes in the pan-genome and ~45% of each strain's genome. The flexible genome content was related to Microcystis subclades defined by phylogenetic analysis of both housekeeping genes and total core genes. To our knowledge this is the first evidence that the flexible genome is linked to the core genome of the Microcystis species complex. The majority of strain-specific genes were present and expressed in bloom communities in Lake Erie. Roughly 8% of these genes from the lower Great Lakes are involved in genome plasticity (rapid gain, loss, or rearrangement of genes and resistance to foreign genetic elements (such as CRISPR-Cas systems. Intriguingly, strain-specific genes from Microcystis cultured from around the world were also present and expressed in the Lake Erie blooms, suggesting that the Microcystis pangenome is truly global. The presence and expression of flexible genes, including strain-specific genes, suggests that strain-level genomic diversity may be important in maintaining Microcystis abundance during bloom events.

  19. Whole-Genome Sequencing of Human Clinical Klebsiella pneumoniae Isolates Reveals Misidentification and Misunderstandings of Klebsiella pneumoniae, Klebsiella variicola, and Klebsiella quasipneumoniae

    Science.gov (United States)

    Linson, Sarah E.; Ojeda Saavedra, Matthew; Cantu, Concepcion; Davis, James J.; Brettin, Thomas; Olsen, Randall J.

    2017-01-01

    ABSTRACT Klebsiella pneumoniae is a major threat to public health, causing significant morbidity and mortality worldwide. The emergence of highly drug-resistant strains is particularly concerning. There has been a recognition and division of Klebsiella pneumoniae into three distinct phylogenetic groups: Klebsiella pneumoniae, Klebsiella variicola, and Klebsiella quasipneumoniae. K. variicola and K. quasipneumoniae have often been described as opportunistic pathogens that have less virulence in humans than K. pneumoniae does. We recently sequenced the genomes of 1,777 extended-spectrum-beta-lactamase (ESBL)-producing K. pneumoniae isolates recovered from human infections and discovered that 28 strains were phylogenetically related to K. variicola and K. quasipneumoniae. Whole-genome sequencing of 95 additional non-ESBL-producing K. pneumoniae isolates recovered from patients found 12 K. quasipneumoniae strains. Matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) analysis initially identified all patient isolates as K. pneumoniae, suggesting a potential pitfall in conventional clinical microbiology laboratory identification methods. Whole-genome sequence analysis revealed extensive sharing of core gene content and plasmid replicons among the Klebsiella species. For the first time, strains of both K. variicola and K. quasipneumoniae were found to carry the Klebsiella pneumoniae carbapenemase (KPC) gene, while another K. variicola strain was found to carry the New Delhi metallo-beta-lactamase 1 (NDM-1) gene. K. variicola and K. quasipneumoniae infections were not less virulent than K. pneumoniae infections, as assessed by in-hospital mortality and infection type. We also discovered evidence of homologous recombination in one K. variicola strain, as well as one strain from a novel Klebsiella species, which challenge the current understanding of interrelationships between clades of Klebsiella. IMPORTANCE Klebsiella

  20. The draft genome sequence of the ascomycete fungus Penicillium subrubescens reveals a highly enriched content of plant biomass related CAZymes compared to related fungi.

    Science.gov (United States)

    Peng, Mao; Dilokpimol, Adiphol; Mäkelä, Miia R; Hildén, Kristiina; Bervoets, Sander; Riley, Robert; Grigoriev, Igor V; Hainaut, Matthieu; Henrissat, Bernard; de Vries, Ronald P; Granchi, Zoraide

    2017-03-20

    Here we report the genome sequence of the ascomycete saprobic fungus Penicillium subrubescens FBCC1632/CBS132785 isolated from a Jerusalem artichoke field in Finland. The 39.75Mb genome containing 14,188 gene models is highly similar for that reported for other Penicillium species, but contains a significantly higher number of putative carbohydrate active enzyme (CAZyme) encoding genes. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Complete genome sequence of a new member of the genus Badnavirus, Dioscorea bacilliform RT virus 3, reveals the first evidence of recombination in yam badnaviruses.

    Science.gov (United States)

    Bömer, Moritz; Rathnayake, Ajith I; Visendi, Paul; Silva, Gonçalo; Seal, Susan E

    2018-02-01

    Yams (Dioscorea spp.) host a diverse range of badnaviruses (genus Badnavirus, family Caulimoviridae). The first complete genome sequence of Dioscorea bacilliform RT virus 3 (DBRTV3), which belongs to the monophyletic species group K5, is described. This virus is most closely related to Dioscorea bacilliform SN virus (DBSNV, group K4) based on a comparison of genome sequences. Recombination analysis identified a unique recombination event in DBRTV3, with DBSNV likely to be the major parent and Dioscorea bacilliform AL virus (DBALV) the minor parent, providing the first evidence for recombination in yam badnaviruses. This has important implications for yam breeding programmes globally.

  2. Whole genome sequencing of Microbacterium sp. AISO3 from polluted San Jacinto River sediment reveals high bacterial mobility, metabolic versatility and heavy metal resistance

    Directory of Open Access Journals (Sweden)

    Rupa Iyer

    2017-12-01

    Full Text Available The genus Microbacterium is composed of high GC content, Gram-positive bacteria of the phylum Acintobacteria known for their antibiotic production. Microbacterium species commonly colonize agricultural rhizospheres and more infrequently have been found to colonize and infect human tissues as well. Here we report the 3,696,310 bp draft genome (chromosome and plasmids sequence assembled at the scaffold level from 232 contigs of Microbacterium sp. strain AISO3, isolated from polluted San Jacinto River sediment in Channelview, Texas. The nucleotide sequence of this genome was deposited into NCBI GenBank under the accession NHRF00000000.

  3. Genome-wide sequencing of small RNAs reveals a tissue-specific loss of conserved microRNA families in Echinococcus granulosus

    OpenAIRE

    Bai, Yun; Zhang, Zhuangzhi; Jin, Lei; Kang, Hui; Zhu, Yongqiang; Zhang, Lu; Li, Xia; Ma, Fengshou; Zhao, Li; Shi, Baoxin; Li, Jun; McManus, Donald P; Zhang, Wenbao; Wang, Shengyue

    2014-01-01

    Background MicroRNAs (miRNAs) are important post-transcriptional regulators which control growth and development in eukaryotes. The cestode Echinococcus granulosus has a complex life-cycle involving different development stages but the mechanisms underpinning this development, including the involvement of miRNAs, remain unknown. Results Using Illumina next generation sequencing technology, we sequenced at the genome-wide level three small RNA populations from the adult, protoscolex and cyst m...

  4. Challenges to genome sequence dissection in sweetpotato

    Science.gov (United States)

    Isobe, Sachiko; Shirasawa, Kenta; Hirakawa, Hideki

    2017-01-01

    The development of next generation sequencing (NGS) technologies has enabled the determination of whole genome sequences in many non-model plant species. However, genome sequencing in sweetpotato (Ipomoea batatas (L.) Lam) is still difficult because of the hexaploid genome structure. Previous studies suggested that a diploid wild relative, I. trifida (H.B.K.) Don., is the most possible ancestor of sweetpotato. Therefore, the genetic and genomic features of I. trifida have been studied as a potential reference for sweetpotato. Meanwhile, several research groups have begun the challenging task of directly sequencing the sweetpotato genome. In this manuscript, we review the recent results and activities of large-scale genome and transcriptome analysis related to genome sequence dissection in sweetpotato under the sections as follows: I. trifida genome and transcript sequencing, genome sequences of I. nil (Japanese morning glory), transcript sequences in sweetpotato, chloroplast sequences, transposable elements and transfer DNA. The recent international activities of de novo whole genome sequencing in sweetpotato are also described. The large-scale publically available genome and transcript sequence resources and the international genome sequencing streams are expected to promote the genome sequence dissection in sweetpotato. PMID:28465666

  5. The genome sequences of Cellulomonas fimi and "Cellvibrio gilvus" reveal the cellulolytic strategies of two facultative anaerobes, transfer of "Cellvibrio gilvus" to the genus Cellulomonas, and proposal of Cellulomonas gilvus sp. nov.

    Science.gov (United States)

    Christopherson, Melissa R; Suen, Garret; Bramhacharya, Shanti; Jewell, Kelsea A; Aylward, Frank O; Mead, David; Brumm, Phillip J

    2013-01-01

    Actinobacteria in the genus Cellulomonas are the only known and reported cellulolytic facultative anaerobes. To better understand the cellulolytic strategy employed by these bacteria, we sequenced the genome of the Cellulomonas fimi ATCC 484(T). For comparative purposes, we also sequenced the genome of the aerobic cellulolytic "Cellvibrio gilvus" ATCC 13127(T). An initial analysis of these genomes using phylogenetic and whole-genome comparison revealed that "Cellvibrio gilvus" belongs to the genus Cellulomonas. We thus propose to assign "Cellvibrio gilvus" to the genus Cellulomonas. A comparative genomics analysis between these two Cellulomonas genome sequences and the recently completed genome for Cellulomonas flavigena ATCC 482(T) showed that these cellulomonads do not encode cellulosomes but appear to degrade cellulose by secreting multi-domain glycoside hydrolases. Despite the minimal number of carbohydrate-active enzymes encoded by these genomes, as compared to other known cellulolytic organisms, these bacteria were found to be proficient at degrading and utilizing a diverse set of carbohydrates, including crystalline cellulose. Moreover, they also encode for proteins required for the fermentation of hexose and xylose sugars into products such as ethanol. Finally, we found relatively few significant differences between the predicted carbohydrate-active enzymes encoded by these Cellulomonas genomes, in contrast to previous studies reporting differences in physiological approaches for carbohydrate degradation. Our sequencing and analysis of these genomes sheds light onto the mechanism through which these facultative anaerobes degrade cellulose, suggesting that the sequenced cellulomonads use secreted, multidomain enzymes to degrade cellulose in a way that is distinct from known anaerobic cellulolytic strategies.

  6. The Genome Sequences of Cellulomonas fimi and “Cellvibrio gilvus” Reveal the Cellulolytic Strategies of Two Facultative Anaerobes, Transfer of “Cellvibrio gilvus” to the Genus Cellulomonas, and Proposal of Cellulomonas gilvus sp. nov

    Science.gov (United States)

    Bramhacharya, Shanti; Jewell, Kelsea A.; Aylward, Frank O.; Mead, David; Brumm, Phillip J.

    2013-01-01

    Actinobacteria in the genus Cellulomonas are the only known and reported cellulolytic facultative anaerobes. To better understand the cellulolytic strategy employed by these bacteria, we sequenced the genome of the Cellulomonas fimi ATCC 484T. For comparative purposes, we also sequenced the genome of the aerobic cellulolytic “Cellvibrio gilvus” ATCC 13127T. An initial analysis of these genomes using phylogenetic and whole-genome comparison revealed that “Cellvibrio gilvus” belongs to the genus Cellulomonas. We thus propose to assign “Cellvibrio gilvus” to the genus Cellulomonas. A comparative genomics analysis between these two Cellulomonas genome sequences and the recently completed genome for Cellulomonas flavigena ATCC 482T showed that these cellulomonads do not encode cellulosomes but appear to degrade cellulose by secreting multi-domain glycoside hydrolases. Despite the minimal number of carbohydrate-active enzymes encoded by these genomes, as compared to other known cellulolytic organisms, these bacteria were found to be proficient at degrading and utilizing a diverse set of carbohydrates, including crystalline cellulose. Moreover, they also encode for proteins required for the fermentation of hexose and xylose sugars into products such as ethanol. Finally, we found relatively few significant differences between the predicted carbohydrate-active enzymes encoded by these Cellulomonas genomes, in contrast to previous studies reporting differences in physiological approaches for carbohydrate degradation. Our sequencing and analysis of these genomes sheds light onto the mechanism through which these facultative anaerobes degrade cellulose, suggesting that the sequenced cellulomonads use secreted, multidomain enzymes to degrade cellulose in a way that is distinct from known anaerobic cellulolytic strategies. PMID:23342046

  7. The genome sequences of Cellulomonas fimi and "Cellvibrio gilvus" reveal the cellulolytic strategies of two facultative anaerobes, transfer of "Cellvibrio gilvus" to the genus Cellulomonas, and proposal of Cellulomonas gilvus sp. nov.

    Directory of Open Access Journals (Sweden)

    Melissa R Christopherson

    Full Text Available Actinobacteria in the genus Cellulomonas are the only known and reported cellulolytic facultative anaerobes. To better understand the cellulolytic strategy employed by these bacteria, we sequenced the genome of the Cellulomonas fimi ATCC 484(T. For comparative purposes, we also sequenced the genome of the aerobic cellulolytic "Cellvibrio gilvus" ATCC 13127(T. An initial analysis of these genomes using phylogenetic and whole-genome comparison revealed that "Cellvibrio gilvus" belongs to the genus Cellulomonas. We thus propose to assign "Cellvibrio gilvus" to the genus Cellulomonas. A comparative genomics analysis between these two Cellulomonas genome sequences and the recently completed genome for Cellulomonas flavigena ATCC 482(T showed that these cellulomonads do not encode cellulosomes but appear to degrade cellulose by secreting multi-domain glycoside hydrolases. Despite the minimal number of carbohydrate-active enzymes encoded by these genomes, as compared to other known cellulolytic organisms, these bacteria were found to be proficient at degrading and utilizing a diverse set of carbohydrates, including crystalline cellulose. Moreover, they also encode for proteins required for the fermentation of hexose and xylose sugars into products such as ethanol. Finally, we found relatively few significant differences between the predicted carbohydrate-active enzymes encoded by these Cellulomonas genomes, in contrast to previous studies reporting differences in physiological approaches for carbohydrate degradation. Our sequencing and analysis of these genomes sheds light onto the mechanism through which these facultative anaerobes degrade cellulose, suggesting that the sequenced cellulomonads use secreted, multidomain enzymes to degrade cellulose in a way that is distinct from known anaerobic cellulolytic strategies.

  8. Whole genome sequencing and analysis of Campylobacter coli YH502 from retail chicken reveals a plasmid-borne type VI secretion system

    Science.gov (United States)

    Campylobacter is a major cause of foodborne illnesses worldwide. Campylobacter infections, commonly caused by ingestion of undercooked poultry and meat products, can lead to gastroenteritis and chronic reactive arthritis in humans. Whole genome sequencing (WGS) is a powerful technology that provides...

  9. Genome sequencing of the lizard parasite Leishmania tarentolae reveals loss of genes associated to the intracellular stage of human pathogenic species

    Science.gov (United States)

    Raymond, Frédéric; Boisvert, Sébastien; Roy, Gaétan; Ritt, Jean-François; Légaré, Danielle; Isnard, Amandine; Stanke, Mario; Olivier, Martin; Tremblay, Michel J.; Papadopoulou, Barbara; Ouellette, Marc; Corbeil, Jacques

    2012-01-01

    The Leishmania tarentolae Parrot-TarII strain genome sequence was resolved to an average 16-fold mean coverage by next-generation DNA sequencing technologies. This is the first non-pathogenic to humans kinetoplastid protozoan genome to be described thus providing an opportunity for comparison with the completed genomes of pathogenic Leishmania species. A high synteny was observed between all sequenced Leishmania species. A limited number of chromosomal regions diverged between L. tarentolae and L. infantum, while remaining syntenic to L. major. Globally, >90% of the L. tarentolae gene content was shared with the other Leishmania species. We identified 95 predicted coding sequences unique to L. tarentolae and 250 genes that were absent from L. tarentolae. Interestingly, many of the latter genes were expressed in the intracellular amastigote stage of pathogenic species. In addition, genes coding for products involved in antioxidant defence or participating in vesicular-mediated protein transport were underrepresented in L. tarentolae. In contrast to other Leishmania genomes, two gene families were expanded in L. tarentolae, namely the zinc metallo-peptidase surface glycoprotein GP63 and the promastigote surface antigen PSA31C. Overall, L. tarentolae's gene content appears better adapted to the promastigote insect stage rather than the amastigote mammalian stage. PMID:21998295

  10. Whole genome sequencing reveals an outbreak of Salmonella Enteritidis associated with reptile feeder mice in the United Kingdom, 2012-2015.

    Science.gov (United States)

    Kanagarajah, Sanch; Waldram, Alison; Dolan, Gayle; Jenkins, Claire; Ashton, Philip M; Carrion Martin, Antonio Isidro; Davies, Robert; Frost, Andrew; Dallman, Timothy J; De Pinna, Elizabeth M; Hawker, Jeremy I; Grant, Kathie A; Elson, Richard

    2018-05-01

    Analysis of whole genome sequencing data uncovered a previously undetected outbreak of Salmonella Enteritidis that had been on-going for four years. Cases were resident in all countries of the United Kingdom and 40% of the cases were aged less than 11 years old. Initial investigations revealed that 30% of cases reported exposure to pet snakes. A case-control study was designed to test the hypothesis that exposure to reptiles or their feed were risk factors. A robust case-definition, based on the single nucleotide polymorphism (SNP) profile, increased the power of the analytical study. Following univariable and multivariable analysis, exposure to snakes was the only variable independently associated with infection (Odds ratio 810 95% CI (85-7715) p < 0.001). Isolates of S. Enteritidis belonging to the outbreak profile were recovered from reptile feeder mice sampled at the retail and wholesale level. Control measures included improved public health messaging at point of sale, press releases and engagement with public health and veterinary counterparts across Europe. Mice destined to be fed to reptiles are not regarded as pet food and are not routinely tested for pathogenic bacteria. Routine microbiological testing to ensure feeder mice are free from Salmonella is recommended. Crown Copyright © 2017. Published by Elsevier Ltd. All rights reserved.

  11. Analysis of the genome sequence of the pathogenic Muscovy duck parvovirus strain YY reveals a 14-nucleotide-pair deletion in the inverted terminal repeats.

    Science.gov (United States)

    Wang, Jianye; Huang, Yu; Zhou, Mingxu; Zhu, Guoqiang

    2016-09-01

    Genomic information about Muscovy duck parvovirus is still limited. In this study, the genome of the pathogenic MDPV strain YY was sequenced. The full-length genome of YY is 5075 nucleotides (nt) long, 57 nt shorter than that of strain FM. Sequence alignment indicates that the 5' and 3' inverted terminal repeats (ITR) of strain YY contain a 14-nucleotide-pair deletion in the stem of the palindromic hairpin structure in comparison to strain FM and FZ91-30. The deleted region contains one "E-box" site and one repeated motif with the sequence "TTCCGGT" or "ACCGGAA". Phylogenetic trees constructed based the protein coding genes concordantly showed that YY, together with nine other MDPV isolates from various places, clustered in a separate branch, distinct from the branch formed by goose parvovirus (GPV) strains. These results demonstrate that, despite the distinctive deletion, the YY strain still belongs to the classical MDPV group. Moreover, the deletion of ITR may contribute to the genome evolution of MDPV under immunization pressure.

  12. Pig genome sequence - analysis and publication strategy

    DEFF Research Database (Denmark)

    Archibald, Alan L.; Bolund, Lars; Churcher, Carol

    2010-01-01

    BACKGROUND: The pig genome is being sequenced and characterised under the auspices of the Swine Genome Sequencing Consortium. The sequencing strategy followed a hybrid approach combining hierarchical shotgun sequencing of BAC clones and whole genome shotgun sequencing. RESULTS: Assemblies...... of the BAC clone derived genome sequence have been annotated using the Pre-Ensembl and Ensembl automated pipelines and made accessible through the Pre-Ensembl/Ensembl browsers. The current annotated genome assembly (Sscrofa9) was released with Ensembl 56 in September 2009. A revised assembly (Sscrofa10......) is under construction and will incorporate whole genome shotgun sequence (WGS) data providing > 30x genome coverage. The WGS sequence, most of which comprise short Illumina/Solexa reads, were generated from DNA from the same single Duroc sow as the source of the BAC library from which clones were...

  13. Complete genome sequence and comparative genomic analysis of Mycobacterium massiliense JCM 15300 in the Mycobacterium abscessus group reveal a conserved genomic island MmGI-1 related to putative lipid metabolism.

    Directory of Open Access Journals (Sweden)

    Tsuyoshi Sekizuka

    Full Text Available Mycobacterium abscessus group subsp., such as M. massiliense, M. abscessus sensu stricto and M. bolletii, are an environmental organism found in soil, water and other ecological niches, and have been isolated from respiratory tract infection, skin and soft tissue infection, postoperative infection of cosmetic surgery. To determine the unique genetic feature of M. massiliense, we sequenced the complete genome of M. massiliense type strain JCM 15300 (corresponding to CCUG 48898. Comparative genomic analysis was performed among Mycobacterium spp. and among M. abscessus group subspp., showing that additional ß-oxidation-related genes and, notably, the mammalian cell entry (mce operon were located on a genomic island, M. massiliense Genomic Island 1 (MmGI-1, in M. massiliense. In addition, putative anaerobic respiration system-related genes and additional mycolic acid cyclopropane synthetase-related genes were found uniquely in M. massiliense. Japanese isolates of M. massiliense also frequently possess the MmGI-1 (14/44, approximately 32% and three unique conserved regions (26/44; approximately 60%, 34/44; approximately 77% and 40/44; approximately 91%, as well as isolates of other countries (Malaysia, France, United Kingdom and United States. The well-conserved genomic island MmGI-1 may play an important role in high growth potential with additional lipid metabolism, extra factors for survival in the environment or synthesis of complex membrane-associated lipids. ORFs on MmGI-1 showed similarities to ORFs of phylogenetically distant M. avium complex (MAC, suggesting that horizontal gene transfer or genetic recombination events might have occurred within MmGI-1 among M. massiliense and MAC.

  14. Intraspecific sequence comparisons reveal similar rates of non-collinear gene insertion in the B and D genomes of bread wheat

    Czech Academy of Sciences Publication Activity Database

    Bartoš, Jan; Vlček, Čestmír; Choulet, F.; Džunková, Mária; Cviková, Kateřina; Šafář, Jan; Šimková, Hana; Pačes, Jan; Strnad, Hynek; Sourdille, P.; Berges, H.; Cattonaro, F.; Feuillet, C.; Doležel, Jaroslav

    2012-01-01

    Roč. 12, č. 155 (2012), s. 1-10 ISSN 1471-2229 R&D Projects: GA ČR GAP501/10/1778 Grant - others:GA MŠk(CZ) ED0007/01/01 Program:ED Institutional research plan: CEZ:AV0Z50380511; CEZ:AV0Z50520514 Keywords : Wheat * BAC sequencing * Homoeologous genomes Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 4.354, year: 2012

  15. Candidate genes and genetic architecture of symbiotic and agronomic traits revealed by whole-genome, sequence-based association genetics in Medicago truncatula.

    Directory of Open Access Journals (Sweden)

    John Stanton-Geddes

    Full Text Available Genome-wide association study (GWAS has revolutionized the search for the genetic basis of complex traits. To date, GWAS have generally relied on relatively sparse sampling of nucleotide diversity, which is likely to bias results by preferentially sampling high-frequency SNPs not in complete linkage disequilibrium (LD with causative SNPs. To avoid these limitations we conducted GWAS with >6 million SNPs identified by sequencing the genomes of 226 accessions of the model legume Medicago truncatula. We used these data to identify candidate genes and the genetic architecture underlying phenotypic variation in plant height, trichome density, flowering time, and nodulation. The characteristics of candidate SNPs differed among traits, with candidates for flowering time and trichome density in distinct clusters of high linkage disequilibrium (LD and the minor allele frequencies (MAF of candidates underlying variation in flowering time and height significantly greater than MAF of candidates underlying variation in other traits. Candidate SNPs tagged several characterized genes including nodulation related genes SERK2, MtnodGRP3, MtMMPL1, NFP, CaML3, MtnodGRP3A and flowering time gene MtFD as well as uncharacterized genes that become candidates for further molecular characterization. By comparing sequence-based candidates to candidates identified by in silico 250K SNP arrays, we provide an empirical example of how reliance on even high-density reduced representation genomic makers can bias GWAS results. Depending on the trait, only 30-70% of the top 20 in silico array candidates were within 1 kb of sequence-based candidates. Moreover, the sequence-based candidates tagged by array candidates were heavily biased towards common variants; these comparisons underscore the need for caution when interpreting results from GWAS conducted with sparsely covered genomes.

  16. Analysis of the human cytomegalovirus genomic region from UL146 through UL147A reveals sequence hypervariability, genotypic stability, and overlapping transcripts

    Directory of Open Access Journals (Sweden)

    Huang Diana D

    2006-01-01

    Full Text Available Abstract Background Although the sequence of the human cytomegalovirus (HCMV genome is generally conserved among unrelated clinical strains, some open reading frames (ORFs are highly variable. UL146 and UL147, which encode CXC chemokine homologues are among these variable ORFs. Results The region of the HCMV genome from UL146 through UL147A was analyzed in clinical strains for sequence variability, genotypic stability, and transcriptional expression. The UL146 sequences in clinical strains from two geographically distant sites were assigned to 12 sequence groups that differ by over 60% at the amino acid level. The same groups were generated by sequences from the UL146-UL147 intergenic region and the UL147 ORF. In contrast to the high level of sequence variability among unrelated clinical strains, the sequences of UL146 through UL147A from isolates of the same strain were highly stable after repeated passage both in vitro and in vivo. Riboprobes homologous to these ORFs detected multiple overlapping transcripts differing in temporal expression. UL146 sequences are present only on the largest transcript, which also contains all of the downstream ORFs including UL148 and UL132. The sizes and hybridization patterns of the transcripts are consistent with a common 3'-terminus downstream of the UL132 ORF. Early-late expression of the transcripts associated with UL146 and UL147 is compatible with the potential role of CXC chemokines in pathogenesis associated with viral replication. Conclusion Clinical isolates from two different geographic sites cluster in the same groups based on the hypervariability of the UL146, UL147, or the intergenic sequences, which provides strong evidence for linkage and no evidence for interstrain recombination within this region. The sequence of individual strains was absolutely stable in vitro and in vivo, which indicates that sequence drift is not a mechanism for the observed sequence hypervariability. There is also no

  17. Whole Genome Sequencing Reveals Novel Non-Synonymous Mutation in Ectodysplasin A (EDA) Associated with Non-Syndromic X-Linked Dominant Congenital Tooth Agenesis

    Science.gov (United States)

    Sarkar, Tanmoy; Bansal, Rajesh; Das, Parimal

    2014-01-01

    Congenital tooth agenesis in human is characterized by failure of tooth development during tooth organogenesis. 300 genes in mouse and 30 genes in human so far have been known to regulate tooth development. However, candidature of only 5 genes viz. PAX9, MSX1, AXIN2, WNT10A and EDA have been experimentally established for congenitally missing teeth like hypodontia and oligodontia. In this study an Indian family with multiple congenital tooth agenesis was identified. Pattern of inheritance was apparently autosomal dominant type with a rare possibility to be X-linked. Whole genome sequencing of two affected individuals was carried out which revealed 119 novel non-synonymous single nucleotide variations (SNVs) distributed among 117 genes. Out of these only one variation (c.956G>T) located at exon 9 of X-linked EDA gene was considered as pathogenic and validated among all the affected and unaffected family members and unrelated controls. This variation leads to p.Ser319Ile change in the TNF homology domain of EDA (transcript variant 1) protein. In silico analysis predicts that this Ser319 is well conserved across different vertebrate species and a part of putative receptor binding site. Structure based homology modeling predicts that this amino acid residue along with four other amino acid residues nearby, those when mutated known to cause selective tooth agenesis, form a cluster that may have functional significance. Taken together these results suggest that c.956G>T (p.Ser319Ile) mutation plausibly reduces the receptor binding activity of EDA leading to distinct tooth agenesis in this family. PMID:25203534

  18. The mitochondrial genome sequence of the ciliate Paramecium caudatum reveals a shift in nucleotide composition and codon usage within the genus Paramecium

    Directory of Open Access Journals (Sweden)

    Berendonk Thomas U

    2011-05-01

    Full Text Available Abstract Background Despite the fact that the organization of the ciliate mitochondrial genome is exceptional, only few ciliate mitochondrial genomes have been sequenced until today. All ciliate mitochondrial genomes are linear. They are 40 kb to 47 kb long and contain some 50 tightly packed genes without introns. Earlier studies documented that the mitochondrial guanine + cytosine contents are very different between Paramecium tetraurelia and all studied Tetrahymena species. This raises the question of whether the high mitochondrial G+C content observed in P. tetraurelia is a characteristic property of Paramecium mtDNA, or whether it is an exception of the ciliate mitochondrial genomes known so far. To test this question, we determined the mitochondrial genome sequence of Paramecium caudatum and compared the gene content and sequence properties to the closely related P. tetraurelia. Results The guanine + cytosine content of the P. caudatum mitochondrial genome was significantly lower than that of P. tetraurelia (22.4% vs. 41.2%. This difference in the mitochondrial nucleotide composition was accompanied by significantly different codon usage patterns in both species, i.e. within P. caudatum clearly A/T ending codons dominated, whereas for P. tetraurelia the synonymous codons were more balanced with a higher number of G/C ending codons. Further analyses indicated that the nucleotide composition of most members of the genus Paramecium resembles that of P. caudatum and that the shift observed in P. tetraurelia is restricted to the P. aurelia species complex. Conclusions Surprisingly, the codon usage bias in the P. caudatum mitochondrial genome, exemplified by the effective number of codons, is more similar to the distantly related T. pyriformis and other single-celled eukaryotes such as Chlamydomonas, than to the closely related P. tetraurelia. These differences in base composition and codon usage bias were, however, not reflected in the amino

  19. The complete chloroplast genome sequence of the chlorophycean green alga Scenedesmus obliquus reveals a compact gene organization and a biased distribution of genes on the two DNA strands

    Directory of Open Access Journals (Sweden)

    Lemieux Claude

    2006-04-01

    Full Text Available Abstract Background The phylum Chlorophyta contains the majority of the green algae and is divided into four classes. While the basal position of the Prasinophyceae is well established, the divergence order of the Ulvophyceae, Trebouxiophyceae and Chlorophyceae (UTC remains uncertain. The five complete chloroplast DNA (cpDNA sequences currently available for representatives of these classes display considerable variability in overall structure, gene content, gene density, intron content and gene order. Among these genomes, that of the chlorophycean green alga Chlamydomonas reinhardtii has retained the least ancestral features. The two single-copy regions, which are separated from one another by the large inverted repeat (IR, have similar sizes, rather than unequal sizes, and differ radically in both gene contents and gene organizations relative to the single-copy regions of prasinophyte and ulvophyte cpDNAs. To gain insights into the various changes that underwent the chloroplast genome during the evolution of chlorophycean green algae, we have sequenced the cpDNA of Scenedesmus obliquus, a member of a distinct chlorophycean lineage. Results The 161,452 bp IR-containing genome of Scenedesmus features single-copy regions of similar sizes, encodes 96 genes, i.e. only two additional genes (infA and rpl12 relative to its Chlamydomonas homologue and contains seven group I and two group II introns. It is clearly more compact than the four UTC algal cpDNAs that have been examined so far, displays the lowest proportion of short repeats among these algae and shows a stronger bias in clustering of genes on the same DNA strand compared to Chlamydomonas cpDNA. Like the latter genome, Scenedesmus cpDNA displays only a few ancestral gene clusters. The two chlorophycean genomes share 11 gene clusters that are not found in previously sequenced trebouxiophyte and ulvophyte cpDNAs as well as a few genes that have an unusual structure; however, their single

  20. Elucidating population histories using genomic DNA sequences.

    Science.gov (United States)

    Vigilant, Linda

    2009-04-01

    In 1993, Cliff Jolly suggested that rather than debating species definitions and classifications, energy would be better spent investigating multidimensional patterns of variation and gene flow among populations. Until now, however, genetic studies of wild primate populations have been limited to very small portions of the genome. Access to complete genome sequences of humans, chimpanzees, macaques, and other primates makes it possible to design studies surveying substantial amounts of DNA sequence variation at multiple genetic loci in representatives of closely related but distinct wild primate populations. Such data can be analyzed with new approaches that estimate not only when populations diverged but also the relative amounts and directions of subsequent gene flow. These analyses will reemphasize the difficulty of achieving consistent species and subspecies definitions by revealing the extent of variation in the amount and duration of gene flow accompanying population divergences.

  1. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  2. Genome-wide allele-specific expression analysis using Massively Parallel Signature Sequencing (MPSS) reveals cis- and trans-effects on gene expression in maize hybrid meristem tissue.

    Science.gov (United States)

    Guo, Mei; Yang, Sean; Rupe, Mary; Hu, Bin; Bickel, David R; Arthur, Lane; Smith, Oscar

    2008-03-01

    Allelic differences in expression are important genetic factors contributing to quantitative trait variation in various organisms. However, the extent of genome-wide allele-specific expression by different modes of gene regulation has not been well characterized in plants. In this study we developed a new methodology for allele-specific expression analysis by applying Massively Parallel Signature Sequencing (MPSS), an open ended and sequencing based mRNA profiling technology. This methodology enabled a genome-wide evaluation of cis- and trans-effects on allelic expression in six meristem stages of the maize hybrid. Summarization of data from nearly 400 pairs of MPSS allelic signature tags showed that 60% of the genes in the hybrid meristems exhibited differential allelic expression. Because both alleles are subjected to the same trans-acting factors in the hybrid, the data suggest the abundance of cis-regulatory differences in the genome. Comparing the same allele expressed in the hybrid versus its inbred parents showed that 40% of the genes were differentially expressed, suggesting different trans-acting effects present in different genotypes. Such trans-acting effects may result in gene expression in the hybrid different from allelic additive expression. With this approach we quantified gene expression in the hybrid relative to its inbred parents at the allele-specific level. As compared to measuring total transcript levels, this study provides a new level of understanding of different modes of gene regulation in the hybrid and the molecular basis of heterosis.

  3. The complete genome sequencing of Prevotella intermedia strain OMA14 and a subsequent fine-scale, intra-species genomic comparison reveal an unusual amplification of conjugative and mobile transposons and identify a novel Prevotella-lineage-specific repeat.

    Science.gov (United States)

    Naito, Mariko; Ogura, Yoshitoshi; Itoh, Takehiko; Shoji, Mikio; Okamoto, Masaaki; Hayashi, Tetsuya; Nakayama, Koji

    2016-02-01

    Prevotella intermedia is a pathogenic bacterium involved in periodontal diseases. Here, we present the complete genome sequence of a clinical strain, OMA14, of this bacterium along with the results of comparative genome analysis with strain 17 of the same species whose genome has also been sequenced, but not fully analysed yet. The genomes of both strains consist of two circular chromosomes: the larger chromosomes are similar in size and exhibit a high overall linearity of gene organizations, whereas the smaller chromosomes show a significant size variation and have undergone remarkable genome rearrangements. Unique features of the Pre. intermedia genomes are the presence of a remarkable number of essential genes on the second chromosomes and the abundance of conjugative and mobilizable transposons (CTns and MTns). The CTns/MTns are particularly abundant in the second chromosomes, involved in its extensive genome rearrangement, and have introduced a number of strain-specific genes into each strain. We also found a novel 188-bp repeat sequence that has been highly amplified in Pre. intermedia and are specifically distributed among the Pre. intermedia-related species. These findings expand our understanding of the genetic features of Pre. intermedia and the roles of CTns and MTns in the evolution of bacteria. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  4. Snake Genome Sequencing: Results and Future Prospects

    Directory of Open Access Journals (Sweden)

    Harald M. I. Kerkkamp

    2016-12-01

    Full Text Available Snake genome sequencing is in its infancy—very much behind the progress made in sequencing the genomes of humans, model organisms and pathogens relevant to biomedical research, and agricultural species. We provide here an overview of some of the snake genome projects in progress, and discuss the biological findings, with special emphasis on toxinology, from the small number of draft snake genomes already published. We discuss the future of snake genomics, pointing out that new sequencing technologies will help overcome the problem of repetitive sequences in assembling snake genomes. Genome sequences are also likely to be valuable in examining the clustering of toxin genes on the chromosomes, in designing recombinant antivenoms and in studying the epigenetic regulation of toxin gene expression.

  5. Snake Genome Sequencing: Results and Future Prospects.

    Science.gov (United States)

    Kerkkamp, Harald M I; Kini, R Manjunatha; Pospelov, Alexey S; Vonk, Freek J; Henkel, Christiaan V; Richardson, Michael K

    2016-12-01

    Snake genome sequencing is in its infancy-very much behind the progress made in sequencing the genomes of humans, model organisms and pathogens relevant to biomedical research, and agricultural species. We provide here an overview of some of the snake genome projects in progress, and discuss the biological findings, with special emphasis on toxinology, from the small number of draft snake genomes already published. We discuss the future of snake genomics, pointing out that new sequencing technologies will help overcome the problem of repetitive sequences in assembling snake genomes. Genome sequences are also likely to be valuable in examining the clustering of toxin genes on the chromosomes, in designing recombinant antivenoms and in studying the epigenetic regulation of toxin gene expression.

  6. Comparison of complete genome sequences of dog rabies viruses isolated from China and Mexico reveals key amino acid changes that may be associated with virus replication and virulence.

    Science.gov (United States)

    Yu, Fulai; Zhang, Guoqing; Zhong, Xiangfu; Han, Na; Song, Yunfeng; Zhao, Ling; Cui, Min; Rayner, Simon; Fu, Zhen F

    2014-07-01

    Rabies is a global problem, but its impact and prevalence vary across different regions. In some areas, such as parts of Africa and Asia, the virus is prevalent in the domestic dog population, leading to epidemic waves and large numbers of human fatalities. In other regions, such as the Americas, the virus predominates in wildlife and bat populations, with sporadic spillover into domestic animals. In this work, we attempted to investigate whether these distinct environments led to selective pressures that result in measurable changes within the genome at the amino acid level. To this end, we collected and sequenced the full genome of two isolates from divergent environments. The first isolate (DRV-AH08) was from China, where the virus is present in the dog population and the country is experiencing a serious epidemic. The second isolate (DRV-Mexico) was taken from Mexico, where the virus is present in both wildlife and domestic dog populations, but at low levels as a consequence of an effective vaccination program. We then combined and compared these with other full genome sequences to identify distinct amino acid changes that might be associated with environment. Phylogenetic analysis identified strain DRV-AH08 as belonging to the China-I lineage, which has emerged to become the dominant lineage in the current epidemic. The Mexico strain was placed in the D11 Mexico lineage, associated with the West USA-Mexico border clade. Amino acid sequence analysis identified only 17 amino acid differences in the N, G and L proteins. These differences may be associated with virus replication and virulence-for example, the short incubation period observed in the current epidemic in China.

  7. Experimental evolution, genetic analysis and genome re-sequencing reveal the mutation conferring artemisinin resistance in an isogenic lineage of malaria parasites

    KAUST Repository

    Hunt, Paul

    2010-09-16

    Background: Classical and quantitative linkage analyses of genetic crosses have traditionally been used to map genes of interest, such as those conferring chloroquine or quinine resistance in malaria parasites. Next-generation sequencing technologies now present the possibility of determining genome-wide genetic variation at single base-pair resolution. Here, we combine in vivo experimental evolution, a rapid genetic strategy and whole genome re-sequencing to identify the precise genetic basis of artemisinin resistance in a lineage of the rodent malaria parasite, Plasmodium chabaudi. Such genetic markers will further the investigation of resistance and its control in natural infections of the human malaria, P. falciparum.Results: A lineage of isogenic in vivo drug-selected mutant P. chabaudi parasites was investigated. By measuring the artemisinin responses of these clones, the appearance of an in vivo artemisinin resistance phenotype within the lineage was defined. The underlying genetic locus was mapped to a region of chromosome 2 by Linkage Group Selection in two different genetic crosses. Whole-genome deep coverage short-read re-sequencing (IlluminaSolexa) defined the point mutations, insertions, deletions and copy-number variations arising in the lineage. Eight point mutations arise within the mutant lineage, only one of which appears on chromosome 2. This missense mutation arises contemporaneously with artemisinin resistance and maps to a gene encoding a de-ubiquitinating enzyme.Conclusions: This integrated approach facilitates the rapid identification of mutations conferring selectable phenotypes, without prior knowledge of biological and molecular mechanisms. For malaria, this model can identify candidate genes before resistant parasites are commonly observed in natural human malaria populations. 2010 Hunt et al; licensee BioMed Central Ltd.

  8. Experimental evolution, genetic analysis and genome re-sequencing reveal the mutation conferring artemisinin resistance in an isogenic lineage of malaria parasites

    Directory of Open Access Journals (Sweden)

    Hunt Paul

    2010-09-01

    Full Text Available Abstract Background Classical and quantitative linkage analyses of genetic crosses have traditionally been used to map genes of interest, such as those conferring chloroquine or quinine resistance in malaria parasites. Next-generation sequencing technologies now present the possibility of determining genome-wide genetic variation at single base-pair resolution. Here, we combine in vivo experimental evolution, a rapid genetic strategy and whole genome re-sequencing to identify the precise genetic basis of artemisinin resistance in a lineage of the rodent malaria parasite, Plasmodium chabaudi. Such genetic markers will further the investigation of resistance and its control in natural infections of the human malaria, P. falciparum. Results A lineage of isogenic in vivo drug-selected mutant P. chabaudi parasites was investigated. By measuring the artemisinin responses of these clones, the appearance of an in vivo artemisinin resistance phenotype within the lineage was defined. The underlying genetic locus was mapped to a region of chromosome 2 by Linkage Group Selection in two different genetic crosses. Whole-genome deep coverage short-read re-sequencing (Illumina® Solexa defined the point mutations, insertions, deletions and copy-number variations arising in the lineage. Eight point mutations arise within the mutant lineage, only one of which appears on chromosome 2. This missense mutation arises contemporaneously with artemisinin resistance and maps to a gene encoding a de-ubiquitinating enzyme. Conclusions This integrated approach facilitates the rapid identification of mutations conferring selectable phenotypes, without prior knowledge of biological and molecular mechanisms. For malaria, this model can identify candidate genes before resistant parasites are commonly observed in natural human malaria populations.

  9. Inconsistencies of genome annotations in apicomplexan parasites revealed by 5'-end-one-pass and full-length sequences of oligo-capped cDNAs

    Directory of Open Access Journals (Sweden)

    Sugano Sumio

    2009-07-01

    Full Text Available Abstract Background Apicomplexan parasites are causative agents of various diseases including malaria and have been targets of extensive genomic sequencing. We generated 5'-EST collections for six apicomplexa parasites using our full-length oligo-capping cDNA library method. To improve upon the current genome annotations, as well as to validate the importance for physical cDNA clone resources, we generated a large-scale collection of full-length cDNAs for several apicomplexa parasites. Results In this study, we used a total of 61,056 5'-end-single-pass cDNA sequences from Plasmodium falciparum, P. vivax, P. yoelii, P. berghei, Cryptosporidium parvum, and Toxoplasma gondii. We compared these partially sequenced cDNA sequences with the currently annotated gene models and observed significant inconsistencies between the two datasets. In particular, we found that on average 14% of the exons in the current gene models were not supported by any cDNA evidence, and that 16% of the current gene models may contain at least one mis-annotation and should be re-evaluated. We also identified a large number of transcripts that had been previously unidentified. For 732 cDNAs in T. gondii, the entire sequences were determined in order to evaluate the annotated gene models at the complete full-length transcript level. We found that 41% of the T. gondii gene models contained at least one inconsistency. We also identified and confirmed by RT-PCR 140 previously unidentified transcripts found in the intergenic regions of the current gene annotations. We show that the majority of these discrepancies are due to questionable predictions of one or two extra exons in the upstream or downstream regions of the genes. Conclusion Our data indicates that the current gene models are likely to still be incomplete and have much room for improvement. Our unique full-length cDNA information is especially useful for further refinement of the annotations for the genomes of

  10. Whole genome sequencing of field isolates reveals a common duplication of the Duffy binding protein gene in Malagasy Plasmodium vivax strains.

    Directory of Open Access Journals (Sweden)

    Didier Menard

    2013-11-01

    Full Text Available Plasmodium vivax is the most prevalent human malaria parasite, causing serious public health problems in malaria-endemic countries. Until recently the Duffy-negative blood group phenotype was considered to confer resistance to vivax malaria for most African ethnicities. We and others have reported that P. vivax strains in African countries from Madagascar to Mauritania display capacity to cause clinical vivax malaria in Duffy-negative people. New insights must now explain Duffy-independent P. vivax invasion of human erythrocytes.Through recent whole genome sequencing we obtained ≥ 70× coverage of the P. vivax genome from five field-isolates, resulting in ≥ 93% of the Sal I reference sequenced at coverage greater than 20×. Combined with sequences from one additional Malagasy field isolate and from five monkey-adapted strains, we describe here identification of DNA sequence rearrangements in the P. vivax genome, including discovery of a duplication of the P. vivax Duffy binding protein (PvDBP gene. A survey of Malagasy patients infected with P. vivax showed that the PvDBP duplication was present in numerous locations in Madagascar and found in over 50% of infected patients evaluated. Extended geographic surveys showed that the PvDBP duplication was detected frequently in vivax patients living in East Africa and in some residents of non-African P. vivax-endemic countries. Additionally, the PvDBP duplication was observed in travelers seeking treatment of vivax malaria upon returning home. PvDBP duplication prevalence was highest in west-central Madagascar sites where the highest frequencies of P. vivax-infected, Duffy-negative people were reported.The highly conserved nature of the sequence involved in the PvDBP duplication suggests that it has occurred in a recent evolutionary time frame. These data suggest that PvDBP, a merozoite surface protein involved in red cell adhesion is rapidly evolving, possibly in response to constraints imposed by

  11. Salmo salar and Esox lucius full-length cDNA sequences reveal changes in evolutionary pressures on a post-tetraploidization genome

    Directory of Open Access Journals (Sweden)

    Holt Robert A

    2010-04-01

    Full Text Available Abstract Background Salmonids are one of the most intensely studied fish, in part due to their economic and environmental importance, and in part due to a recent whole genome duplication in the common ancestor of salmonids. This duplication greatly impacts species diversification, functional specialization, and adaptation. Extensive new genomic resources have recently become available for Atlantic salmon (Salmo salar, but documentation of allelic versus duplicate reference genes remains a major uncertainty in the complete characterization of its genome and its evolution. Results From existing expressed sequence tag (EST resources and three new full-length cDNA libraries, 9,057 reference quality full-length gene insert clones were identified for Atlantic salmon. A further 1,365 reference full-length clones were annotated from 29,221 northern pike (Esox lucius ESTs. Pairwise dN/dS comparisons within each of 408 sets of duplicated salmon genes using northern pike as a diploid out-group show asymmetric relaxation of selection on salmon duplicates. Conclusions 9,057 full-length reference genes were characterized in S. salar and can be used to identify alleles and gene family members. Comparisons of duplicated genes show that while purifying selection is the predominant force acting on both duplicates, consistent with retention of functionality in both copies, some relaxation of pressure on gene duplicates can be identified. In addition, there is evidence that evolution has acted asymmetrically on paralogs, allowing one of the pair to diverge at a faster rate.

  12. A verification protocol for the probe sequences of Affymetrix genome arrays reveals high probe accuracy for studies in mouse, human and rat

    Directory of Open Access Journals (Sweden)

    Nap Jan-Peter

    2007-04-01

    Full Text Available Abstract Background The Affymetrix GeneChip technology uses multiple probes per gene to measure its expression level. Individual probe signals can vary widely, which hampers proper interpretation. This variation can be caused by probes that do not properly match their target gene or that match multiple genes. To determine the accuracy of Affymetrix arrays, we developed an extensive verification protocol, for mouse arrays incorporating the NCBI RefSeq, NCBI UniGene Unique, NIA Mouse Gene Index, and UCSC mouse genome databases. Results Applying this protocol to Affymetrix Mouse Genome arrays (the earlier U74Av2 and the newer 430 2.0 array, the number of sequence-verified probes with perfect matches was no less than 85% and 95%, respectively; and for 74% and 85% of the probe sets all probes were sequence verified. The latter percentages increased to 80% and 94% after discarding one or two unverifiable probes per probe set, and even further to 84% and 97% when, in addition, allowing for one or two mismatches between probe and target gene. Similar results were obtained for other mouse arrays, as well as for human and rat arrays. Based on these data, refined chip definition files for all arrays are provided online. Researchers can choose the version appropriate for their study to (reanalyze expression data. Conclusion The accuracy of Affymetrix probe sequences is higher than previously reported, particularly on newer arrays. Yet, refined probe set definitions have clear effects on the detection of differentially expressed genes. We demonstrate that the interpretation of the results of Affymetrix arrays is improved when the new chip definition files are used.

  13. Sequencing skippy: the genome sequence of an Australian kangaroo, Macropus eugenii

    Science.gov (United States)

    2011-01-01

    Sequencing of the tammar wallaby (Macropus eugenii) reveals insights into genome evolution, and mammalian reproduction and development. See research article: http://genomebiology.com/2011/12/8/R81 PMID:21861852

  14. Whole-Genome Sequencing of Drug-Resistant Salmonella enterica Isolates from Dairy Cattle and Humans in New York and Washington States Reveals Source and Geographic Associations.

    Science.gov (United States)

    Carroll, Laura M; Wiedmann, Martin; den Bakker, Henk; Siler, Julie; Warchocki, Steven; Kent, David; Lyalina, Svetlana; Davis, Margaret; Sischo, William; Besser, Thomas; Warnick, Lorin D; Pereira, Richard V

    2017-06-15

    Multidrug-resistant (MDR) Salmonella enterica can be spread from cattle to humans through direct contact with animals shedding Salmonella as well as through the food chain, making MDR Salmonella a serious threat to human health. The objective of this study was to use whole-genome sequencing to compare antimicrobial-resistant (AMR) Salmonella enterica serovars Typhimurium, Newport, and Dublin isolated from dairy cattle and humans in Washington State and New York State at the genotypic and phenotypic levels. A total of 90 isolates were selected for the study (37 S Typhimurium, 32 S Newport, and 21 S Dublin isolates). All isolates were tested for phenotypic antibiotic resistance to 12 drugs using Kirby-Bauer disk diffusion. AMR genes were detected in the assembled genome of each isolate using nucleotide BLAST and ARG-ANNOT. Genotypic prediction of phenotypic resistance resulted in a mean sensitivity of 97.2 and specificity of 85.2. Sulfamethoxazole-trimethoprim resistance was observed only in human isolates ( P enterica in humans and farm animals in different regions. IMPORTANCE The use of antibiotics in food-producing animals has been hypothesized to select for AMR Salmonella enterica and associated AMR determinants, which can be transferred to humans through different routes. Previous studies have sought to assess the degree to which AMR livestock- and human-associated Salmonella strains overlap, as well as the spatial distribution of Salmonella 's associated AMR determinants, but have often been limited by the degree of resolution at which isolates can be compared. Here, a comparative genomics study of livestock- and human-associated Salmonella strains from different regions of the United States shows that while many AMR genes and phenotypes were confined to human isolates, overlaps between the resistomes of bovine and human-associated Salmonella isolates were observed on numerous occasions, particularly for S Newport. We have also shown that whole-genome

  15. Complete sequence of the mitochondrial genome of ...

    Indian Academy of Sciences (India)

    Supplementary data: Complete sequence of the mitochondrial genome of Odontamblyopus rubicundus (Perciformes: Gobiidae): genome characterization and phylogenetic analysis. Tianxing Liu, Xiaoxiao Jin, Rixin Wang and Tianjun Xu. J. Genet. 92, 423–432. Figure 1. Gene map of O. rubicundus mitochondrial genome.

  16. Towards a reference pecan genome sequence

    Science.gov (United States)

    The cost of generating DNA sequence data has declined dramatically over the previous 15 years as a result of the Human Genome Project and the potential applications of genome sequencing for human medicine. This cost reduction has generated renewed interest among crop breeding scientists in applying...

  17. Genome-wide sequencing of small RNAs reveals a tissue-specific loss of conserved microRNA families in Echinococcus granulosus.

    Science.gov (United States)

    Bai, Yun; Zhang, Zhuangzhi; Jin, Lei; Kang, Hui; Zhu, Yongqiang; Zhang, Lu; Li, Xia; Ma, Fengshou; Zhao, Li; Shi, Baoxin; Li, Jun; McManus, Donald P; Zhang, Wenbao; Wang, Shengyue

    2014-08-29

    MicroRNAs (miRNAs) are important post-transcriptional regulators which control growth and development in eukaryotes. The cestode Echinococcus granulosus has a complex life-cycle involving different development stages but the mechanisms underpinning this development, including the involvement of miRNAs, remain unknown. Using Illumina next generation sequencing technology, we sequenced at the genome-wide level three small RNA populations from the adult, protoscolex and cyst membrane of E. granulosus. A total of 94 pre-miRNA candidates (coding 91 mature miRNAs and 39 miRNA stars) were in silico predicted. Through comparison of expression profiles, we found 42 mature miRNAs and 23 miRNA stars expressed with different patterns in the three life stages examined. Furthermore, considering both the previously reported and newly predicted miRNAs, 25 conserved miRNAs families were identified in the E. granulosus genome. Comparing the presence or absence of these miRNA families with the free-living Schmidtea mediterranea, we found 13 conserved miRNAs are lost in E. granulosus, most of which are tissue-specific and involved in the development of ciliated cells, the gut and sensory organs. Finally, GO enrichment analysis of the differentially expressed miRNAs and their potential targets indicated that they may be involved in bi-directional development, nutrient metabolism and nervous system development in E. granulosus. This study has, for the first time, provided a comprehensive description of the different expression patterns of miRNAs in three distinct life cycle stages of E. granulosus. The analysis supports earlier suggestions that the loss of miRNAs in the Platyhelminths might be related to morphological simplification. These results may help in the exploration of the mechanism of interaction between this parasitic worm and its definitive and intermediate hosts, providing information that can be used to develop new interventions and therapeutics for the control of cystic

  18. Value of a newly sequenced bacterial genome

    DEFF Research Database (Denmark)

    Barbosa, Eudes; Aburjaile, Flavia F; Ramos, Rommel Tj

    2014-01-01

    and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses...... the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information....

  19. Human Genome Sequencing in Health and Disease

    Science.gov (United States)

    Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

    2013-01-01

    Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320

  20. Genomic Investigation Reveals Highly Conserved, Mosaic, Recombination Events Associated with Capsular Switching among Invasive Neisseria meningitidis Serogroup W Sequence Type (ST)-11 Strains.

    Science.gov (United States)

    Mustapha, Mustapha M; Marsh, Jane W; Krauland, Mary G; Fernandez, Jorge O; de Lemos, Ana Paula S; Dunning Hotopp, Julie C; Wang, Xin; Mayer, Leonard W; Lawrence, Jeffrey G; Hiller, N Luisa; Harrison, Lee H

    2016-07-03

    Neisseria meningitidis is an important cause of meningococcal disease globally. Sequence type (ST)-11 clonal complex (cc11) is a hypervirulent meningococcal lineage historically associated with serogroup C capsule and is believed to have acquired the W capsule through a C to W capsular switching event. We studied the sequence of capsule gene cluster (cps) and adjoining genomic regions of 524 invasive W cc11 strains isolated globally. We identified recombination breakpoints corresponding to two distinct recombination events within W cc11: A 8.4-kb recombinant region likely acquired from W cc22 including the sialic acid/glycosyl-transferase gene, csw resulted in a C→W change in capsular phenotype and a 13.7-kb recombinant segment likely acquired from Y cc23 lineage includes 4.5 kb of cps genes and 8.2 kb downstream of the cps cluster resulting in allelic changes in capsule translocation genes. A vast majority of W cc11 strains (497/524, 94.8%) retain both recombination events as evidenced by sharing identical or very closely related capsular allelic profiles. These data suggest that the W cc11 capsular switch involved two separate recombination events and that current global W cc11 meningococcal disease is caused by strains bearing this mosaic capsular switch. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  1. Whole-genome sequencing of mutants with increased resistance against the two-peptide bacteriocin plantaricin JK reveals a putative receptor and potential docking site.

    Science.gov (United States)

    Ekblad, Bie; Nissen-Meyer, Jon; Kristensen, Tom

    2017-01-01

    By whole-genome sequencing of resistant mutants, a putative receptor for plantaricin JK, a two-peptide bacteriocin produced by some Lactobacillus plantarum strains, was identified in Lactobacillus plantarum NCFB 965 and Weissella viridescens NCFB 1655. The receptors of the two species had 66% identical amino acid sequences and belong to the amino acid-polyamine-organocation (APC) transporter protein family. The resistant mutants contained point mutations in the protein-encoding gene resulting in either premature stop codons, leading to truncated versions of the protein, or single amino acid substitutions. The secondary structure of the W. viridescens protein was predicted to contain 12 transmembrane (TM) helices, a core structure shared by most members of the APC protein family. The single amino acid substitutions that resulted in resistant strains were located in a confined region of the protein that consists of TM helix 10, which is predicted to be part of an inner membrane pore, and an extracellular loop between TM helix 11 and 12. By use of template-based modeling a 3D structure model of the protein was obtained, which visualizes this mutational hotspot region and further strengthen the hypothesis that it represents a docking site for plantaricin JK.

  2. Sequencing and analysis of an Irish human genome.

    LENUS (Irish Health Repository)

    Tong, Pin

    2010-01-01

    Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence.

  3. Targeted capture sequencing in whitebark pine reveals range-wide demographic and adaptive patterns despite challenges of a large, repetitive genome

    Directory of Open Access Journals (Sweden)

    John eSyring

    2016-04-01

    Full Text Available Whitebark pine (Pinus albicaulis inhabits an expansive range in western North America, and it is a keystone species of subalpine environments. Whitebark is susceptible to multiple threats – climate change, white pine blister rust, mountain pine beetle, and fire exclusion – and it is suffering significant mortality range-wide, prompting the tree to be listed as ‘globally endangered’ by the International Union for Conservation of Nature (IUCN and ‘endangered’ by the Canadian government. Conservation collections (in situ and ex situ are being initiated to preserve the genetic legacy of the species. Reliable, transferrable, and highly variable genetic markers are essential for quantifying the genetic profiles of seed collections relative to natural stands, and ensuring the completeness of conservation collections. We evaluated the use of hybridization-based target capture to enrich specific genomic regions from the 30+ GB genome of whitebark pine, and to evaluate genetic variation across loci, trees, and geography. Probes were designed to capture 7,849 distinct genes, and screening was performed on 48 trees. Despite the inclusion of repetitive elements in the probe pool, the resulting dataset provided information on 4,452 genes and 32% of targeted positions (528,873 bp, and we were able to identify 12,390 segregating sites from 47 trees. Variations reveal strong geographic trends in heterozygosity and allelic richness, with trees from the southern Cascade and Sierra Range showing the greatest distinctiveness and differentiation. Our results show that even under non-optimal conditions (low enrichment efficiency; inclusion of repetitive elements in baits, targeted enrichment produces high quality, codominant genotypes from large genomes. The resulting data can be readily integrated into management and gene conservation activities for whitebark pine, and have the potential to be applied to other members of 5-needle pine group (Pinus subsect

  4. Genetic basis of olfactory cognition: extremely high level of DNA sequence polymorphism in promoter regions of the human olfactory receptor genes revealed using the 1000 Genomes Project dataset.

    Science.gov (United States)

    Ignatieva, Elena V; Levitsky, Victor G; Yudin, Nikolay S; Moshkin, Mikhail P; Kolchanov, Nikolay A

    2014-01-01

    The molecular mechanism of olfactory cognition is very complicated. Olfactory cognition is initiated by olfactory receptor proteins (odorant receptors), which are activated by olfactory stimuli (ligands). Olfactory receptors are the initial player in the signal transduction cascade producing a nerve impulse, which is transmitted to the brain. The sensitivity to a particular ligand depends on the expression level of multiple proteins involved in the process of olfactory cognition: olfactory receptor proteins, proteins that participate in signal transduction cascade, etc. The expression level of each gene is controlled by its regulatory regions, and especially, by the promoter [a region of DNA about 100-1000 base pairs long located upstream of the transcription start site (TSS)]. We analyzed single nucleotide polymorphisms using human whole-genome data from the 1000 Genomes Project and revealed an extremely high level of single nucleotide polymorphisms in promoter regions of olfactory receptor genes and HLA genes. We hypothesized that the high level of polymorphisms in olfactory receptor promoters was responsible for the diversity in regulatory mechanisms controlling the expression levels of olfactory receptor proteins. Such diversity of regulatory mechanisms may cause the great variability of olfactory cognition of numerous environmental olfactory stimuli perceived by human beings (air pollutants, human body odors, odors in culinary etc.). In turn, this variability may provide a wide range of emotional and behavioral reactions related to the vast variety of olfactory stimuli.

  5. Evolution of extensively drug-resistant tuberculosis over four decades revealed by whole genome sequencing of Mycobacterium tuberculosis from KwaZulu-Natal, South Africa

    Directory of Open Access Journals (Sweden)

    Keira A Cohen

    2015-01-01

    Full Text Available The largest global outbreak of extensively drug-resistant (XDR tuberculosis (TB was identified in Tugela Ferry, KwaZulu-Natal (KZN, South Africa in 2005. The antecedents and timing of the emergence of drug resistance in this fatal epidemic XDR outbreak are unknown, and it is unclear whether drug resistance in this region continues to be driven by clonal spread or by the development of de novo resistance. A whole genome sequencing and drug susceptibility testing (DST was performed on 337 clinical isolates of Mycobacterium tuberculosis (M.tb collected in KZN from 2008 to 2013, in addition to three historical isolates, one of which was isolated during the Tugela Ferry outbreak. Using a variety of whole genome comparative approaches, 11 drug-resistant clones of M.tb circulating from 2008 to 2013 were identified, including a 50-member clone of XDR M.tb that was highly related to the Tugela Ferry XDR outbreak strain. It was calculated that the evolutionary trajectory from first-line drug resistance to XDR in this clone spanned more than four decades and began at the start of the antibiotic era. It was also observed that frequent de novo evolution of MDR and XDR was present, with 56 and 9 independent evolutions, respectively. Thus, ongoing amplification of drug-resistance in KwaZulu-Natal is driven by both clonal spread and de novo acquisition of resistance. In drug-resistant TB, isoniazid resistance was overwhelmingly the initial resistance mutation to be acquired, which would not be detected by current rapid molecular diagnostics that assess only rifampicin resistance.

  6. The complete chloroplast genome sequence of Zanthoxylum piperitum.

    Science.gov (United States)

    Lee, Jonghoon; Lee, Hyeon Ju; Kim, Kyunghee; Lee, Sang-Choon; Sung, Sang Hyun; Yang, Tae-Jin

    2016-09-01

    The complete chloroplast genome sequence of Zanthoxylum piperitum, a plant species with useful aromatic oils in family Rutaceae, was generated in this study by de novo assembly with whole-genome sequence data. The chloroplast genome was 158 154 bp in length with a typical quadripartite structure containing a pair of inverted repeats of 27 644 bp, separated by large single copy and small single copy of 85 340 bp and 17 526 bp, respectively. The chloroplast genome harbored 112 genes consisting of 78 protein-coding genes 30 tRNA genes and 4 rRNA genes. Phylogenetic analysis of the complete chloroplast genome sequences with those of known relatives revealed that Z. piperitum is most closely related to the Citrus species.

  7. Plantagora: modeling whole genome sequencing and assembly of plant genomes.

    Directory of Open Access Journals (Sweden)

    Roger Barthelson

    Full Text Available BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly

  8. The genome sequence of Bacillus cereus ATCC 10987 reveals metabolic adaptations and a large plasmid related to Bacillus anthracis pXO1.

    Science.gov (United States)

    Rasko, David A; Ravel, Jacques; Økstad, Ole Andreas; Helgason, Erlendur; Cer, Regina Z; Jiang, Lingxia; Shores, Kelly A; Fouts, Derrick E; Tourasse, Nicolas J; Angiuoli, Samuel V; Kolonay, James; Nelson, William C; Kolstø, Anne-Brit; Fraser, Claire M; Read, Timothy D

    2004-01-01

    We sequenced the complete genome of Bacillus cereus ATCC 10987, a non-lethal dairy isolate in the same genetic subgroup as Bacillus anthracis. Comparison of the chromosomes demonstrated that B.cereus ATCC 10987 was more similar to B.anthracis Ames than B.cereus ATCC 14579, while containing a number of unique metabolic capabilities such as urease and xylose utilization and lacking the ability to utilize nitrate and nitrite. Additionally, genetic mechanisms for variation of capsule carbohydrate and flagella surface structures were identified. Bacillus cereus ATCC 10987 contains a single large plasmid (pBc10987), of approximately 208 kb, that is similar in gene content and organization to B.anthracis pXO1 but is lacking the pathogenicity-associated island containing the anthrax lethal and edema toxin complex genes. The chromosomal similarity of B.cereus ATCC 10987 to B.anthracis Ames, as well as the fact that it contains a large pXO1-like plasmid, may make it a possible model for studying B.anthracis plasmid biology and regulatory cross-talk.

  9. Comparison of 61 Sequenced Escherichia coli Genomes

    DEFF Research Database (Denmark)

    Lukjancenko, Oksana; Wassenaar, T. M.; Ussery, David

    2010-01-01

    Escherichia coli is an important component of the biosphere and is an ideal model for studies of processes involved in bacterial genome evolution. Sixty-one publically available E. coli and Shigella spp. sequenced genomes are compared, using basic methods to produce phylogenetic and proteomics......% of the pan-genome and about 80% of a typical genome; some of these variable genes tend to be co-localized on genomic islands. The diversity within the species E. coli, and the overlap in gene content between this and related species, suggests a continuum rather than sharp species borders in this group...

  10. Complete Genome Sequencing of Influenza A Viruses within Swine Farrow-to-Wean Farms Reveals the Emergence, Persistence, and Subsidence of Diverse Viral Genotypes.

    Science.gov (United States)

    Diaz, Andres; Marthaler, Douglas; Culhane, Marie; Sreevatsan, Srinand; Alkhamis, Moh; Torremorell, Montserrat

    2017-09-15

    Influenza A viruses (IAVs) are endemic in swine and represent a public health risk. However, there is limited information on the genetic diversity of swine IAVs within farrow-to-wean farms, which is where most pigs are born. In this longitudinal study, we sampled 5 farrow-to-wean farms for a year and collected 4,190 individual nasal swabs from three distinct pig subpopulations. Of these, 207 (4.9%) samples tested PCR positive for IAV, and 124 IAVs were isolated. We sequenced the complete genomes of 123 IAV isolates and found 31 H1N1, 26 H1N2, 63 H3N2, and 3 mixed IAVs. Based on the IAV hemagglutinin, seven different influenza A viral groups (VGs) were identified. Most of the remaining IAV gene segments allowed us to differentiate the same VGs, although an additional viral group was identified for gene segment 3 (PA). Moreover, the codetection of more than one IAV VG was documented at different levels (farm, subpopulation, and individual pigs), highlighting the environment for potential IAV reassortment. Additionally, 3 out of 5 farms contained IAV isolates ( n = 5) with gene segments from more than one VG, and 79% of all the IAVs sequenced contained a signature mutation (S31N) in the matrix gene that has been associated with resistance to the antiviral amantadine. Within farms, some IAVs were detected only once, while others were detected for 283 days. Our results illustrate the maintenance and subsidence of different IAVs within swine farrow-to-wean farms over time, demonstrating that pig subpopulation dynamics are important to better understand the diversity and epidemiology of swine IAVs. IMPORTANCE On a global scale, swine are one of the main reservoir species for influenza A viruses (IAVs) and play a key role in the transmission of IAVs between species. Additionally, the 2009 IAV pandemics highlighted the role of pigs in the emergence of IAVs with pandemic potential. However, limited information is available regarding the diversity and distribution of swine

  11. Ultra-Deep Sequencing of HIV-1 near Full-Length and Partial Proviral Genomes Reveals High Genetic Diversity among Brazilian Blood Donors.

    Science.gov (United States)

    Pessôa, Rodrigo; Loureiro, Paula; Esther Lopes, Maria; Carneiro-Proietti, Anna B F; Sabino, Ester C; Busch, Michael P; Sanabani, Sabri S

    2016-01-01

    Here, we aimed to gain a comprehensive picture of the HIV-1 diversity in the northeast and southeast part of Brazil. To this end, a high-throughput sequencing-by-synthesis protocol and instrument were used to characterize the near full length (NFLG) and partial HIV-1 proviral genome in 259 HIV-1 infected blood donors at four major blood centers in Brazil: Pro-Sangue foundation (São Paulo state (SP), n 51), Hemominas foundation (Minas Gerais state (MG), n 41), Hemope foundation (Recife state (PE), n 96) and Hemorio blood bank (Rio de Janeiro (RJ), n 70). A total of 259 blood samples were obtained from 195 donors with long-standing infections and 64 donors with a lack of stage information. DNA was extracted from the peripheral blood mononuclear cells (PBMCs) to amplify the HIV-1 NFLGs from five overlapping fragments. The amplicons were molecularly bar-coded, pooled, and sequenced by Illumina paired-end protocol. Of the 259 samples studied, 208 (80%) NFLGs and 49 (18.8%) partial fragments were de novo assembled into contiguous sequences and successfully subtyped. Of these 257 samples, 183 (71.2%) were pure subtypes consisting of clade B (n = 167, 65%), C (n = 10, 3.9%), F1 (n = 4, 1.5%), and D (n = 2, 0.7%). Recombinant viruses were detected in 74 (28.8%) samples and consist of unique BF1 (n = 41, 15.9%), BC (n = 7, 2.7%), BCF1 (n = 4, 1.5%), CF1 and CDK (n = 1, 0.4%, each), CRF70_BF1 (n = 4, 1.5%), CRF71_BF1 (n = 12, 4.7%), and CRF72_BF1 (n = 4, 1.5%). Evidence of dual infection was detected in four patients coinfected with the same subtype (n = 3) and distinct subtype (n = 1). Based on this work, subtype B appears to be the prevalent subtype followed by a high proportion of intersubtype recombinants that appeared to be arising continually in this country. Our study represents the largest analysis of the viral NFLG ever undertaken worldwide and provides insights into the understanding the genesis of the HIV-1 epidemic in this particular area of South America and

  12. The characterization of twenty sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Kimberly Pelak

    2010-09-01

    Full Text Available We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten "case" genomes from individuals with severe hemophilia A and ten "control" genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways.

  13. Genome sequence analysis of the model grass Brachypodium distachyon: insights into grass genome evolution

    Energy Technology Data Exchange (ETDEWEB)

    Schulman, Al

    2009-08-09

    Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromeric regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops

  14. Genomic Prediction from Whole Genome Sequence in Livestock: The 1000 Bull Genomes Project

    DEFF Research Database (Denmark)

    Hayes, Benjamin J; MacLeod, Iona M; Daetwyler, Hans D

    Advantages of using whole genome sequence data to predict genomic estimated breeding values (GEBV) include better persistence of accuracy of GEBV across generations and more accurate GEBV across breeds. The 1000 Bull Genomes Project provides a database of whole genome sequenced key ancestor bulls...

  15. Genes but not genomes reveal bacterial domestication of Lactococcus lactis.

    Directory of Open Access Journals (Sweden)

    Delphine Passerini

    Full Text Available BACKGROUND: The population structure and diversity of Lactococcus lactis subsp. lactis, a major industrial bacterium involved in milk fermentation, was determined at both gene and genome level. Seventy-six lactococcal isolates of various origins were studied by different genotyping methods and thirty-six strains displaying unique macrorestriction fingerprints were analyzed by a new multilocus sequence typing (MLST scheme. This gene-based analysis was compared to genomic characteristics determined by pulsed-field gel electrophoresis (PFGE. METHODOLOGY/PRINCIPAL FINDINGS: The MLST analysis revealed that L. lactis subsp. lactis is essentially clonal with infrequent intra- and intergenic recombination; also, despite its taxonomical classification as a subspecies, it displays a genetic diversity as substantial as that within several other bacterial species. Genome-based analysis revealed a genome size variability of 20%, a value typical of bacteria inhabiting different ecological niches, and that suggests a large pan-genome for this subspecies. However, the genomic characteristics (macrorestriction pattern, genome or chromosome size, plasmid content did not correlate to the MLST-based phylogeny, with strains from the same sequence type (ST differing by up to 230 kb in genome size. CONCLUSION/SIGNIFICANCE: The gene-based phylogeny was not fully consistent with the traditional classification into dairy and non-dairy strains but supported a new classification based on ecological separation between "environmental" strains, the main contributors to the genetic diversity within the subspecies, and "domesticated" strains, subject to recent genetic bottlenecks. Comparison between gene- and genome-based analyses revealed little relationship between core and dispensable genome phylogenies, indicating that clonal diversification and phenotypic variability of the "domesticated" strains essentially arose through substantial genomic flux within the dispensable

  16. Multilocus Sequence Typing of Total-Genome-Sequenced Bacteria

    DEFF Research Database (Denmark)

    Larsen, Mette Voldby; Cosentino, Salvatore; Rasmussen, Simon

    2012-01-01

    Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the "gold standard" of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS...

  17. Mycobacterium tuberculosis H37Ra genome sequencing

    Indian Academy of Sciences (India)

    2007-02-09

    Feb 9, 2007 ... Home; Journals; Journal of Biosciences; Volume 32; Issue 2. Commentary: The value of comparative genomics in understanding mycobacterial virulence: Mycobacterium tuberculosis H37Ra genome sequencing – a worthwhile endeavour. Deepak Sharma Jaya Sivaswami Tyagi. Volume 32 Issue 2 March ...

  18. Simple sequence repeats in mycobacterial genomes

    Indian Academy of Sciences (India)

    Prakash

    J. Biosci. 32(1), January 2007. The list of microsatellite rich as well as poor regions in the five mycobacterial genomes. Local GC%. Repeat rich(+)/. Repeat poor(-). Total ORFs. Number of ... Simple sequence repeats in mycobacterial genomes. VATTIPALLY .... heat shock protein (grpE) (15839737), heat shock protein (dnaJ) ...

  19. Refined Pichia pastoris reference genome sequence

    Science.gov (United States)

    Sturmberger, Lukas; Chappell, Thomas; Geier, Martina; Krainer, Florian; Day, Kasey J.; Vide, Ursa; Trstenjak, Sara; Schiefer, Anja; Richardson, Toby; Soriaga, Leah; Darnhofer, Barbara; Birner-Gruenberger, Ruth; Glick, Benjamin S.; Tolstorukov, Ilya; Cregg, James; Madden, Knut; Glieder, Anton

    2016-01-01

    Strains of the species Komagataella phaffii are the most frequently used “Pichia pastoris” strains employed for recombinant protein production as well as studies on peroxisome biogenesis, autophagy and secretory pathway analyses. Genome sequencing of several different P. pastoris strains has provided the foundation for understanding these cellular functions in recent genomics, transcriptomics and proteomics experiments. This experimentation has identified mistakes, gaps and incorrectly annotated open reading frames in the previously published draft genome sequences. Here, a refined reference genome is presented, generated with genome and transcriptome sequencing data from multiple P. pastoris strains. Twelve major sequence gaps from 20 to 6000 base pairs were closed and 5111 out of 5256 putative open reading frames were manually curated and confirmed by RNA-seq and published LC-MS/MS data, including the addition of new open reading frames (ORFs) and a reduction in the number of spliced genes from 797 to 571. One chromosomal fragment of 76 kbp between two previous gaps on chromosome 1 and another 134 kbp fragment at the end of chromosome 4, as well as several shorter fragments needed re-orientation. In total more than 500 positions in the genome have been corrected. This reference genome is presented with new chromosomal numbering, positioning ribosomal repeats at the distal ends of the four chromosomes, and includes predicted chromosomal centromeres as well as the sequence of two linear cytoplasmic plasmids of 13.1 and 9.5 kbp found in some strains of P. pastoris. PMID:27084056

  20. Sequencing and comparing whole mitochondrial genomes ofanimals

    Energy Technology Data Exchange (ETDEWEB)

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

  1. Whole-genome sequencing of veterinary pathogens

    DEFF Research Database (Denmark)

    Ronco, Troels

    using whole-genome sequencing. The results showed that NELoc-1 and -3 and the two virulence genes netB and cnaA were significantly more associated with NE isolates from chickens compared to NE isolates from turkeys. Only NELoc-2 was associated with NE isolates from both turkeys and chickens. A putative......-electrophoresis and single-locus sequencing has been widely used to characterize such types of veterinary pathogens. However, DNA sequencing techniques have become fast and cost effective in recent years and whole-genome sequencing data provide a much higher discriminative power and reproducibility than any...... of the traditional molecular techniques. In this PhD project three important veterinary pathogens (Clostridium perfringens, Escherichia coli and Staphylococcus aureus) were investigated using whole-genome sequencing. This was done in five different scientific papers which all have been published. Paper I and II...

  2. Complete genomic sequence analyses of the first group A giraffe rotavirus reveals close evolutionary relationship with rotaviruses infecting other members of the Artiodactyla.

    Science.gov (United States)

    O'Shea, Helen; Mulherin, Emily; Matthijnssens, Jelle; McCusker, Matthew P; Collins, P J; Cashman, Olivia; Gunn, Lynda; Beltman, Marijke E; Fanning, Séamus

    2014-05-14

    Group A Rotaviruses (RVA) have been established as significant contributory agents of acute gastroenteritis in young children and many animal species. In 2008, we described the first RVA strain detected in a giraffe calf (RVA/Giraffe-wt/IRL/GirRV/2008/G10P[11]), presenting with acute diarrhoea. Molecular characterisation of the VP7 and VP4 genes revealed the bovine-like genotypes G10 and P[11], respectively. To further investigate the origin of this giraffe RVA strain, the 9 remaining gene segments were sequenced and analysed, revealing the following genotype constellation: G10-P[11]-I2-R2-C2-M2-A3-N2-T6-E2-H3. This genotype constellation is very similar to RVA strains isolated from cattle or other members of the artiodactyls. Phylogenetic analyses confirmed the close relationship between GirRV and RVA strains with a bovine-like genotype constellation detected from several host species, including humans. These results suggest that RVA strain GirRV was the result of an interspecies transmission from a bovine host to the giraffe calf. However, we cannot rule out completely that this bovine-like RVA genotype constellation may be enzootic in giraffes. Future RVA surveillance in giraffes may answer this intriguing question. Copyright © 2014 Elsevier B.V. All rights reserved.

  3. The complete chloroplast genome sequence of Abies nephrolepis (Pinaceae: Abietoideae

    Directory of Open Access Journals (Sweden)

    Dong-Keun Yi

    2016-06-01

    Full Text Available The plant chloroplast (cp genome has maintained a relatively conserved structure and gene content throughout evolution. Cp genome sequences have been used widely for resolving evolutionary and phylogenetic issues at various taxonomic levels of plants. Here, we report the complete cp genome of Abies nephrolepis. The A. nephrolepis cp genome is 121,336 base pairs (bp in length including a pair of short inverted repeat regions (IRa and IRb of 139 bp each separated by a small single copy (SSC region of 54,323 bp (SSC and a large single copy region of 66,735 bp (LSC. It contains 114 genes, 68 of which are protein coding genes, 35 tRNA and four rRNA genes, six open reading frames, and one pseudogene. Seventeen repeat units and 64 simple sequence repeats (SSR have been detected in A. nephrolepis cp genome. Large IR sequences locate in 42-kb inversion points (1186 bp. The A. nephrolepis cp genome is identical to Abies koreana’s which is closely related to taxa. Pairwise comparison between two cp genomes revealed 140 polymorphic sites in each. Complete cp genome sequence of A. nephrolepis has a significant potential to provide information on the evolutionary pattern of Abietoideae and valuable data for development of DNA markers for easy identification and classification.

  4. The genome sequence of sweet cherry (Prunus avium) for use in genomics-assisted breeding.

    Science.gov (United States)

    Shirasawa, Kenta; Isuzugawa, Kanji; Ikenaga, Mitsunobu; Saito, Yutaro; Yamamoto, Toshiya; Hirakawa, Hideki; Isobe, Sachiko

    2017-10-01

    We determined the genome sequence of sweet cherry (Prunus avium) using next-generation sequencing technology. The total length of the assembled sequences was 272.4 Mb, consisting of 10,148 scaffold sequences with an N50 length of 219.6 kb. The sequences covered 77.8% of the 352.9 Mb sweet cherry genome, as estimated by k-mer analysis, and included >96.0% of the core eukaryotic genes. We predicted 43,349 complete and partial protein-encoding genes. A high-density consensus map with 2,382 loci was constructed using double-digest restriction site-associated DNA sequencing. Comparing the genetic maps of sweet cherry and peach revealed high synteny between the two genomes; thus the scaffolds were integrated into pseudomolecules using map- and synteny-based strategies. Whole-genome resequencing of six modern cultivars found 1,016,866 SNPs and 162,402 insertions/deletions, out of which 0.7% were deleterious. The sequence variants, as well as simple sequence repeats, can be used as DNA markers. The genomic information helps us to identify agronomically important genes and will accelerate genetic studies and breeding programs for sweet cherries. Further information on the genomic sequences and DNA markers is available in DBcherry (http://cherry.kazusa.or.jp (8 May 2017, date last accessed)). © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  5. 10KP: A phylodiverse genome sequencing plan

    Science.gov (United States)

    Cheng, Shifeng; Melkonian, Michael; Brockington, Samuel; Archibald, John M; Delaux, Pierre-Marc; Melkonian, Barbara; Mavrodiev, Evgeny V; Sun, Wenjing; Fu, Yuan; Yang, Huanming; Soltis, Douglas E; Graham, Sean W; Soltis, Pamela S; Liu, Xin; Xu, Xun

    2018-01-01

    Abstract Understanding plant evolution and diversity in a phylogenomic context is an enormous challenge due, in part, to limited availability of genome-scale data across phylodiverse species. The 10KP (10,000 Plants) Genome Sequencing Project will sequence and characterize representative genomes from every major clade of embryophytes, green algae, and protists (excluding fungi) within the next 5 years. By implementing and continuously improving leading-edge sequencing technologies and bioinformatics tools, 10KP will catalogue the genome content of plant and protist diversity and make these data freely available as an enduring foundation for future scientific discoveries and applications. 10KP is structured as an international consortium, open to the global community, including botanical gardens, plant research institutes, universities, and private industry. Our immediate goal is to establish a policy framework for this endeavor, the principles of which are outlined here. PMID:29618049

  6. 10KP: A phylodiverse genome sequencing plan.

    Science.gov (United States)

    Cheng, Shifeng; Melkonian, Michael; Smith, Stephen A; Brockington, Samuel; Archibald, John M; Delaux, Pierre-Marc; Li, Fay-Wei; Melkonian, Barbara; Mavrodiev, Evgeny V; Sun, Wenjing; Fu, Yuan; Yang, Huanming; Soltis, Douglas E; Graham, Sean W; Soltis, Pamela S; Liu, Xin; Xu, Xun; Wong, Gane Ka-Shu

    2018-03-01

    Understanding plant evolution and diversity in a phylogenomic context is an enormous challenge due, in part, to limited availability of genome-scale data across phylodiverse species. The 10KP (10,000 Plants) Genome Sequencing Project will sequence and characterize representative genomes from every major clade of embryophytes, green algae, and protists (excluding fungi) within the next 5 years. By implementing and continuously improving leading-edge sequencing technologies and bioinformatics tools, 10KP will catalogue the genome content of plant and protist diversity and make these data freely available as an enduring foundation for future scientific discoveries and applications. 10KP is structured as an international consortium, open to the global community, including botanical gardens, plant research institutes, universities, and private industry. Our immediate goal is to establish a policy framework for this endeavor, the principles of which are outlined here.

  7. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    DEFF Research Database (Denmark)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand...... the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two...

  8. Complete Genome Sequences of Chrysanthemum Stunt Viroid from a Single Chrysanthemum Cultivar

    OpenAIRE

    Choi, Hoseong; Jo, Yeonhwa; Yoon, Ju-Yeon; Choi, Seung-Kook; Cho, Won Kyong

    2015-01-01

    The chrysanthemum stunt viroid (CSVd), a member of the genus Pospiviroid with a single circular RNA genome, infects many chrysanthemum species. Here, we report 25 complete genome sequences of CSVd in a single chrysanthemum cultivar, revealing 20 variants.

  9. Genome Sequence of the Palaeopolyploid soybean

    Energy Technology Data Exchange (ETDEWEB)

    Schmutz, Jeremy; Cannon, Steven B.; Schlueter, Jessica; Ma, Jianxin; Mitros, Therese; Nelson, William; Hyten, David L.; Song, Qijian; Thelen, Jay J.; Cheng, Jianlin; Xu, Dong; Hellsten, Uffe; May, Gregory D.; Yu, Yeisoo; Sakura, Tetsuya; Umezawa, Taishi; Bhattacharyya, Madan K.; Sandhu, Devinder; Valliyodan, Babu; Lindquist, Erika; Peto, Myron; Grant, David; Shu, Shengqiang; Goodstein, David; Barry, Kerrie; Futrell-Griggs, Montona; Abernathy, Brian; Du, Jianchang; Tian, Zhixi; Zhu, Liucun; Gill, Navdeep; Joshi, Trupti; Libault, Marc; Sethuraman, Anand; Zhang, Xue-Cheng; Shinozaki, Kazuo; Nguyen, Henry T.; Wing, Rod A.; Cregan, Perry; Specht, James; Grimwood, Jane; Rokhsar, Dan; Stacey, Gary; Shoemaker, Randy C.; Jackson, Scott A.

    2009-08-03

    Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70percent more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78percent of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75percent of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

  10. Genome sequence of M6, a diploid inbred clone of the high glycoalkaloid-producing tuber-bearing potato species Solanum chacoense, reveals residual heterozygosity

    Science.gov (United States)

    Background: Potato (Solanum tuberosum) is the world’s most important vegetable crop and central to global food security. Cultivated potato is a highly heterozygous autotetraploid that presents challenges in genome analyses and breeding. Numerous wild potato species serve as a resource for introgress...

  11. Comparative genomics reveals insights into avian genome evolution and adaptation

    DEFF Research Database (Denmark)

    Zhang, Guojie; Li, Cai; Li, Qiye

    2014-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size...... this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits....

  12. Genome Sequence of Bacillus glycinifermentans TH008, Isolated from Ohio Soil

    OpenAIRE

    Zeigler, Daniel R.

    2016-01-01

    The genome sequence of an Ohio soil isolate, TH008, was determined. The sequence reveals a close relationship between TH008 and domesticated Bacillus glycinifermentans strains found in a traditional Korean fermented soybean food.

  13. Genome sequence of the model medicinal mushroom Ganoderma lucidum

    Science.gov (United States)

    Chen, Shilin; Xu, Jiang; Liu, Chang; Zhu, Yingjie; Nelson, David R.; Zhou, Shiguo; Li, Chunfang; Wang, Lizhi; Guo, Xu; Sun, Yongzhen; Luo, Hongmei; Li, Ying; Song, Jingyuan; Henrissat, Bernard; Levasseur, Anthony; Qian, Jun; Li, Jianqin; Luo, Xiang; Shi, Linchun; He, Liu; Xiang, Li; Xu, Xiaolan; Niu, Yunyun; Li, Qiushi; Han, Mira V.; Yan, Haixia; Zhang, Jin; Chen, Haimei; Lv, Aiping; Wang, Zhen; Liu, Mingzhu; Schwartz, David C.; Sun, Chao

    2012-01-01

    Ganoderma lucidum is a widely used medicinal macrofungus in traditional Chinese medicine that creates a diverse set of bioactive compounds. Here we report its 43.3-Mb genome, encoding 16,113 predicted genes, obtained using next-generation sequencing and optical mapping approaches. The sequence analysis reveals an impressive array of genes encoding cytochrome P450s (CYPs), transporters and regulatory proteins that cooperate in secondary metabolism. The genome also encodes one of the richest sets of wood degradation enzymes among all of the sequenced basidiomycetes. In all, 24 physical CYP gene clusters are identified. Moreover, 78 CYP genes are coexpressed with lanosterol synthase, and 16 of these show high similarity to fungal CYPs that specifically hydroxylate testosterone, suggesting their possible roles in triterpenoid biosynthesis. The elucidation of the G. lucidum genome makes this organism a potential model system for the study of secondary metabolic pathways and their regulation in medicinal fungi. PMID:22735441

  14. Transforming clinical microbiology with bacterial genome sequencing

    OpenAIRE

    Didelot, Xavier; Bowden, Rory; Wilson, Daniel J.; Peto, Tim E. A.; Crook, Derrick W.

    2012-01-01

    Whole genome sequencing of bacteria has recently emerged as a cost-effective and convenient approach for addressing many microbiological questions. Here we review the current status of clinical microbiology and how it has already begun to be transformed by the use of next-generation sequencing. We focus on three essential tasks: identifying the species of an isolate, testing its properties such as resistance to antibiotics and virulence, and monitoring the emergence and spread of bacterial pa...

  15. Escherichia Coli: From Genome Sequences to Consequence

    Directory of Open Access Journals (Sweden)

    Mark Pallen

    2006-01-01

    Full Text Available The present article summarizes a presentation given by Professor Mark Pallen of the School of Medicine at the University of Birmingham (Birmingham, United Kingdom for the Fourth Stanier Lecture held in Regina, Saskatchewan, on November 9, 2004. Professor Pallen's lecture, entitled 'Escherichia coli: From genome sequences to consequences', provides a summary of the important discoveries of his team of research scientists in the area of genetic sequencing and variations in phenotypic expression.

  16. Phytophthora Genome Sequences Uncover Evolutionary Origins and Mechanisms of Pathogenesis

    Energy Technology Data Exchange (ETDEWEB)

    Lamour, Kurt H [ORNL; McDonald, W Hayes [ORNL; Savidor, Alon [ORNL

    2006-01-01

    Genome sequences of the soybean pathogen, Phytophthora sojae, and the sudden oak death pathogen, Phytophthora ramorum, suggest a photosynthetic past and reveal recent massive expansion and diversification of potential pathogenicity gene families. Abstract: Draft genome sequences of the soybean pathogen, Phytophthora sojae, and the sudden oak death pathogen, Phytophthora ramorum, have been determined. O mycetes such as these Phytophthora species share the kingdom Stramenopila with photosynthetic algae such as diatoms and the presence of many Phytophthora genes of probable phototroph origin support a photosynthetic ancestry for the stramenopiles. Comparison of the two species' genomes reveals a rapid expansion and diversification of many protein families associated with plant infection such as hydrolases, ABC transporters, protein toxins, proteinase inhibitors and, in particular, a superfamily of 700 proteins with similarity to known o mycete avirulence genes.

  17. De novo assembly of the carrot mitochondrial genome using next generation sequencing of whole genomic DNA provides first evidence of DNA transfer into an angiosperm plastid genome

    Directory of Open Access Journals (Sweden)

    Iorizzo Massimo

    2012-05-01

    Full Text Available Abstract Background Sequence analysis of organelle genomes has revealed important aspects of plant cell evolution. The scope of this study was to develop an approach for de novo assembly of the carrot mitochondrial genome using next generation sequence data from total genomic DNA. Results Sequencing data from a carrot 454 whole genome library were used to develop a de novo assembly of the mitochondrial genome. Development of a new bioinformatic tool allowed visualizing contig connections and elucidation of the de novo assembly. Southern hybridization demonstrated recombination across two large repeats. Genome annotation allowed identification of 44 protein coding genes, three rRNA and 17 tRNA. Identification of the plastid genome sequence allowed organelle genome comparison. Mitochondrial intergenic sequence analysis allowed detection of a fragment of DNA specific to the carrot plastid genome. PCR amplification and sequence analysis across different Apiaceae species revealed consistent conservation of this fragment in the mitochondrial genomes and an insertion in Daucus plastid genomes, giving evidence of a mitochondrial to plastid transfer of DNA. Sequence similarity with a retrotransposon element suggests a possibility that a transposon-like event transferred this sequence into the plastid genome. Conclusions This study confirmed that whole genome sequencing is a practical approach for de novo assembly of higher plant mitochondrial genomes. In addition, a new aspect of intercompartmental genome interaction was reported providing the first evidence for DNA transfer into an angiosperm plastid genome. The approach used here could be used more broadly to sequence and assemble mitochondrial genomes of diverse species. This information will allow us to better understand intercompartmental interactions and cell evolution.

  18. Genomic multiple sequence alignments: refinement using a genetic algorithm

    Directory of Open Access Journals (Sweden)

    Lefkowitz Elliot J

    2005-08-01

    Full Text Available Abstract Background Genomic sequence data cannot be fully appreciated in isolation. Comparative genomics – the practice of comparing genomic sequences from different species – plays an increasingly important role in understanding the genotypic differences between species that result in phenotypic differences as well as in revealing patterns of evolutionary relationships. One of the major challenges in comparative genomics is producing a high-quality alignment between two or more related genomic sequences. In recent years, a number of tools have been developed for aligning large genomic sequences. Most utilize heuristic strategies to identify a series of strong sequence similarities, which are then used as anchors to align the regions between the anchor points. The resulting alignment is globally correct, but in many cases is suboptimal locally. We describe a new program, GenAlignRefine, which improves the overall quality of global multiple alignments by using a genetic algorithm to improve local regions of alignment. Regions of low quality are identified, realigned using the program T-Coffee, and then refined using a genetic algorithm. Because a better COFFEE (Consistency based Objective Function For alignmEnt Evaluation score generally reflects greater alignment quality, the algorithm searches for an alignment that yields a better COFFEE score. To improve the intrinsic slowness of the genetic algorithm, GenAlignRefine was implemented as a parallel, cluster-based program. Results We tested the GenAlignRefine algorithm by running it on a Linux cluster to refine sequences from a simulation, as well as refine a multiple alignment of 15 Orthopoxvirus genomic sequences approximately 260,000 nucleotides in length that initially had been aligned by Multi-LAGAN. It took approximately 150 minutes for a 40-processor Linux cluster to optimize some 200 fuzzy (poorly aligned regions of the orthopoxvirus alignment. Overall sequence identity increased only

  19. The diploid genome sequence of an individual human.

    Directory of Open Access Journals (Sweden)

    Samuel Levy

    2007-09-01

    Full Text Available Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel included 3,213,401 single nucleotide polymorphisms (SNPs, 53,823 block substitutions (2-206 bp, 292,102 heterozygous insertion/deletion events (indels(1-571 bp, 559,473 homozygous indels (1-82,711 bp, 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

  20. Genome sequencing and population genomics in non-model organisms.

    Science.gov (United States)

    Ellegren, Hans

    2014-01-01

    High-throughput sequencing technologies are revolutionizing the life sciences. The past 12 months have seen a burst of genome sequences from non-model organisms, in each case representing a fundamental source of data of significant importance to biological research. This has bearing on several aspects of evolutionary biology, and we are now beginning to see patterns emerging from these studies. These include significant heterogeneity in the rate of recombination that affects adaptive evolution and base composition, the role of population size in adaptive evolution, and the importance of expansion of gene families in lineage-specific adaptation. Moreover, resequencing of population samples (population genomics) has enabled the identification of the genetic basis of critical phenotypes and cast light on the landscape of genomic divergence during speciation. Copyright © 2013 Elsevier Ltd. All rights reserved.

  1. Genome shotgun sequencing and development of microsatellite ...

    African Journals Online (AJOL)

    Analysis of the gerbera genome DNA ('Raon') general library showed that sequences of (AT), (AG), (AAG) and (AAT) repeats appeared most often, whereas (AC), (AAC) and (ACC) were the least frequent. Primer pairs were designed for 80 loci. Only eight primer pairs produced reproducible polymorphic bands in the 28 ...

  2. Genome sequencing for obstetricians & gynaecologists | Kent ...

    African Journals Online (AJOL)

    The medical profession has been waiting for a decade to be invigorated by the sequencing of the human genome, arguably the greatest scientific project ever. The technology has been spectacular but the results of the project have yielded more unexpected results than definitive answers – many about the very nature of our ...

  3. Sequencing of 50 human exomes reveals adaptation to high altitude

    DEFF Research Database (Denmark)

    Yi, Xin; Liang, Yu; Huerta-Sanchez, Emilia

    2010-01-01

    Residents of the Tibetan Plateau show heritable adaptations to extreme altitude. We sequenced 50 exomes of ethnic Tibetans, encompassing coding sequences of 92% of human genes, with an average coverage of 18x per individual. Genes showing population-specific allele frequency changes, which...... difference between Tibetan and Han samples, representing the fastest allele frequency change observed at any human gene to date. This SNP's association with erythrocyte abundance supports the role of EPAS1 in adaptation to hypoxia. Thus, a population genomic survey has revealed a functionally important locus...

  4. Building a model: developing genomic resources for common milkweed (Asclepias syriaca with low coverage genome sequencing

    Directory of Open Access Journals (Sweden)

    Weitemier Kevin

    2011-05-01

    Full Text Available Abstract Background Milkweeds (Asclepias L. have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L. could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp and 5S rDNA (120 bp sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp, with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae unigenes (median coverage of 0.29× and 66% of single copy orthologs (COSII in asterids (median coverage of 0.14×. From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites and phylogenetics (low-copy nuclear genes studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species

  5. Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing.

    Science.gov (United States)

    Straub, Shannon C K; Fishbein, Mark; Livshultz, Tatyana; Foster, Zachary; Parks, Matthew; Weitemier, Kevin; Cronn, Richard C; Liston, Aaron

    2011-05-04

    Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first

  6. Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing

    Science.gov (United States)

    2011-01-01

    Background Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives

  7. Demographically idiosyncratic responses to climate change and rapid Pleistocene diversification of the walnut genus Juglans (Juglandaceae) revealed by whole-genome sequences.

    Science.gov (United States)

    Bai, Wei-Ning; Yan, Peng-Cheng; Zhang, Bo-Wen; Woeste, Keith E; Lin, Kui; Zhang, Da-Yong

    2018-03-01

    Whether species demography and diversification are driven primarily by extrinsic environmental changes such as climatic oscillations in the Quaternary or by intrinsic biological interactions like coevolution between antagonists is a matter of active debate. In fact, their relative importance can be assessed by tracking past population fluctuations over considerable time periods. We applied the pairwise sequentially Markovian coalescent approach on the genomes of 11 temperate Juglans species to estimate trajectories of changes in effective population size (N e ) and used a Bayesian-coalescent based approach that simultaneously considers multiple genomes (G-PhoCS) to estimate divergence times between lineages. N e curves of all study species converged 1.0 million yr ago, probably reflecting the time when the walnut genus last shared a common ancestor. This estimate was confirmed by the G-PhoCS estimates of divergence times. But all species did not react similarly to the dramatic climatic oscillations following early Pleistocene cooling, so the timing and amplitude of changes in N e differed among species and even among conspecific lineages. The population histories of temperate walnut species were not driven by extrinsic environmental changes alone, and a key role was probably played by species-specific factors such as coevolutionary interactions with specialized pathogens. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.

  8. Whole-Genome Sequencing of Methicillin-Resistant Staphylococcus aureus Resistant to Fifth-Generation Cephalosporins Reveals Potential Non-mecA Mechanisms of Resistance.

    Science.gov (United States)

    Greninger, Alexander L; Chatterjee, Som S; Chan, Liana C; Hamilton, Stephanie M; Chambers, Henry F; Chiu, Charles Y

    2016-01-01

    Fifth-generation cephalosporins, ceftobiprole and ceftaroline, are promising drugs for treatment of bacterial infections from methicillin-resistant Staphylococcus aureus (MRSA). These antibiotics are able to bind native PBP2a, the penicillin-binding protein encoded by the mecA resistance determinant that mediates broad class resistance to nearly all other beta-lactam antibiotics, at clinically achievable concentrations. Mechanisms of resistance to ceftaroline based on mecA mutations have been previously described. Here we compare the genomes of 11 total parent-daughter strains of Staphylococcus aureus for which specific selection by serial passaging with ceftaroline or ceftobiprole was used to identify novel non-mecA mechanisms of resistance. All 5 ceftaroline-resistant strains, derived from 5 different parental strains, contained mutations directly upstream of the pbp4 gene (coding for the PBP4 protein), including four with the same thymidine insertion located 377 nucleotides upstream of the promoter site. In 4 of 5 independent ceftaroline-driven selections, we also isolated mutations to the same residue (Asn138) in PBP4. In addition, mutations in additional candidate genes such as ClpX endopeptidase, PP2C protein phosphatase and transcription terminator Rho, previously undescribed in the context of resistance to ceftaroline or ceftobiprole, were detected in multiple selections. These genomic findings suggest that non-mecA mechanisms, while yet to be encountered in the clinical setting, may also be important in mediating resistance to 5th-generation cephalosporins.

  9. Whole-Genome Sequencing of Methicillin-Resistant Staphylococcus aureus Resistant to Fifth-Generation Cephalosporins Reveals Potential Non-mecA Mechanisms of Resistance.

    Directory of Open Access Journals (Sweden)

    Alexander L Greninger

    Full Text Available Fifth-generation cephalosporins, ceftobiprole and ceftaroline, are promising drugs for treatment of bacterial infections from methicillin-resistant Staphylococcus aureus (MRSA. These antibiotics are able to bind native PBP2a, the penicillin-binding protein encoded by the mecA resistance determinant that mediates broad class resistance to nearly all other beta-lactam antibiotics, at clinically achievable concentrations. Mechanisms of resistance to ceftaroline based on mecA mutations have been previously described. Here we compare the genomes of 11 total parent-daughter strains of Staphylococcus aureus for which specific selection by serial passaging with ceftaroline or ceftobiprole was used to identify novel non-mecA mechanisms of resistance. All 5 ceftaroline-resistant strains, derived from 5 different parental strains, contained mutations directly upstream of the pbp4 gene (coding for the PBP4 protein, including four with the same thymidine insertion located 377 nucleotides upstream of the promoter site. In 4 of 5 independent ceftaroline-driven selections, we also isolated mutations to the same residue (Asn138 in PBP4. In addition, mutations in additional candidate genes such as ClpX endopeptidase, PP2C protein phosphatase and transcription terminator Rho, previously undescribed in the context of resistance to ceftaroline or ceftobiprole, were detected in multiple selections. These genomic findings suggest that non-mecA mechanisms, while yet to be encountered in the clinical setting, may also be important in mediating resistance to 5th-generation cephalosporins.

  10. Genomics analysis of genes expressed reveals differential ...

    African Journals Online (AJOL)

    Genomics analysis of genes expressed reveals differential responses to low chronic nitrogen stress in maize. ... Most induced clones were largely involved in various metabolism processes including physiological process, organelle regulation of biological process, nutrient reservoir activity, transcription regulator activity and ...

  11. Genome sequence and characterization of the Tsukamurella bacteriophage TPA2.

    Science.gov (United States)

    Petrovski, Steve; Seviour, Robert J; Tillett, Daniel

    2011-02-01

    The formation of stable foam in activated sludge plants is a global problem for which control is difficult. These foams are often stabilized by hydrophobic mycolic acid-synthesizing Actinobacteria, among which are Tsukamurella spp. This paper describes the isolation from activated sludge of the novel double-stranded DNA phage TPA2. This polyvalent Siphoviridae family phage is lytic for most Tsukamurella species. Whole-genome sequencing reveals that the TPA2 genome is circularly permuted (61,440 bp) and that 70% of its sequence is novel. We have identified 78 putative open reading frames, 95 pairs of inverted repeats, and 6 palindromes. The TPA2 genome has a modular gene structure that shares some similarity to those of Mycobacterium phages. A number of the genes display a mosaic architecture, suggesting that the TPA2 genome has evolved at least in part from genetic recombination events. The genome sequence reveals many novel genes that should inform any future discussion on Tsukamurella phage evolution.

  12. Genomic divergences among cattle, dog and human estimated from large-scale alignments of genomic sequences

    Directory of Open Access Journals (Sweden)

    Shade Larry L

    2006-06-01

    Full Text Available Abstract Background Approximately 11 Mb of finished high quality genomic sequences were sampled from cattle, dog and human to estimate genomic divergences and their regional variation among these lineages. Results Optimal three-way multi-species global sequence alignments for 84 cattle clones or loci (each >50 kb of genomic sequence were constructed using the human and dog genome assemblies as references. Genomic divergences and substitution rates were examined for each clone and for various sequence classes under different functional constraints. Analysis of these alignments revealed that the overall genomic divergences are relatively constant (0.32–0.37 change/site for pairwise comparisons among cattle, dog and human; however substitution rates vary across genomic regions and among different sequence classes. A neutral mutation rate (2.0–2.2 × 10(-9 change/site/year was derived from ancestral repetitive sequences, whereas the substitution rate in coding sequences (1.1 × 10(-9 change/site/year was approximately half of the overall rate (1.9–2.0 × 10(-9 change/site/year. Relative rate tests also indicated that cattle have a significantly faster rate of substitution as compared to dog and that this difference is about 6%. Conclusion This analysis provides a large-scale and unbiased assessment of genomic divergences and regional variation of substitution rates among cattle, dog and human. It is expected that these data will serve as a baseline for future mammalian molecular evolution studies.

  13. Mitochondrial genome sequences and comparative genomics ofPhytophthora ramorum and P. sojae

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Frank N.; Douda, Bensasson; Tyler, Brett M.; Boore,Jeffrey L.

    2007-01-01

    The complete sequences of the mitochondrial genomes of theoomycetes of Phytophthora ramorum and P. sojae were determined during thecourse of their complete nuclear genome sequencing (Tyler, et al. 2006).Both are circular, with sizes of 39,314 bp for P. ramorum and 42,975 bpfor P. sojae. Each contains a total of 37 identifiable protein-encodinggenes, 25 or 26 tRNAs (P. sojae and P. ramorum, respectively)specifying19 amino acids, and a variable number of ORFs (7 for P. ramorum and 12for P. sojae) which are potentially additional functional genes.Non-coding regions comprise approximately 11.5 percent and 18.4 percentof the genomes of P. ramorum and P. sojae, respectively. Relative to P.sojae, there is an inverted repeat of 1,150 bp in P. ramorum thatincludes an unassigned unique ORF, a tRNA gene, and adjacent non-codingsequences, but otherwise the gene order in both species is identical.Comparisons of these genomes with published sequences of the P. infestansmitochondrial genome reveals a number of similarities, but the gene orderin P. infestans differs in two adjacent locations due to inversions.Sequence alignments of the three genomes indicated sequence conservationranging from 75 to 85 percent and that specific regions were morevariable than others.

  14. Whole genome sequence and phylogenetic analyses reveal human rotavirus G3P[3] strains Ro1845 and HCR3A are examples of direct virion transmission of canine/feline rotaviruses to humans.

    Science.gov (United States)

    Tsugawa, Takeshi; Hoshino, Yasutaka

    2008-10-25

    Rotaviruses, the major causative agents of infantile diarrhea worldwide, are, in general, highly species-specific. Interspecies virus transmission is thought to be one of the important contributors involved in the evolution and diversity of rotaviruses in nature. Human rotavirus (HRV) G3P[3] strains Ro1845 and HCR3A have been reported to be closely related genetically to certain canine and feline rotaviruses (RVs). Whole genome sequence and phylogenetic analyses of each of these 2 HRVs as well as 3 canine RVs (CU-1, K9 and A79-10, each with G3P[3] specificity) and 2 feline RVs (Cat97 with G3P[3] specificity and Cat2 with G3P[9] specificity) revealed that (i) each of 11 genes of the Ro1845 and HCR3A was of canine/feline origin; (ii) canine and feline rotaviruses with G3P[3] specificity bore highly conserved species-specific genomes; and (iii) the Cat2 strain may have evolved via multiple reassortment events involving canine, feline, human and bovine rotaviruses.

  15. Genome Sequence Analysis of Vibrio cholerae clinical isolates from 2013 in Mexico reveals the presence of the strain responsible for the 2010 Haiti outbreak.

    Science.gov (United States)

    Díaz-Quiñonez, José Alberto

    2017-01-01

    La primera semana de septiembre de 2013, el Sistema Nacional de Vigilancia Epidemiológica identificó dos casos de cólera en Ciudad de México. Los cultivos de ambas muestras se confirmaron como Vibrio cholerae serogrupo O1, serotipo Ogawa, biotipo El Tor. Los análisis iniciales por electroforesis por campos pulsados y por reacción en cadena de la polimerasa indicaron que ambas cepas eran similares, pero diferentes de las previamente reportadas en México. La semana siguiente se identificaron cuatro casos más en una comunidad del Estado de Hidalgo, ubicada a 121 kilómetros al noreste de Ciudad de México. Posteriormente se inició un brote de cólera en la región de La Huasteca. Los análisis genómicos de cuatro cepas obtenidas en este estudio confirmaron la presencia de las islas de patogenicidad VPI -1 y VPI-2, VSP-1 y VSP-2, y del elemento integrador SXT. La estructura genómica de los cuatro aislamientos fue similar a la de V. cholerae cepa 2010 EL-1786, identificada durante la epidemia en Haití en 2010. Este estudio pone de manifiesto que la epidemiología molecular es una herramienta muy poderosa para vigilar, prevenir y controlar enfermedades de importancia en salud pública en México. The first week of September 2013, the National Epidemiological Surveillance System identified two cases of cholera in Mexico City. The cultures of both samples were confirmed as Vibrio cholerae serogroup O1, serotype Ogawa, biotype El Tor. Initial analyses by pulsed-field gel electrophoresis and by polymerase chain reaction-amplification of the virulence genes, suggested that both strains were similar, but different from those previously reported in Mexico. The following week, four more cases were identified in a community in the state of Hidalgo, located 121 km northeast of Mexico City. Thereafter a cholera outbreak started in the region of La Huasteca. Genomic analyses of the strains obtained in this study confirmed the presence of pathogenicity islands VPI-1 and

  16. Automated genome sequence analysis and annotation.

    Science.gov (United States)

    Andrade, M A; Brown, N P; Leroy, C; Hoersch, S; de Daruvar, A; Reich, C; Franchini, A; Tamames, J; Valencia, A; Ouzounis, C; Sander, C

    1999-05-01

    Large-scale genome projects generate a rapidly increasing number of sequences, most of them biochemically uncharacterized. Research in bioinformatics contributes to the development of methods for the computational characterization of these sequences. However, the installation and application of these methods require experience and are time consuming. We present here an automatic system for preliminary functional annotation of protein sequences that has been applied to the analysis of sets of sequences from complete genomes, both to refine overall performance and to make new discoveries comparable to those made by human experts. The GeneQuiz system includes a Web-based browser that allows examination of the evidence leading to an automatic annotation and offers additional information, views of the results, and links to biological databases that complement the automatic analysis. System structure and operating principles concerning the use of multiple sequence databases, underlying sequence analysis tools, lexical analyses of database annotations and decision criteria for functional assignments are detailed. The system makes automatic quality assessments of results based on prior experience with the underlying sequence analysis tools; overall error rates in functional assignment are estimated at 2.5-5% for cases annotated with highest reliability ('clear' cases). Sources of over-interpretation of results are discussed with proposals for improvement. A conservative definition for reporting 'new findings' that takes account of database maturity is presented along with examples of possible kinds of discoveries (new function, family and superfamily) made by the system. System performance in relation to sequence database coverage, database dynamics and database search methods is analysed, demonstrating the inherent advantages of an integrated automatic approach using multiple databases and search methods applied in an objective and repeatable manner. The GeneQuiz system

  17. The first genome sequences of human bocaviruses from Vietnam.

    Science.gov (United States)

    Thanh, Tran Tan; Van, Hoang Minh Tu; Hong, Nguyen Thi Thu; Nhu, Le Nguyen Truc; Anh, Nguyen To; Tuan, Ha Manh; Hien, Ho Van; Tuong, Nguyen Manh; Kien, Trinh Trung; Khanh, Truong Huu; Nhan, Le Nguyen Thanh; Hung, Nguyen Thanh; Chau, Nguyen Van Vinh; Thwaites, Guy; van Doorn, H Rogier; Tan, Le Van

    2016-01-01

    As part of an ongoing effort to generate complete genome sequences of hand, foot and mouth disease-causing enteroviruses directly from clinical specimens, two complete coding sequences and two partial genomic sequences of human bocavirus 1 (n=3) and 2 (n=1) were co-amplified and sequenced, representing the first genome sequences of human bocaviruses from Vietnam. The sequences may aid future study aiming at understanding the evolution of the pathogen.

  18. The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes

    Science.gov (United States)

    Liu, Shengyi; Liu, Yumei; Yang, Xinhua; Tong, Chaobo; Edwards, David; Parkin, Isobel A. P.; Zhao, Meixia; Ma, Jianxin; Yu, Jingyin; Huang, Shunmou; Wang, Xiyin; Wang, Junyi; Lu, Kun; Fang, Zhiyuan; Bancroft, Ian; Yang, Tae-Jin; Hu, Qiong; Wang, Xinfa; Yue, Zhen; Li, Haojie; Yang, Linfeng; Wu, Jian; Zhou, Qing; Wang, Wanxin; King, Graham J; Pires, J. Chris; Lu, Changxin; Wu, Zhangyan; Sampath, Perumal; Wang, Zhuo; Guo, Hui; Pan, Shengkai; Yang, Limei; Min, Jiumeng; Zhang, Dong; Jin, Dianchuan; Li, Wanshun; Belcram, Harry; Tu, Jinxing; Guan, Mei; Qi, Cunkou; Du, Dezhi; Li, Jiana; Jiang, Liangcai; Batley, Jacqueline; Sharpe, Andrew G; Park, Beom-Seok; Ruperao, Pradeep; Cheng, Feng; Waminal, Nomar Espinosa; Huang, Yin; Dong, Caihua; Wang, Li; Li, Jingping; Hu, Zhiyong; Zhuang, Mu; Huang, Yi; Huang, Junyan; Shi, Jiaqin; Mei, Desheng; Liu, Jing; Lee, Tae-Ho; Wang, Jinpeng; Jin, Huizhe; Li, Zaiyun; Li, Xun; Zhang, Jiefu; Xiao, Lu; Zhou, Yongming; Liu, Zhongsong; Liu, Xuequn; Qin, Rui; Tang, Xu; Liu, Wenbin; Wang, Yupeng; Zhang, Yangyong; Lee, Jonghoon; Kim, Hyun Hee; Denoeud, France; Xu, Xun; Liang, Xinming; Hua, Wei; Wang, Xiaowu; Wang, Jun; Chalhoub, Boulos; Paterson, Andrew H

    2014-01-01

    Polyploidization has provided much genetic variation for plant adaptive evolution, but the mechanisms by which the molecular evolution of polyploid genomes establishes genetic architecture underlying species differentiation are unclear. Brassica is an ideal model to increase knowledge of polyploid evolution. Here we describe a draft genome sequence of Brassica oleracea, comparing it with that of its sister species B. rapa to reveal numerous chromosome rearrangements and asymmetrical gene loss in duplicated genomic blocks, asymmetrical amplification of transposable elements, differential gene co-retention for specific pathways and variation in gene expression, including alternative splicing, among a large number of paralogous and orthologous genes. Genes related to the production of anticancer phytochemicals and morphological variations illustrate consequences of genome duplication and gene divergence, imparting biochemical and morphological variation to B. oleracea. This study provides insights into Brassica genome evolution and will underpin research into the many important crops in this genus. PMID:24852848

  19. Identifying driver mutations in sequenced cancer genomes

    DEFF Research Database (Denmark)

    Raphael, Benjamin J; Dobson, Jason R; Oesper, Layla

    2014-01-01

    High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, noise......, and random mutations. Here, we review computational approaches to identify somatic mutations in cancer genome sequences and to distinguish the driver mutations that are responsible for cancer from random, passenger mutations. First, we describe approaches to detect somatic mutations from high-throughput DNA...... sequencing data, particularly for tumor samples that comprise heterogeneous populations of cells. Next, we review computational approaches that aim to predict driver mutations according to their frequency of occurrence in a cohort of samples, or according to their predicted functional impact on protein...

  20. Whole genome sequence analysis of Mycobacterium suricattae

    KAUST Repository

    Dippenaar, Anzaan

    2015-10-21

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  1. Phytophthora Genome Sequences Uncover Evolutionary Origins and Mechanisms of Pathogenesis

    Energy Technology Data Exchange (ETDEWEB)

    Tyler, Brett M.; Tripathy, Sucheta; Zhang, Xuemin; Dehal, Paramvir; Jiang, Rays H. Y.; Aerts, Andrea; Arredondo, Felipe D.; Baxter, Laura; Bensasson, Douda; Beynon, JIm L.; Chapman, Jarrod; Damasceno, Cynthia M. B.; Dorrance, Anne E.; Dou, Daolong; Dickerman, Allan W.; Dubchak, Inna L.; Garbelotto, Matteo; Gijzen, Mark; Gordon, Stuart G.; Govers, Francine; Grunwald, NIklaus J.; Huang, Wayne; Ivors, Kelly L.; Jones, Richard W.; Kamoun, Sophien; Krampis, Konstantinos; Lamour, Kurt H.; Lee, Mi-Kyung; McDonald, W. Hayes; Medina, Monica; Meijer, Harold J. G.; Nordberg, Erik K.; Maclean, Donald J.; Ospina-Giraldo, Manuel D.; Morris, Paul F.; Phuntumart, Vipaporn; Putnam, Nicholas J.; Rash, Sam; Rose, Jocelyn K. C.; Sakihama, Yasuko; Salamov, Asaf A.; Savidor, Alon; Scheuring, Chantel F.; Smith, Brian M.; Sobral, Bruno W. S.; Terry, Astrid; Torto-Alalibo, Trudy A.; Win, Joe; Xu, Zhanyou; Zhang, Hongbin; Grigoriev, Igor V.; Rokhsar, Daniel S.; Boore, Jeffrey L.

    2006-04-17

    Draft genome sequences have been determined for the soybean pathogen Phytophthora sojae and the sudden oak death pathogen Phytophthora ramorum. Oömycetes such as these Phytophthora species share the kingdom Stramenopila with photosynthetic algae such as diatoms, and the presence of many Phytophthora genes of probable phototroph origin supports a photosynthetic ancestry for the stramenopiles. Comparison of the two species' genomes reveals a rapid expansion and diversification of many protein families associated with plant infection such as hydrolases, ABC transporters, protein toxins, proteinase inhibitors, and, in particular, a superfamily of 700 proteins with similarity to known oömycete avirulence genes.

  2. Whole Genome Sequencing of a Healthy Aging Cohort

    OpenAIRE

    Erikson, Galina A.; Bodian, Dale L.; Rueda, Manuel; Molparia, Bhuvan; Scott, Erick R.; Scott-Van Zeeland, Ashley A.; Topol, Sarah E.; Wineinger, Nathan E.; Niederhuber, John E.; Topol, Eric J.; Torkamani, Ali

    2016-01-01

    Studies of long-lived individuals have revealed few genetic mechanisms for protection against age-associated disease. Therefore, we pursued genome sequencing of a related phenotype – healthy aging – to understand the genetics of disease-free aging without medical intervention. In contrast with studies of exceptional longevity, usually focused on centenarians, healthy aging is not associated with known longevity variants but is associated with reduced genetic susceptibility to Alzheimer and co...

  3. Genome Sequence of the Human Pathogen Vibrio cholerae Amazonia

    Science.gov (United States)

    Thompson, Cristiane C.; Marin, Michel A.; Dias, Graciela M.; Dutilh, Bas E.; Edwards, Robert A.; Iida, Tetsuya; Thompson, Fabiano L.; Vicente, Ana Carolina P.

    2011-01-01

    Vibrio cholerae O1 Amazonia is a pathogen that was isolated from cholera-like diarrhea cases in at least two countries, Brazil and Ghana. Based on multilocus sequence analysis, this lineage belongs to a distinct profile compared to strains from El Tor and classical biotypes. The genomic analysis revealed that it contains Vibrio pathogenicity island 2 and a set of genes related to pathogenesis and fitness, such as the type VI secretion system, present in choleragenic V. cholerae strains. PMID:21952545

  4. Draft Genome Sequence of Lactobacillus plantarum wikim18, Isolated from Korean Kimchi

    Science.gov (United States)

    Jang, Ja Young; Lim, Hyeong In; Park, Hae Woong; Choi, Hak-Jong; Kim, Tae-Woon; Kang, Miran

    2014-01-01

    This report describes the draft genome sequence of Lactobacillus plantarum strain wikim18, isolated from the traditional Korean food kimchi. The reads generated by Ion Torrent PGM were assembled into 327 contigs. RAST annotation of the genome revealed 12 tRNAs and 3,316 protein-coding gene sequences. PMID:24855305

  5. Complete Genome Sequence of Photobacterium sp. Strain J15, Isolated from Seawater of Southwestern Johor, Malaysia.

    Science.gov (United States)

    Roslan, Noordiyanah Nadhirah; Sabri, Suriana; Oslan, Siti Nurbaya; Baharum, Syarul Nataqain; Leow, Thean Chor

    2016-07-28

    Here, we report the genome sequences of Photobacterium sp. strain J15, isolated from seawater in Johor, Malaysia, with the ability to produce lipase and asparaginase. The PacBio genome sequence analysis of Photobacterium sp. strain J15 generated revealed its potential in producing enzymes with different catalytic functions. Copyright © 2016 Roslan et al.

  6. Transforming clinical microbiology with bacterial genome sequencing.

    Science.gov (United States)

    Didelot, Xavier; Bowden, Rory; Wilson, Daniel J; Peto, Tim E A; Crook, Derrick W

    2012-09-01

    Whole-genome sequencing of bacteria has recently emerged as a cost-effective and convenient approach for addressing many microbiological questions. Here, we review the current status of clinical microbiology and how it has already begun to be transformed by using next-generation sequencing. We focus on three essential tasks: identifying the species of an isolate, testing its properties, such as resistance to antibiotics and virulence, and monitoring the emergence and spread of bacterial pathogens. We predict that the application of next-generation sequencing will soon be sufficiently fast, accurate and cheap to be used in routine clinical microbiology practice, where it could replace many complex current techniques with a single, more efficient workflow.

  7. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans

    DEFF Research Database (Denmark)

    Raghavan, Maanasa; Skoglund, Pontus; Graf, Kelly E.

    2014-01-01

    ,000-year-old individual (MA-1), from Mal'ta in south-central Siberia, to an average depth of 1×. To our knowledge this is the oldest anatomically modern human genome reported to date. The MA-1 mitochondrial genome belongs to haplogroup U, which has also been found at high frequency among Upper Palaeolithic......The origins of the First Americans remain contentious. Although Native Americans seem to be genetically most closely related to east Asians, there is no consensus with regard to which specific Old World populations they are closest to. Here we sequence the draft genome of an approximately 24...... that the region was continuously occupied by humans throughout the Last Glacial Maximum. Our findings reveal that western Eurasian genetic signatures in modern-day Native Americans derive not only from post-Columbian admixture, as commonly thought, but also from a mixed ancestry of the First Americans....

  8. An automated annotation tool for genomic DNA sequences using ...

    Indian Academy of Sciences (India)

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated ...

  9. An automated annotation tool for genomic DNA sequences using

    Indian Academy of Sciences (India)

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated ...

  10. The Whole-Genome Sequence of Bacillus velezensis Strain SB1216 Isolated from the Great Salt Plains of Oklahoma Reveals the Presence of a Novel Extracellular RNase with Antitumor Activity.

    Science.gov (United States)

    Marasini, Daya; Cornell, Carolyn R; Oyewole, Opeoluwa; Sheaff, Robert J; Fakhr, Mohamed K

    2017-11-22

    The whole-genome sequence of Bacillus velezensis strain SB1216, isolated from the Great Salt Plains of Oklahoma, showed the presence of a 3,814,720-bp circular chromosome and no plasmids. The presence of a novel 870-bp extracellular RNase gene is predicted to be responsible for this strain's antitumor activity. Copyright © 2017 Marasini et al.

  11. The Whole-Genome Sequence of Bacillus velezensis Strain SB1216 Isolated from the Great Salt Plains of Oklahoma Reveals the Presence of a Novel Extracellular RNase with Antitumor Activity

    OpenAIRE

    Marasini, Daya; Cornell, Carolyn R.; Oyewole, Opeoluwa; Sheaff, Robert J.; Fakhr, Mohamed K.

    2017-01-01

    ABSTRACT The whole-genome sequence of Bacillus velezensis strain SB1216, isolated from the Great Salt Plains of Oklahoma, showed the presence of a 3,814,720-bp circular chromosome and no plasmids. The presence of a novel 870-bp extracellular RNase gene is predicted to be responsible for this strain’s antitumor activity.

  12. Draft Genome Sequence of Mycobacterium chimaera Type ...

    Science.gov (United States)

    We report the draft genome sequence of the type strain Mycobacterium chimaera Fl-0169T, a member of the Mycobacterium avium complex (MAC). M. chimaera Fl-0169T was isolated from a patient in Italy and is highly similar to strains of M. chimaera isolated in Ireland, though Fl-0169T possesses unique virulence genes. Evidence suggests that M. avium, M. intracellulare, and M. chimaera are differently virulent and a comparative genomic analysis is critically needed to identify diagnostic targets that reliably differentiate species of MAC. With treatment costs for Mycobacterium infections estimated to be >$1.8 B annually in the U.S., correct species identification will result in improved treatment selection, lower costs, and improved patient outcomes.

  13. Genomic signal processing for DNA sequence clustering

    Directory of Open Access Journals (Sweden)

    Gerardo Mendizabal-Ruiz

    2018-01-01

    Full Text Available Genomic signal processing (GSP methods which convert DNA data to numerical values have recently been proposed, which would offer the opportunity of employing existing digital signal processing methods for genomic data. One of the most used methods for exploring data is cluster analysis which refers to the unsupervised classification of patterns in data. In this paper, we propose a novel approach for performing cluster analysis of DNA sequences that is based on the use of GSP methods and the K-means algorithm. We also propose a visualization method that facilitates the easy inspection and analysis of the results and possible hidden behaviors. Our results support the feasibility of employing the proposed method to find and easily visualize interesting features of sets of DNA data.

  14. Genomic signal processing for DNA sequence clustering.

    Science.gov (United States)

    Mendizabal-Ruiz, Gerardo; Román-Godínez, Israel; Torres-Ramos, Sulema; Salido-Ruiz, Ricardo A; Vélez-Pérez, Hugo; Morales, J Alejandro

    2018-01-01

    Genomic signal processing (GSP) methods which convert DNA data to numerical values have recently been proposed, which would offer the opportunity of employing existing digital signal processing methods for genomic data. One of the most used methods for exploring data is cluster analysis which refers to the unsupervised classification of patterns in data. In this paper, we propose a novel approach for performing cluster analysis of DNA sequences that is based on the use of GSP methods and the K-means algorithm. We also propose a visualization method that facilitates the easy inspection and analysis of the results and possible hidden behaviors. Our results support the feasibility of employing the proposed method to find and easily visualize interesting features of sets of DNA data.

  15. Genomic Analysis of Two Phylogenetically DistinctNitrospiraSpecies Reveals Their Genomic Plasticity and Functional Diversity.

    Science.gov (United States)

    Ushiki, Norisuke; Fujitani, Hirotsugu; Shimada, Yu; Morohoshi, Tomohiro; Sekiguchi, Yuji; Tsuneda, Satoshi

    2017-01-01

    The genus Nitrospira represents a dominant group of nitrite-oxidizing bacteria in natural and engineered ecosystems. This genus is phylogenetically divided into six lineages, for which vast phylogenetic and functional diversity has been revealed by recent molecular ecophysiological analyses. However, the genetic basis underlying these phenotypic differences remains largely unknown because of the lack of genome sequences representing their diversity. To gain a more comprehensive understanding of Nitrospira , we performed genomic comparisons between two Nitrospira strains (ND1 and NJ1 belonging to lineages I and II, respectively) previously isolated from activated sludge. In addition, the genomes of these strains were systematically compared with previously reported six Nitrospira genomes to reveal their similarity and presence/absence of several functional genes/operons. Comparisons of Nitrospira genomes indicated that their genomic diversity reflects phenotypic differences and versatile nitrogen metabolisms. Although most genes involved in key metabolic pathways were conserved between strains ND1 and NJ1, assimilatory nitrite reduction pathways of the two Nitrospira strains were different. In addition, the genomes of both strains contain a phylogenetically different urease locus and we confirmed their ureolytic activity. During gene annotation of strain NJ1, we found a gene cluster encoding a quorum-sensing system. From the enriched supernatant of strain NJ1, we successfully identified seven types of acyl-homoserine lactones with a range of C10-C14. In addition, the genome of strain NJ1 lacks genes relevant to flagella and the clustered regularly interspaced short palindromic repeat (CRISPR)-Cas (CRISPR-associated genes) systems, whereas most nitrifying bacteria including strain ND1 possess these genomic elements. These findings enhance our understanding of genomic plasticity and functional diversity among members of the genus Nitrospira .

  16. PennCNV in whole-genome sequencing data

    OpenAIRE

    de Araújo Lima, Leandro; Wang, Kai

    2017-01-01

    Background The use of high-throughput sequencing data has improved the results of genomic analysis due to the resolution of mapping algorithms. Although several tools for copy-number variation calling in whole genome sequencing have been published, the noisy nature of sequencing data is still a limitation for accuracy and concordance among such tools. To assess the performance of PennCNV original algorithm for array data in whole genome sequencing data, we processed mapping (BAM) files to ext...

  17. The complete chloroplast genome sequence of Euonymus japonicus (Celastraceae).

    Science.gov (United States)

    Choi, Kyoung Su; Park, SeonJoo

    2016-09-01

    The complete chloroplast (cp) genome sequence of the Euonymus japonicus, the first sequenced of the genus Euonymus, was reported in this study. The total length was 157 637 bp, containing a pair of 26 678 bp inverted repeat region (IR), which were separated by small single copy (SSC) region and large single copy (LSC) region of 18 340 bp and 85 941 bp, respectively. This genome contains 107 unique genes, including 74 coding genes, four rRNA genes, and 29 tRNA genes. Seventeen genes contain intron of E. japonicus, of which three genes (clpP, ycf3, and rps12) include two introns. The maximum likelihood (ML) phylogenetic analysis revealed that E. japonicus was closely related to Manihot and Populus.

  18. The genome sequence of Blochmannia floridanus: Comparative analysis of reduced genomes

    Science.gov (United States)

    Gil, Rosario; Silva, Francisco J.; Zientz, Evelyn; Delmotte, François; González-Candelas, Fernando; Latorre, Amparo; Rausell, Carolina; Kamerbeek, Judith; Gadau, Jürgen; Hölldobler, Bert; van Ham, Roeland C. H. J.; Gross, Roy; Moya, Andrés

    2003-01-01

    Bacterial symbioses are widespread among insects, probably being one of the key factors of their evolutionary success. We present the complete genome sequence of Blochmannia floridanus, the primary endosymbiont of carpenter ants. Although these ants feed on a complex diet, this symbiosis very likely has a nutritional basis: Blochmannia is able to supply nitrogen and sulfur compounds to the host while it takes advantage of the host metabolic machinery. Remarkably, these bacteria lack all known genes involved in replication initiation (dnaA, priA, and recA). The phylogenetic analysis of a set of conserved protein-coding genes shows that Bl. floridanus is phylogenetically related to Buchnera aphidicola and Wigglesworthia glossinidia, the other endosymbiotic bacteria whose complete genomes have been sequenced so far. Comparative analysis of the five known genomes from insect endosymbiotic bacteria reveals they share only 313 genes, a number that may be close to the minimum gene set necessary to sustain endosymbiotic life. PMID:12886019

  19. Building the sequence map of the human pan-genome

    DEFF Research Database (Denmark)

    Li, Ruiqiang; Li, Yingrui; Zheng, Hancheng

    2010-01-01

    Here we integrate the de novo assembly of an Asian and an African genome with the NCBI reference human genome, as a step toward constructing the human pan-genome. We identified approximately 5 Mb of novel sequences not present in the reference genome in each of these assemblies. Most novel...... analysis of predicted genes indicated that the novel sequences contain potentially functional coding regions. We estimate that a complete human pan-genome would contain approximately 19-40 Mb of novel sequence not present in the extant reference genome. The extensive amount of novel sequence contributing...... to the genetic variation of the pan-genome indicates the importance of using complete genome sequencing and de novo assembly....

  20. Initial sequencing and comparative analysis of the mouse genome.

    Science.gov (United States)

    Waterston, Robert H; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R; Brown, Daniel G; Brown, Stephen D; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T; Church, Deanna M; Clamp, Michele; Clee, Christopher; Collins, Francis S; Cook, Lisa L; Copley, Richard R; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D; Deri, Justin; Dermitzakis, Emmanouil T; Dewey, Colin; Dickens, Nicholas J; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M; Eddy, Sean R; Elnitski, Laura; Emes, Richard D; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A; Flicek, Paul; Foley, Karen; Frankel, Wayne N; Fulton, Lucinda A; Fulton, Robert S; Furey, Terrence S; Gage, Diane; Gibbs, Richard A; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A; Green, Eric D; Gregory, Simon; Guigó, Roderic; Guyer, Mark; Hardison, Ross C; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B; Johnson, L Steven; Jones, Matthew; Jones, Thomas A; Joy, Ann; Kamal, Michael; Karlsson, Elinor K; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W James; Kirby, Andrew; Kolbe, Diana L; Korf, Ian; Kucherlapati, Raju S; Kulbokas, Edward J; Kulp, David; Landers, Tom; Leger, J P; Leonard, Steven; Letunic, Ivica; Levine, Rosie; Li, Jia; Li, Ming; Lloyd, Christine; Lucas, Susan; Ma, Bin; Maglott, Donna R; Mardis, Elaine R; Matthews, Lucy; Mauceli, Evan; Mayer, John H; McCarthy, Megan; McCombie, W Richard; McLaren, Stuart; McLay, Kirsten; McPherson, John D; Meldrim, Jim; Meredith, Beverley; Mesirov, Jill P; Miller, Webb; Miner, Tracie L; Mongin, Emmanuel; Montgomery, Kate T; Morgan, Michael; Mott, Richard; Mullikin, James C; Muzny, Donna M; Nash, William E; Nelson, Joanne O; Nhan, Michael N; Nicol, Robert; Ning, Zemin; Nusbaum, Chad; O'Connor, Michael J; Okazaki, Yasushi; Oliver, Karen; Overton-Larty, Emma; Pachter, Lior; Parra, Genís; Pepin, Kymberlie H; Peterson, Jane; Pevzner, Pavel; Plumb, Robert; Pohl, Craig S; Poliakov, Alex; Ponce, Tracy C; Ponting, Chris P; Potter, Simon; Quail, Michael; Reymond, Alexandre; Roe, Bruce A; Roskin, Krishna M; Rubin, Edward M; Rust, Alistair G; Santos, Ralph; Sapojnikov, Victor; Schultz, Brian; Schultz, Jörg; Schwartz, Matthias S; Schwartz, Scott; Scott, Carol; Seaman, Steven; Searle, Steve; Sharpe, Ted; Sheridan, Andrew; Shownkeen, Ratna; Sims, Sarah; Singer, Jonathan B; Slater, Guy; Smit, Arian; Smith, Douglas R; Spencer, Brian; Stabenau, Arne; Stange-Thomann, Nicole; Sugnet, Charles; Suyama, Mikita; Tesler, Glenn; Thompson, Johanna; Torrents, David; Trevaskis, Evanne; Tromp, John; Ucla, Catherine; Ureta-Vidal, Abel; Vinson, Jade P; Von Niederhausern, Andrew C; Wade, Claire M; Wall, Melanie; Weber, Ryan J; Weiss, Robert B; Wendl, Michael C; West, Anthony P; Wetterstrand, Kris; Wheeler, Raymond; Whelan, Simon; Wierzbowski, Jamey; Willey, David; Williams, Sophie; Wilson, Richard K; Winter, Eitan; Worley, Kim C; Wyman, Dudley; Yang, Shan; Yang, Shiaw-Pyng; Zdobnov, Evgeny M; Zody, Michael C; Lander, Eric S

    2002-12-05

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  1. Draft Genome Sequence of the Fungus Trametes hirsuta 072.

    Science.gov (United States)

    Pavlov, Andrey R; Tyazhelova, Tatiana V; Moiseenko, Konstantin V; Vasina, Daria V; Mosunova, Olga V; Fedorova, Tatiana V; Maloshenok, Lilya G; Landesman, Elena O; Bruskin, Sergei A; Psurtseva, Nadezhda V; Slesarev, Alexei I; Kozyavkin, Sergei A; Koroleva, Olga V

    2015-11-19

    A standard draft genome sequence of the white rot saprotrophic fungus Trametes hirsuta 072 (Basidiomycota, Polyporales) is presented. The genome sequence contains about 33.6 Mb assembled in 141 scaffolds with a G+C content of ~57.6%. The draft genome annotation predicts 14,598 putative protein-coding open reading frames (ORFs). Copyright © 2015 Pavlov et al.

  2. Comparative genomics reveals diversity among xanthomonads infecting tomato and pepper

    LENUS (Irish Health Repository)

    Potnis, Neha

    2011-03-11

    Abstract Background Bacterial spot of tomato and pepper is caused by four Xanthomonas species and is a major plant disease in warm humid climates. The four species are distinct from each other based on physiological and molecular characteristics. The genome sequence of strain 85-10, a member of one of the species, Xanthomonas euvesicatoria (Xcv) has been previously reported. To determine the relationship of the four species at the genome level and to investigate the molecular basis of their virulence and differing host ranges, draft genomic sequences of members of the other three species were determined and compared to strain 85-10. Results We sequenced the genomes of X. vesicatoria (Xv) strain 1111 (ATCC 35937), X. perforans (Xp) strain 91-118 and X. gardneri (Xg) strain 101 (ATCC 19865). The genomes were compared with each other and with the previously sequenced Xcv strain 85-10. In addition, the molecular features were predicted that may be required for pathogenicity including the type III secretion apparatus, type III effectors, other secretion systems, quorum sensing systems, adhesins, extracellular polysaccharide, and lipopolysaccharide determinants. Several novel type III effectors from Xg strain 101 and Xv strain 1111 genomes were computationally identified and their translocation was validated using a reporter gene assay. A homolog to Ax21, the elicitor of XA21-mediated resistance in rice, and a functional Ax21 sulfation system were identified in Xcv. Genes encoding proteins with functions mediated by type II and type IV secretion systems have also been compared, including enzymes involved in cell wall deconstruction, as contributors to pathogenicity. Conclusions Comparative genomic analyses revealed considerable diversity among bacterial spot pathogens, providing new insights into differences and similarities that may explain the diverse nature of these strains. Genes specific to pepper pathogens, such as the O-antigen of the lipopolysaccharide cluster

  3. Rapid whole genome sequencing and precision neonatology.

    Science.gov (United States)

    Petrikin, Joshua E; Willig, Laurel K; Smith, Laurie D; Kingsmore, Stephen F

    2015-12-01

    Traditionally, genetic testing has been too slow or perceived to be impractical to initial management of the critically ill neonate. Technological advances have led to the ability to sequence and interpret the entire genome of a neonate in as little as 26 h. As the cost and speed of testing decreases, the utility of whole genome sequencing (WGS) of neonates for acute and latent genetic illness increases. Analyzing the entire genome allows for concomitant evaluation of the currently identified 5588 single gene diseases. When applied to a select population of ill infants in a level IV neonatal intensive care unit, WGS yielded a diagnosis of a causative genetic disease in 57% of patients. These diagnoses may lead to clinical management changes ranging from transition to palliative care for uniformly lethal conditions for alteration or initiation of medical or surgical therapy to improve outcomes in others. Thus, institution of 2-day WGS at time of acute presentation opens the possibility of early implementation of precision medicine. This implementation may create opportunities for early interventional, frequently novel or off-label therapies that may alter disease trajectory in infants with what would otherwise be fatal disease. Widespread deployment of rapid WGS and precision medicine will raise ethical issues pertaining to interpretation of variants of unknown significance, discovery of incidental findings related to adult onset conditions and carrier status, and implementation of medical therapies for which little is known in terms of risks and benefits. Despite these challenges, precision neonatology has significant potential both to decrease infant mortality related to genetic diseases with onset in newborns and to facilitate parental decision making regarding transition to palliative care. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome

    OpenAIRE

    Margulies, Elliott H.; Cooper, Gregory M.; Asimenos, George; Thomas, Daryl J.; Dewey, Colin N.; Siepel, Adam; Birney, Ewan; Keefe, Damian; Schwartz, Ariel S.; Hou, Minmei; Taylor, James; Nikolaev, Sergey; Montoya-Burgos, Juan I.; Löytynoja, Ari; Whelan, Simon

    2007-01-01

    A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons of these methods reveal large-scale consistency but substantial differences in terms of small genomic rearrangements, sensitivity (sequenc...

  5. Next Generation Sequencing at the University of Chicago Genomics Core

    Energy Technology Data Exchange (ETDEWEB)

    Faber, Pieter [University of Chicago

    2013-04-24

    The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.

  6. Complete genome sequence of Pectobacterium carotovorum subsp. carotovorum bacteriophage My1.

    Science.gov (United States)

    Lee, Dong Hwan; Lee, Ju-Hoon; Shin, Hakdong; Ji, Samnyu; Roh, Eunjung; Jung, Kyusuk; Ryu, Sangryeol; Choi, Jaehyuk; Heu, Sunggi

    2012-10-01

    Pectobacterium carotovorum subsp. carotovorum, a member of the Enterobacteriaceae family, is an important plant-pathogenic bacterium causing significant economic losses worldwide. P. carotovorum subsp. carotovorum bacteriophage My1 was isolated from a soil sample. Its genome was completely sequenced and analyzed for the development of an effective biological control agent. Sequence and morphological analyses revealed that phage My1 is a T5-like bacteriophage and belongs to the family Siphoviridae. To date, there is no report of a Pectobacterium-targeting siphovirus genome sequence. Here, we announce the complete genome sequence of phage My1 and report the results of our analysis.

  7. Comparison of two Next Generation sequencing platforms for full genome sequencing of Classical Swine Fever Virus

    DEFF Research Database (Denmark)

    Fahnøe, Ulrik; Pedersen, Anders Gorm; Höper, Dirk

    2013-01-01

    to the consensus sequence. Additionally, we got an average sequence depth for the genome of 4000 for the Iontorrent PGM and 400 for the FLX platform making the mapping suitable for single nucleotide variant (SNV) detection. The analysis revealed a single non-silent SNV A10665G leading to the amino acid change D......3431G in the RNAdependent RNA polymerase NS5B. This SNV was present at 100% frequency in the 12th passage and only at 55% in the 4th passage, which could explain the difference in growth kinetics between the passages....

  8. Draft Genome Sequence of Bacillus amyloliquefaciens B-1895

    OpenAIRE

    Karlyshev, Andrey V.; Melnikov, Vyacheslav G.; Chistyakov, Vladimir A.

    2014-01-01

    In this report, we present a draft genome sequence of Bacillus amyloliquefaciens strain B-1895. Comparison with the genome of a reference strain demonstrated similar overall organization, as well as differences involving large gene clusters.

  9. The house spider genome reveals an ancient whole-genome duplication during arachnid evolution.

    Science.gov (United States)

    Schwager, Evelyn E; Sharma, Prashant P; Clarke, Thomas; Leite, Daniel J; Wierschin, Torsten; Pechmann, Matthias; Akiyama-Oda, Yasuko; Esposito, Lauren; Bechsgaard, Jesper; Bilde, Trine; Buffry, Alexandra D; Chao, Hsu; Dinh, Huyen; Doddapaneni, HarshaVardhan; Dugan, Shannon; Eibner, Cornelius; Extavour, Cassandra G; Funch, Peter; Garb, Jessica; Gonzalez, Luis B; Gonzalez, Vanessa L; Griffiths-Jones, Sam; Han, Yi; Hayashi, Cheryl; Hilbrant, Maarten; Hughes, Daniel S T; Janssen, Ralf; Lee, Sandra L; Maeso, Ignacio; Murali, Shwetha C; Muzny, Donna M; Nunes da Fonseca, Rodrigo; Paese, Christian L B; Qu, Jiaxin; Ronshaugen, Matthew; Schomburg, Christoph; Schönauer, Anna; Stollewerk, Angelika; Torres-Oliva, Montserrat; Turetzek, Natascha; Vanthournout, Bram; Werren, John H; Wolff, Carsten; Worley, Kim C; Bucher, Gregor; Gibbs, Richard A; Coddington, Jonathan; Oda, Hiroki; Stanke, Mario; Ayoub, Nadia A; Prpic, Nikola-Michael; Flot, Jean-François; Posnien, Nico; Richards, Stephen; McGregor, Alistair P

    2017-07-31

    The duplication of genes can occur through various mechanisms and is thought to make a major contribution to the evolutionary diversification of organisms. There is increasing evidence for a large-scale duplication of genes in some chelicerate lineages including two rounds of whole genome duplication (WGD) in horseshoe crabs. To investigate this further, we sequenced and analyzed the genome of the common house spider Parasteatoda tepidariorum. We found pervasive duplication of both coding and non-coding genes in this spider, including two clusters of Hox genes. Analysis of synteny conservation across the P. tepidariorum genome suggests that there has been an ancient WGD in spiders. Comparison with the genomes of other chelicerates, including that of the newly sequenced bark scorpion Centruroides sculpturatus, suggests that this event occurred in the common ancestor of spiders and scorpions, and is probably independent of the WGDs in horseshoe crabs. Furthermore, characterization of the sequence and expression of the Hox paralogs in P. tepidariorum suggests that many have been subject to neo-functionalization and/or sub-functionalization since their duplication. Our results reveal that spiders and scorpions are likely the descendants of a polyploid ancestor that lived more than 450 MYA. Given the extensive morphological diversity and ecological adaptations found among these animals, rivaling those of vertebrates, our study of the ancient WGD event in Arachnopulmonata provides a new comparative platform to explore common and divergent evolutionary outcomes of polyploidization events across eukaryotes.

  10. The diploid genome sequence of an Asian individual

    DEFF Research Database (Denmark)

    Wang, Jun; Wang, Wei; Li, Ruiqiang

    2008-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we...... used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP...... identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J...

  11. Whole Genome Sequencing: Innovation Dream or Privacy Nightmare?

    OpenAIRE

    De Cristofaro, Emiliano

    2012-01-01

    Over the past several years, DNA sequencing has emerged as one of the driving forces in life-sciences, paving the way for affordable and accurate whole genome sequencing. As genomes represent the entirety of an organism's hereditary information, the availability of complete human genomes prompts a wide range of revolutionary applications. The hope for improving modern healthcare and better understanding the human genome propels many interesting and challenging research frontiers. Unfortunatel...

  12. Reconstructing cancer genomes from paired-end sequencing data

    Directory of Open Access Journals (Sweden)

    Oesper Layla

    2012-04-01

    Full Text Available Abstract Background A cancer genome is derived from the germline genome through a series of somatic mutations. Somatic structural variants - including duplications, deletions, inversions, translocations, and other rearrangements - result in a cancer genome that is a scrambling of intervals, or "blocks" of the germline genome sequence. We present an efficient algorithm for reconstructing the block organization of a cancer genome from paired-end DNA sequencing data. Results By aligning paired reads from a cancer genome - and a matched germline genome, if available - to the human reference genome, we derive: (i a partition of the reference genome into intervals; (ii adjacencies between these intervals in the cancer genome; (iii an estimated copy number for each interval. We formulate the Copy Number and Adjacency Genome Reconstruction Problem of determining the cancer genome as a sequence of the derived intervals that is consistent with the measured adjacencies and copy numbers. We design an efficient algorithm, called Paired-end Reconstruction of Genome Organization (PREGO, to solve this problem by reducing it to an optimization problem on an interval-adjacency graph constructed from the data. The solution to the optimization problem results in an Eulerian graph, containing an alternating Eulerian tour that corresponds to a cancer genome that is consistent with the sequencing data. We apply our algorithm to five ovarian cancer genomes that were sequenced as part of The Cancer Genome Atlas. We identify numerous rearrangements, or structural variants, in these genomes, analyze reciprocal vs. non-reciprocal rearrangements, and identify rearrangements consistent with known mechanisms of duplication such as tandem duplications and breakage/fusion/bridge (B/F/B cycles. Conclusions We demonstrate that PREGO efficiently identifies complex and biologically relevant rearrangements in cancer genome sequencing data. An implementation of the PREGO algorithm is

  13. Genome sequencing and annotation of Stenotrophomonas sp. SAM8

    Directory of Open Access Journals (Sweden)

    Samy Selim

    2015-12-01

    Full Text Available We report draft genome sequence of Stenotrophomonas sp. strain SAM8, isolated from environmental water. The draft genome size is 3,665,538 bp with a G + C content of 67.2% and contains 6 rRNA sequence (single copies of 5S, 16S & 23S rRNA. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LDAV00000000.

  14. The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics.

    Directory of Open Access Journals (Sweden)

    Lincoln D Stein

    2003-11-01

    Full Text Available The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp and C. elegans (100.3 Mbp genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C

  15. Identifying Human Genome-Wide CNV, LOH and UPD by Targeted Sequencing of Selected Regions.

    Directory of Open Access Journals (Sweden)

    Wei Li

    Full Text Available Copy-number variations (CNV, loss of heterozygosity (LOH, and uniparental disomy (UPD are large genomic aberrations leading to many common inherited diseases, cancers, and other complex diseases. An integrated tool to identify these aberrations is essential in understanding diseases and in designing clinical interventions. Previous discovery methods based on whole-genome sequencing (WGS require very high depth of coverage on the whole genome scale, and are cost-wise inefficient. Another approach, whole exome genome sequencing (WEGS, is limited to discovering variations within exons. Thus, we are lacking efficient methods to detect genomic aberrations on the whole genome scale using next-generation sequencing technology. Here we present a method to identify genome-wide CNV, LOH and UPD for the human genome via selectively sequencing a small portion of genome termed Selected Target Regions (SeTRs. In our experiments, the SeTRs are covered by 99.73%~99.95% with sufficient depth. Our developed bioinformatics pipeline calls genome-wide CNVs with high confidence, revealing 8 credible events of LOH and 3 UPD events larger than 5M from 15 individual samples. We demonstrate that genome-wide CNV, LOH and UPD can be detected using a cost-effective SeTRs sequencing approach, and that LOH and UPD can be identified using just a sample grouping technique, without using a matched sample or familial information.

  16. Genomic Resources for Water Yam (Dioscorea alata L.): Analyses of EST-Sequences, De Novo Sequencing and GBS Libraries.

    Science.gov (United States)

    Saski, Christopher A; Bhattacharjee, Ranjana; Scheffler, Brian E; Asiedu, Robert

    2015-01-01

    The reducing cost and rapid progress in next-generation sequencing techniques coupled with high performance computational approaches have resulted in large-scale discovery of advanced genomic resources in several model and non-model plant species. Yam (Dioscorea spp.) is a major food and cash crop in many countries but research efforts have been limited to understand the genetics and generate genomic information for the crop. The availability of a large number of genomic resources including genome-wide molecular markers will accelerate the breeding efforts and application of genomic selection in yams. In the present study, several methods including expressed sequence tags (EST)-sequencing, de novo sequencing, and genotyping-by-sequencing (GBS) profiles on two yam (Dioscorea alata L.) genotypes (TDa 95/00328 and TDa 95-310) was performed to generate genomic resources for use in its improvement programs. This includes a comprehensive set of EST-SSRs, genomic SSRs, whole genome SNPs, and reduced representation SNPs. A total of 1,152 EST-SSRs were developed from >40,000 EST-sequences generated from the two genotypes. A set of 388 EST-SSRs were validated as polymorphic showing a polymorphism rate of 34% when tested on two diverse parents targeted for anthracnose disease. In addition, approximately 40X de novo whole genome sequence coverage was generated for each of the two genotypes, and a total of 18,584 and 15,952 genomic SSRs were identified for TDa 95/00328 and TDa 95-310, respectively. A custom made pipeline resulted in the selection of 573 genomic SSRs common across the two genotypes, of which only eight failed, 478 being polymorphic and 62 monomorphic indicating a polymorphic rate of 83.5%. Additionally, 288,505 high quality SNPs were also identified between these two genotypes. Genotyping by sequencing reads on these two genotypes also revealed 36,790 overlapping SNP positions that are distributed throughout the genome. Our efforts in using different approaches

  17. Evolutionary insights from suffix array-based genome sequence ...

    Indian Academy of Sciences (India)

    2007-08-06

    Aug 6, 2007 ... Keywords. Biological language modelling toolkit (BLMT); genome sequence analysis; n-grams; pattern matching; suffix arrays; suffix trees; short peptide sequences genetic code bias ...

  18. Genome Sequence of Lactobacillus plantarum Strain UCMA 3037

    OpenAIRE

    Naz, Saima; Tareb, Raouf; Bernardeau, Marion; Vaisse, Melissa; Lucchetti-Miganeh, Celine; Rechenmann, Mathias; Vernoux, Jean-Paul

    2013-01-01

    Nucleic acid of the strain Lactobacillus plantarum UCMA 3037, isolated from raw milk camembert cheese in our laboratory, was sequenced. We present its draft genome sequence with the aim of studying its functional properties and relationship to the cheese ecosystem.

  19. A Snapshot of the Emerging Tomato Genome Sequence

    Directory of Open Access Journals (Sweden)

    Lukas A. Mueller

    2009-03-01

    Full Text Available The genome of tomato ( L. is being sequenced by an international consortium of 10 countries (Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy, and the United States as part of the larger “International Solanaceae Genome Project (SOL: Systems Approach to Diversity and Adaptation” initiative. The tomato genome sequencing project uses an ordered bacterial artificial chromosome (BAC approach to generate a high-quality tomato euchromatic genome sequence for use as a reference genome for the Solanaceae and euasterids. Sequence is deposited at GenBank and at the SOL Genomics Network (SGN. Currently, there are around 1000 BACs finished or in progress, representing more than a third of the projected euchromatic portion of the genome. An annotation effort is also underway by the International Tomato Annotation Group. The expected number of genes in the euchromatin is ∼40,000, based on an estimate from a preliminary annotation of 11% of finished sequence. Here, we present this first snapshot of the emerging tomato genome and its annotation, a short comparison with potato ( L. sequence data, and the tools available for the researchers to exploit this new resource are also presented. In the future, whole-genome shotgun techniques will be combined with the BAC-by-BAC approach to cover the entire tomato genome. The high-quality reference euchromatic tomato sequence is expected to be near completion by 2010.

  20. Differential metabolism of Mycoplasma species as revealed by their genomes

    Directory of Open Access Journals (Sweden)

    Fabricio B.M. Arraes

    2007-01-01

    Full Text Available The annotation and comparative analyses of the genomes of Mycoplasma synoviae and Mycoplasma hyopneumonie, as well as of other Mollicutes (a group of bacteria devoid of a rigid cell wall, has set the grounds for a global understanding of their metabolism and infection mechanisms. According to the annotation data, M. synoviae and M. hyopneumoniae are able to perform glycolytic metabolism, but do not possess the enzymatic machinery for citrate and glyoxylate cycles, gluconeogenesis and the pentose phosphate pathway. Both can synthesize ATP by lactic fermentation, but only M. synoviae can convert acetaldehyde to acetate. Also, our genome analysis revealed that M. synoviae and M. hyopneumoniae are not expected to synthesize polysaccharides, but they can take up a variety of carbohydrates via the phosphoenolpyruvate-dependent phosphotransferase system (PEP-PTS. Our data showed that these two organisms are unable to synthesize purine and pyrimidine de novo, since they only possess the sequences which encode salvage pathway enzymes. Comparative analyses of M. synoviae and M. hyopneumoniae with other Mollicutes have revealed differential genes in the former two genomes coding for enzymes that participate in carbohydrate, amino acid and nucleotide metabolism and host-pathogen interaction. The identification of these metabolic pathways will provide a better understanding of the biology and pathogenicity of these organisms.

  1. A plant pathology perspective of fungal genome sequencing.

    Science.gov (United States)

    Aylward, Janneke; Steenkamp, Emma T; Dreyer, Léanne L; Roets, Francois; Wingfield, Brenda D; Wingfield, Michael J

    2017-06-01

    The majority of plant pathogens are fungi and many of these adversely affect food security. This mini-review aims to provide an analysis of the plant pathogenic fungi for which genome sequences are publically available, to assess their general genome characteristics, and to consider how genomics has impacted plant pathology. A list of sequenced fungal species was assembled, the taxonomy of all species verified, and the potential reason for sequencing each of the species considered. The genomes of 1090 fungal species are currently (October 2016) in the public domain and this number is rapidly rising. Pathogenic species comprised the largest category (35.5 %) and, amongst these, plant pathogens are predominant. Of the 191 plant pathogenic fungal species with available genomes, 61.3 % cause diseases on food crops, more than half of which are staple crops. The genomes of plant pathogens are slightly larger than those of other fungal species sequenced to date and they contain fewer coding sequences in relation to their genome size. Both of these factors can be attributed to the expansion of repeat elements. Sequenced genomes of plant pathogens provide blueprints from which potential virulence factors were identified and from which genes associated with different pathogenic strategies could be predicted. Genome sequences have also made it possible to evaluate adaptability of pathogen genomes and genomic regions that experience selection pressures. Some genomic patterns, however, remain poorly understood and plant pathogen genomes alone are not sufficient to unravel complex pathogen-host interactions. Genomes, therefore, cannot replace experimental studies that can be complex and tedious. Ultimately, the most promising application lies in using fungal plant pathogen genomics to inform disease management and risk assessment strategies. This will ultimately minimize the risks of future disease outbreaks and assist in preparation for emerging pathogen outbreaks.

  2. Complete genome sequence of Enterococcus faecium strain TX16 and comparative genomic analysis of Enterococcus faecium genomes

    Science.gov (United States)

    2012-01-01

    Background Enterococci are among the leading causes of hospital-acquired infections in the United States and Europe, with Enterococcus faecalis and Enterococcus faecium being the two most common species isolated from enterococcal infections. In the last decade, the proportion of enterococcal infections caused by E. faecium has steadily increased compared to other Enterococcus species. Although the underlying mechanism for the gradual replacement of E. faecalis by E. faecium in the hospital environment is not yet understood, many studies using genotyping and phylogenetic analysis have shown the emergence of a globally dispersed polyclonal subcluster of E. faecium strains in clinical environments. Systematic study of the molecular epidemiology and pathogenesis of E. faecium has been hindered by the lack of closed, complete E. faecium genomes that can be used as references. Results In this study, we report the complete genome sequence of the E. faecium strain TX16, also known as DO, which belongs to multilocus sequence type (ST) 18, and was the first E. faecium strain ever sequenced. Whole genome comparison of the TX16 genome with 21 E. faecium draft genomes confirmed that most clinical, outbreak, and hospital-associated (HA) strains (including STs 16, 17, 18, and 78), in addition to strains of non-hospital origin, group in the same clade (referred to as the HA clade) and are evolutionally considerably more closely related to each other by phylogenetic and gene content similarity analyses than to isolates in the community-associated (CA) clade with approximately a 3–4% average nucleotide sequence difference between the two clades at the core genome level. Our study also revealed that many genomic loci in the TX16 genome are unique to the HA clade. 380 ORFs in TX16 are HA-clade specific and antibiotic resistance genes are enriched in HA-clade strains. Mobile elements such as IS16 and transposons were also found almost exclusively in HA strains, as previously reported

  3. Single-Cell (Meta-Genomics of a Dimorphic Candidatus Thiomargarita nelsonii Reveals Genomic Plasticity

    Directory of Open Access Journals (Sweden)

    Beverly E. Flood

    2016-05-01

    Full Text Available The genus Thiomargarita includes the world’s largest bacteria. But as uncultured organisms, their physiology, metabolism, and basis for their gigantism are not well understood. Thus a genomics approach, applied to a single Candidatus Thiomargarita nelsonii cell was employed to explore the genetic potential of one of these enigmatic giant bacteria. The Thiomargarita cell was obtained from an assemblage of budding Ca. T. nelsonii attached to a provannid gastropod shell from Hydrate Ridge, a methane seep offshore of Oregon, USA. Here we present a manually curated genome of Bud S10 resulting from a hybrid assembly of long Pacific Biosciences and short Illumina sequencing reads. With respect to inorganic carbon fixation and sulfur oxidation pathways, the Ca. T. nelsonii Hydrate Ridge Bud S10 genome was similar to marine sister taxa within the family Beggiatoaceae. However, the Bud S10 genome contains genes suggestive of the genetic potential for lithotrophic growth on arsenite and perhaps hydrogen. The genome also revealed that Bud S10 likely respires nitrate via two pathways: a complete denitrification pathway and a dissimilatory nitrate reduction to ammonia pathway. Both pathways have been predicted, but not previously fully elucidated, in the genomes of other large, vacuolated, sulfur-oxidizing bacteria.Surprisingly, the genome also had a high number of unusual features for a bacterium to include the largest number of metacaspases and introns ever reported in a bacterium. Also present, are a large number of other mobile genetic elements, such as insertion sequence transposable elements and miniature inverted-repeat transposable elements (MITEs. In some cases, mobile genetic elements disrupted key genes in metabolic pathways. For example, a MITE interrupts hupL, which encodes the large subunit of the hydrogenase in hydrogen oxidation. Moreover, we detected a group I intron in one of the most critical genes in the sulfur oxidation pathway, dsr

  4. A genome-wide BAC-end sequence survey provides first insights into sweetpotato (Ipomoea batatas (L.) Lam.) genome composition.

    Science.gov (United States)

    Si, Zengzhi; Du, Bing; Huo, Jinxi; He, Shaozhen; Liu, Qingchang; Zhai, Hong

    2016-11-21

    Sweetpotato, Ipomoea batatas (L.) Lam., is an important food crop widely grown in the world. However, little is known about the genome of this species because it is a highly heterozygous hexaploid. Gaining a more in-depth knowledge of sweetpotato genome is therefore necessary and imperative. In this study, the first bacterial artificial chromosome (BAC) library of sweetpotato was constructed. Clones from the BAC library were end-sequenced and analyzed to provide genome-wide information about this species. The BAC library contained 240,384 clones with an average insert size of 101 kb and had a 7.93-10.82 × coverage of the genome, and the probability of isolating any single-copy DNA sequence from the library was more than 99%. Both ends of 8310 BAC clones randomly selected from the library were sequenced to generate 11,542 high-quality BAC-end sequences (BESs), with an accumulative length of 7,595,261 bp and an average length of 658 bp. Analysis of the BESs revealed that 12.17% of the sweetpotato genome were known repetitive DNA, including 7.37% long terminal repeat (LTR) retrotransposons, 1.15% Non-LTR retrotransposons and 1.42% Class II DNA transposons etc., 18.31% of the genome were identified as sweetpotato-unique repetitive DNA and 10.00% of the genome were predicted to be coding regions. In total, 3,846 simple sequences repeats (SSRs) were identified, with a density of one SSR per 1.93 kb, from which 288 SSRs primers were designed and tested for length polymorphism using 20 sweetpotato accessions, 173 (60.07%) of them produced polymorphic bands. Sweetpotato BESs had significant hits to the genome sequences of I. trifida and more matches to the whole-genome sequences of Solanum lycopersicum than those of Vitis vinifera, Theobroma cacao and Arabidopsis thaliana. The first BAC library for sweetpotato has been successfully constructed. The high quality BESs provide first insights into sweetpotato genome composition, and have significant hits to the genome

  5. The complete genome sequence and comparative genome analysis of the high pathogenicity Yersinia enterocolitica strain 8081.

    Directory of Open Access Journals (Sweden)

    Nicholas R Thomson

    2006-12-01

    Full Text Available The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common

  6. A complete mitochondrial genome sequence from a mesolithic wild aurochs (Bos primigenius).

    LENUS (Irish Health Repository)

    Edwards, Ceiridwen J

    2010-01-01

    BACKGROUND: The derivation of domestic cattle from the extinct wild aurochs (Bos primigenius) has been well-documented by archaeological and genetic studies. Genetic studies point towards the Neolithic Near East as the centre of origin for Bos taurus, with some lines of evidence suggesting possible, albeit rare, genetic contributions from locally domesticated wild aurochsen across Eurasia. Inferences from these investigations have been based largely on the analysis of partial mitochondrial DNA sequences generated from modern animals, with limited sequence data from ancient aurochsen samples. Recent developments in DNA sequencing technologies, however, are affording new opportunities for the examination of genetic material retrieved from extinct species, providing new insight into their evolutionary history. Here we present DNA sequence analysis of the first complete mitochondrial genome (16,338 base pairs) from an archaeologically-verified and exceptionally-well preserved aurochs bone sample. METHODOLOGY: DNA extracts were generated from an aurochs humerus bone sample recovered from a cave site located in Derbyshire, England and radiocarbon-dated to 6,738+\\/-68 calibrated years before present. These extracts were prepared for both Sanger and next generation DNA sequencing technologies (Illumina Genome Analyzer). In total, 289.9 megabases (22.48%) of the post-filtered DNA sequences generated using the Illumina Genome Analyzer from this sample mapped with confidence to the bovine genome. A consensus B. primigenius mitochondrial genome sequence was constructed and was analysed alongside all available complete bovine mitochondrial genome sequences. CONCLUSIONS: For all nucleotide positions where both Sanger and Illumina Genome Analyzer sequencing methods gave high-confidence calls, no discrepancies were observed. Sequence analysis reveals evidence of heteroplasmy in this sample and places this mitochondrial genome sequence securely within a previously identified

  7. The genomic scrapheap challenge; extracting relevant data from unmapped whole genome sequencing reads, including strain specific genomic segments, in rats

    NARCIS (Netherlands)

    Van Der Weide, Robin H.; Simonis, Marieke; Hermsen, Roel; Toonen, Pim; Cuppen, Edwin; de Ligt, Joep

    2016-01-01

    Unmapped next-generation sequencing reads are typically ignored while they contain biologically relevant information. We systematically analyzed unmapped reads from whole genome sequencing of 33 inbred rat strains. High quality reads were selected and enriched for biologically relevant sequences;

  8. Group-specific amplification of HLA-DQA1 revealed a number of genomic full-length sequences including the novel HLA alleles DQA1*01:10 and DQA1*01:11.

    Science.gov (United States)

    Witter, K; Halliwell, J A; Mautner, J; Jolesch, A; von Welser, G; Rampp, I; Spannagl, M; Kauke, T; Dick, A

    2014-01-01

    In this article, we describe a subgroup-specific amplification assay for HLA-DQA1 that encompasses the whole coding region and allows us to sequence full-length HLA-DQA1 genes. We introduce the novel alleles HLA-DQA1*01:10 and HLA-DQA1*01:11. Moreover, we were able to confirm the full-length genomic sequence data of the alleles HLA-DQA1*01:07, HLA-DQA1*03:01:01, HLA-DQA1*03:02, HLA-DQA1*04:01:02, HLA-DQA1*04:02, HLA-DQA1*05:03, HLA-DQA1*05:05:01:02 and HLA-DQA1*06:01:01. A complete genomic overview of all six HLA-DQA1 allele groups is now available from the submission of our data to the IMGT/HLA database. Because our approach facilitates the analysis of all HLA-DQA1 allele sequences, HLA-DQA1 may become the first HLA locus from which all subgroup members will be known in detail in the near future. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  9. The secondary structure of the 5′ end of the FIV genome reveals a long-range interaction between R/U5 and gag sequences, and a large, stable stem–loop

    Science.gov (United States)

    Kenyon, Julia C.; Ghazawi, Akela; Cheung, Winsome K.S.; Phillip, Pretty S.; Rizvi, Tahir A.; Lever, Andrew M.L.

    2008-01-01

    Feline immunodeficiency virus (FIV) is a lentivirus that infects cats and is related to human immunodeficiency virus (HIV). Although it is a common worldwide infection, and has potential uses as a human gene therapy vector and as a nonprimate model for HIV infection, little detail is known of the viral life cycle. Previous experiments have shown that its packaging signal includes two or more regions within the first 511 nucleotides of the genomic RNA. We have undertaken a secondary structural analysis of this RNA by minimal free-energy structural prediction, biochemical mapping, and phylogenetic analysis, and show that it contains five conserved stem–loops and a conserved long-range interaction between heptanucleotide sequences 5′-CCCUGUC-3′ in R/U5 and 5′-GACAGGG-3′ in gag. This long-range interaction is similar to that seen in primate lentiviruses where it is thought to be functionally important. Along with strains that infect domestic cats, this heptanucleotide interaction can also occur in species-specific FIV strains that infect pumas, lions, and Pallas' cats where the heptanucleotide sequences involved vary. We have analyzed spliced and genomic FIV RNAs and see little structural change or sequence conservation within single-stranded regions of the 5′ UTR that are important for viral packaging, suggesting that FIV may employ a cotranslational packaging mechanism. PMID:18974279

  10. Microbial genome sequencing using optical mapping and Illumina sequencing

    Science.gov (United States)

    Introduction Optical mapping is a technique in which strands of genomic DNA are digested with one or more restriction enzymes, and a physical map of the genome constructed from the resulting image. In outline, genomic DNA is extracted from a pure culture, linearly arrayed on a specialized glass sli...

  11. An overview of wheat genome sequencing and its implications for ...

    Indian Academy of Sciences (India)

    Taken together, the availability of high-quality refer- ence genome and transcriptome data would invariably expe- dite the crop improvement programmes. Using the existing genetic and genomic resources and tools, wheat researchers shall be able to integrate and apply the genome sequence information to explore the ...

  12. Why size really matters when sequencing plant genomes

    Czech Academy of Sciences Publication Activity Database

    Kelly, L.J.; Leitch, A.R.; Fay, M. F.; Renny-Byfield, S.; Pellicer, J.; Macas, Jiří; Leitch, I.J.

    2012-01-01

    Roč. 5, č. 4 (2012), s. 415-425 ISSN 1755-0874 Institutional research plan: CEZ:AV0Z50510513 Institutional support: RVO:60077344 Keywords : C-value * genome assembly * genome size evolution * genome sequencing Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 0.924, year: 2012

  13. Genome sequencing and annotation of Cellulomonas sp. HZM

    OpenAIRE

    Chua, Patric; Har, Zi Mei; Austin, Christopher M.; Yule, Catherine M.; Dykes, Gary A.; Lee, Sui Mae

    2015-01-01

    We report the draft genome sequence of Cellulomonas sp. HZM, isolated from a tropical peat swamp forest. The draft genome size is 3,559,280 bp with a G + C content of 73% and contains 3 rRNA sequences (single copies of 5S, 16S and 23S rRNA).

  14. Genome sequencing and annotation of Cellulomonas sp. HZM.

    Science.gov (United States)

    Chua, Patric; Har, Zi Mei; Austin, Christopher M; Yule, Catherine M; Dykes, Gary A; Lee, Sui Mae

    2015-09-01

    We report the draft genome sequence of Cellulomonas sp. HZM, isolated from a tropical peat swamp forest. The draft genome size is 3,559,280 bp with a G + C content of 73% and contains 3 rRNA sequences (single copies of 5S, 16S and 23S rRNA).

  15. Genome sequencing and annotation of Cellulomonas sp. HZM

    Directory of Open Access Journals (Sweden)

    Patric Chua

    2015-09-01

    Full Text Available We report the draft genome sequence of Cellulomonas sp. HZM, isolated from a tropical peat swamp forest. The draft genome size is 3,559,280 bp with a G + C content of 73% and contains 3 rRNA sequences (single copies of 5S, 16S and 23S rRNA.

  16. Draft Genome Sequence of NDM-1-Producing Leclercia adecarboxylata

    OpenAIRE

    Hoyos-Mallecot, Yannick; Rojo-Mart?n, Mar?a Dolores; Bonnin, R?my A.; Creton, Elodie; Navarro Mar?, Jose Mar?a; Naas, Thierry

    2017-01-01

    ABSTRACT Here, we provide the first draft genome sequence of NDM-1-producing Leclercia adecarboxylata, a human-opportunistic pathogen. The draft genome sequence consists of a total length of 5.13?Mbp, with an average G+C content of 55.2%.

  17. A computational genomics pipeline for prokaryotic sequencing projects.

    Science.gov (United States)

    Kislyuk, Andrey O; Katz, Lee S; Agrawal, Sonia; Hagen, Matthew S; Conley, Andrew B; Jayaraman, Pushkala; Nelakuditi, Viswateja; Humphrey, Jay C; Sammons, Scott A; Govil, Dhwani; Mair, Raydel D; Tatti, Kathleen M; Tondella, Maria L; Harcourt, Brian H; Mayer, Leonard W; Jordan, I King

    2010-08-01

    New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems.

  18. Biased distribution of DNA uptake sequences towards genome maintenance genes

    DEFF Research Database (Denmark)

    Davidsen, T.; Rodland, E.A.; Lagesen, K.

    2004-01-01

    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...

  19. Complete Genome Sequence of the Human Gut Symbiont Roseburia hominis

    DEFF Research Database (Denmark)

    Travis, Anthony J.; Kelly, Denise; Flint, Harry J

    2015-01-01

    We report here the complete genome sequence of the human gut symbiont Roseburia hominis A2-183(T) (= DSM 16839(T) = NCIMB 14029(T)), isolated from human feces. The genome is represented by a 3,592,125-bp chromosome with 3,405 coding sequences. A number of potential functions contributing to host...

  20. On the current status of Phakopsora pachyrhizi genome sequencing.

    Science.gov (United States)

    Loehrer, Marco; Vogel, Alexander; Huettel, Bruno; Reinhardt, Richard; Benes, Vladimir; Duplessis, Sébastien; Usadel, Björn; Schaffrath, Ulrich

    2014-01-01

    Recent advances in the field of sequencing technologies and bioinformatics allow a more rapid access to genomes of non-model organisms at sinking costs. Accordingly, draft genomes of several economically important cereal rust fungi have been released in the last 3 years. Aside from the very recent flax rust and poplar rust draft assemblies there are no genomic data available for other dicot-infecting rust fungi. In this article we outline rust fungus sequencing efforts and comment on the current status of Phakopsora pachyrhizi (Asian soybean rust) genome sequencing.

  1. Comprehensive Genomic Profiling of Esthesioneuroblastoma Reveals Additional Treatment Options.

    Science.gov (United States)

    Gay, Laurie M; Kim, Sungeun; Fedorchak, Kyle; Kundranda, Madappa; Odia, Yazmin; Nangia, Chaitali; Battiste, James; Colon-Otero, Gerardo; Powell, Steven; Russell, Jeffery; Elvin, Julia A; Vergilio, Jo-Anne; Suh, James; Ali, Siraj M; Stephens, Philip J; Miller, Vincent A; Ross, Jeffrey S

    2017-07-01

    Esthesioneuroblastoma (ENB), also known as olfactory neuroblastoma, is a rare malignant neoplasm of the olfactory mucosa. Despite surgical resection combined with radiotherapy and adjuvant chemotherapy, ENB often relapses with rapid progression. Current multimodality, nontargeted therapy for relapsed ENB is of limited clinical benefit. We queried whether comprehensive genomic profiling (CGP) of relapsed or refractory ENB can uncover genomic alterations (GA) that could identify potential targeted therapies for these patients. CGP was performed on formalin-fixed, paraffin-embedded sections from 41 consecutive clinical cases of ENBs using a hybrid-capture, adaptor ligation based next-generation sequencing assay to a mean coverage depth of 593X. The results were analyzed for base substitutions, insertions and deletions, select rearrangements, and copy number changes (amplifications and homozygous deletions). Clinically relevant GA (CRGA) were defined as GA linked to drugs on the market or under evaluation in clinical trials. A total of 28 ENBs harbored GA, with a mean of 1.5 GA per sample. Approximately half of the ENBs (21, 51%) featured at least one CRGA, with an average of 1 CRGA per sample. The most commonly altered gene was TP53 (17%), with GA in PIK3CA , NF1 , CDKN2A , and CDKN2C occurring in 7% of samples. We report comprehensive genomic profiles for 41 ENB tumors. CGP revealed potential new therapeutic targets, including targetable GA in the mTOR, CDK and growth factor signaling pathways, highlighting the clinical value of genomic profiling in ENB. Comprehensive genomic profiling of 41 relapsed or refractory ENBs reveals recurrent alterations or classes of mutation, including amplification of tyrosine kinases encoded on chromosome 5q and mutations affecting genes in the mTOR/PI3K pathway. Approximately half of the ENBs (21, 51%) featured at least one clinically relevant genomic alteration (CRGA), with an average of 1 CRGA per sample. The most commonly altered

  2. Full genome sequence of a Danish isolate of Mycobacterium avium subspecies paratuberculosis, strain Ejlskov2007

    DEFF Research Database (Denmark)

    Afzal, Mamuna; Abidi, Soad; Mikkelsen, Heidi

    We have sequenced a Danish isolate of Mycobacterium avium subspecies paratuberculosis, strain Ejlskov2007. The strain was isolated from faecal material of a 48 month old second parity Danish Holstein cow, with clinical symptoms of chronic diarrhoea and emaciation. The cultures were grown on Löwen......We have sequenced a Danish isolate of Mycobacterium avium subspecies paratuberculosis, strain Ejlskov2007. The strain was isolated from faecal material of a 48 month old second parity Danish Holstein cow, with clinical symptoms of chronic diarrhoea and emaciation. The cultures were grown......, consisting of 4317 unique gene families. Comparison with M. avium paratuberculosis strain K10 revealed only 3436 genes in common (~70%). We have used GenomeAtlases to show conserved (and unique) regions along the Ejlskov2007 chromosome, compared to 2 other Mycobacterium avium sequenced genomes. Pan......-genome analyses of the sequenced Mycobacterium genomes reveal a surprisingly open and diverse set of genes for this bacterial genera....

  3. The sequence and analysis of a Chinese pig genome

    Directory of Open Access Journals (Sweden)

    Fang Xiaodong

    2012-11-01

    Full Text Available Abstract Background The pig is an economically important food source, amounting to approximately 40% of all meat consumed worldwide. Pigs also serve as an important model organism because of their similarity to humans at the anatomical, physiological and genetic level, making them very useful for studying a variety of human diseases. A pig strain of particular interest is the miniature pig, specifically the Wuzhishan pig (WZSP, as it has been extensively inbred. Its high level of homozygosity offers increased ease for selective breeding for specific traits and a more straightforward understanding of the genetic changes that underlie its biological characteristics. WZSP also serves as a promising means for applications in surgery, tissue engineering, and xenotransplantation. Here, we report the sequencing and analysis of an inbreeding WZSP genome. Results Our results reveal some unique genomic features, including a relatively high level of homozygosity in the diploid genome, an unusual distribution of heterozygosity, an over-representation of tRNA-derived transposable elements, a small amount of porcine endogenous retrovirus, and a lack of type C retroviruses. In addition, we carried out systematic research on gene evolution, together with a detailed investigation of the counterparts of human drug target genes. Conclusion Our results provide the opportunity to more clearly define the genomic character of pig, which could enhance our ability to create more useful pig models.

  4. Investigation of genome sequences within the family Pasteurellaceae

    DEFF Research Database (Denmark)

    Angen, Øystein; Ussery, David

    shows that the genomic diversity within species of Pasteurellaceae is a bit lower than the level which has been found for E. coli and B. cereus. This might reflect that most members of Pasteurellaecae have relatively small genomes and that the species of the family often are adapted to host specific......Introduction The bacterial genome sequences are now available for an increasing number of strains within the family Pasteurellaceae. At present, 24 Pasteurellaceae genomes are publicly available through internet databases, and another 40 genomes are being sequenced. This investigation will describe...... the core genome for both the family Pasteurellaceae and for the species Haemophilus influenzae. Methods Twenty genome sequences from the following species were included: Haemophilus influenzae (11 strains), Haemophilus ducreyi (1 strain), Histophilus somni (2 strains), Haemophilus parasuis (1 strain...

  5. Pan-Cancer Analysis of Genomic Sequencing Among the Elderly.

    Science.gov (United States)

    Wahl, Daniel R; Nguyen, Paul L; Santiago, Maria; Yousefi, Kasra; Davicioni, Elai; Shumway, Dean A; Speers, Corey; Mehra, Rohit; Feng, Felix Y; Osborne, Joseph R; Spratt, Daniel E

    2017-07-15

    We hypothesized that elderly patients might have age-specific genetic abnormalities yet be underrepresented in currently available sequencing repositories, which could limit the effect of sequencing efforts for this population. Leveraging The Cancer Genome Atlas (TCGA) data portal, 9 tumor types were analyzed. The frequency distribution of cancer by age was determined and compared with Surveillance, Epidemiology, and End Results data. Using the estimated median somatic mutational frequency of each tumor type, the samples needed beyond TCGA to detect a 10% mutational frequency were calculated. Microarray data from a separate prospective cohort were obtained from primary prostatectomy samples to determine whether elderly-specific transcriptomic alterations could be identified. Of the 5236 TCGA samples, 73% were from patients aged elderly patients with cancer were likely to harbor age-specific molecular abnormalities, we accessed transcriptomic data from a separate, larger database of >2000 prostate cancer samples. That analysis revealed significant differences in the expression of 10 genes in patients aged ≥70 years compared with those Elderly patients have been underrepresented in genomic sequencing studies. Our data suggest the presence of elderly-specific molecular alterations. Further dedicated efforts to understand the biology of cancer among the elderly will be important moving forward. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. Complete nucleotide sequences of avian metapneumovirus subtype B genome.

    Science.gov (United States)

    Sugiyama, Miki; Ito, Hiroshi; Hata, Yusuke; Ono, Eriko; Ito, Toshihiro

    2010-12-01

    Complete nucleotide sequences were determined for subtype B avian metapneumovirus (aMPV), the attenuated vaccine strain VCO3/50 and its parental pathogenic strain VCO3/60616. The genomes of both strains comprised 13,508 nucleotides (nt), with a 42-nt leader at the 3'-end and a 46-nt trailer at the 5'-end. The genome contains eight genes in the order 3'-N-P-M-F-M2-SH-G-L-5', which is the same order shown in the other metapneumoviruses. The genes are flanked on either side by conserved transcriptional start and stop signals and have intergenic sequences varying in length from 1 to 88 nt. Comparison of nt and predicted amino acid (aa) sequences of VCO3/60616 with those of other metapneumoviruses revealed higher homology with aMPV subtype A virus than with other metapneumoviruses. A total of 18 nt and 10 deduced aa differences were seen between the strains, and one or a combination of several differences could be associated with attenuation of VCO3/50.

  7. Genome Sequence of Human Rhinovirus A22, Strain Lancaster/2015.

    Science.gov (United States)

    Atkinson, Kate V; Bishop, Lisa A; Rhodes, Glenn; Salez, Nicolas; McEwan, Neil R; Hegarty, Matthew J; Robey, Julie; Harding, Nicola; Wetherell, Simon; Lauder, Robert M; Pickup, Roger W; Wilkinson, Mark; Gatherer, Derek

    2017-03-23

    The genome of human rhinovirus A22 (HRV-A22) was assembled by deep sequencing RNA samples from nasopharyngeal swabs. The assembled genome is 8.7% divergent from the HRV-A22 reference strain over its full length, and it is only the second full-length genome sequence for HRV-A22. The new strain is designated strain HRV-A22/Lancaster/2015. Copyright © 2017 Atkinson et al.

  8. On the current status of Phakopsora pachyrhizi genome sequencing

    OpenAIRE

    Loehrer, Marco; Vogel, Alexander; Huettel, Bruno; Reinhardt, Richard; Benes, Vladimir; Duplessis, Sébastien; Usadel, Björn; Schaffrath, Ulrich

    2014-01-01

    Recent advances in the field of sequencing technologies and bioinformatics allow a more rapid access to genomes of non-model organisms at sinking costs. Accordingly, draft genomes of several economically important cereal rust fungi have been released in the last 3 years. Aside from the very recent flax rust and poplar rust draft assemblies there are no genomic data available for other dicot-infecting rust fungi. In this article we outline rust fungus sequencing efforts and comment on the curr...

  9. Whole-Genome Sequences of Thirteen Isolates of Borrelia burgdorferi

    Energy Technology Data Exchange (ETDEWEB)

    Schutzer S. E.; Dunn J.; Fraser-Liggett, C. M.; Casjens, S. R.; Qiu, W.-G.; Mongodin, E. F.; Luft, B. J.

    2011-02-01

    Borrelia burgdorferi is a causative agent of Lyme disease in North America and Eurasia. The first complete genome sequence of B. burgdorferi strain 31, available for more than a decade, has assisted research on the pathogenesis of Lyme disease. Because a single genome sequence is not sufficient to understand the relationship between genotypic and geographic variation and disease phenotype, we determined the whole-genome sequences of 13 additional B. burgdorferi isolates that span the range of natural variation. These sequences should allow improved understanding of pathogenesis and provide a foundation for novel detection, diagnosis, and prevention strategies.

  10. Genome Sequence of Lactobacillus paracasei Strain LC-Ikematsu, Isolated from a Pineapple in Okinawa, Japan.

    Science.gov (United States)

    Tada, Ipputa; Saitoh, Seikoh; Aoyama, Hiroaki; Shinzato, Naoya; Yamamoto, Norikuni; Arita, Masanori; Ikematsu, Shinya

    2017-02-02

    The draft genome sequence of Lactobacillus paracasei strain LC-Ikematsu, isolated from a pineapple in Okinawa, was determined. The total length of the 87 contigs was 3.08 Mb with a G+C content of 46.2% and 2,946 coding sequences. The genome analysis revealed its biosynthetic ability of 11 amino acids. Copyright © 2017 Tada et al.

  11. Computational Profiling of Microbial Genomes using Short Sequences

    Science.gov (United States)

    Doering, Dale; Tsukuda, Toyoko

    2001-03-01

    The genomes of a number of microbial species have now been completely sequenced. We have developed a program for the statistical analysis of the appearance frequency and location of short DNA segments within an entire microbial genome. Using this program, the genomes of Methanococcus jannischii (1.66 Mbase; 68radiodurans (3.28 Mbase; 66and compared to a randomly generated genomic pattern. The random sequence shows the expected statistical frequency distribution about the average that equals the genome size divided by the total number of N size short segments (4N). In contrast, the microbial genomes are radically skewed with a large number of segments that rarely occur and a few that are highly represented in the genome. The specific distribution profile of the segments is strongly dependent on the overall bias in the organism. The biased appearance frequency allows us to develop a genome signature of each microbial species.

  12. The complete chloroplast genome sequence of Podocarpus lambertii: genome structure, evolutionary aspects, gene content and SSR detection.

    Directory of Open Access Journals (Sweden)

    Leila do Nascimento Vieira

    Full Text Available BACKGROUND: Podocarpus lambertii (Podocarpaceae is a native conifer from the Brazilian Atlantic Forest Biome, which is considered one of the 25 biodiversity hotspots in the world. The advancement of next-generation sequencing technologies has enabled the rapid acquisition of whole chloroplast (cp genome sequences at low cost. Several studies have proven the potential of cp genomes as tools to understand enigmatic and basal phylogenetic relationships at different taxonomic levels, as well as further probe the structural and functional evolution of plants. In this work, we present the complete cp genome sequence of P. lambertii. METHODOLOGY/PRINCIPAL FINDINGS: The P. lambertii cp genome is 133,734 bp in length, and similar to other sequenced cupressophytes, it lacks one of the large inverted repeat regions (IR. It contains 118 unique genes and one duplicated tRNA (trnN-GUU, which occurs as an inverted repeat sequence. The rps16 gene was not found, which was previously reported for the plastid genome of another Podocarpaceae (Nageia nagi and Araucariaceae (Agathis dammara. Structurally, P. lambertii shows 4 inversions of a large DNA fragment ∼20,000 bp compared to the Podocarpus totara cp genome. These unexpected characteristics may be attributed to geographical distance and different adaptive needs. The P. lambertii cp genome presents a total of 28 tandem repeats and 156 SSRs, with homo- and dipolymers being the most common and tri-, tetra-, penta-, and hexapolymers occurring with less frequency. CONCLUSION: The complete cp genome sequence of P. lambertii revealed significant structural changes, even in species from the same genus. These results reinforce the apparently loss of rps16 gene in Podocarpaceae cp genome. In addition, several SSRs in the P. lambertii cp genome are likely intraspecific polymorphism sites, which may allow highly sensitive phylogeographic and population structure studies, as well as phylogenetic studies of species of

  13. Genomic libraries: II. Subcloning, sequencing, and assembling large-insert genomic DNA clones.

    Science.gov (United States)

    Quail, Mike A; Matthews, Lucy; Sims, Sarah; Lloyd, Christine; Beasley, Helen; Baxter, Simon W

    2011-01-01

    Sequencing large insert clones to completion is useful for characterizing specific genomic regions, identifying haplotypes, and closing gaps in whole genome sequencing projects. Despite being a standard technique in molecular laboratories, DNA sequencing using the Sanger method can be highly problematic when complex secondary structures or sequence repeats are encountered in genomic clones. Here, we describe methods to isolate DNA from a large insert clone (fosmid or BAC), subclone the sample, and sequence the region to the highest industry standard. Troubleshooting solutions for sequencing difficult templates are discussed.

  14. Insights into three whole-genome duplications gleaned from the Paramecium caudatum genome sequence.

    Science.gov (United States)

    McGrath, Casey L; Gout, Jean-Francois; Doak, Thomas G; Yanagi, Akira; Lynch, Michael

    2014-08-01

    Paramecium has long been a model eukaryote. The sequence of the Paramecium tetraurelia genome reveals a history of three successive whole-genome duplications (WGDs), and the sequences of P. biaurelia and P. sexaurelia suggest that these WGDs are shared by all members of the aurelia species complex. Here, we present the genome sequence of P. caudatum, a species closely related to the P. aurelia species group. P. caudatum shares only the most ancient of the three WGDs with the aurelia complex. We found that P. caudatum maintains twice as many paralogs from this early event as the P. aurelia species, suggesting that post-WGD gene retention is influenced by subsequent WGDs and supporting the importance of selection for dosage in gene retention. The availability of P. caudatum as an outgroup allows an expanded analysis of the aurelia intermediate and recent WGD events. Both the Guanine+Cytosine (GC) content and the expression level of preduplication genes are significant predictors of duplicate retention. We find widespread asymmetrical evolution among aurelia paralogs, which is likely caused by gradual pseudogenization rather than by neofunctionalization. Finally, cases of divergent resolution of intermediate WGD duplicates between aurelia species implicate this process acts as an ongoing reinforcement mechanism of reproductive isolation long after a WGD event. Copyright © 2014 by the Genetics Society of America.

  15. A sequence-based survey of the complex structural organization of tumor genomes

    Energy Technology Data Exchange (ETDEWEB)

    Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V.; Trask, Barbara J.; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J.; Mills, Gordon B.; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela; Tao, Quanzhou; Aerni, Sarah J.; Brown, Raymond P.; Bashir, Ali; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M.; Collins, Colin C.

    2008-04-03

    The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

  16. Comparative Genomics of Methanopyrus sp. SNP6 and KOL6 Revealing Genomic Regions of Plasticity Implicated in Extremely Thermophilic Profiles

    Directory of Open Access Journals (Sweden)

    Zhiliang Yu

    2017-07-01

    Full Text Available Methanopyrus spp. are usually isolated from harsh niches, such as high osmotic pressure and extreme temperature. However, the molecular mechanisms for their environmental adaption are poorly understood. Archaeal species is commonly considered as primitive organism. The evolutional placement of archaea is a fundamental and intriguing scientific question. We sequenced the genomes of Methanopyrus strains SNP6 and KOL6 isolated from the Atlantic and Iceland, respectively. Comparative genomic analysis revealed genetic diversity and instability implicated in niche adaption, including a number of transporter- and integrase/transposase-related genes. Pan-genome analysis also defined the gene pool of Methanopyrus spp., in addition of ~120-Kb genomic region of plasticity impacting cognate genomic architecture. We believe that Methanopyrus genomics could facilitate efficient investigation/recognition of archaeal phylogenetic diverse patterns, as well as improve understanding of biological roles and significance of these versatile microbes.

  17. Protein sequences from mastodon and Tyrannosaurus rex revealed by mass spectrometry.

    Science.gov (United States)

    Asara, John M; Schweitzer, Mary H; Freimark, Lisa M; Phillips, Matthew; Cantley, Lewis C

    2007-04-13

    Fossilized bones from extinct taxa harbor the potential for obtaining protein or DNA sequences that could reveal evolutionary links to extant species. We used mass spectrometry to obtain protein sequences from bones of a 160,000- to 600,000-year-old extinct mastodon (Mammut americanum) and a 68-million-year-old dinosaur (Tyrannosaurus rex). The presence of T. rex sequences indicates that their peptide bonds were remarkably stable. Mass spectrometry can thus be used to determine unique sequences from ancient organisms from peptide fragmentation patterns, a valuable tool to study the evolution and adaptation of ancient taxa from which genomic sequences are unlikely to be obtained.

  18. Real-time, portable genome sequencing for Ebola surveillance.

    Science.gov (United States)

    Quick, Joshua; Loman, Nicholas J; Duraffour, Sophie; Simpson, Jared T; Severi, Ettore; Cowley, Lauren; Bore, Joseph Akoi; Koundouno, Raymond; Dudas, Gytis; Mikhail, Amy; Ouédraogo, Nobila; Afrough, Babak; Bah, Amadou; Baum, Jonathan Hj; Becker-Ziaja, Beate; Boettcher, Jan-Peter; Cabeza-Cabrerizo, Mar; Camino-Sanchez, Alvaro; Carter, Lisa L; Doerrbecker, Juiliane; Enkirch, Theresa; Dorival, Isabel Graciela García; Hetzelt, Nicole; Hinzmann, Julia; Holm, Tobias; Kafetzopoulou, Liana Eleni; Koropogui, Michel; Kosgey, Abigail; Kuisma, Eeva; Logue, Christopher H; Mazzarelli, Antonio; Meisel, Sarah; Mertens, Marc; Michel, Janine; Ngabo, Didier; Nitzsche, Katja; Pallash, Elisa; Patrono, Livia Victoria; Portmann, Jasmine; Repits, Johanna Gabriella; Rickett, Natasha Yasmin; Sachse, Andrea; Singethan, Katrin; Vitoriano, Inês; Yemanaberhan, Rahel L; Zekeng, Elsa G; Trina, Racine; Bello, Alexander; Sall, Amadou Alpha; Faye, Ousmane; Faye, Oumar; Magassouba, N'Faly; Williams, Cecelia V; Amburgey, Victoria; Winona, Linda; Davis, Emily; Gerlach, Jon; Washington, Franck; Monteil, Vanessa; Jourdain, Marine; Bererd, Marion; Camara, Alimou; Somlare, Hermann; Camara, Abdoulaye; Gerard, Marianne; Bado, Guillaume; Baillet, Bernard; Delaune, Déborah; Nebie, Koumpingnin Yacouba; Diarra, Abdoulaye; Savane, Yacouba; Pallawo, Raymond Bernard; Gutierrez, Giovanna Jaramillo; Milhano, Natacha; Roger, Isabelle; Williams, Christopher J; Yattara, Facinet; Lewandowski, Kuiama; Taylor, Jamie; Rachwal, Philip; Turner, Daniel; Pollakis, Georgios; Hiscox, Julian A; Matthews, David A; O'Shea, Matthew K; Johnston, Andrew McD; Wilson, Duncan; Hutley, Emma; Smit, Erasmus; Di Caro, Antonino; Woelfel, Roman; Stoecker, Kilian; Fleischmann, Erna; Gabriel, Martin; Weller, Simon A; Koivogui, Lamine; Diallo, Boubacar; Keita, Sakoba; Rambaut, Andrew; Formenty, Pierre; Gunther, Stephan; Carroll, Miles W

    2016-02-11

    The Ebola virus disease epidemic in West Africa is the largest on record, responsible for over 28,599 cases and more than 11,299 deaths. Genome sequencing in viral outbreaks is desirable to characterize the infectious agent and determine its evolutionary rate. Genome sequencing also allows the identification of signatures of host adaptation, identification and monitoring of diagnostic targets, and characterization of responses to vaccines and treatments. The Ebola virus (EBOV) genome substitution rate in the Makona strain has been estimated at between 0.87 × 10(-3) and 1.42 × 10(-3) mutations per site per year. This is equivalent to 16-27 mutations in each genome, meaning that sequences diverge rapidly enough to identify distinct sub-lineages during a prolonged epidemic. Genome sequencing provides a high-resolution view of pathogen evolution and is increasingly sought after for outbreak surveillance. Sequence data may be used to guide control measures, but only if the results are generated quickly enough to inform interventions. Genomic surveillance during the epidemic has been sporadic owing to a lack of local sequencing capacity coupled with practical difficulties transporting samples to remote sequencing facilities. To address this problem, here we devise a genomic surveillance system that utilizes a novel nanopore DNA sequencing instrument. In April 2015 this system was transported in standard airline luggage to Guinea and used for real-time genomic surveillance of the ongoing epidemic. We present sequence data and analysis of 142 EBOV samples collected during the period March to October 2015. We were able to generate results less than 24 h after receiving an Ebola-positive sample, with the sequencing process taking as little as 15-60 min. We show that real-time genomic surveillance is possible in resource-limited settings and can be established rapidly to monitor outbreaks.

  19. From Sequence to Morphology - Long-Range Correlations in Complete Sequenced Genomes

    NARCIS (Netherlands)

    T.A. Knoch (Tobias)

    2004-01-01

    textabstractThe largely unresolved sequential organization, i.e. the relations within DNA sequences, and its connection to the three-dimensional organization of genomes was investigated by correlation analyses of completely sequenced chromosomes from Viroids, Archaea, Bacteria, Arabidopsis

  20. Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.

    KAUST Repository

    Doan, Ryan

    2012-02-17

    BACKGROUND: The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. RESULTS: Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse\\'s genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. CONCLUSIONS: This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.

  1. Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

    Energy Technology Data Exchange (ETDEWEB)

    McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

    2005-08-26

    Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

  2. The genome of Tetranychus urticae reveals herbivorous pest adaptations

    NARCIS (Netherlands)

    Grbić, M.; Van Leeuwen, T.; Clark, R.M.; Rombauts, S.; Grbić, V.; Osborne, E.J.; Dermauw, W.; Phuong, C.T.N.; Ortego, F.; Hernández-Crespo, P.; Diaz, I.; Martinez, M.; Navajas, M.; Sucena, E.; Magalhães, S.; Nagy, L.; Pace, R.M.; Djuranović, S.; Smagghe, G.; Iga, M.; Christiaens, O.; Veenstra, J.A.; Ewer, J.; Villalobos, R.M.; Hutter, J.L.; Hudson, S.D.; Velez, M.; Yi, S.V.; Zeng, J.; Pires-dasilva, A.; Roch, F.; Cazaux, M.; Navarro, M.; Zhurov, V.; Acevedo, G.; Bjelica, A.; Fawcett, J.A.; Bonnet, E.; Martens, C.; Baele, G.; Wissler, L.; Sanchez-Rodriguez, A.; Tirry, L.; Blais, C.; Demeestere, K.; Henz, S.R.; Gregory, T.R.; Mathieu, J.; Verdon, L.; Farinelli, L.; Schmutz, J.; Lindquist, E.; Feyereisen, R.; Van de Peer, Y.

    2011-01-01

    The spider mite Tetranychus urticae is a cosmopolitan agricultural pest with an extensive host plant range and an extreme record of pesticide resistance. Here we present the completely sequenced and annotated spider mite genome, representing the first complete chelicerate genome. At 90 megabases T.

  3. Complete chloroplast genome sequences of important oilseed crop Sesamum indicum L.

    Science.gov (United States)

    Yi, Dong-Keun; Kim, Ki-Joong

    2012-01-01

    Sesamum indicum is an important crop plant species for yielding oil. The complete chloroplast (cp) genome of S. indicum (GenBank acc no. JN637766) is 153,324 bp in length, and has a pair of inverted repeat (IR) regions consisting of 25,141 bp each. The lengths of the large single copy (LSC) and the small single copy (SSC) regions are 85,170 bp and 17,872 bp, respectively. Comparative cp DNA sequence analyses of S. indicum with other cp genomes reveal that the genome structure, gene order, gene and intron contents, AT contents, codon usage, and transcription units are similar to the typical angiosperm cp genomes. Nucleotide diversity of the IR region between Sesamum and three other cp genomes is much lower than that of the LSC and SSC regions in both the coding region and noncoding region. As a summary, the regional constraints strongly affect the sequence evolution of the cp genomes, while the functional constraints weakly affect the sequence evolution of cp genomes. Five short inversions associated with short palindromic sequences that form step-loop structures were observed in the chloroplast genome of S. indicum. Twenty-eight different simple sequence repeat loci have been detected in the chloroplast genome of S. indicum. Almost all of the SSR loci were composed of A or T, so this may also contribute to the A-T richness of the cp genome of S. indicum. Seven large repeated loci in the chloroplast genome of S. indicum were also identified and these loci are useful to developing S. indicum-specific cp genome vectors. The complete cp DNA sequences of S. indicum reported in this paper are prerequisite to modifying this important oilseed crop by cp genetic engineering techniques.

  4. Complete genome sequence of Acidimicrobium ferrooxidans type strain (ICPT)

    Energy Technology Data Exchange (ETDEWEB)

    Clum, Alicia; Nolan, Matt; Lang, Elke; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Goker, Markus; Spring, Stefan; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Chain, Patrick; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Lapidus, Alla

    2009-05-20

    Acidimicrobium ferrooxidans (Clark and Norris 1996) is the sole and type species of the genus, which until recently was the only genus within the actinobacterial family Acidimicrobiaceae and in the order Acidomicrobiales. Rapid oxidation of iron pyrite during autotrophic growth in the absence of an enhanced CO2 concentration is characteristic for A. ferrooxidans. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the order Acidomicrobiales, and the 2,158,157 bp long single replicon genome with its 2038 protein coding and 54 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  5. Oxford Nanopore MinION Sequencing and Genome Assembly

    Directory of Open Access Journals (Sweden)

    Hengyun Lu

    2016-10-01

    Full Text Available The revolution of genome sequencing is continuing after the successful second-generation sequencing (SGS technology. The third-generation sequencing (TGS technology, led by Pacific Biosciences (PacBio, is progressing rapidly, moving from a technology once only capable of providing data for small genome analysis, or for performing targeted screening, to one that promises high quality de novo assembly and structural variation detection for human-sized genomes. In 2014, the MinION, the first commercial sequencer using nanopore technology, was released by Oxford Nanopore Technologies (ONT. MinION identifies DNA bases by measuring the changes in electrical conductivity generated as DNA strands pass through a biological pore. Its portability, affordability, and speed in data production makes it suitable for real-time applications, the release of the long read sequencer MinION has thus generated much excitement and interest in the genomics community. While de novo genome assemblies can be cheaply produced from SGS data, assembly continuity is often relatively poor, due to the limited ability of short reads to handle long repeats. Assembly quality can be greatly improved by using TGS long reads, since repetitive regions can be easily expanded into using longer sequencing lengths, despite having higher error rates at the base level. The potential of nanopore sequencing has been demonstrated by various studies in genome surveillance at locations where rapid and reliable sequencing is needed, but where resources are limited.

  6. First large genomic inversion in familial cerebral cavernous malformation identified by whole genome sequencing.

    Science.gov (United States)

    Spiegler, Stefanie; Rath, Matthias; Hoffjan, Sabine; Dammann, Philipp; Sure, Ulrich; Pagenstecher, Axel; Strom, Tim; Felbor, Ute

    2018-01-01

    Familial cerebral cavernous malformations (CCMs) predispose to seizures and hemorrhagic stroke. Molecular genetic analyses of CCM1, CCM2, and CCM3 result in a mutation detection rate of up to 98%. However, only whole genome sequencing (WGS) in combination with the Manta algorithm for analyses of structural variants revealed a heterozygous 24 kB inversion including exon 1 of CCM2 in a 12-year-old boy with familial CCMs. Its breakpoints were fine-mapped, and quantitative analysis on RNA confirmed reduced CCM2 expression. Our data expand the spectrum of CCM mutations and indicate that the existence of a fourth CCM disease gene is rather unlikely.

  7. Complete genome sequence of “Enterobacter lignolyticus” SCF1

    Energy Technology Data Exchange (ETDEWEB)

    DeAngelis, Kristen M.; D' Haeseleer, Patrik; Chivian, Dylan; Fortney, Julian L.; Khudyakov, Jane I.; Simmons, Blake A.; Woo, Hannah; Arkin, Adam P.; Davenport, Karen W.; Goodwin, Lynne A.; Chen, Amy; Ivanova, Natalia; Kyrpides, Nikos C.; Mavromatis, Konstantinos; Woyke, Tanja; Hazen, Terry C.

    2011-09-23

    In an effort to discover anaerobic bacteria capable of lignin degradation, we isolated 'Ente-robacter lignolyticus' SCF1 on minimal media with alkali lignin as the sole source of carbon. This organism was isolated anaerobically from tropical forest soils collected from the Short Cloud Forest site in the El Yunque National Forest in Puerto Rico, USA, part of the Luquillo Long-Term Ecological Research Station. At this site, the soils experience strong fluctuations in redox potential and are net methane producers. Because of its ability to grow on lignin anae-robically, we sequenced the genome. The genome of 'E. lignolyticus' SCF1 is 4.81 Mbp with no detected plasmids, and includes a relatively small arsenal of lignocellulolytic carbohy-drate active enzymes. Lignin degradation was observed in culture, and the genome revealed two putative laccases, a putative peroxidase, and a complete 4-hydroxyphenylacetate degra-dation pathway encoded in a single gene cluster.

  8. A Genome Sequencing Program for Novel Undiagnosed Diseases

    Science.gov (United States)

    Bloss, Cinnamon S.; Scott-Van Zeeland, Ashley A.; Topol, Sarah E.; Darst, Burcu F.; Boeldt, Debra L.; Erikson, Galina A.; Bethel, Kelly J.; Bjork, Robert L.; Friedman, Jennifer R.; Hwynn, Nelson; Patay, Bradley A.; Pockros, Paul J.; Scott, Erick R.; Simon, Ronald A.; Williams, Gary W.; Schork, Nicholas J.; Topol, Eric J.; Torkamani, Ali

    2015-01-01

    Purpose The Scripps Idiopathic Diseases of huMan (IDIOM) study aims to discover novel gene-disease relationships and provide molecular genetic diagnosis and treatment guidance for individuals with novel diseases using genome sequencing integrated with clinical assessment and multidisciplinary case review. Methods Here we describe the IDIOM study operational protocol and initial results. Results 121 cases underwent first tier review by the principal investigators to determine if the primary inclusion criteria were satisfied, 59 (48.8%) underwent second tier review by our clinician-scientist review panel, and 17 (14.0%) patients and their family members were enrolled. 60% of cases resulted in a plausible molecular diagnosis. 18% of cases resulted in a confirmed molecular diagnosis. 2 of 3 confirmed cases led to the identification of novel gene-disease relationships. In the third confirmed case, a previously described but unrecognized disease was revealed. In all three confirmed cases, a new clinical management strategy was initiated based on the genetic findings. Conclusions Genome sequencing provides tangible clinical benefit for individuals with idiopathic genetic disease, not only in the context of molecular genetic diagnosis of known rare conditions, but also in cases where prior clinical information regarding a new genetic disorder is lacking. PMID:25790160

  9. Phylogenomic analysis of 11 complete African swine fever virus genome sequences

    International Nuclear Information System (INIS)

    Villiers, Etienne P. de; Gallardo, Carmina; Arias, Marisa; Silva, Melissa da; Upton, Chris; Martin, Raquel; Bishop, Richard P.

    2010-01-01

    Viral molecular epidemiology has traditionally analyzed variation in single genes. Whole genome phylogenetic analysis of 123 concatenated genes from 11 ASFV genomes, including E75, a newly sequenced virulent isolate from Spain, identified two clusters. One contained South African isolates from ticks and warthog, suggesting derivation from a sylvatic transmission cycle. The second contained isolates from West Africa and the Iberian Peninsula. Two isolates, from Kenya and Malawi, were outliers. Of the nine genomes within the clusters, seven were within p72 genotype 1. The 11 genomes sequenced comprised only 5 of the 22 p72 genotypes. Comparison of synonymous and non-synonymous mutations at the genome level identified 20 genes subject to selection pressure for diversification. A novel gene of the E75 virus evolved by the fusion of two genes within the 360 multicopy family. Comparative genomics reveals high diversity within a limited sample of the ASFV viral gene pool.

  10. Suspected cases of intracontinental Burkholderia pseudomallei sequence type homoplasy resolved using whole-genome sequencing.

    Science.gov (United States)

    Aziz, Ammar; Sarovich, Derek S; Harris, Tegan M; Kaestli, Mirjam; McRobb, Evan; Mayo, Mark; Currie, Bart J; Price, Erin P

    2017-11-01

    Burkholderia pseudomallei is a Gram-negative environmental bacterium that causes melioidosis, a disease of high mortality in humans and animals. Multilocus sequence typing (MLST) is a popular and portable genotyping method that has been used extensively to characterise the genetic diversity of B. pseudomallei populations. MLST has been central to our understanding of the underlying phylogeographical signal present in the B. pseudomallei genome, revealing distinct populations on both the intra- and the inter-continental level. However, due to its high recombination rate, it is possible for B. pseudomallei isolates to share the same multilocus sequence type (ST) despite being genetically and geographically distinct, with two cases of 'ST homoplasy' recently reported between Cambodian and Australian B. pseudomallei isolates. This phenomenon can dramatically confound conclusions about melioidosis transmission patterns and source attribution, a critical issue for bacteria such as B. pseudomallei that are of concern due to their potential for use as bioweapons. In this study, we used whole-genome sequencing to identify the first reported instances of intracontinental ST homoplasy, which involved ST-722 and ST-804 B. pseudomallei isolates separated by large geographical distances. In contrast, a third suspected homoplasy case was shown to be a true long-range (460 km) dispersal event between a remote Australian island and the Australian mainland. Our results show that, whilst a highly useful and portable method, MLST can occasionally lead to erroneous conclusions about isolate origin and disease attribution. In cases where a shared ST is identified between geographically distant locales, whole-genome sequencing should be used to resolve strain origin.

  11. Genome Sequence of Australian Indigenous Wine Yeast Torulaspora delbrueckii COFT1 Using Nanopore Sequencing.

    Science.gov (United States)

    Tondini, Federico; Jiranek, Vladimir; Grbin, Paul R; Onetto, Cristobal A

    2018-04-26

    Here, we report the first sequenced genome of an indigenous Australian wine isolate of Torulaspora delbrueckii using the Oxford Nanopore MinION and Illumina HiSeq sequencing platforms. The genome size is 9.4 Mb and contains 4,831 genes. Copyright © 2018 Tondini et al.

  12. Genome sequencing of ovine isolates of Mycobacterium avium subspecies paratuberculosis offers insights into host association

    Directory of Open Access Journals (Sweden)

    Bannantine John P

    2012-03-01

    Full Text Available Abstract Background The genome of Mycobacterium avium subspecies paratuberculosis (MAP is remarkably homogeneous among the genomes of bovine, human and wildlife isolates. However, previous work in our laboratories with the bovine K-10 strain has revealed substantial differences compared to sheep isolates. To systematically characterize all genomic differences that may be associated with the specific hosts, we sequenced the genomes of three U.S. sheep isolates and also obtained an optical map. Results Our analysis of one of the isolates, MAP S397, revealed a genome 4.8 Mb in size with 4,700 open reading frames (ORFs. Comparative analysis of the MAP S397 isolate showed it acquired approximately 10 large sequence regions that are shared with the human M. avium subsp. hominissuis strain 104 and lost 2 large regions that are present in the bovine strain. In addition, optical mapping defined the presence of 7 large inversions between the bovine and ovine genomes (~ 2.36 Mb. Whole-genome sequencing of 2 additional sheep strains of MAP (JTC1074 and JTC7565 further confirmed genomic homogeneity of the sheep isolates despite the presence of polymorphisms on the nucleotide level. Conclusions Comparative sequence analysis employed here provided a better understanding of the host association, evolution of members of the M. avium complex and could help in deciphering the phenotypic differences observed among sheep and cattle strains of MAP. A similar approach based on whole-genome sequencing combined with optical mapping could be employed to examine closely related pathogens. We propose an evolutionary scenario for M. avium complex strains based on these genome sequences.

  13. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  14. Simple sequence repeats in mycobacterial genomes

    Indian Academy of Sciences (India)

    2006-12-18

    Dec 18, 2006 ... Although prokaryotic genomes derive some plasticity due to microsatellite mutations they have in-built mechanisms to arrest undue expansions of microsatellites and one such mechanism is constituted by post-replicative DNA repair enzymes MutL, MutH and MutS. The mycobacterial genomes lack these ...

  15. Genome-scale validation of deep-sequencing libraries.

    Directory of Open Access Journals (Sweden)

    Dominic Schmidt

    Full Text Available Chromatin immunoprecipitation followed by high-throughput (HTP sequencing (ChIP-seq is a powerful tool to establish protein-DNA interactions genome-wide. The primary limitation of its broad application at present is the often-limited access to sequencers. Here we report a protocol, Mab-seq, that generates genome-scale quality evaluations for nucleic acid libraries intended for deep-sequencing. We show how commercially available genomic microarrays can be used to maximize the efficiency of library creation and quickly generate reliable preliminary data on a chromosomal scale in advance of deep sequencing. We also exploit this technique to compare enriched regions identified using microarrays with those identified by sequencing, demonstrating that they agree on a core set of clearly identified enriched regions, while characterizing the additional enriched regions identifiable using HTP sequencing.

  16. The Release 6 reference sequence of the Drosophila melanogaster genome.

    Science.gov (United States)

    Hoskins, Roger A; Carlson, Joseph W; Wan, Kenneth H; Park, Soo; Mendez, Ivonne; Galle, Samuel E; Booth, Benjamin W; Pfeiffer, Barret D; George, Reed A; Svirskas, Robert; Krzywinski, Martin; Schein, Jacqueline; Accardo, Maria Carmela; Damia, Elisabetta; Messina, Giovanni; Méndez-Lago, María; de Pablos, Beatriz; Demakova, Olga V; Andreyeva, Evgeniya N; Boldyreva, Lidiya V; Marra, Marco; Carvalho, A Bernardo; Dimitri, Patrizio; Villasante, Alfredo; Zhimulev, Igor F; Rubin, Gerald M; Karpen, Gary H; Celniker, Susan E

    2015-03-01

    Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads. © 2015 Hoskins et al.; Published by Cold Spring Harbor Laboratory Press.

  17. Complete genome sequence of Cellulomonas flavigena type strain (134T)

    Science.gov (United States)

    Abt, Birte; Foster, Brian; Lapidus, Alla; Clum, Alicia; Sun, Hui; Pukall, Rüdiger; Lucas, Susan; Glavina Del Rio, Tijana; Nolan, Matt; Tice, Hope; Cheng, Jan-Fang; Pitluck, Sam; Liolios, Konstantinos; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Goodwin, Lynne; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D.; Rohde, Manfred; Göker, Markus; Woyke, Tanja; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter

    2010-01-01

    Cellulomonas flavigena (Kellerman and McBeth 1912) Bergey et al. 1923 is the type species of the genus Cellulomonas of the actinobacterial family Cellulomonadaceae. Members of the genus Cellulomonas are of special interest for their ability to degrade cellulose and hemicellulose, particularly with regard to the use of biomass as an alternative energy source. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the genus Cellulomonas, and next to the human pathogen Tropheryma whipplei the second complete genome sequence within the actinobacterial family Cellulomonadaceae. The 4,123,179 bp long single replicon genome with its 3,735 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304688

  18. Complete genome sequence of Cellulomonas flavigena type strain (134T)

    Energy Technology Data Exchange (ETDEWEB)

    Abt, Birte [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Foster, Brian [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Clum, Alicia [U.S. Department of Energy, Joint Genome Institute; Sun, Hui [U.S. Department of Energy, Joint Genome Institute; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2010-01-01

    Cellulomonas flavigena (Kellerman and McBeth 1912) Bergey et al. 1923 is the type species of the genus Cellulomonas of the actinobacterial family Cellulomonadaceae. Members of the genus Cellulomonas are of special interest for their ability to degrade cellulose and hemicellulose, particularly with regard to the use of biomass as an alternative energy source. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the genus Cellulomonas, and next to the human pathogen Tropheryma whipplei the second complete genome sequence within the actinobacterial family Cellulomonadaceae. The 4,123,179 bp long single replicon genome with its 3,735 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  19. Human papillomavirus 1a complete DNA sequence: a novel type of genome organization among papovaviridae.

    OpenAIRE

    Danos, O; Katinka, M; Yaniv, M

    1982-01-01

    The complete nucleotide sequence of human papillomavirus type 1a (7811 nucleotides) has been established. The overall organization of the viral genome is different from that of other related papovaviruses (SV40, BKV, polyoma). Firstly, genetic information seems to be coded by one strand. Secondly, no significant homology is found with SV40 or polyoma coding sequence for either DNA or deducted protein sequences. The relatedness of human and bovine papillomaviruses is revealed by a conserved co...

  20. Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.

    Directory of Open Access Journals (Sweden)

    Martijn Staats

    Full Text Available Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22-82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus were generated with 81.4-97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2-71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes, but at least generating vital comparative genomic data for testing (phylogenetic, demographic and genetic hypotheses, that become increasingly more

  1. Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.

    Science.gov (United States)

    Staats, Martijn; Erkens, Roy H J; van de Vossenberg, Bart; Wieringa, Jan J; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E; Bakker, Freek T

    2013-01-01

    Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22-82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4-97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2-71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal

  2. Complete genome sequencing of Agrobacterium sp. H13-3, the former Rhizobium lupini H13-3, reveals a tripartite genome consisting of a circular and a linear chromosome and an accessory plasmid but lacking a tumor-inducing Ti-plasmid.

    Science.gov (United States)

    Wibberg, Daniel; Blom, Jochen; Jaenicke, Sebastian; Kollin, Florian; Rupp, Oliver; Scharf, Birgit; Schneiker-Bekel, Susanne; Sczcepanowski, Rafael; Goesmann, Alexander; Setubal, Joao Carlos; Schmitt, Rüdiger; Pühler, Alfred; Schlüter, Andreas

    2011-08-20

    Agrobacterium sp. H13-3, formerly known as Rhizobium lupini H13-3, is a soil bacterium that was isolated from the rhizosphere of Lupinus luteus. The isolate has been established as a model system for studying novel features of flagellum structure, motility and chemotaxis within the family Rhizobiaceae. The complete genome sequence of Agrobacterium sp. H13-3 has been established and the genome structure and phylogenetic assignment of the organism was analysed. For de novo sequencing of the Agrobacterium sp. H13-3 genome, a combined strategy comprising 454-pyrosequencing on the Genome Sequencer FLX platform and PCR-based amplicon sequencing for gap closure was applied. The finished genome consists of three replicons and comprises 5,573,770 bases. Based on phylogenetic analyses, the isolate could be assigned to the genus Agrobacterium biovar I and represents a genomic species G1 strain within this biovariety. The highly conserved circular chromosome (2.82 Mb) of Agrobacterium sp. H13-3 mainly encodes housekeeping functions characteristic for an aerobic, heterotrophic bacterium. Agrobacterium sp. H13-3 is a motile bacterium driven by the rotation of several complex flagella. Its behaviour towards external stimuli is regulated by a large chemotaxis regulon and a total of 17 chemoreceptors. Comparable to the genome of Agrobacterium tumefaciens C58, Agrobacterium sp. H13-3 possesses a linear chromosome (2.15 Mb) that is related to its reference replicon and features chromosomal and plasmid-like properties. The accessory plasmid pAspH13-3a (0.6 Mb) is only distantly related to the plasmid pAtC58 of A. tumefaciens C58 and shows a mosaic structure. A tumor-inducing Ti-plasmid is missing in the sequenced strain H13-3 indicating that it is a non-virulent isolate. Copyright © 2011 Elsevier B.V. All rights reserved.

  3. Genome sequence of vanilla distortion mosaic virus infecting Coriandrum sativum.

    Science.gov (United States)

    Adams, I P; Rai, S; Deka, M; Harju, V; Hodges, T; Hayward, G; Skelton, A; Fox, A; Boonham, N

    2014-12-01

    The 9573-nucleotide genome of a potyvirus was sequenced from a Coriandrum sativum plant from India with viral symptoms. On analysis, this virus was shown to have greater than 85 % nucleotide sequence identity to vanilla distortion mosaic virus (VDMV). Analysis of the putative coat protein sequence confirmed that this virus was in fact VDMV, with greater than 91 % amino acid sequence identity. The genome appears to encode a 3083-amino-acid polyprotein potentially cleaved into the 10 mature proteins expected in potyviruses. Phylogenetic analysis confirmed that VDMV is a distinct but ungrouped member of the genus Potyvirus.

  4. Draft Genome Sequence of Type Strain Streptococcus gordonii ATCC 10558

    DEFF Research Database (Denmark)

    Rasmussen, Louise Hesselbjerg; Dargis, Rimtas; Christensen, Jens Jørgen Elmer

    2016-01-01

    Streptococcus gordonii ATCC 10558T was isolated from a patient with infective endocarditis in 1946 and announced as a type strain in 1989. Here, we report the 2,154,510-bp draft genome sequence of S. gordonii ATCC 10558T. This sequence will contribute to knowledge about the pathogenesis of infect......Streptococcus gordonii ATCC 10558T was isolated from a patient with infective endocarditis in 1946 and announced as a type strain in 1989. Here, we report the 2,154,510-bp draft genome sequence of S. gordonii ATCC 10558T. This sequence will contribute to knowledge about the pathogenesis...

  5. The Capsaspora genome reveals a complex unicellular prehistory of animals.

    Science.gov (United States)

    Suga, Hiroshi; Chen, Zehua; de Mendoza, Alex; Sebé-Pedrós, Arnau; Brown, Matthew W; Kramer, Eric; Carr, Martin; Kerner, Pierre; Vervoort, Michel; Sánchez-Pons, Núria; Torruella, Guifré; Derelle, Romain; Manning, Gerard; Lang, B Franz; Russ, Carsten; Haas, Brian J; Roger, Andrew J; Nusbaum, Chad; Ruiz-Trillo, Iñaki

    2013-01-01

    To reconstruct the evolutionary origin of multicellular animals from their unicellular ancestors, the genome sequences of diverse unicellular relatives are essential. However, only the genome of the choanoflagellate Monosiga brevicollis has been reported to date. Here we completely sequence the genome of the filasterean Capsaspora owczarzaki, the closest known unicellular relative of metazoans besides choanoflagellates. Analyses of this genome alter our understanding of the molecular complexity of metazoans' unicellular ancestors showing that they had a richer repertoire of proteins involved in cell adhesion and transcriptional regulation than previously inferred only with the choanoflagellate genome. Some of these proteins were secondarily lost in choanoflagellates. In contrast, most intercellular signalling systems controlling development evolved later concomitant with the emergence of the first metazoans. We propose that the acquisition of these metazoan-specific developmental systems and the co-option of pre-existing genes drove the evolutionary transition from unicellular protists to metazoans.

  6. Cancer Genome Sequencing and Its Implications for Personalized Cancer Vaccines

    International Nuclear Information System (INIS)

    Li, Lijin; Goedegebuure, Peter; Mardis, Elaine R.; Ellis, Matthew J.C.; Zhang, Xiuli; Herndon, John M.; Fleming, Timothy P.; Carreno, Beatriz M.; Hansen, Ted H.; Gillanders, William E.

    2011-01-01

    New DNA sequencing platforms have revolutionized human genome sequencing. The dramatic advances in genome sequencing technologies predict that the $1,000 genome will become a reality within the next few years. Applied to cancer, the availability of cancer genome sequences permits real-time decision-making with the potential to affect diagnosis, prognosis, and treatment, and has opened the door towards personalized medicine. A promising strategy is the identification of mutated tumor antigens, and the design of personalized cancer vaccines. Supporting this notion are preliminary analyses of the epitope landscape in breast cancer suggesting that individual tumors express significant numbers of novel antigens to the immune system that can be specifically targeted through cancer vaccines

  7. Cancer Genome Sequencing and Its Implications for Personalized Cancer Vaccines

    Directory of Open Access Journals (Sweden)

    William E. Gillanders

    2011-11-01

    Full Text Available New DNA sequencing platforms have revolutionized human genome sequencing. The dramatic advances in genome sequencing technologies predict that the $1,000 genome will become a reality within the next few years. Applied to cancer, the availability of cancer genome sequences permits real-time decision-making with the potential to affect diagnosis, prognosis, and treatment, and has opened the door towards personalized medicine. A promising strategy is the identification of mutated tumor antigens, and the design of personalized cancer vaccines. Supporting this notion are preliminary analyses of the epitope landscape in breast cancer suggesting that individual tumors express significant numbers of novel antigens to the immune system that can be specifically targeted through cancer vaccines.

  8. Cancer Genome Sequencing and Its Implications for Personalized Cancer Vaccines

    Energy Technology Data Exchange (ETDEWEB)

    Li, Lijin [Department of Surgery, Washington University School of Medicine, St. Louis, MO 63110 (United States); Goedegebuure, Peter [Department of Surgery, Washington University School of Medicine, St. Louis, MO 63110 (United States); The Alvin J. Siteman Cancer Center at Barnes-Jewish Hospital and Washington University School of Medicine, St. Louis, MO 63110 (United States); Mardis, Elaine R. [The Alvin J. Siteman Cancer Center at Barnes-Jewish Hospital and Washington University School of Medicine, St. Louis, MO 63110 (United States); The Genome Institute at Washington University School of Medicine, St. Louis, MO 63108 (United States); Ellis, Matthew J.C. [The Alvin J. Siteman Cancer Center at Barnes-Jewish Hospital and Washington University School of Medicine, St. Louis, MO 63110 (United States); Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110 (United States); Zhang, Xiuli; Herndon, John M. [Department of Surgery, Washington University School of Medicine, St. Louis, MO 63110 (United States); Fleming, Timothy P. [Department of Surgery, Washington University School of Medicine, St. Louis, MO 63110 (United States); The Alvin J. Siteman Cancer Center at Barnes-Jewish Hospital and Washington University School of Medicine, St. Louis, MO 63110 (United States); Carreno, Beatriz M. [The Alvin J. Siteman Cancer Center at Barnes-Jewish Hospital and Washington University School of Medicine, St. Louis, MO 63110 (United States); Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110 (United States); Hansen, Ted H. [The Alvin J. Siteman Cancer Center at Barnes-Jewish Hospital and Washington University School of Medicine, St. Louis, MO 63110 (United States); Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO 63110 (United States); Gillanders, William E., E-mail: gillandersw@wudosis.wustl.edu [Department of Surgery, Washington University School of Medicine, St. Louis, MO 63110 (United States); The Alvin J. Siteman Cancer Center at Barnes-Jewish Hospital and Washington University School of Medicine, St. Louis, MO 63110 (United States)

    2011-11-25

    New DNA sequencing platforms have revolutionized human genome sequencing. The dramatic advances in genome sequencing technologies predict that the $1,000 genome will become a reality within the next few years. Applied to cancer, the availability of cancer genome sequences permits real-time decision-making with the potential to affect diagnosis, prognosis, and treatment, and has opened the door towards personalized medicine. A promising strategy is the identification of mutated tumor antigens, and the design of personalized cancer vaccines. Supporting this notion are preliminary analyses of the epitope landscape in breast cancer suggesting that individual tumors express significant numbers of novel antigens to the immune system that can be specifically targeted through cancer vaccines.

  9. Complete Genome Sequence of Mycobacterium phlei Type Strain RIVM601174

    KAUST Repository

    Abdallah, A. M.

    2012-05-24

    Mycobacterium phlei is a rapidly growing nontuberculous Mycobacterium species that is typically nonpathogenic, with few reported cases of human disease. Here we report the whole genome sequence of M. phlei type strain RIVM601174.

  10. Editing site analysis in a gymnosperm mitochondrial genome reveals similarities with angiosperm mitochondrial genomes

    OpenAIRE

    Salmans, Michael Lee; Chaw, Shu-Miaw; Lin, Ching-Ping; Shih, Arthur Chun-Chieh; Wu, Yu-Wei; Mulligan, R. Michael

    2010-01-01

    Sequence analysis of organelle genomes and comprehensive analysis of C-to-U editing sites from flowering and non-flowering plants have provided extensive sequence information from diverse taxa. This study includes the first comprehensive analysis of RNA editing sites from a gymnosperm mitochondrial genome, and utilizes informatics analyses to determine conserved features in the RNA sequence context around editing sites. We have identified 565 editing sites in 21 full-length and 4 partial cDNA...

  11. Draft Genome Sequences of 1,183 Salmonella Strains from the 100K Pathogen Genome Project.

    Science.gov (United States)

    Kong, Nguyet; Davis, Matthew; Arabyan, Narine; Huang, Bihua C; Weis, Allison M; Chen, Poyin; Thao, Kao; Ng, Whitney; Chin, Ning; Foutouhi, Soraya; Foutouhi, Azarene; Kaufman, James; Xie, Yi; Storey, Dylan B; Weimer, Bart C

    2017-07-13

    Salmonella is a common food-associated bacterium that has substantial impact on worldwide human health and the global economy. This is the public release of 1,183 Salmonella draft genome sequences as part of the 100K Pathogen Genome Project. These isolates represent global genomic diversity in the Salmonella genus. Copyright © 2017 Kong et al.

  12. A genome-wide analysis of FRT-like sequences in the human genome.

    Science.gov (United States)

    Shultz, Jeffry L; Voziyanova, Eugenia; Konieczka, Jay H; Voziyanov, Yuri

    2011-03-23

    Efficient and precise genome manipulations can be achieved by the Flp/FRT system of site-specific DNA recombination. Applications of this system are limited, however, to cases when target sites for Flp recombinase, FRT sites, are pre-introduced into a genome locale of interest. To expand use of the Flp/FRT system in genome engineering, variants of Flp recombinase can be evolved to recognize pre-existing genomic sequences that resemble FRT and thus can serve as recombination sites. To understand the distribution and sequence properties of genomic FRT-like sites, we performed a genome-wide analysis of FRT-like sites in the human genome using the experimentally-derived parameters. Out of 642,151 identified FRT-like sequences, 581,157 sequences were unique and 12,452 sequences had at least one exact duplicate. Duplicated FRT-like sequences are located mostly within LINE1, but also within LTRs of endogenous retroviruses, Alu repeats and other repetitive DNA sequences. The unique FRT-like sequences were classified based on the number of matches to FRT within the first four proximal bases pairs of the Flp binding elements of FRT and the nature of mismatched base pairs in the same region. The data obtained will be useful for the emerging field of genome engineering.

  13. A genome-wide analysis of FRT-like sequences in the human genome.

    Directory of Open Access Journals (Sweden)

    Jeffry L Shultz

    2011-03-01

    Full Text Available Efficient and precise genome manipulations can be achieved by the Flp/FRT system of site-specific DNA recombination. Applications of this system are limited, however, to cases when target sites for Flp recombinase, FRT sites, are pre-introduced into a genome locale of interest. To expand use of the Flp/FRT system in genome engineering, variants of Flp recombinase can be evolved to recognize pre-existing genomic sequences that resemble FRT and thus can serve as recombination sites. To understand the distribution and sequence properties of genomic FRT-like sites, we performed a genome-wide analysis of FRT-like sites in the human genome using the experimentally-derived parameters. Out of 642,151 identified FRT-like sequences, 581,157 sequences were unique and 12,452 sequences had at least one exact duplicate. Duplicated FRT-like sequences are located mostly within LINE1, but also within LTRs of endogenous retroviruses, Alu repeats and other repetitive DNA sequences. The unique FRT-like sequences were classified based on the number of matches to FRT within the first four proximal bases pairs of the Flp binding elements of FRT and the nature of mismatched base pairs in the same region. The data obtained will be useful for the emerging field of genome engineering.

  14. Bos taurus strain:dairy beef (cattle): 1000 Bull Genomes Run 2, Bovine Whole Genome Sequence

    NARCIS (Netherlands)

    Bouwman, A.C.; Daetwyler, H.D.; Chamberlain, Amanda J.; Ponce, Carla Hurtado; Sargolzaei, Mehdi; Schenkel, Flavio S.; Sahana, Goutam; Govignon-Gion, Armelle; Boitard, Simon; Dolezal, Marlies; Pausch, Hubert; Brøndum, Rasmus F.; Bowman, Phil J.; Thomsen, Bo; Guldbrandtsen, Bernt; Lund, Mogens S.; Servin, Bertrand; Garrick, Dorian J.; Reecy, James M.; Vilkki, Johanna; Bagnato, Alessandro; Wang, Min; Hoff, Jesse L.; Schnabel, Robert D.; Taylor, Jeremy F.; Vinkhuyzen, Anna A.E.; Panitz, Frank; Bendixen, Christian; Holm, Lars-Erik; Gredler, Birgit; Hozé, Chris; Boussaha, Mekki; Sanchez, Marie Pierre; Rocha, Dominique; Capitan, Aurelien; Tribout, Thierry; Barbat, Anne; Croiseau, Pascal; Drögemüller, Cord; Jagannathan, Vidhya; Vander Jagt, Christy; Crowley, John J.; Bieber, Anna; Purfield, Deirdre C.; Berry, Donagh P.; Emmerling, Reiner; Götz, Kay Uwe; Frischknecht, Mirjam; Russ, Ingolf; Sölkner, Johann; Tassell, van Curtis P.; Fries, Ruedi; Stothard, Paul; Veerkamp, R.F.; Boichard, Didier; Goddard, Mike E.; Hayes, Ben J.

    2014-01-01

    Whole genome sequence data (BAM format) of 234 bovine individuals aligned to UMD3.1. The aim of the study was to identify genetic variants (SNPs and indels) for downstream analysis such as imputation, GWAS, and detection of lethal recessives. Additional sequences for later 1000 bull genomes runs can

  15. The Phaeodactylum genome reveals the evolutionary history of diatom genomes

    Czech Academy of Sciences Publication Activity Database

    Bowler, Ch.; Allen, A. E.; Badger, J. H.; Grimwood, J.; Jabbari, K.; Kuo, A.; Maheswari, U.; Martens, C.; Maumus, F.; Otillar, R. P.; Rayko, E.; Salamov, A.; Vandepoele, K.; Beszteri, B.; Gruber, A.; Heijde, M.; Katinka, M.; Mock, T.; Valentin, K.; Verret, F.; Berges, J. A.; Brownlee, C.; Cadoret, J.-P.; Chiovitti, A.; Choi, Ch. J.; Coesel, S.; De Martino, A.; Detter, J. Ch.; Durkin, C.; Falciatore, A.; Fournet, J.; Haruta, M.; Huysman, M. J. J.; Jenkins, B. D.; Jiroutová, Kateřina; Jorgensen, R. E.; Joubert, Y.; Kaplan, A.; Kröger, N.; Kroth, P. G.; La Roche, J.; Lindquist, E.; Lommer, M.; Martin–Jézéquel, V.; Lopez, P. J.; Lucas, S.; Mangogna, M.; McGinnis, K.; Medlin, L. K.; Montsant, A.; Oudot–Le Secq, M.-P.; Napoli, C.; Oborník, Miroslav; Schnitzler Parker, M.; Petit, J.-L.; Porcel, B. M.; Poulsen, N.; Robison, M.; Rychlewski, L.; Rynearson, T. A.; Schmutz, J.; Shapiro, H.; Siaut, M.; Stanley, M.; Sussman, M. R.; Taylor, A. R.; Vardi, A.; von Dassow, P.; Vyverman, W.; Willis, A.; Wyrwicz, L. S.; Rokhsar, D. S.; Weissenbach, J.; Armbrust, E. V.; Green, B. R.; Van de Peer, Y.; Grigoriev, I. V.

    2008-01-01

    Roč. 456, 13-11-2008 (2008), s. 239-244 ISSN 0028-0836 Institutional research plan: CEZ:AV0Z60220518 Keywords : Phaeodactylum * genome * evolution * diatom Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 31.434, year: 2008

  16. Fetal anatomy revealed with fast MR sequences.

    Science.gov (United States)

    Levine, D; Hatabu, H; Gaa, J; Atkinson, M W; Edelman, R R

    1996-10-01

    Although all the imaging studies in this pictorial essay were done for maternal rather than fetal indications, fetal anatomy was well visualized. However, when scans are undertaken for fetal indications, fetal motion in between scout views and imaging sequences may make specific image planes difficult to obtain. Of the different techniques described in this review, we preferred the HASTE technique and use it almost exclusively for scanning pregnant patients. The T2-weighting is ideal for delineating fetal organs. Also, the HASTE technique allows images to be obtained in 430 msec, limiting artifacts arising from maternal and fetal motion. MR imaging should play a more important role in evaluating equivocal sonographic cases as fast scanning techniques are more widely used. Obstetric MR imaging no longer will be limited by fetal motion artifacts. When complex anatomy requires definition in a complicated pregnant patient, MR imaging should be considered as a useful adjunct to sonography.

  17. Unique sequence features of the Human Adenovirus 31 complete genomic sequence are conserved in clinical isolates

    Directory of Open Access Journals (Sweden)

    Darr Sebastian

    2009-11-01

    Full Text Available Abstract Background Human adenoviruses (HAdV are causing a broad spectrum of diseases. One of the most severe forms of adenovirus infection is a disseminated disease resulting in significant morbidity and mortality. Several reports in recent years have identified HAdV-31 from species A (HAdV-A31 as a cause of disseminated disease in children following haematopoetic stem cell transplantation (hSCT and liver transplantation. We sequenced and analyzed the complete genome of the HAdV-A31 prototype strain to uncover unique sequence motifs associated with its high virulence. Moreover, we sequenced coding regions known to be essential for tropism and virulence (early transcription units E1A, E3, E4, the fiber knob and the penton base of HAdV-A31 clinical isolates from patients with disseminated disease. Results The genome size of HAdV-A31 is 33763 base pairs (bp in length with a GC content of 46.36%. Nucleotide alignment to the closely related HAdV-A12 revealed an overall homology of 84.2%. The genome organization into early, intermediate and late regions is similar to HAdV-A12. Sequence analysis of the prototype strain showed unique sequence features such as an immunoglobulin-like domain in the species A specific gene product E3 CR1 beta and a potentially integrin binding RGD motif in the C-terminal region of the protein IX. These features were conserved in all analyzed clinical isolates. Overall, amino acid sequences of clinical isolates were highly conserved compared to the prototype (99.2 to 100%, but a synonymous/non synonymous ratio (S/N of 2.36 in E3 CR1 beta suggested positive selection. Conclusion Unique sequence features of HAdV-A31 may enhance its ability to escape the host's immune surveillance and may facilitate a promiscuous tropism for various tissues. Moderate evolution of clinical isolates did not indicate the emergence of new HAdV-A31 subtypes in the recent years.

  18. Unique sequence features of the Human adenovirus 31 complete genomic sequence are conserved in clinical isolates.

    Science.gov (United States)

    Hofmayer, Soeren; Madisch, Ijad; Darr, Sebastian; Rehren, Fabienne; Heim, Albert

    2009-11-25

    Human adenoviruses (HAdV) are causing a broad spectrum of diseases. One of the most severe forms of adenovirus infection is a disseminated disease resulting in significant morbidity and mortality. Several reports in recent years have identified HAdV-31 from species A (HAdV-A31) as a cause of disseminated disease in children following haematopoetic stem cell transplantation (hSCT) and liver transplantation. We sequenced and analyzed the complete genome of the HAdV-A31 prototype strain to uncover unique sequence motifs associated with its high virulence. Moreover, we sequenced coding regions known to be essential for tropism and virulence (early transcription units E1A, E3, E4, the fiber knob and the penton base) of HAdV-A31 clinical isolates from patients with disseminated disease. The genome size of HAdV-A31 is 33763 base pairs (bp) in length with a GC content of 46.36%. Nucleotide alignment to the closely related HAdV-A12 revealed an overall homology of 84.2%. The genome organization into early, intermediate and late regions is similar to HAdV-A12. Sequence analysis of the prototype strain showed unique sequence features such as an immunoglobulin-like domain in the species A specific gene product E3 CR1 beta and a potentially integrin binding RGD motif in the C-terminal region of the protein IX. These features were conserved in all analyzed clinical isolates. Overall, amino acid sequences of clinical isolates were highly conserved compared to the prototype (99.2 to 100%), but a synonymous/non synonymous ratio (S/N) of 2.36 in E3 CR1 beta suggested positive selection. Unique sequence features of HAdV-A31 may enhance its ability to escape the host's immune surveillance and may facilitate a promiscuous tropism for various tissues. Moderate evolution of clinical isolates did not indicate the emergence of new HAdV-A31 subtypes in the recent years.

  19. Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity.

    Science.gov (United States)

    Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F

    2015-04-28

    The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery.

  20. The sequencing of the complete genome of a Tomato black ring virus (TBRV) and of the RNA2 of three Grapevine chrome mosaic virus (GCMV) isolates from grapevine reveals the possible recombinant origin of GCMV.

    Science.gov (United States)

    Digiaro, M; Yahyaoui, E; Martelli, G P; Elbeaino, T

    2015-02-01

    The complete genome of a Tomato black ring virus isolate (TBRV-Mirs) (RNA1, 7,366 nt and RNA2, 4,640 nt) and the RNA2 sequences (4,437; 4,445; and 4,442 nts) of three Grapevine chrome mosaic virus isolates (GCMV-H6, -H15, and -H27) were determined. All RNAs contained a single open reading frame encoding polyproteins of 254 kDa (p1) and 149 kDa (p2) for TBRV-Mirs RNA1 and RNA2, respectively, and 146 kDa for GCMV RNA2. p1 of TBRV-Mirs showed the highest identity with TBRV-MJ (94 %), Beet ringspot virus (BRSV, 82 %), and Grapevine Anatolian ringspot virus (GARSV, 66 %), while p2 showed the highest identity with TBRV isolates MJ (89 %) and ED (85 %), followed by BRSV (65 %), GCMV (58 %), and GARSV (57 %). The amino acid identity of RNA2 sequences of four GCMV isolates (three from this study and one from GenBank) ranged from 91 to 98 %, the homing protein being the most variable. The RDP3 program predicted putative intra-species recombination events for GCMV-H6 and recognized GCMV as a putative inter-species recombinant between GARSV and TBRV. In both cases, the recombination events were at the movement protein level.

  1. Advancing Eucalyptus Genomics: Cytogenomics Reveals Conservation of Eucalyptus Genomes

    Science.gov (United States)

    Ribeiro, Teresa; Barrela, Ricardo M.; Bergès, Hélène; Marques, Cristina; Loureiro, João; Morais-Cecílio, Leonor; Paiva, Jorge A. P.

    2016-01-01

    The genus Eucalyptus encloses several species with high ecological and economic value, being the subgenus Symphyomyrtus one of the most important. Species such as E. grandis and E. globulus are well characterized at the molecular level but knowledge regarding genome and chromosome organization is very scarce. Here we characterized and compared the karyotypes of three economically important species, E. grandis, E. globulus, and E. calmadulensis, and three with ecological relevance, E. pulverulenta, E. cornuta, and E. occidentalis, through an integrative approach including genome size estimation, fluorochrome banding, rDNA FISH, and BAC landing comprising genes involved in lignin biosynthesis. All karyotypes show a high degree of conservation with pericentromeric 35S and 5S rDNA loci in the first and third pairs, respectively. GC-rich heterochromatin was restricted to the 35S rDNA locus while the AT-rich heterochromatin pattern was species-specific. The slight differences in karyotype formulas and distribution of AT-rich heterochromatin, along with genome sizes estimations, support the idea of Eucalyptus genome evolution by local expansions of heterochromatin clusters. The unusual co-localization of both rDNA with AT-rich heterochromatin was attributed mainly to the presence of silent transposable elements in those loci. The cinnamoyl CoA reductase gene (CCR1) previously assessed to linkage group 10 (LG10) was clearly localized distally at the long arm of chromosome 9 establishing an unexpected correlation between the cytogenetic chromosome 9 and the LG10. Our work is novel and contributes to the understanding of Eucalyptus genome organization which is essential to develop successful advanced breeding strategies for this genus. PMID:27148332

  2. Genome Sequence of the Biocontrol Strain Pseudomonas fluorescens F113

    Science.gov (United States)

    Redondo-Nieto, Miguel; Barret, Matthieu; Morrisey, John P.; Germaine, Kieran; Martínez-Granero, Francisco; Barahona, Emma; Navazo, Ana; Sánchez-Contreras, María; Moynihan, Jennifer A.; Giddens, Stephen R.; Coppoolse, Eric R.; Muriel, Candela; Stiekema, Willem J.; Rainey, Paul B.; Dowling, David; O'Gara, Fergal; Martín, Marta

    2012-01-01

    Pseudomonas fluorescens F113 is a plant growth-promoting rhizobacterium (PGPR) that has biocontrol activity against fungal plant pathogens and is a model for rhizosphere colonization. Here, we present its complete genome sequence, which shows that besides a core genome very similar to those of other strains sequenced within this species, F113 possesses a wide array of genes encoding specialized functions for thriving in the rhizosphere and interacting with eukaryotic organisms. PMID:22328765

  3. Genome Sequence of the Milbemycin-Producing Bacterium Streptomyces bingchenggensis▿

    OpenAIRE

    Wang, Xiang-Jing; Yan, Yi-Jun; Zhang, Bo; An, Jing; Wang, Ji-Jia; Tian, Jun; Jiang, Ling; Chen, Yi-Hua; Huang, Sheng-Xiong; Yin, Min; Zhang, Ji; Gao, Ai-Li; Liu, Chong-Xi; Zhu, Zhao-Xiang; Xiang, Wen-Sheng

    2010-01-01

    Streptomyces bingchenggensis is a soil-dwelling bacterium producing the commercially important anthelmintic macrolide milbemycins. Besides milbemycins, the insecticidal polyether antibiotic nanchangmycin and some other antibiotics have also been isolated from this strain. Here we report the complete genome sequence of S. bingchenggensis. The availability of the genome sequence of S. bingchenggensis should enable us to understand the biosynthesis of these structurally intricate antibiotics bet...

  4. Genome sequence of Pantoea agglomerans strain IG1.

    Science.gov (United States)

    Matsuzawa, Tomohiko; Mori, Kazuki; Kadowaki, Takeshi; Shimada, Misato; Tashiro, Kosuke; Kuhara, Satoru; Inagawa, Hiroyuki; Soma, Gen-ichiro; Takegawa, Kaoru

    2012-03-01

    Pantoea agglomerans is a gram-negative bacterium that grows symbiotically with various plants. Here we report the 4.8-Mb genome sequence of P. agglomerans strain IG1. The lipopolysaccharides derived from P. agglomerans IG1 have been shown to be effective in the prevention of various diseases, such as bacterial or viral infection, lifestyle-related diseases. This genome sequence represents a substantial step toward the elucidation of pathways for production of lipopolysaccharides.

  5. Whole genome and transcriptome sequencing of a B3 thymoma.

    Directory of Open Access Journals (Sweden)

    Iacopo Petrini

    Full Text Available Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomic hybridization using Agilent platform, transcriptome sequencing using HiSeq 2000 (Illumina and whole genome sequencing using Complete Genomics Inc platform. Whole genome sequencing determined, in tumor and normal, the sequence of both alleles in more than 95% of the reference genome (NCBI Build 37. Copy number (CN aberrations were comparable with those previously described for B3 thymomas, with CN gain of chromosome 1q, 5, 7 and X and CN loss of 3p, 6, 11q42.2-qter and q13. One translocation t(11;X was identified by whole genome sequencing and confirmed by PCR and Sanger sequencing. Ten single nucleotide variations (SNVs and 2 insertion/deletions (INDELs were identified; these mutations resulted in non-synonymous amino acid changes or affected splicing sites. The lack of common cancer-associated mutations in this patient suggests that thymomas may evolve through mechanisms distinctive from other tumor types, and supports the rationale for additional high-throughput sequencing screens to better understand the somatic genetic architecture of thymoma.

  6. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    Energy Technology Data Exchange (ETDEWEB)

    Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M.; Fahlgren, Noah; Fawcett, Jeffrey A.; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D.; Ossowski, Stephan; Ottilar, Robert P.; Salamov, Asaf A.; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E.; Bergelson, Joy; Carrington, James C.; Gaut, Brandon S.; Schmutz, Jeremy; Mayer, Klaus F. X.; Van de Peer, Yves; Grigoriev, Igor V.; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

    2011-04-29

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.

  7. Draft genome sequence of Penicillium marneffei strain PM1.

    Science.gov (United States)

    Woo, Patrick C Y; Lau, Susanna K P; Liu, Bin; Cai, James J; Chong, Ken T K; Tse, Herman; Kao, Richard Y T; Chan, Che-Man; Chow, Wang-Ngai; Yuen, Kwok-Yung

    2011-12-01

    Penicillium marneffei is the most important thermal dimorphic, pathogenic fungus endemic in China and Southeast Asia and is particularly important in HIV-positive patients. We report the 28,887,485-bp draft genome sequence of P. marneffei, which contains its complete mitochondrial genome, sexual cycle genes, a high diversity of Mp1p homologues, and polyketide synthase genes.

  8. Whole-Genome Sequences of Three Symbiotic Endozoicomonas Bacteria

    KAUST Repository

    Neave, Matthew J.

    2014-08-14

    Members of the genus Endozoicomonas associate with a wide range of marine organisms. Here, we report on the whole-genome sequencing, assembly, and annotation of three Endozoicomonas type strains. These data will assist in exploring interactions between Endozoicomonas organisms and their hosts, and it will aid in the assembly of genomes from uncultivated Endozoicomonas spp.

  9. Whole-genome sequence-based analysis of thyroid function

    DEFF Research Database (Denmark)

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N = 2,287). Using additional whole-genome seque...

  10. Genome sequences of Listeria monocytogenes strains with resistance to arsenic

    Science.gov (United States)

    Listeria monocytogenes frequently exhibits resistance to arsenic. We report here the draft genome sequences of eight genetically diverse arsenic-resistant L. monocytogenes strains from human listeriosis and food-associated environments. Availability of these genomes would help to elucidate the role ...

  11. A bibliometric analysis of global research on genome sequencing ...

    African Journals Online (AJOL)

    The results show that disease and protein related researches were the leading research focuses, and comparative genomics and evolution related research had strong potential in the near future. Key words: Genome sequencing, research trend, scientometrics, science citation index expanded (SCI-Expanded), word cluster ...

  12. Complete Genome Sequence of Pediococcus pentosaceus Strain SL4

    DEFF Research Database (Denmark)

    Dantoft, Shruti Harnal; Bielak, Eliza Maria; Seo, Jae-Gu

    2013-01-01

    Pediococcus pentosaceus SL4 was isolated from a Korean fermented vegetable product, kimchi. We report here the whole-genome sequence (WGS) of P. pentosaceus SL4. The genome consists of a 1.79-Mb circular chromosome (G+C content of 37.3%) and seven distinct plasmids ranging in size from 4 kb to 50...

  13. Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis

    DEFF Research Database (Denmark)

    Carlton, Jane M.; Hirt, Robert P.; Silva, Joana C.

    2007-01-01

    We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the approximately 160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion, in conjunct...

  14. Complete genome sequence of pronghorn virus, a pestivirus

    Science.gov (United States)

    The complete genome sequence of Pronghorn virus, a member of the Pestivirus genus of the Flaviviridae, was determined. The virus, originally isolated from a pronghorn antelope, had a genome of 12,287 nucleotides with a single open reading frame of 11,694 bases encoding 3898 amino acids....

  15. Algal genomes reveal evolutionary mosaicism and the fate of nucleomorphs

    Energy Technology Data Exchange (ETDEWEB)

    Curtis, Bruce A.; Tanifuji, Goro; Burki, Fabien; Gruber, Ansgar; Irimia, Manuuel; Maruyama, Shinichiro; Arias, Maria C.; Ball, Steven G.; Gile, Gillian H.; Hirakawa, Yoshihisa; Hopkins, Julia F.; Kuo, Alan; Rensing, Stefan A.; Schmutz, Jeremy; Symeonidi, Aikaterini; Elias, Marek; Eveleigh, Robert J. M.; Herman, Emily K.; Klute, Mary J.; Nakayama, Takuro; Obornik, Miroslav; Reyes-Prieto, Adrian; Armbrust, E. Virginia; Aves, Stephen J.; Beiko, Robert G.; Coutinho, Pedro; Dacks, Joel B.; Durnford, Dion G.; Fast, Naomi M.; Green, Beverley R.; Grisdale, Cameron J.; Hempel, Franziska; Henrissat, Bernard; Hoppner, Marc P.; Ishida, Ken-Ichiro; Kim, Eunsoo; Koreny, Ludek; Kroth, Peter G.; Liu, Yuan; Malik, Shehre-Banoo; Maier, Uwe G.; McRose, Darcy; Mock, Thomas; Neilson, Jonathan A. D.; Onodera, Naoko T.; Poole, Anthony M.; Pritham, Ellen J.; Richards, Thomas A.; Rocap, Gabrielle; Roy, Scott W.; Sarai, Chihiro; Schaack, Sarah; Shirato, Shu; Slamovits, Claudio H.; Spencer, Davie F.; Suzuki, Shigekatsu; Worden, Alexandra Z.; Zauner, Stefan; Barry, Kerrie; Bell, Callum; Bharti, Arvind K.; Crow, John A.; Grimwood, Jane; Kramer, Robin; Lindquist, Erika; Lucas, Susan; Salamov, Asaf; McFadden, Geoffrey I.; Lane, Christopher E.; Keeling, Patrick J.; Gray, Michael W.; Grigoriev, Igor V.; Archibald, John M.

    2012-08-10

    Cryptophyte and chlorarachniophyte algae are transitional forms in the widespread secondary endosymbiotic acquisition of photosynthesis by engulfment of eukaryotic algae. Unlike most secondary plastid-bearing algae, miniaturized versions of the endosymbiont nuclei (nucleomorphs) persist in cryptophytes and chlorarachniophytes. To determine why, and to address other fundamental questions about eukaryote eukaryote endosymbiosis, we sequenced the nuclear genomes of the cryptophyte Guillardia theta and the chlorarachniophyte Bigelowiella natans. Both genomes have 21,000 protein genes and are intron rich, and B. natans exhibits unprecedented alternative splicing for a single-celled organism. Phylogenomic analyses and subcellular targeting predictions reveal extensive genetic and biochemical mosaicism, with both host- and endosymbiont-derived genes servicing the mitochondrion, the host cell cytosol, the plastid and the remnant endosymbiont cytosol of both algae. Mitochondrion-to-nucleus gene transfer still occurs in both organisms but plastid-to-nucleus and nucleomorph-to-nucleus transfers do not, which explains why a small residue of essential genes remains locked in each nucleomorph.

  16. Genome sequence of the olive tree, Olea europaea.

    Science.gov (United States)

    Cruz, Fernando; Julca, Irene; Gómez-Garrido, Jèssica; Loska, Damian; Marcet-Houben, Marina; Cano, Emilio; Galán, Beatriz; Frias, Leonor; Ribeca, Paolo; Derdak, Sophia; Gut, Marta; Sánchez-Fernández, Manuel; García, Jose Luis; Gut, Ivo G; Vargas, Pablo; Alioto, Tyler S; Gabaldón, Toni

    2016-06-27

    The Mediterranean olive tree (Olea europaea subsp. europaea) was one of the first trees to be domesticated and is currently of major agricultural importance in the Mediterranean region as the source of olive oil. The molecular bases underlying the phenotypic differences among domesticated cultivars, or between domesticated olive trees and their wild relatives, remain poorly understood. Both wild and cultivated olive trees have 46 chromosomes (2n). A total of 543 Gb of raw DNA sequence from whole genome shotgun sequencing, and a fosmid library containing 155,000 clones from a 1,000+ year-old olive tree (cv. Farga) were generated by Illumina sequencing using different combinations of mate-pair and pair-end libraries. Assembly gave a final genome with a scaffold N50 of 443 kb, and a total length of 1.31 Gb, which represents 95 % of the estimated genome length (1.38 Gb). In addition, the associated fungus Aureobasidium pullulans was partially sequenced. Genome annotation, assisted by RNA sequencing from leaf, root, and fruit tissues at various stages, resulted in 56,349 unique protein coding genes, suggesting recent genomic expansion. Genome completeness, as estimated using the CEGMA pipeline, reached 98.79 %. The assembled draft genome of O. europaea will provide a valuable resource for the study of the evolution and domestication processes of this important tree, and allow determination of the genetic bases of key phenotypic traits. Moreover, it will enhance breeding programs and the formation of new varieties.

  17. Comparative sequence analyses of genome and transcriptome ...

    Indian Academy of Sciences (India)

    /fulltext/jbsc/040/05/0891-0907. Keywords. Asian elephant; comparative genomics; gene prediction; transcriptome. Abstract. The Asian elephant Elephas maximus and the African elephant Loxodonta africana that diverged 5-7 million years ...

  18. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks

    OpenAIRE

    Rusconi Brigida; Sanjar Fatemeh; Koenig Sara SK; Mammel Mark K; Tarr Phillip I; Eppinger Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and m...

  19. Molecular cloning and organization of two leghaemoglobin genomic sequences of soybean

    Science.gov (United States)

    Sullivan, D.; Brisson, N.; Goodchild, B.; Verma, D. P. S.

    1981-02-01

    The leghaemoglobins (Lb) are myoglobin-like proteins found in all nitrogen-fixing root nodules of legumes1-3. They are encoded by plant nuclear genes4 which are specifically induced and form the predominant protein in nodules developed in symbiosis with the appropriate species of Rhizobium. The Lb is located in the host-cell cytoplasm of the infected cell5 and is thought to facilitate oxygen diffusion6,7. Amino acid sequencing of the soybean Lbs has revealed at least four primary structures differing only in a few amino acids8-10. We have previously estimated about 40 copies of Lb sequences in the soybean (Glycine max L.) genome by cDNA hybridization4. To investigate Lb gene organization and function, we prepared and characterized a Lb cDNA recombinant molecule, pLb1, and used it to isolate two genomic Lb sequences from a library constructed in Charon 4. We report here that the organization of the two genomic Lb sequences is quite distinct and one of them seems to have an intervening sequence(s). Hybridization of pLb1 with genomic DNA from various tissues showed that Lb sequences are dispersed through more than 30 kilobases of genomic DNA and that there is no apparent sequence rearrangement or methylation changes following induction of Lb genes.

  20. Simultaneous Whole Mitochondrial Genome Sequencing with Short Overlapping Amplicons Suitable for Degraded DNA Using the Ion Torrent Personal Genome Machine.

    Science.gov (United States)

    Chaitanya, Lakshmi; Ralf, Arwin; van Oven, Mannis; Kupiec, Tomasz; Chang, Joseph; Lagacé, Robert; Kayser, Manfred

    2015-12-01

    Whole mitochondrial (mt) genome analysis enables a considerable increase in analysis throughput, and improves the discriminatory power to the maximum possible phylogenetic resolution. Most established protocols on the different massively parallel sequencing (MPS) platforms, however, invariably involve the PCR amplification of large fragments, typically several kilobases in size, which may fail due to mtDNA fragmentation in the available degraded materials. We introduce a MPS tiling approach for simultaneous whole human mt genome sequencing using 161 short overlapping amplicons (average 200 bp) with the Ion Torrent Personal Genome Machine. We illustrate the performance of this new method by sequencing 20 DNA samples belonging to different worldwide mtDNA haplogroups. Additional quality control, particularly regarding the potential detection of nuclear insertions of mtDNA (NUMTs), was performed by comparative MPS analysis using the conventional long-range amplification method. Preliminary sensitivity testing revealed that detailed haplogroup inference was feasible with 100 pg genomic input DNA. Complete mt genome coverage was achieved from DNA samples experimentally degraded down to genomic fragment sizes of about 220 bp, and up to 90% coverage from naturally degraded samples. Overall, we introduce a new approach for whole mt genome MPS analysis from degraded and nondegraded materials relevant to resolve and infer maternal genetic ancestry at complete resolution in anthropological, evolutionary, medical, and forensic applications. © 2015 The Authors. **Human Mutation published by Wiley Periodicals, Inc.

  1. First fungal genome sequence from Africa: A preliminary analysis

    Directory of Open Access Journals (Sweden)

    Rene Sutherland

    2012-01-01

    Full Text Available Some of the most significant breakthroughs in the biological sciences this century will emerge from the development of next generation sequencing technologies. The ease of availability of DNA sequence made possible through these new technologies has given researchers opportunities to study organisms in a manner that was not possible with Sanger sequencing. Scientists will, therefore, need to embrace genomics, as well as develop and nurture the human capacity to sequence genomes and utilise the ’tsunami‘ of data that emerge from genome sequencing. In response to these challenges, we sequenced the genome of Fusarium circinatum, a fungal pathogen of pine that causes pitch canker, a disease of great concern to the South African forestry industry. The sequencing work was conducted in South Africa, making F. circinatum the first eukaryotic organism for which the complete genome has been sequenced locally. Here we report on the process that was followed to sequence, assemble and perform a preliminary characterisation of the genome. Furthermore, details of the computer annotation and manual curation of this genome are presented. The F. circinatum genome was found to be nearly 44 million bases in size, which is similar to that of four other Fusarium genomes that have been sequenced elsewhere. The genome contains just over 15 000 open reading frames, which is less than that of the related species, Fusarium oxysporum, but more than that for Fusarium verticillioides. Amongst the various putative gene clusters identified in F. circinatum, those encoding the secondary metabolites fumosin and fusarin appeared to harbour evidence of gene translocation. It is anticipated that similar comparisons of other loci will provide insights into the genetic basis for pathogenicity of the pitch canker pathogen. Perhaps more importantly, this project has engaged a relatively large group of scientists

  2. Symbiodinium genomes reveal adaptive evolution of functions related to symbiosis

    KAUST Repository

    Liu, Huanle

    2017-10-06

    Symbiosis between dinoflagellates of the genus Symbiodinium and reef-building corals forms the trophic foundation of the world\\'s coral reef ecosystems. Here we present the first draft genome of Symbiodinium goreaui (Clade C, type C1: 1.03 Gbp), one of the most ubiquitous endosymbionts associated with corals, and an improved draft genome of Symbiodinium kawagutii (Clade F, strain CS-156: 1.05 Gbp), previously sequenced as strain CCMP2468, to further elucidate genomic signatures of this symbiosis. Comparative analysis of four available Symbiodinium genomes against other dinoflagellate genomes led to the identification of 2460 nuclear gene families that show evidence of positive selection, including genes involved in photosynthesis, transmembrane ion transport, synthesis and modification of amino acids and glycoproteins, and stress response. Further, we identified extensive sets of genes for meiosis and response to light stress. These draft genomes provide a foundational resource for advancing our understanding Symbiodinium biology and the coral-algal symbiosis.

  3. Dissection of the octoploid strawberry genome by deep sequencing of the genomes of Fragaria species.

    Science.gov (United States)

    Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N

    2014-01-01

    Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ∼200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species.

  4. Genomic Analysis of the Basal Lineage Fungus Rhizopus oryzae Reveals a Whole-Genome Duplication

    Science.gov (United States)

    Ma, Li-Jun; Ibrahim, Ashraf S.; Skory, Christopher; Grabherr, Manfred G.; Burger, Gertraud; Butler, Margi; Elias, Marek; Idnurm, Alexander; Lang, B. Franz; Sone, Teruo; Abe, Ayumi; Calvo, Sarah E.; Corrochano, Luis M.; Engels, Reinhard; Fu, Jianmin; Hansberg, Wilhelm; Kim, Jung-Mi; Kodira, Chinnappa D.; Koehrsen, Michael J.; Liu, Bo; Miranda-Saavedra, Diego; O'Leary, Sinead; Ortiz-Castellanos, Lucila; Poulter, Russell; Rodriguez-Romero, Julio; Ruiz-Herrera, José; Shen, Yao-Qing; Zeng, Qiandong; Galagan, James; Birren, Bruce W.

    2009-01-01

    Rhizopus oryzae is the primary cause of mucormycosis, an emerging, life-threatening infection characterized by rapid angioinvasive growth with an overall mortality rate that exceeds 50%. As a representative of the paraphyletic basal group of the fungal kingdom called “zygomycetes,” R. oryzae is also used as a model to study fungal evolution. Here we report the genome sequence of R. oryzae strain 99–880, isolated from a fatal case of mucormycosis. The highly repetitive 45.3 Mb genome assembly contains abundant transposable elements (TEs), comprising approximately 20% of the genome. We predicted 13,895 protein-coding genes not overlapping TEs, many of which are paralogous gene pairs. The order and genomic arrangement of the duplicated gene pairs and their common phylogenetic origin provide evidence for an ancestral whole-genome duplication (WGD) event. The WGD resulted in the duplication of nearly all subunits of the protein complexes associated with respiratory electron transport chains, the V-ATPase, and the ubiquitin–proteasome systems. The WGD, together with recent gene duplications, resulted in the expansion of multiple gene families related to cell growth and signal transduction, as well as secreted aspartic protease and subtilase protein families, which are known fungal virulence factors. The duplication of the ergosterol biosynthetic pathway, especially the major azole target, lanosterol 14α-demethylase (ERG11), could contribute to the variable responses of R. oryzae to different azole drugs, including voriconazole and posaconazole. Expanded families of cell-wall synthesis enzymes, essential for fungal cell integrity but absent in mammalian hosts, reveal potential targets for novel and R. oryzae-specific diagnostic and therapeutic treatments. PMID:19578406

  5. A parts list for fungal cellulosomes revealed by comparative genomics

    Energy Technology Data Exchange (ETDEWEB)

    Haitjema, Charles H.; Gilmore, Sean P.; Henske, John K.; Solomon, Kevin V.; de Groot, Randall; Kuo, Alan; Mondo, Stephen J.; Salamov, Asaf A.; LaButti, Kurt; Zhao, Zhiying; Chiniquy, Jennifer; Barry, Kerrie; Brewer, Heather M.; Purvine, Samuel O.; Wright, Aaron T.; Hainaut, Matthieu; Boxma, Brigitte; van Alen, Theo; Hackstein, Johannes H. P.; Henrissat, Bernard; Baker, Scott E.; Grigoriev, Igor V.; O' Malley, Michelle A.

    2017-05-26

    Cellulosomes are large, multi-protein complexes that tether plant biomass degrading enzymes together for improved hydrolysis1. These complexes were first described in anaerobic bacteria where species specific dockerin domains mediate assembly of enzymes onto complementary cohesin motifs interspersed within non-catalytic protein scaffolds1. The versatile protein assembly mechanism conferred by the bacterial cohesin-dockerin interaction is now a standard design principle for synthetic protein-scale pathways2,3. For decades, analogous structures have been reported in the early branching anaerobic fungi, which are known to assemble by sequence divergent non-catalytic dockerin domains (NCDD)4. However, the enzyme components, modular assembly mechanism, and functional role of fungal cellulosomes remain unknown5,6. Here, we describe the comprehensive set of proteins critical to fungal cellulosome assembly, including novel, conserved scaffolding proteins unique to the Neocallimastigomycota. High quality genomes of the anaerobic fungi Anaeromyces robustus, Neocallimastix californiae and Piromyces finnis were assembled with long-read, single molecule technology to overcome their repeat-richness and extremely low GC content. Genomic analysis coupled with proteomic validation revealed an average 320 NCDD-containing proteins per fungal strain that were overwhelmingly carbohydrate active enzymes (CAZymes), with 95 large fungal scaffoldins identified across 4 genera that contain a conserved amino acid sequence repeat that binds to NCDDs. Fungal dockerin and scaffoldin domains have no similarity to their bacterial counterparts, yet several catalytic domains originated via horizontal gene transfer with gut bacteria. Though many catalytic domains are shared with bacteria, the biocatalytic activity of anaerobic fungi is expanded by the inclusion of GH3, GH6, and GH45 enzymes in the enzyme complexes. Collectively, these findings suggest that the fungal cellulosome is an evolutionarily

  6. Genome sequence of the date palm Phoenix dactylifera L.

    Science.gov (United States)

    Al-Mssallem, Ibrahim S; Hu, Songnian; Zhang, Xiaowei; Lin, Qiang; Liu, Wanfei; Tan, Jun; Yu, Xiaoguang; Liu, Jiucheng; Pan, Linlin; Zhang, Tongwu; Yin, Yuxin; Xin, Chengqi; Wu, Hao; Zhang, Guangyu; Ba Abdullah, Mohammed M; Huang, Dawei; Fang, Yongjun; Alnakhli, Yasser O; Jia, Shangang; Yin, An; Alhuzimi, Eman M; Alsaihati, Burair A; Al-Owayyed, Saad A; Zhao, Duojun; Zhang, Sun; Al-Otaibi, Noha A; Sun, Gaoyuan; Majrashi, Majed A; Li, Fusen; Tala; Wang, Jixiang; Yun, Quanzheng; Alnassar, Nafla A; Wang, Lei; Yang, Meng; Al-Jelaify, Rasha F; Liu, Kan; Gao, Shenghan; Chen, Kaifu; Alkhaldi, Samiyah R; Liu, Guiming; Zhang, Meng; Guo, Haiyan; Yu, Jun

    2013-01-01

    Date palm (Phoenix dactylifera L.) is a cultivated woody plant species with agricultural and economic importance. Here we report a genome assembly for an elite variety (Khalas), which is 605.4 Mb in size and covers >90% of the genome (~671 Mb) and >96% of its genes (~41,660 genes). Genomic sequence analysis demonstrates that P. dactylifera experienced a clear genome-wide duplication after either ancient whole genome duplications or massive segmental duplications. Genetic diversity analysis indicates that its stress resistance and sugar metabolism-related genes tend to be enriched in the chromosomal regions where the density of single-nucleotide polymorphisms is relatively low. Using transcriptomic data, we also illustrate the date palm's unique sugar metabolism that underlies fruit development and ripening. Our large-scale genomic and transcriptomic data pave the way for further genomic studies not only on P. dactylifera but also other Arecaceae plants.

  7. Global Genomic Diversity of Human Papillomavirus 11 Based on 433 Isolates and 78 Complete Genome Sequences

    Science.gov (United States)

    Jelen, Mateja M.; Chen, Zigui; Kocjan, Boštjan J.; Hošnjak, Lea; Burt, Felicity J.; Chan, Paul K. S.; Chouhy, Diego; Combrinck, Catharina E.; Estrade, Christine; Fiander, Alison; Garland, Suzanne M.; Giri, Adriana A.; González, Joaquín Víctor; Gröning, Arndt; Hibbitts, Sam; Luk, Tommy N. M.; Marinic, Karina; Matsukura, Toshihiko; Neumann, Anna; Oštrbenk, Anja; Picconi, Maria Alejandra; Sagadin, Martin; Sahli, Roland; Seedat, Riaz Y.; Seme, Katja; Severini, Alberto; Sinchi, Jessica L.; Smahelova, Jana; Tabrizi, Sepehr N.; Tachezy, Ruth; Tohme Faybush, Sarah; Uloza, Virgilijus; Uloziene, Ingrida; Wong, Yong Wee; Židovec Lepej, Snježana; Burk, Robert D.

    2016-01-01

    ABSTRACT Human papillomavirus 11 (HPV11) is an etiological agent of anogenital warts and laryngeal papillomas and is included in the 4-valent and 9-valent prophylactic HPV vaccines. We established the largest collection of globally circulating HPV11 isolates to date and examined the genomic diversity of 433 isolates and 78 complete genomes (CGs) from six continents. The genomic variation within the 2,800-bp E5a-E5b-L1-upstream regulatory region was initially studied in 181/207 (87.4%) HPV11 isolates collected for this study. Of these, the CGs of 30 HPV11 variants containing unique single nucleotide polymorphisms (SNPs), indels (insertions or deletions), or amino acid changes were fully sequenced. A maximum likelihood tree based on the global alignment of 78 HPV11 CGs (30 CGs from our study and 48 CGs from GenBank) revealed two HPV11 lineages (lineages A and B) and four sublineages (sublineages A1, A2, A3, and A4). HPV11 (sub)lineage-specific SNPs within the CG were identified, as well as the 208-bp representative region for CG-based phylogenetic clustering within the partial E2 open reading frame and noncoding region 2. Globally, sublineage A2 was the most prevalent, followed by sublineages A1, A3, and A4 and lineage B. IMPORTANCE This collaborative international study defined the global heterogeneity of HPV11 and established the largest collection of globally circulating HPV11 genomic variants to date. Thirty novel complete HPV11 genomes were determined and submitted to the available sequence repositories. Global phylogenetic analysis revealed two HPV11 variant lineages and four sublineages. The HPV11 (sub)lineage-specific SNPs and the representative region identified within the partial genomic region E2/noncoding region 2 (NCR2) will enable the simpler identification and comparison of HPV11 variants worldwide. This study provides an important knowledge base for HPV11 for future studies in HPV epidemiology, evolution, pathogenicity, prevention, and molecular assay

  8. AluScan: a method for genome-wide scanning of sequence and structure variations in the human genome

    Directory of Open Access Journals (Sweden)

    Mei Lingling

    2011-11-01

    Full Text Available Abstract Background To complement next-generation sequencing technologies, there is a pressing need for efficient pre-sequencing capture methods with reduced costs and DNA requirement. The Alu family of short interspersed nucleotide elements is the most abundant type of transposable elements in the human genome and a recognized source of genome instability. With over one million Alu elements distributed throughout the genome, they are well positioned to facilitate genome-wide sequence amplification and capture of regions likely to harbor genetic variation hotspots of biological relevance. Results Here we report on the use of inter-Alu PCR with an enhanced range of amplicons in conjunction with next-generation sequencing to generate an Alu-anchored scan, or 'AluScan', of DNA sequences between Alu transposons, where Alu consensus sequence-based 'H-type' PCR primers that elongate outward from the head of an Alu element are combined with 'T-type' primers elongating from the poly-A containing tail to achieve huge amplicon range. To illustrate the method, glioma DNA was compared with white blood cell control DNA of the same patient by means of AluScan. The over 10 Mb sequences obtained, derived from more than 8,000 genes spread over all the chromosomes, revealed a highly reproducible capture of genomic sequences enriched in genic sequences and cancer candidate gene regions. Requiring only sub-micrograms of sample DNA, the power of AluScan as a discovery tool for genetic variations was demonstrated by the identification of 357 instances of loss of heterozygosity, 341 somatic indels, 274 somatic SNVs, and seven potential somatic SNV hotspots between control and glioma DNA. Conclusions AluScan, implemented with just a small number of H-type and T-type inter-Alu PCR primers, provides an effective capture of a diversity of genome-wide sequences for analysis. The method, by enabling an examination of gene-enriched regions containing exons, introns, and

  9. Genome sequence analyses of Pseudomonas savastanoi pv. glycinea and subtractive hybridization-based comparative genomics with nine pseudomonads.

    Science.gov (United States)

    Qi, Mingsheng; Wang, Dongping; Bradley, Carl A; Zhao, Youfu

    2011-01-27

    Bacterial blight, caused by Pseudomonas savastanoi pv. glycinea (Psg), is a common disease of soybean. In an effort to compare a current field isolate with one isolated in the early 1960s, the genomes of two Psg strains, race 4 and B076, were sequenced using 454 pyrosequencing. The genomes of both Psg strains share more than 4,900 highly conserved genes, indicating very low genetic diversity between Psg genomes. Though conserved, genome rearrangements and recombination events occur commonly within the two Psg genomes. When compared to each other, 437 and 163 specific genes were identified in B076 and race 4, respectively. Most specific genes are plasmid-borne, indicating that acquisition and maintenance of plasmids may represent a major mechanism to change the genetic composition of the genome and even acquire new virulence factors. Type three secretion gene clusters of Psg strains are near identical with that of P. savastanoi pv. phaseolicola (Pph) strain 1448A and they shared 20 common effector genes. Furthermore, the coronatine biosynthetic cluster is present on a large plasmid in strain B076, but not in race 4. In silico subtractive hybridization-based comparative genomic analyses with nine sequenced phytopathogenic pseudomonads identified dozens of specific islands (SIs), and revealed that the genomes of Psg strains are more similar to those belonging to the same genomospecies such as Pph 1448A than to other phytopathogenic pseudomonads. The number of highly