WorldWideScience

Sample records for genome sequences reveal

  1. Sequence analysis reveals mosaic genome of Aichi virus

    Directory of Open Access Journals (Sweden)

    Han Xiaohong

    2011-08-01

    Full Text Available Abstract Aichi virus is a positive-sense and single-stranded RNA virus, which demonstrated to be related to diarrhea of Children. In the present study, phylogenetic and recombination analysis based on the Aichi virus complete genomes available in GenBank reveal a mosaic genome sequence [GenBank: FJ890523], of which the nt 261-852 region (the nt position was based on the aligned sequence file shows close relationship with AB010145/Japan with 97.9% sequence identity, while the other genomic regions show close relationship with AY747174/German with 90.1% sequence identity. Our results will provide valuable hints for future research on Aichi virus diversity. Aichi virus is a member of the Kobuvirus genus of the Picornaviridae family 12 and belongs to a positive-sense and single-stranded RNA virus. Its presence in fecal specimens of children suffering from diarrhea has been demonstrated in several Asian countries 3456, in Brazil and German 7, in France 8 and in Tunisia 9. Some reports showed the high level of seroprevalence in adults 710, suggesting the widespread exposure to Aichi virus during childhood. The genome of Aichi virus contains 8,280 nucleotides and a poly(A tail. The single large open reading frame (nt 713-8014 according to the strain AB010145 encodes a polyprotein of 2,432 amino acids that is cleaved into the typical picornavirus structural proteins VP0, VP3, VP1, and nonstructural proteins 2A, 2B, 2C, 3A, 3B, 3C and 3D 211. Based on the phylogenetic analysis of 519-bp sequences at the 3C-3D (3CD junction, Aichi viruses can be divided into two genotypes A and B with approximately 90% sequence homology 12. Although only six complete genomes of Aichi virus were deposited in GenBank at present, mosaic genomes can be found in strains from different countries.

  2. Coelacanth genome sequence reveals the evolutionary history of vertebrate genes.

    Science.gov (United States)

    Noonan, James P; Grimwood, Jane; Danke, Joshua; Schmutz, Jeremy; Dickson, Mark; Amemiya, Chris T; Myers, Richard M

    2004-12-01

    The coelacanth is one of the nearest living relatives of tetrapods. However, a teleost species such as zebrafish or Fugu is typically used as the outgroup in current tetrapod comparative sequence analyses. Such studies are complicated by the fact that teleost genomes have undergone a whole-genome duplication event, as well as individual gene-duplication events. Here, we demonstrate the value of coelacanth genome sequence by complete sequencing and analysis of the protocadherin gene cluster of the Indonesian coelacanth, Latimeria menadoensis. We found that coelacanth has 49 protocadherin cluster genes organized in the same three ordered subclusters, alpha, beta, and gamma, as the 54 protocadherin cluster genes in human. In contrast, whole-genome and tandem duplications have generated two zebrafish protocadherin clusters comprised of at least 97 genes. Additionally, zebrafish protocadherins are far more prone to homogenizing gene conversion events than coelacanth protocadherins, suggesting that recombination- and duplication-driven plasticity may be a feature of teleost genomes. Our results indicate that coelacanth provides the ideal outgroup sequence against which tetrapod genomes can be measured. We therefore present L. menadoensis as a candidate for whole-genome sequencing.

  3. Registered Report: Melanoma genome sequencing reveals frequent PREX2 mutations

    OpenAIRE

    2015-01-01

    Authors: Denise Chroscinski, Darryl Sampey, Alex Hewitt, The Reproducibility Project: Cancer Biology† ### Abstract The [Reproducibility Project: Cancer Biology](https://osf.io/e81xl/wiki/home/) seeks to address growing concerns about reproducibility in scientific research by conducting replications of 50 papers in the field of cancer biology published between 2010 and 2012. This Registered Report describes the proposed replication plan of key experiments from “Melanoma genome sequenci...

  4. Genomic view of bipolar disorder revealed by whole genome sequencing in a genetic isolate.

    Directory of Open Access Journals (Sweden)

    Benjamin Georgi

    2014-03-01

    Full Text Available Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders.

  5. Genome sequence of Thermofilum pendens reveals an exceptional loss of biosynthetic pathways without genome reduction

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos; Anderson, Iain; Rodriguez, Jason; Susanti, Dwi; Porat, Iris; Reich, Claudia; Ulrich, Luke E.; Elkins, James G.; Mavromatis, Kostas; Lykidis, Athanasios; Kim, Edwin; Thompson, Linda S.; Nolan, Matt; Land, Miriam; Copeland, Alex; Lapidus, Alla; Lucas, Susan; Detter, Chris; Zhulin, Igor B.; Olsen, Gary J.; Whitman, William; Mukhopadhyay, Biswarup; Bristow, James; Kyrpides, Nikos

    2008-01-01

    We report the complete genome of Thermofilum pendens, a deep-branching, hyperthermophilic member of the order Thermoproteales within the archaeal kingdom Crenarchaeota. T. pendens is a sulfur-dependent, anaerobic heterotroph isolated from a solfatara in Iceland. It is an extracellular commensal, requiring an extract of Thermoproteus tenax for growth, and the genome sequence reveals that biosynthetic pathways for purines, most amino acids, and most cofactors are absent. In fact T. pendens has fewer biosynthetic enzymes than obligate intracellular parasites, although it does not display other features common among obligate parasites and thus does not appear to be in the process of becoming a parasite. It appears that T. pendens has adapted to life in an environment rich in nutrients. T. pendens was known to utilize peptides as an energy source, but the genome reveals substantial ability to grow on carbohydrates. T. pendens is the first crenarchaeote and only the second archaeon found to have a transporter of the phosphotransferase system. In addition to fermentation, T. pendens may gain energy from sulfur reduction with hydrogen and formate as electron donors. It may also be capable of sulfur-independent growth on formate with formate hydrogenlyase. Additional novel features are the presence of a monomethylamine:corrinoid methyltransferase, the first time this enzyme has been found outside of Methanosarcinales, and a presenilin-related protein. Predicted highly expressed proteins do not include housekeeping genes, and instead include ABC transporters for carbohydrates and peptides, and CRISPR-associated proteins.

  6. Plasmodium knowlesi genome sequences from clinical isolates reveal extensive genomic dimorphism.

    Directory of Open Access Journals (Sweden)

    Miguel M Pinheiro

    Full Text Available Plasmodium knowlesi is a newly described zoonosis that causes malaria in the human population that can be severe and fatal. The study of P. knowlesi parasites from human clinical isolates is relatively new and, in order to obtain maximum information from patient sample collections, we explored the possibility of generating P. knowlesi genome sequences from archived clinical isolates. Our patient sample collection consisted of frozen whole blood samples that contained excessive human DNA contamination and, in that form, were not suitable for parasite genome sequencing. We developed a method to reduce the amount of human DNA in the thawed blood samples in preparation for high throughput parasite genome sequencing using Illumina HiSeq and MiSeq sequencing platforms. Seven of fifteen samples processed had sufficiently pure P. knowlesi DNA for whole genome sequencing. The reads were mapped to the P. knowlesi H strain reference genome and an average mapping of 90% was obtained. Genes with low coverage were removed leaving 4623 genes for subsequent analyses. Previously we identified a DNA sequence dimorphism on a small fragment of the P. knowlesi normocyte binding protein xa gene on chromosome 14. We used the genome data to assemble full-length Pknbpxa sequences and discovered that the dimorphism extended along the gene. An in-house algorithm was developed to detect SNP sites co-associating with the dimorphism. More than half of the P. knowlesi genome was dimorphic, involving genes on all chromosomes and suggesting that two distinct types of P. knowlesi infect the human population in Sarawak, Malaysian Borneo. We use P. knowlesi clinical samples to demonstrate that Plasmodium DNA from archived patient samples can produce high quality genome data. We show that analyses, of even small numbers of difficult clinical malaria isolates, can generate comprehensive genomic information that will improve our understanding of malaria parasite diversity and

  7. Genome sequence of the necrotrophic plant pathogen Pythium ultimum reveals original pathogenicity mechanisms and effector repertoire.

    Science.gov (United States)

    The P. ultimum DAOM BR144 (=CBS 805.95 = ATCC200006) genome (42.8 Mb) encodes 15,290 genes, and has extensive sequence similarity and synteny with related Phytophthora spp., including the potato late blight pathogen Phytophthora infestans. Whole transcriptome sequencing revealed expression of 86 % o...

  8. Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-10-24

    Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic diversity

  9. Supplementary Material for: Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-01-01

    Abstract Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic

  10. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  11. The Douglas-Fir Genome Sequence Reveals Specialization of the Photosynthetic Apparatus in Pinaceae

    Directory of Open Access Journals (Sweden)

    David B. Neale

    2017-09-01

    Full Text Available A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb. Franco (Coastal Douglas-fir is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50 = 340,704 bp. Incremental improvements in sequencing and assembly technologies are in part responsible for the higher quality reference genome, but it may also be due to a slightly lower exact repeat content in Douglas-fir vs. pine and spruce. Comparative genome annotation with angiosperm species reveals gene-family expansion and contraction in Douglas-fir and other conifers which may account for some of the major morphological and physiological differences between the two major plant groups. Notable differences in the size of the NDH-complex gene family and genes underlying the functional basis of shade tolerance/intolerance were observed. This reference genome sequence not only provides an important resource for Douglas-fir breeders and geneticists but also sheds additional light on the evolutionary processes that have led to the divergence of modern angiosperms from the more ancient gymnosperms.

  12. Partial sequencing of the bottle gourd genome reveals markers useful for phylogenetic analysis and breeding

    Directory of Open Access Journals (Sweden)

    Wang Sha

    2011-09-01

    Full Text Available Abstract Background Bottle gourd [Lagenaria siceraria (Mol. Standl.] is an important cucurbit crop worldwide. Archaeological research indicates that bottle gourd was domesticated more than 10,000 years ago, making it one of the earliest plants cultivated by man. In spite of its widespread importance and long history of cultivation almost nothing has been known about the genome of this species thus far. Results We report here the partial sequencing of bottle gourd genome using the 454 GS-FLX Titanium sequencing platform. A total of 150,253 sequence reads, which were assembled into 3,994 contigs and 82,522 singletons were generated. The total length of the non-redundant singletons/assemblies is 32 Mb, theoretically covering ~ 10% of the bottle gourd genome. Functional annotation of the sequences revealed a broad range of functional types, covering all the three top-level ontologies. Comparison of the gene sequences between bottle gourd and the model cucurbit cucumber (Cucumis sativus revealed a 90% sequence similarity on average. Using the sequence information, 4395 microsatellite-containing sequences were identified and 400 SSR markers were developed, of which 94% amplified bands of anticipated sizes. Transferability of these markers to four other cucurbit species showed obvious decline with increasing phylogenetic distance. From analyzing polymorphisms of a subset of 14 SSR markers assayed on 44 representative China bottle gourd varieties/landraces, a principal coordinates (PCo analysis output and a UPGMA-based dendrogram were constructed. Bottle gourd accessions tended to group by fruit shape rather than geographic origin, although in certain subclades the lines from the same or close origin did tend to cluster. Conclusions This work provides an initial basis for genome characterization, gene isolation and comparative genomics analysis in bottle gourd. The SSR markers developed would facilitate marker assisted breeding schemes for efficient

  13. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  14. Genome sequencing and comparative genomics reveal a repertoire of putative pathogenicity genes in chilli anthracnose fungus Colletotrichum truncatum.

    Science.gov (United States)

    Rao, Soumya; Nandineni, Madhusudan R

    2017-01-01

    Colletotrichum truncatum, a major fungal phytopathogen, causes the anthracnose disease on an economically important spice crop chilli (Capsicum annuum), resulting in huge economic losses in tropical and sub-tropical countries. It follows a subcuticular intramural infection strategy on chilli with a short, asymptomatic, endophytic phase, which contrasts with the intracellular hemibiotrophic lifestyle adopted by most of the Colletotrichum species. However, little is known about the molecular determinants and the mechanism of pathogenicity in this fungus. A high quality whole genome sequence and gene annotation based on transcriptome data of an Indian isolate of C. truncatum from chilli has been obtained. Analysis of the genome sequence revealed a rich repertoire of pathogenicity genes in C. truncatum encoding secreted proteins, effectors, plant cell wall degrading enzymes, secondary metabolism associated proteins, with potential roles in the host-specific infection strategy, placing it next only to the Fusarium species. The size of genome assembly, number of predicted genes and some of the functional categories were similar to other sequenced Colletotrichum species. The comparative genomic analyses with other species and related fungi identified some unique genes and certain highly expanded gene families of CAZymes, proteases and secondary metabolism associated genes in the genome of C. truncatum. The draft genome assembly and functional annotation of potential pathogenicity genes of C. truncatum provide an important genomic resource for understanding the biology and lifestyle of this important phytopathogen and will pave the way for designing efficient disease control regimens.

  15. Next generation sequencing reveals the antibiotic resistant variants in the genome of Pseudomonas aeruginosa.

    Science.gov (United States)

    Ramanathan, Babu; Jindal, Hassan Mahmood; Le, Cheng Foh; Gudimella, Ranganath; Anwar, Arif; Razali, Rozaimi; Poole-Johnson, Johan; Manikam, Rishya; Sekaran, Shamala Devi

    2017-01-01

    Rapid progress in next generation sequencing and allied computational tools have aided in identification of single nucleotide variants in genomes of several organisms. In the present study, we have investigated single nucleotide polymorphism (SNP) in ten multi-antibiotic resistant Pseudomonas aeruginosa clinical isolates. All the draft genomes were submitted to Rapid Annotations using Subsystems Technology (RAST) web server and the predicted protein sequences were used for comparison. Non-synonymous single nucleotide polymorphism (nsSNP) found in the clinical isolates compared to the reference genome (PAO1), and the comparison of nsSNPs between antibiotic resistant and susceptible clinical isolates revealed insights into the genome variation. These nsSNPs identified in the multi-drug resistant clinical isolates were found to be altering a single amino acid in several antibiotic resistant genes. We found mutations in genes encoding efflux pump systems, cell wall, DNA replication and genes involved in repair mechanism. In addition, nucleotide deletions in the genome and mutations leading to generation of stop codons were also observed in the antibiotic resistant clinical isolates. Next generation sequencing is a powerful tool to compare the whole genomes and analyse the single base pair variations found within the antibiotic resistant genes. We identified specific mutations within antibiotic resistant genes compared to the susceptible strain of the same bacterial species and these findings may provide insights to understand the role of single nucleotide variants in antibiotic resistance.

  16. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat

    Directory of Open Access Journals (Sweden)

    Huajing Teng

    2016-07-01

    Full Text Available Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches.

  17. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat.

    Science.gov (United States)

    Teng, Huajing; Zhang, Yaohua; Shi, Chengmin; Mao, Fengbiao; Hou, Lingling; Guo, Hongling; Sun, Zhongsheng; Zhang, Jianxu

    2016-07-07

    Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches.

  18. High resolution genetic mapping by genome sequencing reveals genome duplication and tetraploid genetic structure of the diploid Miscanthus sinensis.

    Directory of Open Access Journals (Sweden)

    Xue-Feng Ma

    Full Text Available We have created a high-resolution linkage map of Miscanthus sinensis, using genotyping-by-sequencing (GBS, identifying all 19 linkage groups for the first time. The result is technically significant since Miscanthus has a very large and highly heterozygous genome, but has no or limited genomics information to date. The composite linkage map containing markers from both parental linkage maps is composed of 3,745 SNP markers spanning 2,396 cM on 19 linkage groups with a 0.64 cM average resolution. Comparative genomics analyses of the M. sinensis composite linkage map to the genomes of sorghum, maize, rice, and Brachypodium distachyon indicate that sorghum has the closest syntenic relationship to Miscanthus compared to other species. The comparative results revealed that each pair of the 19 M. sinensis linkages aligned to one sorghum chromosome, except for LG8, which mapped to two sorghum chromosomes (4 and 7, presumably due to a chromosome fusion event after genome duplication. The data also revealed several other chromosome rearrangements relative to sorghum, including two telomere-centromere inversions of the sorghum syntenic chromosome 7 in LG8 of M. sinensis and two paracentric inversions of sorghum syntenic chromosome 4 in LG7 and LG8 of M. sinensis. The results clearly demonstrate, for the first time, that the diploid M. sinensis is tetraploid origin consisting of two sub-genomes. This complete and high resolution composite linkage map will not only serve as a useful resource for novel QTL discoveries, but also enable informed deployment of the wealth of existing genomics resources of other species to the improvement of Miscanthus as a high biomass energy crop. In addition, it has utility as a reference for genome sequence assembly for the forthcoming whole genome sequencing of the Miscanthus genus.

  19. Characterization and phylogenetic analysis of -gliadin gene sequences reveals significant genomic divergence in Triticeae species

    Indian Academy of Sciences (India)

    Guang-Rong Li; Tao Lang; En-Nian Yang; Cheng Liu; Zu-Jun Yang

    2014-12-01

    Although the unique properties of wheat -gliadin gene family are well characterized, little is known about the evolution and genomic divergence of -gliadin gene family within the Triticeae. We isolated a total of 203 -gliadin gene sequences from 11 representative diploid and polyploid Triticeae species, and found 108 sequences putatively functional. Our results indicate that -gliadin genes may have possibly originated from wild Secale species, where the sequences contain the shortest repetitive domains and display minimum variation. A miniature inverted-repeat transposable element insertion is reported for the first time in -gliadin gene sequence of Thinopyrum intermedium in this study, indicating that the transposable element might have contributed to the diversification of -gliadin genes family among Triticeae genomes. The phylogenetic analyses revealed that the -gliadin gene sequences of Dasypyrum, Australopyrum, Lophopyrum, Eremopyrum and Pseudoroengeria species have amplified several times. A search for four typical toxic epitopes for celiac disease within the Triticeae -gliadin gene sequences showed that the -gliadins of wild Secale, Australopyrum and Agropyron genomes lack all four epitopes, while other Triticeae species have accumulated these epitopes, suggesting that the evolution of these toxic epitopes sequences occurred during the course of speciation, domestication or polyploidization of Triticeae.

  20. The Complete Genome Sequences, Unique Mutational Spectra, and Developmental Potency of Adult Neurons Revealed by Cloning.

    Science.gov (United States)

    Hazen, Jennifer L; Faust, Gregory G; Rodriguez, Alberto R; Ferguson, William C; Shumilina, Svetlana; Clark, Royden A; Boland, Michael J; Martin, Greg; Chubukov, Pavel; Tsunemoto, Rachel K; Torkamani, Ali; Kupriyanov, Sergey; Hall, Ira M; Baldwin, Kristin K

    2016-03-16

    Somatic mutation in neurons is linked to neurologic disease and implicated in cell-type diversification. However, the origin, extent, and patterns of genomic mutation in neurons remain unknown. We established a nuclear transfer method to clonally amplify the genomes of neurons from adult mice for whole-genome sequencing. Comprehensive mutation detection and independent validation revealed that individual neurons harbor ∼100 unique mutations from all classes but lack recurrent rearrangements. Most neurons contain at least one gene-disrupting mutation and rare (0-2) mobile element insertions. The frequency and gene bias of neuronal mutations differ from other lineages, potentially due to novel mechanisms governing postmitotic mutation. Fertile mice were cloned from several neurons, establishing the compatibility of mutated adult neuronal genomes with reprogramming to pluripotency and development.

  1. Maize (Zea mays L. genome diversity as revealed by RNA-sequencing.

    Directory of Open Access Journals (Sweden)

    Candice N Hansey

    Full Text Available Maize is rich in genetic and phenotypic diversity. Understanding the sequence, structural, and expression variation that contributes to phenotypic diversity would facilitate more efficient varietal improvement. RNA based sequencing (RNA-seq is a powerful approach for transcriptional analysis, assessing sequence variation, and identifying novel transcript sequences, particularly in large, complex, repetitive genomes such as maize. In this study, we sequenced RNA from whole seedlings of 21 maize inbred lines representing diverse North American and exotic germplasm. Single nucleotide polymorphism (SNP detection identified 351,710 polymorphic loci distributed throughout the genome covering 22,830 annotated genes. Tight clustering of two distinct heterotic groups and exotic lines was evident using these SNPs as genetic markers. Transcript abundance analysis revealed minimal variation in the total number of genes expressed across these 21 lines (57.1% to 66.0%. However, the transcribed gene set among the 21 lines varied, with 48.7% expressed in all of the lines, 27.9% expressed in one to 20 lines, and 23.4% expressed in none of the lines. De novo assembly of RNA-seq reads that did not map to the reference B73 genome sequence revealed 1,321 high confidence novel transcripts, of which, 564 loci were present in all 21 lines, including B73, and 757 loci were restricted to a subset of the lines. RT-PCR validation demonstrated 87.5% concordance with the computational prediction of these expressed novel transcripts. Intriguingly, 145 of the novel de novo assembled loci were present in lines from only one of the two heterotic groups consistent with the hypothesis that, in addition to sequence polymorphisms and transcript abundance, transcript presence/absence variation is present and, thereby, may be a mechanism contributing to the genetic basis of heterosis.

  2. Complexity of genome evolution by segmental rearrangement in Brassica rapa revealed by sequence-level analysis

    Directory of Open Access Journals (Sweden)

    Paterson Andrew H

    2009-11-01

    Full Text Available Abstract Background The Brassica species, related to Arabidopsis thaliana, include an important group of crops and represent an excellent system for studying the evolutionary consequences of polyploidy. Previous studies have led to a proposed structure for an ancestral karyotype and models for the evolution of the B. rapa genome by triplication and segmental rearrangement, but these have not been validated at the sequence level. Results We developed computational tools to analyse the public collection of B. rapa BAC end sequence, in order to identify candidates for representing collinearity discontinuities between the genomes of B. rapa and A. thaliana. For each putative discontinuity, one of the BACs was sequenced and analysed for collinearity with the genome of A. thaliana. Additional BAC clones were identified and sequenced as part of ongoing efforts to sequence four chromosomes of B. rapa. Strikingly few of the 19 inter-chromosomal rearrangements corresponded to the set of collinearity discontinuities anticipated on the basis of previous studies. Our analyses revealed numerous instances of newly detected collinearity blocks. For B. rapa linkage group A8, we were able to develop a model for the derivation of the chromosome from the ancestral karyotype. We were also able to identify a rearrangement event in the ancestor of B. rapa that was not shared with the ancestor of A. thaliana, and is represented in triplicate in the B. rapa genome. In addition to inter-chromosomal rearrangements, we identified and analysed 32 BACs containing the end points of segmental inversion events. Conclusion Our results show that previous studies of segmental collinearity between the A. thaliana, Brassica and ancestral karyotype genomes, although very useful, represent over-simplifications of their true relationships. The presence of numerous cryptic collinear genome segments and the frequent occurrence of segmental inversions mean that inference of the positions

  3. Single nucleus genome sequencing reveals high similarity among nuclei of an endomycorrhizal fungus.

    Directory of Open Access Journals (Sweden)

    Kui Lin

    2014-01-01

    Full Text Available Nuclei of arbuscular endomycorrhizal fungi have been described as highly diverse due to their asexual nature and absence of a single cell stage with only one nucleus. This has raised fundamental questions concerning speciation, selection and transmission of the genetic make-up to next generations. Although this concept has become textbook knowledge, it is only based on studying a few loci, including 45S rDNA. To provide a more comprehensive insight into the genetic makeup of arbuscular endomycorrhizal fungi, we applied de novo genome sequencing of individual nuclei of Rhizophagus irregularis. This revealed a surprisingly low level of polymorphism between nuclei. In contrast, within a nucleus, the 45S rDNA repeat unit turned out to be highly diverged. This finding demystifies a long-lasting hypothesis on the complex genetic makeup of arbuscular endomycorrhizal fungi. Subsequent genome assembly resulted in the first draft reference genome sequence of an arbuscular endomycorrhizal fungus. Its length is 141 Mbps, representing over 27,000 protein-coding gene models. We used the genomic sequence to reinvestigate the phylogenetic relationships of Rhizophagus irregularis with other fungal phyla. This unambiguously demonstrated that Glomeromycota are more closely related to Mucoromycotina than to its postulated sister Dikarya.

  4. Single Nucleus Genome Sequencing Reveals High Similarity among Nuclei of an Endomycorrhizal Fungus

    Science.gov (United States)

    Zhang, Zhonghua; Ivanov, Sergey; Saunders, Diane G. O.; Mu, Desheng; Pang, Erli; Cao, Huifen; Cha, Hwangho; Lin, Tao; Zhou, Qian; Shang, Yi; Li, Ying; Sharma, Trupti; van Velzen, Robin; de Ruijter, Norbert; Aanen, Duur K.; Win, Joe; Kamoun, Sophien; Bisseling, Ton; Geurts, René; Huang, Sanwen

    2014-01-01

    Nuclei of arbuscular endomycorrhizal fungi have been described as highly diverse due to their asexual nature and absence of a single cell stage with only one nucleus. This has raised fundamental questions concerning speciation, selection and transmission of the genetic make-up to next generations. Although this concept has become textbook knowledge, it is only based on studying a few loci, including 45S rDNA. To provide a more comprehensive insight into the genetic makeup of arbuscular endomycorrhizal fungi, we applied de novo genome sequencing of individual nuclei of Rhizophagus irregularis. This revealed a surprisingly low level of polymorphism between nuclei. In contrast, within a nucleus, the 45S rDNA repeat unit turned out to be highly diverged. This finding demystifies a long-lasting hypothesis on the complex genetic makeup of arbuscular endomycorrhizal fungi. Subsequent genome assembly resulted in the first draft reference genome sequence of an arbuscular endomycorrhizal fungus. Its length is 141 Mbps, representing over 27,000 protein-coding gene models. We used the genomic sequence to reinvestigate the phylogenetic relationships of Rhizophagus irregularis with other fungal phyla. This unambiguously demonstrated that Glomeromycota are more closely related to Mucoromycotina than to its postulated sister Dikarya. PMID:24415955

  5. De Novo Sequences of Haloquadratum walsbyi from Lake Tyrrell, Australia, Reveal a Variable Genomic Landscape

    Directory of Open Access Journals (Sweden)

    Benjamin J. Tully

    2015-01-01

    Full Text Available Hypersaline systems near salt saturation levels represent an extreme environment, in which organisms grow and survive near the limits of life. One of the abundant members of the microbial communities in hypersaline systems is the square archaeon, Haloquadratum walsbyi. Utilizing a short-read metagenome from Lake Tyrrell, a hypersaline ecosystem in Victoria, Australia, we performed a comparative genomic analysis of H. walsbyi to better understand the extent of variation between strains/subspecies. Results revealed that previously isolated strains/subspecies do not fully describe the complete repertoire of the genomic landscape present in H. walsbyi. Rearrangements, insertions, and deletions were observed for the Lake Tyrrell derived Haloquadratum genomes and were supported by environmental de novo sequences, including shifts in the dominant genomic landscape of the two most abundant strains. Analysis pertaining to halomucins indicated that homologs for this large protein are not a feature common for all species of Haloquadratum. Further, we analyzed ATP-binding cassette transporters (ABC-type transporters for evidence of niche partitioning between different strains/subspecies. We were able to identify unique and variable transporter subunits from all five genomes analyzed and the de novo environmental sequences, suggesting that differences in nutrient and carbon source acquisition may play a role in maintaining distinct strains/subspecies.

  6. Sequencing and analyses of all known human rhinovirus genomes reveal structure and evolution.

    Science.gov (United States)

    Palmenberg, Ann C; Spiro, David; Kuzmickas, Ryan; Wang, Shiliang; Djikeng, Appolinaire; Rathe, Jennifer A; Fraser-Liggett, Claire M; Liggett, Stephen B

    2009-04-03

    Infection by human rhinovirus (HRV) is a major cause of upper and lower respiratory tract disease worldwide and displays considerable phenotypic variation. We examined diversity by completing the genome sequences for all known serotypes (n = 99). Superimposition of capsid crystal structure and optimal-energy RNA configurations established alignments and phylogeny. These revealed conserved motifs; clade-specific diversity, including a potential newly identified species (HRV-D); mutations in field isolates; and recombination. In analogy with poliovirus, a hypervariable 5' untranslated region tract may affect virulence. A configuration consistent with nonscanning internal ribosome entry was found in all HRVs and may account for rapid translation. The data density from complete sequences of the reference HRVs provided high resolution for this degree of modeling and serves as a platform for full genome-based epidemiologic studies and antiviral or vaccine development.

  7. Complete mitochondrial genome sequencing reveals novel haplotypes in a Polynesian population.

    Directory of Open Access Journals (Sweden)

    Miles Benton

    Full Text Available The high risk of metabolic disease traits in Polynesians may be partly explained by elevated prevalence of genetic variants involved in energy metabolism. The genetics of Polynesian populations has been shaped by island hoping migration events which have possibly favoured thrifty genes. The aim of this study was to sequence the mitochondrial genome in a group of Maoris in an effort to characterise genome variation in this Polynesian population for use in future disease association studies. We sequenced the complete mitochondrial genomes of 20 non-admixed Maori subjects using Affymetrix technology. DNA diversity analyses showed the Maori group exhibited reduced mitochondrial genome diversity compared to other worldwide populations, which is consistent with historical bottleneck and founder effects. Global phylogenetic analysis positioned these Maori subjects specifically within mitochondrial haplogroup--B4a1a1. Interestingly, we identified several novel variants that collectively form new and unique Maori motifs--B4a1a1c, B4a1a1a3 and B4a1a1a5. Compared to ancestral populations we observed an increased frequency of non-synonymous coding variants of several mitochondrial genes in the Maori group, which may be a result of positive selection and/or genetic drift effects. In conclusion, this study reports the first complete mitochondrial genome sequence data for a Maori population. Overall, these new data reveal novel mitochondrial genome signatures in this Polynesian population and enhance the phylogenetic picture of maternal ancestry in Oceania. The increased frequency of several mitochondrial coding variants makes them good candidates for future studies aimed at assessment of metabolic disease risk in Polynesian populations.

  8. Rapid genome evolution in Pms1 region of rice revealed by comparative sequence analysis

    Institute of Scientific and Technical Information of China (English)

    YU JinSheng; FAN YouRong; LIU Nan; SHAN Yan; LI XiangHua; ZHANG QiFa

    2007-01-01

    Pms1, a locus for photoperiod sensitive genic male sterility in rice, was identified and mapped to chromosome 7 in previous studies. Here we report an effort to identify the candidate genes for Pms1 by comparative sequencing of BAC clones from two cultivars Minghui 63 and Nongken 58, the parents for the initial mapping population. Annotation and comparison of the sequences of the two clones resulted in a total of five potential candidates which should be functionally tested. We also conducted comparative analysis of sequences of these two cultivars with two other cultivars, Nipponbare and 93-11,for which sequence data were available in public databases. The analysis revealed large differences in sequence composition among the four genotypes in the Pms1 region primarily due to retroelement activity leading to rapid recent growth and divergence of the genomes. High levels of polymorphism in the forms of indels and SNPs were found both in intra- and inter-subspecific comparisons. Dating analysis using LTRs of the retroelements in this region showed that the substitution rate of LTRs was much higher than reported in the literature. The results provided strong evidence for rapid genomic evolution of this region as a consequence of natural and artificial selection.

  9. The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans.

    Science.gov (United States)

    Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

    2015-07-20

    Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures.

  10. Whole-genome sequence comparisons reveal the evolution of Vibrio cholerae O1.

    Science.gov (United States)

    Kim, Eun Jin; Lee, Chan Hee; Nair, G Balakrish; Kim, Dong Wook

    2015-08-01

    The analysis of the whole-genome sequences of Vibrio cholerae strains from previous and current cholera pandemics has demonstrated that genomic changes and alterations in phage CTX (particularly in the gene encoding the B subunit of cholera toxin) were major features in the evolution of V. cholerae. Recent studies have revealed the genetic mechanisms in these bacteria by which new variants of V. cholerae are generated from type-specific strains; these mechanisms suggest that certain strains are selected by environmental or human factors over time. By understanding the mechanisms and driving forces of historical and current changes in the V. cholerae population, it would be possible to predict the direction of such changes and the evolution of new variants; this has implications for the battle against cholera. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Correction: Synergism between genome sequencing, tandem mass spectrometry and bio-inspired synthesis reveals insights into nocardioazine B biogenesis.

    Science.gov (United States)

    Alqahtani, Norah; Porwal, Suheel K; James, Elle D; Bis, Dana M; Karty, Jonathan A; Lane, Amy L; Viswanathan, Rajesh

    2015-09-21

    Correction for 'Synergism between genome sequencing, tandem mass spectrometry and bio-inspired synthesis reveals insights into nocardioazine B biogenesis' by Norah Alqahtani et al., Org. Biomol. Chem., 2015, 13, 7177-7192.

  12. The complete genome sequence of Fibrobacter succinogenes S85 reveals a cellulolytic and metabolic specialist.

    Directory of Open Access Journals (Sweden)

    Garret Suen

    Full Text Available Fibrobacter succinogenes is an important member of the rumen microbial community that converts plant biomass into nutrients usable by its host. This bacterium, which is also one of only two cultivated species in its phylum, is an efficient and prolific degrader of cellulose. Specifically, it has a particularly high activity against crystalline cellulose that requires close physical contact with this substrate. However, unlike other known cellulolytic microbes, it does not degrade cellulose using a cellulosome or by producing high extracellular titers of cellulase enzymes. To better understand the biology of F. succinogenes, we sequenced the genome of the type strain S85 to completion. A total of 3,085 open reading frames were predicted from its 3.84 Mbp genome. Analysis of sequences predicted to encode for carbohydrate-degrading enzymes revealed an unusually high number of genes that were classified into 49 different families of glycoside hydrolases, carbohydrate binding modules (CBMs, carbohydrate esterases, and polysaccharide lyases. Of the 31 identified cellulases, none contain CBMs in families 1, 2, and 3, typically associated with crystalline cellulose degradation. Polysaccharide hydrolysis and utilization assays showed that F. succinogenes was able to hydrolyze a number of polysaccharides, but could only utilize the hydrolytic products of cellulose. This suggests that F. succinogenes uses its array of hemicellulose-degrading enzymes to remove hemicelluloses to gain access to cellulose. This is reflected in its genome, as F. succinogenes lacks many of the genes necessary to transport and metabolize the hydrolytic products of non-cellulose polysaccharides. The F. succinogenes genome reveals a bacterium that specializes in cellulose as its sole energy source, and provides insight into a novel strategy for cellulose degradation.

  13. Mitochondrial genome sequences reveal deep divergences among Anopheles punctulatus sibling species in Papua New Guinea

    Directory of Open Access Journals (Sweden)

    Logue Kyle

    2013-02-01

    Full Text Available Abstract Background Members of the Anopheles punctulatus group (AP group are the primary vectors of human malaria in Papua New Guinea. The AP group includes 13 sibling species, most of them morphologically indistinguishable. Understanding why only certain species are able to transmit malaria requires a better comprehension of their evolutionary history. In particular, understanding relationships and divergence times among Anopheles species may enable assessing how malaria-related traits (e.g. blood feeding behaviours, vector competence have evolved. Methods DNA sequences of 14 mitochondrial (mt genomes from five AP sibling species and two species of the Anopheles dirus complex of Southeast Asia were sequenced. DNA sequences from all concatenated protein coding genes (10,770 bp were then analysed using a Bayesian approach to reconstruct phylogenetic relationships and date the divergence of the AP sibling species. Results Phylogenetic reconstruction using the concatenated DNA sequence of all mitochondrial protein coding genes indicates that the ancestors of the AP group arrived in Papua New Guinea 25 to 54 million years ago and rapidly diverged to form the current sibling species. Conclusion Through evaluation of newly described mt genome sequences, this study has revealed a divergence among members of the AP group in Papua New Guinea that would significantly predate the arrival of humans in this region, 50 thousand years ago. The divergence observed among the mtDNA sequences studied here may have resulted from reproductive isolation during historical changes in sea-level through glacial minima and maxima. This leads to a hypothesis that the AP sibling species have evolved independently for potentially thousands of generations. This suggests that the evolution of many phenotypes, such as insecticide resistance will arise independently in each of the AP sibling species studied here.

  14. Genome-wide translocation sequencing reveals mechanisms of chromosome breaks and rearrangements in B cells.

    Science.gov (United States)

    Chiarle, Roberto; Zhang, Yu; Frock, Richard L; Lewis, Susanna M; Molinie, Benoit; Ho, Yu-Jui; Myers, Darienne R; Choi, Vivian W; Compagno, Mara; Malkin, Daniel J; Neuberg, Donna; Monti, Stefano; Giallourakis, Cosmas C; Gostissa, Monica; Alt, Frederick W

    2011-09-30

    Whereas chromosomal translocations are common pathogenetic events in cancer, mechanisms that promote them are poorly understood. To elucidate translocation mechanisms in mammalian cells, we developed high-throughput, genome-wide translocation sequencing (HTGTS). We employed HTGTS to identify tens of thousands of independent translocation junctions involving fixed I-SceI meganuclease-generated DNA double-strand breaks (DSBs) within the c-myc oncogene or IgH locus of B lymphocytes induced for activation-induced cytidine deaminase (AID)-dependent IgH class switching. DSBs translocated widely across the genome but were preferentially targeted to transcribed chromosomal regions. Additionally, numerous AID-dependent and AID-independent hot spots were targeted, with the latter comprising mainly cryptic I-SceI targets. Comparison of translocation junctions with genome-wide nuclear run-ons revealed a marked association between transcription start sites and translocation targeting. The majority of translocation junctions were formed via end-joining with short microhomologies. Our findings have implications for diverse fields, including gene therapy and cancer genomics.

  15. Unique features of a Japanese 'Candidatus Liberibacter asiaticus' strain revealed by whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Hiroshi Katoh

    Full Text Available Citrus greening (huanglongbing is the most destructive disease of citrus worldwide. It is spread by citrus psyllids and is associated with phloem-limited bacteria of three species of α-Proteobacteria, namely, 'Candidatus Liberibacter asiaticus', 'Ca. L. americanus', and 'Ca. L. africanus'. Recent findings suggested that some Japanese strains lack the bacteriophage-type DNA polymerase region (DNA pol, in contrast to the Floridian psy62 strain. The whole genome sequence of the pol-negative 'Ca. L. asiaticus' Japanese isolate Ishi-1 was determined by metagenomic analysis of DNA extracted from 'Ca. L. asiaticus'-infected psyllids and leaf midribs. The 1.19-Mb genome has an average 36.32% GC content. Annotation revealed 13 operons encoding rRNA and 44 tRNA genes, but no typical bacterial pathogenesis-related genes were located within the genome, similar to the Floridian psy62 and Chinese gxpsy. In contrast to other 'Ca. L. asiaticus' strains, the genome of the Japanese Ishi-1 strain lacks a prophage-related region.

  16. Whole genome sequence of Staphylococcus saprophyticus reveals the pathogenesis of uncomplicated urinary tract infection.

    Science.gov (United States)

    Kuroda, Makoto; Yamashita, Atsushi; Hirakawa, Hideki; Kumano, Miyuki; Morikawa, Kazuya; Higashide, Masato; Maruyama, Atsushi; Inose, Yumiko; Matoba, Kimio; Toh, Hidehiro; Kuhara, Satoru; Hattori, Masahira; Ohta, Toshiko

    2005-09-13

    Staphylococcus saprophyticus is a uropathogenic Staphylococcus frequently isolated from young female outpatients presenting with uncomplicated urinary tract infections. We sequenced the whole genome of S. saprophyticus type strain ATCC 15305, which harbors a circular chromosome of 2,516,575 bp with 2,446 ORFs and two plasmids. Comparative genomic analyses with the strains of two other species, Staphylococcus aureus and Staphylococcus epidermidis, as well as experimental data, revealed the following characteristics of the S. saprophyticus genome. S. saprophyticus does not possess any virulence factors found in S. aureus, such as coagulase, enterotoxins, exoenzymes, and extracellular matrix-binding proteins, although it does have a remarkable paralog expansion of transport systems related to highly variable ion contents in the urinary environment. A further unique feature is that only a single ORF is predictable as a cell wall-anchored protein, and it shows positive hemagglutination and adherence to human bladder cell associated with initial colonization in the urinary tract. It also shows significantly high urease activity in S. saprophyticus. The uropathogenicity of S. saprophyticus can be attributed to its genome that is needed for its survival in the human urinary tract by means of novel cell wall-anchored adhesin and redundant uro-adaptive transport systems, together with urease.

  17. Seventeen new complete mtDNA sequences reveal extensive mitochondrial genome evolution within the Demospongiae.

    Directory of Open Access Journals (Sweden)

    Xiujuan Wang

    Full Text Available Two major transitions in animal evolution--the origins of multicellularity and bilaterality--correlate with major changes in mitochondrial DNA (mtDNA organization. Demosponges, the largest class in the phylum Porifera, underwent only the first of these transitions and their mitochondrial genomes display a peculiar combination of ancestral and animal-specific features. To get an insight into the evolution of mitochondrial genomes within the Demospongiae, we determined 17 new mtDNA sequences from this group and analyzing them with five previously published sequences. Our analysis revealed that all demosponge mtDNAs are 16- to 25-kbp circular molecules, containing 13-15 protein genes, 2 rRNA genes, and 2-27 tRNA genes. All but four pairs of sampled genomes had unique gene orders, with the number of shared gene boundaries ranging from 1 to 41. Although most demosponge species displayed low rates of mitochondrial sequence evolution, a significant acceleration in evolutionary rates occurred in the G1 group (orders Dendroceratida, Dictyoceratida, and Verticillitida. Large variation in mtDNA organization was also observed within the G0 group (order Homosclerophorida including gene rearrangements, loss of tRNA genes, and the presence of two introns in Plakortis angulospiculatus. While introns are rare in modern-day demosponge mtDNA, we inferred that at least one intron was present in cox1 of the common ancestor of all demosponges. Our study uncovered an extensive mitochondrial genomic diversity within the Demospongiae. Although all sampled mitochondrial genomes retained some ancestral features, including a minimally modified genetic code, conserved structures of tRNA genes, and presence of multiple non-coding regions, they vary considerably in their size, gene content, gene order, and the rates of sequence evolution. Some of the changes in demosponge mtDNA, such as the loss of tRNA genes and the appearance of hairpin-containing repetitive elements

  18. Seventeen New Complete mtDNA Sequences Reveal Extensive Mitochondrial Genome Evolution within the Demospongiae

    Science.gov (United States)

    Wang, Xiujuan; Lavrov, Dennis V.

    2008-01-01

    Two major transitions in animal evolution–the origins of multicellularity and bilaterality–correlate with major changes in mitochondrial DNA (mtDNA) organization. Demosponges, the largest class in the phylum Porifera, underwent only the first of these transitions and their mitochondrial genomes display a peculiar combination of ancestral and animal-specific features. To get an insight into the evolution of mitochondrial genomes within the Demospongiae, we determined 17 new mtDNA sequences from this group and analyzing them with five previously published sequences. Our analysis revealed that all demosponge mtDNAs are 16- to 25-kbp circular molecules, containing 13–15 protein genes, 2 rRNA genes, and 2–27 tRNA genes. All but four pairs of sampled genomes had unique gene orders, with the number of shared gene boundaries ranging from 1 to 41. Although most demosponge species displayed low rates of mitochondrial sequence evolution, a significant acceleration in evolutionary rates occurred in the G1 group (orders Dendroceratida, Dictyoceratida, and Verticillitida). Large variation in mtDNA organization was also observed within the G0 group (order Homosclerophorida) including gene rearrangements, loss of tRNA genes, and the presence of two introns in Plakortis angulospiculatus. While introns are rare in modern-day demosponge mtDNA, we inferred that at least one intron was present in cox1 of the common ancestor of all demosponges. Our study uncovered an extensive mitochondrial genomic diversity within the Demospongiae. Although all sampled mitochondrial genomes retained some ancestral features, including a minimally modified genetic code, conserved structures of tRNA genes, and presence of multiple non-coding regions, they vary considerably in their size, gene content, gene order, and the rates of sequence evolution. Some of the changes in demosponge mtDNA, such as the loss of tRNA genes and the appearance of hairpin-containing repetitive elements, occurred in

  19. Sequencing the CHO DXB11 genome reveals regional variations in genomic stability and haploidy

    DEFF Research Database (Denmark)

    Kaas, Christian Schrøder; Kristensen, Claus; Betenbaugh, Michael J.

    2015-01-01

    Background: The DHFR negative CHO DXB11 cell line (also known as DUX-B11 and DUKX) was historically the first CHO cell line to be used for large scale production of heterologous proteins and is still used for production of a number of complex proteins.  Results: Here we present the genomic sequen...

  20. Genome and transcriptome sequences reveal the specific parasitism of the nematophagous Purpureocillium lilacinum 36-1

    Directory of Open Access Journals (Sweden)

    Jialian Xie

    2016-07-01

    Full Text Available Purpureocillium lilacinum is a promising nematophagous ascomycete able to adapt diverse environments and it is also an opportunistic fungus that infects humans. A microbial inoculant of P. lilacinum has been registered to control plant parasitic nematodes. However, the molecular mechanism of the toxicological processes is still unclear because of the relatively few reports on the subject. In this study, using Illumina paired-end sequencing, the draft genome sequence and the transcriptome of P. lilacinum strain 36-1 infecting nematode-eggs were determined. Whole genome alignment indicated that P. lilacinum 36-1 possessed a more dynamic genome in comparison with P. lilacinum India strain. Moreover, a phylogenetic analysis showed that the P. lilacinum 36-1 had a closer relation to entomophagous fungi. The protein-coding genes in P. lilacinum 36-1 occurred much more frequently than they did in other fungi, which was a result of the depletion of repeat-induced point mutations (RIP. Comparative genome and transcriptome analyses revealed the genes that were involved in pathogenicity, particularly in the recognition, adhesion of nematode-eggs, downstream signal transduction pathways and hydrolase genes. By contrast, certain numbers of cellulose and xylan degradation genes and a lack of polysaccharide lyase genes showed the potential of P. lilacinum 36-1 as an endophyte. Notably, the expression of appressorium-formation and antioxidants-related genes exhibited similar infection patterns in P. lilacinum strain 36-1 to those of the model entomophagous fungi Metarhizium spp. These results uncovered the specific parasitism of P. lilacinum and presented the genes responsible for the infection of nematode-eggs.

  1. A survey of genomic traces reveals a common sequencing error, RNA editing, and DNA editing.

    Directory of Open Access Journals (Sweden)

    Alexander Wait Zaranek

    2010-05-01

    Full Text Available While it is widely held that an organism's genomic information should remain constant, several protein families are known to modify it. Members of the AID/APOBEC protein family can deaminate DNA. Similarly, members of the ADAR family can deaminate RNA. Characterizing the scope of these events is challenging. Here we use large genomic data sets, such as the two billion sequences in the NCBI Trace Archive, to look for clusters of mismatches of the same type, which are a hallmark of editing events caused by APOBEC3 and ADAR. We align 603,249,815 traces from the NCBI trace archive to their reference genomes. In clusters of mismatches of increasing size, at least one systematic sequencing error dominates the results (G-to-A. It is still present in mismatches with 99% accuracy and only vanishes in mismatches at 99.99% accuracy or higher. The error appears to have entered into about 1% of the HapMap, possibly affecting other users that rely on this resource. Further investigation, using stringent quality thresholds, uncovers thousands of mismatch clusters with no apparent defects in their chromatograms. These traces provide the first reported candidates of endogenous DNA editing in human, further elucidating RNA editing in human and mouse and also revealing, for the first time, extensive RNA editing in Xenopus tropicalis. We show that the NCBI Trace Archive provides a valuable resource for the investigation of the phenomena of DNA and RNA editing, as well as setting the stage for a comprehensive mapping of editing events in large-scale genomic datasets.

  2. The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima.

    OpenAIRE

    2014-01-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologue...

  3. The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede strigamia maritima

    OpenAIRE

    2014-01-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologue...

  4. The First Myriapod Genome Sequence Reveals Conservative Arthropod Gene Content and Genome Organisation in the Centipede Strigamia maritima

    OpenAIRE

    2014-01-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologue...

  5. Genomic and polyploid evolution in genus Avena as revealed by RFLPs of repeated DNA sequences.

    Science.gov (United States)

    Morikawa, Toshinobu; Nishihara, Miho

    2009-06-01

    Phylogenetic relationships and genome affinities were investigated by utilizing all the biological Avena species consisting of 11 diploid species (15 accessions), 8 tetraploid species (9 accessions) and 4 hexaploid species (5 accessions). Genomic DNA regions of As120a, avenin, and globulin were amplified by PCR. A total of 130 polymorphic fragments were detected out of 156 fragments generated by digesting the PCR-amplified fragments with 11 restriction enzymes. The number of fragments generated by PCR-amplification followed by digestion with restriction enzymes was almost the same as those among the three repeated DNA sequences. A high level of genetic distance was detected between A. damascena (Ad) and A. canariensis (Ac) genomes, which reflected their different morphology and reproductive isolation. The A. longiglumis (Al) and A. prostrata (Ap) genomes were closely related to the As genome group. The AB genome species formed a cluster with the AsAs genome artificial autotetraploid and the As genome diploids indicating near-autotetraploid origin. The A. macrostachya is an outbreeding autotetraploid closely related with the C genome diploid and the AC genome tetraploid species. The differences of genetic distances estimated from the repeated DNA sequence divergence among the Avena species were consistent with genome divergences and it was possible to compare the genetic intra- and inter-ploidy relationships produced by RFLPs. These results suggested that the PCR-mediated analysis of repeated DNA polymorphism can be used as a tool to examine genomic relationships of polyploidy species.

  6. Sequencing of Australian wild rice genomes reveals ancestral relationships with domesticated rice.

    Science.gov (United States)

    Brozynska, Marta; Copetti, Dario; Furtado, Agnelo; Wing, Rod A; Crayn, Darren; Fox, Glen; Ishikawa, Ryuji; Henry, Robert J

    2016-11-27

    The related A genome species of the Oryza genus are the effective gene pool for rice. Here, we report draft genomes for two Australian wild A genome taxa: O. rufipogon-like population, referred to as Taxon A, and O. meridionalis-like population, referred to as Taxon B. These two taxa were sequenced and assembled by integration of short- and long-read next-generation sequencing (NGS) data to create a genomic platform for a wider rice gene pool. Here, we report that, despite the distinct chloroplast genome, the nuclear genome of the Australian Taxon A has a sequence that is much closer to that of domesticated rice (O. sativa) than to the other Australian wild populations. Analysis of 4643 genes in the A genome clade showed that the Australian annual, O. meridionalis, and related perennial taxa have the most divergent (around 3 million years) genome sequences relative to domesticated rice. A test for admixture showed possible introgression into the Australian Taxon A (diverged around 1.6 million years ago) especially from the wild indica/O. nivara clade in Asia. These results demonstrate that northern Australia may be the centre of diversity of the A genome Oryza and suggest the possibility that this might also be the centre of origin of this group and represent an important resource for rice improvement.

  7. Whole-Genome Sequencing Analysis from the Chikungunya Virus Caribbean Outbreak Reveals Novel Evolutionary Genomic Elements.

    Directory of Open Access Journals (Sweden)

    Kenneth A Stapleford

    2016-01-01

    Full Text Available Chikungunya virus (CHIKV, an alphavirus and member of the Togaviridae family, is capable of causing severe febrile disease in humans. In December of 2013 the Asian Lineage of CHIKV spread from the Old World to the Americas, spreading rapidly throughout the New World. Given this new emergence in naïve populations we studied the viral genetic diversity present in infected individuals to understand how CHIKV may have evolved during this continuing outbreak.We used deep-sequencing technologies coupled with well-established bioinformatics pipelines to characterize the minority variants and diversity present in CHIKV infected individuals from Guadeloupe and Martinique, two islands in the center of the epidemic. We observed changes in the consensus sequence as well as a diverse range of minority variants present at various levels in the population. Furthermore, we found that overall diversity was dramatically reduced after single passages in cell lines. Finally, we constructed an infectious clone from this outbreak and identified a novel 3' untranslated region (UTR structure, not previously found in nature, that led to increased replication in insect cells.Here we preformed an intrahost quasispecies analysis of the new CHIKV outbreak in the Caribbean. We identified novel variants present in infected individuals, as well as a new 3'UTR structure, suggesting that CHIKV has rapidly evolved in a short period of time once it entered this naïve population. These studies highlight the need to continue viral diversity surveillance over time as this epidemic evolves in order to understand the evolutionary potential of CHIKV.

  8. The arthrobacter arilaitensis Re117 genome sequence reveals its genetic adaptation to the surface of cheese.

    Directory of Open Access Journals (Sweden)

    Christophe Monnet

    Full Text Available Arthrobacter arilaitensis is one of the major bacterial species found at the surface of cheeses, especially in smear-ripened cheeses, where it contributes to the typical colour, flavour and texture properties of the final product. The A. arilaitensis Re117 genome is composed of a 3,859,257 bp chromosome and two plasmids of 50,407 and 8,528 bp. The chromosome shares large regions of synteny with the chromosomes of three environmental Arthrobacter strains for which genome sequences are available: A. aurescens TC1, A. chlorophenolicus A6 and Arthrobacter sp. FB24. In contrast however, 4.92% of the A. arilaitensis chromosome is composed of ISs elements, a portion that is at least 15 fold higher than for the other Arthrobacter strains. Comparative genomic analyses reveal an extensive loss of genes associated with catabolic activities, presumably as a result of adaptation to the properties of the cheese surface habitat. Like the environmental Arthrobacter strains, A. arilaitensis Re117 is well-equipped with enzymes required for the catabolism of major carbon substrates present at cheese surfaces such as fatty acids, amino acids and lactic acid. However, A. arilaitensis has several specificities which seem to be linked to its adaptation to its particular niche. These include the ability to catabolize D-galactonate, a high number of glycine betaine and related osmolyte transporters, two siderophore biosynthesis gene clusters and a high number of Fe(3+/siderophore transport systems. In model cheese experiments, addition of small amounts of iron strongly stimulated the growth of A. arilaitensis, indicating that cheese is a highly iron-restricted medium. We suggest that there is a strong selective pressure at the surface of cheese for strains with efficient iron acquisition and salt-tolerance systems together with abilities to catabolize substrates such as lactic acid, lipids and amino acids.

  9. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences

    Directory of Open Access Journals (Sweden)

    Alessandra Traini

    2013-01-01

    Full Text Available Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  10. Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication.

    Science.gov (United States)

    Wu, G Albert; Prochnik, Simon; Jenkins, Jerry; Salse, Jerome; Hellsten, Uffe; Murat, Florent; Perrier, Xavier; Ruiz, Manuel; Scalabrin, Simone; Terol, Javier; Takita, Marco Aurélio; Labadie, Karine; Poulain, Julie; Couloux, Arnaud; Jabbari, Kamel; Cattonaro, Federica; Del Fabbro, Cristian; Pinosio, Sara; Zuccolo, Andrea; Chapman, Jarrod; Grimwood, Jane; Tadeo, Francisco R; Estornell, Leandro H; Muñoz-Sanz, Juan V; Ibanez, Victoria; Herrero-Ortega, Amparo; Aleza, Pablo; Pérez-Pérez, Julián; Ramón, Daniel; Brunel, Dominique; Luro, François; Chen, Chunxian; Farmerie, William G; Desany, Brian; Kodira, Chinnappa; Mohiuddin, Mohammed; Harkins, Tim; Fredrikson, Karin; Burns, Paul; Lomsadze, Alexandre; Borodovsky, Mark; Reforgiato, Giuseppe; Freitas-Astúa, Juliana; Quetier, Francis; Navarro, Luis; Roose, Mikeal; Wincker, Patrick; Schmutz, Jeremy; Morgante, Michele; Machado, Marcos Antonio; Talon, Manuel; Jaillon, Olivier; Ollitrault, Patrick; Gmitter, Frederick; Rokhsar, Daniel

    2014-07-01

    Cultivated citrus are selections from, or hybrids of, wild progenitor species whose identities and contributions to citrus domestication remain controversial. Here we sequence and compare citrus genomes--a high-quality reference haploid clementine genome and mandarin, pummelo, sweet-orange and sour-orange genomes--and show that cultivated types derive from two progenitor species. Although cultivated pummelos represent selections from one progenitor species, Citrus maxima, cultivated mandarins are introgressions of C. maxima into the ancestral mandarin species Citrus reticulata. The most widely cultivated citrus, sweet orange, is the offspring of previously admixed individuals, but sour orange is an F1 hybrid of pure C. maxima and C. reticulata parents, thus implying that wild mandarins were part of the early breeding germplasm. A Chinese wild 'mandarin' diverges substantially from C. reticulata, thus suggesting the possibility of other unrecognized wild citrus species. Understanding citrus phylogeny through genome analysis clarifies taxonomic relationships and facilitates sequence-directed genetic improvement.

  11. Genome sequencing of chimpanzee malaria parasites reveals possible pathways of adaptation to human hosts

    KAUST Repository

    Otto, Thomas D.

    2014-09-09

    Plasmodium falciparum causes most human malaria deaths, having prehistorically evolved from parasites of African Great Apes. Here we explore the genomic basis of P. falciparum adaptation to human hosts by fully sequencing the genome of the closely related chimpanzee parasite species P. reichenowi, and obtaining partial sequence data from a more distantly related chimpanzee parasite (P. gaboni). The close relationship between P. reichenowi and P. falciparum is emphasized by almost complete conservation of genomic synteny, but against this strikingly conserved background we observe major differences at loci involved in erythrocyte invasion. The organization of most virulence-associated multigene families, including the hypervariable var genes, is broadly conserved, but P. falciparum has a smaller subset of rif and stevor genes whose products are expressed on the infected erythrocyte surface. Genome-wide analysis identifies other loci under recent positive selection, but a limited number of changes at the host–parasite interface may have mediated host switching.

  12. Genome sequencing of chimpanzee malaria parasites reveals possible pathways of adaptation to human hosts.

    Science.gov (United States)

    Otto, Thomas D; Rayner, Julian C; Böhme, Ulrike; Pain, Arnab; Spottiswoode, Natasha; Sanders, Mandy; Quail, Michael; Ollomo, Benjamin; Renaud, François; Thomas, Alan W; Prugnolle, Franck; Conway, David J; Newbold, Chris; Berriman, Matthew

    2014-09-09

    Plasmodium falciparum causes most human malaria deaths, having prehistorically evolved from parasites of African Great Apes. Here we explore the genomic basis of P. falciparum adaptation to human hosts by fully sequencing the genome of the closely related chimpanzee parasite species P. reichenowi, and obtaining partial sequence data from a more distantly related chimpanzee parasite (P. gaboni). The close relationship between P. reichenowi and P. falciparum is emphasized by almost complete conservation of genomic synteny, but against this strikingly conserved background we observe major differences at loci involved in erythrocyte invasion. The organization of most virulence-associated multigene families, including the hypervariable var genes, is broadly conserved, but P. falciparum has a smaller subset of rif and stevor genes whose products are expressed on the infected erythrocyte surface. Genome-wide analysis identifies other loci under recent positive selection, but a limited number of changes at the host-parasite interface may have mediated host switching.

  13. Sequencing of bovine herpesvirus 4 v.test strain reveals important genome features

    Directory of Open Access Journals (Sweden)

    Gillet Laurent

    2011-08-01

    Full Text Available Abstract Background Bovine herpesvirus 4 (BoHV-4 is a useful model for the human pathogenic gammaherpesviruses Epstein-Barr virus and Kaposi's Sarcoma-associated Herpesvirus. Although genome manipulations of this virus have been greatly facilitated by the cloning of the BoHV-4 V.test strain as a Bacterial Artificial Chromosome (BAC, the lack of a complete genome sequence for this strain limits its experimental use. Methods In this study, we have determined the complete sequence of BoHV-4 V.test strain by a pyrosequencing approach. Results The long unique coding region (LUR consists of 108,241 bp encoding at least 79 open reading frames and is flanked by several polyrepetitive DNA units (prDNA. As previously suggested, we showed that the prDNA unit located at the left prDNA-LUR junction (prDNA-G differs from the other prDNA units (prDNA-inner. Namely, the prDNA-G unit lacks the conserved pac-2 cleavage and packaging signal in its right terminal region. Based on the mechanisms of cleavage and packaging of herpesvirus genomes, this feature implies that only genomes bearing left and right end prDNA units are encapsulated into virions. Conclusions In this study, we have determined the complete genome sequence of the BAC-cloned BoHV-4 V.test strain and identified genome organization features that could be important in other herpesviruses.

  14. Genomic and Functional Characteristics of Human Cytomegalovirus Revealed by Next-Generation Sequencing

    Directory of Open Access Journals (Sweden)

    Steven Sijmons

    2014-03-01

    Full Text Available The complete genome of human cytomegalovirus (HCMV was elucidated almost 25 years ago using a traditional cloning and Sanger sequencing approach. Analysis of the genetic content of additional laboratory and clinical isolates has lead to a better, albeit still incomplete, definition of the coding potential and diversity of wild-type HCMV strains. The introduction of a new generation of massively parallel sequencing technologies, collectively called next-generation sequencing, has profoundly increased the throughput and resolution of the genomics field. These increased possibilities are already leading to a better understanding of the circulating diversity of HCMV clinical isolates. The higher resolution of next-generation sequencing provides new opportunities in the study of intrahost viral population structures. Furthermore, deep sequencing enables novel diagnostic applications for sensitive drug resistance mutation detection. RNA-seq applications have changed the picture of the HCMV transcriptome, which resulted in proof of a vast amount of splicing events and alternative transcripts. This review discusses the application of next-generation sequencing technologies, which has provided a clearer picture of the intricate nature of the HCMV genome. The continuing development and application of novel sequencing technologies will further augment our understanding of this ubiquitous, but elusive, herpesvirus.

  15. A 100%-complete sequence reveals unusually simple genomic features in the hot-spring red alga Cyanidioschyzon merolae

    Directory of Open Access Journals (Sweden)

    Yoshida Yamato

    2007-07-01

    Full Text Available Abstract Background All previously reported eukaryotic nuclear genome sequences have been incomplete, especially in highly repeated units and chromosomal ends. Because repetitive DNA is important for many aspects of biology, complete chromosomal structures are fundamental for understanding eukaryotic cells. Our earlier, nearly complete genome sequence of the hot-spring red alga Cyanidioschyzon merolae revealed several unique features, including just three ribosomal DNA copies, very few introns, and a small total number of genes. However, because the exact structures of certain functionally important repeated elements remained ambiguous, that sequence was not complete. Obviously, those ambiguities needed to be resolved before the unique features of the C. merolae genome could be summarized, and the ambiguities could only be resolved by completing the sequence. Therefore, we aimed to complete all previous gaps and sequence all remaining chromosomal ends, and now report the first nuclear-genome sequence for any eukaryote that is 100% complete. Results Our present complete sequence consists of 16546747 nucleotides covering 100% of the 20 linear chromosomes from telomere to telomere, representing the simple and unique chromosomal structures of the eukaryotic cell. We have unambiguously established that the C. merolae genome contains the smallest known histone-gene cluster, a unique telomeric repeat for all chromosomal ends, and an extremely low number of transposons. Conclusion By virtue of these attributes and others that we had discovered previously, C. merolae appears to have the simplest nuclear genome of the non-symbiotic eukaryotes. These unusually simple genomic features in the 100% complete genome sequence of C. merolae are extremely useful for further studies of eukaryotic cells.

  16. A 100%-complete sequence reveals unusually simple genomic features in the hot-spring red alga Cyanidioschyzon merolae

    Science.gov (United States)

    Nozaki, Hisayoshi; Takano, Hiroyoshi; Misumi, Osami; Terasawa, Kimihiro; Matsuzaki, Motomichi; Maruyama, Shinichiro; Nishida, Keiji; Yagisawa, Fumi; Yoshida, Yamato; Fujiwara, Takayuki; Takio, Susumu; Tamura, Katsunori; Chung, Sung Jin; Nakamura, Soichi; Kuroiwa, Haruko; Tanaka, Kan; Sato, Naoki; Kuroiwa, Tsuneyoshi

    2007-01-01

    Background All previously reported eukaryotic nuclear genome sequences have been incomplete, especially in highly repeated units and chromosomal ends. Because repetitive DNA is important for many aspects of biology, complete chromosomal structures are fundamental for understanding eukaryotic cells. Our earlier, nearly complete genome sequence of the hot-spring red alga Cyanidioschyzon merolae revealed several unique features, including just three ribosomal DNA copies, very few introns, and a small total number of genes. However, because the exact structures of certain functionally important repeated elements remained ambiguous, that sequence was not complete. Obviously, those ambiguities needed to be resolved before the unique features of the C. merolae genome could be summarized, and the ambiguities could only be resolved by completing the sequence. Therefore, we aimed to complete all previous gaps and sequence all remaining chromosomal ends, and now report the first nuclear-genome sequence for any eukaryote that is 100% complete. Results Our present complete sequence consists of 16546747 nucleotides covering 100% of the 20 linear chromosomes from telomere to telomere, representing the simple and unique chromosomal structures of the eukaryotic cell. We have unambiguously established that the C. merolae genome contains the smallest known histone-gene cluster, a unique telomeric repeat for all chromosomal ends, and an extremely low number of transposons. Conclusion By virtue of these attributes and others that we had discovered previously, C. merolae appears to have the simplest nuclear genome of the non-symbiotic eukaryotes. These unusually simple genomic features in the 100% complete genome sequence of C. merolae are extremely useful for further studies of eukaryotic cells. PMID:17623057

  17. Metabolic diversity and ecological niches of Achromatium populations revealed with single-cell genomic sequencing

    Directory of Open Access Journals (Sweden)

    Muammar eMansor

    2015-08-01

    Full Text Available Large, sulfur-cycling, calcite-precipitating bacteria in the genus Achromatium represent a significant proportion of bacterial communities near sediment-water interfaces throughout the world. Our understanding of their potentially crucial roles in calcium, carbon, sulfur, nitrogen, and iron cycling is limited because they have not been cultured or sequenced using environmental genomics approaches to date. We utilized single-cell genomic sequencing to obtain one incomplete and two nearly complete draft genomes for Achromatium collected at Warm Mineral Springs, FL. Based on 16S rRNA gene sequences, the three cells represent distinct and relatively distant Achromatium populations (91-92% identity. The draft genomes encode key genes involved in sulfur and hydrogen oxidation; oxygen, nitrogen and polysulfide respiration; carbon and nitrogen fixation; organic carbon assimilation and storage; chemotaxis; twitching motility; antibiotic resistance; and membrane transport. Known genes for iron and manganese energy metabolism were not detected. The presence of pyrophosphatase and vacuolar (V-type ATPases, which are generally rare in bacterial genomes, suggests a role for these enzymes in calcium transport, proton pumping, and/or energy generation in the membranes of calcite-containing inclusions.

  18. Sequencing the genome of Marssonina brunnea reveals fungus-poplar co-evolution

    Directory of Open Access Journals (Sweden)

    Zhu Sheng

    2012-08-01

    Full Text Available Abstract Background The fungus Marssonina brunnea is a causal pathogen of Marssonina leaf spot that devastates poplar plantations by defoliating susceptible trees before normal fall leaf drop. Results We sequence the genome of M. brunnea with a size of 52 Mb assembled into 89 scaffolds, representing the first sequenced Dermateaceae genome. By inoculating this fungus onto a poplar hybrid clone, we investigate how M. brunnea interacts and co-evolves with its host to colonize poplar leaves. While a handful of virulence genes in M. brunnea, mostly from the LysM family, are detected to up-regulate during infection, the poplar down-regulates its resistance genes, such as nucleotide binding site domains and leucine rich repeats, in response to infection. From 10,027 predicted proteins of M. brunnea in a comparison with those from poplar, we identify four poplar transferases that stimulate the host to resist M. brunnea. These transferas-encoding genes may have driven the co-evolution of M. brunnea and Populus during the process of infection and anti-infection. Conclusions Our results from the draft sequence of the M. brunnea genome provide evidence for genome-genome interactions that play an important role in poplar-pathogen co-evolution. This knowledge could help to design effective strategies for controlling Marssonina leaf spot in poplar.

  19. Next generation sequencing and FISH reveal uneven and nonrandom microsatellite distribution in two grasshopper genomes.

    Science.gov (United States)

    Ruiz-Ruano, Francisco J; Cuadrado, Ángeles; Montiel, Eugenia E; Camacho, Juan Pedro M; López-León, María Dolores

    2015-06-01

    Simple sequence repeats (SSRs), also known as microsatellites, are one of the prominent DNA sequences shaping the repeated fraction of eukaryotic genomes. In spite of their profuse use as molecular markers for a variety of genetic and evolutionary studies, their genomic location, distribution, and function are not yet well understood. Here we report the first thorough joint analysis of microsatellite motifs at both genomic and chromosomal levels in animal species, by a combination of 454 sequencing and fluorescent in situ hybridization (FISH) techniques performed on two grasshopper species. The in silico analysis of the 454 reads suggested that microsatellite expansion is not driving size increase of these genomes, as SSR abundance was higher in the species showing the smallest genome. However, the two species showed the same uneven and nonrandom location of SSRs, with clear predominance of dinucleotide motifs and association with several types of repetitive elements, mostly histone gene spacers, ribosomal DNA intergenic spacers (IGS), and transposable elements (TEs). The FISH analysis showed a dispersed chromosome distribution of microsatellite motifs in euchromatic regions, in coincidence with chromosome location patterns previously observed for many mobile elements in these species. However, some SSR motifs were clustered, especially those located in the histone gene cluster.

  20. Whole-genome sequencing reveals the diversity of cattle copy number variations and multicopy genes

    Science.gov (United States)

    Structural and functional impacts of copy number variations (CNVs) on livestock genomes are not yet well understood. We identified 1853 CNV regions using population-scale sequencing data generated from 75 cattle representing 8 breeds (Angus, Brahman, Gir, Holstein, Jersey, Limousin, Nelore, Romagnol...

  1. Ecophysiology of Thioploca ingrica as revealed by the complete genome sequence supplemented with proteomic evidence.

    Science.gov (United States)

    Kojima, Hisaya; Ogura, Yoshitoshi; Yamamoto, Nozomi; Togashi, Tomoaki; Mori, Hiroshi; Watanabe, Tomohiro; Nemoto, Fumiko; Kurokawa, Ken; Hayashi, Tetsuya; Fukui, Manabu

    2015-05-01

    Large sulfur-oxidizing bacteria, which accumulate a high concentration of nitrate, are important constituents of aquatic sediment ecosystems. No representative of this group has been isolated in pure culture, and only fragmented draft genome sequences are available for these microorganisms. In this study, we successfully reconstituted the genome of Thioploca ingrica from metagenomic sequences, thereby generating the first complete genome sequence from this group. The Thioploca samples for the metagenomic analysis were obtained from a freshwater lake in Japan. A PCR-free paired-end library was constructed from the DNA extracted from the samples and was sequenced on the Illumina MiSeq platform. By closing gaps within and between the scaffolds, we obtained a circular chromosome and a plasmid-like element. The reconstituted chromosome was 4.8 Mbp in length with a 41.2% GC content. A sulfur oxidation pathway identical to that suggested for the closest relatives of Thioploca was deduced from the reconstituted genome. A full set of genes required for respiratory nitrate reduction to dinitrogen gas was also identified. We further performed a proteomic analysis of the Thioploca sample and detected many enzymes/proteins involved in sulfur oxidation, nitrate respiration and inorganic carbon fixation as major components of the protein extracts from the sample, suggesting that these metabolic activities are strongly associated with the physiology of T. ingrica in lake sediment.

  2. Genome Sequencing Reveals Loci under Artificial Selection that Underlie Disease Phenotypes in the Laboratory Rat

    NARCIS (Netherlands)

    Atanur, Santosh S.; Diaz, Ana Garcia; Maratou, Klio; Sarkis, Allison; Rotival, Maxime; Game, Laurence; Tschannen, Michael R.; Kaisaki, Pamela J.; Otto, Georg W.; Ma, Man Chun John; Keane, Thomas M.; Hummel, Oliver; Saar, Kathrin; Chen, Wei; Guryev, Victor; Gopalakrishnan, Kathirvel; Garrett, Michael R.; Joe, Bina; Citterio, Lorena; Bianchi, Giuseppe; McBride, Martin; Dominiczak, Anna; Adams, David J.; Serikawa, Tadao; Flicek, Paul; Cuppen, Edwin; Hubner, Norbert; Petretto, Enrico; Gauguier, Dominique; Kwitek, Anne; Jacob, Howard; Aitman, Timothy J.

    2013-01-01

    Large numbers of inbred laboratory rat strains have been developed for a range of complex disease phenotypes. To gain insights into the evolutionary pressures underlying selection for these phenotypes, we sequenced the genomes of 27 rat strains, including 11 models of hypertension, diabetes, and ins

  3. Genome sequencing reveals loci under artificial selection that underlie disease phenotypes in the laboratory rat

    NARCIS (Netherlands)

    Atanur, S.S.; Diaz, A.G.; Maratou, K.; Sarkis, A.; Rotival, M.; Game, L.; Tschannen, M.R.; Kaisaki, P.J.; Otto, G.W.; Ma, M.C.; Keane, T.M.; Hummel, O.; Saar, K.; Chen, W.; Guryev, V.; Gopalakrishnan, K.; Garrett, M.R.; Joe, B.; Citterio, L.; Bianchi, G.; McBride, M.; Dominiczak, A.; Adams, D.J.; Serikawa, T.; Flicek, P.; Cuppen, E.; Hubner, N.; Petretto, E.; Gauguier, D.; Kwitek, A.; Jacob, H.; Aitman, T.J.

    2013-01-01

    Large numbers of inbred laboratory rat strains have been developed for a range of complex disease phenotypes. To gain insights into the evolutionary pressures underlying selection for these phenotypes, we sequenced the genomes of 27 rat strains, including 11 models of hypertension, diabetes, and ins

  4. Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication

    Science.gov (United States)

    Cultivated citrus are selections from, or hybrids of, wild progenitor species whose identities and contributions to citrus domestication remain controversial. Here we sequence and compare citrus genomes—a high-quality reference haploid clementine genome and mandarin, pummelo, sweet-orange and sour-o...

  5. Full Genome Sequence Analysis of Two Isolates Reveals a Novel Xanthomonas Species Close to the Sugarcane Pathogen Xanthomonas albilineans

    Directory of Open Access Journals (Sweden)

    Isabelle Pieretti

    2015-07-01

    Full Text Available Xanthomonas albilineans is the bacterium responsible for leaf scald, a lethal disease of sugarcane. Within the Xanthomonas genus, X. albilineans exhibits distinctive genomic characteristics including the presence of significant genome erosion, a non-ribosomal peptide synthesis (NRPS locus involved in albicidin biosynthesis, and a type 3 secretion system (T3SS of the Salmonella pathogenicity island-1 (SPI-1 family. We sequenced two X. albilineans-like strains isolated from unusual environments, i.e., from dew droplets on sugarcane leaves and from the wild grass Paspalum dilatatum, and compared these genomes sequences with those of two strains of X. albilineans and three of Xanthomonas sacchari. Average nucleotide identity (ANI and multi-locus sequence analysis (MLSA showed that both X. albilineans-like strains belong to a new species close to X. albilineans that we have named “Xanthomonas pseudalbilineans”. X. albilineans and “X. pseudalbilineans” share many genomic features including (i the lack of genes encoding a hypersensitive response and pathogenicity type 3 secretion system (Hrp-T3SS, and (ii genome erosion that probably occurred in a common progenitor of both species. Our comparative analyses also revealed specific genomic features that may help X. albilineans interact with sugarcane, e.g., a PglA endoglucanase, three TonB-dependent transporters and a glycogen metabolism gene cluster. Other specific genomic features found in the “X. pseudalbilineans” genome may contribute to its fitness and specific ecological niche.

  6. Heteroplasmy in the mitochondrial genomes of human lice and ticks revealed by high throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Haoyu Xiong

    Full Text Available The typical mitochondrial (mt genomes of bilateral animals consist of 37 genes on a single circular chromosome. The mt genomes of the human body louse, Pediculus humanus, and the human head louse, Pediculus capitis, however, are extensively fragmented and contain 20 minichromosomes, with one to three genes on each minichromosome. Heteroplasmy, i.e. nucleotide polymorphisms in the mt genome within individuals, has been shown to be significantly higher in the mt cox1 gene of human lice than in humans and other animals that have the typical mt genomes. To understand whether the extent of heteroplasmy in human lice is associated with mt genome fragmentation, we sequenced the entire coding regions of all of the mt minichromosomes of six human body lice and six human head lice from Ethiopia, China and France with an Illumina HiSeq platform. For comparison, we also sequenced the entire coding regions of the mt genomes of seven species of ticks, which have the typical mitochondrial genome organization of bilateral animals. We found that the level of heteroplasmy varies significantly both among the human lice and among the ticks. The human lice from Ethiopia have significantly higher level of heteroplasmy than those from China and France (Pt<0.05. The tick, Amblyomma cajennense, has significantly higher level of heteroplasmy than other ticks (Pt<0.05. Our results indicate that heteroplasmy level can be substantially variable within a species and among closely related species, and does not appear to be determined by single factors such as genome fragmentation.

  7. Genome sequence of Candidatus Nitrososphaera evergladensis from group I.1b enriched from Everglades soil reveals novel genomic features of the ammonia-oxidizing archaea.

    Directory of Open Access Journals (Sweden)

    Kateryna V Zhalnina

    Full Text Available The activity of ammonia-oxidizing archaea (AOA leads to the loss of nitrogen from soil, pollution of water sources and elevated emissions of greenhouse gas. To date, eight AOA genomes are available in the public databases, seven are from the group I.1a of the Thaumarchaeota and only one is from the group I.1b, isolated from hot springs. Many soils are dominated by AOA from the group I.1b, but the genomes of soil representatives of this group have not been sequenced and functionally characterized. The lack of knowledge of metabolic pathways of soil AOA presents a critical gap in understanding their role in biogeochemical cycles. Here, we describe the first complete genome of soil archaeon Candidatus Nitrososphaera evergladensis, which has been reconstructed from metagenomic sequencing of a highly enriched culture obtained from an agricultural soil. The AOA enrichment was sequenced with the high throughput next generation sequencing platforms from Pacific Biosciences and Ion Torrent. The de novo assembly of sequences resulted in one 2.95 Mb contig. Annotation of the reconstructed genome revealed many similarities of the basic metabolism with the rest of sequenced AOA. Ca. N. evergladensis belongs to the group I.1b and shares only 40% of whole-genome homology with the closest sequenced relative Ca. N. gargensis. Detailed analysis of the genome revealed coding sequences that were completely absent from the group I.1a. These unique sequences code for proteins involved in control of DNA integrity, transporters, two-component systems and versatile CRISPR defense system. Notably, genomes from the group I.1b have more gene duplications compared to the genomes from the group I.1a. We suggest that the presence of these unique genes and gene duplications may be associated with the environmental versatility of this group.

  8. Complete Genome Sequence of the Filamentous Fungus Aspergillus westerdijkiae Reveals the Putative Biosynthetic Gene Cluster of Ochratoxin A

    Science.gov (United States)

    Chakrabortti, Alolika; Li, Jinming

    2016-01-01

    Ochratoxin A (OTA) is a common mycotoxin that contaminates food and agricultural products. Sequencing of the complete genome of Aspergillus westerdijkiae, a major producer of OTA, reveals more than 50 biosynthetic gene clusters, including a putative OTA biosynthetic gene cluster that encodes a dozen of enzymes, transporters, and regulatory proteins. PMID:27635003

  9. Comparative genomic sequence analysis of strawberry and other rosids reveals significant microsynteny

    Directory of Open Access Journals (Sweden)

    Abbott Albert

    2010-06-01

    Full Text Available Abstract Background Fragaria belongs to the Rosaceae, an economically important family that includes a number of important fruit producing genera such as Malus and Prunus. Using genomic sequences from 50 Fragaria fosmids, we have examined the microsynteny between Fragaria and other plant models. Results In more than half of the strawberry fosmids, we found syntenic regions that are conserved in Populus, Vitis, Medicago and/or Arabidopsis with Populus containing the greatest number of syntenic regions with Fragaria. The longest syntenic region was between LG VIII of the poplar genome and the strawberry fosmid 72E18, where seven out of twelve predicted genes were collinear. We also observed an unexpectedly high level of conserved synteny between Fragaria (rosid I and Vitis (basal rosid. One of the strawberry fosmids, 34E24, contained a cluster of R gene analogs (RGAs with NBS and LRR domains. We detected clusters of RGAs with high sequence similarity to those in 34E24 in all the genomes compared. In the phylogenetic tree we have generated, all the NBS-LRR genes grouped together with Arabidopsis CNL-A type NBS-LRR genes. The Fragaria RGA grouped together with those of Vitis and Populus in the phylogenetic tree. Conclusions Our analysis shows considerable microsynteny between Fragaria and other plant genomes such as Populus, Medicago, Vitis, and Arabidopsis to a lesser degree. We also detected a cluster of NBS-LRR type genes that are conserved in all the genomes compared.

  10. Sequencing rare marine actinomycete genomes reveals high density of unique natural product biosynthetic gene clusters.

    Science.gov (United States)

    Schorn, Michelle A; Alanjary, Mohammad M; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R; Ziemert, Nadine; Moore, Bradley S

    2016-12-01

    Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites.

  11. Whole-genome sequencing of uropathogenic Escherichia coli reveals long evolutionary history of diversity and virulence.

    Science.gov (United States)

    Lo, Yancy; Zhang, Lixin; Foxman, Betsy; Zöllner, Sebastian

    2015-08-01

    Uropathogenic Escherichia coli (UPEC) are phenotypically and genotypically very diverse. This diversity makes it challenging to understand the evolution of UPEC adaptations responsible for causing urinary tract infections (UTI). To gain insight into the relationship between evolutionary divergence and adaptive paths to uropathogenicity, we sequenced at deep coverage (190×) the genomes of 19 E. coli strains from urinary tract infection patients from the same geographic area. Our sample consisted of 14 UPEC isolates and 5 non-UTI-causing (commensal) rectal E. coli isolates. After identifying strain variants using de novo assembly-based methods, we clustered the strains based on pairwise sequence differences using a neighbor-joining algorithm. We examined evolutionary signals on the whole-genome phylogeny and contrasted these signals with those found on gene trees constructed based on specific uropathogenic virulence factors. The whole-genome phylogeny showed that the divergence between UPEC and commensal E. coli strains without known UPEC virulence factors happened over 32 million generations ago. Pairwise diversity between any two strains was also high, suggesting multiple genetic origins of uropathogenic strains in a small geographic region. Contrasting the whole-genome phylogeny with three gene trees constructed from common uropathogenic virulence factors, we detected no selective advantage of these virulence genes over other genomic regions. These results suggest that UPEC acquired uropathogenicity long time ago and used it opportunistically to cause extraintestinal infections.

  12. The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima

    NARCIS (Netherlands)

    Chipman, Ariel D; Ferrier, David E K; Brena, Carlo; Qu, Jiaxin; Hughes, Daniel S T; Schröder, Reinhard; Torres-Oliva, Montserrat; Znassi, Nadia; Jiang, Huaiyang; Almeida, Francisca C; Alonso, Claudio R; Apostolou, Zivkos; Aqrawi, Peshtewani; Arthur, Wallace; Barna, Jennifer C J; Blankenburg, Kerstin P; Brites, Daniela; Capella-Gutiérrez, Salvador; Coyle, Marcus; Dearden, Peter K; Du Pasquier, Louis; Duncan, Elizabeth J; Ebert, Dieter; Eibner, Cornelius; Erikson, Galina; Evans, Peter D; Extavour, Cassandra G; Francisco, Liezl; Gabaldón, Toni; Gillis, William J; Goodwin-Horn, Elizabeth A; Green, Jack E; Griffiths-Jones, Sam; Grimmelikhuijzen, Cornelis J P; Gubbala, Sai; Guigó, Roderic; Han, Yi; Hauser, Frank; Havlak, Paul; Hayden, Luke; Helbing, Sophie; Holder, Michael; Hui, Jerome H L; Hunn, Julia P; Hunnekuhl, Vera S; Jackson, LaRonda; Javaid, Mehwish; Jhangiani, Shalini N; Jiggins, Francis M; Jones, Tamsin E; Kaiser, Tobias S; Kalra, Divya; Kenny, Nathan J; Korchina, Viktoriya; Kovar, Christie L; Kraus, F Bernhard; Lapraz, François; Lee, Sandra L; Lv, Jie; Mandapat, Christigale; Manning, Gerard; Mariotti, Marco; Mata, Robert; Mathew, Tittu; Neumann, Tobias; Newsham, Irene; Ngo, Dinh N; Ninova, Maria; Okwuonu, Geoffrey; Ongeri, Fiona; Palmer, William J; Patil, Shobha; Patraquim, Pedro; Pham, Christopher; Pu, Ling-Ling; Putman, Nicholas H; Rabouille, Catherine; Ramos, Olivia Mendivil; Rhodes, Adelaide C; Robertson, Helen E; Robertson, Hugh M; Ronshaugen, Matthew; Rozas, Julio; Saada, Nehad; Sánchez-Gracia, Alejandro; Scherer, Steven E; Schurko, Andrew M; Siggens, Kenneth W; Simmons, DeNard; Stief, Anna; Stolle, Eckart; Telford, Maximilian J; Tessmar-Raible, Kristin; Thornton, Rebecca; van der Zee, Maurijn; von Haeseler, Arndt; Williams, James M; Willis, Judith H; Wu, Yuanqing; Zou, Xiaoyan; Lawson, Daniel; Muzny, Donna M; Worley, Kim C; Gibbs, Richard A; Akam, Michael; Richards, Stephen

    2014-01-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present

  13. The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima

    NARCIS (Netherlands)

    Chipman, Ariel D; Ferrier, David E K; Brena, Carlo; Qu, Jiaxin; Hughes, Daniel S T; Schröder, Reinhard; Torres-Oliva, Montserrat; Znassi, Nadia; Jiang, Huaiyang; Almeida, Francisca C; Alonso, Claudio R; Apostolou, Zivkos; Aqrawi, Peshtewani; Arthur, Wallace; Barna, Jennifer C J; Blankenburg, Kerstin P; Brites, Daniela; Capella-Gutiérrez, Salvador; Coyle, Marcus; Dearden, Peter K; Du Pasquier, Louis; Duncan, Elizabeth J; Ebert, Dieter; Eibner, Cornelius; Erikson, Galina; Evans, Peter D; Extavour, Cassandra G; Francisco, Liezl; Gabaldón, Toni; Gillis, William J; Goodwin-Horn, Elizabeth A; Green, Jack E; Griffiths-Jones, Sam; Grimmelikhuijzen, Cornelis J P; Gubbala, Sai; Guigó, Roderic; Han, Yi; Hauser, Frank; Havlak, Paul; Hayden, Luke; Helbing, Sophie; Holder, Michael; Hui, Jerome H L; Hunn, Julia P; Hunnekuhl, Vera S; Jackson, LaRonda; Javaid, Mehwish; Jhangiani, Shalini N; Jiggins, Francis M; Jones, Tamsin E; Kaiser, Tobias S; Kalra, Divya; Kenny, Nathan J; Korchina, Viktoriya; Kovar, Christie L; Kraus, F Bernhard; Lapraz, François; Lee, Sandra L; Lv, Jie; Mandapat, Christigale; Manning, Gerard; Mariotti, Marco; Mata, Robert; Mathew, Tittu; Neumann, Tobias; Newsham, Irene; Ngo, Dinh N; Ninova, Maria; Okwuonu, Geoffrey; Ongeri, Fiona; Palmer, William J; Patil, Shobha; Patraquim, Pedro; Pham, Christopher; Pu, Ling-Ling; Putman, Nicholas H; Rabouille, Catherine; Ramos, Olivia Mendivil; Rhodes, Adelaide C; Robertson, Helen E; Robertson, Hugh M; Ronshaugen, Matthew; Rozas, Julio; Saada, Nehad; Sánchez-Gracia, Alejandro; Scherer, Steven E; Schurko, Andrew M; Siggens, Kenneth W; Simmons, DeNard; Stief, Anna; Stolle, Eckart; Telford, Maximilian J; Tessmar-Raible, Kristin; Thornton, Rebecca; van der Zee, Maurijn; von Haeseler, Arndt; Williams, James M; Willis, Judith H; Wu, Yuanqing; Zou, Xiaoyan; Lawson, Daniel; Muzny, Donna M; Worley, Kim C; Gibbs, Richard A; Akam, Michael; Richards, Stephen

    2014-01-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present

  14. Combining genomic sequencing methods to explore viral diversity and reveal potential virus-host interactions

    Directory of Open Access Journals (Sweden)

    Cheryl-Emiliane Tien Chow

    2015-04-01

    Full Text Available Viral diversity and virus-host interactions in oxygen-starved regions of the ocean, also known as oxygen minimum zones (OMZs, remain relatively unexplored. Microbial community metabolism in OMZs alters nutrient and energy flow through marine food webs, resulting in biological nitrogen loss and greenhouse gas production. Thus, viruses infecting OMZ microbes have the potential to modulate community metabolism with resulting feedback on ecosystem function. Here, we describe viral communities inhabiting oxic surface (10m and oxygen-starved basin (200m waters of Saanich Inlet, a seasonally anoxic fjord on the coast of Vancouver Island, British Columbia using viral metagenomics and complete viral fosmid sequencing on samples collected between April 2007 and April 2010. Of 6459 open reading frames (ORFs predicted across all 34 viral fosmids, 77.6% (n=5010 had no homology to reference viral genomes. These fosmids recruited a higher proportion of viral metagenomic sequences from Saanich Inlet than from nearby northeastern subarctic Pacific Ocean (Line P waters, indicating differences in the viral communities between coastal and open ocean locations. While functional annotations of fosmid ORFs were limited, recruitment to NCBI’s non-redundant ‘nr’ database and publicly available single-cell genomes identified putative viruses infecting marine thaumarchaeal and SUP05 proteobacteria to provide potential host linkages with relevance to coupled biogeochemical cycling processes in OMZ waters. Taken together, these results highlight the power of coupled analyses of multiple sequence data types, such as viral metagenomic and fosmid sequence data with prokaryotic single cell genomes, to chart viral diversity, elucidate genomic and ecological contexts for previously unclassifiable viral sequences, and identify novel host interactions in natural and engineered ecosystems.

  15. Genome sequencing reveals complex secondary metabolome in themarine actinomycete Salinispora tropica

    Energy Technology Data Exchange (ETDEWEB)

    Udwary, Daniel W.; Zeigler, Lisa; Asolkar, Ratnakar; Singan,Vasanth; Lapidus, Alla; Fenical, William; Jensen, Paul R.; Moore, BradleyS.

    2007-05-01

    Recent fermentation studies have identified actinomycetes ofthe marine-dwelling genus Salinispora as prolific natural productproducers. To further evaluate their biosynthetic potential, we analyzedall identifiable secondary natural product gene clusters from therecently sequenced 5,184,724 bp S. tropica CNB-440 circular genome. Ouranalysis shows that biosynthetic potential meets or exceeds that shown byprevious Streptomyces genome sequences as well as other naturalproduct-producing actinomycetes. The S. tropica genome features ninepolyketide synthase systems of every known formally classified family,non-ribosomal peptide synthetases and several hybrid clusters. While afew clusters appear to encode molecules previously identified inStreptomyces species,the majority of the 15 biosynthetic loci are novel.Specific chemical information about putative and observed natural productmolecules is presented and discussed. In addition, our bioinformaticanalysis was critical for the structure elucidation of the novelpolyenemacrolactam salinilactam A. This study demonstrates the potentialfor genomic analysis to complement and strengthen traditional naturalproduct isolation studies and firmly establishes the genus Salinispora asa rich source of novel drug-like molecules.

  16. Shotgun Bisulfite Sequencing of the Betula platyphylla Genome Reveals the Tree’s DNA Methylation Patterning

    Directory of Open Access Journals (Sweden)

    Chang Su

    2014-12-01

    Full Text Available DNA methylation plays a critical role in the regulation of gene expression. Most studies of DNA methylation have been performed in herbaceous plants, and little is known about the methylation patterns in tree genomes. In the present study, we generated a map of methylated cytosines at single base pair resolution for Betula platyphylla (white birch by bisulfite sequencing combined with transcriptomics to analyze DNA methylation and its effects on gene expression. We obtained a detailed view of the function of DNA methylation sequence composition and distribution in the genome of B. platyphylla. There are 34,460 genes in the whole genome of birch, and 31,297 genes are methylated. Conservatively, we estimated that 14.29% of genomic cytosines are methylcytosines in birch. Among the methylation sites, the CHH context accounts for 48.86%, and is the largest proportion. Combined transcriptome and methylation analysis showed that the genes with moderate methylation levels had higher expression levels than genes with high and low methylation. In addition, methylated genes are highly enriched for the GO subcategories of binding activities, catalytic activities, cellular processes, response to stimulus and cell death, suggesting that methylation mediates these pathways in birch trees.

  17. Ultra Deep Sequencing of a Baculovirus Population Reveals Widespread Genomic Variations

    Directory of Open Access Journals (Sweden)

    Aurélien Chateigner

    2015-07-01

    Full Text Available Viruses rely on widespread genetic variation and large population size for adaptation. Large DNA virus populations are thought to harbor little variation though natural populations may be polymorphic. To measure the genetic variation present in a dsDNA virus population, we deep sequenced a natural strain of the baculovirus Autographa californica multiple nucleopolyhedrovirus. With 124,221X average genome coverage of our 133,926 bp long consensus, we could detect low frequency mutations (0.025%. K-means clustering was used to classify the mutations in four categories according to their frequency in the population. We found 60 high frequency non-synonymous mutations under balancing selection distributed in all functional classes. These mutants could alter viral adaptation dynamics, either through competitive or synergistic processes. Lastly, we developed a technique for the delimitation of large deletions in next generation sequencing data. We found that large deletions occur along the entire viral genome, with hotspots located in homologous repeat regions (hrs. Present in 25.4% of the genomes, these deletion mutants presumably require functional complementation to complete their infection cycle. They might thus have a large impact on the fitness of the baculovirus population. Altogether, we found a wide breadth of genomic variation in the baculovirus population, suggesting it has high adaptive potential.

  18. Ultra-deep sequencing reveals the subclonal structure and genomic evolution of oral squamous cell carcinoma

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin Jakob

    Background: Oral squamous cell carcinoma (OSCC), a subgroup of head and neck squamous cell carcinoma (HNSCC), is primarily caused by alcohol consumption and tobacco use. Recent DNA sequencing studies suggests that HNSCC are very heterogeneous between patients; however the intra-patient subclonal...... structure remains unexplored due to lack of sampling multiple tumor biopsies from each patient. Materials and methods: To examine the clonal structure and describe the genomic cancer evolution we applied whole-exome sequencing combined with targeted ultra-deep targeted sequencing on biopsies from 5stage IV...... complex subclonal architectures comprising distinct subclones only found in geographically distinct regions of the tumors. The metastatic potential of the tumor is acquired early in the tumor evolution, as indicated by the lymph node sharing the majority of the mutations with the tumor biopsies, while...

  19. The First Myriapod Genome Sequence Reveals Conservative Arthropod Gene Content and Genome Organisation in the Centipede Strigamia maritima

    Science.gov (United States)

    Chipman, Ariel D.; Ferrier, David E. K.; Brena, Carlo; Qu, Jiaxin; Hughes, Daniel S. T.; Schröder, Reinhard; Torres-Oliva, Montserrat; Znassi, Nadia; Jiang, Huaiyang; Almeida, Francisca C.; Alonso, Claudio R.; Apostolou, Zivkos; Aqrawi, Peshtewani; Arthur, Wallace; Barna, Jennifer C. J.; Blankenburg, Kerstin P.; Brites, Daniela; Capella-Gutiérrez, Salvador; Coyle, Marcus; Dearden, Peter K.; Du Pasquier, Louis; Duncan, Elizabeth J.; Ebert, Dieter; Eibner, Cornelius; Erikson, Galina; Evans, Peter D.; Extavour, Cassandra G.; Francisco, Liezl; Gabaldón, Toni; Gillis, William J.; Goodwin-Horn, Elizabeth A.; Green, Jack E.; Griffiths-Jones, Sam; Grimmelikhuijzen, Cornelis J. P.; Gubbala, Sai; Guigó, Roderic; Han, Yi; Hauser, Frank; Havlak, Paul; Hayden, Luke; Helbing, Sophie; Holder, Michael; Hui, Jerome H. L.; Hunn, Julia P.; Hunnekuhl, Vera S.; Jackson, LaRonda; Javaid, Mehwish; Jhangiani, Shalini N.; Jiggins, Francis M.; Jones, Tamsin E.; Kaiser, Tobias S.; Kalra, Divya; Kenny, Nathan J.; Korchina, Viktoriya; Kovar, Christie L.; Kraus, F. Bernhard; Lapraz, François; Lee, Sandra L.; Lv, Jie; Mandapat, Christigale; Manning, Gerard; Mariotti, Marco; Mata, Robert; Mathew, Tittu; Neumann, Tobias; Newsham, Irene; Ngo, Dinh N.; Ninova, Maria; Okwuonu, Geoffrey; Ongeri, Fiona; Palmer, William J.; Patil, Shobha; Patraquim, Pedro; Pham, Christopher; Pu, Ling-Ling; Putman, Nicholas H.; Rabouille, Catherine; Ramos, Olivia Mendivil; Rhodes, Adelaide C.; Robertson, Helen E.; Robertson, Hugh M.; Ronshaugen, Matthew; Rozas, Julio; Saada, Nehad; Sánchez-Gracia, Alejandro; Scherer, Steven E.; Schurko, Andrew M.; Siggens, Kenneth W.; Simmons, DeNard; Stief, Anna; Stolle, Eckart; Telford, Maximilian J.; Tessmar-Raible, Kristin; Thornton, Rebecca; van der Zee, Maurijn; von Haeseler, Arndt; Williams, James M.; Willis, Judith H.; Wu, Yuanqing; Zou, Xiaoyan; Lawson, Daniel; Muzny, Donna M.; Worley, Kim C.; Gibbs, Richard A.; Akam, Michael; Richards, Stephen

    2014-01-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air is likely effected by expansion of other receptor gene families. For some genes S. maritima has evolved paralogues to generate coding sequence diversity, where insects use alternate splicing. This is most striking for the Dscam gene, which in Drosophila generates more than 100,000 alternate splice forms, but in S. maritima is encoded by over 100 paralogues. We see an intriguing linkage between the absence of any known photosensory proteins in a blind organism and the additional absence of canonical circadian clock genes. The phylogenetic position of myriapods allows us to identify where in arthropod phylogeny several particular molecular mechanisms and traits emerged. For example, we conclude that juvenile hormone signalling evolved with the emergence of the exoskeleton in the arthropods and that RR-1 containing cuticle proteins evolved in the lineage leading to Mandibulata. We also identify when various gene expansions and losses occurred. The genome of S. maritima offers us a unique glimpse into the ancestral arthropod genome, while also displaying many adaptations to its specific

  20. The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima.

    Directory of Open Access Journals (Sweden)

    Ariel D Chipman

    2014-11-01

    Full Text Available Myriapods (e.g., centipedes and millipedes display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air is likely effected by expansion of other receptor gene families. For some genes S. maritima has evolved paralogues to generate coding sequence diversity, where insects use alternate splicing. This is most striking for the Dscam gene, which in Drosophila generates more than 100,000 alternate splice forms, but in S. maritima is encoded by over 100 paralogues. We see an intriguing linkage between the absence of any known photosensory proteins in a blind organism and the additional absence of canonical circadian clock genes. The phylogenetic position of myriapods allows us to identify where in arthropod phylogeny several particular molecular mechanisms and traits emerged. For example, we conclude that juvenile hormone signalling evolved with the emergence of the exoskeleton in the arthropods and that RR-1 containing cuticle proteins evolved in the lineage leading to Mandibulata. We also identify when various gene expansions and losses occurred. The genome of S. maritima offers us a unique glimpse into the ancestral arthropod genome, while also displaying many adaptations

  1. The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima.

    Science.gov (United States)

    Chipman, Ariel D; Ferrier, David E K; Brena, Carlo; Qu, Jiaxin; Hughes, Daniel S T; Schröder, Reinhard; Torres-Oliva, Montserrat; Znassi, Nadia; Jiang, Huaiyang; Almeida, Francisca C; Alonso, Claudio R; Apostolou, Zivkos; Aqrawi, Peshtewani; Arthur, Wallace; Barna, Jennifer C J; Blankenburg, Kerstin P; Brites, Daniela; Capella-Gutiérrez, Salvador; Coyle, Marcus; Dearden, Peter K; Du Pasquier, Louis; Duncan, Elizabeth J; Ebert, Dieter; Eibner, Cornelius; Erikson, Galina; Evans, Peter D; Extavour, Cassandra G; Francisco, Liezl; Gabaldón, Toni; Gillis, William J; Goodwin-Horn, Elizabeth A; Green, Jack E; Griffiths-Jones, Sam; Grimmelikhuijzen, Cornelis J P; Gubbala, Sai; Guigó, Roderic; Han, Yi; Hauser, Frank; Havlak, Paul; Hayden, Luke; Helbing, Sophie; Holder, Michael; Hui, Jerome H L; Hunn, Julia P; Hunnekuhl, Vera S; Jackson, LaRonda; Javaid, Mehwish; Jhangiani, Shalini N; Jiggins, Francis M; Jones, Tamsin E; Kaiser, Tobias S; Kalra, Divya; Kenny, Nathan J; Korchina, Viktoriya; Kovar, Christie L; Kraus, F Bernhard; Lapraz, François; Lee, Sandra L; Lv, Jie; Mandapat, Christigale; Manning, Gerard; Mariotti, Marco; Mata, Robert; Mathew, Tittu; Neumann, Tobias; Newsham, Irene; Ngo, Dinh N; Ninova, Maria; Okwuonu, Geoffrey; Ongeri, Fiona; Palmer, William J; Patil, Shobha; Patraquim, Pedro; Pham, Christopher; Pu, Ling-Ling; Putman, Nicholas H; Rabouille, Catherine; Ramos, Olivia Mendivil; Rhodes, Adelaide C; Robertson, Helen E; Robertson, Hugh M; Ronshaugen, Matthew; Rozas, Julio; Saada, Nehad; Sánchez-Gracia, Alejandro; Scherer, Steven E; Schurko, Andrew M; Siggens, Kenneth W; Simmons, DeNard; Stief, Anna; Stolle, Eckart; Telford, Maximilian J; Tessmar-Raible, Kristin; Thornton, Rebecca; van der Zee, Maurijn; von Haeseler, Arndt; Williams, James M; Willis, Judith H; Wu, Yuanqing; Zou, Xiaoyan; Lawson, Daniel; Muzny, Donna M; Worley, Kim C; Gibbs, Richard A; Akam, Michael; Richards, Stephen

    2014-11-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air is likely effected by expansion of other receptor gene families. For some genes S. maritima has evolved paralogues to generate coding sequence diversity, where insects use alternate splicing. This is most striking for the Dscam gene, which in Drosophila generates more than 100,000 alternate splice forms, but in S. maritima is encoded by over 100 paralogues. We see an intriguing linkage between the absence of any known photosensory proteins in a blind organism and the additional absence of canonical circadian clock genes. The phylogenetic position of myriapods allows us to identify where in arthropod phylogeny several particular molecular mechanisms and traits emerged. For example, we conclude that juvenile hormone signalling evolved with the emergence of the exoskeleton in the arthropods and that RR-1 containing cuticle proteins evolved in the lineage leading to Mandibulata. We also identify when various gene expansions and losses occurred. The genome of S. maritima offers us a unique glimpse into the ancestral arthropod genome, while also displaying many adaptations to its specific

  2. Genome Sequencing and Comparative Genomics Analysis Revealed Pathogenic Potential in Penicillium capsulatum as a Novel Fungal Pathogen Belonging to Eurotiales

    Science.gov (United States)

    Yang, Ying; Chen, Min; Li, Zongwei; Al-Hatmi, Abdullah M. S.; de Hoog, Sybren; Pan, Weihua; Ye, Qiang; Bo, Xiaochen; Li, Zhen; Wang, Shengqi; Wang, Junzhi; Chen, Huipeng; Liao, Wanqing

    2016-01-01

    Penicillium capsulatum is a rare Penicillium species used in paper manufacturing, but recently it has been reported to cause invasive infection. To research the pathogenicity of the clinical Penicillium strain, we sequenced the genomes and transcriptomes of the clinical and environmental strains of P. capsulatum. Comparative analyses of these two P. capsulatum strains and close related strains belonging to Eurotiales were performed. The assembled genome sizes of P. capsulatum are approximately 34.4 Mbp in length and encode 11,080 predicted genes. The different isolates of P. capsulatum are highly similar, with the exception of several unique genes, INDELs or SNPs in the genes coding for glycosyl hydrolases, amino acid transporters and circumsporozoite protein. A phylogenomic analysis was performed based on the whole genome data of 38 strains belonging to Eurotiales. By comparing the whole genome sequences and the virulence-related genes from 20 important related species, including fungal pathogens and non-human pathogens belonging to Eurotiales, we found meaningful pathogenicity characteristics between P. capsulatum and its closely related species. Our research indicated that P. capsulatum may be a neglected opportunistic pathogen. This study is beneficial for mycologists, geneticists and epidemiologists to achieve a deeper understanding of the genetic basis of the role of P. capsulatum as a newly reported fungal pathogen. PMID:27761131

  3. Reconstruction of the lipid metabolism for the microalga Monoraphidium neglectum from its genome sequence reveals characteristics suitable for biofuel production.

    Science.gov (United States)

    Bogen, Christian; Al-Dilaimi, Arwa; Albersmeier, Andreas; Wichmann, Julian; Grundmann, Michael; Rupp, Oliver; Lauersen, Kyle J; Blifernez-Klassen, Olga; Kalinowski, Jörn; Goesmann, Alexander; Mussgnug, Jan H; Kruse, Olaf

    2013-12-28

    Microalgae are gaining importance as sustainable production hosts in the fields of biotechnology and bioenergy. A robust biomass accumulating strain of the genus Monoraphidium (SAG 48.87) was investigated in this work as a potential feedstock for biofuel production. The genome was sequenced, annotated, and key enzymes for triacylglycerol formation were elucidated. Monoraphidium neglectum was identified as an oleaginous species with favourable growth characteristics as well as a high potential for crude oil production, based on neutral lipid contents of approximately 21% (dry weight) under nitrogen starvation, composed of predominantly C18:1 and C16:0 fatty acids. Further characterization revealed growth in a relatively wide pH range and salt concentrations of up to 1.0% NaCl, in which the cells exhibited larger structures. This first full genome sequencing of a member of the Selenastraceae revealed a diploid, approximately 68 Mbp genome with a G + C content of 64.7%. The circular chloroplast genome was assembled to a 135,362 bp single contig, containing 67 protein-coding genes. The assembly of the mitochondrial genome resulted in two contigs with an approximate total size of 94 kb, the largest known mitochondrial genome within algae. 16,761 protein-coding genes were assigned to the nuclear genome. Comparison of gene sets with respect to functional categories revealed a higher gene number assigned to the category "carbohydrate metabolic process" and in "fatty acid biosynthetic process" in M. neglectum when compared to Chlamydomonas reinhardtii and Nannochloropsis gaditana, indicating a higher metabolic diversity for applications in carbohydrate conversions of biotechnological relevance. The genome of M. neglectum, as well as the metabolic reconstruction of crucial lipid pathways, provides new insights into the diversity of the lipid metabolism in microalgae. The results of this work provide a platform to encourage the development of this strain for

  4. Complete mitochondrial genome sequences of three bats species and whole genome mitochondrial analyses reveal patterns of codon bias and lend support to a basal split in Chiroptera.

    Science.gov (United States)

    Meganathan, P R; Pagan, Heidi J T; McCulloch, Eve S; Stevens, Richard D; Ray, David A

    2012-01-15

    Order Chiroptera is a unique group of mammals whose members have attained self-powered flight as their main mode of locomotion. Much speculation persists regarding bat evolution; however, lack of sufficient molecular data hampers evolutionary and conservation studies. Of ~1200 species, complete mitochondrial genome sequences are available for only eleven. Additional sequences should be generated if we are to resolve many questions concerning these fascinating mammals. Herein, we describe the complete mitochondrial genomes of three bats: Corynorhinus rafinesquii, Lasiurus borealis and Artibeus lituratus. We also compare the currently available mitochondrial genomes and analyze codon usage in Chiroptera. C. rafinesquii, L. borealis and A. lituratus mitochondrial genomes are 16438 bp, 17048 bp and 16709 bp, respectively. Genome organization and gene arrangements are similar to other bats. Phylogenetic analyses using complete mitochondrial genome sequences support previously established phylogenetic relationships and suggest utility in future studies focusing on the evolutionary aspects of these species. Comprehensive analyses of available bat mitochondrial genomes reveal distinct nucleotide patterns and synonymous codon preferences corresponding to different chiropteran families. These patterns suggest that mutational and selection forces are acting to different extents within Chiroptera and shape their mitochondrial genomes.

  5. Whole genome sequencing reveals a de novo SHANK3 mutation in familial autism spectrum disorder.

    Directory of Open Access Journals (Sweden)

    Sergio I Nemirovsky

    Full Text Available Clinical genomics promise to be especially suitable for the study of etiologically heterogeneous conditions such as Autism Spectrum Disorder (ASD. Here we present three siblings with ASD where we evaluated the usefulness of Whole Genome Sequencing (WGS for the diagnostic approach to ASD.We identified a family segregating ASD in three siblings with an unidentified cause. We performed WGS in the three probands and used a state-of-the-art comprehensive bioinformatic analysis pipeline and prioritized the identified variants located in genes likely to be related to ASD. We validated the finding by Sanger sequencing in the probands and their parents.Three male siblings presented a syndrome characterized by severe intellectual disability, absence of language, autism spectrum symptoms and epilepsy with negative family history for mental retardation, language disorders, ASD or other psychiatric disorders. We found germline mosaicism for a heterozygous deletion of a cytosine in the exon 21 of the SHANK3 gene, resulting in a missense sequence of 5 codons followed by a premature stop codon (NM_033517:c.3259_3259delC, p.Ser1088Profs*6.We reported an infrequent form of familial ASD where WGS proved useful in the clinic. We identified a mutation in SHANK3 that underscores its relevance in Autism Spectrum Disorder.

  6. Molecular footprints of domestication and improvement in soybean revealed by whole genome re-sequencing

    DEFF Research Database (Denmark)

    Li, Ying-hui; Zhao, Shan-cen; Ma, Jian-xin

    2013-01-01

    BACKGROUND:Artificial selection played an important role in the origin of modern Glycine max cultivars from the wild soybean Glycine soja. To elucidate the consequences of artificial selection accompanying the domestication and modern improvement of soybean, 25 new and 30 published whole-genome re......-sequencing accessions, which represent wild, domesticated landrace, and Chinese elite soybean populations were analyzed.RESULTS:A total of 5,102,244 single nucleotide polymorphisms (SNPs) and 707,969 insertion/deletions were identified. Among the SNPs detected, 25.5% were not described previously. We found...... that artificial selection during domestication led to more pronounced reduction in the genetic diversity of soybean than the switch from landraces to elite cultivars. Only a small proportion (2.99%) of the whole genomic regions appear to be affected by artificial selection for preferred agricultural traits...

  7. Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication

    Science.gov (United States)

    Wu, G. Albert; Prochnik, Simon; Jenkins, Jerry; Salse, Jerome; Hellsten, Uffe; Murat, Florent; Perrier, Xavier; Ruiz, Manuel; Scalabrin, Simone; Terol, Javier; Takita, Marco Aurélio; Labadie, Karine; Poulain, Julie; Couloux, Arnaud; Jabbari, Kamel; Cattonaro, Federica; Del Fabbro, Cristian; Pinosio, Sara; Zuccolo, Andrea; Chapman, Jarrod; Grimwood, Jane; Tadeo, Francisco R.; Estornell, Leandro H.; Muñoz-Sanz, Juan V.; Ibanez, Victoria; Herrero-Ortega, Amparo; Aleza, Pablo; Pérez-Pérez, Julián; Ramón, Daniel; Brunel, Dominique; Luro, François; Chen, Chunxian; Farmerie, William G.; Desany, Brian; Kodira, Chinnappa; Mohiuddin, Mohammed; Harkins, Tim; Fredrikson, Karin; Burns, Paul; Lomsadze, Alexandre; Borodovsky, Mark; Reforgiato, Giuseppe; Freitas-Astúa, Juliana; Quetier, Francis; Navarro, Luis; Roose, Mikeal; Wincker, Patrick; Schmutz, Jeremy; Morgante, Michele; Machado, Marcos Antonio; Talon, Manuel; Jaillon, Olivier; Ollitrault, Patrick; Gmitter, Frederick; Rokhsar, Daniel

    2014-01-01

    The domestication of citrus, is poorly understood. Cultivated types are selections from, or hybrids of, wild progenitor species, whose identities and contributions remain controversial. By comparative analysis of a collection of citrus genomes, including a high quality haploid reference, we show that cultivated types were derived from two progenitor species. Though cultivated pummelos represent selections from a single progenitor species, C. maxima, cultivated mandarins are introgressions of C. maxima into the ancestral mandarin species, C. reticulata. The most widely cultivated citrus, sweet orange, is the offspring of previously admixed individuals, but sour orange is an F1 hybrid of pure C. maxima and C. reticulata parents, implying that wild mandarins were part of the early breeding germplasm. A wild “mandarin” from China exhibited substantial divergence from C. reticulata, suggesting the possibility of other unrecognized wild citrus species. Understanding citrus phylogeny through genome analysis clarifies taxonomic relationships and enables sequence-directed genetic improvement. PMID:24908277

  8. Genomic sequencing reveals historical, demographic and selective factors associated with the diversification of the fire-associated fungus Neurospora discreta.

    Science.gov (United States)

    Gladieux, Pierre; Wilson, Benjamin A; Perraudeau, Fanny; Montoya, Liliam A; Kowbel, David; Hann-Soden, Christopher; Fischer, Monika; Sylvain, Iman; Jacobson, David J; Taylor, John W

    2015-11-01

    Delineating microbial populations, discovering ecologically relevant phenotypes and identifying migrants, hybrids or admixed individuals have long proved notoriously difficult, thereby limiting our understanding of the evolutionary forces at play during the diversification of microbial species. However, recent advances in sequencing and computational methods have enabled an unbiased approach whereby incipient species and the genetic correlates of speciation can be identified by examining patterns of genomic variation within and between lineages. We present here a population genomic study of a phylogenetic species in the Neurospora discreta species complex, based on the resequencing of full genomes (~37 Mb) for 52 fungal isolates from nine sites in three continents. Population structure analyses revealed two distinct lineages in South-East Asia, and three lineages in North America/Europe with a broad longitudinal and latitudinal range and limited admixture between lineages. Genome scans for selective sweeps and comparisons of the genomic landscapes of diversity and recombination provided no support for a role of selection at linked sites on genomic heterogeneity in levels of divergence between lineages. However, demographic inference indicated that the observed genomic heterogeneity in divergence was generated by varying rates of gene flow between lineages following a period of isolation. Many putative cases of exchange of genetic material between phylogenetically divergent fungal lineages have been discovered, and our work highlights the quantitative importance of genetic exchanges between more closely related taxa to the evolution of fungal genomes. Our study also supports the role of allopatric isolation as a driver of diversification in saprobic microbes.

  9. Deep Sequencing Reveals the Complete Genome and Evidence for Transcriptional Activity of the First Virus-Like Sequences Identified in Aristotelia chilensis (Maqui Berry

    Directory of Open Access Journals (Sweden)

    Javier Villacreses

    2015-04-01

    Full Text Available Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1. High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs: ORFs 1 and 2 shares 66%–73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV, Petuvirus genus. ORF1 encodes a movement protein (MP; ORF2 a Reverse Transcriptase (RT and a Ribonuclease H (RNase H domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs, AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq. Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant.

  10. Deep sequencing reveals the complete genome and evidence for transcriptional activity of the first virus-like sequences identified in Aristotelia chilensis (Maqui Berry).

    Science.gov (United States)

    Villacreses, Javier; Rojas-Herrera, Marcelo; Sánchez, Carolina; Hewstone, Nicole; Undurraga, Soledad F; Alzate, Juan F; Manque, Patricio; Maracaja-Coutinho, Vinicius; Polanco, Victor

    2015-04-03

    Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1). High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs): ORFs 1 and 2 shares 66%-73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV), Petuvirus genus. ORF1 encodes a movement protein (MP); ORF2 a Reverse Transcriptase (RT) and a Ribonuclease H (RNase H) domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs), AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq). Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant.

  11. Plastome sequences of Lygodium japonicum and Marsilea crenata reveal the genome organization transformation from basal ferns to core leptosporangiates.

    Science.gov (United States)

    Gao, Lei; Wang, Bo; Wang, Zhi-Wei; Zhou, Yuan; Su, Ying-Juan; Wang, Ting

    2013-01-01

    Previous studies have shown that core leptosporangiates, the most species-rich group of extant ferns (monilophytes), have a distinct plastid genome (plastome) organization pattern from basal fern lineages. However, the details of genome structure transformation from ancestral ferns to core leptosporangiates remain unclear because of limited plastome data available. Here, we have determined the complete chloroplast genome sequences of Lygodium japonicum (Lygodiaceae), a member of schizaeoid ferns (Schizaeales), and Marsilea crenata (Marsileaceae), a representative of heterosporous ferns (Salviniales). The two species represent the sister and the basal lineages of core leptosporangiates, respectively, for which the plastome sequences are currently unavailable. Comparative genomic analysis of all sequenced fern plastomes reveals that the gene order of L. japonicum plastome occupies an intermediate position between that of basal ferns and core leptosporangiates. The two exons of the fern ndhB gene have a unique pattern of intragenic copy number variances. Specifically, the substitution rate heterogeneity between the two exons is congruent with their copy number changes, confirming the constraint role that inverted repeats may play on the substitution rate of chloroplast gene sequences.

  12. Oil palm genome sequence reveals divergence of interfertile species in Old and New worlds.

    Science.gov (United States)

    Singh, Rajinder; Ong-Abdullah, Meilina; Low, Eng-Ti Leslie; Manaf, Mohamad Arif Abdul; Rosli, Rozana; Nookiah, Rajanaidu; Ooi, Leslie Cheng-Li; Ooi, Siew-Eng; Chan, Kuang-Lim; Halim, Mohd Amin; Azizi, Norazah; Nagappan, Jayanthi; Bacher, Blaire; Lakey, Nathan; Smith, Steven W; He, Dong; Hogan, Michael; Budiman, Muhammad A; Lee, Ernest K; DeSalle, Rob; Kudrna, David; Goicoechea, Jose Luis; Wing, Rod A; Wilson, Richard K; Fulton, Robert S; Ordway, Jared M; Martienssen, Robert A; Sambanthamurthi, Ravigadevi

    2013-08-15

    Oil palm is the most productive oil-bearing crop. Although it is planted on only 5% of the total world vegetable oil acreage, palm oil accounts for 33% of vegetable oil and 45% of edible oil worldwide, but increased cultivation competes with dwindling rainforest reserves. We report the 1.8-gigabase (Gb) genome sequence of the African oil palm Elaeis guineensis, the predominant source of worldwide oil production. A total of 1.535 Gb of assembled sequence and transcriptome data from 30 tissue types were used to predict at least 34,802 genes, including oil biosynthesis genes and homologues of WRINKLED1 (WRI1), and other transcriptional regulators, which are highly expressed in the kernel. We also report the draft sequence of the South American oil palm Elaeis oleifera, which has the same number of chromosomes (2n = 32) and produces fertile interspecific hybrids with E. guineensis but seems to have diverged in the New World. Segmental duplications of chromosome arms define the palaeotetraploid origin of palm trees. The oil palm sequence enables the discovery of genes for important traits as well as somaclonal epigenetic alterations that restrict the use of clones in commercial plantings, and should therefore help to achieve sustainability for biofuels and edible oils, reducing the rainforest footprint of this tropical plantation crop.

  13. Infidelity of SARS-CoV Nsp14-Exonuclease Mutant Virus Replication Is Revealed by Complete Genome Sequencing

    Science.gov (United States)

    Eckerle, Lance D.; Becker, Michelle M.; Halpin, Rebecca A.; Li, Kelvin; Venter, Eli; Lu, Xiaotao; Scherbakova, Sana; Graham, Rachel L.; Baric, Ralph S.; Stockwell, Timothy B.; Spiro, David J.; Denison, Mark R.

    2010-01-01

    Most RNA viruses lack the mechanisms to recognize and correct mutations that arise during genome replication, resulting in quasispecies diversity that is required for pathogenesis and adaptation. However, it is not known how viruses encoding large viral RNA genomes such as the Coronaviridae (26 to 32 kb) balance the requirements for genome stability and quasispecies diversity. Further, the limits of replication infidelity during replication of large RNA genomes and how decreased fidelity impacts virus fitness over time are not known. Our previous work demonstrated that genetic inactivation of the coronavirus exoribonuclease (ExoN) in nonstructural protein 14 (nsp14) of murine hepatitis virus results in a 15-fold decrease in replication fidelity. However, it is not known whether nsp14-ExoN is required for replication fidelity of all coronaviruses, nor the impact of decreased fidelity on genome diversity and fitness during replication and passage. We report here the engineering and recovery of nsp14-ExoN mutant viruses of severe acute respiratory syndrome coronavirus (SARS-CoV) that have stable growth defects and demonstrate a 21-fold increase in mutation frequency during replication in culture. Analysis of complete genome sequences from SARS-ExoN mutant viral clones revealed unique mutation sets in every genome examined from the same round of replication and a total of 100 unique mutations across the genome. Using novel bioinformatic tools and deep sequencing across the full-length genome following 10 population passages in vitro, we demonstrate retention of ExoN mutations and continued increased diversity and mutational load compared to wild-type SARS-CoV. The results define a novel genetic and bioinformatics model for introduction and identification of multi-allelic mutations in replication competent viruses that will be powerful tools for testing the effects of decreased fidelity and increased quasispecies diversity on viral replication, pathogenesis, and

  14. Whole genome sequence analysis of Cryptococcus gattii from the Pacific Northwest reveals unexpected diversity.

    Directory of Open Access Journals (Sweden)

    John D Gillece

    Full Text Available A recent emergence of Cryptococcus gattii in the Pacific Northwest involves strains that fall into three primarily clonal molecular subtypes: VGIIa, VGIIb and VGIIc. Multilocus sequence typing (MLST and variable number tandem repeat analysis appear to identify little diversity within these molecular subtypes. Given the apparent expansion of these subtypes into new geographic areas and their ability to cause disease in immunocompetent individuals, differentiation of isolates belonging to these subtypes could be very important from a public health perspective. We used whole genome sequence typing (WGST to perform fine-scale phylogenetic analysis on 20 C. gattii isolates, 18 of which are from the VGII molecular type largely responsible for the Pacific Northwest emergence. Analysis both including and excluding (289,586 SNPs and 56,845 SNPs, respectively molecular types VGI and VGIII isolates resulted in phylogenetic reconstructions consistent, for the most part, with MLST analysis but with far greater resolution among isolates. The WGST analysis presented here resulted in identification of over 100 SNPs among eight VGIIc isolates as well as unique genotypes for each of the VGIIa, VGIIb and VGIIc isolates. Similar levels of genetic diversity were found within each of the molecular subtype isolates, despite the fact that the VGIIb clade is thought to have emerged much earlier. The analysis presented here is the first multi-genome WGST study to focus on the C. gattii molecular subtypes involved in the Pacific Northwest emergence and describes the tools that will further our understanding of this emerging pathogen.

  15. Whole-genome sequencing reveals the effect of vaccination on the evolution of Bordetella pertussis.

    Science.gov (United States)

    Xu, Yinghua; Liu, Bin; Gröndahl-Yli-Hannuksila, Kirsi; Tan, Yajun; Feng, Lu; Kallonen, Teemu; Wang, Lichan; Peng, Ding; He, Qiushui; Wang, Lei; Zhang, Shumin

    2015-08-18

    Herd immunity can potentially induce a change of circulating viruses. However, it remains largely unknown that how bacterial pathogens adapt to vaccination. In this study, Bordetella pertussis, the causative agent of whooping cough, was selected as an example to explore possible effect of vaccination on the bacterial pathogen. We sequenced and analysed the complete genomes of 40 B. pertussis strains from Finland and China, as well as 11 previously sequenced strains from the Netherlands, where different vaccination strategies have been used over the past 50 years. The results showed that the molecular clock moved at different rates in these countries and in distinct periods, which suggested that evolution of the B. pertussis population was closely associated with the country vaccination coverage. Comparative whole-genome analyses indicated that evolution in this human-restricted pathogen was mainly characterised by ongoing genetic shift and gene loss. Furthermore, 116 SNPs were specifically detected in currently circulating ptxP3-containing strains. The finding might explain the successful emergence of this lineage and its spread worldwide. Collectively, our results suggest that the immune pressure of vaccination is one major driving force for the evolution of B. pertussis, which facilitates further exploration of the pathogenicity of B. pertussis.

  16. The complete genome sequence of Staphylothermus marinus reveals differences in sulfur metabolism among heterotrophic Crenarchaeota.

    Science.gov (United States)

    Anderson, Iain J; Dharmarajan, Lakshmi; Rodriguez, Jason; Hooper, Sean; Porat, Iris; Ulrich, Luke E; Elkins, James G; Mavromatis, Kostas; Sun, Hui; Land, Miriam; Lapidus, Alla; Lucas, Susan; Barry, Kerrie; Huber, Harald; Zhulin, Igor B; Whitman, William B; Mukhopadhyay, Biswarup; Woese, Carl; Bristow, James; Kyrpides, Nikos

    2009-04-02

    Staphylothermus marinus is an anaerobic, sulfur-reducing peptide fermenter of the archaeal phylum Crenarchaeota. It is the third heterotrophic, obligate sulfur reducing crenarchaeote to be sequenced and provides an opportunity for comparative analysis of the three genomes. The 1.57 Mbp genome of the hyperthermophilic crenarchaeote Staphylothermus marinus has been completely sequenced. The main energy generating pathways likely involve 2-oxoacid:ferredoxin oxidoreductases and ADP-forming acetyl-CoA synthases. S. marinus possesses several enzymes not present in other crenarchaeotes including a sodium ion-translocating decarboxylase likely to be involved in amino acid degradation. S. marinus lacks sulfur-reducing enzymes present in the other two sulfur-reducing crenarchaeotes that have been sequenced -- Thermofilum pendens and Hyperthermus butylicus. Instead it has three operons similar to the mbh and mbx operons of Pyrococcus furiosus, which may play a role in sulfur reduction and/or hydrogen production. The two marine organisms, S. marinus and H. butylicus, possess more sodium-dependent transporters than T. pendens and use symporters for potassium uptake while T. pendens uses an ATP-dependent potassium transporter. T. pendens has adapted to a nutrient-rich environment while H. butylicus is adapted to a nutrient-poor environment, and S. marinus lies between these two extremes. The three heterotrophic sulfur-reducing crenarchaeotes have adapted to their habitats, terrestrial vs. marine, via their transporter content, and they have also adapted to environments with differing levels of nutrients. Despite the fact that they all use sulfur as an electron acceptor, they are likely to have different pathways for sulfur reduction.

  17. The complete genome sequence of Staphylothermus marinus reveals differences in sulfur metabolism among heterotrophic Crenarchaeota

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Iain [U.S. Department of Energy, Joint Genome Institute; Lakshmi, Lakshmi Dharmarajan [Virginia Polytechnic Institute and State University (Virginia Tech); Rodriquez, Jason [Virginia Polytechnic Institute and State University (Virginia Tech); Hooper, Sean [U.S. Department of Energy, Joint Genome Institute; Porat, I. [University of Georgia, Athens, GA; Ulrich, Luke [ORNL; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Sun, Hui [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Barry, Kerrie [U.S. Department of Energy, Joint Genome Institute; Huber, Harald [Universitat Regensburg, Regensburg, Germany; Zhulin, Igor B [University of Tennessee, Knoxville (UTK) & Oak Ridge National Laboratory (ORNL); Whitman, W. B. [University of Georgia, Athens, GA; Mukhopadhyay, Biswarup [Virginia Polytechnic Institute and State University (Virginia Tech); Woese, Carl [University of Illinois, Urbana-Champaign; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

    2009-01-01

    Background Staphylothermus marinus is an anaerobic, sulfur-reducing peptide fermenter of the archaeal phylum Crenarchaeota. It is the third heterotrophic, obligate sulfur reducing crenarchaeote to be sequenced and provides an opportunity for comparative analysis of the three genomes. Results The 1.57 Mbp genome of the hyperthermophilic crenarchaeote Staphylothermus marinus has been completely sequenced. The main energy generating pathways likely involve 2-oxoacid:ferredoxin oxidoreductases and ADP-forming acetyl-CoA synthases. S. marinus possesses several enzymes not present in other crenarchaeotes including a sodium ion-translocating decarboxylase likely to be involved in amino acid degradation. S. marinus lacks sulfur-reducing enzymes present in the other two sulfur-reducing crenarchaeotes that have been sequenced Thermofilum pendens and Hyperthermus butylicus. Instead it has three operons similar to the mbh and mbx operons of Pyrococcus furiosus, which may play a role in sulfur reduction and/or hydrogen production. The two marine organisms, S. marinus and H. butylicus, possess more sodium-dependent transporters than T. pendens and use symporters for potassium uptake while T. pendens uses an ATP-dependent potassium transporter. T. pendens has adapted to a nutrient-rich environment while H. butylicus is adapted to a nutrient-poor environment, and S. marinus lies between these two extremes. Conclusion The three heterotrophic sulfur-reducing crenarchaeotes have adapted to their habitats, terrestrial vs. marine, via their transporter content, and they have also adapted to environments with differing levels of nutrients. Despite the fact that they all use sulfur as an electron acceptor, they are likely to have different pathways for sulfur reduction.

  18. The complete genome sequence of Staphylothermus marinus reveals differences in sulfur metabolism among heterotrophic Crenarchaeota

    Directory of Open Access Journals (Sweden)

    Barry Kerrie

    2009-04-01

    Full Text Available Abstract Background Staphylothermus marinus is an anaerobic, sulfur-reducing peptide fermenter of the archaeal phylum Crenarchaeota. It is the third heterotrophic, obligate sulfur reducing crenarchaeote to be sequenced and provides an opportunity for comparative analysis of the three genomes. Results The 1.57 Mbp genome of the hyperthermophilic crenarchaeote Staphylothermus marinus has been completely sequenced. The main energy generating pathways likely involve 2-oxoacid:ferredoxin oxidoreductases and ADP-forming acetyl-CoA synthases. S. marinus possesses several enzymes not present in other crenarchaeotes including a sodium ion-translocating decarboxylase likely to be involved in amino acid degradation. S. marinus lacks sulfur-reducing enzymes present in the other two sulfur-reducing crenarchaeotes that have been sequenced – Thermofilum pendens and Hyperthermus butylicus. Instead it has three operons similar to the mbh and mbx operons of Pyrococcus furiosus, which may play a role in sulfur reduction and/or hydrogen production. The two marine organisms, S. marinus and H. butylicus, possess more sodium-dependent transporters than T. pendens and use symporters for potassium uptake while T. pendens uses an ATP-dependent potassium transporter. T. pendens has adapted to a nutrient-rich environment while H. butylicus is adapted to a nutrient-poor environment, and S. marinus lies between these two extremes. Conclusion The three heterotrophic sulfur-reducing crenarchaeotes have adapted to their habitats, terrestrial vs. marine, via their transporter content, and they have also adapted to environments with differing levels of nutrients. Despite the fact that they all use sulfur as an electron acceptor, they are likely to have different pathways for sulfur reduction.

  19. The complete genome sequence of Staphylothermus marinus reveals differences in sulfur metabolism among heterotrophic Crenarchaeota

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, iain J.; Dharmarajan, Lakshmi; Rodriguez, Jason; Hooper, Sean; Porat, Iris; Ulrich, Luke E.; Elkins, James G.; Mavromatis, Kostas; Sun, Hui; Land, Miriam; Lapidus, Alla; Lucas, Susan; Barry, Kerrie; Huber, Harald; Zhulin, Igor B.; Whitman, William B.; Mukhopadhyay, Biswarup; Woese, Carl; Bristow, James; Kyrpides, Nikos

    2008-09-05

    Staphylothermus marinus is an anaerobic, sulfur-reducing peptide fermenter of the archaeal phylum Crenarchaeota. It is the third heterotrophic, obligate sulfur reducing crenarchaeote to be sequenced and provides an opportunity for comparative analysis of the three genomes. The 1.57 Mbp genome of the hyperthermophilic crenarchaeote Staphylothermus marinus has been completely sequenced. The main energy generating pathways likely involve 2-oxoacid:ferredoxin oxidoreductases and ADP-forming acetyl-CoA synthases. S. marinus possesses several enzymes not present in other crenarchaeotes including a sodium ion-translocating decarboxylase likely to be involved in amino acid degradation. S. marinus lacks sulfur-reducing enzymes present in the other two sulfur-reducing crenarchaeotes that have been sequenced - Thermofilum pendens and Hyperthermus butylicus. Instead it has three operons similar to the mbh and mbx operons of Pyrococcus furiosus, which may play a role in sulfur reduction and/or hydrogen production. The two marine organisms, S. marinus and H. butylicus, possess more sodium-dependent transporters than T. pendens and use symporters for potassium uptake while T. pendens uses an ATP-dependent potassium transporter. T. pendens has adapted to a nutrient-rich environment while H. butylicus is adapted to a nutrient-poor environment, and S. marinus lies between these two extremes. The three heterotrophic sulfur-reducing crenarchaeotes have adapted to their habitats, terrestrial vs. marine, via their transporter content, and they have also adapted to environments with differing levels of nutrients. Despite the fact that they all use sulfur as an electron acceptor, they are likely to have different pathways for sulfur reduction.

  20. Culture-independent genome sequencing of clinical samples reveals an unexpected heterogeneity of infections by Chlamydia pecorum.

    Science.gov (United States)

    Bachmann, Nathan L; Sullivan, Mitchell J; Jelocnik, Martina; Myers, Garry S A; Timms, Peter; Polkinghorne, Adam

    2015-05-01

    Chlamydia pecorum is an important global pathogen of livestock, and it is also a significant threat to the long-term survival of Australia's koala populations. This study employed a culture-independent DNA capture approach to sequence C. pecorum genomes directly from clinical swab samples collected from koalas with chlamydial disease as well as from sheep with arthritis and conjunctivitis. Investigations into single-nucleotide polymorphisms within each of the swab samples revealed that a portion of the reads in each sample belonged to separate C. pecorum strains, suggesting that all of the clinical samples analyzed contained mixed populations of genetically distinct C. pecorum isolates. This observation was independent of the anatomical site sampled and the host species. Using the genomes of strains identified in each of these samples, whole-genome phylogenetic analysis revealed that a clade containing a bovine and a koala isolate is distinct from other clades comprised of livestock or koala C. pecorum strains. Providing additional evidence to support exposure of koalas to Australian livestock strains, two minor strains assembled from the koala swab samples clustered with livestock strains rather than koala strains. Culture-independent probe-based genome capture and sequencing of clinical samples provides the strongest evidence yet to suggest that naturally occurring chlamydial infections are comprised of multiple genetically distinct strains.

  1. Comparative Genome Sequencing Reveals Within-Host Genetic Changes in Neisseria meningitidis during Invasive Disease

    Science.gov (United States)

    Klughammer, Johanna; Dittrich, Marcus; Blom, Jochen; Mitesser, Vera; Vogel, Ulrich; Frosch, Matthias; Goesmann, Alexander; Müller, Tobias

    2017-01-01

    Some members of the physiological human microbiome occasionally cause life-threatening disease even in immunocompetent individuals. A prime example of such a commensal pathogen is Neisseria meningitidis, which normally resides in the human nasopharynx but is also a leading cause of sepsis and epidemic meningitis. Using N. meningitidis as model organism, we tested the hypothesis that virulence of commensal pathogens is a consequence of within host evolution and selection of invasive variants due to mutations at contingency genes, a mechanism called phase variation. In line with the hypothesis that phase variation evolved as an adaptation to colonize diverse hosts, computational comparisons of all 27 to date completely sequenced and annotated meningococcal genomes retrieved from public databases showed that contingency genes are indeed enriched for genes involved in host interactions. To assess within-host genetic changes in meningococci, we further used ultra-deep whole-genome sequencing of throat-blood strain pairs isolated from four patients suffering from invasive meningococcal disease. We detected up to three mutations per strain pair, affecting predominantly contingency genes involved in type IV pilus biogenesis. However, there was not a single (set) of mutation(s) that could invariably be found in all four pairs of strains. Phenotypic assays further showed that these genetic changes were generally not associated with increased serum resistance, higher fitness in human blood ex vivo or differences in the interaction with human epithelial and endothelial cells in vitro. In conclusion, we hypothesize that virulence of meningococci results from accidental emergence of invasive variants during carriage and without within host evolution of invasive phenotypes during disease progression in vivo. PMID:28081260

  2. Genomic DNA sequences from mastodon and woolly mammoth reveal deep speciation of forest and savanna elephants.

    Directory of Open Access Journals (Sweden)

    Nadin Rohland

    Full Text Available To elucidate the history of living and extinct elephantids, we generated 39,763 bp of aligned nuclear DNA sequence across 375 loci for African savanna elephant, African forest elephant, Asian elephant, the extinct American mastodon, and the woolly mammoth. Our data establish that the Asian elephant is the closest living relative of the extinct mammoth in the nuclear genome, extending previous findings from mitochondrial DNA analyses. We also find that savanna and forest elephants, which some have argued are the same species, are as or more divergent in the nuclear genome as mammoths and Asian elephants, which are considered to be distinct genera, thus resolving a long-standing debate about the appropriate taxonomic classification of the African elephants. Finally, we document a much larger effective population size in forest elephants compared with the other elephantid taxa, likely reflecting species differences in ancient geographic structure and range and differences in life history traits such as variance in male reproductive success.

  3. Systematic Inference of Copy-Number Genotypes from Personal Genome Sequencing Data Reveals Extensive Olfactory Receptor Gene Content Diversity

    Science.gov (United States)

    Waszak, Sebastian M.; Hasin, Yehudit; Zichner, Thomas; Olender, Tsviya; Keydar, Ifat; Khen, Miriam; Stütz, Adrian M.; Schlattl, Andreas; Lancet, Doron; Korbel, Jan O.

    2010-01-01

    Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel computational approach with an underlying statistical framework that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can incorporate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes. We benchmarked CopySeq by genotyping 500 chromosome 1 CNV regions in 150 personal genomes sequenced at low-coverage. The assessed copy-number genotypes were highly concordant with our performed qPCR experiments (Pearson correlation coefficient 0.94), and with the published results of two microarray platforms (95–99% concordance). We further demonstrated the utility of CopySeq for analyzing gene regions enriched for segmental duplications by comprehensively inferring copy-number genotypes in the CNV-enriched >800 olfactory receptor (OR) human gene and pseudogene loci. CopySeq revealed that OR loci display an extensive range of locus copy-numbers across individuals, with zero to two copies in some OR loci, and two to nine copies in others. Among genetic variants affecting OR loci we identified deleterious variants including CNVs and SNPs affecting ∼15% and ∼20% of the human OR gene repertoire, respectively, implying that genetic variants with a possible impact on smell perception are widespread. Finally, we found that for several OR loci the reference genome appears to represent a minor-frequency variant, implying a necessary revision of the OR repertoire for future functional studies. CopySeq can ascertain genomic structural variation in specific gene families as well as at a genome-wide scale, where it may enable the quantitative evaluation of CNVs in genome-wide association studies involving high

  4. Evolution of the RH gene family in vertebrates revealed by brown hagfish (Eptatretus atami) genome sequences.

    Science.gov (United States)

    Suzuki, Akinori; Komata, Hidero; Iwashita, Shogo; Seto, Shotaro; Ikeya, Hironobu; Tabata, Mitsutoshi; Kitano, Takashi

    2017-02-01

    In vertebrates, there are four major genes in the RH (Rhesus) gene family, RH, RHAG, RHBG, and RHCG. These genes are thought to have been formed by the two rounds of whole-genome duplication (2R-WGD) in the common ancestor of all vertebrates. In our previous work, where we analyzed details of the gene duplications process of this gene family, three nucleotide sequences belonging to this family were identified in Far Eastern brook lamprey (Lethenteron reissneri), and the phylogenetic positions of the genes were determined. Lampreys, along with hagfishes, are cyclostomata (jawless fishes), which is a sister group of gnathostomata (jawed vertebrates). Although those results suggested that one gene was orthologous to the gnathostome RHCG genes, we did not identify clear orthologues for other genes. In this study, therefore, we identified three novel cDNA sequences that belong to the RH gene family using de novo transcriptome analysis of another cyclostome: the brown hagfish (Eptatretus atami). We also determined the nucleotide sequences for the RHBG and RHCG genes in a red stingray (Dasyatis akajei), which belongs to the cartilaginous fishes. The phylogenetic tree showed that two brown hagfish genes, which were probably duplicated in the cyclostome lineage, formed a cluster with the gnathostome RHAG genes, whereas another brown hagfish gene formed a cluster with the gnathostome RHCG genes. We estimated that the RH genes had a higher evolutionary rate than the RHAG, RHBG, and RHCG genes. Interestingly, in the RHBG genes, only the bird lineage showed a higher rate of nonsynonymous substitutions. It is likely that this higher rate was caused by a state of relaxed functional constraints rather than positive selection nor by pseudogenization.

  5. The genome sequence of black cottonwood (Populus trichocarpa) reveals 18 conserved cellulose synthase (CesA) genes.

    Science.gov (United States)

    Djerbi, Soraya; Lindskog, Mats; Arvestad, Lars; Sterky, Fredrik; Teeri, Tuula T

    2005-07-01

    The genome sequence of Populus trichocarpa was screened for genes encoding cellulose synthases by using full-length cDNA sequences and ESTs previously identified in the tissue specific cDNA libraries of other poplars. The data obtained revealed 18 distinct CesA gene sequences in P. trichocarpa. The identified genes were grouped in seven gene pairs, one group of three sequences and one single gene. Evidence from gene expression studies of hybrid aspen suggests that both copies of at least one pair, CesA3-1 and CesA3-2, are actively transcribed. No sequences corresponding to the gene pair, CesA6-1 and CesA6-2, were found in Arabidopsis or hybrid aspen, while one homologous gene has been identified in the rice genome and an active transcript in Populus tremuloides. A phylogenetic analysis suggests that the CesA genes previously associated with secondary cell wall synthesis originate from a single ancestor gene and group in three distinct subgroups. The newly identified copies of CesA genes in P. trichocarpa give rise to a number of new questions concerning the mechanism of cellulose synthesis in trees.

  6. Whole Genome Sequencing Reveals Potential New Targets for Improving Nitrogen Uptake and Utilization in Sorghum bicolor

    Directory of Open Access Journals (Sweden)

    Karen Massel

    2016-10-01

    Full Text Available Nitrogen (N fertilizers are a major agricultural input where more than 100 million tons are supplied annually. Cereals are particularly inefficient at soil N uptake, where the unrecovered nitrogen causes serious environmental damage. Sorghum bicolor (sorghum is an important cereal crop, particularly in resource-poor semi-arid regions, and is known to have a high NUE in comparison to other major cereals under limited N conditions. This study provides the first assessment of genetic diversity and signatures of selection across 230 fully sequenced genes putatively involved in the uptake and mobilization of N from a diverse panel of sorghum lines. This comprehensive analysis reveals an overall reduction in diversity as a result of domestication and a total of 128 genes displaying signatures of purifying selection, thereby revealing possible gene targets to improve NUE in sorghum and cereals alike. A number of key genes appear to have been involved in selective sweeps, reducing their sequence diversity. The ammonium transporter (AMT genes generally had low allelic diversity, whereas a substantial number of nitrate/peptide transporter 1 (NRT1/PTR genes had higher nucleotide diversity in domesticated germplasm. Interestingly, members of the distinct race Guinea margaritiferum contained a number of unique alleles, and along with the wild sorghum species, represent a rich resource of new variation for plant improvement of NUE in sorghum.

  7. Whole Genome Sequencing Reveals Potential New Targets for Improving Nitrogen Uptake and Utilization in Sorghum bicolor

    Science.gov (United States)

    Massel, Karen; Campbell, Bradley C.; Mace, Emma S.; Tai, Shuaishuai; Tao, Yongfu; Worland, Belinda G.; Jordan, David R.; Botella, Jose R.; Godwin, Ian D.

    2016-01-01

    Nitrogen (N) fertilizers are a major agricultural input where more than 100 million tons are supplied annually. Cereals are particularly inefficient at soil N uptake, where the unrecovered nitrogen causes serious environmental damage. Sorghum bicolor (sorghum) is an important cereal crop, particularly in resource-poor semi-arid regions, and is known to have a high NUE in comparison to other major cereals under limited N conditions. This study provides the first assessment of genetic diversity and signatures of selection across 230 fully sequenced genes putatively involved in the uptake and utilization of N from a diverse panel of sorghum lines. This comprehensive analysis reveals an overall reduction in diversity as a result of domestication and a total of 128 genes displaying signatures of purifying selection, thereby revealing possible gene targets to improve NUE in sorghum and cereals alike. A number of key genes appear to have been involved in selective sweeps, reducing their sequence diversity. The ammonium transporter (AMT) genes generally had low allelic diversity, whereas a substantial number of nitrate/peptide transporter 1 (NRT1/PTR) genes had higher nucleotide diversity in domesticated germplasm. Interestingly, members of the distinct race Guinea margaritiferum contained a number of unique alleles, and along with the wild sorghum species, represent a rich resource of new variation for plant improvement of NUE in sorghum. PMID:27826302

  8. Comparative genomics analysis of completely sequenced microbial genomes reveals the ubiquity of N-linked glycosylation in prokaryotes.

    Science.gov (United States)

    Kumar, Manjeet; Balaji, Petety V

    2011-05-01

    Glycosylation of proteins in prokaryotes has been known for the last few decades. Glycan structures and/or the glycosylation pathways have been experimentally characterized in only a small number of prokaryotes. Even this has become possible only during the last decade or so, primarily due to technological and methodological developments. Glycosylated proteins are diverse in their function and localization. Glycosylation has been shown to be associated with a wide range of biological phenomena. Characterization of the various types of glycans and the glycosylation machinery is critical to understand such processes. Such studies can help in the identification of novel targets for designing drugs, diagnostics, and engineering of therapeutic proteins. In view of this, the experimentally characterized pgl system of Campylobacter jejuni, responsible for N-linked glycosylation, has been used in this study to identify glycosylation loci in 865 prokaryotes whose genomes have been completely sequenced. Results from the present study show that only a small number of organisms have homologs for all the pgl enzymes and a few others have homologs for none of the pgl enzymes. Most of the organisms have homologs for only a subset of the pgl enzymes. There is no specific pattern for the presence or absence of pgl homologs vis-à-vis the 16S rRNA sequence-based phylogenetic tree. This may be due to differences in the glycan structures, high sequence divergence, horizontal gene transfer or non-orthologous gene displacement. Overall, the presence of homologs for pgl enzymes in a large number of organisms irrespective of their habitat, pathogenicity, energy generation mechanism, etc., hints towards the ubiquity of N-linked glycosylation in prokaryotes.

  9. Next-Generation Sequencing of Two Mitochondrial Genomes from Family Pompilidae (Hymenoptera: Vespoidea Reveal Novel Patterns of Gene Arrangement

    Directory of Open Access Journals (Sweden)

    Peng-Yan Chen

    2016-10-01

    Full Text Available Animal mitochondrial genomes have provided large and diverse datasets for evolutionary studies. Here, the first two representative mitochondrial genomes from the family Pompilidae (Hymenoptera: Vespoidea were determined using next-generation sequencing. The sequenced region of these two mitochondrial genomes from the species Auplopus sp. and Agenioideus sp. was 16,746 bp long with an A + T content of 83.12% and 16,596 bp long with an A + T content of 78.64%, respectively. In both species, all of the 37 typical mitochondrial genes were determined. The secondary structure of tRNA genes and rRNA genes were predicted and compared with those of other insects. Atypical trnS1 using abnormal anticodons TCT and lacking D-stem pairings was identified. There were 49 helices belonging to six domains in rrnL and 30 helices belonging to three domains in rrns present. Compared with the ancestral organization, four and two tRNA genes were rearranged in mitochondrial genomes of Auplopus and Agenioideus, respectively. In both species, trnM was shuffled upstream of the trnI-trnQ-trnM cluster, and trnA was translocated from the cluster trnA-trnR-trnN-trnS1-trnE-trnF to the region between nad1 and trnL1, which is novel to the Vespoidea. In Auplopus, the tRNA cluster trnW-trnC-trnY was shuffled to trnW-trnY-trnC. Phylogenetic analysis within Vespoidea revealed that Pompilidae and Mutillidae formed a sister lineage, and then sistered Formicidae. The genomes presented in this study have enriched the knowledge base of molecular markers, which is valuable in respect to studies about the gene rearrangement mechanism, genomic evolutionary processes and phylogeny of Hymenoptera.

  10. Mitochondrial genome sequences reveal evolutionary relationships of the Phytophthora 1c clade species.

    Science.gov (United States)

    Lassiter, Erica S; Russ, Carsten; Nusbaum, Chad; Zeng, Qiandong; Saville, Amanda C; Olarte, Rodrigo A; Carbone, Ignazio; Hu, Chia-Hui; Seguin-Orlando, Andaine; Samaniego, Jose A; Thorne, Jeffrey L; Ristaino, Jean B

    2015-11-01

    Phytophthora infestans is one of the most destructive plant pathogens of potato and tomato globally. The pathogen is closely related to four other Phytophthora species in the 1c clade including P. phaseoli, P. ipomoeae, P. mirabilis and P. andina that are important pathogens of other wild and domesticated hosts. P. andina is an interspecific hybrid between P. infestans and an unknown Phytophthora species. We have sequenced mitochondrial genomes of the sister species of P. infestans and examined the evolutionary relationships within the clade. Phylogenetic analysis indicates that the P. phaseoli mitochondrial lineage is basal within the clade. P. mirabilis and P. ipomoeae are sister lineages and share a common ancestor with the Ic mitochondrial lineage of P. andina. These lineages in turn are sister to the P. infestans and P. andina Ia mitochondrial lineages. The P. andina Ic lineage diverged much earlier than the P. andina Ia mitochondrial lineage and P. infestans. The presence of two mitochondrial lineages in P. andina supports the hybrid nature of this species. The ancestral state of the P. andina Ic lineage in the tree and its occurrence only in the Andean regions of Ecuador, Colombia and Peru suggests that the origin of this species hybrid in nature may occur there.

  11. Complete genome sequence and transcriptomics analyses reveal pigment biosynthesis and regulatory mechanisms in an industrial strain, Monascus purpureus YY-1.

    Science.gov (United States)

    Yang, Yue; Liu, Bin; Du, Xinjun; Li, Ping; Liang, Bin; Cheng, Xiaozhen; Du, Liangcheng; Huang, Di; Wang, Lei; Wang, Shuo

    2015-02-09

    Monascus has been used to produce natural colorants and food supplements for more than one thousand years, and approximately more than one billion people eat Monascus-fermented products during their daily life. In this study, using next-generation sequencing and optical mapping approaches, a 24.1-Mb complete genome of an industrial strain, Monascus purpureus YY-1, was obtained. This genome consists of eight chromosomes and 7,491 genes. Phylogenetic analysis at the genome level provides convincing evidence for the evolutionary position of M. purpureus. We provide the first comprehensive prediction of the biosynthetic pathway for Monascus pigment. Comparative genomic analyses show that the genome of M. purpureus is 13.6-40% smaller than those of closely related filamentous fungi and has undergone significant gene losses, most of which likely occurred during its specialized adaptation to starch-based foods. Comparative transcriptome analysis reveals that carbon starvation stress, resulting from the use of relatively low-quality carbon sources, contributes to the high yield of pigments by repressing central carbon metabolism and augmenting the acetyl-CoA pool. Our work provides important insights into the evolution of this economically important fungus and lays a foundation for future genetic manipulation and engineering of this strain.

  12. Microbiota present in cystic fibrosis lungs as revealed by whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Philippe M Hauser

    Full Text Available Determination of the precise composition and variation of microbiota in cystic fibrosis lungs is crucial since chronic inflammation due to microorganisms leads to lung damage and ultimately, death. However, this constitutes a major technical challenge. Culturing of microorganisms does not provide a complete representation of a microbiota, even when using culturomics (high-throughput culture. So far, only PCR-based metagenomics have been investigated. However, these methods are biased towards certain microbial groups, and suffer from uncertain quantification of the different microbial domains. We have explored whole genome sequencing (WGS using the Illumina high-throughput technology applied directly to DNA extracted from sputa obtained from two cystic fibrosis patients. To detect all microorganism groups, we used four procedures for DNA extraction, each with a different lysis protocol. We avoided biases due to whole DNA amplification thanks to the high efficiency of current Illumina technology. Phylogenomic classification of the reads by three different methods produced similar results. Our results suggest that WGS provides, in a single analysis, a better qualitative and quantitative assessment of microbiota compositions than cultures and PCRs. WGS identified a high quantity of Haemophilus spp. (patient 1 or Staphylococcus spp. plus Streptococcus spp. (patient 2 together with low amounts of anaerobic (Veillonella, Prevotella, Fusobacterium and aerobic bacteria (Gemella, Moraxella, Granulicatella. WGS suggested that fungal members represented very low proportions of the microbiota, which were detected by cultures and PCRs because of their selectivity. The future increase of reads' sizes and decrease in cost should ensure the usefulness of WGS for the characterisation of microbiota.

  13. Whole-genome sequencing reveals complex mechanisms of intrinsic resistance to BRAF inhibition.

    Science.gov (United States)

    Turajlic, S; Furney, S J; Stamp, G; Rana, S; Ricken, G; Oduko, Y; Saturno, G; Springer, C; Hayes, A; Gore, M; Larkin, J; Marais, R

    2014-05-01

    BRAF is mutated in ∼42% of human melanomas (COSMIC. http://www.sanger.ac.uk/genetics/CGP/cosmic/) and pharmacological BRAF inhibitors such as vemurafenib and dabrafenib achieve dramatic responses in patients whose tumours harbour BRAF(V600) mutations. Objective responses occur in ∼50% of patients and disease stabilisation in a further ∼30%, but ∼20% of patients present primary or innate resistance and do not respond. Here, we investigated the underlying cause of treatment failure in a patient with BRAF mutant melanoma who presented primary resistance. We carried out whole-genome sequencing and single nucleotide polymorphism (SNP) array analysis of five metastatic tumours from the patient. We validated mechanisms of resistance in a cell line derived from the patient's tumour. We observed that the majority of the single-nucleotide variants identified were shared across all tumour sites, but also saw site-specific copy-number alterations in discrete cell populations at different sites. We found that two ubiquitous mutations mediated resistance to BRAF inhibition in these tumours. A mutation in GNAQ sustained mitogen-activated protein kinase (MAPK) signalling, whereas a mutation in PTEN activated the PI3 K/AKT pathway. Inhibition of both pathways synergised to block the growth of the cells. Our analyses show that the five metastases arose from a common progenitor and acquired additional alterations after disease dissemination. We demonstrate that a distinct combination of mutations mediated primary resistance to BRAF inhibition in this patient. These mutations were present in all five tumours and in a tumour sample taken before BRAF inhibitor treatment was administered. Inhibition of both pathways was required to block tumour cell growth, suggesting that combined targeting of these pathways could have been a valid therapeutic approach for this patient.

  14. The draft genome sequence of Corynebacterium diphtheriae bv. mitis NCTC 3529 reveals significant diversity between the primary disease-causing biovars.

    Science.gov (United States)

    Sangal, Vartul; Tucker, Nicholas P; Burkovski, Andreas; Hoskisson, Paul A

    2012-06-01

    We report the draft genome of the human pathogen Corynebacterium diphtheriae bv. mitis NCTC 3529. This is the first C. diphtheriae bv. mitis strain to be sequenced and reveals significant differences from the other primary biovar, C. diphtheriae bv. gravis.

  15. The Draft Genome Sequence of Corynebacterium diphtheriae bv. mitis NCTC 3529 Reveals Significant Diversity between the Primary Disease-Causing Biovars

    OpenAIRE

    Sangal, Vartul; Nicholas P Tucker; Burkovski, Andreas; Hoskisson, Paul A.

    2012-01-01

    We report the draft genome of the human pathogen Corynebacterium diphtheriae bv. mitis NCTC 3529. This is the first C. diphtheriae bv. mitis strain to be sequenced and reveals significant differences from the other primary biovar, C. diphtheriae bv. gravis.

  16. Genome-Wide Sequencing Reveals MicroRNAs Downregulated in Cerebral Cavernous Malformations.

    Science.gov (United States)

    Kar, Souvik; Bali, Kiran Kumar; Baisantry, Arpita; Geffers, Robert; Samii, Amir; Bertalanffy, Helmut

    2017-02-01

    Cerebral cavernous malformations (CCM) are vascular lesions associated with loss-of-function mutations in one of the three genes encoding KRIT1 (CCM1), CCM2, and PDCD10. Recent understanding of the molecular mechanisms that lead to CCM development is limited. The role of microRNAs (miRNAs) has been demonstrated in vascular pathologies resulting in loss of tight junction proteins, increased vascular permeability and endothelial cell dysfunction. Since the relevance of miRNAs in CCM pathophysiology has not been elucidated, the primary aim of the study was to identify the miRNA-mRNA expression network associated with CCM. Using small RNA sequencing, we identified a total of 764 matured miRNAs expressed in CCM patients compared to the healthy brains. The expression of the selected miRNAs was validated by qRT-PCR, and the results were found to be consistent with the sequencing data. Upon application of additional statistical stringency, five miRNAs (let-7b-5p, miR-361-5p, miR-370-3p, miR-181a-2-3p, and miR-95-3p) were prioritized to be top CCM-relevant miRNAs. Further in silico analyses revealed that the prioritized miRNAs have a direct functional relation with mRNAs, such as MIB1, HIF1A, PDCD10, TJP1, OCLN, HES1, MAPK1, VEGFA, EGFL7, NF1, and ENG, which are previously characterized as key regulators of CCM pathology. To date, this is the first study to investigate the role of miRNAs in CCM pathology. By employing cutting edge molecular and in silico analyses on clinical samples, the current study reports global miRNA expression changes in CCM patients and provides a rich source of data set to understand detailed molecular machinery involved in CCM pathophysiology.

  17. Pseudo-De Novo Assembly and Analysis of Unmapped Genome Sequence Reads in Wild Zebrafish Reveal Novel Gene Content.

    Science.gov (United States)

    Faber-Hammond, Joshua J; Brown, Kim H

    2016-04-01

    Zebrafish represents the third vertebrate with an officially completed genome, yet it remains incomplete with additions and corrections continuing with the current release, GRCz10, having 13% of zebrafish cDNA sequences unmapped. This disparity may result from population differences, given that the genome reference was generated from clonal individuals with limited genetic diversity. This is supported by the recent analysis of a single wild zebrafish, which identified over 5.2 million SNPs and 1.6 million in/dels in the previous genome build, zv9. Re-examination of this sequence data set indicated that 13.8% of quality sequence reads failed to align to GRCz10. Using a novel bioinformatics de novo assembly pipeline on these unmappable reads, we identified 1,514,491 novel contigs covering ∼224 Mb of genomic sequence. Among these, 1083 contigs were found to contain a potential gene coding sequence. RNA-seq data comparison confirmed that 362 contigs contained a transcribed DNA sequence, suggesting that a large amount of functional genomic sequence remains unannotated in the zebrafish reference genome. By utilizing the bioinformatics pipeline developed in this study, the zebrafish genome will be bolstered as a model for human disease research. Adaptation of the pipeline described here also offers a cost-efficient and effective method to identify and map novel genetic content across any genome and will ultimately aid in the completion of additional genomes for a broad range of species.

  18. Exome sequencing and genome-wide copy number variant mapping reveal novel associations with sensorineural hereditary hearing loss.

    Science.gov (United States)

    Haraksingh, Rajini R; Jahanbani, Fereshteh; Rodriguez-Paris, Juan; Gelernter, Joel; Nadeau, Kari C; Oghalai, John S; Schrijver, Iris; Snyder, Michael P

    2014-12-20

    The genetic diversity of loci and mutations underlying hereditary hearing loss is an active area of investigation. To identify loci associated with predominantly non-syndromic sensorineural hearing loss, we performed exome sequencing of families and of single probands, as well as copy number variation (CNV) mapping in a case-control cohort. Analysis of three distinct families revealed several candidate loci in two families and a single strong candidate gene, MYH7B, for hearing loss in one family. MYH7B encodes a Type II myosin, consistent with a role for cytoskeletal proteins in hearing. High-resolution genome-wide CNV analysis of 150 cases and 157 controls revealed deletions in genes known to be involved in hearing (e.g. GJB6, OTOA, and STRC, encoding connexin 30, otoancorin, and stereocilin, respectively), supporting CNV contributions to hearing loss phenotypes. Additionally, a novel region on chromosome 16 containing part of the PDXDC1 gene was found to be frequently deleted in hearing loss patients (OR=3.91, 95% CI: 1.62-9.40, p=1.45×10(-7)). We conclude that many known as well as novel loci and distinct types of mutations not typically tested in clinical settings can contribute to the etiology of hearing loss. Our study also demonstrates the challenges of exome sequencing and genome-wide CNV mapping for direct clinical application, and illustrates the need for functional and clinical follow-up as well as curated open-access databases.

  19. Low-coverage MiSeq next generation sequencing reveals the mitochondrial genome of the Eastern Rock Lobster, Sagmariasus verreauxi.

    Science.gov (United States)

    Doyle, Stephen R; Griffith, Ian S; Murphy, Nick P; Strugnell, Jan M

    2015-01-01

    The complete mitochondrial genome of the Eastern Rock lobster, Sagmariasus verreauxi, is reported for the first time. Using low-coverage, long read MiSeq next generation sequencing, we constructed and determined the mtDNA genome organization of the 15,470 bp sequence from two isolates from Eastern Tasmania, Australia and Northern New Zealand, and identified 46 polymorphic nucleotides between the two sequences. This genome sequence and its genetic polymorphisms will likely be useful in understanding the distribution and population connectivity of the Eastern Rock Lobster, and in the fisheries management of this commercially important species.

  20. Identification of Bari Transposons in 23 Sequenced Drosophila Genomes Reveals Novel Structural Variants, MITEs and Horizontal Transfer.

    Directory of Open Access Journals (Sweden)

    Antonio Palazzo

    Full Text Available Bari elements are members of the Tc1-mariner superfamily of DNA transposons, originally discovered in Drosophila melanogaster, and subsequently identified in silico in 11 sequenced Drosophila genomes and as experimentally isolated in four non-sequenced Drosophila species. Bari-like elements have been also studied for their mobility both in vivo and in vitro. We analyzed 23 Drosophila genomes and carried out a detailed characterization of the Bari elements identified, including those from the heterochromatic Bari1 cluster in D. melanogaster. We have annotated 401 copies of Bari elements classified either as putatively autonomous or inactive according to the structure of the terminal sequences and the presence of a complete transposase-coding region. Analyses of the integration sites revealed that Bari transposase prefers AT-rich sequences in which the TA target is cleaved and duplicated. Furthermore evaluation of transposon's co-occurrence near the integration sites of Bari elements showed a non-random distribution of other transposable elements. We also unveil the existence of a putatively autonomous Bari1 variant characterized by two identical long Terminal Inverted Repeats, in D. rhopaloa. In addition, we detected MITEs related to Bari transposons in 9 species. Phylogenetic analyses based on transposase gene and the terminal sequences confirmed that Bari-like elements are distributed into three subfamilies. A few inconsistencies in Bari phylogenetic tree with respect to the Drosophila species tree could be explained by the occurrence of horizontal transfer events as also suggested by the results of dS analyses. This study further clarifies the Bari transposon's evolutionary dynamics and increases our understanding on the Tc1-mariner elements' biology.

  1. Genome sequencing of normal cells reveals developmental lineages and mutational processes

    NARCIS (Netherlands)

    Behjati, Sam; Huch, Meritxell; van Boxtel, Ruben; Karthaus, Wouter; Wedge, David C; Tamuri, Asif U; Martincorena, Iñigo; Petljak, Mia; Alexandrov, Ludmil B; Gundem, Gunes; Tarpey, Patrick S; Roerink, Sophie; Blokker, Joyce; Maddison, Mark; Mudie, Laura; Robinson, Ben; Nik-Zainal, Serena; Campbell, Peter; Goldman, Nick; van de Wetering, Marc; Cuppen, Edwin; Clevers, Hans; Stratton, Michael R

    2014-01-01

    The somatic mutations present in the genome of a cell accumulate over the lifetime of a multicellular organism. These mutations can provide insights into the developmental lineage tree, the number of divisions that each cell has undergone and the mutational processes that have been operative. Here w

  2. Molecular footprints of domestication and improvement in soybean revealed by whole genome re-sequencing

    DEFF Research Database (Denmark)

    Li, Ying-hui; Zhao, Shan-cen; Ma, Jian-xin;

    2013-01-01

    BACKGROUND:Artificial selection played an important role in the origin of modern Glycine max cultivars from the wild soybean Glycine soja. To elucidate the consequences of artificial selection accompanying the domestication and modern improvement of soybean, 25 new and 30 published whole-genome re...

  3. Genome sequence comparison reveals a candidate gene involved in male-hermaphrodite differentiation in papaya (Carica papaya) trees.

    Science.gov (United States)

    Ueno, Hiroki; Urasaki, Naoya; Natsume, Satoshi; Yoshida, Kentaro; Tarora, Kazuhiko; Shudo, Ayano; Terauchi, Ryohei; Matsumura, Hideo

    2015-04-01

    The sex type of papaya (Carica papaya) is determined by the pair of sex chromosomes (XX, female; XY, male; and XY(h), hermaphrodite), in which there is a non-recombining genomic region in the Y and Y(h) chromosomes. This region is presumed to be involved in determination of males and hermaphrodites; it is designated as the male-specific region in the Y chromosome (MSY) and the hermaphrodite-specific region in the Y(h) chromosome (HSY). Here, we identified the genes determining male and hermaphrodite sex types by comparing MSY and HSY genomic sequences. In the MSY and HSY genomic regions, we identified 14,528 nucleotide substitutions and 965 short indels with a large gap and two highly diverged regions. In the predicted genes expressed in flower buds, we found no nucleotide differences leading to amino acid changes between the MSY and HSY. However, we found an HSY-specific transposon insertion in a gene (SVP like) showing a similarity to the Short Vegetative Phase (SVP) gene. Study of SVP-like transcripts revealed that the MSY allele encoded an intact protein, while the HSY allele encoded a truncated protein. Our findings demonstrated that the SVP-like gene is a candidate gene for male-hermaphrodite determination in papaya.

  4. Genome Sequencing and Mapping Reveal Loss of Heterozygosity as a Mechanism for Rapid Adaptation in the Vegetable Pathogen Phytophthora capsici

    Energy Technology Data Exchange (ETDEWEB)

    Lamour, Kurt H.; Mudge, Joann; Gobena, Daniel; Hurtado-Gonzales, Oscar P.; Schmutz, Jeremy; Kuo, Alan; Miller, Neil A.; Rice, Brandon J.; Raffaele, Sylvain; Cano, Liliana M.; Bharti, Arvind K.; Donahoo, Ryan S.; Finely, Sabra; Huitema, Edgar; Hulvey, Jon; Platt, Darren; Salamov, Asaf; Savidor, Alon; Sharma, Rahul; Stam, Remco; Sotrey, Dylan; Thines, Marco; Win, Joe; Haas, Brian J.; Dinwiddie, Darrell L.; Jenkins, Jerry; Knight, James R.; Affourtit, Jason P.; Han, Cliff S.; Chertkov, Olga; Lindquist, Erika A.; Detter, Chris; Grigoriev, Igor V.; Kamoun, Sophien; Kingsmore, Stephen F.

    2012-02-07

    The oomycete vegetable pathogen Phytophthora capsici has shown remarkable adaptation to fungicides and new hosts. Like other members of this destructive genus, P. capsici has an explosive epidemiology, rapidly producing massive numbers of asexual spores on infected hosts. In addition, P. capsici can remain dormant for years as sexually recombined oospores, making it difficult to produce crops at infested sites, and allowing outcrossing populations to maintain significant genetic variation. Genome sequencing, development of a high-density genetic map, and integrative genomic or genetic characterization of P. capsici field isolates and intercross progeny revealed significant mitotic loss of heterozygosity (LOH) in diverse isolates. LOH was detected in clonally propagated field isolates and sexual progeny, cumulatively affecting >30percent of the genome. LOH altered genotypes for more than 11,000 single-nucleotide variant sites and showed a strong association with changes in mating type and pathogenicity. Overall, it appears that LOH may provide a rapid mechanism for fixing alleles and may be an important component of adaptability for P. capsici.

  5. Analysis of the Complete Mycoplasma hominis LBD-4 Genome Sequence Reveals Strain-Variable Prophage Insertion and Distinctive Repeat-Containing Surface Protein Arrangements

    OpenAIRE

    2015-01-01

    The complete genome sequence of Mycoplasma hominis LBD-4 has been determined and the gene content ascribed. The 715,165-bp chromosome contains 620 genes, including 14 carried by a strain-variable prophage genome related to Mycoplasma fermentans MFV-1 and Mycoplasma arthritidis MAV-1. Comparative analysis with the genome of M. hominis PG21T reveals distinctive arrangements of repeat-containing surface proteins.

  6. Analysis of the Complete Mycoplasma hominis LBD-4 Genome Sequence Reveals Strain-Variable Prophage Insertion and Distinctive Repeat-Containing Surface Protein Arrangements.

    Science.gov (United States)

    Calcutt, Michael J; Foecking, Mark F

    2015-02-26

    The complete genome sequence of Mycoplasma hominis LBD-4 has been determined and the gene content ascribed. The 715,165-bp chromosome contains 620 genes, including 14 carried by a strain-variable prophage genome related to Mycoplasma fermentans MFV-1 and Mycoplasma arthritidis MAV-1. Comparative analysis with the genome of M. hominis PG21(T) reveals distinctive arrangements of repeat-containing surface proteins.

  7. Penicillium arizonense, a new, genome sequenced fungal species, reveals a high chemical diversity in secreted metabolites

    DEFF Research Database (Denmark)

    Grijseels, Sietske; Nielsen, Jens Christian; Randelovic, Milica;

    2016-01-01

    confirmed the grouping of P. arizonense within section Canescentia. Compared to related species, P. arizonense proved to encode a high number of proteins involved in carbohydrate metabolism, in particular hemicellulases. Mining the genome for genes involved in secondary metabolite biosynthesis resulted...... of biosynthetic gene clusters in P. arizonense responsible for the synthesis of all detected compounds except curvulinic acid. The capacity to produce biomass degrading enzymes and the identification of a high chemical diversity in secreted bioactive secondary metabolites, offers a broad range of potential...

  8. Reduced representation bisulphite sequencing of the cattle genome reveals DNA methylation patterns

    Science.gov (United States)

    Using reduced representation bisulphite sequencing (RRBS), we obtained the first single-base-resolution maps of bovine DNA methylation in ten somatic tissues. In total, we observed 1,868,049 cytosines in the CG-enriched regions. Similar to the methylation patterns in other species, the CG context wa...

  9. Genome sequence of the Asian Tiger mosquito, Aedes albopictus, reveals insights into its biology, genetics, and evolution

    NARCIS (Netherlands)

    Chena, X.G.; Jiang, X.; Gu, J.; Xu, M.; Wu, Y.; Deng, Y.; Zhang, C.; Bonizzoni, M.; Dermauw, W.; Vontas, J.; Armbruster, P.; Huang, X.; Yang, Y.; Zhang, H.; He, W.; Peng, H.; Liu, Y.; Wu, K.; Chen, J.; Lirakis, M.; Topalis, P.; Van Leeuwen, T.; Hall, B.A.; Thorpe, C.; Mueller, R.L.; Sun, C.; Waterhouse, R.M.; Yan, G.; Tu, Z.J.; Fang, X.; James, A.A.

    2015-01-01

    The Asian tiger mosquito, Aedes albopictus, is a highly successful invasive species that transmits a number of human viral diseases, including dengue and Chikungunya fevers. This species has a large genome with significant population-based size variation. The complete genome sequence was determined

  10. Genome sequence of the button mushroom Agaricus bisporus reveals mechanisms governing adaptation to a humic-rich ecological niche

    Energy Technology Data Exchange (ETDEWEB)

    Morin, Emmanuelle; Kohler, Annegret; Baker, Adam R.; Foulongne-Oriol, Marie; Lombard, Vincent; Nagy, Laszlo G.; Ohm, Robin A.; Patyshakuliyeva, Aleksandrina; Brun, Annick; Aerts, Andrea L.; Bailey, Andrew M.; Billette, Christophe; Coutinho, Pedro M.; Deakin, Greg; Doddapaneni, Harshavardhan; Floudas, Dimitrios; Grimwood, Jane; Hilden, Kristiina; Kues, Ursula; LaButti, Kurt M.; Lapidus, Alla; Lindquist, Erika A.; Lucas, Susan M.; Murat, Claude; Riley, Robert W.; Salamov, Asaf A.; Schmutz, Jeremy; Subramanian, Venkataramanan; Wosten, Han A. B.; Xu, Jianping; Eastwood, Daniel C.; Foster, Gary D.; Sonnenberg, Anton S. M.; Cullen, Dan; de Vries, Ronald P.; Lundell, Taina; Hibbett, David S.; Henrissat, Bernard; Burton, Kerry S.; Kerrigan, Richard W.; Challen, Michael P.; Grigoriev, Igor V.; Martin, Francis

    2012-04-27

    Agaricus bisporus is the model fungus for the adaptation, persistence, and growth in the humic-rich leaf-litter environment. Aside from its ecological role, A. bisporus has been an important component of the human diet for over 200 y and worldwide cultivation of the button mushroom forms a multibillion dollar industry. We present two A. bisporus genomes, their gene repertoires and transcript profiles on compost and during mushroom formation. The genomes encode a full repertoire of polysaccharide-degrading enzymes similar to that of wood-decayers. Comparative transcriptomics of mycelium grown on defined medium, casing-soil, and compost revealed genes encoding enzymes involved in xylan, cellulose, pectin, and protein degradation are more highly expressed in compost. The striking expansion of heme-thiolate peroxidases and etherases is distinctive from Agaricomycotina wood-decayers and suggests a broad attack on decaying lignin and related metabolites found in humic acid-rich environment. Similarly, up-regulation of these genes together with a lignolytic manganese peroxidase, multiple copper radical oxidases, and cytochrome P450s is consistent with challenges posed by complex humic-rich substrates. The gene repertoire and expression of hydrolytic enzymes in A. bisporus is substantially different from the taxonomically related ectomycorrhizal symbiont Laccaria bicolor. A common promoter motif was also identified in genes very highly expressed in humic-rich substrates. These observations reveal genetic and enzymatic mechanisms governing adaptation to the humic-rich ecological niche formed during plant degradation, further defining the critical role such fungi contribute to soil structure and carbon sequestration in terrestrial ecosystems. Genome sequence will expedite mushroom breeding for improved agronomic characteristics.

  11. Genome sequence of the button mushroom Agaricus bisporus reveals mechanisms governing adaptation to a humic-rich ecological niche.

    Science.gov (United States)

    Morin, Emmanuelle; Kohler, Annegret; Baker, Adam R; Foulongne-Oriol, Marie; Lombard, Vincent; Nagy, Laszlo G; Ohm, Robin A; Patyshakuliyeva, Aleksandrina; Brun, Annick; Aerts, Andrea L; Bailey, Andrew M; Billette, Christophe; Coutinho, Pedro M; Deakin, Greg; Doddapaneni, Harshavardhan; Floudas, Dimitrios; Grimwood, Jane; Hildén, Kristiina; Kües, Ursula; Labutti, Kurt M; Lapidus, Alla; Lindquist, Erika A; Lucas, Susan M; Murat, Claude; Riley, Robert W; Salamov, Asaf A; Schmutz, Jeremy; Subramanian, Venkataramanan; Wösten, Han A B; Xu, Jianping; Eastwood, Daniel C; Foster, Gary D; Sonnenberg, Anton S M; Cullen, Dan; de Vries, Ronald P; Lundell, Taina; Hibbett, David S; Henrissat, Bernard; Burton, Kerry S; Kerrigan, Richard W; Challen, Michael P; Grigoriev, Igor V; Martin, Francis

    2012-10-23

    Agaricus bisporus is the model fungus for the adaptation, persistence, and growth in the humic-rich leaf-litter environment. Aside from its ecological role, A. bisporus has been an important component of the human diet for over 200 y and worldwide cultivation of the "button mushroom" forms a multibillion dollar industry. We present two A. bisporus genomes, their gene repertoires and transcript profiles on compost and during mushroom formation. The genomes encode a full repertoire of polysaccharide-degrading enzymes similar to that of wood-decayers. Comparative transcriptomics of mycelium grown on defined medium, casing-soil, and compost revealed genes encoding enzymes involved in xylan, cellulose, pectin, and protein degradation are more highly expressed in compost. The striking expansion of heme-thiolate peroxidases and β-etherases is distinctive from Agaricomycotina wood-decayers and suggests a broad attack on decaying lignin and related metabolites found in humic acid-rich environment. Similarly, up-regulation of these genes together with a lignolytic manganese peroxidase, multiple copper radical oxidases, and cytochrome P450s is consistent with challenges posed by complex humic-rich substrates. The gene repertoire and expression of hydrolytic enzymes in A. bisporus is substantially different from the taxonomically related ectomycorrhizal symbiont Laccaria bicolor. A common promoter motif was also identified in genes very highly expressed in humic-rich substrates. These observations reveal genetic and enzymatic mechanisms governing adaptation to the humic-rich ecological niche formed during plant degradation, further defining the critical role such fungi contribute to soil structure and carbon sequestration in terrestrial ecosystems. Genome sequence will expedite mushroom breeding for improved agronomic characteristics.

  12. Whole Genome Sequencing of Field Isolates Reveals Extensive Genetic Diversity in Plasmodium vivax from Colombia.

    Science.gov (United States)

    Winter, David J; Pacheco, M Andreína; Vallejo, Andres F; Schwartz, Rachel S; Arevalo-Herrera, Myriam; Herrera, Socrates; Cartwright, Reed A; Escalante, Ananias A

    2015-12-01

    Plasmodium vivax is the most prevalent malarial species in South America and exerts a substantial burden on the populations it affects. The control and eventual elimination of P. vivax are global health priorities. Genomic research contributes to this objective by improving our understanding of the biology of P. vivax and through the development of new genetic markers that can be used to monitor efforts to reduce malaria transmission. Here we analyze whole-genome data from eight field samples from a region in Cordóba, Colombia where malaria is endemic. We find considerable genetic diversity within this population, a result that contrasts with earlier studies suggesting that P. vivax had limited diversity in the Americas. We also identify a selective sweep around a substitution known to confer resistance to sulphadoxine-pyrimethamine (SP). This is the first observation of a selective sweep for SP resistance in this species. These results indicate that P. vivax has been exposed to SP pressure even when the drug is not in use as a first line treatment for patients afflicted by this parasite. We identify multiple non-synonymous substitutions in three other genes known to be involved with drug resistance in Plasmodium species. Finally, we found extensive microsatellite polymorphisms. Using this information we developed 18 polymorphic and easy to score microsatellite loci that can be used in epidemiological investigations in South America.

  13. Genome re-sequencing of semi-wild soybean reveals a complex Soja population structure and deep introgression.

    Directory of Open Access Journals (Sweden)

    Jie Qiu

    Full Text Available Semi-wild soybean is a unique type of soybean that retains both wild and domesticated characteristics, which provides an important intermediate type for understanding the evolution of the subgenus Soja population in the Glycine genus. In this study, a semi-wild soybean line (Maliaodou and a wild line (Lanxi 1 collected from the lower Yangtze regions were deeply sequenced while nine other semi-wild lines were sequenced to a 3-fold genome coverage. Sequence analysis revealed that (1 no independent phylogenetic branch covering all 10 semi-wild lines was observed in the Soja phylogenetic tree; (2 besides two distinct subpopulations of wild and cultivated soybean in the Soja population structure, all semi-wild lines were mixed with some wild lines into a subpopulation rather than an independent one or an intermediate transition type of soybean domestication; (3 high heterozygous rates (0.19-0.49 were observed in several semi-wild lines; and (4 over 100 putative selective regions were identified by selective sweep analysis, including those related to the development of seed size. Our results suggested a hybridization origin for the semi-wild soybean, which makes a complex Soja population structure.

  14. Genome re-sequencing of semi-wild soybean reveals a complex Soja population structure and deep introgression.

    Science.gov (United States)

    Qiu, Jie; Wang, Yu; Wu, Sanling; Wang, Ying-Ying; Ye, Chu-Yu; Bai, Xuefei; Li, Zefeng; Yan, Chenghai; Wang, Weidi; Wang, Ziqiang; Shu, Qingyao; Xie, Jiahua; Lee, Suk-Ha; Fan, Longjiang

    2014-01-01

    Semi-wild soybean is a unique type of soybean that retains both wild and domesticated characteristics, which provides an important intermediate type for understanding the evolution of the subgenus Soja population in the Glycine genus. In this study, a semi-wild soybean line (Maliaodou) and a wild line (Lanxi 1) collected from the lower Yangtze regions were deeply sequenced while nine other semi-wild lines were sequenced to a 3-fold genome coverage. Sequence analysis revealed that (1) no independent phylogenetic branch covering all 10 semi-wild lines was observed in the Soja phylogenetic tree; (2) besides two distinct subpopulations of wild and cultivated soybean in the Soja population structure, all semi-wild lines were mixed with some wild lines into a subpopulation rather than an independent one or an intermediate transition type of soybean domestication; (3) high heterozygous rates (0.19-0.49) were observed in several semi-wild lines; and (4) over 100 putative selective regions were identified by selective sweep analysis, including those related to the development of seed size. Our results suggested a hybridization origin for the semi-wild soybean, which makes a complex Soja population structure.

  15. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  16. The complete mitochondrial genome sequence of the spider habronattus oregonensis reveals rearranged and extremely truncated tRNAs

    Energy Technology Data Exchange (ETDEWEB)

    Masta, Susan E.; Boore, Jeffrey L.

    2004-01-31

    We sequenced the entire mitochondrial genome of the jumping spider Habronattus oregonensis of the arachnid order Araneae (Arthropoda: Chelicerata). A number of unusual features distinguish this genome from other chelicerate and arthropod mitochondrial genomes. Most of the transfer RNA gene sequences are greatly reduced in size and cannot be folded into typical cloverleaf-shaped secondary structures. At least nine of the tRNA sequences lack the potential to form TYC arm stem pairings, and instead are inferred to have TV-replacement loops. Furthermore, sequences that could encode the 3' aminoacyl acceptor stems in at least 10 tRNAs appear to be lacking, because fully paired acceptor stems are not possible and because the downstream sequences instead encode adjacent genes. Hence, these appear to be among the smallest known tRNA genes. We postulate that an RNA editing mechanism must exist to restore the 3' aminoacyl acceptor stems in order to allow the tRNAs to function. At least seven tRN As are rearranged with respect to the chelicerate Limulus polyphemus, although the arrangement of the protein-coding genes is identical. Most mitochondrial protein-coding genes of H. oregonensis have ATN as initiation codons, as commonly found in arthropod mtDNAs, but cytochrome oxidase subunit 2 and 3 genes apparently use UUG as an initiation codon. Finally, many of the gene sequences overlap one another and are truncated. This 14,381 bp genome, the first mitochondrial genome of a spider yet sequenced, is one of the smallest arthropod mitochondrial genomes known. We suggest that post transcriptional RNA editing can likely maintain function of the tRNAs while permitting the accumulation of mutations that would otherwise be deleterious. Such mechanisms may have allowed for the minimization of the spider mitochondrial genome.

  17. The complete genome sequence of Dickeya zeae EC1 reveals substantial divergence from other Dickeya strains and species.

    Science.gov (United States)

    Zhou, Jianuan; Cheng, Yingying; Lv, Mingfa; Liao, Lisheng; Chen, Yufan; Gu, Yanfang; Liu, Shiyin; Jiang, Zide; Xiong, Yuanyan; Zhang, Lianhui

    2015-08-04

    Dickeya zeae is a bacterial species that infects monocotyledons and dicotyledons. Two antibiotic-like phytotoxins named zeamine and zeamine II were reported to play an important role in rice seed germination, and two genes associated with zeamines production, i.e., zmsA and zmsK, have been thoroughly characterized. However, other virulence factors and its molecular mechanisms of host specificity and pathogenesis are hardly known. The complete genome of D. zeae strain EC1 isolated from diseased rice plants was sequenced, annotated, and compared with the genomes of other Dickeya spp.. The pathogen contains a chromosome of 4,532,364 bp with 4,154 predicted protein-coding genes. Comparative genomics analysis indicates that D. zeae EC1 is most co-linear with D. chrysanthemi Ech1591, most conserved with D. zeae Ech586 and least similar to D. paradisiaca Ech703. Substantial genomic rearrangement was revealed by comparing EC1 with Ech586 and Ech703. Most virulence genes were well-conserved in Dickeya strains except Ech703. Significantly, the zms gene cluster involved in biosynthesis of zeamines, which were shown previously as key virulence determinants, is present in D. zeae strains isolated from rice, and some D. solani strains, but absent in other Dickeya species and the D. zeae strains isolated from other plants or sources. In addition, a DNA fragment containing 9 genes associated with fatty acid biosynthesis was found inserted in the fli gene cluster encoding flagellar biosynthesis of strain EC1 and other two rice isolates but not in other strains. This gene cluster shares a high protein similarity to the fatty acid genes from Pantoea ananatis. Our findings delineate the genetic background of D. zeae EC1, which infects both dicotyledons and monocotyledons, and suggest that D. zeae strains isolated from rice could be grouped into a distinct pathovar, i.e., D. zeae subsp. oryzae. In addition, the results of this study also unveiled that the zms gene cluster presented in

  18. Nuclear species-diagnostic SNP markers mined from 454 amplicon sequencing reveal admixture genomic structure of modern citrus varieties.

    Science.gov (United States)

    Curk, Franck; Ancillo, Gema; Ollitrault, Frédérique; Perrier, Xavier; Jacquemoud-Collet, Jean-Pierre; Garcia-Lor, Andres; Navarro, Luis; Ollitrault, Patrick

    2015-01-01

    Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha) with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP) markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105) were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species-diagnostic SNP

  19. Whole-Genome Sequencing of Measles Virus Genotypes H1 and D8 During Outbreaks of Infection Following the 2010 Olympic Winter Games Reveals Viral Transmission Routes.

    Science.gov (United States)

    Gardy, Jennifer L; Naus, Monika; Amlani, Ashraf; Chung, Walter; Kim, Hochan; Tan, Malcolm; Severini, Alberto; Krajden, Mel; Puddicombe, David; Sahni, Vanita; Hayden, Althea S; Gustafson, Reka; Henry, Bonnie; Tang, Patrick

    2015-11-15

    We used whole-genome sequencing to investigate a dual-genotype outbreak of measles occurring after the XXI Olympic Winter Games in Vancouver, Canada. By sequencing 27 complete genomes from H1 and D8 genotype measles viruses isolated from outbreak cases, we estimated the virus mutation rate, determined that person-to-person transmission is typically associated with 0 mutations between isolates, and established that a single introduction of H1 virus led to the expansion of the outbreak beyond Vancouver. This is the largest measles genomics project to date, revealing novel aspects of measles virus genetics and providing new insights into transmission of this reemerging viral pathogen.

  20. Phylogenetic diversity and genotypical complexity of H9N2 influenza A viruses revealed by genomic sequence analysis.

    Directory of Open Access Journals (Sweden)

    Guoying Dong

    Full Text Available H9N2 influenza A viruses have become established worldwide in terrestrial poultry and wild birds, and are occasionally transmitted to mammals including humans and pigs. To comprehensively elucidate the genetic and evolutionary characteristics of H9N2 influenza viruses, we performed a large-scale sequence analysis of 571 viral genomes from the NCBI Influenza Virus Resource Database, representing the spectrum of H9N2 influenza viruses isolated from 1966 to 2009. Our study provides a panoramic framework for better understanding the genesis and evolution of H9N2 influenza viruses, and for describing the history of H9N2 viruses circulating in diverse hosts. Panorama phylogenetic analysis of the eight viral gene segments revealed the complexity and diversity of H9N2 influenza viruses. The 571 H9N2 viral genomes were classified into 74 separate lineages, which had marked host and geographical differences in phylogeny. Panorama genotypical analysis also revealed that H9N2 viruses include at least 98 genotypes, which were further divided according to their HA lineages into seven series (A-G. Phylogenetic analysis of the internal genes showed that H9N2 viruses are closely related to H3, H4, H5, H7, H10, and H14 subtype influenza viruses. Our results indicate that H9N2 viruses have undergone extensive reassortments to generate multiple reassortants and genotypes, suggesting that the continued circulation of multiple genotypical H9N2 viruses throughout the world in diverse hosts has the potential to cause future influenza outbreaks in poultry and epidemics in humans. We propose a nomenclature system for identifying and unifying all lineages and genotypes of H9N2 influenza viruses in order to facilitate international communication on the evolution, ecology and epidemiology of H9N2 influenza viruses.

  1. Repetitive sequence analysis and karyotyping reveal different genome evolution and speciation of diploid and tetraploid Tripsacum dactyloides

    Directory of Open Access Journals (Sweden)

    Qilin Zhu

    2016-08-01

    Full Text Available In the subtribe Maydeae, Tripsacum and Zea are closely related genera. Tripsacum is a horticultural crop widely used as pasture forage. Previous studies suggested that Tripsacum might play an important role in maize origin and evolution. However, our understanding of the genomics and the evolution of Tripsacum remains limited. In this study, two diploids, T. dactyloides var. meridionale (2n = 36, MR and T. dactyloides (2n = 36, DD, and one tetraploid, T. dactyloides (2n = 72, DL were sequenced by low-coverage genome sequencing followed by graph-based cluster analysis. The results showed that 63.23%, 59.20%, and 61.57% of the respective genome of MR, DD, and DL were repetitive DNA sequence. The proportions of different repetitive sequences varied greatly among the three species. Fluorescence in situ hybridization (FISH analysis of mitotic metaphase chromosomes with satellite repeats as the probes showed that the FISH signal patterns of DL were more similar to that of DD than to that of MR. Comparative analysis of the repeats also showed that DL shared more common repeat families with DD than with MR. Phylogenetic analysis of internal transcribed spacer region sequences further supported the evolutionary relationship among the three species. Repetitive sequences comparison showed that Tripsacum shared more repeat families with Zea than with Coix and Sorghum. Our study sheds new light on the genomics of Tripsacum and differential speciation in the Poaceae family.

  2. Complete mitochondrial genome sequence of three Tetrahymena species reveals mutation hot spots and accelerated nonsynonymous substitutions in Ymf genes.

    Directory of Open Access Journals (Sweden)

    Mike M Moradian

    Full Text Available The ciliate Tetrahymena, a model organism, contains divergent mitochondrial (Mt genome with unusual properties, where half of its 44 genes still remain without a definitive function. These genes could be categorized into two major groups of KPC (known protein coding and Ymf (genes without an identified function. To gain insights into the mechanisms underlying gene divergence and molecular evolution of Tetrahymena (T. Mt genomes, we sequenced three Mt genomes of T.paravorax, T.pigmentosa, and T.malaccensis. These genomes were aligned and the analyses were carried out using several programs that calculate distance, nucleotide substitution (dn/ds, and their rate ratios (omega on individual codon sites and via a sliding window approach. Comparative genomic analysis indicated a conserved putative transcription control sequence, a GC box, in a region where presumably transcription and replication initiate. We also found distinct features in Mt genome of T.paravorax despite similar genome organization among these approximately 47 kb long linear genomes. Another significant finding was the presence of at least one or more highly variable regions in Ymf genes where majority of substitutions were concentrated. These regions were mutation hotspots where elevated distances and the dn/ds ratios were primarily due to an increase in the number of nonsynonymous substitutions, suggesting relaxed selective constraint. However, in a few Ymf genes, accelerated rates of nonsynonymous substitutions may be due to positive selection. Similarly, on protein level the majority of amino acid replacements occurred in these regions. Ymf genes comprise half of the genes in Tetrahymena Mt genomes, so understanding why they have not been assigned definitive functions is an important aspect of molecular evolution. Importantly, nucleotide substitution types and rates suggest possible reasons for not being able to find homologues for Ymf genes. Additionally, comparative genomic

  3. Whole-Genome Sequences of Xanthomonas euvesicatoria Strains Clarify Taxonomy and Reveal a Stepwise Erosion of Type 3 Effectors

    Science.gov (United States)

    Barak, Jeri D.; Vancheva, Taca; Lefeuvre, Pierre; Jones, Jeffrey B.; Timilsina, Sujan; Minsavage, Gerald V.; Vallad, Gary E.; Koebnik, Ralf

    2016-01-01

    Multiple species of Xanthomonas cause bacterial spot of tomato (BST) and pepper. We sequenced five Xanthomonas euvesicatoria strains isolated from three continents (Africa, Asia, and South America) to provide a set of representative genomes with temporal and geographic diversity. LMG strains 667, 905, 909, and 933 were pathogenic on tomato and pepper, except LMG 918 elicited a hypersensitive reaction (HR) on tomato. Furthermore, LMG 667, 909, and 918 elicited a HR on Early Cal Wonder 30R containing Bs3. We examined pectolytic activity and starch hydrolysis, two tests which are useful in differentiating X. euvesicatoria from X. perforans, both causal agents of BST. LMG strains 905, 909, 918, and 933 were nonpectolytic while only LMG 918 was amylolytic. These results suggest that LMG 918 is atypical of X. euvesicatoria. Sequence analysis of all the publicly available X. euvesicatoria and X. perforans strains comparing seven housekeeping genes identified seven haplotypes with few polymorphisms. Whole genome comparison by average nucleotide identity (ANI) resulted in values of >99% among the LMG strains 667, 905, 909, 918, and 933 and X. euvesicatoria strains and >99.6% among the LMG strains and a subset of X. perforans strains. These results suggest that X. euvesicatoria and X. perforans should be considered a single species. ANI values between strains of X. euvesicatoria, X. perforans, X. allii, X. alfalfa subsp. citrumelonis, X. dieffenbachiae, and a recently described pathogen of rose were >97.8% suggesting these pathogens should be a single species and recognized as X. euvesicatoria. Analysis of the newly sequenced X. euvesicatoria strains revealed interesting findings among the type 3 (T3) effectors, relatively ancient stepwise erosion of some T3 effectors, additional X. euvesicatoria-specific T3 effectors among the causal agents of BST, orthologs of avrBs3 and avrBs4, and T3 effectors shared among xanthomonads pathogenic against various hosts. The results from

  4. Genome-wide footprints of pig domestication and selection revealed through massive parallel sequencing of pooled DNA.

    Directory of Open Access Journals (Sweden)

    Andreia J Amaral

    Full Text Available BACKGROUND: Artificial selection has caused rapid evolution in domesticated species. The identification of selection footprints across domesticated genomes can contribute to uncover the genetic basis of phenotypic diversity. METHODOLOGY/MAIN FINDINGS: Genome wide footprints of pig domestication and selection were identified using massive parallel sequencing of pooled reduced representation libraries (RRL representing ∼2% of the genome from wild boar and four domestic pig breeds (Large White, Landrace, Duroc and Pietrain which have been under strong selection for muscle development, growth, behavior and coat color. Using specifically developed statistical methods that account for DNA pooling, low mean sequencing depth, and sequencing errors, we provide genome-wide estimates of nucleotide diversity and genetic differentiation in pig. Widespread signals suggestive of positive and balancing selection were found and the strongest signals were observed in Pietrain, one of the breeds most intensively selected for muscle development. Most signals were population-specific but affected genomic regions which harbored genes for common biological categories including coat color, brain development, muscle development, growth, metabolism, olfaction and immunity. Genetic differentiation in regions harboring genes related to muscle development and growth was higher between breeds than between a given breed and the wild boar. CONCLUSIONS/SIGNIFICANCE: These results, suggest that although domesticated breeds have experienced similar selective pressures, selection has acted upon different genes. This might reflect the multiple domestication events of European breeds or could be the result of subsequent introgression of Asian alleles. Overall, it was estimated that approximately 7% of the porcine genome has been affected by selection events. This study illustrates that the massive parallel sequencing of genomic pools is a cost-effective approach to identify

  5. Transcriptome sequencing and genome-wide association analyses reveal lysosomal function and actin cytoskeleton remodeling in schizophrenia and bipolar disorder.

    Science.gov (United States)

    Zhao, Z; Xu, J; Chen, J; Kim, S; Reimers, M; Bacanu, S-A; Yu, H; Liu, C; Sun, J; Wang, Q; Jia, P; Xu, F; Zhang, Y; Kendler, K S; Peng, Z; Chen, X

    2015-05-01

    Schizophrenia (SCZ) and bipolar disorder (BPD) are severe mental disorders with high heritability. Clinicians have long noticed the similarities of clinic symptoms between these disorders. In recent years, accumulating evidence indicates some shared genetic liabilities. However, what is shared remains elusive. In this study, we conducted whole transcriptome analysis of post-mortem brain tissues (cingulate cortex) from SCZ, BPD and control subjects, and identified differentially expressed genes in these disorders. We found 105 and 153 genes differentially expressed in SCZ and BPD, respectively. By comparing the t-test scores, we found that many of the genes differentially expressed in SCZ and BPD are concordant in their expression level (q⩽0.01, 53 genes; q⩽0.05, 213 genes; q⩽0.1, 885 genes). Using genome-wide association data from the Psychiatric Genomics Consortium, we found that these differentially and concordantly expressed genes were enriched in association signals for both SCZ (Pgenes show concordant expression and association for both SCZ and BPD. Pathway analyses of these genes indicated that they are involved in the lysosome, Fc gamma receptor-mediated phagocytosis, regulation of actin cytoskeleton pathways, along with several cancer pathways. Functional analyses of these genes revealed an interconnected pathway network centered on lysosomal function and the regulation of actin cytoskeleton. These pathways and their interacting network were principally confirmed by an independent transcriptome sequencing data set of the hippocampus. Dysregulation of lysosomal function and cytoskeleton remodeling has direct impacts on endocytosis, phagocytosis, exocytosis, vesicle trafficking, neuronal maturation and migration, neurite outgrowth and synaptic density and plasticity, and different aspects of these processes have been implicated in SCZ and BPD.

  6. The first complete mitochondrial genome sequences of Amblypygi (Chelicerata: Arachnida) reveal conservation of the ancestral arthropod gene order.

    Science.gov (United States)

    Fahrein, Kathrin; Masta, Susan E; Podsiadlowski, Lars

    2009-05-01

    Amblypygi (whip spiders) are terrestrial chelicerates inhabiting the subtropics and tropics. In morphological and rRNA-based phylogenetic analyses, Amblypygi cluster with Uropygi (whip scorpions) and Araneae (spiders) to form the taxon Tetrapulmonata, but there is controversy regarding the interrelationship of these three taxa. Mitochondrial genomes provide an additional large data set of phylogenetic information (sequences, gene order, RNA secondary structure), but in arachnids, mitochondrial genome data are missing for some of the major orders. In the course of an ongoing project concerning arachnid mitochondrial genomics, we present the first two complete mitochondrial genomes from Amblypygi. Both genomes were found to be typical circular duplex DNA molecules with all 37 genes usually present in bilaterian mitochondrial genomes. In both species, gene order is identical to that of Limulus polyphemus (Xiphosura), which is assumed to reflect the putative arthropod ground pattern. All tRNA gene sequences have the potential to fold into structures that are typical of metazoan mitochondrial tRNAs, except for tRNA-Ala, which lacks the D arm in both amblypygids, suggesting the loss of this feature early in amblypygid evolution. Phylogenetic analysis resulted in weak support for Uropygi being the sister group of Amblypygi.

  7. Genome sequence of the Asian Tiger mosquito, Aedes albopictus, reveals insights into its biology, genetics, and evolution.

    Science.gov (United States)

    Chen, Xiao-Guang; Jiang, Xuanting; Gu, Jinbao; Xu, Meng; Wu, Yang; Deng, Yuhua; Zhang, Chi; Bonizzoni, Mariangela; Dermauw, Wannes; Vontas, John; Armbruster, Peter; Huang, Xin; Yang, Yulan; Zhang, Hao; He, Weiming; Peng, Hongjuan; Liu, Yongfeng; Wu, Kun; Chen, Jiahua; Lirakis, Manolis; Topalis, Pantelis; Van Leeuwen, Thomas; Hall, Andrew Brantley; Jiang, Xiaofang; Thorpe, Chevon; Mueller, Rachel Lockridge; Sun, Cheng; Waterhouse, Robert Michael; Yan, Guiyun; Tu, Zhijian Jake; Fang, Xiaodong; James, Anthony A

    2015-11-03

    The Asian tiger mosquito, Aedes albopictus, is a highly successful invasive species that transmits a number of human viral diseases, including dengue and Chikungunya fevers. This species has a large genome with significant population-based size variation. The complete genome sequence was determined for the Foshan strain, an established laboratory colony derived from wild mosquitoes from southeastern China, a region within the historical range of the origin of the species. The genome comprises 1,967 Mb, the largest mosquito genome sequenced to date, and its size results principally from an abundance of repetitive DNA classes. In addition, expansions of the numbers of members in gene families involved in insecticide-resistance mechanisms, diapause, sex determination, immunity, and olfaction also contribute to the larger size. Portions of integrated flavivirus-like genomes support a shared evolutionary history of association of these viruses with their vector. The large genome repertory may contribute to the adaptability and success of Ae. albopictus as an invasive species.

  8. Determination of the melon chloroplast and mitochondrial genome sequences reveals that the largest reported mitochondrial genome in plants contains a significant amount of DNA having a nuclear origin

    Science.gov (United States)

    2011-01-01

    Background The melon belongs to the Cucurbitaceae family, whose economic importance among vegetable crops is second only to Solanaceae. The melon has a small genome size (454 Mb), which makes it suitable for molecular and genetic studies. Despite similar nuclear and chloroplast genome sizes, cucurbits show great variation when their mitochondrial genomes are compared. The melon possesses the largest plant mitochondrial genome, as much as eight times larger than that of other cucurbits. Results The nucleotide sequences of the melon chloroplast and mitochondrial genomes were determined. The chloroplast genome (156,017 bp) included 132 genes, with 98 single-copy genes dispersed between the small (SSC) and large (LSC) single-copy regions and 17 duplicated genes in the inverted repeat regions (IRa and IRb). A comparison of the cucumber and melon chloroplast genomes showed differences in only approximately 5% of nucleotides, mainly due to short indels and SNPs. Additionally, 2.74 Mb of mitochondrial sequence, accounting for 95% of the estimated mitochondrial genome size, were assembled into five scaffolds and four additional unscaffolded contigs. An 84% of the mitochondrial genome is contained in a single scaffold. The gene-coding region accounted for 1.7% (45,926 bp) of the total sequence, including 51 protein-coding genes, 4 conserved ORFs, 3 rRNA genes and 24 tRNA genes. Despite the differences observed in the mitochondrial genome sizes of cucurbit species, Citrullus lanatus (379 kb), Cucurbita pepo (983 kb) and Cucumis melo (2,740 kb) share 120 kb of sequence, including the predicted protein-coding regions. Nevertheless, melon contained a high number of repetitive sequences and a high content of DNA of nuclear origin, which represented 42% and 47% of the total sequence, respectively. Conclusions Whereas the size and gene organisation of chloroplast genomes are similar among the cucurbit species, mitochondrial genomes show a wide variety of sizes, with a non

  9. The walnut (Juglans regia) genome sequence reveals diversity in genes coding for the biosynthesis of non-structural polyphenols.

    Science.gov (United States)

    Martínez-García, Pedro J; Crepeau, Marc W; Puiu, Daniela; Gonzalez-Ibeas, Daniel; Whalen, Jeanne; Stevens, Kristian A; Paul, Robin; Butterfield, Timothy S; Britton, Monica T; Reagan, Russell L; Chakraborty, Sandeep; Walawage, Sriema L; Vasquez-Gross, Hans A; Cardeno, Charis; Famula, Randi A; Pratt, Kevin; Kuruganti, Sowmya; Aradhya, Mallikarjuna K; Leslie, Charles A; Dandekar, Abhaya M; Salzberg, Steven L; Wegrzyn, Jill L; Langley, Charles H; Neale, David B

    2016-09-01

    The Persian walnut (Juglans regia L.), a diploid species native to the mountainous regions of Central Asia, is the major walnut species cultivated for nut production and is one of the most widespread tree nut species in the world. The high nutritional value of J. regia nuts is associated with a rich array of polyphenolic compounds, whose complete biosynthetic pathways are still unknown. A J. regia genome sequence was obtained from the cultivar 'Chandler' to discover target genes and additional unknown genes. The 667-Mbp genome was assembled using two different methods (SOAPdenovo2 and MaSuRCA), with an N50 scaffold size of 464 955 bp (based on a genome size of 606 Mbp), 221 640 contigs and a GC content of 37%. Annotation with MAKER-P and other genomic resources yielded 32 498 gene models. Previous studies in walnut relying on tissue-specific methods have only identified a single polyphenol oxidase (PPO) gene (JrPPO1). Enabled by the J. regia genome sequence, a second homolog of PPO (JrPPO2) was discovered. In addition, about 130 genes in the large gallate 1-β-glucosyltransferase (GGT) superfamily were detected. Specifically, two genes, JrGGT1 and JrGGT2, were significantly homologous to the GGT from Quercus robur (QrGGT), which is involved in the synthesis of 1-O-galloyl-β-d-glucose, a precursor for the synthesis of hydrolysable tannins. The reference genome for J. regia provides meaningful insight into the complex pathways required for the synthesis of polyphenols. The walnut genome sequence provides important tools and methods to accelerate breeding and to facilitate the genetic dissection of complex traits.

  10. Complete genome sequence of the biocontrol strain Pseudomonas protegens Cab57 discovered in Japan reveals strain-specific diversity of this species.

    Directory of Open Access Journals (Sweden)

    Kasumi Takeuchi

    Full Text Available The biocontrol strain Pseudomonas sp. Cab57 was isolated from the rhizosphere of shepherd's purse growing in a field in Hokkaido by screening the antibiotic producers. The whole genome sequence of this strain was obtained by paired-end and whole-genome shotgun sequencing, and the gaps between the contigs were closed using gap-spanning PCR products. The P. sp. Cab57 genome is organized into a single circular chromosome with 6,827,892 bp, 63.3% G+C content, and 6,186 predicted protein-coding sequences. Based on 16S rRNA gene analysis and whole genome analysis, strain Cab57 was identified as P. protegens. As reported in P. protegens CHA0 and Pf-5, four gene clusters (phl, prn, plt, and hcn encoding the typical antibiotic metabolites and the reported genes associated with Gac/Rsm signal transduction pathway of these strains are fully conserved in the Cab57 genome. Actually strain Cab57 exhibited typical Gac/Rsm activities and antibiotic production, and these activities were enhanced by knocking out the retS gene (for a sensor kinase acting as an antagonist of GacS. Two large segments (79 and 115 kb lacking in the Cab57 genome, as compared with the Pf-5 genome, accounted for the majority of the difference (247 kb between these genomes. One of these segments was the complete rhizoxin analog biosynthesis gene cluster (ca. 79 kb and another one was the 115-kb mobile genomic island. A whole genome comparison of those relative strains revealed that each strain has unique gene clusters involved in metabolism such as nitrite/nitrate assimilation, which was identified in the Cab57 genome. These findings suggest that P. protegens is a ubiquitous bacterium that controls its biocontrol traits while building up strain-specific genomic repertoires for the biosynthesis of secondary metabolites and niche adaptation.

  11. Candidate genes revealed by a genome scan for mosquito resistance to a bacterial insecticide: sequence and gene expression variations

    Directory of Open Access Journals (Sweden)

    David Jean-Philippe

    2009-11-01

    Full Text Available Abstract Background Genome scans are becoming an increasingly popular approach to study the genetic basis of adaptation and speciation, but on their own, they are often helpless at identifying the specific gene(s or mutation(s targeted by selection. This shortcoming is hopefully bound to disappear in the near future, thanks to the wealth of new genomic resources that are currently being developed for many species. In this article, we provide a foretaste of this exciting new era by conducting a genome scan in the mosquito Aedes aegypti with the aim to look for candidate genes involved in resistance to Bacillus thuringiensis subsp. israelensis (Bti insecticidal toxins. Results The genome of a Bti-resistant and a Bti-susceptible strains was surveyed using about 500 MITE-based molecular markers, and the loci showing the highest inter-strain genetic differentiation were sequenced and mapped on the Aedes aegypti genome sequence. Several good candidate genes for Bti-resistance were identified in the vicinity of these highly differentiated markers. Two of them, coding for a cadherin and a leucine aminopeptidase, were further examined at the sequence and gene expression levels. In the resistant strain, the cadherin gene displayed patterns of nucleotide polymorphisms consistent with the action of positive selection (e.g. an excess of high compared to intermediate frequency mutations, as well as a significant under-expression compared to the susceptible strain. Conclusion Both sequence and gene expression analyses agree to suggest a role for positive selection in the evolution of this cadherin gene in the resistant strain. However, it is unlikely that resistance to Bti is conferred by this gene alone, and further investigation will be needed to characterize other genes significantly associated with Bti resistance in Ae. aegypti. Beyond these results, this article illustrates how genome scans can build on the body of new genomic information (here, full

  12. Malaria Genome Sequencing Project

    Science.gov (United States)

    2004-01-01

    million cases and up to 2.7 million A whole chromosome shotgun sequencing strategy was used to deaths from malaria each year. The mortality levels are...deaths from malaria each year. The mortality levels are greatest in determine the genome sequence of P. falciparum clone 3D7. This sub-Saharan Africa...aminolevulinic acid dehydratase. Cura . Genet. 40, 391-398 (2002). 15. Lasonder, E. et al Analysis of the Plasmodium falciparum proteome by high-accuracy mass

  13. Genome sequencing conference II

    Energy Technology Data Exchange (ETDEWEB)

    1990-01-01

    Genome Sequencing Conference 2 was held September 30 to October 30, 1990. 26 speaker abstracts and 33 poster presentations were included in the program report. New and improved methods for DNA sequencing and genetic mapping were presented. Many of the papers were concerned with accuracy and speed of acquisition of data with computers and automation playing an increasing role. Individual papers have been processed separately for inclusion on the database.

  14. Clinical whole-genome sequencing in severe early-onset epilepsy reveals new genes and improves molecular diagnosis.

    Science.gov (United States)

    Martin, Hilary C; Kim, Grace E; Pagnamenta, Alistair T; Murakami, Yoshiko; Carvill, Gemma L; Meyer, Esther; Copley, Richard R; Rimmer, Andrew; Barcia, Giulia; Fleming, Matthew R; Kronengold, Jack; Brown, Maile R; Hudspith, Karl A; Broxholme, John; Kanapin, Alexander; Cazier, Jean-Baptiste; Kinoshita, Taroh; Nabbout, Rima; Bentley, David; McVean, Gil; Heavin, Sinéad; Zaiwalla, Zenobia; McShane, Tony; Mefford, Heather C; Shears, Deborah; Stewart, Helen; Kurian, Manju A; Scheffer, Ingrid E; Blair, Edward; Donnelly, Peter; Kaczmarek, Leonard K; Taylor, Jenny C

    2014-06-15

    In severe early-onset epilepsy, precise clinical and molecular genetic diagnosis is complex, as many metabolic and electro-physiological processes have been implicated in disease causation. The clinical phenotypes share many features such as complex seizure types and developmental delay. Molecular diagnosis has historically been confined to sequential testing of candidate genes known to be associated with specific sub-phenotypes, but the diagnostic yield of this approach can be low. We conducted whole-genome sequencing (WGS) on six patients with severe early-onset epilepsy who had previously been refractory to molecular diagnosis, and their parents. Four of these patients had a clinical diagnosis of Ohtahara Syndrome (OS) and two patients had severe non-syndromic early-onset epilepsy (NSEOE). In two OS cases, we found de novo non-synonymous mutations in the genes KCNQ2 and SCN2A. In a third OS case, WGS revealed paternal isodisomy for chromosome 9, leading to identification of the causal homozygous missense variant in KCNT1, which produced a substantial increase in potassium channel current. The fourth OS patient had a recessive mutation in PIGQ that led to exon skipping and defective glycophosphatidyl inositol biosynthesis. The two patients with NSEOE had likely pathogenic de novo mutations in CBL and CSNK1G1, respectively. Mutations in these genes were not found among 500 additional individuals with epilepsy. This work reveals two novel genes for OS, KCNT1 and PIGQ. It also uncovers unexpected genetic mechanisms and emphasizes the power of WGS as a clinical tool for making molecular diagnoses, particularly for highly heterogeneous disorders.

  15. Next-Generation Sequencing Techniques Reveal that Genomic Imprinting Is Absent in Day-Old Gallus gallus domesticus Brains.

    Science.gov (United States)

    Wang, Qiong; Li, Kaiyang; Zhang, Daixi; Li, Junying; Xu, Guiyun; Zheng, Jiangxia; Yang, Ning; Qu, Lujiang

    2015-01-01

    Genomic imprinting is a phenomenon characterized by parent-of-origin-specific gene expression. While widely documented in viviparous mammals and plants, imprinting in oviparous birds remains controversial. Because genomic imprinting is temporal- and tissue-specific, we investigated this phenomenon only in the brain tissues of 1-day-old chickens (Gallus gallus). We used next-generation sequencing technology to compare four transcriptomes pooled from 11 chickens, generated from reciprocally crossed families, to the DNA sequences of their parents. Candidate imprinted genes were then selected from these sequence alignments and subjected to verification experiments that excluded all but one SNP. Subsequent experiments performed with two new sets of reciprocally crossed families resulted in the exclusion of that candidate SNP as well. Attempts to find evidence of genomic imprinting from long non-coding RNAs yielded negative results. We therefore conclude that genomic imprinting is absent in the brains of 1-day-old chickens. However, due to the temporal and tissue specificity of imprinting, our results cannot be extended to all growth stages and tissue types.

  16. A Year of Infection in the Intensive Care Unit: Prospective Whole Genome Sequencing of Bacterial Clinical Isolates Reveals Cryptic Transmissions and Novel Microbiota.

    Science.gov (United States)

    Roach, David J; Burton, Joshua N; Lee, Choli; Stackhouse, Bethany; Butler-Wu, Susan M; Cookson, Brad T; Shendure, Jay; Salipante, Stephen J

    2015-07-01

    Bacterial whole genome sequencing holds promise as a disruptive technology in clinical microbiology, but it has not yet been applied systematically or comprehensively within a clinical context. Here, over the course of one year, we performed prospective collection and whole genome sequencing of nearly all bacterial isolates obtained from a tertiary care hospital's intensive care units (ICUs). This unbiased collection of 1,229 bacterial genomes from 391 patients enables detailed exploration of several features of clinical pathogens. A sizable fraction of isolates identified as clinically relevant corresponded to previously undescribed species: 12% of isolates assigned a species-level classification by conventional methods actually qualified as distinct, novel genomospecies on the basis of genomic similarity. Pan-genome analysis of the most frequently encountered pathogens in the collection revealed substantial variation in pan-genome size (1,420 to 20,432 genes) and the rate of gene discovery (1 to 152 genes per isolate sequenced). Surprisingly, although potential nosocomial transmission of actively surveilled pathogens was rare, 8.7% of isolates belonged to genomically related clonal lineages that were present among multiple patients, usually with overlapping hospital admissions, and were associated with clinically significant infection in 62% of patients from which they were recovered. Multi-patient clonal lineages were particularly evident in the neonatal care unit, where seven separate Staphylococcus epidermidis clonal lineages were identified, including one lineage associated with bacteremia in 5/9 neonates. Our study highlights key differences in the information made available by conventional microbiological practices versus whole genome sequencing, and motivates the further integration of microbial genome sequencing into routine clinical care.

  17. The genome sequence of the rumen methanogen Methanobrevibacter ruminantium reveals new possibilities for controlling ruminant methane emissions.

    Directory of Open Access Journals (Sweden)

    Sinead C Leahy

    Full Text Available BACKGROUND: Methane (CH(4 is a potent greenhouse gas (GHG, having a global warming potential 21 times that of carbon dioxide (CO(2. Methane emissions from agriculture represent around 40% of the emissions produced by human-related activities, the single largest source being enteric fermentation, mainly in ruminant livestock. Technologies to reduce these emissions are lacking. Ruminant methane is formed by the action of methanogenic archaea typified by Methanobrevibacter ruminantium, which is present in ruminants fed a wide variety of diets worldwide. To gain more insight into the lifestyle of a rumen methanogen, and to identify genes and proteins that can be targeted to reduce methane production, we have sequenced the 2.93 Mb genome of M. ruminantium M1, the first rumen methanogen genome to be completed. METHODOLOGY/PRINCIPAL FINDINGS: The M1 genome was sequenced, annotated and subjected to comparative genomic and metabolic pathway analyses. Conserved and methanogen-specific gene sets suitable as targets for vaccine development or chemogenomic-based inhibition of rumen methanogens were identified. The feasibility of using a synthetic peptide-directed vaccinology approach to target epitopes of methanogen surface proteins was demonstrated. A prophage genome was described and its lytic enzyme, endoisopeptidase PeiR, was shown to lyse M1 cells in pure culture. A predicted stimulation of M1 growth by alcohols was demonstrated and microarray analyses indicated up-regulation of methanogenesis genes during co-culture with a hydrogen (H(2 producing rumen bacterium. We also report the discovery of non-ribosomal peptide synthetases in M. ruminantium M1, the first reported in archaeal species. CONCLUSIONS/SIGNIFICANCE: The M1 genome sequence provides new insights into the lifestyle and cellular processes of this important rumen methanogen. It also defines vaccine and chemogenomic targets for broad inhibition of rumen methanogens and represents a significant

  18. Bos taurus genome sequence reveals the assortment of immunoglobulin and surrogate light chain genes in domestic cattle

    Directory of Open Access Journals (Sweden)

    Liljavirta Jenni

    2009-04-01

    Full Text Available Abstract Background The assortment of cattle immunoglobulin and surrogate light chain genes has been extracted from the version 3.1 of Bos taurus genome sequence as a part of an international effort to sequence and annotate the bovine genome. Results 63 variable lambda chain and 22 variable kappa chain genes were identified and phylogenetically assigned to 8 and 4 subgroups, respectively. The specified phylogenetic relationships are compatible with the established ruminant light chain variable gene families or subgroups. Because of gaps and uncertainties in the assembled genome sequence, the number of genes might change in the future versions of the genome sequence. In addition, three bovine surrogate light chain genes were identified. The corresponding cDNAs were cloned and the expression of the surrogate light chain genes was demonstrated from fetal material. Conclusion The bovine kappa gene locus is compact and simple which may reflect the preferential use of the lambda chain in cattle. The relative orientation of variable and joining genes in both loci are consistent with a deletion mechanism in VJ joining. The orientation of some variable genes cannot be determined from the data available. The number of functional variable genes is moderate when compared to man or mouse. Thus, post-recombinatorial mechanisms might contribute to the generation of the bovine pre-immune antibody repertoire. The heavy chains probably contribute more to recombinational immunoglobulin repertoire diversity than the light chains but the heavy chain locus could not be annotated from the version 3.1 of Bos taurus genome.

  19. Genome-wide analyses of Epstein-Barr virus reveal conserved RNA structures and a novel stable intronic sequence RNA

    OpenAIRE

    2013-01-01

    Background Epstein-Barr virus (EBV) is a human herpesvirus implicated in cancer and autoimmune disorders. Little is known concerning the roles of RNA structure in this important human pathogen. This study provides the first comprehensive genome-wide survey of RNA and RNA structure in EBV. Results Novel EBV RNAs and RNA structures were identified by computational modeling and RNA-Seq analyses of EBV. Scans of the genomic sequences of four EBV strains (EBV-1, EBV-2, GD1, and GD2) and of the clo...

  20. Whole genome sequence of two Rathayibacter toxicus strains reveals a tunicamycin biosynthetic cluster similar to Streptomyces chartreusis

    Science.gov (United States)

    Sechler, Aaron J.; Tancos, Matthew A.; Schneider, David J.; King, Jonas G.; Fennessey, Christine M.; Schroeder, Brenda K.; Murray, Timothy D.; Luster, Douglas G.; Schneider, William L.

    2017-01-01

    Rathayibacter toxicus is a forage grass associated Gram-positive bacterium of major concern to food safety and agriculture. This species is listed by USDA-APHIS as a plant pathogen select agent because it produces a tunicamycin-like toxin that is lethal to livestock and may be vectored by nematode species native to the U.S. The complete genomes of two strains of R. toxicus, including the type strain FH-79, were sequenced and analyzed in comparison with all available, complete R. toxicus genomes. Genome sizes ranged from 2,343,780 to 2,394,755 nucleotides, with 2079 to 2137 predicted open reading frames; all four strains showed remarkable synteny over nearly the entire genome, with only a small transposed region. A cluster of genes with similarity to the tunicamycin biosynthetic cluster from Streptomyces chartreusis was identified. The tunicamycin gene cluster (TGC) in R. toxicus contained 14 genes in two transcriptional units, with all of the functional elements for tunicamycin biosynthesis present. The TGC had a significantly lower GC content (52%) than the rest of the genome (61.5%), suggesting that the TGC may have originated from a horizontal transfer event. Further analysis indicated numerous remnants of other potential horizontal transfer events are present in the genome. In addition to the TGC, genes potentially associated with carotenoid and exopolysaccharide production, bacteriocins and secondary metabolites were identified. A CRISPR array is evident. There were relatively few plant-associated cell-wall hydrolyzing enzymes, but there were numerous secreted serine proteases that share sequence homology to the pathogenicity-associated protein Pat-1 of Clavibacter michiganensis. Overall, the genome provides clear insight into the possible mechanisms for toxin production in R. toxicus, providing a basis for future genetic approaches. PMID:28796837

  1. Whole Genome Sequencing of the Symbiont Pseudovibrio sp. from the Intertidal Marine Sponge Polymastia penicillus Revealed a Gene Repertoire for Host-Switching Permissive Lifestyle.

    Science.gov (United States)

    Alex, Anoop; Antunes, Agostinho

    2015-10-31

    Sponges harbor a complex consortium of microbial communities living in symbiotic relationship benefiting each other through the integration of metabolites. The mechanisms influencing a successful microbial association with a sponge partner are yet to be fully understood. Here, we sequenced the genome of Pseudovibrio sp. POLY-S9 strain isolated from the intertidal marine sponge Polymastia penicillus sampled from the Atlantic coast of Portugal to identify the genomic features favoring the symbiotic relationship. The draft genome revealed an exceptionally large genome size of 6.6 Mbp compared with the previously reported genomes of the genus Pseudovibrio isolated from a coral and a sponge larva. Our genomic study detected the presence of several biosynthetic gene clusters-polyketide synthase, nonribosomal peptide synthetase and siderophore-affirming the potential ability of the genus Pseudovibrio to produce a wide variety of metabolic compounds. Moreover, we identified a repertoire of genes encoding adaptive symbioses factors (eukaryotic-like proteins), such as the ankyrin repeats, tetratrico peptide repeats, and Sel1 repeats that improve the attachment to the eukaryotic hosts and the avoidance of the host's immune response : The genome also harbored a large number of mobile elements (∼5%) and gene transfer agents, which explains the massive genome expansion and suggests a possible mechanism of horizontal gene transfer. In conclusion, the genome of POLY-S9 exhibited an increase in size, number of mobile DNA, multiple metabolite gene clusters, and secretion systems, likely to influence the genome diversification and the evolvability.

  2. Ion torrent next-generation sequencing reveals the complete mitochondrial genome of endangered mahseer Tor khudree (Sykes, 1839).

    Science.gov (United States)

    Raman, Sudhanshu; Pavan-Kumar, A; Koringa, Prakash G; Patel, Namrata; Shah, Tejas; Singh, Rajeev K; Krishna, Gopal; Joshi, C G; Gireesh-Babu, P; Chaudhari, Aparna; Lakra, W S

    2016-07-01

    The complete mitochondrial genome of an endangered mahseer (Deccan mahseer), Tor khudree was sequenced using Ion torrent platform for the first time. The genome sequence was 16 573 bp in size, and consists of 13 protein coding genes, 22 tRNAs, 2 rRNA genes and 1 control region. The gene organization and its order were similar to other vertebrates. The overall base composition was A: 31.9%, G: 15.6%, C: 27.68%, T: 24.76%, A + T content 56.6% and the G + C content 43.32%. The phylogenetic tree constructed using a maximum likelihood model showed sister relationship between T. khudree and Tor tambroides.

  3. The genome sequence of the leaf-cutter ant Atta cephalotes reveals insights into its obligate symbiotic lifestyle.

    Directory of Open Access Journals (Sweden)

    Garret Suen

    Full Text Available Leaf-cutter ants are one of the most important herbivorous insects in the Neotropics, harvesting vast quantities of fresh leaf material. The ants use leaves to cultivate a fungus that serves as the colony's primary food source. This obligate ant-fungus mutualism is one of the few occurrences of farming by non-humans and likely facilitated the formation of their massive colonies. Mature leaf-cutter ant colonies contain millions of workers ranging in size from small garden tenders to large soldiers, resulting in one of the most complex polymorphic caste systems within ants. To begin uncovering the genomic underpinnings of this system, we sequenced the genome of Atta cephalotes using 454 pyrosequencing. One prediction from this ant's lifestyle is that it has undergone genetic modifications that reflect its obligate dependence on the fungus for nutrients. Analysis of this genome sequence is consistent with this hypothesis, as we find evidence for reductions in genes related to nutrient acquisition. These include extensive reductions in serine proteases (which are likely unnecessary because proteolysis is not a primary mechanism used to process nutrients obtained from the fungus, a loss of genes involved in arginine biosynthesis (suggesting that this amino acid is obtained from the fungus, and the absence of a hexamerin (which sequesters amino acids during larval development in other insects. Following recent reports of genome sequences from other insects that engage in symbioses with beneficial microbes, the A. cephalotes genome provides new insights into the symbiotic lifestyle of this ant and advances our understanding of host-microbe symbioses.

  4. Sequencing the maize genome.

    Science.gov (United States)

    Martienssen, Robert A; Rabinowicz, Pablo D; O'Shaughnessy, Andrew; McCombie, W Richard

    2004-04-01

    Sequencing of complex genomes can be accomplished by enriching shotgun libraries for genes. In maize, gene-enrichment by copy-number normalization (high C(0)t) and methylation filtration (MF) have been used to generate up to two-fold coverage of the gene-space with less than 1 million sequencing reads. Simulations using sequenced bacterial artificial chromosome (BAC) clones predict that 5x coverage of gene-rich regions, accompanied by less than 1x coverage of subclones from BAC contigs, will generate high-quality mapped sequence that meets the needs of geneticists while accommodating unusually high levels of structural polymorphism. By sequencing several inbred strains, we propose a strategy for capturing this polymorphism to investigate hybrid vigor or heterosis.

  5. Complete Sequencing and Pan-Genomic Analysis of Lactobacillus delbrueckii subsp. bulgaricus Reveal Its Genetic Basis for Industrial Yogurt Production

    OpenAIRE

    Pei Hao; Huajun Zheng; Yao Yu; Guohui Ding; Wenyi Gu; Shuting Chen; Zhonghao Yu; Shuangxi Ren; Munehiro Oda; Tomonobu Konno; Shengyue Wang; Xuan Li; Zai-Si Ji; Guoping Zhao

    2011-01-01

    Lactobacillus delbrueckii subsp. bulgaricus (Lb. bulgaricus) is an important species of Lactic Acid Bacteria (LAB) used for cheese and yogurt fermentation. The genome of Lb. bulgaricus 2038, an industrial strain mainly used for yogurt production, was completely sequenced and compared against the other two ATCC collection strains of the same subspecies. Specific physiological properties of strain 2038, such as lysine biosynthesis, formate production, aspartate-related carbon-skeleton intermedi...

  6. Whole genome sequencing reveals complex evolution patterns of multidrug-resistant Mycobacterium tuberculosis Beijing strains in patients.

    Directory of Open Access Journals (Sweden)

    Matthias Merker

    Full Text Available Multidrug-resistant (MDR Mycobacterium tuberculosis complex (MTBC strains represent a major threat for tuberculosis (TB control. Treatment of MDR-TB patients is long and less effective, resulting in a significant number of treatment failures. The development of further resistances leads to extensively drug-resistant (XDR variants. However, data on the individual reasons for treatment failure, e.g. an induced mutational burst, and on the evolution of bacteria in the patient are only sparsely available. To address this question, we investigated the intra-patient evolution of serial MTBC isolates obtained from three MDR-TB patients undergoing longitudinal treatment, finally leading to XDR-TB. Sequential isolates displayed identical IS6110 fingerprint patterns, suggesting the absence of exogenous re-infection. We utilized whole genome sequencing (WGS to screen for variations in three isolates from Patient A and four isolates from Patient B and C, respectively. Acquired polymorphisms were subsequently validated in up to 15 serial isolates by Sanger sequencing. We determined eight (Patient A and nine (Patient B polymorphisms, which occurred in a stepwise manner during the course of the therapy and were linked to resistance or a potential compensatory mechanism. For both patients, our analysis revealed the long-term co-existence of clonal subpopulations that displayed different drug resistance allele combinations. Out of these, the most resistant clone was fixed in the population. In contrast, baseline and follow-up isolates of Patient C were distinguished each by eleven unique polymorphisms, indicating an exogenous re-infection with an XDR strain not detected by IS6110 RFLP typing. Our study demonstrates that intra-patient microevolution of MDR-MTBC strains under longitudinal treatment is more complex than previously anticipated. However, a mutator phenotype was not detected. The presence of different subpopulations might confound phenotypic and

  7. Comparative sequence analyses of genome and transcriptome reveal novel transcripts and variants in the Asian elephant Elephas maximus

    Indian Academy of Sciences (India)

    Puli Chandramouli Reddy; Ishani Sinha; Ashwin Kelkar; Farhat Habib; Saurabh J Pradhan; Raman Sukumar; Sanjeev Galande

    2015-12-01

    The Asian elephant Elephas maximus and the African elephant Loxodonta africana that diverged 5-7 million years ago exhibit differences in their physiology, behaviour and morphology. A comparative genomics approach would be useful and necessary for evolutionary and functional genetic studies of elephants. We performed sequencing of E. maximus and map to L. africana at ∼ 15X coverage. Through comparative sequence analyses, we have identified Asian elephant specific homozygous, non-synonymous single nucleotide variants (SNVs) that map to 1514 protein coding genes, many of which are involved in olfaction. We also present the first report of a high-coverage transcriptome sequence in E. maximus from peripheral blood lymphocytes. We have identified 103 novel protein coding transcripts and 66-long non-coding (Inc)RNAs. We also report the presence of 181 protein domains unique to elephants when compared to other Afrotheria species. Each of these findings can be further investigated to gain a better understanding of functional differences unique to elephant species, as well as those unique to elephantids in comparison with other mammals. This work therefore provides a valuable resource to explore the immense research potential of comparative analyses of transcriptome and genome sequences in the Asian elephant.

  8. Comparative sequence analyses of genome and transcriptome reveal novel transcripts and variants in the Asian elephant Elephas maximus.

    Science.gov (United States)

    Reddy, Puli Chandramouli; Sinha, Ishani; Kelkar, Ashwin; Habib, Farhat; Pradhan, Saurabh J; Sukumar, Raman; Galande, Sanjeev

    2015-12-01

    The Asian elephant Elephas maximus and the African elephant Loxodonta africana that diverged 5-7 million years ago exhibit differences in their physiology, behaviour and morphology. A comparative genomics approach would be useful and necessary for evolutionary and functional genetic studies of elephants. We performed sequencing of E. maximus and map to L. africana at ~15X coverage. Through comparative sequence analyses, we have identified Asian elephant specific homozygous, non-synonymous single nucleotide variants (SNVs) that map to 1514 protein coding genes, many of which are involved in olfaction. We also present the first report of a high-coverage transcriptome sequence in E. maximus from peripheral blood lymphocytes. We have identified 103 novel protein coding transcripts and 66-long non-coding (lnc)RNAs. We also report the presence of 181 protein domains unique to elephants when compared to other Afrotheria species. Each of these findings can be further investigated to gain a better understanding of functional differences unique to elephant species, as well as those unique to elephantids in comparison with other mammals. This work therefore provides a valuable resource to explore the immense research potential of comparative analyses of transcriptome and genome sequences in the Asian elephant.

  9. Whole genome sequencing of mutation accumulation lines reveals a low mutation rate in the social amoeba Dictyostelium discoideum.

    Directory of Open Access Journals (Sweden)

    Gerda Saxer

    Full Text Available Spontaneous mutations play a central role in evolution. Despite their importance, mutation rates are some of the most elusive parameters to measure in evolutionary biology. The combination of mutation accumulation (MA experiments and whole-genome sequencing now makes it possible to estimate mutation rates by directly observing new mutations at the molecular level across the whole genome. We performed an MA experiment with the social amoeba Dictyostelium discoideum and sequenced the genomes of three randomly chosen lines using high-throughput sequencing to estimate the spontaneous mutation rate in this model organism. The mitochondrial mutation rate of 6.76×10(-9, with a Poisson confidence interval of 4.1×10(-9 - 9.5×10(-9, per nucleotide per generation is slightly lower than estimates for other taxa. The mutation rate estimate for the nuclear DNA of 2.9×10(-11, with a Poisson confidence interval ranging from 7.4×10(-13 to 1.6×10(-10, is the lowest reported for any eukaryote. These results are consistent with low microsatellite mutation rates previously observed in D. discoideum and low levels of genetic variation observed in wild D. discoideum populations. In addition, D. discoideum has been shown to be quite resistant to DNA damage, which suggests an efficient DNA-repair mechanism that could be an adaptation to life in soil and frequent exposure to intracellular and extracellular mutagenic compounds. The social aspect of the life cycle of D. discoideum and a large portion of the genome under relaxed selection during vegetative growth could also select for a low mutation rate. This hypothesis is supported by a significantly lower mutation rate per cell division in multicellular eukaryotes compared with unicellular eukaryotes.

  10. The Subclonal Structure and Genomic Evolution of Oral Squamous Cell Carcinoma Revealed by Ultra-deep Sequencing

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin Jakob

    Background: Oral squamous cell carcinoma (OSCC), a subgroup of head and neck squamous cell carcinoma (HNSCC), is primarily caused by alcohol consumption and tobacco use. Recent DNA sequencing studies suggests that HNSCC are very heterogeneous between patients; however the intra-patient subclonal...... structure remains unexplored due to lack of sampling multiple tumor biopsies from each patient. Materials and methods: To examine the clonal structure and describe the genomic cancer evolution we applied whole-exome sequencing combined with targeted ultra-deep targeted sequencing on biopsies from 5stage IV...... complex subclonal architectures comprising distinct subclones only found in geographically distinct regions of the tumors. The metastatic potential of the tumor is acquired early in the tumor evolution, as indicated by the lymph node sharing the majority of the mutations with the tumor biopsies, while...

  11. Whole genome sequencing and analysis of Campylobacter coli YH502 from retail chicken reveals a plasmid-borne type VI secretion system

    Directory of Open Access Journals (Sweden)

    Sandeep Ghatak

    2017-03-01

    Full Text Available Campylobacter is a major cause of foodborne illnesses worldwide. Campylobacter infections, commonly caused by ingestion of undercooked poultry and meat products, can lead to gastroenteritis and chronic reactive arthritis in humans. Whole genome sequencing (WGS is a powerful technology that provides comprehensive genetic information about bacteria and is increasingly being applied to study foodborne pathogens: e.g., evolution, epidemiology/outbreak investigation, and detection. Herein we report the complete genome sequence of Campylobacter coli strain YH502 isolated from retail chicken in the United States. WGS, de novo assembly, and annotation of the genome revealed a chromosome of 1,718,974 bp and a mega-plasmid (pCOS502 of 125,964 bp. GC content of the genome was 31.2% with 1931 coding sequences and 53 non-coding RNAs. Multiple virulence factors including a plasmid-borne type VI secretion system and antimicrobial resistance genes (beta-lactams, fluoroquinolones, and aminoglycoside were found. The presence of T6SS in a mobile genetic element (plasmid suggests plausible horizontal transfer of these virulence genes to other organisms. The C. coli YH502 genome also harbors CRISPR sequences and associated proteins. Phylogenetic analysis based on average nucleotide identity and single nucleotide polymorphisms identified closely related C. coli genomes available in the NCBI database. Taken together, the analyzed genomic data of this potentially virulent strain of C. coli will facilitate further understanding of this important foodborne pathogen most likely leading to better control strategies. The chromosome and plasmid sequences of C. coli YH502 have been deposited in GenBank under the accession numbers CP018900.1 and CP018901.1, respectively.

  12. The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima

    DEFF Research Database (Denmark)

    Chipman, Ariel D.; Ferrier, David E.K.; Brena, Carlo

    2014-01-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We pres...

  13. The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima

    DEFF Research Database (Denmark)

    Chipman, Ariel D.; Ferrier, David E.K.; Brena, Carlo;

    2014-01-01

    many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air......Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We...... present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates...

  14. The Genomic Sequence of the Oral Pathobiont Strain NI1060 Reveals Unique Strategies for Bacterial Competition and Pathogenicity.

    Directory of Open Access Journals (Sweden)

    Youssef Darzi

    Full Text Available Strain NI1060 is an oral bacterium responsible for periodontitis in a murine ligature-induced disease model. To better understand its pathogenicity, we have determined the complete sequence of its 2,553,982 bp genome. Although closely related to Pasteurella pneumotropica, a pneumonia-associated rodent commensal based on its 16S rRNA, the NI1060 genomic content suggests that they are different species thriving on different energy sources via alternative metabolic pathways. Genomic and phylogenetic analyses showed that strain NI1060 is distinct from the genera currently described in the family Pasteurellaceae, and is likely to represent a novel species. In addition, we found putative virulence genes involved in lipooligosaccharide synthesis, adhesins and bacteriotoxic proteins. These genes are potentially important for host adaption and for the induction of dysbiosis through bacterial competition and pathogenicity. Importantly, strain NI1060 strongly stimulates Nod1, an innate immune receptor, but is defective in two peptidoglycan recycling genes due to a frameshift mutation. The in-depth analysis of its genome thus provides critical insights for the development of NI1060 as a prime model system for infectious disease.

  15. Species-wide whole genome sequencing reveals historical global spread and recent local persistence in Shigella flexneri.

    Science.gov (United States)

    Connor, Thomas R; Barker, Clare R; Baker, Kate S; Weill, François-Xavier; Talukder, Kaisar Ali; Smith, Anthony M; Baker, Stephen; Gouali, Malika; Pham Thanh, Duy; Jahan Azmi, Ishrat; Dias da Silveira, Wanderley; Semmler, Torsten; Wieler, Lothar H; Jenkins, Claire; Cravioto, Alejandro; Faruque, Shah M; Parkhill, Julian; Wook Kim, Dong; Keddy, Karen H; Thomson, Nicholas R

    2015-08-04

    Shigella flexneri is the most common cause of bacterial dysentery in low-income countries. Despite this, S. flexneri remains largely unexplored from a genomic standpoint and is still described using a vocabulary based on serotyping reactions developed over half-a-century ago. Here we combine whole genome sequencing with geographical and temporal data to examine the natural history of the species. Our analysis subdivides S. flexneri into seven phylogenetic groups (PGs); each containing two-or-more serotypes and characterised by distinct virulence gene complement and geographic range. Within the S. flexneri PGs we identify geographically restricted sub-lineages that appear to have persistently colonised regions for many decades to over 100 years. Although we found abundant evidence of antimicrobial resistance (AMR) determinant acquisition, our dataset shows no evidence of subsequent intercontinental spread of antimicrobial resistant strains. The pattern of colonisation and AMR gene acquisition suggest that S. flexneri has a distinct life-cycle involving local persistence.

  16. Comparative analysis of the complete genome sequence of the California MSW strain of myxoma virus reveals potential host adaptations.

    Science.gov (United States)

    Kerr, Peter J; Rogers, Matthew B; Fitch, Adam; Depasse, Jay V; Cattadori, Isabella M; Hudson, Peter J; Tscharke, David C; Holmes, Edward C; Ghedin, Elodie

    2013-11-01

    Myxomatosis is a rapidly lethal disease of European rabbits that is caused by myxoma virus (MYXV). The introduction of a South American strain of MYXV into the European rabbit population of Australia is the classic case of host-pathogen coevolution following cross-species transmission. The most virulent strains of MYXV for European rabbits are the Californian viruses, found in the Pacific states of the United States and the Baja Peninsula, Mexico. The natural host of Californian MYXV is the brush rabbit, Sylvilagus bachmani. We determined the complete sequence of the MSW strain of Californian MYXV and performed a comparative analysis with other MYXV genomes. The MSW genome is larger than that of the South American Lausanne (type) strain of MYXV due to an expansion of the terminal inverted repeats (TIRs) of the genome, with duplication of the M156R, M154L, M153R, M152R, and M151R genes and part of the M150R gene from the right-hand (RH) end of the genome at the left-hand (LH) TIR. Despite the extreme virulence of MSW, no novel genes were identified; five genes were disrupted by multiple indels or mutations to the ATG start codon, including two genes, M008.1L/R and M152R, with major virulence functions in European rabbits, and a sixth gene, M000.5L/R, was absent. The loss of these gene functions suggests that S. bachmani is a relatively recent host for MYXV and that duplication of virulence genes in the TIRs, gene loss, or sequence variation in other genes can compensate for the loss of M008.1L/R and M152R in infections of European rabbits.

  17. Analysis of The Cancer Genome Atlas sequencing data reveals novel properties of the human papillomavirus 16 genome in head and neck squamous cell carcinoma.

    Science.gov (United States)

    Nulton, Tara J; Olex, Amy L; Dozmorov, Mikhail; Morgan, Iain M; Windle, Brad

    2017-03-14

    Human papillomavirus (HPV) DNA is detected in up to 80% of oropharyngeal carcinomas (OPC) and this HPV positive disease has reached epidemic proportions. To increase our understanding of the disease, we investigated the status of the HPV16 genome in HPV-positive head and neck cancers (HNC). Raw RNA-Seq and Whole Genome Sequence data from The Cancer Genome Atlas HNC samples were analyzed to gain a full understanding of the HPV genome status for these tumors. Several remarkable and novel observations were made following this analysis. Firstly, there are three main HPV genome states in these tumors that are split relatively evenly: An episomal only state, an integrated state, and a state in which the viral genome exists as a hybrid episome with human DNA. Secondly, none of the tumors expressed high levels of E6; E6*I is the dominant variant expressed in all tumors. The most striking conclusion from this study is that around three quarters of HPV16 positive HNC contain episomal versions of the viral genome that are likely replicating in an E1-E2 dependent manner. The clinical and therapeutic implications of these observations are discussed.

  18. Long-read sequencing improves assembly of Trichinella genomes 10-fold, revealing substantial synteny between lineages diverged over seven million years

    Science.gov (United States)

    Genome evolution influences a parasite’s’s pathogenicity, host-pathogen interactions, environmental constraints, and invasion biology, while genome assemblies form the basis of comparative sequence analyses. Given that closely related organisms typically maintain appreciable synteny, the genome asse...

  19. Genomic library screening for viruses from the human dental plaque revealed pathogen-specific lytic phage sequences.

    Science.gov (United States)

    Al-Jarbou, Ahmed Nasser

    2012-01-01

    Bacterial pathogenesis presents an astounding arsenal of virulence factors that allow them to conquer many different niches throughout the course of infection. Principally fascinating is the fact that some bacterial species are able to induce different diseases by expression of different combinations of virulence factors. Nevertheless, studies aiming at screening for the presence of bacteriophages in humans have been limited. Such screening procedures would eventually lead to identification of phage-encoded properties that impart increased bacterial fitness and/or virulence in a particular niche, and hence, would potentially be used to reverse the course of bacterial infections. As the human oral cavity represents a rich and dynamic ecosystem for several upper respiratory tract pathogens. However, little is known about virus diversity in human dental plaque which is an important reservoir. We applied the culture-independent approach to characterize virus diversity in human dental plaque making a library from a virus DNA fraction amplified using a multiple displacement method and sequenced 80 clones. The resulting sequence showed 44% significant identities to GenBank databases by TBLASTX analysis. TBLAST homology comparisons showed that 66% was viral; 18% eukarya; 10% bacterial; 6% mobile elements. These sequences were sorted into 6 contigs and 45 single sequences in which 4 contigs and a single sequence showed significant identity to a small region of a putative prophage in the Corynebacterium diphtheria genome. These findings interestingly highlight the uniqueness of over half of the sequences, whilst the dominance of a pathogen-specific prophage sequences imply their role in virulence.

  20. Genomic convergence analysis of schizophrenia: mRNA sequencing reveals altered synaptic vesicular transport in post-mortem cerebellum.

    Directory of Open Access Journals (Sweden)

    Joann Mudge

    Full Text Available Schizophrenia (SCZ is a common, disabling mental illness with high heritability but complex, poorly understood genetic etiology. As the first phase of a genomic convergence analysis of SCZ, we generated 16.7 billion nucleotides of short read, shotgun sequences of cDNA from post-mortem cerebellar cortices of 14 patients and six, matched controls. A rigorous analysis pipeline was developed for analysis of digital gene expression studies. Sequences aligned to approximately 33,200 transcripts in each sample, with average coverage of 450 reads per gene. Following adjustments for confounding clinical, sample and experimental sources of variation, 215 genes differed significantly in expression between cases and controls. Golgi apparatus, vesicular transport, membrane association, Zinc binding and regulation of transcription were over-represented among differentially expressed genes. Twenty three genes with altered expression and involvement in presynaptic vesicular transport, Golgi function and GABAergic neurotransmission define a unifying molecular hypothesis for dysfunction in cerebellar cortex in SCZ.

  1. Whole genome re-sequencing reveals genome-wide variations among parental lines of 16 mapping populations in chickpea (Cicer arietinum L.).

    Science.gov (United States)

    Thudi, Mahendar; Khan, Aamir W; Kumar, Vinay; Gaur, Pooran M; Katta, Krishnamohan; Garg, Vanika; Roorkiwal, Manish; Samineni, Srinivasan; Varshney, Rajeev K

    2016-01-27

    Chickpea (Cicer arietinum L.) is the second most important grain legume cultivated by resource poor farmers in South Asia and Sub-Saharan Africa. In order to harness the untapped genetic potential available for chickpea improvement, we re-sequenced 35 chickpea genotypes representing parental lines of 16 mapping populations segregating for abiotic (drought, heat, salinity), biotic stresses (Fusarium wilt, Ascochyta blight, Botrytis grey mould, Helicoverpa armigera) and nutritionally important (protein content) traits using whole genome re-sequencing approach. A total of 192.19 Gb data, generated on 35 genotypes of chickpea, comprising 973.13 million reads, with an average sequencing depth of ~10 X for each line. On an average 92.18 % reads from each genotype were aligned to the chickpea reference genome with 82.17 % coverage. A total of 2,058,566 unique single nucleotide polymorphisms (SNPs) and 292,588 Indels were detected while comparing with the reference chickpea genome. Highest number of SNPs were identified on the Ca4 pseudomolecule. In addition, copy number variations (CNVs) such as gene deletions and duplications were identified across the chickpea parental genotypes, which were minimum in PI 489777 (1 gene deletion) and maximum in JG 74 (1,497). A total of 164,856 line specific variations (144,888 SNPs and 19,968 Indels) with the highest percentage were identified in coding regions in ICC 1496 (21 %) followed by ICCV 97105 (12 %). Of 539 miscellaneous variations, 339, 138 and 62 were inter-chromosomal variations (CTX), intra-chromosomal variations (ITX) and inversions (INV) respectively. Genome-wide SNPs, Indels, CNVs, PAVs, and miscellaneous variations identified in different mapping populations are a valuable resource in genetic research and helpful in locating genes/genomic segments responsible for economically important traits. Further, the genome-wide variations identified in the present study can be used for developing high density SNP arrays for

  2. Genome sequence reveals that Pseudomonas fluorescens F113 possesses a large and diverse array of systems for rhizosphere function and host interaction

    OpenAIRE

    2013-01-01

    Redondo-Nieto et al.: Genome sequence reveals that Pseudomonas fluorescens F113 possesses a large and diverse array of systems for rhizosphere function and host interaction. BMC Genomics 2013 14:54.The electronic version of this article is the complete one and can be found online at http://www.biomedcentral.com/1471-2164/14/54 Background: Pseudomonas fluorescens F113 is a plant growth-promoting rhizobacterium (PGPR) isolated from the sugar-beet rhizosphere. This bacterium has been extensiv...

  3. Classifying Genomic Sequences by Sequence Feature Analysis

    Institute of Scientific and Technical Information of China (English)

    Zhi-Hua Liu; Dian Jiao; Xiao Sun

    2005-01-01

    Traditional sequence analysis depends on sequence alignment. In this study, we analyzed various functional regions of the human genome based on sequence features, including word frequency, dinucleotide relative abundance, and base-base correlation. We analyzed the human chromosome 22 and classified the upstream,exon, intron, downstream, and intergenic regions by principal component analysis and discriminant analysis of these features. The results show that we could classify the functional regions of genome based on sequence feature and discriminant analysis.

  4. Complete genome sequence of avian paramyxovirus (APMV serotype 5 completes the analysis of nine APMV serotypes and reveals the longest APMV genome.

    Directory of Open Access Journals (Sweden)

    Arthur S Samuel

    Full Text Available BACKGROUND: Avian paramyxoviruses (APMV consist of nine known serotypes. The genomes of representatives of all APMV serotypes except APMV type 5 have recently been fully sequenced. Here, we report the complete genome sequence of the APMV-5 prototype strain budgerigar/Kunitachi/74. METHODOLOGY/PRINCIPAL FINDINGS: APMV-5 Kunitachi virus is unusual in that it lacks a virion hemagglutinin and does not grow in the allantoic cavity of embryonated chicken eggs. However, the virus grew in the amniotic cavity of embryonated chicken eggs and in twelve different established cell lines and two primary cell cultures. The genome is 17,262 nucleotides (nt long, which is the longest among members of genus Avulavirus, and encodes six non-overlapping genes in the order of 3'N-P/V/W-M-F-HN-L-5' with intergenic regions of 4-57 nt. The genome length follows the 'rule of six' and contains a 55-nt leader sequence at the 3'end and a 552 nt trailer sequence at the 5' end. The phosphoprotein (P gene contains a conserved RNA editing site and is predicted to encode P, V, and W proteins. The cleavage site of the F protein (G-K-R-K-K-R downward arrowF conforms to the cleavage site motif of the ubiquitous cellular protease furin. Consistent with this, exogenous protease was not required for virus replication in vitro. However, the intracerebral pathogenicity index of APMV-5 strain Kunitachi in one-day-old chicks was found to be zero, indicating that the virus is avirulent for chickens despite the presence of a polybasic F cleavage site. CONCLUSIONS/SIGNIFICANCE: Phylogenetic analysis of the sequences of the APVM-5 genome and proteins versus those of the other APMV serotypes showed that APMV-5 is more closely related to APMV-6 than to the other APMVs. Furthermore, these comparisons provided evidence of extensive genome-wide divergence that supports the classification of the APMVs into nine separate serotypes. The structure of the F cleavage site does not appear to be a

  5. KSHV 2.0: a comprehensive annotation of the Kaposi's sarcoma-associated herpesvirus genome using next-generation sequencing reveals novel genomic and functional features.

    Directory of Open Access Journals (Sweden)

    Carolina Arias

    2014-01-01

    Full Text Available Productive herpesvirus infection requires a profound, time-controlled remodeling of the viral transcriptome and proteome. To gain insights into the genomic architecture and gene expression control in Kaposi's sarcoma-associated herpesvirus (KSHV, we performed a systematic genome-wide survey of viral transcriptional and translational activity throughout the lytic cycle. Using mRNA-sequencing and ribosome profiling, we found that transcripts encoding lytic genes are promptly bound by ribosomes upon lytic reactivation, suggesting their regulation is mainly transcriptional. Our approach also uncovered new genomic features such as ribosome occupancy of viral non-coding RNAs, numerous upstream and small open reading frames (ORFs, and unusual strategies to expand the virus coding repertoire that include alternative splicing, dynamic viral mRNA editing, and the use of alternative translation initiation codons. Furthermore, we provide a refined and expanded annotation of transcription start sites, polyadenylation sites, splice junctions, and initiation/termination codons of known and new viral features in the KSHV genomic space which we have termed KSHV 2.0. Our results represent a comprehensive genome-scale image of gene regulation during lytic KSHV infection that substantially expands our understanding of the genomic architecture and coding capacity of the virus.

  6. Complete sequencing and pan-genomic analysis of Lactobacillus delbrueckii subsp. bulgaricus reveal its genetic basis for industrial yogurt production.

    Directory of Open Access Journals (Sweden)

    Pei Hao

    Full Text Available Lactobacillus delbrueckii subsp. bulgaricus (Lb. bulgaricus is an important species of Lactic Acid Bacteria (LAB used for cheese and yogurt fermentation. The genome of Lb. bulgaricus 2038, an industrial strain mainly used for yogurt production, was completely sequenced and compared against the other two ATCC collection strains of the same subspecies. Specific physiological properties of strain 2038, such as lysine biosynthesis, formate production, aspartate-related carbon-skeleton intermediate metabolism, unique EPS synthesis and efficient DNA restriction/modification systems, are all different from those of the collection strains that might benefit the industrial production of yogurt. Other common features shared by Lb. bulgaricus strains, such as efficient protocooperation with Streptococcus thermophilus and lactate production as well as well-equipped stress tolerance mechanisms may account for it being selected originally for yogurt fermentation industry. Multiple lines of evidence suggested that Lb. bulgaricus 2038 was genetically closer to the common ancestor of the subspecies than the other two sequenced collection strains, probably due to a strict industrial maintenance process for strain 2038 that might have halted its genome decay and sustained a gene network suitable for large scale yogurt production.

  7. Complete sequencing and pan-genomic analysis of Lactobacillus delbrueckii subsp. bulgaricus reveal its genetic basis for industrial yogurt production.

    Science.gov (United States)

    Hao, Pei; Zheng, Huajun; Yu, Yao; Ding, Guohui; Gu, Wenyi; Chen, Shuting; Yu, Zhonghao; Ren, Shuangxi; Oda, Munehiro; Konno, Tomonobu; Wang, Shengyue; Li, Xuan; Ji, Zai-Si; Zhao, Guoping

    2011-01-17

    Lactobacillus delbrueckii subsp. bulgaricus (Lb. bulgaricus) is an important species of Lactic Acid Bacteria (LAB) used for cheese and yogurt fermentation. The genome of Lb. bulgaricus 2038, an industrial strain mainly used for yogurt production, was completely sequenced and compared against the other two ATCC collection strains of the same subspecies. Specific physiological properties of strain 2038, such as lysine biosynthesis, formate production, aspartate-related carbon-skeleton intermediate metabolism, unique EPS synthesis and efficient DNA restriction/modification systems, are all different from those of the collection strains that might benefit the industrial production of yogurt. Other common features shared by Lb. bulgaricus strains, such as efficient protocooperation with Streptococcus thermophilus and lactate production as well as well-equipped stress tolerance mechanisms may account for it being selected originally for yogurt fermentation industry. Multiple lines of evidence suggested that Lb. bulgaricus 2038 was genetically closer to the common ancestor of the subspecies than the other two sequenced collection strains, probably due to a strict industrial maintenance process for strain 2038 that might have halted its genome decay and sustained a gene network suitable for large scale yogurt production.

  8. Comparison of C. elegans and C. briggsae genome sequences reveals extensive conservation of chromosome organization and synteny.

    Directory of Open Access Journals (Sweden)

    LaDeana W Hillier

    2007-07-01

    Full Text Available To determine whether the distinctive features of Caenorhabditis elegans chromosomal organization are shared with the C. briggsae genome, we constructed a single nucleotide polymorphism-based genetic map to order and orient the whole genome shotgun assembly along the six C. briggsae chromosomes. Although these species are of the same genus, their most recent common ancestor existed 80-110 million years ago, and thus they are more evolutionarily distant than, for example, human and mouse. We found that, like C. elegans chromosomes, C. briggsae chromosomes exhibit high levels of recombination on the arms along with higher repeat density, a higher fraction of intronic sequence, and a lower fraction of exonic sequence compared with chromosome centers. Despite extensive intrachromosomal rearrangements, 1:1 orthologs tend to remain in the same region of the chromosome, and colinear blocks of orthologs tend to be longer in chromosome centers compared with arms. More strikingly, the two species show an almost complete conservation of synteny, with 1:1 orthologs present on a single chromosome in one species also found on a single chromosome in the other. The conservation of both chromosomal organization and synteny between these two distantly related species suggests roles for chromosome organization in the fitness of an organism that are only poorly understood presently.

  9. Genome sequencing reveals diversification of virulence factor content and possible host adaptation in distinct subpopulations of Salmonella enterica

    Directory of Open Access Journals (Sweden)

    Rodriguez-Rivera Lorraine D

    2011-08-01

    Full Text Available Abstract Background Divergence of bacterial populations into distinct subpopulations is often the result of ecological isolation. While some studies have suggested the existence of Salmonella enterica subsp. enterica subclades, evidence for these subdivisions has been ambiguous. Here we used a comparative genomics approach to define the population structure of Salmonella enterica subsp. enterica, and identify clade-specific genes that may be the result of ecological specialization. Results Multi-locus sequence analysis (MLSA and single nucleotide polymorphisms (SNPs data for 16 newly sequenced and 30 publicly available genomes showed an unambiguous subdivision of S. enterica subsp. enterica into at least two subpopulations, which we refer to as clade A and clade B. Clade B strains contain several clade-specific genes or operons, including a β-glucuronidase operon, a S-fimbrial operon, and cell surface related genes, which strongly suggests niche specialization of this subpopulation. An additional set of 123 isolates was assigned to clades A and B by using qPCR assays targeting subpopulation-specific SNPs and genes of interest. Among 98 serovars examined, approximately 20% belonged to clade B. All clade B isolates contained two pathogenicity related genomic islands, SPI-18 and a cytolethal distending toxin islet; a combination of these two islands was previously thought to be exclusive to serovars Typhi and Paratyphi A. Presence of β-glucuronidase in clade B isolates specifically suggests an adaptation of this clade to the vertebrate gastrointestinal environment. Conclusions S. enterica subsp. enterica consists of at least two subpopulations that differ specifically in genes involved in host and tissue tropism, utilization of host specific carbon and nitrogen sources and are therefore likely to differ in ecology and transmission characteristics.

  10. Transmission of Staphylococcus aureus from Humans to Green Monkeys in The Gambia as Revealed by Whole-Genome Sequencing

    Science.gov (United States)

    Senghore, Madikay; Bayliss, Sion C.; Kwambana-Adams, Brenda A.; Foster-Nyarko, Ebenezer; Manneh, Jainaba; Dione, Michel; Badji, Henry; Ebruke, Chinelo; Doughty, Emma L.; Thorpe, Harry A.; Jasinska, Anna J.; Schmitt, Christopher A.; Cramer, Jennifer D.; Turner, Trudy R.; Weinstock, George; Freimer, Nelson B.; Feil, Edward J.; Antonio, Martin

    2016-01-01

    ABSTRACT Staphylococcus aureus is an important pathogen of humans and animals. We genome sequenced 90 S. aureus isolates from The Gambia: 46 isolates from invasive disease in humans, 13 human carriage isolates, and 31 monkey carriage isolates. We inferred multiple anthroponotic transmissions of S. aureus from humans to green monkeys (Chlorocebus sabaeus) in The Gambia over different time scales. We report a novel monkey-associated clade of S. aureus that emerged from a human-to-monkey switch estimated to have occurred 2,700 years ago. Adaptation of this lineage to the monkey host is accompanied by the loss of phage-carrying genes that are known to play an important role in human colonization. We also report recent anthroponotic transmission of the well-characterized human lineages sequence type 6 (ST6) and ST15 to monkeys, probably because of steadily increasing encroachment of humans into the monkeys' habitat. Although we have found no evidence of transmission of S. aureus from monkeys to humans, as the two species come into ever-closer contact, there might be an increased risk of additional interspecies exchanges of potential pathogens. IMPORTANCE The population structures of Staphylococcus aureus in humans and monkeys in sub-Saharan Africa have been previously described using multilocus sequence typing (MLST). However, these data lack the power to accurately infer details regarding the origin and maintenance of new adaptive lineages. Here, we describe the use of whole-genome sequencing to detect transmission of S. aureus between humans and nonhuman primates and to document the genetic changes accompanying host adaptation. We note that human-to-monkey switches tend to be more common than the reverse and that a novel monkey-associated clade is likely to have emerged from such a switch approximately 2,700 years ago. Moreover, analysis of the accessory genome provides important clues as to the genetic changes underpinning host adaptation and, in particular, shows

  11. Evolution of novel wood decay mechanisms in Agaricales revealed by the genome sequences of Fistulina hepatica and Cylindrobasidium torrendii

    Science.gov (United States)

    Floudas, Dimitrios; Held, Benjamin W.; Riley, Robert; Nagy, Laszlo G.; Koehler, Gage; Ransdell, Anthony S.; Younus, Hina; Chow, Julianna; Chiniquy, Jennifer; Lipzen, Anna; Tritt, Andrew; Sun, Hui; Haridas, Sajeet; LaButti, Kurt; Ohm, Robin A.; Kües, Ursula; Blanchette, Robert A.; Grigoriev, Igor V.; Minto, Robert E.; Hibbett, David S.

    2015-01-01

    Wood decay mechanisms in Agaricomycotina have been traditionally separated in two categories termed white and brown rot. Recently the accuracy of such a dichotomy has been questioned. Here, we present the genome sequences of the white rot fungus Cylindrobasidium torrendii and the brown rot fungus Fistulina hepatica both members of Agaricales, combining comparative genomics and wood decay experiments. Cylindrobasidium torrendii is closely related to the white-rot root pathogen Armillaria mellea, while F. hepatica is related to Schizophyllum commune, which has been reported to cause white rot. Our results suggest that C. torrendii and S. commune are intermediate between white-rot and brown-rot fungi, but at the same time they show characteristics of decay that resembles soft rot. Both species cause weak wood decay and degrade all wood components but leave the middle lamella intact. Their gene content related to lignin degradation is reduced, similar to brown-rot fungi, but both have maintained a rich array of genes related to carbohydrate degradation, similar to white-rot fungi. These characteristics appear to have evolved from white-rot ancestors with stronger ligninolytic ability. Fistulina hepatica shows characteristics of brown rot both in terms of wood decay genes found in its genome and the decay that it causes. However, genes related to cellulose degradation are still present, which is a plesiomorphic characteristic shared with its white-rot ancestors. Four wood degradation-related genes, homologs of which are frequently lost in brown-rot fungi, show signs of pseudogenization in the genome of F. hepatica. These results suggest that transition towards a brown rot lifestyle could be an ongoing process in F. hepatica. Our results reinforce the idea that wood decay mechanisms are more diverse than initially thought and that the dichotomous separation of wood decay mechanisms in Agaricomycotina into white rot and brown rot should be revisited. PMID:25683379

  12. Whole genome deep sequencing of HIV-1 reveals the impact of early minor variants upon immune recognition during acute infection.

    Directory of Open Access Journals (Sweden)

    Matthew R Henn

    Full Text Available Deep sequencing technologies have the potential to transform the study of highly variable viral pathogens by providing a rapid and cost-effective approach to sensitively characterize rapidly evolving viral quasispecies. Here, we report on a high-throughput whole HIV-1 genome deep sequencing platform that combines 454 pyrosequencing with novel assembly and variant detection algorithms. In one subject we combined these genetic data with detailed immunological analyses to comprehensively evaluate viral evolution and immune escape during the acute phase of HIV-1 infection. The majority of early, low frequency mutations represented viral adaptation to host CD8+ T cell responses, evidence of strong immune selection pressure occurring during the early decline from peak viremia. CD8+ T cell responses capable of recognizing these low frequency escape variants coincided with the selection and evolution of more effective secondary HLA-anchor escape mutations. Frequent, and in some cases rapid, reversion of transmitted mutations was also observed across the viral genome. When located within restricted CD8 epitopes these low frequency reverting mutations were sufficient to prime de novo responses to these epitopes, again illustrating the capacity of the immune response to recognize and respond to low frequency variants. More importantly, rapid viral escape from the most immunodominant CD8+ T cell responses coincided with plateauing of the initial viral load decline in this subject, suggestive of a potential link between maintenance of effective, dominant CD8 responses and the degree of early viremia reduction. We conclude that the early control of HIV-1 replication by immunodominant CD8+ T cell responses may be substantially influenced by rapid, low frequency viral adaptations not detected by conventional sequencing approaches, which warrants further investigation. These data support the critical need for vaccine-induced CD8+ T cell responses to target more

  13. Large scale full-length cDNA sequencing reveals a unique genomic landscape in a lepidopteran model insect, Bombyx mori.

    Science.gov (United States)

    Suetsugu, Yoshitaka; Futahashi, Ryo; Kanamori, Hiroyuki; Kadono-Okuda, Keiko; Sasanuma, Shun-ichi; Narukawa, Junko; Ajimura, Masahiro; Jouraku, Akiya; Namiki, Nobukazu; Shimomura, Michihiko; Sezutsu, Hideki; Osanai-Futahashi, Mizuko; Suzuki, Masataka G; Daimon, Takaaki; Shinoda, Tetsuro; Taniai, Kiyoko; Asaoka, Kiyoshi; Niwa, Ryusuke; Kawaoka, Shinpei; Katsuma, Susumu; Tamura, Toshiki; Noda, Hiroaki; Kasahara, Masahiro; Sugano, Sumio; Suzuki, Yutaka; Fujiwara, Haruhiko; Kataoka, Hiroshi; Arunkumar, Kallare P; Tomar, Archana; Nagaraju, Javaregowda; Goldsmith, Marian R; Feng, Qili; Xia, Qingyou; Yamamoto, Kimiko; Shimada, Toru; Mita, Kazuei

    2013-09-01

    The establishment of a complete genomic sequence of silkworm, the model species of Lepidoptera, laid a foundation for its functional genomics. A more complete annotation of the genome will benefit functional and comparative studies and accelerate extensive industrial applications for this insect. To realize these goals, we embarked upon a large-scale full-length cDNA collection from 21 full-length cDNA libraries derived from 14 tissues of the domesticated silkworm and performed full sequencing by primer walking for 11,104 full-length cDNAs. The large average intron size was 1904 bp, resulting from a high accumulation of transposons. Using gene models predicted by GLEAN and published mRNAs, we identified 16,823 gene loci on the silkworm genome assembly. Orthology analysis of 153 species, including 11 insects, revealed that among three Lepidoptera including Monarch and Heliconius butterflies, the 403 largest silkworm-specific genes were composed mainly of protective immunity, hormone-related, and characteristic structural proteins. Analysis of testis-/ovary-specific genes revealed distinctive features of sexual dimorphism, including depletion of ovary-specific genes on the Z chromosome in contrast to an enrichment of testis-specific genes. More than 40% of genes expressed in specific tissues mapped in tissue-specific chromosomal clusters. The newly obtained FL-cDNA sequences enabled us to annotate the genome of this lepidopteran model insect more accurately, enhancing genomic and functional studies of Lepidoptera and comparative analyses with other insect orders, and yielding new insights into the evolution and organization of lepidopteran-specific genes.

  14. Whole genome duplication and enrichment of metal cation transporters revealed by de novo genome sequencing of extremely halotolerant black yeast Hortaea werneckii.

    Directory of Open Access Journals (Sweden)

    Metka Lenassi

    Full Text Available Hortaea werneckii, ascomycetous yeast from the order Capnodiales, shows an exceptional adaptability to osmotically stressful conditions. To investigate this unusual phenotype we obtained a draft genomic sequence of a H. werneckii strain isolated from hypersaline water of solar saltern. Two of its most striking characteristics that may be associated with a halotolerant lifestyle are the large genetic redundancy and the expansion of genes encoding metal cation transporters. Although no sexual state of H. werneckii has yet been described, a mating locus with characteristics of heterothallic fungi was found. The total assembly size of the genome is 51.6 Mb, larger than most phylogenetically related fungi, coding for almost twice the usual number of predicted genes (23333. The genome appears to have experienced a relatively recent whole genome duplication, and contains two highly identical gene copies of almost every protein. This is consistent with some previous studies that reported increases in genomic DNA content triggered by exposure to salt stress. In hypersaline conditions transmembrane ion transport is of utmost importance. The analysis of predicted metal cation transporters showed that most types of transporters experienced several gene duplications at various points during their evolution. Consequently they are present in much higher numbers than expected. The resulting diversity of transporters presents interesting biotechnological opportunities for improvement of halotolerance of salt-sensitive species. The involvement of plasma P-type H⁺ ATPases in adaptation to different concentrations of salt was indicated by their salt dependent transcription. This was not the case with vacuolar H⁺ ATPases, which were transcribed constitutively. The availability of this genomic sequence is expected to promote the research of H. werneckii. Studying its extreme halotolerance will not only contribute to our understanding of life in hypersaline

  15. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    DEFF Research Database (Denmark)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand...... the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two...... misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan-and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity...

  16. Whole-genome sequencing of six Mauritian Cynomolgus macaques (Macaca fascicularis) reveals a genome-wide pattern of polymorphisms under extreme population bottleneck.

    Science.gov (United States)

    Osada, Naoki; Hettiarachchi, Nilmini; Adeyemi Babarinde, Isaac; Saitou, Naruya; Blancher, Antoine

    2015-03-23

    Cynomolgus macaques (Macaca fascicularis) were introduced to the island of Mauritius by humans around the 16th century. The unique demographic history of the Mauritian cynomolgus macaques provides the opportunity to not only examine the genetic background of well-established nonhuman primates for biomedical research but also understand the effect of an extreme population bottleneck on the pattern of polymorphisms in genomes. We sequenced the whole genomes of six Mauritian cynomolgus macaques and obtained an average of 20-fold coverage of the genome sequences for each individual. The overall level of nucleotide diversity was 23% smaller than that of the Malaysian cynomolgus macaques, and a reduction of low-frequency polymorphisms was observed. In addition, we also confirmed that the Mauritian cynomolgus macaques were genetically closer to a representative of the Malaysian population than to a representative of the Indochinese population. Excess of nonsynonymous polymorphisms in low frequency, which has been observed in many other species, was not very strong in the Mauritian samples, and the proportion of heterozygous nonsynonymous polymorphisms relative to synonymous polymorphisms is higher within individuals in Mauritian than Malaysian cynomolgus macaques. Those patterns indicate that the extreme population bottleneck made purifying selection overwhelmed by the power of genetic drift in the population. Finally, we estimated the number of founding individuals by using the genome-wide site frequency spectrum of the six samples. Assuming a simple demographic scenario with a single bottleneck followed by exponential growth, the estimated number of founders (∼20 individuals) is largely consistent with previous estimates.

  17. Whole-Genome Bisulfite Sequencing of Human Pancreatic Islets Reveals Novel Differentially Methylated Regions in Type 2 Diabetes Pathogenesis.

    Science.gov (United States)

    Volkov, Petr; Bacos, Karl; Ofori, Jones K; Esguerra, Jonathan Lou S; Eliasson, Lena; Rönn, Tina; Ling, Charlotte

    2017-04-01

    Current knowledge about the role of epigenetics in type 2 diabetes (T2D) remains limited. Only a few studies have investigated DNA methylation of selected candidate genes or a very small fraction of genomic CpG sites in human pancreatic islets, the tissue of primary pathogenic importance for diabetes. Our aim was to characterize the whole-genome DNA methylation landscape in human pancreatic islets, to identify differentially methylated regions (DMRs) in diabetic islets, and to investigate the function of DMRs in islet biology. Here, we performed whole-genome bisulfite sequencing, which is a comprehensive and unbiased method to study DNA methylation throughout the genome at a single nucleotide resolution, in pancreatic islets from donors with T2D and control subjects without diabetes. We identified 25,820 DMRs in islets from individuals with T2D. These DMRs cover loci with known islet function, e.g., PDX1, TCF7L2, and ADCY5 Importantly, binding sites previously identified by ChIP-seq for islet-specific transcription factors, enhancer regions, and different histone marks were enriched in the T2D-associated DMRs. We also identified 457 genes, including NR4A3, PARK2, PID1, SLC2A2, and SOCS2, that had both DMRs and significant expression changes in T2D islets. To mimic the situation in T2D islets, candidate genes were overexpressed or silenced in cultured β-cells. This resulted in impaired insulin secretion, thereby connecting differential methylation to islet dysfunction. We further explored the islet methylome and found a strong link between methylation levels and histone marks. Additionally, DNA methylation in different genomic regions and of different transcript types (i.e., protein coding, noncoding, and pseudogenes) was associated with islet expression levels. Our study provides a comprehensive picture of the islet DNA methylome in individuals with and without diabetes and highlights the importance of epigenetic dysregulation in pancreatic islets and T2D

  18. Whole-genome bisulfite sequencing maps from multiple human tissues reveal novel CpG islands associated with tissue-specific regulation.

    Science.gov (United States)

    Mendizabal, Isabel; Yi, Soojin V

    2016-01-01

    CpG islands (CGIs) are one of the most widely studied regulatory features of the human genome, with critical roles in development and disease. Despite such significance and the original epigenetic definition, currently used CGI sets are typically predicted from DNA sequence characteristics. Although CGIs are deeply implicated in practical analyses of DNA methylation, recent studies have shown that such computational annotations suffer from inaccuracies. Here we used whole-genome bisulfite sequencing from 10 diverse human tissues to identify a comprehensive, experimentally obtained, single-base resolution CGI catalog. In addition to the unparalleled annotation precision, our method is free from potential bias due to arbitrary sequence features or probe affinity differences. In addition to clarifying substantial false positives in the widely used University of California Santa Cruz (UCSC) annotations, our study identifies numerous novel epigenetic loci. In particular, we reveal significant impact of transposable elements on the epigenetic regulatory landscape of the human genome and demonstrate ubiquitous presence of transcription initiation at CGIs, including alternative promoters in gene bodies and non-coding RNAs in intergenic regions. Moreover, coordinated DNA methylation and chromatin modifications mark tissue-specific enhancers at novel CGIs. Enrichment of specific transcription factor binding from ChIP-seq supports mechanistic roles of CGIs on the regulation of tissue-specific transcription. The new CGI catalog provides a comprehensive and integrated list of genomic hotspots of epigenetic regulation. © The Author 2015. Published by Oxford University Press.

  19. Involvement of two latex-clearing proteins during rubber degradation and insights into the subsequent degradation pathway revealed by the genome sequence of Gordonia polyisoprenivorans strain VH2.

    Science.gov (United States)

    Hiessl, Sebastian; Schuldes, Jörg; Thürmer, Andrea; Halbsguth, Tobias; Bröker, Daniel; Angelov, Angel; Liebl, Wolfgang; Daniel, Rolf; Steinbüchel, Alexander

    2012-04-01

    The increasing production of synthetic and natural poly(cis-1,4-isoprene) rubber leads to huge challenges in waste management. Only a few bacteria are known to degrade rubber, and little is known about the mechanism of microbial rubber degradation. The genome of Gordonia polyisoprenivorans strain VH2, which is one of the most effective rubber-degrading bacteria, was sequenced and annotated to elucidate the degradation pathway and other features of this actinomycete. The genome consists of a circular chromosome of 5,669,805 bp and a circular plasmid of 174,494 bp with average GC contents of 67.0% and 65.7%, respectively. It contains 5,110 putative protein-coding sequences, including many candidate genes responsible for rubber degradation and other biotechnically relevant pathways. Furthermore, we detected two homologues of a latex-clearing protein, which is supposed to be a key enzyme in rubber degradation. The deletion of these two genes for the first time revealed clear evidence that latex-clearing protein is essential for the microbial utilization of rubber. Based on the genome sequence, we predict a pathway for the microbial degradation of rubber which is supported by previous and current data on transposon mutagenesis, deletion mutants, applied comparative genomics, and literature search.

  20. The Genome Sequence of Methanohalophilus mahii SLPT Reveals Differences in the Energy Metabolism among Members of the Methanosarcinaceae Inhabiting Freshwater and Saline Environments

    Directory of Open Access Journals (Sweden)

    Stefan Spring

    2010-01-01

    Full Text Available Methanohalophilus mahii is the type species of the genus Methanohalophilus, which currently comprises three distinct species with validly published names. Mhp. mahii represents moderately halophilic methanogenic archaea with a strictly methylotrophic metabolism. The type strain SLPT was isolated from hypersaline sediments collected from the southern arm of Great Salt Lake, Utah. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 2,012,424 bp genome is a single replicon with 2032 protein-coding and 63 RNA genes and part of the Genomic Encyclopedia of Bacteria and Archaea project. A comparison of the reconstructed energy metabolism in the halophilic species Mhp. mahii with other representatives of the Methanosarcinaceae reveals some interesting differences to freshwater species.

  1. The Genome Sequence of Methanohalophilus mahii SLPT Reveals Differences in the Energy Metabolism among Members of the Methanosarcinaceae Inhabiting Freshwater and Saline Environments

    Energy Technology Data Exchange (ETDEWEB)

    Spring, Stefan [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Scheuner, Carmen [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [Joint Genome Institute, Walnut Creek, California; Lucas, Susan [Joint Genome Institute, Walnut Creek, California; Glavina Del Rio, Tijana [Joint Genome Institute, Walnut Creek, California; Tice, Hope [Joint Genome Institute, Walnut Creek, California; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [Joint Genome Institute, Walnut Creek, California; Chen, Feng [Joint Genome Institute, Walnut Creek, California; Nolan, Matt [Joint Genome Institute, Walnut Creek, California; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Pitluck, Samuel [ORNL; Liolios, Konstantinos [Joint Genome Institute, Walnut Creek, California; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Lykidis, A [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [Joint Genome Institute, Walnut Creek, California; Palaniappan, Krishna [Joint Genome Institute, Walnut Creek, California; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia D [ORNL; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Detter, J. Chris [Joint Genome Institute, Walnut Creek, California; Brettin, Thomas S [ORNL; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [ORNL; Bristow, James [Joint Genome Institute, Walnut Creek, California; Eisen, Jonathan [Joint Genome Institute, Walnut Creek, California; Markowitz, Victor [Joint Genome Institute, Walnut Creek, California; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpidis, Nikos C [ORNL; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2010-12-01

    Methanohalophilus mahii is the type species of the genus Methanohalophilus, which currently comprises three distinct species with validly published names. Mhp. mahii represents moderately halophilic methanogenic archaea with a strictly methylotrophic metabolism. The type strain SLPT was isolated from hypersaline sediments collected from the southern arm of Great Salt Lake, Utah. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 2,012,424 bp genome is a single replicon with 2032 protein-coding and 63 RNA genes and part of the Genomic Encyclopedia of Bacteria and Archaea project. A comparison of the reconstructed energy metabolism in the halophilic species Mhp. mahii with other representatives of the Methanosarcinaceae reveals some interesting differences to freshwater species.

  2. The Genome Sequence of Methanohalophilus mahii SLPT Reveals Differences in the Energy Metabolism among Members of the Methanosarcinaceae Inhabiting Freshwater and Saline Environments

    Energy Technology Data Exchange (ETDEWEB)

    Spring, Stefan [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Scheuner, Carmen [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Lykidis, A [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Brettin, Thomas S [ORNL; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2010-01-01

    Methanohalophilus mahii is the type species of the genus Methanohalophilus, which currently comprises three distinct species with validly published names. Mhp. mahii represents moderately halophilic methanogenic archaea with a strictly methylotrophic metabolism. The type strain SLPT was isolated from hypersaline sediments collected from the southern arm of Great Salt Lake, Utah. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 2,012,424 bp genome is a single replicon with 2032 protein-coding and 63 RNA genes and part of the Genomic Encyclopedia of Bacteria and Archaea project. A comparison of the reconstructed energy metabolism in the halophilic species Mhp. mahii with other representatives of the Methanosarcinaceae reveals some interesting differences to freshwater species.

  3. Genome sequence reveals that Pseudomonas fluorescens F113 possesses a large and diverse array of systems for rhizosphere function and host interaction

    Directory of Open Access Journals (Sweden)

    Redondo-Nieto Miguel

    2013-01-01

    Full Text Available Abstract Background Pseudomonas fluorescens F113 is a plant growth-promoting rhizobacterium (PGPR isolated from the sugar-beet rhizosphere. This bacterium has been extensively studied as a model strain for genetic regulation of secondary metabolite production in P. fluorescens, as a candidate biocontrol agent against phytopathogens, and as a heterologous host for expression of genes with biotechnological application. The F113 genome sequence and annotation has been recently reported. Results Comparative analysis of 50 genome sequences of strains belonging to the P. fluorescens group has revealed the existence of five distinct subgroups. F113 belongs to subgroup I, which is mostly composed of strains classified as P. brassicacearum. The core genome of these five strains is highly conserved and represents approximately 76% of the protein-coding genes in any given genome. Despite this strong conservation, F113 also contains a large number of unique protein-coding genes that encode traits potentially involved in the rhizocompetence of this strain. These features include protein coding genes required for denitrification, diterpenoids catabolism, motility and chemotaxis, protein secretion and production of antimicrobial compounds and insect toxins. Conclusions The genome of P. fluorescens F113 is composed of numerous protein-coding genes, not usually found together in previously sequenced genomes, which are potentially decisive during the colonisation of the rhizosphere and/or interaction with other soil organisms. This includes genes encoding proteins involved in the production of a second flagellar apparatus, the use of abietic acid as a growth substrate, the complete denitrification pathway, the possible production of a macrolide antibiotic and the assembly of multiple protein secretion systems.

  4. Next-Generation Sequencing of Genomic DNA Fragments Bound to a Transcription Factor in Vitro Reveals Its Regulatory Potential

    Directory of Open Access Journals (Sweden)

    Yukio Kurihara

    2014-12-01

    Full Text Available Several transcription factors (TFs coordinate to regulate expression of specific genes at the transcriptional level. In Arabidopsis thaliana it is estimated that approximately 10% of all genes encode TFs or TF-like proteins. It is important to identify target genes that are directly regulated by TFs in order to understand the complete picture of a plant’s transcriptome profile. Here, we investigate the role of the LONG HYPOCOTYL5 (HY5 transcription factor that acts as a regulator of photomorphogenesis. We used an in vitro genomic DNA binding assay coupled with immunoprecipitation and next-generation sequencing (gDB-seq instead of the in vivo chromatin immunoprecipitation (ChIP-based methods. The results demonstrate that the HY5-binding motif predicted here was similar to the motif reported previously and that in vitro HY5-binding loci largely overlapped with the HY5-targeted candidate genes identified in previous ChIP-chip analysis. By combining these results with microarray analysis, we identified hundreds of HY5-binding genes that were differentially expressed in hy5. We also observed delayed induction of some transcripts of HY5-binding genes in hy5 mutants in response to blue-light exposure after dark treatment. Thus, an in vitro gDNA-binding assay coupled with sequencing is a convenient and powerful method to bridge the gap between identifying TF binding potential and establishing function.

  5. Maize genome sequencing by methylation filtration.

    Science.gov (United States)

    Palmer, Lance E; Rabinowicz, Pablo D; O'Shaughnessy, Andrew L; Balija, Vivekanand S; Nascimento, Lidia U; Dike, Sujit; de la Bastide, Melissa; Martienssen, Robert A; McCombie, W Richard

    2003-12-19

    Gene enrichment strategies offer an alternative to sequencing large and repetitive genomes such as that of maize. We report the generation and analysis of nearly 100,000 undermethylated (or methylation filtration) maize sequences. Comparison with the rice genome reveals that methylation filtration results in a more comprehensive representation of maize genes than those that result from expressed sequence tags or transposon insertion sites sequences. About 7% of the repetitive DNA is unmethylated and thus selected in our libraries, but potentially active transposons and unmethylated organelle genomes can be identified. Reverse transcription polymerase chain reaction can be used to finish the maize transcriptome.

  6. Genome Sequence Databases (Overview): Sequencing and Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  7. Entire genome sequence analysis of genotype IX Newcastle disease viruses reveals their early-genotype phylogenetic position and recent-genotype genome size

    Directory of Open Access Journals (Sweden)

    Hu Shunling

    2011-03-01

    Full Text Available Abstract Background Six nucleotide (nt insertion in the 5'-noncoding region (NCR of the nucleoprotein (NP gene of Newcaslte disease virus (NDV is considered to be a genetic marker for recent genotypes of NDV, which emerged after 1960. However, F48-like NDVs from China, identified a 6-nt insert in the NP gene, have been previously classified into genotype III or genotype IX. Results In order to clarify their phylogenetic position and explore the origin of NDVs with the 6-nt insert and its significance in NDV evolution, we determined the entire genome sequences of five F48-like viruses isolated in China between 1946 and 2002 by RT-PCR amplification of overlapping fragments of full-length genome and rapid amplification of cDNA ends. All the five NDV isolates shared the same genome size of 15,192-nt with the recent genotype V-VIII viruses whereas they had the highest homology with early genotype III and IV isolates. Conclusions The unique characteristic of the genome size and phylogenetic position of F48-like viruses warrants placing them in a separate geno-group, genotype IX. Results in this study also suggest that genotype IX viruses most likely originate from a genotype III virus by insertion of a 6-nt motif in the 5'-NCR of the NP gene which had occurred as early as in 1940 s, and might be the common origin of genotype V-VIII viruses.

  8. Comparative sequence analysis of Solanum and Arabidopsis in a hot spot for pathogen resistance on potato chromosome V reveals a patchwork of conserved and rapidly evolving genome segments

    Directory of Open Access Journals (Sweden)

    Bruggmann Rémy

    2007-05-01

    Full Text Available Abstract Background Quantitative phenotypic variation of agronomic characters in crop plants is controlled by environmental and genetic factors (quantitative trait loci = QTL. To understand the molecular basis of such QTL, the identification of the underlying genes is of primary interest and DNA sequence analysis of the genomic regions harboring QTL is a prerequisite for that. QTL mapping in potato (Solanum tuberosum has identified a region on chromosome V tagged by DNA markers GP21 and GP179, which contains a number of important QTL, among others QTL for resistance to late blight caused by the oomycete Phytophthora infestans and to root cyst nematodes. Results To obtain genomic sequence for the targeted region on chromosome V, two local BAC (bacterial artificial chromosome contigs were constructed and sequenced, which corresponded to parts of the homologous chromosomes of the diploid, heterozygous genotype P6/210. Two contiguous sequences of 417,445 and 202,781 base pairs were assembled and annotated. Gene-by-gene co-linearity was disrupted by non-allelic insertions of retrotransposon elements, stretches of diverged intergenic sequences, differences in gene content and gene order. The latter was caused by inversion of a 70 kbp genomic fragment. These features were also found in comparison to orthologous sequence contigs from three homeologous chromosomes of Solanum demissum, a wild tuber bearing species. Functional annotation of the sequence identified 48 putative open reading frames (ORF in one contig and 22 in the other, with an average of one ORF every 9 kbp. Ten ORFs were classified as resistance-gene-like, 11 as F-box-containing genes, 13 as transposable elements and three as transcription factors. Comparing potato to Arabidopsis thaliana annotated proteins revealed five micro-syntenic blocks of three to seven ORFs with A. thaliana chromosomes 1, 3 and 5. Conclusion Comparative sequence analysis revealed highly conserved collinear regions

  9. Genomic sequencing of Pleistocene cave bears

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

    2005-04-01

    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

  10. Genome sequencing reveals unique mutations in characteristic metabolic pathways and the transfer of virulence genes between V. mimicus and V. cholerae.

    Science.gov (United States)

    Wang, Duochun; Wang, Haiyin; Zhou, Yanyan; Zhang, Qiuxiang; Zhang, Fanfei; Du, Pengcheng; Wang, Shujing; Chen, Chen; Kan, Biao

    2011-01-01

    Vibrio mimicus, the species most similar to V. cholerae, is a microbe present in the natural environmental and sometimes causes diarrhea and internal infections in humans. It shows similar phenotypes to V. cholerae but differs in some biochemical characteristics. The molecular mechanisms underlying the differences in biochemical metabolism between V. mimicus and V. cholerae are currently unclear. Several V. mimicus isolates have been found that carry cholera toxin genes (ctxAB) and cause cholera-like diarrhea in humans. Here, the genome of the V. mimicus isolate SX-4, which carries an intact CTX element, was sequenced and annotated. Analysis of its genome, together with those of other Vibrio species, revealed extensive differences within the Vibrionaceae. Common mutations in gene clusters involved in three biochemical metabolism pathways that are used for discrimination between V. mimicus and V. cholerae were found in V. mimicus strains. We also constructed detailed genomic structures and evolution maps for the general types of genomic drift associated with pathogenic characters in polysaccharides, CTX elements and toxin co-regulated pilus (TCP) gene clusters. Overall, the whole-genome sequencing of the V. mimicus strain carrying the cholera toxin gene provides detailed information for understanding genomic differences among Vibrio spp. V. mimicus has a large number of diverse gene and nucleotide differences from its nearest neighbor, V. cholerae. The observed mutations in the characteristic metabolism pathways may indicate different adaptations to different niches for these species and may be caused by ancient events in evolution before the divergence of V. cholerae and V. mimicus. Horizontal transfers of virulence-related genes from an uncommon clone of V. cholerae, rather than the seventh pandemic strains, have generated the pathogenic V. mimicus strain carrying cholera toxin genes.

  11. Hyperlipidemia-associated gene variations and expression patterns revealed by whole-genome and transcriptome sequencing of rabbit models

    Science.gov (United States)

    Wang, Zhen; Zhang, Jifeng; Li, Hong; Li, Junyi; Niimi, Manabu; Ding, Guohui; Chen, Haifeng; Xu, Jie; Zhang, Hongjiu; Xu, Ze; Dai, Yulin; Gui, Tuantuan; Li, Shengdi; Liu, Zhi; Wu, Sujuan; Cao, Mushui; Zhou, Lu; Lu, Xingyu; Wang, Junxia; Yang, Jing; Fu, Yunhe; Yang, Dongshan; Song, Jun; Zhu, Tianqing; Li, Shen; Ning, Bo; Wang, Ziyun; Koike, Tomonari; Shiomi, Masashi; Liu, Enqi; Chen, Luonan; Fan, Jianglin; Chen, Y. Eugene; Li, Yixue

    2016-01-01

    The rabbit (Oryctolagus cuniculus) is an important experimental animal for studying human diseases, such as hypercholesterolemia and atherosclerosis. Despite this, genetic information and RNA expression profiling of laboratory rabbits are lacking. Here, we characterized the whole-genome variants of three breeds of the most popular experimental rabbits, New Zealand White (NZW), Japanese White (JW) and Watanabe heritable hyperlipidemic (WHHL) rabbits. Although the genetic diversity of WHHL rabbits was relatively low, they accumulated a large proportion of high-frequency deleterious mutations due to the small population size. Some of the deleterious mutations were associated with the pathophysiology of WHHL rabbits in addition to the LDLR deficiency. Furthermore, we conducted transcriptome sequencing of different organs of both WHHL and cholesterol-rich diet (Chol)-fed NZW rabbits. We found that gene expression profiles of the two rabbit models were essentially similar in the aorta, even though they exhibited different types of hypercholesterolemia. In contrast, Chol-fed rabbits, but not WHHL rabbits, exhibited pronounced inflammatory responses and abnormal lipid metabolism in the liver. These results provide valuable insights into identifying therapeutic targets of hypercholesterolemia and atherosclerosis with rabbit models. PMID:27245873

  12. Complete genome sequence and transcriptomic analysis of a novel marine strain Bacillus weihaiensis reveals the mechanism of brown algae degradation.

    Science.gov (United States)

    Zhu, Yueming; Chen, Peng; Bao, Yunjuan; Men, Yan; Zeng, Yan; Yang, Jiangang; Sun, Jibin; Sun, Yuanxia

    2016-11-30

    A novel marine strain representing efficient degradation ability toward brown algae was isolated, identified, and assigned to Bacillus weihaiensis Alg07. The alga-associated marine bacteria promote the nutrient cycle and perform important functions in the marine ecosystem. The de novo sequencing of the B. weihaiensis Alg07 genome was carried out. Results of gene annotation and carbohydrate-active enzyme analysis showed that the strain harbored enzymes that can completely degrade alginate and laminarin, which are the specific polysaccharides of brown algae. We also found genes for the utilization of mannitol, the major storage monosaccharide in the cell of brown algae. To understand the process of brown algae decomposition by B. weihaiensis Alg07, RNA-seq transcriptome analysis and qRT-PCR were performed. The genes involved in alginate metabolism were all up-regulated in the initial stage of kelp degradation, suggesting that the strain Alg07 first degrades alginate to destruct the cell wall so that the laminarin and mannitol are released and subsequently decomposed. The key genes involved in alginate and laminarin degradation were expressed in Escherichia coli and characterized. Overall, the model of brown algae degradation by the marine strain Alg07 was established, and novel alginate lyases and laminarinase were discovered.

  13. Synergism between genome sequencing, tandem mass spectrometry and bio-inspired synthesis reveals insights into nocardioazine B biogenesis.

    Science.gov (United States)

    Alqahtani, Norah; Porwal, Suheel K; James, Elle D; Bis, Dana M; Karty, Jonathan A; Lane, Amy L; Viswanathan, Rajesh

    2015-07-14

    Marine actinomycete-derived natural products continue to inspire chemical and biological investigations. Nocardioazines A and B (3 and 4), from Nocardiopsis sp. CMB-M0232, are structurally unique alkaloids featuring a 2,5-diketopiperazine (DKP) core functionalized with indole C3-prenyl as well as indole C3- and N-methyl groups. The logic of their assembly remains cryptic. Bioinformatics analyses of the Nocardiopsis sp. CMB-M0232 draft genome afforded the noz cluster, split across two regions of the genome, and encoding putative open reading frames with roles in nocardioazine biosynthesis, including cyclodipeptide synthase (CDPS), prenyltransferase, methyltransferase, and cytochrome P450 homologs. Heterologous expression of a twelve gene contig from the noz cluster in Streptomyces coelicolor resulted in accumulation of cyclo-l-Trp-l-Trp DKP (5). This experimentally connected the noz cluster to indole alkaloid natural product biosynthesis. Results from bioinformatics analyses of the noz pathway along with challenges in actinomycete genetics prompted us to use asymmetric synthesis and mass spectrometry to determine biosynthetic intermediates in the noz pathway. The structures of hypothesized biosynthetic intermediates 5 and 12-17 were firmly established through chemical synthesis. LC-MS and MS-MS comparison of these synthetic compounds with metabolites present in chemical extracts from Nocardiopsis sp. CMB-M0232 revealed which of these hypothesized intermediates were relevant in the nocardioazine biosynthetic pathway. This established the early and mid-stages of the biosynthetic pathway, demonstrating that Nocardiopsis performs indole C3-methylation prior to indole C3-normal prenylation and indole N1'-methylation in nocardioazine B assembly. These results highlight the utility of merging bioinformatics analyses, asymmetric synthetic approaches, and mass spectrometric metabolite profiling in probing natural product biosynthesis.

  14. Evaluation of methods for de novo genome assembly from high-throughput sequencing reads reveals dependencies that affect the quality of the results.

    Science.gov (United States)

    Haiminen, Niina; Kuhn, David N; Parida, Laxmi; Rigoutsos, Isidore

    2011-01-01

    Recent developments in high-throughput sequencing technology have made low-cost sequencing an attractive approach for many genome analysis tasks. Increasing read lengths, improving quality and the production of increasingly larger numbers of usable sequences per instrument-run continue to make whole-genome assembly an appealing target application. In this paper we evaluate the feasibility of de novo genome assembly from short reads (≤100 nucleotides) through a detailed study involving genomic sequences of various lengths and origin, in conjunction with several of the currently popular assembly programs. Our extensive analysis demonstrates that, in addition to sequencing coverage, attributes such as the architecture of the target genome, the identity of the used assembly program, the average read length and the observed sequencing error rates are powerful variables that affect the best achievable assembly of the target sequence in terms of size and correctness.

  15. Deep sequencing revealed genome-wide single-nucleotide polymorphism and plasmid content of Erwinia amylovora strains isolated in Middle Atlas, Morocco.

    Science.gov (United States)

    Hannou, Najat; Mondy, Samuel; Planamente, Sara; Moumni, Mohieddine; Llop, Pablo; López, María; Manceau, Charles; Barny, Marie-Anne; Faure, Denis

    2013-10-01

    Erwinia amylovora causes economic losses that affect pear and apple production in Morocco. Here, we report comparative genomics of four Moroccan E. amylovora strains with the European strain CFBP1430 and North-American strain ATCC49946. Analysis of single nucleotide polymorphisms (SNPs) revealed genetic homogeneity of Moroccan's strains and their proximity to the European strain CFBP1430. Moreover, the collected sequences allowed the assembly of a 65 kpb plasmid, which is highly similar to the plasmid pEI70 harbored by several European E. amylovora isolates. This plasmid was found in 33% of the 40 E. amylovora strains collected from several host plants in 2009 and 2010 in Morocco.

  16. The Genome Sequence of Bacillus cereus ATCC 10987 Reveals Metabolic Adaptations and a Large Plasmid Related to Bacillus anthracis pXO1

    Science.gov (United States)

    2004-01-01

    R.L. and Waites,K.B. (2003) Bacillus cereus bacteremia in a preterm neonate. J. Clin. Microbiol., 41, 3441±3444. 9. Ginsburg,A.S., Salazar,L.G., True... bacteremia and pneumonia due to Bacillus cereus . J. Clin. Microbiol., 35, 504±507. 12. Okinaka,R., Cloud,K., Hampton,O., Hoffmaster,A., Hill,K., Keim,P...The genome sequence of Bacillus cereus ATCC 10987 reveals metabolic adaptations and a large plasmid related to Bacillus anthracis pXO1 David A. Rasko

  17. Genome-wide DNA polymorphism in the indica rice varieties RGD-7S and Taifeng B as revealed by whole genome re-sequencing.

    Science.gov (United States)

    Fu, Chong-Yun; Liu, Wu-Ge; Liu, Di-Lin; Li, Ji-Hua; Zhu, Man-Shan; Liao, Yi-Long; Liu, Zhen-Rong; Zeng, Xue-Qin; Wang, Feng

    2016-03-01

    Next-generation sequencing technologies provide opportunities to further understand genetic variation, even within closely related cultivars. We performed whole genome resequencing of two elite indica rice varieties, RGD-7S and Taifeng B, whose F1 progeny showed hybrid weakness and hybrid vigor when grown in the early- and late-cropping seasons, respectively. Approximately 150 million 100-bp pair-end reads were generated, which covered ∼86% of the rice (Oryza sativa L. japonica 'Nipponbare') reference genome. A total of 2,758,740 polymorphic sites including 2,408,845 SNPs and 349,895 InDels were detected in RGD-7S and Taifeng B, respectively. Applying stringent parameters, we identified 961,791 SNPs and 46,640 InDels between RGD-7S and Taifeng B (RGD-7S/Taifeng B). The density of DNA polymorphisms was 256.8 SNPs and 12.5 InDels per 100 kb for RGD-7S/Taifeng B. Copy number variations (CNVs) were also investigated. In RGD-7S, 1989 of 2727 CNVs were overlapped in 218 genes, and 1231 of 2010 CNVs were annotated in 175 genes in Taifeng B. In addition, we verified a subset of InDels in the interval of hybrid weakness genes, Hw3 and Hw4, and obtained some polymorphic InDel markers, which will provide a sound foundation for cloning hybrid weakness genes. Analysis of genomic variations will also contribute to understanding the genetic basis of hybrid weakness and heterosis.

  18. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

    Science.gov (United States)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.

  19. Whole Genome Sequencing of 39 Invasive Streptococcus pneumoniae Sequence Type 199 Isolates Revealed Switches from Serotype 19A to 15B

    Science.gov (United States)

    Lucas, Marie; Brandt, Christian; Herrmann, Leonie; Albersmeier, Andreas; Blom, Jochen; Goesmann, Alexander

    2017-01-01

    Streptococcus pneumoniae is a major pathogen that causes different invasive pneumococcal diseases (IPD). The pneumococcal polysaccharide capsule is a main virulence factor. More than 94 capsule types have been described, but only a limited number of capsule types accounted for the majority of IPD cases before the introduction of pneumococcal vaccines. After the introduction of the conjugated pneumococcal vaccine PCV7, which covered the seven most frequent serotypes in IPD in the USA, an increase in IPD caused by non-vaccine serotypes was observed, and serotype 19A, which belongs to sequence type (ST) 199, was among the most prevalent STs. After the introduction of the extended vaccine PCV13, which includes serotype 19A, serogroup 15B/C increased in IPD. Therefore, whole genome sequences of 39 isolates of ST199 from Germany (collected between 1998 and 2011) with serotype 19A (n = 24) and serogroup 15B/C (n = 15) were obtained using an Illumina platform and were analysed to identify capsular switches within ST199. Two 19A to 15B/C serotype switch events were identified. Both events occurred before the introduction of PCV7, which indicates that a capsular switch from 19A to 15B among ST199 isolates is not unusual and is not directly linked to the vaccination. The observed serotype replacement appears to be the result of a vacant niche due to the displacement of vaccine serotypes that is now successfully occupied by ST199 clones. PMID:28046133

  20. Complete genome sequence of the rifamycin SV-producing Amycolatopsis mediterranei U32 revealed its genetic characteristics in phylogeny and metabolism.

    Science.gov (United States)

    Zhao, Wei; Zhong, Yi; Yuan, Hua; Wang, Jin; Zheng, Huajun; Wang, Ying; Cen, Xufeng; Xu, Feng; Bai, Jie; Han, Xiaobiao; Lu, Gang; Zhu, Yongqiang; Shao, Zhihui; Yan, Han; Li, Chen; Peng, Nanqiu; Zhang, Zilong; Zhang, Yunyi; Lin, Wei; Fan, Yun; Qin, Zhongjun; Hu, Yongfei; Zhu, Baoli; Wang, Shengyue; Ding, Xiaoming; Zhao, Guo-Ping

    2010-10-01

    Amycolatopsis mediterranei is used for industry-scale production of rifamycin, which plays a vital role in antimycobacterial therapy. As the first sequenced genome of the genus Amycolatopsis, the chromosome of strain U32 comprising 10,236,715 base pairs, is one of the largest prokaryotic genomes ever sequenced so far. Unlike the linear topology found in streptomycetes, this chromosome is circular, particularly similar to that of Saccharopolyspora erythraea and Nocardia farcinica, representing their close relationship in phylogeny and taxonomy. Although the predicted 9,228 protein-coding genes in the A. mediterranei genome shared the greatest number of orthologs with those of S. erythraea, it was unexpectedly followed by Streptomyces coelicolor rather than N. farcinica, indicating the distinct metabolic characteristics evolved via adaptation to diverse ecological niches. Besides a core region analogous to that common in streptomycetes, a novel 'quasi-core' with typical core characteristics is defined within the non-core region, where 21 out of the total 26 gene clusters for secondary metabolite production are located. The rifamycin biosynthesis gene cluster located in the core encodes a cytochrome P450 enzyme essential for the conversion of rifamycin SV to B, revealed by comparing to the highly homologous cluster of the rifamycin B-producing strain S699 and further confirmed by genetic complementation. The genomic information of A. mediterranei demonstrates a metabolic network orchestrated not only for extensive utilization of various carbon sources and inorganic nitrogen compounds but also for effective funneling of metabolic intermediates into the secondary antibiotic synthesis process under the control of a seemingly complex regulatory mechanism.

  1. The Genome Sequence of the Tomato-Pathogenic Actinomycete Clavibacter michiganensis subsp. michiganensis NCPPB382 Reveals a Large Island Involved in Pathogenicity▿ †

    Science.gov (United States)

    Gartemann, Karl-Heinz; Abt, Birte; Bekel, Thomas; Burger, Annette; Engemann, Jutta; Flügel, Monika; Gaigalat, Lars; Goesmann, Alexander; Gräfen, Ines; Kalinowski, Jörn; Kaup, Olaf; Kirchner, Oliver; Krause, Lutz; Linke, Burkhard; McHardy, Alice; Meyer, Folker; Pohle, Sandra; Rückert, Christian; Schneiker, Susanne; Zellermann, Eva-Maria; Pühler, Alfred; Eichenlaub, Rudolf; Kaiser, Olaf; Bartels, Daniela

    2008-01-01

    Clavibacter michiganensis subsp. michiganensis is a plant-pathogenic actinomycete that causes bacterial wilt and canker of tomato. The nucleotide sequence of the genome of strain NCPPB382 was determined. The chromosome is circular, consists of 3.298 Mb, and has a high G+C content (72.6%). Annotation revealed 3,080 putative protein-encoding sequences; only 26 pseudogenes were detected. Two rrn operons, 45 tRNAs, and three small stable RNA genes were found. The two circular plasmids, pCM1 (27.4 kbp) and pCM2 (70.0 kbp), which carry pathogenicity genes and thus are essential for virulence, have lower G+C contents (66.5 and 67.6%, respectively). In contrast to the genome of the closely related organism Clavibacter michiganensis subsp. sepedonicus, the genome of C. michiganensis subsp. michiganensis lacks complete insertion elements and transposons. The 129-kb chp/tomA region with a low G+C content near the chromosomal origin of replication was shown to be necessary for pathogenicity. This region contains numerous genes encoding proteins involved in uptake and metabolism of sugars and several serine proteases. There is evidence that single genes located in this region, especially genes encoding serine proteases, are required for efficient colonization of the host. Although C. michiganensis subsp. michiganensis grows mainly in the xylem of tomato plants, no evidence for pronounced genome reduction was found. C. michiganensis subsp. michiganensis seems to have as many transporters and regulators as typical soil-inhabiting bacteria. However, the apparent lack of a sulfate reduction pathway, which makes C. michiganensis subsp. michiganensis dependent on reduced sulfur compounds for growth, is probably the reason for the poor survival of C. michiganensis subsp. michiganensis in soil. PMID:18192381

  2. The genome sequence of the tomato-pathogenic actinomycete Clavibacter michiganensis subsp. michiganensis NCPPB382 reveals a large island involved in pathogenicity.

    Science.gov (United States)

    Gartemann, Karl-Heinz; Abt, Birte; Bekel, Thomas; Burger, Annette; Engemann, Jutta; Flügel, Monika; Gaigalat, Lars; Goesmann, Alexander; Gräfen, Ines; Kalinowski, Jörn; Kaup, Olaf; Kirchner, Oliver; Krause, Lutz; Linke, Burkhard; McHardy, Alice; Meyer, Folker; Pohle, Sandra; Rückert, Christian; Schneiker, Susanne; Zellermann, Eva-Maria; Pühler, Alfred; Eichenlaub, Rudolf; Kaiser, Olaf; Bartels, Daniela

    2008-03-01

    Clavibacter michiganensis subsp. michiganensis is a plant-pathogenic actinomycete that causes bacterial wilt and canker of tomato. The nucleotide sequence of the genome of strain NCPPB382 was determined. The chromosome is circular, consists of 3.298 Mb, and has a high G+C content (72.6%). Annotation revealed 3,080 putative protein-encoding sequences; only 26 pseudogenes were detected. Two rrn operons, 45 tRNAs, and three small stable RNA genes were found. The two circular plasmids, pCM1 (27.4 kbp) and pCM2 (70.0 kbp), which carry pathogenicity genes and thus are essential for virulence, have lower G+C contents (66.5 and 67.6%, respectively). In contrast to the genome of the closely related organism Clavibacter michiganensis subsp. sepedonicus, the genome of C. michiganensis subsp. michiganensis lacks complete insertion elements and transposons. The 129-kb chp/tomA region with a low G+C content near the chromosomal origin of replication was shown to be necessary for pathogenicity. This region contains numerous genes encoding proteins involved in uptake and metabolism of sugars and several serine proteases. There is evidence that single genes located in this region, especially genes encoding serine proteases, are required for efficient colonization of the host. Although C. michiganensis subsp. michiganensis grows mainly in the xylem of tomato plants, no evidence for pronounced genome reduction was found. C. michiganensis subsp. michiganensis seems to have as many transporters and regulators as typical soil-inhabiting bacteria. However, the apparent lack of a sulfate reduction pathway, which makes C. michiganensis subsp. michiganensis dependent on reduced sulfur compounds for growth, is probably the reason for the poor survival of C. michiganensis subsp. michiganensis in soil.

  3. Whole-exome/genome sequencing and genomics.

    Science.gov (United States)

    Grody, Wayne W; Thompson, Barry H; Hudgins, Louanne

    2013-12-01

    As medical genetics has progressed from a descriptive entity to one focused on the functional relationship between genes and clinical disorders, emphasis has been placed on genomics. Genomics, a subelement of genetics, is the study of the genome, the sum total of all the genes of an organism. The human genome, which is contained in the 23 pairs of nuclear chromosomes and in the mitochondrial DNA of each cell, comprises >6 billion nucleotides of genetic code. There are some 23,000 protein-coding genes, a surprisingly small fraction of the total genetic material, with the remainder composed of noncoding DNA, regulatory sequences, and introns. The Human Genome Project, launched in 1990, produced a draft of the genome in 2001 and then a finished sequence in 2003, on the 50th anniversary of the initial publication of Watson and Crick's paper on the double-helical structure of DNA. Since then, this mass of genetic information has been translated at an ever-increasing pace into useable knowledge applicable to clinical medicine. The recent advent of massively parallel DNA sequencing (also known as shotgun, high-throughput, and next-generation sequencing) has brought whole-genome analysis into the clinic for the first time, and most of the current applications are directed at children with congenital conditions that are undiagnosable by using standard genetic tests for single-gene disorders. Thus, pediatricians must become familiar with this technology, what it can and cannot offer, and its technical and ethical challenges. Here, we address the concepts of human genomic analysis and its clinical applicability for primary care providers.

  4. The subclonal structure and genomic evolution of oral squamous cell carcinoma revealed by ultra-deep sequencing

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin J

    2017-01-01

    Recent studies suggest that head and neck squamous cell carcinomas are very heterogeneous between patients; however the subclonal structure remains unexplored mainly due to studies using only a single biopsy per patient. To deconvolutethe clonal structure and describe the genomic cancer evolution...

  5. Genome sequence of a diabetes-prone rodent reveals a mutation hotspot around the ParaHox gene cluster

    DEFF Research Database (Denmark)

    Hargreaves, Adam D.; Zhou, Long; Christensen, Josef

    2017-01-01

    Pdx1 has been grossly affected by GC-biased mutation, leading to the highest divergence observed for this gene across the Bilateria. In addition to genomic insights into restricted caloric intake in a desert species, the discovery of a localized chromosomal region subject to elevated mutation suggests...

  6. Genome sequence and analysis of Lactobacillus helveticus

    Directory of Open Access Journals (Sweden)

    Paola eCremonesi

    2013-01-01

    Full Text Available The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of L. helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract.As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones.

  7. Genome size analyses of Pucciniales reveal the largest fungal genomes

    Directory of Open Access Journals (Sweden)

    Silvia eTavares

    2014-08-01

    Full Text Available Rust fungi (Basidiomycota, Pucciniales are biotrophic plant pathogens which exhibit diverse complexities in their life cycles and host ranges. The completion of genome sequencing of a few rust fungi has revealed the occurrence of large genomes. Sequencing efforts for other rust fungi have been hampered by uncertainty concerning their genome sizes. Flow cytometry was recently applied to estimate the genome size of a few rust fungi, and confirmed the occurrence of large genomes in this order (averaging 151.5 Mbp, while the average for Basidiomycota was 49.9 Mbp and was 37.7 Mbp for all fungi. In this work, we have used an innovative and simple approach to simultaneously isolate nuclei from the rust and its host plant in order to estimate the genome size of 30 rust species by flow cytometry. Genome sizes varied over 10-fold, from 70 to 893 Mbp, with an average genome size value of 380.2 Mbp. Compared to the genome sizes of over 1,800 fungi, Gymnosporangium confusum possesses the largest fungal genome ever reported (893.2 Mbp. Moreover, even the smallest rust genome determined in this study is larger than the vast majority of fungal genomes (94 %. The average genome size of the Pucciniales is now of 305.5 Mbp, while the average Basidiomycota genome size has shifted to 70.4 Mbp and the average for all fungi reached 44.2 Mbp. Despite the fact that no correlation could be drawn between the genome sizes, the phylogenomics or the life cycle of rust fungi, it is interesting to note that rusts with Fabaceae hosts present genomes clearly larger than those with Poaceae hosts. Although this study comprises only a small fraction of the more than 7,000 rust species described, it seems already evident that the Pucciniales represent a group where genome size expansion could be a common characteristic. This is in sharp contrast to sister taxa, placing this order in a relevant position in fungal genomics research.

  8. The complete chloroplast DNA sequence of the green alga Oltmannsiellopsis viridis reveals a distinctive quadripartite architecture in the chloroplast genome of early diverging ulvophytes

    Directory of Open Access Journals (Sweden)

    Lemieux Claude

    2006-02-01

    Full Text Available Abstract Background The phylum Chlorophyta contains the majority of the green algae and is divided into four classes. The basal position of the Prasinophyceae has been well documented, but the divergence order of the Ulvophyceae, Trebouxiophyceae and Chlorophyceae is currently debated. The four complete chloroplast DNA (cpDNA sequences presently available for representatives of these classes have revealed extensive variability in overall structure, gene content, intron composition and gene order. The chloroplast genome of Pseudendoclonium (Ulvophyceae, in particular, is characterized by an atypical quadripartite architecture that deviates from the ancestral type by a large inverted repeat (IR featuring an inverted rRNA operon and a small single-copy (SSC region containing 14 genes normally found in the large single-copy (LSC region. To gain insights into the nature of the events that led to the reorganization of the chloroplast genome in the Ulvophyceae, we have determined the complete cpDNA sequence of Oltmannsiellopsis viridis, a representative of a distinct, early diverging lineage. Results The 151,933 bp IR-containing genome of Oltmannsiellopsis differs considerably from Pseudendoclonium and other chlorophyte cpDNAs in intron content and gene order, but shares close similarities with its ulvophyte homologue at the levels of quadripartite architecture, gene content and gene density. Oltmannsiellopsis cpDNA encodes 105 genes, contains five group I introns, and features many short dispersed repeats. As in Pseudendoclonium cpDNA, the rRNA genes in the IR are transcribed toward the single copy region featuring the genes typically found in the ancestral LSC region, and the opposite single copy region harbours genes characteristic of both the ancestral SSC and LSC regions. The 52 genes that were transferred from the ancestral LSC to SSC region include 12 of those observed in Pseudendoclonium cpDNA. Surprisingly, the overall gene organization of

  9. Genome sequencing reveals a new lineage associated with lablab bean and genetic exchange between Xanthomonas axonopodis pv. phaseoli and Xanthomonas fuscans subsp. fuscans

    Directory of Open Access Journals (Sweden)

    Valente eAritua

    2015-10-01

    Full Text Available Common bacterial blight is a devastating seed-borne disease of common beans that also occurs on other legume species including lablab and Lima beans. We sequenced and analysed the genomes of 26 isolates of Xanthomonas axonopodis pv. phaseoli and X. fuscans subsp. fuscans, the causative agents of this disease, collected over four decades and six continents. This revealed considerable genetic variation within both taxa, encompassing both single-nucleotide variants and differences in gene content, that could be exploited for tracking pathogen spread. The bacterial isolate from Lima bean fell within the previously described Genetic Lineage 1, along with the pathovar type isolate (NCPPB 3035. The isolates from lablab represent a new, previously unknown genetic lineage closely related to strains of X. axonopodis pv. glycines. Finally, we identified more than 100 genes that appear to have been recently acquired by Xanthomonas axonopodis pv. phaseoli from X. fuscans subsp. fuscans.

  10. Targeted isolation, sequence assembly and characterization of two white spruce (Picea glauca BAC clones for terpenoid synthase and cytochrome P450 genes involved in conifer defence reveal insights into a conifer genome

    Directory of Open Access Journals (Sweden)

    Ritland Carol

    2009-08-01

    Full Text Available Abstract Background Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs and full-length (FLcDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. Results We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR and a cytochrome P450 (CYP720B4 from a non-arrayed genomic BAC library of white spruce (Picea glauca. Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR and 94 kbp (CYP720B4 long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs, high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. Conclusion We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene

  11. The Diversity of Sequence and Chromosomal Distribution of New Transposable Element-Related Segments in the Rye Genome Revealed by FISH and Lineage Annotation

    Directory of Open Access Journals (Sweden)

    Yingxin Zhang

    2017-10-01

    Full Text Available Transposable elements (TEs in plant genomes exhibit a great variety of structure, sequence content and copy number, making them important drivers for species diversity and genome evolution. Even though a genome-wide statistic summary of TEs in rye has been obtained using high-throughput DNA sequencing technology, the accurate diversity of TEs in rye, as well as their chromosomal distribution and evolution, remains elusive due to the repetitive sequence assembling problems and the high dynamic and nested nature of TEs. In this study, using genomic plasmid library construction combined with dot-blot hybridization and fluorescence in situ hybridization (FISH analysis, we successfully isolated 70 unique FISH-positive TE-related sequences including 47 rye genome specific ones: 30 showed homology or partial homology with previously FISH characterized sequences and 40 have not been characterized. Among the 70 sequences, 48 sequences carried Ty3/gypsy-derived segments, 7 sequences carried Ty1/copia-derived segments and 15 sequences carried segments homologous with multiple TE families. 26 TE lineages were found in the 70 sequences, and among these lineages, Wilma was found in sequences dispersed in all chromosome regions except telomeric positions; Abiba was found in sequences predominantly located at pericentromeric and centromeric positions; Wis, Carmilla, and Inga were found in sequences displaying signals dispersed from distal regions toward pericentromeric positions; except DNA transposon lineages, all the other lineages were found in sequences displaying signals dispersed from proximal regions toward distal regions. A high percentage (21.4% of chimeric sequences were identified in this study and their high abundance in rye genome suggested that new TEs might form through recombination and nested transposition. Our results also gave proofs that diverse TE lineages were arranged at centromeric and pericentromeric positions in rye, and lineages like

  12. Sequencing and Analysis of a Genomic Fragment Provide an Insight into the Dunaliella viridis Genomic Sequence

    Institute of Scientific and Technical Information of China (English)

    Xiao-Ming SUN; Yuan-Ping TANG; Xiang-Zong MENG; Wen-Wen ZHANG; Shan LI; Zhi-Rui DENG; Zheng-Kai XU; Ren-Tao SONG

    2006-01-01

    Dunaliella is a genus of wall-less unicellular eukaryotic green alga. Its exceptional resistances to salt and various other stresses have made it an ideal model for stress tolerance study. However, very little is known about its genome and genomic sequences. In this study, we sequenced and analyzed a 29,268 bp genomic fragment from Dunaliella viridis. The fragment showed low sequence homology to the GenBank database. At the nucleotide level, only a segment with significant sequence homology to 18S rRNA was found. The fragment contained six putative genes, but only one gene showed significant homology at the protein level to GenBank database. The average GC content of this sequence was 51.1%, which was much lower than that of close related green algae Chlamydomonas (65.7%). Significant segmental duplications were found within this fragment. The duplicated sequences accounted for about 35.7% of the entire region. Large amounts of simple sequence repeats (microsatellites) were found, with strong bias towards (AC)n type (76%). Analysis of other Dunaliella genomic sequences in the GenBank database (total 25,749 bp) was in agreement with these findings. These sequence features made it difficult to sequence Dunaliella genomic sequences. Further investigation should be made to reveal the biological significance of these unique sequence features.

  13. Simple sequence repeats in mycobacterial genomes

    Indian Academy of Sciences (India)

    Vattipally B Sreenu; Pankaj Kumar; Javaregowda Nagaraju; Hampapathalu A Nagarajaram

    2007-01-01

    Simple sequence repeats (SSRs) or microsatellites are the repetitive nucleotide sequences of motifs of length 1–6 bp. They are scattered throughout the genomes of all the known organisms ranging from viruses to eukaryotes. Microsatellites undergo mutations in the form of insertions and deletions (INDELS) of their repeat units with some bias towards insertions that lead to microsatellite tract expansion. Although prokaryotic genomes derive some plasticity due to microsatellite mutations they have in-built mechanisms to arrest undue expansions of microsatellites and one such mechanism is constituted by post-replicative DNA repair enzymes MutL, MutH and MutS. The mycobacterial genomes lack these enzymes and as a null hypothesis one could expect these genomes to harbour many long tracts. It is therefore interesting to analyse the mycobacterial genomes for distribution and abundance of microsatellites tracts and to look for potentially polymorphic microsatellites. Available mycobacterial genomes, Mycobacterium avium, M. leprae, M. bovis and the two strains of M. tuberculosis (CDC1551 and H37Rv) were analysed for frequencies and abundance of SSRs. Our analysis revealed that the SSRs are distributed throughout the mycobacterial genomes at an average of 220–230 SSR tracts per kb. All the mycobacterial genomes contain few regions that are conspicuously denser or poorer in microsatellites compared to their expected genome averages. The genomes distinctly show scarcity of long microsatellites despite the absence of a post-replicative DNA repair system. Such severe scarcity of long microsatellites could arise as a result of strong selection pressures operating against long and unstable sequences although influence of GC-content and role of point mutations in arresting microsatellite expansions can not be ruled out. Nonetheless, the long tracts occasionally found in coding as well as non-coding regions may account for limited genome plasticity in these genomes.

  14. Genome sequence and analysis of the tuber crop potato

    DEFF Research Database (Denmark)

    Xu, X.; Pan, S.; Cheng, S.

    2011-01-01

    and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade...

  15. Pig genome sequence - analysis and publication strategy

    NARCIS (Netherlands)

    Archibald, A.L.; Bolund, L.; Churcher, C.; Fredholm, M.; Groenen, M.A.M.; Harlizius, B.

    2010-01-01

    Background - The pig genome is being sequenced and characterised under the auspices of the Swine Genome Sequencing Consortium. The sequencing strategy followed a hybrid approach combining hierarchical shotgun sequencing of BAC clones and whole genome shotgun sequencing. Results - Assemblies of the B

  16. Biochemical and genome sequence analyses of Megasphaera sp. strain DISK18 from dental plaque of a healthy individual reveals commensal lifestyle

    Science.gov (United States)

    Nallabelli, Nayudu; Patil, Prashant P.; Pal, Vijay Kumar; Singh, Namrata; Jain, Ashish; Patil, Prabhu B.; Grover, Vishakha; Korpole, Suresh

    2016-01-01

    Much of the work in periodontal microbiology in recent years has focused on identifying and understanding periodontal pathogens. As the majority of oral microbes have not yet been isolated in pure form, it is essential to understand the phenotypic characteristics of microbes to decipher their role in oral environment. In this study, strain DISK18 was isolated from gingival sulcus and identified as a Megasphaera species. Although metagenomics studies revealed Megasphaera species as a major group within the oral habitat, they have never been isolated in cultivable form to date. Therefore, we have characterized the DISK18 strain to better understand its role in the periodontal ecosystem. Strain Megasphaera sp. DISK18 displayed the ability to adhere and self-aggregate, which are essential requisite features for inhabiting and persisting in oral cavity. It also coaggregated with other pioneer oral colonizers like Streptococcus and Lactobacillus species but not with Veillonella. This behaviour points towards its role in the ecologic succession of a multispecies biofilm as an early colonizer. The absence of virulence determining genes as observed in whole genome sequence analysis coupled with an inability to degrade collagen reveals that Megasphaera sp. strain DISK18 is likely not a pathogenic species and emphasizes its commensal lifestyle. PMID:27651180

  17. Sequencing of Pax6 loci from the elephant shark reveals a family of Pax6 genes in vertebrate genomes, forged by ancient duplications and divergences.

    Directory of Open Access Journals (Sweden)

    Vydianathan Ravi

    Full Text Available Pax6 is a developmental control gene essential for eye development throughout the animal kingdom. In addition, Pax6 plays key roles in other parts of the CNS, olfactory system, and pancreas. In mammals a single Pax6 gene encoding multiple isoforms delivers these pleiotropic functions. Here we provide evidence that the genomes of many other vertebrate species contain multiple Pax6 loci. We sequenced Pax6-containing BACs from the cartilaginous elephant shark (Callorhinchus milii and found two distinct Pax6 loci. Pax6.1 is highly similar to mammalian Pax6, while Pax6.2 encodes a paired-less Pax6. Using synteny relationships, we identify homologs of this novel paired-less Pax6.2 gene in lizard and in frog, as well as in zebrafish and in other teleosts. In zebrafish two full-length Pax6 duplicates were known previously, originating from the fish-specific genome duplication (FSGD and expressed in divergent patterns due to paralog-specific loss of cis-elements. We show that teleosts other than zebrafish also maintain duplicate full-length Pax6 loci, but differences in gene and regulatory domain structure suggest that these Pax6 paralogs originate from a more ancient duplication event and are hence renamed as Pax6.3. Sequence comparisons between mammalian and elephant shark Pax6.1 loci highlight the presence of short- and long-range conserved noncoding elements (CNEs. Functional analysis demonstrates the ancient role of long-range enhancers for Pax6 transcription. We show that the paired-less Pax6.2 ortholog in zebrafish is expressed specifically in the developing retina. Transgenic analysis of elephant shark and zebrafish Pax6.2 CNEs with homology to the mouse NRE/Pα internal promoter revealed highly specific retinal expression. Finally, morpholino depletion of zebrafish Pax6.2 resulted in a "small eye" phenotype, supporting a role in retinal development. In summary, our study reveals that the pleiotropic functions of Pax6 in vertebrates are served by

  18. Genotyping by sequencing reveals the interspecific C. maxima / C. reticulata admixture along the genomes of modern citrus varieties of mandarins, tangors, tangelos, orangelos and grapefruits.

    Science.gov (United States)

    Oueslati, Amel; Salhi-Hannachi, Amel; Luro, François; Vignes, Hélène; Mournet, Pierre; Ollitrault, Patrick

    2017-01-01

    The mandarin horticultural group is an important component of world citrus production for the fresh fruit market. This group formerly classified as C. reticulata is highly polymorphic and recent molecular studies have suggested that numerous cultivated mandarins were introgressed by C. maxima (the pummelos). C. maxima and C. reticulata are also the ancestors of sweet and sour oranges, grapefruit, and therefore of all the "small citrus" modern varieties (mandarins, tangors, tangelos) derived from sexual hybridization between these horticultural groups. Recently, NGS technologies have greatly modified how plant evolution and genomic structure are analyzed, moving from phylogenetics to phylogenomics. The objective of this work was to develop a workflow for phylogenomic inference from Genotyping By Sequencing (GBS) data and to analyze the interspecific admixture along the nine citrus chromosomes for horticultural groups and recent varieties resulting from the combination of the C. reticulata and C. maxima gene pools. A GBS library was established from 55 citrus varieties, using the ApekI restriction enzyme and selective PCR to improve the read depth. Diagnostic polymorphisms (DPs) of C. reticulata/C. maxima differentiation were identified and used to decipher the phylogenomic structure of the 55 varieties. The GBS approach was powerful and revealed 30,289 SNPs and 8,794 Indels with 12.6% of missing data. 11,133 DPs were selected covering the nine chromosomes with a higher density in genic regions. GBS combined with the detection of DPs was powerful for deciphering the "phylogenomic karyotypes" of cultivars derived from admixture of the two ancestral species after a limited number of interspecific recombinations. All the mandarins, mandarin hybrids, tangelos and tangors analyzed displayed introgression of C. maxima in different parts of the genome. C. reticulata/C. maxima admixture should be a major component of the high phenotypic variability of this germplasm opening

  19. The genome sequences of Cellulomonas fimi and "Cellvibrio gilvus" reveal the cellulolytic strategies of two facultative anaerobes, transfer of "Cellvibrio gilvus" to the genus Cellulomonas, and proposal of Cellulomonas gilvus sp. nov.

    Directory of Open Access Journals (Sweden)

    Melissa R Christopherson

    Full Text Available Actinobacteria in the genus Cellulomonas are the only known and reported cellulolytic facultative anaerobes. To better understand the cellulolytic strategy employed by these bacteria, we sequenced the genome of the Cellulomonas fimi ATCC 484(T. For comparative purposes, we also sequenced the genome of the aerobic cellulolytic "Cellvibrio gilvus" ATCC 13127(T. An initial analysis of these genomes using phylogenetic and whole-genome comparison revealed that "Cellvibrio gilvus" belongs to the genus Cellulomonas. We thus propose to assign "Cellvibrio gilvus" to the genus Cellulomonas. A comparative genomics analysis between these two Cellulomonas genome sequences and the recently completed genome for Cellulomonas flavigena ATCC 482(T showed that these cellulomonads do not encode cellulosomes but appear to degrade cellulose by secreting multi-domain glycoside hydrolases. Despite the minimal number of carbohydrate-active enzymes encoded by these genomes, as compared to other known cellulolytic organisms, these bacteria were found to be proficient at degrading and utilizing a diverse set of carbohydrates, including crystalline cellulose. Moreover, they also encode for proteins required for the fermentation of hexose and xylose sugars into products such as ethanol. Finally, we found relatively few significant differences between the predicted carbohydrate-active enzymes encoded by these Cellulomonas genomes, in contrast to previous studies reporting differences in physiological approaches for carbohydrate degradation. Our sequencing and analysis of these genomes sheds light onto the mechanism through which these facultative anaerobes degrade cellulose, suggesting that the sequenced cellulomonads use secreted, multidomain enzymes to degrade cellulose in a way that is distinct from known anaerobic cellulolytic strategies.

  20. Whole genome sequencing and analysis of Campylobacter coli YH502 from retail chicken reveals a plasmid-borne type VI secretion system

    Science.gov (United States)

    Campylobacter is a major cause of foodborne illnesses worldwide. Campylobacter infections, commonly caused by ingestion of undercooked poultry and meat products, can lead to gastroenteritis and chronic reactive arthritis in humans. Whole genome sequencing (WGS) is a powerful technology that provides...

  1. Pilot Sequencing of Onion Genomic DNA Reveals Fragments of Transposable Elements, Low Gene Densities, and Significant Gene Enrichment After Methyl Filtration

    Science.gov (United States)

    Onion (Allium cepa) is a diploid (2n=2x=16) monocot with one of the largest nuclear genomes among cultivated plants, over 6 and 16 times that of maize and rice, respectively. In this study, we sequenced onion BACs to estimate gene densities and investigate the nature and distribution of repetitive ...

  2. Pig genome sequence - analysis and publication strategy

    DEFF Research Database (Denmark)

    Archibald, Alan L.; Bolund, Lars; Churcher, Carol;

    2010-01-01

    BACKGROUND: The pig genome is being sequenced and characterised under the auspices of the Swine Genome Sequencing Consortium. The sequencing strategy followed a hybrid approach combining hierarchical shotgun sequencing of BAC clones and whole genome shotgun sequencing. RESULTS: Assemblies......) is under construction and will incorporate whole genome shotgun sequence (WGS) data providing > 30x genome coverage. The WGS sequence, most of which comprise short Illumina/Solexa reads, were generated from DNA from the same single Duroc sow as the source of the BAC library from which clones were...

  3. Genome-wide analysis reveals diverged patterns of codon bias, gene expression, and rates of sequence evolution in picea gene families.

    Science.gov (United States)

    De La Torre, Amanda R; Lin, Yao-Cheng; Van de Peer, Yves; Ingvarsson, Pär K

    2015-03-05

    The recent sequencing of several gymnosperm genomes has greatly facilitated studying the evolution of their genes and gene families. In this study, we examine the evidence for expression-mediated selection in the first two fully sequenced representatives of the gymnosperm plant clade (Picea abies and Picea glauca). We use genome-wide estimates of gene expression (>50,000 expressed genes) to study the relationship between gene expression, codon bias, rates of sequence divergence, protein length, and gene duplication. We found that gene expression is correlated with rates of sequence divergence and codon bias, suggesting that natural selection is acting on Picea protein-coding genes for translational efficiency. Gene expression, rates of sequence divergence, and codon bias are correlated with the size of gene families, with large multicopy gene families having, on average, a lower expression level and breadth, lower codon bias, and higher rates of sequence divergence than single-copy gene families. Tissue-specific patterns of gene expression were more common in large gene families with large gene expression divergence than in single-copy families. Recent family expansions combined with large gene expression variation in paralogs and increased rates of sequence evolution suggest that some Picea gene families are rapidly evolving to cope with biotic and abiotic stress. Our study highlights the importance of gene expression and natural selection in shaping the evolution of protein-coding genes in Picea species, and sets the ground for further studies investigating the evolution of individual gene families in gymnosperms.

  4. Human-mouse comparative genomics: successes and failures to reveal functional regions of the human genome

    Energy Technology Data Exchange (ETDEWEB)

    Pennacchio, Len A.; Baroukh, Nadine; Rubin, Edward M.

    2003-05-15

    Deciphering the genetic code embedded within the human genome remains a significant challenge despite the human genome consortium's recent success at defining its linear sequence (Lander et al. 2001; Venter et al. 2001). While useful strategies exist to identify a large percentage of protein encoding regions, efforts to accurately define functional sequences in the remaining {approx}97 percent of the genome lag. Our primary interest has been to utilize the evolutionary relationship and the universal nature of genomic sequence information in vertebrates to reveal functional elements in the human genome. This has been achieved through the combined use of vertebrate comparative genomics to pinpoint highly conserved sequences as candidates for biological activity and transgenic mouse studies to address the functionality of defined human DNA fragments. Accordingly, we describe strategies and insights into functional sequences in the human genome through the use of comparative genomics coupled wit h functional studies in the mouse.

  5. Sequence Analysis of Staphylococcus hyicus ATCC 11249T, an Etiological Agent of Exudative Epidermitis in Swine, Reveals a Type VII Secretion System Locus and a Novel 116-Kilobase Genomic Island Harboring Toxin-Encoding Genes.

    Science.gov (United States)

    Calcutt, Michael J; Foecking, Mark F; Hsieh, Hsin-Yeh; Adkins, Pamela R F; Stewart, George C; Middleton, John R

    2015-02-19

    Staphylococcus hyicus is the primary etiological agent of exudative epidermitis in swine. Analysis of the complete genome sequence of the type strain revealed a locus encoding a type VII secretion system and a large chromosomal island harboring the genes encoding exfoliative toxin ExhA and an EDIN toxin homolog.

  6. Analysis of the genome sequence of the pathogenic Muscovy duck parvovirus strain YY reveals a 14-nucleotide-pair deletion in the inverted terminal repeats.

    Science.gov (United States)

    Wang, Jianye; Huang, Yu; Zhou, Mingxu; Zhu, Guoqiang

    2016-09-01

    Genomic information about Muscovy duck parvovirus is still limited. In this study, the genome of the pathogenic MDPV strain YY was sequenced. The full-length genome of YY is 5075 nucleotides (nt) long, 57 nt shorter than that of strain FM. Sequence alignment indicates that the 5' and 3' inverted terminal repeats (ITR) of strain YY contain a 14-nucleotide-pair deletion in the stem of the palindromic hairpin structure in comparison to strain FM and FZ91-30. The deleted region contains one "E-box" site and one repeated motif with the sequence "TTCCGGT" or "ACCGGAA". Phylogenetic trees constructed based the protein coding genes concordantly showed that YY, together with nine other MDPV isolates from various places, clustered in a separate branch, distinct from the branch formed by goose parvovirus (GPV) strains. These results demonstrate that, despite the distinctive deletion, the YY strain still belongs to the classical MDPV group. Moreover, the deletion of ITR may contribute to the genome evolution of MDPV under immunization pressure.

  7. Genome Sequences of Eight Morphologically Diverse Alphaproteobacteria▿

    OpenAIRE

    Brown, Pamela J.B.; Kysela, David T.; Buechlein, Aaron; Hemmerich, Chris; Brun, Yves V

    2011-01-01

    The Alphaproteobacteriacomprise morphologically diverse bacteria, including many species of stalked bacteria. Here we announce the genome sequences of eight alphaproteobacteria, including the first genome sequences of species belonging to the genera Asticcacaulis, Hirschia, Hyphomicrobium, and Rhodomicrobium.

  8. Genome sequences of eight morphologically diverse Alphaproteobacteria.

    Science.gov (United States)

    Brown, Pamela J B; Kysela, David T; Buechlein, Aaron; Hemmerich, Chris; Brun, Yves V

    2011-09-01

    The Alphaproteobacteria comprise morphologically diverse bacteria, including many species of stalked bacteria. Here we announce the genome sequences of eight alphaproteobacteria, including the first genome sequences of species belonging to the genera Asticcacaulis, Hirschia, Hyphomicrobium, and Rhodomicrobium.

  9. Genome Sequences of Eight Morphologically Diverse Alphaproteobacteria▿

    Science.gov (United States)

    Brown, Pamela J. B.; Kysela, David T.; Buechlein, Aaron; Hemmerich, Chris; Brun, Yves V.

    2011-01-01

    The Alphaproteobacteriacomprise morphologically diverse bacteria, including many species of stalked bacteria. Here we announce the genome sequences of eight alphaproteobacteria, including the first genome sequences of species belonging to the genera Asticcacaulis, Hirschia, Hyphomicrobium, and Rhodomicrobium. PMID:21705585

  10. Genome Sequence of Mycobacteriophage Momo.

    Science.gov (United States)

    Pope, Welkin H; Bina, Elizabeth A; Brahme, Indraneel S; Hill, Amy B; Himmelstein, Philip H; Hunsicker, Sara M; Ish, Amanda R; Le, Tinh S; Martin, Mary M; Moscinski, Catherine N; Shetty, Sameer A; Swierzewski, Tomasz; Iyengar, Varun B; Kim, Hannah; Schafer, Claire E; Grubb, Sarah R; Warner, Marcie H; Bowman, Charles A; Russell, Daniel A; Hatfull, Graham F

    2015-06-18

    Momo is a newly discovered phage of Mycobacterium smegmatis mc(2)155. Momo has a double-stranded DNA genome 154,553 bp in length, with 233 predicted protein-encoding genes, 34 tRNA genes, and one transfer-messenger RNA (tmRNA) gene. Momo has a myoviral morphology and shares extensive nucleotide sequence similarity with subcluster C1 mycobacteriophages. Copyright © 2015 Pope et al.

  11. Genome sequence of the Chlamydophila abortus variant strain LLG.

    Science.gov (United States)

    Sait, Michelle; Clark, Ewan M; Wheelhouse, Nick; Livingstone, Morag; Spalding, Lucy; Siarkou, Victoria I; Vretou, Evangelia; Smith, David G E; Lainson, F Alex; Longbottom, David

    2011-08-01

    Chlamydophila abortus is a common cause of ruminant abortion. Here we report the genome sequence of strain LLG, which differs genotypically and phenotypically from the wild-type strain S26/3. Genome sequencing revealed differences between LLG and S26/3 to occur in pseudogene content, in transmembrane head/inc family proteins, and in biotin biosynthesis genes.

  12. Genome Polymorphisms Between Indica and Japonica Revealed by RFLP

    Institute of Scientific and Technical Information of China (English)

    WANG Song-wen; LIU Xia; XU Cai-guo; SHI Li-li; ZHANG Xin; DING De-liang; WANG Yong

    2007-01-01

    Revealing the genome polymorphisms between indica and japonica subspecies; RFLP markers, which are located across 12 chromosomes of rice, were used to analyze indica-japonica differentiation in different rice varieties. At the same time, genome sequence variations of screened loci were analyzed by bioinformatics method. Twenty-eight RFLP probes, which can classify indica-japonica rice, were confirmed. Subspecies genome polymorphisms of screened loci were found by analyzing the publication of the genome sequences data of rice. The study indicated that these screened markers can be used for classifying indica-japonica subspecies. With the publication of the genome sequences of rice, marker polymorphisms between indica and japonica subspecies can be revealed by genome differentiation.

  13. The genomic sequence of Exiguobacterium chiriqhucha str. N139 reveals a species that thrives in cold waters and extreme environmental conditions

    Directory of Open Access Journals (Sweden)

    Ana Gutiérrez-Preciado

    2017-04-01

    Full Text Available We report the genome sequence of Exiguobacterium chiriqhucha str. N139, isolated from a high-altitude Andean lake. Comparative genomic analyses of the Exiguobacterium genomes available suggest that our strain belongs to the same species as the previously reported E. pavilionensis str. RW-2 and Exiguobacterium str. GIC 31. We describe this species and propose the chiriqhucha name to group them. ‘Chiri qhucha’ in Quechua means ‘cold lake’, which is a common origin of these three cosmopolitan Exiguobacteria. The 2,952,588-bp E. chiriqhucha str. N139 genome contains one chromosome and three megaplasmids. The genome analysis of the Andean strain suggests the presence of enzymes that confer E. chiriqhucha str. N139 the ability to grow under multiple environmental extreme conditions, including high concentrations of different metals, high ultraviolet B radiation, scavenging for phosphorous and coping with high salinity. Moreover, the regulation of its tryptophan biosynthesis suggests that novel pathways remain to be discovered, and that these pathways might be fundamental in the amino acid metabolism of the microbial community from Laguna Negra, Argentina.

  14. Whole genome sequencing of Guzerá cattle reveals genetic variants in candidate genes for production, disease resistance, and heat tolerance.

    Science.gov (United States)

    Rosse, Izinara C; Assis, Juliana G; Oliveira, Francislon S; Leite, Laura R; Araujo, Flávio; Zerlotini, Adhemar; Volpini, Angela; Dominitini, Anderson J; Lopes, Beatriz C; Arbex, Wagner A; Machado, Marco A; Peixoto, Maria G C D; Verneque, Rui S; Martins, Marta F; Coimbra, Roney S; Silva, Marcos V G B; Oliveira, Guilherme; Carvalho, Maria Raquel S

    2017-02-01

    In bovines, artificial selection has produced a large number of breeds which differ in production, environmental adaptation, and health characteristics. To investigate the genetic basis of these phenotypical differences, several bovine breeds have been sequenced. Millions of new SNVs were described at every new breed sequenced, suggesting that every breed should be sequenced. Guzerat or Guzerá is an indicine breed resistant to drought and parasites that has been the base for some important breeds such as Brahman. Here, we describe the sequence of the Guzerá genome and the in silico functional analyses of intragenic breed-specific variations. Mate-paired libraries were generated using the ABI SOLiD system. Sequences were mapped to the Bos taurus reference genome (UMD 3.1) and 87% of the reference genome was covered at a 26X. Among the variants identified, 2,676,067 SNVs and 463,158 INDELs were homozygous, not found in any database searched, and may represent true differences between Guzerá and B. taurus. Functional analyses investigated with the NGS-SNP package focused on 1069 new, non-synonymous SNVs, splice-site variants (including acceptor and donor sites, and the conserved regions at both intron borders, referred to here as splice regions) and coding INDELs (NS/SS/I). These NS/SS/I map to 935 genes belonging to cell communication, environmental adaptation, signal transduction, sensory, and immune systems pathways. These pathways have been involved in phenotypes related to health, adaptation to the environment and behavior, and particularly, disease resistance and heat tolerance. Indeed, 105 of these genes are known QTLs for milk, meat and carcass, production, reproduction, and health traits. Therefore, in addition to describing new genetic variants, our approach provided groundwork for unraveling key candidate genes and mutations.

  15. The mitochondrial genome sequence of the ciliate Paramecium caudatum reveals a shift in nucleotide composition and codon usage within the genus Paramecium

    Directory of Open Access Journals (Sweden)

    Berendonk Thomas U

    2011-05-01

    Full Text Available Abstract Background Despite the fact that the organization of the ciliate mitochondrial genome is exceptional, only few ciliate mitochondrial genomes have been sequenced until today. All ciliate mitochondrial genomes are linear. They are 40 kb to 47 kb long and contain some 50 tightly packed genes without introns. Earlier studies documented that the mitochondrial guanine + cytosine contents are very different between Paramecium tetraurelia and all studied Tetrahymena species. This raises the question of whether the high mitochondrial G+C content observed in P. tetraurelia is a characteristic property of Paramecium mtDNA, or whether it is an exception of the ciliate mitochondrial genomes known so far. To test this question, we determined the mitochondrial genome sequence of Paramecium caudatum and compared the gene content and sequence properties to the closely related P. tetraurelia. Results The guanine + cytosine content of the P. caudatum mitochondrial genome was significantly lower than that of P. tetraurelia (22.4% vs. 41.2%. This difference in the mitochondrial nucleotide composition was accompanied by significantly different codon usage patterns in both species, i.e. within P. caudatum clearly A/T ending codons dominated, whereas for P. tetraurelia the synonymous codons were more balanced with a higher number of G/C ending codons. Further analyses indicated that the nucleotide composition of most members of the genus Paramecium resembles that of P. caudatum and that the shift observed in P. tetraurelia is restricted to the P. aurelia species complex. Conclusions Surprisingly, the codon usage bias in the P. caudatum mitochondrial genome, exemplified by the effective number of codons, is more similar to the distantly related T. pyriformis and other single-celled eukaryotes such as Chlamydomonas, than to the closely related P. tetraurelia. These differences in base composition and codon usage bias were, however, not reflected in the amino

  16. The complete chloroplast genome sequence of the chlorophycean green alga Scenedesmus obliquus reveals a compact gene organization and a biased distribution of genes on the two DNA strands

    Directory of Open Access Journals (Sweden)

    Lemieux Claude

    2006-04-01

    Full Text Available Abstract Background The phylum Chlorophyta contains the majority of the green algae and is divided into four classes. While the basal position of the Prasinophyceae is well established, the divergence order of the Ulvophyceae, Trebouxiophyceae and Chlorophyceae (UTC remains uncertain. The five complete chloroplast DNA (cpDNA sequences currently available for representatives of these classes display considerable variability in overall structure, gene content, gene density, intron content and gene order. Among these genomes, that of the chlorophycean green alga Chlamydomonas reinhardtii has retained the least ancestral features. The two single-copy regions, which are separated from one another by the large inverted repeat (IR, have similar sizes, rather than unequal sizes, and differ radically in both gene contents and gene organizations relative to the single-copy regions of prasinophyte and ulvophyte cpDNAs. To gain insights into the various changes that underwent the chloroplast genome during the evolution of chlorophycean green algae, we have sequenced the cpDNA of Scenedesmus obliquus, a member of a distinct chlorophycean lineage. Results The 161,452 bp IR-containing genome of Scenedesmus features single-copy regions of similar sizes, encodes 96 genes, i.e. only two additional genes (infA and rpl12 relative to its Chlamydomonas homologue and contains seven group I and two group II introns. It is clearly more compact than the four UTC algal cpDNAs that have been examined so far, displays the lowest proportion of short repeats among these algae and shows a stronger bias in clustering of genes on the same DNA strand compared to Chlamydomonas cpDNA. Like the latter genome, Scenedesmus cpDNA displays only a few ancestral gene clusters. The two chlorophycean genomes share 11 gene clusters that are not found in previously sequenced trebouxiophyte and ulvophyte cpDNAs as well as a few genes that have an unusual structure; however, their single

  17. Agaricus bisporus genome sequence: a commentary.

    Science.gov (United States)

    Kerrigan, Richard W; Challen, Michael P; Burton, Kerry S

    2013-06-01

    The genomes of two isolates of Agaricus bisporus have been sequenced recently. This soil-inhabiting fungus has a wide geographical distribution in nature and it is also cultivated in an industrialized indoor process ($4.7bn annual worldwide value) to produce edible mushrooms. Previously this lignocellulosic fungus has resisted precise econutritional classification, i.e. into white- or brown-rot decomposers. The generation of the genome sequence and transcriptomic analyses has revealed a new classification, 'humicolous', for species adapted to grow in humic-rich, partially decomposed leaf material. The Agaricus biporus genomes contain a collection of polysaccharide and lignin-degrading genes and more interestingly an expanded number of genes (relative to other lignocellulosic fungi) that enhance degradation of lignin derivatives, i.e. heme-thiolate peroxidases and β-etherases. A motif that is hypothesized to be a promoter element in the humicolous adaptation suite is present in a large number of genes specifically up-regulated when the mycelium is grown on humic-rich substrate. The genome sequence of A. bisporus offers a platform to explore fungal biology in carbon-rich soil environments and terrestrial cycling of carbon, nitrogen, phosphorus and potassium.

  18. Genome sequence of the deep-rooted Yersinia pestis strain Angola reveals new insights into the evolution and pangenome of the plague bacterium.

    Science.gov (United States)

    Eppinger, Mark; Worsham, Patricia L; Nikolich, Mikeljon P; Riley, David R; Sebastian, Yinong; Mou, Sherry; Achtman, Mark; Lindler, Luther E; Ravel, Jacques

    2010-03-01

    To gain insights into the origin and genome evolution of the plague bacterium Yersinia pestis, we have sequenced the deep-rooted strain Angola, a virulent Pestoides isolate. Its ancient nature makes this atypical isolate of particular importance in understanding the evolution of plague pathogenicity. Its chromosome features a unique genetic make-up intermediate between modern Y. pestis isolates and its evolutionary ancestor, Y. pseudotuberculosis. Our genotypic and phenotypic analyses led us to conclude that Angola belongs to one of the most ancient Y. pestis lineages thus far sequenced. The mobilome carries the first reported chimeric plasmid combining the two species-specific virulence plasmids. Genomic findings were validated in virulence assays demonstrating that its pathogenic potential is distinct from modern Y. pestis isolates. Human infection with this particular isolate would not be diagnosed by the standard clinical tests, as Angola lacks the plasmid-borne capsule, and a possible emergence of this genotype raises major public health concerns. To assess the genomic plasticity in Y. pestis, we investigated the global gene reservoir and estimated the pangenome at 4,844 unique protein-coding genes. As shown by the genomic analysis of this evolutionary key isolate, we found that the genomic plasticity within Y. pestis clearly was not as limited as previously thought, which is strengthened by the detection of the largest number of isolate-specific single-nucleotide polymorphisms (SNPs) currently reported in the species. This study identified numerous novel genetic signatures, some of which seem to be intimately associated with plague virulence. These markers are valuable in the development of a robust typing system critical for forensic, diagnostic, and epidemiological studies.

  19. Translational genomics for plant breeding with the genome sequence explosion.

    Science.gov (United States)

    Kang, Yang Jae; Lee, Taeyoung; Lee, Jayern; Shim, Sangrea; Jeong, Haneul; Satyawan, Dani; Kim, Moon Young; Lee, Suk-Ha

    2016-04-01

    The use of next-generation sequencers and advanced genotyping technologies has propelled the field of plant genomics in model crops and plants and enhanced the discovery of hidden bridges between genotypes and phenotypes. The newly generated reference sequences of unstudied minor plants can be annotated by the knowledge of model plants via translational genomics approaches. Here, we reviewed the strategies of translational genomics and suggested perspectives on the current databases of genomic resources and the database structures of translated information on the new genome. As a draft picture of phenotypic annotation, translational genomics on newly sequenced plants will provide valuable assistance for breeders and researchers who are interested in genetic studies.

  20. Sequencing intractable DNA to close microbial genomes.

    Science.gov (United States)

    Hurt, Richard A; Brown, Steven D; Podar, Mircea; Palumbo, Anthony V; Elias, Dwayne A

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  1. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  2. Fungal genome sequencing: basic biology to biotechnology.

    Science.gov (United States)

    Sharma, Krishna Kant

    2016-08-01

    The genome sequences provide a first glimpse into the genomic basis of the biological diversity of filamentous fungi and yeast. The genome sequence of the budding yeast, Saccharomyces cerevisiae, with a small genome size, unicellular growth, and rich history of genetic and molecular analyses was a milestone of early genomics in the 1990s. The subsequent completion of fission yeast, Schizosaccharomyces pombe and genetic model, Neurospora crassa initiated a revolution in the genomics of the fungal kingdom. In due course of time, a substantial number of fungal genomes have been sequenced and publicly released, representing the widest sampling of genomes from any eukaryotic kingdom. An ambitious genome-sequencing program provides a wealth of data on metabolic diversity within the fungal kingdom, thereby enhancing research into medical science, agriculture science, ecology, bioremediation, bioenergy, and the biotechnology industry. Fungal genomics have higher potential to positively affect human health, environmental health, and the planet's stored energy. With a significant increase in sequenced fungal genomes, the known diversity of genes encoding organic acids, antibiotics, enzymes, and their pathways has increased exponentially. Currently, over a hundred fungal genome sequences are publicly available; however, no inclusive review has been published. This review is an initiative to address the significance of the fungal genome-sequencing program and provides the road map for basic and applied research.

  3. Value of a newly sequenced bacterial genome

    DEFF Research Database (Denmark)

    Barbosa, Eudes; Aburjaile, Flavia F; Ramos, Rommel Tj

    2014-01-01

    and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses...

  4. Synaptotagmin gene content of the sequenced genomes

    Directory of Open Access Journals (Sweden)

    Craxton Molly

    2004-07-01

    Full Text Available Abstract Background Synaptotagmins exist as a large gene family in mammals. There is much interest in the function of certain family members which act crucially in the regulated synaptic vesicle exocytosis required for efficient neurotransmission. Knowledge of the functions of other family members is relatively poor and the presence of Synaptotagmin genes in plants indicates a role for the family as a whole which is wider than neurotransmission. Identification of the Synaptotagmin genes within completely sequenced genomes can provide the entire Synaptotagmin gene complement of each sequenced organism. Defining the detailed structures of all the Synaptotagmin genes and their encoded products can provide a useful resource for functional studies and a deeper understanding of the evolution of the gene family. The current rapid increase in the number of sequenced genomes from different branches of the tree of life, together with the public deposition of evolutionarily diverse transcript sequences make such studies worthwhile. Results I have compiled a detailed list of the Synaptotagmin genes of Caenorhabditis, Anopheles, Drosophila, Ciona, Danio, Fugu, Mus, Homo, Arabidopsis and Oryza by examining genomic and transcript sequences from public sequence databases together with some transcript sequences obtained by cDNA library screening and RT-PCR. I have compared all of the genes and investigated the relationship between plant Synaptotagmins and their non-Synaptotagmin counterparts. Conclusions I have identified and compared 98 Synaptotagmin genes from 10 sequenced genomes. Detailed comparison of transcript sequences reveals abundant and complex variation in Synaptotagmin gene expression and indicates the presence of Synaptotagmin genes in all animals and land plants. Amino acid sequence comparisons indicate patterns of conservation and diversity in function. Phylogenetic analysis shows the origin of Synaptotagmins in multicellular eukaryotes and their

  5. Experimental evolution, genetic analysis and genome re-sequencing reveal the mutation conferring artemisinin resistance in an isogenic lineage of malaria parasites

    KAUST Repository

    Hunt, Paul

    2010-09-16

    Background: Classical and quantitative linkage analyses of genetic crosses have traditionally been used to map genes of interest, such as those conferring chloroquine or quinine resistance in malaria parasites. Next-generation sequencing technologies now present the possibility of determining genome-wide genetic variation at single base-pair resolution. Here, we combine in vivo experimental evolution, a rapid genetic strategy and whole genome re-sequencing to identify the precise genetic basis of artemisinin resistance in a lineage of the rodent malaria parasite, Plasmodium chabaudi. Such genetic markers will further the investigation of resistance and its control in natural infections of the human malaria, P. falciparum.Results: A lineage of isogenic in vivo drug-selected mutant P. chabaudi parasites was investigated. By measuring the artemisinin responses of these clones, the appearance of an in vivo artemisinin resistance phenotype within the lineage was defined. The underlying genetic locus was mapped to a region of chromosome 2 by Linkage Group Selection in two different genetic crosses. Whole-genome deep coverage short-read re-sequencing (IlluminaSolexa) defined the point mutations, insertions, deletions and copy-number variations arising in the lineage. Eight point mutations arise within the mutant lineage, only one of which appears on chromosome 2. This missense mutation arises contemporaneously with artemisinin resistance and maps to a gene encoding a de-ubiquitinating enzyme.Conclusions: This integrated approach facilitates the rapid identification of mutations conferring selectable phenotypes, without prior knowledge of biological and molecular mechanisms. For malaria, this model can identify candidate genes before resistant parasites are commonly observed in natural human malaria populations. 2010 Hunt et al; licensee BioMed Central Ltd.

  6. Draft Genome Sequence of Lactobacillus rhamnosus 2166.

    OpenAIRE

    Karlyshev, Andrey V.; Melnikov, Vyacheslav G.; Kosarev, Igor V.; Abramov, Vyacheslav M.

    2014-01-01

    In this report, we present a draft sequence of the genome of Lactobacillus rhamnosus strain 2166, a potential novel probiotic. Genome annotation and read mapping onto a reference genome of L. rhamnosus strain GG allowed for the identification of the differences and similarities in the genomic contents and gene arrangements of these strains.

  7. Draft Genome Sequence of Lactobacillus rhamnosus 2166.

    OpenAIRE

    Karlyshev, Andrey V.; Melnikov, Vyacheslav G.; Kosarev, Igor V.; Abramov, Vyacheslav M.

    2014-01-01

    In this report, we present a draft sequence of the genome of Lactobacillus rhamnosus strain 2166, a potential novel probiotic. Genome annotation and read mapping onto a reference genome of L. rhamnosus strain GG allowed for the identification of the differences and similarities in the genomic contents and gene arrangements of these strains.

  8. Value of a newly sequenced bacterial genome

    Institute of Scientific and Technical Information of China (English)

    Eudes; GV; Barbosa; Flavia; F; Aburjaile; Rommel; TJ; Ramos; Adriana; R; Carneiro; Yves; Le; Loir; Jan; Baumbach; Anderson; Miyoshi; Artur; Silva; Vasco; Azevedo

    2014-01-01

    Next-generation sequencing(NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft(partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information.

  9. Whole genome sequencing of field isolates reveals a common duplication of the Duffy binding protein gene in Malagasy Plasmodium vivax strains.

    Directory of Open Access Journals (Sweden)

    Didier Menard

    2013-11-01

    Full Text Available BACKGROUND: Plasmodium vivax is the most prevalent human malaria parasite, causing serious public health problems in malaria-endemic countries. Until recently the Duffy-negative blood group phenotype was considered to confer resistance to vivax malaria for most African ethnicities. We and others have reported that P. vivax strains in African countries from Madagascar to Mauritania display capacity to cause clinical vivax malaria in Duffy-negative people. New insights must now explain Duffy-independent P. vivax invasion of human erythrocytes. METHODS/PRINCIPAL FINDINGS: Through recent whole genome sequencing we obtained ≥ 70× coverage of the P. vivax genome from five field-isolates, resulting in ≥ 93% of the Sal I reference sequenced at coverage greater than 20×. Combined with sequences from one additional Malagasy field isolate and from five monkey-adapted strains, we describe here identification of DNA sequence rearrangements in the P. vivax genome, including discovery of a duplication of the P. vivax Duffy binding protein (PvDBP gene. A survey of Malagasy patients infected with P. vivax showed that the PvDBP duplication was present in numerous locations in Madagascar and found in over 50% of infected patients evaluated. Extended geographic surveys showed that the PvDBP duplication was detected frequently in vivax patients living in East Africa and in some residents of non-African P. vivax-endemic countries. Additionally, the PvDBP duplication was observed in travelers seeking treatment of vivax malaria upon returning home. PvDBP duplication prevalence was highest in west-central Madagascar sites where the highest frequencies of P. vivax-infected, Duffy-negative people were reported. CONCLUSIONS/SIGNIFICANCE: The highly conserved nature of the sequence involved in the PvDBP duplication suggests that it has occurred in a recent evolutionary time frame. These data suggest that PvDBP, a merozoite surface protein involved in red cell adhesion

  10. Inconsistencies of genome annotations in apicomplexan parasites revealed by 5'-end-one-pass and full-length sequences of oligo-capped cDNAs

    Directory of Open Access Journals (Sweden)

    Sugano Sumio

    2009-07-01

    Full Text Available Abstract Background Apicomplexan parasites are causative agents of various diseases including malaria and have been targets of extensive genomic sequencing. We generated 5'-EST collections for six apicomplexa parasites using our full-length oligo-capping cDNA library method. To improve upon the current genome annotations, as well as to validate the importance for physical cDNA clone resources, we generated a large-scale collection of full-length cDNAs for several apicomplexa parasites. Results In this study, we used a total of 61,056 5'-end-single-pass cDNA sequences from Plasmodium falciparum, P. vivax, P. yoelii, P. berghei, Cryptosporidium parvum, and Toxoplasma gondii. We compared these partially sequenced cDNA sequences with the currently annotated gene models and observed significant inconsistencies between the two datasets. In particular, we found that on average 14% of the exons in the current gene models were not supported by any cDNA evidence, and that 16% of the current gene models may contain at least one mis-annotation and should be re-evaluated. We also identified a large number of transcripts that had been previously unidentified. For 732 cDNAs in T. gondii, the entire sequences were determined in order to evaluate the annotated gene models at the complete full-length transcript level. We found that 41% of the T. gondii gene models contained at least one inconsistency. We also identified and confirmed by RT-PCR 140 previously unidentified transcripts found in the intergenic regions of the current gene annotations. We show that the majority of these discrepancies are due to questionable predictions of one or two extra exons in the upstream or downstream regions of the genes. Conclusion Our data indicates that the current gene models are likely to still be incomplete and have much room for improvement. Our unique full-length cDNA information is especially useful for further refinement of the annotations for the genomes of

  11. Salmo salar and Esox lucius full-length cDNA sequences reveal changes in evolutionary pressures on a post-tetraploidization genome

    Directory of Open Access Journals (Sweden)

    Holt Robert A

    2010-04-01

    Full Text Available Abstract Background Salmonids are one of the most intensely studied fish, in part due to their economic and environmental importance, and in part due to a recent whole genome duplication in the common ancestor of salmonids. This duplication greatly impacts species diversification, functional specialization, and adaptation. Extensive new genomic resources have recently become available for Atlantic salmon (Salmo salar, but documentation of allelic versus duplicate reference genes remains a major uncertainty in the complete characterization of its genome and its evolution. Results From existing expressed sequence tag (EST resources and three new full-length cDNA libraries, 9,057 reference quality full-length gene insert clones were identified for Atlantic salmon. A further 1,365 reference full-length clones were annotated from 29,221 northern pike (Esox lucius ESTs. Pairwise dN/dS comparisons within each of 408 sets of duplicated salmon genes using northern pike as a diploid out-group show asymmetric relaxation of selection on salmon duplicates. Conclusions 9,057 full-length reference genes were characterized in S. salar and can be used to identify alleles and gene family members. Comparisons of duplicated genes show that while purifying selection is the predominant force acting on both duplicates, consistent with retention of functionality in both copies, some relaxation of pressure on gene duplicates can be identified. In addition, there is evidence that evolution has acted asymmetrically on paralogs, allowing one of the pair to diverge at a faster rate.

  12. The Genome Sequence of Psychrobacter arcticus 273-4, a Psychroactive Siberian Permafrost Bacterium, Reveals Mechanisms for Adaptation to Low-Temperature Growth

    Energy Technology Data Exchange (ETDEWEB)

    Ayala-del-Rio, Hector L. [Michigan State University, East Lansing; Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Grzymski, Joseph J. [Desert Research Institute, Reno, NV; Ponder, Monica [Michigan State University, East Lansing; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Bergholz, Peter [Michigan State University, East Lansing; Bartolo, Genevive [U.S. Department of Energy, Joint Genome Institute; Hauser, Loren John [ORNL; Land, Miriam L [ORNL; Bakermans, Corien [Michigan State University, East Lansing; Rodrigues, Debora [Michigan State University, East Lansing; Klappenbach, Joel [Michigan State University, East Lansing; Zarka, Dan [Michigan State University, East Lansing; Larimer, Frank W [ORNL; Richardson, P M [U.S. Department of Energy, Joint Genome Institute; Murray, Alison [Desert Research Institute, Reno, NV; Thomashow, Michael [Michigan State University, East Lansing; Tiedje, James M. [Michigan State University, East Lansing

    2010-01-01

    Psychrobacter arcticus strain 273-4, which grows at temperatures as low as -10 degrees C, is the first cold-adapted bacterium from a terrestrial environment whose genome was sequenced. Analysis of the 2.65-Mb genome suggested that some of the strategies employed by P. arcticus 273-4 for survival under cold and stress conditions are changes in membrane composition, synthesis of cold shock proteins, and the use of acetate as an energy source. Comparative genome analysis indicated that in a significant portion of the P. arcticus proteome there is reduced use of the acidic amino acids and proline and arginine, which is consistent with increased protein flexibility at low temperatures. Differential amino acid usage occurred in all gene categories, but it was more common in gene categories essential for cell growth and reproduction, suggesting that P. arcticus evolved to grow at low temperatures. Amino acid adaptations and the gene content likely evolved in response to the long-term freezing temperatures (-10 degrees C to -12 degrees C) of the Kolyma (Siberia) permafrost soil from which this strain was isolated. Intracellular water likely does not freeze at these in situ temperatures, which allows P. arcticus to live at subzero temperatures.

  13. The whole genome sequence of the Mediterranean fruit fly, Ceratitis capitata (Wiedemann), reveals insights into the biology and adaptive evolution of a highly invasive pest species.

    Science.gov (United States)

    Papanicolaou, Alexie; Schetelig, Marc F; Arensburger, Peter; Atkinson, Peter W; Benoit, Joshua B; Bourtzis, Kostas; Castañera, Pedro; Cavanaugh, John P; Chao, Hsu; Childers, Christopher; Curril, Ingrid; Dinh, Huyen; Doddapaneni, HarshaVardhan; Dolan, Amanda; Dugan, Shannon; Friedrich, Markus; Gasperi, Giuliano; Geib, Scott; Georgakilas, Georgios; Gibbs, Richard A; Giers, Sarah D; Gomulski, Ludvik M; González-Guzmán, Miguel; Guillem-Amat, Ana; Han, Yi; Hatzigeorgiou, Artemis G; Hernández-Crespo, Pedro; Hughes, Daniel S T; Jones, Jeffery W; Karagkouni, Dimitra; Koskinioti, Panagiota; Lee, Sandra L; Malacrida, Anna R; Manni, Mosè; Mathiopoulos, Kostas; Meccariello, Angela; Murali, Shwetha C; Murphy, Terence D; Muzny, Donna M; Oberhofer, Georg; Ortego, Félix; Paraskevopoulou, Maria D; Poelchau, Monica; Qu, Jiaxin; Reczko, Martin; Robertson, Hugh M; Rosendale, Andrew J; Rosselot, Andrew E; Saccone, Giuseppe; Salvemini, Marco; Savini, Grazia; Schreiner, Patrick; Scolari, Francesca; Siciliano, Paolo; Sim, Sheina B; Tsiamis, George; Ureña, Enric; Vlachos, Ioannis S; Werren, John H; Wimmer, Ernst A; Worley, Kim C; Zacharopoulou, Antigone; Richards, Stephen; Handler, Alfred M

    2016-09-22

    The Mediterranean fruit fly (medfly), Ceratitis capitata, is a major destructive insect pest due to its broad host range, which includes hundreds of fruits and vegetables. It exhibits a unique ability to invade and adapt to ecological niches throughout tropical and subtropical regions of the world, though medfly infestations have been prevented and controlled by the sterile insect technique (SIT) as part of integrated pest management programs (IPMs). The genetic analysis and manipulation of medfly has been subject to intensive study in an effort to improve SIT efficacy and other aspects of IPM control. The 479 Mb medfly genome is sequenced from adult flies from lines inbred for 20 generations. A high-quality assembly is achieved having a contig N50 of 45.7 kb and scaffold N50 of 4.06 Mb. In-depth curation of more than 1800 messenger RNAs shows specific gene expansions that can be related to invasiveness and host adaptation, including gene families for chemoreception, toxin and insecticide metabolism, cuticle proteins, opsins, and aquaporins. We identify genes relevant to IPM control, including those required to improve SIT. The medfly genome sequence provides critical insights into the biology of one of the most serious and widespread agricultural pests. This knowledge should significantly advance the means of controlling the size and invasive potential of medfly populations. Its close relationship to Drosophila, and other insect species important to agriculture and human health, will further comparative functional and structural studies of insect genomes that should broaden our understanding of gene family evolution.

  14. The Genomic Scrapheap Challenge; Extracting Relevant Data from Unmapped Whole Genome Sequencing Reads, Including Strain Specific Genomic Segments, in Rats.

    Science.gov (United States)

    van der Weide, Robin H; Simonis, Marieke; Hermsen, Roel; Toonen, Pim; Cuppen, Edwin; de Ligt, Joep

    2016-01-01

    Unmapped next-generation sequencing reads are typically ignored while they contain biologically relevant information. We systematically analyzed unmapped reads from whole genome sequencing of 33 inbred rat strains. High quality reads were selected and enriched for biologically relevant sequences; similarity-based analysis revealed clustering similar to previously reported phylogenetic trees. Our results demonstrate that on average 20% of all unmapped reads harbor sequences that can be used to improve reference genomes and generate hypotheses on potential genotype-phenotype relationships. Analysis pipelines would benefit from incorporating the described methods and reference genomes would benefit from inclusion of the genomic segments obtained through these efforts.

  15. Identification of ancient remains through genomic sequencing

    Science.gov (United States)

    Blow, Matthew J.; Zhang, Tao; Woyke, Tanja; Speller, Camilla F.; Krivoshapkin, Andrei; Yang, Dongya Y.; Derevianko, Anatoly; Rubin, Edward M.

    2008-01-01

    Studies of ancient DNA have been hindered by the preciousness of remains, the small quantities of undamaged DNA accessible, and the limitations associated with conventional PCR amplification. In these studies, we developed and applied a genomewide adapter-mediated emulsion PCR amplification protocol for ancient mammalian samples estimated to be between 45,000 and 69,000 yr old. Using 454 Life Sciences (Roche) and Illumina sequencing (formerly Solexa sequencing) technologies, we examined over 100 megabases of DNA from amplified extracts, revealing unbiased sequence coverage with substantial amounts of nonredundant nuclear sequences from the sample sources and negligible levels of human contamination. We consistently recorded over 500-fold increases, such that nanogram quantities of starting material could be amplified to microgram quantities. Application of our protocol to a 50,000-yr-old uncharacterized bone sample that was unsuccessful in mitochondrial PCR provided sufficient nuclear sequences for comparison with extant mammals and subsequent phylogenetic classification of the remains. The combined use of emulsion PCR amplification and high-throughput sequencing allows for the generation of large quantities of DNA sequence data from ancient remains. Using such techniques, even small amounts of ancient remains with low levels of endogenous DNA preservation may yield substantial quantities of nuclear DNA, enabling novel applications of ancient DNA genomics to the investigation of extinct phyla. PMID:18426903

  16. Microbial genomics: from sequence to function.

    OpenAIRE

    Schwartz, I

    2000-01-01

    The era of genomics (the study of genes and their function) began a scant dozen years ago with a suggestion by James Watson that the complete DNA sequence of the human genome be determined. Since that time, the human genome project has attracted a great deal of attention in the scientific world and the general media; the scope of the sequencing effort, and the extraordinary value that it will provide, has served to mask the enormous progress in sequencing other genomes. Microbial genome seque...

  17. Genome sequence analysis of five Canadian isolates of strawberry mottle virus reveals extensive intra-species diversity and a longer RNA2 with increased coding capacity compared to a previously characterized European isolate.

    Science.gov (United States)

    Bhagwat, Basdeo; Dickison, Virginia; Ding, Xinlun; Walker, Melanie; Bernardy, Michael; Bouthillier, Michel; Creelman, Alexa; DeYoung, Robyn; Li, Yinzi; Nie, Xianzhou; Wang, Aiming; Xiang, Yu; Sanfaçon, Hélène

    2016-06-01

    In this study, we report the genome sequence of five isolates of strawberry mottle virus (family Secoviridae, order Picornavirales) from strawberry field samples with decline symptoms collected in Eastern Canada. The Canadian isolates differed from the previously characterized European isolate 1134 in that they had a longer RNA2, resulting in a 239-amino-acid extension of the C-terminal region of the polyprotein. Sequence analysis suggests that reassortment and recombination occurred among the isolates. Phylogenetic analysis revealed that the Canadian isolates are diverse, grouping in two separate branches along with isolates from Europe and the Americas.

  18. Differentiation of Indica-Japonica rice revealed by insertion/deletion (InDel) fragments obtained from the comparative genomic study of DNA sequences between 93-11 (Indica) and Nipponbare (Japonica)

    Institute of Scientific and Technical Information of China (English)

    CAI Xingxing; LIU Jing; QIU Yinqiu; ZHAO Wei; SONG Zhiping; LU Baorong

    2007-01-01

    DNA polymorphisms from nucleotide insertion/deletions (InDels) in genomic sequences are the basis for developing InDel molecular markers.To validate the InDel primer pairs on the basis of the comparative genomic study on DNA sequences between an Indica rice 93-11 and a Japonica rice Nipponbare for identifying Indica and Japonica rice varieties and studying wild Oryza species,we studied 49 Indica,43 Japonica,and 24 wild rice accessions collected from ten Asian countries using 45 InDel primer pairs.Results indicated that of the 45 InDel primer pairs,41 can accurately identify Indica and Japonica rice varieties with a reliability of over 80%.The scatter plotting data of the principal component analysis (PCA) indicated that:(i) the InDel primer pairs can easily distinguish Indica from Japonica rice varieties,in addition to revealing their genetic differentiation;(ii) the AA-genome wild rice species showed a relatively close genetic relationship with the Indica rice varieties;and (iii)the non-AA genome wild rice species did not show evident differentiation into the Indica and Japonica types.It is concluded from the study that most of the InDel primer pairs obtained from DNA sequences of 93-11 and Nipponbare can be used for identifying lndica and Japonica rice varieties,and for studying genetic relationships of wild rice species,particularly in terms of the Indica-Japonica differentiation.

  19. Complete genome sequence of Klebsiella pneumoniae phage JD001.

    Science.gov (United States)

    Cui, Zelin; Shen, Wenbin; Wang, Zheng; Zhang, Haotian; Me, Rao; Wang, Yanchun; Zeng, Lingbin; Zhu, Yongzhang; Qin, Jinhong; He, Ping; Guo, Xiaokui

    2012-12-01

    Klebsiella pneumoniae is a member of the family Enterobacteriaceae, opportunistic pathogens that are among the eight most prevalent infectious agents in hospitals. The emergence of multidrug-resistant strains of K. pneumoniae has became a public health problem globally. To develop an effective antimicrobial agent, we isolated a bacteriophage, named JD001, from seawater and sequenced its genome. Comparative genome analysis of phage JD001 with other K. pneumoniae bacteriophages revealed that phage JD001 has little similarity to previously published K. pneumoniae phages KP15, KP32, KP34, and phiKO2. Here we announce the complete genome sequence of JD001 and report major findings from the genomic analysis.

  20. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    OpenAIRE

    Henrique Machado; Lone Gram

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationship...

  1. Complete Genome Sequence of Phytopathogenic Pectobacterium atrosepticum Bacteriophage Peat1.

    Science.gov (United States)

    Kalischuk, Melanie; Hachey, John; Kawchuk, Lawrence

    2015-08-13

    Pectobacterium atrosepticum is a common phytopathogen causing significant economic losses worldwide. To develop a biocontrol strategy for this blackleg pathogen of solanaceous plants, P. atrosepticum bacteriophage Peat1 was isolated and its genome completely sequenced. Interestingly, morphological and sequence analyses of the 45,633-bp genome revealed that phage Peat1 is a member of the family Podoviridae and most closely resembles the Klebsiella pneumoniae bacteriophage KP34. This is the first published complete genome sequence of a phytopathogenic P. atrosepticum bacteriophage, and details provide important information for the development of biocontrol by advancing our understanding of phage-phytopathogen interactions.

  2. Genomic Investigation Reveals Highly Conserved, Mosaic, Recombination Events Associated with Capsular Switching among Invasive Neisseria meningitidis Serogroup W Sequence Type (ST)-11 Strains.

    Science.gov (United States)

    Mustapha, Mustapha M; Marsh, Jane W; Krauland, Mary G; Fernandez, Jorge O; de Lemos, Ana Paula S; Dunning Hotopp, Julie C; Wang, Xin; Mayer, Leonard W; Lawrence, Jeffrey G; Hiller, N Luisa; Harrison, Lee H

    2016-07-03

    Neisseria meningitidis is an important cause of meningococcal disease globally. Sequence type (ST)-11 clonal complex (cc11) is a hypervirulent meningococcal lineage historically associated with serogroup C capsule and is believed to have acquired the W capsule through a C to W capsular switching event. We studied the sequence of capsule gene cluster (cps) and adjoining genomic regions of 524 invasive W cc11 strains isolated globally. We identified recombination breakpoints corresponding to two distinct recombination events within W cc11: A 8.4-kb recombinant region likely acquired from W cc22 including the sialic acid/glycosyl-transferase gene, csw resulted in a C→W change in capsular phenotype and a 13.7-kb recombinant segment likely acquired from Y cc23 lineage includes 4.5 kb of cps genes and 8.2 kb downstream of the cps cluster resulting in allelic changes in capsule translocation genes. A vast majority of W cc11 strains (497/524, 94.8%) retain both recombination events as evidenced by sharing identical or very closely related capsular allelic profiles. These data suggest that the W cc11 capsular switch involved two separate recombination events and that current global W cc11 meningococcal disease is caused by strains bearing this mosaic capsular switch.

  3. Genome Sequence of Lactobacillus rhamnosus ATCC 8530

    OpenAIRE

    Pittet, Vanessa; Ewen, Emily; Bushell, Barry R.; Ziola, Barry

    2012-01-01

    Lactobacillus rhamnosus is found in the human gastrointestinal tract and is important for probiotics. We became interested in L. rhamnosus isolate ATCC 8530 in relation to beer spoilage and hops resistance. We report here the genome sequence of this isolate, along with a brief comparison to other available L. rhamnosus genome sequences.

  4. Phylogenetic and genome-wide deep-sequencing analyses of canine parvovirus reveal co-infection with field variants and emergence of a recent recombinant strain.

    Science.gov (United States)

    Pérez, Ruben; Calleros, Lucía; Marandino, Ana; Sarute, Nicolás; Iraola, Gregorio; Grecco, Sofia; Blanc, Hervé; Vignuzzi, Marco; Isakov, Ofer; Shomron, Noam; Carrau, Lucía; Hernández, Martín; Francia, Lourdes; Sosa, Katia; Tomás, Gonzalo; Panzera, Yanina

    2014-01-01

    Canine parvovirus (CPV), a fast-evolving single-stranded DNA virus, comprises three antigenic variants (2a, 2b, and 2c) with different frequencies and genetic variability among countries. The contribution of co-infection and recombination to the genetic variability of CPV is far from being fully elucidated. Here we took advantage of a natural CPV population, recently formed by the convergence of divergent CPV-2c and CPV-2a strains, to study co-infection and recombination. Complete sequences of the viral coding region of CPV-2a and CPV-2c strains from 40 samples were generated and analyzed using phylogenetic tools. Two samples showed co-infection and were further analyzed by deep sequencing. The sequence profile of one of the samples revealed the presence of CPV-2c and CPV-2a strains that differed at 29 nucleotides. The other sample included a minor CPV-2a strain (13.3% of the viral population) and a major recombinant strain (86.7%). The recombinant strain arose from inter-genotypic recombination between CPV-2c and CPV-2a strains within the VP1/VP2 gene boundary. Our findings highlight the importance of deep-sequencing analysis to provide a better understanding of CPV molecular diversity.

  5. Phylogenetic and genome-wide deep-sequencing analyses of canine parvovirus reveal co-infection with field variants and emergence of a recent recombinant strain.

    Directory of Open Access Journals (Sweden)

    Ruben Pérez

    Full Text Available Canine parvovirus (CPV, a fast-evolving single-stranded DNA virus, comprises three antigenic variants (2a, 2b, and 2c with different frequencies and genetic variability among countries. The contribution of co-infection and recombination to the genetic variability of CPV is far from being fully elucidated. Here we took advantage of a natural CPV population, recently formed by the convergence of divergent CPV-2c and CPV-2a strains, to study co-infection and recombination. Complete sequences of the viral coding region of CPV-2a and CPV-2c strains from 40 samples were generated and analyzed using phylogenetic tools. Two samples showed co-infection and were further analyzed by deep sequencing. The sequence profile of one of the samples revealed the presence of CPV-2c and CPV-2a strains that differed at 29 nucleotides. The other sample included a minor CPV-2a strain (13.3% of the viral population and a major recombinant strain (86.7%. The recombinant strain arose from inter-genotypic recombination between CPV-2c and CPV-2a strains within the VP1/VP2 gene boundary. Our findings highlight the importance of deep-sequencing analysis to provide a better understanding of CPV molecular diversity.

  6. Targeted capture sequencing in whitebark pine reveals range-wide demographic and adaptive patterns despite challenges of a large, repetitive genome

    Directory of Open Access Journals (Sweden)

    John eSyring

    2016-04-01

    Full Text Available Whitebark pine (Pinus albicaulis inhabits an expansive range in western North America, and it is a keystone species of subalpine environments. Whitebark is susceptible to multiple threats – climate change, white pine blister rust, mountain pine beetle, and fire exclusion – and it is suffering significant mortality range-wide, prompting the tree to be listed as ‘globally endangered’ by the International Union for Conservation of Nature (IUCN and ‘endangered’ by the Canadian government. Conservation collections (in situ and ex situ are being initiated to preserve the genetic legacy of the species. Reliable, transferrable, and highly variable genetic markers are essential for quantifying the genetic profiles of seed collections relative to natural stands, and ensuring the completeness of conservation collections. We evaluated the use of hybridization-based target capture to enrich specific genomic regions from the 30+ GB genome of whitebark pine, and to evaluate genetic variation across loci, trees, and geography. Probes were designed to capture 7,849 distinct genes, and screening was performed on 48 trees. Despite the inclusion of repetitive elements in the probe pool, the resulting dataset provided information on 4,452 genes and 32% of targeted positions (528,873 bp, and we were able to identify 12,390 segregating sites from 47 trees. Variations reveal strong geographic trends in heterozygosity and allelic richness, with trees from the southern Cascade and Sierra Range showing the greatest distinctiveness and differentiation. Our results show that even under non-optimal conditions (low enrichment efficiency; inclusion of repetitive elements in baits, targeted enrichment produces high quality, codominant genotypes from large genomes. The resulting data can be readily integrated into management and gene conservation activities for whitebark pine, and have the potential to be applied to other members of 5-needle pine group (Pinus subsect

  7. Draft genome sequence of Therminicola potens strain JR

    Energy Technology Data Exchange (ETDEWEB)

    Byrne-Bailey, K.G.; Wrighton, K.C.; Melnyk, R.A.; Agbo, P.; Hazen, T.C.; Coates, J.D.

    2010-07-01

    'Thermincola potens' strain JR is one of the first Gram-positive dissimilatory metal-reducing bacteria (DMRB) for which there is a complete genome sequence. Consistent with the physiology of this organism, preliminary annotation revealed an abundance of multiheme c-type cytochromes that are putatively associated with the periplasm and cell surface in a Gram-positive bacterium. Here we report the complete genome sequence of strain JR.

  8. Complete Genome Sequence of Phytopathogenic Pectobacterium atrosepticum Bacteriophage Peat1

    OpenAIRE

    Kalischuk, Melanie; Hachey, John; Kawchuk, Lawrence

    2015-01-01

    Pectobacterium atrosepticum is a common phytopathogen causing significant economic losses worldwide. To develop a biocontrol strategy for this blackleg pathogen of solanaceous plants, P. atrosepticum bacteriophage Peat1 was isolated and its genome completely sequenced. Interestingly, morphological and sequence analyses of the 45,633-bp genome revealed that phage Peat1 is a member of the family Podoviridae and most closely resembles the Klebsiella pneumoniae bacteriophage KP34. This is the fir...

  9. Human Genome Sequencing in Health and Disease

    Science.gov (United States)

    Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

    2013-01-01

    Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320

  10. The genome sequence of parrot bornavirus 5.

    Science.gov (United States)

    Guo, Jianhua; Tizard, Ian

    2015-12-01

    Although several new avian bornaviruses have recently been described, information on their evolution, virulence, and sequence are often limited. Here we report the complete genome sequence of parrot bornavirus 5 (PaBV-5) isolated from a case of proventricular dilatation disease in a Palm cockatoo (Probosciger aterrimus). The complete genome consists of 8842 nucleotides with distinct 5' and 3' end sequences. This virus shares nucleotide sequence identities of 69-74 % with other bornaviruses in the genomic regions excluding the 5' and 3' terminal sequences. Phylogenetic analysis based on the genomic regions demonstrated this new isolate is an isolated branch within the clade that includes the aquatic bird bornaviruses and the passerine bornaviruses. Based on phylogenetic analyses and its low nucleotide sequence identities with other bornavirus, we support the proposal that PaBV-5 be assigned to a new bornavirus species:- Psittaciform 2 bornavirus.

  11. Evolution of extensively drug-resistant tuberculosis over four decades revealed by whole genome sequencing of Mycobacterium tuberculosis from KwaZulu-Natal, South Africa

    Directory of Open Access Journals (Sweden)

    Keira A Cohen

    2015-01-01

    Full Text Available The largest global outbreak of extensively drug-resistant (XDR tuberculosis (TB was identified in Tugela Ferry, KwaZulu-Natal (KZN, South Africa in 2005. The antecedents and timing of the emergence of drug resistance in this fatal epidemic XDR outbreak are unknown, and it is unclear whether drug resistance in this region continues to be driven by clonal spread or by the development of de novo resistance. A whole genome sequencing and drug susceptibility testing (DST was performed on 337 clinical isolates of Mycobacterium tuberculosis (M.tb collected in KZN from 2008 to 2013, in addition to three historical isolates, one of which was isolated during the Tugela Ferry outbreak. Using a variety of whole genome comparative approaches, 11 drug-resistant clones of M.tb circulating from 2008 to 2013 were identified, including a 50-member clone of XDR M.tb that was highly related to the Tugela Ferry XDR outbreak strain. It was calculated that the evolutionary trajectory from first-line drug resistance to XDR in this clone spanned more than four decades and began at the start of the antibiotic era. It was also observed that frequent de novo evolution of MDR and XDR was present, with 56 and 9 independent evolutions, respectively. Thus, ongoing amplification of drug-resistance in KwaZulu-Natal is driven by both clonal spread and de novo acquisition of resistance. In drug-resistant TB, isoniazid resistance was overwhelmingly the initial resistance mutation to be acquired, which would not be detected by current rapid molecular diagnostics that assess only rifampicin resistance.

  12. Sequencing and analysis of an Irish human genome.

    LENUS (Irish Health Repository)

    Tong, Pin

    2010-01-01

    Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence.

  13. Whole-genome sequencing reveals a link between β-lactam resistance and synthetases of the alarmone (p)ppGpp in Staphylococcus aureus.

    Science.gov (United States)

    Mwangi, Michael M; Kim, Choonkeun; Chung, Marilyn; Tsai, Jennifer; Vijayadamodar, Govindan; Benitez, Michelle; Jarvie, Thomas P; Du, Lei; Tomasz, Alexander

    2013-06-01

    The overwhelming majority of methicillin-resistant Staphylococcus aureus (MRSA) clinical isolates exhibit a peculiar heterogeneous resistance to β-lactam antibiotics: in cultures of such strains, the majority of cells display only a low level of methicillin resistance--often close to the MIC breakpoint of susceptible strains. Yet, in the same cultures, subpopulations of bacteria exhibiting very high levels of resistance are also present with variable frequencies, which are characteristic of the particular MRSA lineage. The mechanism of heterogeneous resistance is not understood. We describe here an experimental system for exploring the mechanism of heterogeneous resistance. Copies of the resistance gene mecA cloned into a temperature-sensitive plasmid were introduced into the fully sequenced methicillin-susceptible clinical isolate S. aureus strain 476. Transductants of strain 476 expressed methicillin resistance in a heterogeneous fashion: the great majority of cells showed only low MIC (0.75 μg/ml) for the antibiotic, but a minority population of highly resistant bacteria (MIC >300 μg/ml) was also present with a frequency of ∼10(-4). The genetic backgrounds of the majority and minority cells were compared by whole-genome sequencing: the only differences detectable were two point mutations in relA of the highly resistant minority population of bacteria. The relA gene codes for the synthesis of (p)ppGpp, an effector of the stringent stress response. Titration of (p)ppGpp showed increased amounts of this effector in the highly resistant cells. Involvement of (p)ppGpp synthesis genes may explain some of the perplexing aspects of β-lactam resistance in MRSA, since many environmental and genetic changes can modulate cellular levels of (p)ppGpp.

  14. The genome of Tetranychus urticae reveals herbivorous pest adaptations

    Science.gov (United States)

    Grbić, Miodrag; Van Leeuwen, Thomas; Clark, Richard M.; Rombauts, Stephane; Rouzé, Pierre; Grbić, Vojislava; Osborne, Edward J.; Dermauw, Wannes; Ngoc, Phuong Cao Thi; Ortego, Félix; Hernández-Crespo, Pedro; Diaz, Isabel; Martinez, Manuel; Navajas, Maria; Sucena, Élio; Magalhães, Sara; Nagy, Lisa; Pace, Ryan M.; Djuranović, Sergej; Smagghe, Guy; Iga, Masatoshi; Christiaens, Olivier; Veenstra, Jan A.; Ewer, John; Villalobos, Rodrigo Mancilla; Hutter, Jeffrey L.; Hudson, Stephen D.; Velez, Marisela; Yi, Soojin V.; Zeng, Jia; Pires-daSilva, Andre; Roch, Fernando; Cazaux, Marc; Navarro, Marie; Zhurov, Vladimir; Acevedo, Gustavo; Bjelica, Anica; Fawcett, Jeffrey A.; Bonnet, Eric; Martens, Cindy; Baele, Guy; Wissler, Lothar; Sanchez-Rodriguez, Aminael; Tirry, Luc; Blais, Catherine; Demeestere, Kristof; Henz, Stefan R.; Gregory, T. Ryan; Mathieu, Johannes; Verdon, Lou; Farinelli, Laurent; Schmutz, Jeremy; Lindquist, Erika; Feyereisen, René; Van de Peer, Yves

    2016-01-01

    The spider mite Tetranychus urticae is a cosmopolitan agricultural pest with an extensive host plant range and an extreme record of pesticide resistance. Here we present the completely sequenced and annotated spider mite genome, representing the first complete chelicerate genome. At 90 megabases T. urticae has the smallest sequenced arthropod genome. Compared with other arthropods, the spider mite genome shows unique changes in the hormonal environment and organization of the Hox complex, and also reveals evolutionary innovation of silk production. We find strong signatures of polyphagy and detoxification in gene families associated with feeding on different hosts and in new gene families acquired by lateral gene transfer. Deep transcriptome analysis of mites feeding on different plants shows how this pest responds to a changing host environment. The T. urticae genome thus offers new insights into arthropod evolution and plant–herbivore interactions, and provides unique opportunities for developing novel plant protection strategies. PMID:22113690

  15. Genes but not genomes reveal bacterial domestication of Lactococcus lactis.

    Directory of Open Access Journals (Sweden)

    Delphine Passerini

    Full Text Available BACKGROUND: The population structure and diversity of Lactococcus lactis subsp. lactis, a major industrial bacterium involved in milk fermentation, was determined at both gene and genome level. Seventy-six lactococcal isolates of various origins were studied by different genotyping methods and thirty-six strains displaying unique macrorestriction fingerprints were analyzed by a new multilocus sequence typing (MLST scheme. This gene-based analysis was compared to genomic characteristics determined by pulsed-field gel electrophoresis (PFGE. METHODOLOGY/PRINCIPAL FINDINGS: The MLST analysis revealed that L. lactis subsp. lactis is essentially clonal with infrequent intra- and intergenic recombination; also, despite its taxonomical classification as a subspecies, it displays a genetic diversity as substantial as that within several other bacterial species. Genome-based analysis revealed a genome size variability of 20%, a value typical of bacteria inhabiting different ecological niches, and that suggests a large pan-genome for this subspecies. However, the genomic characteristics (macrorestriction pattern, genome or chromosome size, plasmid content did not correlate to the MLST-based phylogeny, with strains from the same sequence type (ST differing by up to 230 kb in genome size. CONCLUSION/SIGNIFICANCE: The gene-based phylogeny was not fully consistent with the traditional classification into dairy and non-dairy strains but supported a new classification based on ecological separation between "environmental" strains, the main contributors to the genetic diversity within the subspecies, and "domesticated" strains, subject to recent genetic bottlenecks. Comparison between gene- and genome-based analyses revealed little relationship between core and dispensable genome phylogenies, indicating that clonal diversification and phenotypic variability of the "domesticated" strains essentially arose through substantial genomic flux within the dispensable

  16. Complete Genome Sequencing of Influenza A Viruses within Swine Farrow-to-Wean Farms Reveals the Emergence, Persistence, and Subsidence of Diverse Viral Genotypes.

    Science.gov (United States)

    Diaz, Andres; Marthaler, Douglas; Culhane, Marie; Sreevatsan, Srinand; Alkhamis, Moh; Torremorell, Montserrat

    2017-09-15

    Influenza A viruses (IAVs) are endemic in swine and represent a public health risk. However, there is limited information on the genetic diversity of swine IAVs within farrow-to-wean farms, which is where most pigs are born. In this longitudinal study, we sampled 5 farrow-to-wean farms for a year and collected 4,190 individual nasal swabs from three distinct pig subpopulations. Of these, 207 (4.9%) samples tested PCR positive for IAV, and 124 IAVs were isolated. We sequenced the complete genomes of 123 IAV isolates and found 31 H1N1, 26 H1N2, 63 H3N2, and 3 mixed IAVs. Based on the IAV hemagglutinin, seven different influenza A viral groups (VGs) were identified. Most of the remaining IAV gene segments allowed us to differentiate the same VGs, although an additional viral group was identified for gene segment 3 (PA). Moreover, the codetection of more than one IAV VG was documented at different levels (farm, subpopulation, and individual pigs), highlighting the environment for potential IAV reassortment. Additionally, 3 out of 5 farms contained IAV isolates (n = 5) with gene segments from more than one VG, and 79% of all the IAVs sequenced contained a signature mutation (S31N) in the matrix gene that has been associated with resistance to the antiviral amantadine. Within farms, some IAVs were detected only once, while others were detected for 283 days. Our results illustrate the maintenance and subsidence of different IAVs within swine farrow-to-wean farms over time, demonstrating that pig subpopulation dynamics are important to better understand the diversity and epidemiology of swine IAVs.IMPORTANCE On a global scale, swine are one of the main reservoir species for influenza A viruses (IAVs) and play a key role in the transmission of IAVs between species. Additionally, the 2009 IAV pandemics highlighted the role of pigs in the emergence of IAVs with pandemic potential. However, limited information is available regarding the diversity and distribution of swine IAVs

  17. Strategies for complete plastid genome sequencing.

    Science.gov (United States)

    Twyford, Alex D; Ness, Rob W

    2016-10-28

    Plastid sequencing is an essential tool in the study of plant evolution. This high-copy organelle is one of the most technically accessible regions of the genome, and its sequence conservation makes it a valuable region for comparative genome evolution, phylogenetic analysis and population studies. Here, we discuss recent innovations and approaches for de novo plastid assembly that harness genomic tools. We focus on technical developments including low-cost sequence library preparation approaches for genome skimming, enrichment via hybrid baits and methylation-sensitive capture, sequence platforms with higher read outputs and longer read lengths, and automated tools for assembly. These developments allow for a much more streamlined assembly than via conventional short-range PCR. Although newer methods make complete plastid sequencing possible for any land plant or green alga, there are still challenges for producing finished plastomes particularly from herbarium material or from structurally divergent plastids such as those of parasitic plants.

  18. Plantagora: modeling whole genome sequencing and assembly of plant genomes.

    Directory of Open Access Journals (Sweden)

    Roger Barthelson

    Full Text Available BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly

  19. Perspectives of Integrative Cancer Genomics in Next Generation Sequencing Era

    Directory of Open Access Journals (Sweden)

    So Mee Kwon

    2012-06-01

    Full Text Available The explosive development of genomics technologies including microarrays and next generation sequencing (NGS has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establish the huge cancer genome compendium and integrative omics analyses, so-called "integromics", have extended our understanding on the cancer genome, showing its daunting complexity and heterogeneity. However, the challenges of the structured integration, sharing, and interpretation of the big omics data still remain to be resolved. Here, we review several issues raised in cancer omics data analysis, including NGS, focusing particularly on the study design and analysis strategies. This might be helpful to understand the current trends and strategies of the rapidly evolving cancer genomics research.

  20. Complete genomic sequence analyses of the first group A giraffe rotavirus reveals close evolutionary relationship with rotaviruses infecting other members of the Artiodactyla.

    Science.gov (United States)

    O'Shea, Helen; Mulherin, Emily; Matthijnssens, Jelle; McCusker, Matthew P; Collins, P J; Cashman, Olivia; Gunn, Lynda; Beltman, Marijke E; Fanning, Séamus

    2014-05-14

    Group A Rotaviruses (RVA) have been established as significant contributory agents of acute gastroenteritis in young children and many animal species. In 2008, we described the first RVA strain detected in a giraffe calf (RVA/Giraffe-wt/IRL/GirRV/2008/G10P[11]), presenting with acute diarrhoea. Molecular characterisation of the VP7 and VP4 genes revealed the bovine-like genotypes G10 and P[11], respectively. To further investigate the origin of this giraffe RVA strain, the 9 remaining gene segments were sequenced and analysed, revealing the following genotype constellation: G10-P[11]-I2-R2-C2-M2-A3-N2-T6-E2-H3. This genotype constellation is very similar to RVA strains isolated from cattle or other members of the artiodactyls. Phylogenetic analyses confirmed the close relationship between GirRV and RVA strains with a bovine-like genotype constellation detected from several host species, including humans. These results suggest that RVA strain GirRV was the result of an interspecies transmission from a bovine host to the giraffe calf. However, we cannot rule out completely that this bovine-like RVA genotype constellation may be enzootic in giraffes. Future RVA surveillance in giraffes may answer this intriguing question.

  1. Microbial species delineation using whole genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos; Mukherjee, Supratim; Ivanova, Natalia; Mavrommatics, Kostas; Pati, Amrita; Konstantinidis, Konstantinos

    2014-10-20

    Species assignments in prokaryotes use a manual, poly-phasic approach utilizing both phenotypic traits and sequence information of phylogenetic marker genes. With thousands of genomes being sequenced every year, an automated, uniform and scalable approach exploiting the rich genomic information in whole genome sequences is desired, at least for the initial assignment of species to an organism. We have evaluated pairwise genome-wide Average Nucleotide Identity (gANI) values and alignment fractions (AFs) for nearly 13,000 genomes using our fast implementation of the computation, identifying robust and widely applicable hard cut-offs for species assignments based on AF and gANI. Using these cutoffs, we generated stable species-level clusters of organisms, which enabled the identification of several species mis-assignments and facilitated the assignment of species for organisms without species definitions.

  2. Genomic Prediction from Whole Genome Sequence in Livestock: The 1000 Bull Genomes Project

    DEFF Research Database (Denmark)

    Hayes, Benjamin J; MacLeod, Iona M; Daetwyler, Hans D

    Advantages of using whole genome sequence data to predict genomic estimated breeding values (GEBV) include better persistence of accuracy of GEBV across generations and more accurate GEBV across breeds. The 1000 Bull Genomes Project provides a database of whole genome sequenced key ancestor bulls...

  3. Genomic prediction using QTL derived from whole genome sequence data

    DEFF Research Database (Denmark)

    Brøndum, Rasmus Froberg; Su, Guosheng; Janss, Luc

    This study investigated the gain in accuracy of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k SNP data. Analyses were performed for Nordic Holstein and Danish Jersey animals, using eithe...

  4. Genome sequence analysis of the model grass Brachypodium distachyon: insights into grass genome evolution

    Energy Technology Data Exchange (ETDEWEB)

    Schulman, Al

    2009-08-09

    Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromeric regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops

  5. Multilocus Sequence Typing of Total-Genome-Sequenced Bacteria

    DEFF Research Database (Denmark)

    Larsen, Mette Voldby; Cosentino, Salvatore; Rasmussen, Simon

    2012-01-01

    Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the "gold standard" of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS...... the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56...... MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types...

  6. The characterization of twenty sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Kimberly Pelak

    2010-09-01

    Full Text Available We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten "case" genomes from individuals with severe hemophilia A and ten "control" genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways.

  7. Integrated genomics of Mucorales reveals novel therapeutic targets

    Science.gov (United States)

    Mucormycosis is a life-threatening infection caused by Mucorales fungi. We sequenced 30 fungal genomes and performed transcriptomics with three representative Rhizopus and Mucor strains with human airway epithelial cells during fungal invasion to reveal key host and fungal determinants contributing ...

  8. Heavy-metal resistance of a France vineyard soil bacterium, Pseudomonas mendocina strain S5.2, revealed by whole-genome sequencing.

    Science.gov (United States)

    Chong, Teik Min; Yin, Wai-Fong; Mondy, Samuel; Grandclément, Catherine; Dessaux, Yves; Chan, Kok-Gan

    2012-11-01

    Here we present the draft genome of Pseudomonas mendocina strain S5.2, possessing tolerance to a high concentration of copper. In addition to being copper resistant, the genome of P. mendocina strain S5.2 contains a number of heavy-metal-resistant genes known to confer resistance to multiple heavy-metal ions.

  9. High-Quality Draft Genome Sequence of Kallotenue papyrolyticum JKG1T Reveals Broad Heterotrophic Capacity Focused on Carbohydrate and Amino Acid Metabolism.

    Science.gov (United States)

    Hedlund, Brian P; Murugapiran, Senthil K; Huntemann, Marcel; Clum, Alicia; Pillay, Manoj; Palaniappan, Krishnaveni; Varghese, Neha; Mikhailova, Natalia; Stamatis, Dimitrios; Reddy, T B K; Ngan, Chew Yee; Daum, Chris; Duffy, Kecia; Shapiro, Nicole; Markowitz, Victor; Ivanova, Natalia; Kyrpides, Nikos; Williams, Amanda J; Cole, Jessica K; Dodsworth, Jeremy A; Woyke, Tanja

    2015-12-03

    The draft genome of Kallotenue papyrolyticum JKG1(T), a member of the order Kallotenuales, class Chloroflexia, consists of 4,475,263 bp in 4 contigs and encodes 4,010 predicted genes, 49 tRNA-encoding genes, and 3 rRNA operons. The genome is consistent with a heterotrophic lifestyle including catabolism of polysaccharides and amino acids.

  10. Sequencing and comparing whole mitochondrial genomes ofanimals

    Energy Technology Data Exchange (ETDEWEB)

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

  11. The complete chloroplast genome sequence of Abies nephrolepis (Pinaceae: Abietoideae

    Directory of Open Access Journals (Sweden)

    Dong-Keun Yi

    2016-06-01

    Full Text Available The plant chloroplast (cp genome has maintained a relatively conserved structure and gene content throughout evolution. Cp genome sequences have been used widely for resolving evolutionary and phylogenetic issues at various taxonomic levels of plants. Here, we report the complete cp genome of Abies nephrolepis. The A. nephrolepis cp genome is 121,336 base pairs (bp in length including a pair of short inverted repeat regions (IRa and IRb of 139 bp each separated by a small single copy (SSC region of 54,323 bp (SSC and a large single copy region of 66,735 bp (LSC. It contains 114 genes, 68 of which are protein coding genes, 35 tRNA and four rRNA genes, six open reading frames, and one pseudogene. Seventeen repeat units and 64 simple sequence repeats (SSR have been detected in A. nephrolepis cp genome. Large IR sequences locate in 42-kb inversion points (1186 bp. The A. nephrolepis cp genome is identical to Abies koreana’s which is closely related to taxa. Pairwise comparison between two cp genomes revealed 140 polymorphic sites in each. Complete cp genome sequence of A. nephrolepis has a significant potential to provide information on the evolutionary pattern of Abietoideae and valuable data for development of DNA markers for easy identification and classification.

  12. Comparative Genomics Reveals the Core and Accessory Genomes of Streptomyces Species.

    Science.gov (United States)

    Kim, Ji-Nu; Kim, Yeonbum; Jeong, Yujin; Roe, Jung-Hye; Kim, Byung-Gee; Cho, Byung-Kwan

    2015-10-01

    The development of rapid and efficient genome sequencing methods has enabled us to study the evolutionary background of bacterial genetic information. Here, we present comparative genomic analysis of 17 Streptomyces species, for which the genome has been completely sequenced, using the pan-genome approach. The analysis revealed that 34,592 ortholog clusters constituted the pan-genome of these Streptomyces species, including 2,018 in the core genome, 11,743 in the dispensable genome, and 20,831 in the unique genome. The core genome was converged to a smaller number of genes than reported previously, with 3,096 gene families. Functional enrichment analysis showed that genes involved in transcription were most abundant in the Streptomyces pan-genome. Finally, we investigated core genes for the sigma factors, mycothiol biosynthesis pathway, and secondary metabolism pathways; our data showed that many genes involved in stress response and morphological differentiation were commonly expressed in Streptomyces species. Elucidation of the core genome offers a basis for understanding the functional evolution of Streptomyces species and provides insights into target selection for the construction of industrial strains.

  13. Complete genome sequence of arracacha mottle virus.

    Science.gov (United States)

    Orílio, Anelise F; Lucinda, Natalia; Dusi, André N; Nagata, Tatsuya; Inoue-Nagata, Alice K

    2013-01-01

    Arracacha mottle virus (AMoV) is the only potyvirus reported to infect arracacha (Arracacia xanthorrhiza) in Brazil. Here, the complete genome sequence of an isolate of AMoV was determined to be 9,630 nucleotides in length, excluding the 3' poly-A tail, and encoding a polyprotein of 3,135 amino acids and a putative P3N-PIPO protein. Its genomic organization is typical of a member of the genus Potyvirus, containing all conserved motifs. Its full genome sequence shared 56.2 % nucleotide identity with sunflower chlorotic mottle virus and verbena virus Y, the most closely related viruses.

  14. Sequencing the Cotton Genomes-Gossypium spp.

    Institute of Scientific and Technical Information of China (English)

    PATERSON Andrew H

    2008-01-01

    @@ The genomes of most major crops,including cotton,will be fully sequenced in the next fewyears.Cotton is unusual,although not unique,in that we will need to sequence not only cultivated(tetraploid) genotypes but their diploid progenitors,to understand how elite cottons have surpassedthe productivity and quality of their progenitors.

  15. Whole-genome sequencing for comparative genomics and de novo genome assembly.

    Science.gov (United States)

    Benjak, Andrej; Sala, Claudia; Hartkoorn, Ruben C

    2015-01-01

    Next-generation sequencing technologies for whole-genome sequencing of mycobacteria are rapidly becoming an attractive alternative to more traditional sequencing methods. In particular this technology is proving useful for genome-wide identification of mutations in mycobacteria (comparative genomics) as well as for de novo assembly of whole genomes. Next-generation sequencing however generates a vast quantity of data that can only be transformed into a usable and comprehensible form using bioinformatics. Here we describe the methodology one would use to prepare libraries for whole-genome sequencing, and the basic bioinformatics to identify mutations in a genome following Illumina HiSeq or MiSeq sequencing, as well as de novo genome assembly following sequencing using Pacific Biosciences (PacBio).

  16. Genome Sequence of the Palaeopolyploid soybean

    Energy Technology Data Exchange (ETDEWEB)

    Schmutz, Jeremy; Cannon, Steven B.; Schlueter, Jessica; Ma, Jianxin; Mitros, Therese; Nelson, William; Hyten, David L.; Song, Qijian; Thelen, Jay J.; Cheng, Jianlin; Xu, Dong; Hellsten, Uffe; May, Gregory D.; Yu, Yeisoo; Sakura, Tetsuya; Umezawa, Taishi; Bhattacharyya, Madan K.; Sandhu, Devinder; Valliyodan, Babu; Lindquist, Erika; Peto, Myron; Grant, David; Shu, Shengqiang; Goodstein, David; Barry, Kerrie; Futrell-Griggs, Montona; Abernathy, Brian; Du, Jianchang; Tian, Zhixi; Zhu, Liucun; Gill, Navdeep; Joshi, Trupti; Libault, Marc; Sethuraman, Anand; Zhang, Xue-Cheng; Shinozaki, Kazuo; Nguyen, Henry T.; Wing, Rod A.; Cregan, Perry; Specht, James; Grimwood, Jane; Rokhsar, Dan; Stacey, Gary; Shoemaker, Randy C.; Jackson, Scott A.

    2009-08-03

    Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70percent more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78percent of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75percent of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

  17. Viral genome sequencing by random priming methods

    Directory of Open Access Journals (Sweden)

    Zhang Xinsheng

    2008-01-01

    Full Text Available Abstract Background Most emerging health threats are of zoonotic origin. For the overwhelming majority, their causative agents are RNA viruses which include but are not limited to HIV, Influenza, SARS, Ebola, Dengue, and Hantavirus. Of increasing importance therefore is a better understanding of global viral diversity to enable better surveillance and prediction of pandemic threats; this will require rapid and flexible methods for complete viral genome sequencing. Results We have adapted the SISPA methodology 123 to genome sequencing of RNA and DNA viruses. We have demonstrated the utility of the method on various types and sources of viruses, obtaining near complete genome sequence of viruses ranging in size from 3,000–15,000 kb with a median depth of coverage of 14.33. We used this technique to generate full viral genome sequence in the presence of host contaminants, using viral preparations from cell culture supernatant, allantoic fluid and fecal matter. Conclusion The method described is of great utility in generating whole genome assemblies for viruses with little or no available sequence information, viruses from greatly divergent families, previously uncharacterized viruses, or to more fully describe mixed viral infections.

  18. Complete genome sequence of the San Miguel sea lion virus-8 reveals that it is not a member of the vesicular exanthema of swine virus/San Miguel Sea Lion virus species of the Caliciviridae

    Science.gov (United States)

    The complete genome sequence of the San Miguel sea lion virus-8 (SMSV-8) was determined. Comparison of this sequence to other calicivirus sequences in GenBank showed that this virus was genetically distinct from the VESV/SMSV viruses and belonged to a novel clade within the Vesivirus genus....

  19. Rhipicephalus (Boophilus) microplus strain Deutsch, whole genome shotgun sequencing project first submission of genome sequence

    Science.gov (United States)

    The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence difficult. Cot filtration/selection techniques were used to reduce the repetitive fraction of the tick genome and enrich for the fraction of DNA with gene-containing regions. The Cot-selected ...

  20. Whole genome sequencing of diverse Shiga toxin-producing and non-producing Escherichia coli strains reveals a variety of virulence and novel antibiotic resistance plasmids

    Science.gov (United States)

    The genomes of a diverse set of Shiga toxin-producing E. coli strains and the presence of 38 plasmids among all the isolates were determined. Among the novel plasmids found, there were eight that encoded resistance genes to antibiotics, including aminoglycosides, carbapenems, penicillins, cephalosp...

  1. Analysis of ATP6 sequence diversity in the Triticum-Aegilops group of species reveals the crucial role of rearrangement in mitochondrial genome evolution

    Science.gov (United States)

    Mutation and chromosomal rearrangements are the two main forces of increasing genetic diversity for natural selection to act upon, and ultimately drive the evolutionary process. Although genome evolution is a function of both forces, simultaneously, the ratio of each can be varied among different ge...

  2. Complete Genome Sequence of Porcine Parvovirus 2 Recovered from Swine Sera

    Science.gov (United States)

    Kluge, M.; Franco, A. C.; Giongo, A.; Valdez, F. P.; Saddi, T. M.; Brito, W. M. E. D.; Roehe, P. M.

    2016-01-01

    A complete genomic sequence of porcine parvovirus 2 (PPV-2) was detected by viral metagenome analysis on swine sera. A phylogenetic analysis of this genome reveals that it is highly similar to previously reported North American PPV-2 genomes. The complete PPV-2 sequence is 5,426 nucleotides long. PMID:26823583

  3. The genome sequence of Trypanosoma cruzi, etiologic agent of Chagas disease

    NARCIS (Netherlands)

    El-Sayed, NM; Myler, PJ; Bartholomeu, DC; Nilsson, D; Aggarwal, G; Tran, AN; Ghedin, E; Worthey, EA; Delcher, AL; Blandin, G; Westenberger, SJ; Caler, E; Cerqueira, GC; Branche, C; Haas, B; Anupama, A; Arner, E; Aslund, L; Attipoe, P; Bontempi, E; Bringaud, F; Burton, P; Cadag, E; Campbell, DA; Carrington, M; Crabtree, J; Darban, H; da Silveira, JF; de Jong, P; Edwards, K; Englund, PT; Fazelina, G; Feldblyum, T; Ferella, M; Frasch, AC; Gull, K; Horn, D; Hou, LH; Huang, YT; Kindlund, E; Ktingbeil, M; Kluge, S; Koo, H; Lacerda, D; Levin, MJ; Lorenzi, H; Louie, T; Machado, CR; McCulloch, R; McKenna, A; Mizuno, Y; Mottram, JC; Nelson, S; Ochaya, S; Osoegawa, K; Pai, G; Parsons, M; Pentony, M; Pettersson, U; Pop, M; Ramirez, JL; Rinta, J; Robertson, L; Salzberg, SL; Sanchez, DO; Seyler, A; Sharma, R; Shetty, J; Simpson, AJ; Sisk, E; Tammi, MT; Tarteton, R; Teixeira, S; Van Aken, S; Vogt, C; Ward, PN; Wickstead, B; Wortman, J; White, O; Fraser, CM; Stuart, KD; Andersson, B

    2005-01-01

    Whole-genome sequencing of the protozoan pathogen Trypanosoma cruzi revealed that the diploid genome contains a predicted 22,570 proteins encoded by genes, of which 12,570 represent allelic pairs. Over 50% of the genome consists of repeated sequences, such as retrotransposons and genes for large, fa

  4. The genome sequence of Trypanosoma cruzi, etiologic agent of Chagas disease

    NARCIS (Netherlands)

    El-Sayed, NM; Myler, PJ; Bartholomeu, DC; Nilsson, D; Aggarwal, G; Tran, AN; Ghedin, E; Worthey, EA; Delcher, AL; Blandin, G; Westenberger, SJ; Caler, E; Cerqueira, GC; Branche, C; Haas, B; Anupama, A; Arner, E; Aslund, L; Attipoe, P; Bontempi, E; Bringaud, F; Burton, P; Cadag, E; Campbell, DA; Carrington, M; Crabtree, J; Darban, H; da Silveira, JF; de Jong, P; Edwards, K; Englund, PT; Fazelina, G; Feldblyum, T; Ferella, M; Frasch, AC; Gull, K; Horn, D; Hou, LH; Huang, YT; Kindlund, E; Ktingbeil, M; Kluge, S; Koo, H; Lacerda, D; Levin, MJ; Lorenzi, H; Louie, T; Machado, CR; McCulloch, R; McKenna, A; Mizuno, Y; Mottram, JC; Nelson, S; Ochaya, S; Osoegawa, K; Pai, G; Parsons, M; Pentony, M; Pettersson, U; Pop, M; Ramirez, JL; Rinta, J; Robertson, L; Salzberg, SL; Sanchez, DO; Seyler, A; Sharma, R; Shetty, J; Simpson, AJ; Sisk, E; Tammi, MT; Tarteton, R; Teixeira, S; Van Aken, S; Vogt, C; Ward, PN; Wickstead, B; Wortman, J; White, O; Fraser, CM; Stuart, KD; Andersson, B

    2005-01-01

    Whole-genome sequencing of the protozoan pathogen Trypanosoma cruzi revealed that the diploid genome contains a predicted 22,570 proteins encoded by genes, of which 12,570 represent allelic pairs. Over 50% of the genome consists of repeated sequences, such as retrotransposons and genes for large, fa

  5. Combining two technologies for full genome sequencing of human.

    Science.gov (United States)

    Skryabin, K G; Prokhortchouk, E B; Mazur, A M; Boulygina, E S; Tsygankova, S V; Nedoluzhko, A V; Rastorguev, S M; Matveev, V B; Chekanov, N N; D A, Goranskaya; Teslyuk, A B; Gruzdeva, N M; Velikhov, V E; Zaridze, D G; Kovalchuk, M V

    2009-10-01

    At present, the new technologies of DNA sequencing are rapidly developing allowing quick and efficient characterisation of organisms at the level of the genome structure. In this study, the whole genome sequencing of a human (Russian man) was performed using two technologies currently present on the market - Sequencing by Oligonucleotide Ligation and Detection (SOLiD™) (Applied Biosystems) and sequencing technologies of molecular clusters using fluorescently labeled precursors (Illumina). The total number of generated data resulted in 108.3 billion base pairs (60.2 billion from Illumina technology and 48.1 billion from SOLiD technology). Statistics performed on reads generated by GAII and SOLiD showed that they covered 75% and 96% of the genome respectively. Short polymorphic regions were detected with comparable accuracy however, the absolute amount of them revealed by SOLiD was several times less than by GAII. Optimal algorithm for using the latest methods of sequencing was established for the analysis of individual human genomes. The study is the first Russian effort towards whole human genome sequencing.

  6. Sequence analysis of the genome of carnation (Dianthus caryophyllus L.).

    Science.gov (United States)

    Yagi, Masafumi; Kosugi, Shunichi; Hirakawa, Hideki; Ohmiya, Akemi; Tanase, Koji; Harada, Taro; Kishimoto, Kyutaro; Nakayama, Masayoshi; Ichimura, Kazuo; Onozaki, Takashi; Yamaguchi, Hiroyasu; Sasaki, Nobuhiro; Miyahara, Taira; Nishizaki, Yuzo; Ozeki, Yoshihiro; Nakamura, Noriko; Suzuki, Takamasa; Tanaka, Yoshikazu; Sato, Shusei; Shirasawa, Kenta; Isobe, Sachiko; Miyamura, Yoshinori; Watanabe, Akiko; Nakayama, Shinobu; Kishida, Yoshie; Kohara, Mitsuyo; Tabata, Satoshi

    2014-06-01

    The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. 'Francesco' was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568,887,315 bp, consisting of 45,088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16,644 bp and 60,737 bp, respectively, and the longest scaffold was 1,287,144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ∼ 98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp.

  7. Sequencing and comparative analysis of the gorilla MHC genomic sequence.

    Science.gov (United States)

    Wilming, Laurens G; Hart, Elizabeth A; Coggill, Penny C; Horton, Roger; Gilbert, James G R; Clee, Chris; Jones, Matt; Lloyd, Christine; Palmer, Sophie; Sims, Sarah; Whitehead, Siobhan; Wiley, David; Beck, Stephan; Harrow, Jennifer L

    2013-01-01

    Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC.

  8. Genomic Sequence Variation Markup Language (GSVML).

    Science.gov (United States)

    Nakaya, Jun; Kimura, Michio; Hiroi, Kaei; Ido, Keisuke; Yang, Woosung; Tanaka, Hiroshi

    2010-02-01

    With the aim of making good use of internationally accumulated genomic sequence variation data, which is increasing rapidly due to the explosive amount of genomic research at present, the development of an interoperable data exchange format and its international standardization are necessary. Genomic Sequence Variation Markup Language (GSVML) will focus on genomic sequence variation data and human health applications, such as gene based medicine or pharmacogenomics. We developed GSVML through eight steps, based on case analysis and domain investigations. By focusing on the design scope to human health applications and genomic sequence variation, we attempted to eliminate ambiguity and to ensure practicability. We intended to satisfy the requirements derived from the use case analysis of human-based clinical genomic applications. Based on database investigations, we attempted to minimize the redundancy of the data format, while maximizing the data covering range. We also attempted to ensure communication and interface ability with other Markup Languages, for exchange of omics data among various omics researchers or facilities. The interface ability with developing clinical standards, such as the Health Level Seven Genotype Information model, was analyzed. We developed the human health-oriented GSVML comprising variation data, direct annotation, and indirect annotation categories; the variation data category is required, while the direct and indirect annotation categories are optional. The annotation categories contain omics and clinical information, and have internal relationships. For designing, we examined 6 cases for three criteria as human health application and 15 data elements for three criteria as data formats for genomic sequence variation data exchange. The data format of five international SNP databases and six Markup Languages and the interface ability to the Health Level Seven Genotype Model in terms of 317 items were investigated. GSVML was developed as

  9. Sorghum genome sequencing by methylation filtration.

    Directory of Open Access Journals (Sweden)

    Joseph A Bedell

    2005-01-01

    Full Text Available Sorghum bicolor is a close relative of maize and is a staple crop in Africa and much of the developing world because of its superior tolerance of arid growth conditions. We have generated sequence from the hypomethylated portion of the sorghum genome by applying methylation filtration (MF technology. The evidence suggests that 96% of the genes have been sequence tagged, with an average coverage of 65% across their length. Remarkably, this level of gene discovery was accomplished after generating a raw coverage of less than 300 megabases of the 735-megabase genome. MF preferentially captures exons and introns, promoters, microRNAs, and simple sequence repeats, and minimizes interspersed repeats, thus providing a robust view of the functional parts of the genome. The sorghum MF sequence set is beneficial to research on sorghum and is also a powerful resource for comparative genomics among the grasses and across the entire plant kingdom. Thousands of hypothetical gene predictions in rice and Arabidopsis are supported by the sorghum dataset, and genomic similarities highlight evolutionarily conserved regions that will lead to a better understanding of rice and Arabidopsis.

  10. Sorghum genome sequencing by methylation filtration.

    Science.gov (United States)

    Bedell, Joseph A; Budiman, Muhammad A; Nunberg, Andrew; Citek, Robert W; Robbins, Dan; Jones, Joshua; Flick, Elizabeth; Rholfing, Theresa; Fries, Jason; Bradford, Kourtney; McMenamy, Jennifer; Smith, Michael; Holeman, Heather; Roe, Bruce A; Wiley, Graham; Korf, Ian F; Rabinowicz, Pablo D; Lakey, Nathan; McCombie, W Richard; Jeddeloh, Jeffrey A; Martienssen, Robert A

    2005-01-01

    Sorghum bicolor is a close relative of maize and is a staple crop in Africa and much of the developing world because of its superior tolerance of arid growth conditions. We have generated sequence from the hypomethylated portion of the sorghum genome by applying methylation filtration (MF) technology. The evidence suggests that 96% of the genes have been sequence tagged, with an average coverage of 65% across their length. Remarkably, this level of gene discovery was accomplished after generating a raw coverage of less than 300 megabases of the 735-megabase genome. MF preferentially captures exons and introns, promoters, microRNAs, and simple sequence repeats, and minimizes interspersed repeats, thus providing a robust view of the functional parts of the genome. The sorghum MF sequence set is beneficial to research on sorghum and is also a powerful resource for comparative genomics among the grasses and across the entire plant kingdom. Thousands of hypothetical gene predictions in rice and Arabidopsis are supported by the sorghum dataset, and genomic similarities highlight evolutionarily conserved regions that will lead to a better understanding of rice and Arabidopsis.

  11. Whole genome sequencing of Chinese clearhead icefish, Protosalanx hyalocranius.

    Science.gov (United States)

    Liu, Kai; Xu, Dongpo; Li, Jia; Bian, Chao; Duan, Jinrong; Zhou, Yanfeng; Zhang, Minying; You, Xinxin; You, Yang; Chen, Jieming; Yu, Hui; Xu, Gangchun; Fang, Di-An; Qiang, Jun; Jiang, Shulun; He, Jie; Xu, Junmin; Shi, Qiong; Zhang, Zhiyong; Xu, Pao

    2017-04-01

    Chinese clearhead icefish, Protosalanx hyalocranius , is a representative icefish species with economic importance and special appearance. Due to its great economic value in China, the fish was introduced into Lake Dianchi and several other lakes from the Lake Taihu half a century ago. Similar to the Sinocyclocheilus cavefish, the clearhead icefish has certain cavefish-like traits, such as transparent body and nearly scaleless skin. Here, we provide the whole genome sequence of this surface-dwelling fish and generated a draft genome assembly, aiming at exploring molecular mechanisms for the biological interests. A total of 252.1 Gb of raw reads were sequenced. Subsequently, a novel draft genome assembly was generated, with the scaffold N50 reaching 1.163 Mb. The genome completeness was estimated to be 98.39 % by using the CEGMA evaluation. Finally, we annotated 19 884 protein-coding genes and observed that repeat sequences account for 24.43 % of the genome assembly. We report the first draft genome of the Chinese clearhead icefish. The genome assembly will provide a solid foundation for further molecular breeding and germplasm resource protection in Chinese clearhead icefish, as well as other icefishes. It is also a valuable genetic resource for revealing the molecular mechanisms for the cavefish-like characters.

  12. De novo assembly of the carrot mitochondrial genome using next generation sequencing of whole genomic DNA provides first evidence of DNA transfer into an angiosperm plastid genome

    Directory of Open Access Journals (Sweden)

    Iorizzo Massimo

    2012-05-01

    Full Text Available Abstract Background Sequence analysis of organelle genomes has revealed important aspects of plant cell evolution. The scope of this study was to develop an approach for de novo assembly of the carrot mitochondrial genome using next generation sequence data from total genomic DNA. Results Sequencing data from a carrot 454 whole genome library were used to develop a de novo assembly of the mitochondrial genome. Development of a new bioinformatic tool allowed visualizing contig connections and elucidation of the de novo assembly. Southern hybridization demonstrated recombination across two large repeats. Genome annotation allowed identification of 44 protein coding genes, three rRNA and 17 tRNA. Identification of the plastid genome sequence allowed organelle genome comparison. Mitochondrial intergenic sequence analysis allowed detection of a fragment of DNA specific to the carrot plastid genome. PCR amplification and sequence analysis across different Apiaceae species revealed consistent conservation of this fragment in the mitochondrial genomes and an insertion in Daucus plastid genomes, giving evidence of a mitochondrial to plastid transfer of DNA. Sequence similarity with a retrotransposon element suggests a possibility that a transposon-like event transferred this sequence into the plastid genome. Conclusions This study confirmed that whole genome sequencing is a practical approach for de novo assembly of higher plant mitochondrial genomes. In addition, a new aspect of intercompartmental genome interaction was reported providing the first evidence for DNA transfer into an angiosperm plastid genome. The approach used here could be used more broadly to sequence and assemble mitochondrial genomes of diverse species. This information will allow us to better understand intercompartmental interactions and cell evolution.

  13. Phytophthora Genome Sequences Uncover Evolutionary Origins and Mechanisms of Pathogenesis

    Energy Technology Data Exchange (ETDEWEB)

    Lamour, Kurt H [ORNL; McDonald, W Hayes [ORNL; Savidor, Alon [ORNL

    2006-01-01

    Genome sequences of the soybean pathogen, Phytophthora sojae, and the sudden oak death pathogen, Phytophthora ramorum, suggest a photosynthetic past and reveal recent massive expansion and diversification of potential pathogenicity gene families. Abstract: Draft genome sequences of the soybean pathogen, Phytophthora sojae, and the sudden oak death pathogen, Phytophthora ramorum, have been determined. O mycetes such as these Phytophthora species share the kingdom Stramenopila with photosynthetic algae such as diatoms and the presence of many Phytophthora genes of probable phototroph origin support a photosynthetic ancestry for the stramenopiles. Comparison of the two species' genomes reveals a rapid expansion and diversification of many protein families associated with plant infection such as hydrolases, ABC transporters, protein toxins, proteinase inhibitors and, in particular, a superfamily of 700 proteins with similarity to known o mycete avirulence genes.

  14. Draft Genome Sequence of Ustilago trichophora RK089, a Promising Malic Acid Producer

    Science.gov (United States)

    Zambanini, Thiemo; Buescher, Joerg M.; Meurer, Guido; Blank, Lars M.

    2016-01-01

    The basidiomycetous smut fungus Ustilago trichophora RK089 produces malate from glycerol. De novo genome sequencing revealed a 20.7-Mbp genome (301 gap-closed contigs, 246 scaffolds). A comparison to the genome of Ustilago maydis 521 revealed all essential genes for malate production from glycerol contributing to metabolic engineering for improving malate production. PMID:27469969

  15. Cactus: Algorithms for genome multiple sequence alignment

    OpenAIRE

    Paten, Benedict; Earl, Dent; Nguyen, Ngan; Diekhans, Mark; Zerbino, Daniel; Haussler, David

    2011-01-01

    Much attention has been given to the problem of creating reliable multiple sequence alignments in a model incorporating substitutions, insertions, and deletions. Far less attention has been paid to the problem of optimizing alignments in the presence of more general rearrangement and copy number variation. Using Cactus graphs, recently introduced for representing sequence alignments, we describe two complementary algorithms for creating genomic alignments. We have implemented these algorithms...

  16. Biochemical and full genome sequence analyses of clinical Vibrio cholerae isolates in Mexico reveals the presence of novel V. cholerae strains.

    Science.gov (United States)

    Díaz-Quiñonez, José Alberto; Hernández-Monroy, Irma; Montes-Colima, Norma Angélica; Moreno-Pérez, María Asunción; Galicia-Nicolás, Adriana Guadalupe; López-Martínez, Irma; Ruiz-Matus, Cuitláhuac; Kuri-Morales, Pablo; Ortíz-Alcántara, Joanna María; Garcés-Ayala, Fabiola; Ramírez-González, José Ernesto

    2016-05-01

    The first week of September 2013, the National Epidemiological Surveillance System identified two cases of cholera in Mexico City. The cultures of both samples were confirmed as Vibrio cholerae serogroup O1, serotype Ogawa, biotype El Tor. Initial analyses by PFGE and by PCR-amplification of the virulence genes, suggested that both strains were similar, but different from those previously reported in Mexico. The following week, four more cases were identified in a community in the state of Hidalgo, located 121 km northeast of Mexico City. Thereafter a cholera outbreak started in the region of La Huasteca. Genomic analyses of the four strains obtained in this study confirmed the presence of Pathogenicity Islands VPI-1 and -2, VSP-1 and -2, and of the integrative element SXT. The genomic structure of the 4 isolates was similar to that of V. cholerae strain 2010 EL-1786, identified during the epidemic in Haiti in 2010. Copyright © 2016 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  17. Whole-Genome Sequencing of Methicillin-Resistant Staphylococcus aureus Resistant to Fifth-Generation Cephalosporins Reveals Potential Non-mecA Mechanisms of Resistance.

    Directory of Open Access Journals (Sweden)

    Alexander L Greninger

    Full Text Available Fifth-generation cephalosporins, ceftobiprole and ceftaroline, are promising drugs for treatment of bacterial infections from methicillin-resistant Staphylococcus aureus (MRSA. These antibiotics are able to bind native PBP2a, the penicillin-binding protein encoded by the mecA resistance determinant that mediates broad class resistance to nearly all other beta-lactam antibiotics, at clinically achievable concentrations. Mechanisms of resistance to ceftaroline based on mecA mutations have been previously described. Here we compare the genomes of 11 total parent-daughter strains of Staphylococcus aureus for which specific selection by serial passaging with ceftaroline or ceftobiprole was used to identify novel non-mecA mechanisms of resistance. All 5 ceftaroline-resistant strains, derived from 5 different parental strains, contained mutations directly upstream of the pbp4 gene (coding for the PBP4 protein, including four with the same thymidine insertion located 377 nucleotides upstream of the promoter site. In 4 of 5 independent ceftaroline-driven selections, we also isolated mutations to the same residue (Asn138 in PBP4. In addition, mutations in additional candidate genes such as ClpX endopeptidase, PP2C protein phosphatase and transcription terminator Rho, previously undescribed in the context of resistance to ceftaroline or ceftobiprole, were detected in multiple selections. These genomic findings suggest that non-mecA mechanisms, while yet to be encountered in the clinical setting, may also be important in mediating resistance to 5th-generation cephalosporins.

  18. Whole-Genome Sequencing of Methicillin-Resistant Staphylococcus aureus Resistant to Fifth-Generation Cephalosporins Reveals Potential Non-mecA Mechanisms of Resistance

    Science.gov (United States)

    Chan, Liana C.; Hamilton, Stephanie M.; Chambers, Henry F.; Chiu, Charles Y.

    2016-01-01

    Fifth-generation cephalosporins, ceftobiprole and ceftaroline, are promising drugs for treatment of bacterial infections from methicillin-resistant Staphylococcus aureus (MRSA). These antibiotics are able to bind native PBP2a, the penicillin-binding protein encoded by the mecA resistance determinant that mediates broad class resistance to nearly all other beta-lactam antibiotics, at clinically achievable concentrations. Mechanisms of resistance to ceftaroline based on mecA mutations have been previously described. Here we compare the genomes of 11 total parent-daughter strains of Staphylococcus aureus for which specific selection by serial passaging with ceftaroline or ceftobiprole was used to identify novel non-mecA mechanisms of resistance. All 5 ceftaroline-resistant strains, derived from 5 different parental strains, contained mutations directly upstream of the pbp4 gene (coding for the PBP4 protein), including four with the same thymidine insertion located 377 nucleotides upstream of the promoter site. In 4 of 5 independent ceftaroline-driven selections, we also isolated mutations to the same residue (Asn138) in PBP4. In addition, mutations in additional candidate genes such as ClpX endopeptidase, PP2C protein phosphatase and transcription terminator Rho, previously undescribed in the context of resistance to ceftaroline or ceftobiprole, were detected in multiple selections. These genomic findings suggest that non-mecA mechanisms, while yet to be encountered in the clinical setting, may also be important in mediating resistance to 5th-generation cephalosporins. PMID:26890675

  19. Genomic multiple sequence alignments: refinement using a genetic algorithm

    Directory of Open Access Journals (Sweden)

    Lefkowitz Elliot J

    2005-08-01

    Full Text Available Abstract Background Genomic sequence data cannot be fully appreciated in isolation. Comparative genomics – the practice of comparing genomic sequences from different species – plays an increasingly important role in understanding the genotypic differences between species that result in phenotypic differences as well as in revealing patterns of evolutionary relationships. One of the major challenges in comparative genomics is producing a high-quality alignment between two or more related genomic sequences. In recent years, a number of tools have been developed for aligning large genomic sequences. Most utilize heuristic strategies to identify a series of strong sequence similarities, which are then used as anchors to align the regions between the anchor points. The resulting alignment is globally correct, but in many cases is suboptimal locally. We describe a new program, GenAlignRefine, which improves the overall quality of global multiple alignments by using a genetic algorithm to improve local regions of alignment. Regions of low quality are identified, realigned using the program T-Coffee, and then refined using a genetic algorithm. Because a better COFFEE (Consistency based Objective Function For alignmEnt Evaluation score generally reflects greater alignment quality, the algorithm searches for an alignment that yields a better COFFEE score. To improve the intrinsic slowness of the genetic algorithm, GenAlignRefine was implemented as a parallel, cluster-based program. Results We tested the GenAlignRefine algorithm by running it on a Linux cluster to refine sequences from a simulation, as well as refine a multiple alignment of 15 Orthopoxvirus genomic sequences approximately 260,000 nucleotides in length that initially had been aligned by Multi-LAGAN. It took approximately 150 minutes for a 40-processor Linux cluster to optimize some 200 fuzzy (poorly aligned regions of the orthopoxvirus alignment. Overall sequence identity increased only

  20. The diploid genome sequence of an individual human.

    Directory of Open Access Journals (Sweden)

    Samuel Levy

    2007-09-01

    Full Text Available Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel included 3,213,401 single nucleotide polymorphisms (SNPs, 53,823 block substitutions (2-206 bp, 292,102 heterozygous insertion/deletion events (indels(1-571 bp, 559,473 homozygous indels (1-82,711 bp, 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

  1. Mapping and Sequencing the Human Genome

    Science.gov (United States)

    1988-01-01

    Numerous meetings have been held and a debate has developed in the biological community over the merits of mapping and sequencing the human genome. In response a committee to examine the desirability and feasibility of mapping and sequencing the human genome was formed to suggest options for implementing the project. The committee asked many questions. Should the analysis of the human genome be left entirely to the traditionally uncoordinated, but highly successful, support systems that fund the vast majority of biomedical research. Or should a more focused and coordinated additional support system be developed that is limited to encouraging and facilitating the mapping and eventual sequencing of the human genome. If so, how can this be done without distorting the broader goals of biological research that are crucial for any understanding of the data generated in such a human genome project. As the committee became better informed on the many relevant issues, the opinions of its members coalesced, producing a shared consensus of what should be done. This report reflects that consensus.

  2. Building a model: developing genomic resources for common milkweed (Asclepias syriaca with low coverage genome sequencing

    Directory of Open Access Journals (Sweden)

    Weitemier Kevin

    2011-05-01

    Full Text Available Abstract Background Milkweeds (Asclepias L. have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L. could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp and 5S rDNA (120 bp sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp, with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae unigenes (median coverage of 0.29× and 66% of single copy orthologs (COSII in asterids (median coverage of 0.14×. From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites and phylogenetics (low-copy nuclear genes studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species

  3. Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing.

    Science.gov (United States)

    Straub, Shannon C K; Fishbein, Mark; Livshultz, Tatyana; Foster, Zachary; Parks, Matthew; Weitemier, Kevin; Cronn, Richard C; Liston, Aaron

    2011-05-04

    Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first

  4. [Mapping and human genome sequence program].

    Science.gov (United States)

    Weissenbach, J

    1997-03-01

    Until recently, human genome programs focused primarily on establishing maps that would provide signposts to researchers seeking to identify genes responsible for inherited diseases, as well as a basis for genome sequencing studies. Preestablished gene mapping goals have been reached. The over 7,000 microsatellite markers identified to date provide a map of sufficient density to allow localization of the gene of a monogenic disease with a precision of 1 to 2 million base pairs. The physical map, based on systematically arranged overlapping sets of artificial yeast chromosomes (YACs), has also made considerable headway during the last few years. The most recently published map covers more than 90% of the genome. However, currently available physical maps cannot be used for sequencing studies because multiple rearrangements occur in YACs. The recently developed sets of radioinduced hybrids are extremely useful for incorporating genes into existing maps. A network of American and European laboratories has successfully used these radioinduced hybrids to map 15,000 gene tags from large-scale cDNA library sequencing programs. There are increasingly pressing reasons for initiating large scale human genome sequencing studies.

  5. Genome sequence of Lactobacillus farciminis KCTC 3681.

    Science.gov (United States)

    Nam, Seong-Hyeuk; Choi, Sang-Haeng; Kang, Aram; Kim, Dong-Wook; Kim, Ryong Nam; Kim, Aeri; Kim, Dae-Soo; Park, Hong-Seog

    2011-04-01

    Lactobacillus farciminis is one of the most prevalent lactic acid bacterial species present during the manufacturing process of kimchi, the best-known traditional Korean dish. Here, we present the draft genome sequence of the type strain Lactobacillus farciminis KCTC 3681 (2,498,309 bp, with a G+C content of 36.4%), which consists of 5 scaffolds.

  6. The diploid genome sequence of an Asian individual

    DEFF Research Database (Denmark)

    Wang, Jun; Wang, Wei; Li, Ruiqiang

    2008-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we...

  7. Ancient DNA sequence revealed by error-correcting codes.

    Science.gov (United States)

    Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo

    2015-07-10

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.

  8. A Draft Sequence of the Neandertal Genome

    Science.gov (United States)

    Green, Richard E.; Li, Heng; Zhai, Weiwei; Fritz, Markus Hsi-Yang; Hansen, Nancy F.; Durand, Eric Y.; Malaspinas, Anna-Sapfo; Jensen, Jeffrey D.; Marques-Bonet, Tomas; Alkan, Can; Prüfer, Kay; Meyer, Matthias; Burbano, Hernán A.; Good, Jeffrey M.; Schultz, Rigo; Aximu-Petri, Ayinuer; Butthof, Anne; Höber, Barbara; Höffner, Barbara; Siegemund, Madlen; Weihmann, Antje; Nusbaum, Chad; Lander, Eric S.; Russ, Carsten; Novod, Nathaniel; Affourtit, Jason; Egholm, Michael; Verna, Christine; Rudan, Pavao; Brajkovic, Dejana; Kucan, Željko; Gušic, Ivan; Doronichev, Vladimir B.; Golovanova, Liubov V.; Lalueza-Fox, Carles; de la Rasilla, Marco; Fortea, Javier; Rosas, Antonio; Schmitz, Ralf W.; Johnson, Philip L. F.; Eichler, Evan E.; Falush, Daniel; Birney, Ewan; Mullikin, James C.; Slatkin, Montgomery; Nielsen, Rasmus; Kelso, Janet; Lachmann, Michael; Reich, David; Pääbo, Svante

    2016-01-01

    Neandertals, the closest evolutionary relatives of present-day humans, lived in large parts of Europe and western Asia before disappearing 30,000 years ago. We present a draft sequence of the Neandertal genome composed of more than 4 billion nucleotides from three individuals. Comparisons of the Neandertal genome to the genomes of five present-day humans from different parts of the world identify a number of genomic regions that may have been affected by positive selection in ancestral modern humans, including genes involved in metabolism and in cognitive and skeletal development. We show that Neandertals shared more genetic variants with present-day humans in Eurasia than with present-day humans in sub-Saharan Africa, suggesting that gene flow from Neandertals into the ancestors of non-Africans occurred before the divergence of Eurasian groups from each other. PMID:20448178

  9. Complete Genome Sequences of Campylobacter jejuni Strains OD267 and WP2202 Isolated from Retail Chicken Livers and Gizzards Reveal the Presence of Novel 116-Kilobase and 119-Kilobase Megaplasmids with Type VI Secretion Systems

    Science.gov (United States)

    Marasini, Daya

    2016-01-01

    Genome sequences of Campylobacter jejuni strains OD267 and WP2202, isolated from chicken livers and gizzards, showed the presence of novel 116-kb and 119-kb megaplasmids, respectively. The two megaplasmids carry a type VI secretion system and tetracycline resistance genes. These are the largest sequenced Campylobacter plasmids to date. PMID:27688318

  10. Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii genome.

    Directory of Open Access Journals (Sweden)

    Byrappa Venkatesh

    2007-04-01

    Full Text Available Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4x coverage and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element-like and long interspersed element-like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes.

  11. Survey Sequencing and Comparative Analysis of the Elephant Shark (Callorhinchus milii) Genome

    Science.gov (United States)

    Venkatesh, Byrappa; Kirkness, Ewen F; Loh, Yong-Hwee; Halpern, Aaron L; Lee, Alison P; Johnson, Justin; Dandona, Nidhi; Viswanathan, Lakshmi D; Tay, Alice; Venter, J. Craig; Strausberg, Robert L; Brenner, Sydney

    2007-01-01

    Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras) provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4× coverage) and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element–like and long interspersed element–like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes. PMID:17407382

  12. Genomic divergences among cattle, dog and human estimated from large-scale alignments of genomic sequences

    Directory of Open Access Journals (Sweden)

    Shade Larry L

    2006-06-01

    Full Text Available Abstract Background Approximately 11 Mb of finished high quality genomic sequences were sampled from cattle, dog and human to estimate genomic divergences and their regional variation among these lineages. Results Optimal three-way multi-species global sequence alignments for 84 cattle clones or loci (each >50 kb of genomic sequence were constructed using the human and dog genome assemblies as references. Genomic divergences and substitution rates were examined for each clone and for various sequence classes under different functional constraints. Analysis of these alignments revealed that the overall genomic divergences are relatively constant (0.32–0.37 change/site for pairwise comparisons among cattle, dog and human; however substitution rates vary across genomic regions and among different sequence classes. A neutral mutation rate (2.0–2.2 × 10(-9 change/site/year was derived from ancestral repetitive sequences, whereas the substitution rate in coding sequences (1.1 × 10(-9 change/site/year was approximately half of the overall rate (1.9–2.0 × 10(-9 change/site/year. Relative rate tests also indicated that cattle have a significantly faster rate of substitution as compared to dog and that this difference is about 6%. Conclusion This analysis provides a large-scale and unbiased assessment of genomic divergences and regional variation of substitution rates among cattle, dog and human. It is expected that these data will serve as a baseline for future mammalian molecular evolution studies.

  13. Genome sequence and characterization of the Tsukamurella bacteriophage TPA2.

    Science.gov (United States)

    Petrovski, Steve; Seviour, Robert J; Tillett, Daniel

    2011-02-01

    The formation of stable foam in activated sludge plants is a global problem for which control is difficult. These foams are often stabilized by hydrophobic mycolic acid-synthesizing Actinobacteria, among which are Tsukamurella spp. This paper describes the isolation from activated sludge of the novel double-stranded DNA phage TPA2. This polyvalent Siphoviridae family phage is lytic for most Tsukamurella species. Whole-genome sequencing reveals that the TPA2 genome is circularly permuted (61,440 bp) and that 70% of its sequence is novel. We have identified 78 putative open reading frames, 95 pairs of inverted repeats, and 6 palindromes. The TPA2 genome has a modular gene structure that shares some similarity to those of Mycobacterium phages. A number of the genes display a mosaic architecture, suggesting that the TPA2 genome has evolved at least in part from genetic recombination events. The genome sequence reveals many novel genes that should inform any future discussion on Tsukamurella phage evolution.

  14. Mitochondrial genome sequences and comparative genomics ofPhytophthora ramorum and P. sojae

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Frank N.; Douda, Bensasson; Tyler, Brett M.; Boore,Jeffrey L.

    2007-01-01

    The complete sequences of the mitochondrial genomes of theoomycetes of Phytophthora ramorum and P. sojae were determined during thecourse of their complete nuclear genome sequencing (Tyler, et al. 2006).Both are circular, with sizes of 39,314 bp for P. ramorum and 42,975 bpfor P. sojae. Each contains a total of 37 identifiable protein-encodinggenes, 25 or 26 tRNAs (P. sojae and P. ramorum, respectively)specifying19 amino acids, and a variable number of ORFs (7 for P. ramorum and 12for P. sojae) which are potentially additional functional genes.Non-coding regions comprise approximately 11.5 percent and 18.4 percentof the genomes of P. ramorum and P. sojae, respectively. Relative to P.sojae, there is an inverted repeat of 1,150 bp in P. ramorum thatincludes an unassigned unique ORF, a tRNA gene, and adjacent non-codingsequences, but otherwise the gene order in both species is identical.Comparisons of these genomes with published sequences of the P. infestansmitochondrial genome reveals a number of similarities, but the gene orderin P. infestans differs in two adjacent locations due to inversions.Sequence alignments of the three genomes indicated sequence conservationranging from 75 to 85 percent and that specific regions were morevariable than others.

  15. The genome sequence of the colonial chordate, Botryllus schlosseri

    Science.gov (United States)

    Voskoboynik, Ayelet; Neff, Norma F; Sahoo, Debashis; Newman, Aaron M; Pushkarev, Dmitry; Koh, Winston; Passarelli, Benedetto; Fan, H Christina; Mantalas, Gary L; Palmeri, Karla J; Ishizuka, Katherine J; Gissi, Carmela; Griggio, Francesca; Ben-Shlomo, Rachel; Corey, Daniel M; Penland, Lolita; White, Richard A; Weissman, Irving L; Quake, Stephen R

    2013-01-01

    Botryllus schlosseri is a colonial urochordate that follows the chordate plan of development following sexual reproduction, but invokes a stem cell-mediated budding program during subsequent rounds of asexual reproduction. As urochordates are considered to be the closest living invertebrate relatives of vertebrates, they are ideal subjects for whole genome sequence analyses. Using a novel method for high-throughput sequencing of eukaryotic genomes, we sequenced and assembled 580 Mbp of the B. schlosseri genome. The genome assembly is comprised of nearly 14,000 intron-containing predicted genes, and 13,500 intron-less predicted genes, 40% of which could be confidently parceled into 13 (of 16 haploid) chromosomes. A comparison of homologous genes between B. schlosseri and other diverse taxonomic groups revealed genomic events underlying the evolution of vertebrates and lymphoid-mediated immunity. The B. schlosseri genome is a community resource for studying alternative modes of reproduction, natural transplantation reactions, and stem cell-mediated regeneration. DOI: http://dx.doi.org/10.7554/eLife.00569.001 PMID:23840927

  16. Complete coding sequences of the rabbitpox virus genome.

    Science.gov (United States)

    Li, G; Chen, N; Roper, R L; Feng, Z; Hunter, A; Danila, M; Lefkowitz, E J; Buller, R M L; Upton, C

    2005-11-01

    Rabbitpox virus (RPXV) is highly virulent for rabbits and it has long been suspected to be a close relative of vaccinia virus. To explore these questions, the complete coding region of the rabbitpox virus genome was sequenced to permit comparison with sequenced strains of vaccinia virus and other orthopoxviruses. The genome of RPXV strain Utrecht (RPXV-UTR) is 197 731 nucleotides long, excluding the terminal hairpin structures at each end of the genome. The RPXV-UTR genome has 66.5 % A + T content, 184 putative functional genes and 12 fragmented ORF regions that are intact in other orthopoxviruses. The sequence of the RPXV-UTR genome reveals that two RPXV-UTR genes have orthologues in variola virus (VARV; the causative agent of smallpox), but not in vaccinia virus (VACV) strains. These genes are a zinc RING finger protein gene (RPXV-UTR-008) and an ankyrin repeat family protein gene (RPXV-UTR-180). A third gene, encoding a chemokine-binding protein (RPXV-UTR-001/184), is complete in VARV but functional only in some VACV strains. Examination of the evolutionary relationship between RPXV and other orthopoxviruses was carried out using the central 143 kb DNA sequence conserved among all completely sequenced orthopoxviruses and also the protein sequences of 49 gene products present in all completely sequenced chordopoxviruses. The results of these analyses both confirm that RPXV-UTR is most closely related to VACV and suggest that RPXV has not evolved directly from any of the sequenced VACV strains, since RPXV contains a 719 bp region not previously identified in any VACV.

  17. Genome-wide analysis of short interspersed nuclear elements SINES revealed high sequence conservation, gene association and retrotranspositional activity in wheat.

    Science.gov (United States)

    Ben-David, Smadar; Yaakov, Beery; Kashkush, Khalil

    2013-10-01

    Short interspersed nuclear elements (SINEs) are non-autonomous non-LTR retroelements that are present in most eukaryotic species. While SINEs have been intensively investigated in humans and other animal systems, they are poorly studied in plants, especially in wheat (Triticum aestivum). We used quantitative PCR of various wheat species to determine the copy number of a wheat SINE family, termed Au SINE, combined with computer-assisted analyses of the publicly available 454 pyrosequencing database of T. aestivum. In addition, we utilized site-specific PCR on 57 Au SINE insertions, transposon methylation display and transposon display on newly formed wheat polyploids to assess retrotranspositional activity, epigenetic status and genetic rearrangements in Au SINE, respectively. We retrieved 3706 different insertions of Au SINE from the 454 pyrosequencing database of T. aestivum, and found that most of the elements are inserted in A/T-rich regions, while approximately 38% of the insertions are associated with transcribed regions, including known wheat genes. We observed typical retrotransposition of Au SINE in the second generation of a newly formed wheat allohexaploid, and massive hypermethylation in CCGG sites surrounding Au SINE in the third generation. Finally, we observed huge differences in the copy numbers in diploid Triticum and Aegilops species, and a significant increase in the copy numbers in natural wheat polyploids, but no significant increase in the copy number of Au SINE in the first four generations for two of three newly formed allopolyploid species used in this study. Our data indicate that SINEs may play a prominent role in the genomic evolution of wheat through stress-induced activation. © 2013 Ben-Gurion University The Plant Journal © 2013 John Wiley & Sons Ltd.

  18. Genome Calligrapher: A Web Tool for Refactoring Bacterial Genome Sequences for de Novo DNA Synthesis.

    Science.gov (United States)

    Christen, Matthias; Deutsch, Samuel; Christen, Beat

    2015-08-21

    Recent advances in synthetic biology have resulted in an increasing demand for the de novo synthesis of large-scale DNA constructs. Any process improvement that enables fast and cost-effective streamlining of digitized genetic information into fabricable DNA sequences holds great promise to study, mine, and engineer genomes. Here, we present Genome Calligrapher, a computer-aided design web tool intended for whole genome refactoring of bacterial chromosomes for de novo DNA synthesis. By applying a neutral recoding algorithm, Genome Calligrapher optimizes GC content and removes obstructive DNA features known to interfere with the synthesis of double-stranded DNA and the higher order assembly into large DNA constructs. Subsequent bioinformatics analysis revealed that synthesis constraints are prevalent among bacterial genomes. However, a low level of codon replacement is sufficient for refactoring bacterial genomes into easy-to-synthesize DNA sequences. To test the algorithm, 168 kb of synthetic DNA comprising approximately 20 percent of the synthetic essential genome of the cell-cycle bacterium Caulobacter crescentus was streamlined and then ordered from a commercial supplier of low-cost de novo DNA synthesis. The successful assembly into eight 20 kb segments indicates that Genome Calligrapher algorithm can be efficiently used to refactor difficult-to-synthesize DNA. Genome Calligrapher is broadly applicable to recode biosynthetic pathways, DNA sequences, and whole bacterial genomes, thus offering new opportunities to use synthetic biology tools to explore the functionality of microbial diversity. The Genome Calligrapher web tool can be accessed at https://christenlab.ethz.ch/GenomeCalligrapher  .

  19. BSMAP: whole genome bisulfite sequence MAPping program

    Directory of Open Access Journals (Sweden)

    Li Wei

    2009-07-01

    Full Text Available Abstract Background Bisulfite sequencing is a powerful technique to study DNA cytosine methylation. Bisulfite treatment followed by PCR amplification specifically converts unmethylated cytosines to thymine. Coupled with next generation sequencing technology, it is able to detect the methylation status of every cytosine in the genome. However, mapping high-throughput bisulfite reads to the reference genome remains a great challenge due to the increased searching space, reduced complexity of bisulfite sequence, asymmetric cytosine to thymine alignments, and multiple CpG heterogeneous methylation. Results We developed an efficient bisulfite reads mapping algorithm BSMAP to address the above issues. BSMAP combines genome hashing and bitwise masking to achieve fast and accurate bisulfite mapping. Compared with existing bisulfite mapping approaches, BSMAP is faster, more sensitive and more flexible. Conclusion BSMAP is the first general-purpose bisulfite mapping software. It is able to map high-throughput bisulfite reads at whole genome level with feasible memory and CPU usage. It is freely available under GPL v3 license at http://code.google.com/p/bsmap/.

  20. Improved genome sequencing using an engineered transposase.

    Science.gov (United States)

    Kia, Amirali; Gloeckner, Christian; Osothprarop, Trina; Gormley, Niall; Bomati, Erin; Stephenson, Michelle; Goryshin, Igor; He, Molly Min

    2017-01-17

    Next-generation sequencing (NGS) has transformed genomic research by reducing turnaround time and cost. However, no major breakthrough has been made in the upstream library preparation methods until the transposase-based Nextera method was invented. Nextera combines DNA fragmentation and barcoding in a single tube reaction and therefore enables a very fast workflow to sequencing-ready DNA libraries within a couple of hours. When compared to the traditional ligation-based methods, transposed-based Nextera has a slight insertion bias. Here we present the discovery of a mutant transposase (Tn5-059) with a lowered GC insertion bias through protein engineering. We demonstrate Tn5-059 reduces AT dropout and increases uniformity of genome coverage in both bacterial genomes and human genome. We also observe higher library diversity generated by Tn5-059 when compared to Nextera v2 for human exomes, which leads to less sequencing and lower cost per genome. In addition, when used for human exomes, Tn5-059 delivers consistent library insert size over a range of input DNA, allowing up to a tenfold variance from the 50 ng input recommendation. Enhanced DNA input tolerance of Tn5-059 can translate to flexibility and robustness of workflow. DNA input tolerance together with superior uniformity of coverage and lower AT dropouts extend the applications of transposase based library preps. We discuss possible mechanisms of improvements in Tn5-059, and potential advantages of using the new mutant in varieties of applications including microbiome sequencing and chromatin profiling.

  1. A genome wide dosage suppressor network reveals genomic robustness

    Science.gov (United States)

    Patra, Biranchi; Kon, Yoshiko; Yadav, Gitanjali; Sevold, Anthony W.; Frumkin, Jesse P.; Vallabhajosyula, Ravishankar R.; Hintze, Arend; Østman, Bjørn; Schossau, Jory; Bhan, Ashish; Marzolf, Bruz; Tamashiro, Jenna K.; Kaur, Amardeep; Baliga, Nitin S.; Grayhack, Elizabeth J.; Adami, Christoph; Galas, David J.; Raval, Alpan; Phizicky, Eric M.; Ray, Animesh

    2017-01-01

    Genomic robustness is the extent to which an organism has evolved to withstand the effects of deleterious mutations. We explored the extent of genomic robustness in budding yeast by genome wide dosage suppressor analysis of 53 conditional lethal mutations in cell division cycle and RNA synthesis related genes, revealing 660 suppressor interactions of which 642 are novel. This collection has several distinctive features, including high co-occurrence of mutant-suppressor pairs within protein modules, highly correlated functions between the pairs and higher diversity of functions among the co-suppressors than previously observed. Dosage suppression of essential genes encoding RNA polymerase subunits and chromosome cohesion complex suggests a surprising degree of functional plasticity of macromolecular complexes, and the existence of numerous degenerate pathways for circumventing the effects of potentially lethal mutations. These results imply that organisms and cancer are likely able to exploit the genomic robustness properties, due the persistence of cryptic gene and pathway functions, to generate variation and adapt to selective pressures. PMID:27899637

  2. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans

    DEFF Research Database (Denmark)

    Raghavan, Maanasa; Skoglund, Pontus; Graf, Kelly E.

    2014-01-01

    ,000-year-old individual (MA-1), from Mal'ta in south-central Siberia, to an average depth of 1×. To our knowledge this is the oldest anatomically modern human genome reported to date. The MA-1 mitochondrial genome belongs to haplogroup U, which has also been found at high frequency among Upper Palaeolithic......The origins of the First Americans remain contentious. Although Native Americans seem to be genetically most closely related to east Asians, there is no consensus with regard to which specific Old World populations they are closest to. Here we sequence the draft genome of an approximately 24...... that the region was continuously occupied by humans throughout the Last Glacial Maximum. Our findings reveal that western Eurasian genetic signatures in modern-day Native Americans derive not only from post-Columbian admixture, as commonly thought, but also from a mixed ancestry of the First Americans....

  3. Identifying driver mutations in sequenced cancer genomes

    DEFF Research Database (Denmark)

    Raphael, Benjamin J; Dobson, Jason R; Oesper, Layla

    2014-01-01

    High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, noise......, and random mutations. Here, we review computational approaches to identify somatic mutations in cancer genome sequences and to distinguish the driver mutations that are responsible for cancer from random, passenger mutations. First, we describe approaches to detect somatic mutations from high-throughput DNA...... sequencing data, particularly for tumor samples that comprise heterogeneous populations of cells. Next, we review computational approaches that aim to predict driver mutations according to their frequency of occurrence in a cohort of samples, or according to their predicted functional impact on protein...

  4. Whole genome sequence analysis of Mycobacterium suricattae

    KAUST Repository

    Dippenaar, Anzaan

    2015-10-21

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  5. Genome sequence of the pea aphid Acyrthosiphon pisum

    DEFF Research Database (Denmark)

    Richards, S.; Gibbs, R. A.; Gerardo, N. M.;

    2010-01-01

    Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first...... published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we...... include genes involved in chromatin modification, miRNA synthesis, and sugar transport. Gene losses include genes central to the IMD immune pathway, selenoprotein utilization, purine salvage, and the entire urea cycle. The pea aphid genome reveals that only a limited number of genes have been acquired...

  6. Draft Genome Sequence of Rubrivivax gelatinosus CBS

    Energy Technology Data Exchange (ETDEWEB)

    Hu, P. S.; Lang, J.; Wawrousek, K.; Yu, J. P.; Maness, P. C.; Chen, J.

    2012-06-01

    Rubrivivax gelatinosus CBS, a purple nonsulfur photosynthetic bacterium, can grow photosynthetically using CO and N{sub 2} as the sole carbon and nitrogen nutrients, respectively. R. gelatinosus CBS is of particular interest due to its ability to metabolize CO and yield H{sub 2}. We present the 5-Mb draft genome sequence of R. gelatinosus CBS with the goal of providing genetic insight into the metabolic properties of this bacterium.

  7. Genome sequence of Psychrobacter cibarius strain W1

    DEFF Research Database (Denmark)

    Raghupathi, Prem Krishnan; Herschend, Jakob; Røder, Henriette Lyng

    2016-01-01

    Here, we report the draft genome sequence of Psychrobacter cibarius strain W1, which was isolated at a slaughterhouse in Denmark. The 3.63-Mb genome sequence was assembled into 241 contigs.......Here, we report the draft genome sequence of Psychrobacter cibarius strain W1, which was isolated at a slaughterhouse in Denmark. The 3.63-Mb genome sequence was assembled into 241 contigs....

  8. Phytophthora Genome Sequences Uncover Evolutionary Origins and Mechanisms of Pathogenesis

    Energy Technology Data Exchange (ETDEWEB)

    Tyler, Brett M.; Tripathy, Sucheta; Zhang, Xuemin; Dehal, Paramvir; Jiang, Rays H. Y.; Aerts, Andrea; Arredondo, Felipe D.; Baxter, Laura; Bensasson, Douda; Beynon, JIm L.; Chapman, Jarrod; Damasceno, Cynthia M. B.; Dorrance, Anne E.; Dou, Daolong; Dickerman, Allan W.; Dubchak, Inna L.; Garbelotto, Matteo; Gijzen, Mark; Gordon, Stuart G.; Govers, Francine; Grunwald, NIklaus J.; Huang, Wayne; Ivors, Kelly L.; Jones, Richard W.; Kamoun, Sophien; Krampis, Konstantinos; Lamour, Kurt H.; Lee, Mi-Kyung; McDonald, W. Hayes; Medina, Monica; Meijer, Harold J. G.; Nordberg, Erik K.; Maclean, Donald J.; Ospina-Giraldo, Manuel D.; Morris, Paul F.; Phuntumart, Vipaporn; Putnam, Nicholas J.; Rash, Sam; Rose, Jocelyn K. C.; Sakihama, Yasuko; Salamov, Asaf A.; Savidor, Alon; Scheuring, Chantel F.; Smith, Brian M.; Sobral, Bruno W. S.; Terry, Astrid; Torto-Alalibo, Trudy A.; Win, Joe; Xu, Zhanyou; Zhang, Hongbin; Grigoriev, Igor V.; Rokhsar, Daniel S.; Boore, Jeffrey L.

    2006-04-17

    Draft genome sequences have been determined for the soybean pathogen Phytophthora sojae and the sudden oak death pathogen Phytophthora ramorum. Oömycetes such as these Phytophthora species share the kingdom Stramenopila with photosynthetic algae such as diatoms, and the presence of many Phytophthora genes of probable phototroph origin supports a photosynthetic ancestry for the stramenopiles. Comparison of the two species' genomes reveals a rapid expansion and diversification of many protein families associated with plant infection such as hydrolases, ABC transporters, protein toxins, proteinase inhibitors, and, in particular, a superfamily of 700 proteins with similarity to known oömycete avirulence genes.

  9. Draft genome sequence of an aflatoxigenic Aspergillus species, A. bombycis

    Science.gov (United States)

    The genome of the A. bombycis Type strain was sequenced using a Personal Genome Machine, followed by annotation of its predicted genes. The genome size for A. bombycis was found to be approximately 37 Mb and contained 12,266 genes. This announcement introduces a sequenced genome for an aflatoxigenic...

  10. What Will We Do with a Cotton Genome Sequence?

    Institute of Scientific and Technical Information of China (English)

    BRUBAKER Curt

    2008-01-01

    @@ With the publication of "Toward Sequencing Cotton (Gossypium) Genomes" [Chen et al.PlantPhysiology,2007,145:1303-1310-] a clear consensus emerged from the cotton genomics community not only that cotton genome sequences were a critical resource for research and commercial innovationin cotton genomics,but that there was a logical means of achieving this goal.

  11. Sequencing of a Cultivated Diploid Cotton Genome-Gossypium arboreum

    Institute of Scientific and Technical Information of China (English)

    WILKINS; Thea; A

    2008-01-01

    Sequencing the genomes of crop species and model systems contributes significantly to our understanding of the organization,structure and function of plant genomes.In a `white paper' published in 2007,the cotton community set forth a strategic plan for sequencing the AD genome of cultivated upland cotton that initially targets less complex diploid genomes.This strategy banks on the high degree

  12. Characterizing the citrus cultivar Carrizo genome through 454 shotgun sequencing.

    Science.gov (United States)

    Belknap, William R; Wang, Yi; Huo, Naxin; Wu, Jiajie; Rockhold, David R; Gu, Yong Q; Stover, Ed

    2011-12-01

    The citrus cultivar Carrizo is the single most important rootstock to the US citrus industry and has resistance or tolerance to a number of major citrus diseases, including citrus tristeza virus, foot rot, and Huanglongbing (HLB, citrus greening). A Carrizo genomic sequence database providing approximately 3.5×genome coverage (haploid genome size approximately 367 Mb) was populated through 454 GS FLX shotgun sequencing. Analysis of the repetitive DNA fraction indicated a total interspersed repeat fraction of 36.5%. Assembly and characterization of abundant citrus Ty3/gypsy elements revealed a novel type of element containing open reading frames encoding a viral RNA-silencing suppressor protein (RNA binding protein, rbp) and a plant cytokinin riboside 5′-monophosphate phosphoribohydrolase-related protein (LONELY GUY, log). Similar gypsy elements were identified in the Populus trichocarpa genome. Gene-coding region analysis indicated that 24.4% of the nonrepetitive reads contained genic regions. The depth of genome coverage was sufficient to allow accurate assembly of constituent genes, including a putative phloem-expressed gene. The development of the Carrizo database (http://citrus.pw.usda.gov/) will contribute to characterization of agronomically significant loci and provide a publicly available genomic resource to the citrus research community.

  13. Genome sequence of the stramenopile Blastocystis, a human anaerobic parasite

    Science.gov (United States)

    2011-01-01

    Background Blastocystis is a highly prevalent anaerobic eukaryotic parasite of humans and animals that is associated with various gastrointestinal and extraintestinal disorders. Epidemiological studies have identified different subtypes but no one subtype has been definitively correlated with disease. Results Here we report the 18.8 Mb genome sequence of a Blastocystis subtype 7 isolate, which is the smallest stramenopile genome sequenced to date. The genome is highly compact and contains intriguing rearrangements. Comparisons with other available stramenopile genomes (plant pathogenic oomycete and diatom genomes) revealed effector proteins potentially involved in the adaptation to the intestinal environment, which were likely acquired via horizontal gene transfer. Moreover, Blastocystis living in anaerobic conditions harbors mitochondria-like organelles. An incomplete oxidative phosphorylation chain, a partial Krebs cycle, amino acid and fatty acid metabolisms and an iron-sulfur cluster assembly are all predicted to occur in these organelles. Predicted secretory proteins possess putative activities that may alter host physiology, such as proteases, protease-inhibitors, immunophilins and glycosyltransferases. This parasite also possesses the enzymatic machinery to tolerate oxidative bursts resulting from its own metabolism or induced by the host immune system. Conclusions This study provides insights into the genome architecture of this unusual stramenopile. It also proposes candidate genes with which to study the physiopathology of this parasite and thus may lead to further investigations into Blastocystis-host interactions. PMID:21439036

  14. Enhanced Dynamic Algorithm of Genome Sequence Alignments

    Directory of Open Access Journals (Sweden)

    Arabi E. keshk

    2014-05-01

    Full Text Available The merging of biology and computer science has created a new field called computational biology that explore the capacities of computers to gain knowledge from biological data, bioinformatics. Computational biology is rooted in life sciences as well as computers, information sciences, and technologies. The main problem in computational biology is sequence alignment that is a way of arranging the sequences of DNA, RNA or protein to identify the region of similarity and relationship between sequences. This paper introduces an enhancement of dynamic algorithm of genome sequence alignment, which called EDAGSA. It is filling the three main diagonals without filling the entire matrix by the unused data. It gets the optimal solution with decreasing the execution time and therefore the performance is increased. To illustrate the effectiveness of optimizing the performance of the proposed algorithm, it is compared with the traditional methods such as Needleman-Wunsch, Smith-Waterman and longest common subsequence algorithms. Also, database is implemented for using the algorithm in multi-sequence alignments for searching the optimal sequence that matches the given sequence.

  15. Full genome sequence of a Danish isolate of Mycobacterium avium subspecies paratuberculosis, strain Ejlskov2007

    DEFF Research Database (Denmark)

    Afzal, Mamuna; Abidi, Soad; Mikkelsen, Heidi

    , consisting of 4317 unique gene families. Comparison with M. avium paratuberculosis strain K10 revealed only 3436 genes in common (~70%). We have used GenomeAtlases to show conserved (and unique) regions along the Ejlskov2007 chromosome, compared to 2 other Mycobacterium avium sequenced genomes. Pan-genome...

  16. Draft Genome Sequence of Phytopathogenic Fungus Fusarium fujikuroi CF-295141, Isolated from Pinus sylvestris

    Science.gov (United States)

    Bertoni-Mann, Michele; Sánchez-Hidalgo, Marina; González-Menéndez, Victor

    2016-01-01

    Here, we report the draft genome sequence of a new strain of Fusarium fujikuroi, isolated from Pinus sylvestris, which was also found to produce the mycotoxin beauvericin. The Illumina-based sequence analysis revealed an approximate genome size of 44.2 Mbp, containing 164 secondary metabolite biosynthetic clusters. PMID:27795279

  17. Transforming clinical microbiology with bacterial genome sequencing.

    Science.gov (United States)

    Didelot, Xavier; Bowden, Rory; Wilson, Daniel J; Peto, Tim E A; Crook, Derrick W

    2012-09-01

    Whole-genome sequencing of bacteria has recently emerged as a cost-effective and convenient approach for addressing many microbiological questions. Here, we review the current status of clinical microbiology and how it has already begun to be transformed by using next-generation sequencing. We focus on three essential tasks: identifying the species of an isolate, testing its properties, such as resistance to antibiotics and virulence, and monitoring the emergence and spread of bacterial pathogens. We predict that the application of next-generation sequencing will soon be sufficiently fast, accurate and cheap to be used in routine clinical microbiology practice, where it could replace many complex current techniques with a single, more efficient workflow.

  18. The complete chloroplast genome sequence of Brachypodium distachyon: sequence comparison and phylogenetic analysis of eight grass plastomes

    Directory of Open Access Journals (Sweden)

    Anderson Olin D

    2008-07-01

    Full Text Available Abstract Background Wheat, barley, and rye, of tribe Triticeae in the Poaceae, are among the most important crops worldwide but they present many challenges to genomics-aided crop improvement. Brachypodium distachyon, a close relative of those cereals has recently emerged as a model for grass functional genomics. Sequencing of the nuclear and organelle genomes of Brachypodium is one of the first steps towards making this species available as a tool for researchers interested in cereals biology. Findings The chloroplast genome of Brachypodium distachyon was sequenced by a combinational approach using BAC end and shotgun sequences derived from a selected BAC containing the entire chloroplast genome. Comparative analysis indicated that the chloroplast genome is conserved in gene number and organization with respect to those of other cereals. However, several Brachypodium genes evolve at a faster rate than those in other grasses. Sequence analysis reveals that rice and wheat have a ~2.1 kb deletion in their plastid genomes and this deletion must have occurred independently in both species. Conclusion We demonstrate that BAC libraries can be used to sequence plastid, and likely other organellar, genomes. As expected, the Brachypodium chloroplast genome is very similar to those of other sequenced grasses. The phylogenetic analyses and the pattern of insertions and deletions in the chloroplast genome confirmed that Brachypodium is a close relative of the tribe Triticeae. Nevertheless, we show that some large indels can arise multiple times and may confound phylogenetic reconstruction.

  19. Detecting overlapping coding sequences in virus genomes

    Directory of Open Access Journals (Sweden)

    Brown Chris M

    2006-02-01

    Full Text Available Abstract Background Detecting new coding sequences (CDSs in viral genomes can be difficult for several reasons. The typically compact genomes often contain a number of overlapping coding and non-coding functional elements, which can result in unusual patterns of codon usage; conservation between related sequences can be difficult to interpret – especially within overlapping genes; and viruses often employ non-canonical translational mechanisms – e.g. frameshifting, stop codon read-through, leaky-scanning and internal ribosome entry sites – which can conceal potentially coding open reading frames (ORFs. Results In a previous paper we introduced a new statistic – MLOGD (Maximum Likelihood Overlapping Gene Detector – for detecting and analysing overlapping CDSs. Here we present (a an improved MLOGD statistic, (b a greatly extended suite of software using MLOGD, (c a database of results for 640 virus sequence alignments, and (d a web-interface to the software and database. Tests show that, from an alignment with just 20 mutations, MLOGD can discriminate non-overlapping CDSs from non-coding ORFs with a typical accuracy of up to 98%, and can detect CDSs overlapping known CDSs with a typical accuracy of 90%. In addition, the software produces a variety of statistics and graphics, useful for analysing an input multiple sequence alignment. Conclusion MLOGD is an easy-to-use tool for virus genome annotation, detecting new CDSs – in particular overlapping or short CDSs – and for analysing overlapping CDSs following frameshift sites. The software, web-server, database and supplementary material are available at http://guinevere.otago.ac.nz/mlogd.html.

  20. Uncovering genomic features and maternal origin of korean native chicken by whole genome sequencing.

    Science.gov (United States)

    Kwak, Woori; Song, Ki-Duk; Oh, Jae-Don; Heo, Kang-Nyeong; Lee, Jun-Heon; Lee, Woon Kyu; Yoon, Sook Hee; Kim, Heebal; Cho, Seoae; Lee, Hak-Kyo

    2014-01-01

    The Korean Native Chicken (KNC) is an important endemic biological resource in Korea. While numerous studies have been conducted exploring this breed, none have used next-generation sequencing to identify its specific genomic features. We sequenced five strains of KNC and identified 10.9 million SNVs and 1.3 million InDels. Through the analysis, we found that the highly variable region common to all 5 strains had genes like PCHD15, CISD1, PIK3C2A, and NUCB2 that might be related to the phenotypic traits of the chicken such as auditory sense, growth rate and egg traits. In addition, we assembled unaligned reads that could not be mapped to the reference genome. By assembling the unaligned reads, we were able to present genomic sequences characteristic to the KNC. Based on this, we also identified genes related to the olfactory receptors and antigen that are common to all 5 strains. Finally, through the reconstructed mitochondrial genome sequences, we performed phylogenomic analysis and elucidated the maternal origin of the artificially restored KNC. Our results revealed that the KNC has multiple maternal origins which are in agreement with Korea's history of chicken breed imports. The results presented here provide a valuable basis for future research on genomic features of KNC and further understanding of KNC's origin.

  1. Uncovering genomic features and maternal origin of korean native chicken by whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Woori Kwak

    Full Text Available The Korean Native Chicken (KNC is an important endemic biological resource in Korea. While numerous studies have been conducted exploring this breed, none have used next-generation sequencing to identify its specific genomic features. We sequenced five strains of KNC and identified 10.9 million SNVs and 1.3 million InDels. Through the analysis, we found that the highly variable region common to all 5 strains had genes like PCHD15, CISD1, PIK3C2A, and NUCB2 that might be related to the phenotypic traits of the chicken such as auditory sense, growth rate and egg traits. In addition, we assembled unaligned reads that could not be mapped to the reference genome. By assembling the unaligned reads, we were able to present genomic sequences characteristic to the KNC. Based on this, we also identified genes related to the olfactory receptors and antigen that are common to all 5 strains. Finally, through the reconstructed mitochondrial genome sequences, we performed phylogenomic analysis and elucidated the maternal origin of the artificially restored KNC. Our results revealed that the KNC has multiple maternal origins which are in agreement with Korea's history of chicken breed imports. The results presented here provide a valuable basis for future research on genomic features of KNC and further understanding of KNC's origin.

  2. An evaluation of Comparative Genome Sequencing (CGS by comparing two previously-sequenced bacterial genomes

    Directory of Open Access Journals (Sweden)

    Herring Christopher D

    2007-08-01

    Full Text Available Abstract Background With the development of new technology, it has recently become practical to resequence the genome of a bacterium after experimental manipulation. It is critical though to know the accuracy of the technique used, and to establish confidence that all of the mutations were detected. Results In order to evaluate the accuracy of genome resequencing using the microarray-based Comparative Genome Sequencing service provided by Nimblegen Systems Inc., we resequenced the E. coli strain W3110 Kohara using MG1655 as a reference, both of which have been completely sequenced using traditional sequencing methods. CGS detected 7 of 8 small sequence differences, one large deletion, and 9 of 12 IS element insertions present in W3110, but did not detect a large chromosomal inversion. In addition, we confirmed that CGS also detected 2 SNPs, one deletion and 7 IS element insertions that are not present in the genome sequence, which we attribute to changes that occurred after the creation of the W3110 lambda clone library. The false positive rate for SNPs was one per 244 Kb of genome sequence. Conclusion CGS is an effective way to detect multiple mutations present in one bacterium relative to another, and while highly cost-effective, is prone to certain errors. Mutations occurring in repeated sequences or in sequences with a high degree of secondary structure may go undetected. It is also critical to follow up on regions of interest in which SNPs were not called because they often indicate deletions or IS element insertions.

  3. Comparative genomics reveals diversity among xanthomonads infecting tomato and pepper

    LENUS (Irish Health Repository)

    Potnis, Neha

    2011-03-11

    Abstract Background Bacterial spot of tomato and pepper is caused by four Xanthomonas species and is a major plant disease in warm humid climates. The four species are distinct from each other based on physiological and molecular characteristics. The genome sequence of strain 85-10, a member of one of the species, Xanthomonas euvesicatoria (Xcv) has been previously reported. To determine the relationship of the four species at the genome level and to investigate the molecular basis of their virulence and differing host ranges, draft genomic sequences of members of the other three species were determined and compared to strain 85-10. Results We sequenced the genomes of X. vesicatoria (Xv) strain 1111 (ATCC 35937), X. perforans (Xp) strain 91-118 and X. gardneri (Xg) strain 101 (ATCC 19865). The genomes were compared with each other and with the previously sequenced Xcv strain 85-10. In addition, the molecular features were predicted that may be required for pathogenicity including the type III secretion apparatus, type III effectors, other secretion systems, quorum sensing systems, adhesins, extracellular polysaccharide, and lipopolysaccharide determinants. Several novel type III effectors from Xg strain 101 and Xv strain 1111 genomes were computationally identified and their translocation was validated using a reporter gene assay. A homolog to Ax21, the elicitor of XA21-mediated resistance in rice, and a functional Ax21 sulfation system were identified in Xcv. Genes encoding proteins with functions mediated by type II and type IV secretion systems have also been compared, including enzymes involved in cell wall deconstruction, as contributors to pathogenicity. Conclusions Comparative genomic analyses revealed considerable diversity among bacterial spot pathogens, providing new insights into differences and similarities that may explain the diverse nature of these strains. Genes specific to pepper pathogens, such as the O-antigen of the lipopolysaccharide cluster

  4. Comparative genomics reveals diversity among xanthomonads infecting tomato and pepper

    Directory of Open Access Journals (Sweden)

    Koebnik Ralf

    2011-03-01

    Full Text Available Abstract Background Bacterial spot of tomato and pepper is caused by four Xanthomonas species and is a major plant disease in warm humid climates. The four species are distinct from each other based on physiological and molecular characteristics. The genome sequence of strain 85-10, a member of one of the species, Xanthomonas euvesicatoria (Xcv has been previously reported. To determine the relationship of the four species at the genome level and to investigate the molecular basis of their virulence and differing host ranges, draft genomic sequences of members of the other three species were determined and compared to strain 85-10. Results We sequenced the genomes of X. vesicatoria (Xv strain 1111 (ATCC 35937, X. perforans (Xp strain 91-118 and X. gardneri (Xg strain 101 (ATCC 19865. The genomes were compared with each other and with the previously sequenced Xcv strain 85-10. In addition, the molecular features were predicted that may be required for pathogenicity including the type III secretion apparatus, type III effectors, other secretion systems, quorum sensing systems, adhesins, extracellular polysaccharide, and lipopolysaccharide determinants. Several novel type III effectors from Xg strain 101 and Xv strain 1111 genomes were computationally identified and their translocation was validated using a reporter gene assay. A homolog to Ax21, the elicitor of XA21-mediated resistance in rice, and a functional Ax21 sulfation system were identified in Xcv. Genes encoding proteins with functions mediated by type II and type IV secretion systems have also been compared, including enzymes involved in cell wall deconstruction, as contributors to pathogenicity. Conclusions Comparative genomic analyses revealed considerable diversity among bacterial spot pathogens, providing new insights into differences and similarities that may explain the diverse nature of these strains. Genes specific to pepper pathogens, such as the O-antigen of the

  5. Why Assembling Plant Genome Sequences Is So Challenging

    Directory of Open Access Journals (Sweden)

    Pedro Seoane

    2012-09-01

    Full Text Available In spite of the biological and economic importance of plants, relatively few plant species have been sequenced. Only the genome sequence of plants with relatively small genomes, most of them angiosperms, in particular eudicots, has been determined. The arrival of next-generation sequencing technologies has allowed the rapid and efficient development of new genomic resources for non-model or orphan plant species. But the sequencing pace of plants is far from that of animals and microorganisms. This review focuses on the typical challenges of plant genomes that can explain why plant genomics is less developed than animal genomics. Explanations about the impact of some confounding factors emerging from the nature of plant genomes are given. As a result of these challenges and confounding factors, the correct assembly and annotation of plant genomes is hindered, genome drafts are produced, and advances in plant genomics are delayed.

  6. Why Assembling Plant Genome Sequences Is So Challenging

    Science.gov (United States)

    Claros, Manuel Gonzalo; Bautista, Rocío; Guerrero-Fernández, Darío; Benzerki, Hicham; Seoane, Pedro; Fernández-Pozo, Noé

    2012-01-01

    In spite of the biological and economic importance of plants, relatively few plant species have been sequenced. Only the genome sequence of plants with relatively small genomes, most of them angiosperms, in particular eudicots, has been determined. The arrival of next-generation sequencing technologies has allowed the rapid and efficient development of new genomic resources for non-model or orphan plant species. But the sequencing pace of plants is far from that of animals and microorganisms. This review focuses on the typical challenges of plant genomes that can explain why plant genomics is less developed than animal genomics. Explanations about the impact of some confounding factors emerging from the nature of plant genomes are given. As a result of these challenges and confounding factors, the correct assembly and annotation of plant genomes is hindered, genome drafts are produced, and advances in plant genomics are delayed. PMID:24832233

  7. Global genomic diversity of human papillomavirus 6 based on 724 isolates and 190 complete genome sequences.

    Science.gov (United States)

    Jelen, Mateja M; Chen, Zigui; Kocjan, Boštjan J; Burt, Felicity J; Chan, Paul K S; Chouhy, Diego; Combrinck, Catharina E; Coutlée, François; Estrade, Christine; Ferenczy, Alex; Fiander, Alison; Franco, Eduardo L; Garland, Suzanne M; Giri, Adriana A; González, Joaquín Víctor; Gröning, Arndt; Heidrich, Kerstin; Hibbitts, Sam; Hošnjak, Lea; Luk, Tommy N M; Marinic, Karina; Matsukura, Toshihiko; Neumann, Anna; Oštrbenk, Anja; Picconi, Maria Alejandra; Richardson, Harriet; Sagadin, Martin; Sahli, Roland; Seedat, Riaz Y; Seme, Katja; Severini, Alberto; Sinchi, Jessica L; Smahelova, Jana; Tabrizi, Sepehr N; Tachezy, Ruth; Tohme, Sarah; Uloza, Virgilijus; Vitkauskiene, Astra; Wong, Yong Wee; Zidovec Lepej, Snježana; Burk, Robert D; Poljak, Mario

    2014-07-01

    Human papillomavirus type 6 (HPV6) is the major etiological agent of anogenital warts and laryngeal papillomas and has been included in both the quadrivalent and nonavalent prophylactic HPV vaccines. This study investigated the global genomic diversity of HPV6, using 724 isolates and 190 complete genomes from six continents, and the association of HPV6 genomic variants with geographical location, anatomical site of infection/disease, and gender. Initially, a 2,800-bp E5a-E5b-L1-LCR fragment was sequenced from 492/530 (92.8%) HPV6-positive samples collected for this study. Among them, 130 exhibited at least one single nucleotide polymorphism (SNP), indel, or amino acid change in the E5a-E5b-L1-LCR fragment and were sequenced in full. A global alignment and maximum likelihood tree of 190 complete HPV6 genomes (130 fully sequenced in this study and 60 obtained from sequence repositories) revealed two variant lineages, A and B, and five B sublineages: B1, B2, B3, B4, and B5. HPV6 (sub)lineage-specific SNPs and a 960-bp representative region for whole-genome-based phylogenetic clustering within the L2 open reading frame were identified. Multivariate logistic regression analysis revealed that lineage B predominated globally. Sublineage B3 was more common in Africa and North and South America, and lineage A was more common in Asia. Sublineages B1 and B3 were associated with anogenital infections, indicating a potential lesion-specific predilection of some HPV6 sublineages. Females had higher odds for infection with sublineage B3 than males. In conclusion, a global HPV6 phylogenetic analysis revealed the existence of two variant lineages and five sublineages, showing some degree of ethnogeographic, gender, and/or disease predilection in their distribution. This study established the largest database of globally circulating HPV6 genomic variants and contributed a total of 130 new, complete HPV6 genome sequences to available sequence repositories. Two HPV6 variant lineages

  8. Draft genome sequences of the Pseudomonas fluorescens biocontrol strains Wayne1R and Wood1R.

    Science.gov (United States)

    Rong, Xiaoqing; Gurel, Fulya Baysal; Meulia, Tea; McSpadden Gardener, Brian B

    2012-02-01

    Pseudomonas fluorescens strains Wayne1R and Wood1R have proven capacities to improve plant health. Here we report the draft genome sequences and automatic annotations of both strains. Genome comparisons reveal similarities with P. fluorescens strain Pf-5, reveal the novelty of Wood1R, and indicate some genes that may be related to biocontrol.

  9. The genome sequence of Blochmannia floridanus: Comparative analysis of reduced genomes

    Science.gov (United States)

    Gil, Rosario; Silva, Francisco J.; Zientz, Evelyn; Delmotte, François; González-Candelas, Fernando; Latorre, Amparo; Rausell, Carolina; Kamerbeek, Judith; Gadau, Jürgen; Hölldobler, Bert; van Ham, Roeland C. H. J.; Gross, Roy; Moya, Andrés

    2003-01-01

    Bacterial symbioses are widespread among insects, probably being one of the key factors of their evolutionary success. We present the complete genome sequence of Blochmannia floridanus, the primary endosymbiont of carpenter ants. Although these ants feed on a complex diet, this symbiosis very likely has a nutritional basis: Blochmannia is able to supply nitrogen and sulfur compounds to the host while it takes advantage of the host metabolic machinery. Remarkably, these bacteria lack all known genes involved in replication initiation (dnaA, priA, and recA). The phylogenetic analysis of a set of conserved protein-coding genes shows that Bl. floridanus is phylogenetically related to Buchnera aphidicola and Wigglesworthia glossinidia, the other endosymbiotic bacteria whose complete genomes have been sequenced so far. Comparative analysis of the five known genomes from insect endosymbiotic bacteria reveals they share only 313 genes, a number that may be close to the minimum gene set necessary to sustain endosymbiotic life. PMID:12886019

  10. How evolution of genomes is reflected in exact DNA sequence match statistics.

    Science.gov (United States)

    Massip, Florian; Sheinman, Michael; Schbath, Sophie; Arndt, Peter F

    2015-02-01

    Genome evolution is shaped by a multitude of mutational processes, including point mutations, insertions, and deletions of DNA sequences, as well as segmental duplications. These mutational processes can leave distinctive qualitative marks in the statistical features of genomic DNA sequences. One such feature is the match length distribution (MLD) of exactly matching sequence segments within an individual genome or between the genomes of related species. These have been observed to exhibit characteristic power law decays in many species. Here, we show that simple dynamical models consisting solely of duplication and mutation processes can already explain the characteristic features of MLDs observed in genomic sequences. Surprisingly, we find that these features are largely insensitive to details of the underlying mutational processes and do not necessarily rely on the action of natural selection. Our results demonstrate how analyzing statistical features of DNA sequences can help us reveal and quantify the different mutational processes that underlie genome evolution.

  11. Complete genome sequence of the fish pathogen Flavobacterium branchiophilum.

    Science.gov (United States)

    Touchon, Marie; Barbier, Paul; Bernardet, Jean-François; Loux, Valentin; Vacherie, Benoit; Barbe, Valérie; Rocha, Eduardo P C; Duchaud, Eric

    2011-11-01

    Members of the genus Flavobacterium occur in a variety of ecological niches and represent an interesting diversity of lifestyles. Flavobacterium branchiophilum is the main causative agent of bacterial gill disease, a severe condition affecting various cultured freshwater fish species worldwide, in particular salmonids in Canada and Japan. We report here the complete genome sequence of strain FL-15 isolated from a diseased sheatfish (Silurus glanis) in Hungary. The analysis of the F. branchiophilum genome revealed putative mechanisms of pathogenicity strikingly different from those of the other, closely related fish pathogen Flavobacterium psychrophilum, including the first cholera-like toxin in a non-Proteobacteria and a wealth of adhesins. The comparison with available genomes of other Flavobacterium species revealed a small genome size, large differences in chromosome organization, and fewer rRNA and tRNA genes, in line with its more fastidious growth. In addition, horizontal gene transfer shaped the evolution of F. branchiophilum, as evidenced by its virulence factors, genomic islands, and CRISPR (clustered regularly interspaced short palindromic repeats) systems. Further functional analysis should help in the understanding of host-pathogen interactions and in the development of rational diagnostic tools and control strategies in fish farms.

  12. Building the sequence map of the human pan-genome

    DEFF Research Database (Denmark)

    Li, Ruiqiang; Li, Yingrui; Zheng, Hancheng

    2010-01-01

    Here we integrate the de novo assembly of an Asian and an African genome with the NCBI reference human genome, as a step toward constructing the human pan-genome. We identified approximately 5 Mb of novel sequences not present in the reference genome in each of these assemblies. Most novel...... analysis of predicted genes indicated that the novel sequences contain potentially functional coding regions. We estimate that a complete human pan-genome would contain approximately 19-40 Mb of novel sequence not present in the extant reference genome. The extensive amount of novel sequence contributing...... to the genetic variation of the pan-genome indicates the importance of using complete genome sequencing and de novo assembly....

  13. Detecting long tandem duplications in genomic sequences

    Directory of Open Access Journals (Sweden)

    Audemard Eric

    2012-05-01

    Full Text Available Abstract Background Detecting duplication segments within completely sequenced genomes provides valuable information to address genome evolution and in particular the important question of the emergence of novel functions. The usual approach to gene duplication detection, based on all-pairs protein gene comparisons, provides only a restricted view of duplication. Results In this paper, we introduce ReD Tandem, a software using a flow based chaining algorithm targeted at detecting tandem duplication arrays of moderate to longer length regions, with possibly locally weak similarities, directly at the DNA level. On the A. thaliana genome, using a reference set of tandem duplicated genes built using TAIR,a we show that ReD Tandem is able to predict a large fraction of recently duplicated genes (dS  Conclusions ReD Tandem allows to identify large tandem duplications without any annotation, leading to agnostic identification of tandem duplications. This approach nicely complements the usual protein gene based which ignores duplications involving non coding regions. It is however inherently restricted to relatively recent duplications. By recovering otherwise ignored events, ReD Tandem gives a more comprehensive view of existing evolutionary processes and may also allow to improve existing annotations.

  14. Rapid whole genome sequencing and precision neonatology.

    Science.gov (United States)

    Petrikin, Joshua E; Willig, Laurel K; Smith, Laurie D; Kingsmore, Stephen F

    2015-12-01

    Traditionally, genetic testing has been too slow or perceived to be impractical to initial management of the critically ill neonate. Technological advances have led to the ability to sequence and interpret the entire genome of a neonate in as little as 26 h. As the cost and speed of testing decreases, the utility of whole genome sequencing (WGS) of neonates for acute and latent genetic illness increases. Analyzing the entire genome allows for concomitant evaluation of the currently identified 5588 single gene diseases. When applied to a select population of ill infants in a level IV neonatal intensive care unit, WGS yielded a diagnosis of a causative genetic disease in 57% of patients. These diagnoses may lead to clinical management changes ranging from transition to palliative care for uniformly lethal conditions for alteration or initiation of medical or surgical therapy to improve outcomes in others. Thus, institution of 2-day WGS at time of acute presentation opens the possibility of early implementation of precision medicine. This implementation may create opportunities for early interventional, frequently novel or off-label therapies that may alter disease trajectory in infants with what would otherwise be fatal disease. Widespread deployment of rapid WGS and precision medicine will raise ethical issues pertaining to interpretation of variants of unknown significance, discovery of incidental findings related to adult onset conditions and carrier status, and implementation of medical therapies for which little is known in terms of risks and benefits. Despite these challenges, precision neonatology has significant potential both to decrease infant mortality related to genetic diseases with onset in newborns and to facilitate parental decision making regarding transition to palliative care.

  15. Genomic Sequence Comparisons, 1987-2003 Final Report

    Energy Technology Data Exchange (ETDEWEB)

    George M. Church

    2004-07-29

    This project was to develop new DNA sequencing and RNA and protein quantitation methods and related genome annotation tools. The project began in 1987 with the development of multiplex sequencing (published in Science in 1988), and one of the first automated sequencing methods. This lead to the first commercial genome sequence in 1994 and to the establishment of the main commercial participants (GTC then Agencourt) in the public DOE/NIH genome project. In collaboration with GTC we contributed to one of the first complete DOE genome sequences, in 1997, that of Methanobacterium thermoautotropicum, a species of great relevance to energy-rich gas production.

  16. Draft Genome Sequence of Phenylobacterium immobile Strain E (DSM 1986), Isolated from Uncontaminated Soil in Ecuador

    OpenAIRE

    Reznicek, Ondrej; Luesken, Francisca; Facey, Sandra J.; Hauer, Bernhard

    2015-01-01

    We report the draft genome sequence of 3.3 Mb and the sequence (19.2 kb) of a natural plasmid isolated from Phenylobacterium immobile strain E (DSM 1986), able to degrade xenobiotic compounds as the sole carbon source. The sequences reveal a large number of novel Rieske nonheme iron aromatic ring-hydroxylating oxygenases (RHOs).

  17. Differential metabolism of Mycoplasma species as revealed by their genomes

    Directory of Open Access Journals (Sweden)

    Fabricio B.M. Arraes

    2007-01-01

    Full Text Available The annotation and comparative analyses of the genomes of Mycoplasma synoviae and Mycoplasma hyopneumonie, as well as of other Mollicutes (a group of bacteria devoid of a rigid cell wall, has set the grounds for a global understanding of their metabolism and infection mechanisms. According to the annotation data, M. synoviae and M. hyopneumoniae are able to perform glycolytic metabolism, but do not possess the enzymatic machinery for citrate and glyoxylate cycles, gluconeogenesis and the pentose phosphate pathway. Both can synthesize ATP by lactic fermentation, but only M. synoviae can convert acetaldehyde to acetate. Also, our genome analysis revealed that M. synoviae and M. hyopneumoniae are not expected to synthesize polysaccharides, but they can take up a variety of carbohydrates via the phosphoenolpyruvate-dependent phosphotransferase system (PEP-PTS. Our data showed that these two organisms are unable to synthesize purine and pyrimidine de novo, since they only possess the sequences which encode salvage pathway enzymes. Comparative analyses of M. synoviae and M. hyopneumoniae with other Mollicutes have revealed differential genes in the former two genomes coding for enzymes that participate in carbohydrate, amino acid and nucleotide metabolism and host-pathogen interaction. The identification of these metabolic pathways will provide a better understanding of the biology and pathogenicity of these organisms.

  18. Two genome sequences of the same bacterial strain, Gluconacetobacter diazotrophicus PAl 5, suggest a new standard in genome sequence submission.

    Science.gov (United States)

    Giongo, Adriana; Tyler, Heather L; Zipperer, Ursula N; Triplett, Eric W

    2010-06-15

    Gluconacetobacter diazotrophicus PAl 5 is of agricultural significance due to its ability to provide fixed nitrogen to plants. Consequently, its genome sequence has been eagerly anticipated to enhance understanding of endophytic nitrogen fixation. Two groups have sequenced the PAl 5 genome from the same source (ATCC 49037), though the resulting sequences contain a surprisingly high number of differences. Therefore, an optical map of PAl 5 was constructed in order to determine which genome assembly more closely resembles the chromosomal DNA by aligning each sequence against a physical map of the genome. While one sequence aligned very well, over 98% of the second sequence contained numerous rearrangements. The many differences observed between these two genome sequences could be owing to either assembly errors or rapid evolutionary divergence. The extent of the differences derived from sequence assembly errors could be assessed if the raw sequencing reads were provided by both genome centers at the time of genome sequence submission. Hence, a new genome sequence standard is proposed whereby the investigator supplies the raw reads along with the closed sequence so that the community can make more accurate judgments on whether differences observed in a single stain may be of biological origin or are simply caused by differences in genome assembly procedures.

  19. Structure and sequence of the saimiriine herpesvirus 1 genome.

    Science.gov (United States)

    Tyler, Shaun; Severini, Alberto; Black, Darla; Walker, Matthew; Eberle, R

    2011-02-05

    We report here the complete genome sequence of the squirrel monkey α-herpesvirus saimiriine herpesvirus 1 (HVS1). Unlike the simplexviruses of other primate species, only the unique short region of the HVS1 genome is bounded by inverted repeats. While all Old World simian simplexviruses characterized to date lack the herpes simplex virus RL1 (γ34.5) gene, HVS1 has an RL1 gene. HVS1 lacks several genes that are present in other primate simplexviruses (US8.5, US10-12, UL43/43.5 and UL49A). Although the overall genome structure appears more like that of varicelloviruses, the encoded HVS1 proteins are most closely related to homologous proteins of the primate simplexviruses. Phylogenetic analyses confirm that HVS1 is a simplexvirus. Limited comparison of two HVS1 strains revealed a very low degree of sequence variation more typical of varicelloviruses. HVS1 is thus unique among the primate α-herpesviruses in that its genome has properties of both simplexviruses and varicelloviruses.

  20. Genome Sequence of Stachybotrys chartarum Strain 51-11

    OpenAIRE

    Betancourt, Doris A.; Dean, Timothy R.; Kim, Jean; Levy, Josh

    2015-01-01

    The Stachybotrys chartarum strain 51-11 genome was sequenced by shotgun sequencing utilizing Illumina HiSeq 2000 and PacBio technologies. Since S. chartarum has been implicated as having health impacts within water-damaged buildings, any information extracted from the genomic sequence data relating to toxins or the metabolism of the fungus might be useful.

  1. Complete Genome Sequence of Rift Valley Fever Virus Strain Lunyo.

    Science.gov (United States)

    Lumley, Sarah; Horton, Daniel L; Marston, Denise A; Johnson, Nicholas; Ellis, Richard J; Fooks, Anthony R; Hewson, Roger

    2016-04-14

    Using next-generation sequencing technologies, the first complete genome sequence of Rift Valley fever virus strain Lunyo is reported here. Originally reported as an attenuated antigenic variant strain from Uganda, genomic sequence analysis shows that Lunyo clusters together with other Ugandan isolates.

  2. Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis

    DEFF Research Database (Denmark)

    Carlton, Jane M.; Hirt, Robert P.; Silva, Joana C.

    2007-01-01

    We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the approximately 160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion...

  3. Coevolution between simple sequence repeats (SSRs and virus genome size

    Directory of Open Access Journals (Sweden)

    Zhao Xiangyan

    2012-08-01

    Full Text Available Abstract Background Relationship between the level of repetitiveness in genomic sequence and genome size has been investigated by making use of complete prokaryotic and eukaryotic genomes, but relevant studies have been rarely made in virus genomes. Results In this study, a total of 257 viruses were examined, which cover 90% of genera. The results showed that simple sequence repeats (SSRs is strongly, positively and significantly correlated with genome size. Certain repeat class is distributed in a certain range of genome sequence length. Mono-, di- and tri- repeats are widely distributed in all virus genomes, tetra- SSRs as a common component consist in genomes which more than 100 kb in size; in the range of genome  Conclusions We conducted this research standing on the height of the whole virus. We concluded that genome size is an important factor in affecting the occurrence of SSRs; hosts are also responsible for the variances of SSRs content to a certain degree.

  4. Nullomers and High Order Nullomers in Genomic Sequences

    Science.gov (United States)

    Vergni, Davide; Santoni, Daniele

    2016-01-01

    A nullomer is an oligomer that does not occur as a subsequence in a given DNA sequence, i.e. it is an absent word of that sequence. The importance of nullomers in several applications, from drug discovery to forensic practice, is now debated in the literature. Here, we investigated the nature of nullomers, whether their absence in genomes has just a statistical explanation or it is a peculiar feature of genomic sequences. We introduced an extension of the notion of nullomer, namely high order nullomers, which are nullomers whose mutated sequences are still nullomers. We studied different aspects of them: comparison with nullomers of random sequences, CpG distribution and mean helical rise. In agreement with previous results we found that the number of nullomers in the human genome is much larger than expected by chance. Nevertheless antithetical results were found when considering a random DNA sequence preserving dinucleotide frequencies. The analysis of CpG frequencies in nullomers and high order nullomers revealed, as expected, a high CpG content but it also highlighted a strong dependence of CpG frequencies on the dinucleotide position, suggesting that nullomers have their own peculiar structure and are not simply sequences whose CpG frequency is biased. Furthermore, phylogenetic trees were built on eleven species based on both the similarities between the dinucleotide frequencies and the number of nullomers two species share, showing that nullomers are fairly conserved among close species. Finally the study of mean helical rise of nullomers sequences revealed significantly high mean rise values, reinforcing the hypothesis that those sequences have some peculiar structural features. The obtained results show that nullomers are the consequence of the peculiar structure of DNA (also including biased CpG frequency and CpGs islands), so that the hypermutability model, also taking into account CpG islands, seems to be not sufficient to explain nullomer phenomenon

  5. Draft Genome Sequences of Klebsiella variicola Plant Isolates.

    Science.gov (United States)

    Martínez-Romero, Esperanza; Silva-Sanchez, Jesús; Barrios, Humberto; Rodríguez-Medina, Nadia; Martínez-Barnetche, Jesús; Téllez-Sosa, Juan; Gómez-Barreto, Rosa Elena; Garza-Ramos, Ulises

    2015-09-10

    Three endophytic Klebsiella variicola isolates-T29A, 3, and 6A2, obtained from sugar cane stem, maize shoots, and banana leaves, respectively-were used for whole-genome sequencing. Here, we report the draft genome sequences of circular chromosomes and plasmids. The genomes contain plant colonization and cellulases genes. This study will help toward understanding the genomic basis of K. variicola interaction with plant hosts.

  6. Single-Cell (Meta-Genomics of a Dimorphic Candidatus Thiomargarita nelsonii Reveals Genomic Plasticity

    Directory of Open Access Journals (Sweden)

    Beverly E. Flood

    2016-05-01

    Full Text Available The genus Thiomargarita includes the world’s largest bacteria. But as uncultured organisms, their physiology, metabolism, and basis for their gigantism are not well understood. Thus a genomics approach, applied to a single Candidatus Thiomargarita nelsonii cell was employed to explore the genetic potential of one of these enigmatic giant bacteria. The Thiomargarita cell was obtained from an assemblage of budding Ca. T. nelsonii attached to a provannid gastropod shell from Hydrate Ridge, a methane seep offshore of Oregon, USA. Here we present a manually curated genome of Bud S10 resulting from a hybrid assembly of long Pacific Biosciences and short Illumina sequencing reads. With respect to inorganic carbon fixation and sulfur oxidation pathways, the Ca. T. nelsonii Hydrate Ridge Bud S10 genome was similar to marine sister taxa within the family Beggiatoaceae. However, the Bud S10 genome contains genes suggestive of the genetic potential for lithotrophic growth on arsenite and perhaps hydrogen. The genome also revealed that Bud S10 likely respires nitrate via two pathways: a complete denitrification pathway and a dissimilatory nitrate reduction to ammonia pathway. Both pathways have been predicted, but not previously fully elucidated, in the genomes of other large, vacuolated, sulfur-oxidizing bacteria.Surprisingly, the genome also had a high number of unusual features for a bacterium to include the largest number of metacaspases and introns ever reported in a bacterium. Also present, are a large number of other mobile genetic elements, such as insertion sequence transposable elements and miniature inverted-repeat transposable elements (MITEs. In some cases, mobile genetic elements disrupted key genes in metabolic pathways. For example, a MITE interrupts hupL, which encodes the large subunit of the hydrogenase in hydrogen oxidation. Moreover, we detected a group I intron in one of the most critical genes in the sulfur oxidation pathway, dsr

  7. Single-Cell (Meta-)Genomics of a Dimorphic Candidatus Thiomargarita nelsonii Reveals Genomic Plasticity

    Science.gov (United States)

    Flood, Beverly E.; Fliss, Palmer; Jones, Daniel S.; Dick, Gregory J.; Jain, Sunit; Kaster, Anne-Kristin; Winkel, Matthias; Mußmann, Marc; Bailey, Jake

    2016-01-01

    The genus Thiomargarita includes the world's largest bacteria. But as uncultured organisms, their physiology, metabolism, and basis for their gigantism are not well understood. Thus, a genomics approach, applied to a single Candidatus Thiomargarita nelsonii cell was employed to explore the genetic potential of one of these enigmatic giant bacteria. The Thiomargarita cell was obtained from an assemblage of budding Ca. T. nelsonii attached to a provannid gastropod shell from Hydrate Ridge, a methane seep offshore of Oregon, USA. Here we present a manually curated genome of Bud S10 resulting from a hybrid assembly of long Pacific Biosciences and short Illumina sequencing reads. With respect to inorganic carbon fixation and sulfur oxidation pathways, the Ca. T. nelsonii Hydrate Ridge Bud S10 genome was similar to marine sister taxa within the family Beggiatoaceae. However, the Bud S10 genome contains genes suggestive of the genetic potential for lithotrophic growth on arsenite and perhaps hydrogen. The genome also revealed that Bud S10 likely respires nitrate via two pathways: a complete denitrification pathway and a dissimilatory nitrate reduction to ammonia pathway. Both pathways have been predicted, but not previously fully elucidated, in the genomes of other large, vacuolated, sulfur-oxidizing bacteria. Surprisingly, the genome also had a high number of unusual features for a bacterium to include the largest number of metacaspases and introns ever reported in a bacterium. Also present, are a large number of other mobile genetic elements, such as insertion sequence (IS) transposable elements and miniature inverted-repeat transposable elements (MITEs). In some cases, mobile genetic elements disrupted key genes in metabolic pathways. For example, a MITE interrupts hupL, which encodes the large subunit of the hydrogenase in hydrogen oxidation. Moreover, we detected a group I intron in one of the most critical genes in the sulfur oxidation pathway, dsrA. The dsrA group

  8. Single-Cell (Meta-)Genomics of a Dimorphic Candidatus Thiomargarita nelsonii Reveals Genomic Plasticity.

    Science.gov (United States)

    Flood, Beverly E; Fliss, Palmer; Jones, Daniel S; Dick, Gregory J; Jain, Sunit; Kaster, Anne-Kristin; Winkel, Matthias; Mußmann, Marc; Bailey, Jake

    2016-01-01

    The genus Thiomargarita includes the world's largest bacteria. But as uncultured organisms, their physiology, metabolism, and basis for their gigantism are not well understood. Thus, a genomics approach, applied to a single Candidatus Thiomargarita nelsonii cell was employed to explore the genetic potential of one of these enigmatic giant bacteria. The Thiomargarita cell was obtained from an assemblage of budding Ca. T. nelsonii attached to a provannid gastropod shell from Hydrate Ridge, a methane seep offshore of Oregon, USA. Here we present a manually curated genome of Bud S10 resulting from a hybrid assembly of long Pacific Biosciences and short Illumina sequencing reads. With respect to inorganic carbon fixation and sulfur oxidation pathways, the Ca. T. nelsonii Hydrate Ridge Bud S10 genome was similar to marine sister taxa within the family Beggiatoaceae. However, the Bud S10 genome contains genes suggestive of the genetic potential for lithotrophic growth on arsenite and perhaps hydrogen. The genome also revealed that Bud S10 likely respires nitrate via two pathways: a complete denitrification pathway and a dissimilatory nitrate reduction to ammonia pathway. Both pathways have been predicted, but not previously fully elucidated, in the genomes of other large, vacuolated, sulfur-oxidizing bacteria. Surprisingly, the genome also had a high number of unusual features for a bacterium to include the largest number of metacaspases and introns ever reported in a bacterium. Also present, are a large number of other mobile genetic elements, such as insertion sequence (IS) transposable elements and miniature inverted-repeat transposable elements (MITEs). In some cases, mobile genetic elements disrupted key genes in metabolic pathways. For example, a MITE interrupts hupL, which encodes the large subunit of the hydrogenase in hydrogen oxidation. Moreover, we detected a group I intron in one of the most critical genes in the sulfur oxidation pathway, dsrA. The dsrA group

  9. Next-generation sequencing strategies for characterizing the turkey genome.

    Science.gov (United States)

    Dalloul, Rami A; Zimin, Aleksey V; Settlage, Robert E; Kim, Sungwon; Reed, Kent M

    2014-02-01

    The turkey genome sequencing project was initiated in 2008 and has relied primarily on next-generation sequencing (NGS) technologies. Our first efforts used a synergistic combination of 2 NGS platforms (Roche/454 and Illumina GAII), detailed bacterial artificial chromosome (BAC) maps, and unique assembly tools to sequence and assemble the genome of the domesticated turkey, Meleagris gallopavo. Since the first release in 2010, efforts to improve the genome assembly, gene annotation, and genomic analyses continue. The initial assembly build (2.01) represented about 89% of the genome sequence with 17X coverage depth (931 Mb). Sequence contigs were assigned to 30 of the 40 chromosomes with approximately 10% of the assembled sequence corresponding to unassigned chromosomes (ChrUn). The sequence has been refined through both genome-wide and area-focused sequencing, including shotgun and paired-end sequencing, and targeted sequencing of chromosomal regions with low or incomplete coverage. These additional efforts have improved the sequence assembly resulting in 2 subsequent genome builds of higher genome coverage (25X/Build3.0 and 30X/Build4.0) with a current sequence totaling 1,010 Mb. Further, BAC with end sequences assigned to the Z/W and MG18 (MHC) chromosomes, ChrUn, or not placed in the previous build were isolated, deeply sequenced (Hi-Seq), and incorporated into the latest build (5.0). To aid in the annotation and to generate a gene expression atlas of major tissues, a comprehensive set of RNA samples was collected at various developmental stages of female and male turkeys. Transcriptome sequencing data (using Illumina Hi-Seq) will provide information to enhance the final assembly and ultimately improve sequence annotation. The most current sequence covers more than 95% of the turkey genome and should yield a much improved gene level of annotation, making it a valuable resource for studying genetic variations underlying economically important traits in poultry.

  10. Reconstructing cancer genomes from paired-end sequencing data

    Directory of Open Access Journals (Sweden)

    Oesper Layla

    2012-04-01

    Full Text Available Abstract Background A cancer genome is derived from the germline genome through a series of somatic mutations. Somatic structural variants - including duplications, deletions, inversions, translocations, and other rearrangements - result in a cancer genome that is a scrambling of intervals, or "blocks" of the germline genome sequence. We present an efficient algorithm for reconstructing the block organization of a cancer genome from paired-end DNA sequencing data. Results By aligning paired reads from a cancer genome - and a matched germline genome, if available - to the human reference genome, we derive: (i a partition of the reference genome into intervals; (ii adjacencies between these intervals in the cancer genome; (iii an estimated copy number for each interval. We formulate the Copy Number and Adjacency Genome Reconstruction Problem of determining the cancer genome as a sequence of the derived intervals that is consistent with the measured adjacencies and copy numbers. We design an efficient algorithm, called Paired-end Reconstruction of Genome Organization (PREGO, to solve this problem by reducing it to an optimization problem on an interval-adjacency graph constructed from the data. The solution to the optimization problem results in an Eulerian graph, containing an alternating Eulerian tour that corresponds to a cancer genome that is consistent with the sequencing data. We apply our algorithm to five ovarian cancer genomes that were sequenced as part of The Cancer Genome Atlas. We identify numerous rearrangements, or structural variants, in these genomes, analyze reciprocal vs. non-reciprocal rearrangements, and identify rearrangements consistent with known mechanisms of duplication such as tandem duplications and breakage/fusion/bridge (B/F/B cycles. Conclusions We demonstrate that PREGO efficiently identifies complex and biologically relevant rearrangements in cancer genome sequencing data. An implementation of the PREGO algorithm is

  11. The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics.

    Directory of Open Access Journals (Sweden)

    Lincoln D Stein

    2003-11-01

    Full Text Available The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp and C. elegans (100.3 Mbp genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C

  12. Functional noncoding sequences derived from SINEs in the mammalian genome.

    Science.gov (United States)

    Nishihara, Hidenori; Smit, Arian F A; Okada, Norihiro

    2006-07-01

    Recent comparative analyses of mammalian sequences have revealed that a large number of nonprotein-coding genomic regions are under strong selective constraint. Here, we report that some of these loci have been derived from a newly defined family of ancient SINEs (short interspersed repetitive elements). This is a surprising result, as SINEs and other transposable elements are commonly thought to be genomic parasites. We named the ancient SINE family AmnSINE1, for Amniota SINE1, because we found it to be present in mammals as well as in birds, and some copies predate the mammalian-bird split 310 million years ago (Mya). AmnSINE1 has a chimeric structure of a 5S rRNA and a tRNA-derived SINE, and is related to five tRNA-derived SINE families that we characterized here in the coelacanth, dogfish shark, hagfish, and amphioxus genomes. All of the newly described SINE families have a common central domain that is also shared by zebrafish SINE3, and we collectively name them the DeuSINE (Deuterostomia SINE) superfamily. Notably, of the approximately 1000 still identifiable copies of AmnSINE1 in the human genome, 105 correspond to loci phylogenetically highly conserved among mammalian orthologs. The conservation is strongest over the central domain. Thus, AmnSINE1 appears to be the best example of a transposable element of which a significant fraction of the copies have acquired genomic functionality.

  13. Next-generation sequencing and large genome assemblies

    OpenAIRE

    Henson, Joseph; Tischler, German; Ning, Zemin

    2012-01-01

    The next-generation sequencing (NGS) revolution has drastically reduced time and cost requirements for sequencing of large genomes, and also qualitatively changed the problem of assembly. This article reviews the state of the art in de novo genome assembly, paying particular attention to mammalian-sized genomes. The strengths and weaknesses of the main sequencing platforms are highlighted, leading to a discussion of assembly and the new challenges associated with NGS data. Current approaches ...

  14. Genome sequencing and annotation of Morganella sp. SA36

    Directory of Open Access Journals (Sweden)

    Samy Selim

    2015-12-01

    Full Text Available We report draft genome sequence of Morganella sp. Strain SA36, isolated from water spring in Aljouf region, Saudi Arabia. The draft genome size is 2,564,439 bp with a G + C content of 51.1% and contains 6 rRNA sequence (single copies of 5S, 16S & 23S rRNA. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LDNQ00000000.

  15. Genome sequencing and annotation of Stenotrophomonas sp. SAM8

    Directory of Open Access Journals (Sweden)

    Samy Selim

    2015-12-01

    Full Text Available We report draft genome sequence of Stenotrophomonas sp. strain SAM8, isolated from environmental water. The draft genome size is 3,665,538 bp with a G + C content of 67.2% and contains 6 rRNA sequence (single copies of 5S, 16S & 23S rRNA. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LDAV00000000.

  16. Genome sequencing and annotation of Proteus sp. SAS71

    Directory of Open Access Journals (Sweden)

    Samy Selim

    2015-12-01

    Full Text Available We report draft genome sequence of Proteus sp. strain SAS71, isolated from water spring in Aljouf region, Saudi Arabia. The draft genome size is 3,037,704 bp with a G + C content of 39.3% and contains 6 rRNA sequence (single copies of 5S, 16S & 23S rRNA. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LDIU00000000.

  17. Complete Genome Sequence of Corynebacterium pseudotuberculosis Viscerotropic Strain N1

    Science.gov (United States)

    Portela, Ricardo W.; Sousa, Thiago J.; Rocha, Flávia; Pereira, Felipe L.; Dorella, Fernanda A.; Carvalho, Alex F.; Menezes, Nildo; Macedo, Eduardo S.; Moura-Costa, Lilia F.; Meyer, Roberto; Leal, Carlos A. G.; Figueiredo, Henrique C.; Azevedo, Vasco

    2016-01-01

    We present the complete genome sequence of Corynebacterium pseudotuberculosis strain N1. The sequencing was performed with the Ion Torrent Personal Genome Machine system. The genome is a circular chromosome with 2,337,845 bp, a G+C content of 52.85%, and a total of 2,045 coding sequences, 12 rRNAs, 49 tRNAs, and 58 pseudogenes. PMID:26823597

  18. Insights from 20 years of bacterial genome sequencing

    DEFF Research Database (Denmark)

    Land, Miriam; Hauser, Loren; Jun, Se-Ran

    2015-01-01

    the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative...... (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident...

  19. Identifying Human Genome-Wide CNV, LOH and UPD by Targeted Sequencing of Selected Regions.

    Directory of Open Access Journals (Sweden)

    Wei Li

    Full Text Available Copy-number variations (CNV, loss of heterozygosity (LOH, and uniparental disomy (UPD are large genomic aberrations leading to many common inherited diseases, cancers, and other complex diseases. An integrated tool to identify these aberrations is essential in understanding diseases and in designing clinical interventions. Previous discovery methods based on whole-genome sequencing (WGS require very high depth of coverage on the whole genome scale, and are cost-wise inefficient. Another approach, whole exome genome sequencing (WEGS, is limited to discovering variations within exons. Thus, we are lacking efficient methods to detect genomic aberrations on the whole genome scale using next-generation sequencing technology. Here we present a method to identify genome-wide CNV, LOH and UPD for the human genome via selectively sequencing a small portion of genome termed Selected Target Regions (SeTRs. In our experiments, the SeTRs are covered by 99.73%~99.95% with sufficient depth. Our developed bioinformatics pipeline calls genome-wide CNVs with high confidence, revealing 8 credible events of LOH and 3 UPD events larger than 5M from 15 individual samples. We demonstrate that genome-wide CNV, LOH and UPD can be detected using a cost-effective SeTRs sequencing approach, and that LOH and UPD can be identified using just a sample grouping technique, without using a matched sample or familial information.

  20. Identifying Human Genome-Wide CNV, LOH and UPD by Targeted Sequencing of Selected Regions.

    Science.gov (United States)

    Wang, Yu; Li, Wei; Xia, Yingying; Wang, Chongzhi; Tang, Y Tom; Guo, Wenying; Li, Jinliang; Zhao, Xia; Sun, Yepeng; Hu, Juan; Zhen, Hefu; Zhang, Xiandong; Chen, Chao; Shi, Yujian; Li, Lin; Cao, Hongzhi; Du, Hongli; Li, Jian

    2014-01-01

    Copy-number variations (CNV), loss of heterozygosity (LOH), and uniparental disomy (UPD) are large genomic aberrations leading to many common inherited diseases, cancers, and other complex diseases. An integrated tool to identify these aberrations is essential in understanding diseases and in designing clinical interventions. Previous discovery methods based on whole-genome sequencing (WGS) require very high depth of coverage on the whole genome scale, and are cost-wise inefficient. Another approach, whole exome genome sequencing (WEGS), is limited to discovering variations within exons. Thus, we are lacking efficient methods to detect genomic aberrations on the whole genome scale using next-generation sequencing technology. Here we present a method to identify genome-wide CNV, LOH and UPD for the human genome via selectively sequencing a small portion of genome termed Selected Target Regions (SeTRs). In our experiments, the SeTRs are covered by 99.73%~99.95% with sufficient depth. Our developed bioinformatics pipeline calls genome-wide CNVs with high confidence, revealing 8 credible events of LOH and 3 UPD events larger than 5M from 15 individual samples. We demonstrate that genome-wide CNV, LOH and UPD can be detected using a cost-effective SeTRs sequencing approach, and that LOH and UPD can be identified using just a sample grouping technique, without using a matched sample or familial information.

  1. Complete genome sequence of Enterococcus faecium strain TX16 and comparative genomic analysis of Enterococcus faecium genomes

    Directory of Open Access Journals (Sweden)

    Qin Xiang

    2012-07-01

    Full Text Available Abstract Background Enterococci are among the leading causes of hospital-acquired infections in the United States and Europe, with Enterococcus faecalis and Enterococcus faecium being the two most common species isolated from enterococcal infections. In the last decade, the proportion of enterococcal infections caused by E. faecium has steadily increased compared to other Enterococcus species. Although the underlying mechanism for the gradual replacement of E. faecalis by E. faecium in the hospital environment is not yet understood, many studies using genotyping and phylogenetic analysis have shown the emergence of a globally dispersed polyclonal subcluster of E. faecium strains in clinical environments. Systematic study of the molecular epidemiology and pathogenesis of E. faecium has been hindered by the lack of closed, complete E. faecium genomes that can be used as references. Results In this study, we report the complete genome sequence of the E. faecium strain TX16, also known as DO, which belongs to multilocus sequence type (ST 18, and was the first E. faecium strain ever sequenced. Whole genome comparison of the TX16 genome with 21 E. faecium draft genomes confirmed that most clinical, outbreak, and hospital-associated (HA strains (including STs 16, 17, 18, and 78, in addition to strains of non-hospital origin, group in the same clade (referred to as the HA clade and are evolutionally considerably more closely related to each other by phylogenetic and gene content similarity analyses than to isolates in the community-associated (CA clade with approximately a 3–4% average nucleotide sequence difference between the two clades at the core genome level. Our study also revealed that many genomic loci in the TX16 genome are unique to the HA clade. 380 ORFs in TX16 are HA-clade specific and antibiotic resistance genes are enriched in HA-clade strains. Mobile elements such as IS16 and transposons were also found almost exclusively in HA strains

  2. A taste of pineapple evolution through genome sequencing.

    Science.gov (United States)

    Xu, Qing; Liu, Zhong-Jian

    2015-12-01

    The genome sequence assembly of the highly heterozygous Ananas comosus and its varieties is an impressive technical achievement. The sequence opens the door to a greater understanding of pineapple morphology and evolution.

  3. Comparative genomics reveals insights into avian genome evolution and adaptation

    DEFF Research Database (Denmark)

    Zhang, Guojie; Li, Cai; Li, Qiye

    2014-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, ...

  4. A Snapshot of the Emerging Tomato Genome Sequence

    Directory of Open Access Journals (Sweden)

    Lukas A. Mueller

    2009-03-01

    Full Text Available The genome of tomato ( L. is being sequenced by an international consortium of 10 countries (Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy, and the United States as part of the larger “International Solanaceae Genome Project (SOL: Systems Approach to Diversity and Adaptation” initiative. The tomato genome sequencing project uses an ordered bacterial artificial chromosome (BAC approach to generate a high-quality tomato euchromatic genome sequence for use as a reference genome for the Solanaceae and euasterids. Sequence is deposited at GenBank and at the SOL Genomics Network (SGN. Currently, there are around 1000 BACs finished or in progress, representing more than a third of the projected euchromatic portion of the genome. An annotation effort is also underway by the International Tomato Annotation Group. The expected number of genes in the euchromatin is ∼40,000, based on an estimate from a preliminary annotation of 11% of finished sequence. Here, we present this first snapshot of the emerging tomato genome and its annotation, a short comparison with potato ( L. sequence data, and the tools available for the researchers to exploit this new resource are also presented. In the future, whole-genome shotgun techniques will be combined with the BAC-by-BAC approach to cover the entire tomato genome. The high-quality reference euchromatic tomato sequence is expected to be near completion by 2010.

  5. Whole-Genome Sequence Assembly for Mammalian Genomes: Arachne 2

    OpenAIRE

    Jaffe, David B.; Butler, Jonathan; Gnerre, Sante; Mauceli, Evan; Lindblad-Toh, Kerstin; Jill P. Mesirov; Michael C Zody; Lander, Eric S.

    2003-01-01

    We previously described the whole-genome assembly program Arachne, presenting assemblies of simulated data for small to mid-sized genomes. Here we describe algorithmic adaptations to the program, allowing for assembly of mammalian-size genomes, and also improving the assembly of smaller genomes. Three principal changes were simultaneously made and applied to the assembly of the mouse genome, during a six-month period of development: (1) Supercontigs (scaffolds) were iteratively broken and rej...

  6. Genomic Resources for Water Yam (Dioscorea alata L.): Analyses of EST-Sequences, De Novo Sequencing and GBS Libraries.

    Science.gov (United States)

    Saski, Christopher A; Bhattacharjee, Ranjana; Scheffler, Brian E; Asiedu, Robert

    2015-01-01

    The reducing cost and rapid progress in next-generation sequencing techniques coupled with high performance computational approaches have resulted in large-scale discovery of advanced genomic resources in several model and non-model plant species. Yam (Dioscorea spp.) is a major food and cash crop in many countries but research efforts have been limited to understand the genetics and generate genomic information for the crop. The availability of a large number of genomic resources including genome-wide molecular markers will accelerate the breeding efforts and application of genomic selection in yams. In the present study, several methods including expressed sequence tags (EST)-sequencing, de novo sequencing, and genotyping-by-sequencing (GBS) profiles on two yam (Dioscorea alata L.) genotypes (TDa 95/00328 and TDa 95-310) was performed to generate genomic resources for use in its improvement programs. This includes a comprehensive set of EST-SSRs, genomic SSRs, whole genome SNPs, and reduced representation SNPs. A total of 1,152 EST-SSRs were developed from >40,000 EST-sequences generated from the two genotypes. A set of 388 EST-SSRs were validated as polymorphic showing a polymorphism rate of 34% when tested on two diverse parents targeted for anthracnose disease. In addition, approximately 40X de novo whole genome sequence coverage was generated for each of the two genotypes, and a total of 18,584 and 15,952 genomic SSRs were identified for TDa 95/00328 and TDa 95-310, respectively. A custom made pipeline resulted in the selection of 573 genomic SSRs common across the two genotypes, of which only eight failed, 478 being polymorphic and 62 monomorphic indicating a polymorphic rate of 83.5%. Additionally, 288,505 high quality SNPs were also identified between these two genotypes. Genotyping by sequencing reads on these two genotypes also revealed 36,790 overlapping SNP positions that are distributed throughout the genome. Our efforts in using different approaches

  7. Genomic Resources for Water Yam (Dioscorea alata L.: Analyses of EST-Sequences, De Novo Sequencing and GBS Libraries.

    Directory of Open Access Journals (Sweden)

    Christopher A Saski

    Full Text Available The reducing cost and rapid progress in next-generation sequencing techniques coupled with high performance computational approaches have resulted in large-scale discovery of advanced genomic resources in several model and non-model plant species. Yam (Dioscorea spp. is a major food and cash crop in many countries but research efforts have been limited to understand the genetics and generate genomic information for the crop. The availability of a large number of genomic resources including genome-wide molecular markers will accelerate the breeding efforts and application of genomic selection in yams. In the present study, several methods including expressed sequence tags (EST-sequencing, de novo sequencing, and genotyping-by-sequencing (GBS profiles on two yam (Dioscorea alata L. genotypes (TDa 95/00328 and TDa 95-310 was performed to generate genomic resources for use in its improvement programs. This includes a comprehensive set of EST-SSRs, genomic SSRs, whole genome SNPs, and reduced representation SNPs. A total of 1,152 EST-SSRs were developed from >40,000 EST-sequences generated from the two genotypes. A set of 388 EST-SSRs were validated as polymorphic showing a polymorphism rate of 34% when tested on two diverse parents targeted for anthracnose disease. In addition, approximately 40X de novo whole genome sequence coverage was generated for each of the two genotypes, and a total of 18,584 and 15,952 genomic SSRs were identified for TDa 95/00328 and TDa 95-310, respectively. A custom made pipeline resulted in the selection of 573 genomic SSRs common across the two genotypes, of which only eight failed, 478 being polymorphic and 62 monomorphic indicating a polymorphic rate of 83.5%. Additionally, 288,505 high quality SNPs were also identified between these two genotypes. Genotyping by sequencing reads on these two genotypes also revealed 36,790 overlapping SNP positions that are distributed throughout the genome. Our efforts in using

  8. Insights from twenty years of bacterial genome sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Jun, Se Ran [ORNL; Nookaew, Intawat [ORNL; Leuze, Michael Rex [ORNL; Ahn, Tae-Hyuk [ORNL; Karpinets, Tatiana V [ORNL; Lund, Ole [Technical University of Denmark; Kora, Guruprasad H [ORNL; Wassenaar, Trudy [Molecular Microbiology & Genomics Consultants, Zotzenheim, Germany; Poudel, Suresh [ORNL; Ussery, David W [ORNL

    2015-01-01

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome

  9. A mitochondrial genome sequence of the Tibetan antelope (Pantholops hodgsonii)

    DEFF Research Database (Denmark)

    Xu, Shu Qing; Yang, Ying Zhong; Zhou, Jun

    2005-01-01

    To investigate genetic mechanisms of high altitude adaptations of native mammals on the Tibetan Plateau, we compared mitochondrial sequences of the endangered Pantholops hodgsonii with its lowland distant relatives Ovis aries and Capra hircus, as well as other mammals. The complete mitochondrial...... genome of P. hodgsonii (16,498 bp) revealed a similar gene order as of other mammals. Because of tandem duplications, the control region of P. hodgsonii mitochondrial genome is shorter than those of O. aries and C. hircus, but longer than those of Bos species. Phylogenetic analysis based on alignments...... that the COXI (cytochrome c oxidase subunit I) gene was under positive selection in P. hodgsonii and Bos grunniens. Considering the same climates and environments shared by these two mammalian species, we proposed that the mitochondrial COXI gene is probably relevant for these native mammals to adapt the high...

  10. Genome-wide BAC-end sequencing of Cucumis melo using two BAC libraries

    Directory of Open Access Journals (Sweden)

    Puigdomènech Pere

    2010-11-01

    Full Text Available Abstract Background Although melon (Cucumis melo L. is an economically important fruit crop, no genome-wide sequence information is openly available at the current time. We therefore sequenced BAC-ends representing a total of 33,024 clones, half of them from a previously described melon BAC library generated with restriction endonucleases and the remainder from a new random-shear BAC library. Results We generated a total of 47,140 high-quality BAC-end sequences (BES, 91.7% of which were paired-BES. Both libraries were assembled independently and then cross-assembled to obtain a final set of 33,372 non-redundant, high-quality sequences. These were grouped into 6,411 contigs (4.5 Mb and 26,961 non-assembled BES (14.4 Mb, representing ~4.2% of the melon genome. The sequences were used to screen genomic databases, identifying 7,198 simple sequence repeats (corresponding to one microsatellite every 2.6 kb and 2,484 additional repeats of which 95.9% represented transposable elements. The sequences were also used to screen expressed sequence tag (EST databases, revealing 11,372 BES that were homologous to ESTs. This suggests that ~30% of the melon genome consists of coding DNA. We observed regions of microsynteny between melon paired-BES and six other dicotyledonous plant genomes. Conclusion The analysis of nearly 50,000 BES from two complementary genomic libraries covered ~4.2% of the melon genome, providing insight into properties such as microsatellite and transposable element distribution, and the percentage of coding DNA. The observed synteny between melon paired-BES and six other plant genomes showed that useful comparative genomic data can be derived through large scale BAC-end sequencing by anchoring a small proportion of the melon genome to other sequenced genomes.

  11. Evolutionary growth process of highly conserved sequences in vertebrate genomes.

    Science.gov (United States)

    Ishibashi, Minaka; Noda, Akiko Ogura; Sakate, Ryuichi; Imanishi, Tadashi

    2012-08-01

    Genome sequence comparison between evolutionarily distant species revealed ultraconserved elements (UCEs) among mammals under strong purifying selection. Most of them were also conserved among vertebrates. Because they tend to be located in the flanking regions of developmental genes, they would have fundamental roles in creating vertebrate body plans. However, the evolutionary origin and selection mechanism of these UCEs remain unclear. Here we report that UCEs arose in primitive vertebrates, and gradually grew in vertebrate evolution. We searched for UCEs in two teleost fishes, Tetraodon nigroviridis and Oryzias latipes, and found 554 UCEs with 100% identity over 100 bps. Comparison of teleost and mammalian UCEs revealed 43 pairs of common, jawed-vertebrate UCEs (jUCE) with high sequence identities, ranging from 83.1% to 99.2%. Ten of them retain lower similarities to the Petromyzon marinus genome, and the substitution rates of four non-exonic jUCEs were reduced after the teleost-mammal divergence, suggesting that robust conservation had been acquired in the jawed vertebrate lineage. Our results indicate that prototypical UCEs originated before the divergence of jawed and jawless vertebrates and have been frozen as perfect conserved sequences in the jawed vertebrate lineage. In addition, our comparative sequence analyses of UCEs and neighboring regions resulted in a discovery of lineage-specific conserved sequences. They were added progressively to prototypical UCEs, suggesting step-wise acquisition of novel regulatory roles. Our results indicate that conserved non-coding elements (CNEs) consist of blocks with distinct evolutionary history, each having been frozen since different evolutionary era along the vertebrate lineage. Copyright © 2012 Elsevier B.V. All rights reserved.

  12. The complete genome sequence and comparative genome analysis of the high pathogenicity Yersinia enterocolitica strain 8081.

    Directory of Open Access Journals (Sweden)

    Nicholas R Thomson

    2006-12-01

    Full Text Available The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common

  13. Draft Genome Sequence of Carnobacterium divergens V41, a Bacteriocin-Producing Strain

    Science.gov (United States)

    Remenant, Benoît; Borges, Frédéric; Cailliez-Grimal, Catherine; Revol-Junelles, Anne-Marie; Marché, Laurent; Lajus, Aurélie; Médigue, Claudine; Pilet, Marie-France; Prévost, Hervé

    2016-01-01

    In this study, we present the draft genome sequence of Carnobacterium divergens V41. This strain was previously reported as producing divercin V41, a bacteriocin of interest for food biopreservation. Its genome revealed also the presence of a gene cluster putatively involved in polyketide production, which is unique in lactic acid bacteria. PMID:27738030

  14. Complete Genome Sequence of Streptococcus iniae 89353, a Virulent Strain Isolated from Diseased Tilapia in Taiwan

    Science.gov (United States)

    Wu, Sheng-Han; Chen, Chun-Yao; Huang, Chang-Wen; Lu, Jenn-Kan; Chou, Hsin-Yiu

    2017-01-01

    ABSTRACT Streptococcus iniae 89353 is a virulent strain isolated from diseased tilapia in Taiwan. The full-genome sequence of S. iniae 89353 is 2,098,647 bp. The revealed genome information will be beneficial for identification and understanding of potential virulence genes of Streptococcus iniae and possible immunogens for vaccine development against streptococcosis. PMID:28126946

  15. Draft Genome Sequence of Ochroconis constricta UM 578, Isolated from Human Skin Scraping.

    Science.gov (United States)

    Chan, Chai Ling; Yew, Su Mei; Na, Shiang Ling; Tan, Yung-Chie; Lee, Kok Wei; Yee, Wai-Yan; Ngeow, Yun Fong; Ng, Kee Peng

    2014-04-17

    Ochroconis constricta is a soilborne dematiaceous fungus that has never been reported to be associated with human infection. Here we report the first draft genome sequence of strain UM 578, isolated from human skin scraping. The genomic information revealed will contribute to a better understanding of this species.

  16. Draft Genome Sequence of Marine-Derived Aeromonas caviae CHZ306, a Potential Chitinase Producer Strain

    Science.gov (United States)

    Zimpel, Cristina Kraemer; Guimaraes, Ana Marcia Sa; Pessoa, Adalberto; Rivera, Irma Nelly Gutierrez

    2016-01-01

    We report here a draft genome sequence of Aeromonas caviae CHZ306, a marine-derived bacterium with the ability to hydrolyze chitin and express high levels of chitinases. The assembly resulted in 65 scaffolds with approximately 4.78 Mb. Genomic analysis revealed different genes encoding chitin-degrading enzymes that can be used for chitin derivative production. PMID:27856589

  17. Sequence imputation of HPV16 genomes for genetic association studies.

    Directory of Open Access Journals (Sweden)

    Benjamin Smith

    Full Text Available BACKGROUND: Human Papillomavirus type 16 (HPV16 causes over half of all cervical cancer and some HPV16 variants are more oncogenic than others. The genetic basis for the extraordinary oncogenic properties of HPV16 compared to other HPVs is unknown. In addition, we neither know which nucleotides vary across and within HPV types and lineages, nor which of the single nucleotide polymorphisms (SNPs determine oncogenicity. METHODS: A reference set of 62 HPV16 complete genome sequences was established and used to examine patterns of evolutionary relatedness amongst variants using a pairwise identity heatmap and HPV16 phylogeny. A BLAST-based algorithm was developed to impute complete genome data from partial sequence information using the reference database. To interrogate the oncogenic risk of determined and imputed HPV16 SNPs, odds-ratios for each SNP were calculated in a case-control viral genome-wide association study (VWAS using biopsy confirmed high-grade cervix neoplasia and self-limited HPV16 infections from Guanacaste, Costa Rica. RESULTS: HPV16 variants display evolutionarily stable lineages that contain conserved diagnostic SNPs. The imputation algorithm indicated that an average of 97.5±1.03% of SNPs could be accurately imputed. The VWAS revealed specific HPV16 viral SNPs associated with variant lineages and elevated odds ratios; however, individual causal SNPs could not be distinguished with certainty due to the nature of HPV evolution. CONCLUSIONS: Conserved and lineage-specific SNPs can be imputed with a high degree of accuracy from limited viral polymorphic data due to the lack of recombination and the stochastic mechanism of variation accumulation in the HPV genome. However, to determine the role of novel variants or non-lineage-specific SNPs by VWAS will require direct sequence analysis. The investigation of patterns of genetic variation and the identification of diagnostic SNPs for lineages of HPV16 variants provides a valuable

  18. High throughput sequencing reveals a novel fabavirus infecting sweet cherry.

    Science.gov (United States)

    Villamor, D E V; Pillai, S S; Eastwell, K C

    2017-03-01

    The genus Fabavirus currently consists of five species represented by viruses that infect a wide range of hosts but none reported from temperate climate fruit trees. A virus with genomic features resembling fabaviruses (tentatively named Prunus virus F, PrVF) was revealed by high throughput sequencing of extracts from a sweet cherry tree (Prunus avium). PrVF was subsequently shown to be graft transmissible and further identified in three other non-symptomatic Prunus spp. from different geographical locations. Two genetic variants of RNA1 and RNA2 coexisted in the same samples. RNA1 consisted of 6,165 and 6,163 nucleotides, and RNA2 consisted of 3,622 and 3,468 nucleotides.

  19. Genome Project Standards in a New Era of Sequencing

    Energy Technology Data Exchange (ETDEWEB)

    GSC Consortia; HMP Jumpstart Consortia; Chain, P. S. G.; Grafham, D. V.; Fulton, R. S.; FitzGerald, M. G.; Hostetler, J.; Muzny, D.; Detter, J. C.; Ali, J.; Birren, B.; Bruce, D. C.; Buhay, C.; Cole, J. R.; Ding, Y.; Dugan, S.; Field, D.; Garrity, G. M.; Gibbs, R.; Graves, T.; Han, C. S.; Harrison, S. H.; Highlander, S.; Hugenholtz, P.; Khouri, H. M.; Kodira, C. D.; Kolker, E.; Kyrpides, N. C.; Lang, D.; Lapidus, A.; Malfatti, S. A.; Markowitz, V.; Metha, T.; Nelson, K. E.; Parkhill, J.; Pitluck, S.; Qin, X.; Read, T. D.; Schmutz, J.; Sozhamannan, S.; Strausberg, R.; Sutton, G.; Thomson, N. R.; Tiedje, J. M.; Weinstock, G.; Wollam, A.

    2009-06-01

    For over a decade, genome 43 sequences have adhered to only two standards that are relied on for purposes of sequence analysis by interested third parties (1, 2). However, ongoing developments in revolutionary sequencing technologies have resulted in a redefinition of traditional whole genome sequencing that requires a careful reevaluation of such standards. With commercially available 454 pyrosequencing (followed by Illumina, SOLiD, and now Helicos), there has been an explosion of genomes sequenced under the moniker 'draft', however these can be very poor quality genomes (due to inherent errors in the sequencing technologies, and the inability of assembly programs to fully address these errors). Further, one can only infer that such draft genomes may be of poor quality by navigating through the databases to find the number and type of reads deposited in sequence trace repositories (and not all genomes have this available), or to identify the number of contigs or genome fragments deposited to the database. The difficulty in assessing the quality of such deposited genomes has created some havoc for genome analysis pipelines and contributed to many wasted hours of (mis)interpretation. These same novel sequencing technologies have also brought an exponential leap in raw sequencing capability, and at greatly reduced prices that have further skewed the time- and cost-ratios of draft data generation versus the painstaking process of improving and finishing a genome. The resulting effect is an ever-widening gap between drafted and finished genomes that only promises to continue (Figure 1), hence there is an urgent need to distinguish good and poor datasets. The sequencing institutes in the authorship, along with the NIH's Human Microbiome Project Jumpstart Consortium (3), strongly believe that a new set of standards is required for genome sequences. The following represents a set of six community-defined categories of genome sequence standards that better

  20. A complete mitochondrial genome sequence from a mesolithic wild aurochs (Bos primigenius).

    LENUS (Irish Health Repository)

    Edwards, Ceiridwen J

    2010-01-01

    BACKGROUND: The derivation of domestic cattle from the extinct wild aurochs (Bos primigenius) has been well-documented by archaeological and genetic studies. Genetic studies point towards the Neolithic Near East as the centre of origin for Bos taurus, with some lines of evidence suggesting possible, albeit rare, genetic contributions from locally domesticated wild aurochsen across Eurasia. Inferences from these investigations have been based largely on the analysis of partial mitochondrial DNA sequences generated from modern animals, with limited sequence data from ancient aurochsen samples. Recent developments in DNA sequencing technologies, however, are affording new opportunities for the examination of genetic material retrieved from extinct species, providing new insight into their evolutionary history. Here we present DNA sequence analysis of the first complete mitochondrial genome (16,338 base pairs) from an archaeologically-verified and exceptionally-well preserved aurochs bone sample. METHODOLOGY: DNA extracts were generated from an aurochs humerus bone sample recovered from a cave site located in Derbyshire, England and radiocarbon-dated to 6,738+\\/-68 calibrated years before present. These extracts were prepared for both Sanger and next generation DNA sequencing technologies (Illumina Genome Analyzer). In total, 289.9 megabases (22.48%) of the post-filtered DNA sequences generated using the Illumina Genome Analyzer from this sample mapped with confidence to the bovine genome. A consensus B. primigenius mitochondrial genome sequence was constructed and was analysed alongside all available complete bovine mitochondrial genome sequences. CONCLUSIONS: For all nucleotide positions where both Sanger and Illumina Genome Analyzer sequencing methods gave high-confidence calls, no discrepancies were observed. Sequence analysis reveals evidence of heteroplasmy in this sample and places this mitochondrial genome sequence securely within a previously identified