WorldWideScience

Sample records for chloroplast genome sequence

  1. The complete chloroplast genome sequence of Zanthoxylum piperitum.

    Science.gov (United States)

    Lee, Jonghoon; Lee, Hyeon Ju; Kim, Kyunghee; Lee, Sang-Choon; Sung, Sang Hyun; Yang, Tae-Jin

    2016-09-01

    The complete chloroplast genome sequence of Zanthoxylum piperitum, a plant species with useful aromatic oils in family Rutaceae, was generated in this study by de novo assembly with whole-genome sequence data. The chloroplast genome was 158 154 bp in length with a typical quadripartite structure containing a pair of inverted repeats of 27 644 bp, separated by large single copy and small single copy of 85 340 bp and 17 526 bp, respectively. The chloroplast genome harbored 112 genes consisting of 78 protein-coding genes 30 tRNA genes and 4 rRNA genes. Phylogenetic analysis of the complete chloroplast genome sequences with those of known relatives revealed that Z. piperitum is most closely related to the Citrus species. PMID:26260183

  2. Complete Chloroplast Genome Sequence of Phagomixotrophic Green Alga Cymbomonas tetramitiformis

    Science.gov (United States)

    Paasch, Amber E.; Graham, Linda E.; Kim, Eunsoo

    2016-01-01

    We report here the complete chloroplast genome sequence of Cymbomonas tetramitiformis strain PLY262, which is a prasinophycean green alga that retains a phagomixotrophic mode of nutrition. The genome is 84,524 bp in length, with a G+C content of 37%, and contains 3 rRNAs, 26 tRNAs, and 76 protein-coding genes. PMID:27313295

  3. The complete chloroplast genome sequence of Alocasia macrorrhizos.

    Science.gov (United States)

    Wang, Bin; Han, Limin

    2016-09-01

    The complete chloroplast sequence of Alocasia macrorrhizos is 154 995 bp in length, containing a pair of inverted repeats of 25 944 bp separated by a large single-copy (LSC) region and a small single-copy (SSC) region of 87 366 bp and 15 741 bp, respectively. The chloroplast genome encodes 132 predicted functional genes, including 87 protein-coding genes, four ribosomal RNA genes, and 37 transfer RNA genes, 18 of which are duplicated in the inverted repeat regions. In these genes, 16 genes contained single intron and two genes comprising double introns. A maximum-likelihood phylogenetic analysis using complete chloroplast genome revealed that A. macrorrhizos does not belong to Araceae family, which infers that the A. macrorrhizos is distant from the species in Araceae family. PMID:26258514

  4. The complete chloroplast genome sequence of Curcuma flaviflora (Curcuma).

    Science.gov (United States)

    Zhang, Yan; Deng, Jiabin; Li, Yangyi; Gao, Gang; Ding, Chunbang; Zhang, Li; Zhou, Yonghong; Yang, Ruiwu

    2016-09-01

    The complete chloroplast (cp) genome of Curcuma flaviflora, a medicinal plant in Southeast Asia, was sequenced. The genome size was 160 478 bp in length, with 36.3% GC content. A pair of inverted repeats (IRs) of 26 946 bp were separated by a large single copy (LSC) of 88 008 bp and a small single copy (SSC) of 18 578 bp, respectively. The cp genome contained 132 annotated genes, including 79 protein coding genes, 30 tRNA genes, and four rRNA genes. And 19 of these genes were duplicated in inverted repeat regions. PMID:26367332

  5. The complete chloroplast genome sequence of Hibiscus syriacus.

    Science.gov (United States)

    Kwon, Hae-Yun; Kim, Joon-Hyeok; Kim, Sea-Hyun; Park, Ji-Min; Lee, Hyoshin

    2016-09-01

    The complete chloroplast genome sequence of Hibiscus syriacus L. is presented in this study. The genome is composed of 161 019 bp in length, with a typical circular structure containing a pair of inverted repeats of 25 745 bp of length separated by a large single-copy region and a small single-copy region of 89 698 bp and 19 831 bp of length, respectively. The overall GC content is 36.8%. One hundred and fourteen genes were annotated, including 81 protein-coding genes, 4 ribosomal RNA genes and 29 transfer RNA genes. PMID:26357910

  6. The complete chloroplast genome sequence of Abies nephrolepis (Pinaceae: Abietoideae

    Directory of Open Access Journals (Sweden)

    Dong-Keun Yi

    2016-06-01

    Full Text Available The plant chloroplast (cp genome has maintained a relatively conserved structure and gene content throughout evolution. Cp genome sequences have been used widely for resolving evolutionary and phylogenetic issues at various taxonomic levels of plants. Here, we report the complete cp genome of Abies nephrolepis. The A. nephrolepis cp genome is 121,336 base pairs (bp in length including a pair of short inverted repeat regions (IRa and IRb of 139 bp each separated by a small single copy (SSC region of 54,323 bp (SSC and a large single copy region of 66,735 bp (LSC. It contains 114 genes, 68 of which are protein coding genes, 35 tRNA and four rRNA genes, six open reading frames, and one pseudogene. Seventeen repeat units and 64 simple sequence repeats (SSR have been detected in A. nephrolepis cp genome. Large IR sequences locate in 42-kb inversion points (1186 bp. The A. nephrolepis cp genome is identical to Abies koreana’s which is closely related to taxa. Pairwise comparison between two cp genomes revealed 140 polymorphic sites in each. Complete cp genome sequence of A. nephrolepis has a significant potential to provide information on the evolutionary pattern of Abietoideae and valuable data for development of DNA markers for easy identification and classification.

  7. The complete chloroplast genome sequence of Anoectochilus emeiensis.

    Science.gov (United States)

    Zhu, Shuying; Niu, Zhitao; Yan, Wenjin; Xue, Qingyun; Ding, Xiaoyu

    2016-09-01

    The complete chloroplast (cp) genome sequence of Anoectochilus emeiensis, an extremely endangered medical plant with important economic value, was determined and characterized. The genome size was 152 650 bp, containing a pair of inverted repeats (IRs) (26 319 bp) which were separated by a large single copy (LSC) (82 670 bp) and a small single copy (SSC) (17 342 bp). The cpDNA of A. emeiensis contained 113 unique genes, including 79 protein coding genes, 30 tRNA genes and 4 rRNA genes. Among them, 18 genes contained one or two introns. The overall AT content of the genome was 63.1%. PMID:26403535

  8. The complete chloroplast genome sequence of Euonymus japonicus (Celastraceae).

    Science.gov (United States)

    Choi, Kyoung Su; Park, SeonJoo

    2016-09-01

    The complete chloroplast (cp) genome sequence of the Euonymus japonicus, the first sequenced of the genus Euonymus, was reported in this study. The total length was 157 637 bp, containing a pair of 26 678 bp inverted repeat region (IR), which were separated by small single copy (SSC) region and large single copy (LSC) region of 18 340 bp and 85 941 bp, respectively. This genome contains 107 unique genes, including 74 coding genes, four rRNA genes, and 29 tRNA genes. Seventeen genes contain intron of E. japonicus, of which three genes (clpP, ycf3, and rps12) include two introns. The maximum likelihood (ML) phylogenetic analysis revealed that E. japonicus was closely related to Manihot and Populus. PMID:26407184

  9. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  10. Chloroplast Genome Sequence of the Moss Tortula ruralis: Gene Content and Structural Arrangement Relative to Other Green Plant Chloroplast Genomes

    Science.gov (United States)

    Tortula ruralis, a widely distributed moss species in the family Pottiaceae, is increasingly being used as a model organism for the study of desiccation tolerance and mechanisms of cellular repair. In this paper, we present the chloroplast genome sequence of Tortula ruralis, only the second publishe...

  11. Whole genome sequencing of enriched chloroplast DNA using the Illumina GAII platform

    Directory of Open Access Journals (Sweden)

    Shepherd Lara D

    2010-09-01

    Full Text Available Abstract Background Complete chloroplast genome sequences provide a valuable source of molecular markers for studies in molecular ecology and evolution of plants. To obtain complete genome sequences, recent studies have made use of the polymerase chain reaction to amplify overlapping fragments from conserved gene loci. However, this approach is time consuming and can be more difficult to implement where gene organisation differs among plants. An alternative approach is to first isolate chloroplasts and then use the capacity of high-throughput sequencing to obtain complete genome sequences. We report our findings from studies of the latter approach, which used a simple chloroplast isolation procedure, multiply-primed rolling circle amplification of chloroplast DNA, Illumina Genome Analyzer II sequencing, and de novo assembly of paired-end sequence reads. Results A modified rapid chloroplast isolation protocol was used to obtain plant DNA that was enriched for chloroplast DNA, but nevertheless contained nuclear and mitochondrial DNA. Multiply-primed rolling circle amplification of this mixed template produced sufficient quantities of chloroplast DNA, even when the amount of starting material was small, and improved the template quality for Illumina Genome Analyzer II (hereafter Illumina GAII sequencing. We demonstrate, using independent samples of karaka (Corynocarpus laevigatus, that there is high fidelity in the sequence obtained from this template. Although less than 20% of our sequenced reads could be mapped to chloroplast genome, it was relatively easy to assemble complete chloroplast genome sequences from the mixture of nuclear, mitochondrial and chloroplast reads. Conclusions We report successful whole genome sequencing of chloroplast DNA from karaka, obtained efficiently and with high fidelity.

  12. Two complete chloroplast genome sequences of Cannabis sativa varieties.

    Science.gov (United States)

    Oh, Hyehyun; Seo, Boyoung; Lee, Seunghwan; Ahn, Dong-Ha; Jo, Euna; Park, Jin-Kyoung; Min, Gi-Sik

    2016-07-01

    In this study, we determined the complete chloroplast (cp) genomes from two varieties of Cannabis sativa. The genome sizes were 153,848 bp (the Korean non-drug variety, Cheungsam) and 153,854 bp (the African variety, Yoruba Nigeria). The genome structures were identical with 131 individual genes [86 protein-coding genes (PCGs), eight rRNA, and 37 tRNA genes]. Further, except for the presence of an intron in the rps3 genes of two C. sativa varieties, the cp genomes of C. sativa had conservative features similar to that of all known species in the order Rosales. To verify the position of C. sativa within the order Rosales, we conducted phylogenetic analysis by using concatenated sequences of all PCGs from 17 complete cp genomes. The resulting tree strongly supported monophyly of Rosales. Further, the family Cannabaceae, represented by C. sativa, showed close relationship with the family Moraceae. The phylogenetic relationship outlined in our study is well congruent with those previously shown for the order Rosales. PMID:26104156

  13. High-Throughput Sequencing of Three Lemnoideae (Duckweeds) Chloroplast Genomes from Total DNA

    OpenAIRE

    Wang, Wenqin; Messing, Joachim

    2011-01-01

    Background Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. Methods We sequenced the chloroplast genomes...

  14. Complete chloroplast genome sequence of Fritillaria unibracteata var. wabuensis based on SMRT Sequencing Technology.

    Science.gov (United States)

    Li, Ying; Li, Qiushi; Li, Xiwen; Song, Jingyuan; Sun, Chao

    2016-09-01

    Fritillaria unibracteata var. wabuensis is an important medicinal plant used for the treatment of cough symptoms related to the respiratory system. The chloroplast genome of F. unibracteata var. wabuensis (GenBank accession no. KF769142) was assembled using the PacBio RS platform (Pacific Biosciences, Beverly, MA) as a circle sequence with 151 009 bp. The assembled genome contains 133 genes, including 88 protein-coding, 37 tRNA, and eight rRNA genes. This genome sequence will provide important resource for further studies on the evolution of Fritillaria genus and molecular identification of Fritillaria herbs and their adulterants. This work suggests that PacBio RS is a powerful tool to sequence and assemble chloroplast genomes. PMID:26370383

  15. Sonication-based isolation and enrichment of Chlorella protothecoides chloroplasts for illumina genome sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Angelova, Angelina [University of Arizona; Park, Sang-Hycuk [University of Arizona; Kyndt, John [Bellevue University; Fitzsimmons, Kevin [University of Arizona; Brown, Judith K [University of Arizona

    2013-09-01

    With the increasing world demand for biofuel, a number of oleaginous algal species are being considered as renewable sources of oil. Chlorella protothecoides Krüger synthesizes triacylglycerols (TAGs) as storage compounds that can be converted into renewable fuel utilizing an anabolic pathway that is poorly understood. The paucity of algal chloroplast genome sequences has been an important constraint to chloroplast transformation and for studying gene expression in TAGs pathways. In this study, the intact chloroplasts were released from algal cells using sonication followed by sucrose gradient centrifugation, resulting in a 2.36-fold enrichment of chloroplasts from C. protothecoides, based on qPCR analysis. The C. protothecoides chloroplast genome (cpDNA) was determined using the Illumina HiSeq 2000 sequencing platform and found to be 84,576 Kb in size (8.57 Kb) in size, with a GC content of 30.8 %. This is the first report of an optimized protocol that uses a sonication step, followed by sucrose gradient centrifugation, to release and enrich intact chloroplasts from a microalga (C. prototheocoides) of sufficient quality to permit chloroplast genome sequencing with high coverage, while minimizing nuclear genome contamination. The approach is expected to guide chloroplast isolation from other oleaginous algal species for a variety of uses that benefit from enrichment of chloroplasts, ranging from biochemical analysis to genomics studies.

  16. The complete chloroplast genome sequence of Fagopyrum cymosum.

    Science.gov (United States)

    Yang, Jun; Lu, Chaolong; Shen, Qi; Yan, Yuying; Xu, Changjiang; Song, Chi

    2016-07-01

    Fagopyrum cymosum is a traditional medicinal plant. In this study, the complete chloroplast genome of Fagopyrum cymosum is presented. The total genome size is 160,546 bp in length, containing a pair of inverted repeats (IRs) of 32,598 bp, separated by large single copy (LSC) and small single copy (SSC) of 84,237 bp and 11,014 bp, respectively. Overall GC contents of the genome were 36.9%. The chloroplast genome harbors 126 annotated genes, including 91 protein coding genes, 29 tRNA genes, and six rRNA genes. Eighteen genes contain one or two introns. Phylogenetic analyses indicated a clear evolutionary relationship among species of Caryophyllales. PMID:26119127

  17. Complete chloroplast genome sequences of Mongolia medicine Artemisia frigida and phylogenetic relationships with other plants.

    Directory of Open Access Journals (Sweden)

    Yue Liu

    Full Text Available BACKGROUND: Artemisia frigida Willd. is an important Mongolian traditional medicinal plant with pharmacological functions of stanch and detumescence. However, there is little sequence and genomic information available for Artemisia frigida, which makes phylogenetic identification, evolutionary studies, and genetic improvement of its value very difficult. We report the complete chloroplast genome sequence of Artemisia frigida based on 454 pyrosequencing. METHODOLOGY/PRINCIPAL FINDINGS: The complete chloroplast genome of Artemisia frigida is 151,076 bp including a large single copy (LSC region of 82,740 bp, a small single copy (SSC region of 18,394 bp and a pair of inverted repeats (IRs of 24,971 bp. The genome contains 114 unique genes and 18 duplicated genes. The chloroplast genome of Artemisia frigida contains a small 3.4 kb inversion within a large 23 kb inversion in the LSC region, a unique feature in Asteraceae. The gene order in the SSC region of Artemisia frigida is inverted compared with the other 6 Asteraceae species with the chloroplast genomes sequenced. This inversion is likely caused by an intramolecular recombination event only occurred in Artemisia frigida. The existence of rich SSR loci in the Artemisia frigida chloroplast genome provides a rare opportunity to study population genetics of this Mongolian medicinal plant. Phylogenetic analysis demonstrates a sister relationship between Artemisia frigida and four other species in Asteraceae, including Ageratina adenophora, Helianthus annuus, Guizotia abyssinica and Lactuca sativa, based on 61 protein-coding sequences. Furthermore, Artemisia frigida was placed in the tribe Anthemideae in the subfamily Asteroideae (Asteraceae based on ndhF and trnL-F sequence comparisons. CONCLUSION: The chloroplast genome sequence of Artemisia frigida was assembled and analyzed in this study, representing the first plastid genome sequenced in the Anthemideae tribe. This complete chloroplast genome

  18. The Complete Chloroplast Genome Sequences of Five Epimedium Species: Lights into Phylogenetic and Taxonomic Analyses

    OpenAIRE

    Zhang, Yanjun; Du, Liuwen; Liu, Ao; Chen, Jianjun; Wu, Li; Hu, Weiming; Zhang, Wei; Kim, Kyunghee; Lee, Sang-Choon; Yang, Tae-Jin; Wang, Ying

    2016-01-01

    Epimedium L. is a phylogenetically and economically important genus in the family Berberidaceae. We here sequenced the complete chloroplast (cp) genomes of four Epimedium species using Illumina sequencing technology via a combination of de novo and reference-guided assembly, which was also the first comprehensive cp genome analysis on Epimedium combining the cp genome sequence of E. koreanum previously reported. The five Epimedium cp genomes exhibited typical quadripartite and circular struct...

  19. Chloroplast genome sequence of the moss Tortula ruralis: gene content, polymorphism, and structural arrangement relative to other green plant chloroplast genomes

    Directory of Open Access Journals (Sweden)

    Wolf Paul G

    2010-02-01

    Full Text Available Abstract Background Tortula ruralis, a widely distributed species in the moss family Pottiaceae, is increasingly used as a model organism for the study of desiccation tolerance and mechanisms of cellular repair. In this paper, we present the chloroplast genome sequence of T. ruralis, only the second published chloroplast genome for a moss, and the first for a vegetatively desiccation-tolerant plant. Results The Tortula chloroplast genome is ~123,500 bp, and differs in a number of ways from that of Physcomitrella patens, the first published moss chloroplast genome. For example, Tortula lacks the ~71 kb inversion found in the large single copy region of the Physcomitrella genome and other members of the Funariales. Also, the Tortula chloroplast genome lacks petN, a gene found in all known land plant plastid genomes. In addition, an unusual case of nucleotide polymorphism was discovered. Conclusions Although the chloroplast genome of Tortula ruralis differs from that of the only other sequenced moss, Physcomitrella patens, we have yet to determine the biological significance of the differences. The polymorphisms we have uncovered in the sequencing of the genome offer a rare possibility (for mosses of the generation of DNA markers for fine-level phylogenetic studies, or to investigate individual variation within populations.

  20. The complete chloroplast genome sequence of an important medicinal plant Cynanchum wilfordii (Maxim.) Hemsl. (Apocynaceae).

    Science.gov (United States)

    Park, Hyun-Seung; Kim, Kyu-Yeob; Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Seong, Rack Seon; Shim, Young Hun; Sung, Sang Hyun; Yang, Tae-Jin

    2016-09-01

    Cynanchum wilfordii (Maxim.) Hemsl. is a traditional medicinal herb belonging to the Asclepiadoideae subfamily, whose dried roots have been used as traditional medicine in Asia. The complete chloroplast genome of C. wilfordii was generated by de novo assembly using the small amount of whole genome sequencing data. The chloroplast genome of C. wilfordii was 161 241 bp long, composed of large single copy region (91 995 bp), small single copy region (19 930 bp) and a pair of inverted repeat regions (24 658 bp). The overall GC contents of the chloroplast genome was 37.8%. A total of 114 genes were annotated, which included 80 protein-coding genes, 30 tRNA genes and 4 rRNA genes. Phylogenetic analysis with the reported chloroplast genomes revealed that C. wilfordii is most closely related to Asclepias nivea (Caribbean milkweed) and Asclepias syriaca (common milkweed) within the Asclepiadoideae subfamily. PMID:26358391

  1. Relationships of wild and domesticated rices (Oryza AA genome species) based upon whole chloroplast genome sequences

    OpenAIRE

    Wambugu, Peterson W.; Marta Brozynska; Agnelo Furtado; Daniel L. Waters; Robert J. Henry

    2015-01-01

    Rice is the most important crop in the world, acting as the staple food for over half of the world’s population. The evolutionary relationships of cultivated rice and its wild relatives have remained contentious and inconclusive. Here we report on the use of whole chloroplast sequences to elucidate the evolutionary and phylogenetic relationships in the AA genome Oryza species, representing the primary gene pool of rice. This is the first study that has produced a well resolved and strongly su...

  2. The complete sequence of the chloroplast genome of the green microalga Lobosphaera (Parietochloris) incisa.

    Science.gov (United States)

    Tourasse, Nicolas J; Barbi, Tommaso; Waterhouse, Janet C; Shtaida, Nastassia; Leu, Stefan; Boussiba, Sammy; Purton, Saul; Vallon, Olivier

    2016-05-01

    We hereby report the complete chloroplast genome sequence of the green unicellular alga Lobosphaera (Parietochloris) incisa (strain SAG 2468). The genome consists of a circular chromosome of 156,028 bp, which is 72% A-T rich and does not contain a large rRNA-encoding inverted repeat. It is predicted to encode a total of 111 genes including 78 protein-coding, three rRNA, and 30 tRNA genes. The genome sequence also carries a self-splicing group I intron and a group II intron remnant. Overall, the gene and intron content of the L. incisa chloroplast genome is highly similar to that of other species of Trebouxiophyceae. In contrast, the L. incisa chloroplast genome harbors 88 copies of various intergenic dispersed DNA repeat sequences that are all unique to L. incisa. PMID:25423517

  3. The complete chloroplast genome sequence of Brachypodium distachyon: sequence comparison and phylogenetic analysis of eight grass plastomes

    Directory of Open Access Journals (Sweden)

    Anderson Olin D

    2008-07-01

    Full Text Available Abstract Background Wheat, barley, and rye, of tribe Triticeae in the Poaceae, are among the most important crops worldwide but they present many challenges to genomics-aided crop improvement. Brachypodium distachyon, a close relative of those cereals has recently emerged as a model for grass functional genomics. Sequencing of the nuclear and organelle genomes of Brachypodium is one of the first steps towards making this species available as a tool for researchers interested in cereals biology. Findings The chloroplast genome of Brachypodium distachyon was sequenced by a combinational approach using BAC end and shotgun sequences derived from a selected BAC containing the entire chloroplast genome. Comparative analysis indicated that the chloroplast genome is conserved in gene number and organization with respect to those of other cereals. However, several Brachypodium genes evolve at a faster rate than those in other grasses. Sequence analysis reveals that rice and wheat have a ~2.1 kb deletion in their plastid genomes and this deletion must have occurred independently in both species. Conclusion We demonstrate that BAC libraries can be used to sequence plastid, and likely other organellar, genomes. As expected, the Brachypodium chloroplast genome is very similar to those of other sequenced grasses. The phylogenetic analyses and the pattern of insertions and deletions in the chloroplast genome confirmed that Brachypodium is a close relative of the tribe Triticeae. Nevertheless, we show that some large indels can arise multiple times and may confound phylogenetic reconstruction.

  4. The Complete Chloroplast Genome Sequences of the Medicinal Plant Pogostemon cablin.

    Science.gov (United States)

    He, Yang; Xiao, Hongtao; Deng, Cao; Xiong, Liang; Yang, Jian; Peng, Cheng

    2016-01-01

    Pogostemon cablin, the natural source of patchouli alcohol, is an important herb in the Lamiaceae family. Here, we present the entire chloroplast genome of P. cablin. This genome, with 38.24% GC content, is 152,460 bp in length. The genome presents a typical quadripartite structure with two inverted repeats (each 25,417 bp in length), separated by one small and one large single-copy region (17,652 and 83,974 bp in length, respectively). The chloroplast genome encodes 127 genes, of which 107 genes are single-copy, including 79 protein-coding genes, four rRNA genes, and 24 tRNA genes. The genome structure, GC content, and codon usage of this chloroplast genome are similar to those of other species in the family, except that it encodes less protein-coding genes and tRNA genes. Phylogenetic analysis reveals that P. cablin diverged from the Scutellarioideae clade about 29.45 million years ago (Mya). Furthermore, most of the simple sequence repeats (SSRs) are short polyadenine or polythymine repeats that contribute to high AT content in the chloroplast genome. Complete sequences and annotation of P. cablin chloroplast genome will facilitate phylogenic, population and genetic engineering research investigations involving this particular species. PMID:27275817

  5. The Complete Chloroplast Genome Sequences of the Medicinal Plant Pogostemon cablin

    Directory of Open Access Journals (Sweden)

    Yang He

    2016-06-01

    Full Text Available Pogostemon cablin, the natural source of patchouli alcohol, is an important herb in the Lamiaceae family. Here, we present the entire chloroplast genome of P. cablin. This genome, with 38.24% GC content, is 152,460 bp in length. The genome presents a typical quadripartite structure with two inverted repeats (each 25,417 bp in length, separated by one small and one large single-copy region (17,652 and 83,974 bp in length, respectively. The chloroplast genome encodes 127 genes, of which 107 genes are single-copy, including 79 protein-coding genes, four rRNA genes, and 24 tRNA genes. The genome structure, GC content, and codon usage of this chloroplast genome are similar to those of other species in the family, except that it encodes less protein-coding genes and tRNA genes. Phylogenetic analysis reveals that P. cablin diverged from the Scutellarioideae clade about 29.45 million years ago (Mya. Furthermore, most of the simple sequence repeats (SSRs are short polyadenine or polythymine repeats that contribute to high AT content in the chloroplast genome. Complete sequences and annotation of P. cablin chloroplast genome will facilitate phylogenic, population and genetic engineering research investigations involving this particular species.

  6. High-throughput sequencing of three Lemnoideae (duckweeds chloroplast genomes from total DNA.

    Directory of Open Access Journals (Sweden)

    Wenqin Wang

    Full Text Available BACKGROUND: Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. METHODS: We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. CONCLUSIONS: This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power.

  7. The complete chloroplast genome sequence of Lilium hansonii Leichtlin ex D.D.T.Moore.

    Science.gov (United States)

    Kim, Kyunghee; Hwang, Yoon-Jung; Lee, Sang-Choon; Yang, Tae-Jin; Lim, Ki-Byung

    2016-09-01

    Lilium hansonii is a lily species native to Korea and an important wild species for lily breeding. The chloroplast genome of L. hansonii was completed by de novo assembly using the small amount of whole genome sequencing data. The chloroplast genome of L. hansonii was 152 655 bp long and consisted of large single copy region (82 051 bp), small single copy region (17 620 bp) and a pair of inverted repeat regions (26 492 bp). A total of 115 genes were annotated, which included 81 protein-coding genes, 30 tRNA genes and 4 rRNA genes. Phylogenetic analysis with the reported chloroplast genomes revealed that L. hansonii is most closely related to L. superbum (Turk's-cap lily) and L. longiflorum (Easter lily). PMID:26404645

  8. The complete chloroplast genome sequence of the medicinal plant Rheum palmatum L. (Polygonaceae).

    Science.gov (United States)

    Fan, Kai; Sun, Xiao-Jie; Huang, Min; Wang, Xu-Mei

    2016-07-01

    The complete chloroplast genome of the medicinal plant Rheum palmatum L. (Polygonaceae) has been reconstructed from the whole-genome Illumina sequencing data. The genome is 161 541 bp in length, and exhibits a typical quadripartite structure of the large (LSC, 86 518 bp) and small (SSC, 13 111 bp) single-copy regions, separated by a pair of inverted repeats (IRs, 30 956 bp each). The chloroplast genome contains 131 genes, including 84 protein-coding genes (78 PCG species), eight ribosomal RNA genes (four rRNA species) and 37 transfer RNA genes (28 tRNA species). Phylogenetic tree based on the maximum parsimony (MP) analysis of 65 chloroplast protein-coding genes for 13 taxa demonstrated a close relationship between R. palmatum and Fagopyrum esculentum subsp. ancestrale in Polygonaceae. PMID:26153751

  9. The first complete chloroplast genome sequence of a lycophyte,Huperzia lucidula (Lycopodiaceae)

    Energy Technology Data Exchange (ETDEWEB)

    Wolf, Paul G.; Karol, Kenneth G.; Mandoli, Dina F.; Kuehl,Jennifer V.; Arumuganathan, K.; Ellis, Mark W.; Mishler, Brent D.; Kelch,Dean G.; Olmstead, Richard G.; Boore, Jeffrey L.

    2005-02-01

    We used a unique combination of techniques to sequence the first complete chloroplast genome of a lycophyte, Huperzia lucidula. This plant belongs to a significant clade hypothesized to represent the sister group to all other vascular plants. We used fluorescence-activated cell sorting (FACS) to isolate the organelles, rolling circle amplification (RCA) to amplify the genome, and shotgun sequencing to 8x depth coverage to obtain the complete chloroplast genome sequence. The genome is 154,373bp, containing inverted repeats of 15,314 bp each, a large single-copy region of 104,088 bp, and a small single-copy region of 19,671 bp. Gene order is more similar to those of mosses, liverworts, and hornworts than to gene order for other vascular plants. For example, the Huperziachloroplast genome possesses the bryophyte gene order for a previously characterized 30 kb inversion, thus supporting the hypothesis that lycophytes are sister to all other extant vascular plants. The lycophytechloroplast genome data also enable a better reconstruction of the basaltracheophyte genome, which is useful for inferring relationships among bryophyte lineages. Several unique characters are observed in Huperzia, such as movement of the gene ndhF from the small single copy region into the inverted repeat. We present several analyses of evolutionary relationships among land plants by using nucleotide data, amino acid sequences, and by comparing gene arrangements from chloroplast genomes. The results, while still tentative pending the large number of chloroplast genomes from other key lineages that are soon to be sequenced, are intriguing in themselves, and contribute to a growing comparative database of genomic and morphological data across the green plants.

  10. Complete chloroplast genome sequence of Omani lime (Citrus aurantiifolia and comparative analysis within the rosids.

    Directory of Open Access Journals (Sweden)

    Huei-Jiun Su

    Full Text Available The genus Citrus contains many economically important fruits that are grown worldwide for their high nutritional and medicinal value. Due to frequent hybridizations among species and cultivars, the exact number of natural species and the taxonomic relationships within this genus are unclear. To compare the differences between the Citrus chloroplast genomes and to develop useful genetic markers, we used a reference-assisted approach to assemble the complete chloroplast genome of Omani lime (C. aurantiifolia. The complete C. aurantiifolia chloroplast genome is 159,893 bp in length; the organization and gene content are similar to most of the rosids lineages characterized to date. Through comparison with the sweet orange (C. sinensis chloroplast genome, we identified three intergenic regions and 94 simple sequence repeats (SSRs that are potentially informative markers with resolution for interspecific relationships. These markers can be utilized to better understand the origin of cultivated Citrus. A comparison among 72 species belonging to 10 families of representative rosids lineages also provides new insights into their chloroplast genome evolution.

  11. The complete chloroplast genome sequence of Tetrastigma hemsleyanum Diels at Gilg.

    Science.gov (United States)

    Li, Mengzhu; Chen, Qinyi; Yang, Bingxian; Ma, Ji; Li, Baoguo; Zhang, Lin

    2016-09-01

    The complete chloroplast genome sequence of Tetrastigma hemsleyanum Diels at Gilg, a critical Chinese medicine, is reported here. The complete chloroplast genome of Tetrastigma hemsleyanum Diels at Gilg is 159 914 bp in length with 37.55% overall GC content. A pair of IRs (inverted repeats) of 26 510 bp were separated by LSC (87 927 bp) and SSC (18 967 bp). The phylogenetic analysis of 40 taxa showed a strong sister relationship with all other rosids. However, the placement of Myrtales still needs further verification. PMID:26329851

  12. The complete chloroplast genome sequence of Clematis terniflora DC. (Ranunculaceae).

    Science.gov (United States)

    Li, Mengzhu; Yang, Bingxian; Chen, Qinyi; Zhu, Wei; Ma, Ji; Tian, Jingkui

    2016-07-01

    Clematis terniflora DC. is an important medicinal plant used in the treatment of inflammatory symptoms related to respiratory and urinary systems. In this study, we found that the complete cp genome of C. terniflora DC. is 159,528 bp. The phylogenetic analysis of 32 taxa showed a strong sister relationship with Ranunculus macranthus, which also strongly supports the position of Ranunculales. The complete cp genome sequence of Clematis terniflora DC. reported here has the potential to advance population and phylogenetic studies of this medicinal plant. PMID:25865739

  13. A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower

    Energy Technology Data Exchange (ETDEWEB)

    Timme, Ruth E.; Kuehl, Jennifer V.; Boore, Jeffrey L.; Jansen, Robert K.

    2006-01-20

    Asteraceae is the second largest family of plants, with over 20,000 species. For the past few decades, numerous phylogenetic studies have contributed to our understanding of the evolutionary relationships within this family, including comparisons of the fast evolving chloroplast gene, ndhF, rbcL, as well as non-coding DNA from the trnL intron plus the trnLtrnF intergenic spacer, matK, and, with lesser resolution, psbA-trnH. This culminated in a study by Panero and Funk in 2002 that used over 13,000 bp per taxon for the largest taxonomic revision of Asteraceae in over a hundred years. Still, some uncertainties remain, and it would be very useful to have more information on the relative rates of sequence evolution among various genes and on genome structure as a potential set of phylogenetic characters to help guide future phylogenetic structures. By way of contributing to this, we report the first two complete chloroplast genome sequences from members of the Asteraceae, those of Helianthus annuus and Lactuca sativa. These plants belong to two distantly related subfamilies, Asteroideae and Cichorioideae, respectively. In addition to these, there is only one other published chloroplast genome sequence for any plant within the larger group called Eusterids II, that of Panax ginseng (Araliaceae, 156,318 bps, AY582139). Early chloroplast genome mapping studies demonstrated that H. annuus and L. sativa share a 22 kb inversion relative to members of the subfamily Barnadesioideae. By comparison to outgroups, this inversion was shown to be derived, indicating that the Asteroideae and Cichorioideae are more closely related than either is to the Barnadesioideae. Later sequencing study found that taxa that share this 22 kb inversion also contain within this region a second, smaller, 3.3 kb inversion. These sequences also enable an analysis of patterns of shared repeats in the genomes at fine level and of RNA editing by comparison to available EST sequences. In addition, since

  14. Complete chloroplast and ribosomal sequences for 30 accessions elucidate evolution of Oryza AA genome species

    OpenAIRE

    Kyunghee Kim; Sang-Choon Lee; Junki Lee; Yeisoo Yu; Kiwoung Yang; Beom-Soon Choi; Hee-Jong Koh; Nomar Espinosa Waminal; Hong-Il Choi; Nam-Hoon Kim; Woojong Jang; Hyun-Seung Park; Jonghoon Lee; Hyun Oh Lee; Ho Jun Joh

    2015-01-01

    Cytoplasmic chloroplast (cp) genomes and nuclear ribosomal DNA (nR) are the primary sequences used to understand plant diversity and evolution. We introduce a high-throughput method to simultaneously obtain complete cp and nR sequences using Illumina platform whole-genome sequence. We applied the method to 30 rice specimens belonging to nine Oryza species. Concurrent phylogenomic analysis using cp and nR of several of specimens of the same Oryza AA genome species provides insight into the evo...

  15. The complete chloroplast genome sequence of Aster spathulifolius (Asteraceae); genomic features and relationship with Asteraceae.

    Science.gov (United States)

    Choi, Kyoung Su; Park, SeonJoo

    2015-11-10

    Aster spathulifolius, a member of the Asteraceae family, is distributed along the coast of Japan and Korea. This plant is used for medicinal and ornamental purposes. The complete chloroplast (cp) genome of A. sphathulifolius consists of 149,473 bp that include a pair of inverted repeats of 24,751 bp separated by a large single copy region of 81,998 bp and a small single copy region of 17,973 bp. The chloroplast genome contains 78 coding genes, four rRNA genes and 29 tRNA genes. When compared to other cpDNA sequences of Asteraceae, A. spathulifolius showed the closest relationship with Jacobaea vulgaris, and its atpB gene was found to be a pseudogene, unlike J. vulgaris. Furthermore, evaluation of the gene compositions of J. vulgaris, Helianthus annuus, Guizotia abyssinica and A. spathulifolius revealed that 13.6-kb showed inversion from ndhF to rps15, unlike Lactuca of Asteraceae. Comparison of the synonymous (Ks) and nonsynonymous (Ka) substitution rates with J. vulgaris revealed that synonymous genes related to a small subunit of the ribosome showed the highest value (0.1558), while nonsynonymous rates of genes related to ATP synthase genes were highest (0.0118). These findings revealed that substitution has occurred at similar rates in most genes, and the substitution rates suggested that most genes is a purified selection. PMID:26164759

  16. The complete chloroplast genome sequence of Ledebouriella seseloides (Hoffm.) H. Wolff.

    Science.gov (United States)

    Lee, Hyun Oh; Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Lee, Jonghoon; Kim, Soonok; Yang, Tae-Jin

    2016-09-01

    Ledebouriella seseloides (Hoffm.) H.Wolff is a traditional medicinal herb belonging to Apiaceae family, whose dried roots and rhizomes have been used as traditional medicine in East Asian countries. The complete chloroplast genome of L. seseloides was obtained by de novo assembly using the small amount of whole genome sequencing data. The chloroplast genome of L. seseloides was 147 880 bp in length, which consisted of large single copy region (93 222 bp), small single copy region (17 324 bp), and a pair of inverted repeat regions (18 667 bp). The overall GC contents of the chloroplast genome were 37.5%. A total of 113 genes were annotated, which included 79 protein-coding genes, 30 tRNA genes, and four rRNA genes. Phylogenetic analysis with the reported chloroplast genomes revealed that L. seseloides is most closely related to Petroselinum crispum (parsley), an herb widely used in cooking. PMID:26218226

  17. Integration of complete chloroplast genome sequences with small amplicon datasets improves phylogenetic resolution in Acacia.

    Science.gov (United States)

    Williams, Anna V; Miller, Joseph T; Small, Ian; Nevill, Paul G; Boykin, Laura M

    2016-03-01

    Combining whole genome data with previously obtained amplicon sequences has the potential to increase the resolution of phylogenetic analyses, particularly at low taxonomic levels or where recent divergence, rapid speciation or slow genome evolution has resulted in limited sequence variation. However, the integration of these types of data for large scale phylogenetic studies has rarely been investigated. Here we conduct a phylogenetic analysis of the whole chloroplast genome and two nuclear ribosomal loci for 65 Acacia species from across the most recent Acacia phylogeny. We then combine this data with previously generated amplicon sequences (four chloroplast loci and two nuclear ribosomal loci) for 508 Acacia species. We use several phylogenetic methods, including maximum likelihood bootstrapping (with and without constraint) and ExaBayes, in order to determine the success of combining a dataset of 4000bp with one of 189,000bp. The results of our study indicate that the inclusion of whole genome data gave a far better resolved and well supported representation of the phylogenetic relationships within Acacia than using only amplicon sequences, with the greatest support observed when using a whole genome phylogeny as a constraint on the amplicon sequences. Our study therefore provides methods for optimal integration of genomic and amplicon sequences. PMID:26702955

  18. Chloroplast genome sequencing analysis of Heterosigma akashiwo CCMP452 (West Atlantic and NIES293 (West Pacific strains

    Directory of Open Access Journals (Sweden)

    Lybrand Terry

    2008-05-01

    Full Text Available Abstract Background Heterokont algae form a monophyletic group within the stramenopile branch of the tree of life. These organisms display wide morphological diversity, ranging from minute unicells to massive, bladed forms. Surprisingly, chloroplast genome sequences are available only for diatoms, representing two (Coscinodiscophyceae and Bacillariophyceae of approximately 18 classes of algae that comprise this taxonomic cluster. A universal challenge to chloroplast genome sequencing studies is the retrieval of highly purified DNA in quantities sufficient for analytical processing. To circumvent this problem, we have developed a simplified method for sequencing chloroplast genomes, using fosmids selected from a total cellular DNA library. The technique has been used to sequence chloroplast DNA of two Heterosigma akashiwo strains. This raphidophyte has served as a model system for studies of stramenopile chloroplast biogenesis and evolution. Results H. akashiwo strain CCMP452 (West Atlantic chloroplast DNA is 160,149 bp in size with a 21,822-bp inverted repeat, whereas NIES293 (West Pacific chloroplast DNA is 159,370 bp in size and has an inverted repeat of 21,665 bp. The fosmid cloning technique reveals that both strains contain an isomeric chloroplast DNA population resulting from an inversion of their single copy domains. Both strains contain multiple small inverted and tandem repeats, non-randomly distributed within the genomes. Although both CCMP452 and NIES293 chloroplast DNAs contains 197 genes, multiple nucleotide polymorphisms are present in both coding and intergenic regions. Several protein-coding genes contain large, in-frame inserts relative to orthologous genes in other plastids. These inserts are maintained in mRNA products. Two genes of interest in H. akashiwo, not previously reported in any chloroplast genome, include tyrC, a tyrosine recombinase, which we hypothesize may be a result of a lateral gene transfer event, and an

  19. The Complete Chloroplast Genome Sequences of Five Epimedium Species: Lights into Phylogenetic and Taxonomic Analyses

    Science.gov (United States)

    Zhang, Yanjun; Du, Liuwen; Liu, Ao; Chen, Jianjun; Wu, Li; Hu, Weiming; Zhang, Wei; Kim, Kyunghee; Lee, Sang-Choon; Yang, Tae-Jin; Wang, Ying

    2016-01-01

    Epimedium L. is a phylogenetically and economically important genus in the family Berberidaceae. We here sequenced the complete chloroplast (cp) genomes of four Epimedium species using Illumina sequencing technology via a combination of de novo and reference-guided assembly, which was also the first comprehensive cp genome analysis on Epimedium combining the cp genome sequence of E. koreanum previously reported. The five Epimedium cp genomes exhibited typical quadripartite and circular structure that was rather conserved in genomic structure and the synteny of gene order. However, these cp genomes presented obvious variations at the boundaries of the four regions because of the expansion and contraction of the inverted repeat (IR) region and the single-copy (SC) boundary regions. The trnQ-UUG duplication occurred in the five Epimedium cp genomes, which was not found in the other basal eudicotyledons. The rapidly evolving cp genome regions were detected among the five cp genomes, as well as the difference of simple sequence repeats (SSR) and repeat sequence were identified. Phylogenetic relationships among the five Epimedium species based on their cp genomes showed accordance with the updated system of the genus on the whole, but reminded that the evolutionary relationships and the divisions of the genus need further investigation applying more evidences. The availability of these cp genomes provided valuable genetic information for accurately identifying species, taxonomy and phylogenetic resolution and evolution of Epimedium, and assist in exploration and utilization of Epimedium plants. PMID:27014326

  20. Relationships of wild and domesticated rices (Oryza AA genome species) based upon whole chloroplast genome sequences.

    Science.gov (United States)

    Wambugu, Peterson W; Brozynska, Marta; Furtado, Agnelo; Waters, Daniel L; Henry, Robert J

    2015-01-01

    Rice is the most important crop in the world, acting as the staple food for over half of the world's population. The evolutionary relationships of cultivated rice and its wild relatives have remained contentious and inconclusive. Here we report on the use of whole chloroplast sequences to elucidate the evolutionary and phylogenetic relationships in the AA genome Oryza species, representing the primary gene pool of rice. This is the first study that has produced a well resolved and strongly supported phylogeny of the AA genome species. The pan tropical distribution of these rice relatives was found to be explained by long distance dispersal within the last million years. The analysis resulted in a clustering pattern that showed strong geographical differentiation. The species were defined in two primary clades with a South American/African clade with two species, O glumaepatula and O longistaminata, distinguished from all other species. The largest clade was comprised of an Australian clade including newly identified taxa and the African and Asian clades. This refined knowledge of the relationships between cultivated rice and the related wild species provides a strong foundation for more targeted use of wild genetic resources in rice improvement and efforts to ensure their conservation. PMID:26355750

  1. The Complete Chloroplast Genome of Capsicum annuum var. glabriusculum Using Illumina Sequencing

    Directory of Open Access Journals (Sweden)

    Sebastin Raveendar

    2015-07-01

    Full Text Available Chloroplast (cp genome sequences provide a valuable source for DNA barcoding. Molecular phylogenetic studies have concentrated on DNA sequencing of conserved gene loci. However, this approach is time consuming and more difficult to implement when gene organization differs among species. Here we report the complete re-sequencing of the cp genome of Capsicum pepper (Capsicum annuum var. glabriusculum using the Illumina platform. The total length of the cp genome is 156,817 bp with a 37.7% overall GC content. A pair of inverted repeats (IRs of 50,284 bp were separated by a small single copy (SSC; 18,948 bp and a large single copy (LSC; 87,446 bp. The number of cp genes in C. annuum var. glabriusculum is the same as that in other Capsicum species. Variations in the lengths of LSC; SSC and IR regions were the main contributors to the size variation in the cp genome of this species. A total of 125 simple sequence repeat (SSR and 48 insertions or deletions variants were found by sequence alignment of Capsicum cp genome. These findings provide a foundation for further investigation of cp genome evolution in Capsicum and other higher plants.

  2. Comparative genomics of four Liliales families inferred from the complete chloroplast genome sequence of Veratrum patulum O. Loes. (Melanthiaceae).

    Science.gov (United States)

    Do, Hoang Dang Khoa; Kim, Jung Sung; Kim, Joo-Hwan

    2013-11-10

    The sequence of the chloroplast genome, which is inherited maternally, contains useful information for many scientific fields such as plant systematics, biogeography and biotechnology because its characteristics are highly conserved among species. There is an increase in chloroplast genomes of angiosperms that have been sequenced in recent years. In this study, the nucleotide sequence of the chloroplast genome (cpDNA) of Veratrum patulum Loes. (Melanthiaceae, Liliales) was analyzed completely. The circular double-stranded DNA of 153,699 bp consists of two inverted repeat (IR) regions of 26,360 bp each, a large single copy of 83,372 bp, and a small single copy of 17,607 bp. This plastome contains 81 protein-coding genes, 30 distinct tRNA and four genes of rRNA. In addition, there are six hypothetical coding regions (ycf1, ycf2, ycf3, ycf4, ycf15 and ycf68) and two open reading frames (ORF42 and ORF56), which are also found in the chloroplast genomes of the other species. The gene orders and gene contents of the V. patulum plastid genome are similar to that of Smilax china, Lilium longiflorum and Alstroemeria aurea, members of the Smilacaceae, Liliaceae and Alstroemeriaceae (Liliales), respectively. However, the loss rps16 exon 2 in V. patulum results in the difference in the large single copy regions in comparison with other species. The base substitution rate is quite similar among genes of these species. Additionally, the base substitution rate of inverted repeat region was smaller than that of single copy regions in all observed species of Liliales. The IR regions were expanded to trnH_GUG in V. patulum, a part of rps19 in L. longiflorum and A. aurea, and whole sequence of rps19 in S. china. Furthermore, the IGS lengths of rbcL-accD-psaI region were variable among Liliales species, suggesting that this region might be a hotspot of indel events and the informative site for phylogenetic studies in Liliales. In general, the whole chloroplast genome of V. patulum, a

  3. Complete Chloroplast Genome Sequence of Aquilaria sinensis (Lour. Gilg and the Evolution Analysis within the Malvalesorder

    Directory of Open Access Journals (Sweden)

    Ying eWang

    2016-03-01

    Full Text Available Aquilaria sinensis (Lour. Gilg is an important medicinal woody plant producing agarwood, which is widely used in traditional Chinese medicine. High-throughput sequencing of chloroplast (cp genomes enhanced the understanding about evolutionary relationships within plant families. In this study, we determined the complete cp genome sequences for A. sinensis. The size of the A.sinensis cp genome was 159,565 bp. This genome included a large single-copy region of 87,482 bp, a small single-copy region of 19,857 bp, and a pair of inverted repeats (IRa and IRb of 26,113 bp each. The GC content of the genome was 37.11%. The A.sinensis cp genome encoded 113 functional genes, including 82 protein-coding genes, 27 tRNA genes, and 4 rRNA genes. Seven genes were duplicated in the protein-coding genes, whereas 11 genes were duplicated in the RNA genes. A total of 45 polymorphic simple-sequence repeat loci and 60 pairs of large repeats were identified. Most simple-sequence repeats were located in the noncoding sections of the large single-copy/small single-copy region and exhibited high A/T content. Moreover, 33 pairs of large repeat sequences were located in the protein-coding genes, whereas 27 pairs were located in the intergenic regions. Aquilaria sinensis cp genome bias ended with A/T on the basis of codon usage. The distribution of codon usage in A.sinensis cp genome was most similar to that in the Gonystylus bancanus cp genome. Comparative results of 82 protein-coding genes from 29 species of cp genomes demonstrated that A.sinensis was a sister species to G. bancanus within the Malvales order. Aquilaria sinensis cp genome presented the highest sequence similarity of >90% with the G. bancanus cp genome by using CGView Comparison Tool. This finding strongly supports the placement of A.sinensis as a sister to G. bancanus within the Malvales order. The complete A.sinensis cp genome information will be highly beneficial for further studies on this traditional

  4. Insights into phylogeny, sex function and age of Fragaria based on whole chloroplast genome sequencing.

    Science.gov (United States)

    Njuguna, Wambui; Liston, Aaron; Cronn, Richard; Ashman, Tia-Lynn; Bassil, Nahla

    2013-01-01

    The cultivated strawberry is one of the youngest domesticated plants, developed in France in the 1700s from chance hybridization between two western hemisphere octoploid species. However, little is known about the evolution of the species that gave rise to this important fruit crop. Phylogenetic analysis of chloroplast genome sequences of 21 Fragaria species and subspecies resolves the western North American diploid F. vesca subsp. bracteata as sister to the clade of octoploid/decaploid species. No extant tetraploids or hexaploids are directly involved in the maternal ancestry of the octoploids. There is strong geographic segregation of chloroplast haplotypes in subsp. bracteata, and the gynodioecious Pacific Coast populations are implicated as both the maternal lineage and the source of male-sterility in the octoploid strawberries. Analysis of sexual system evolution in Fragaria provides evidence that the loss of male and female function can follow polyploidization, but does not seem to be associated with loss of self-incompatibility following genome doubling. Character-state mapping provided insight into sexual system evolution and its association with loss of self-incompatibility and genome doubling/merger. Fragaria attained its circumboreal and amphitropical distribution within the past one to four million years and the rise of the octoploid clade is dated at 0.372-2.05 million years ago. PMID:22982444

  5. The complete chloroplast genome sequence of the medicinal plant Glehnia littoralis F.Schmidt ex Miq. (Apiaceae).

    Science.gov (United States)

    Lee, Sang-Choon; Oh Lee, Hyun; Kim, Kyunghee; Kim, Soonok; Yang, Tae-Jin

    2016-09-01

    Glehnia littoralis F. Schmidt ex Miq is an oriental medicinal herb belonging to Apiaceae family, and its dried roots and rhizomes are known to show various pharmacological effects. The complete chlorplast genome of G. littoralis was generated by de novo assembly using whole genome sequencing data. The chloroplast genome of G. littoralis was 147 467 bp in length and divided into four distinct regions: large single copy region (93 493 bp), small single copy region (17 546 bp) and a pair of inverted repeat regions (18 214 bp). A total of 114 genes including 80 protein-coding genes, 30 tRNA genes and 4 rRNA genes were predicted and accounted for 57.1% of the chloroplast genome. Phylogenetic analysis with the reported chloroplast genomes revealed that G. littoralis is an herbal species closely related to Ledebouriella seseloides, an herbal medicinal plant. PMID:26367483

  6. Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes

    Directory of Open Access Journals (Sweden)

    Cronn Richard

    2009-12-01

    Full Text Available Abstract Background Molecular evolutionary studies share the common goal of elucidating historical relationships, and the common challenge of adequately sampling taxa and characters. Particularly at low taxonomic levels, recent divergence, rapid radiations, and conservative genome evolution yield limited sequence variation, and dense taxon sampling is often desirable. Recent advances in massively parallel sequencing make it possible to rapidly obtain large amounts of sequence data, and multiplexing makes extensive sampling of megabase sequences feasible. Is it possible to efficiently apply massively parallel sequencing to increase phylogenetic resolution at low taxonomic levels? Results We reconstruct the infrageneric phylogeny of Pinus from 37 nearly-complete chloroplast genomes (average 109 kilobases each of an approximately 120 kilobase genome generated using multiplexed massively parallel sequencing. 30/33 ingroup nodes resolved with ≥ 95% bootstrap support; this is a substantial improvement relative to prior studies, and shows massively parallel sequencing-based strategies can produce sufficient high quality sequence to reach support levels originally proposed for the phylogenetic bootstrap. Resampling simulations show that at least the entire plastome is necessary to fully resolve Pinus, particularly in rapidly radiating clades. Meta-analysis of 99 published infrageneric phylogenies shows that whole plastome analysis should provide similar gains across a range of plant genera. A disproportionate amount of phylogenetic information resides in two loci (ycf1, ycf2, highlighting their unusual evolutionary properties. Conclusion Plastome sequencing is now an efficient option for increasing phylogenetic resolution at lower taxonomic levels in plant phylogenetic and population genetic analyses. With continuing improvements in sequencing capacity, the strategies herein should revolutionize efforts requiring dense taxon and character sampling

  7. Complete Chloroplast Genome Sequence of Omani Lime (Citrus aurantiifolia) and Comparative Analysis within the Rosids

    OpenAIRE

    Huei-Jiun Su; Hogenhout, Saskia A.; Al-Sadi, Abdullah M.; Chih-Horng Kuo

    2014-01-01

    The genus Citrus contains many economically important fruits that are grown worldwide for their high nutritional and medicinal value. Due to frequent hybridizations among species and cultivars, the exact number of natural species and the taxonomic relationships within this genus are unclear. To compare the differences between the Citrus chloroplast genomes and to develop useful genetic markers, we used a reference-assisted approach to assemble the complete chloroplast genome of Omani lime (C....

  8. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences

    Directory of Open Access Journals (Sweden)

    Liu Chang

    2012-12-01

    Full Text Available Abstract Background The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. Results We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. Conclusions CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from http://www.herbalgenomics.org/cpgavas.

  9. In silico analysis of Simple Sequence Repeats from chloroplast genomes of Solanaceae species

    Directory of Open Access Journals (Sweden)

    Evandro Vagner Tambarussi

    2009-01-01

    Full Text Available The availability of chloroplast genome (cpDNA sequences of Atropa belladonna, Nicotiana sylvestris, N.tabacum, N. tomentosiformis, Solanum bulbocastanum, S. lycopersicum and S. tuberosum, which are Solanaceae species,allowed us to analyze the organization of cpSSRs in their genic and intergenic regions. In general, the number of cpSSRs incpDNA ranged from 161 in S. tuberosum to 226 in N. tabacum, and the number of intergenic cpSSRs was higher than geniccpSSRs. The mononucleotide repeats were the most frequent in studied species, but we also identified di-, tri-, tetra-, pentaandhexanucleotide repeats. Multiple alignments of all cpSSRs sequences from Solanaceae species made the identification ofnucleotide variability possible and the phylogeny was estimated by maximum parsimony. Our study showed that the plastomedatabase can be exploited for phylogenetic analysis and biotechnological approaches.

  10. Complete chloroplast and ribosomal sequences for 30 accessions elucidate evolution of Oryza AA genome species

    Science.gov (United States)

    Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Yu, Yeisoo; Yang, Kiwoung; Choi, Beom-Soon; Koh, Hee-Jong; Waminal, Nomar Espinosa; Choi, Hong-Il; Kim, Nam-Hoon; Jang, Woojong; Park, Hyun-Seung; Lee, Jonghoon; Lee, Hyun Oh; Joh, Ho Jun; Lee, Hyeon Ju; Park, Jee Young; Perumal, Sampath; Jayakodi, Murukarthick; Lee, Yun Sun; Kim, Backki; Copetti, Dario; Kim, Soonok; Kim, Sunggil; Lim, Ki-Byung; Kim, Young-Dong; Lee, Jungho; Cho, Kwang-Su; Park, Beom-Seok; Wing, Rod A.; Yang, Tae-Jin

    2015-01-01

    Cytoplasmic chloroplast (cp) genomes and nuclear ribosomal DNA (nR) are the primary sequences used to understand plant diversity and evolution. We introduce a high-throughput method to simultaneously obtain complete cp and nR sequences using Illumina platform whole-genome sequence. We applied the method to 30 rice specimens belonging to nine Oryza species. Concurrent phylogenomic analysis using cp and nR of several of specimens of the same Oryza AA genome species provides insight into the evolution and domestication of cultivated rice, clarifying three ambiguous but important issues in the evolution of wild Oryza species. First, cp-based trees clearly classify each lineage but can be biased by inter-subspecies cross-hybridization events during speciation. Second, O. glumaepatula, a South American wild rice, includes two cytoplasm types, one of which is derived from a recent interspecies hybridization with O. longistminata. Third, the Australian O. rufipogan-type rice is a perennial form of O. meridionalis. PMID:26506948

  11. The complete chloroplast genome sequence of Pelargonium xhortorum: Or ganization and evolution of the largest and most highlyrearranged chloroplast genome of land plants

    Energy Technology Data Exchange (ETDEWEB)

    Chumley, Timothy W.; Palmer, Jeffrey D.; Mower, Jeffrey P.; Fourcade, H. Matthew; Calie, Patrick J.; Boore, Jeffrey L.; Jansen,Robert K.

    2006-01-20

    The chloroplast genome of Pelargonium e hortorum has beencompletely sequenced. It maps as a circular molecule of 217,942 bp, andis both the largest and most rearranged land plant chloroplast genome yetsequenced. It features two copies of a greatly expanded inverted repeat(IR) of 75,741 bp each, and consequently diminished single copy regionsof 59,710 bp and 6,750 bp. It also contains two different associations ofrepeated elements that contribute about 10 percent to the overall sizeand account for the majority of repeats found in the genome. Theyrepresent hotspots for rearrangements and gene duplications and include alarge number of pseudogenes. We propose simple models that account forthe major rearrangements with a minimum of eight IR boundary changes and12 inversions in addition to a several insertions of duplicated sequence.The major processes at work (duplication, IR expansion, and inversion)have disrupted at least one and possibly two or three transcriptionaloperons, and the genes involved in these disruptions form the core of thetwo major repeat associations. Despite the vast increase in size andcomplexity of the genome, the gene content is similar to that of otherangiosperms, with the exceptions of a large number of pseudogenes as partof the repeat associations, the recognition of two open reading frames(ORF56 and ORF42) in the trnA intron with similarities to previouslyidentified mitochondrial products (ACRS and pvs-trnA), the loss of accDand trnT-GGU, and in particular, the lack of a recognizably functionalrpoA. One or all of three similar open reading frames may possibly encodethe latter, however.

  12. The complete chloroplast genome sequence of Ampelopsis: gene organization, comparative analysis and phylogenetic relationships to other angiosperms

    Directory of Open Access Journals (Sweden)

    Gurusamy eRaman

    2016-03-01

    Full Text Available Ampelopsis brevipedunculata is an economically important plant that belongs to the Vitaceae family of angiosperms. The phylogenetic placement of Vitaceae is still unresolved. Recent phylogenetic studies suggested that it should be placed in various alternative families including Caryophyllaceae, asteraceae, Saxifragaceae, Dilleniaceae, or with the rest of the rosid families. However, these analyses provided weak supportive results because they were based on only one of several genes. Accordingly, complete chloroplast genome sequences are required to resolve the phylogenetic relationships among angiosperms. Recent phylogenetic analyses based on the complete chloroplast genome sequence suggested strong support for the position of Vitaceae as the earliest diverging lineage of rosids and placed it as a sister to the remaining rosids. These studies also revealed relationships among several major lineages of angiosperms; however, they highlighted the significance of taxon sampling for obtaining accurate phylogenies. In the present study, we sequenced the complete chloroplast genome of A. brevipedunculata and used these data to assess the relationships among 32 angiosperms, including 18 taxa of rosids. The Ampelopsis chloroplast genome is 161,090 bp in length, and includes a pair of inverted repeats of 26,394 bp that are separated by small and large single copy regions of 19,036 bp and 89,266 bp, respectively. The gene content and order of Ampelopsis is identical to many other unrearranged angiosperm chloroplast genomes, including Vitis and tobacco. A phylogenetic tree constructed based on 70 protein-coding genes of 33 angiosperms showed that both Saxifragales and Vitaceae diverged from the rosid clade and formed two clades with 100% bootstrap value. The position of the Vitaceae is sister to Saxifragales, and both are the basal and earliest diverging lineages. Moreover, Saxifragales forms a sister clade to Vitaceae of rosids. Overall, the results of

  13. Direct Chloroplast Sequencing: Comparison of Sequencing Platforms and Analysis Tools for Whole Chloroplast Barcoding

    OpenAIRE

    Marta Brozynska; Agnelo Furtado; Robert James Henry

    2014-01-01

    Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina) and Ion Torrent (Life Technology) sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genom...

  14. Congruent Deep Relationships in the Grape Family (Vitaceae Based on Sequences of Chloroplast Genomes and Mitochondrial Genes via Genome Skimming.

    Directory of Open Access Journals (Sweden)

    Ning Zhang

    Full Text Available Vitaceae is well-known for having one of the most economically important fruits, i.e., the grape (Vitis vinifera. The deep phylogeny of the grape family was not resolved until a recent phylogenomic analysis of 417 nuclear genes from transcriptome data. However, it has been reported extensively that topologies based on nuclear and organellar genes may be incongruent due to differences in their evolutionary histories. Therefore, it is important to reconstruct a backbone phylogeny of the grape family using plastomes and mitochondrial genes. In this study, next-generation sequencing data sets of 27 species were obtained using genome skimming with total DNAs from silica-gel preserved tissue samples on an Illumina HiSeq 2500 instrument. Plastomes were assembled using the combination of de novo and reference genome (of V. vinifera methods. Sixteen mitochondrial genes were also obtained via genome skimming using the reference genome of V. vinifera. Extensive phylogenetic analyses were performed using maximum likelihood and Bayesian methods. The topology based on either plastome data or mitochondrial genes is congruent with the one using hundreds of nuclear genes, indicating that the grape family did not exhibit significant reticulation at the deep level. The results showcase the power of genome skimming in capturing extensive phylogenetic data: especially from chloroplast and mitochondrial DNAs.

  15. The complete chloroplast genome sequence of Citrus sinensis (L. Osbeck var 'Ridge Pineapple': organization and phylogenetic relationships to other angiosperms

    Directory of Open Access Journals (Sweden)

    Jansen Robert K

    2006-09-01

    Full Text Available Abstract Background The production of Citrus, the largest fruit crop of international economic value, has recently been imperiled due to the introduction of the bacterial disease Citrus canker. No significant improvements have been made to combat this disease by plant breeding and nuclear transgenic approaches. Chloroplast genetic engineering has a number of advantages over nuclear transformation; it not only increases transgene expression but also facilitates transgene containment, which is one of the major impediments for development of transgenic trees. We have sequenced the Citrus chloroplast genome to facilitate genetic improvement of this crop and to assess phylogenetic relationships among major lineages of angiosperms. Results The complete chloroplast genome sequence of Citrus sinensis is 160,129 bp in length, and contains 133 genes (89 protein-coding, 4 rRNAs and 30 distinct tRNAs. Genome organization is very similar to the inferred ancestral angiosperm chloroplast genome. However, in Citrus the infA gene is absent. The inverted repeat region has expanded to duplicate rps19 and the first 84 amino acids of rpl22. The rpl22 gene in the IRb region has a nonsense mutation resulting in 9 stop codons. This was confirmed by PCR amplification and sequencing using primers that flank the IR/LSC boundaries. Repeat analysis identified 29 direct and inverted repeats 30 bp or longer with a sequence identity ≥ 90%. Comparison of protein-coding sequences with expressed sequence tags revealed six putative RNA edits, five of which resulted in non-synonymous modifications in petL, psbH, ycf2 and ndhA. Phylogenetic analyses using maximum parsimony (MP and maximum likelihood (ML methods of a dataset composed of 61 protein-coding genes for 30 taxa provide strong support for the monophyly of several major clades of angiosperms, including monocots, eudicots, rosids and asterids. The MP and ML trees are incongruent in three areas: the position of Amborella and

  16. Complete chloroplast genome sequence of green foxtail (Setaria viridis), a promising model system for C4 photosynthesis.

    Science.gov (United States)

    Wang, Shuo; Gao, Li-Zhi

    2016-09-01

    The complete chloroplast genome of green foxtail (Setaria viridis), a promising model system for C4 photosynthesis, is first reported in this study. The genome harbors a large single copy (LSC) region of 81 016 bp and a small single copy (SSC) region of 12 456  bp separated by a pair of inverted repeat (IRa and IRb) regions of 22 315 bp. GC content is 38.92%. The proportion of coding sequence is 57.97%, comprising of 111 (19 duplicated in IR regions) unique genes, 71 of which are protein-coding genes, four are rRNA genes, and 36 are tRNA genes. Phylogenetic analysis indicated that S. viridis was clustered with its cultivated species S. italica in the tribe Paniceae of the family Poaceae. This newly determined chloroplast genome will provide valuable genetic resources to assist future studies on C4 photosynthesis in grasses. PMID:26305916

  17. The complete chloroplast genome sequence of the medicinal plant Andrographis paniculata.

    Science.gov (United States)

    Ding, Ping; Shao, Yanhua; Li, Qian; Gao, Junli; Zhang, Runjing; Lai, Xiaoping; Wang, Deqin; Zhang, Huiye

    2016-07-01

    The complete chloroplast genome of Andrographis paniculata, an important medicinal plant with great economic value, has been studied in this article. The genome size is 150,249 bp in length, with 38.3% GC content. A pair of inverted repeats (IRs, 25,300 bp) are separated by a large single copy region (LSC, 82,459 bp) and a small single-copy region (SSC, 17,190 bp). The chloroplast genome contains 114 unique genes, 80 protein-coding genes, 30 tRNA genes and 4 rRNA genes. In these genes, 15 genes contained 1 intron and 3 genes comprised of 2 introns. PMID:25856518

  18. A comparison of rice chloroplast genomes

    DEFF Research Database (Denmark)

    Tang, Jiabin; Xia, Hong'ai; Cao, Mengliang; Zhang, Xiuqing; Zeng, Wanyong; Hu, Songnian; Tong, Wei; Wang, Jun; Wang, Jian; Yu, Jun; Yang, Huanming; Zhu, Lihuang

    2004-01-01

    ), which are both parental varieties of the super-hybrid rice, LYP9. Based on the patterns of high sequence coverage, we partitioned chloroplast sequence variations into two classes, intravarietal and intersubspecific polymorphisms. Intravarietal polymorphisms refer to variations within 93-11 or PA64S...... intersubspecific polymorphisms. In our study, we found that the intersubspecific variations of 93-11 (indica) and PA64S (japonica) chloroplast genomes consisted of 72 single nucleotide polymorphisms and 27 insertions or deletions. The intersubspecific polymorphism rates between 93-11 and PA64S were 0.05% for...... single nucleotide polymorphisms and 0.02% for insertions or deletions, nearly 8 and 10 times lower than their respective nuclear genomes. Based on the total number of nucleotide substitutions between the two chloroplast genomes, we dated the divergence of indica and japonica chloroplast genomes as...

  19. Genomics and chloroplast evolution: what did cyanobacteria do for plants?

    OpenAIRE

    Raven, J.A.; Allen, John

    2003-01-01

    The complete genome sequences of cyanobacteria and of the higher plant Arabidopsis thaliana leave no doubt that the plant chloroplast originated, through endosymbiosis, from a cyanobacterium. But the genomic legacy of cyanobacterial ancestry extends far beyond the chloroplast itself, and persists in organisms that have lost chloroplasts completely.

  20. The chloroplast genome sequence of the green alga Leptosira terrestris: multiple losses of the inverted repeat and extensive genome rearrangements within the Trebouxiophyceae

    Directory of Open Access Journals (Sweden)

    Turmel Monique

    2007-07-01

    Full Text Available Abstract Background In the Chlorophyta – the green algal phylum comprising the classes Prasinophyceae, Ulvophyceae, Trebouxiophyceae and Chlorophyceae – the chloroplast genome displays a highly variable architecture. While chlorophycean chloroplast DNAs (cpDNAs deviate considerably from the ancestral pattern described for the prasinophyte Nephroselmis olivacea, the degree of remodelling sustained by the two ulvophyte cpDNAs completely sequenced to date is intermediate relative to those observed for chlorophycean and trebouxiophyte cpDNAs. Chlorella vulgaris (Chlorellales is currently the only photosynthetic trebouxiophyte whose complete cpDNA sequence has been reported. To gain insights into the evolutionary trends of the chloroplast genome in the Trebouxiophyceae, we sequenced cpDNA from the filamentous alga Leptosira terrestris (Ctenocladales. Results The 195,081-bp Leptosira chloroplast genome resembles the 150,613-bp Chlorella genome in lacking a large inverted repeat (IR but differs greatly in gene order. Six of the conserved genes present in Chlorella cpDNA are missing from the Leptosira gene repertoire. The 106 conserved genes, four introns and 11 free standing open reading frames (ORFs account for 48.3% of the genome sequence. This is the lowest gene density yet observed among chlorophyte cpDNAs. Contrary to the situation in Chlorella but similar to that in the chlorophycean Scenedesmus obliquus, the gene distribution is highly biased over the two DNA strands in Leptosira. Nine genes, compared to only three in Chlorella, have significantly expanded coding regions relative to their homologues in ancestral-type green algal cpDNAs. As observed in chlorophycean genomes, the rpoB gene is fragmented into two ORFs. Short repeats account for 5.1% of the Leptosira genome sequence and are present mainly in intergenic regions. Conclusion Our results highlight the great plasticity of the chloroplast genome in the Trebouxiophyceae and indicate

  1. Transcriptional Slippage and RNA Editing Increase the Diversity of Transcripts in Chloroplasts: Insight from Deep Sequencing of Vigna radiata Genome and Transcriptome.

    Directory of Open Access Journals (Sweden)

    Ching-Ping Lin

    Full Text Available We performed deep sequencing of the nuclear and organellar genomes of three mungbean genotypes: Vigna radiata ssp. sublobata TC1966, V. radiata var. radiata NM92 and the recombinant inbred line RIL59 derived from a cross between TC1966 and NM92. Moreover, we performed deep sequencing of the RIL59 transcriptome to investigate transcript variability. The mungbean chloroplast genome has a quadripartite structure including a pair of inverted repeats separated by two single copy regions. A total of 213 simple sequence repeats were identified in the chloroplast genomes of NM92 and RIL59; 78 single nucleotide variants and nine indels were discovered in comparing the chloroplast genomes of TC1966 and NM92. Analysis of the mungbean chloroplast transcriptome revealed mRNAs that were affected by transcriptional slippage and RNA editing. Transcriptional slippage frequency was positively correlated with the length of simple sequence repeats of the mungbean chloroplast genome (R2=0.9911. In total, 41 C-to-U editing sites were found in 23 chloroplast genes and in one intergenic spacer. No editing site that swapped U to C was found. A combination of bioinformatics and experimental methods revealed that the plastid-encoded RNA polymerase-transcribed genes psbF and ndhA are affected by transcriptional slippage in mungbean and in main lineages of land plants, including three dicots (Glycine max, Brassica rapa, and Nicotiana tabacum, two monocots (Oryza sativa and Zea mays, two gymnosperms (Pinus taeda and Ginkgo biloba and one moss (Physcomitrella patens. Transcript analysis of the rps2 gene showed that transcriptional slippage could affect transcripts at single sequence repeat regions with poly-A runs. It showed that transcriptional slippage together with incomplete RNA editing may cause sequence diversity of transcripts in chloroplasts of land plants.

  2. The complete nucleotide sequence of the coffee (Coffea arabica L.) chloroplast genome: organization and implications for biotechnology and phylogenetic relationships among angiosperms.

    Science.gov (United States)

    The chloroplast genome sequence of Coffea arabica L., first member of family Rubiaceae (fourth largest family of angiosperms) is reported. The genome is 155,189 bp in length, including a pair of inverted repeats of 25,943 bp, separated by a small single copy region of 18,137 bp and a large single co...

  3. Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum and Comparative Analysis with Common Buckwheat (F. esculentum.

    Directory of Open Access Journals (Sweden)

    Kwang-Soo Cho

    Full Text Available We report the chloroplast (cp genome sequence of tartary buckwheat (Fagopyrum tataricum obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats and F. esculentum (one repeat, and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes--rpoC2, ycf3, accD, and clpP--have high synonymous (Ks value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum.

  4. Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum) and Comparative Analysis with Common Buckwheat (F. esculentum).

    Science.gov (United States)

    Cho, Kwang-Soo; Yun, Bong-Kyoung; Yoon, Young-Ho; Hong, Su-Young; Mekapogu, Manjulatha; Kim, Kyung-Hee; Yang, Tae-Jin

    2015-01-01

    We report the chloroplast (cp) genome sequence of tartary buckwheat (Fagopyrum tataricum) obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale) cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp) were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats) and F. esculentum (one repeat), and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes--rpoC2, ycf3, accD, and clpP--have high synonymous (Ks) value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum. PMID:25966355

  5. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    Directory of Open Access Journals (Sweden)

    Marta Brozynska

    Full Text Available Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina and Ion Torrent (Life Technology sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare. Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  6. Chloroplast genome variation in upland and lowland switchgrass

    Science.gov (United States)

    Switchgrass (Panicum virgatum L.) exists at multiple ploidies and two phenotypically distinct ecotypes. To facilitate interploidal comparisons and to understand the extent of sequence variation within existing breeding pools, two complete switchgrass chloroplast genomes were sequenced from individu...

  7. The complete chloroplast genome of the Dendrobium strongylanthum (Orchidaceae: Epidendroideae).

    Science.gov (United States)

    Li, Jing; Chen, Chen; Wang, Zhe-Zhi

    2016-07-01

    Complete chloroplast genome sequence is very useful for studying the phylogenetic and evolution of species. In this study, the complete chloroplast genome of Dendrobium strongylanthum was constructed from whole-genome Illumina sequencing data. The chloroplast genome is 153 058 bp in length with 37.6% GC content and consists of two inverted repeats (IRs) of 26 316 bp. The IR regions are separated by large single-copy region (LSC, 85 836 bp) and small single-copy (SSC, 14 590 bp) region. A total of 130 chloroplast genes were successfully annotated, including 84 protein coding genes, 38 tRNA genes, and eight rRNA genes. Phylogenetic analyses showed that the chloroplast genome of Dendrobium strongylanthum is related to that of the Dendrobium officinal. PMID:26153739

  8. The complete chloroplast genome of Capsicum frutescens (Solanaceae) 1

    OpenAIRE

    Shim, Donghwan; Raveendar, Sebastin; Lee, Jung-Ro; Lee, Gi-An; Ro, Na-Young; Jeon, Young-Ah; Cho, Gyu-Taek; Lee, Ho-Sun; Ma, Kyung-Ho; Chung, Jong-Wook

    2016-01-01

    Premise of the study: We report the complete sequence of the chloroplast genome of Capsicum frutescens (Solanaceae), a species of chili pepper. Methods and Results: Using an Illumina platform, we sequenced the chloroplast genome of C. frutescens. The total length of the genome is 156,817 bp, and the overall GC content is 37.7%. A pair of 25,792-bp inverted repeats is separated by small (17,853 bp) and large (87,380 bp) single-copy regions. The C. frutescens chloroplast genome encodes 132 uniq...

  9. Chloroplast DNA sequence of the green alga Oedogonium cardiacum (Chlorophyceae: Unique genome architecture, derived characters shared with the Chaetophorales and novel genes acquired through horizontal transfer

    Directory of Open Access Journals (Sweden)

    Lemieux Claude

    2008-06-01

    Full Text Available Abstract Background To gain insight into the branching order of the five main lineages currently recognized in the green algal class Chlorophyceae and to expand our understanding of chloroplast genome evolution, we have undertaken the sequencing of chloroplast DNA (cpDNA from representative taxa. The complete cpDNA sequences previously reported for Chlamydomonas (Chlamydomonadales, Scenedesmus (Sphaeropleales, and Stigeoclonium (Chaetophorales revealed tremendous variability in their architecture, the retention of only few ancestral gene clusters, and derived clusters shared by Chlamydomonas and Scenedesmus. Unexpectedly, our recent phylogenies inferred from these cpDNAs and the partial sequences of three other chlorophycean cpDNAs disclosed two major clades, one uniting the Chlamydomonadales and Sphaeropleales (CS clade and the other uniting the Oedogoniales, Chaetophorales and Chaetopeltidales (OCC clade. Although molecular signatures provided strong support for this dichotomy and for the branching of the Oedogoniales as the earliest-diverging lineage of the OCC clade, more data are required to validate these phylogenies. We describe here the complete cpDNA sequence of Oedogonium cardiacum (Oedogoniales. Results Like its three chlorophycean homologues, the 196,547-bp Oedogonium chloroplast genome displays a distinctive architecture. This genome is one of the most compact among photosynthetic chlorophytes. It has an atypical quadripartite structure, is intron-rich (17 group I and 4 group II introns, and displays 99 different conserved genes and four long open reading frames (ORFs, three of which are clustered in the spacious inverted repeat of 35,493 bp. Intriguingly, two of these ORFs (int and dpoB revealed high similarities to genes not usually found in cpDNA. At the gene content and gene order levels, the Oedogonium genome most closely resembles its Stigeoclonium counterpart. Characters shared by these chlorophyceans but missing in members

  10. Complete chloroplast genome sequences of Drimys, Liriodendron, andPiper: Implications for the phylogeny of magnoliids and the evolution ofGC content

    Energy Technology Data Exchange (ETDEWEB)

    Zhengqiu, C.; Penaflor, C.; Kuehl, J.V.; Leebens-Mack, J.; Carlson, J.; dePamphilis, C.W.; Boore, J.L.; Jansen, R.K.

    2006-06-01

    The magnoliids represent the largest basal angiosperm clade with four orders, 19 families and 8,500 species. Although several recent angiosperm molecular phylogenies have supported the monophyly of magnoliids and suggested relationships among the orders, the limited number of genes examined resulted in only weak support, and these issues remain controversial. Furthermore, considerable incongruence has resulted in phylogenies supporting three different sets of relationships among magnoliids and the two large angiosperm clades, monocots and eudicots. This is one of the most important remaining issues concerning relationships among basal angiosperms. We sequenced the chloroplast genomes of three magnoliids, Drimys (Canellales), Liriodendron (Magnoliales), and Piper (Piperales), and used these data in combination with 32 other completed angiosperm chloroplast genomes to assess phylogenetic relationships among magnoliids. The Drimys and Piper chloroplast genomes are nearly identical in size at 160,606 and 160,624 bp, respectively. The genomes include a pair of inverted repeats of 26,649 bp (Drimys) and 27,039 (Piper), separated by a small single copy region of 18,621 (Drimys) and 18,878 (Piper) and a large single copy region of 88,685 bp (Drimys) and 87,666 bp (Piper). The gene order of both taxa is nearly identical to many other unrearranged angiosperm chloroplast genomes, including Calycanthus, the other published magnoliid genome. Comparisons of angiosperm chloroplast genomes indicate that GC content is not uniformly distributed across the genome. Overall GC content ranges from 34-39%, and coding regions have a substantially higher GC content than non-coding regions (both intergenic spacers and introns). Among protein-coding genes, GC content varies by codon position with 1st codon > 2nd codon > 3rd codon, and it varies by functional group with photosynthetic genes having the highest percentage and NADH genes the lowest. Across the genome, GC content is highest in

  11. Local repeat sequence organization of an intergenic spacer in the chloroplast genome of Chlamydomonas reinhardtii leads to DNA expansion and sequence scrambling: a complex mode of “copy-choice replication”?

    Indian Academy of Sciences (India)

    Mahendra D Wagle; Subhojit Sen; Basuthkar J Rao

    2001-12-01

    Parent-specific, randomly amplified polymorphic DNA (RAPD) markers were obtained from total genomic DNA of Chlamydomonas reinhardtii. Such parent-specific RAPD bands (genomic fingerprints) segregated uniparentally (through mt+) in a cross between a pair of polymorphic interfertile strains of Chlamydomonas (C. reinhardtii and C. minnesotti), suggesting that they originated from the chloroplast genome. Southern analysis mapped the RAPD-markers to the chloroplast genome. One of the RAPD-markers, ``P2” (1.6 kb) was cloned, sequenced and was fine mapped to the 3 kb region encompassing 3′ end of 23S, full 5S and intergenic region between 5S and psbA. This region seems divergent enough between the two parents, such that a specific PCR designed for a parental specific chloroplast sequence within this region, amplified a marker in that parent only and not in the other, indicating the utility of RAPD-scan for locating the genomic regions of sequence divergence. Remarkably, the RAPD-product, ``P2” seems to have originated from a PCR-amplification of a much smaller (about 600 bp), but highly repeat-rich (direct and inverted) domain of the 3 kb region in a manner that yielded no linear sequence alignment with its own template sequence. The amplification yielded the same uniquely ``sequence-scrambled” product, whether the template used for PCR was total cellular DNA, chloroplast DNA or a plasmid clone DNA corresponding to that region. The PCR product, a ``unique” new sequence, had lost the repetitive organization of the template genome where it had originated from and perhaps represented a ``complex path” of copy-choice replication.

  12. The complete chloroplast genome of Capsicum frutescens (Solanaceae)1

    Science.gov (United States)

    Shim, Donghwan; Raveendar, Sebastin; Lee, Jung-Ro; Lee, Gi-An; Ro, Na-Young; Jeon, Young-Ah; Cho, Gyu-Taek; Lee, Ho-Sun; Ma, Kyung-Ho; Chung, Jong-Wook

    2016-01-01

    Premise of the study: We report the complete sequence of the chloroplast genome of Capsicum frutescens (Solanaceae), a species of chili pepper. Methods and Results: Using an Illumina platform, we sequenced the chloroplast genome of C. frutescens. The total length of the genome is 156,817 bp, and the overall GC content is 37.7%. A pair of 25,792-bp inverted repeats is separated by small (17,853 bp) and large (87,380 bp) single-copy regions. The C. frutescens chloroplast genome encodes 132 unique genes, including 87 protein-coding genes, 37 transfer RNA (tRNA) genes, and eight ribosomal RNA (rRNA) genes. Of these, seven genes are duplicated in the inverted repeats and 12 genes contain one or two introns. Comparative analysis with the reference chloroplast genome revealed 125 simple sequence repeat motifs and 34 variants, mostly located in the noncoding regions. Conclusions: The complete chloroplast genome sequence of C. frutescens reported here is a valuable genetic resource for Capsicum species. PMID:27213127

  13. The complete chloroplast genome of Schrenkiella parvula (Brassicaceae).

    Science.gov (United States)

    He, Qi; Hao, Guoqian; Wang, Xiaojuan; Bi, Hao; Li, Yuanshuo; Guo, Xinyi; Ma, Tao

    2016-09-01

    Schrenkiella parvula is an Arabidopsis-related model species used here for studying plant stress tolerance. In this study, the complete chloroplast genome sequence of S. parvula has been reported for the first time. The total length of the chloroplast genome was 153 979 bp, which had a typical quadripartite structure. The annotated plastid genome includes 87 protein-coding genes, 39 tRNA genes and 8 ribosomal RNA genes. The evolutionary relationships revealed by our phylogenetic analysis indicated that S. parvula is closer to the Brassiceae species when compared with Eutrema salsugineum. PMID:26260181

  14. Dynamics of chloroplast genomes in green plants.

    Science.gov (United States)

    Xu, Jian-Hong; Liu, Qiuxiang; Hu, Wangxiong; Wang, Tingzhang; Xue, Qingzhong; Messing, Joachim

    2015-10-01

    Chloroplasts are essential organelles, in which genes have widely been used in the phylogenetic analysis of green plants. Here, we took advantage of the breadth of plastid genomes (cpDNAs) sequenced species to investigate their dynamic changes. Our study showed that gene rearrangements occurred more frequently in the cpDNAs of green algae than in land plants. Phylogenetic trees were generated using 55 conserved protein-coding genes including 33 genes for photosynthesis, 16 ribosomal protein genes and 6 other genes, which supported the monophyletic evolution of vascular plants, land plants, seed plants, and angiosperms. Moreover, we could show that seed plants were more closely related to bryophytes rather than pteridophytes. Furthermore, the substitution rate for cpDNA genes was calculated to be 3.3×10(-10), which was almost 10 times lower than genes of nuclear genomes, probably because of the plastid homologous recombination machinery. PMID:26206079

  15. Development of chloroplast genomic resources for Cynara.

    Science.gov (United States)

    Curci, Pasquale L; De Paola, Domenico; Sonnante, Gabriella

    2016-03-01

    In this study, new chloroplast (cp) resources were developed for the genus Cynara, using whole cp genomes from 20 genotypes, by means of high-throughput sequencing technologies. Our target species included seven globe artichokes, two cultivated cardoons, eight wild artichokes, and three other wild Cynara species (C. baetica, C. cornigera and C. syriaca). One complete cp genome was isolated using short reads from a whole-genome sequencing project, while the others were obtained by means of long-range PCR, for which primer pairs are provided here. A de novo assembly strategy combined with a reference-based assembly allowed us to reconstruct each cp genome. Comparative analyses among the newly sequenced genotypes and two additional Cynara cp genomes ('Brindisino' artichoke and C. humilis) retrieved from public databases revealed 126 parsimony informative characters and 258 singletons in Cynara, for a total of 384 variable characters. Thirty-nine SSR loci and 34 other INDEL events were detected. After data analysis, 37 primer pairs for SSR amplification were designed, and these molecular markers were subsequently validated in our Cynara genotypes. Phylogenetic analysis based on all cp variable characters provided the best resolution when compared to what was observed using only parsimony informative characters, or only short 'variable' cp regions. The evaluation of the molecular resources obtained from this study led us to support the 'super-barcode' theory and consider the total cp sequence of Cynara as a reliable and valuable molecular marker for exploring species diversity and examining variation below the species level. PMID:26354522

  16. The complete chloroplast genome of North American ginseng, Panax quinquefolius.

    Science.gov (United States)

    Han, Zeng-Jie; Li, Wei; Liu, Yuan; Gao, Li-Zhi

    2016-09-01

    We report complete nucleotide sequence of the Panax quinquefolius chloroplast genome using next-generation sequencing technology. The genome size is 156 359 bp, including two inverted repeats (IRs) of 52 153 bp, separated by the large single-copy (LSC 86 184 bp) and small single-copy (SSC 18 081 bp) regions. This cp genome encodes 114 unigenes (80 protein-coding genes, four rRNA genes, and 30 tRNA genes), in which 18 are duplicated in the IR regions. Overall GC content of the genome is 38.08%. A phylogenomic analysis of the 10 complete chloroplast genomes from Araliaceae using Daucus carota from Apiaceae as outgroup showed that P. quinquefolius is closely related to the other two members of the genus Panax, P. ginseng and P. notoginseng. PMID:27158867

  17. The complete chloroplast genome of Torreya fargesii (Taxaceae).

    Science.gov (United States)

    Tao, Ke; Gao, Lei; Li, Jia; Chen, Shanshan; Su, Yingjuan; Wang, Ting

    2016-09-01

    The complete chloroplast genome sequence of Torreya fargesii (Taxaceae), a relic plant endemic to China, is presented in this study. The genome is 137 075 bp in length, with 35.47% average GC content. One copy of the large inverted repeats is lost from this genome. The T. fargesii chloroplast genome encodes 118 unique genes, in which trnI-CAU, trnQ-UUG, trnN-GUU are duplicated. Protein-coding, tRNA and rRNA genes represent 54.7%, 1.9% and 3.4% of the genome, respectively. There are 17 intron-containing genes, of which 6 are tRNA genes. A maximum likelihood phylogenetic analysis revealed a strong sister relationship between Torreya and Amentotaxus. PMID:27158868

  18. The complete chloroplast genome of Eleutherococcus gracilistylus (W.W.Sm.) S.Y.Hu (Araliaceae).

    Science.gov (United States)

    Kim, Kyunghee; Lee, Junki; Lee, Sang-Choon; Kim, Nam-Hoon; Jang, Woojong; Kim, Soonok; Sung, Sangmin; Lee, Jungho; Yang, Tae-Jin

    2016-09-01

    Eleutherococcus gracilistylus is a plant species that is close to E. senticosus, a famous medicinal plant called Siberian ginseng. The complete chloroplast genome sequence of the E. gracilistylus was determined by de novo assembly using whole genome next generation sequences. The chloroplast genome of E. gracilistylus was 156 770 bp long and showed distinct four partite structures such as a large single copy region of 86 729 bp, a small single copy region of 18 175 bp, and a pair of inverted repeat regions of 25 933 bp. The overall GC contents of the genome sequence were 36.8%. The chloroplast genome of E. gracilistylus contains 79 protein-coding sequences, 30 tRNA genes, and four rRNA genes. The phylogenetic analysis with the reported chloroplast genomes confirmed close taxonomical relationship of E. gracilistylus with E. senticosus. PMID:26358682

  19. Nucleotide sequence of a Euglena gracilis chloroplast genome region coding for the elongation factor Tu; evidence for a spliced mRNA.

    OpenAIRE

    Montandon, P E; Stutz, E

    1983-01-01

    We characterize a 1.95 kb transcription product of the Euglena gracilis chloroplast DNA fragment Eco-N + Q by S1 nuclease analysis and DNA sequencing and show that it is the product of three splicing events. Exon 1 (0.45 kb), exon 2 (0.74 kb) and 175 nucleotides of exon 3 (0.53 kb) code for the chloroplast elongation factor protein (EF-Tu). The remaining part of exon 3 and exon 4 (0.23 kb) have unidentified open reading frames. The chloroplast EF-Tu protein has 408 aminoacids and is to 70% ho...

  20. The Complete Chloroplast Genome of Banana (Musa acuminata, Zingiberales): Insight into Plastid Monocotyledon Evolution

    OpenAIRE

    Guillaume Martin; Franc-Christophe Baurens; Céline Cardi; Jean-Marc Aury; Angélique D'Hont

    2013-01-01

    BACKGROUND: Banana (genus Musa) is a crop of major economic importance worldwide. It is a monocotyledonous member of the Zingiberales, a sister group of the widely studied Poales. Most cultivated bananas are natural Musa inter-(sub-)specific triploid hybrids. A Musa acuminata reference nuclear genome sequence was recently produced based on sequencing of genomic DNA enriched in nucleus. METHODOLOGY/PRINCIPAL FINDINGS: The Musa acuminata chloroplast genome was assembled with chloroplast reads e...

  1. The complete chloroplast genome of banana (Musa acuminata, Zingiberales: insight into plastid monocotyledon evolution.

    Directory of Open Access Journals (Sweden)

    Guillaume Martin

    Full Text Available BACKGROUND: Banana (genus Musa is a crop of major economic importance worldwide. It is a monocotyledonous member of the Zingiberales, a sister group of the widely studied Poales. Most cultivated bananas are natural Musa inter-(sub-specific triploid hybrids. A Musa acuminata reference nuclear genome sequence was recently produced based on sequencing of genomic DNA enriched in nucleus. METHODOLOGY/PRINCIPAL FINDINGS: The Musa acuminata chloroplast genome was assembled with chloroplast reads extracted from whole-genome-shotgun sequence data. The Musa chloroplast genome is a circular molecule of 169,972 bp with a quadripartite structure containing two single copy regions, a Large Single Copy region (LSC, 88,338 bp and a Small Single Copy region (SSC, 10,768 bp separated by Inverted Repeat regions (IRs, 35,433 bp. Two forms of the chloroplast genome relative to the orientation of SSC versus LSC were found. The Musa chloroplast genome shows an extreme IR expansion at the IR/SSC boundary relative to the most common structures found in angiosperms. This expansion consists of the integration of three additional complete genes (rps15, ndhH and ycf1 and part of the ndhA gene. No such expansion has been observed in monocots so far. Simple Sequence Repeats were identified in the Musa chloroplast genome and a new set of Musa chloroplastic markers was designed. CONCLUSION: The complete sequence of M. acuminata ssp malaccensis chloroplast we reported here is the first one for the Zingiberales order. As such it provides new insight in the evolution of the chloroplast of monocotyledons. In particular, it reinforces that IR/SSC expansion has occurred independently several times within monocotyledons. The discovery of new polymorphic markers within Musa chloroplast opens new perspectives to better understand the origin of cultivated triploid bananas.

  2. The chloroplast genome of a symbiodinium sp. clade C3 isolate

    KAUST Repository

    Barbrook, Adrian C.

    2014-01-01

    Dinoflagellate algae of the genus Symbiodinium form important symbioses within corals and other benthic marine animals. Dinoflagellates possess an extremely reduced plastid genome relative to those examined in plants and other algae. In dinoflagellates the plastid genes are located on small plasmids, commonly referred to as \\'minicircles\\'. However, the chloroplast genomes of dinoflagellates have only been extensively characterised from a handful of species. There is also evidence of considerable variation in the chloroplast genome organisation across those species that have been examined. We therefore characterised the chloroplast genome from an environmental coral isolate, in this case containing a symbiont belonging to the Symbiodinium sp. clade C3. The gene content of the genome is well conserved with respect to previously characterised genomes. However, unlike previously characterised dinoflagellate chloroplast genomes we did not identify any \\'empty\\' minicircles. The sequences of this chloroplast genome show a high rate of evolution relative to other algal species. Particularly notable was a surprisingly high level of sequence divergence within the core polypeptides of photosystem I, the reasons for which are currently unknown. This chloroplast genome also possesses distinctive codon usage and GC content. These features suggest that chloroplast genomes in Symbiodinium are highly plastic. © 2013 Adrian C. Barbrook.

  3. Comparative analyses of chloroplast genome data representing nine green algae in Sphaeropleales (Chlorophyceae, Chlorophyta).

    Science.gov (United States)

    Fučíková, Karolina; Lewis, Louise A; Lewis, Paul O

    2016-06-01

    The chloroplast genomes of green algae are highly variable in their architecture. In this article we summarize gene content across newly obtained and published chloroplast genomes in Chlorophyceae, including new data from nine of species in Sphaeropleales (Chlorophyceae, Chlorophyta). We present genome architecture information, including genome synteny analysis across two groups of species. Also, we provide a phylogenetic tree obtained from analysis of gene order data for species in Chlorophyceae with fully sequenced chloroplast genomes. Further analyses and interpretation of the data can be found in "Chloroplast phylogenomic data from the green algal order Sphaeropleales (Chlorophyceae, Chlorophyta) reveal complex patterns of sequence evolution" (Fučíková et al., In review) [1]. PMID:27054159

  4. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  5. Insights from the complete chloroplast genome into the evolution of Sesamum indicum L.

    Directory of Open Access Journals (Sweden)

    Haiyang Zhang

    Full Text Available Sesame (Sesamum indicum L. is one of the oldest oilseed crops. In order to investigate the evolutionary characters according to the Sesame Genome Project, apart from sequencing its nuclear genome, we sequenced the complete chloroplast genome of S. indicum cv. Yuzhi 11 (white seeded using Illumina and 454 sequencing. Comparisons of chloroplast genomes between S. indicum and the 18 other higher plants were then analyzed. The chloroplast genome of cv. Yuzhi 11 contains 153,338 bp and a total of 114 unique genes (KC569603. The number of chloroplast genes in sesame is the same as that in Nicotiana tabacum, Vitis vinifera and Platanus occidentalis. The variation in the length of the large single-copy (LSC regions and inverted repeats (IR in sesame compared to 18 other higher plant species was the main contributor to size variation in the cp genome in these species. The 77 functional chloroplast genes, except for ycf1 and ycf2, were highly conserved. The deletion of the cp ycf1 gene sequence in cp genomes may be due either to its transfer to the nuclear genome, as has occurred in sesame, or direct deletion, as has occurred in Panax ginseng and Cucumis sativus. The sesame ycf2 gene is only 5,721 bp in length and has lost about 1,179 bp. Nucleotides 1-585 of ycf2 when queried in BLAST had hits in the sesame draft genome. Five repeats (R10, R12, R13, R14 and R17 were unique to the sesame chloroplast genome. We also found that IR contraction/expansion in the cp genome alters its rate of evolution. Chloroplast genes and repeats display the signature of convergent evolution in sesame and other species. These findings provide a foundation for further investigation of cp genome evolution in Sesamum and other higher plants.

  6. Complete Chloroplast and Mitochondrial Genome Sequences of the Hydrocarbon Oil-Producing Green Microalga Botryococcus braunii Race B (Showa)

    Science.gov (United States)

    Blifernez-Klassen, Olga; Wibberg, Daniel; Winkler, Anika; Blom, Jochen; Goesmann, Alexander; Kalinowski, Jörn

    2016-01-01

    The green alga Botryococcus braunii is capable of the production and excretion of high quantities of long-chain hydrocarbons and exopolysaccharides. In this study, we present the complete plastid and mitochondrial genomes of the hydrocarbon-producing microalga Botryococcus braunii race B (Showa), with a total length of 156,498 and 129,356 bp, respectively. PMID:27284138

  7. The complete chloroplast genome of Cinnamomum kanehirae Hayata (Lauraceae).

    Science.gov (United States)

    Wu, Chia-Chen; Ho, Cheng-Kuen; Chang, Shu-Hwa

    2016-07-01

    The complete chloroplast genome of Cinnamomum kanehirae (Hayata), the first to be completely sequenced of Lauraceae family, is presented in this study. The total genome size is 152,700 bp, with a typical circular structure including a pair of inverted repeats (IRa/b) of 20,107 bp of length separated by a large single-copy region (LSC) and a small single-copy region (SSC) of 93,642 bp and 18,844 bp of length, respectively. The overall GC content of the genome is 39.1%. The nucleotide sequence shows 91% identities with Liriodendron tulipifera in the Magnoliaceae. In total, 123 annotated genes consisted of 79 coding genes, eight rRNA genes, and 36 tRNA genes. Among all 79 coding genes, seven genes (rpoC1, atpF, rpl2, ndhB, ndhA, rps16, and rpl2) contain one intron, while two genes (ycf3 and clpP) contain two introns. The maximum likelihood phylogenetic analysis revealed that C. kanehirae chloroplast genome is closely related to Calycanthus fertilis within Laurales order. PMID:26053940

  8. Comparative chloroplast genomics and phylogenetics of Fagopyrum esculentum ssp. ancestrale – A wild ancestor of cultivated buckwheat

    Directory of Open Access Journals (Sweden)

    Dhingra Amit

    2008-05-01

    Full Text Available Abstract Background Chloroplast genome sequences are extremely informative about species-interrelationships owing to its non-meiotic and often uniparental inheritance over generations. The subject of our study, Fagopyrum esculentum, is a member of the family Polygonaceae belonging to the order Caryophyllales. An uncertainty remains regarding the affinity of Caryophyllales and the asterids that could be due to undersampling of the taxa. With that background, having access to the complete chloroplast genome sequence for Fagopyrum becomes quite pertinent. Results We report the complete chloroplast genome sequence of a wild ancestor of cultivated buckwheat, Fagopyrum esculentum ssp. ancestrale. The sequence was rapidly determined using a previously described approach that utilized a PCR-based method and employed universal primers, designed on the scaffold of multiple sequence alignment of chloroplast genomes. The gene content and order in buckwheat chloroplast genome is similar to Spinacia oleracea. However, some unique structural differences exist: the presence of an intron in the rpl2 gene, a frameshift mutation in the rpl23 gene and extension of the inverted repeat region to include the ycf1 gene. Phylogenetic analysis of 61 protein-coding gene sequences from 44 complete plastid genomes provided strong support for the sister relationships of Caryophyllales (including Polygonaceae to asterids. Further, our analysis also provided support for Amborella as sister to all other angiosperms, but interestingly, in the bayesian phylogeny inference based on first two codon positions Amborella united with Nymphaeales. Conclusion Comparative genomics analyses revealed that the Fagopyrum chloroplast genome harbors the characteristic gene content and organization as has been described for several other chloroplast genomes. However, it has some unique structural features distinct from previously reported complete chloroplast genome sequences. Phylogenetic

  9. Chloroplast phylogenomic data from the green algal order Sphaeropleales (Chlorophyceae, Chlorophyta) reveal complex patterns of sequence evolution.

    Science.gov (United States)

    Fučíková, Karolina; Lewis, Paul O; Lewis, Louise A

    2016-05-01

    Chloroplast sequence data are widely used to infer phylogenies of plants and algae. With the increasing availability of complete chloroplast genome sequences, the opportunity arises to resolve ancient divergences that were heretofore problematic. On the flip side, properly analyzing large multi-gene data sets can be a major challenge, as these data may be riddled with systematic biases and conflicting signals. Our study contributes new data from nine complete and four fragmentary chloroplast genome sequences across the green algal order Sphaeropleales. Our phylogenetic analyses of a 56-gene data set show that analyzing these data on a nucleotide level yields a well-supported phylogeny - yet one that is quite different from a corresponding amino acid analysis. We offer some possible explanations for this conflict through a range of analyses of modified data sets. In addition, we characterize the newly sequenced genomes in terms of their structure and content, thereby further contributing to the knowledge of chloroplast genome evolution. PMID:26903036

  10. Complete Chloroplast Genome of the Wollemi Pine (Wollemia nobilis: Structure and Evolution.

    Directory of Open Access Journals (Sweden)

    Jia-Yee S Yap

    Full Text Available The Wollemi pine (Wollemia nobilis is a rare Southern conifer with striking morphological similarity to fossil pines. A small population of W. nobilis was discovered in 1994 in a remote canyon system in the Wollemi National Park (near Sydney, Australia. This population contains fewer than 100 individuals and is critically endangered. Previous genetic studies of the Wollemi pine have investigated its evolutionary relationship with other pines in the family Araucariaceae, and have suggested that the Wollemi pine genome contains little or no variation. However, these studies were performed prior to the widespread use of genome sequencing, and their conclusions were based on a limited fraction of the Wollemi pine genome. In this study, we address this problem by determining the entire sequence of the W. nobilis chloroplast genome. A detailed analysis of the structure of the genome is presented, and the evolution of the genome is inferred by comparison with the chloroplast sequences of other members of the Araucariaceae and the related family Podocarpaceae. Pairwise alignments of whole genome sequences, and the presence of unique pseudogenes, gene duplications and insertions in W. nobilis and Araucariaceae, indicate that the W. nobilis chloroplast genome is most similar to that of its sister taxon Agathis. However, the W. nobilis genome contains an unusually high number of repetitive sequences, and these could be used in future studies to investigate and conserve any remnant genetic diversity in the Wollemi pine.

  11. Whole Genome Sequencing

    Science.gov (United States)

    ... you want to learn. Search form Search Whole Genome Sequencing You are here Home Testing & Services Testing ... the full story, click here . What is whole genome sequencing? Whole genome sequencing is the mapping out ...

  12. The complete chloroplast genome of Gracilariopsis lemaneiformis (Rhodophyta) gives new insight into the evolution of family Gracilariaceae.

    Science.gov (United States)

    Du, Qingwei; Bi, Guiqi; Mao, Yunxiang; Sui, Zhenghong

    2016-06-01

    The complete chloroplast genome of Gracilariopsis lemaneiformis was recovered from a Next Generation Sequencing data set. Without quadripartite structure, this chloroplast genome (183,013 bp, 27.40% GC content) contains 202 protein-coding genes, 34 tRNA genes, 3 rRNA genes, and 1 tmRNA gene. Synteny analysis showed plasmid incorporation regions in chloroplast genomes of three species of family Gracilariaceae and in Grateloupia taiwanensis of family Halymeniaceae. Combined with reported red algal plasmid sequences in nuclear and mitochondrial genomes, we postulated that red algal plasmids may have played an important role in ancient horizontal gene transfer among nuclear, chloroplast, and mitochondrial genomes. Substitution rate analysis showed that purifying selective forces maintaining stability of protein-coding genes of nine red algal chloroplast genomes over long periods must be strong and that the forces acting on gene groups and single genes of nine red algal chloroplast genomes were similar and consistent. The divergence of Gp. lemaneiformis occurred ~447.98 million years ago (Mya), close to the divergence time of genus Pyropia and Porphyra (443.62 Mya). PMID:27273536

  13. Sequence evidence for the symbiotic origins of chloroplasts and mitochondria

    Science.gov (United States)

    George, D. G.; Hunt, L. T.; Dayhoff, M. O.

    1983-01-01

    The origin of mitochondria and chloroplasts is investigated on the basis of prokaryotic and early-eukaryotic evolutionary trees derived from protein and nucleic-acid sequences by the method of Dayhoff (1979). Trees for bacterial ferrodoxins, 5S ribosomal RNA, c-type cytochromes, the lipid-binding subunit of ATPase, and dihydrofolate reductase are presented and discussed. Good agreement among the trees is found, and it is argued that the mitochondria and chloroplasts evolved by multiple symbiotic events.

  14. Combined analysis of the chloroplast genome and transcriptome of the Antarctic vascular plant Deschampsia antarctica Desv.

    Directory of Open Access Journals (Sweden)

    Jungeun Lee

    Full Text Available BACKGROUND: Antarctic hairgrass (Deschampsia antarctica Desv. is the only natural grass species in the maritime Antarctic. It has been researched as an important ecological marker and as an extremophile plant for studies on stress tolerance. Despite its importance, little genomic information is available for D. antarctica. Here, we report the complete chloroplast genome, transcriptome profiles of the coding/noncoding genes, and the posttranscriptional processing by RNA editing in the chloroplast system. RESULTS: The complete chloroplast genome of D. antarctica is 135,362 bp in length with a typical quadripartite structure, including the large (LSC: 79,881 bp and small (SSC: 12,519 bp single-copy regions, separated by a pair of identical inverted repeats (IR: 21,481 bp. It contains 114 unique genes, including 81 unique protein-coding genes, 29 tRNA genes, and 4 rRNA genes. Sequence divergence analysis with other plastomes from the BEP clade of the grass family suggests a sister relationship between D. antarctica, Festuca arundinacea and Lolium perenne of the Poeae tribe, based on the whole plastome. In addition, we conducted high-resolution mapping of the chloroplast-derived transcripts. Thus, we created an expression profile for 81 protein-coding genes and identified ndhC, psbJ, rps19, psaJ, and psbA as the most highly expressed chloroplast genes. Small RNA-seq analysis identified 27 small noncoding RNAs of chloroplast origin that were preferentially located near the 5'- or 3'-ends of genes. We also found >30 RNA-editing sites in the D. antarctica chloroplast genome, with a dominance of C-to-U conversions. CONCLUSIONS: We assembled and characterized the complete chloroplast genome sequence of D. antarctica and investigated the features of the plastid transcriptome. These data may contribute to a better understanding of the evolution of D. antarctica within the Poaceae family for use in molecular phylogenetic studies and may also help researchers

  15. The complete chloroplast and mitochondrial genomes of the green macroalga Ulva sp. UNA00071828 (Ulvophyceae, Chlorophyta.

    Directory of Open Access Journals (Sweden)

    James T Melton

    Full Text Available Sequencing mitochondrial and chloroplast genomes has become an integral part in understanding the genomic machinery and the phylogenetic histories of green algae. Previously, only three chloroplast genomes (Oltmannsiellopsis viridis, Pseudendoclonium akinetum, and Bryopsis hypnoides and two mitochondrial genomes (O. viridis and P. akinetum from the class Ulvophyceae have been published. Here, we present the first chloroplast and mitochondrial genomes from the ecologically and economically important marine, green algal genus Ulva. The chloroplast genome of Ulva sp. was 99,983 bp in a circular-mapping molecule that lacked inverted repeats, and thus far, was the smallest ulvophycean plastid genome. This cpDNA was a highly compact, AT-rich genome that contained a total of 102 identified genes (71 protein-coding genes, 28 tRNA genes, and three ribosomal RNA genes. Additionally, five introns were annotated in four genes: atpA (1, petB (1, psbB (2, and rrl (1. The circular-mapping mitochondrial genome of Ulva sp. was 73,493 bp and follows the expanded pattern also seen in other ulvophyceans and trebouxiophyceans. The Ulva sp. mtDNA contained 29 protein-coding genes, 25 tRNA genes, and two rRNA genes for a total of 56 identifiable genes. Ten introns were annotated in this mtDNA: cox1 (4, atp1 (1, nad3 (1, nad5 (1, and rrs (3. Double-cut-and-join (DCJ values showed that organellar genomes across Chlorophyta are highly rearranged, in contrast to the highly conserved organellar genomes of the red algae (Rhodophyta. A phylogenomic investigation of 51 plastid protein-coding genes showed that Ulvophyceae is not monophyletic, and also placed Oltmannsiellopsis (Oltmannsiellopsidales and Tetraselmis (Chlorodendrophyceae closely to Ulva (Ulvales and Pseudendoclonium (Ulothrichales.

  16. Localized hypermutation and associated gene losses in legume chloroplast genomes

    OpenAIRE

    KAVANAGH, THOMAS; WOLFE, KENNETH; POWELL, ANTOINETTE

    2010-01-01

    PUBLISHED Point mutations result from errors made during DNA replication or repair, so they are usually expected to be homogeneous across all regions of a genome. However, we have found a region of chloroplast DNA in plants related to sweetpea (Lathyrus) whose local point mutation rate is at least 20 times higher than elsewhere in the same molecule. There are very few precedents for such heterogeneity in any genome, and we suspect that the hypermutable region may be subject to an unusual p...

  17. The complete chloroplast genome sequence of Citrus sinensis (L.) Osbeck var 'Ridge Pineapple': organization and phylogenetic relationships to other angiosperms

    OpenAIRE

    Jansen Robert K; Lee Seung-Bum; Singh Nameirakpam D; Bausher Michael G; Daniell Henry

    2006-01-01

    Abstract Background The production of Citrus, the largest fruit crop of international economic value, has recently been imperiled due to the introduction of the bacterial disease Citrus canker. No significant improvements have been made to combat this disease by plant breeding and nuclear transgenic approaches. Chloroplast genetic engineering has a number of advantages over nuclear transformation; it not only increases transgene expression but also facilitates transgene containment, which is ...

  18. Nucleotide sequence of a spinach chloroplast valine tRNA.

    OpenAIRE

    Sprouse, H M; Kashdan, M; Otis, L; Dudock, B

    1981-01-01

    The nucleotide sequence of a spinach chloroplast valine tRNA (sp. chl. tRNA Val) has been determined. This tRNA shows essentially equal homology to prokaryotic valine tRNAs (58-65% homology) and to the mitochondrial valine tRNAs of lower eukaryotes (yeast and N. crassa, 61-62% homology). Sp. chl. tRNA Val shows distinctly lower homology to mouse mitochondrial valine tRNA (53% homology) and to eukaryotic cytoplasmic valine tRNAs (47-53% homology). Sp. chl. tRNA Val, like all other chloroplast ...

  19. A tiling microarray for global analysis of chloroplast genome expression in cucumber and other plants

    Directory of Open Access Journals (Sweden)

    Pląder Wojciech

    2011-09-01

    Full Text Available Abstract Plastids are small organelles equipped with their own genomes (plastomes. Although these organelles are involved in numerous plant metabolic pathways, current knowledge about the transcriptional activity of plastomes is limited. To solve this problem, we constructed a plastid tiling microarray (PlasTi-microarray consisting of 1629 oligonucleotide probes. The oligonucleotides were designed based on the cucumber chloroplast genomic sequence and targeted both strands of the plastome in a non-contiguous arrangement. Up to 4 specific probes were designed for each gene/exon, and the intergenic regions were covered regularly, with 70-nt intervals. We also developed a protocol for direct chemical labeling and hybridization of as little as 2 micrograms of chloroplast RNA. We used this protocol for profiling the expression of the cucumber chloroplast plastome on the PlasTi-microarray. Owing to the high sequence similarity of plant plastomes, the newly constructed microarray can be used to study plants other than cucumber. Comparative hybridization of chloroplast transcriptomes from cucumber, Arabidopsis, tomato and spinach showed that the PlasTi-microarray is highly versatile.

  20. The chloroplast genome of the hexaploid Spartina maritima (Poaceae, Chloridoideae): Comparative analyses and molecular dating.

    Science.gov (United States)

    Rousseau-Gueutin, M; Bellot, S; Martin, G E; Boutte, J; Chelaifa, H; Lima, O; Michon-Coudouel, S; Naquin, D; Salmon, A; Ainouche, K; Ainouche, M

    2015-12-01

    The history of many plant lineages is complicated by reticulate evolution with cases of hybridization often followed by genome duplication (allopolyploidy). In such a context, the inference of phylogenetic relationships and biogeographic scenarios based on molecular data is easier using haploid markers like chloroplast genome sequences. Hybridization and polyploidization occurred recurrently in the genus Spartina (Poaceae, Chloridoideae), as illustrated by the recent formation of the invasive allododecaploid S. anglica during the 19th century in Europe. Until now, only a few plastid markers were available to explore the history of this genus and their low variability limited the resolution of species relationships. We sequenced the complete chloroplast genome (plastome) of S. maritima, the native European parent of S. anglica, and compared it to the plastomes of other Poaceae. Our analysis revealed the presence of fast-evolving regions of potential taxonomic, phylogeographic and phylogenetic utility at various levels within the Poaceae family. Using secondary calibrations, we show that the tetraploid and hexaploid lineages of Spartina diverged 6-10 my ago, and that the two parents of the invasive allopolyploid S. anglica separated 2-4 my ago via long distance dispersal of the ancestor of S. maritima over the Atlantic Ocean. Finally, we discuss the meaning of divergence times between chloroplast genomes in the context of reticulate evolution. PMID:26182838

  1. Complete chloroplast genome of Trachelium caeruleum: extensiverearrangements are associated with repeats and tRNAs

    Energy Technology Data Exchange (ETDEWEB)

    Haberle, Rosemarie C.; Fourcade, Matthew L.; Boore, Jeffrey L.; Jansen, Robert K.

    2006-01-09

    Chloroplast genome structure, gene order and content arehighly conserved in land plants. We sequenced the complete chloroplastgenome sequence of Trachelium caeruleum (Campanulaceae) a member of anangiosperm family known for highly rearranged chloroplast genomes. Thetotal genome size is 162,321 bp with an IR of 27,273 bp, LSC of 100,113bp and SSC of 7,661 bp. The genome encodes 115 unique genes, with 19duplicated in the IR, a tRNA (trnI-CAU) duplicated once in the LSC and aprotein coding gene (psbJ) duplicated twice, for a total of 137 genes.Four genes (ycf15, rpl23, infA and accD) are truncated and likelynonfunctional; three others (clpP, ycf1 and ycf2) are so highly divergedthat they may now be pseudogenes. The most conspicuous feature of theTrachelium genome is the presence of eighteen internally unrearrangedblocks of genes that have been inverted or relocated within the genome,relative to the typical gene order of most angiosperm chloroplastgenomes. Recombination between repeats or tRNAs has been suggested as twomeans of chloroplast genome rearrangements. We compared the relativenumber of repeats in Trachelium to eight other angiosperm chloroplastgenomes, and evaluated the location of repeats and tRNAs in relation torearrangements. Trachelium has the highest number and largest repeats,which are concentrated near inversion endpoints or other rearrangements.tRNAs occur at many but not all inversion endpoints. There is likely nosingle mechanism responsible for the remarkable number of alterations inthis genome, but both repeats and tRNAs are clearly associated with theserearrangements. Land plant chloroplast genomes are highly conserved instructure, gene order and content. The chloroplast genomes of ferns, thegymnosperm Ginkgo, and most angiosperms are nearly collinear, reflectingthe gene order in lineages that diverged from lycopsids and the ancestralchloroplast gene order over 350 million years ago (Raubeson and Jansen,1992). Although earlier mapping studies

  2. The complete chloroplast genomes of Cannabis sativa and Humulus lupulus.

    Science.gov (United States)

    Vergara, Daniela; White, Kristin H; Keepers, Kyle G; Kane, Nolan C

    2016-09-01

    Cannabis and Humulus are sister genera comprising the entirety of the Cannabaceae sensu stricto, including C. sativa L. (marijuana, hemp), and H. lupulus L. (hops) as two economically important crops. These two plants have been used by humans for many purposes including as a fiber, food, medicine, or inebriant in the case of C. sativa, and as a flavoring component in beer brewing in the case of H. lupulus. In this study, we report the complete chloroplast genomes for two distinct hemp varieties of C. sativa, Italian "Carmagnola" and Russian "Dagestani", and one Czech variety of H. lupulus "Saazer". Both C. sativa genomes are 153 871 bp in length, while the H. lupulus genome is 153 751 bp. The genomes from the two C. sativa varieties differ in 16 single nucleotide polymorphisms (SNPs), while the H. lupulus genome differs in 1722 SNPs from both C. sativa cultivars. PMID:26329384

  3. Complete chloroplast genome of Oncidium Gower Ramsey and evaluation of molecular markers for identification and breeding in Oncidiinae

    Directory of Open Access Journals (Sweden)

    Daniell Henry

    2010-04-01

    Full Text Available Abstract Background Oncidium spp. produce commercially important orchid cut flowers. However, they are amenable to intergeneric and inter-specific crossing making phylogenetic identification very difficult. Molecular markers derived from the chloroplast genome can provide useful tools for phylogenetic resolution. Results The complete chloroplast genome of the economically important Oncidium variety Onc. Gower Ramsey (Accession no. GQ324949 was determined using a polymerase chain reaction (PCR and Sanger based ABI sequencing. The length of the Oncidium chloroplast genome is 146,484 bp. Genome structure, gene order and orientation are similar to Phalaenopsis, but differ from typical Poaceae, other monocots for which there are several published chloroplast (cp genome. The Onc. Gower Ramsey chloroplast-encoded NADH dehydrogenase (ndh genes, except ndhE, lack apparent functions. Deletion and other types of mutations were also found in the ndh genes of 15 other economically important Oncidiinae varieties, except ndhE in some species. The positions of some species in the evolution and taxonomy of Oncidiinae are difficult to identify. To identify the relationships between the 15 Oncidiinae hybrids, eight regions of the Onc. Gower Ramsey chloroplast genome were amplified by PCR for phylogenetic analysis. A total of 7042 bp derived from the eight regions could identify the relationships at the species level, which were supported by high bootstrap values. One particular 1846 bp region, derived from two PCR products (trnHGUG -psbA and trnFGAA-ndhJ was adequate for correct phylogenetic placement of 13 of the 15 varieties (with the exception of Degarmoara Flying High and Odontoglossum Violetta von Holm. Thus the chloroplast genome provides a useful molecular marker for species identifications. Conclusion In this report, we used Phalaenopsis. aphrodite as a prototype for primer design to complete the Onc. Gower Ramsey genome sequence. Gene annotation showed

  4. Comprehensive Survey of Genetic Diversity in Chloroplast Genomes and 45S nrDNAs within Panax ginseng Species

    OpenAIRE

    Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Lee, Hyun Oh; Joh, Ho Jun; Kim, Nam-Hoon; Park, Hyun-Seung; Yang, Tae-Jin

    2015-01-01

    We report complete sequences of chloroplast (cp) genome and 45S nuclear ribosomal DNA (45S nrDNA) for 11 Panax ginseng cultivars. We have obtained complete sequences of cp and 45S nrDNA, the representative barcoding target sequences for cytoplasm and nuclear genome, respectively, based on low coverage NGS sequence of each cultivar. The cp genomes sizes ranged from 156,241 to 156,425 bp and the major size variation was derived from differences in copy number of tandem repeats in the ycf1 gene ...

  5. Chloroplast gene sequences and the study of plant evolution.

    OpenAIRE

    Clegg, M T

    1993-01-01

    A large body of sequence data has accumulated for the chloroplast-encoded gene ribulose-1,5-biphosphate carboxylase/oxygenase (rbcL) as the result of a cooperative effort involving many laboratories. The data span all seed plants, including most major lineages from the angiosperms, and as such they provide an unprecedented opportunity to study plant evolutionary history. The full analysis of this large data set poses many problems and opportunities for plant evolutionary biologists and for bi...

  6. Towards resolving Lamiales relationships: insights from rapidly evolving chloroplast sequences

    Directory of Open Access Journals (Sweden)

    Heubl Günther

    2010-11-01

    Full Text Available Abstract Background In the large angiosperm order Lamiales, a diverse array of highly specialized life strategies such as carnivory, parasitism, epiphytism, and desiccation tolerance occur, and some lineages possess drastically accelerated DNA substitutional rates or miniaturized genomes. However, understanding the evolution of these phenomena in the order, and clarifying borders of and relationships among lamialean families, has been hindered by largely unresolved trees in the past. Results Our analysis of the rapidly evolving trnK/matK, trnL-F and rps16 chloroplast regions enabled us to infer more precise phylogenetic hypotheses for the Lamiales. Relationships among the nine first-branching families in the Lamiales tree are now resolved with very strong support. Subsequent to Plocospermataceae, a clade consisting of Carlemanniaceae plus Oleaceae branches, followed by Tetrachondraceae and a newly inferred clade composed of Gesneriaceae plus Calceolariaceae, which is also supported by morphological characters. Plantaginaceae (incl. Gratioleae and Scrophulariaceae are well separated in the backbone grade; Lamiaceae and Verbenaceae appear in distant clades, while the recently described Linderniaceae are confirmed to be monophyletic and in an isolated position. Conclusions Confidence about deep nodes of the Lamiales tree is an important step towards understanding the evolutionary diversification of a major clade of flowering plants. The degree of resolution obtained here now provides a first opportunity to discuss the evolution of morphological and biochemical traits in Lamiales. The multiple independent evolution of the carnivorous syndrome, once in Lentibulariaceae and a second time in Byblidaceae, is strongly supported by all analyses and topological tests. The evolution of selected morphological characters such as flower symmetry is discussed. The addition of further sequence data from introns and spacers holds promise to eventually obtain a

  7. The whole chloroplast genomes of two Eutrema species (Brassicaceae).

    Science.gov (United States)

    Hao, Guoqian; Bi, Hao; Li, Yuanshuo; He, Qi; Ma, Yazhen; Guo, Xinyi; Ma, Tao

    2016-09-01

    In this study, we determined the complete chloroplast genomes from two crucifer species of the Eutrema genus. The sizes of the two cp genomes were 153 948 bp (E. yunnanense) and 153 876 bp (E. heterophyllum). Both genomes have the typical quadripartite structure consisting of a large single copy region, a small single copy region and two inverted repeats. Gene contents and their relative positions of the 132 individual genes (87 protein-coding genes, eight rRNA, and 37 tRNA genes) of either genome were identical to each other. Phylogenetic analysis supports the idea that the currently recognized Eutrema genus is monophyletic and that E. salsugineum and Schrenkiella parvula evolved salt tolerance independently. PMID:26329763

  8. Chloroplast DNA rearrangements in Campanulaceae: phylogenetic utility of highly rearranged genomes

    Directory of Open Access Journals (Sweden)

    Jansen Robert K

    2004-08-01

    Full Text Available Abstract Background The Campanulaceae (the "hare bell" or "bellflower" family is a derived angiosperm family comprised of about 600 species treated in 35 to 55 genera. Taxonomic treatments vary widely and little phylogenetic work has been done in the family. Gene order in the chloroplast genome usually varies little among vascular plants. However, chloroplast genomes of Campanulaceae represent an exception and phylogenetic analyses solely based on chloroplast rearrangement characters support a reasonably well-resolved tree. Results Chloroplast DNA physical maps were constructed for eighteen representatives of the family. So many gene order changes have occurred among the genomes that characterizing individual mutational events was not always possible. Therefore, we examined different, novel scoring methods to prepare data matrices for cladistic analysis. These approaches yielded largely congruent results but varied in amounts of resolution and homoplasy. The strongly supported nodes were common to all gene order analyses as well as to parallel analyses based on ITS and rbcL sequence data. The results suggest some interesting and unexpected intrafamilial relationships. For example fifteen of the taxa form a derived clade; whereas the remaining three taxa – Platycodon, Codonopsis, and Cyananthus – form the basal clade. This major subdivision of the family corresponds to the distribution of pollen morphology characteristics but is not compatible with previous taxonomic treatments. Conclusions Our use of gene order data in the Campanulaceae provides the most highly resolved phylogeny as yet developed for a plant family using only cpDNA rearrangements. The gene order data showed markedly less homoplasy than sequence data for the same taxa but did not resolve quite as many nodes. The rearrangement characters, though relatively few in number, support robust and meaningful phylogenetic hypotheses and provide new insights into evolutionary

  9. A novel class of heat-responsive small RNAs derived from the chloroplast genome of Chinese cabbage (Brassica rapa

    Directory of Open Access Journals (Sweden)

    de Ruiter Marjo

    2011-06-01

    Full Text Available Abstract Background Non-coding small RNAs play critical roles in various cellular processes in a wide spectrum of eukaryotic organisms. Their responses to abiotic stress have become a popular topic of economic and scientific importance in biological research. Several studies in recent years have reported a small number of non-coding small RNAs that map to chloroplast genomes. However, it remains uncertain whether small RNAs are generated from chloroplast genome and how they respond to environmental stress, such as high temperature. Chinese cabbage is an important vegetable crop, and heat stress usually causes great losses in yields and quality. Under heat stress, the leaves become etiolated due to the disruption and disassembly of chloroplasts. In an attempt to determine the heat-responsive small RNAs in chloroplast genome of Chinese cabbage, we carried out deep sequencing, using heat-treated samples, and analysed the proportion of small RNAs that were matched to chloroplast genome. Results Deep sequencing provided evidence that a novel subset of small RNAs were derived from the chloroplast genome of Chinese cabbage. The chloroplast small RNAs (csRNAs include those derived from mRNA, rRNA, tRNA and intergenic RNA. The rRNA-derived csRNAs were preferentially located at the 3'-ends of the rRNAs, while the tRNA-derived csRNAs were mainly located at 5'-termini of the tRNAs. After heat treatment, the abundance of csRNAs decreased in seedlings, except those of 24 nt in length. The novel heat-responsive csRNAs and their locations in the chloroplast were verified by Northern blotting. The regulation of some csRNAs to the putative target genes were identified by real-time PCR. Our results reveal that high temperature suppresses the production of some csRNAs, which have potential roles in transcriptional or post-transcriptional regulation. Conclusions In addition to nucleus, the chloroplast is another important organelle that generates a number of small

  10. Manipulating the chloroplast genome of Chlamydomonas: Present realities and future prospects

    Energy Technology Data Exchange (ETDEWEB)

    Boynton, J.; Gillham, N.; Hauser, C.; Heifetz, P.; Lers, A.; Newman, S.; Osmond, B.

    1992-12-31

    Biotechnology is being applied in vitro modification and stable reintroduction of chloroplast genes in Chlamydomonas reinhardtii and Nicotiana tabacum by homologous recombination. We are attempting the function analyses of plastid encoded proteins involved in photosynthesis, characterization of sequences which regulate expression of plastid genes at the transcriptional and translational levels, targeted disruption of chloroplast genes and molecular analysis of processes involved in chloroplast recombination.

  11. Research Progress of Sugarcane Chloroplast Genome%甘蔗叶绿体基因组研究进展

    Institute of Scientific and Technical Information of China (English)

    吴杨; 周会

    2013-01-01

    Along with the development of modern molecular biology technologies, complete chloroplast genomes have been sequenced in various plant species to date, and the structure, function and expression of these genes have been deter-mined. The chloroplast genome structure in most higher plants is stable, since the gene number, arrangement and composition are conservative. The determination of sugarcane chloroplast genome sequence laid a good foundation for sugarcane chloroplast related research. This article gives a review on the research progress of sugarcane chloroplast genome through the chloroplast genome map, gene structure, function, chloroplast RNA editing, and phylogenetic analysis in Saccharum and relat-ed genera. This study held great potential to clarify more directions in researches, including sugarcane chloroplast genetic transformation, complete chloroplast nu-cleotide sequence determination in Saccharum and closely related genera, cpSSRs development and application.%随着现代分子生物学技术的发展,目前已经完成了多种植物叶绿体基因组的全序列测定,并研究了这些基因的结构、功能与表达。大部分高等植物的叶绿体基因组结构稳定,基因数量、排列顺序及组成上具有保守性。甘蔗叶绿体基因组测序工作的完成为甘蔗叶绿体相关研究奠定了良好基础。文章从甘蔗叶绿体基因组图谱、结构和功能基因、叶绿体RNA编辑以及甘蔗属叶绿体系统进化等方面综合概述了甘蔗叶绿体基因组研究取得的成果,并从甘蔗叶绿体遗传转化、甘蔗及近缘属叶绿体基因组测序和叶绿体基因组 cpSSRs开发利用等方面指出甘蔗叶绿体基因组今后的研究方向。

  12. Comparative analyses of chloroplast genome data representing nine green algae in Sphaeropleales (Chlorophyceae, Chlorophyta)

    OpenAIRE

    Fučíková, Karolina; Lewis, Louise A.; Lewis, Paul O.

    2016-01-01

    The chloroplast genomes of green algae are highly variable in their architecture. In this article we summarize gene content across newly obtained and published chloroplast genomes in Chlorophyceae, including new data from nine of species in Sphaeropleales (Chlorophyceae, Chlorophyta). We present genome architecture information, including genome synteny analysis across two groups of species. Also, we provide a phylogenetic tree obtained from analysis of gene order data for species in Chlorophy...

  13. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  14. An efficient procedure for plant organellar genome assembly, based on whole genome data from the 454 GS FLX sequencing platform

    Directory of Open Access Journals (Sweden)

    Zhang Tongwu

    2011-11-01

    Full Text Available Abstract Motivation Complete organellar genome sequences (chloroplasts and mitochondria provide valuable resources and information for studying plant molecular ecology and evolution. As high-throughput sequencing technology advances, it becomes the norm that a shotgun approach is used to obtain complete genome sequences. Therefore, to assemble organellar sequences from the whole genome, shotgun reads are inevitable. However, associated techniques are often cumbersome, time-consuming, and difficult, because true organellar DNA is difficult to separate efficiently from nuclear copies, which have been transferred to the nucleus through the course of evolution. Results We report a new, rapid procedure for plant chloroplast and mitochondrial genome sequencing and assembly using the Roche/454 GS FLX platform. Plant cells can contain multiple copies of the organellar genomes, and there is a significant correlation between the depth of sequence reads in contigs and the number of copies of the genome. Without isolating organellar DNA from the mixture of nuclear and organellar DNA for sequencing, we retrospectively extracted assembled contigs of either chloroplast or mitochondrial sequences from the whole genome shotgun data. Moreover, the contig connection graph property of Newbler (a platform-specific sequence assembler ensures an efficient final assembly. Using this procedure, we assembled both chloroplast and mitochondrial genomes of a resurrection plant, Boea hygrometrica, with high fidelity. We also present information and a minimal sequence dataset as a reference for the assembly of other plant organellar genomes.

  15. Complete chloroplast genome of Prunus yedoensis Matsum.(Rosaceae), wild and endemic flowering cherry on Jeju Island, Korea.

    Science.gov (United States)

    Cho, Myong-Suk; Hyun Cho, Chung; Yeon Kim, Su; Su Yoon, Hwan; Kim, Seung-Chul

    2016-09-01

    The complete chloroplast genome sequences of the wild flowering cherry, Prunus yedoensis Matsum., which is native and endemic to Jeju Island, Korea, is reported in this study. The genome size is 157 786 bp in length with 36.7% GC content, which is composed of LSC region of 85 908 bp, SSC region of 19 120 bp and two IR copies of 26 379 bp each. The cp genome contains 131 genes, including 86 coding genes, 8 rRNA genes and 37 tRNA genes. The maximum likelihood analysis was conducted to verify a phylogenetic position of the newly sequenced cp genome of P. yedoensis using 11 representatives of complete cp genome sequences within the family Rosaceae. The genus Prunus exhibited monophyly and the result of the phylogenetic relationship agreed with the previous phylogenetic analyses within Rosaceae. PMID:26329800

  16. Population structure and diversity of the aa genome of rice based on simple sequence repeats variation in organelle genome

    International Nuclear Information System (INIS)

    Maternally inherited mitochondrial and chloroplast genomes based Simple Sequence Repeat (SSR) variations were examined for their contribution to diversity of rice genome. Population structure and diversity analysis based on mitochondria and chloroplast inherited genome has been studied less as compared to nuclear genome inheritance. The present study was designed to evaluate the population structure and diversity of rice grown in Pakistan along with other countries based on maternally inherited mitochondria and chloroplast genome. The mitochondrial and chloroplast genomes were analyzed by using 42 mitochondrial and 20 chloroplast pairs of SSR primers. A slightly higher percentage of polymorphism was observed in chloroplast (30 percentage) than mitochondria (28.57 percentage). The average gene diversity for both mitochondrial and chloroplast was 0.32 oscillating from 0.041 to 0.620. The Polymorphism Information Content (PIC) value ranged from 0.040 to 0.543 with an average of 0.282, while the allelic richness ranged from two to four alleles with an average of 2.779 alleles. Mononucleotide repeats stood first (50 percentage polymorphic) for detecting polymorphism for organelle genomes followed by tri- (25 percentage), tetra- (14.29 percentage) and dinucleotide (12.5 percentage), respectively. Cluster and population structure analysis revealed two groups of accessions. On the basis of our results the AA genome of Asian cultivated rice diverges from the same origin during evolution. (author)

  17. Data characterizing the chloroplast genomes of extinct and endangered Hawaiian endemic mints (Lamiaceae) and their close relatives

    Science.gov (United States)

    Welch, Andreanna J.; Collins, Katherine; Ratan, Aakrosh; Drautz-Moses, Daniela I.; Schuster, Stephan C.; Lindqvist, Charlotte

    2016-01-01

    These data are presented in support of a plastid phylogenomic analysis of the recent radiation of the Hawaiian endemic mints (Lamiaceae), and their close relatives in the genus Stachys, “The quest to resolve recent radiations: Plastid phylogenomics of extinct and endangered Hawaiian endemic mints (Lamiaceae)” [1]. Here we describe the chloroplast genome sequences for 12 mint taxa. Data presented include summaries of gene content and length for these taxa, structural comparison of the mint chloroplast genomes with published sequences from other species in the order Lamiales, and comparisons of variability among three Hawaiian taxa vs. three outgroup taxa. Finally, we provide a list of 108 primer pairs targeting the most variable regions within this group and designed specifically for amplification of DNA extracted from degraded herbarium material. PMID:27077093

  18. Data characterizing the chloroplast genomes of extinct and endangered Hawaiian endemic mints (Lamiaceae) and their close relatives.

    Science.gov (United States)

    Welch, Andreanna J; Collins, Katherine; Ratan, Aakrosh; Drautz-Moses, Daniela I; Schuster, Stephan C; Lindqvist, Charlotte

    2016-06-01

    These data are presented in support of a plastid phylogenomic analysis of the recent radiation of the Hawaiian endemic mints (Lamiaceae), and their close relatives in the genus Stachys, "The quest to resolve recent radiations: Plastid phylogenomics of extinct and endangered Hawaiian endemic mints (Lamiaceae)" [1]. Here we describe the chloroplast genome sequences for 12 mint taxa. Data presented include summaries of gene content and length for these taxa, structural comparison of the mint chloroplast genomes with published sequences from other species in the order Lamiales, and comparisons of variability among three Hawaiian taxa vs. three outgroup taxa. Finally, we provide a list of 108 primer pairs targeting the most variable regions within this group and designed specifically for amplification of DNA extracted from degraded herbarium material. PMID:27077093

  19. Comparative chloroplast genomics and phylogenetics of Fagopyrum esculentum ssp. ancestrale – A wild ancestor of cultivated buckwheat

    OpenAIRE

    Dhingra Amit; Samigullin Tahir H; Logacheva Maria D; Penin Aleksey A

    2008-01-01

    Abstract Background Chloroplast genome sequences are extremely informative about species-interrelationships owing to its non-meiotic and often uniparental inheritance over generations. The subject of our study, Fagopyrum esculentum, is a member of the family Polygonaceae belonging to the order Caryophyllales. An uncertainty remains regarding the affinity of Caryophyllales and the asterids that could be due to undersampling of the taxa. With that background, having access to the complete chlor...

  20. OrgConv: detection of gene conversion using consensus sequences and its application in plant mitochondrial and chloroplast homologs

    Directory of Open Access Journals (Sweden)

    Hao Weilong

    2010-03-01

    Full Text Available Abstract Background The ancestry of mitochondria and chloroplasts traces back to separate endosymbioses of once free-living bacteria. The highly reduced genomes of these two organelles therefore contain very distant homologs that only recently have been shown to recombine inside the mitochondrial genome. Detection of gene conversion between mitochondrial and chloroplast homologs was previously impossible due to the lack of suitable computer programs. Recently, I developed a novel method and have, for the first time, discovered recurrent gene conversion between chloroplast mitochondrial genes. The method will further our understanding of plant organellar genome evolution and help identify and remove gene regions with incongruent phylogenetic signals for several genes widely used in plant systematics. Here, I implement such a method that is available in a user friendly web interface. Results OrgConv (Organellar Conversion is a computer package developed for detection of gene conversion between mitochondrial and chloroplast homologous genes. OrgConv is available in two forms; source code can be installed and run on a Linux platform and a web interface is available on multiple operating systems. The input files of the feature program are two multiple sequence alignments from different organellar compartments in FASTA format. The program compares every examined sequence against the consensus sequence of each sequence alignment rather than exhaustively examining every possible combination. Making use of consensus sequences significantly reduces the number of comparisons and therefore reduces overall computational time, which allows for analysis of very large datasets. Most importantly, with the significantly reduced number of comparisons, the statistical power remains high in the face of correction for multiple tests. Conclusions Both the source code and the web interface of OrgConv are available for free from the OrgConv website http

  1. Nelumbonaceae: Systematic position and species diversification revealed by the complete chloroplast genome

    Institute of Scientific and Technical Information of China (English)

    Jian-Hua XUE; Wen-Pan DONG; Tao CHENG; Shi-Liang ZHOU

    2012-01-01

    Nelumbonaceae is a morphologically unique family of angiosperms and was traditionally placed in Nymphaeales; more recently,it was placed in Proteales based on molecular data,or in an order of its own,Nelumbonales.To determine the systematic position of the family and to date the divergence time of the family and the divergence time of its two intercontinentally disjunct species,we sequenced the entire chloroplast genome of Nelumbo lutea and most of the chloroplast genes of N.nucifera.We carried out phylogenetic and molecular dating analyses of the two species and representatives of 47 other plant families,representing the major lineages of angiosperms,using 83 plastid genes.The N.lutea genome was 163 510 bp long,with a total of 130 coding genes and an overall GC content of 38%.No significant structural differences among the genomes of N.lutea,Nymphaea alba,and Platanus occidentalis were observed.The phylogenetic relationships based on the 83 plastid genes revealed a close relationship between Nelumbonaceae and Platanaceae.The divergence times were estimated to be 109 Ma between the two families and 1.5 Ma between the two Nelumbo species.The estimated time was only slightly longer than the age of known Nelumbo fossils,suggesting morphological stasis within Nelumbonaceae.We conclude that Nelumbonaceae holds a position in or close to Proteales.We further conclude that the two species of Nelumbo diverged recently from a common ancestor and do not represent ancient relicts on different continents.

  2. Classifying Genomic Sequences by Sequence Feature Analysis

    Institute of Scientific and Technical Information of China (English)

    Zhi-Hua Liu; Dian Jiao; Xiao Sun

    2005-01-01

    Traditional sequence analysis depends on sequence alignment. In this study, we analyzed various functional regions of the human genome based on sequence features, including word frequency, dinucleotide relative abundance, and base-base correlation. We analyzed the human chromosome 22 and classified the upstream,exon, intron, downstream, and intergenic regions by principal component analysis and discriminant analysis of these features. The results show that we could classify the functional regions of genome based on sequence feature and discriminant analysis.

  3. Genome Sequence Databases (Overview): Sequencing and Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  4. Comprehensive Survey of Genetic Diversity in Chloroplast Genomes and 45S nrDNAs within Panax ginseng Species

    Science.gov (United States)

    Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Lee, Hyun Oh; Joh, Ho Jun; Kim, Nam-Hoon; Park, Hyun-Seung; Yang, Tae-Jin

    2015-01-01

    We report complete sequences of chloroplast (cp) genome and 45S nuclear ribosomal DNA (45S nrDNA) for 11 Panax ginseng cultivars. We have obtained complete sequences of cp and 45S nrDNA, the representative barcoding target sequences for cytoplasm and nuclear genome, respectively, based on low coverage NGS sequence of each cultivar. The cp genomes sizes ranged from 156,241 to 156,425 bp and the major size variation was derived from differences in copy number of tandem repeats in the ycf1 gene and in the intergenic regions of rps16-trnUUG and rpl32-trnUAG. The complete 45S nrDNA unit sequences were 11,091 bp, representing a consensus single transcriptional unit with an intergenic spacer region. Comparative analysis of these sequences as well as those previously reported for three Chinese accessions identified very rare but unique polymorphism in the cp genome within P. ginseng cultivars. There were 12 intra-species polymorphisms (six SNPs and six InDels) among 14 cultivars. We also identified five SNPs from 45S nrDNA of 11 Korean ginseng cultivars. From the 17 unique informative polymorphic sites, we developed six reliable markers for analysis of ginseng diversity and cultivar authentication. PMID:26061692

  5. The complete chloroplast genome of the Taiwan red pine Pinus taiwanensis (Pinaceae).

    Science.gov (United States)

    Fang, Min-Feng; Wang, Yu-Jin; Zu, Yu-Meng; Dong, Wan-Lin; Wang, Ruo-Nan; Deng, Tuan-Tuan; Li, Zhong-Hu

    2016-07-01

    The complete nucleotide sequence of the Taiwan red pine Pinus taiwanensis Hayata chloroplast genome (cpDNA) is determined in this study. The genome is composed of 119,741 bp in length, containing a pair of very short inverted repeat (IRa and IRb) regions of 495 bp, which was divided by a large single-copy (LSC) region of 65,670 bp and a small single-copy (SSC) region of 53,080 bp in length. The cpDNA contained 115 genes, including 74 protein-coding genes (73 PCG species), 4 ribosomal RNA genes (four rRNA species) and 37 tRNA genes (22 tRNA species). Out of these genes, 12 harbored a single intron, and one (rps12) contained a couple of introns. The overall AT content of the Taiwan red pine cpDNA is 61.5%, while the corresponding values of the LSC, SSC and IR regions are 62.2%, 60.6% and 63.6%, respectively. A maximum parsimony phylogenetic analysis suggested that the genus Pinus, Picea, Abies and Larix were strongly supported as monophyletic, and the cpDNA of P. taiwanensis is closely related to that of P. thunbergii. PMID:26057016

  6. Origin and evolution of the colonial volvocales (Chlorophyceae) as inferred from multiple, chloroplast gene sequences.

    Science.gov (United States)

    Nozaki, H; Misawa, K; Kajita, T; Kato, M; Nohara, S; Watanabe, M M

    2000-11-01

    A combined data set of DNA sequences (6021 bp) from five protein-coding genes of the chloroplast genome (rbcL, atpB, psaA, psaB, and psbC genes) were analyzed for 42 strains representing 30 species of the colonial Volvocales (Volvox and its relatives) and 5 related species of green algae to deduce robust phylogenetic relationships within the colonial green flagellates. The 4-celled family Tetrabaenaceae was robustly resolved as the most basal group within the colonial Volvocales. The sequence data also suggested that all five volvocacean genera with 32 or more cells in a vegetative colony (all four of the anisogamous/oogamous genera, Eudorina, Platydorina, Pleodorina, and Volvox, plus the isogamous genus Yamagishiella) constituted a large monophyletic group, in which 2 Pleodorina species were positioned distally to 3 species of Volvox. Therefore, most of the evolution of the colonial Volvocales appears to constitute a gradual progression in colonial complexity and in types of sexual reproduction, as in the traditional volvocine lineage hypothesis, although reverse evolution must be considered for the origin of certain species of Pleodorina. Data presented here also provide robust support for a monophyletic family Goniaceae consisting of two genera: Gonium and Astrephomene. PMID:11083939

  7. Phylogenomic analysis of transcriptomic sequences of mitochondria and chloroplasts of essential brown algae (Phaeophyceae) in China

    Institute of Scientific and Technical Information of China (English)

    JIA Shangang; LIU Tao; WU Shuangxiu; WANG Xumin; LI Tianyong; QIAN Hao; SUN Jing; WANG Liang; YU Jun; REN Lufeng; YIN Jinlong

    2014-01-01

    The chloroplast and mitochondrion of brown algae (Class Phaeophyceae of Phylum Ochrophyta) may have originated from different endosymbiosis. In this study, we carried out phylogenomic analysis to distinguish their evolutionary lineages by using algal RNA-seq datasets of the 1 000 Plants (1KP) Project and publicly available complete genomes of mitochondria and chloroplasts of Kingdom Chromista. We have found that there is a split between Class Phaeophyceae of Phylum Ochrophyta and the others (Phylum Cryptophyta and Haptophyta) in Kingdom Chromista, and identified more diversity in chloroplast genes than mitochondrial ones in their phylogenetic trees. Taxonomy resolution for Class Phaeophyceae showed that it was divided into Laminariales-Ectocarpales clade and Fucales clade, and phylogenetic positions of Kjellmaniella crassi-folia, Hizikia fusifrome and Ishige okamurai were confirmed. Our analysis provided the basic phylogenetic relationships of Chromista algae, and demonstrated their potential ability to study endosymbiotic events.

  8. The complete plastid genome sequence of Abies koreana (Pinaceae: Abietoideae).

    Science.gov (United States)

    Yi, Dong-Keun; Yang, Jong Cheol; So, Soonku; Joo, Minjung; Kim, Dong-Kap; Shin, Chang Ho; Lee, You-Mi; Choi, Kyung

    2016-07-01

    The nucleotide sequence of the chloroplast genome from Abies koreana is the first to have complete genome sequence from genus Abies of family Pinaceae. The circular double-stranded DNA, which consists of 121,373 base pairs (bp), contains a pair of very short inverted repeat regions (IRa and IRb) of 264 bp each, which are separated by a small and large single-copy regions (SSC and LSC) of 54,197 and 66,648 bp, respectively. The genome contents of 114 genes (68 peptide-encoding genes, 35 tRNA genes, four rRNA genes, six open reading frames and one pseudogene) are similar to the chloroplast DNA of other species of Abietoideae. Loss of ndh genes was also identified in the genome of A. koreana like other genomes in the family Pinaceae. Thirteen genes contain one (11 genes) or two (rps12 and ycf3 genes) introns. In phylogenetic analysis, the tree confirms that Abies, Keteleeria and Cedrus are strongly supported as monophyletic. Other inverted repeat sequences located in 42-kb inversion points (1186 bp) include trnS-psaM-ycf12- ψtrnG genes. PMID:25812052

  9. A Cyan Fluorescent Reporter Expressed from the Chloroplast Genome of Marchantia polymorpha

    Science.gov (United States)

    Boehm, Christian R.; Ueda, Minoru; Nishimura, Yoshiki; Shikanai, Toshiharu; Haseloff, Jim

    2016-01-01

    Recently, the liverwort Marchantia polymorpha has received increasing attention as a basal plant model for multicellular studies. Its ease of handling, well-characterized plastome and proven protocols for biolistic plastid transformation qualify M. polymorpha as an attractive platform to study the evolution of chloroplasts during the transition from water to land. In addition, chloroplasts of M. polymorpha provide a convenient test-bed for the characterization of genetic elements involved in plastid gene expression due to the absence of mechanisms for RNA editing. While reporter genes have proven valuable to the qualitative and quantitative study of gene expression in chloroplasts, expression of green fluorescent protein (GFP) in chloroplasts of M. polymorpha has proven problematic. We report the design of a codon-optimized gfp varian, mturq2cp, which allowed successful expression of a cyan fluorescent protein under control of the tobacco psbA promoter from the chloroplast genome of M. polymorpha. We demonstrate the utility of mturq2cp in (i) early screening for transplastomic events following biolistic transformation of M. polymorpha spores; (ii) visualization of stromules as elements of plastid structure in Marchantia; and (iii) quantitative microscopy for the analysis of promoter activity. PMID:26634291

  10. A Cyan Fluorescent Reporter Expressed from the Chloroplast Genome of Marchantia polymorpha.

    Science.gov (United States)

    Boehm, Christian R; Ueda, Minoru; Nishimura, Yoshiki; Shikanai, Toshiharu; Haseloff, Jim

    2016-02-01

    Recently, the liverwort Marchantia polymorpha has received increasing attention as a basal plant model for multicellular studies. Its ease of handling, well-characterized plastome and proven protocols for biolistic plastid transformation qualify M. polymorpha as an attractive platform to study the evolution of chloroplasts during the transition from water to land. In addition, chloroplasts of M. polymorpha provide a convenient test-bed for the characterization of genetic elements involved in plastid gene expression due to the absence of mechanisms for RNA editing. While reporter genes have proven valuable to the qualitative and quantitative study of gene expression in chloroplasts, expression of green fluorescent protein (GFP) in chloroplasts of M. polymorpha has proven problematic. We report the design of a codon-optimized gfp varian, mturq2cp, which allowed successful expression of a cyan fluorescent protein under control of the tobacco psbA promoter from the chloroplast genome of M. polymorpha. We demonstrate the utility of mturq2cp in (i) early screening for transplastomic events following biolistic transformation of M. polymorpha spores; (ii) visualization of stromules as elements of plastid structure in Marchantia; and (iii) quantitative microscopy for the analysis of promoter activity. PMID:26634291

  11. Comparative Chloroplast Genome Analyses of Streptophyte Green Algae Uncover Major Structural Alterations in the Klebsormidiophyceae, Coleochaetophyceae and Zygnematophyceae

    Science.gov (United States)

    Lemieux, Claude; Otis, Christian; Turmel, Monique

    2016-01-01

    The Streptophyta comprises all land plants and six main lineages of freshwater green algae: Mesostigmatophyceae, Chlorokybophyceae, Klebsormidiophyceae, Charophyceae, Coleochaetophyceae and Zygnematophyceae. Previous comparisons of the chloroplast genome from nine streptophyte algae (including four zygnematophyceans) revealed that, although land plant chloroplast DNAs (cpDNAs) inherited most of their highly conserved structural features from green algal ancestors, considerable cpDNA changes took place during the evolution of the Zygnematophyceae, the sister group of land plants. To gain deeper insights into the evolutionary dynamics of the chloroplast genome in streptophyte algae, we sequenced the cpDNAs of nine additional taxa: two klebsormidiophyceans (Entransia fimbriata and Klebsormidium sp. SAG 51.86), one coleocheatophycean (Coleochaete scutata) and six zygnematophyceans (Cylindrocystis brebissonii, Netrium digitus, Roya obtusa, Spirogyra maxima, Cosmarium botrytis and Closterium baillyanum). Our comparative analyses of these genomes with their streptophyte algal counterparts indicate that the large inverted repeat (IR) encoding the rDNA operon experienced loss or expansion/contraction in all three sampled classes and that genes were extensively shuffled in both the Klebsormidiophyceae and Zygnematophyceae. The klebsormidiophycean genomes boast greatly expanded IRs, with the Entransia 60,590-bp IR being the largest known among green algae. The 206,025-bp Entransia cpDNA, which is one of the largest genome among streptophytes, encodes 118 standard genes, i.e., four additional genes compared to its Klebsormidium flaccidum homolog. We inferred that seven of the 21 group II introns usually found in land plants were already present in the common ancestor of the Klebsormidiophyceae and its sister lineages. At 107,236 bp and with 117 standard genes, the Coleochaete IR-less genome is both the smallest and most compact among the streptophyte algal cpDNAs analyzed thus

  12. The complete plastid genome sequence of Picea jezoensis (Pinaceae: Piceoideae).

    Science.gov (United States)

    Yang, Jong Cheol; Joo, Minjung; So, Soonku; Yi, Dong-Keun; Shin, Chang Ho; Lee, You-Mi; Choi, Kyung

    2016-09-01

    The nucleotide sequence of the complete chloroplast genome of P. jezoensis was completed. The total genome size was 124 146 bp, containing a pair of very short inverted repeats (IRa and IRb) of 422 bp, which were separated by large single copy (LSC) and small single copy (SSC) with 66 956 bp and 56 346 bp, respectively. The overall GC contents of the plastid genome were determined as 38.8%. One hundred fifteen genes including 68 peptide-encoding genes, 35 tRNA genes, four rRNA genes, six open-reading frames, and two pseudogenes were annotated. In these genes, 15 genes contained only one or two introns. Phylogenetic analyses using maximum likelihood (ML) methods were performed from fully sequenced Gymnosperms and other species of dataset composed of 69 protein-coding genes. PMID:26332576

  13. Towards resolving Lamiales relationships: insights from rapidly evolving chloroplast sequences

    OpenAIRE

    Heubl Günther; Borsch Thomas; Albach Dirk C; Fischer Eberhard; Fleischmann Andreas; Schäferhoff Bastian; Müller Kai F

    2010-01-01

    Abstract Background In the large angiosperm order Lamiales, a diverse array of highly specialized life strategies such as carnivory, parasitism, epiphytism, and desiccation tolerance occur, and some lineages possess drastically accelerated DNA substitutional rates or miniaturized genomes. However, understanding the evolution of these phenomena in the order, and clarifying borders of and relationships among lamialean families, has been hindered by largely unresolved trees in the past. Results ...

  14. The complete chloroplast genome of Cupressus gigantea, an endemic conifer species to Qinghai-Tibetan Plateau.

    Science.gov (United States)

    Li, Huie; Guo, Qiqiang; Zheng, Weilie

    2016-09-01

    The complete chloroplast genome of the wild Cupressus gigantea (Cupressaceae) is determined in this study. The circular genome is 128 244 bp in length with 115 single copy genes and two duplicated genes (trnI-CAU and trnQ-UUG). This genome contains 82 protein-coding genes, four ribosomal RNA genes and 31 transfer RNA genes. In these genes, eight genes (atpF, rpoC1, ndhA, ndhB, petB, petD, rpl16 and rpl2) harbor a single intron and two genes (rps12 and ycf3) harbor two introns. This genome does not contain canonical IRs, and the overall GC content is 34.7%. A maximum parsimony phylogenetic analysis revealed that C. gigantea and C. sempervirens are more closely related. PMID:26359779

  15. The nucleotide sequence of Scenedesmus obliquus chloroplast tRNAfMet.

    OpenAIRE

    McCoy, J M; Jones, D S

    1980-01-01

    The chloroplast initiator tRNAfMet from the green alga Scenedesmus obliquus has been purified and its sequence shown to be p C-G-C-A-G-G-A-U-A-G-A-G-C-A-G-U-C-U-Gm-G-D-A-G-C-U-C-m2(2)G-psi-G-G-G-G-C-U-C-A -U-A-A-psi-C-C-C-A-A-U-m7G-D-C-G-C-A-G-G-T-psi-C-A-A-A-U-C-C-U-G-C-U-C-C-U-G-C-A-A-C-C-A-OH. This structure is prokaryotic in character and displays close homologies with a blue green algal initiator tRNAfMet and bean chloroplast initiator tRNAfMet.

  16. Phylogenomic analysis of transcriptomic sequences of mitochondria and chloroplasts for marine red algae (Rhodophyta) in China

    Institute of Scientific and Technical Information of China (English)

    JIA Shangang; LIU Tao; WU Shuangxiu; WANG Xumin; QIAN Hao; LI Tianyong; SUN Jing; WANG Liang; YU Jun; LI Xingang; YIN Jinlong

    2014-01-01

    The chloroplast and mitochondrion of red algae (Phylum Rhodophyta) may have originated from different endosymbiosis. In this study, we carried out phylogenomic analysis to distinguish their evolutionary lin-eages by using red algal RNA-seq datasets of the 1 000 Plants (1KP) Project and publicly available complete genomes of mitochondria and chloroplasts of Rhodophyta. We have found that red algae were divided into three clades of orders, Florideophyceae, Bangiophyceae and Cyanidiophyceae. Taxonomy resolution for Class Florideophyceae showed that Order Gigartinales was close to Order Halymeniales, while Order Graci-lariales was in a clade of Order Ceramials. We confirmed Prionitis divaricata (Family Halymeniaceae) was closely related to the clade of Order Gracilariales, rather than to genus Grateloupia of Order Halymeniales as reported before. Furthermore, we found both mitochondrial and chloroplastic genes in Rhodophyta under negative selection (Ka/Ks<1), suggesting that red algae, as one primitive group of eukaryotic algae, might share joint evolutionary history with these two organelles for a long time, although we identified some dif-ferences in their phylogenetic trees. Our analysis provided the basic phylogenetic relationships of red algae, and demonstrated their potential ability to study endosymbiotic events.

  17. RNA Editing Sites Exist in Protein-coding Genes in the Chloroplast Genome of Cycas taitungensis

    Institute of Scientific and Technical Information of China (English)

    Haiyan Chen; Likun Deng; Yuan Jiang; Ping Lu; Jianing Yu

    2011-01-01

    RNA editing is a post-transcriptional process that results in modifications of ribonucleotides at specific locations.In land plants editing can occur in both mitochondria and chloroplasts and most commonly involves C-to-U changes,especially in seed plants.Using prediction and experimental determination,we investigated RNA editing in 40 protein-coding genes from the chloroplast genome of Cycas taitungensis.A total of 85 editing sites were identified in 25 transcripts.Comparison analysis of the published editotypes of these 25 transcripts in eight species showed that RNA editing events gradually disappear during plant evolution.The editing in the first and third codon position disappeared quicker than that in the second codon position,ndh genes have the highest editing frequency while serine and proline codons were more frequently edited than the codons of other amino acids.These results imply that retained RNA editing sites have imbalanced distribution in genes and most of them may function by changing protein structure or interaction.Mitochondrion protein-coding genes have three times the editing sites compared with chloroplast genes of Cycas,most likely due to slower evolution speed.

  18. Identifying the Basal Angiosperm Node in Chloroplast GenomePhylogenies: Sampling One's Way Out of the Felsenstein Zone

    Energy Technology Data Exchange (ETDEWEB)

    Leebens-Mack, Jim; Raubeson, Linda A.; Cui, Liying; Kuehl,Jennifer V.; Fourcade, Matthew H.; Chumley, Timothy W.; Boore, JeffreyL.; Jansen, Robert K.; dePamphilis, Claude W.

    2005-05-27

    While there has been strong support for Amborella and Nymphaeales (water lilies) as branching from basal-most nodes in the angiosperm phylogeny, this hypothesis has recently been challenged by phylogenetic analyses of 61 protein-coding genes extracted from the chloroplast genome sequences of Amborella, Nymphaea and 12 other available land plant chloroplast genomes. These character-rich analyses placed the monocots, represented by three grasses (Poaceae), as sister to all other extant angiosperm lineages. We have extracted protein-coding regions from draft sequences for six additional chloroplast genomes to test whether this surprising result could be an artifact of long-branch attraction due to limited taxon sampling. The added taxa include three monocots (Acorus, Yucca and Typha), a water lily (Nuphar), a ranunculid(Ranunculus), and a gymnosperm (Ginkgo). Phylogenetic analyses of the expanded DNA and protein datasets together with microstructural characters (indels) provided unambiguous support for Amborella and the Nymphaeales as branching from the basal-most nodes in the angiospermphylogeny. However, their relative positions proved to be dependent on method of analysis, with parsimony favoring Amborella as sister to all other angiosperms, and maximum likelihood and neighbor-joining methods favoring an Amborella + Nympheales clade as sister. The maximum likelihood phylogeny supported the later hypothesis, but the likelihood for the former hypothesis was not significantly different. Parametric bootstrap analysis, single gene phylogenies, estimated divergence dates and conflicting in del characters all help to illuminate the nature of the conflict in resolution of the most basal nodes in the angiospermphylogeny. Molecular dating analyses provided median age estimates of 161 mya for the most recent common ancestor of all extant angiosperms and 145 mya for the most recent common ancestor of monocots, magnoliids andeudicots. Whereas long sequences reduce variance in

  19. The diploid genome sequence of Candida albicans

    OpenAIRE

    Jones, Ted; Federspiel, Nancy A.; Chibana, Hiroji; Dungan, Jan; Kalman, Sue; Magee, B. B.; Newport, George; Thorstenson, Yvonne R.; Agabian, Nina; Magee, P T; Davis, Ronald W.; Scherer, Stewart

    2004-01-01

    We present the diploid genome sequence of the fungal pathogen Candida albicans. Because C. albicans has no known haploid or homozygous form, sequencing was performed as a whole-genome shotgun of the heterozygous diploid genome in strain SC5314, a clinical isolate that is the parent of strains widely used for molecular analysis. We developed computational methods to assemble a diploid genome sequence in good agreement with available physical mapping data. We provide a whole-genome description ...

  20. A hybrid swarm population of Pinus densiflora × P. sylvestris inferred from sequence analysis of chloroplast DNA and morphological characters

    Institute of Scientific and Technical Information of China (English)

    Young Hee Joung; Jerry L.Hill; Jung Oh Hyun; Ding Mu; Juchun Luo; Do Hyung Lee; Takayuki Kawahara; Jeung Keun Suh; Mark S.Roh

    2013-01-01

    To confirm a hybrid swarm population ofPinus densiflora × P.sylvestris in Jilin,China,we used needles and seeds from P.densiflora,P.sylvestris,and P.densiflora × P.sylvestris collected from natural stands or experimental stations to study whether shoot apex morphology of 4-year old seedlings can be correlated with the sequence of a chloroplast DNA simple sequence repeat marker (cpDNA SSRs).Total genomic DNA was extracted and subjected to sequence analysis of the pine cpDNA SSR marker Pt15169.Results show that morphological characters from 4-year old seedlings did not correlate with sequence variants of this marker.Marker haplotypes from all P.sylvestris trees had a CTAT element that was absent from all sampled P.densiflora trees.However,both haplotype classes involving this insertion/deletion element were found in a P.densiflora × P.sylvestris population and its seedling progeny.It was concluded that the P.densiflora × P.sylvestris accessions sampled from Jilin,China resulted from bi-directional crosses,as evidenced by both species' cpDNA haplotypes within the hybrid swarm population.

  1. Maternal inheritance of mitochondrial genomes and complex inheritance of chloroplast genomes in Actinidia Lind.: evidences from interspecific crosses.

    Science.gov (United States)

    Li, Dawei; Qi, Xiaoqiong; Li, Xinwei; Li, Li; Zhong, Caihong; Huang, Hongwen

    2013-04-01

    The inheritance pattern of chloroplast and mitochondria is a critical determinant in studying plant phylogenetics, biogeography and hybridization. To better understand chloroplast and mitochondrial inheritance patterns in Actinidia (traditionally called kiwifruit), we performed 11 artificial interspecific crosses and studied the ploidy levels, morphology, and sequence polymorphisms of chloroplast DNA (cpDNA) and mitochondrial DNA (mtDNA) of parents and progenies. Sequence analysis showed that the mtDNA haplotypes of F1 hybrids entirely matched those of the female parents, indicating strictly maternal inheritance of Actinidia mtDNA. However, the cpDNA haplotypes of F1 hybrids, which were predominantly derived from the male parent (9 crosses), could also originate from the mother (1 cross) or both parents (1 cross), demonstrating paternal, maternal, and biparental inheritance of Actinidia cpDNA. The inheritance patterns of the cpDNA in Actinidia hybrids differed according to the species and genotypes chosen to be the parents, rather than the ploidy levels of the parent selected. The multiple inheritance modes of Actinidia cpDNA contradicted the strictly paternal inheritance patterns observed in previous studies, and provided new insights into the use of cpDNA markers in studies of phylogenetics, biogeography and introgression in Actinidia and other angiosperms. PMID:23337924

  2. Association between Chloroplast and Mitochondrial DNA sequences in Chinese Prunus genotypes (Prunus persica, Prunus domestica, and Prunus avium)

    OpenAIRE

    Pervaiz, Tariq; Sun, Xin; Zhang, Yanyi; Tao, Ran; Zhang, Junhuan; Fang, Jinggui

    2015-01-01

    Background The nuclear DNA is conventionally used to assess the diversity and relatedness among different species, but variations at the DNA genome level has also been used to study the relationship among different organisms. In most species, mitochondrial and chloroplast genomes are inherited maternally; therefore it is anticipated that organelle DNA remains completely associated. Many research studies were conducted simultaneously on organelle genome. The objectives of this study was to ana...

  3. Development in Rice Genome Research Based on Accurate Genome Sequence

    OpenAIRE

    2008-01-01

    Rice is one of the most important crops in the world. Although genetic improvement is a key technology for the acceleration of rice breeding, a lack of genome information had restricted efforts in molecular-based breeding until the completion of the high-quality rice genome sequence, which opened new opportunities for research in various areas of genomics. The syntenic relationship of the rice genome to other cereal genomes makes the rice genome invaluable for understanding how cereal genomes...

  4. Transfer of a eubacteria-type cell division site-determining factor CrMinD gene to the nucleus from the chloroplast genome in Chlamydomonas reinhardtii

    Institute of Scientific and Technical Information of China (English)

    LIU WeiZhong; HU Yong; ZHANG RunJie; ZHOU WeiWei; ZHU JiaYing; LIU XiangLin; HE YiKun

    2007-01-01

    MinD is a ubiquitous ATPase that plays a crucial role in selection of the division site in eubacteria, chloroplasts, and probably Archaea. In four green algae, Mesostigma viride, Nephroselmis olivacea, Chlorella vulgaris and Prototheca wickerhamii, MinD homologues are encoded in the plastid genome. However, in Arabidopsis, MinD is a nucleus-encoded, chloroplast-targeted protein involved in chloroplast division, which suggests that MinD has been transferred to the nucleus in higher land plants. Yet the lateral gene transfer (LGT) of MinD from plastid to nucleus during plastid evolution remains poorly understood. Here, we identified a nucleus-encoded MinD homologue from unicellular green alga Chlamydomonas reinhardtii, a basal species in the green plant lineage. Overexpression of CrMinD in wild type E. coli inhibited cell division and resulted in the filamentous cell formation, clearly demonstrated the conservation of the MinD protein during the evolution of photosynthetic eukaryotes. The transient expression of CrMinD-egfp confirmed the role of CrMinD protein in the regulation of plastid division. Searching all the published plastid genomic sequences of land plants, no MinD homologues were found, which suggests that the transfer of MinD from plastid to nucleus might have occurred before the evolution of land plants.

  5. Molecular Phylogeny of Asian Meconopsis Based on Nuclear Ribosomal and Chloroplast DNA Sequence Data

    OpenAIRE

    Liu, Yu-cheng; Liu, Ya-Nan; Yang, Fu-Sheng; Wang, Xiao-Quan

    2014-01-01

    The taxonomy and phylogeny of Asian Meconopsis (Himalayan blue poppy) remain largely unresolved. We used the internal transcribed spacer (ITS) region of nuclear ribosomal DNA (nrDNA) and the chloroplast DNA (cpDNA) trnL-F region for phylogenetic reconstruction of Meconopsis and its close relatives Papaver, Roemeria, and Stylomecon. We identified five main clades, which were well-supported in the gene trees reconstructed with the nrDNA ITS and cpDNA trnL-F sequences. We found that 41 species o...

  6. Value of a newly sequenced bacterial genome

    DEFF Research Database (Denmark)

    Barbosa, Eudes; Aburjaile, Flavia F; Ramos, Rommel Tj;

    2014-01-01

    Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also...... heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting...... in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome...

  7. The nucleotide sequences of the initiator transfer RNAs from bean cytoplasm and chloroplasts.

    OpenAIRE

    Canaday, J; Guillemaut, P; Weil, J H

    1980-01-01

    The initiator tRNAsMet from the cytoplasm and chloroplasts of Phaseolus vulgaris have been purified and sequenced. The sequence of bean cytoplasmic initiator tRNAiMet is : pA-U-C-A-G-A-G-U-m1G-m2G-C-G-C-A-G-C-G-G-A-A-G-C-G-U-m2G-G-U-G-G-G2-C-C-C-A-U-t6A-A-C-C-C-A-C-A-G-m7G-D-m5C-C-C-A-G-G-A-psi-C-G-m1A-A-A-C-C-U-Gm-G-C-U-C-U-G-A-U-A-C-C-AOH. The sequence of bean cytoplasmic tRNAiMet is almost identical to that of wheat germ and shows a high degree of homology with other cytoplasmic initiator ...

  8. Phylogenetic relationships among Acanthaceae: evidence from noncoding trnL-trnF chloroplast DNA sequences.

    Science.gov (United States)

    McDade, L A; Moody, M L

    1999-01-01

    We used sequence data from the intron and spacer of the trnL-trnF chloroplast region to study phylogenetic relationships among Acanthaceae. This region is more variable than other chloroplast loci that have been sequenced for members of Acanthaceae (rbcL and ndhF), is more prone to length mutations, and is less homoplasious than these genes. Our results indicate that this region is likely to be useful in addressing phylogenetic questions among but not within genera in these and related plants. In terms of phylogenetic relationships, Elytraria (representing Nelsonioideae) is more distantly related to Acanthaceae sensu stricto (s.s.) than Thunbergia and Mendoncia. These last two genera are strongly supported as sister taxa. Molecular evidence does not support monophyly of Acanthaceae s.s., although there is strong morphological evidence for this relationship. There is strong support for monophyly of four major lineages within Acanthaceae s.s.: the Acanthus, Barleria, Ruellia, and Justicia lineages as here defined. The last three of these comprise a strongly supported monophyletic group, and there is weaker evidence linking the Ruellia and Justicia lineages as closest relatives. Within the Acanthus lineage, our results confirm the existence of monophyletic lineages representing Aphelandreae and Acantheae. Lastly, within the Justicia lineage, we develop initial hypotheses regarding the definition of sublineages; some of these correspond to earlier ideas, whereas others do not. All of these hypotheses need to be tested against more data. PMID:21680347

  9. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  10. Fungal genome sequencing: basic biology to biotechnology.

    Science.gov (United States)

    Sharma, Krishna Kant

    2016-08-01

    The genome sequences provide a first glimpse into the genomic basis of the biological diversity of filamentous fungi and yeast. The genome sequence of the budding yeast, Saccharomyces cerevisiae, with a small genome size, unicellular growth, and rich history of genetic and molecular analyses was a milestone of early genomics in the 1990s. The subsequent completion of fission yeast, Schizosaccharomyces pombe and genetic model, Neurospora crassa initiated a revolution in the genomics of the fungal kingdom. In due course of time, a substantial number of fungal genomes have been sequenced and publicly released, representing the widest sampling of genomes from any eukaryotic kingdom. An ambitious genome-sequencing program provides a wealth of data on metabolic diversity within the fungal kingdom, thereby enhancing research into medical science, agriculture science, ecology, bioremediation, bioenergy, and the biotechnology industry. Fungal genomics have higher potential to positively affect human health, environmental health, and the planet's stored energy. With a significant increase in sequenced fungal genomes, the known diversity of genes encoding organic acids, antibiotics, enzymes, and their pathways has increased exponentially. Currently, over a hundred fungal genome sequences are publicly available; however, no inclusive review has been published. This review is an initiative to address the significance of the fungal genome-sequencing program and provides the road map for basic and applied research. PMID:25721271

  11. Building a model: developing genomic resources for common milkweed (Asclepias syriaca with low coverage genome sequencing

    Directory of Open Access Journals (Sweden)

    Weitemier Kevin

    2011-05-01

    Full Text Available Abstract Background Milkweeds (Asclepias L. have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L. could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp and 5S rDNA (120 bp sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp, with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae unigenes (median coverage of 0.29× and 66% of single copy orthologs (COSII in asterids (median coverage of 0.14×. From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites and phylogenetics (low-copy nuclear genes studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species

  12. Sequence of the chloroplast 16S rRNA gene and its surrounding regions of Chlamydomonas reinhardii.

    OpenAIRE

    Dron, M; Rahire, M; Rochaix, J D

    1982-01-01

    The sequence of a 2 kb DNA fragment containing the chloroplast 16S ribosomal RNA gene from Chlamydomonas reinhardii and its flanking regions has been determined. The algal 16S rRNA sequence (1475 nucleotides) and secondary structure are highly related to those found in bacteria and in the chloroplasts of higher plants. In contrast, the flanking regions are very different. In C. reinhardii the 16S rRNA gene is surrounded by AT rich segments of about 180 bases, which are followed by a long stre...

  13. Draft Genome Sequence of Lactobacillus rhamnosus 2166.

    OpenAIRE

    Karlyshev, Andrey V.; Melnikov, Vyacheslav G.; Kosarev, Igor V.; Abramov, Vyacheslav M.

    2014-01-01

    In this report, we present a draft sequence of the genome of Lactobacillus rhamnosus strain 2166, a potential novel probiotic. Genome annotation and read mapping onto a reference genome of L. rhamnosus strain GG allowed for the identification of the differences and similarities in the genomic contents and gene arrangements of these strains.

  14. Update on chloroplast research

    OpenAIRE

    Armbruster, Ute; Pesaresi, Paolo; Pribil, Mathias; Hertle, Alexander; Leister, Dario

    2010-01-01

    Chloroplasts, the green differentiation form of plastids, are the sites of photosynthesis and other important plant functions. Genetic and genomic technologies have greatly boosted the rate of discovery and functional characterization of chloroplast proteins during the past decade. Indeed, data obtained using high-throughput methodologies, in particular proteomics and transcriptomics, are now routinely used to assign functions to chloroplast proteins. Our knowledge of many chloroplast process...

  15. Phylogeny of Populus (Salicaceae) based on nucleotide sequences of chloroplast TRNT-TRNF region and nuclear rDNA.

    Science.gov (United States)

    Hamzeh, Mona; Dayanandan, Selvadurai

    2004-09-01

    The species of the genus Populus, collectively known as poplars, are widely distributed over the northern hemisphere and well known for their ecological, economical, and evolutionary importance. The extensive interspecific hybridization and high morphological diversity in this group pose difficulties in identifying taxonomic units for comparative evolutionary studies and systematics. To understand the evolutionary relationships among poplars and to provide a framework for biosystematic classification, we reconstructed a phylogeny of the genus Populus based on nucleotide sequences of three noncoding regions of the chloroplast DNA (intron of trnL and intergenic regions of trnT-trnL and trnL-trnF) and ITS1 and ITS2 of the nuclear rDNA. The resulting phylogenetic trees showed polyphyletic relationships among species in the sections Tacamahaca and Aigeiros. Based on chloroplast DNA sequence data, P. nigra had a close affinity to species of section Populus, whereas nuclear DNA sequence data suggested a close relationship between P. nigra and species of the section Aigeiros, suggesting a possible hybrid origin for P. nigra. Similarly, the chloroplast DNA sequences of P. tristis and P. szechuanica were similar to that of the species of section Aigeiros, while the nuclear sequences revealed a close affinity to species of the section Tacamahaca, suggesting a hybrid origin for these two Asiatic balsam poplars. The incongruence between phylogenetic trees based on nuclear- and chloroplast-DNA sequence data suggests a reticulate evolution in the genus Populus. PMID:21652373

  16. Value of a newly sequenced bacterial genome.

    Science.gov (United States)

    Barbosa, Eudes Gv; Aburjaile, Flavia F; Ramos, Rommel Tj; Carneiro, Adriana R; Le Loir, Yves; Baumbach, Jan; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

    2014-05-26

    Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information. PMID:24921006

  17. Value of a newly sequenced bacterial genome

    Institute of Scientific and Technical Information of China (English)

    Eudes; GV; Barbosa; Flavia; F; Aburjaile; Rommel; TJ; Ramos; Adriana; R; Carneiro; Yves; Le; Loir; Jan; Baumbach; Anderson; Miyoshi; Artur; Silva; Vasco; Azevedo

    2014-01-01

    Next-generation sequencing(NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft(partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information.

  18. [Analysis of microsatellite loci of the chloroplast genome in the genus Capsicum (Pepper)].

    Science.gov (United States)

    Ryzhova, N N; Kochieva, E Z

    2004-08-01

    Six plastome microsatellites were examined in 43 accessions of the genus Capsicum. In total, 33 allelic variants were detected. A specific haplotype of chloroplast DNA was identified for each Capsicum species. Species-specific allelic variants were found for most wild Capsicum species. The highest intraspecific variation was observed for the C. baccatum plastome. Low cpDNA polymorphism was characteristic of C. annuum: the cpSSRs were either monomorphic or dimorphic. The vast majority of C. annuum accessions each had alleles of one type. Another allele type was rare and occurred only in wild accessions. The results testified again to genetic conservation of C. annuum and especially its cultivated forms. The phylogenetic relationships established for the Capsicum species on the basis of plastome analysis were similar to those inferred from the morphological traits, isozyme patterns, and molecular analysis of the nuclear genome. PMID:15523848

  19. Isolation and analysis of high quality nuclear DNA with reduced organellar DNA for plant genome sequencing and resequencing

    Directory of Open Access Journals (Sweden)

    Zdepski Anna

    2011-05-01

    Full Text Available Abstract Background High throughput sequencing (HTS technologies have revolutionized the field of genomics by drastically reducing the cost of sequencing, making it feasible for individual labs to sequence or resequence plant genomes. Obtaining high quality, high molecular weight DNA from plants poses significant challenges due to the high copy number of chloroplast and mitochondrial DNA, as well as high levels of phenolic compounds and polysaccharides. Multiple methods have been used to isolate DNA from plants; the CTAB method is commonly used to isolate total cellular DNA from plants that contain nuclear DNA, as well as chloroplast and mitochondrial DNA. Alternatively, DNA can be isolated from nuclei to minimize chloroplast and mitochondrial DNA contamination. Results We describe optimized protocols for isolation of nuclear DNA from eight different plant species encompassing both monocot and eudicot species. These protocols use nuclei isolation to minimize chloroplast and mitochondrial DNA contamination. We also developed a protocol to determine the number of chloroplast and mitochondrial DNA copies relative to the nuclear DNA using quantitative real time PCR (qPCR. We compared DNA isolated from nuclei to total cellular DNA isolated with the CTAB method. As expected, DNA isolated from nuclei consistently yielded nuclear DNA with fewer chloroplast and mitochondrial DNA copies, as compared to the total cellular DNA prepared with the CTAB method. This protocol will allow for analysis of the quality and quantity of nuclear DNA before starting a plant whole genome sequencing or resequencing experiment. Conclusions Extracting high quality, high molecular weight nuclear DNA in plants has the potential to be a bottleneck in the era of whole genome sequencing and resequencing. The methods that are described here provide a framework for researchers to extract and quantify nuclear DNA in multiple types of plants.

  20. Maternal inheritance of chloroplast genome and paternal inheritance of mitochondrial genome in bananas (Musa acuminata).

    Science.gov (United States)

    Fauré, S; Noyer, J L; Carreel, F; Horry, J P; Bakry, F; Lanaud, C

    1994-03-01

    Restriction fragment length polymorphisms (RFLPs) were used as markers to determine the transmission of cytoplasmic DNA in diploid banana crosses. Progenies from two controlled crosses were studied with heterologous cytoplasmic probes. This analysis provided evidence for a strong bias towards maternal transmission of chloroplast DNA and paternal transmission of mitochondrial DNA in Musa acuminata. These results suggest the existence of two separate mechanisms of organelle transmission and selection, but no model to explain this can be proposed at the present time. Knowledge of the organelle mode of inheritance constitutes an important point for phylogeny analyses in bananas and may offer a powerful tool to confirm hybrid origins. PMID:7923414

  1. Accurate and comprehensive sequencing of personal genomes

    OpenAIRE

    Ajay, Subramanian S.; Parker, Stephen C.J.; Ozel Abaan, Hatice; Fuentes Fajardo, Karin V.; Margulies, Elliott H.

    2011-01-01

    As whole-genome sequencing becomes commoditized and we begin to sequence and analyze personal genomes for clinical and diagnostic purposes, it is necessary to understand what constitutes a complete sequencing experiment for determining genotypes and detecting single-nucleotide variants. Here, we show that the current recommendation of ∼30× coverage is not adequate to produce genotype calls across a large fraction of the genome with acceptably low error rates. Our results are based on analyses...

  2. Automated correction of genome sequence errors

    OpenAIRE

    Gajer, Pawel; Schatz, Michael; Salzberg, Steven L

    2004-01-01

    By using information from an assembly of a genome, a new program called AutoEditor significantly improves base calling accuracy over that achieved by previous algorithms. This in turn improves the overall accuracy of genome sequences and facilitates the use of these sequences for polymorphism discovery. We describe the algorithm and its application in a large set of recent genome sequencing projects. The number of erroneous base calls in these projects was reduced by 80%. In an analysis of ov...

  3. A clade uniting the green algae Mesostigma viride and Chlorokybus atmophyticus represents the deepest branch of the Streptophyta in chloroplast genome-based phylogenies

    Directory of Open Access Journals (Sweden)

    Turmel Monique

    2007-01-01

    Full Text Available Abstract Background The Viridiplantae comprise two major phyla: the Streptophyta, containing the charophycean green algae and all land plants, and the Chlorophyta, containing the remaining green algae. Despite recent progress in unravelling phylogenetic relationships among major green plant lineages, problematic nodes still remain in the green tree of life. One of the major issues concerns the scaly biflagellate Mesostigma viride, which is either regarded as representing the earliest divergence of the Streptophyta or a separate lineage that diverged before the Chlorophyta and Streptophyta. Phylogenies based on chloroplast and mitochondrial genomes support the latter view. Because some green plant lineages are not represented in these phylogenies, sparse taxon sampling has been suspected to yield misleading topologies. Here, we describe the complete chloroplast DNA (cpDNA sequence of the early-diverging charophycean alga Chlorokybus atmophyticus and present chloroplast genome-based phylogenies with an expanded taxon sampling. Results The 152,254 bp Chlorokybus cpDNA closely resembles its Mesostigma homologue at the gene content and gene order levels. Using various methods of phylogenetic inference, we analyzed amino acid and nucleotide data sets that were derived from 45 protein-coding genes common to the cpDNAs of 37 green algal/land plant taxa and eight non-green algae. Unexpectedly, all best trees recovered a robust clade uniting Chlorokybus and Mesostigma. In protein trees, this clade was sister to all streptophytes and chlorophytes and this placement received moderate support. In contrast, gene trees provided unequivocal support to the notion that the Mesostigma + Chlorokybus clade represents the earliest-diverging branch of the Streptophyta. Independent analyses of structural data (gene content and/or gene order and of subsets of amino acid data progressively enriched in slow-evolving sites led us to conclude that the latter topology

  4. The complete plastid genome sequence of Panax notoginseng, a famous traditional Chinese medicinal plant of the family Araliaceae.

    Science.gov (United States)

    Zhang, Dan; Li, Wei; Gao, Chengwen; Liu, Yuan; Gao, Li-Zhi

    2016-09-01

    We report complete nucleotide sequence of the Panax notoginseng chloroplast genome using next-generation sequencing technology. The genome consists of 156 324 bp containing a pair of inverted repeats (IRs) of 26 105 bp, which was separated by a large single-copy region and a small single-copy region of 86 082 bp and 18 032 bp, respectively. The P. notoginseng cp genome encodes 114 unigenes (80 protein-coding genes, 4 rRNA genes, and 30 tRNA genes), in which 18 are duplicated in the IR regions. The genic regions account for 51.1% of whole cp genome, and the GC content of the plastome was 38.1%. A phylogenomic analysis of the 10 complete chloroplast genomes from Araliaceae using Hydrocotyle verticillata outgroup showed that P. notoginseng is closely related to P. ginseng that belongs to the genus Panax. PMID:26365031

  5. The Bryopsis hypnoides plastid genome: multimeric forms and complete nucleotide sequence.

    Directory of Open Access Journals (Sweden)

    Fang Lü

    Full Text Available BACKGROUND: Bryopsis hypnoides Lamouroux is a siphonous green alga, and its extruded protoplasm can aggregate spontaneously in seawater and develop into mature individuals. The chloroplast of B. hypnoides is the biggest organelle in the cell and shows strong autonomy. To better understand this organelle, we sequenced and analyzed the chloroplast genome of this green alga. PRINCIPAL FINDINGS: A total of 111 functional genes, including 69 potential protein-coding genes, 5 ribosomal RNA genes, and 37 tRNA genes were identified. The genome size (153,429 bp, arrangement, and inverted-repeat (IR-lacking structure of the B. hypnoides chloroplast DNA (cpDNA closely resembles that of Chlorella vulgaris. Furthermore, our cytogenomic investigations using pulsed-field gel electrophoresis (PFGE and southern blotting methods showed that the B. hypnoides cpDNA had multimeric forms, including monomer, dimer, trimer, tetramer, and even higher multimers, which is similar to the higher order organization observed previously for higher plant cpDNA. The relative amounts of the four multimeric cpDNA forms were estimated to be about 1, 1/2, 1/4, and 1/8 based on molecular hybridization analysis. Phylogenetic analyses based on a concatenated alignment of chloroplast protein sequences suggested that B. hypnoides is sister to all Chlorophyceae and this placement received moderate support. CONCLUSION: All of the results suggest that the autonomy of the chloroplasts of B. hypnoides has little to do with the size and gene content of the cpDNA, and the IR-lacking structure of the chloroplasts indirectly demonstrated that the multimeric molecules might result from the random cleavage and fusion of replication intermediates instead of recombinational events.

  6. Sequence Maneuverer: tool for sequence extraction from genomes

    OpenAIRE

    Yasmin, Tayyaba; Rehman, Inayat Ur; Ansari, Adnan Ahmad; liaqat, Khurrum; Khan, Muhammad Irfan

    2012-01-01

    The availability of genomic sequences of many organisms has opened new challenges in many aspects particularly in terms of genome analysis. Sequence extraction is a vital step and many tools have been developed to solve this issue. These tools are available publically but have limitations with reference to the sequence extraction, length of the sequence to be extracted, organism specificity and lack of user friendly interface. We have developed a java based software package having three modul...

  7. The nucleotide sequence of 4.5S ribosomal RNA from tobacco chloroplasts.

    OpenAIRE

    Takaiwa, F; Sugiura, M

    1980-01-01

    The nucleotide sequence of tobacco chloroplast 4.5S ribosomal RNA has been determined to be: OHG-A-A-G-G-U-C-A-C-G-G-C-G-A-G-A-C-G-A-G-C-C-G-U-U-U-A-U-C-A-U-U-A-C-G-A-U-A-G-G-U-G-U-C-A-A-G-U-G-G-A-A-G-U-G-C-A-G-U-G-A-U-G-U-A-U-G-C-(G-A)-C-U-G-A-G-G-C-A-U-C-C-U-A-A-C-A-G-A-C-C-G-G-U-A-G-A-C-U-U-G-A-A-COH. The 4.5S RNA is 103 nucleotides long and its 5'-terminus is not phosphorylated.

  8. CURE-Chloroplast: A chloroplast C-to-U RNA editing predictor for seed plants

    Directory of Open Access Journals (Sweden)

    Li Yanda

    2009-05-01

    Full Text Available Abstract Background RNA editing is a type of post-transcriptional modification of RNA and belongs to the class of mechanisms that contribute to the complexity of transcriptomes. C-to-U RNA editing is commonly observed in plant mitochondria and chloroplasts. The in vivo mechanism of recognizing C-to-U RNA editing sites is still unknown. In recent years, many efforts have been made to computationally predict C-to-U RNA editing sites in the mitochondria of seed plants, but there is still no algorithm available for C-to-U RNA editing site prediction in the chloroplasts of seed plants. Results In this paper, we extend our algorithm CURE, which can accurately predict the C-to-U RNA editing sites in mitochondria, to predict C-to-U RNA editing sites in the chloroplasts of seed plants. The algorithm achieves over 80% sensitivity and over 99% specificity. We implement the algorithm as an online service called CURE-Chloroplast http://bioinfo.au.tsinghua.edu.cn/pure. Conclusion CURE-Chloroplast is an online service for predicting the C-to-U RNA editing sites in the chloroplasts of seed plants. The online service allows the processing of entire chloroplast genome sequences. Since CURE-Chloroplast performs very well, it could be a helpful tool in the study of C-to-U RNA editing in the chloroplasts of seed plants.

  9. Comparative chloroplast genomics: Analyses including new sequencesfrom the angiosperms Nuphar advena and Ranunculus macranthus

    Energy Technology Data Exchange (ETDEWEB)

    Raubeso, Linda A.; Peery, Rhiannon; Chumley, Timothy W.; Dziubek,Chris; Fourcade, H. Matthew; Boore, Jeffrey L.; Jansen, Robert K.

    2007-03-01

    The number of completely sequenced plastid genomes available is growing rapidly. This new array of sequences presents new opportunities to perform comparative analyses. In comparative studies, it is most useful to compare across wide phylogenetic spans and, within angiosperms, to include representatives from basally diverging lineages such as the new genomes reported here: Nuphar advena (from a basal-most lineage) and Ranunculus macranthus (from the basal group of eudicots). We report these two new plastid genome sequences and make comparisons (within angiosperms, seed plants, or all photosynthetic lineages) to evaluate features such as the status of ycf15 and ycf68 as protein coding genes, the distribution of simple sequence repeats (SSRs) and longer dispersed repeats (SDR), and patterns of nucleotide composition.

  10. Towards a reference pecan genome sequence

    Science.gov (United States)

    The cost of generating DNA sequence data has declined dramatically over the previous 15 years as a result of the Human Genome Project and the potential applications of genome sequencing for human medicine. This cost reduction has generated renewed interest among crop breeding scientists in applying...

  11. Phylogeny of the quadriflagellate Volvocales (Chlorophyceae) based on chloroplast multigene sequences.

    Science.gov (United States)

    Nozaki, Hisayoshi; Misumi, Osami; Kuroiwa, Tsuneyoshi

    2003-10-01

    Since the phylogenetic relationships of the green plants (green algae and land plants) have been extensively studied using 18S ribosomal RNA sequences, change in the arrangement of basal bodies in flagellate cells is considered to be one of the major evolutionary events in the green plants. However, the phylogenetic relationships between biflagellate and quadriflagellate species within the Volvocales remain uncertain. This study examined the phylogeny of three genera of quadriflagellate Volvocales (Carteria, Pseudocarteria, and Hafniomonas) using concatenated sequences from three chloroplast genes. Using these multigene sequences, all three quadriflagellate genera were basal to other members (biflagellates) of the CW (clockwise) group (the Volvocales and their relatives, the Chlorophyceae) and formed three robust clades. Since the flagellar apparatuses of these three quadriflagellate lineages are diverse, including counter clockwise (CCW) and CW orientation of the basal bodies, the CW orientation of the basal bodies might have evolved from the CCW orientation in the ancestral quadriflagellate volvocalean algae, giving rise to the biflagellates, major members of the CW group. PMID:12967607

  12. The complete plastid genome sequence of Eustrephus latifolius (Asparagaceae: Lomandroideae).

    Science.gov (United States)

    Kim, Hyoung Tae; Kim, Jung Sung; Kim, Joo-Hwan

    2016-01-01

    The complete chloroplast (cp) genome sequence of Eustrephus latifolius was firstly determined in subfamily Lomandriodeae of family Asparagaceae. It was 159,736 bp and contained a large single copy region (82,403 bp) and a small single copy region (13,607 bp) which were separated by two inverted repeat regions (31,863 bp). In total, 132 genes were identified and they were consisted of 83 coding genes, 8 rRNA genes, 38 tRNA genes, 3 pseudogenes. rpl23 and clpP were pseudogenes due to sequence deletions. Among 23 genes containing introns, rps12 and ycf3 contained two introns and the rest had just one intron. The intact ycf68 was identified within an intron of trnI-GAU. The amino acid sequence was almost identical with Phoenix dactylifera in Aracales. Ycf1 of E. latifolius was completely located in IR. It was similar to cp genome structure of Lemna minor, Spirodela polyrhiza, Wolffiella lingulata, Wolffia australiana in Alismatales. PMID:25186113

  13. Sequence Maneuverer: tool for sequence extraction from genomes

    Science.gov (United States)

    Yasmin, Tayyaba; Rehman, Inayat Ur; Ansari, Adnan Ahmad; liaqat, Khurrum; khan, Muhammad Irfan

    2012-01-01

    The availability of genomic sequences of many organisms has opened new challenges in many aspects particularly in terms of genome analysis. Sequence extraction is a vital step and many tools have been developed to solve this issue. These tools are available publically but have limitations with reference to the sequence extraction, length of the sequence to be extracted, organism specificity and lack of user friendly interface. We have developed a java based software package having three modules which can be used independently or sequentially. The tool efficiently extracts sequences from large datasets with few simple steps. It can efficiently extract multiple sequences of any desired length from a genome of any organism. The results are crosschecked by published data. Availability URL 1: http://ww3.comsats.edu.pk/bio/ResearchProjects.aspx URL 2: http://ww3.comsats.edu.pk/bio/SequenceManeuverer.aspx PMID:23275734

  14. A Phylogenetic Analysis of 34 Chloroplast Genomes Elucidates the Relationships between Wild and Domestic Species within the Genus Citrus

    OpenAIRE

    Carbonell-Caballero, Jose; Alonso, Roberto; Ibañez, Victoria; Terol, Javier; Talon, Manuel; Dopazo, Joaquin

    2015-01-01

    Citrus genus includes some of the most important cultivated fruit trees worldwide. Despite being extensively studied because of its commercial relevance, the origin of cultivated citrus species and the history of its domestication still remain an open question. Here, we present a phylogenetic analysis of the chloroplast genomes of 34 citrus genotypes which constitutes the most comprehensive and detailed study to date on the evolution and variability of the genus Citrus. A statistical model wa...

  15. Genomic sequencing of Pleistocene cave bears

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

    2005-04-01

    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

  16. A Phylogenetic Analysis of 34 Chloroplast Genomes Elucidates the Relationships between Wild and Domestic Species within the Genus Citrus.

    Science.gov (United States)

    Carbonell-Caballero, Jose; Alonso, Roberto; Ibañez, Victoria; Terol, Javier; Talon, Manuel; Dopazo, Joaquin

    2015-08-01

    Citrus genus includes some of the most important cultivated fruit trees worldwide. Despite being extensively studied because of its commercial relevance, the origin of cultivated citrus species and the history of its domestication still remain an open question. Here, we present a phylogenetic analysis of the chloroplast genomes of 34 citrus genotypes which constitutes the most comprehensive and detailed study to date on the evolution and variability of the genus Citrus. A statistical model was used to estimate divergence times between the major citrus groups. Additionally, a complete map of the variability across the genome of different citrus species was produced, including single nucleotide variants, heteroplasmic positions, indels (insertions and deletions), and large structural variants. The distribution of all these variants provided further independent support to the phylogeny obtained. An unexpected finding was the high level of heteroplasmy found in several of the analyzed genomes. The use of the complete chloroplast DNA not only paves the way for a better understanding of the phylogenetic relationships within the Citrus genus but also provides original insights into other elusive evolutionary processes, such as chloroplast inheritance, heteroplasmy, and gene selection. PMID:25873589

  17. Systematic positions of Lamiophiomis and Paraphlomis (Lamiaceae) based on nuclear and chloroplast sequences

    Institute of Scientific and Technical Information of China (English)

    Yue-Zhi PAN; Li-Qin FANG; Gang HAO; Jie CAI; Xun GONG

    2009-01-01

    Genera Lamiophlomis and Paraphlomis were originally separated from genus Phlomis s.l. on the basis of particular morphological characteristics. However, their relationship was highly contentious, as evidenced by the literature. In the present paper, the systematic positions of Lamiophlomis, Paraphlomis, and their related genera were assessed based on nuclear internal transcribed spacer (ITS) and chloroplast rpl16 and trnL-F sequence data using maximum parsimony (MP) and Bayesian methods. In total, 24 species representing six genera of the ingroup and outgroup were sampled. Analyses of both separate and combined sequence data were conducted to resolve the systematic relationships of these genera. The results reveal that Lamiophlomis is nested within Phlomis sect. Phlomoides and its genetic status is not supported. With the inclusion of Lamiophlomis rotata in sect. Phlomoides, sections Phlomis and Phlomoides of Phlomis were resolved as monophyletic. Paraphlomis was supported as an inde-pendent genus. However, the resolution of its monophyly conflicted between MP and Bayesian analyses, suggesting the need for expended sampling and further evidence.

  18. Complex chloroplast RNA metabolism: just debugging the genetic programme?

    Directory of Open Access Journals (Sweden)

    Schmitz-Linneweber Christian

    2008-08-01

    Full Text Available Abstract Background The gene expression system of chloroplasts is far more complex than that of their cyanobacterial progenitor. This gain in complexity affects in particular RNA metabolism, specifically the transcription and maturation of RNA. Mature chloroplast RNA is generated by a plethora of nuclear-encoded proteins acquired or recruited during plant evolution, comprising additional RNA polymerases and sigma factors, and sequence-specific RNA maturation factors promoting RNA splicing, editing, end formation and translatability. Despite years of intensive research, we still lack a comprehensive explanation for this complexity. Results We inspected the available literature and genome databases for information on components of RNA metabolism in land plant chloroplasts. In particular, new inventions of chloroplast-specific mechanisms and the expansion of some gene/protein families detected in land plants lead us to suggest that the primary function of the additional nuclear-encoded components found in chloroplasts is the transgenomic suppression of point mutations, fixation of which occurred due to an enhanced genetic drift exhibited by chloroplast genomes. We further speculate that a fast evolution of transgenomic suppressors occurred after the water-to-land transition of plants. Conclusion Our inspections indicate that several chloroplast-specific mechanisms evolved in land plants to remedy point mutations that occurred after the water-to-land transition. Thus, the complexity of chloroplast gene expression evolved to guarantee the functionality of chloroplast genetic information and may not, with some exceptions, be involved in regulatory functions.

  19. Plantagora: modeling whole genome sequencing and assembly of plant genomes.

    Directory of Open Access Journals (Sweden)

    Roger Barthelson

    Full Text Available BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly

  20. Comparison of 61 Sequenced Escherichia coli Genomes

    DEFF Research Database (Denmark)

    Lukjancenko, Oksana; Wassenaar, T. M.; Ussery, David

    2010-01-01

    Escherichia coli is an important component of the biosphere and is an ideal model for studies of processes involved in bacterial genome evolution. Sixty-one publically available E. coli and Shigella spp. sequenced genomes are compared, using basic methods to produce phylogenetic and proteomics...... the pan-genome and about 80% of a typical genome; some of these variable genes tend to be co-localized on genomic islands. The diversity within the species E. coli, and the overlap in gene content between this and related species, suggests a continuum rather than sharp species borders in this group of...

  1. Complete sequence of heterogenous-composition mitochondrial genome (Brassica napus and its exogenous source

    Directory of Open Access Journals (Sweden)

    Wang Juan

    2012-11-01

    Full Text Available Abstract Background Unlike maternal inheritance of mitochondria in sexual reproduction, somatic hybrids follow no obvious pattern. The introgressed segment orf138 from the mitochondrial genome of radish (Raphanus sativus to its counterpart in rapeseed (Brassica napus demonstrates that this inheritance mode derives from the cytoplasm of both parents. Sequencing of the complete mitochondrial genome of five species from Brassica family allowed the prediction of other extraneous sources of the cybrids from the radish parent, and the determination of their mitochondrial rearrangement. Results We obtained the complete mitochondrial genome of Ogura-cms-cybrid (oguC rapeseed. To date, this is the first time that a heterogeneously composed mitochondrial genome was sequenced. The 258,473 bp master circle constituted of 33 protein-coding genes, 3 rRNA sequences, and 23 tRNA sequences. This mitotype noticeably holds two copies of atp9 and is devoid of cox2-2. Relative to nap mitochondrial genome, 40 point mutations were scattered in the 23 protein-coding genes. atp6 even has an abnormal start locus whereas tatC has an abnormal end locus. The rearrangement of the 22 syntenic regions that comprised 80.11% of the genome was influenced by short repeats. A pair of large repeats (9731 bp was responsible for the multipartite structure. Nine unique regions were detected when compared with other published Brassica mitochondrial genome sequences. We also found six homologous chloroplast segments (Brassica napus. Conclusions The mitochondrial genome of oguC is quite divergent from nap and pol, which are more similar with each other. We analyzed the unique regions of every genome of the Brassica family, and found that very few segments were specific for these six mitotypes, especially cam, jun, and ole, which have no specific segments at all. Therefore, we conclude that the most specific regions of oguC possibly came from radish. Compared with the chloroplast genome

  2. Microbial species delineation using whole genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos; Mukherjee, Supratim; Ivanova, Natalia; Mavrommatics, Kostas; Pati, Amrita; Konstantinidis, Konstantinos

    2014-10-20

    Species assignments in prokaryotes use a manual, poly-phasic approach utilizing both phenotypic traits and sequence information of phylogenetic marker genes. With thousands of genomes being sequenced every year, an automated, uniform and scalable approach exploiting the rich genomic information in whole genome sequences is desired, at least for the initial assignment of species to an organism. We have evaluated pairwise genome-wide Average Nucleotide Identity (gANI) values and alignment fractions (AFs) for nearly 13,000 genomes using our fast implementation of the computation, identifying robust and widely applicable hard cut-offs for species assignments based on AF and gANI. Using these cutoffs, we generated stable species-level clusters of organisms, which enabled the identification of several species mis-assignments and facilitated the assignment of species for organisms without species definitions.

  3. The characterization of twenty sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Kimberly Pelak

    2010-09-01

    Full Text Available We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten "case" genomes from individuals with severe hemophilia A and ten "control" genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways.

  4. Genome sequence and analysis of Lactobacillus helveticus

    Directory of Open Access Journals (Sweden)

    PaolaCremonesi

    2013-01-01

    Full Text Available The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of L. helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract. As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones.

  5. Molecular phylogeny of Asian Meconopsis based on nuclear ribosomal and chloroplast DNA sequence data.

    Science.gov (United States)

    Liu, Yu-Cheng; Liu, Ya-Nan; Yang, Fu-Sheng; Wang, Xiao-Quan

    2014-01-01

    The taxonomy and phylogeny of Asian Meconopsis (Himalayan blue poppy) remain largely unresolved. We used the internal transcribed spacer (ITS) region of nuclear ribosomal DNA (nrDNA) and the chloroplast DNA (cpDNA) trnL-F region for phylogenetic reconstruction of Meconopsis and its close relatives Papaver, Roemeria, and Stylomecon. We identified five main clades, which were well-supported in the gene trees reconstructed with the nrDNA ITS and cpDNA trnL-F sequences. We found that 41 species of Asian Meconopsis did not constitute a monophyletic clade, but formed two solid clades (I and V) separated in the phylogenetic tree by three clades (II, III and IV) of Papaver and its allies. Clade V includes only four Asian Meconopsis species, with the remaining 90 percent of Asian species included in clade I. In this core Asian Meconopsis clade, five subclades (Ia-Ie) were recognized in the nrDNA ITS tree. Three species (Meconopsis discigera, M. pinnatifolia, and M. torquata) of subgenus Discogyne were imbedded in subclade Ia, indicating that the present definition of subgenera in Meconopsis should be rejected. These subclades are inconsistent with any series or sections of the present classifications, suggesting that classifications of the genus should be completely revised. Finally, proposals for further revision of the genus Meconopsis were put forward based on molecular, morphological, and biogeographical evidences. PMID:25118100

  6. Sequencing and comparing whole mitochondrial genomes ofanimals

    Energy Technology Data Exchange (ETDEWEB)

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

  7. Multilocus Sequence Typing of Total-Genome-Sequenced Bacteria

    DEFF Research Database (Denmark)

    Larsen, Mette Voldby; Cosentino, Salvatore; Rasmussen, Simon;

    2012-01-01

    Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the "gold standard" of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS)...

  8. Genomic prediction using QTL derived from whole genome sequence data

    DEFF Research Database (Denmark)

    Brøndum, Rasmus Froberg; Su, Guosheng; Janss, Luc;

    This study investigated the gain in accuracy of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k SNP data. Analyses were performed for Nordic Holstein and Danish Jersey animals, using either...... a genomic BLUP or a Bayesian variable selection model. When using the genomic BLUP model, results showed increases in accuracy of up to two percentage points for production traits in both Holstein and Jersey animals by including the extra variants in the analysis, and an extra 1.5 percentage points...

  9. Simple sequence repeats in mycobacterial genomes

    Indian Academy of Sciences (India)

    Vattipally B Sreenu; Pankaj Kumar; Javaregowda Nagaraju; Hampapathalu A Nagarajaram

    2007-01-01

    Simple sequence repeats (SSRs) or microsatellites are the repetitive nucleotide sequences of motifs of length 1–6 bp. They are scattered throughout the genomes of all the known organisms ranging from viruses to eukaryotes. Microsatellites undergo mutations in the form of insertions and deletions (INDELS) of their repeat units with some bias towards insertions that lead to microsatellite tract expansion. Although prokaryotic genomes derive some plasticity due to microsatellite mutations they have in-built mechanisms to arrest undue expansions of microsatellites and one such mechanism is constituted by post-replicative DNA repair enzymes MutL, MutH and MutS. The mycobacterial genomes lack these enzymes and as a null hypothesis one could expect these genomes to harbour many long tracts. It is therefore interesting to analyse the mycobacterial genomes for distribution and abundance of microsatellites tracts and to look for potentially polymorphic microsatellites. Available mycobacterial genomes, Mycobacterium avium, M. leprae, M. bovis and the two strains of M. tuberculosis (CDC1551 and H37Rv) were analysed for frequencies and abundance of SSRs. Our analysis revealed that the SSRs are distributed throughout the mycobacterial genomes at an average of 220–230 SSR tracts per kb. All the mycobacterial genomes contain few regions that are conspicuously denser or poorer in microsatellites compared to their expected genome averages. The genomes distinctly show scarcity of long microsatellites despite the absence of a post-replicative DNA repair system. Such severe scarcity of long microsatellites could arise as a result of strong selection pressures operating against long and unstable sequences although influence of GC-content and role of point mutations in arresting microsatellite expansions can not be ruled out. Nonetheless, the long tracts occasionally found in coding as well as non-coding regions may account for limited genome plasticity in these genomes.

  10. Genomic Prediction from Whole Genome Sequence in Livestock: The 1000 Bull Genomes Project

    DEFF Research Database (Denmark)

    Hayes, Benjamin J; MacLeod, Iona M; Daetwyler, Hans D;

    Advantages of using whole genome sequence data to predict genomic estimated breeding values (GEBV) include better persistence of accuracy of GEBV across generations and more accurate GEBV across breeds. The 1000 Bull Genomes Project provides a database of whole genome sequenced key ancestor bulls......, for imputing sequence variant genotypes into reference sets for genomic prediction. Run 3.0 included 429 sequences, with 31.8 million variants detected. BayesRC, a new method for genomic prediction, addresses some challenges associated with using the sequence data, and takes advantage of biological...... information. In a dairy data set, predictions using BayesRC and imputed sequence data from 1000 Bull Genomes were 2% more accurate than with 800k data. We could demonstrate the method identified causal mutations in some cases. Further improvements will come from more accurate imputation of sequence variant...

  11. Common sequence motifs coding for higher-plant and prokaryotic O-acetylserine (thiol)-lyases: bacterial origin of a chloroplast transit peptide?

    Science.gov (United States)

    Rolland, N; Job, D; Douce, R

    1993-08-01

    A comparison of the amino acid sequence of O-acetylserine (thiol)-lyase (EC 4.2.99.8) from Escherichia coli and the isoforms of this enzyme found in the cytosolic and chloroplastic compartments of spinach (Spinacia oleracea) leaf cells allows the essential lysine residue involved in the binding of the pyridoxal 5'-phosphate cofactor to be identified. The results of further sequence comparison of cDNAs coding for these proteins are discussed in the frame of the endosymbiotic theory of chloroplast evolution. The results are compatible with a mechanism in which the chloroplast enzyme originated from the cytosolic enzyme and both plant genes originated from a common prokaryotic ancestor. The comparison also suggests that the 5'-non-coding sequence of the bacterial gene was transferred to the plant cell nucleus and that it has been used to create the N-terminal portions of both plant enzymes, and possibly the transit peptide of the chloroplast enzyme. PMID:7916619

  12. Genome Sequence of Pseudomonas chlororaphis Strain 189

    Science.gov (United States)

    Town, Jennifer; Audy, Patrice; Boyetchko, Susan M.

    2016-01-01

    Pseudomonas chlororaphis strain 189 is a potent inhibitor of the growth of the potato pathogen Phytophthora infestans. We determined the complete, finished sequence of the 6.8-Mbp genome of this strain, consisting of a single contiguous molecule. Strain 189 is closely related to previously sequenced strains of P. chlororaphis. PMID:27340063

  13. Next-generation sequencing: applications beyond genomes

    OpenAIRE

    Marguerat, Samuel; Wilhelm, Brian T.; Bähler, Jürg

    2008-01-01

    The development of DNA sequencing more than 30 years ago has profoundly impacted biological research. In the last couple of years, remarkable technological innovations have emerged that allow the direct and cost-effective sequencing of complex samples at unprecedented scale and speed. These next-generation technologies make it feasible to sequence not only static genomes, but also entire transcriptomes expressed under different conditions. These and other powerful applications of next-generat...

  14. Genome Sequence of the Palaeopolyploid soybean

    Energy Technology Data Exchange (ETDEWEB)

    Schmutz, Jeremy; Cannon, Steven B.; Schlueter, Jessica; Ma, Jianxin; Mitros, Therese; Nelson, William; Hyten, David L.; Song, Qijian; Thelen, Jay J.; Cheng, Jianlin; Xu, Dong; Hellsten, Uffe; May, Gregory D.; Yu, Yeisoo; Sakura, Tetsuya; Umezawa, Taishi; Bhattacharyya, Madan K.; Sandhu, Devinder; Valliyodan, Babu; Lindquist, Erika; Peto, Myron; Grant, David; Shu, Shengqiang; Goodstein, David; Barry, Kerrie; Futrell-Griggs, Montona; Abernathy, Brian; Du, Jianchang; Tian, Zhixi; Zhu, Liucun; Gill, Navdeep; Joshi, Trupti; Libault, Marc; Sethuraman, Anand; Zhang, Xue-Cheng; Shinozaki, Kazuo; Nguyen, Henry T.; Wing, Rod A.; Cregan, Perry; Specht, James; Grimwood, Jane; Rokhsar, Dan; Stacey, Gary; Shoemaker, Randy C.; Jackson, Scott A.

    2009-08-03

    Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70percent more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78percent of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75percent of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

  15. Rhipicephalus (Boophilus) microplus strain Deutsch, whole genome shotgun sequencing project first submission of genome sequence

    Science.gov (United States)

    The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence difficult. Cot filtration/selection techniques were used to reduce the repetitive fraction of the tick genome and enrich for the fraction of DNA with gene-containing regions. The Cot-selected ...

  16. Viral genome sequencing by random priming methods

    Directory of Open Access Journals (Sweden)

    Zhang Xinsheng

    2008-01-01

    Full Text Available Abstract Background Most emerging health threats are of zoonotic origin. For the overwhelming majority, their causative agents are RNA viruses which include but are not limited to HIV, Influenza, SARS, Ebola, Dengue, and Hantavirus. Of increasing importance therefore is a better understanding of global viral diversity to enable better surveillance and prediction of pandemic threats; this will require rapid and flexible methods for complete viral genome sequencing. Results We have adapted the SISPA methodology 123 to genome sequencing of RNA and DNA viruses. We have demonstrated the utility of the method on various types and sources of viruses, obtaining near complete genome sequence of viruses ranging in size from 3,000–15,000 kb with a median depth of coverage of 14.33. We used this technique to generate full viral genome sequence in the presence of host contaminants, using viral preparations from cell culture supernatant, allantoic fluid and fecal matter. Conclusion The method described is of great utility in generating whole genome assemblies for viruses with little or no available sequence information, viruses from greatly divergent families, previously uncharacterized viruses, or to more fully describe mixed viral infections.

  17. Sequencing and Analysis of a Genomic Fragment Provide an Insight into the Dunaliella viridis Genomic Sequence

    Institute of Scientific and Technical Information of China (English)

    Xiao-Ming SUN; Yuan-Ping TANG; Xiang-Zong MENG; Wen-Wen ZHANG; Shan LI; Zhi-Rui DENG; Zheng-Kai XU; Ren-Tao SONG

    2006-01-01

    Dunaliella is a genus of wall-less unicellular eukaryotic green alga. Its exceptional resistances to salt and various other stresses have made it an ideal model for stress tolerance study. However, very little is known about its genome and genomic sequences. In this study, we sequenced and analyzed a 29,268 bp genomic fragment from Dunaliella viridis. The fragment showed low sequence homology to the GenBank database. At the nucleotide level, only a segment with significant sequence homology to 18S rRNA was found. The fragment contained six putative genes, but only one gene showed significant homology at the protein level to GenBank database. The average GC content of this sequence was 51.1%, which was much lower than that of close related green algae Chlamydomonas (65.7%). Significant segmental duplications were found within this fragment. The duplicated sequences accounted for about 35.7% of the entire region. Large amounts of simple sequence repeats (microsatellites) were found, with strong bias towards (AC)n type (76%). Analysis of other Dunaliella genomic sequences in the GenBank database (total 25,749 bp) was in agreement with these findings. These sequence features made it difficult to sequence Dunaliella genomic sequences. Further investigation should be made to reveal the biological significance of these unique sequence features.

  18. Sequencing and analysis of a genomic fragment provide an insight into the Dunaliella viridis genomic sequence.

    Science.gov (United States)

    Sun, Xiao-Ming; Tang, Yuan-Ping; Meng, Xiang-Zong; Zhang, Wen-Wen; Li, Shan; Deng, Zhi-Rui; Xu, Zheng-Kai; Song, Ren-Tao

    2006-11-01

    Dunaliella is a genus of wall-less unicellular eukaryotic green alga. Its exceptional resistances to salt and various other stresses have made it an ideal model for stress tolerance study. However, very little is known about its genome and genomic sequences. In this study, we sequenced and analyzed a 29,268 bp genomic fragment from Dunaliella viridis. The fragment showed low sequence homology to the GenBank database. At the nucleotide level, only a segment with significant sequence homology to 18S rRNA was found. The fragment contained six putative genes, but only one gene showed significant homology at the protein level to GenBank database. The average GC content of this sequence was 51.1%, which was much lower than that of close related green algae Chlamydomonas (65.7%). Significant segmental duplications were found within this fragment. The duplicated sequences accounted for about 35.7% of the entire region. Large amounts of simple sequence repeats (microsatellites) were found, with strong bias towards (AC)(n) type (76%). Analysis of other Dunaliella genomic sequences in the GenBank database (total 25,749 bp) was in agreement with these findings. These sequence features made it difficult to sequence Dunaliella genomic sequences. Further investigation should be made to reveal the biological significance of these unique sequence features. PMID:17091199

  19. Sorghum genome sequencing by methylation filtration.

    Directory of Open Access Journals (Sweden)

    Joseph A Bedell

    2005-01-01

    Full Text Available Sorghum bicolor is a close relative of maize and is a staple crop in Africa and much of the developing world because of its superior tolerance of arid growth conditions. We have generated sequence from the hypomethylated portion of the sorghum genome by applying methylation filtration (MF technology. The evidence suggests that 96% of the genes have been sequence tagged, with an average coverage of 65% across their length. Remarkably, this level of gene discovery was accomplished after generating a raw coverage of less than 300 megabases of the 735-megabase genome. MF preferentially captures exons and introns, promoters, microRNAs, and simple sequence repeats, and minimizes interspersed repeats, thus providing a robust view of the functional parts of the genome. The sorghum MF sequence set is beneficial to research on sorghum and is also a powerful resource for comparative genomics among the grasses and across the entire plant kingdom. Thousands of hypothetical gene predictions in rice and Arabidopsis are supported by the sorghum dataset, and genomic similarities highlight evolutionarily conserved regions that will lead to a better understanding of rice and Arabidopsis.

  20. Sorghum genome sequencing by methylation filtration.

    Science.gov (United States)

    Bedell, Joseph A; Budiman, Muhammad A; Nunberg, Andrew; Citek, Robert W; Robbins, Dan; Jones, Joshua; Flick, Elizabeth; Rholfing, Theresa; Fries, Jason; Bradford, Kourtney; McMenamy, Jennifer; Smith, Michael; Holeman, Heather; Roe, Bruce A; Wiley, Graham; Korf, Ian F; Rabinowicz, Pablo D; Lakey, Nathan; McCombie, W Richard; Jeddeloh, Jeffrey A; Martienssen, Robert A

    2005-01-01

    Sorghum bicolor is a close relative of maize and is a staple crop in Africa and much of the developing world because of its superior tolerance of arid growth conditions. We have generated sequence from the hypomethylated portion of the sorghum genome by applying methylation filtration (MF) technology. The evidence suggests that 96% of the genes have been sequence tagged, with an average coverage of 65% across their length. Remarkably, this level of gene discovery was accomplished after generating a raw coverage of less than 300 megabases of the 735-megabase genome. MF preferentially captures exons and introns, promoters, microRNAs, and simple sequence repeats, and minimizes interspersed repeats, thus providing a robust view of the functional parts of the genome. The sorghum MF sequence set is beneficial to research on sorghum and is also a powerful resource for comparative genomics among the grasses and across the entire plant kingdom. Thousands of hypothetical gene predictions in rice and Arabidopsis are supported by the sorghum dataset, and genomic similarities highlight evolutionarily conserved regions that will lead to a better understanding of rice and Arabidopsis. PMID:15660154

  1. Population genetic inference from genomic sequence variation

    OpenAIRE

    Pool, John E.; Hellmann, Ines; Jeffrey D. Jensen; Nielsen, Rasmus

    2010-01-01

    Population genetics has evolved from a theory-driven field with little empirical data into a data-driven discipline in which genome-scale data sets test the limits of available models and computational analysis methods. In humans and a few model organisms, analyses of whole-genome sequence polymorphism data are currently under way. And in light of the falling costs of next-generation sequencing technologies, such studies will soon become common in many other organisms as well. Here, we assess...

  2. An International Plan to Sequence the Onion Genome

    Science.gov (United States)

    The cost of DNA sequencing continues to decline and, in the near future, it will become reasonable to undertake sequencing of the enormous nuclear genome of onion. We undertook sequencing of expressed and genomic regions of the onion genome to learn about the structure of the onion genome, as well a...

  3. The diploid genome sequence of an Asian individual

    DEFF Research Database (Denmark)

    Wang, Jun; Wang, Wei; Li, Ruiqiang;

    2008-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we...

  4. Multilocus sequence typing of total-genome-sequenced bacteria.

    Science.gov (United States)

    Larsen, Mette V; Cosentino, Salvatore; Rasmussen, Simon; Friis, Carsten; Hasman, Henrik; Marvig, Rasmus Lykke; Jelsbak, Lars; Sicheritz-Pontén, Thomas; Ussery, David W; Aarestrup, Frank M; Lund, Ole

    2012-04-01

    Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the "gold standard" of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST. PMID:22238442

  5. Mauve: Multiple Alignment of Conserved Genomic Sequence With Rearrangements

    OpenAIRE

    Darling, Aaron C.E.; Mau, Bob; Blattner, Frederick R.; Perna, Nicole T.

    2004-01-01

    As genomes evolve, they undergo large-scale evolutionary processes that present a challenge to sequence comparison not posed by short sequences. Recombination causes frequent genome rearrangements, horizontal transfer introduces new sequences into bacterial chromosomes, and deletions remove segments of the genome. Consequently, each genome is a mosaic of unique lineage-specific segments, regions shared with a subset of other genomes and segments conserved among all the genomes under considera...

  6. Advancing Eucalyptus genomics: identification and sequencing of lignin biosynthesis genes from deep-coverage BAC libraries

    Directory of Open Access Journals (Sweden)

    Kudrna David

    2011-03-01

    Full Text Available Abstract Background Eucalyptus species are among the most planted hardwoods in the world because of their rapid growth, adaptability and valuable wood properties. The development and integration of genomic resources into breeding practice will be increasingly important in the decades to come. Bacterial artificial chromosome (BAC libraries are key genomic tools that enable positional cloning of important traits, synteny evaluation, and the development of genome framework physical maps for genetic linkage and genome sequencing. Results We describe the construction and characterization of two deep-coverage BAC libraries EG_Ba and EG_Bb obtained from nuclear DNA fragments of E. grandis (clone BRASUZ1 digested with HindIII and BstYI, respectively. Genome coverages of 17 and 15 haploid genome equivalents were estimated for EG_Ba and EG_Bb, respectively. Both libraries contained large inserts, with average sizes ranging from 135 Kb (Eg_Bb to 157 Kb (Eg_Ba, very low extra-nuclear genome contamination providing a probability of finding a single copy gene ≥ 99.99%. Libraries were screened for the presence of several genes of interest via hybridizations to high-density BAC filters followed by PCR validation. Five selected BAC clones were sequenced and assembled using the Roche GS FLX technology providing the whole sequence of the E. grandis chloroplast genome, and complete genomic sequences of important lignin biosynthesis genes. Conclusions The two E. grandis BAC libraries described in this study represent an important milestone for the advancement of Eucalyptus genomics and forest tree research. These BAC resources have a highly redundant genome coverage (> 15×, contain large average inserts and have a very low percentage of clones with organellar DNA or empty vectors. These publicly available BAC libraries are thus suitable for a broad range of applications in genetic and genomic research in Eucalyptus and possibly in related species of Myrtaceae

  7. Hidden ribozymes in eukaryotic genome sequence

    OpenAIRE

    Sean P Ryder

    2010-01-01

    The small self-cleaving ribozymes fold into complex tertiary structures to promote autocatalytic cleavage or ligation at a precise position within their sequence. Until recently, relatively few examples had been identified. Two papers now reveal that self-cleaving ribozymes are prevalent in eukaryotic genomes and, in some cases, might play a role in regulating gene expression.

  8. Whole genome sequences of four Brucella strains.

    Science.gov (United States)

    Ding, Jiabo; Pan, Yuanlong; Jiang, Hai; Cheng, Junsheng; Liu, Taotao; Qin, Nan; Yang, Yi; Cui, Buyun; Chen, Chen; Liu, Cuihua; Mao, Kairong; Zhu, Baoli

    2011-07-01

    Brucella melitensis and Brucella suis are intracellular pathogens of livestock and humans. Here we report four genome sequences, those of the virulent strain B. melitensis M28-12 and vaccine strains B. melitensis M5 and M111 and B. suis S2, which show different virulences and pathogenicities, which will help to design a more effective brucellosis vaccine. PMID:21602346

  9. Genome Sequence of Lactobacillus amylovorus GRL1112

    OpenAIRE

    Kant, R.; Paulin, L.; Alatalo, E.; DE VOS W.M.; Palva, A.

    2010-01-01

    Lactobacillus amylovorus is a common member of the normal gastrointestinal tract (GIT) microbiota in pigs. Here, we report the genome sequence of L. amylovorus GRL1112, a porcine feces isolate displaying strong adherence to the pig intestinal epithelial cells. The strain is of interest, as it is a potential probiotic bacterium.

  10. Genome sequence of Lactobacillus amylovorus GRL1112.

    Science.gov (United States)

    Kant, Ravi; Paulin, Lars; Alatalo, Edward; de Vos, Willem M; Palva, Airi

    2011-02-01

    Lactobacillus amylovorus is a common member of the normal gastrointestinal tract (GIT) microbiota in pigs. Here, we report the genome sequence of L. amylovorus GRL1112, a porcine feces isolate displaying strong adherence to the pig intestinal epithelial cells. The strain is of interest, as it is a potential probiotic bacterium. PMID:21131492

  11. Mapping whole genome shotgun sequence and variant calling in mammalian species without their reference genomes

    Science.gov (United States)

    Genomics research in mammals has produced reference genome sequences that are essential for identifying variation associated with disease. High quality reference genome sequences are now available for humans, model species, and economically important agricultural animals. Comparisons between these s...

  12. Multiplexed shotgun sequencing reveals congruent three-genome phylogenetic signals for four botanical sections of the flax genus Linum.

    Science.gov (United States)

    Fu, Yong-Bi; Dong, Yibo; Yang, Mo-Hua

    2016-08-01

    A genome-wide detection of phylogenetic signals by next generation sequencing (NGS) has recently emerged as a promising genomic approach for phylogenetic analysis of non-model organisms. Here we explored the use of a multiplexed shotgun sequencing method to assess the phylogenetic relationships of 18 Linum samples representing 16 species within four botanical sections of the flax genus Linum. The whole genome DNAs of 18 Linum samples were fragmented, tagged, and sequenced using an Illumina MiSeq. Acquired sequencing reads per sample were further separated into chloroplast, mitochondrial and nuclear sequence reads. SNP calls upon genome-specific sequence data sets revealed 6143 chloroplast, 2673 mitochondrial, and 19,562 nuclear SNPs. Phylogenetic analyses based on three-genome SNP data sets with and without missing observations showed congruent three-genome phylogenetic signals for four botanical sections of the Linum genus. Specifically, two major lineages showing a separation of Linum-Dasylinum sections and Linastrum-Syllinum sections were confirmed. The Linum section displayed three major branches representing two major evolutionary stages leading to cultivated flax. Cultivated flax and its immediate progenitor were formed as its own branch, genetically more closely related to L. decumbens and L. grandiflorum with chromosome count of eight, and distantly apart from six other species with chromosome count of nine. Five species of the Linastrum and Syllinum sections were genetically more distant from cultivated flax, but they appeared to be more closely related to each other, even with variable chromosome counts. These findings not only provide the first evidence of congruent three-genome phylogenetic pathways within the Linum genus, but also demonstrate the utility of the multiplexed shotgun sequencing in acquisition of three-genome phylogenetic signals of non-model organisms. PMID:27165939

  13. The Theory and Practice of Genome Sequence Assembly.

    Science.gov (United States)

    Simpson, Jared T; Pop, Mihai

    2015-01-01

    The current genomic revolution was made possible by joint advances in genome sequencing technologies and computational approaches for analyzing sequence data. The close interaction between biologists and computational scientists is perhaps most apparent in the development of approaches for sequencing entire genomes, a feat that would not be possible without sophisticated computational tools called genome assemblers (short for genome sequence assemblers). Here, we survey the key developments in algorithms for assembling genome sequences since the development of the first DNA sequencing methods more than 35 years ago. PMID:25939056

  14. Complexity of rice Hsp100 gene family: lessons from rice genome sequence data

    Indian Academy of Sciences (India)

    Gaurav Batra; Vineeta Singh Chauhan; Amanjot Singh; Neelam K Sarkar; Anil Grover

    2007-04-01

    Elucidation of genome sequence provides an excellent platform to understand detailed complexity of the various gene families. Hsp100 is an important family of chaperones in diverse living systems. There are eight putative gene loci encoding for Hsp100 proteins in Arabidopsis genome. In rice, two full-length Hsp100 cDNAs have been isolated and sequenced so far. Analysis of rice genomic sequence by in silico approach showed that two isolated rice Hsp100 cDNAs correspond to Os05g44340 and Os02g32520 genes in the rice genome database. There appears to be three additional proteins (encoded by Os03g31300, Os04g32560 and Os04g33210 gene loci) that are variably homologous to Os05g44340 and Os02g32520 throughout the entire amino acid sequence. The above five rice Hsp100 genes show significant similarities in the signature sequences known to be conserved among Hsp100 proteins. While Os05g44340 encodes cytoplasmic Hsp100 protein, those encoded by the other four genes are predicted to have chloroplast transit peptides.

  15. Whole Genome Sequencing: Cracking the Genetic Code for Foodborne Illness

    Science.gov (United States)

    ... For Consumers Home For Consumers Consumer Updates Whole Genome Sequencing: Cracking the Genetic Code for Foodborne Illness ... Bacteria that cause disease have millions of different genomes, or sequences of genetic code, each as unique ...

  16. The first symbiont-free genome sequence of marine red alga, Susabi-nori (Pyropia yezoensis.

    Directory of Open Access Journals (Sweden)

    Yoji Nakamura

    Full Text Available Nori, a marine red alga, is one of the most profitable mariculture crops in the world. However, the biological properties of this macroalga are poorly understood at the molecular level. In this study, we determined the draft genome sequence of susabi-nori (Pyropia yezoensis using next-generation sequencing platforms. For sequencing, thalli of P. yezoensis were washed to remove bacteria attached on the cell surface and enzymatically prepared as purified protoplasts. The assembled contig size of the P. yezoensis nuclear genome was approximately 43 megabases (Mb, which is an order of magnitude smaller than the previously estimated genome size. A total of 10,327 gene models were predicted and about 60% of the genes validated lack introns and the other genes have shorter introns compared to large-genome algae, which is consistent with the compact size of the P. yezoensis genome. A sequence homology search showed that 3,611 genes (35% are functionally unknown and only 2,069 gene groups are in common with those of the unicellular red alga, Cyanidioschyzon merolae. As color trait determinants of red algae, light-harvesting genes involved in the phycobilisome were predicted from the P. yezoensis nuclear genome. In particular, we found a second homolog of phycobilisome-degradation gene, which is usually chloroplast-encoded, possibly providing a novel target for color fading of susabi-nori in aquaculture. These findings shed light on unexplained features of macroalgal genes and genomes, and suggest that the genome of P. yezoensis is a promising model genome of marine red algae.

  17. Sequence analysis and editing for bisulphite genomic sequencing projects

    OpenAIRE

    Carr, IM; Valleley, EMA; Cordery, SF; Markham, AF; Bonthron, DT

    2007-01-01

    Bisulphite genomic sequencing is a widely used technique for detailed analysis of the methylation status of a region of DNA. It relies upon the selective deamination of unmethylated cytosine to uracil after treatment with sodium bisulphite, usually followed by PCR amplification of the chosen target region. Since this two-step procedure replaces all unmethylated cytosine bases with thymine, PCR products derived from unmethylated templates contain only three types of nucleotide, in unequal prop...

  18. Genetically Unstable Mutants as Novel Sources of Genetic Variability: The Chloroplast Mutator Genotype in Barley as a Tool for Exploring the Plastid Genome

    International Nuclear Information System (INIS)

    The presence of clonally variegated seedlings was used as a criterion to isolate putative genetically unstable mutants (GUMs) from the M2 or further generations arising from X-rays and/or chemical treatments applied to barley seeds. Analysis of seedlings in the glasshouse revealed that in some of the families isolated, a particular spectrum of mutant phenotypes was repeatedly observed over several generations of auto pollination. By reciprocal crosses it was noticed that some of these GUMs produced maternally-inherited changes and they were classified in two groups manifesting either a narrow or a wide spectrum of mutant phenotypes. One case of the latter, designated as a 'chloroplast mutator' genotype, has been studied in our Institute since 1985. In several mutants obtained from this GUM, evidence of major plastid-DNA changes were not detected, but interestingly, sequencing of some plastid genes showed that single nucleotide mutations were present. Mutational changes, consisting in transitions T/A - C/G which were located at three different positions in the plastid gene infA, were detected in three independently-originated albo-viridis mutants. Additionally, one transition and one base insertion on the ycf3 locus were observed in a temperature-sensitive viridis type and one transition on the plastid gene psbA was observed in families selected for atrazine tolerance. Both the wide spectrum of mutants and the subtle DNA changes induced in this barley chloroplast mutator genotype, suggest that it can be an exceptionally valuable tool to explore the potential functionality of the otherwise highly conserved plastid genome. (author)

  19. Genetically unstable mutants as novel sources of genetic variability: The chloroplast mutator genotype in barley as a tool for exploring the plastid genome

    International Nuclear Information System (INIS)

    The presence of clonally variegated seedlings was used as a criterion to isolate putative genetically unstable mutants (GUMs) from M2 or further generations coming from X-rays and/or chemical treatments applied on barley seeds. Seedlings analysis in the greenhouse revealed that in some of those isolated families a particular spectrum of mutant phenotypes was repeatedly observed during several generations of auto pollination. By reciprocal crosses it was noticed that some of those GUMs produced maternally-inherited changes and, according to the width of the spectrum induced by them, they were classified in two groups, inducing either a narrow or a wide spectrum of mutant phenotypes. One case of the latter, designated as 'chloroplast mutator genotype', has been studied in our Institute since 1985. In several mutants obtained from this GUM, evidences of major plastid-DNA changes were not detected but, interestingly, sequencing of some plastid genes showed that single nucleotide mutations were induced. Three different transitions on the plastid gene infA were detected in three independently originated albo-viridis mutants. One transition and one base insertion on the ycf3 locus were observed in a temperature-sensitive viridis type. Besides, one transition on the plastid gene psbA was observed in families selected for atrazine tolerance. Both, the wide spectrum of mutants and the subtle DNA changes induced by this barley chloroplast mutator genotype suggest that it can be an exceptionally valuable tool to explore the potential functionality of the otherwise highly conserved plastid genome. (author)

  20. Phylogeny of the genus Pistacia as determined from analysis of the chloroplast genome.

    Science.gov (United States)

    Parfitt, D E; Badenes, M L

    1997-07-22

    Classification within the genus Pistacia has been based on leaf morphology and geographical distribution. Molecular genetic tools (PCR amplification followed by restriction analysis of a 3.2-kb region of variable chloroplast DNA, and restriction fragment length polymorphism analysis of the Pistacia cpDNA with tobacco chloroplast DNA probes) provided a new set of variables to study the phylogenetic relationships of 10 Pistacia species. Both parsimony and cluster analyses were used to divide the genus into two major groups. P. vera was determined to be the least derived species. P. weinmannifolia, an Asian species, is most closely related to P. texana and P. mexicana, New World species. These three species share a common origin, suggesting that a common ancestor of P. texana and P. mexicana originated in Asia. P. integerrima and P. chinensis were shown to be distinct whereas the pairs of species were monophyletic within each of two tertiary groups, P. vera:P. khinjuk and P. mexicana:P. texana. An evolutionary trend from large to small nuts and leaves with few, large leaflets to many, small leaflets was supported. The genus Pistacia was shown to have a low chloroplast DNA mutation rate: 0.05-0.16 times that expected of annual plants. PMID:9223300

  1. Sequence motif discovery with computational genome-wide analysis

    OpenAIRE

    Akashi, Hirofumi; Aoki, Fumio; Toyota, Minoru; Maruyama, Reo; Sasaki, Yasushi; Mita, Hiroaki; Tokura, Hajime; Imai, Kohzoh; Tatsumi, Haruyuki

    2006-01-01

    As a result of the human genome project and advancements in DNA sequencing technology, we can utilize a huge amount of nucleotide sequence data and can search DNA sequence motifs in whole human genome. However, searching motifs with the naked eye is an enormous task and searching throughout the whole genome is absolutely impossible. Therefore, we have developed a computational genome-wide analyzing system for detecting DNA sequence motifs with biological significance. We used a multi-parallel...

  2. What Will We Do with a Cotton Genome Sequence?

    Institute of Scientific and Technical Information of China (English)

    BRUBAKER Curt

    2008-01-01

    @@ With the publication of "Toward Sequencing Cotton (Gossypium) Genomes" [Chen et al.PlantPhysiology,2007,145:1303-1310-] a clear consensus emerged from the cotton genomics community not only that cotton genome sequences were a critical resource for research and commercial innovationin cotton genomics,but that there was a logical means of achieving this goal.

  3. The predictive capacity of personal genome sequencing.

    Science.gov (United States)

    Roberts, Nicholas J; Vogelstein, Joshua T; Parmigiani, Giovanni; Kinzler, Kenneth W; Vogelstein, Bert; Velculescu, Victor E

    2012-05-01

    New DNA sequencing methods will soon make it possible to identify all germline variants in any individual at a reasonable cost. However, the ability of whole-genome sequencing to predict predisposition to common diseases in the general population is unknown. To estimate this predictive capacity, we use the concept of a "genometype." A specific genometype represents the genomes in the population conferring a specific level of genetic risk for a specified disease. Using this concept, we estimated the maximum capacity of whole-genome sequencing to identify individuals at clinically significant risk for 24 different diseases. Our estimates were derived from the analysis of large numbers of monozygotic twin pairs; twins of a pair share the same genometype and therefore identical genetic risk factors. Our analyses indicate that (i) for 23 of the 24 diseases, most of the individuals will receive negative test results; (ii) these negative test results will, in general, not be very informative, because the risk of developing 19 of the 24 diseases in those who test negative will still be, at minimum, 50 to 80% of that in the general population; and (iii) on the positive side, in the best-case scenario, more than 90% of tested individuals might be alerted to a clinically significant predisposition to at least one disease. These results have important implications for the valuation of genetic testing by industry, health insurance companies, public policy-makers, and consumers. PMID:22472521

  4. Enhanced Dynamic Algorithm of Genome Sequence Alignments

    Directory of Open Access Journals (Sweden)

    Arabi E. keshk

    2014-05-01

    Full Text Available The merging of biology and computer science has created a new field called computational biology that explore the capacities of computers to gain knowledge from biological data, bioinformatics. Computational biology is rooted in life sciences as well as computers, information sciences, and technologies. The main problem in computational biology is sequence alignment that is a way of arranging the sequences of DNA, RNA or protein to identify the region of similarity and relationship between sequences. This paper introduces an enhancement of dynamic algorithm of genome sequence alignment, which called EDAGSA. It is filling the three main diagonals without filling the entire matrix by the unused data. It gets the optimal solution with decreasing the execution time and therefore the performance is increased. To illustrate the effectiveness of optimizing the performance of the proposed algorithm, it is compared with the traditional methods such as Needleman-Wunsch, Smith-Waterman and longest common subsequence algorithms. Also, database is implemented for using the algorithm in multi-sequence alignments for searching the optimal sequence that matches the given sequence.

  5. Swine Genome Sequencing Consortium (SGSC: A Strategic Roadmap for Sequencing The Pig Genome

    Directory of Open Access Journals (Sweden)

    Kellye Eversole

    2006-04-01

    Full Text Available The Swine Genome Sequencing Consortium (SGSC was formed in September 2003 by academic, government and industry representatives to provide international coordination for sequencing the pig genome. The SGSC’s mission is to advance biomedical research for animal production and health by the development of DNAbased tools and products resulting from the sequencing of the swine genome. During the past 2 years, the SGSC has met bi-annually to develop a strategic roadmap for creating the required scientific resources, to integrate existing physical maps, and to create a sequencing strategy that captured international participation and a broad funding base. During the past year, SGSC members have integrated their respective physical mapping data with the goal of creating a minimal tiling path (MTP that will be used as the sequencing template. During the recent Plant and Animal Genome meeting (January 16, 2005 San Diego, CA, presentations demonstrated that a human–pig comparative map has been completed, BAC fingerprint contigs (FPC for each of the autosomes and X chromosome have been constructed and that BAC end-sequencing has permitted, through BLAST analysis and RH-mapping, anchoring of the contigs. Thus, significant progress has been made towards the creation of a MTP. In addition, whole-genome (WG shotgun libraries have been constructed and are currently being sequenced in various laboratories around the globe. Thus, a hybrid sequencing approach in which 3x coverage of BACs comprising the MTP and 3x of the WG-shotgun libraries will be used to develop a draft 6x coverage of the pig genome.

  6. Underlying Data for Sequencing the Mitochondrial Genome with the Massively Parallel Sequencing Platform Ion Torrent™ PGM™

    OpenAIRE

    Seo, Seung Bum; Zeng, Xiangpei; King, Jonathan L.; Larue, Bobby L; Assidi, Mourad; Al-Qahtani, Mohamed H; Sajantila, Antti; Budowle, Bruce

    2015-01-01

    Abstract Background Massively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (L...

  7. Underlying Data for Sequencing the Mitochondrial Genome with the Massively Parallel Sequencing Platform Ion Torrent™ PGM™

    OpenAIRE

    Seo, Seung Bum; Zeng, Xiangpei; King, Jonathan L.; Larue, Bobby L; Assidi, Mourad; Al-Qahtani, Mohamed H; Sajantila, Antti; Budowle, Bruce

    2015-01-01

    Background Massively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (Life Technologies, S...

  8. Identifying driver mutations in sequenced cancer genomes

    DEFF Research Database (Denmark)

    Raphael, Benjamin J; Dobson, Jason R; Oesper, Layla;

    2014-01-01

    protein sequence or structure. Finally, we review techniques to identify recurrent combinations of somatic mutations, including approaches that examine mutations in known pathways or protein-interaction networks, as well as de novo approaches that identify combinations of mutations according to......-throughput DNA sequencing data, particularly for tumor samples that comprise heterogeneous populations of cells. Next, we review computational approaches that aim to predict driver mutations according to their frequency of occurrence in a cohort of samples, or according to their predicted functional impact on......, and random mutations. Here, we review computational approaches to identify somatic mutations in cancer genome sequences and to distinguish the driver mutations that are responsible for cancer from random, passenger mutations. First, we describe approaches to detect somatic mutations from high...

  9. Why Assembling Plant Genome Sequences Is So Challenging

    Directory of Open Access Journals (Sweden)

    Pedro Seoane

    2012-09-01

    Full Text Available In spite of the biological and economic importance of plants, relatively few plant species have been sequenced. Only the genome sequence of plants with relatively small genomes, most of them angiosperms, in particular eudicots, has been determined. The arrival of next-generation sequencing technologies has allowed the rapid and efficient development of new genomic resources for non-model or orphan plant species. But the sequencing pace of plants is far from that of animals and microorganisms. This review focuses on the typical challenges of plant genomes that can explain why plant genomics is less developed than animal genomics. Explanations about the impact of some confounding factors emerging from the nature of plant genomes are given. As a result of these challenges and confounding factors, the correct assembly and annotation of plant genomes is hindered, genome drafts are produced, and advances in plant genomics are delayed.

  10. Primers for the Amplification of the Circular Chloroplast DNA from the A-genome Group of Cultivated Cotton

    Institute of Scientific and Technical Information of China (English)

    IBRAHIM Rashid Ismael Hag; AZUMA Jun-Ichi; SAKAMOTO Masahiro

    2008-01-01

    @@ The availability of the plastid genome sequences is one of the bases for comparative,functional,and structural genomic studies of plastid-containing living organisms,in addition to the application of plastid genetic engineering technology.The past efforts to sequence plastid genomes involve complicated preparation protocols.One procedure starts with the isolation of plastids,which was tiresome and time wasting that followed by a second step to extract plastid DNA from the isolated plastids,then finally the build up of plasmid or bacterial artificial chromosome (BAC) library.

  11. Phylogeny of Ptychostomum (Bryaceae, Musci) inferred from sequences of nuclear ribosomal DNA internal transcribed spacer (ITS) and chloroplast rps4

    Institute of Scientific and Technical Information of China (English)

    Chen-Ying WANG; Jian-Cheng ZHAO

    2009-01-01

    The phylogeny of Ptychostomum was first undertaken based on analysis of the internal transcribed spacer (ITS) region of the nuclear ribosomal (nr) DNA and by combining data from nrDNA ITS and chloroplast DNA rps4 sequences. Maximum parsimony, maximum likelihood, and Bayesian analyses all support the conclu-sion that the reinstated genus Ptychostomum is not monophyletic. Ptychostomum funkii (Schwagr.) J. R. Spence (= Bryum funkii Schwagr.) is placed within a clade containing the type species of Bryum, B. argenteum Hedw. The remaining members of Ptychostomum investigated in the present study constitute another well-supported clade. The results are congruent with previous molecular analyses. On the basis of phylogenetic evidence, we agree with Bryum lonchocaulon Mull. Hal., Bryum pallescens Schleich. ex Schwagr., and Bryum pallens Sw. to Ptychostomum.

  12. Simple sequence repeats in bryophyte mitochondrial genomes.

    Science.gov (United States)

    Zhao, Chao-Xian; Zhu, Rui-Liang; Liu, Yang

    2016-01-01

    Simple sequence repeats (SSRs) are thought to be common in plant mitochondrial (mt) genomes, but have yet to be fully described for bryophytes. We screened the mt genomes of two liverworts (Marchantia polymorpha and Pleurozia purpurea), two mosses (Physcomitrella patens and Anomodon rugelii) and two hornworts (Phaeoceros laevis and Nothoceros aenigmaticus), and detected 475 SSRs. Some SSRs are found conserved during the evolution, among which except one exists in both liverworts and mosses, all others are shared only by the two liverworts, mosses or hornworts. SSRs are known as DNA tracts having high mutation rates; however, according to our observations, they still can evolve slowly. The conservativeness of these SSRs suggests that they are under strong selection and could play critical roles in maintaining the gene functions. PMID:24491104

  13. Four cleaved amplified polymorphic sequence (CAPS) markers for the detection of the Juglans ailantifolia chloroplast in putatively native J. cinerea populations.

    Science.gov (United States)

    McCleary, Tim S; Robichaud, Rodney L; Nuanes, Steve; Anagnostakis, Sandra L; Schlarbaum, Scott E; Romero-Severson, Jeanne

    2009-03-01

    Hybridization between butternut (Juglans cinerea), a forest tree native to eastern North America, and Japanese walnut (J. ailantifolia), a tree tolerant to the lethal fungal disease butternut canker, casts doubt on the genetic identity of the remaining butternuts. We report a diagnostic test to distinguish the J. cinerea chloroplast from the J. ailantifolia chloroplast using cleaved amplified polymorphic sequences resolvable in 1.5% agarose gels. J. ailantifolia maternal ancestry in naturally regenerated stands provides a site selection criterion for studies of introgression dynamics when the non-native parent and the hybrids tolerate a disease to which the native species is susceptible. PMID:21564682

  14. Initial sequencing and comparative analysis of the mouse genome

    Energy Technology Data Exchange (ETDEWEB)

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  15. Genome sequence of Haemophilus parasuis strain 29755

    OpenAIRE

    Mullins, Michael A.; Register, Karen B.; Bayles, Darrell O; Dyer, David W.; Joanna S Kuehn; Phillips, Gregory J.

    2011-01-01

    Haemophilus parasuis is a member of the family Pasteurellaceae and is the etiologic agent of Glässer’s disease in pigs, a systemic syndrome associated with only a subset of isolates. The genetic basis for virulence and systemic spread of particular H. parasuis isolates is currently unknown. Strain 29755 is an invasive isolate that has long been used in the study of Glässer’s disease. Accordingly, the genome sequence of strain 29755 is of considerable importance to investigators endeavoring to...

  16. Building the sequence map of the human pan-genome

    DEFF Research Database (Denmark)

    Li, Ruiqiang; Li, Yingrui; Zheng, Hancheng;

    2009-01-01

    Here we integrate the de novo assembly of an Asian and an African genome with the NCBI reference human genome, as a step toward constructing the human pan-genome. We identified approximately 5 Mb of novel sequences not present in the reference genome in each of these assemblies. Most novel...... sequences are individual or population specific, as revealed by their comparison to all available human DNA sequence and by PCR validation using the human genome diversity cell line panel. We found novel sequences present in patterns consistent with known human migration paths. Cross-species conservation...

  17. Insights from 20 years of bacterial genome sequencing

    DEFF Research Database (Denmark)

    Land, Miriam; Hauser, Loren; Jun, Se-Ran;

    2015-01-01

    genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in......Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the...... genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative...

  18. Organization of a large gene cluster encoding ribosomal proteins in the cyanobacterium Synechococcus sp. strain PCC 6301: comparison of gene clusters among cyanobacteria, eubacteria and chloroplast genomes.

    Science.gov (United States)

    Sugita, M; Sugishita, H; Fujishiro, T; Tsuboi, M; Sugita, C; Endo, T; Sugiura, M

    1997-08-11

    The structure of a large gene cluster containing 22 ribosomal protein (r-protein) genes of the cyanobacterium Synechococcus sp. strain PCC6301 is presented. Based on DNA and protein sequence analyses, genes encoding r-proteins L3, L4, L23, L2, S19, L22, S3, L16, L29, S17, L14, L24, L5, S8, L6, L18, S5, L15, L36, S13, S11, L17, SecY, adenylate kinase (AK) and the alpha subunit of RNA polymerase were identified. The gene order is similar to that of the E. coli S10, spc and alpha operons. Unlike the corresponding E. coli operons, the genes for r-proteins S4, S10, S14 and L30 are not present in this cluster. The organization of Synechococcus r-protein genes also resembles that of chloroplast (cp) r-protein genes of red and brown algal species. This strongly supports the endosymbiotic theory that the cp genome evolved from an ancient photosynthetic bacterium. PMID:9300823

  19. Complete DNA sequences of the plastid genomes of two parasitic flowering plant species, Cuscuta reflexa and Cuscuta gronovii

    Directory of Open Access Journals (Sweden)

    Maier Uwe G

    2007-08-01

    Full Text Available Abstract Background The holoparasitic plant genus Cuscuta comprises species with photosynthetic capacity and functional chloroplasts as well as achlorophyllous and intermediate forms with restricted photosynthetic activity and degenerated chloroplasts. Previous data indicated significant differences with respect to the plastid genome coding capacity in different Cuscuta species that could correlate with their photosynthetic activity. In order to shed light on the molecular changes accompanying the parasitic lifestyle, we sequenced the plastid chromosomes of the two species Cuscuta reflexa and Cuscuta gronovii. Both species are capable of performing photosynthesis, albeit with varying efficiencies. Together with the plastid genome of Epifagus virginiana, an achlorophyllous parasitic plant whose plastid genome has been sequenced, these species represent a series of progression towards total dependency on the host plant, ranging from reduced levels of photosynthesis in C. reflexa to a restricted photosynthetic activity and degenerated chloroplasts in C. gronovii to an achlorophyllous state in E. virginiana. Results The newly sequenced plastid genomes of C. reflexa and C. gronovii reveal that the chromosome structures are generally very similar to that of non-parasitic plants, although a number of species-specific insertions, deletions (indels and sequence inversions were identified. However, we observed a gradual adaptation of the plastid genome to the different degrees of parasitism. The changes are particularly evident in C. gronovii and include (a the parallel losses of genes for the subunits of the plastid-encoded RNA polymerase and the corresponding promoters from the plastid genome, (b the first documented loss of the gene for a putative splicing factor, MatK, from the plastid genome and (c a significant reduction of RNA editing. Conclusion Overall, the comparative genomic analysis of plastid DNA from parasitic plants indicates a bias towards

  20. The complete plastid genome sequence of Welwitschia mirabilis: an unusually compact plastome with accelerated divergence rates

    Directory of Open Access Journals (Sweden)

    Boore Jeffrey L

    2008-05-01

    Full Text Available Abstract Background Welwitschia mirabilis is the only extant member of the family Welwitschiaceae, one of three lineages of gnetophytes, an enigmatic group of gymnosperms variously allied with flowering plants or conifers. Limited sequence data and rapid divergence rates have precluded consensus on the evolutionary placement of gnetophytes based on molecular characters. Here we report on the first complete gnetophyte chloroplast genome sequence, from Welwitschia mirabilis, as well as analyses on divergence rates of protein-coding genes, comparisons of gene content and order, and phylogenetic implications. Results The chloroplast genome of Welwitschia mirabilis [GenBank: EU342371] is comprised of 119,726 base pairs and exhibits large and small single copy regions and two copies of the large inverted repeat (IR. Only 101 unique gene species are encoded. The Welwitschia plastome is the most compact photosynthetic land plant plastome sequenced to date; 66% of the sequence codes for product. The genome also exhibits a slightly expanded IR, a minimum of 9 inversions that modify gene order, and 19 genes that are lost or present as pseudogenes. Phylogenetic analyses, including one representative of each extant seed plant lineage and based on 57 concatenated protein-coding sequences, place Welwitschia at the base of all seed plants (distance, maximum parsimony or as the sister to Pinus (the only conifer representative in a monophyletic gymnosperm clade (maximum likelihood, bayesian. Relative rate tests on these gene sequences show the Welwitschia sequences to be evolving at faster rates than other seed plants. For these genes individually, a comparison of average pairwise distances indicates that relative divergence in Welwitschia ranges from amounts about equal to other seed plants to amounts almost three times greater than the average for non-gnetophyte seed plants. Conclusion Although the basic organization of the Welwitschia plastome is typical, its

  1. Draft Genome Sequence of Alternaria alternata ATCC 34957.

    Science.gov (United States)

    Nguyen, Hai D T; Lewis, Christopher T; Lévesque, C André; Gräfenhan, Tom

    2016-01-01

    We report the draft genome sequence of Alternaria alternata ATCC 34957. This strain was previously reported to produce alternariol and alternariol monomethyl ether on weathered grain sorghum. The genome was sequenced with PacBio technology and assembled into 27 scaffolds with a total genome size of 33.5 Mb. PMID:26769939

  2. Draft Genome Sequence of Fungus Clonostachys rosea Strain YKD0085.

    Science.gov (United States)

    Liu, Shuai; Chang, Yaowen; Hu, Xujia; Gong, Xuanyun; Di, Yingtong; Dong, Jinyan; Hao, Xiaojiang

    2016-01-01

    Here, we report the draft genome sequence of Clonostachys rosea (strain YKD0085). The functional annotation of C. rosea provides important information related to its ability to produce secondary metabolites. The genome sequence presented here builds the basis for further genome mining. PMID:27340057

  3. Complete Genome Sequence of Staphylococcus aureus Siphovirus Phage JS01

    OpenAIRE

    Jia, Hongying; Bai, Qinqin; Yang, Yongchun; Yao, Huochun

    2013-01-01

    Staphylococcus aureus is the most prevalent and economically significant pathogen causing bovine mastitis. We isolated and characterized one staphylophage from the milk of mastitis-affected cattle and sequenced its genome. Transmission electron microscopy (TEM) observation shows that it belongs to the family Siphovirus. We announce here its complete genome sequence and report major findings from the genomic analysis.

  4. First Draft Genome Sequence of Staphylococcus condimenti F-2T

    Science.gov (United States)

    Zheng, Beiwen; Hu, Xinjun; Jiang, Xiawei; Li, Ang; Yao, Jian

    2016-01-01

    This report describes the draft genome sequence of S. condimenti strain F-2T (DSM 11674), a potential starter culture. The genome assembly comprised 2,616,174 bp with 34.6% GC content. To the best of our knowledge, this is the first documentation that reports the whole-genome sequence of S. condimenti. PMID:27257207

  5. Draft Genome Sequence of Streptomyces hygroscopicus subsp. hygroscopicus NBRC 16556.

    Science.gov (United States)

    Komaki, Hisayuki; Ichikawa, Natsuko; Oguchi, Akio; Hamada, Moriyuki; Tamura, Tomohiko; Suzuki, Ken-Ichiro; Fujita, Nobuyuki

    2016-01-01

    Here, we report the draft genome sequence of strain NBRC 16556, deposited as Streptomyces hygroscopicus subsp. hygroscopicus into the NBRC culture collection. An average nucleotide identity analysis confirmed that the taxonomic identification is correct. The genome sequence will serve as a valuable reference for genome mining to search new secondary metabolites. PMID:27198007

  6. Whole-Genome Shotgun Sequencing of a Colonizing Multilocus Sequence Type 17 Streptococcus agalactiae Strain

    Science.gov (United States)

    Singh, Pallavi; Springman, A. Cody; Davies, H. Dele

    2012-01-01

    This report highlights the whole-genome shotgun draft sequence for a Streptococcus agalactiae strain representing multilocus sequence type (ST) 17, isolated from a colonized woman at 8 weeks postpartum. This sequence represents an important addition to the published genomes and will promote comparative genomic studies of S. agalactiae recovered from diverse sources. PMID:23045509

  7. Whole-Genome Shotgun Sequencing of a Colonizing Multilocus Sequence Type 17 Streptococcus agalactiae Strain

    OpenAIRE

    Singh, Pallavi; Springman, A. Cody; Davies, H Dele; Manning, Shannon D.

    2012-01-01

    This report highlights the whole-genome shotgun draft sequence for a Streptococcus agalactiae strain representing multilocus sequence type (ST) 17, isolated from a colonized woman at 8 weeks postpartum. This sequence represents an important addition to the published genomes and will promote comparative genomic studies of S. agalactiae recovered from diverse sources.

  8. Phylogeny and divergence of Chinese Angiopteridaceae based on chloroplast DNA sequence data (rbcL and trnL-F)

    Institute of Scientific and Technical Information of China (English)

    LI ChunXiang; LU ShuGang

    2007-01-01

    Marattioid ferns are an ancient lineage of primitive vascular plants that first appeared in the middle Carboniferous. Extant members are almost exclusively restricted to tropical regions, and the species-rich family Angiopteridaceae are limited in their distribution to the eastern hemisphere; relationships within the group are currently vague. Here the phylogenetic relationship between Angiopteris Hoffm. and Archangiopteris Christ et Gies. was evaluated based on the sequence analysis of chloroplast rbcL gene and trnL-F intergenic spacer with MEGA2 and MrBayes v3.0b4. On the basis of the phylogenetic pattern and fossil record, we further estimated the divergence time for the two genera. The phylogenetic trees revealed that all species of Angiopteris and Archangiopteris in this study formed a monophyletic group with strong statistical support, but the relationship between the two genera remained unresolved based on individual sequence analysis. On the other hand, the sequence analyses of combined data set revealed that Archangiopteris species diverged first, indicating that Archangiopteris may not be a direct derivative as traditionally assumed. The clade of Angiopteris and Archangiopteris appears to have diversified in the late Oligocene (≈26 Ma) based on the molecular estimate. Thus, the evolutionary history of extant Angiopteris and Archangiopteris has been characterized by ancient origin and recent diversification, and these groups are not relic and endangered lineages as traditionally considered.

  9. Genome Sequence of Stachybotrys chartarum Strain 51-11

    OpenAIRE

    Betancourt, Doris A; Dean, Timothy R.; Kim, Jean; Levy, Josh

    2015-01-01

    The Stachybotrys chartarum strain 51-11 genome was sequenced by shotgun sequencing utilizing Illumina HiSeq 2000 and PacBio technologies. Since S. chartarum has been implicated as having health impacts within water-damaged buildings, any information extracted from the genomic sequence data relating to toxins or the metabolism of the fungus might be useful.

  10. First Complete Genome Sequence of Cherry virus A.

    Science.gov (United States)

    Koinuma, Hiroaki; Nijo, Takamichi; Iwabuchi, Nozomu; Yoshida, Tetsuya; Keima, Takuya; Okano, Yukari; Maejima, Kensaku; Yamaji, Yasuyuki; Namba, Shigetou

    2016-01-01

    The 5'-terminal genomic sequence of Cherry virus A (CVA) has long been unknown. We determined the first complete genome sequence of an apricot isolate of CVA (7,434 nucleotides [nt]). The 5'-untranslated region was 107 nt in length, which was 53 nt longer than those of known CVA sequences. PMID:27284130

  11. Complete Genome Sequence of Rift Valley Fever Virus Strain Lunyo

    OpenAIRE

    Lumley, Sarah; Horton, Daniel L.; Marston, Denise A.; Johnson, Nicholas; Ellis, Richard J.; Fooks, Anthony R.; Hewson, Roger

    2016-01-01

    Using next-generation sequencing technologies, the first complete genome sequence of Rift Valley fever virus strain Lunyo is reported here. Originally reported as an attenuated antigenic variant strain from Uganda, genomic sequence analysis shows that Lunyo clusters together with other Ugandan isolates.

  12. Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis

    DEFF Research Database (Denmark)

    Carlton, Jane M.; Hirt, Robert P.; Silva, Joana C.;

    2007-01-01

    We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the approximately 160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion...

  13. Draft Genome Sequence of Brevibacterium massiliense Strain 541308T

    OpenAIRE

    Roux, Véronique; Robert, Catherine; Gimenez, Grégory; Raoult, Didier

    2012-01-01

    A draft genome sequence of Brevibacterium massiliense, an aerobic bacterium isolated from a human ankle discharge, is described here. CRISPR-associated proteins were found to be encoded in the genome, and analysis of transport proteins was performed.

  14. First complete genome sequence of infectious laryngotracheitis virus

    Directory of Open Access Journals (Sweden)

    Ficorilli Nino P

    2011-04-01

    Full Text Available Abstract Background Infectious laryngotracheitis virus (ILTV is an alphaherpesvirus that causes acute respiratory disease in chickens worldwide. To date, only one complete genomic sequence of ILTV has been reported. This sequence was generated by concatenating partial sequences from six different ILTV strains. Thus, the full genomic sequence of a single (individual strain of ILTV has not been determined previously. This study aimed to use high throughput sequencing technology to determine the complete genomic sequence of a live attenuated vaccine strain of ILTV. Results The complete genomic sequence of the Serva vaccine strain of ILTV was determined, annotated and compared to the concatenated ILTV reference sequence. The genome size of the Serva strain was 152,628 bp, with a G + C content of 48%. A total of 80 predicted open reading frames were identified. The Serva strain had 96.5% DNA sequence identity with the concatenated ILTV sequence. Notably, the concatenated ILTV sequence was found to lack four large regions of sequence, including 528 bp and 594 bp of sequence in the UL29 and UL36 genes, respectively, and two copies of a 1,563 bp sequence in the repeat regions. Considerable differences in the size of the predicted translation products of 4 other genes (UL54, UL30, UL37 and UL38 were also identified. More than 530 single-nucleotide polymorphisms (SNPs were identified. Most SNPs were located within three genomic regions, corresponding to sequence from the SA-2 ILTV vaccine strain in the concatenated ILTV sequence. Conclusions This is the first complete genomic sequence of an individual ILTV strain. This sequence will facilitate future comparative genomic studies of ILTV by providing an appropriate reference sequence for the sequence analysis of other ILTV strains.

  15. First complete genome sequence of infectious laryngotracheitis virus

    OpenAIRE

    Ficorilli Nino P; Browning Glenn F; Petermann Ivonne; Noormohammadi Amir H; Markham John F; Markham Philip F; Lee Sang-Won; Hartley Carol A; Devlin Joanne M

    2011-01-01

    Abstract Background Infectious laryngotracheitis virus (ILTV) is an alphaherpesvirus that causes acute respiratory disease in chickens worldwide. To date, only one complete genomic sequence of ILTV has been reported. This sequence was generated by concatenating partial sequences from six different ILTV strains. Thus, the full genomic sequence of a single (individual) strain of ILTV has not been determined previously. This study aimed to use high throughput sequencing technology to determine t...

  16. Development of chloroplast simple sequence repeats (cpSSRs) for the intraspecific study of Gracilaria tenuistipitata (Gracilariales, Rhodophyta) from different populations

    OpenAIRE

    Song, Sze-Looi; Lim, Phaik-Eem; Phang, Siew-Moi; Lee, Weng-Wah; Hong, Dang Diem; Prathep, Anchana

    2014-01-01

    Background Gracilaria tenuistipitata is an agarophyte with substantial economic potential because of its high growth rate and tolerance to a wide range of environment factors. This red seaweed is intensively cultured in China for the production of agar and fodder for abalone. Microsatellite markers were developed from the chloroplast genome of G. tenuistipitata var. liui to differentiate G. tenuistipitata obtained from six different localities: four from Peninsular Malaysia, one from Thailand...

  17. Coevolution between simple sequence repeats (SSRs and virus genome size

    Directory of Open Access Journals (Sweden)

    Zhao Xiangyan

    2012-08-01

    Full Text Available Abstract Background Relationship between the level of repetitiveness in genomic sequence and genome size has been investigated by making use of complete prokaryotic and eukaryotic genomes, but relevant studies have been rarely made in virus genomes. Results In this study, a total of 257 viruses were examined, which cover 90% of genera. The results showed that simple sequence repeats (SSRs is strongly, positively and significantly correlated with genome size. Certain repeat class is distributed in a certain range of genome sequence length. Mono-, di- and tri- repeats are widely distributed in all virus genomes, tetra- SSRs as a common component consist in genomes which more than 100 kb in size; in the range of genome  Conclusions We conducted this research standing on the height of the whole virus. We concluded that genome size is an important factor in affecting the occurrence of SSRs; hosts are also responsible for the variances of SSRs content to a certain degree.

  18. Chloroplast protein targeting involves localized translation in Chlamydomonas

    OpenAIRE

    Uniacke, James; Zerges, William

    2009-01-01

    The compartmentalization of eukaryotic cells requires that newly synthesized proteins be targeted to the compartments in which they function. In chloroplasts, a few thousand proteins function in photosynthesis, expression of the chloroplast genome, and other processes. Most chloroplast proteins are synthesized in the cytoplasm, imported, and then targeted to a specific chloroplast compartment. The remainder are encoded by the chloroplast genome, synthesized within the organelle, and targeted ...

  19. Whole Genome Sequencing: Innovation Dream or Privacy Nightmare?

    OpenAIRE

    De Cristofaro, Emiliano

    2012-01-01

    Over the past several years, DNA sequencing has emerged as one of the driving forces in life-sciences, paving the way for affordable and accurate whole genome sequencing. As genomes represent the entirety of an organism's hereditary information, the availability of complete human genomes prompts a wide range of revolutionary applications. The hope for improving modern healthcare and better understanding the human genome propels many interesting and challenging research frontiers. Unfortunatel...

  20. Draft Genome Sequences of Klebsiella variicola Plant Isolates

    OpenAIRE

    Martínez-Romero, Esperanza; Silva-Sanchez, Jesús; Barrios, Humberto; Rodríguez-Medina, Nadia; Martínez-Barnetche, Jesús; Téllez-Sosa, Juan; Gómez-Barreto, Rosa Elena; Garza-Ramos, Ulises

    2015-01-01

    Three endophytic Klebsiella variicola isolates—T29A, 3, and 6A2, obtained from sugar cane stem, maize shoots, and banana leaves, respectively—were used for whole-genome sequencing. Here, we report the draft genome sequences of circular chromosomes and plasmids. The genomes contain plant colonization and cellulases genes. This study will help toward understanding the genomic basis of K. variicola interaction with plant hosts.

  1. An Improved Protocol for Intact Chloroplasts and cpDNA Isolation in Conifers

    OpenAIRE

    Vieira, Leila do Nascimento; Faoro, Helisson; Fraga, Hugo Pacheco de Freitas; Rogalski, Marcelo; de Souza, Emanuel Maltempi; de Oliveira Pedrosa, Fábio; Nodari, Rubens Onofre; Guerra, Miguel Pedro

    2014-01-01

    Background Performing chloroplast DNA (cpDNA) isolation is considered a major challenge among different plant groups, especially conifers. Isolating chloroplasts in conifers by such conventional methods as sucrose gradient and high salt has not been successful. So far, plastid genome sequencing protocols for conifer species have been based mainly on long-range PCR, which is known to be time-consuming and difficult to implement. Methodology/Principal Findings We developed a protocol for cpDNA ...

  2. Integration of new alternative reference strain genome sequences into the Saccharomyces genome database.

    Science.gov (United States)

    Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla; Demeter, Janos; Engel, Stacia; Hellerstedt, Sage T; Karra, Kalpana; Hitz, Benjamin C; Nash, Robert S; Paskov, Kelley; Sheppard, Travis; Skrzypek, Marek; Weng, Shuai; Wong, Edith; Michael Cherry, J

    2016-01-01

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. To provide a wider scope of genetic and phenotypic variation in yeast, the genome sequences and their corresponding annotations from 11 alternative S. cerevisiae reference strains have been integrated into SGD. Genomic and protein sequence information for genes from these strains are now available on the Sequence and Protein tab of the corresponding Locus Summary pages. We illustrate how these genome sequences can be utilized to aid our understanding of strain-specific functional and phenotypic differences.Database URL: www.yeastgenome.org. PMID:27252399

  3. Next-generation sequencing and large genome assemblies

    OpenAIRE

    Henson, Joseph; Tischler, German; Ning, Zemin

    2012-01-01

    The next-generation sequencing (NGS) revolution has drastically reduced time and cost requirements for sequencing of large genomes, and also qualitatively changed the problem of assembly. This article reviews the state of the art in de novo genome assembly, paying particular attention to mammalian-sized genomes. The strengths and weaknesses of the main sequencing platforms are highlighted, leading to a discussion of assembly and the new challenges associated with NGS data. Current approaches ...

  4. Genome and Metagenome Sequencing: Using the Human Methyl-Binding Domain to Partition Genomic DNA Derived from Plant Tissues

    Directory of Open Access Journals (Sweden)

    Erbay Yigit

    2014-11-01

    Full Text Available Premise of the study: Variation in the distribution of methylated CpG (methyl-CpG in genomic DNA (gDNA across the tree of life is biologically interesting and useful in genomic studies. We illustrate the use of human methyl-CpG-binding domain (MBD2 to fractionate angiosperm DNA into eukaryotic nuclear (methyl-CpG-rich vs. organellar and prokaryotic (methyl-CpG-poor elements for genomic and metagenomic sequencing projects. Methods: MBD2 has been used to enrich prokaryotic DNA in animal systems. Using gDNA from five model angiosperm species, we apply a similar approach to identify whether MBD2 can fractionate plant gDNA into methyl-CpG-depleted vs. enriched methyl-CpG elements. For each sample, three gDNA libraries were sequenced: (1 untreated gDNA, (2 a methyl-CpG-depleted fraction, and (3 a methyl-CpG-enriched fraction. Results: Relative to untreated gDNA, the methyl-depleted libraries showed a 3.2–11.2-fold and 3.4–11.3-fold increase in chloroplast DNA (cpDNA and mitochondrial DNA (mtDNA, respectively. Methyl-enriched fractions showed a 1.8–31.3-fold and 1.3–29.0-fold decrease in cpDNA and mtDNA, respectively. Discussion: The application of MBD2 enabled fractionation of plant gDNA. The effectiveness was particularly striking for monocot gDNA (Poaceae. When sufficiently effective on a sample, this approach can increase the cost efficiency of sequencing plant genomes as well as prokaryotes living in or on plant tissues.

  5. Sugarcane genome sequencing by methylation filtration provides tools for genomic research in the genus Saccharum

    OpenAIRE

    Grativol, Clícia; Regulski, Michael; Bertalan, Marcelo; McCombie, W Richard; da Silva, Felipe Rodrigues; Neto, Adhemar Zerlotini; Vicentini, Renato; Farinelli, Laurent; Hemerly, Adriana Silva; Martienssen, Robert A; Ferreira, Paulo Cavalcanti Gomes

    2014-01-01

    Many economically important crops have large and complex genomes, which hampers sequencing of their genome by standard methods such as WGS. Large tracts of methylated repeats occur at plant genomes interspersed by hypomethylated gene-rich regions. Gene enrichment strategies based on methylation profile offer an alternative to sequencing repetitive genomes. Here, we have applied methyl filtration (MF) with McrBC digestion to enrich for euchromatic regions of sugarcane genome. To verify the eff...

  6. Phylogeny of the sundews, Drosera (Droseraceae), based on chloroplast rbcL and nuclear 18S ribosomal DNA Sequences.

    Science.gov (United States)

    Rivadavia, Fernando; Kondo, Katsuhiko; Kato, Masahiro; Hasebe, Mitsuyasu

    2003-01-01

    The sundew genus Drosera consists of carnivorous plants with active flypaper traps and includes nearly 150 species distributed mainly in Australia, Africa, and South America, with some Northern Hemisphere species. In addition to confused intrageneric classification of Drosera, the intergeneric relationships among the Drosera and two other genera in the Droseraceae with snap traps, Dionaea and Aldrovanda, are problematic. We conducted phylogenetic analyses of DNA sequences of the chloroplast rbcL gene for 59 species of Drosera, covering all sections except one. These analyses revealed that five of 11 sections, including three monotypic sections, are polyphyletic. Combined rbcL and 18S rDNA sequence data were used to infer phylogenetic relationships among Drosera, Dionaea, and Aldrovanda. This analysis revealed that all Drosera species form a clade sister to a clade including Dionaea and Aldrovanda, suggesting that the snap traps of Aldrovanda and Dionaea are homologous despite their morphological differences. MacClade reconstructions indicated that multiple episodes of aneuploidy occurred in a clade that includes mainly Australian species, while the chromosome numbers in the other clades are not as variable. Drosera regia, which is native to South Africa, and most species native to Australia, were clustered basally, suggesting that Drosera originated in Africa or Australia. The rbcL tree indicates that Australian species expanded their distribution to South America and then to Africa. Expansion of distribution to the Northern Hemisphere from the Southern Hemispere occurred in a few different lineages. PMID:21659087

  7. Endosymbiotic origin and codon bias of the nuclear gene for chloroplast glyceraldehyde-3-phosphate dehydrogenase from maize.

    Science.gov (United States)

    Brinkmann, H; Martinez, P; Quigley, F; Martin, W; Cerff, R

    1987-01-01

    The nuclei of plant cells harbor genes for two types of glyceraldehyde-3-phosphate dehydrogenases (GAPDH) displaying a sequence divergence corresponding to the prokaryote/eukaryote separation. This strongly supports the endosymbiotic theory of chloroplast evolution and in particular the gene transfer hypothesis suggesting that the gene for the chloroplast enzyme, initially located in the genome of the endosymbiotic chloroplast progenitor, was transferred during the course of evolution into the nuclear genome of the endosymbiotic host. Codon usage in the gene for chloroplast GAPDH of maize is radically different from that employed by present-day chloroplasts and from that of the cytosolic (glycolytic) enzyme from the same cell. This reveals the presence of subcellular selective pressures which appear to be involved in the optimization of gene expression in the economically important graminaceous monocots. PMID:3131533

  8. Circumscription and phylogeny of Apiaceae subfamily Saniculoideae based on chloroplast DNA sequences.

    Science.gov (United States)

    Calviño, Carolina I; Downie, Stephen R

    2007-07-01

    An estimate of phylogenetic relationships within Apiaceae subfamily Saniculoideae was inferred using data from the chloroplast DNA trnQ-trnK 5'-exon region to clarify the circumscription of the subfamily and to assess the monophyly of its constituent genera. Ninety-one accessions representing 14 genera and 82 species of Apiaceae were examined, including the genera Steganotaenia, Polemanniopsis, and Lichtensteinia which have been traditionally treated in subfamily Apioideae but determined in recent studies to be more closely related to or included within subfamily Saniculoideae. The trnQ-trnK 5'-exon region includes two intergenic spacers heretofore underutilized in molecular systematic studies and the rps16 intron. Analyses of these loci permitted an assessment of the relative utility of these noncoding regions (including the use of indel characters) for phylogenetic study at different hierarchical levels. The use of indels in phylogenetic analyses of both combined and partitioned data sets improves resolution of relationships, increases bootstrap support values, and decreases levels of overall homoplasy. Intergeneric relationships derived from maximum parsimony, Bayesian, and maximum likelihood analyses, as well as from maximum parsimony analysis of indel data alone, are fully resolved and consistent with one another and generally very well supported. We confirm the expansion of subfamily Saniculoideae to include Steganotaenia and Polemanniopsis (as the new tribe Steganotaenieae C.I. Calviño and S.R. Downie) but not Lichtensteinia. Sister group to tribe Steganotaenieae is tribe Saniculeae, redefined to include the genera Actinolema, Alepidea, Arctopus, Astrantia, Eryngium, Petagnaea, and Sanicula. With the synonymization of Hacquetia into Sanicula, all genera are monophyletic. Eryngium is divided into "Old World" and "New World" subclades and within Astrantia sections Astrantia and Astrantiella are monophyletic. PMID:17321762

  9. Insights from twenty years of bacterial genome sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Jun, Se Ran [ORNL; Nookaew, Intawat [ORNL; Leuze, Michael Rex [ORNL; Ahn, Tae-Hyuk [ORNL; Karpinets, Tatiana V [ORNL; Lund, Ole [Technical University of Denmark; Kora, Guruprasad H [ORNL; Wassenaar, Trudy [Molecular Microbiology & Genomics Consultants, Zotzenheim, Germany; Poudel, Suresh [ORNL; Ussery, David W [ORNL

    2015-01-01

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome

  10. Genome Project Standards in a New Era of Sequencing

    Energy Technology Data Exchange (ETDEWEB)

    GSC Consortia; HMP Jumpstart Consortia; Chain, P. S. G.; Grafham, D. V.; Fulton, R. S.; FitzGerald, M. G.; Hostetler, J.; Muzny, D.; Detter, J. C.; Ali, J.; Birren, B.; Bruce, D. C.; Buhay, C.; Cole, J. R.; Ding, Y.; Dugan, S.; Field, D.; Garrity, G. M.; Gibbs, R.; Graves, T.; Han, C. S.; Harrison, S. H.; Highlander, S.; Hugenholtz, P.; Khouri, H. M.; Kodira, C. D.; Kolker, E.; Kyrpides, N. C.; Lang, D.; Lapidus, A.; Malfatti, S. A.; Markowitz, V.; Metha, T.; Nelson, K. E.; Parkhill, J.; Pitluck, S.; Qin, X.; Read, T. D.; Schmutz, J.; Sozhamannan, S.; Strausberg, R.; Sutton, G.; Thomson, N. R.; Tiedje, J. M.; Weinstock, G.; Wollam, A.

    2009-06-01

    For over a decade, genome 43 sequences have adhered to only two standards that are relied on for purposes of sequence analysis by interested third parties (1, 2). However, ongoing developments in revolutionary sequencing technologies have resulted in a redefinition of traditional whole genome sequencing that requires a careful reevaluation of such standards. With commercially available 454 pyrosequencing (followed by Illumina, SOLiD, and now Helicos), there has been an explosion of genomes sequenced under the moniker 'draft', however these can be very poor quality genomes (due to inherent errors in the sequencing technologies, and the inability of assembly programs to fully address these errors). Further, one can only infer that such draft genomes may be of poor quality by navigating through the databases to find the number and type of reads deposited in sequence trace repositories (and not all genomes have this available), or to identify the number of contigs or genome fragments deposited to the database. The difficulty in assessing the quality of such deposited genomes has created some havoc for genome analysis pipelines and contributed to many wasted hours of (mis)interpretation. These same novel sequencing technologies have also brought an exponential leap in raw sequencing capability, and at greatly reduced prices that have further skewed the time- and cost-ratios of draft data generation versus the painstaking process of improving and finishing a genome. The resulting effect is an ever-widening gap between drafted and finished genomes that only promises to continue (Figure 1), hence there is an urgent need to distinguish good and poor datasets. The sequencing institutes in the authorship, along with the NIH's Human Microbiome Project Jumpstart Consortium (3), strongly believe that a new set of standards is required for genome sequences. The following represents a set of six community-defined categories of genome sequence standards that better

  11. Molecular phylogeny of horsetails (Equisetum) including chloroplast atpB sequences.

    Science.gov (United States)

    Guillon, Jean-Michel

    2007-07-01

    Equisetum is a genus of 15 extant species that are the sole surviving representatives of the class Sphenopsida. The generally accepted taxonomy of Equisetum recognizes two subgenera: Equisetum and Hippochaete. Two recent phylogenetical studies have independently questioned the monophyly of subgenus Equisetum. Here, I use original (atpB) and published (rbcL, trnL-trnF, rps4) sequence data to investigate the phylogeny of the genus. Analyses of atpB sequences give an unusual topology, with E. bogotense branching within Hippochaete. A Bayesian analysis based on all available sequences yields a tree with increased resolution, favoring the sister relationships of E. bogotense with subgenus Hippochaete. PMID:17476459

  12. Genome Wide Characterization of Simple Sequence Repeats in Cucumber

    Science.gov (United States)

    The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...

  13. Why size really matters when sequencing plant genomes

    Czech Academy of Sciences Publication Activity Database

    Kelly, L.J.; Leitch, A.R.; Fay, M. F.; Renny-Byfield, S.; Pellicer, J.; Macas, Jiří; Leitch, I.J.

    2012-01-01

    Roč. 5, č. 4 (2012), s. 415-425. ISSN 1755-0874 Institutional research plan: CEZ:AV0Z50510513 Institutional support: RVO:60077344 Keywords : C-value * genome assembly * genome size evolution * genome sequencing Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 0.924, year: 2012

  14. Genome Sequence of Mushroom Soft-Rot Pathogen Janthinobacterium agaricidamnosum.

    Science.gov (United States)

    Graupner, Katharina; Lackner, Gerald; Hertweck, Christian

    2015-01-01

    Janthinobacterium agaricidamnosum causes soft-rot disease of the cultured button mushroom Agaricus bisporus and is thus responsible for agricultural losses. Here, we present the genome sequence of J. agaricidamnosum DSM 9628. The 5.9-Mb genome harbors several secondary metabolite biosynthesis gene clusters, which renders this neglected bacterium a promising source for genome mining approaches. PMID:25883287

  15. Genome Sequence of Mushroom Soft-Rot Pathogen Janthinobacterium agaricidamnosum

    OpenAIRE

    Graupner, Katharina; Lackner, Gerald; Hertweck, Christian

    2015-01-01

    Janthinobacterium agaricidamnosum causes soft-rot disease of the cultured button mushroom Agaricus bisporus and is thus responsible for agricultural losses. Here, we present the genome sequence of J. agaricidamnosum DSM 9628. The 5.9-Mb genome harbors several secondary metabolite biosynthesis gene clusters, which renders this neglected bacterium a promising source for genome mining approaches.

  16. Nucleotide sequence and genome organization of carnation mottle virus RNA.

    OpenAIRE

    Guilley, H; Carrington, J C; Balàzs, E; Jonard, G; Richards, K; Morris, T J

    1985-01-01

    The complete nucleotide sequence of carnation mottle genomic RNA (4003 nucleotides) is presented. The sequence was determined for cloned cDNA copies of viral RNA containing over 99% of the sequence and was completed by direct sequence analysis of RNA and cDNA transcripts. The sequence contains two long open reading frames which together can account for observed translation products. One translation product would arise by suppression of an amber termination codon and the sequence raises the po...

  17. Sequence resources at the Candida Genome Database

    OpenAIRE

    Arnaud, Martha B.; Costanzo, Maria C.; Skrzypek, Marek S.; Shah, Prachi; Binkley, Gail; Lane, Christopher; Miyasato, Stuart R.; SHERLOCK, Gavin

    2006-01-01

    The Candida Genome Database (CGD, ) contains a curated collection of genomic information and community resources for researchers who are interested in the molecular biology of the opportunistic pathogen Candida albicans. With the recent release of a new assembly of the C.albicans genome, Assembly 20, C.albicans genomics has entered a new era. Although the C.albicans genome assembly continues to undergo refinement, multiple assemblies and gene nomenclatures will remain in widespread use by the...

  18. Complete Genome Sequence of the Human Gut Symbiont Roseburia hominis

    DEFF Research Database (Denmark)

    Travis, Anthony J.; Kelly, Denise; Flint, Harry J; Aminov, Rustam

    2015-01-01

    We report here the complete genome sequence of the human gut symbiont Roseburia hominis A2-183(T) (= DSM 16839(T) = NCIMB 14029(T)), isolated from human feces. The genome is represented by a 3,592,125-bp chromosome with 3,405 coding sequences. A number of potential functions contributing to host...

  19. Draft Genome Sequence of the Fish Pathogen Piscirickettsia salmonis

    OpenAIRE

    Eppinger, Mark; McNair, Katelyn; Zogaj, Xhavit; Dinsdale, Elizabeth A.; Edwards, Robert A.; Klose, Karl E.

    2013-01-01

    Piscirickettsia salmonis is a Gram-negative intracellular fish pathogen that has a significant impact on the salmon industry. Here, we report the genome sequence of P. salmonis strain LF-89. This is the first draft genome sequence of P. salmonis, and it reveals interesting attributes, including flagellar genes, despite this bacterium being considered nonmotile.

  20. Draft Genome Sequence of the Fish Pathogen Piscirickettsia salmonis.

    Science.gov (United States)

    Eppinger, Mark; McNair, Katelyn; Zogaj, Xhavit; Dinsdale, Elizabeth A; Edwards, Robert A; Klose, Karl E

    2013-01-01

    Piscirickettsia salmonis is a Gram-negative intracellular fish pathogen that has a significant impact on the salmon industry. Here, we report the genome sequence of P. salmonis strain LF-89. This is the first draft genome sequence of P. salmonis, and it reveals interesting attributes, including flagellar genes, despite this bacterium being considered nonmotile. PMID:24201203

  1. Biased distribution of DNA uptake sequences towards genome maintenance genes

    DEFF Research Database (Denmark)

    Davidsen, T.; Rodland, E.A.; Lagesen, K.;

    2004-01-01

    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...

  2. Complete genome sequence of ‘Candidatus Liberibacter africanus’

    Science.gov (United States)

    The complete genome sequence of ‘Candidatus Liberibacter africanus’ (Laf), strain ptsapsy, was obtained by an Illumina HiSeq 2000. The Laf genome comprises 1,192,232 nucleotides, 34.5% GC content, 1,141 predicted coding sequences, 44 tRNAs, 3 complete copies of ribosomal RNA genes (16S, 23S and 5S) ...

  3. Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii

    OpenAIRE

    Siozios, Stefanos; Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

    2013-01-01

    Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named “wSuzi” that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii.

  4. Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii.

    Science.gov (United States)

    Siozios, Stefanos; Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

    2013-01-01

    Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named "wSuzi" that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii. PMID:23472225

  5. Draft Genome Sequence of Klebsiella pneumoniae Isolate PR04

    OpenAIRE

    Zulkifli, M. H.; L. K. Teh; L. S. Lee; Z. A. Zakaria; Salleh, M. Z.

    2013-01-01

    Klebsiella pneumoniae PR04 was isolated from a patient hospitalized in Malaysia. The draft genome sequence of K. pneumoniae PR04 shows differences compared to the reference sequences of K. pneumoniae strains MGH 78578 and NTUH-K2044 in terms of their genomic structures.

  6. The carrot genome sequence brings colors out of the dark.

    Science.gov (United States)

    Garcia-Mas, Jordi; Rodriguez-Concepcion, Manuel

    2016-05-27

    The genome sequence of carrot (Daucus carota L.) is the first completed for an Apiaceae species, furthering knowledge of the evolution of the important euasterid II clade. Analyzing the whole-genome sequence allowed for the identification of a gene that may regulate the accumulation of carotenoids in the root. PMID:27230684

  7. Complete Genome Sequences of Five Paenibacillus larvae Bacteriophages.

    Science.gov (United States)

    Sheflo, Michael A; Gardner, Adam V; Merrill, Bryan D; Fisher, Joshua N B; Lunt, Bryce L; Breakwell, Donald P; Grose, Julianne H; Burnett, Sandra H

    2013-01-01

    Paenibacillus larvae is a pathogen of honeybees that causes American foulbrood (AFB). We isolated bacteriophages from soil containing bee debris collected near beehives in Utah. We announce five high-quality complete genome sequences, which represent the first completed genome sequences submitted to GenBank for any P. larvae bacteriophage. PMID:24233582

  8. Genome sequence of Kocuria palustris strain W4

    DEFF Research Database (Denmark)

    Herschend, Jakob; Raghupathi, Prem Krishnan; Røder, Henriette Lyng;

    2016-01-01

    We report the 3.09 Mb draft genome sequence ofKocuria palustrisW4, isolated from a slaughterhouse in Denmark.......We report the 3.09 Mb draft genome sequence ofKocuria palustrisW4, isolated from a slaughterhouse in Denmark....

  9. Nearly Complete Genome Sequence of Lactobacillus plantarum Strain NIZO2877

    NARCIS (Netherlands)

    Martino, M.E.; Bayjanov, J.R.; Joncour, P.; Hughes, S.; Gillet, B.; Kleerebezem, M; Siezen, R.; Hijum, S.A.F.T. van; Leulier, F.

    2015-01-01

    Lactobacillus plantarum is a versatile bacterial species that is isolated mostly from foods. Here, we present the first genome sequence of L. plantarum strain NIZO2877 isolated from a hot dog in Vietnam. Its two contigs represent a nearly complete genome sequence.

  10. On the current status of Phakopsora pachyrhizi genome sequencing

    Directory of Open Access Journals (Sweden)

    Marco eLoehrer

    2014-08-01

    Full Text Available Recent advances in the field of sequencing technologies and bioinformatics allow a more rapid access to genomes of non-model organisms at sinking costs. Accordingly, draft genomes of several economically important cereal rust fungi have been released in the last three years. Aside from the very recent flax rust and poplar rust draft assemblies there are no genomic data available for other dicot-infecting rust fungi. In this article we outline rust fungus sequencing efforts and comment on the current status of Phakopsora pachyrhizi (Asian soybean rust genome sequencing.

  11. Comparison of intraspecific, interspecific and intergeneric chloroplast diversity in Cycads.

    Science.gov (United States)

    Jiang, Guo-Feng; Hinsinger, Damien Daniel; Strijk, Joeri Sergej

    2016-01-01

    Cycads are among the most threatened plant species. Increasing the availability of genomic information by adding whole chloroplast data is a fundamental step in supporting phylogenetic studies and conservation efforts. Here, we assemble a dataset encompassing three taxonomic levels in cycads, including ten genera, three species in the genus Cycas and two individuals of C. debaoensis. Repeated sequences, SSRs and variations of the chloroplast were analyzed at the intraspecific, interspecific and intergeneric scale, and using our sequence data, we reconstruct a phylogenomic tree for cycads. The chloroplast was 162,094 bp in length, with 133 genes annotated, including 87 protein-coding, 37 tRNA and 8 rRNA genes. We found 7 repeated sequences and 39 SSRs. Seven loci showed promising levels of variations for application in DNA-barcoding. The chloroplast phylogeny confirmed the division of Cycadales in two suborders, each of them being monophyletic, revealing a contradiction with the current family circumscription and its evolution. Finally, 10 intraspecific SNPs were found. Our results showed that despite the extremely restricted distribution range of C. debaoensis, using complete chloroplast data is useful not only in intraspecific studies, but also to improve our understanding of cycad evolution and in defining conservation strategies for this emblematic group. PMID:27558458

  12. Unexpected cross-species contamination in genome sequencing projects

    Directory of Open Access Journals (Sweden)

    Samier Merchant

    2014-11-01

    Full Text Available The raw data from a genome sequencing project sometimes contains DNA from contaminating organisms, which may be introduced during sample collection or sequence preparation. In some instances, these contaminants remain in the sequence even after assembly and deposition of the genome into public databases. As a result, searches of these databases may yield erroneous and confusing results. We used efficient microbiome analysis software to scan the draft assembly of domestic cow, Bos taurus, and identify 173 small contigs that appeared to derive from microbial contaminants. In the course of verifying these findings, we discovered that one genome, Neisseria gonorrhoeae TCDC-NG08107, although putatively a complete genome, contained multiple sequences that actually derived from the cow and sheep genomes. Our findings illustrate the need to carefully validate findings of anomalous DNA that rely on comparisons to either draft or finished genomes.

  13. Minimum taxonomic criteria for bacterial genome sequence depositions and announcements.

    Science.gov (United States)

    Bull, Matthew J; Marchesi, Julian R; Vandamme, Peter; Plummer, Sue; Mahenthiralingam, Eshwar

    2012-04-01

    Multiple bioinformatic methods are available to analyse the information encoded within the complete genome sequence of a bacterium and accurately assign its species status or nearest phylogenetic neighbour. However, it is clear that even now in what is the third decade of bacterial genomics, taxonomically incorrect genome sequence depositions are still being made. We outline a simple scheme of bioinformatic analysis and a set of minimum criteria that should be applied to all bacterial genomic data to ensure that they are accurately assigned to the species or genus level prior to database deposition. To illustrate the utility of the bioinformatic workflow, we analysed the recently deposited genome sequence of Lactobacillus acidophilus 30SC and demonstrated that this DNA was in fact derived from a strain of Lactobacillus amylovorus. Using these methods researchers can ensure that the taxonomic accuracy of genome sequence depositions is maintained within the ever increasing nucleic acid datasets. PMID:22366464

  14. Phylogeography of Spiraea alpina (Rosaceae) in the Qinghai-Tibetan Plateau inferred from chloroplast DNA sequence variations

    Institute of Scientific and Technical Information of China (English)

    Fa-Qi ZHANG; Qing-Bo GAO; De-Jun ZHANG; Yi-Zhong DUAN; Yin-Hu LI; Peng-Cheng FU; Rui XING; Khan GULZAR; Shi-Long CHEN

    2012-01-01

    The aim of the present study was to investigate the phylogeographic patterns of Spiraea alpina (Rosaceae) and clarify its response to past climatic changes in the climate-sensitive Qinghai-Tibetan Plateau (QTP).We sequenced a chloroplast DNA fragment (trnL-trnF) from 528 individuals representing 43 populations.We identified 10 haplotypes,which were tentatively divided into three groups.These haplotypes or groups were distributed in the different regions of the QTP.Only half the populations were fixed by a single haplotype,whereas the others contained two or more.In the central and eastern regions,adjacent populations at the local scale shared the same haplotype.Our phylogeographic analyses suggest that this alpine shrub survived in multiple refugia during the Last Glacial Maximum and that earlier glaciations may have trigged deep intraspecific divergences.Post-glacial expansions occurred only within populations or across multiple populations within a local range.The findings of the present study together with previous phylogeographic reports suggest that evolutionary histories of plants in the QTP are complex and variable depending on the species investigated.

  15. Genome sequencing and annotation of Serratia sp. strain TEL.

    Science.gov (United States)

    Lephoto, Tiisetso E; Gray, Vincent M

    2015-12-01

    We present the annotation of the draft genome sequence of Serratia sp. strain TEL (GenBank accession number KP711410). This organism was isolated from entomopathogenic nematode Oscheius sp. strain TEL (GenBank accession number KM492926) collected from grassland soil and has a genome size of 5,000,541 bp and 542 subsystems. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LDEG00000000. PMID:26697332

  16. Comparative Copy Number Variation From Whole Genome Sequencing

    OpenAIRE

    Janevski, A.; Varadan, V.; Kamalakaran, S.; Banerjee, N.; Dimitrova, D

    2011-01-01

    Whole genome sequencing enables a high resolution view of the humangenome and enables unique insights into copy number variations in anunprecedented scale. Numerous tools and studies have already been introduced that provide confirmatory and new genomic variability datain individuals and across populations. We investigate two such methods, CNV-seq and FREEC and compare their outputs when applied to five whole genome sequences representing four populations. We focus onthe ability of these tool...

  17. Whole-genome sequence-based analysis of thyroid function

    DEFF Research Database (Denmark)

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby;

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N = 2,287). Using additional whole-genome...... association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function....

  18. Marsupial Genome Sequences: Providing Insight into Evolution and Disease

    OpenAIRE

    Deakin, Janine E.

    2012-01-01

    Marsupials (metatherians), with their position in vertebrate phylogeny and their unique biological features, have been studied for many years by a dedicated group of researchers, but it has only been since the sequencing of the first marsupial genome that their value has been more widely recognised. We now have genome sequences for three distantly related marsupial species (the grey short-tailed opossum, the tammar wallaby, and Tasmanian devil), with the promise of many more genomes to be seq...

  19. Whole-genome sequencing in bacteriology: state of the art

    OpenAIRE

    Dark, Michael

    2013-01-01

    Michael J DarkDepartment of Infectious Diseases and Pathology and Emerging Pathogens Institute, University of Florida, Gainesville, FL, USAAbstract: Over the last ten years, genome sequencing capabilities have expanded exponentially. There have been tremendous advances in sequencing technology, DNA sample preparation, genome assembly, and data analysis. This has led to advances in a number of facets of bacterial genomics, including metagenomics, clinical medicine, bacterial archaeology, and b...

  20. Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities

    OpenAIRE

    Chen, Kevin; Pachter, Lior

    2005-01-01

    The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fe...

  1. Perspectives of Integrative Cancer Genomics in Next Generation Sequencing Era

    OpenAIRE

    Kwon, So Mee; Cho, Hyunwoo; Choi, Ji Hye; Jee, Byul A; Jo, Yuna; Woo, Hyun Goo

    2012-01-01

    The explosive development of genomics technologies including microarrays and next generation sequencing (NGS) has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establis...

  2. Whole-Genome Sequences of Thirteen Isolates of Borrelia burgdorferi

    Energy Technology Data Exchange (ETDEWEB)

    Schutzer S. E.; Dunn J.; Fraser-Liggett, C. M.; Casjens, S. R.; Qiu, W.-G.; Mongodin, E. F.; Luft, B. J.

    2011-02-01

    Borrelia burgdorferi is a causative agent of Lyme disease in North America and Eurasia. The first complete genome sequence of B. burgdorferi strain 31, available for more than a decade, has assisted research on the pathogenesis of Lyme disease. Because a single genome sequence is not sufficient to understand the relationship between genotypic and geographic variation and disease phenotype, we determined the whole-genome sequences of 13 additional B. burgdorferi isolates that span the range of natural variation. These sequences should allow improved understanding of pathogenesis and provide a foundation for novel detection, diagnosis, and prevention strategies.

  3. Generation of physical map contig-specific sequences useful for whole genome sequence scaffolding.

    Directory of Open Access Journals (Sweden)

    Yanliang Jiang

    Full Text Available Along with the rapid advances of the nextgen sequencing technologies, more and more species are added to the list of organisms whose whole genomes are sequenced. However, the assembled draft genome of many organisms consists of numerous small contigs, due to the short length of the reads generated by nextgen sequencing platforms. In order to improve the assembly and bring the genome contigs together, more genome resources are needed. In this study, we developed a strategy to generate a valuable genome resource, physical map contig-specific sequences, which are randomly distributed genome sequences in each physical contig. Two-dimensional tagging method was used to create specific tags for 1,824 physical contigs, in which the cost was dramatically reduced. A total of 94,111,841 100-bp reads and 315,277 assembled contigs are identified containing physical map contig-specific tags. The physical map contig-specific sequences along with the currently available BAC end sequences were then used to anchor the catfish draft genome contigs. A total of 156,457 genome contigs (~79% of whole genome sequencing assembly were anchored and grouped into 1,824 pools, in which 16,680 unique genes were annotated. The physical map contig-specific sequences are valuable resources to link physical map, genetic linkage map and draft whole genome sequences, consequently have the capability to improve the whole genome sequences assembly and scaffolding, and improve the genome-wide comparative analysis as well. The strategy developed in this study could also be adopted in other species whose whole genome assembly is still facing a challenge.

  4. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    OpenAIRE

    Tettelin, Hervé; Masignani, Vega; Cieslewicz, Michael J.; Eisen, Jonathan A.; Peterson, Scott; Wessels, Michael R.; Paulsen, Ian T.; Nelson, Karen E.; Margarit, Immaculada; Read, Timothy D.; Madoff, Lawrence C.; Wolf, Alex M.; Beanan, Maureen J; Brinkac, Lauren M.; Sean C Daugherty

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the other completely sequenced genomes identified genes specific to the streptococci and to S. agalactiae. These in silico analyses, combined with comparative genome hybridization experiments between the ...

  5. Progress in Understanding and Sequencing the Genome of Brassica rapa

    OpenAIRE

    Hong, Chang Pyo; Kwon, Soo-Jin; Kim, Jung Sun; Yang, Tae-Jin; Park, Beom-Seok; Lim, Yong Pyo

    2008-01-01

    Brassica rapa, which is closely related to Arabidopsis thaliana, is an important crop and a model plant for studying genome evolution via polyploidization. We report the current understanding of the genome structure of B. rapa and efforts for the whole-genome sequencing of the species. The tribe Brassicaceae, which comprises ca. 240 species, descended from a common hexaploid ancestor with a basic genome similar to that of Arabidopsis. Chromosome rearrangements, including fusions and/or fissio...

  6. Complete Genome Sequence of Probiotic Strain Lactobacillus acidophilus La-14.

    Science.gov (United States)

    Stahl, Buffy; Barrangou, Rodolphe

    2013-01-01

    We present the 1,991,830-bp complete genome sequence of Lactobacillus acidophilus strain La-14 (SD-5212). Comparative genomic analysis revealed 99.98% similarity overall to the L. acidophilus NCFM genome. Globally, 111 single nucleotide polymorphisms (SNPs) (95 SNPs, 16 indels) were observed throughout the genome. Also, a 416-bp deletion in the LA14_1146 sugar ABC transporter was identified. PMID:23788546

  7. Complete Genome Sequence of Probiotic Strain Lactobacillus acidophilus La-14

    OpenAIRE

    Stahl, Buffy; Barrangou, Rodolphe

    2013-01-01

    We present the 1,991,830-bp complete genome sequence of Lactobacillus acidophilus strain La-14 (SD-5212). Comparative genomic analysis revealed 99.98% similarity overall to the L. acidophilus NCFM genome. Globally, 111 single nucleotide polymorphisms (SNPs) (95 SNPs, 16 indels) were observed throughout the genome. Also, a 416-bp deletion in the LA14_1146 sugar ABC transporter was identified.

  8. Analysis of the Thermotoga maritima genome combining a variety of sequence similarity and genome context tools

    OpenAIRE

    Kyrpides, Nikos C; Ouzounis, Christos A; Iliopoulos, Ioannis; Vonstein, Veronika; Overbeek, Ross

    2000-01-01

    The proliferation of genome sequence data has led to the development of a number of tools and strategies that facilitate computational analysis. These methods include the identification of motif patterns, membership of the query sequences in family databases, metabolic pathway involvement and gene proximity. We re-examined the completely sequenced genome of Thermotoga maritima by employing the combined use of the above methods. By analyzing all 1877 proteins encoded in this genome, we identif...

  9. Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

    Energy Technology Data Exchange (ETDEWEB)

    McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

    2005-08-26

    Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

  10. Real-time, portable genome sequencing for Ebola surveillance.

    Science.gov (United States)

    Quick, Joshua; Loman, Nicholas J; Duraffour, Sophie; Simpson, Jared T; Severi, Ettore; Cowley, Lauren; Bore, Joseph Akoi; Koundouno, Raymond; Dudas, Gytis; Mikhail, Amy; Ouédraogo, Nobila; Afrough, Babak; Bah, Amadou; Baum, Jonathan H J; Becker-Ziaja, Beate; Boettcher, Jan Peter; Cabeza-Cabrerizo, Mar; Camino-Sánchez, Álvaro; Carter, Lisa L; Doerrbecker, Juliane; Enkirch, Theresa; García-Dorival, Isabel; Hetzelt, Nicole; Hinzmann, Julia; Holm, Tobias; Kafetzopoulou, Liana Eleni; Koropogui, Michel; Kosgey, Abigael; Kuisma, Eeva; Logue, Christopher H; Mazzarelli, Antonio; Meisel, Sarah; Mertens, Marc; Michel, Janine; Ngabo, Didier; Nitzsche, Katja; Pallasch, Elisa; Patrono, Livia Victoria; Portmann, Jasmine; Repits, Johanna Gabriella; Rickett, Natasha Y; Sachse, Andreas; Singethan, Katrin; Vitoriano, Inês; Yemanaberhan, Rahel L; Zekeng, Elsa G; Racine, Trina; Bello, Alexander; Sall, Amadou Alpha; Faye, Ousmane; Faye, Oumar; Magassouba, N'Faly; Williams, Cecelia V; Amburgey, Victoria; Winona, Linda; Davis, Emily; Gerlach, Jon; Washington, Frank; Monteil, Vanessa; Jourdain, Marine; Bererd, Marion; Camara, Alimou; Somlare, Hermann; Camara, Abdoulaye; Gerard, Marianne; Bado, Guillaume; Baillet, Bernard; Delaune, Déborah; Nebie, Koumpingnin Yacouba; Diarra, Abdoulaye; Savane, Yacouba; Pallawo, Raymond Bernard; Gutierrez, Giovanna Jaramillo; Milhano, Natacha; Roger, Isabelle; Williams, Christopher J; Yattara, Facinet; Lewandowski, Kuiama; Taylor, James; Rachwal, Phillip; Turner, Daniel J; Pollakis, Georgios; Hiscox, Julian A; Matthews, David A; O'Shea, Matthew K; Johnston, Andrew McD; Wilson, Duncan; Hutley, Emma; Smit, Erasmus; Di Caro, Antonino; Wölfel, Roman; Stoecker, Kilian; Fleischmann, Erna; Gabriel, Martin; Weller, Simon A; Koivogui, Lamine; Diallo, Boubacar; Keïta, Sakoba; Rambaut, Andrew; Formenty, Pierre; Günther, Stephan; Carroll, Miles W

    2016-02-11

    The Ebola virus disease epidemic in West Africa is the largest on record, responsible for over 28,599 cases and more than 11,299 deaths. Genome sequencing in viral outbreaks is desirable to characterize the infectious agent and determine its evolutionary rate. Genome sequencing also allows the identification of signatures of host adaptation, identification and monitoring of diagnostic targets, and characterization of responses to vaccines and treatments. The Ebola virus (EBOV) genome substitution rate in the Makona strain has been estimated at between 0.87 × 10(-3) and 1.42 × 10(-3) mutations per site per year. This is equivalent to 16-27 mutations in each genome, meaning that sequences diverge rapidly enough to identify distinct sub-lineages during a prolonged epidemic. Genome sequencing provides a high-resolution view of pathogen evolution and is increasingly sought after for outbreak surveillance. Sequence data may be used to guide control measures, but only if the results are generated quickly enough to inform interventions. Genomic surveillance during the epidemic has been sporadic owing to a lack of local sequencing capacity coupled with practical difficulties transporting samples to remote sequencing facilities. To address this problem, here we devise a genomic surveillance system that utilizes a novel nanopore DNA sequencing instrument. In April 2015 this system was transported in standard airline luggage to Guinea and used for real-time genomic surveillance of the ongoing epidemic. We present sequence data and analysis of 142 EBOV samples collected during the period March to October 2015. We were able to generate results less than 24 h after receiving an Ebola-positive sample, with the sequencing process taking as little as 15-60 min. We show that real-time genomic surveillance is possible in resource-limited settings and can be established rapidly to monitor outbreaks. PMID:26840485

  11. Determining and comparing protein function in Bacterial genome sequences

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla

    predictions were made in about 60% of the cases. This project has highlighted the difficulties and challenges in functional annotation and computational analysis of sequence data. It has provided possible solutions for creating reproducible pipelines for comparative genomics as well as constructed a number of......In November 2013, there was around 21.000 different prokaryotic genomes sequenced and publicly available, and the number is growing daily with another 20.000 or more genomes expected to be sequenced and deposited by the end of 2014. An important part of the analysis of this data is the functional...... known functions. This thesis describes the development of new tools for comparative functional annotation and a system for comparative genomics in general. As novel sequenced genomes are becoming more readily available, there is a need for standard analysis tools. The system CMG-biotools is presented...

  12. Genetic variation of Kaempferia (Zingiberaceae) in Thailand based on chloroplast DNA (psbA-trnH and petA-psbJ) sequences.

    Science.gov (United States)

    Techaprasan, J; Klinbunga, S; Ngamriabsakul, C; Jenjittikul, T

    2010-01-01

    Genetic variation and species authentication of 71 Kaempferia accessions (representing 15 recognized, six new, and four unidentified species) found indigenously in Thailand were examined by determining chloroplast psbA-trnH and partial petA-psbJ spacer sequences. Ten closely related species (Boesenbergia rotunda, Gagnepainia godefroyi, G. thoreliana, Globba substrigosa, Smithatris myanmarensis, S. supraneanae, Scaphochlamys biloba, S. minutiflora, S. rubescens, and Stahlianthus sp) were also included. After sequence alignments, 1010 and 865 bp in length were obtained for the respective chloroplast DNA sequences. Intraspecific sequence variation was not observed in Kaempferia candida, K. angustifolia, K. laotica, K. galanga, K. pardi sp nov., K. bambusetorum sp nov., K. albomaculata sp nov., K. minuta sp nov., Kaempferia sp nov. 1, and G. thoreliana, for which more than one specimen was available. In contrast, intraspecific sequence polymorphisms were observed in various populations of K. fallax, K. filifolia, K. elegans, K. pulchra, K. rotunda, K. marginata, K. parviflora, K. larsenii, K. roscoeana, K. siamensis, and G. godefroyi. A strict consensus tree based on combined psbA-trnH and partial petA-psbJ sequences revealed four major groups of Kaempferia species. We suggest that the genus Kaempferia is a polyphyletic group, as K. candida was distantly related and did not group with other Kaempferia species. Polymorphic sites and indels of psbA-trnH and petA-psbJ can be used as DNA barcodes for species diagnosis of most Kaempferia and outgroup species. Nuclear DNA polymorphism should be examined to determine if there has been interspecific hybridization and chloroplast DNA introgression in these taxa. PMID:20927714

  13. Marsupial genome sequences: providing insight into evolution and disease.

    Science.gov (United States)

    Deakin, Janine E

    2012-01-01

    Marsupials (metatherians), with their position in vertebrate phylogeny and their unique biological features, have been studied for many years by a dedicated group of researchers, but it has only been since the sequencing of the first marsupial genome that their value has been more widely recognised. We now have genome sequences for three distantly related marsupial species (the grey short-tailed opossum, the tammar wallaby, and Tasmanian devil), with the promise of many more genomes to be sequenced in the near future, making this a particularly exciting time in marsupial genomics. The emergence of a transmissible cancer, which is obliterating the Tasmanian devil population, has increased the importance of obtaining and analysing marsupial genome sequence for understanding such diseases as well as for conservation efforts. In addition, these genome sequences have facilitated studies aimed at answering questions regarding gene and genome evolution and provided insight into the evolution of epigenetic mechanisms. Here I highlight the major advances in our understanding of evolution and disease, facilitated by marsupial genome projects, and speculate on the future contributions to be made by such sequences. PMID:24278712

  14. Draft genome sequence of Enterococcus faecium strain LMG 8148.

    Science.gov (United States)

    Michiels, Joran E; Van den Bergh, Bram; Fauvart, Maarten; Michiels, Jan

    2016-01-01

    Enterococcus faecium, traditionally considered a harmless gut commensal, is emerging as an important nosocomial pathogen showing increasing rates of multidrug resistance. We report the draft genome sequence of E. faecium strain LMG 8148, isolated in 1968 from a human in Gothenburg, Sweden. The draft genome has a total length of 2,697,490 bp, a GC-content of 38.3 %, and 2,402 predicted protein-coding sequences. The isolation of this strain predates the emergence of E. faecium as a nosocomial pathogen. Consequently, its genome can be useful in comparative genomic studies investigating the evolution of E. faecium as a pathogen. PMID:27610213

  15. Complete genome sequence of Acidimicrobium ferrooxidans type strain (ICPT)

    Energy Technology Data Exchange (ETDEWEB)

    Clum, Alicia; Nolan, Matt; Lang, Elke; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Goker, Markus; Spring, Stefan; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Chain, Patrick; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Lapidus, Alla

    2009-05-20

    Acidimicrobium ferrooxidans (Clark and Norris 1996) is the sole and type species of the genus, which until recently was the only genus within the actinobacterial family Acidimicrobiaceae and in the order Acidomicrobiales. Rapid oxidation of iron pyrite during autotrophic growth in the absence of an enhanced CO2 concentration is characteristic for A. ferrooxidans. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the order Acidomicrobiales, and the 2,158,157 bp long single replicon genome with its 2038 protein coding and 54 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  16. Puzzling sequences: studying microbial genomes from 'Ötzi'

    International Nuclear Information System (INIS)

    Ancient remains, and mummies in particular, are of central value for archaeological research. The Tyrolean iceman “Ötzi” was conserved in a glacier of the Ötztal Alps about 5000 years ago. Aside from morphological and phenotypical classification, the determination of DNA sequences and the subsequent genome analyses have been first applied to mitochondrial DNA and then been extended to genomic DNA. Typically also ancient microbial DNA is sequenced. These sequences allow the identification of pathogens as well as studying the evolution of microorganisms. The talk will explain the metagenomic aspects of the “Ötzi” genome project and discuss the first results. (author)

  17. Analysis of codon usage in the chloroplast genome of Medicago truncatula%蒺藜苜蓿叶绿体密码子偏好性分析

    Institute of Scientific and Technical Information of China (English)

    杨国锋; 苏昆龙; 赵怡然; 宋智斌; 孙娟

    2015-01-01

    The complete nucleotide sequence of the chloroplast genome of Medicago truncatula was investiga-ted.Fifty CDS (coding DNA sequences)selected from the chloroplast genome sequence of M.truncatula, were analyzed using CodonW software.The results show that the third codon position was rich in A and U. ENC ranged from 37.1 to 51.9 meaning that the codon bias was weak.There were 23 codons with relative syn-onymous codon usage greater than 1 and 20 codons ending with A and T.ENC-plot analysis showed that GC3 was not correlated with GC12 ;ENC ratio’s of most genes ranged from -0.05 to 0.05.In the correspondence analysis of the first group of four axes,the first axis showed 10.3% variation.The correlation coefficients for axis 1 between ENC and GC3 were 0.091 and -0.092 respectively (not significant).Synonymous codon usage bias was found,mainly due to the effect of mutation pressure,but there were other factors.In addition,analy-sis of the high expression codons enabled 23 to be affirmed as the “optimal codons”as UAA,UUG,CCU.The results provide evidence for molecular modification of exogenous genes to increase the expression efficiency in M.truncatula chloroplasts.%本文对蒺藜苜蓿叶绿体基因组全序列密码子进行分析,筛选出50条 CDS(coding DNA sequence)利用 Codo-nW 软件进行分析其密码子使用模式。结果显示,蒺藜苜蓿叶绿体基因组密码子第3位碱基 GC 含量为26.9%,即第3位密码子富含 A 和 U,ENC 值在37.11~51.91之间密码子偏好性较弱。相对同义密码子使用度分析显示RSCU 值大于1的密码子有23个,其中以 A 和 U 为结尾20个。中性绘图分析显示 GC12与 GC3的相关系数为0.341,相关性不显著,回归系数为0.4843;单基因 ENC 比值多分布在-0.05~0.05,即大部分基因 ENC 值离 ENC期望值较近;对应性分析,第一轴显示了12.50%的差异为主要影响因素,第一轴与 ENC 和 GC3

  18. Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.

    Directory of Open Access Journals (Sweden)

    Martijn Staats

    Full Text Available Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22-82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus were generated with 81.4-97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2-71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes, but at least generating vital comparative genomic data for testing (phylogenetic, demographic and genetic hypotheses, that become increasingly more

  19. BAC-pool 454-sequencing: A rapid and efficient approach to sequence complex tetraploid cotton genomes

    Science.gov (United States)

    New and emerging next generation sequencing technologies have been promising in reducing sequencing costs, but not significantly for complex polyploid plant genomes such as cotton. Large and highly repetitive genome of G. hirsutum (~2.5GB) is less amenable and cost-intensive with traditional BAC-by...

  20. Genome sequencing and analysis of the model grass Brachypodium distachyon.

    Science.gov (United States)

    2010-02-11

    Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops. PMID:20148030

  1. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    Energy Technology Data Exchange (ETDEWEB)

    Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M.; Fahlgren, Noah; Fawcett, Jeffrey A.; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D.; Ossowski, Stephan; Ottilar, Robert P.; Salamov, Asaf A.; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E.; Bergelson, Joy; Carrington, James C.; Gaut, Brandon S.; Schmutz, Jeremy; Mayer, Klaus F. X.; Van de Peer, Yves; Grigoriev, Igor V.; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

    2011-04-29

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.

  2. Perspectives of integrative cancer genomics in next generation sequencing era.

    Science.gov (United States)

    Kwon, So Mee; Cho, Hyunwoo; Choi, Ji Hye; Jee, Byul A; Jo, Yuna; Woo, Hyun Goo

    2012-06-01

    The explosive development of genomics technologies including microarrays and next generation sequencing (NGS) has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establish the huge cancer genome compendium and integrative omics analyses, so-called "integromics", have extended our understanding on the cancer genome, showing its daunting complexity and heterogeneity. However, the challenges of the structured integration, sharing, and interpretation of the big omics data still remain to be resolved. Here, we review several issues raised in cancer omics data analysis, including NGS, focusing particularly on the study design and analysis strategies. This might be helpful to understand the current trends and strategies of the rapidly evolving cancer genomics research. PMID:23105932

  3. Analysis of Simple Sequence Repeats in Genomes of Rhizobia

    Institute of Scientific and Technical Information of China (English)

    GAO Ya-mei; HAN Yi-qiang; TANG Hui; SUN Dong-mei; WANG Yan-jie; WANG Wei-dong

    2008-01-01

    Simple sequence repeats (SSRs) or microsatellites, as genetic markers, are ubiquitous in genomes of various organisms. The analysis of SSR in rhizobia genome provides useful information for a variety of applications in population genetics of rhizobia. We analyzed the occurrences, relative abundance, and relative density of SSRs, the most common in Bradyrhizobium japonicum, Mesorhizobium loti, and Sinorhizobium meliloti genomes se-quenced in the microorganisms tandem repeats database, and SSRs in the three species genomes were compared with each other. The result showed that there were 1 410, 859, and 638 SSRs in B. japonicum, M. loti, and 5. meliloti genomes, respectively. In the genomes of B. japonicum, M. loti, and 5. meliloti, tetranucleotide, pentanucleotide, and hexanucleotide repeats were more abundant and indicated higher mutation rates in these species. The least abundance was mononucleotide repeat. The SSRs type and distribution were similar among these species.

  4. Draft Genome Sequences of Gammaproteobacterial Methanotrophs Isolated from Marine Ecosystems.

    Science.gov (United States)

    Flynn, James D; Hirayama, Hisako; Sakai, Yasuyoshi; Dunfield, Peter F; Klotz, Martin G; Knief, Claudia; Op den Camp, Huub J M; Jetten, Mike S M; Khmelenina, Valentina N; Trotsenko, Yuri A; Murrell, J Colin; Semrau, Jeremy D; Svenning, Mette M; Stein, Lisa Y; Kyrpides, Nikos; Shapiro, Nicole; Woyke, Tanja; Bringel, Françoise; Vuilleumier, Stéphane; DiSpirito, Alan A; Kalyuzhnaya, Marina G

    2016-01-01

    The genome sequences of Methylobacter marinus A45, Methylobacter sp. strain BBA5.1, and Methylomarinum vadi IT-4 were obtained. These aerobic methanotrophs are typical members of coastal and hydrothermal vent marine ecosystems. PMID:26798114

  5. Draft Genome Sequences of Gammaproteobacterial Methanotrophs Isolated from Marine Ecosystems

    OpenAIRE

    Flynn, James D.; Hirayama, Hisako; Sakai, Yasuyoshi; Dunfield, Peter F.; Klotz, Martin G.; Knief, Claudia; Op Den Camp, Huub J M; Jetten, Mike S. M.; Khmelenina, Valentina N; Trotsenko, Yuri A.; Murrell, J. Colin; Semrau, Jeremy D.; Svenning, Mette M.; Stein, Lisa Y.; Kyrpides, Nikos

    2016-01-01

    The genome sequences of Methylobacter marinus A45, Methylobacter sp. strain BBA5.1, and Methylomarinum vadi IT-4 were obtained. These aerobic methanotrophs are typical members of coastal and hydrothermal vent marine ecosystems.

  6. Draft Genome Sequences of Gammaproteobacterial Methanotrophs Isolated from Marine Ecosystems

    Science.gov (United States)

    Flynn, James D.; Hirayama, Hisako; Sakai, Yasuyoshi; Dunfield, Peter F.; Knief, Claudia; Op den Camp, Huub J. M.; Jetten, Mike S. M.; Khmelenina, Valentina N.; Trotsenko, Yuri A.; Murrell, J. Colin; Semrau, Jeremy D.; Svenning, Mette M.; Stein, Lisa Y.; Kyrpides, Nikos; Shapiro, Nicole; Woyke, Tanja; Bringel, Françoise; Vuilleumier, Stéphane; DiSpirito, Alan A.

    2016-01-01

    The genome sequences of Methylobacter marinus A45, Methylobacter sp. strain BBA5.1, and Methylomarinum vadi IT-4 were obtained. These aerobic methanotrophs are typical members of coastal and hydrothermal vent marine ecosystems. PMID:26798114

  7. Draft Genome Sequence of Paecilomyces hepiali, Isolated from Cordyceps sinensis.

    Science.gov (United States)

    Yu, Yi; Wang, Wenting; Wang, Linping; Pang, Fang; Guo, Lanping; Song, Lai; Liu, Guiming; Feng, Chengqiang

    2016-01-01

    Paecilomyces hepiali is an endoparasitic fungus that commonly exists in the natural Cordyceps sinensis Here, we report the draft genome sequence of P. hepiali, which will facilitate the exploitation of medicinal compounds produced by the fungus. PMID:27389266

  8. Draft Genome Sequence of Paecilomyces hepiali, Isolated from Cordyceps sinensis

    Science.gov (United States)

    Yu, Yi; Wang, Wenting; Wang, Linping; Pang, Fang; Guo, Lanping; Song, Lai

    2016-01-01

    Paecilomyces hepiali is an endoparasitic fungus that commonly exists in the natural Cordyceps sinensis. Here, we report the draft genome sequence of P. hepiali, which will facilitate the exploitation of medicinal compounds produced by the fungus. PMID:27389266

  9. First Draft Genome Sequence of a Mycobacterium gordonae Clinical Isolate

    Science.gov (United States)

    Smirnova, T.; Blagodatskikh, K.; Varlamov, D.; Sochivko, D.; Larionova, E.; Andreevskaya, S.; Andrievskaya, I.; Chernousova, L.

    2016-01-01

    Here, we report the first draft genome sequence of the clinically relevant species Mycobacterium gordonae. The clinical isolate Mycobacterium gordonae 14-8773 was obtained from the sputum of a patient with mycobacteriosis. PMID:27365356

  10. Genome Sequence of Bacillus thuringiensis subsp. kurstaki Strain HD-1

    OpenAIRE

    Day, Michael; Ibrahim, Mohamed; Dyer, David; Bulla, Lee

    2014-01-01

    We report here the complete genome sequence of Bacillus thuringiensis subsp. kurstaki strain HD-1, which serves as the primary U.S. reference standard for all commercial insecticidal formulations of B. thuringiensis manufactured around the world.

  11. Bacterial epidemiology and biology - lessons from genome sequencing.

    OpenAIRE

    Parkhill, J.; Wren, BW

    2011-01-01

    : ABSTRACT: Next-generation sequencing has ushered in a new era of microbial genomics, enabling the detailed historical and geographical tracing of bacteria. This is helping to shape our understanding of bacterial evolution.

  12. Seeing chordate evolution through the Ciona genome sequence

    OpenAIRE

    Cañestro, Cristian; Bassham, Susan; Postlethwait, John H.

    2003-01-01

    A draft sequence of the compact genome of the sea squirt Ciona intestinalis, a non-vertebrate chordate that diverged very early from other chordates, including vertebrates, illuminates how chordates originated and how vertebrate developmental innovations evolved.

  13. Complete Genome Sequence of Rahnella aquatilis CIP 78.65

    Energy Technology Data Exchange (ETDEWEB)

    Martinez, Robert J [University of Alabama, Tuscaloosa; Bruce, David [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Han, James [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Held, Brittany [Los Alamos National Laboratory (LANL); Land, Miriam L [ORNL; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Pennacchio, Len [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Sobeckya, Patricia A. [University of Alabama, Tuscaloosa

    2012-01-01

    Rahnella aquatilis CIP 78.65 is a gammaproteobacterium isolated from a drinking water source in Lille, France. Here we report the complete genome sequence of Rahnella aquatilis CIP 78.65, the type strain of R. aquatilis.

  14. Cancer Genome Sequencing and Its Implications for Personalized Cancer Vaccines

    International Nuclear Information System (INIS)

    New DNA sequencing platforms have revolutionized human genome sequencing. The dramatic advances in genome sequencing technologies predict that the $1,000 genome will become a reality within the next few years. Applied to cancer, the availability of cancer genome sequences permits real-time decision-making with the potential to affect diagnosis, prognosis, and treatment, and has opened the door towards personalized medicine. A promising strategy is the identification of mutated tumor antigens, and the design of personalized cancer vaccines. Supporting this notion are preliminary analyses of the epitope landscape in breast cancer suggesting that individual tumors express significant numbers of novel antigens to the immune system that can be specifically targeted through cancer vaccines

  15. Brucella abortus S19 genome sequenced, points toward virulence genes

    OpenAIRE

    Whyte, Barry James

    2008-01-01

    Researchers at the Virginia Bioinformatics Institute at Virginia Tech; the National Animal Disease Center in Ames, Iowa; and collaborators at 454 Life Sciences, Branford, Conn., have sequenced the genome of Brucella abortus strain S19.

  16. Complete Genome Sequence of Mycobacterium phlei Type Strain RIVM601174

    KAUST Repository

    Abdallah, A. M.

    2012-05-24

    Mycobacterium phlei is a rapidly growing nontuberculous Mycobacterium species that is typically nonpathogenic, with few reported cases of human disease. Here we report the whole genome sequence of M. phlei type strain RIVM601174.

  17. Complete Genome Sequences of Six Strains of the Genus Methylobacterium

    Energy Technology Data Exchange (ETDEWEB)

    Marx, Christopher J [Harvard University; Bringel, Francoise O. [University of Strasbourg; Christoserdova, Ludmila [University of Washington, Seattle; Moulin, Lionel [UMR, France; UI Hague, Muhammad Farhan [University of Strasbourg; Fleischman, Darrell E. [Wright State University, Dayton, OH; Gruffaz, Christelle [CNRS, Strasbourg, France; Jourand, Philippe [UMR, France; Knief, Claudia [ETH Zurich, Switzerland; Lee, Ming-Chun [Harvard University; Muller, Emilie E. L. [CNRS, Strasbourg, France; Nadalig, Thierry [CNRS, Strasbourg, France; Peyraud, Remi [ETH Zurich, Switzerland; Roselli, Sandro [CNRS, Strasbourg, France; Russ, Lina [ETH Zurich, Switzerland; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Ivanov, Pavel S. [University of Wyoming, Laramie; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lajus, Aurelie [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Land, Miriam L [ORNL; Medigue, Claudine [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Stolyar, Sergey [University of Washington; Vorholt, Julia A. [ETH Zurich, Switzerland; Vuilleumier, Stephane [University of Strasbourg

    2012-01-01

    The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints.

  18. Complete genome sequences of six strains of the genus methylobacterium

    Energy Technology Data Exchange (ETDEWEB)

    Marx, Christopher J [Harvard University; Bringel, Francoise O. [University of Strasbourg; Christoserdova, Ludmila [University of Washington, Seattle; Moulin, Lionel [UMR, France; Farhan Ul Haque, Muhammad [CNRS, Strasbourg, France; Fleischman, Darrell E. [Wright State University, Dayton, OH; Gruffaz, Christelle [CNRS, Strasbourg, France; Jourand, Philippe [UMR, France; Knief, Claudia [ETH Zurich, Switzerland; Lee, Ming-Chun [Harvard University; Muller, Emilie E. L. [CNRS, Strasbourg, France; Nadalig, Thierry [CNRS, Strasbourg, France; Peyraud, Remi [ETH Zurich, Switzerland; Roselli, Sandro [CNRS, Strasbourg, France; Russ, Lina [ETH Zurich, Switzerland; Aguero, Fernan [Universidad Nacional de General San Martin; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lajus, Aurelie [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Land, Miriam L [ORNL; Medigue, Claudine [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Stolyar, Sergey [University of Washington; Vorholt, Julia A. [ETH Zurich, Switzerland; Vuilleumier, Stephane [University of Strasbourg

    2012-01-01

    The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints.

  19. Sequencing of Wheat Chromosome 6B: Toward Functional Genomics

    Czech Academy of Sciences Publication Activity Database

    Tanaka, T.; Kobayashi, F.; Joshi, G.P.; Onuki, R.; Šimková, Hana; Nasuda, S.; Doležel, Jaroslav; Ogihara, Y.; Itoh, T.; Handa, H.

    Verlag: Springer, 2015 - (Handa, H.), s. 111-116 ISBN 978-4-431-55674-9 Institutional support: RVO:61389030 Keywords : Chromosome 6B * Genome sequencing * Marker construction Subject RIV: EB - Genetics ; Molecular Biology

  20. Sequence analysis of the complete mitochondrial genome of Youxian sheldrake.

    Science.gov (United States)

    He, Shao-Ping; Liu, Li-Li; Yu, Qi-Fang; Li, Si; He, Jian-Hua

    2016-01-01

    Youxian sheldrake is excellent native breeds in Hunan province in China. The complete mitochondrial (mt) genome sequence plays an important role in the accurate determination of phylogenetic relationships among metazoans. This is the first study to determine the complete mitochondrial genome sequence of Youxian sheldrake using PCR-based amplification and Sanger sequencing. The characteristic of the entire mitochondrial genome was analyzed in detail, the total length of the mitogenome is 16,605 bp, with the base composition of 29.21% A, 22.18% T, 32.84% C, 15.77% G in the Youxian sheldrake. It contained 2 ribosomal RNA genes, 13 protein-coding genes, 22 transfer RNA genes and a major non-coding control region (D-loop region). The complete mitochondrial genome sequence of Youxian sheldrake provided an important data for further study of the phylogenetics of poultry, and available data for the genetics and breeding. PMID:25090395

  1. Complete genome sequence of Treponema pallidum strain DAL-1

    Science.gov (United States)

    Zobaníková, Marie; Mikolka, Pavol; Čejková, Darina; Pospíšilová, Petra; Chen, Lei; Strouhal, Michal; Qin, Xiang; Weinstock, George M.; Šmajs, David

    2012-01-01

    Treponema pallidum strain DAL-1 is a human uncultivable pathogen causing the sexually transmitted disease syphilis. Strain DAL-1 was isolated from the amniotic fluid of a pregnant woman in the secondary stage of syphilis. Here we describe the 1,139,971 bp long genome of T. pallidum strain DAL-1 which was sequenced using two independent sequencing methods (454 pyrosequencing and Illumina). In rabbits, strain DAL-1 replicated better than the T. pallidum strain Nichols. The comparison of the complete DAL-1 genome sequence with the Nichols sequence revealed a list of genetic differences that are potentially responsible for the increased rabbit virulence of the DAL-1 strain. PMID:23449808

  2. Complete genome sequencing and comparative genomic analysis of functionally diverse Lysinibacillus sphaericus III(3)7.

    Science.gov (United States)

    Rey, Andrés; Silva-Quintero, Laura; Dussán, Jenny

    2016-09-01

    Lysinibacillus sphaericus III(3)7 is a native Colombian strain, the first one isolated from soil samples. This strain has shown high levels of pathogenic activity against Culex quinquefaciatus larvae in laboratory assays compared to other members of the same species. Using Pacific Biosciences sequencing technology we sequenced, annotated (de novo) and described the genome of strain III(3)7, achieving a complete genome sequence status. We then performed a comparative analysis between the newly sequenced genome and the ones previously reported for Colombian isolates L. sphaericus OT4b.31, CBAM5 and OT4b.25, with the inclusion of L. sphaericus C3-41 that has been used as a reference genome for most of previous genome sequencing projects. We concluded that L. sphaericus III(3)7 is highly similar with strain OT4b.25 and shares high levels of synteny with isolates CBAM5 and C3-41. PMID:27419068

  3. Analysis of the bread wheat genome using whole-genome shotgun sequencing

    OpenAIRE

    Brenchley R.; Brenchley, Rachel; Spannagl M.; Spannagl, Manuel; Pfeifer M; Pfeifer, Matthias; Barker, Gary L. A.; Barker G.L.A.; D'Amore R.; D'Amore, Rosalinda; Allen A.M.; Allen, Alexandra M.; McKenzie, Neil; McKenzie N.; Kramer, Melissa

    2012-01-01

    Summary Bread wheat (Triticum aestivum) is a globally important crop, accounting for 20% of the calories consumed by mankind. We sequenced its large and challenging 17 Gb hexaploid genome using 454 pyrosequencing and compared this with the sequences of diploid ancestral and progenitor genomes. Between 94,000-96,000 genes were identified, and two-thirds were assigned to the A, B and D genomes. High-resolution synteny maps identified many small disruptions to conserved gene order. We show the h...

  4. Intra-species sequence comparisons for annotating genomes

    Energy Technology Data Exchange (ETDEWEB)

    Boffelli, Dario; Weer, Claire V.; Weng, Li; Lewis, Keith D.; Shoukry, Malak I.; Pachter, Lior; Keys, David N.; Rubin, Edward M.

    2004-07-15

    Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intra-species sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents and a set of genomic intervals amplified, resequenced and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom and raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species. The sequence data from this study has been submitted to GenBank under accession nos. AY667278-AY667407.

  5. Whole genome sequencing in clinical and public health microbiology

    OpenAIRE

    Kwong, J. C.; McCallum, N; Sintchenko, V.; Howden, B. P.

    2015-01-01

    SummaryGenomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology. The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology. Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laborat...

  6. Biogeography of the Pistia clade (Araceae): based on chloroplast and mitochondrial DNA sequences and Bayesian divergence time inference.

    Science.gov (United States)

    Renner, Susanne S; Zhang, Li-Bing

    2004-06-01

    Pistia stratiotes (water lettuce) and Lemna (duckweeds) are the only free-floating aquatic Araceae. The geographic origin and phylogenetic placement of these unrelated aroids present long-standing problems because of their highly modified reproductive structures and wide geographical distributions. We sampled chloroplast (trnL-trnF and rpl20-rps12 spacers, trnL intron) and mitochondrial sequences (nad1 b/c intron) for all genera implicated as close relatives of Pistia by morphological, restriction site, and sequencing data, and present a hypothesis about its geographic origin based on the consensus of trees obtained from the combined data, using Bayesian, maximum likelihood, parsimony, and distance analyses. Of the 14 genera closest to Pistia, only Alocasia, Arisaema, and Typhonium are species-rich, and the latter two were studied previously, facilitating the choice of representatives that span the roots of these genera. Results indicate that Pistia and the Seychelles endemic Protarum sechellarum are the basalmost branches in a grade comprising the tribes Colocasieae (Ariopsis, Steudnera, Remusatia, Alocasia, Colocasia), Arisaemateae (Arisaema, Pinellia), and Areae (Arum, Biarum, Dracunculus, Eminium, Helicodiceros, Theriophonum, Typhonium). Unexpectedly, all Areae genera are embedded in Typhonium, which throws new light on the geographic history of Areae. A Bayesian analysis of divergence times that explores the effects of multiple fossil and geological calibration points indicates that the Pistia lineage is 90 to 76 million years (my) old. The oldest fossils of the Pistia clade, though not Pistia itself, are 45-my-old leaves from Germany; the closest outgroup, Peltandreae (comprising a few species in Florida, the Mediterranean, and Madagascar), is known from 60-my-old leaves from Europe, Kazakhstan, North Dakota, and Tennessee. Based on the geographic ranges of close relatives, Pistia likely originated in the Tethys region, with Protarum then surviving on the

  7. Genome sequence and comparative analysis of Avibacterium paragallinarum

    OpenAIRE

    Requena, David; Chumbe, Ana; Torres, Michael; Alzamora, Ofelia; Ramirez, Manuel; Valdivia-Olarte, Hugo; Gutierrez, Andres Hazaet; Izquierdo-Lara, Ray; Saravia, Luis Enrique; Zavaleta, Milagros; Tataje-Lavanda, Luis; Best, Ivan; Fernández-Sánchez, Manolo; Icochea, Eliana; Zimic, Mirko

    2013-01-01

    Background: Avibacterium paragallinarum, the causative agent of infectious coryza, is a highly contagious respiratory acute disease of poultry, which affects commercial chickens, laying hens and broilers worldwide. Methodology: In this study, we performed the whole genome sequencing, assembly and annotation of a Peruvian isolate of A. paragallinarum. Genome was sequenced in a 454 GS FLX Titanium system. De novo assembly was performed and annotation was completed with GS De Novo Assembler 2.6 ...

  8. Mapping Challenging Mutations by Whole-Genome Sequencing

    OpenAIRE

    Smith, Harold E.; Fabritius, Amy S.; Aimee Jaramillo-Lambert; Andy Golden

    2016-01-01

    Whole-genome sequencing provides a rapid and powerful method for identifying mutations on a global scale, and has spurred a renewed enthusiasm for classical genetic screens in model organisms. The most commonly characterized category of mutation consists of monogenic, recessive traits, due to their genetic tractability. Therefore, most of the mapping methods for mutation identification by whole-genome sequencing are directed toward alleles that fulfill those criteria (i.e., single-gene, homoz...

  9. Whole genome and transcriptome sequencing of a B3 thymoma.

    Directory of Open Access Journals (Sweden)

    Iacopo Petrini

    Full Text Available Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomic hybridization using Agilent platform, transcriptome sequencing using HiSeq 2000 (Illumina and whole genome sequencing using Complete Genomics Inc platform. Whole genome sequencing determined, in tumor and normal, the sequence of both alleles in more than 95% of the reference genome (NCBI Build 37. Copy number (CN aberrations were comparable with those previously described for B3 thymomas, with CN gain of chromosome 1q, 5, 7 and X and CN loss of 3p, 6, 11q42.2-qter and q13. One translocation t(11;X was identified by whole genome sequencing and confirmed by PCR and Sanger sequencing. Ten single nucleotide variations (SNVs and 2 insertion/deletions (INDELs were identified; these mutations resulted in non-synonymous amino acid changes or affected splicing sites. The lack of common cancer-associated mutations in this patient suggests that thymomas may evolve through mechanisms distinctive from other tumor types, and supports the rationale for additional high-throughput sequencing screens to better understand the somatic genetic architecture of thymoma.

  10. Genome Sequence of the Biocontrol Strain Pseudomonas fluorescens F113

    OpenAIRE

    Redondo-Nieto, M.; M. Barret; Morrisey, J; Germaine, K.; Martínez-Granero, F.; Barahona, E.; Navazo, A.; Sánchez-Contreras, M.; Moynihan, J.; Giddens, S.; Coppoolse, E.; Muriel, C.; Stiekema, W.; Rainey, P; Dowling, D

    2012-01-01

    Pseudomonas fluorescens F113 is a plant growth-promoting rhizobacterium (PGPR) that has biocontrol activity against fungal plant pathogens and is a model for rhizosphere colonization. Here, we present its complete genome sequence, which shows that besides a core genome very similar to those of other strains sequenced within this species, F113 possesses a wide array of genes encoding specialized functions for thriving in the rhizosphere and interacting with eukaryotic organisms.

  11. Draft genome sequence of Therminicola potens strain JR

    Energy Technology Data Exchange (ETDEWEB)

    Byrne-Bailey, K.G.; Wrighton, K.C.; Melnyk, R.A.; Agbo, P.; Hazen, T.C.; Coates, J.D.

    2010-07-01

    'Thermincola potens' strain JR is one of the first Gram-positive dissimilatory metal-reducing bacteria (DMRB) for which there is a complete genome sequence. Consistent with the physiology of this organism, preliminary annotation revealed an abundance of multiheme c-type cytochromes that are putatively associated with the periplasm and cell surface in a Gram-positive bacterium. Here we report the complete genome sequence of strain JR.

  12. Genome Sequence of Pantoea agglomerans Strain IG1

    OpenAIRE

    Matsuzawa, Tomohiko; Mori, Kazuki; Kadowaki, Takeshi; Shimada, Misato; Tashiro, Kosuke; Kuhara, Satoru; Inagawa, Hiroyuki; Soma, Gen-Ichiro; Takegawa, Kaoru

    2012-01-01

    Pantoea agglomerans is a Gram-negative bacterium that grows symbiotically with various plants. Here we report the 4.8-Mb genome sequence of P. agglomerans strain IG1. The lipopolysaccharides derived from P. agglomerans IG1 have been shown to be effective in the prevention of various diseases, such as bacterial or viral infection, lifestyle-related diseases. This genome sequence represents a substantial step toward the elucidation of pathways for production of lipopolysaccharides.

  13. Complete Genome Sequence of Pseudomonas aeruginosa Phage AAT-1.

    Science.gov (United States)

    Andrade-Domínguez, Andrés; Kolter, Roberto

    2016-01-01

    Aspects of the interaction between phages and animals are of interest and importance for medical applications. Here, we report the genome sequence of the lytic Pseudomonas phage AAT-1, isolated from mammalian serum. AAT-1 is a double-stranded DNA phage, with a genome of 57,599 bp, containing 76 predicted open reading frames. PMID:27563032

  14. Draft Genome Sequence of Avibacterium paragallinarum Strain 221

    OpenAIRE

    Xu, Fuzhou; Miao, Deyuan; Du, Yu; CHEN, XIAOLING; Zhang, Peijun; Sun, Huiling

    2013-01-01

    Avibacterium paragallinarum is the causative agent of infectious coryza. Here we report the draft genome sequence of reference strain 221 of A. paragallinarum serovar A. The genome is composed of 135 contigs for 2,685,568 bp with a 41% G+C content.

  15. Draft Genome Sequence of Amycolatopsis decaplanina Strain DSM 44594T

    OpenAIRE

    Kaur, Navjot; Kumar, Shailesh; Bala, Monu; Raghava, Gajendra Pal Singh; Mayilraj, Shanmugam

    2013-01-01

    We report the 8.5-Mb genome sequence of Amycolatopsis decaplanina strain DSM 44594T, isolated from a soil sample from India. The draft genome of strain DSM 44594T consists of 8,533,276 bp with a 68.6% G+C content, 7,899 protein-coding genes, and 57 RNAs.

  16. Complete genome sequence of Aeromonas hydrophila AL06-06

    Science.gov (United States)

    Aeromonas hydrophila occurs in freshwater environments and infects fish and mammals. In this work, we report the complete genome sequence of Aeromonas hydrophila AL06-06, which was isolated from diseased goldfish and is being used for comparative genomic studies with A. hydrophila strains causing ba...

  17. A snapshot of the emerging tomato genome sequence

    NARCIS (Netherlands)

    Mueller, L.A.; Klein Lankhorst, R.M.; Tanksley, S.D.; Peters, R.M.; Staveren, van M.J.; Datema, E.; Fiers, M.W.E.J.; Ham, van R.C.H.J.; Szinay, D.; Jong, de J.H.S.G.M.

    2009-01-01

    The genome of tomato (Solanum lycopersicum L.) is being sequenced by an international consortium of 10 countries (Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy, and the United States) as part of the larger “International Solanaceae Genome Project (SOL): System

  18. Draft Genome Sequence of Rhodococcus sp. Strain 311R

    Science.gov (United States)

    Ehsani, Elham; Jauregui, Ruy; Geffers, Robert; Jareck, Michael; Boon, Nico; Pieper, Dietmar H.

    2015-01-01

    Here, we report the draft genome sequence of Rhodococcus sp. strain 311R, which was isolated from a site contaminated with alkanes and aromatic compounds. Strain 311R shares 90% of the genome of Rhodococcus erythropolis SK121, which is the closest related bacteria. PMID:25999565

  19. Whole-Genome Sequences of Three Symbiotic Endozoicomonas Bacteria

    KAUST Repository

    Neave, Matthew J.

    2014-08-14

    Members of the genus Endozoicomonas associate with a wide range of marine organisms. Here, we report on the whole-genome sequencing, assembly, and annotation of three Endozoicomonas type strains. These data will assist in exploring interactions between Endozoicomonas organisms and their hosts, and it will aid in the assembly of genomes from uncultivated Endozoicomonas spp.

  20. Complete Genome Sequence of Pediococcus pentosaceus Strain SL4

    DEFF Research Database (Denmark)

    Dantoft, Shruti Harnal; Bielak, Eliza Maria; Seo, Jae-Gu;

    2013-01-01

    Pediococcus pentosaceus SL4 was isolated from a Korean fermented vegetable product, kimchi. We report here the whole-genome sequence (WGS) of P. pentosaceus SL4. The genome consists of a 1.79-Mb circular chromosome (G+C content of 37.3%) and seven distinct plasmids ranging in size from 4 kb to 50...

  1. Genome Sequence of Chinese Porcine Parvovirus Strain PPV2010

    OpenAIRE

    Cui, Jin; wang, xin; Ren, Yudong; Cui, Shangjin; Li, Guangxing; Ren, Xiaofeng

    2012-01-01

    Porcine parvovirus (PPV) isolate PPV2010 has recently emerged in China. Herein, we analyze the complete genome sequence of PPV2010. Our results indicate that the genome of PPV2010 bears mixed characteristics of virulent PPV and vaccine strains. Importantly, PPV2010 has the potential to be a naturally attenuated candidate vaccine strain.

  2. The tomato genome sequence provides insight into fleshy fruit evolution

    Science.gov (United States)

    The genome of the inbred tomato cultivar ‘Heinz 1706’ was sequenced and assembled using a combination of Sanger and “next generation” technologies. The predicted genome size is ~900 Mb, consistent with prior estimates, of which 760 Mb were assembled in 91 scaffolds aligned to the 12 tomato chromosom...

  3. Genome sequence of the cultivated cotton Gossypium arboreum

    Science.gov (United States)

    Cotton is one of the most economically important natural fiber crops in the world, and the complex tetraploid nature of its genome (AADD, 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled 98.3% of the 1.7-gigabase G. arboreum (AA, 2n = 26...

  4. Complete Genome Sequence of Bacillus thuringiensis Bacteriophage Smudge

    Science.gov (United States)

    Cornell, Jessica L.; Breslin, Eileen; Schuhmacher, Zachary; Himelright, Madison; Berluti, Cassandra; Boyd, Charles; Carson, Rachel; Del Gallo, Elle; Giessler, Caris; Gilliam, Benjamin; Heatherly, Catherine; Nevin, Julius; Nguyen, Bryan; Nguyen, Justin; Parada, Jocelyn; Sutterfield, Blake; Tukruni, Muruj

    2016-01-01

    Smudge, a bacteriophage enriched from soil using Bacillus thuringiensis DSM-350 as the host, had its complete genome sequenced. Smudge is a myovirus with a genome consisting of 292 genes and was identified as belonging to the C1 cluster of Bacillus phages. PMID:27540049

  5. Complete Genome Sequence of Mycobacterium bovis Strain BCG-1 (Russia).

    Science.gov (United States)

    Sotnikova, Evgeniya A; Shitikov, Egor A; Malakhova, Maja V; Kostryukova, Elena S; Ilina, Elena N; Atrasheuskaya, Alena V; Ignatyev, Georgy M; Vinokurova, Nataliya V; Gorbachyov, Vyacheslav Y

    2016-01-01

    Mycobacterium bovisBCG (Bacille Calmette-Guérin) is a vaccine strain used for protection against tuberculosis. Here, we announce the complete genome sequence ofM. bovisstrain BCG-1 (Russia). Extensive use of this strain necessitates the study of its genome stability by comparative analysis. PMID:27034492

  6. Complete genome sequence of Campylobacter gracilis ATCC 33236T

    Science.gov (United States)

    The human oral pathogen Campylobacter gracilis has been isolated from periodontal and endodontal infections, and also from non-oral head, neck or lung infections. This study describes the whole-genome sequence of the human periodontal isolate ATCC 33236T (=FDC 1084), which is the first closed genome...

  7. Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis

    NARCIS (Netherlands)

    Carlton, Jane M.; Hirt, Robert P.; Silva, Joana C.; Delcher, Arthur L.; Schatz, Michael; Zhao, Qi; Wortman, Jennifer R.; Bidwell, Shelby L.; Alsmark, U. Cecilia M.; Besteiro, Sebastien; Sicheritz-Ponten, Thomas; Noel, Christophe J.; Dacks, Joel B.; Foster, Peter G.; Simillion, Cedric; Van de Peer, Yves; Miranda-Saavedra, Diego; Barton, Geoffrey J.; Westrop, Gareth D.; Mueller, Sylke; Dessi, Daniele; Fiori, Pier Luigi; Ren, Qinghu; Paulsen, Ian; Zhang, Hanbang; Bastida-Corcuera, Felix D.; Simoes-Barbosa, Augusto; Brown, Mark T.; Hayes, Richard D.; Mukherjee, Mandira; Okumura, Cheryl Y.; Schneider, Rachel; Smith, Alias J.; Vanacova, Stepanka; Villalvazo, Maria; Haas, Brian J.; Pertea, Mihaela; Feldblyum, Tamara V.; Utterback, Terry R.; Shu, Chung-Li; Osoegawa, Kazutoyo; de Jong, Pieter J.; Hrdy, Ivan; Horvathova, Lenka; Zubacova, Zuzana; Dolezal, Pavel; Malik, Shehre-Banoo; Logsdon, John M.; Henze, Katrin; Gupta, Arti; Wang, Ching C.; Dunne, Rebecca L.; Upcroft, Jacqueline A.; Upcroft, Peter; White, Owen; Salzberg, Steven L.; Tang, Petrus; Chiu, Cheng-Hsun; Lee, Ying-Shiung; Embley, T. Martin; Coombs, Graham H.; Mottram, Jeremy C.; Tachezy, Jan; Fraser-Liggett, Claire M.; Johnson, Patricia J.

    2007-01-01

    We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the similar to 160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion, in conjunction wi

  8. Complete Genome Sequence of Bacillus thuringiensis Strain 407 Cry-

    OpenAIRE

    Poehlein, Anja; Liesegang, Heiko

    2013-01-01

    Bacillus thuringiensis is an insect pathogen that has been used widely as a biopesticide. Here, we report the genome sequence of strain 407 Cry-, which is used to study the genetic determinants of pathogenicity. The genome consists of a 5.5-Mb chromosome and nine plasmids, including a novel 502-kb megaplasmid.

  9. Complete Genome Sequence of Cyanobacterial Siphovirus KBS2A.

    Science.gov (United States)

    Ponsero, Alise J; Chen, Feng; Lennon, Jay T; Wilhelm, Steven W

    2013-01-01

    We present the genome of a cyanosiphovirus (KBS2A) that infects a marine Synechococcus sp. (strain WH7803). Unique to this genome, relative to other sequenced cyanosiphoviruses, is the absence of elements associated with integration into the host chromosome, suggesting this virus may not be able to establish a lysogenic relationship. PMID:23969045

  10. Complete Genome Sequence of Cyanobacterial Siphovirus KBS2A

    OpenAIRE

    Ponsero, Alise J.; Chen, Feng; Lennon, Jay T.; Wilhelm, Steven W.

    2013-01-01

    We present the genome of a cyanosiphovirus (KBS2A) that infects a marine Synechococcus sp. (strain WH7803). Unique to this genome, relative to other sequenced cyanosiphoviruses, is the absence of elements associated with integration into the host chromosome, suggesting this virus may not be able to establish a lysogenic relationship.

  11. Complete Genome Sequence of Bacillus thuringiensis Bacteriophage Smudge.

    Science.gov (United States)

    Cornell, Jessica L; Breslin, Eileen; Schuhmacher, Zachary; Himelright, Madison; Berluti, Cassandra; Boyd, Charles; Carson, Rachel; Del Gallo, Elle; Giessler, Caris; Gilliam, Benjamin; Heatherly, Catherine; Nevin, Julius; Nguyen, Bryan; Nguyen, Justin; Parada, Jocelyn; Sutterfield, Blake; Tukruni, Muruj; Temple, Louise

    2016-01-01

    Smudge, a bacteriophage enriched from soil using Bacillus thuringiensis DSM-350 as the host, had its complete genome sequenced. Smudge is a myovirus with a genome consisting of 292 genes and was identified as belonging to the C1 cluster of Bacillus phages. PMID:27540049

  12. Complete Genome Sequence of Cyanobacterium Leptolyngbya sp. NIES-3755

    Science.gov (United States)

    Fujisawa, Takatomo; Ohtsubo, Yoshiyuki; Katayama, Mitsunori; Misawa, Naomi; Wakazuki, Sachiko; Shimura, Yohei; Nakamura, Yasukazu; Kawachi, Masanobu; Yoshikawa, Hirofumi; Eki, Toshihiko

    2016-01-01

    Cyanobacterial genus Leptolyngbya comprises genetically diverse species, but the availability of their complete genome information is limited. Here, we isolated Leptolyngbya sp. strain NIES-3755 from soil at the Toyohashi University of Technology, Japan. We determined the complete genome sequence of the NIES-3755 strain, which is composed of one chromosome and three plasmids. PMID:26988037

  13. Finished Genome Sequence of Collimonas arenae Cal35

    NARCIS (Netherlands)

    Wu, Je-Jia; de Jager, Victor; Deng, Wen-ling; Leveau, Johan

    2015-01-01

    We announce the finished genome sequence of soil forest isolate Collimonas arenae Cal35, which comprises a 5.6-Mbp chromosome and 41-kb plasmid. The Cal35 genome is the second one published for the bacterial genus Collimonas and represents the first opportunity for high-resolution comparison of geno

  14. Dissection of the octoploid strawberry genome by deep sequencing of the genomes of Fragaria species.

    Science.gov (United States)

    Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N

    2014-01-01

    Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ∼200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021

  15. Comparative genomics beyond sequence-based alignments

    DEFF Research Database (Denmark)

    Þórarinsson, Elfar; Yao, Zizhen; Wiklund, Eric D.;

    2008-01-01

    Recent computational scans for non-coding RNAs (ncRNAs) in multiple organisms have relied on existing multiple sequence alignments. However, as sequence similarity drops, a key signal of RNA structure--frequent compensating base changes--is increasingly likely to cause sequence-based alignment me...

  16. First fungal genome sequence from Africa: A preliminary analysis

    Directory of Open Access Journals (Sweden)

    Rene Sutherland

    2012-01-01

    Full Text Available Some of the most significant breakthroughs in the biological sciences this century will emerge from the development of next generation sequencing technologies. The ease of availability of DNA sequence made possible through these new technologies has given researchers opportunities to study organisms in a manner that was not possible with Sanger sequencing. Scientists will, therefore, need to embrace genomics, as well as develop and nurture the human capacity to sequence genomes and utilise the ’tsunami‘ of data that emerge from genome sequencing. In response to these challenges, we sequenced the genome of Fusarium circinatum, a fungal pathogen of pine that causes pitch canker, a disease of great concern to the South African forestry industry. The sequencing work was conducted in South Africa, making F. circinatum the first eukaryotic organism for which the complete genome has been sequenced locally. Here we report on the process that was followed to sequence, assemble and perform a preliminary characterisation of the genome. Furthermore, details of the computer annotation and manual curation of this genome are presented. The F. circinatum genome was found to be nearly 44 million bases in size, which is similar to that of four other Fusarium genomes that have been sequenced elsewhere. The genome contains just over 15 000 open reading frames, which is less than that of the related species, Fusarium oxysporum, but more than that for Fusarium verticillioides. Amongst the various putative gene clusters identified in F. circinatum, those encoding the secondary metabolites fumosin and fusarin appeared to harbour evidence of gene translocation. It is anticipated that similar comparisons of other loci will provide insights into the genetic basis for pathogenicity of the pitch canker pathogen. Perhaps more importantly, this project has engaged a relatively large group of scientists

  17. Enabling technologies of genomic-scale sequence enrichment for targeted high-throughput sequencing

    OpenAIRE

    Summerer, Daniel

    2009-01-01

    Next-generation sequencing has still not reached its full potential due to the technical inability of effectively targeting desired genomic regions of interest. Once available, methods adressing this bottleneck will dramatically reduce cost and enable the efficient analysis of complex samples. Recently, a number of possible approaches for genomic-scale sequence enrichment have been reported using different strategies. All methods basically rely on sequence-specific nucleic acid hybridization,...

  18. Ancient Human Genome Sequence of an Extinct Palaeo-Eskimo

    DEFF Research Database (Denmark)

    Rasmussen, Morten; Li, Yingrui; Lindgreen, Stinus;

    2010-01-01

    We report here the genome sequence of an ancient human. Obtained from approximately 4,000-year-old permafrost-preserved hair, the genome represents a male individual from the first known culture to settle in Greenland. Sequenced to an average depth of 20x, we recover 79% of the diploid genome, an...... possible phenotypic characteristics of the individual that belonged to a culture whose location has yielded only trace human remains. We compare the high-confidence SNPs to those of contemporary populations to find the populations most closely related to the individual. This provides evidence for a...

  19. Complete mitochondrial genome sequence of Aoluguya reindeer (Rangifer tarandus).

    Science.gov (United States)

    Ju, Yan; Liu, Huamiao; Rong, Min; Yang, Yifeng; Wei, Haijun; Shao, Yuanchen; Chen, Xiumin; Xing, Xiumei

    2016-05-01

    The complete mitochondria genome of the reindeer, Rangifer tarandus, was determined by accurate polymerase chain reaction. The entire genome is 16,357 bp in length and contains 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and a D-loop region, all of which are arranged in a typical vertebrate manner. The overall base composition of the reindeer's mitochondrial genome is 33.7% of A, 23.1% of C, 30.1% of T and 13.2%of G. A termination associated sequence and several conserved central sequence block domains were discovered within the control region. PMID:25469816

  20. Monitoring genomic sequences during SELEX using high-throughput sequencing: neutral SELEX.

    Directory of Open Access Journals (Sweden)

    Bob Zimmermann

    Full Text Available BACKGROUND: SELEX is a well established in vitro selection tool to analyze the structure of ligand-binding nucleic acid sequences called aptamers. Genomic SELEX transforms SELEX into a tool to discover novel, genomically encoded RNA or DNA sequences binding a ligand of interest, called genomic aptamers. Concerns have been raised regarding requirements imposed on RNA sequences undergoing SELEX selection. METHODOLOGY/PRINCIPAL FINDINGS: To evaluate SELEX and assess the extent of these effects, we designed and performed a Neutral SELEX experiment omitting the selection step, such that the sequences are under the sole selective pressure of SELEX's amplification steps. Using high-throughput sequencing, we obtained thousands of full-length sequences from the initial genomic library and the pools after each of the 10 rounds of Neutral SELEX. We compared these to sequences obtained from a Genomic SELEX experiment deriving from the same initial library, but screening for RNAs binding with high affinity to the E. coli regulator protein Hfq. With each round of Neutral SELEX, sequences became less stable and changed in nucleotide content, but no sequences were enriched. In contrast, we detected substantial enrichment in the Hfq-selected set with enriched sequences having structural stability similar to the neutral sequences but with significantly different nucleotide selection. CONCLUSIONS/SIGNIFICANCE: Our data indicate that positive selection in SELEX acts independently of the neutral selective requirements imposed on the sequences. We conclude that Genomic SELEX, when combined with high-throughput sequencing of positively and neutrally selected pools, as well as the gnomic library, is a powerful method to identify genomic aptamers.

  1. Genetic Analysis of Chloroplast Translation

    Energy Technology Data Exchange (ETDEWEB)

    Barkan, Alice

    2005-08-15

    The assembly of the photosynthetic apparatus requires the concerted action of hundreds of genes distributed between the two physically separate genomes in the nucleus and chloroplast. Nuclear genes coordinate this process by controlling the expression of chloroplast genes in response to developmental and environmental cues. However, few regulatory factors have been identified. We used mutant phenotypes to identify nuclear genes in maize that modulate chloroplast translation, a key control point in chloroplast gene expression. This project focused on the nuclear gene crp1, required for the translation of two chloroplast mRNAs. CRP1 is related to fungal proteins involved in the translation of mitochondrial mRNAs, and is the founding member of a large gene family in plants, with {approx}450 members. Members of the CRP1 family are defined by a repeated 35 amino acid motif called a ''PPR'' motif. The PPR motif is closely related to the TPR motif, which mediates protein-protein interactions. We and others have speculated that PPR tracts adopt a structure similar to that of TPR tracts, but with a substrate binding surface adapted to bind RNA instead of protein. To understand how CRP1 influences the translation of specific chloroplast mRNAs, we sought proteins that interact with CRP1, and identified the RNAs associated with CRP1 in vivo. We showed that CRP1 is associated in vivo with the mRNAs whose translation it activates. To explore the functions of PPR proteins more generally, we sought mutations in other PPR-encoding genes: mutations in the maize PPR2 and PPR4 were shown to disrupt chloroplast ribosome biogenesis and chloroplast trans-splicing, respectively. These and other results suggest that the nuclear-encoded PPR family plays a major role in modulating the expression of the chloroplast genome in higher plants.

  2. Specialized microbial databases for inductive exploration of microbial genome sequences

    Directory of Open Access Journals (Sweden)

    Cabau Cédric

    2005-02-01

    Full Text Available Abstract Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore http://bioinfo.hku.hk/genochore.html, a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis associated to related organisms for comparison.

  3. Genome sequence analysis of the model grass Brachypodium distachyon: insights into grass genome evolution

    Energy Technology Data Exchange (ETDEWEB)

    Schulman, Al

    2009-08-09

    Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromeric regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops

  4. Sequencing and comparative analyses of the genomes of zoysiagrasses.

    Science.gov (United States)

    Tanaka, Hidenori; Hirakawa, Hideki; Kosugi, Shunichi; Nakayama, Shinobu; Ono, Akiko; Watanabe, Akiko; Hashiguchi, Masatsugu; Gondo, Takahiro; Ishigaki, Genki; Muguerza, Melody; Shimizu, Katsuya; Sawamura, Noriko; Inoue, Takayasu; Shigeki, Yuichi; Ohno, Naoki; Tabata, Satoshi; Akashi, Ryo; Sato, Shusei

    2016-04-01

    Zoysiais a warm-season turfgrass, which comprises 11 allotetraploid species (2n= 4x= 40), each possessing different morphological and physiological traits. To characterize the genetic systems ofZoysiaplants and to analyse their structural and functional differences in individual species and accessions, we sequenced the genomes ofZoysiaspecies using HiSeq and MiSeq platforms. As a reference sequence ofZoysiaspecies, we generated a high-quality draft sequence of the genome ofZ. japonicaaccession 'Nagirizaki' (334 Mb) in which 59,271 protein-coding genes were predicted. In parallel, draft genome sequences ofZ. matrella'Wakaba' andZ. pacifica'Zanpa' were also generated for comparative analyses. To investigate the genetic diversity among theZoysiaspecies, genome sequence reads of three additional accessions,Z. japonica'Kyoto',Z. japonica'Miyagi' andZ. matrella'Chiba Fair Green', were accumulated, and aligned against the reference genome of 'Nagirizaki' along with those from 'Wakaba' and 'Zanpa'. As a result, we detected 7,424,163 single-nucleotide polymorphisms and 852,488 short indels among these species. The information obtained in this study will be valuable for basic studies on zoysiagrass evolution and genetics as well as for the breeding of zoysiagrasses, and is made available in the 'Zoysia Genome Database' athttp://zoysia.kazusa.or.jp. PMID:26975196

  5. Human genome and genetic sequencing research and informed consent

    International Nuclear Information System (INIS)

    On March 29, 2001, the Ethical Guidelines for Human Genome and Genetic Sequencing Research were established. They have intended to serve as ethical guidelines for all human genome and genetic sequencing research practice, for the purpose of upholding respect for human dignity and rights and enforcing use of proper methods in the pursuit of human genome and genetic sequencing research, with the understanding and cooperation of the public. The RadGenomics Project has prepared a research protocol and informed consent document that follow these ethical guidelines. We have endeavored to protect the privacy of individual information, and have established a procedure for examination of research practices by an ethics committee. Here we report our procedure in order to offer this concept to the patients. (authors)

  6. Open access to sequence: Browsing the Pichia pastoris genome

    Directory of Open Access Journals (Sweden)

    Graf Alexandra

    2009-10-01

    Full Text Available Abstract The first genome sequences of the important yeast protein production host Pichia pastoris have been released into the public domain this spring. In order to provide the scientific community easy and versatile access to the sequence, two web-sites have been installed as a resource for genomic sequence, gene and protein information for P. pastoris: A GBrowse based genome browser was set up at http://www.pichiagenome.org and a genome portal with gene annotation and browsing functionality at http://bioinformatics.psb.ugent.be/webtools/bogas. Both websites are offering information on gene annotation and function, regulation and structure. In addition, a WiKi based platform allows all users to create additional information on genes, proteins, physiology and other items of P. pastoris research, so that the Pichia community can benefit from exchange of knowledge, data and materials.

  7. A Probabilistic Genome-Wide Gene Reading Frame Sequence Model

    DEFF Research Database (Denmark)

    Have, Christian Theil; Mørk, Søren

    We introduce a new type of probabilistic sequence model, that model the sequential composition of reading frames of genes in a genome. Our approach extends gene finders with a model of the sequential composition of genes at the genome-level -- effectively producing a sequential genome annotation as...... output. The model can be used to obtain the most probable genome annotation based on a combination of i: a gene finder score of each gene candidate and ii: the sequence of the reading frames of gene candidates through a genome. The model --- as well as a higher order variant --- is developed and tested...... using the probabilistic logic programming language and machine learning system PRISM - a fast and efficient model prototyping environment, using bacterial gene finding performance as a benchmark of signal strength. The model is used to prune a set of gene predictions from an underlying gene finder and...

  8. Complete Genome Sequence of the Alfalfa latent virus.

    Science.gov (United States)

    Nemchinov, Lev G; Shao, Jonathan; Postnikova, Olga A

    2015-01-01

    The first complete genome sequence of the Alfalfa latent carlavirus (ALV) was obtained by primer walking and Illumina RNA sequencing. The virus differs substantially from the Czech ALV isolate and the Pea streak virus isolate from Wisconsin. The absence of a clear nucleic acid-binding protein indicates ALV divergence from other carlaviruses. PMID:25883281

  9. Draft Genome Sequence of Biocontrol Agent Bacillus cereus UW85.

    Science.gov (United States)

    Lozano, Gabriel L; Holt, Jonathan; Ravel, Jacques; Rasko, David A; Thomas, Michael G; Handelsman, Jo

    2016-01-01

    Bacillus cereus UW85 was isolated from a root of a field-grown alfalfa plant from Arlington, WI, and identified for its ability to suppress damping off, a disease caused by Phytophthora megasperma f. sp. medicaginis on alfalfa. Here, we report the draft genome sequence of B. cereus UW85, obtained by a combination of Sanger and Illumina sequencing. PMID:27587823

  10. Genome sequence of Stachybotrys chartarum Strain 51-11

    Science.gov (United States)

    Stachybotrys chartarum strain 51-11 genome was sequenced by shotgun sequencing utilizing Illumina Hiseq 2000 and PacBio long read technology. Since Stachybotrys chartarum has been implicated in health impacts within water-damaged buildings, any information extracted from the geno...

  11. Complete Genome Sequence of Vibrio alginolyticus ZJ-T.

    Science.gov (United States)

    Deng, Yiqin; Chen, Chang; Zhao, Zhe; Huang, Xiaochun; Yang, Yiying; Ding, Xiongqi

    2016-01-01

    Vibrio alginolyticus is a ubiquitous Gram-negative bacterium which is normally distributed in the coastal and estuarine environments. It has been suggested to be an opportunistic pathogen to both marine animals and humans, Here, the completed genome sequence of V. alginolyticus ZJ-T was determined by Illumina high-throughput sequencing. PMID:27587824

  12. Sequencing and analysis of an Irish human genome.

    LENUS (Irish Health Repository)

    Tong, Pin

    2010-01-01

    Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence.

  13. Draft Genome Sequence of Type Strain Streptococcus gordonii ATCC 10558

    DEFF Research Database (Denmark)

    Rasmussen, Louise Hesselbjerg; Dargis, Rimtas; Christensen, Jens Jørgen Elmer;

    2016-01-01

    Streptococcus gordonii ATCC 10558T was isolated from a patient with infective endocarditis in 1946 and announced as a type strain in 1989. Here, we report the 2,154,510-bp draft genome sequence of S. gordonii ATCC 10558T. This sequence will contribute to knowledge about the pathogenesis of...

  14. Genomic insight into the common carp (Cyprinus carpio genome by sequencing analysis of BAC-end sequences

    Directory of Open Access Journals (Sweden)

    Wang Jintu

    2011-04-01

    Full Text Available Abstract Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio, a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3

  15. Genomic Sequencing of Single Microbial Cells from Environmental Samples

    Energy Technology Data Exchange (ETDEWEB)

    Ishoey, Thomas; Woyke, Tanja; Stepanauskas, Ramunas; Novotny, Mark; Lasken, Roger S.

    2008-02-01

    Recently developed techniques allow genomic DNA sequencing from single microbial cells [Lasken RS: Single-cell genomic sequencing using multiple displacement amplification, Curr Opin Microbiol 2007, 10:510-516]. Here, we focus on research strategies for putting these methods into practice in the laboratory setting. An immediate consequence of single-cell sequencing is that it provides an alternative to culturing organisms as a prerequisite for genomic sequencing. The microgram amounts of DNA required as template are amplified from a single bacterium by a method called multiple displacement amplification (MDA) avoiding the need to grow cells. The ability to sequence DNA from individual cells will likely have an immense impact on microbiology considering the vast numbers of novel organisms, which have been inaccessible unless culture-independent methods could be used. However, special approaches have been necessary to work with amplified DNA. MDA may not recover the entire genome from the single copy present in most bacteria. Also, some sequence rearrangements can occur during the DNA amplification reaction. Over the past two years many research groups have begun to use MDA, and some practical approaches to single-cell sequencing have been developed. We review the consensus that is emerging on optimum methods, reliability of amplified template, and the proper interpretation of 'composite' genomes which result from the necessity of combining data from several single-cell MDA reactions in order to complete the assembly. Preferred laboratory methods are considered on the basis of experience at several large sequencing centers where >70% of genomes are now often recovered from single cells. Methods are reviewed for preparation of bacterial fractions from environmental samples, single-cell isolation, DNA amplification by MDA, and DNA sequencing.

  16. Draft Genome Sequence of Neisseria gonorrhoeae Sequence Type 1407, a Multidrug-Resistant Clinical Isolate.

    Science.gov (United States)

    Anselmo, A; Ciammaruconi, A; Carannante, A; Neri, A; Fazio, C; Fortunato, A; Palozzi, A M; Vacca, P; Fillo, S; Lista, F; Stefanelli, P

    2015-01-01

    Gonorrhea may become untreatable due to the spread of resistant or multidrug-resistant strains. Cefixime-resistant gonococci belonging to sequence type 1407 have been described worldwide. We report the genome sequence of Neisseria gonorrhoeae strain G2891, a multidrug-resistant isolate of sequence type 1407, collected in Italy in 2013. PMID:26272575

  17. Comparison of two Next Generation sequencing platforms for full genome sequencing of Classical Swine Fever Virus

    DEFF Research Database (Denmark)

    Fahnøe, Ulrik; Pedersen, Anders Gorm; Höper, Dirk;

    2013-01-01

    Next Generation Sequencing (NGS) is becoming more adopted into viral research and will be the preferred technology in the years to come. We have recently sequenced several strains of Classical Swine Fever Virus (CSFV) by NGS on both Genome Sequencer FLX (GS FLX) and Iontorrent PGM platforms. In...

  18. Chemical rationale for selection of isolates for genome sequencing

    DEFF Research Database (Denmark)

    Rank, Christian; Larsen, Thomas Ostenfeld; Frisvad, Jens Christian

    The advances in gene sequencing will in the near future enable researchers to affordably acquire the full genomes of handpicked isolates. We here present a method to evaluate the chemical potential of an entire species and select representatives for genome sequencing. The selection criteria for new...... strains to be sequenced can be manifold, but for studying the functional phenotype, using a metabolome based approach offers a cheap and rapid assessment of critical strains to cover the chemical diversity. We have applied this methodology on the complex A. flavus/A. oryzae group. Though these two species...

  19. Complete genome sequence of Treponema pallidum, the syphilis spirochete.

    Science.gov (United States)

    Fraser, C M; Norris, S J; Weinstock, G M; White, O; Sutton, G G; Dodson, R; Gwinn, M; Hickey, E K; Clayton, R; Ketchum, K A; Sodergren, E; Hardham, J M; McLeod, M P; Salzberg, S; Peterson, J; Khalak, H; Richardson, D; Howell, J K; Chidambaram, M; Utterback, T; McDonald, L; Artiach, P; Bowman, C; Cotton, M D; Fujii, C; Garland, S; Hatch, B; Horst, K; Roberts, K; Sandusky, M; Weidman, J; Smith, H O; Venter, J C

    1998-07-17

    The complete genome sequence of Treponema pallidum was determined and shown to be 1,138,006 base pairs containing 1041 predicted coding sequences (open reading frames). Systems for DNA replication, transcription, translation, and repair are intact, but catabolic and biosynthetic activities are minimized. The number of identifiable transporters is small, and no phosphoenolpyruvate:phosphotransferase carbohydrate transporters were found. Potential virulence factors include a family of 12 potential membrane proteins and several putative hemolysins. Comparison of the T. pallidum genome sequence with that of another pathogenic spirochete, Borrelia burgdorferi, the agent of Lyme disease, identified unique and common genes and substantiates the considerable diversity observed among pathogenic spirochetes. PMID:9665876

  20. Pig genome sequence - analysis and publication strategy

    DEFF Research Database (Denmark)

    Archibald, Alan L.; Bolund, Lars; Churcher, Carol;

    2010-01-01

    preferentially selected for sequencing. In accordance with the Bermuda and Fort Lauderdale agreements and the more recent Toronto Statement the data have been released into public sequence repositories (Genbank/EMBL, NCBI/Ensembl trace repositories) in a timely manner and in advance of publication. CONCLUSIONS...

  1. DNA sequencing leads to genomics progress in China

    Institute of Scientific and Technical Information of China (English)

    WU JiaYan; XIAO JingFa; ZHANG RuoSi; YU Jun

    2011-01-01

    1 Science in the large-scale sequencing era Ten years ago,the first draft sequence assembly of the human genome was completed [1],bringing biomedical research one-step closer toward the goal of revolutionizing diagnosis,prevention,and treatment of human diseases.Recently,journalists from the journal Nature surveyed more than 1000 life scientists regarding this laudable aim [2],obtaining substantially negative responses [3].However,almost all of those surveyed had been influenced,in one way or another,by the availability of the human genome sequence,and they also agreed with the notion that the "sequence is the start." The complexity of genome biology and almost every aspect of human biology is far greater than previously thought [4].

  2. Variable copy number DNA sequences in rice.

    Science.gov (United States)

    Kikuchi, S; Takaiwa, F; Oono, K

    1987-12-01

    We have cloned two types of variable copy number DNA sequences from the rice embryo genome. One of these sequences, which was cloned in pRB301, was amplified about 50-fold during callus formation and diminished in copy number to the embryonic level during regeneration. The other clone, named pRB401, showed the reciprocal pattern. The copy numbers of both sequences were changed even in the early developmental stage and eliminated from nuclear DNA along with growth of the plant. Sequencing analysis of the pRB301 insert revealed some open reading frames and direct repeat structures, but corresponding sequences were not identified in the EMBL and LASL DNA databases. Sequencing of the nuclear genomic fragment cloned in pRB401 revealed the presence of the 3'rps12-rps7 region of rice chloroplast DNA. Our observations suggest that during callus formation (dedifferentiation), regeneration and the growth process the copy numbers of some DNA sequences are variable and that nuclear integrated chloroplast DNA acts as a variable copy number sequence in the rice genome. Based on data showing a common sequence in mitochondria and chloroplast DNA of maize (Stern and Lonsdale 1982) and that the rps12 gene of tobacco chloroplast DNA is a divided gene (Torazawa et al. 1986), it is suggested that the sequence on the inverted repeat structure of chloroplast DNA may have the character of a movable genetic element. PMID:3481021

  3. Genomic multiple sequence alignments: refinement using a genetic algorithm

    Directory of Open Access Journals (Sweden)

    Lefkowitz Elliot J

    2005-08-01

    Full Text Available Abstract Background Genomic sequence data cannot be fully appreciated in isolation. Comparative genomics – the practice of comparing genomic sequences from different species – plays an increasingly important role in understanding the genotypic differences between species that result in phenotypic differences as well as in revealing patterns of evolutionary relationships. One of the major challenges in comparative genomics is producing a high-quality alignment between two or more related genomic sequences. In recent years, a number of tools have been developed for aligning large genomic sequences. Most utilize heuristic strategies to identify a series of strong sequence similarities, which are then used as anchors to align the regions between the anchor points. The resulting alignment is globally correct, but in many cases is suboptimal locally. We describe a new program, GenAlignRefine, which improves the overall quality of global multiple alignments by using a genetic algorithm to improve local regions of alignment. Regions of low quality are identified, realigned using the program T-Coffee, and then refined using a genetic algorithm. Because a better COFFEE (Consistency based Objective Function For alignmEnt Evaluation score generally reflects greater alignment quality, the algorithm searches for an alignment that yields a better COFFEE score. To improve the intrinsic slowness of the genetic algorithm, GenAlignRefine was implemented as a parallel, cluster-based program. Results We tested the GenAlignRefine algorithm by running it on a Linux cluster to refine sequences from a simulation, as well as refine a multiple alignment of 15 Orthopoxvirus genomic sequences approximately 260,000 nucleotides in length that initially had been aligned by Multi-LAGAN. It took approximately 150 minutes for a 40-processor Linux cluster to optimize some 200 fuzzy (poorly aligned regions of the orthopoxvirus alignment. Overall sequence identity increased only

  4. Molecular epidemiology of dengue viruses from complete genome sequences

    OpenAIRE

    Ong, Swee Hoe

    2010-01-01

    The availability of the complete genetic blueprint of the dengue virus is essential in molecular epidemiological studies to uncover the role of the virus in dengue pathogenesis. During the course of this project, over two hundred complete genomes of the dengue virus were generated from clinical samples collected in three dengue-endemic Southeast Asian countries. In addition, a bioinformatics platform integrating a sequence database, sequence retrieval tools, sequence annotation data and a var...

  5. Information-theoretic View of Sequence Organization in a Genome

    OpenAIRE

    Luo, Liaofu; Gao, Yang; Lu, Jun

    2010-01-01

    Sequence organizations are viewed from two points: one is from informational redundancy or informational correlation (IC) and another is from k-mer frequency statistics. Two problems are investigated. The first is how the ICs exceed the fluctuation bound and the order emerges from fluctuation in a genome when the sequence length attains some critical value. We demonstrated that the transition from fluctuation to order takes place at about sequence length 200-300 thousands bases for human and ...

  6. Physical map-assisted whole-genome shotgun sequence assemblies

    OpenAIRE

    Warren, René L.; Varabei, Dmitry; Platt, Darren; Huang, Xiaoqiu; Messina, David; Yang, Shiaw-Pyng; Kronstad, James W.; Krzywinski, Martin; Warren, Wesley C; Wallis, John W.; Hillier, LaDeana W.; Chinwalla, Asif T.; Schein, Jacqueline E.; Siddiqui, Asim S.; Marra, Marco A.

    2006-01-01

    We describe a targeted approach to improve the contiguity of whole-genome shotgun sequence (WGS) assemblies at run-time, using information from Bacterial Artificial Chromosome (BAC)-based physical maps. Clone sizes and overlaps derived from clone fingerprints are used for the calculation of length constraints between any two BAC neighbors sharing 40% of their size. These constraints are used to promote the linkage and guide the arrangement of sequence contigs within a sequence scaffold at the...

  7. Draft Genome Sequences of Two Virulent Serotypes of Avian Pasteurella multocida

    OpenAIRE

    Abrahante, Juan E.; Johnson, Timothy J.; Hunter, Samuel S.; Maheswaran, Samuel K.; Hauglund, Melissa J.; Bayles, Darrell O.; Tatum, Fred M.; Briggs, Robert E.

    2013-01-01

    Here we report the draft genome sequences of two virulent avian strains of Pasteurella multocida. Comparative analyses of these genomes were done with the published genome sequence of avirulent P. multocida strain Pm70.

  8. Mitochondrial genome sequencing helps show the evolutionary mechanism of mitochondrial genome formation in Brassica

    Directory of Open Access Journals (Sweden)

    Yan Jiyong

    2011-10-01

    Full Text Available Abstract Background Angiosperm mitochondrial genomes are more complex than those of other organisms. Analyses of the mitochondrial genome sequences of at least 11 angiosperm species have showed several common properties; these cannot easily explain, however, how the diverse mitotypes evolved within each genus or species. We analyzed the evolutionary relationships of Brassica mitotypes by sequencing. Results We sequenced the mitotypes of cam (Brassica rapa, ole (B. oleracea, jun (B. juncea, and car (B. carinata and analyzed them together with two previously sequenced mitotypes of B. napus (pol and nap. The sizes of whole single circular genomes of cam, jun, ole, and car are 219,747 bp, 219,766 bp, 360,271 bp, and 232,241 bp, respectively. The mitochondrial genome of ole is largest as a resulting of the duplication of a 141.8 kb segment. The jun mitotype is the result of an inherited cam mitotype, and pol is also derived from the cam mitotype with evolutionary modifications. Genes with known functions are conserved in all mitotypes, but clear variation in open reading frames (ORFs with unknown functions among the six mitotypes was observed. Sequence relationship analysis showed that there has been genome compaction and inheritance in the course of Brassica mitotype evolution. Conclusions We have sequenced four Brassica mitotypes, compared six Brassica mitotypes and suggested a mechanism for mitochondrial genome formation in Brassica, including evolutionary events such as inheritance, duplication, rearrangement, genome compaction, and mutation.

  9. The diploid genome sequence of an individual human.

    Directory of Open Access Journals (Sweden)

    Samuel Levy

    2007-09-01

    Full Text Available Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel included 3,213,401 single nucleotide polymorphisms (SNPs, 53,823 block substitutions (2-206 bp, 292,102 heterozygous insertion/deletion events (indels(1-571 bp, 559,473 homozygous indels (1-82,711 bp, 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

  10. Genome-wide sequence variations among Mycobacterium avium subspecies paratuberculosis.

    Directory of Open Access Journals (Sweden)

    AdelMTalaat

    2011-12-01

    Full Text Available Mycobacterium avium subspecies paratuberculosis (M. ap, the causative agent of Johne’s disease (JD, infects many farmed ruminants, wildlife animals and humans. To better understand the molecular pathogenesis of these infections, we analyzed the whole genome sequences of several M. ap and M. avium subspecies avium (M. avium strains isolated from various hosts and environments. Using Next-generation sequencing technology, all 6 M. ap isolates showed a high percentage of homology (98% to the reference genome sequence of M. ap K-10 isolated from cattle. However, 2 M. avium isolates (DT 78 and Env 77 showed significant sequence diversity from the reference strain M. avium 104. The genomes of M. avium isolates DT 78 and Env 77 exhibited only 87% and 40% homology, respectively, to the M. avium 104 reference genome. Within the M. ap isolates, genomic rearrangements (insertions/deletions, Indels were not detected, and only unique single nucleotide polymorphisms (SNPs were observed among the 6 M. ap strains. While most of the SNPs (~100 in M. ap genomes were non-synonymous, a total of ~ 6000 SNPs were detected among M. avium genomes, most of them were synonymous suggesting a differential selective pressure between M. ap and M. avium isolates. In addition, SNPs-based phylo-genomic analysis showed that isolates from goat and Oryx are closely related to the cattle (K-10 strain while the human isolate (M. ap 4B is closely related to the environmental strains, indicating environmental source to human infections. Overall, SNPs were the most common variations among M. ap isolates while SNPs in addition to Indels were prevalent among M. avium isolates. Genomic variations will be useful in designing host-specific markers for the analysis of mycobacterial evolution and for developing novel diagnostics directed against Johne’s disease in animals.

  11. CGKB: an annotation knowledge base for cowpea (Vigna unguiculata L. methylation filtered genomic genespace sequences

    Directory of Open Access Journals (Sweden)

    Spraggins Thomas A

    2007-04-01

    Full Text Available Abstract Background Cowpea [Vigna unguiculata (L. Walp.] is one of the most important food and forage legumes in the semi-arid tropics because of its ability to tolerate drought and grow on poor soils. It is cultivated mostly by poor farmers in developing countries, with 80% of production taking place in the dry savannah of tropical West and Central Africa. Cowpea is largely an underexploited crop with relatively little genomic information available for use in applied plant breeding. The goal of the Cowpea Genomics Initiative (CGI, funded by the Kirkhouse Trust, a UK-based charitable organization, is to leverage modern molecular genetic tools for gene discovery and cowpea improvement. One aspect of the initiative is the sequencing of the gene-rich region of the cowpea genome (termed the genespace recovered using methylation filtration technology and providing annotation and analysis of the sequence data. Description CGKB, Cowpea Genespace/Genomics Knowledge Base, is an annotation knowledge base developed under the CGI. The database is based on information derived from 298,848 cowpea genespace sequences (GSS isolated by methylation filtering of genomic DNA. The CGKB consists of three knowledge bases: GSS annotation and comparative genomics knowledge base, GSS enzyme and metabolic pathway knowledge base, and GSS simple sequence repeats (SSRs knowledge base for molecular marker discovery. A homology-based approach was applied for annotations of the GSS, mainly using BLASTX against four public FASTA formatted protein databases (NCBI GenBank Proteins, UniProtKB-Swiss-Prot, UniprotKB-PIR (Protein Information Resource, and UniProtKB-TrEMBL. Comparative genome analysis was done by BLASTX searches of the cowpea GSS against four plant proteomes from Arabidopsis thaliana, Oryza sativa, Medicago truncatula, and Populus trichocarpa. The possible exons and introns on each cowpea GSS were predicted using the HMM-based Genscan gene predication program and the

  12. Comparison of methods for genomic localization of gene trap sequences

    Directory of Open Access Journals (Sweden)

    Ferrin Thomas E

    2006-09-01

    Full Text Available Abstract Background Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences were used to evaluate localization results. Results In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small proportion of the sequence tags. However, large differences in performance were noted with regard to correctly identifying exon boundaries. BLAT correctly identified the vast majority of exon boundaries, while SSAHA and MegaBLAST missed the majority of exon boundaries. SSAHA consistently reported the fewest false positives and is the fastest algorithm. MegaBLAST was comparable to BLAT in speed, but was the most susceptible to localizing sequence tags incorrectly to pseudogenes. Conclusion The differences in performance for sequence tags and full-length reference sequences were surprisingly small. Characteristic variations in localization results for each program were noted that affect the localization of sequence at exon boundaries, in particular.

  13. Evolution Analysis of Simple Sequence Repeats in Plant Genome.

    Directory of Open Access Journals (Sweden)

    Zhen Qin

    Full Text Available Simple sequence repeats (SSRs are widespread units on genome sequences, and play many important roles in plants. In order to reveal the evolution of plant genomes, we investigated the evolutionary regularities of SSRs during the evolution of plant species and the plant kingdom by analysis of twelve sequenced plant genome sequences. First, in the twelve studied plant genomes, the main SSRs were those which contain repeats of 1-3 nucleotides combination. Second, in mononucleotide SSRs, the A/T percentage gradually increased along with the evolution of plants (except for P. patens. With the increase of SSRs repeat number the percentage of A/T in C. reinhardtii had no significant change, while the percentage of A/T in terrestrial plants species gradually declined. Third, in dinucleotide SSRs, the percentage of AT/TA increased along with the evolution of plant kingdom and the repeat number increased in terrestrial plants species. This trend was more obvious in dicotyledon than monocotyledon. The percentage of CG/GC showed the opposite pattern to the AT/TA. Forth, in trinucleotide SSRs, the percentages of combinations including two or three A/T were in a rising trend along with the evolution of plant kingdom; meanwhile with the increase of SSRs repeat number in plants species, different species chose different combinations as dominant SSRs. SSRs in C. reinhardtii, P. patens, Z. mays and A. thaliana showed their specific patterns related to evolutionary position or specific changes of genome sequences. The results showed that, SSRs not only had the general pattern in the evolution of plant kingdom, but also were associated with the evolution of the specific genome sequence. The study of the evolutionary regularities of SSRs provided new insights for the analysis of the plant genome evolution.

  14. Enhanced Dynamic Algorithm of Genome Sequence Alignments

    OpenAIRE

    Arabi E. keshk

    2014-01-01

    The merging of biology and computer science has created a new field called computational biology that explore the capacities of computers to gain knowledge from biological data, bioinformatics. Computational biology is rooted in life sciences as well as computers, information sciences, and technologies. The main problem in computational biology is sequence alignment that is a way of arranging the sequences of DNA, RNA or protein to identify the region of similarity and relationship between se...

  15. Toward Complete Bacterial Genome Sequencing Through the Combined Use of Multiple Next-Generation Sequencing Platforms.

    Science.gov (United States)

    Jeong, Haeyoung; Lee, Dae-Hee; Ryu, Choong-Min; Park, Seung-Hwan

    2016-01-01

    PacBio's long-read sequencing technologies can be successfully used for a complete bacterial genome assembly using recently developed non-hybrid assemblers in the absence of secondgeneration, high-quality short reads. However, standardized procedures that take into account multiple pre-existing second-generation sequencing platforms are scarce. In addition to Illumina HiSeq and Ion Torrent PGM-based genome sequencing results derived from previous studies, we generated further sequencing data, including from the PacBio RS II platform, and applied various bioinformatics tools to obtain complete genome assemblies for five bacterial strains. Our approach revealed that the hierarchical genome assembly process (HGAP) non-hybrid assembler resulted in nearly complete assemblies at a moderate coverage of ~75x, but that different versions produced non-compatible results requiring post processing. The other two platforms further improved the PacBio assembly through scaffolding and a final error correction. PMID:26464377

  16. Chloroplast ribosomes and protein synthesis.

    OpenAIRE

    Harris, E. H.; Boynton, J E; Gillham, N W

    1994-01-01

    Consistent with their postulated origin from endosymbiotic cyanobacteria, chloroplasts of plants and algae have ribosomes whose component RNAs and proteins are strikingly similar to those of eubacteria. Comparison of the secondary structures of 16S rRNAs of chloroplasts and bacteria has been particularly useful in identifying highly conserved regions likely to have essential functions. Comparative analysis of ribosomal protein sequences may likewise prove valuable in determining their roles i...

  17. cpSSR: a New Tool to Analyze Chloroplast Genome of Citrus Somatic Hybrids%叶绿体S S R标记:柑橘体细胞杂种胞质遗传分析的一种新方法

    Institute of Scientific and Technical Information of China (English)

    程运江; 郭文武; 邓秀新

    2003-01-01

    Chloroplast simple sequence repeat (cpSSR) markers in Citrus were developed and success-fully used to analyze chloroplast genome inheritance of Citrus somatic hybrids. Twenty-two previouslyreported cpSSR primer pairs from pine (Pinus thunbergii Parl.), rice (Oryza sativa L.) and tobacco (Nicotianatabacum L.) were tested in Citrus, nine of which could amplify intensive PCR products by agarose gelelectrophoresis. Chloroplast genome inheritance of Citrus somatic hybrids from nine fusions was thenanalyzed, and five of the nine pre-screened primer pairs showed polymorphisms by polyacrylamide gelelectrophoresis. The results revealed the random inheritance nature of chloroplast genome in all analyzedCitrus somatic hybrids, which was in agreement with previous reports based on RFLP or CAPS analyses. Itwas also shown that cpSSR is a more efficient tool in chloroplast genome analyses of somatic hybrids inhigher plants, compared with the conventional RFLP or CAPS analyses.%从水稻(Oryza sativa L.)、烟草(Nicotiana tabacum L.)和黑松(Pinus thunbergiiParl.)等植物的22对叶绿体SSR引物中筛选出 5对能用于柑橘叶绿体SSR分析的引物,应用这5对引物对9个组合的柑橘体细胞杂种的叶绿体遗传进行了分析.结果表明:这些组合再生的杂种中叶绿体都呈现随机分离,该现象与以前报道的RFLP分析结果一致,而且其可靠性已被CAPS分析所证实.表明柑橘叶绿体SSR同RFLP及CAPS一样可靠,并且更简单高效、易于操作,特别适合对柑橘等植物体细胞杂种进行早期胞质遗传组成分析.

  18. Sequence-specific binding of a chloroplast pentatricopeptide repeat protein to its native group II intron ligand

    OpenAIRE

    Williams-Carrier, Rosalind; Kroeger, Tiffany; Barkan, Alice

    2008-01-01

    Pentatricopeptide repeat (PPR) proteins are defined by degenerate 35-amino acid repeats that are related to the tetratricopeptide repeat (TPR). Most characterized PPR proteins mediate specific post-transcriptional steps in gene expression in mitochondria or chloroplasts. However, little is known about the structure of PPR proteins or the biochemical mechanisms through which they act. Here we establish features of PPR protein structure and nucleic acid binding activity through in vitro experim...

  19. 基于柑橘及其近缘属植物DNA条形码的叶绿体编码序列筛选%Screening Potential DNA Barcode Regions of Chloroplast Coding Genome for Citrus and Its Related Genera

    Institute of Scientific and Technical Information of China (English)

    于杰; 闫化学; 鲁振华; 周志钦

    2011-01-01

    [Objective] Four coding regions of chloroplast genome of Citrus and its close relatives were analyzed in an attempt to find suitable DNA barcoding markers for species identification and lay a foundation for further study of non-coding region.[ Method ] Four chloroplast DNA regions (matK, rpoB, rpoC1 and rbcL ) of 59 Citrus accessions were sequenced, the intergeneric,interspecific, intraspecific genetic distances were calculated, and the phylogenetic tree of all the accessions tested was built based on the distance data obtained. [Result] The intergeneric and interspecific sequence variations of matK were the highest among four coding regions tested, and had significant difference from other regions studied. On the contrary, no obvious variations were found in the rpoB and rpoC1 regions. The sequence variation of rbcL was medium among the fragments sequenced. [Conclusion] The matK sequence could be used as potential candidate fragment for future DNA barcoding study of Citrus and its closely related genera.%[目的]通过对柑橘及其近缘属植物叶绿体4种编码序列的测定分析,获得能进行DNA条形编码的特征序列,为进一步研究叶绿体非编码区序列奠定基础.[方法]对柑橘及其近缘属植物59份样品进行matK、rpoB、rpoC1、rbcL测序,序列比对与人工校正,计算属间,种同、种内的遗传距离,比较序列间的差异,建立系统发育树.[结果]4种序列中,matK序列在属间、种间差异最大,与其它序列相比具有显著性差异,rbcL序列次之,而rpoB、rpoC1序列两者间没有显著性差异.[结论]matK序列是柑橘及其近缘属植物DNA条形码的未来研究中一个重要的候选片段.

  20. Plasmodium knowlesi genome sequences from clinical isolates reveal extensive genomic dimorphism.

    Directory of Open Access Journals (Sweden)

    Miguel M Pinheiro

    Full Text Available Plasmodium knowlesi is a newly described zoonosis that causes malaria in the human population that can be severe and fatal. The study of P. knowlesi parasites from human clinical isolates is relatively new and, in order to obtain maximum information from patient sample collections, we explored the possibility of generating P. knowlesi genome sequences from archived clinical isolates. Our patient sample collection consisted of frozen whole blood samples that contained excessive human DNA contamination and, in that form, were not suitable for parasite genome sequencing. We developed a method to reduce the amount of human DNA in the thawed blood samples in preparation for high throughput parasite genome sequencing using Illumina HiSeq and MiSeq sequencing platforms. Seven of fifteen samples processed had sufficiently pure P. knowlesi DNA for whole genome sequencing. The reads were mapped to the P. knowlesi H strain reference genome and an average mapping of 90% was obtained. Genes with low coverage were removed leaving 4623 genes for subsequent analyses. Previously we identified a DNA sequence dimorphism on a small fragment of the P. knowlesi normocyte binding protein xa gene on chromosome 14. We used the genome data to assemble full-length Pknbpxa sequences and discovered that the dimorphism extended along the gene. An in-house algorithm was developed to detect SNP sites co-associating with the dimorphism. More than half of the P. knowlesi genome was dimorphic, involving genes on all chromosomes and suggesting that two distinct types of P. knowlesi infect the human population in Sarawak, Malaysian Borneo. We use P. knowlesi clinical samples to demonstrate that Plasmodium DNA from archived patient samples can produce high quality genome data. We show that analyses, of even small numbers of difficult clinical malaria isolates, can generate comprehensive genomic information that will improve our understanding of malaria parasite diversity and

  1. Complete genome sequence of Arcobacter nitrofigilis type strain (CIT)

    Energy Technology Data Exchange (ETDEWEB)

    Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Gronow, Sabine [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Chertkov, Olga [Los Alamos National Laboratory (LANL); Bruce, David [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

    2010-01-01

    Arcobacter nitrofigilis (McClung et al. 1983) Vandamme et al. 1991 is the type species of the genus Arcobacter in the epsilonproteobacterial family Campylobacteraceae. The species was first described in 1983 as Campylobacter nitrofigilis [1] after its detection as a free-living, nitrogen-fixing Campylobacter species associated with Spartina alterniflora Loisel. roots [2]. It is of phylogenetic interest because of its lifestyle as a symbiotic organism in a marine environment in contrast to many other Arcobacter species which are associated with warm-blooded animals and tend to be pathogenic. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a type stain of the genus Arcobacter. The 3,192,235 bp genome with its 3,154 protein-coding and 70 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  2. The complete plastid genome sequence of Bomarea edulis (Alstroemeriaceae: Liliales).

    Science.gov (United States)

    Kim, Jung Sung; Kim, Hyoung Tae; Yoon, Chang Young; Kim, Joo-Hwan

    2016-05-01

    Bomarea, a member of the family Alstroemeriaceae, is distributed from Chile to Mexico and includes approximately 120 species. Recent molecular phylogenetic studies have clarified the monophyly of the family within the order Liliales and the sister relationship with the family Colchicaceae. At this time, five plastid genomes of Liliales have been analyzed at the familial level. To examine plastid genome variation at the generic level, we sequenced the plastid genome of Bomarea edulis, which is the most widely distributed species in the genus, and compared it with Alstroemeria aurea. The plastid genome sequence of B. edulis was 154,925 bp in length with a similar structure as A. aurea, excluding the IR-LSC junction. Ycf68 and infA were pseudogenes caused by frameshift mutations, and the ycf15 gene was deleted, similar to A. aurea. PMID:25319309

  3. The complete mitochondrial genome sequence of Malus hupehensis var. pinyiensis.

    Science.gov (United States)

    Duan, Naibin; Sun, Honghe; Wang, Nan; Fei, Zhangjun; Chen, Xuesen

    2016-07-01

    The complete mitochondrial genome sequence of Malus hupehensis var. pinyiensis, a widely used apple rootstock, was determined using the Illumina high-throughput sequencing approach. The genome is 422,555 bp in length and has a GC content of 45.21%. It is separated by a pair of inverted repeats of 32,504 bp, to form a large single copy region of 213,055 bp and a small single copy region of 144,492 bp. The genome contains 38 protein-coding genes, four pseudogenes, 25 tRNA genes, and three rRNA genes. The genome is 25,608 bp longer than that of M. domestica, and several structural variations between these two mitogenomes were detected. PMID:26539696

  4. Draft genome sequence of the rubber tree Hevea brasiliensis

    Directory of Open Access Journals (Sweden)

    Rahman Ahmad Yamin Abdul

    2013-02-01

    Full Text Available Abstract Background Hevea brasiliensis, a member of the Euphorbiaceae family, is the major commercial source of natural rubber (NR. NR is a latex polymer with high elasticity, flexibility, and resilience that has played a critical role in the world economy since 1876. Results Here, we report the draft genome sequence of H. brasiliensis. The assembly spans ~1.1 Gb of the estimated 2.15 Gb haploid genome. Overall, ~78% of the genome was identified as repetitive DNA. Gene prediction shows 68,955 gene models, of which 12.7% are unique to Hevea. Most of the key genes associated with rubber biosynthesis, rubberwood formation, disease resistance, and allergenicity have been identified. Conclusions The knowledge gained from this genome sequence will aid in the future development of high-yielding clones to keep up with the ever increasing need for natural rubber.

  5. An automated annotation tool for genomic DNA sequences using GeneScan and BLAST

    Indian Academy of Sciences (India)

    Andrew M. Lynn; Chakresh Kumar Jain; K. Kosalai; Pranjan Barman; Nupur Thakur; Harish Batra; Alok Bhattacharya

    2001-04-01

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated annotation of genome DNA sequences.

  6. Accuracy of genomic prediction using imputed whole-genome sequence data in white layers.

    Science.gov (United States)

    Heidaritabar, M; Calus, M P L; Megens, H-J; Vereijken, A; Groenen, M A M; Bastiaansen, J W M

    2016-06-01

    There is an increasing interest in using whole-genome sequence data in genomic selection breeding programmes. Prediction of breeding values is expected to be more accurate when whole-genome sequence is used, because the causal mutations are assumed to be in the data. We performed genomic prediction for the number of eggs in white layers using imputed whole-genome resequence data including ~4.6 million SNPs. The prediction accuracies based on sequence data were compared with the accuracies from the 60 K SNP panel. Predictions were based on genomic best linear unbiased prediction (GBLUP) as well as a Bayesian variable selection model (BayesC). Moreover, the prediction accuracy from using different types of variants (synonymous, non-synonymous and non-coding SNPs) was evaluated. Genomic prediction using the 60 K SNP panel resulted in a prediction accuracy of 0.74 when GBLUP was applied. With sequence data, there was a small increase (~1%) in prediction accuracy over the 60 K genotypes. With both 60 K SNP panel and sequence data, GBLUP slightly outperformed BayesC in predicting the breeding values. Selection of SNPs more likely to affect the phenotype (i.e. non-synonymous SNPs) did not improve the accuracy of genomic prediction. The fact that sequence data were based on imputation from a small number of sequenced animals may have limited the potential to improve the prediction accuracy. A small reference population (n = 1004) and possible exclusion of many causal SNPs during quality control can be other possible reasons for limited benefit of sequence data. We expect, however, that the limited improvement is because the 60 K SNP panel was already sufficiently dense to accurately determine the relationships between animals in our data. PMID:26776363

  7. Genome and exome sequencing in the clinic: unbiased genomic approaches with a high diagnostic yield

    NARCIS (Netherlands)

    Nelen, M.; Veltman, J.A.

    2012-01-01

    For the reasons discussed here, we think whole-genome- or exome-based approaches are currently most suited for diagnostic implementation in genetically heterogeneous diseases, initially to complement and later to replace Sanger sequencing, qPCR and genomic microarrays. Patients do need to be counsel

  8. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Eisen, JA; Peterson, S; Paulsen, IT; Nelson, KE; Margarit, [No Value; Read, TD; Madoff, LC; Beanan, MJ; Brinkac, LM; Daugherty, SC; DeBoy, RT; Durkin, AS; Kolonay, JF; Madupu, R; Lewis, MR; Radune, D; Fedorova, NB; Scanlan, D; Khouri, H; Mulligan, S; Carty, HA; Cline, RT; Van Aken, SE; Gill, J; Scarselli, M; Mora, M; Iacobini, ET; Brettoni, C; Galli, G; Mariani, M; Vegni, F; Maione, D; Rinaudo, D; Rappuoli, R; Telford, JL; Kasper, DL; Grandi, G; Fraser, CM

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the oth

  9. Candida albicans genome sequence: a platform for genomics in the absence of genetics

    OpenAIRE

    Odds, Frank C.; Brown, Alistair JP; Gow, Neil AR

    2004-01-01

    Publication of the complete diploid genome sequence of the yeast Candida albicans will accelerate research into the pathogenesis of Candida infections. Comparative genomic analysis highlights genes that may contribute to C. albicans survival and its fitness as a human commensal and pathogen.

  10. Complete Genome Sequence of Streptomyces ambofaciens DSM 40697, a Paradigm for Genome Plasticity Studies

    Science.gov (United States)

    Thibessard, Annabelle

    2016-01-01

    The sequence of Streptomyces ambofaciens DSM 40697 was completely determined. The genome consists of an 8.1-Mbp linear chromosome with terminal inverted repeats of 210 kb. Genomic islands were identified, one of which corresponds to a new putative integrative and conjugative element (ICE) called pSAM3. PMID:27257195

  11. Complete genome sequence of Croceibacter atlanticus HTCC2559T.

    Science.gov (United States)

    Oh, Hyun-Myung; Kang, Ilnam; Ferriera, Steve; Giovannoni, Stephen J; Cho, Jang-Cheon

    2010-09-01

    Here we announce the complete genome sequence of Croceibacter atlanticus HTCC2559(T), which was isolated by high-throughput dilution-to-extinction culturing from the Bermuda Atlantic Time Series station in the Western Sargasso Sea. Strain HTCC2559(T) contained genes for carotenoid biosynthesis, flavonoid biosynthesis, and several macromolecule-degrading enzymes. The genome confirmed physiological observations of cultivated Croceibacter atlanticus strain HTCC2559(T), which identified it as an obligate chemoheterotroph. PMID:20639333

  12. The genome sequence of the filamentous fungus Neurospora crassa

    OpenAIRE

    Read, Nick D; et al.

    2003-01-01

    Neurospora crassa is a central organism in the history of twentieth-century genetics, biochemistry and molecular biology. Here, we report a high-quality draft sequence of the N. crassa genome. The approximately 40-megabase genome encodes about 10,000 protein-coding genes—more than twice as many as in the fission yeast Schizosaccharomyces pombe and only about 25% fewer than in the fruitfly Drosophila melanogaster. Analysis of the gene set yields insights into unexpected aspects of Neu...

  13. Complete Genome Sequence of a Novel Porcine Parvovirus in China

    OpenAIRE

    Dai, Xiao-Fang; Wang, Qiu-Ju; Jiang, Shi-Jin; Xie, Zhi-Jing

    2012-01-01

    The porcine parvovirus JT strain (PPV-JT) was isolated from a piglet showing nonsuppurative myocarditis in Shandong, China, in 2010. The complete genomic sequence of PPV-JT, 4,941 bp long, was determined from clones made from replicative form (RF) DNA. The genomic analysis demonstrated that the PPV-JT might be involved in a recombination event, which will help us understand the molecular characteristics and evolutionary of PPV in China.

  14. Complete Genome Sequence of the Endophytic Fungus Diaporthe (Phomopsis) ampelina.

    Science.gov (United States)

    Savitha, J; Bhargavi, S D; Praveen, V K

    2016-01-01

    Diaporthe ampelina was isolated as an endophytic fungus from the root of Commiphora wightii, a medicinal plant collected from Dhanvantri Vana, Bangalore University, Bangalore, India. The whole genome is 59 Mb, contains a total of 905 scaffolds, and has a G+C content of 51.74%. The genome sequence of D. ampelina shows a complete absence of lovastatin (an anticholesterol drug) gene cluster. PMID:27257198

  15. Whole Genome and Transcriptome Sequencing of a B3 Thymoma

    OpenAIRE

    Iacopo Petrini; Arun Rajan; Trung Pham; Donna Voeller; Sean Davis; James Gao; Yisong Wang; Giuseppe Giaccone

    2013-01-01

    Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomi...

  16. Microsatellite evolution inferred from human– chimpanzee genomic sequence alignments

    OpenAIRE

    Webster, Matthew T.; Smith, Nick G.C.; Ellegren, Hans

    2002-01-01

    Most studies of microsatellite evolution utilize long, highly mutable loci, which are unrepresentative of the majority of simple repeats in the human genome. Here we use an unbiased sample of 2,467 microsatellite loci derived from alignments of 5.1 Mb of genomic sequence from human and chimpanzee to investigate the mutation process of tandemly repetitive DNA. The results indicate that the process of microsatellite evolution is highly heterogeneous, exhibiting differences between loci of diffe...

  17. Complete genome sequence of the European sheatfish virus

    OpenAIRE

    Mavian, Carla; López-Bueno, Alberto; Somalo, María Pilar Fernández; Alcamí, Antonio; Alejo, Alí

    2012-01-01

    Viral diseases are an increasing threat to the thriving aquaculture industry worldwide. An emerging group of fish pathogens is formed by several ranaviruses, which have been isolated at different locations from freshwater and seawater fish species since 1985.We report the complete genome sequence of European sheatfish ranavirus (ESV), the first ranavirus isolated in Europe, which causes high mortality rates in infected sheatfish (Silurus glanis) and in other species. Analysis of the genome se...

  18. Complete genome sequence of the European sheatfish virus

    OpenAIRE

    Mavian, Carla; López-Bueno, Alberto; Alcamí, Antonio; Alejo, Alí; Fernández Somalo, María Pilar

    2012-01-01

    Viral diseases are an increasing threat to the thriving aquaculture industry worldwide. An emerging group of fish pathogens is formed by several ranaviruses, which have been isolated at different locations from freshwater and seawater fish species since 1985.Wereport the complete genome sequence of European sheatfish ranavirus (ESV), the first ranavirus isolated in Europe, which causes high mortality rates in infected sheatfish (Silurus glanis) and in other species. Analysis of the genome seq...

  19. Arrangement of repetitive sequences in the genome of herpesvirus Sylvilagus.

    OpenAIRE

    Medveczky, M M; Geck, P; Clarke, C; Byrnes, J; Sullivan, J L; Medveczky, P G

    1989-01-01

    Herpesvirus sylvilagus is a lymphotropic (type gamma) herpesvirus of cottontail rabbits (Sylvilagus floridanus). Analysis of virion DNA of herpesvirus sylvilagus has revealed that the genome consists of one stretch of about 120 kilobase pairs of internal, unique DNA flanked by a variable number of 553-base-pair tandem repeats. The G + C content of the repetitive DNA is extremely high (83%), as determined by sequencing. The organization of the herpesvirus sylvilagus genome is, therefore, simil...

  20. Complete Genome Sequence of the Endophytic Fungus Diaporthe (Phomopsis) ampelina

    Science.gov (United States)

    Bhargavi, S. D.; Praveen, V. K.

    2016-01-01

    Diaporthe ampelina was isolated as an endophytic fungus from the root of Commiphora wightii, a medicinal plant collected from Dhanvantri Vana, Bangalore University, Bangalore, India. The whole genome is 59 Mb, contains a total of 905 scaffolds, and has a G+C content of 51.74%. The genome sequence of D. ampelina shows a complete absence of lovastatin (an anticholesterol drug) gene cluster. PMID:27257198

  1. Genome sequencing, annotation of Citrobacter freundii strain GTC 09479

    Directory of Open Access Journals (Sweden)

    Kazuyuki Kimura

    2014-12-01

    Full Text Available We report the 4.9-Mb genome sequence of Citrobacter freundii strain GTC 09479, isolated from urine sample collected during the year 1983 at Gifu University Graduate School of Medicine, Japan. This draft genome consist of 4,899,578 bp with 51.62% G + C, 4,574 predicted CDSs, 72 tRNAs and 10 rRNAs.

  2. Insights into phylogeny, sex function and age of Fragaria based on whole chloroplast genome sequencing

    Science.gov (United States)

    The cultivated strawberry is one of the youngest domesticated plants, developed in France in the 1700s from chance hybridization between two western hemisphere octoploid species. However, little is known about the evolution of the species that gave rise to this important fruit crop. Phylogenetic an...

  3. Standardized metadata for human pathogen/vector genomic sequences.

    Directory of Open Access Journals (Sweden)

    Vivien G Dugan

    Full Text Available High throughput sequencing has accelerated the determination of genome sequences for thousands of human infectious disease pathogens and dozens of their vectors. The scale and scope of these data are enabling genotype-phenotype association studies to identify genetic determinants of pathogen virulence and drug/insecticide resistance, and phylogenetic studies to track the origin and spread of disease outbreaks. To maximize the utility of genomic sequences for these purposes, it is essential that metadata about the pathogen/vector isolate characteristics be collected and made available in organized, clear, and consistent formats. Here we report the development of the GSCID/BRC Project and Sample Application Standard, developed by representatives of the Genome Sequencing Centers for Infectious Diseases (GSCIDs, the Bioinformatics Resource Centers (BRCs for Infectious Diseases, and the U.S. National Institute of Allergy and Infectious Diseases (NIAID, part of the National Institutes of Health (NIH, informed by interactions with numerous collaborating scientists. It includes mapping to terms from other data standards initiatives, including the Genomic Standards Consortium's minimal information (MIxS and NCBI's BioSample/BioProjects checklists and the Ontology for Biomedical Investigations (OBI. The standard includes data fields about characteristics of the organism or environmental source of the specimen, spatial-temporal information about the specimen isolation event, phenotypic characteristics of the pathogen/vector isolated, and project leadership and support. By modeling metadata fields into an ontology-based semantic framework and reusing existing ontologies and minimum information checklists, the application standard can be extended to support additional project-specific data fields and integrated with other data represented with comparable standards. The use of this metadata standard by all ongoing and future GSCID sequencing projects will

  4. Sequence analysis reveals mosaic genome of Aichi virus

    Directory of Open Access Journals (Sweden)

    Han Xiaohong

    2011-08-01

    Full Text Available Abstract Aichi virus is a positive-sense and single-stranded RNA virus, which demonstrated to be related to diarrhea of Children. In the present study, phylogenetic and recombination analysis based on the Aichi virus complete genomes available in GenBank reveal a mosaic genome sequence [GenBank: FJ890523], of which the nt 261-852 region (the nt position was based on the aligned sequence file shows close relationship with AB010145/Japan with 97.9% sequence identity, while the other genomic regions show close relationship with AY747174/German with 90.1% sequence identity. Our results will provide valuable hints for future research on Aichi virus diversity. Aichi virus is a member of the Kobuvirus genus of the Picornaviridae family 12 and belongs to a positive-sense and single-stranded RNA virus. Its presence in fecal specimens of children suffering from diarrhea has been demonstrated in several Asian countries 3456, in Brazil and German 7, in France 8 and in Tunisia 9. Some reports showed the high level of seroprevalence in adults 710, suggesting the widespread exposure to Aichi virus during childhood. The genome of Aichi virus contains 8,280 nucleotides and a poly(A tail. The single large open reading frame (nt 713-8014 according to the strain AB010145 encodes a polyprotein of 2,432 amino acids that is cleaved into the typical picornavirus structural proteins VP0, VP3, VP1, and nonstructural proteins 2A, 2B, 2C, 3A, 3B, 3C and 3D 211. Based on the phylogenetic analysis of 519-bp sequences at the 3C-3D (3CD junction, Aichi viruses can be divided into two genotypes A and B with approximately 90% sequence homology 12. Although only six complete genomes of Aichi virus were deposited in GenBank at present, mosaic genomes can be found in strains from different countries.

  5. Transcription of densovirus endogenous sequences in the Myzus persicae genome.

    Science.gov (United States)

    Clavijo, Gabriel; van Munster, Manuella; Monsion, Baptiste; Bochet, Nicole; Brault, Véronique

    2016-04-01

    Integration of non-retroviral sequences in the genome of different organisms has been observed and, in some cases, a relationship of these integrations with immunity has been established. The genome of the green peach aphid, Myzus persicae (clone G006), was screened for densovirus-like sequence (DLS) integrations. A total of 21 DLSs localized on 10 scaffolds were retrieved that mostly shared sequence identity with two aphid-infecting viruses, Myzus persicae densovirus (MpDNV) and Dysaphis plantaginea densovirus (DplDNV). In some cases, uninterrupted potential ORFs corresponding to non-structural viral proteins or capsid proteins were found within DLSs identified in the aphid genome. In particular, one scaffold harboured a complete virus-like genome, while another scaffold contained two virus-like genomes in reverse orientation. Remarkably, transcription of some of these ORFs was observed in M. persicae, suggesting a biological effect of these viral integrations. In contrast to most of the other densoviruses identified so far that induce acute host infection, it has been reported previously that MpDNV has only a minor effect on M. persicae fitness, while DplDNV can even have a beneficial effect on its aphid host. This suggests that DLS integration in the M. persicae genome may be responsible for the latency of MpDNV infection in the aphid host. PMID:26758080

  6. Low-pass sequencing for microbial comparative genomics

    Directory of Open Access Journals (Sweden)

    Kennedy Sean

    2004-01-01

    Full Text Available Abstract Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1 the metabolically versatile Haloarcula marismortui; (2 the non-pigmented Natrialba asiatica; (3 the psychrophile Halorubrum lacusprofundi and (4 the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI for their predicted proteins. Multiple insertion sequence (IS elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP and transcription factor IIB (TFB homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1 high GC content and (2 low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the

  7. The complete mitochondrial genome sequence of the budgerigar, Melopsittacus undulatus.

    Science.gov (United States)

    Guan, Xiaojing; Xu, Jun; Smith, Edward J

    2016-01-01

    Here, we describe the budgie's mitochondrial genome sequence, a resource that can facilitate this parrot's use as a model organism as well as for determining its phylogenetic relatedness to other parrots/Psittaciformes. The estimated total length of the sequence was 18,193 bp. In addition to the to the 13 protein and tRNA and rRNA coding regions, the sequence also includes a duplicated hypervariable region, a feature unique to only a few birds. The two hypervariable regions shared a sequence identity of about 86%. PMID:24660934

  8. Pittosporum cryptic virus 1: genome sequence completion using next-generation sequencing.

    Science.gov (United States)

    Elbeaino, Toufic; Kubaa, Raied Abou; Tuzlali, Hasan Tuna; Digiaro, Michele

    2016-07-01

    Next-generation sequencing (NGS) was applied to dsRNAs extracted from an Italian pittosporum plant infected with pittosporum cryptic virus 1 (PiCV1). NGS allowed assembly of the full genome sequence of PiCV1, comprising dsRNA1 (1.9 kbp) and dsRNA2 (1.5 kbp), which encode the RNA-dependent RNA polymerase and capsid protein genes, respectively. Phylogenetic and sequence analyses confirmed that PiCV1 is a new member of the genus Deltapartitivirus, family Partiviridae. From the same plant, NSG also permitted assembly of the complete genome sequence of eggplant mottled dwarf virus (EMDV), which shared 86 % to 98 % nucleotide sequence identity with complete and partial sequences (ca 6750 nt) of other known EMDV isolates with sequences available in the GenBank database. PMID:27087112

  9. Sequence Classification: 892357 [TMBETA-GENOME[Archive

    Lifescience Database Archive (English)

    Full Text Available Non-TMB Non-TMH Non-TMB Non-TMB Non-TMB Non-TMB >gi|6322971|ref|NP_013043.1| Fe(II)-dependent su ... dioxygenase, involved in sulfonate catabolism for use ... as a sulfur source, contains sequence that closely ...

  10. Genome Sequencing and Annotation of Mycobacterium tuberculosis PR08 strain

    Directory of Open Access Journals (Sweden)

    Mohammad Maaruf Jaafar

    2016-03-01

    Full Text Available Mycobacterium tuberculosis is an acid fast bacterial species in the family Mycobacteriaceae and is the causative agent of most cases of tuberculosis. Here, we report the genomic features of Mycobacterium tuberculosis isolated from the cerebrospinal fluid (CSF of a patient diagnosed with both pulmonary and extrapulmonary tuberculosis (TB. The isolated strain was identified as Mycobacterium tuberculosis PR08 (MTB PR08. Genomic DNA of the MTB PR08 strain was extracted and subjected to whole genome sequencing using MiSeq (Illumina, CA,USA. The draft genome size of MTB PR08 strain is 4,292,364 bp with a G + C content of 65.2%. This strain was annotated to have 4723 genes and 48 RNAs. This whole genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession number CP010895.

  11. Sequencing and analysis of the giant panda genome

    Institute of Scientific and Technical Information of China (English)

    YANG HuanMing

    2010-01-01

    @@ The giant panda (Ailuropoda melanoleuca) is loved all over the world and is considered a symbol of China, as illustrated by its being one of the mascots for the Beijing 2008 Olympic Games.It is also one of the world's most endangered animals and a flagship species for conservation.Using next-generation sequencing technology (Illumina Genome Analyzer) and our in-house assembly software, we have generated the first map of the giant panda genome sequence.This map will provide an unparalleled amount of information to aid in understanding the genetic and biological nature of this unique species and will contribute significantly to disease control and conservation efforts for this endangered species.In March 2008, the giant panda genome sequencing and analysis project was started at the Beijing Genomics Institute (BGI) in Shenzhen with collaborators from the Kunming Institute of Zoology and the Chengdu Research Base of Giant Panda Breeding.On 21 Jan.2010, this collaboration resulted in the publication, as a cover story in the journal Nature, of the sequencing and analysis of the giant panda genome.

  12. Genome sequence of the pea aphid Acyrthosiphon pisum

    DEFF Research Database (Denmark)

    Richards, S.; Gibbs, R. A.; Gerardo, N. M.;

    2010-01-01

    Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first...... published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we...

  13. The complete mitochondrial genome sequence of Emperor Penguins (Aptenodytes forsteri).

    Science.gov (United States)

    Xu, Qiwu; Xia, Yan; Dang, Xiao; Chen, Xiaoli

    2016-09-01

    The emperor penguin (Aptenodytes forsteri) is the largest living species of penguin. Herein, we first reported the complete mitochondrial genome of emperor penguin. The mitochondrial genome is a circular molecule of 17 301 bp in length, consisting of 13 protein-coding genes, 22 tRNA genes, two rRNA, and one control region. To verify the accuracy and the utility of new determined mitogenome sequences, we constructed the species phylogenetic tree of emperor penguin together with 10 other closely species. This is the second complete mitochondrial genome of penguin, and this is going to be an important data to study mitochondrial evolution of birds. PMID:26403091

  14. Draft genome sequence of the rubber tree Hevea brasiliensis

    OpenAIRE

    Rahman Ahmad Yamin Abdul; Usharraj Abhilash O; Misra Biswapriya B; Thottathil Gincy P; Jayasekaran Kandakumar; Feng Yun; Hou Shaobin; Ong Su Yean; Ng Fui Ling; Lee Ling Sze; Tan Hock Siew; Sakaff Muhd Khairul Luqman Muhd; Teh Beng Soon; Khoo Bee; Badai Siti Suriawati

    2013-01-01

    Abstract Background Hevea brasiliensis, a member of the Euphorbiaceae family, is the major commercial source of natural rubber (NR). NR is a latex polymer with high elasticity, flexibility, and resilience that has played a critical role in the world economy since 1876. Results Here, we report the draft genome sequence of H. brasiliensis. The assembly spans ~1.1 Gb of the estimated 2.15 Gb haploid genome. Overall, ~78% of the genome was identified as repetitive DNA. Gene prediction shows 68,95...

  15. Establishing a framework for comparative analysis of genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  16. Complete Plastid Genome Sequence of the Brown Alga Undaria pinnatifida.

    Directory of Open Access Journals (Sweden)

    Lei Zhang

    Full Text Available In this study, we fully sequenced the circular plastid genome of a brown alga, Undaria pinnatifida. The genome is 130,383 base pairs (bp in size; it contains a large single-copy (LSC, 76,598 bp and a small single-copy region (SSC, 42,977 bp, separated by two inverted repeats (IRa and IRb: 5,404 bp. The genome contains 139 protein-coding, 28 tRNA, and 6 rRNA genes; none of these genes contains introns. Organization and gene contents of the U. pinnatifida plastid genome were similar to those of Saccharina japonica. There is a co-linear relationship between the plastid genome of U. pinnatifida and that of three previously sequenced large brown algal species. Phylogenetic analyses of 43 taxa based on 23 plastid protein-coding genes grouped all plastids into a red or green lineage. In the large brown algae branch, U. pinnatifida and S. japonica formed a sister clade with much closer relationship to Ectocarpus siliculosus than to Fucus vesiculosus. For the first time, the start codon ATT was identified in the plastid genome of large brown algae, in the atpA gene of U. pinnatifida. In addition, we found a gene-length change induced by a 3-bp repetitive DNA in ycf35 and ilvB genes of the U. pinnatifida plastid genome.

  17. Sequence modelling and an extensible data model for genomic database

    Energy Technology Data Exchange (ETDEWEB)

    Li, Peter Wei-Der [California Univ., San Francisco, CA (United States)]|[Lawrence Berkeley Lab., CA (United States)

    1992-01-01

    The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS`s do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the ``Extensible Object Model``, to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.

  18. Sequence modelling and an extensible data model for genomic database

    Energy Technology Data Exchange (ETDEWEB)

    Li, Peter Wei-Der (California Univ., San Francisco, CA (United States) Lawrence Berkeley Lab., CA (United States))

    1992-01-01

    The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS's do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the Extensible Object Model'', to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.

  19. Next-Gen phylogeography of rainforest trees: exploring landscape-level cpDNA variation from whole-genome sequencing.

    Science.gov (United States)

    van der Merwe, M; McPherson, H; Siow, J; Rossetto, M

    2014-01-01

    Standardized phylogeographic studies across codistributed taxa can identify important refugia and biogeographic barriers, and potentially uncover how changes in adaptive constraints through space and time impact on the distribution of genetic diversity. The combination of next-generation sequencing and methodologies that enable uncomplicated analysis of the full chloroplast genome may provide an invaluable resource for such studies. Here, we assess the potential of a shotgun-based method across twelve nonmodel rainforest trees sampled from two evolutionary distinct regions. Whole genomic shotgun sequencing libraries consisting of pooled individuals were used to assemble species-specific chloroplast references (in silicio). For each species, the pooled libraries allowed for the detection of variation within and between data sets (each representing a geographic region). The potential use of nuclear rDNA as an additional marker from the NGS libraries was investigated by mapping reads against available references. We successfully obtained phylogeographically informative sequence data from a range of previously unstudied rainforest trees. Greater levels of diversity were found in northern refugial rainforests than in southern expansion areas. The genetic signatures of varying evolutionary histories were detected, and interesting associative patterns between functional characteristics and genetic diversity were identified. This approach can suit a wide range of landscape-level studies. As the key laboratory-based steps do not require prior species-specific knowledge and can be easily outsourced, the techniques described here are even suitable for researchers without access to wet-laboratory facilities, making evolutionary ecology questions increasingly accessible to the research community. PMID:24119022

  20. Mitochondrial DNA sequences in the nuclear genome of a locust.

    Science.gov (United States)

    Gellissen, G; Bradfield, J Y; White, B N; Wyatt, G R

    The endosymbiotic theory of the origin of mitochondria is widely accepted, and implies that loss of genes from the mitochondria to the nucleus of eukaryotic cells has occurred over evolutionary time. However, evidence at the DNA sequence level for gene transfer between these organelles has so far been limited to a single example, the demonstration that a mitochondrial ATPase subunit gene of Neurospora crassa has an homologous partner in the nuclear genome. From a gene library of the insect, Locusta migratoria, we have now isolated two clones, representing separate fragments of nuclear DNA, which contain sequences homologous to the mitochondrial genes for ribosomal RNA, as well as regions of homology with highly repeated nuclear sequences. The results suggest the transfer of sequences between mitochondrial and nuclear genomes, followed by evolutionary divergence. PMID:6298629

  1. Complete genome sequence of Allochromatium vinosum DSM 180T

    Energy Technology Data Exchange (ETDEWEB)

    Weissgerber, Thomas [Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany; Zigann, Renate [Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany; Bruce, David [Los Alamos National Laboratory (LANL); Chang, Yun-Juan [ORNL; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Hauser, Loren John [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Land, Miriam L [ORNL; Munk, Christine [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Dahl, Christiane [Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany

    2011-01-01

    Allochromatium vinosum formerly Chromatium vinosum is a mesophilic purple sulfur bacte- rium belonging to the family Chromatiaceae in the bacterial class Gammaproteobacteria. The genus Allochromatium contains currently five species. All members were isolated from fresh- water, brackish water or marine habitats and are predominately obligate phototrophs. Here we describe the features of the organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the Chromatiaceae within the purple sulfur bacteria thriving in globally occurring habitats. The 3,669,074 bp ge- nome with its 3,302 protein-coding and 64 RNA genes was sequenced within the Joint Ge- nome Institute Community Sequencing Program.

  2. Complete genome sequence of Thauera aminoaromatica strain MZ1T

    Energy Technology Data Exchange (ETDEWEB)

    Sanseverino, John [ORNL; Chauhan, Archana [University of Tennessee, Knoxville (UTK); Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Dalin, Eileen [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Sims, David [Los Alamos National Laboratory (LANL); Brettin, Thomas S [ORNL; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Chang, Yun-Juan [ORNL; Larimer, Frank W [ORNL; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Moser, Scott [University of Tennessee, Knoxville (UTK); Jegier, Patricia [University of Tennessee, Knoxville (UTK); Close, Dan [University of Tennessee, Knoxville (UTK); Wang, Ying [University of Tennessee, Knoxville (UTK); Layton, Alice [University of Tennessee, Knoxville (UTK); Allen, Michael S. [University of Tennessee, Knoxville (UTK); Sayler, Gary [University of Tennessee, Knoxville (UTK)

    2012-01-01

    Thauera aminoaromatica strain MZ1T, an isolate belonging to genus Thauera, of the family Rhodocyclaceae and the class the Betaproteobacteria, has been characterized for its ability to produce abundant exopolysaccharide and degrade various aromatic compounds with nitrate as an electron acceptor. These properties, if fully understood at the genome-sequence level, can aid in environmental processing of organic matter in anaerobic cycles by short-circuiting a central anaerobic metabolite, acetate, from microbiological conversion to methane, a criti-cal greenhouse gas. Strain MZ1T is the first strain from the genus Thauera with a completely sequenced genome. The 4,496,212 bp chromosome and 78,374 bp plasmid contain 4,071 protein-coding and 71 RNA genes, and were sequenced as part of the DOE Community Se-quencing Program CSP{_}776774.

  3. Complete Genome Sequence of Streptococcus agalactiae CNCTC 10/84, a Hypervirulent Sequence Type 26 Strain

    OpenAIRE

    Hooven, Thomas A.; Randis, Tara M.; Sean C Daugherty; Narechania, Apurva; Planet, Paul J.; Tettelin, Hervé; Ratner, Adam J.

    2014-01-01

    Streptococcus agalactiae (group B Streptococcus [GBS]) is a human pathogen with a propensity to cause neonatal infections. We report the complete genome sequence of GBS strain CNCTC 10/84, a hypervirulent clinical isolate frequently used to study GBS pathogenesis. Comparative analysis of this sequence may shed light on novel pathogenic mechanisms.

  4. Transcriptome analysis of ectopic chloroplast development in green curd cauliflower (Brassica oleracea L. var. botrytis

    Directory of Open Access Journals (Sweden)

    Zhou Xiangjun

    2011-11-01

    Full Text Available Abstract Background Chloroplasts are the green plastids where photosynthesis takes place. The biogenesis of chloroplasts requires the coordinate expression of both nuclear and chloroplast genes and is regulated by developmental and environmental signals. Despite extensive studies of this process, the genetic basis and the regulatory control of chloroplast biogenesis and development remain to be elucidated. Results Green cauliflower mutant causes ectopic development of chloroplasts in the curd tissue of the plant, turning the otherwise white curd green. To investigate the transcriptional control of chloroplast development, we compared gene expression between green and white curds using the RNA-seq approach. Deep sequencing produced over 15 million reads with lengths of 86 base pairs from each cDNA library. A total of 7,155 genes were found to exhibit at least 3-fold changes in expression between green and white curds. These included light-regulated genes, genes encoding chloroplast constituents, and genes involved in chlorophyll biosynthesis. Moreover, we discovered that the cauliflower ELONGATED HYPOCOTYL5 (BoHY5 was expressed higher in green curds than white curds and that 2616 HY5-targeted genes, including 1600 up-regulated genes and 1016 down-regulated genes, were differently expressed in green in comparison to white curd tissue. All these 1600 up-regulated genes were HY5-targeted genes in the light. Conclusions The genome-wide profiling of gene expression by RNA-seq in green curds led to the identification of large numbers of genes associated with chloroplast development, and suggested the role of regulatory genes in the high hierarchy of light signaling pathways in mediating the ectopic chloroplast development in the green curd cauliflower mutant.

  5. Genome sequencing highlights the dynamic early history of dogs.

    OpenAIRE

    Freedman, Adam H.; Ilan Gronau; Schweizer, Rena M.; Diego Ortega-Del Vecchyo; Eunjung Han; Silva, Pedro M.; Marco Galaverni; Zhenxin Fan; Peter Marx; Belen Lorente-Galdos; Holly Beale; Oscar Ramirez; Farhad Hormozdiari; Can Alkan; Carles Vilà

    2014-01-01

    To identify genetic changes underlying dog domestication and reconstruct their early evolutionary history, we generated high-quality genome sequences from three gray wolves, one from each of the three putative centers of dog domestication, two basal dog lineages (Basenji and Dingo) and a golden jackal as an outgroup. Analysis of these sequences supports a demographic model in which dogs and wolves diverged through a dynamic process involving population bottlenecks in both lineages and post-di...

  6. Genome Sequencing Highlights the Dynamic Early History of Dogs

    OpenAIRE

    Freedman, A.H.; Gronau, I.; Schweizer, R.M.; Han, E; Silva, P.M.; Galaverni, M.; Fan, Z; Marx, P; Lorente-Galdos, B.; Beale, H.; Ramirez, O.; Hormozdiari, Fereydoun; Alkan, Can; Vilà, Carles; Geffen, E

    2014-01-01

    To identify genetic changes underlying dog domestication and reconstruct their early evolutionary history, we generated high-quality genome sequences from three gray wolves, one from each of the three putative centers of dog domestication, two basal dog lineages (Basenji and Dingo) and a golden jackal as an outgroup. Analysis of these sequences supports a demographic model in which dogs and wolves diverged through a dynamic process involving population bottlenecks in both lineages and post-di...

  7. Sequencing Crop Genomes: A Gateway to Improve Tropical Agriculture

    OpenAIRE

    Thottathil, Gincy Paily; Jayasekaran, Kandakumar; Othman, Ahmad Sofiman

    2016-01-01

    Agricultural development in the tropics lags behind development in the temperate latitudes due to the lack of advanced technology, and various biotic and abiotic factors. To cope with the increasing demand for food and other plant-based products, improved crop varieties have to be developed. To breed improved varieties, a better understanding of crop genetics is necessary. With the advent of next-generation DNA sequencing technologies, many important crop genomes have been sequenced. Primary ...

  8. The impact of next-generation sequencing on genomics

    OpenAIRE

    Zhang, Jun; Chiodini, Rod; Badr, Ahmed; Zhang, Genfa

    2011-01-01

    This article reviews basic concepts, general applications, and the potential impact of next-generation sequencing (NGS) technologies on genomics, with particular reference to currently available and possible future platforms and bioinformatics. NGS technologies have demonstrated the capacity to sequence DNA at unprecedented speed, thereby enabling previously unimaginable scientific achievements and novel biological applications. But, the massive data produced by NGS also presents a significan...

  9. Analysis of Chimpanzee History Based on Genome Sequence Alignments

    OpenAIRE

    Caswell, Jennifer L.; Richter, Daniel J.; Neubauer, Julie; Schirmer, Christine; Gnerre, Sante; Mallick, Swapan; Reich, David Emil

    2008-01-01

    Population geneticists often study small numbers of carefully chosen loci, but it has become possible to obtain orders of magnitude for more data from overlaps of genome sequences. Here, we generate tens of millions of base pairs of multiple sequence alignments from combinations of three western chimpanzees, three central chimpanzees, an eastern chimpanzee, a bonobo, a human, an orangutan, and a macaque. Analysis provides a more precise understanding of demographic history than was previously...

  10. Complete mitochondrial genome sequence of Romanogobio tenuicorpus (Amur whitefin gudgeon).

    Science.gov (United States)

    Dong, Fang; Tong, Guang-Xiang; Kuang, You-Yi; Sun, Xiao-Wen

    2015-01-01

    Amur whitefin gudgeon (Romanogobio tenuicorpus) belongs to the family Cyprinidae, it is freshwater aquaculture species in China. In the report, we determined the complete mitochondrial genome sequence of Romanogobio tenuicorpus, which is 16,600 bp long circular molecule with 13 protein-coding genes, 22 tRNA genes, 2 rRNA genes and a control region, the conserved sequence blocks, CSB1, CSB2 and CSB3 were also detected. PMID:24409923

  11. Molecular evolution of herpesviruses: genomic and protein sequence comparisons.

    OpenAIRE

    Karlin, S; Mocarski, E S; Schachtel, G A

    1994-01-01

    Phylogenetic reconstruction of herpesvirus evolution is generally founded on amino acid sequence comparisons of specific proteins. These are relevant to the evolution of the specific gene (or set of genes), but the resulting phylogeny may vary depending on the particular sequence chosen for analysis (or comparison). In the first part of this report, we compare 13 herpesvirus genomes by using a new multidimensional methodology based on distance measures and partial orderings of dinucleotide re...

  12. Mitochondrial genome sequences and comparative genomics ofPhytophthora ramorum and P. sojae

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Frank N.; Douda, Bensasson; Tyler, Brett M.; Boore,Jeffrey L.

    2007-01-01

    The complete sequences of the mitochondrial genomes of theoomycetes of Phytophthora ramorum and P. sojae were determined during thecourse of their complete nuclear genome sequencing (Tyler, et al. 2006).Both are circular, with sizes of 39,314 bp for P. ramorum and 42,975 bpfor P. sojae. Each contains a total of 37 identifiable protein-encodinggenes, 25 or 26 tRNAs (P. sojae and P. ramorum, respectively)specifying19 amino acids, and a variable number of ORFs (7 for P. ramorum and 12for P. sojae) which are potentially additional functional genes.Non-coding regions comprise approximately 11.5 percent and 18.4 percentof the genomes of P. ramorum and P. sojae, respectively. Relative to P.sojae, there is an inverted repeat of 1,150 bp in P. ramorum thatincludes an unassigned unique ORF, a tRNA gene, and adjacent non-codingsequences, but otherwise the gene order in both species is identical.Comparisons of these genomes with published sequences of the P. infestansmitochondrial genome reveals a number of similarities, but the gene orderin P. infestans differs in two adjacent locations due to inversions.Sequence alignments of the three genomes indicated sequence conservationranging from 75 to 85 percent and that specific regions were morevariable than others.

  13. Targeted or whole genome sequencing of formalin fixed tissue samples: potential applications in cancer genomics.

    Science.gov (United States)

    Munchel, Sarah; Hoang, Yen; Zhao, Yue; Cottrell, Joseph; Klotzle, Brandy; Godwin, Andrew K; Koestler, Devin; Beyerlein, Peter; Fan, Jian-Bing; Bibikova, Marina; Chien, Jeremy

    2015-09-22

    Current genomic studies are limited by the poor availability of fresh-frozen tissue samples. Although formalin-fixed diagnostic samples are in abundance, they are seldom used in current genomic studies because of the concern of formalin-fixation artifacts. Better characterization of these artifacts will allow the use of archived clinical specimens in translational and clinical research studies. To provide a systematic analysis of formalin-fixation artifacts on Illumina sequencing, we generated 26 DNA sequencing data sets from 13 pairs of matched formalin-fixed paraffin-embedded (FFPE) and fresh-frozen (FF) tissue samples. The results indicate high rate of concordant calls between matched FF/FFPE pairs at reference and variant positions in three commonly used sequencing approaches (whole genome, whole exome, and targeted exon sequencing). Global mismatch rates and C · G > T · A substitutions were comparable between matched FF/FFPE samples, and discordant rates were low (<0.26%) in all samples. Finally, low-pass whole genome sequencing produces similar pattern of copy number alterations between FF/FFPE pairs. The results from our studies suggest the potential use of diagnostic FFPE samples for cancer genomic studies to characterize and catalog variations in cancer genomes. PMID:26305677

  14. Identification of cancer-driver genes in focal genomic alterations from whole genome sequencing data.

    Science.gov (United States)

    Jang, Ho; Hur, Youngmi; Lee, Hyunju

    2016-01-01

    DNA copy number alterations (CNAs) are the main genomic events that occur during the initiation and development of cancer. Distinguishing driver aberrant regions from passenger regions, which might contain candidate target genes for cancer therapies, is an important issue. Several methods for identifying cancer-driver genes from multiple cancer patients have been developed for single nucleotide polymorphism (SNP) arrays. However, for NGS data, methods for the SNP array cannot be directly applied because of different characteristics of NGS such as higher resolutions of data without predefined probes and incorrectly mapped reads to reference genomes. In this study, we developed a wavelet-based method for identification of focal genomic alterations for sequencing data (WIFA-Seq). We applied WIFA-Seq to whole genome sequencing data from glioblastoma multiforme, ovarian serous cystadenocarcinoma and lung adenocarcinoma, and identified focal genomic alterations, which contain candidate cancer-related genes as well as previously known cancer-driver genes. PMID:27156852

  15. Identification of cancer-driver genes in focal genomic alterations from whole genome sequencing data

    Science.gov (United States)

    Jang, Ho; Hur, Youngmi; Lee, Hyunju

    2016-01-01

    DNA copy number alterations (CNAs) are the main genomic events that occur during the initiation and development of cancer. Distinguishing driver aberrant regions from passenger regions, which might contain candidate target genes for cancer therapies, is an important issue. Several methods for identifying cancer-driver genes from multiple cancer patients have been developed for single nucleotide polymorphism (SNP) arrays. However, for NGS data, methods for the SNP array cannot be directly applied because of different characteristics of NGS such as higher resolutions of data without predefined probes and incorrectly mapped reads to reference genomes. In this study, we developed a wavelet-based method for identification of focal genomic alterations for sequencing data (WIFA-Seq). We applied WIFA-Seq to whole genome sequencing data from glioblastoma multiforme, ovarian serous cystadenocarcinoma and lung adenocarcinoma, and identified focal genomic alterations, which contain candidate cancer-related genes as well as previously known cancer-driver genes. PMID:27156852

  16. Sequence Determination from Overlapping Fragments: A Simple Model of Whole-Genome Shotgun Sequencing

    Science.gov (United States)

    Derrida, Bernard; Fink, Thomas M.

    2002-02-01

    Assembling fragments randomly sampled from along a sequence is the basis of whole-genome shotgun sequencing, a technique used to map the DNA of the human and other genomes. We calculate the probability that a random sequence can be recovered from a collection of overlapping fragments. We provide an exact solution for an infinite alphabet and in the case of constant overlaps. For the general problem we apply two assembly strategies and give the probability that the assembly puzzle can be solved in the limit of infinitely many fragments.

  17. A Pan-HIV Strategy for Complete Genome Sequencing.

    Science.gov (United States)

    Berg, Michael G; Yamaguchi, Julie; Alessandri-Gradt, Elodie; Tell, Robert W; Plantier, Jean-Christophe; Brennan, Catherine A

    2016-04-01

    Molecular surveillance is essential to monitor HIV diversity and track emerging strains. We have developed a universal library preparation method (HIV-SMART [i.e.,switchingmechanismat 5' end ofRNAtranscript]) for next-generation sequencing that harnesses the specificity of HIV-directed priming to enable full genome characterization of all HIV-1 groups (M, N, O, and P) and HIV-2. Broad application of the HIV-SMART approach was demonstrated using a panel of diverse cell-cultured virus isolates. HIV-1 non-subtype B-infected clinical specimens from Cameroon were then used to optimize the protocol to sequence directly from plasma. When multiplexing 8 or more libraries per MiSeq run, full genome coverage at a median ∼2,000× depth was routinely obtained for either sample type. The method reproducibly generated the same consensus sequence, consistently identified viral sequence heterogeneity present in specimens, and at viral loads of ≤4.5 log copies/ml yielded sufficient coverage to permit strain classification. HIV-SMART provides an unparalleled opportunity to identify diverse HIV strains in patient specimens and to determine phylogenetic classification based on the entire viral genome. Easily adapted to sequence any RNA virus, this technology illustrates the utility of next-generation sequencing (NGS) for viral characterization and surveillance. PMID:26699702

  18. A shot in the genome: how accurately do shotgun 454 sequences represent a genome?

    Directory of Open Access Journals (Sweden)

    Meglécz Emese

    2012-05-01

    Full Text Available Abstract Background Next generation sequencing (NGS provides a valuable method to quickly obtain sequence information from non-model organisms at a genomic scale. In principle, if sequencing is not targeted for a genomic region or sequence type (e.g. coding region, microsatellites NGS reads can be used as a genome snapshot and provide information on the different types of sequences in the genome. However, no study has ascertained if a typical 454 dataset of low coverage (1/4-1/8 of a PicoTiter plate leading to generally less than 0.1x of coverage represents all parts of genomes equally. Findings Partial genome shotgun sequencing of total DNA (without enrichment on a 454 NGS platform was used to obtain reads of Apis mellifera (454 reads hereafter. These 454 reads were compared to the assembled chromosomes of this species in three different aspects: (i dimer and trimer compositions, (ii the distribution of mapped 454 sequences along the chromosomes and (iii the numbers of different classes of microsatellites. Highly significant chi-square tests for all three types of analyses indicated that the 454 data is not a perfect random sample of the genome. Only the number of 454 reads mapped to each of the 16 chromosomes and the number of microsatellites pooled by motif (repeat unit length was not significantly different from the expected values. However, a very strong correlation (correlation coefficients greater than 0.97 was observed between most of the 454 variables (the number of different dimers and trimers, the number of 454 reads mapped to each chromosome fragments of one Mb, the number of 454 reads mapped to each chromosome, the number of microsatellites of each class and their corresponding genomic variables. Conclusions The results of chi square tests suggest that 454 shotgun reads cannot be regarded as a perfect representation of the genome especially if the comparison is done on a finer scale (e.g. chromosome fragments instead of whole

  19. Sugarcane genome sequencing by methylation filtration provides tools for genomic research in the genus Saccharum.

    Science.gov (United States)

    Grativol, Clícia; Regulski, Michael; Bertalan, Marcelo; McCombie, W Richard; da Silva, Felipe Rodrigues; Zerlotini Neto, Adhemar; Vicentini, Renato; Farinelli, Laurent; Hemerly, Adriana Silva; Martienssen, Robert A; Ferreira, Paulo Cavalcanti Gomes

    2014-07-01

    Many economically important crops have large and complex genomes that hamper their sequencing by standard methods such as whole genome shotgun (WGS). Large tracts of methylated repeats occur in plant genomes that are interspersed by hypomethylated gene-rich regions. Gene-enrichment strategies based on methylation profiles offer an alternative to sequencing repetitive genomes. Here, we have applied methyl filtration with McrBC endonuclease digestion to enrich for euchromatic regions in the sugarcane genome. To verify the efficiency of methylation filtration and the assembly quality of sequences submitted to gene-enrichment strategy, we have compared assemblies using methyl-filtered (MF) and unfiltered (UF) libraries. The use of methy filtration allowed a better assembly by filtering out 35% of the sugarcane genome and by producing 1.5× more scaffolds and 1.7× more assembled Mb in length compared with unfiltered dataset. The coverage of sorghum coding sequences (CDS) by MF scaffolds was at least 36% higher than by the use of UF scaffolds. Using MF technology, we increased by 134× the coverage of gene regions of the monoploid sugarcane genome. The MF reads assembled into scaffolds that covered all genes of the sugarcane bacterial artificial chromosomes (BACs), 97.2% of sugarcane expressed sequence tags (ESTs), 92.7% of sugarcane RNA-seq reads and 98.4% of sorghum protein sequences. Analysis of MF scaffolds from encoded enzymes of the sucrose/starch pathway discovered 291 single-nucleotide polymorphisms (SNPs) in the wild sugarcane species, S. spontaneum and S. officinarum. A large number of microRNA genes was also identified in the MF scaffolds. The information achieved by the MF dataset provides a valuable tool for genomic research in the genus Saccharum and for improvement of sugarcane as a biofuel crop. PMID:24773339

  20. [Genome sequencing and personalized medicine: perspectives and limitations].

    Science.gov (United States)

    Le Gall, Jean-Yves; Debré, Patrice

    2014-01-01

    DNA sequencing technologies have advanced at an exponential rate in recent years: the first human genome was sequenced in 2001 after many years of effort by dozens of international laboratories at a cost of tens of millions of dollars, while in 2013 a genome can be sequenced within 24 hours for a few hundred dollars (exome sequencing takes only a few hours). More and more hospital laboratories are acquiring new high-throughput sequencing devices ("next-generation sequencers", NGS), allowing them to analyze tens or hundreds of genes, or even the entire exome. This is having a major impact on medical concepts and practices, especially with respect to genetics and oncology. This ability to search for mutations simultaneously in a large number of genes is finding applications in the diagnosis of Mendelian diseases (including at birth), routine screening for heterozygotes, and pre-conception diagnosis. NGS is now sufficiently sensitive to analyze circulating fetal DNA in maternal blood (cell-free fetal DNA, cffDNA), enabling applications such as non invasive diagnosis of fetal sex (and X-linked diseases), fetal rhesus among rhesus-negative women, trisomy and, in the near future, Mendelian mutations. Data on multifactorial diseases are still preliminary, but it should soon be possible to identify "strong" factors of genetic predisposition that have so far been beyond the scope of genome-wide association studies (GWAS). In the field of constitutional oncogenetics, NGS can also be used for simultaneous analysis of genes involved in " hereditary " cancers (21 breast cancer genes, 6 colon cancer genes, etc.). More generally, NGS can identify all genomic abnormalities (deletions, translocations, mutations) in a given malignant tissue (hemopathy or solid tumor), and has the potential to distinguish between important mutations (those that drive tumor progression) from " bystander " or accessory mutations, and also to identify "druggable" mutations amenable to targeted therapies

  1. Draft Genome Sequence of Rice Isolate Pseudomonas chlororaphis EA105

    OpenAIRE

    McCully, Lucy M.; Bitzer, Adam S.; Spence, Carla A.; Bais, Harsh P.; Silby, Mark W.

    2014-01-01

    Pseudomonas chlororaphis EA105, a strain isolated from rice rhizosphere, has shown antagonistic activities against a rice fungal pathogen, and could be important in defense against rice blast. We report the draft genome sequence of EA105, which is an estimated size of 6.6 Mb.

  2. Complete Genome Sequence of the Haloalkaliphilic, Hydrogen Producing Halanaerobium hydrogenoformans

    Energy Technology Data Exchange (ETDEWEB)

    Brown, Steven D [ORNL; Begemann, Matthew B [University of Wisconsin, Madison; Mormile, Dr. Melanie R. [Missouri University of Science and Technology; Wall, Judy D. [University of Missouri; Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Samual [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Elias, Dwayne A [ORNL

    2011-01-01

    Halanaerobium hydrogenoformans is an alkaliphilic bacterium capable of biohydrogen production at pH 11 and 7% (w/v) salt. We present the 2.6 Mb genome sequence to provide insights into its physiology and potential for bioenergy applications.

  3. Genome Sequence of the Yeast Cyberlindnera fabianii (Hansenula fabianii)

    OpenAIRE

    Freel, Kelle C.; Sarilar, Veronique; Neuvéglise, Cécile; Devillers, Hugo; Friedrich, Anne; Schacherer, Joseph

    2014-01-01

    The yeast Cyberlindnera fabianii is used in wastewater treatment, fermentation of alcoholic beverages, and has caused blood infections. To assist in the accurate identification of this species, and to determine the genetic basis for properties involved in fermentation and water treatment, we sequenced and annotated the genome of C. fabianii (YJS4271).

  4. Complete Genome Sequence of Haemophilus parasuis SH0165▿

    OpenAIRE

    Yue, Min; Yang, Fan; Yang, Jian; Bei, Weicheng; Cai, Xuwang; Chen, Lihong; Dong, Jie; Zhou, Rui; Jin, Meilin; Jin, Qi; Chen, Huanchun

    2008-01-01

    Haemophilus parasuis is the causative agent of Glässer's disease, which produces big losses in swine populations worldwide. H. parasuis SH0165, belonging to the dominant serovar 5 in China, is a clinically isolated strain with high-level virulence. Here, we report the first completed genome sequence of this species.

  5. Complete Genome Sequence of a Novel Human Betapapillomavirus, HPV-159

    OpenAIRE

    Kocjan, Boštjan J.; Hošnjak, Lea; Seme, Katja; Poljak, Mario

    2013-01-01

    A novel human papillomavirus (HPV), now officially recognized as HPV-159, isolated from an anal swab, was fully cloned, sequenced, and genetically characterized. HPV-159 has a genomic organization that is typical of cutaneotrophic HPV types, and it belongs to the genus Betapapillomavirus.

  6. Complete Genome Sequence of Beijerinckia indica subsp. indica▿

    OpenAIRE

    Tamas, Ivica; Dedysh, Svetlana N.; Liesack, Werner; Stott, Matthew B.; Alam, Maqsudul; Murrell, J. Colin; Dunfield, Peter F.

    2010-01-01

    Beijerinckia indica subsp. indica is an aerobic, acidophilic, exopolysaccharide-producing, N2-fixing soil bacterium. It is a generalist chemoorganotroph that is phylogenetically closely related to facultative and obligate methanotrophs of the genera Methylocella and Methylocapsa. Here we report the full genome sequence of this bacterium.

  7. The mitochondrial genome sequence of the Tasmanian tiger (Thylacinus cynocephalus)

    DEFF Research Database (Denmark)

    Miller, Webb; Drautz, Daniela I; Janecka, Jan E; Lesk, Arthur M; Ratan, Aakrosh; Tomsho, Lynn P; Packard, Mike; Zhang, Yeting; McClellan, Lindsay R; Qi, Ji; Zhao, Fangqing; Gilbert, M Thomas P; Dalén, Love; Arsuaga, Juan Luis; Ericson, Per G P; Huson, Daniel H; Helgen, Kristofer M; Murphy, William J; Götherström, Anders; Schuster, Stephan C

    2009-01-01

    We report the first two complete mitochondrial genome sequences of the thylacine (Thylacinus cynocephalus), or so-called Tasmanian tiger, extinct since 1936. The thylacine's phylogenetic position within australidelphian marsupials has long been debated, and here we provide strong support for the...

  8. Complete Genome Sequence of Robiginitalea biformata HTCC2501▿

    OpenAIRE

    Oh, Hyun-Myung; Giovannoni, Stephen J.; Lee, Kiyoung; Ferriera, Steve; Johnson, Justin; Cho, Jang-Cheon

    2009-01-01

    Robiginitalea biformata HTCC2501, isolated from the Sargasso Sea by dilution-to-extinction culturing, has been known as an aerobic chemoheterotroph with carotenoid pigments and dimorphic growth phases. Here, we announce the complete sequence of the R. biformata HTCC2501 genome, which contains genes for carotenoid biosynthesis and several macromolecule-degrading enzymes.

  9. Genome sequence of the human pathogen Vibrio cholerae Amazonia.

    NARCIS (Netherlands)

    Thompson, C.C.; Marin, M.A.; Dias, G.M.; Dutilh, B.E.; Edwards, R.A.; Iida, T.; Thompson, F.L.; Vicente, A.C.

    2011-01-01

    Vibrio cholerae O1 Amazonia is a pathogen that was isolated from cholera-like diarrhea cases in at least two countries, Brazil and Ghana. Based on multilocus sequence analysis, this lineage belongs to a distinct profile compared to strains from El Tor and classical biotypes. The genomic analysis rev

  10. Genome Sequence of the Tick-Borne Pathogen Rickettsia raoultii.

    Science.gov (United States)

    El Karkouri, Khalid; Mediannikov, Oleg; Robert, Catherine; Raoult, Didier; Fournier, Pierre-Edouards

    2016-01-01

    ITALIC! Rickettsia raoultiiis a tick-associated spotted fever group (SFG) organism, causing scalp eschar and neck lymphadenopathy after tick bite (SENLAT) in humans. We report here the genome sequence of ITALIC! R. raoultiistrain Khabarovsk(T)(CSUR R3(T), ATCC VR-1596(T)), which was isolated from a ITALIC! Dermacentor silvarumtick collected in Russia. PMID:27103706

  11. Genome Sequence of the Tick-Borne Pathogen Rickettsia raoultii

    OpenAIRE

    El Karkouri, Khalid; Mediannikov, Oleg; Robert, Catherine; Raoult, Didier; Fournier, Pierre-Edouards

    2016-01-01

    Rickettsia raoultii is a tick-associated spotted fever group (SFG) organism, causing scalp eschar and neck lymphadenopathy after tick bite (SENLAT) in humans. We report here the genome sequence of R. raoultii strain KhabarovskT (CSUR R3T, ATCC VR-1596T), which was isolated from a Dermacentor silvarum tick collected in Russia.

  12. Genome Sequence of the Paleopolyploid Soybean (Glycine max (L.) Merr.)

    Science.gov (United States)

    We report the genome sequence for soybean (Glycine max var. Williams 82), one of the most important crop plants worldwide because of its ability to produce both protein and oil. Soybean is a recently domesticated legume that plays a vital role in crop rotation as it fixes atmospheric nitrogen via s...

  13. Draft Genome Sequence of Bacillus subtilis strain KATMIRA1933

    OpenAIRE

    Karlyshev, Andrey V.; Melnikov, Vyacheslav G.; Chikindas, Michael L.

    2014-01-01

    In this report, we present a draft sequence of Bacillus subtilis KATMIRA1933. Previous studies demonstrated probiotic properties of this strain partially attributed to production of an antibacterial compound, subtilosin. Comparative analysis of this strain’s genome with that of a commercial probiotic strain, B. subtilis Natto, is presented.

  14. Complete Genome Sequences of Four Isolates of Plutella xylostella Granulovirus

    Science.gov (United States)

    2016-01-01

    Granuloviruses are widespread pathogens of Plutella xylostella L. (diamondback moth) and potential biopesticides for control of this global insect pest. We report the complete genomes of four Plutella xylostella granulovirus isolates from China, Malaysia, and Taiwan exhibiting pairs of noncoding, homologous repeat regions with significant sequence variation but equivalent length. PMID:27365355

  15. Draft Genome Sequence of Streptococcus agalactiae PR06

    OpenAIRE

    MZ, Irma Syakina; L. K. Teh; Salleh, M. Z.

    2013-01-01

    Streptococcus agalactiae (group B streptococcus [GBS]) is a Gram-positive bacterium that was first recognized as a causative agent of bovine mastitis. S. agalactiae has subsequently emerged as a significant cause of human diseases. Here, we report the draft genome sequence of S. agalactiae PR06, which was isolated from a septicemic patient in a local hospital in Malaysia.

  16. Draft Genome Sequence of Pectobacterium wasabiae Strain CFIA1002.

    Science.gov (United States)

    Yuan, Kat Xiaoli; Adam, Zaky; Tambong, James; Lévesque, C André; Chen, Wen; Lewis, Christopher T; De Boer, Solke H; Li, Xiang Sean

    2014-01-01

    Pectobacterium wasabiae, originally causing soft rot disease in horseradish in Japan, was recently found to cause blackleg-like symptoms on potato in the United States, Canada, and Europe. A draft genome sequence of a Canadian potato isolate of P. wasabiae CFIA1002 will enhance the characterization of its pathogenicity and host specificity features. PMID:24831134

  17. Draft Genome Sequence of Pectobacterium wasabiae Strain CFIA1002

    OpenAIRE

    Yuan, Kat (Xiaoli); Adam, Zaky; Tambong, James; Lévesque, C. André; Chen, Wen; Lewis, Christopher T.; De Boer, Solke H.; LI, XIANG

    2014-01-01

    Pectobacterium wasabiae, originally causing soft rot disease in horseradish in Japan, was recently found to cause blackleg-like symptoms on potato in the United States, Canada, and Europe. A draft genome sequence of a Canadian potato isolate of P. wasabiae CFIA1002 will enhance the characterization of its pathogenicity and host specificity features.

  18. Draft Genome Sequences of the Turfgrass Pathogen Sclerotinia homoeocarpa.

    Science.gov (United States)

    Green, Robert; Sang, Hyunkyu; Chang, Taehyun; Allan-Perkins, Elisha; Petit, Elsa; Jung, Geunhwa

    2016-01-01

    Sclerotinia homoeocarpa (F. T. Bennett) is one of the most economically important pathogens on high-amenity cool-season turfgrasses, where it causes dollar spot. To understand the genetic mechanisms of fungicide resistance, which has become highly prevalent, the whole genomes of two isolates with varied resistance levels to fungicides were sequenced. PMID:26868400

  19. Complete Genome Sequence of Bacillus thuringiensis Bacteriophage BMBtp2

    OpenAIRE

    Dong, Zhaoxia; Peng, Donghai; Wang, Yueying; Zhu, Lei; Ruan, Lifang; Sun, Ming

    2013-01-01

    Bacillus thuringiensis is an insect pathogen which has been widely used for biocontrol. During B. thuringiensis fermentation, lysogenic bacteriophages cause severe losses of yield. Here, we announce the complete genome sequence of a bacteriophage, BMBtp2, which is induced from B. thuringiensis strain YBT-1765, which may be helpful to clarify the mechanism involved in bacteriophage contamination.

  20. Draft Genome Sequence of Halomonas smyrnensis AAD6T

    OpenAIRE

    Sogutcu, Elif; Emrence, Zeliha; Arikan, Muzzaffer; Cakiris, Aris; Abaci, Neslihan; Öner, Ebru Toksoy; Üstek, Duran; Arga, Kazim Yalcin

    2012-01-01

    Halomonas smyrnensis AAD6T is a Gram-negative, aerobic, exopolysaccharide-producing, and moderately halophilic bacterium that produces levan, a fructose homopolymer with many potential uses in various industries. We report the draft genome sequence of H. smyrnensis AAD6T, which will accelerate research on the rational design and optimization of microbial levan production.