WorldWideScience

Sample records for chloroplast dna sequence

  1. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  2. GENETIC POLYMORPHISM IN GYMNODINIUM GALATHEANUM CHLOROPLAST DNA SEQUENCES AND DEVELOPMENT OF A MOLECULAR DETECTION ASSAY. (R827084)

    Science.gov (United States)

    Nuclear and chloroplast-encoded small subunit ribosomal DNA sequences were obtainedfrom several strains of the toxic dinoflagellate Gymnodinium galatheanum. Phylogenetic analyses andcomparison of sequences indicate that the chloroplast sequences show a higher degree of se...

  3. Genetic polymorphism in Gymnodinium galatheanum chloroplast DNA sequences and development of a molecular detection assay.

    Science.gov (United States)

    Tengs, T; Bowers, H A; Ziman, A P; Stoecker, D K; Oldach, D W

    2001-02-01

    Nuclear and chloroplast-encoded small subunit ribosomal DNA sequences were obtained from several strains of the toxic dinoflagellate Gymnodinium galatheanum. Phylogenetic analyses and comparison of sequences indicate that the chloroplast sequences show a higher degree of sequence divergence than the nuclear homologue. The chloroplast sequences were chosen as targets for the development of a 5'--3' exonuclease assay for detection of the organism. The assay has a very high degree of specificity and has been used to screen environmental water samples from a fish farm where the presence of this dinoflagellate species has previously been associated with fish kills. Various hypotheses for the derived nature of the chloroplast sequences are discussed, as well as what is known about the toxicity of the species.

  4. High-throughput sequencing of three Lemnoideae (duckweeds chloroplast genomes from total DNA.

    Directory of Open Access Journals (Sweden)

    Wenqin Wang

    Full Text Available BACKGROUND: Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. METHODS: We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. CONCLUSIONS: This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power.

  5. Complete chloroplast DNA sequence from a Korean endemic genus, Megaleranthis saniculifolia, and its evolutionary implications.

    Science.gov (United States)

    Kim, Young-Kyu; Park, Chong-wook; Kim, Ki-Joong

    2009-03-31

    The chloroplast DNA sequences of Megaleranthis saniculifolia, an endemic and monotypic endangered plant species, were completed in this study (GenBank FJ597983). The genome is 159,924 bp in length. It harbors a pair of IR regions consisting of 26,608 bp each. The lengths of the LSC and SSC regions are 88,326 bp and 18,382 bp, respectively. The structural organizations, gene and intron contents, gene orders, AT contents, codon usages, and transcription units of the Megaleranthis chloroplast genome are similar to those of typical land plant cp DNAs. However, the detailed features of Megaleranthis chloroplast genomes are substantially different from that of Ranunculus, which belongs to the same family, the Ranunculaceae. First, the Megaleranthis cp DNA was 4,797 bp longer than that of Ranunculus due to an expanded IR region into the SSC region and duplicated sequence elements in several spacer regions of the Megaleranthis cp genome. Second, the chloroplast genomes of Megaleranthis and Ranunculus evidence 5.6% sequence divergence in the coding regions, 8.9% sequence divergence in the intron regions, and 18.7% sequence divergence in the intergenic spacer regions, respectively. In both the coding and noncoding regions, average nucleotide substitution rates differed markedly, depending on the genome position. Our data strongly implicate the positional effects of the evolutionary modes of chloroplast genes. The genes evidencing higher levels of base substitutions also have higher incidences of indel mutations and low Ka/Ks ratios. A total of 54 simple sequence repeat loci were identified from the Megaleranthis cp genome. The existence of rich cp SSR loci in the Megaleranthis cp genome provides a rare opportunity to study the population genetic structures of this endangered species. Our phylogenetic trees based on the two independent markers, the nuclear ITS and chloroplast matK sequences, strongly support the inclusion of the Megaleranthis to the Trollius. Therefore, our

  6. Conflict amongst chloroplast DNA sequences obscures the phylogeny of a group of Asplenium ferns.

    Science.gov (United States)

    Shepherd, Lara D; Holland, Barbara R; Perrie, Leon R

    2008-07-01

    A previous study of the relationships amongst three subgroups of the Austral Asplenium ferns found conflicting signal between the two chloroplast loci investigated. Because organelle genomes like those of chloroplasts and mitochondria are thought to be non-recombining, with a single evolutionary history, we sequenced four additional chloroplast loci with the expectation that this would resolve these relationships. Instead, the conflict was only magnified. Although tree-building analyses favoured one of the three possible trees, one of the alternative trees actually had one more supporting site (six versus five) and received greater support in spectral and neighbor-net analyses. Simulations suggested that chance alone was unlikely to produce strong support for two of the possible trees and none for the third. Likelihood permutation tests indicated that the concatenated chloroplast sequence data appeared to have experienced recombination. However, recombination between the chloroplast genomes of different species would be highly atypical, and corollary supporting observations, like chloroplast heteroplasmy, are lacking. Wider taxon sampling clarified the composition of the Austral group, but the conflicting signal meant analyses (e.g., morphological evolution, biogeographic) conditional on a well-supported phylogeny could not be performed.

  7. The complete chloroplast DNA sequence of the green alga Oltmannsiellopsis viridis reveals a distinctive quadripartite architecture in the chloroplast genome of early diverging ulvophytes

    Directory of Open Access Journals (Sweden)

    Lemieux Claude

    2006-02-01

    Full Text Available Abstract Background The phylum Chlorophyta contains the majority of the green algae and is divided into four classes. The basal position of the Prasinophyceae has been well documented, but the divergence order of the Ulvophyceae, Trebouxiophyceae and Chlorophyceae is currently debated. The four complete chloroplast DNA (cpDNA sequences presently available for representatives of these classes have revealed extensive variability in overall structure, gene content, intron composition and gene order. The chloroplast genome of Pseudendoclonium (Ulvophyceae, in particular, is characterized by an atypical quadripartite architecture that deviates from the ancestral type by a large inverted repeat (IR featuring an inverted rRNA operon and a small single-copy (SSC region containing 14 genes normally found in the large single-copy (LSC region. To gain insights into the nature of the events that led to the reorganization of the chloroplast genome in the Ulvophyceae, we have determined the complete cpDNA sequence of Oltmannsiellopsis viridis, a representative of a distinct, early diverging lineage. Results The 151,933 bp IR-containing genome of Oltmannsiellopsis differs considerably from Pseudendoclonium and other chlorophyte cpDNAs in intron content and gene order, but shares close similarities with its ulvophyte homologue at the levels of quadripartite architecture, gene content and gene density. Oltmannsiellopsis cpDNA encodes 105 genes, contains five group I introns, and features many short dispersed repeats. As in Pseudendoclonium cpDNA, the rRNA genes in the IR are transcribed toward the single copy region featuring the genes typically found in the ancestral LSC region, and the opposite single copy region harbours genes characteristic of both the ancestral SSC and LSC regions. The 52 genes that were transferred from the ancestral LSC to SSC region include 12 of those observed in Pseudendoclonium cpDNA. Surprisingly, the overall gene organization of

  8. Complete chloroplast genome and 45S nrDNA sequences of the medicinal plant species Glycyrrhiza glabra and Glycyrrhiza uralensis.

    Science.gov (United States)

    Kang, Sang-Ho; Lee, Jeong-Hoon; Lee, Hyun Oh; Ahn, Byoung Ohg; Won, So Youn; Sohn, Seong-Han; Kim, Jung Sun

    2017-10-06

    Glycyrrhiza uralensis and G. glabra, members of the Fabaceae, are medicinally important species that are native to Asia and Europe. Extracts from these plants are widely used as natural sweeteners because of their much greater sweetness than sucrose. In this study, the three complete chloroplast genomes and five 45S nuclear ribosomal (nr)DNA sequences of these two licorice species and an interspecific hybrid are presented. The chloroplast genomes of G. glabra, G. uralensis and G. glabra × G. uralensis were 127,895 bp, 127,716 bp and 127,939 bp, respectively. The three chloroplast genomes harbored 110 annotated genes, including 76 protein-coding genes, 30 tRNA genes and 4 rRNA genes. The 45S nrDNA sequences were either 5,947 or 5,948 bp in length. Glycyrrhiza glabra and G. glabra × G. uralensis showed two types of nrDNA, while G. uralensis contained a single type. The complete 45S nrDNA sequence unit contains 18S rRNA, ITS1, 5.8S rRNA, ITS2 and 26S rRNA. We identified simple sequence repeat and tandem repeat sequences. We also developed four reliable markers for analysis of Glycyrrhiza diversity authentication.

  9. Accumulation of chloroplast DNA sequences on the Y chromosome of Silene latifolia

    Czech Academy of Sciences Publication Activity Database

    Kejnovský, Eduard; Kubát, Zdeněk; Hobza, Roman; Lengerová, Martina; Sato, S.; Tabata, S.; Fukui, K.; Matsunaga, S.; Vyskot, Boris

    2006-01-01

    Roč. 128, 1-3 (2006), s. 167-175 ISSN 0016-6707 R&D Projects: GA ČR(CZ) GA204/05/2097; GA ČR(CZ) GD204/05/H505; GA AV ČR(CZ) 1QS500040507 Institutional research plan: CEZ:AV0Z50040507 Keywords : accumulation * chloroplast DNA * Y chromosome Subject RIV: BO - Biophysics Impact factor: 1.492, year: 2006

  10. A hybrid swarm population of Pinus densiflora x P. sylvestris hybrids inferred from sequence analysis of chloroplast DNA and morphological characters

    Science.gov (United States)

    To confirm a hybrid swarm population of Pinus densiflora × P. sylvestris in Jilin, China and to study whether shoot apex morphology of 4-year old seedlings can be correlated with the sequence of a chloroplast DNA simple sequence repeat marker (cpDNA SSR), needles and seeds from P. densiflora, P. syl...

  11. Re-exploration of U's Triangle Brassica Species Based on Chloroplast Genomes and 45S nrDNA Sequences.

    Science.gov (United States)

    Kim, Chang-Kug; Seol, Young-Joo; Perumal, Sampath; Lee, Jonghoon; Waminal, Nomar Espinosa; Jayakodi, Murukarthick; Lee, Sang-Choon; Jin, Seungwoo; Choi, Beom-Soon; Yu, Yeisoo; Ko, Ho-Cheol; Choi, Ji-Weon; Ryu, Kyoung-Yul; Sohn, Seong-Han; Parkin, Isobel; Yang, Tae-Jin

    2018-05-09

    The concept of U's triangle, which revealed the importance of polyploidization in plant genome evolution, described natural allopolyploidization events in Brassica using three diploids [B. rapa (A genome), B. nigra (B), and B. oleracea (C)] and derived allotetraploids [B. juncea (AB genome), B. napus (AC), and B. carinata (BC)]. However, comprehensive understanding of Brassica genome evolution has not been fully achieved. Here, we performed low-coverage (2-6×) whole-genome sequencing of 28 accessions of Brassica as well as of Raphanus sativus [R genome] to explore the evolution of six Brassica species based on chloroplast genome and ribosomal DNA variations. Our phylogenomic analyses led to two main conclusions. (1) Intra-species-level chloroplast genome variations are low in the three allotetraploids (2~7 SNPs), but rich and variable in each diploid species (7~193 SNPs). (2) Three allotetraploids maintain two 45SnrDNA types derived from both ancestral species with maternal dominance. Furthermore, this study sheds light on the maternal origin of the AC chloroplast genome. Overall, this study clarifies the genetic relationships of U's triangle species based on a comprehensive genomics approach and provides important genomic resources for correlative and evolutionary studies.

  12. De Novo Assembly of Complete Chloroplast Genomes from Non-model Species Based on a K-mer Frequency-Based Selection of Chloroplast Reads from Total DNA Sequences

    Directory of Open Access Journals (Sweden)

    Shairul Izan

    2017-08-01

    Full Text Available Whole Genome Shotgun (WGS sequences of plant species often contain an abundance of reads that are derived from the chloroplast genome. Up to now these reads have generally been identified and assembled into chloroplast genomes based on homology to chloroplasts from related species. This re-sequencing approach may select against structural differences between the genomes especially in non-model species for which no close relatives have been sequenced before. The alternative approach is to de novo assemble the chloroplast genome from total genomic DNA sequences. In this study, we used k-mer frequency tables to identify and extract the chloroplast reads from the WGS reads and assemble these using a highly integrated and automated custom pipeline. Our strategy includes steps aimed at optimizing assemblies and filling gaps which are left due to coverage variation in the WGS dataset. We have successfully de novo assembled three complete chloroplast genomes from plant species with a range of nuclear genome sizes to demonstrate the universality of our approach: Solanum lycopersicum (0.9 Gb, Aegilops tauschii (4 Gb and Paphiopedilum henryanum (25 Gb. We also highlight the need to optimize the choice of k and the amount of data used. This new and cost-effective method for de novo short read assembly will facilitate the study of complete chloroplast genomes with more accurate analyses and inferences, especially in non-model plant genomes.

  13. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    Directory of Open Access Journals (Sweden)

    Marta Brozynska

    Full Text Available Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina and Ion Torrent (Life Technology sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare. Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  14. Identification of Dendrobium species by a candidate DNA barcode sequence: the chloroplast psbA-trnH intergenic region.

    Science.gov (United States)

    Yao, Hui; Song, Jing-Yuan; Ma, Xin-Ye; Liu, Chang; Li, Ying; Xu, Hong-Xi; Han, Jian-Ping; Duan, Li-Sheng; Chen, Shi-Lin

    2009-05-01

    DNA barcoding is a novel technology that uses a standard DNA sequence to facilitate species identification. Although a consensus has not been reached regarding which DNA sequences can be used as the best plant barcodes, the psbA-trnH spacer region has been tested extensively in recent years. In this study, we hypothesize that the psbA-trnH spacer regions are also effective barcodes for Dendrobium species. We have sequenced the chloroplast psbA-trnH intergenic spacers of 17 Dendrobium species to test this hypothesis. The sequences were found to be significantly different from those of other species, with percentages of variation ranging from 0.3 % to 2.3 % and an average of 1.2 %. In contrast, the intraspecific variation among the Dendrobium species studied ranged from 0 % to 0.1 %. The sequence difference between the psbA-trnH sequences of 17 Dendrobium species and one Bulbophyllum odoratissimum ranged from 2.0 % to 3.1 %, with an average of 2.5 %. Our results support the notion that the psbA-trnH intergenic spacer region could be used as a barcode to distinguish various Dendrobium species and to differentiate Dendrobium species from other adulterating species. Copyright Georg Thieme Verlag KG Stuttgart. New York.

  15. Phylogeography of the endangered orchid Dendrobium moniliforme in East Asia inferred from chloroplast DNA sequences.

    Science.gov (United States)

    Ye, Meirong; Liu, Wei; Xue, Qingyun; Hou, Beiwei; Luo, Jing; Ding, Xiaoyu

    2017-11-01

    The aim of the current study was to elucidate the phylogeographic history of Dendrobium moniliforme, an endangered orchid species, based on two chloroplast DNA (cpDNA) markers (trnC-petN and trnE-trnT). One hundred and thirty-five samples were collected from 18 natural populations of D. moniliforme covering the entire range of the Sino-Japanese Floristic Region (SJFR) of East Asia. A total of 35 distinct cpDNA haplotypes were identified in these populations, of which 23 haplotypes were each present in only one sample and thus restricted to a single population. The significantly larger N ST value (0.586) than G ST (0.328) (p < 0.05) demonstrated the presence of strong phylogeographic structure. Phylogenetic analyses indicated that all haplotypes were clustered into two lineages. The genetic diversity of D. moniliforme was high at the species level, reflected in its haplotype diversity (H d =0.8862), nucleotide diversity (P i =0.00361), total genetic diversity (H T =0.9011), and significant differentiation (Φ ST =0.5482). Based on mismatch distribution analysis and neutrality tests, population expansion was evident in all sampled populations and also in all populations sampled in mainland China. Three refuge areas were identified, one each in southwestern China, central-southeastern China, and the CKJ (Taiwan, Japan and Korea) Islands. The results supported the hypothesis that glacial refugia were maintained on different spatial-temporal scales in the SJFR during the last glacial maximum or earlier cold periods, suggesting that Quaternary refugial isolation promoted allopatric speciation of D. moniliforme in East Asia.

  16. Genetic diversity in breonadia salicina based on intra-species sequence variation of chloroplast dna spacer sequence

    International Nuclear Information System (INIS)

    Qurainy, F.A.; Gaafar, A.R.Z.

    2014-01-01

    Assessment and knowledge of the genetic diversity and variation within and between populations of rare and endangered plants is very important for effective conservation. Intergenic spacer sequences variation of psbA-trnH locus of chloroplast genome was assessed within Breonadia salicina (Rubiaceae), a critically endangered and endemic plant species to South western part of Kingdom of Saudi Arabia. The obtained sequence data from 19 individuals in three populations revealed nine haplotypes. The aligned sequences obtained from the overall Saudi accessions extended to 355 bp, revealing nine haplotypes. A high level of haplotype diversity (Hd = 0.842) and low level of nucleotide diversity (Pi = 0.0058) were detected. Consistently, both hierarchical analysis of molecular variance (AMOVA) and constructed neighbor-joining tree indicated null genetic differentiation among populations. This level of differentiation between populations or between regions in psbA-trnH sequences may be due to effects of the abundance of ancestral haplotype sharing and the presence of private haplotypes fixed for each population. Furthermore, the results revealed almost the same level of genetic diversity in comparison with Yemeni accessions, in which Saudi accessions were sharing three haplotypes from the four haplotypes found in Yemeni accessions. (author)

  17. Phylogenetic inferences of Nepenthes species in Peninsular Malaysia revealed by chloroplast (trnL intron) and nuclear (ITS) DNA sequences

    OpenAIRE

    Bunawan, Hamidun; Yen, Choong Chee; Yaakop, Salmah; Noor, Normah Mohd

    2017-01-01

    Background The chloroplastic trnL intron and the nuclear internal transcribed spacer (ITS) region were sequenced for 11 Nepenthes species recorded in Peninsular Malaysia to examine their phylogenetic relationship and to evaluate the usage of trnL intron and ITS sequences for phylogenetic reconstruction of this genus. Results Phylogeny reconstruction was carried out using neighbor-joining, maximum parsimony and Bayesian analyses. All the trees revealed two major clusters, a lowland group consi...

  18. De Novo Assembly of Complete Chloroplast Genomes from Non-model Species Based on a K-mer Frequency-Based Selection of Chloroplast Reads from total DNA Sequences.

    NARCIS (Netherlands)

    Izan, Shairul; Esselink, G.; Visser, R.G.F.; Smulders, M.J.M.; Borm, T.J.A.

    2017-01-01

    Whole Genome Shotgun (WGS) sequences of plant species often contain an abundance of reads that are derived from the chloroplast genome. Up to now these reads have generally been identified and assembled into chloroplast genomes based on homology to chloroplasts from related species. This

  19. Chloroplast DNA sequence of the green alga Oedogonium cardiacum (Chlorophyceae: Unique genome architecture, derived characters shared with the Chaetophorales and novel genes acquired through horizontal transfer

    Directory of Open Access Journals (Sweden)

    Lemieux Claude

    2008-06-01

    Full Text Available Abstract Background To gain insight into the branching order of the five main lineages currently recognized in the green algal class Chlorophyceae and to expand our understanding of chloroplast genome evolution, we have undertaken the sequencing of chloroplast DNA (cpDNA from representative taxa. The complete cpDNA sequences previously reported for Chlamydomonas (Chlamydomonadales, Scenedesmus (Sphaeropleales, and Stigeoclonium (Chaetophorales revealed tremendous variability in their architecture, the retention of only few ancestral gene clusters, and derived clusters shared by Chlamydomonas and Scenedesmus. Unexpectedly, our recent phylogenies inferred from these cpDNAs and the partial sequences of three other chlorophycean cpDNAs disclosed two major clades, one uniting the Chlamydomonadales and Sphaeropleales (CS clade and the other uniting the Oedogoniales, Chaetophorales and Chaetopeltidales (OCC clade. Although molecular signatures provided strong support for this dichotomy and for the branching of the Oedogoniales as the earliest-diverging lineage of the OCC clade, more data are required to validate these phylogenies. We describe here the complete cpDNA sequence of Oedogonium cardiacum (Oedogoniales. Results Like its three chlorophycean homologues, the 196,547-bp Oedogonium chloroplast genome displays a distinctive architecture. This genome is one of the most compact among photosynthetic chlorophytes. It has an atypical quadripartite structure, is intron-rich (17 group I and 4 group II introns, and displays 99 different conserved genes and four long open reading frames (ORFs, three of which are clustered in the spacious inverted repeat of 35,493 bp. Intriguingly, two of these ORFs (int and dpoB revealed high similarities to genes not usually found in cpDNA. At the gene content and gene order levels, the Oedogonium genome most closely resembles its Stigeoclonium counterpart. Characters shared by these chlorophyceans but missing in members

  20. Phylogeography of Thlaspi arvense (Brassicaceae in China Inferred from Chloroplast and Nuclear DNA Sequences and Ecological Niche Modeling

    Directory of Open Access Journals (Sweden)

    Miao An

    2015-06-01

    Full Text Available Thlaspi arvense is a well-known annual farmland weed with worldwide distribution, which can be found from sea level to above 4000 m high on the Qinghai-Tibetan Plateau (QTP. In this paper, a phylogeographic history of T. arvense including 19 populations from China was inferred by using three chloroplast (cp DNA segments (trnL-trnF, rpl32-trnL and rps16 and one nuclear (n DNA segment (Fe-regulated transporter-like protein, ZIP. A total of 11 chloroplast haplotypes and six nuclear alleles were identified, and haplotypes unique to the QTP were recognized (C4, C5, C7 and N4. On the basis of molecular dating, haplotypes C4, C5 and C7 have separated from others around 1.58 Ma for cpDNA, which corresponds to the QTP uplift. In addition, this article suggests that the T. arvense populations in China are a mixture of diverged subpopulations as inferred by hT/vT test (hT ≤ vT, cpDNA and positive Tajima’s D values (1.87, 0.05 < p < 0.10 for cpDNA and 3.37, p < 0.01 for nDNA. Multimodality mismatch distribution curves and a relatively large shared area of suitable environmental conditions between the Last Glacial Maximum (LGM as well as the present time recognized by MaxEnt software reject the sudden expansion population model.

  1. The complete chloroplast genome sequence of the chlorophycean green alga Scenedesmus obliquus reveals a compact gene organization and a biased distribution of genes on the two DNA strands

    Science.gov (United States)

    de Cambiaire, Jean-Charles; Otis, Christian; Lemieux, Claude; Turmel, Monique

    2006-01-01

    Background The phylum Chlorophyta contains the majority of the green algae and is divided into four classes. While the basal position of the Prasinophyceae is well established, the divergence order of the Ulvophyceae, Trebouxiophyceae and Chlorophyceae (UTC) remains uncertain. The five complete chloroplast DNA (cpDNA) sequences currently available for representatives of these classes display considerable variability in overall structure, gene content, gene density, intron content and gene order. Among these genomes, that of the chlorophycean green alga Chlamydomonas reinhardtii has retained the least ancestral features. The two single-copy regions, which are separated from one another by the large inverted repeat (IR), have similar sizes, rather than unequal sizes, and differ radically in both gene contents and gene organizations relative to the single-copy regions of prasinophyte and ulvophyte cpDNAs. To gain insights into the various changes that underwent the chloroplast genome during the evolution of chlorophycean green algae, we have sequenced the cpDNA of Scenedesmus obliquus, a member of a distinct chlorophycean lineage. Results The 161,452 bp IR-containing genome of Scenedesmus features single-copy regions of similar sizes, encodes 96 genes, i.e. only two additional genes (infA and rpl12) relative to its Chlamydomonas homologue and contains seven group I and two group II introns. It is clearly more compact than the four UTC algal cpDNAs that have been examined so far, displays the lowest proportion of short repeats among these algae and shows a stronger bias in clustering of genes on the same DNA strand compared to Chlamydomonas cpDNA. Like the latter genome, Scenedesmus cpDNA displays only a few ancestral gene clusters. The two chlorophycean genomes share 11 gene clusters that are not found in previously sequenced trebouxiophyte and ulvophyte cpDNAs as well as a few genes that have an unusual structure; however, their single-copy regions differ

  2. Phylogenetic inferences of Nepenthes species in Peninsular Malaysia revealed by chloroplast (trnL intron) and nuclear (ITS) DNA sequences.

    Science.gov (United States)

    Bunawan, Hamidun; Yen, Choong Chee; Yaakop, Salmah; Noor, Normah Mohd

    2017-01-26

    The chloroplastic trnL intron and the nuclear internal transcribed spacer (ITS) region were sequenced for 11 Nepenthes species recorded in Peninsular Malaysia to examine their phylogenetic relationship and to evaluate the usage of trnL intron and ITS sequences for phylogenetic reconstruction of this genus. Phylogeny reconstruction was carried out using neighbor-joining, maximum parsimony and Bayesian analyses. All the trees revealed two major clusters, a lowland group consisting of N. ampullaria, N. mirabilis, N. gracilis and N. rafflesiana, and another containing both intermediately distributed species (N. albomarginata and N. benstonei) and four highland species (N. sanguinea, N. macfarlanei, N. ramispina and N. alba). The trnL intron and ITS sequences proved to provide phylogenetic informative characters for deriving a phylogeny of Nepenthes species in Peninsular Malaysia. To our knowledge, this is the first molecular phylogenetic study of Nepenthes species occurring along an altitudinal gradient in Peninsular Malaysia.

  3. The demise of chloroplast DNA in Arabidopsis.

    Science.gov (United States)

    Rowan, Beth A; Oldenburg, Delene J; Bendich, Arnold J

    2004-09-01

    Although it might be expected that chloroplast DNA (cpDNA) would be stably maintained in mature leaves, we report the surprising observation that cpDNA levels decline during plastid development in Arabidopsis thaliana (Col.) until most of the leaves contain little or no DNA long before the onset of senescence. We measured the cpDNA content in developing cotyledons, rosette leaves, and cauline leaves. The amount of cpDNA per chloroplast decreases as the chloroplasts develop, reaching undetectable levels in mature leaves. In young cauline leaves, most individual molecules of cpDNA are found in complex, branched forms. In expanded cauline leaves, cpDNA is present in smaller branched forms only at the base of the leaf and is virtually absent in the distal part of the leaf. We conclude that photosynthetic activity may persist long after the demise of the cpDNA. Copyright 2004 Springer-Verlag

  4. Biogeography of the Pistia clade (Araceae): based on chloroplast and mitochondrial DNA sequences and Bayesian divergence time inference.

    Science.gov (United States)

    Renner, Susanne S; Zhang, Li-Bing

    2004-06-01

    Pistia stratiotes (water lettuce) and Lemna (duckweeds) are the only free-floating aquatic Araceae. The geographic origin and phylogenetic placement of these unrelated aroids present long-standing problems because of their highly modified reproductive structures and wide geographical distributions. We sampled chloroplast (trnL-trnF and rpl20-rps12 spacers, trnL intron) and mitochondrial sequences (nad1 b/c intron) for all genera implicated as close relatives of Pistia by morphological, restriction site, and sequencing data, and present a hypothesis about its geographic origin based on the consensus of trees obtained from the combined data, using Bayesian, maximum likelihood, parsimony, and distance analyses. Of the 14 genera closest to Pistia, only Alocasia, Arisaema, and Typhonium are species-rich, and the latter two were studied previously, facilitating the choice of representatives that span the roots of these genera. Results indicate that Pistia and the Seychelles endemic Protarum sechellarum are the basalmost branches in a grade comprising the tribes Colocasieae (Ariopsis, Steudnera, Remusatia, Alocasia, Colocasia), Arisaemateae (Arisaema, Pinellia), and Areae (Arum, Biarum, Dracunculus, Eminium, Helicodiceros, Theriophonum, Typhonium). Unexpectedly, all Areae genera are embedded in Typhonium, which throws new light on the geographic history of Areae. A Bayesian analysis of divergence times that explores the effects of multiple fossil and geological calibration points indicates that the Pistia lineage is 90 to 76 million years (my) old. The oldest fossils of the Pistia clade, though not Pistia itself, are 45-my-old leaves from Germany; the closest outgroup, Peltandreae (comprising a few species in Florida, the Mediterranean, and Madagascar), is known from 60-my-old leaves from Europe, Kazakhstan, North Dakota, and Tennessee. Based on the geographic ranges of close relatives, Pistia likely originated in the Tethys region, with Protarum then surviving on the

  5. Chloroplast DNA analysis of Tunisian cork oak populations (Quercus suber L.): sequence variations and molecular evolution of the trnL (UAA)-trnF (GAA) region.

    Science.gov (United States)

    Abdessamad, A; Baraket, G; Sakka, H; Ammari, Y; Ksontini, M; Hannachi, A Salhi

    2016-10-24

    Sequences of the trnL-trnF spacer and combined trnL-trnF region in chloroplast DNA of cork oak (Quercus suber L.) were analyzed to detect polymorphisms and to elucidate molecular evolution and demographic history. The aligned sequences varied in length and nucleotide composition. The overall ratio of transition/transversion (ti/tv) of 0.724 for the intergenic spacer and 0.258 for the pooled sequences were estimated, and indicated that transversions are more frequent than transitions. The molecular evolution and demographic history of Q. suber were investigated. Neutrality tests (Tajima's D and Fu and Li) ruled out the null hypothesis of a strictly neutral model, and Fu's Fs and Ramos-Onsins and Rozas' R2 confirmed the recent expansion of cork oak trees, validating its persistency in North Africa since the last glaciation during the Quaternary. The observed uni-modal mismatch distribution and the Harpending's raggedness index confirmed the demographic history model for cork oak. A phylogenetic dendrogram showed that the distribution of Q. suber trees occurs independently of geographical origin, the relief of the population site, and the bioclimatic stages. The molecular history and cytoplasmic diversity suggest that in situ and ex situ conservation strategies can be recommended for preserving landscape value and facing predictable future climatic changes.

  6. [Reconstruction of the phylogenetic position of larch (Larix sukaczewii Dylis) by sequencing data for the trnK intron of chloroplast DNA].

    Science.gov (United States)

    Bashalkhanov, S I; Konstantinov, Iu M; Verbitskiĭ, D S; Kobzev, V F

    2003-10-01

    To reconstruct the systematic relationships of larch Larix sukaczewii, we used the chloroplast trnK intron sequences of L. decidua, L. sukaczewii, L. sibirica, L. czekanovskii, and L. gmelinii. Analysis of phylogenetic trees constructed using the maximum parsimony and maximum likelihood methods showed a clear divergence of the trnK intron sequences between L. sukaczewii and L. sibirica. This divergence reaches intraspecific level, which supports a previously published hypothesis on the taxonomic isolation of L. sukaczewii.

  7. Sonication-based isolation and enrichment of Chlorella protothecoides chloroplasts for illumina genome sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Angelova, Angelina [University of Arizona; Park, Sang-Hycuk [University of Arizona; Kyndt, John [Bellevue University; Fitzsimmons, Kevin [University of Arizona; Brown, Judith K [University of Arizona

    2013-09-01

    With the increasing world demand for biofuel, a number of oleaginous algal species are being considered as renewable sources of oil. Chlorella protothecoides Krüger synthesizes triacylglycerols (TAGs) as storage compounds that can be converted into renewable fuel utilizing an anabolic pathway that is poorly understood. The paucity of algal chloroplast genome sequences has been an important constraint to chloroplast transformation and for studying gene expression in TAGs pathways. In this study, the intact chloroplasts were released from algal cells using sonication followed by sucrose gradient centrifugation, resulting in a 2.36-fold enrichment of chloroplasts from C. protothecoides, based on qPCR analysis. The C. protothecoides chloroplast genome (cpDNA) was determined using the Illumina HiSeq 2000 sequencing platform and found to be 84,576 Kb in size (8.57 Kb) in size, with a GC content of 30.8 %. This is the first report of an optimized protocol that uses a sonication step, followed by sucrose gradient centrifugation, to release and enrich intact chloroplasts from a microalga (C. prototheocoides) of sufficient quality to permit chloroplast genome sequencing with high coverage, while minimizing nuclear genome contamination. The approach is expected to guide chloroplast isolation from other oleaginous algal species for a variety of uses that benefit from enrichment of chloroplasts, ranging from biochemical analysis to genomics studies.

  8. Inferring Genetic Variation and Demographic History of Michelia yunnanensis Franch. (Magnoliaceae from Chloroplast DNA Sequences and Microsatellite Markers

    Directory of Open Access Journals (Sweden)

    Shikang Shen

    2017-04-01

    Full Text Available Michelia yunnanensis Franch., is a traditional ornamental, aromatic, and medicinal shrub that endemic to Yunnan Province in southwest China. Although the species has a large distribution pattern and is abundant in Yunnan Province, the populations are dramatically declining because of overexploitation and habitat destruction. Studies on the genetic variation and demography of endemic species are necessary to develop effective conservation and management strategies. To generate such knowledge, we used 3 pairs of universal cpDNA markers and 10 pairs of microsatellite markers to assess the genetic diversity, genetic structure, and demographic history of 7 M. yunnanensis populations. We calculated a total of 88 alleles for 10 polymorphic loci and 10 haplotypes for a combined 2,089 bp of cpDNA. M. yunnanensis populations showed high genetic diversity (Ho = 0.551 for nuclear markers and Hd = 0.471 for cpDNA markers and low genetic differentiation (FST = 0.058. Geographical structure was not found among M. yunnanensis populations. Genetic distance and geographic distance were not correlated (P > 0.05, which indicated that geographic isolation is not the primary cause of the low genetic differentiation of M. yunnanensis. Additionally, M. yunnanensis populations contracted ~20,000–30,000 years ago, and no recent expansion occurred in current populations. Results indicated that the high genetic diversity of the species and within its populations holds promise for effective genetic resource management and sustainable utilization. Thus, we suggest that the conservation and management of M. yunnanensis should address exotic overexploitation and habitat destruction.

  9. The complete chloroplast genome sequence of Dendrobium officinale.

    Science.gov (United States)

    Yang, Pei; Zhou, Hong; Qian, Jun; Xu, Haibin; Shao, Qingsong; Li, Yonghua; Yao, Hui

    2016-01-01

    The complete chloroplast sequence of Dendrobium officinale, an endangered and economically important traditional Chinese medicine, was reported and characterized. The genome size is 152,018 bp, with 37.5% GC content. A pair of inverted repeats (IRs) of 26,284 bp are separated by a large single-copy region (LSC, 84,944 bp) and a small single-copy region (SSC, 14,506 bp). The complete cp DNA contains 83 protein-coding genes, 39 tRNA genes and 8 rRNA genes. Fourteen genes contained one or two introns.

  10. Phylogeny and character evolution of the fern genus Tectaria (Tectariaceae) in the Old World inferred from chloroplast DNA sequences.

    Science.gov (United States)

    Ding, Hui-Hui; Chao, Yi-Shan; Callado, John Rey; Dong, Shi-Yong

    2014-11-01

    In this study we provide a phylogeny for the pantropical fern genus Tectaria, with emphasis on the Old World species, based on sequences of five plastid regions (atpB, ndhF plus ndhF-trnL, rbcL, rps16-matK plus matK, and trnL-F). Maximum parsimony, maximum likelihood, and Bayesian inference are used to analyze 115 individuals, representing ca. 56 species of Tectaria s.l. and 36 species of ten related genera. The results strongly support the monophyly of Tectaria in a broad sense, in which Ctenitopsis, Hemigramma, Heterogonium, Psomiocarpa, Quercifilix, Stenosemia, and Tectaridium should be submerged. Such broadly circumscribed Tectaria is supported by the arising pattern of veinlets and the base chromosome number (x=40). Four primary clades are well resolved within Tectaria, one from the Neotropic (T. trifoliata clade) and three from the Old World (T. subtriphylla clade, Ctenitopsis clade, and T. crenata clade). Tectaria crenata clade is the largest one including six subclades. Of the genera previously recognized as tectarioid ferns, Ctenitis, Lastreopsis, and Pleocnemia, are confirmed to be members in Dryopteridaceae; while Pteridrys and Triplophyllum are supported in Tectariaceae. To infer morphological evolution, 13 commonly used characters are optimized on the resulting phylogenetic trees and in result, are all homoplastic in Tectaria. Copyright © 2014 Elsevier Inc. All rights reserved.

  11. The complete chloroplast genome sequence of Dianthus superbus var. longicalycinus.

    Science.gov (United States)

    Gurusamy, Raman; Lee, Do-Hyung; Park, SeonJoo

    2016-05-01

    The complete chloroplast genome (cpDNA) sequence of Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicine was reported and characterized. The cpDNA of Dianthus superbus var. longicalycinus is 149,539 bp, with 36.3% GC content. A pair of inverted repeats (IRs) of 24,803 bp is separated by a large single-copy region (LSC, 82,805 bp) and a small single-copy region (SSC, 17,128 bp). It encodes 85 protein-coding genes, 36 tRNA genes and 8 rRNA genes. Of 129 individual genes, 13 genes encoded one intron and three genes have two introns.

  12. Complete sequencing of five araliaceae chloroplast genomes and the phylogenetic implications.

    Directory of Open Access Journals (Sweden)

    Rong Li

    Full Text Available BACKGROUND: The ginseng family (Araliaceae includes a number of economically important plant species. Previously phylogenetic studies circumscribed three major clades within the core ginseng plant family, yet the internal relationships of each major group have been poorly resolved perhaps due to rapid radiation of these lineages. Recent studies have shown that phyogenomics based on chloroplast genomes provides a viable way to resolve complex relationships. METHODOLOGY/PRINCIPAL FINDINGS: We report the complete nucleotide sequences of five Araliaceae chloroplast genomes using next-generation sequencing technology. The five chloroplast genomes are 156,333-156,459 bp in length including a pair of inverted repeats (25,551-26,108 bp separated by the large single-copy (86,028-86,566 bp and small single-copy (18,021-19,117 bp regions. Each chloroplast genome contains the same 114 unique genes consisting of 30 transfer RNA genes, four ribosomal RNA genes, and 80 protein coding genes. Gene size, content, and order, AT content, and IR/SC boundary structure are similar among all Araliaceae chloroplast genomes. A total of 140 repeats were identified in the five chloroplast genomes with palindromic repeat as the most common type. Phylogenomic analyses using parsimony, likelihood, and Bayesian inference based on the complete chloroplast genomes strongly supported the monophyly of the Asian Palmate group and the Aralia-Panax group. Furthermore, the relationships among the sampled taxa within the Asian Palmate group were well resolved. Twenty-six DNA markers with the percentage of variable sites higher than 5% were identified, which may be useful for phylogenetic studies of Araliaceae. CONCLUSION: The chloroplast genomes of Araliaceae are highly conserved in all aspects of genome features. The large-scale phylogenomic data based on the complete chloroplast DNA sequences is shown to be effective for the phylogenetic reconstruction of Araliaceae.

  13. The complete chloroplast genome sequence of Abies nephrolepis (Pinaceae: Abietoideae

    Directory of Open Access Journals (Sweden)

    Dong-Keun Yi

    2016-06-01

    Full Text Available The plant chloroplast (cp genome has maintained a relatively conserved structure and gene content throughout evolution. Cp genome sequences have been used widely for resolving evolutionary and phylogenetic issues at various taxonomic levels of plants. Here, we report the complete cp genome of Abies nephrolepis. The A. nephrolepis cp genome is 121,336 base pairs (bp in length including a pair of short inverted repeat regions (IRa and IRb of 139 bp each separated by a small single copy (SSC region of 54,323 bp (SSC and a large single copy region of 66,735 bp (LSC. It contains 114 genes, 68 of which are protein coding genes, 35 tRNA and four rRNA genes, six open reading frames, and one pseudogene. Seventeen repeat units and 64 simple sequence repeats (SSR have been detected in A. nephrolepis cp genome. Large IR sequences locate in 42-kb inversion points (1186 bp. The A. nephrolepis cp genome is identical to Abies koreana’s which is closely related to taxa. Pairwise comparison between two cp genomes revealed 140 polymorphic sites in each. Complete cp genome sequence of A. nephrolepis has a significant potential to provide information on the evolutionary pattern of Abietoideae and valuable data for development of DNA markers for easy identification and classification.

  14. [Study of Chloroplast DNA Polymorphism in the Sunflower (Helianthus L.)].

    Science.gov (United States)

    Markina, N V; Usatov, A V; Logacheva, M D; Azarin, K V; Gorbachenko, C F; Kornienko, I V; Gavrilova, V A; Tihobaeva, V E

    2015-08-01

    The polymorphism of microsatellite loci of chloroplast genome in six Helianthus species and 46 lines of cultivated sunflower H. annuus (17 CMS lines and 29 Rf-lines) were studied. The differences between species are confined to four SSR loci. Within cultivated forms of the sunflower H. annuus, the polymorphism is absent. A comparative analysis was performed on sequences of the cpDNA inbred line 3629, line 398941 of the wild sunflower, and the American line HA383 H. annuus. As a result, 52 polymorphic loci represented by 27 SSR and 25 SNP were found; they can be used for genotyping of H. annuus samples, including cultural varieties: twelve polymorphic positions, of which eight are SSR and four are SNP.

  15. A set of 100 chloroplast DNA primer pairs to study population genetics and phylogeny in monocotylenons

    DEFF Research Database (Denmark)

    Scarcelli, Nora; Bernaud, Adeline; Eiserhardt, Wolf L.

    2011-01-01

    Chloroplast DNA sequences are of great interest for population genetics and phylogenetic studies. However, only a small set of markers are commonly used. Most of them have been designed for amplification in a large range of Angiosperms and are located in the Large Single Copy (LSC). Here we...... anticipate that it will also be useful for phylogeny and bar-coding studies....

  16. Isolation and characterisation of the cDNA encoding a glycosylated accessory protein of pea chloroplast DNA polymerase.

    OpenAIRE

    Gaikwad, A; Tewari, K K; Kumar, D; Chen, W; Mukherjee, S K

    1999-01-01

    The cDNA encoding p43, a DNA binding protein from pea chloroplasts (ct) that binds to cognate DNA polymerase and stimulates the polymerase activity, has been cloned and characterised. The characteristic sequence motifs of hydroxyproline-rich glyco-proteins (HRGP) are present in the cDNA corres-ponding to the N-terminal domain of the mature p43. The protein was found to be highly O-arabinosylated. Chemically deglycosylated p43 (i.e. p29) retains its binding to both DNA and pea ct-DNA polymeras...

  17. Contribution of chloroplast DNA in the biodiversity of some Aegilops ...

    African Journals Online (AJOL)

    Four Aegilops species (Aegilops longissima, Aegilops speltoides, Aegilops searsii and Aegilops caudata) belonging to the family Poaceae were used in this study. Nucleotides of 1651 bp from 5.8 S rRNA gene and the intergenic spacers trnT-trnL and trnL-trnF from the chloroplast DNA were combined together in order to ...

  18. The complete chloroplast genome sequence of Helwingia himalaica (Helwingiaceae, Aquifoliales) and a chloroplast phylogenomic analysis of the Campanulidae

    OpenAIRE

    Yao, Xin; Liu, Ying-Ying; Tan, Yun-Hong; Song, Yu; Corlett, Richard T.

    2016-01-01

    Complete chloroplast genome sequences have been very useful for understanding phylogenetic relationships in angiosperms at the family level and above, but there are currently large gaps in coverage. We report the chloroplast genome for Helwingia himalaica, the first in the distinctive family Helwingiaceae and only the second genus to be sequenced in the order Aquifoliales. We then combine this with 36 published sequences in the large (c. 35,000 species) subclass Campanulidae in order to inves...

  19. The First Complete Chloroplast Genome Sequences in Actinidiaceae: Genome Structure and Comparative Analysis.

    Science.gov (United States)

    Yao, Xiaohong; Tang, Ping; Li, Zuozhou; Li, Dawei; Liu, Yifei; Huang, Hongwen

    2015-01-01

    Actinidia chinensis is an important economic plant belonging to the basal lineage of the asterids. Availability of a complete Actinidia chloroplast genome sequence is crucial to understanding phylogenetic relationships among major lineages of angiosperms and facilitates kiwifruit genetic improvement. We report here the complete nucleotide sequences of the chloroplast genomes for Actinidia chinensis and A. chinensis var deliciosa obtained through de novo assembly of Illumina paired-end reads produced by total DNA sequencing. The total genome size ranges from 155,446 to 157,557 bp, with an inverted repeat (IR) of 24,013 to 24,391 bp, a large single copy region (LSC) of 87,984 to 88,337 bp and a small single copy region (SSC) of 20,332 to 20,336 bp. The genome encodes 113 different genes, including 79 unique protein-coding genes, 30 tRNA genes and 4 ribosomal RNA genes, with 16 duplicated in the inverted repeats, and a tRNA gene (trnfM-CAU) duplicated once in the LSC region. Comparisons of IR boundaries among four asterid species showed that IR/LSC borders were extended into the 5' portion of the psbA gene and IR contraction occurred in Actinidia. The clap gene has been lost from the chloroplast genome in Actinidia, and may have been transferred to the nucleus during chloroplast evolution. Twenty-seven polymorphic simple sequence repeat (SSR) loci were identified in the Actinidia chloroplast genome. Maximum parsimony analyses of a 72-gene, 16 taxa angiosperm dataset strongly support the placement of Actinidiaceae in Ericales within the basal asterids.

  20. Phylogeography of Quercus variabilis Based on Chloroplast DNA Sequence in East Asia: Multiple Glacial Refugia and Mainland-Migrated Island Populations

    Science.gov (United States)

    Kang, Hongzhang; Sun, Xiao; Yin, Shan; Du, Hongmei; Yamanaka, Norikazu; Gapare, Washington; Wu, Harry X.; Liu, Chunjiang

    2012-01-01

    The biogeographical relationships between far-separated populations, in particular, those in the mainland and islands, remain unclear for widespread species in eastern Asia where the current distribution of plants was greatly influenced by the Quaternary climate. Deciduous Oriental oak (Quercus variabilis) is one of the most widely distributed species in eastern Asia. In this study, leaf material of 528 Q. variabilis trees from 50 populations across the whole distribution (Mainland China, Korea Peninsular as well as Japan, Zhoushan and Taiwan Islands) was collected, and three cpDNA intergenic spacer fragments were sequenced using universal primers. A total of 26 haplotypes were detected, and it showed a weak phylogeographical structure in eastern Asia populations at species level, however, in the central-eastern region of Mainland China, the populations had more haplotypes than those in other regions, with a significant phylogeographical structure (N ST = 0.751> G ST = 0.690, Ptree showed a rapid speciation during Pleistocene, with a population augment occurred in Middle Pleistocene. Both diversity patterns and ecological niche modelling indicated there could be multiple glacial refugia and possible bottleneck or founder effects occurred in the southern Japan. We dated major spatial expansion of Q. variabilis population in eastern Asia to the last glacial cycle(s), a period with sea-level fluctuations and land bridges in East China Sea as possible dispersal corridors. This study showed that geographical heterogeneity combined with climate and sea-level changes have shaped the genetic structure of this wide-ranging tree species in East Asia. PMID:23115642

  1. Phylogeography of Quercus variabilis based on chloroplast DNA sequence in East Asia: multiple glacial refugia and Mainland-migrated island populations.

    Directory of Open Access Journals (Sweden)

    Dongmei Chen

    Full Text Available The biogeographical relationships between far-separated populations, in particular, those in the mainland and islands, remain unclear for widespread species in eastern Asia where the current distribution of plants was greatly influenced by the Quaternary climate. Deciduous Oriental oak (Quercus variabilis is one of the most widely distributed species in eastern Asia. In this study, leaf material of 528 Q. variabilis trees from 50 populations across the whole distribution (Mainland China, Korea Peninsular as well as Japan, Zhoushan and Taiwan Islands was collected, and three cpDNA intergenic spacer fragments were sequenced using universal primers. A total of 26 haplotypes were detected, and it showed a weak phylogeographical structure in eastern Asia populations at species level, however, in the central-eastern region of Mainland China, the populations had more haplotypes than those in other regions, with a significant phylogeographical structure (N(ST= 0.751> G(ST= 0.690, P<0.05. Q. variabilis displayed high interpopulation and low intrapopulation genetic diversity across the distribution range. Both unimodal mismatch distribution and significant negative Fu's F(S indicated a demographic expansion of Q. variabilis populations in East Asia. A fossil calibrated phylogenetic tree showed a rapid speciation during Pleistocene, with a population augment occurred in Middle Pleistocene. Both diversity patterns and ecological niche modelling indicated there could be multiple glacial refugia and possible bottleneck or founder effects occurred in the southern Japan. We dated major spatial expansion of Q. variabilis population in eastern Asia to the last glacial cycle(s, a period with sea-level fluctuations and land bridges in East China Sea as possible dispersal corridors. This study showed that geographical heterogeneity combined with climate and sea-level changes have shaped the genetic structure of this wide-ranging tree species in East Asia.

  2. Phylogenetic relationships among vietnamese cocoa accessions using a non-coding region of the chloroplast dna

    International Nuclear Information System (INIS)

    Ha, L.T.V.; Dung, T.N.; Phuoc, P.H.D.

    2017-01-01

    Cocoa cultivation has increased in tropical areas around the world, including Vietnam, due to the high demand of cocoa beans for chocolate production. The genetic diversity of cocoa genotypes is recognized to be complex, however, their phylogenetic relationships need to be clarified. The present study aimed to classify the cocoa genotypes, that are imported and cultivated in Vietnam, based on a chloroplast DNA region. Sixty-three Vietnamese Cocoa accessions were collected from different regions in Southern Vietnam. Their phylogenetic relationships were identified using the universal primers c-B49317 and d-A49855 from the chloroplast DNA region. The sequences were situated in the trnL intron genes which are identify the closest terrestrial plant species of the chloroplast genome. DNA sequences were determined and subjected to an analysis of the phylogenetic relationship using the maximum evolution method. The genetic analysis showed clustering of 63 cocoa accessions in three groups: the domestically cultivated Trinitario group, the Indigenous cultivars, and the cultivations from Peru. The analyzed sequencing data also illustrated that the TD accessions and CT accessions were related genetically closed. Based on those results the genetic relation between PA and NA accessions was established as the hybrid origins of the TD and CT accessions. Some foreign accessions, including UIT, SCA and IMC accessions were confirmed of their genetic relationship. The present study is the first report of phylogenetic relationships of Vietnamese cocoa collections. The cocoa program in Vietnam has been in development for thirty years. (author)

  3. A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower

    Energy Technology Data Exchange (ETDEWEB)

    Timme, Ruth E.; Kuehl, Jennifer V.; Boore, Jeffrey L.; Jansen, Robert K.

    2006-01-20

    Asteraceae is the second largest family of plants, with over 20,000 species. For the past few decades, numerous phylogenetic studies have contributed to our understanding of the evolutionary relationships within this family, including comparisons of the fast evolving chloroplast gene, ndhF, rbcL, as well as non-coding DNA from the trnL intron plus the trnLtrnF intergenic spacer, matK, and, with lesser resolution, psbA-trnH. This culminated in a study by Panero and Funk in 2002 that used over 13,000 bp per taxon for the largest taxonomic revision of Asteraceae in over a hundred years. Still, some uncertainties remain, and it would be very useful to have more information on the relative rates of sequence evolution among various genes and on genome structure as a potential set of phylogenetic characters to help guide future phylogenetic structures. By way of contributing to this, we report the first two complete chloroplast genome sequences from members of the Asteraceae, those of Helianthus annuus and Lactuca sativa. These plants belong to two distantly related subfamilies, Asteroideae and Cichorioideae, respectively. In addition to these, there is only one other published chloroplast genome sequence for any plant within the larger group called Eusterids II, that of Panax ginseng (Araliaceae, 156,318 bps, AY582139). Early chloroplast genome mapping studies demonstrated that H. annuus and L. sativa share a 22 kb inversion relative to members of the subfamily Barnadesioideae. By comparison to outgroups, this inversion was shown to be derived, indicating that the Asteroideae and Cichorioideae are more closely related than either is to the Barnadesioideae. Later sequencing study found that taxa that share this 22 kb inversion also contain within this region a second, smaller, 3.3 kb inversion. These sequences also enable an analysis of patterns of shared repeats in the genomes at fine level and of RNA editing by comparison to available EST sequences. In addition, since

  4. Towards resolving Lamiales relationships: insights from rapidly evolving chloroplast sequences

    Directory of Open Access Journals (Sweden)

    Heubl Günther

    2010-11-01

    Full Text Available Abstract Background In the large angiosperm order Lamiales, a diverse array of highly specialized life strategies such as carnivory, parasitism, epiphytism, and desiccation tolerance occur, and some lineages possess drastically accelerated DNA substitutional rates or miniaturized genomes. However, understanding the evolution of these phenomena in the order, and clarifying borders of and relationships among lamialean families, has been hindered by largely unresolved trees in the past. Results Our analysis of the rapidly evolving trnK/matK, trnL-F and rps16 chloroplast regions enabled us to infer more precise phylogenetic hypotheses for the Lamiales. Relationships among the nine first-branching families in the Lamiales tree are now resolved with very strong support. Subsequent to Plocospermataceae, a clade consisting of Carlemanniaceae plus Oleaceae branches, followed by Tetrachondraceae and a newly inferred clade composed of Gesneriaceae plus Calceolariaceae, which is also supported by morphological characters. Plantaginaceae (incl. Gratioleae and Scrophulariaceae are well separated in the backbone grade; Lamiaceae and Verbenaceae appear in distant clades, while the recently described Linderniaceae are confirmed to be monophyletic and in an isolated position. Conclusions Confidence about deep nodes of the Lamiales tree is an important step towards understanding the evolutionary diversification of a major clade of flowering plants. The degree of resolution obtained here now provides a first opportunity to discuss the evolution of morphological and biochemical traits in Lamiales. The multiple independent evolution of the carnivorous syndrome, once in Lentibulariaceae and a second time in Byblidaceae, is strongly supported by all analyses and topological tests. The evolution of selected morphological characters such as flower symmetry is discussed. The addition of further sequence data from introns and spacers holds promise to eventually obtain a

  5. Towards resolving Lamiales relationships: insights from rapidly evolving chloroplast sequences

    Science.gov (United States)

    2010-01-01

    Background In the large angiosperm order Lamiales, a diverse array of highly specialized life strategies such as carnivory, parasitism, epiphytism, and desiccation tolerance occur, and some lineages possess drastically accelerated DNA substitutional rates or miniaturized genomes. However, understanding the evolution of these phenomena in the order, and clarifying borders of and relationships among lamialean families, has been hindered by largely unresolved trees in the past. Results Our analysis of the rapidly evolving trnK/matK, trnL-F and rps16 chloroplast regions enabled us to infer more precise phylogenetic hypotheses for the Lamiales. Relationships among the nine first-branching families in the Lamiales tree are now resolved with very strong support. Subsequent to Plocospermataceae, a clade consisting of Carlemanniaceae plus Oleaceae branches, followed by Tetrachondraceae and a newly inferred clade composed of Gesneriaceae plus Calceolariaceae, which is also supported by morphological characters. Plantaginaceae (incl. Gratioleae) and Scrophulariaceae are well separated in the backbone grade; Lamiaceae and Verbenaceae appear in distant clades, while the recently described Linderniaceae are confirmed to be monophyletic and in an isolated position. Conclusions Confidence about deep nodes of the Lamiales tree is an important step towards understanding the evolutionary diversification of a major clade of flowering plants. The degree of resolution obtained here now provides a first opportunity to discuss the evolution of morphological and biochemical traits in Lamiales. The multiple independent evolution of the carnivorous syndrome, once in Lentibulariaceae and a second time in Byblidaceae, is strongly supported by all analyses and topological tests. The evolution of selected morphological characters such as flower symmetry is discussed. The addition of further sequence data from introns and spacers holds promise to eventually obtain a fully resolved plastid tree of

  6. Nucleotide sequence of soybean chloroplast DNA regions which contain the psb A and trn H genes and cover the ends of the large single copy region and one end of the inverted repeats.

    Science.gov (United States)

    Spielmann, A; Stutz, E

    1983-10-25

    The soybean chloroplast psb A gene (photosystem II thylakoid membrane protein of Mr 32 000, lysine-free) and the trn H gene (tRNAHisGUG), which both map in the large single copy region adjacent to one of the inverted repeat structures (IR1), have been sequenced including flanking regions. The psb A gene shows in its structural part 92% sequence homology with the corresponding genes of spinach and N. debneyi and contains also an open reading frame for 353 aminoacids. The aminoacid sequence of a potential primary translation product (calculated Mr, 38 904, no lysine) diverges from that of spinach and N. debneyi in only two positions in the C-terminal part. The trn H gene has the same polarity as the psb A gene and the coding region is located at the very end of the large single copy region. The deduced sequence of the soybean chloroplast tRNAHisGUG is identical with that of Zea mays chloroplasts. Both ends of the large single copy region were sequenced including a small segment of the adjacent IR1 and IR2.

  7. Phylogeny of palaeotropic Derris-like taxa (Fabaceae) based on chloroplast and nuclear DNA sequences shows reorganization of (infra)generic classifications is needed.

    Science.gov (United States)

    Sirichamorn, Yotsawate; Adema, Frits A C B; Gravendeel, Barbara; van Welzen, Peter C

    2012-11-01

    Palaeotropic Derris-like taxa (family Fabaceae, tribe Millettieae) comprise 6-9 genera. They are well known as important sources of rotenone toxin, which are used as organic insecticide and fish poison. However, their phylogenetic relationships and classification are still problematic due to insufficient sampling and high morphological variability. Fifty species of palaeotropic Derris-like taxa were sampled, which is more than in former studies. Three chloroplast genes (trnK-matK, trnL-F IGS, and psbA-trnH IGS) and nuclear ribosomal ITS /5.8S were analyzed using parsimony and Bayesian methods. Parsimony and Bayesian analyses of individual and combined markers show more or less similar tree topologies (only varying in terminal branches). The old-world monophyletic genera Aganope, Brachypterum, and Leptoderris are distinct from Derris s.s., and their generic status is here confirmed. Aganope may be classified into two or three subgeneric taxa. Paraderris has to be included in Derris s.s. to form a monophyletic group. The genera Philenoptera, Deguelia, and Lonchocarpus are monophyletic and distinct from each other and clearly separate from Derris s.s. Morphologically highly similar species of Derris s.s. are shown to be unrelated. Our study shows that previous infrageneric classifications of Derris are incorrect. Paraderris elliptica may contain several cryptic lineages that need further investigation. The concept of the genus Derris s.s. should be reorganized with a new generic circumscription by including Paraderris but excluding Brachypterum. Synapomorphic morphological features will be examined in future studies, and the status of the newly defined Derris and its closely related taxa will be formalized.

  8. A set of primers for analyzing chloroplast DNA diversity in Citrus and related genera.

    Science.gov (United States)

    Cheng, Yunjiang; de Vicente, M Carmen; Meng, Haijun; Guo, Wenwu; Tao, Nengguo; Deng, Xiuxin

    2005-06-01

    Chloroplast simple sequence repeat (cpSSR) markers in Citrus were developed and used to analyze chloroplast diversity of Citrus and closely related genera. Fourteen cpSSR primer pairs from the chloroplast genomes of tobacco (Nicotiana tabacum L.) and Arabidopsis were found useful for analyzing the Citrus chloroplast genome (cpDNA) and recoded with the prefix SPCC (SSR Primers for Citrus Chloroplast). Eleven of the 14 primer pairs revealed some degree of polymorphism among 34 genotypes of Citrus, Fortunella, Poncirus and some of their hybrids, with polymorphism information content (PIC) values ranging from 0.057 to 0.732, and 18 haplotypes were identified. The cpSSR data were analyzed with NTSYS-pc software, and the genetic relationships suggested by the unweighted pair group method based on arithmetic means (UPGMA) dendrogram were congruent with previous taxonomic investigations: the results showed that all samples fell into seven major clusters, i.e., Citrus medica L., Poncirus, Fortunella, C. ichangensis Blanco, C. reticulata Swingle, C. aurantifolia (Christm.) Swingle and C. grandis (L.) Osbeck. The results of previous studies combined with our cpSSR analyses revealed that: (1) Calamondin (C. madurensis Swingle) is the result of hybridization between kumquat (Fortunella) and mandarin (C. reticulata), where kumquat acted as the female parent; (2) Ichang papeda (C. ichangensis) has a unique taxonomic status; and (3) although Bendiguangju mandarin (C. reticulata) and Satsuma mandarin (C. reticulata) are similar in fruit shape and leaf morphology, they have different maternal parents. Bendiguangju mandarin has the same cytoplasm as sweet orange (C. sinensis), whereas Satsuma mandarin has the cytoplasm of C. reticulata. Seventeen PCR products from SPCC1 and 21 from SPCC11 were cloned and sequenced. The results revealed that mononucleotide repeats as well as insertions and deletions of small segments of DNA were associated with SPCC1 polymorphism, whereas polymorphism

  9. The complete chloroplast genome sequence of Hibiscus syriacus.

    Science.gov (United States)

    Kwon, Hae-Yun; Kim, Joon-Hyeok; Kim, Sea-Hyun; Park, Ji-Min; Lee, Hyoshin

    2016-09-01

    The complete chloroplast genome sequence of Hibiscus syriacus L. is presented in this study. The genome is composed of 161 019 bp in length, with a typical circular structure containing a pair of inverted repeats of 25 745 bp of length separated by a large single-copy region and a small single-copy region of 89 698 bp and 19 831 bp of length, respectively. The overall GC content is 36.8%. One hundred and fourteen genes were annotated, including 81 protein-coding genes, 4 ribosomal RNA genes and 29 transfer RNA genes.

  10. The complete chloroplast genome sequence of Curcuma flaviflora (Curcuma).

    Science.gov (United States)

    Zhang, Yan; Deng, Jiabin; Li, Yangyi; Gao, Gang; Ding, Chunbang; Zhang, Li; Zhou, Yonghong; Yang, Ruiwu

    2016-09-01

    The complete chloroplast (cp) genome of Curcuma flaviflora, a medicinal plant in Southeast Asia, was sequenced. The genome size was 160 478 bp in length, with 36.3% GC content. A pair of inverted repeats (IRs) of 26 946 bp were separated by a large single copy (LSC) of 88 008 bp and a small single copy (SSC) of 18 578 bp, respectively. The cp genome contained 132 annotated genes, including 79 protein coding genes, 30 tRNA genes, and four rRNA genes. And 19 of these genes were duplicated in inverted repeat regions.

  11. Assembly of the Complete Sitka Spruce Chloroplast Genome Using 10X Genomics' GemCode Sequencing Data.

    Directory of Open Access Journals (Sweden)

    Lauren Coombe

    Full Text Available The linked read sequencing library preparation platform by 10X Genomics produces barcoded sequencing libraries, which are subsequently sequenced using the Illumina short read sequencing technology. In this new approach, long fragments of DNA are partitioned into separate micro-reactions, where the same index sequence is incorporated into each of the sequencing fragment inserts derived from a given long fragment. In this study, we exploited this property by using reads from index sequences associated with a large number of reads, to assemble the chloroplast genome of the Sitka spruce tree (Picea sitchensis. Here we report on the first Sitka spruce chloroplast genome assembled exclusively from P. sitchensis genomic libraries prepared using the 10X Genomics protocol. We show that the resulting 124,049 base pair long genome shares high sequence similarity with the related white spruce and Norway spruce chloroplast genomes, but diverges substantially from a previously published P. sitchensis- P. thunbergii chimeric genome. The use of reads from high-frequency indices enabled separation of the nuclear genome reads from that of the chloroplast, which resulted in the simplification of the de Bruijn graphs used at the various stages of assembly.

  12. Chloroplast DNA Structural Variation, Phylogeny, and Age of Divergence among Diploid Cotton Species

    Science.gov (United States)

    Li, Pengbo; Liu, Fang; Wang, Yumei; Xu, Qin; Shang, Mingzhao; Zhou, Zhongli; Cai, Xiaoyan; Wang, Xingxing; Wendel, Jonathan F.; Wang, Kunbo

    2016-01-01

    The cotton genus (Gossypium spp.) contains 8 monophyletic diploid genome groups (A, B, C, D, E, F, G, K) and a single allotetraploid clade (AD). To gain insight into the phylogeny of Gossypium and molecular evolution of the chloroplast genome in this group, we performed a comparative analysis of 19 Gossypium chloroplast genomes, six reported here for the first time. Nucleotide distance in non-coding regions was about three times that of coding regions. As expected, distances were smaller within than among genome groups. Phylogenetic topologies based on nucleotide and indel data support for the resolution of the 8 genome groups into 6 clades. Phylogenetic analysis of indel distribution among the 19 genomes demonstrates contrasting evolutionary dynamics in different clades, with a parallel genome downsizing in two genome groups and a biased accumulation of insertions in the clade containing the cultivated cottons leading to large (for Gossypium) chloroplast genomes. Divergence time estimates derived from the cpDNA sequence suggest that the major diploid clades had diverged approximately 10 to 11 million years ago. The complete nucleotide sequences of 6 cpDNA genomes are provided, offering a resource for cytonuclear studies in Gossypium. PMID:27309527

  13. DNA sequencing conference, 2

    Energy Technology Data Exchange (ETDEWEB)

    Cook-Deegan, R.M. [Georgetown Univ., Kennedy Inst. of Ethics, Washington, DC (United States); Venter, J.C. [National Inst. of Neurological Disorders and Strokes, Bethesda, MD (United States); Gilbert, W. [Harvard Univ., Cambridge, MA (United States); Mulligan, J. [Stanford Univ., CA (United States); Mansfield, B.K. [Oak Ridge National Lab., TN (United States)

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  14. Complete chloroplast genome sequence of MD-2 pineapple and its comparative analysis among nine other plants from the subclass Commelinidae.

    Science.gov (United States)

    Redwan, R M; Saidin, A; Kumar, S V

    2015-08-12

    Pineapple (Ananas comosus var. comosus) is known as the king of fruits for its crown and is the third most important tropical fruit after banana and citrus. The plant, which is indigenous to South America, is the most important species in the Bromeliaceae family and is largely traded for fresh fruit consumption. Here, we report the complete chloroplast sequence of the MD-2 pineapple that was sequenced using the PacBio sequencing technology. In this study, the high error rate of PacBio long sequence reads of A. comosus's total genomic DNA were improved by leveraging on the high accuracy but short Illumina reads for error-correction via the latest error correction module from Novocraft. Error corrected long PacBio reads were assembled by using a single tool to produce a contig representing the pineapple chloroplast genome. The genome of 159,636 bp in length is featured with the conserved quadripartite structure of chloroplast containing a large single copy region (LSC) with a size of 87,482 bp, a small single copy region (SSC) with a size of 18,622 bp and two inverted repeat regions (IRA and IRB) each with the size of 26,766 bp. Overall, the genome contained 117 unique coding regions and 30 were repeated in the IR region with its genes contents, structure and arrangement similar to its sister taxon, Typha latifolia. A total of 35 repeats structure were detected in both the coding and non-coding regions with a majority being tandem repeats. In addition, 205 SSRs were detected in the genome with six protein-coding genes contained more than two SSRs. Comparative chloroplast genomes from the subclass Commelinidae revealed a conservative protein coding gene albeit located in a highly divergence region. Analysis of selection pressure on protein-coding genes using Ka/Ks ratio showed significant positive selection exerted on the rps7 gene of the pineapple chloroplast with P less than 0.05. Phylogenetic analysis confirmed the recent taxonomical relation among the member of

  15. The complete chloroplast genome sequence of Euonymus japonicus (Celastraceae).

    Science.gov (United States)

    Choi, Kyoung Su; Park, SeonJoo

    2016-09-01

    The complete chloroplast (cp) genome sequence of the Euonymus japonicus, the first sequenced of the genus Euonymus, was reported in this study. The total length was 157 637 bp, containing a pair of 26 678 bp inverted repeat region (IR), which were separated by small single copy (SSC) region and large single copy (LSC) region of 18 340 bp and 85 941 bp, respectively. This genome contains 107 unique genes, including 74 coding genes, four rRNA genes, and 29 tRNA genes. Seventeen genes contain intron of E. japonicus, of which three genes (clpP, ycf3, and rps12) include two introns. The maximum likelihood (ML) phylogenetic analysis revealed that E. japonicus was closely related to Manihot and Populus.

  16. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences

    Science.gov (United States)

    2012-01-01

    Background The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. Results We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR) are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. Conclusions CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from http://www.herbalgenomics.org/cpgavas. PMID:23256920

  17. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences

    Directory of Open Access Journals (Sweden)

    Liu Chang

    2012-12-01

    Full Text Available Abstract Background The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. Results We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. Conclusions CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from http://www.herbalgenomics.org/cpgavas.

  18. In silico analysis of Simple Sequence Repeats from chloroplast genomes of Solanaceae species

    Directory of Open Access Journals (Sweden)

    Evandro Vagner Tambarussi

    2009-01-01

    Full Text Available The availability of chloroplast genome (cpDNA sequences of Atropa belladonna, Nicotiana sylvestris, N.tabacum, N. tomentosiformis, Solanum bulbocastanum, S. lycopersicum and S. tuberosum, which are Solanaceae species,allowed us to analyze the organization of cpSSRs in their genic and intergenic regions. In general, the number of cpSSRs incpDNA ranged from 161 in S. tuberosum to 226 in N. tabacum, and the number of intergenic cpSSRs was higher than geniccpSSRs. The mononucleotide repeats were the most frequent in studied species, but we also identified di-, tri-, tetra-, pentaandhexanucleotide repeats. Multiple alignments of all cpSSRs sequences from Solanaceae species made the identification ofnucleotide variability possible and the phylogeny was estimated by maximum parsimony. Our study showed that the plastomedatabase can be exploited for phylogenetic analysis and biotechnological approaches.

  19. The complete chloroplast genome sequence of Dendrobium nobile.

    Science.gov (United States)

    Yan, Wenjin; Niu, Zhitao; Zhu, Shuying; Ye, Meirong; Ding, Xiaoyu

    2016-11-01

    The complete chloroplast (cp) genome sequence of Dendrobium nobile, an endangered and traditional Chinese medicine with important economic value, is presented in this article. The total genome size is 150,793 bp, containing a large single copy (LSC) region (84,939 bp) and a small single copy region (SSC) (13,310 bp) which were separated by two inverted repeat (IRs) regions (26,272 bp). The overall GC contents of the plastid genome were 38.8%. In total, 130 unique genes were annotated and they were consisted of 76 protein-coding genes, 30 tRNA genes and 4 rRNA genes. Fourteen genes contained one or two introns.

  20. Isolation and analysis of high quality nuclear DNA with reduced organellar DNA for plant genome sequencing and resequencing

    Directory of Open Access Journals (Sweden)

    Zdepski Anna

    2011-05-01

    Full Text Available Abstract Background High throughput sequencing (HTS technologies have revolutionized the field of genomics by drastically reducing the cost of sequencing, making it feasible for individual labs to sequence or resequence plant genomes. Obtaining high quality, high molecular weight DNA from plants poses significant challenges due to the high copy number of chloroplast and mitochondrial DNA, as well as high levels of phenolic compounds and polysaccharides. Multiple methods have been used to isolate DNA from plants; the CTAB method is commonly used to isolate total cellular DNA from plants that contain nuclear DNA, as well as chloroplast and mitochondrial DNA. Alternatively, DNA can be isolated from nuclei to minimize chloroplast and mitochondrial DNA contamination. Results We describe optimized protocols for isolation of nuclear DNA from eight different plant species encompassing both monocot and eudicot species. These protocols use nuclei isolation to minimize chloroplast and mitochondrial DNA contamination. We also developed a protocol to determine the number of chloroplast and mitochondrial DNA copies relative to the nuclear DNA using quantitative real time PCR (qPCR. We compared DNA isolated from nuclei to total cellular DNA isolated with the CTAB method. As expected, DNA isolated from nuclei consistently yielded nuclear DNA with fewer chloroplast and mitochondrial DNA copies, as compared to the total cellular DNA prepared with the CTAB method. This protocol will allow for analysis of the quality and quantity of nuclear DNA before starting a plant whole genome sequencing or resequencing experiment. Conclusions Extracting high quality, high molecular weight nuclear DNA in plants has the potential to be a bottleneck in the era of whole genome sequencing and resequencing. The methods that are described here provide a framework for researchers to extract and quantify nuclear DNA in multiple types of plants.

  1. Gomphid DNA sequence data

    Data.gov (United States)

    U.S. Environmental Protection Agency — DNA sequence data for several genetic loci. This dataset is not publicly accessible because: It's already publicly available on GenBank. It can be accessed through...

  2. The Complete Chloroplast Genome Sequences of Six Rehmannia Species

    Directory of Open Access Journals (Sweden)

    Shuyun Zeng

    2017-03-01

    Full Text Available Rehmannia is a non-parasitic genus in Orobanchaceae including six species mainly distributed in central and north China. Its phylogenetic position and infrageneric relationships remain uncertain due to potential hybridization and polyploidization. In this study, we sequenced and compared the complete chloroplast genomes of six Rehmannia species using Illumina sequencing technology to elucidate the interspecific variations. Rehmannia plastomes exhibited typical quadripartite and circular structures with good synteny of gene order. The complete genomes ranged from 153,622 bp to 154,055 bp in length, including 133 genes encoding 88 proteins, 37 tRNAs, and 8 rRNAs. Three genes (rpoA, rpoC2, accD have potentially experienced positive selection. Plastome size variation of Rehmannia was mainly ascribed to the expansion and contraction of the border regions between the inverted repeat (IR region and the single-copy (SC regions. Despite of the conserved structure in Rehmannia plastomes, sequence variations provide useful phylogenetic information. Phylogenetic trees of 23 Lamiales species reconstructed with the complete plastomes suggested that Rehmannia was monophyletic and sister to the clade of Lindenbergia and the parasitic taxa in Orobanchaceae. The interspecific relationships within Rehmannia were completely different with the previous studies. In future, population phylogenomic works based on plastomes are urgently needed to clarify the evolutionary history of Rehmannia.

  3. Chloroplast DNA footprints of postglacial recolonization by oaks

    Science.gov (United States)

    Petit, Rémy J.; Pineau, Emmanuel; Demesure, Brigitte; Bacilieri, Roberto; Ducousso, Alexis; Kremer, Antoine

    1997-01-01

    Recolonization of Europe by forest tree species after the last glaciation is well documented in the fossil pollen record. This spread may have been achieved at low densities by rare events of long-distance dispersal, rather than by a compact wave of advance, generating a patchy genetic structure through founder effects. In long-lived oak species, this structure could still be discernible by using maternally transmitted genetic markers. To test this hypothesis, a fine-scale study of chloroplast DNA (cpDNA) variability of two sympatric oak species was carried out in western France. The distributions of six cpDNA length variants were analyzed at 188 localities over a 200 × 300 km area. A cpDNA map was obtained by applying geostatistics methods to the complete data set. Patches of several hundred square kilometers exist which are virtually fixed for a single haplotype for both oak species. This local systematic interspecific sharing of the maternal genome strongly suggests that long-distance seed dispersal events followed by interspecific exchanges were involved at the time of colonization, about 10,000 years ago. PMID:11038572

  4. Genetic diversity and differentiation in Prunus species (Rosaceae) using chloroplast and mitochondrial DNA CAPS markers.

    Science.gov (United States)

    Ben Mustapha, S; Ben Tamarzizt, H; Baraket, G; Abdallah, D; Salhi Hannachi, A

    2015-04-27

    Chloroplast (cpDNA) and mitochondrial DNA (mtDNA) were analyzed to establish genetic relationships among Tunisian plum cultivars using the polymerase chain reaction restriction fragment length polymorphism (PCR-RFLP) technique. Two mtDNA regions (nad 1 b/c and nad 4 1/2) and a cpDNA region (trnL-trnF) were amplified and digested using restriction enzymes. Seventy and six polymorphic sites were revealed in cpDNA and mtDNA, respectively. As a consequence, cpDNA appears to be more polymorphic than mtDNA. The unweighted pair group method with arithmetic mean (UPGMA) dendrogram showed that accessions were distributed independently of their geographical origin, and introduced and local cultivars appear to be closely related. Both UPGMA and principal component analysis grouped Tunisian plum accessions into similar clusters. The analysis of the pooled sequences allowed the detection of 17 chlorotypes and 12 mitotypes. The unique haplotypes detected for cultivars are valuable for management and preservation of the plum local resources. From this study, PCR-RFLP analysis appears to be a useful approach to detect and identify cytoplasmic variation in plum trees. Our results also provide useful information for the management of genetic resources and to establish a program to improve the genetic resources available for plums.

  5. The complete chloroplast genome of Capsicum annuum var. glabriusculum using Illumina sequencing.

    Science.gov (United States)

    Raveendar, Sebastin; Na, Young-Wang; Lee, Jung-Ro; Shim, Donghwan; Ma, Kyung-Ho; Lee, Sok-Young; Chung, Jong-Wook

    2015-07-20

    Chloroplast (cp) genome sequences provide a valuable source for DNA barcoding. Molecular phylogenetic studies have concentrated on DNA sequencing of conserved gene loci. However, this approach is time consuming and more difficult to implement when gene organization differs among species. Here we report the complete re-sequencing of the cp genome of Capsicum pepper (Capsicum annuum var. glabriusculum) using the Illumina platform. The total length of the cp genome is 156,817 bp with a 37.7% overall GC content. A pair of inverted repeats (IRs) of 50,284 bp were separated by a small single copy (SSC; 18,948 bp) and a large single copy (LSC; 87,446 bp). The number of cp genes in C. annuum var. glabriusculum is the same as that in other Capsicum species. Variations in the lengths of LSC; SSC and IR regions were the main contributors to the size variation in the cp genome of this species. A total of 125 simple sequence repeat (SSR) and 48 insertions or deletions variants were found by sequence alignment of Capsicum cp genome. These findings provide a foundation for further investigation of cp genome evolution in Capsicum and other higher plants.

  6. Comparative analysis of complete chloroplast genome sequence and inversion variation in Lasthenia burkei (Madieae, Asteraceae).

    Science.gov (United States)

    Walker, Joseph F; Zanis, Michael J; Emery, Nancy C

    2014-04-01

    Complete chloroplast genome studies can help resolve relationships among large, complex plant lineages such as Asteraceae. We present the first whole plastome from the Madieae tribe and compare its sequence variation to other chloroplast genomes in Asteraceae. We used high throughput sequencing to obtain the Lasthenia burkei chloroplast genome. We compared sequence structure and rates of molecular evolution in the small single copy (SSC), large single copy (LSC), and inverted repeat (IR) regions to those for eight Asteraceae accessions and one Solanaceae accession. The chloroplast sequence of L. burkei is 150 746 bp and contains 81 unique protein coding genes and 4 coding ribosomal RNA sequences. We identified three major inversions in the L. burkei chloroplast, all of which have been found in other Asteraceae lineages, and a previously unreported inversion in Lactuca sativa. Regions flanking inversions contained tRNA sequences, but did not have particularly high G + C content. Substitution rates varied among the SSC, LSC, and IR regions, and rates of evolution within each region varied among species. Some observed differences in rates of molecular evolution may be explained by the relative proportion of coding to noncoding sequence within regions. Rates of molecular evolution vary substantially within and among chloroplast genomes, and major inversion events may be promoted by the presence of tRNAs. Collectively, these results provide insight into different mechanisms that may promote intramolecular recombination and the inversion of large genomic regions in the plastome.

  7. Two complete chloroplast genome sequences of Cannabis sativa varieties.

    Science.gov (United States)

    Oh, Hyehyun; Seo, Boyoung; Lee, Seunghwan; Ahn, Dong-Ha; Jo, Euna; Park, Jin-Kyoung; Min, Gi-Sik

    2016-07-01

    In this study, we determined the complete chloroplast (cp) genomes from two varieties of Cannabis sativa. The genome sizes were 153,848 bp (the Korean non-drug variety, Cheungsam) and 153,854 bp (the African variety, Yoruba Nigeria). The genome structures were identical with 131 individual genes [86 protein-coding genes (PCGs), eight rRNA, and 37 tRNA genes]. Further, except for the presence of an intron in the rps3 genes of two C. sativa varieties, the cp genomes of C. sativa had conservative features similar to that of all known species in the order Rosales. To verify the position of C. sativa within the order Rosales, we conducted phylogenetic analysis by using concatenated sequences of all PCGs from 17 complete cp genomes. The resulting tree strongly supported monophyly of Rosales. Further, the family Cannabaceae, represented by C. sativa, showed close relationship with the family Moraceae. The phylogenetic relationship outlined in our study is well congruent with those previously shown for the order Rosales.

  8. Regulation of chloroplast number and DNA synthesis in higher plants. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Mullet, J.E.

    1995-11-10

    The long term objective of this research is to understand the process of chloroplast development and its coordination with leaf development in higher plants. This is important because the photosynthetic capacity of plants is directly related to leaf and chloroplast development. This research focuses on obtaining a detailed description of leaf development and the early steps in chloroplast development including activation of plastid DNA synthesis, changes in plastid DNA copy number, activation of chloroplast transcription and increases in plastid number per cell. The grant will also begin analysis of specific biochemical mechanisms by isolation of the plastid DNA polymerase, and identification of genetic mutants which are altered in their accumulation of plastid DNA and plastid number per cell.

  9. The complete chloroplast genome sequence of Aster spathulifolius (Asteraceae); genomic features and relationship with Asteraceae.

    Science.gov (United States)

    Choi, Kyoung Su; Park, SeonJoo

    2015-11-10

    Aster spathulifolius, a member of the Asteraceae family, is distributed along the coast of Japan and Korea. This plant is used for medicinal and ornamental purposes. The complete chloroplast (cp) genome of A. sphathulifolius consists of 149,473 bp that include a pair of inverted repeats of 24,751 bp separated by a large single copy region of 81,998 bp and a small single copy region of 17,973 bp. The chloroplast genome contains 78 coding genes, four rRNA genes and 29 tRNA genes. When compared to other cpDNA sequences of Asteraceae, A. spathulifolius showed the closest relationship with Jacobaea vulgaris, and its atpB gene was found to be a pseudogene, unlike J. vulgaris. Furthermore, evaluation of the gene compositions of J. vulgaris, Helianthus annuus, Guizotia abyssinica and A. spathulifolius revealed that 13.6-kb showed inversion from ndhF to rps15, unlike Lactuca of Asteraceae. Comparison of the synonymous (Ks) and nonsynonymous (Ka) substitution rates with J. vulgaris revealed that synonymous genes related to a small subunit of the ribosome showed the highest value (0.1558), while nonsynonymous rates of genes related to ATP synthase genes were highest (0.0118). These findings revealed that substitution has occurred at similar rates in most genes, and the substitution rates suggested that most genes is a purified selection. Copyright © 2015 Elsevier B.V. All rights reserved.

  10. Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes

    Directory of Open Access Journals (Sweden)

    Cronn Richard

    2009-12-01

    Full Text Available Abstract Background Molecular evolutionary studies share the common goal of elucidating historical relationships, and the common challenge of adequately sampling taxa and characters. Particularly at low taxonomic levels, recent divergence, rapid radiations, and conservative genome evolution yield limited sequence variation, and dense taxon sampling is often desirable. Recent advances in massively parallel sequencing make it possible to rapidly obtain large amounts of sequence data, and multiplexing makes extensive sampling of megabase sequences feasible. Is it possible to efficiently apply massively parallel sequencing to increase phylogenetic resolution at low taxonomic levels? Results We reconstruct the infrageneric phylogeny of Pinus from 37 nearly-complete chloroplast genomes (average 109 kilobases each of an approximately 120 kilobase genome generated using multiplexed massively parallel sequencing. 30/33 ingroup nodes resolved with ≥ 95% bootstrap support; this is a substantial improvement relative to prior studies, and shows massively parallel sequencing-based strategies can produce sufficient high quality sequence to reach support levels originally proposed for the phylogenetic bootstrap. Resampling simulations show that at least the entire plastome is necessary to fully resolve Pinus, particularly in rapidly radiating clades. Meta-analysis of 99 published infrageneric phylogenies shows that whole plastome analysis should provide similar gains across a range of plant genera. A disproportionate amount of phylogenetic information resides in two loci (ycf1, ycf2, highlighting their unusual evolutionary properties. Conclusion Plastome sequencing is now an efficient option for increasing phylogenetic resolution at lower taxonomic levels in plant phylogenetic and population genetic analyses. With continuing improvements in sequencing capacity, the strategies herein should revolutionize efforts requiring dense taxon and character sampling

  11. Complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera, and comparative analyses with other grass genomes

    Science.gov (United States)

    Saski, Christopher; Lee, Seung-Bum; Fjellheim, Siri; Guda, Chittibabu; Jansen, Robert K.; Luo, Hong; Tomkins, Jeffrey; Rognli, Odd Arne; Clarke, Jihong Liu

    2009-01-01

    Comparisons of complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera to six published grass chloroplast genomes reveal that gene content and order are similar but two microstructural changes have occurred. First, the expansion of the IR at the SSC/IRa boundary that duplicates a portion of the 5′ end of ndhH is restricted to the three genera of the subfamily Pooideae (Agrostis, Hordeum and Triticum). Second, a 6 bp deletion in ndhK is shared by Agrostis, Hordeum, Oryza and Triticum, and this event supports the sister relationship between the subfamilies Erhartoideae and Pooideae. Repeat analysis identified 19–37 direct and inverted repeats 30 bp or longer with a sequence identity of at least 90%. Seventeen of the 26 shared repeats are found in all the grass chloroplast genomes examined and are located in the same genes or intergenic spacer (IGS) regions. Examination of simple sequence repeats (SSRs) identified 16–21 potential polymorphic SSRs. Five IGS regions have 100% sequence identity among Zea mays, Saccharum officinarum and Sorghum bicolor, whereas no spacer regions were identical among Oryza sativa, Triticum aestivum, H. vulgare and A. stolonifera despite their close phylogenetic relationship. Alignment of EST sequences and DNA coding sequences identified six C–U conversions in both Sorghum bicolor and H. vulgare but only one in A. stolonifera. Phylogenetic trees based on DNA sequences of 61 protein-coding genes of 38 taxa using both maximum parsimony and likelihood methods provide moderate support for a sister relationship between the subfamilies Erhartoideae and Pooideae. PMID:17534593

  12. The complete chloroplast genome sequence of Helwingia himalaica (Helwingiaceae, Aquifoliales) and a chloroplast phylogenomic analysis of the Campanulidae.

    Science.gov (United States)

    Yao, Xin; Liu, Ying-Ying; Tan, Yun-Hong; Song, Yu; Corlett, Richard T

    2016-01-01

    Complete chloroplast genome sequences have been very useful for understanding phylogenetic relationships in angiosperms at the family level and above, but there are currently large gaps in coverage. We report the chloroplast genome for Helwingia himalaica , the first in the distinctive family Helwingiaceae and only the second genus to be sequenced in the order Aquifoliales. We then combine this with 36 published sequences in the large (c. 35,000 species) subclass Campanulidae in order to investigate relationships at the order and family levels. The Helwingia genome consists of 158,362 bp containing a pair of inverted repeat (IR) regions of 25,996 bp separated by a large single-copy (LSC) region and a small single-copy (SSC) region which are 87,810 and 18,560 bp, respectively. There are 142 known genes, including 94 protein-coding genes, eight ribosomal RNA genes, and 40 tRNA genes. The topology of the phylogenetic relationships between Apiales, Asterales, and Dipsacales differed between analyses based on complete genome sequences and on 36 shared protein-coding genes, showing that further studies of campanulid phylogeny are needed.

  13. The complete chloroplast genome sequence of Helwingia himalaica (Helwingiaceae, Aquifoliales and a chloroplast phylogenomic analysis of the Campanulidae

    Directory of Open Access Journals (Sweden)

    Xin Yao

    2016-11-01

    Full Text Available Complete chloroplast genome sequences have been very useful for understanding phylogenetic relationships in angiosperms at the family level and above, but there are currently large gaps in coverage. We report the chloroplast genome for Helwingia himalaica, the first in the distinctive family Helwingiaceae and only the second genus to be sequenced in the order Aquifoliales. We then combine this with 36 published sequences in the large (c. 35,000 species subclass Campanulidae in order to investigate relationships at the order and family levels. The Helwingia genome consists of 158,362 bp containing a pair of inverted repeat (IR regions of 25,996 bp separated by a large single-copy (LSC region and a small single-copy (SSC region which are 87,810 and 18,560 bp, respectively. There are 142 known genes, including 94 protein-coding genes, eight ribosomal RNA genes, and 40 tRNA genes. The topology of the phylogenetic relationships between Apiales, Asterales, and Dipsacales differed between analyses based on complete genome sequences and on 36 shared protein-coding genes, showing that further studies of campanulid phylogeny are needed.

  14. Complete chloroplast genome sequence of Elodea canadensis and comparative analyses with other monocot plastid genomes.

    Science.gov (United States)

    Huotari, Tea; Korpelainen, Helena

    2012-10-15

    Elodea canadensis is an aquatic angiosperm native to North America. It has attracted great attention due to its invasive nature when transported to new areas in its non-native range. We have determined the complete nucleotide sequence of the chloroplast (cp) genome of Elodea. Taxonomically Elodea is a basal monocot, and only few monocot cp genomes representing early lineages of monocots have been sequenced so far. The genome is a circular double-stranded DNA molecule 156,700 bp in length, and has a typical structure with large (LSC 86,194 bp) and small (SSC 17,810 bp) single-copy regions separated by a pair of inverted repeats (IRs 26,348 bp each). The Elodea cp genome contains 113 unique genes and 16 duplicated genes in the IR regions. A comparative analysis showed that the gene order and organization of the Elodea cp genome is almost identical to that of Amborella trichopoda, a basal angiosperm. The structure of IRs in Elodea is unique among monocot species with the whole cp genome sequenced. In Elodea and another monocot Lemna minor the borders between IRs and LSC are located upstream of rps 19 gene and downstream of trnH-GUG gene, while in most monocots, IR has extended to include both trnH and rps 19 genes. A phylogenetic analysis conducted using Bayesian method, based on the DNA sequences of 81 chloroplast genes from 17 monocot taxa provided support for the placement of Elodea together with Lemna as a basal monocot and the next diverging lineage of monocots after Acorales. In comparison with other monocots, the Elodea cp genome has gone through only few rearrangements or gene losses. IR of Elodea has a unique structure among the monocot species studied so far as its structure is similar to that of a basal angiosperm Amborella. This result together with phylogenetic analyses supports the placement of Elodea as a basal monocot to the next diverging lineage of monocots after Acorales. So far, only few cp genomes representing early lineages of monocots have been

  15. Mitochondrial DNA, chloroplast DNA and the origins of development in eukaryotic organisms

    Directory of Open Access Journals (Sweden)

    Bendich Arnold J

    2010-06-01

    Full Text Available Abstract Background Several proposals have been made to explain the rise of multicellular life forms. An internal environment can be created and controlled, germ cells can be protected in novel structures, and increased organismal size allows a "division of labor" among cell types. These proposals describe advantages of multicellular versus unicellular organisms at levels of organization at or above the individual cell. I focus on a subsequent phase of evolution, when multicellular organisms initiated the process of development that later became the more complex embryonic development found in animals and plants. The advantage here is realized at the level of the mitochondrion and chloroplast. Hypothesis The extreme instability of DNA in mitochondria and chloroplasts has not been widely appreciated even though it was first reported four decades ago. Here, I show that the evolutionary success of multicellular animals and plants can be traced to the protection of organellar DNA. Three stages are envisioned. Sequestration allowed mitochondria and chloroplasts to be placed in "quiet" germ line cells so that their DNA is not exposed to the oxidative stress produced by these organelles in "active" somatic cells. This advantage then provided Opportunity, a period of time during which novel processes arose for signaling within and between cells and (in animals for cell-cell recognition molecules to evolve. Development then led to the enormous diversity of animals and plants. Implications The potency of a somatic stem cell is its potential to generate cell types other than itself, and this is a systems property. One of the biochemical properties required for stemness to emerge from a population of cells might be the metabolic quiescence that protects organellar DNA from oxidative stress. Reviewers This article was reviewed by John Logsdon, Arcady Mushegian, and Patrick Forterre.

  16. Real-Time PCR Quantification of Chloroplast DNA Supports DNA Barcoding of Plant Species.

    Science.gov (United States)

    Kikkawa, Hitomi S; Tsuge, Kouichiro; Sugita, Ritsuko

    2016-03-01

    Species identification from extracted DNA is sometimes needed for botanical samples. DNA quantification is required for an accurate and effective examination. If a quantitative assay provides unreliable estimates, a higher quantity of DNA than the estimated amount may be used in additional analyses to avoid failure to analyze samples from which extracting DNA is difficult. Compared with conventional methods, real-time quantitative PCR (qPCR) requires a low amount of DNA and enables quantification of dilute DNA solutions accurately. The aim of this study was to develop a qPCR assay for quantification of chloroplast DNA from taxonomically diverse plant species. An absolute quantification method was developed using primers targeting the ribulose-1,5-bisphosphate carboxylase/oxygenase large subunit (rbcL) gene using SYBR Green I-based qPCR. The calibration curve was generated using the PCR amplicon as the template. DNA extracts from representatives of 13 plant families common in Japan. This demonstrates that qPCR analysis is an effective method for quantification of DNA from plant samples. The results of qPCR assist in the decision-making will determine the success or failure of DNA analysis, indicating the possibility of optimization of the procedure for downstream reactions.

  17. Two major groups of chloroplast DNA haplotypes in diploid and tetraploid Aconitum subgen. Aconitum (Ranunculaceae in the Carpathians

    Directory of Open Access Journals (Sweden)

    J. Mitka

    2016-04-01

    Full Text Available Aconitum in Europe is represented by ca. 10% of the total number of species and the Carpathian Mts. are the center of the genus variability in the subcontinent. We studied the chloroplast DNA intergenic spacer trnL(UAG-rpl32- ndhF (cpDNA variability of the Aconitum subgen. Aconitum in the Carpathians: diploids (2n=16, sect. Cammarum, tetraploids (2n=32, sect. Aconitum and triploids (2n=24, nothosect. Acomarum. Altogether 25 Aconitum accessions representing the whole taxonomic variability of the subgenus were sequenced and subjected to phylogenetic analyses. Both parsimony, Bayesian and character network analyses showed the two distinct types of the cpDNA chloroplast, one typical of the diploid and the second of the tetraploid groups. Some specimens had identical cpDNA sequences (haplotypes and scattered across the whole mountain arch. In the sect. Aconitum 9 specimens shared one haplotype, while in the sect. Camarum one haplotype represents 4 accessions and the second – 5 accessions. The diploids and tetraploids were diverged by 6 mutations, while the intrasectional variability amounted maximally to 3 polymorphisms. Taking into consideration different types of cpDNA haplotypes and ecological profiles of the sections (tetraploids – high‑mountain species, diploids – species from forest montane belt we speculate on the different and independent history of the sections in the Carpathians.

  18. Chloroplast DNA variation of oaks in western Central Europe and genetic consequences of human influences

    NARCIS (Netherlands)

    König, A.O.; Ziegenhagen, B.; Dam, van B.C.; Csaikl, U.M.; Coart, E.; Degen, B.; Burg, K.; Vries, de S.M.G.; Petit, R.J.

    2002-01-01

    Oak chloroplast DNA (cpDNA) variation was studied in a grid-based inventory in western Central Europe, including Belgium, The Netherlands, Luxembourg, Germany, the Czech Republic, and the northern parts of Upper and Lower Austria. A total of 2155 trees representing 426 populations of Quercus robur

  19. Complete chloroplast genome sequence of a major allogamous forage species, perennial ryegrass (Lolium perenne L.).

    Science.gov (United States)

    Diekmann, Kerstin; Hodkinson, Trevor R; Wolfe, Kenneth H; van den Bekerom, Rob; Dix, Philip J; Barth, Susanne

    2009-06-01

    Lolium perenne L. (perennial ryegrass) is globally one of the most important forage and grassland crops. We sequenced the chloroplast (cp) genome of Lolium perenne cultivar Cashel. The L. perenne cp genome is 135 282 bp with a typical quadripartite structure. It contains genes for 76 unique proteins, 30 tRNAs and four rRNAs. As in other grasses, the genes accD, ycf1 and ycf2 are absent. The genome is of average size within its subfamily Pooideae and of medium size within the Poaceae. Genome size differences are mainly due to length variations in non-coding regions. However, considerable length differences of 1-27 codons in comparison of L. perenne to other Poaceae and 1-68 codons among all Poaceae were also detected. Within the cp genome of this outcrossing cultivar, 10 insertion/deletion polymorphisms and 40 single nucleotide polymorphisms were detected. Two of the polymorphisms involve tiny inversions within hairpin structures. By comparing the genome sequence with RT-PCR products of transcripts for 33 genes, 31 mRNA editing sites were identified, five of them unique to Lolium. The cp genome sequence of L. perenne is available under Accession number AM777385 at the European Molecular Biology Laboratory, National Center for Biotechnology Information and DNA DataBank of Japan.

  20. Repeated DNA sequences in fungi

    Energy Technology Data Exchange (ETDEWEB)

    Dutta, S K

    1974-11-01

    Several fungal species, representatives of all broad groups like basidiomycetes, ascomycetes and phycomycetes, were examined for the nature of repeated DNA sequences by DNA:DNA reassociation studies using hydroxyapatite chromatography. All of the fungal species tested contained 10 to 20 percent repeated DNA sequences. There are approximately 100 to 110 copies of repeated DNA sequences of approximately 4 x 10/sup 7/ daltons piece size of each. Repeated DNA sequence homoduplexes showed on average 5/sup 0/C difference of T/sub e/50 (temperature at which 50 percent duplexes dissociate) values from the corresponding homoduplexes of unfractionated whole DNA. It is suggested that a part of repetitive sequences in fungi constitutes mitochondrial DNA and a part of it constitutes nuclear DNA. (auth)

  1. Genetic population structure of the desert shrub species lycium ruthenicum inferred from chloroplast dna

    International Nuclear Information System (INIS)

    Chen, H.; Yonezawa, T.

    2014-01-01

    Lycium ruthenicum (Solananeae), a spiny shrub mostly distributed in the desert regions of north and northwest China, has been shown to exhibit high tolerance to the extreme environment. In this study, the phylogeography and evolutionary history of L. ruthenicum were examined, on the basis of 80 individuals from eight populations. Using the sequence variations of two spacer regions of chloroplast DNA (trnH-psbA and rps16-trnK) , the absence of a geographic component in the chloroplast DNA genetic structure was identified (GST = 0.351, NST = 0.304, NST< GST), which was consisted with the result of SAMOVA, suggesting weak phylogeographic structure of this species. Phylogenetic and network analyses showed that a total of 10 haplotypes identified in the present study clustered into two clades, in which clade I harbored the ancestral haplotypes that inferred two independent glacial refugia in the middle of Qaidam Basin and the western Inner Mongolia. The existence of regional evolutionary differences was supported by GENETREE, which revealed that one of the population in Qaidam Basin and the two populations in Tarim Basin had experienced rapid expansion, and the other populations retained relatively stable population size during the Pleistocene . Given the results of long-term gene flow and pairwise differences, strong gene flow was insufficient to reduce the genetic differentiation among populations or within populations, probably due to the genetic composition containing a common haplotype and the high number of private haplotypes fixed for most of the population. The divergence times of different lineages were consistent with the rapid uplift phases of the Qinghai-Tibetan Plateau and the initiation and expansion of deserts in northern China, suggesting that the origin and evolution of L. ruthenicum were strongly influenced by Quaternary environment changes. (author)

  2. Utilization of complete chloroplast genomes for phylogenetic studies

    NARCIS (Netherlands)

    Ramlee, Shairul Izan Binti

    2016-01-01

    Chloroplast DNA sequence polymorphisms are a primary source of data in many plant phylogenetic studies. The chloroplast genome is relatively conserved in its evolution making it an ideal molecule to retain phylogenetic signals. The chloroplast genome is also largely, but not completely, free from

  3. [Isolation and partial characterization of DNA topoisomerase I from the nucleoids of white mustard chloroplasts].

    Science.gov (United States)

    Belkina, G G; Pogul'skaia, E V; Iurina, N P

    2004-01-01

    DNA topoisomerase was isolated for the first time from nucleoids of white mustard (Sinapis alba L.) chloroplasts. The enzyme had a molecular weight of 70 kDa; it was ATP-independent, required the presence of mono- (K+) and bivalent (Mg2+) cations, and was capable of relaxing both negatively and positively supercoiled DNA. These results suggest that the enzyme isolated belongs to type IB DNA topoisomerases.

  4. Biosensors for DNA sequence detection

    Science.gov (United States)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  5. Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes

    Science.gov (United States)

    Matthew Parks; Richard Cronn; Aaron Liston

    2009-01-01

    We reconstruct the infrageneric phylogeny of Pinus from 37 nearly-complete chloroplast genomes (average 109 kilobases each of an approximately 120 kilobase genome) generated using multiplexed massively parallel sequencing. We found that 30/33 ingroup nodes resolved wlth > 95-percent bootstrap support; this is a substantial improvement relative...

  6. Insights into phylogeny, sex function and age of Fragaria based on whole chloroplast genome sequencing

    Science.gov (United States)

    Wambui Njunguna; Aaron Liston; Richard Cronn; Tia-Lynn Ashman; Nahla Bassil

    2013-01-01

    The cultivated strawberry is one of the youngest domesticated plants, developed in France in the 1700s from chance hybridization between two western hemisphere octoploid species. However, little is known about the evolution of the species that gave rise to this important fruit crop. Phylogenetic analysis of chloroplast genome sequences of 21 Fragaria...

  7. The Complete Chloroplast Genome Sequences of the Medicinal Plant Forsythia suspensa (Oleaceae

    Directory of Open Access Journals (Sweden)

    Wenbin Wang

    2017-10-01

    Full Text Available Forsythia suspensa is an important medicinal plant and traditionally applied for the treatment of inflammation, pyrexia, gonorrhea, diabetes, and so on. However, there is limited sequence and genomic information available for F. suspensa. Here, we produced the complete chloroplast genomes of F. suspensa using Illumina sequencing technology. F. suspensa is the first sequenced member within the genus Forsythia (Oleaceae. The gene order and organization of the chloroplast genome of F. suspensa are similar to other Oleaceae chloroplast genomes. The F. suspensa chloroplast genome is 156,404 bp in length, exhibits a conserved quadripartite structure with a large single-copy (LSC; 87,159 bp region, and a small single-copy (SSC; 17,811 bp region interspersed between inverted repeat (IRa/b; 25,717 bp regions. A total of 114 unique genes were annotated, including 80 protein-coding genes, 30 tRNA, and four rRNA. The low GC content (37.8% and codon usage bias for A- or T-ending codons may largely affect gene codon usage. Sequence analysis identified a total of 26 forward repeats, 23 palindrome repeats with lengths >30 bp (identity > 90%, and 54 simple sequence repeats (SSRs with an average rate of 0.35 SSRs/kb. We predicted 52 RNA editing sites in the chloroplast of F. suspensa, all for C-to-U transitions. IR expansion or contraction and the divergent regions were analyzed among several species including the reported F. suspensa in this study. Phylogenetic analysis based on whole-plastome revealed that F. suspensa, as a member of the Oleaceae family, diverged relatively early from Lamiales. This study will contribute to strengthening medicinal resource conservation, molecular phylogenetic, and genetic engineering research investigations of this species.

  8. Unique haplotypes of cacao trees as revealed by trnH-psbA chloroplast DNA

    Directory of Open Access Journals (Sweden)

    Nidia Gutiérrez-López

    2016-04-01

    Full Text Available Cacao trees have been cultivated in Mesoamerica for at least 4,000 years. In this study, we analyzed sequence variation in the chloroplast DNA trnH-psbA intergenic spacer from 28 cacao trees from different farms in the Soconusco region in southern Mexico. Genetic relationships were established by two analysis approaches based on geographic origin (five populations and genetic origin (based on a previous study. We identified six polymorphic sites, including five insertion/deletion (indels types and one transversion. The overall nucleotide diversity was low for both approaches (geographic = 0.0032 and genetic = 0.0038. Conversely, we obtained moderate to high haplotype diversity (0.66 and 0.80 with 10 and 12 haplotypes, respectively. The common haplotype (H1 for both networks included cacao trees from all geographic locations (geographic approach and four genetic groups (genetic approach. This common haplotype (ancient derived a set of intermediate haplotypes and singletons interconnected by one or two mutational steps, which suggested directional selection and event purification from the expansion of narrow populations. Cacao trees from Soconusco region were grouped into one cluster without any evidence of subclustering based on AMOVA (FST = 0 and SAMOVA (FST = 0.04393 results. One population (Mazatán showed a high haplotype frequency; thus, this population could be considered an important reservoir of genetic material. The indels located in the trnH-psbA intergenic spacer of cacao trees could be useful as markers for the development of DNA barcoding.

  9. The chloroplast genome sequence of the green alga Leptosira terrestris: multiple losses of the inverted repeat and extensive genome rearrangements within the Trebouxiophyceae

    Directory of Open Access Journals (Sweden)

    Turmel Monique

    2007-07-01

    Full Text Available Abstract Background In the Chlorophyta – the green algal phylum comprising the classes Prasinophyceae, Ulvophyceae, Trebouxiophyceae and Chlorophyceae – the chloroplast genome displays a highly variable architecture. While chlorophycean chloroplast DNAs (cpDNAs deviate considerably from the ancestral pattern described for the prasinophyte Nephroselmis olivacea, the degree of remodelling sustained by the two ulvophyte cpDNAs completely sequenced to date is intermediate relative to those observed for chlorophycean and trebouxiophyte cpDNAs. Chlorella vulgaris (Chlorellales is currently the only photosynthetic trebouxiophyte whose complete cpDNA sequence has been reported. To gain insights into the evolutionary trends of the chloroplast genome in the Trebouxiophyceae, we sequenced cpDNA from the filamentous alga Leptosira terrestris (Ctenocladales. Results The 195,081-bp Leptosira chloroplast genome resembles the 150,613-bp Chlorella genome in lacking a large inverted repeat (IR but differs greatly in gene order. Six of the conserved genes present in Chlorella cpDNA are missing from the Leptosira gene repertoire. The 106 conserved genes, four introns and 11 free standing open reading frames (ORFs account for 48.3% of the genome sequence. This is the lowest gene density yet observed among chlorophyte cpDNAs. Contrary to the situation in Chlorella but similar to that in the chlorophycean Scenedesmus obliquus, the gene distribution is highly biased over the two DNA strands in Leptosira. Nine genes, compared to only three in Chlorella, have significantly expanded coding regions relative to their homologues in ancestral-type green algal cpDNAs. As observed in chlorophycean genomes, the rpoB gene is fragmented into two ORFs. Short repeats account for 5.1% of the Leptosira genome sequence and are present mainly in intergenic regions. Conclusion Our results highlight the great plasticity of the chloroplast genome in the Trebouxiophyceae and indicate

  10. ChloroSSRdb: a repository of perfect and imperfect chloroplastic simple sequence repeats (cpSSRs) of green plants.

    Science.gov (United States)

    Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh

    2014-01-01

    Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1-6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ © The Author(s) 2014. Published by Oxford University Press.

  11. Complete chloroplast genome sequence of a tree fern Alsophila spinulosa: insights into evolutionary changes in fern chloroplast genomes.

    Science.gov (United States)

    Gao, Lei; Yi, Xuan; Yang, Yong-Xia; Su, Ying-Juan; Wang, Ting

    2009-06-11

    Ferns have generally been neglected in studies of chloroplast genomics. Before this study, only one polypod and two basal ferns had their complete chloroplast (cp) genome reported. Tree ferns represent an ancient fern lineage that first occurred in the Late Triassic. In recent phylogenetic analyses, tree ferns were shown to be the sister group of polypods, the most diverse group of living ferns. Availability of cp genome sequence from a tree fern will facilitate interpretation of the evolutionary changes of fern cp genomes. Here we have sequenced the complete cp genome of a scaly tree fern Alsophila spinulosa (Cyatheaceae). The Alsophila cp genome is 156,661 base pairs (bp) in size, and has a typical quadripartite structure with the large (LSC, 86,308 bp) and small single copy (SSC, 21,623 bp) regions separated by two copies of an inverted repeat (IRs, 24,365 bp each). This genome contains 117 different genes encoding 85 proteins, 4 rRNAs and 28 tRNAs. Pseudogenes of ycf66 and trnT-UGU are also detected in this genome. A unique trnR-UCG gene (derived from trnR-CCG) is found between rbcL and accD. The Alsophila cp genome shares some unusual characteristics with the previously sequenced cp genome of the polypod fern Adiantum capillus-veneris, including the absence of 5 tRNA genes that exist in most other cp genomes. The genome shows a high degree of synteny with that of Adiantum, but differs considerably from two basal ferns (Angiopteris evecta and Psilotum nudum). At one endpoint of an ancient inversion we detected a highly repeated 565-bp-region that is absent from the Adiantum cp genome. An additional minor inversion of the trnD-GUC, which is possibly shared by all ferns, was identified by comparison between the fern and other land plant cp genomes. By comparing four fern cp genome sequences it was confirmed that two major rearrangements distinguish higher leptosporangiate ferns from basal fern lineages. The Alsophila cp genome is very similar to that of the

  12. Complete chloroplast genome sequence of a tree fern Alsophila spinulosa: insights into evolutionary changes in fern chloroplast genomes

    Directory of Open Access Journals (Sweden)

    Yang Yong-Xia

    2009-06-01

    Full Text Available Abstract Background Ferns have generally been neglected in studies of chloroplast genomics. Before this study, only one polypod and two basal ferns had their complete chloroplast (cp genome reported. Tree ferns represent an ancient fern lineage that first occurred in the Late Triassic. In recent phylogenetic analyses, tree ferns were shown to be the sister group of polypods, the most diverse group of living ferns. Availability of cp genome sequence from a tree fern will facilitate interpretation of the evolutionary changes of fern cp genomes. Here we have sequenced the complete cp genome of a scaly tree fern Alsophila spinulosa (Cyatheaceae. Results The Alsophila cp genome is 156,661 base pairs (bp in size, and has a typical quadripartite structure with the large (LSC, 86,308 bp and small single copy (SSC, 21,623 bp regions separated by two copies of an inverted repeat (IRs, 24,365 bp each. This genome contains 117 different genes encoding 85 proteins, 4 rRNAs and 28 tRNAs. Pseudogenes of ycf66 and trnT-UGU are also detected in this genome. A unique trnR-UCG gene (derived from trnR-CCG is found between rbcL and accD. The Alsophila cp genome shares some unusual characteristics with the previously sequenced cp genome of the polypod fern Adiantum capillus-veneris, including the absence of 5 tRNA genes that exist in most other cp genomes. The genome shows a high degree of synteny with that of Adiantum, but differs considerably from two basal ferns (Angiopteris evecta and Psilotum nudum. At one endpoint of an ancient inversion we detected a highly repeated 565-bp-region that is absent from the Adiantum cp genome. An additional minor inversion of the trnD-GUC, which is possibly shared by all ferns, was identified by comparison between the fern and other land plant cp genomes. Conclusion By comparing four fern cp genome sequences it was confirmed that two major rearrangements distinguish higher leptosporangiate ferns from basal fern lineages. The

  13. The Role of Heterologous Chloroplast Sequence Elements in Transgene Integration and Expression1[W][OA

    Science.gov (United States)

    Ruhlman, Tracey; Verma, Dheeraj; Samson, Nalapalli; Daniell, Henry

    2010-01-01

    Heterologous regulatory elements and flanking sequences have been used in chloroplast transformation of several crop species, but their roles and mechanisms have not yet been investigated. Nucleotide sequence identity in the photosystem II protein D1 (psbA) upstream region is 59% across all taxa; similar variation was consistent across all genes and taxa examined. Secondary structure and predicted Gibbs free energy values of the psbA 5′ untranslated region (UTR) among different families reflected this variation. Therefore, chloroplast transformation vectors were made for tobacco (Nicotiana tabacum) and lettuce (Lactuca sativa), with endogenous (Nt-Nt, Ls-Ls) or heterologous (Nt-Ls, Ls-Nt) psbA promoter, 5′ UTR and 3′ UTR, regulating expression of the anthrax protective antigen (PA) or human proinsulin (Pins) fused with the cholera toxin B-subunit (CTB). Unique lettuce flanking sequences were completely eliminated during homologous recombination in the transplastomic tobacco genomes but not unique tobacco sequences. Nt-Ls or Ls-Nt transplastomic lines showed reduction of 80% PA and 97% CTB-Pins expression when compared with endogenous psbA regulatory elements, which accumulated up to 29.6% total soluble protein PA and 72.0% total leaf protein CTB-Pins, 2-fold higher than Rubisco. Transgene transcripts were reduced by 84% in Ls-Nt-CTB-Pins and by 72% in Nt-Ls-PA lines. Transcripts containing endogenous 5′ UTR were stabilized in nonpolysomal fractions. Stromal RNA-binding proteins were preferentially associated with endogenous psbA 5′ UTR. A rapid and reproducible regeneration system was developed for lettuce commercial cultivars by optimizing plant growth regulators. These findings underscore the need for sequencing complete crop chloroplast genomes, utilization of endogenous regulatory elements and flanking sequences, as well as optimization of plant growth regulators for efficient chloroplast transformation. PMID:20130101

  14. The role of heterologous chloroplast sequence elements in transgene integration and expression.

    Science.gov (United States)

    Ruhlman, Tracey; Verma, Dheeraj; Samson, Nalapalli; Daniell, Henry

    2010-04-01

    Heterologous regulatory elements and flanking sequences have been used in chloroplast transformation of several crop species, but their roles and mechanisms have not yet been investigated. Nucleotide sequence identity in the photosystem II protein D1 (psbA) upstream region is 59% across all taxa; similar variation was consistent across all genes and taxa examined. Secondary structure and predicted Gibbs free energy values of the psbA 5' untranslated region (UTR) among different families reflected this variation. Therefore, chloroplast transformation vectors were made for tobacco (Nicotiana tabacum) and lettuce (Lactuca sativa), with endogenous (Nt-Nt, Ls-Ls) or heterologous (Nt-Ls, Ls-Nt) psbA promoter, 5' UTR and 3' UTR, regulating expression of the anthrax protective antigen (PA) or human proinsulin (Pins) fused with the cholera toxin B-subunit (CTB). Unique lettuce flanking sequences were completely eliminated during homologous recombination in the transplastomic tobacco genomes but not unique tobacco sequences. Nt-Ls or Ls-Nt transplastomic lines showed reduction of 80% PA and 97% CTB-Pins expression when compared with endogenous psbA regulatory elements, which accumulated up to 29.6% total soluble protein PA and 72.0% total leaf protein CTB-Pins, 2-fold higher than Rubisco. Transgene transcripts were reduced by 84% in Ls-Nt-CTB-Pins and by 72% in Nt-Ls-PA lines. Transcripts containing endogenous 5' UTR were stabilized in nonpolysomal fractions. Stromal RNA-binding proteins were preferentially associated with endogenous psbA 5' UTR. A rapid and reproducible regeneration system was developed for lettuce commercial cultivars by optimizing plant growth regulators. These findings underscore the need for sequencing complete crop chloroplast genomes, utilization of endogenous regulatory elements and flanking sequences, as well as optimization of plant growth regulators for efficient chloroplast transformation.

  15. Motif analysis unveils the possible co-regulation of chloroplast genes and nuclear genes encoding chloroplast proteins.

    Science.gov (United States)

    Wang, Ying; Ding, Jun; Daniell, Henry; Hu, Haiyan; Li, Xiaoman

    2012-09-01

    Chloroplasts play critical roles in land plant cells. Despite their importance and the availability of at least 200 sequenced chloroplast genomes, the number of known DNA regulatory sequences in chloroplast genomes are limited. In this paper, we designed computational methods to systematically study putative DNA regulatory sequences in intergenic regions near chloroplast genes in seven plant species and in promoter sequences of nuclear genes in Arabidopsis and rice. We found that -35/-10 elements alone cannot explain the transcriptional regulation of chloroplast genes. We also concluded that there are unlikely motifs shared by intergenic sequences of most of chloroplast genes, indicating that these genes are regulated differently. Finally and surprisingly, we found five conserved motifs, each of which occurs in no more than six chloroplast intergenic sequences, are significantly shared by promoters of nuclear-genes encoding chloroplast proteins. By integrating information from gene function annotation, protein subcellular localization analyses, protein-protein interaction data, and gene expression data, we further showed support of the functionality of these conserved motifs. Our study implies the existence of unknown nuclear-encoded transcription factors that regulate both chloroplast genes and nuclear genes encoding chloroplast protein, which sheds light on the understanding of the transcriptional regulation of chloroplast genes.

  16. The first complete chloroplast genome sequence of a lycophyte,Huperzia lucidula (Lycopodiaceae)

    Energy Technology Data Exchange (ETDEWEB)

    Wolf, Paul G.; Karol, Kenneth G.; Mandoli, Dina F.; Kuehl,Jennifer V.; Arumuganathan, K.; Ellis, Mark W.; Mishler, Brent D.; Kelch,Dean G.; Olmstead, Richard G.; Boore, Jeffrey L.

    2005-02-01

    We used a unique combination of techniques to sequence the first complete chloroplast genome of a lycophyte, Huperzia lucidula. This plant belongs to a significant clade hypothesized to represent the sister group to all other vascular plants. We used fluorescence-activated cell sorting (FACS) to isolate the organelles, rolling circle amplification (RCA) to amplify the genome, and shotgun sequencing to 8x depth coverage to obtain the complete chloroplast genome sequence. The genome is 154,373bp, containing inverted repeats of 15,314 bp each, a large single-copy region of 104,088 bp, and a small single-copy region of 19,671 bp. Gene order is more similar to those of mosses, liverworts, and hornworts than to gene order for other vascular plants. For example, the Huperziachloroplast genome possesses the bryophyte gene order for a previously characterized 30 kb inversion, thus supporting the hypothesis that lycophytes are sister to all other extant vascular plants. The lycophytechloroplast genome data also enable a better reconstruction of the basaltracheophyte genome, which is useful for inferring relationships among bryophyte lineages. Several unique characters are observed in Huperzia, such as movement of the gene ndhF from the small single copy region into the inverted repeat. We present several analyses of evolutionary relationships among land plants by using nucleotide data, amino acid sequences, and by comparing gene arrangements from chloroplast genomes. The results, while still tentative pending the large number of chloroplast genomes from other key lineages that are soon to be sequenced, are intriguing in themselves, and contribute to a growing comparative database of genomic and morphological data across the green plants.

  17. The complete chloroplast genome sequence of Dodonaea viscosa: comparative and phylogenetic analyses.

    Science.gov (United States)

    Saina, Josphat K; Gichira, Andrew W; Li, Zhi-Zhong; Hu, Guang-Wan; Wang, Qing-Feng; Liao, Kuo

    2018-02-01

    The plant chloroplast (cp) genome is a highly conserved structure which is beneficial for evolution and systematic research. Currently, numerous complete cp genome sequences have been reported due to high throughput sequencing technology. However, there is no complete chloroplast genome of genus Dodonaea that has been reported before. To better understand the molecular basis of Dodonaea viscosa chloroplast, we used Illumina sequencing technology to sequence its complete genome. The whole length of the cp genome is 159,375 base pairs (bp), with a pair of inverted repeats (IRs) of 27,099 bp separated by a large single copy (LSC) 87,204 bp, and small single copy (SSC) 17,972 bp. The annotation analysis revealed a total of 115 unique genes of which 81 were protein coding, 30 tRNA, and four ribosomal RNA genes. Comparative genome analysis with other closely related Sapindaceae members showed conserved gene order in the inverted and single copy regions. Phylogenetic analysis clustered D. viscosa with other species of Sapindaceae with strong bootstrap support. Finally, a total of 249 SSRs were detected. Moreover, a comparison of the synonymous (Ks) and nonsynonymous (Ka) substitution rates in D. viscosa showed very low values. The availability of cp genome reported here provides a valuable genetic resource for comprehensive further studies in genetic variation, taxonomy and phylogenetic evolution of Sapindaceae family. In addition, SSR markers detected will be used in further phylogeographic and population structure studies of the species in this genus.

  18. Comparative chloroplast genomes of eleven Schima (Theaceae) species: Insights into DNA barcoding and phylogeny.

    Science.gov (United States)

    Yu, Xiang-Qin; Drew, Bryan T; Yang, Jun-Bo; Gao, Lian-Ming; Li, De-Zhu

    2017-01-01

    Schima is an ecologically and economically important woody genus in tea family (Theaceae). Unresolved species delimitations and phylogenetic relationships within Schima limit our understanding of the genus and hinder utilization of the genus for economic purposes. In the present study, we conducted comparative analysis among the complete chloroplast (cp) genomes of 11 Schima species. Our results indicate that Schima cp genomes possess a typical quadripartite structure, with conserved genomic structure and gene order. The size of the Schima cp genome is about 157 kilo base pairs (kb). They consistently encode 114 unique genes, including 80 protein-coding genes, 30 tRNAs, and 4 rRNAs, with 17 duplicated in the inverted repeat (IR). These cp genomes are highly conserved and do not show obvious expansion or contraction of the IR region. The percent variability of the 68 coding and 93 noncoding (>150 bp) fragments is consistently less than 3%. The seven most widely touted DNA barcode regions as well as one promising barcode candidate showed low sequence divergence. Eight mutational hotspots were identified from the 11 cp genomes. These hotspots may potentially be useful as specific DNA barcodes for species identification of Schima. The 58 cpSSR loci reported here are complementary to the microsatellite markers identified from the nuclear genome, and will be leveraged for further population-level studies. Phylogenetic relationships among the 11 Schima species were resolved with strong support based on the cp genome data set, which corresponds well with the species distribution pattern. The data presented here will serve as a foundation to facilitate species identification, DNA barcoding and phylogenetic reconstructions for future exploration of Schima.

  19. Complete sequence and comparative analysis of the chloroplast genome of coconut palm (Cocos nucifera).

    Science.gov (United States)

    Huang, Ya-Yi; Matzke, Antonius J M; Matzke, Marjori

    2013-01-01

    Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available.

  20. Identification of the ``a'' Genome of Finger Millet Using Chloroplast DNA

    Science.gov (United States)

    Hilu, K. W.

    1988-01-01

    Finger millet (Eleusine corocana subsp. coracana), an important cereal in East Africa and India, is a tetraploid species with unknown genomic components. A recent cytogenetic study confirmed the direct origin of this millet from the tetraploid E. coracana subsp. africana but questioned Eleusine indica as a genomic donor. Chloroplast (ct) DNA sequence analysis using restriction fragment pattern was used to examine the phylogenetic relationships between E. coracana subsp. coracana (domesticated finger millet), E. coracana subspecies africana (wild finger millet), and E. indica. Eleusine tristachya was included since it is the only other annual diploid species in the genus with a basic chromosome number of x = 9 like finger millet. Eight of the ten restriction endonucleases used had 16 to over 30 restriction sites per genome and were informative. E. coracana subsp. coracana and subsp. africana and E. indica were identical in all the restriction sites surveyed, while the ct genome of E. tristachya differed consistently by at least one mutational event for each restriction enzyme surveyed. This random survey of the ct genomes of these species points out E. indica as one of the genome donors (maternal genome donor) of domesticated finger millet contrary to a previous cytogenetic study. The data also substantiate E. coracana subsp. africana as the progenitor of domesticated finger millet. The disparity between the cytogenetic and the molecular approaches is discussed in light of the problems associated with chromosome pairing and polyploidy. PMID:8608927

  1. Identification of the "A" genome of finger millet using chloroplast DNA.

    Science.gov (United States)

    Hilu, K W

    1988-01-01

    Finger millet (Eleusine corocana subsp. coracana), an important cereal in East Africa and India, is a tetraploid species with unknown genomic components. A recent cytogenetic study confirmed the direct origin of this millet from the tetraploid E. coracana subsp. africana but questioned Eleusine indica as a genomic donor. Chloroplast (ct) DNA sequence analysis using restriction fragment pattern was used to examine the phylogenetic relationships between E. coracana subsp. coracana (domesticated finger millet), E. coracana subspecies africana (wild finger millet), and E. indica. Eleusine tristachya was included since it is the only other annual diploid species in the genus with a basic chromosome number of x = 9 like finger millet. Eight of the ten restriction endonucleases used had 16 to over 30 restriction sites per genome and were informative. E. coracana subsp. coracana and subsp. africana and E. indica were identical in all the restriction sites surveyed, while the ct genome of E, tristachya differed consistently by at least one mutational event for each restriction enzyme surveyed. This random survey of the ct genomes of these species points out E. indica as one of the genome donors (maternal genome donor) of domesticated finger millet contrary to a previous cytogenetic study. The data also substantiate E. coracana subsp. africana as the progenitor of domesticated finger millet. The disparity between the cytogenetic and the molecular approaches is discussed in light of the problems associated with chromosome pairing and polyploidy.

  2. Phylogenomic Analysis and Dynamic Evolution of Chloroplast Genomes in Salicaceae

    Directory of Open Access Journals (Sweden)

    Yuan Huang

    2017-06-01

    Full Text Available Chloroplast genomes of plants are highly conserved in both gene order and gene content. Analysis of the whole chloroplast genome is known to provide much more informative DNA sites and thus generates high resolution for plant phylogenies. Here, we report the complete chloroplast genomes of three Salix species in family Salicaceae. Phylogeny of Salicaceae inferred from complete chloroplast genomes is generally consistent with previous studies but resolved with higher statistical support. Incongruences of phylogeny, however, are observed in genus Populus, which most likely results from homoplasy. By comparing three Salix chloroplast genomes with the published chloroplast genomes of other Salicaceae species, we demonstrate that the synteny and length of chloroplast genomes in Salicaceae are highly conserved but experienced dynamic evolution among species. We identify seven positively selected chloroplast genes in Salicaceae, which might be related to the adaptive evolution of Salicaceae species. Comparative chloroplast genome analysis within the family also indicates that some chloroplast genes are lost or became pseudogenes, infer that the chloroplast genes horizontally transferred to the nucleus genome. Based on the complete nucleus genome sequences from two Salicaceae species, we remarkably identify that the entire chloroplast genome is indeed transferred and integrated to the nucleus genome in the individual of the reference genome of P. trichocarpa at least once. This observation, along with presence of the large nuclear plastid DNA (NUPTs and NUPTs-containing multiple chloroplast genes in their original order in the chloroplast genome, favors the DNA-mediated hypothesis of organelle to nucleus DNA transfer. Overall, the phylogenomic analysis using chloroplast complete genomes clearly elucidates the phylogeny of Salicaceae. The identification of positively selected chloroplast genes and dynamic chloroplast-to-nucleus gene transfers in

  3. Genetic diversity of sago palm in Indonesia based on chloroplast DNA (cpDNA markers

    Directory of Open Access Journals (Sweden)

    MEMEN SURAHMAN

    2010-07-01

    Full Text Available Abbas B, Renwarin Y, Bintoro MH, Sudarsono, Surahman M, Ehara H (2010 Genetic diversity of sago palm in Indonesia based on chloroplast DNA (cpDNA markers. Biodiversitas 11: 112-117. Sago palm (Metroxylon sagu Rottb. was believed capable to accumulate high carbohydrate content in its trunk. The capability of sago palm producing high carbohydrate should be an appropriate criterion for defining alternative crops in anticipating food crisis. The objective of this research was to study genetic diversity of sago palm in Indonesia based on cpDNA markers. Total genome extraction was done following the Qiagen DNA isolation protocols 2003. Single Nucleotide Fragments (SNF analyses were performed by using ABI Prism GeneScanR 3.7. SNF analyses detected polymorphism revealing eleven alleles and ten haplotypes from total 97 individual samples of sago palm. Specific haplotypes were found in the population from Papua, Sulawesi, and Kalimantan. Therefore, the three islands will be considered as origin of sago palm diversities in Indonesia. The highest haplotype numbers and the highest specific haplotypes were found in the population from Papua suggesting this islands as the centre and the origin of sago palm diversities in Indonesia. The research had however no sufficient data yet to conclude the Papua origin of sago palm. Genetic hierarchies and differentiations of sago palm samples were observed significantly different within populations (P=0.04574, among populations (P=0.04772, and among populations within the island (P=0.03366, but among islands no significant differentiations were observed (P= 0.63069.

  4. Duplication in DNA Sequences

    Science.gov (United States)

    Ito, Masami; Kari, Lila; Kincaid, Zachary; Seki, Shinnosuke

    The duplication and repeat-deletion operations are the basis of a formal language theoretic model of errors that can occur during DNA replication. During DNA replication, subsequences of a strand of DNA may be copied several times (resulting in duplications) or skipped (resulting in repeat-deletions). As formal language operations, iterated duplication and repeat-deletion of words and languages have been well studied in the literature. However, little is known about single-step duplications and repeat-deletions. In this paper, we investigate several properties of these operations, including closure properties of language families in the Chomsky hierarchy and equations involving these operations. We also make progress toward a characterization of regular languages that are generated by duplicating a regular language.

  5. Graphene nanodevices for DNA sequencing

    NARCIS (Netherlands)

    Heerema, S.J.; Dekker, C.

    2016-01-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with

  6. PHYLOGENETIC RELATIONSHIPS AMONG VIETNAMESE COCOA ACCESSIONS USING A NON-CODING REGION OF THE CHLOROPLAST DNA

    OpenAIRE

    Lam Thi, Viet Ha; D.T., Khang; Everaert, Helena; T.N, Dung; P.H.D, Phuoc; H.T., Toan; Dewettinck, Koen; Messens, Kathy

    2017-01-01

    Cocoa (Theobroma cacao L.) cultivation has increased in tropical areas around the world, including Vietnam, due to the high demand of cocoa beans for chocolate production. The genetic diversity of cocoa genotypes is recognized to be complex, however, their phylogenetic relationships need to be clarified. The present study aimed to classify the cocoa genotypes that are imported and cultivated in Vietnam based on a chloroplast DNA region. Sixty-three Vietnamese Cocoa accessions were collected f...

  7. Variability of chloroplast DNA and nuclear ribosomal DNA in cassava (Manihot esculenta Crantz) and its wild relatives.

    Science.gov (United States)

    Fregene, M A; Vargas, J; Ikea, J; Angel, F; Tohme, J; Asiedu, R A; Akoroda, M O; Roca, W M

    1994-11-01

    Chloroplast DNA (cp) and nuclear ribosomal DNA (rDNA) variation was investigated in 45 accessions of cultivated and wild Manihot species. Ten independent mutations, 8 point mutations and 2 length mutations were identified, using eight restriction enzymes and 12 heterologous cpDNA probes from mungbean. Restriction fragment length polymorphism analysis defined nine distinct chloroplast types, three of which were found among the cultivated accessions and six among the wild species. Cladistic analysis of the cpDNA data using parsimony yielded a hypothetical phylogeny of lineages among the cpDNAs of cassava and its wild relatives that is congruent with morphological evolutionary differentiation in the genus. The results of our survey of cpDNA, together with rDNA restriction site change at the intergenic spacer region and rDNA repeat unit length variation (using rDNA cloned fragments from taro as probe), suggest that cassava might have arisen from the domestication of wild tuberous accessions of some Manihot species, followed by intensive selection. M. esculenta subspp flabellifolia is probably a wild progenitor. Introgressive hybridization with wild forms and pressures to adapt to the widely varying climates and topography in which cassava is found might have enhanced the crop's present day variability.

  8. Restriction endonuclease analysis of chloroplast DNA in interspecies somatic Hybrids of Petunia.

    Science.gov (United States)

    Kumar, A; Cocking, E C; Bovenberg, W A; Kool, A J

    1982-12-01

    Restriction endonuclease cleavage pattern analysis of chloroplast DNA (cpDNA) of three different interspecific somatic hybrid plants revealed that the cytoplasms of the hybrids contained only cpDNA of P. parodii. The somatic hybrid plants analysed were those between P. parodii (wild type) + P. hybrida (wild type); P. parodii (wild type)+P. inflata (cytoplasmic albino mutant); P. parodii (wild type) + P. parviflora (nuclear albino mutant). The presence of only P. parodii chloroplasts in the somatic hybrid of P. parodii + P. inflata is possibly due to the stringent selection used for somatic hybrid production. However, in the case of the two other somatic hybrids P. parodii + P. hybrida and P. parodii + P. parviflora it was not possible to determine whether the presence of only P. parodii chloroplasts in these somatic hybrid plants was due to the nature of the selection schemes used or simply occurred by chance. The relevance of such somatic hybrid material for the study of genomic-cytoplasmic interaction is discussed, as well as the use of restriction endonuclease fragment patterns for the analysis of taxonomic and evolutionary inter-relationships in the genus Petunia.

  9. Transcriptional regulation and DNA methylation in plastids during transitional conversion of chloroplasts to chromoplasts.

    Science.gov (United States)

    Kobayashi, H; Ngernprasirtsiri, J; Akazawa, T

    1990-01-01

    During transitional conversion of chloroplasts to chromoplasts in ripening tomato (Lycopersicon esculentum) fruits, transcripts for several plastid genes for photosynthesis decreased to undetectable levels. Run-on transcription of plastids indicated that transcriptional regulation operated as a predominant factor. We found that most of the genes in chloroplasts were actively transcribed in vitro by Escherichia coli and soluble plastid RNA polymerases, but some genes in chromoplasts seemed to be silent when assayed by the in vitro systems. The regulatory step, therefore, was ascribed to DNA templates. The analysis of modified base composition revealed the presence of methylated bases in chromoplast DNA, in which 5-methylcytosine was most abundant. The presence of 5-methylcytosine detected by isoschizomeric endonucleases and Southern hybridization was correlated with the undetectable transcription activity of each gene in the run-on assay and in vitro transcription experiments. It is thus concluded that the suppression of transcription mediated by DNA methylation is one of the mechanisms governing gene expression in plastids converting from chloroplasts to chromoplasts. Images Fig. 1 Fig. 2 Fig. 3. Fig. 4. Fig. 5. PMID:2303026

  10. Complete Chloroplast Genome Sequence of Coptis chinensis Franch. and Its Evolutionary History

    Science.gov (United States)

    He, Yang; Deng, Cao; Fan, Gang; Qin, Shishang

    2017-01-01

    The Coptis chinensis Franch. is an important medicinal plant from the Ranunculales. We used next generation sequencing technology to determine the complete chloroplast genome of C. chinensis. This genome is 155,484 bp long with 38.17% GC content. Two 26,758 bp long inverted repeats separated the genome into a typical quadripartite structure. The C. chinensis chloroplast genome consists of 128 gene loci, including eight rRNA gene loci, 28 tRNA gene loci, and 92 protein-coding gene loci. Most of the SSRs in C. chinensis are poly-A/T. The numbers of mononucleotide SSRs in C. chinensis and other Ranunculaceae species are fewer than those in Berberidaceae species, while the number of dinucleotide SSRs is greater than that in the Berberidaceae. C. chinensis diverged from other Ranunculaceae species an estimated 81 million years ago (Mya). The divergence between Ranunculaceae and Berberidaceae was ~111 Mya, while the Ranunculales and Magnoliaceae shared a common ancestor during the Jurassic, ~153 Mya. Position 104 of the C. chinensis ndhG protein was identified as a positively selected site, indicating possible selection for the photosystem-chlororespiration system in C. chinensis. In summary, the complete sequencing and annotation of the C. chinensis chloroplast genome will facilitate future studies on this important medicinal species. PMID:28698879

  11. Complete Chloroplast Genome Sequence of Coptis chinensis Franch. and Its Evolutionary History

    Directory of Open Access Journals (Sweden)

    Yang He

    2017-01-01

    Full Text Available The Coptis chinensis Franch. is an important medicinal plant from the Ranunculales. We used next generation sequencing technology to determine the complete chloroplast genome of C. chinensis. This genome is 155,484 bp long with 38.17% GC content. Two 26,758 bp long inverted repeats separated the genome into a typical quadripartite structure. The C. chinensis chloroplast genome consists of 128 gene loci, including eight rRNA gene loci, 28 tRNA gene loci, and 92 protein-coding gene loci. Most of the SSRs in C. chinensis are poly-A/T. The numbers of mononucleotide SSRs in C. chinensis and other Ranunculaceae species are fewer than those in Berberidaceae species, while the number of dinucleotide SSRs is greater than that in the Berberidaceae. C. chinensis diverged from other Ranunculaceae species an estimated 81 million years ago (Mya. The divergence between Ranunculaceae and Berberidaceae was ~111 Mya, while the Ranunculales and Magnoliaceae shared a common ancestor during the Jurassic, ~153 Mya. Position 104 of the C. chinensis ndhG protein was identified as a positively selected site, indicating possible selection for the photosystem-chlororespiration system in C. chinensis. In summary, the complete sequencing and annotation of the C. chinensis chloroplast genome will facilitate future studies on this important medicinal species.

  12. Phylogenetic analyses of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids.

    Science.gov (United States)

    Jansen, Robert K; Kaittanis, Charalambos; Saski, Christopher; Lee, Seung-Bum; Tomkins, Jeffrey; Alverson, Andrew J; Daniell, Henry

    2006-04-09

    The Vitaceae (grape) is an economically important family of angiosperms whose phylogenetic placement is currently unresolved. Recent phylogenetic analyses based on one to several genes have suggested several alternative placements of this family, including sister to Caryophyllales, asterids, Saxifragales, Dilleniaceae or to rest of rosids, though support for these different results has been weak. There has been a recent interest in using complete chloroplast genome sequences for resolving phylogenetic relationships among angiosperms. These studies have clarified relationships among several major lineages but they have also emphasized the importance of taxon sampling and the effects of different phylogenetic methods for obtaining accurate phylogenies. We sequenced the complete chloroplast genome of Vitis vinifera and used these data to assess relationships among 27 angiosperms, including nine taxa of rosids. The Vitis vinifera chloroplast genome is 160,928 bp in length, including a pair of inverted repeats of 26,358 bp that are separated by small and large single copy regions of 19,065 bp and 89,147 bp, respectively. The gene content and order of Vitis is identical to many other unrearranged angiosperm chloroplast genomes, including tobacco. Phylogenetic analyses using maximum parsimony and maximum likelihood were performed on DNA sequences of 61 protein-coding genes for two datasets with 28 or 29 taxa, including eight or nine taxa from four of the seven currently recognized major clades of rosids. Parsimony and likelihood phylogenies of both data sets provide strong support for the placement of Vitaceae as sister to the remaining rosids. However, the position of the Myrtales and support for the monophyly of the eurosid I clade differs between the two data sets and the two methods of analysis. In parsimony analyses, the inclusion of Gossypium is necessary to obtain trees that support the monophyly of the eurosid I clade. However, maximum likelihood analyses place

  13. Phylogenetic analyses of Vitis (Vitaceae based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids

    Directory of Open Access Journals (Sweden)

    Alverson Andrew J

    2006-04-01

    Full Text Available Abstract Background The Vitaceae (grape is an economically important family of angiosperms whose phylogenetic placement is currently unresolved. Recent phylogenetic analyses based on one to several genes have suggested several alternative placements of this family, including sister to Caryophyllales, asterids, Saxifragales, Dilleniaceae or to rest of rosids, though support for these different results has been weak. There has been a recent interest in using complete chloroplast genome sequences for resolving phylogenetic relationships among angiosperms. These studies have clarified relationships among several major lineages but they have also emphasized the importance of taxon sampling and the effects of different phylogenetic methods for obtaining accurate phylogenies. We sequenced the complete chloroplast genome of Vitis vinifera and used these data to assess relationships among 27 angiosperms, including nine taxa of rosids. Results The Vitis vinifera chloroplast genome is 160,928 bp in length, including a pair of inverted repeats of 26,358 bp that are separated by small and large single copy regions of 19,065 bp and 89,147 bp, respectively. The gene content and order of Vitis is identical to many other unrearranged angiosperm chloroplast genomes, including tobacco. Phylogenetic analyses using maximum parsimony and maximum likelihood were performed on DNA sequences of 61 protein-coding genes for two datasets with 28 or 29 taxa, including eight or nine taxa from four of the seven currently recognized major clades of rosids. Parsimony and likelihood phylogenies of both data sets provide strong support for the placement of Vitaceae as sister to the remaining rosids. However, the position of the Myrtales and support for the monophyly of the eurosid I clade differs between the two data sets and the two methods of analysis. In parsimony analyses, the inclusion of Gossypium is necessary to obtain trees that support the monophyly of the eurosid I clade

  14. Species identification of medicinal pteridophytes by a DNA barcode marker, the chloroplast psbA-trnH intergenic region.

    Science.gov (United States)

    Ma, Xin-Ye; Xie, Cai-Xiang; Liu, Chang; Song, Jing-Yuan; Yao, Hui; Luo, Kun; Zhu, Ying-Jie; Gao, Ting; Pang, Xiao-Hui; Qian, Jun; Chen, Shi-Lin

    2010-01-01

    Medicinal pteridophytes are an important group used in traditional Chinese medicine; however, there is no simple and universal way to differentiate various species of this group by morphological traits. A novel technology termed "DNA barcoding" could discriminate species by a standard DNA sequence with universal primers and sufficient variation. To determine whether DNA barcoding would be effective for differentiating pteridophyte species, we first analyzed five DNA sequence markers (psbA-trnH intergenic region, rbcL, rpoB, rpoC1, and matK) using six chloroplast genomic sequences from GeneBank and found psbA-trnH intergenic region the best candidate for availability of universal primers. Next, we amplified the psbA-trnH region from 79 samples of medicinal pteridophyte plants. These samples represented 51 species from 24 families, including all the authentic pteridophyte species listed in the Chinese pharmacopoeia (2005 version) and some commonly used adulterants. We found that the sequence of the psbA-trnH intergenic region can be determined with both high polymerase chain reaction (PCR) amplification efficiency (94.1%) and high direct sequencing success rate (81.3%). Combined with GeneBank data (54 species cross 12 pteridophyte families), species discriminative power analysis showed that 90.2% of species could be separated/identified successfully by the TaxonGap method in conjunction with the Basic Local Alignment Search Tool 1 (BLAST1) method. The TaxonGap method results further showed that, for 37 out of 39 separable species with at least two samples each, between-species variation was higher than the relevant within-species variation. Thus, the psbA-trnH intergenic region is a suitable DNA marker for species identification in medicinal pteridophytes.

  15. Chloroplast DNA variation in European white oaks phylogeography and patterns of diversity based on data from over 2600 populations

    NARCIS (Netherlands)

    Petit, R.J.; Csaikl, U.M.; Bordács, S.; Burg, K.; Coart, E.; Cottrell, J.; Dam, van B.C.; Deans, J.D.; Dumolin-LapOgue, S.; Fineschi, S.; Finkeldey, R.; Gillies, A.; Glaz, I.; Goicoechea, P.G.; Jensen, J.S.; König, A.O.; Lowe, A.J.; Madsen, S.F.; Mátyás, G.; Munro, R.C.; Olalde, M.; Pemonge, M.H.; Popescu, F.; Slade, D.; Tabbener, H.; Taurchini, D.; Vries, de S.G.M.; Ziegenhagen, B.; Kremer, A.

    2002-01-01

    A consortium of 16 laboratories have studied chloroplast DNA (cpDNA) variation in European white oaks. A common strategy for molecular screening, based on restriction analysis of four PCR-amplified cpDNA fragments, was used to allow comparison among the different laboratories. A total of 2613 oak

  16. Complete chloroplast genome sequence of a major economic species, Ziziphus jujuba (Rhamnaceae).

    Science.gov (United States)

    Ma, Qiuyue; Li, Shuxian; Bi, Changwei; Hao, Zhaodong; Sun, Congrui; Ye, Ning

    2017-02-01

    Ziziphus jujuba is an important woody plant with high economic and medicinal value. Here, we analyzed and characterized the complete chloroplast (cp) genome of Z. jujuba, the first member of the Rhamnaceae family for which the chloroplast genome sequence has been reported. We also built a web browser for navigating the cp genome of Z. jujuba ( http://bio.njfu.edu.cn/gb2/gbrowse/Ziziphus_jujuba_cp/ ). Sequence analysis showed that this cp genome is 161,466 bp long and has a typical quadripartite structure of large (LSC, 89,120 bp) and small (SSC, 19,348 bp) single-copy regions separated by a pair of inverted repeats (IRs, 26,499 bp). The sequence contained 112 unique genes, including 78 protein-coding genes, 30 transfer RNAs, and four ribosomal RNAs. The genome structure, gene order, GC content, and codon usage are similar to other typical angiosperm cp genomes. A total of 38 tandem repeats, two forward repeats, and three palindromic repeats were detected in the Z. jujuba cp genome. Simple sequence repeat (SSR) analysis revealed that most SSRs were AT-rich. The homopolymer regions in the cp genome of Z. jujuba were verified and manually corrected by Sanger sequencing. One-third of mononucleotide repeats were found to be erroneously sequenced by the 454 pyrosequencing, which resulted in sequences of 1-4 bases shorter than that by the Sanger sequencing. Analyzing the cp genome of Z. jujuba revealed that the IR contraction and expansion events resulted in ycf1 and rps19 pseudogenes. A phylogenetic analysis based on 64 protein-coding genes showed that Z. jujuba was closely related to members of the Elaeagnaceae family, which will be helpful for phylogenetic studies of other Rosales species. The complete cp genome sequence of Z. jujuba will facilitate population, phylogenetic, and cp genetic engineering studies of this economic plant.

  17. Sequence analysis of Leukemia DNA

    Science.gov (United States)

    Nacong, Nasria; Lusiyanti, Desy; Irawan, Muhammad. Isa

    2018-03-01

    Cancer is a very deadly disease, one of which is leukemia disease or better known as blood cancer. The cancer cell can be detected by taking DNA in laboratory test. This study focused on local alignment of leukemia and non leukemia data resulting from NCBI in the form of DNA sequences by using Smith-Waterman algorithm. SmithWaterman algorithm was invented by TF Smith and MS Waterman in 1981. These algorithms try to find as much as possible similarity of a pair of sequences, by giving a negative value to the unequal base pair (mismatch), and positive values on the same base pair (match). So that will obtain the maximum positive value as the end of the alignment, and the minimum value as the initial alignment. This study will use sequences of leukemia and 3 sequences of non leukemia.

  18. DNA Sequencing by Capillary Electrophoresis

    Science.gov (United States)

    Karger, Barry L.; Guttman, Andras

    2009-01-01

    Sequencing of human and other genomes has been at the center of interest in the biomedical field over the past several decades and is now leading toward an era of personalized medicine. During this time, DNA sequencing methods have evolved from the labor intensive slab gel electrophoresis, through automated multicapillary electrophoresis systems using fluorophore labeling with multispectral imaging, to the “next generation” technologies of cyclic array, hybridization based, nanopore and single molecule sequencing. Deciphering the genetic blueprint and follow-up confirmatory sequencing of Homo sapiens and other genomes was only possible by the advent of modern sequencing technologies that was a result of step by step advances with a contribution of academics, medical personnel and instrument companies. While next generation sequencing is moving ahead at break-neck speed, the multicapillary electrophoretic systems played an essential role in the sequencing of the Human Genome, the foundation of the field of genomics. In this prospective, we wish to overview the role of capillary electrophoresis in DNA sequencing based in part of several of our articles in this journal. PMID:19517496

  19. Fractals in DNA sequence analysis

    Institute of Scientific and Technical Information of China (English)

    Yu Zu-Guo(喻祖国); Vo Anh; Gong Zhi-Min(龚志民); Long Shun-Chao(龙顺潮)

    2002-01-01

    Fractal methods have been successfully used to study many problems in physics, mathematics, engineering, finance,and even in biology. There has been an increasing interest in unravelling the mysteries of DNA; for example, how can we distinguish coding and noncoding sequences, and the problems of classification and evolution relationship of organisms are key problems in bioinformatics. Although much research has been carried out by taking into consideration the long-range correlations in DNA sequences, and the global fractal dimension has been used in these works by other people, the models and methods are somewhat rough and the results are not satisfactory. In recent years, our group has introduced a time series model (statistical point of view) and a visual representation (geometrical point of view)to DNA sequence analysis. We have also used fractal dimension, correlation dimension, the Hurst exponent and the dimension spectrum (multifractal analysis) to discuss problems in this field. In this paper, we introduce these fractal models and methods and the results of DNA sequence analysis.

  20. PineElm_SSRdb: a microsatellite marker database identified from genomic, chloroplast, mitochondrial and EST sequences of pineapple (Ananas comosus (L.) Merrill).

    Science.gov (United States)

    Chaudhary, Sakshi; Mishra, Bharat Kumar; Vivek, Thiruvettai; Magadum, Santoshkumar; Yasin, Jeshima Khan

    2016-01-01

    Simple Sequence Repeats or microsatellites are resourceful molecular genetic markers. There are only few reports of SSR identification and development in pineapple. Complete genome sequence of pineapple available in the public domain can be used to develop numerous novel SSRs. Therefore, an attempt was made to identify SSRs from genomic, chloroplast, mitochondrial and EST sequences of pineapple which will help in deciphering genetic makeup of its germplasm resources. A total of 359511 SSRs were identified in pineapple (356385 from genome sequence, 45 from chloroplast sequence, 249 in mitochondrial sequence and 2832 from EST sequences). The list of EST-SSR markers and their details are available in the database. PineElm_SSRdb is an open source database available for non-commercial academic purpose at http://app.bioelm.com/ with a mapping tool which can develop circular maps of selected marker set. This database will be of immense use to breeders, researchers and graduates working on Ananas spp. and to others working on cross-species transferability of markers, investigating diversity, mapping and DNA fingerprinting.

  1. Chloroplast DNA codon use: evidence for selection at the psb A locus based on tRNA availability.

    Science.gov (United States)

    Morton, B R

    1993-09-01

    Codon use in the three sequenced chloroplast genomes (Marchantia, Oryza, and Nicotiana) is examined. The chloroplast has a bias in that codons NNA and NNT are favored over synonymous NNC and NNG codons. This appears to be a consequence of an overall high A + T content of the genome. This pattern of codon use is not followed by the psb A gene of all three genomes and other psb A sequences examined. In this gene, the codon use favors NNC over NNT for twofold degenerate amino acids. In each case the only tRNA coded by the genome is complementary to the NNC codon. This codon use is similar to the codon use by chloroplast genes examined from Chlamydomonas reinhardtii. Since psb A is the major translation product of the chloroplast, this suggests that selection is acting on the codon use of this gene to adapt codons to tRNA availability, as previously suggested for unicellular organisms.

  2. Sequencing and annotation of the chloroplast DNAs and identification of polymorphisms distinguishing normal male-fertile and male-sterile cytoplasms of onion.

    Science.gov (United States)

    von Kohn, Christopher; Kiełkowska, Agnieszka; Havey, Michael J

    2013-12-01

    Male-sterile (S) cytoplasm of onion is an alien cytoplasm introgressed into onion in antiquity and is widely used for hybrid seed production. Owing to the biennial generation time of onion, classical crossing takes at least 4 years to classify cytoplasms as S or normal (N) male-fertile. Molecular markers in the organellar DNAs that distinguish N and S cytoplasms are useful to reduce the time required to classify onion cytoplasms. In this research, we completed next-generation sequencing of the chloroplast DNAs of N- and S-cytoplasmic onions; we assembled and annotated the genomes in addition to identifying polymorphisms that distinguish these cytoplasms. The sizes (153 538 and 153 355 base pairs) and GC contents (36.8%) were very similar for the chloroplast DNAs of N and S cytoplasms, respectively, as expected given their close phylogenetic relationship. The size difference was primarily due to small indels in intergenic regions and a deletion in the accD gene of N-cytoplasmic onion. The structures of the onion chloroplast DNAs were similar to those of most land plants with large and small single copy regions separated by inverted repeats. Twenty-eight single nucleotide polymorphisms, two polymorphic restriction-enzyme sites, and one indel distributed across 20 chloroplast genes in the large and small single copy regions were selected and validated using diverse onion populations previously classified as N or S cytoplasmic using restriction fragment length polymorphisms. Although cytoplasmic male sterility is likely associated with the mitochondrial DNA, maternal transmission of the mitochondrial and chloroplast DNAs allows for polymorphisms in either genome to be useful for classifying onion cytoplasms to aid the development of hybrid onion cultivars.

  3. Finding a (pine) needle in a haystack: chloroplast genome sequence divergence in rare and widespread pines

    Science.gov (United States)

    J.B. Whittall; J. Syring; M. Parks; J. Buenrostro; C. Dick; A. Liston; R. Cronn

    2010-01-01

    Critical to conservation efforts and other investigations at low taxonomic levels, DNA sequence data offer important insights into the distinctiveness, biogeographic partitioning, and evolutionary histories of species. The resolving power of DNA sequences is often limited by insufficient variability at the intraspecific level. This is particularly true of studies...

  4. Strong Accumulation of Chloroplast DNA in the Y Chromosomes of Rumex acetosa and Silene latifolia

    Czech Academy of Sciences Publication Activity Database

    Šteflová, Pavlína; Hobza, Roman; Vyskot, Boris; Kejnovský, Eduard

    2014-01-01

    Roč. 142, č. 1 (2014), s. 59-65 ISSN 1424-8581 R&D Projects: GA ČR(CZ) GAP305/10/0930; GA ČR(CZ) GAP501/10/0102; GA ČR(CZ) GBP501/12/G090; GA ČR GAP501/12/2220; GA ČR(CZ) GA522/09/0083; GA MŠk(CZ) LO1204 Institutional research plan: CEZ:AV0Z50040702 Institutional support: RVO:68081707 Keywords : Chloroplast DNA * Rumex acetosa * Sex chromosomes Subject RIV: BO - Biophysics Impact factor: 1.561, year: 2014

  5. Fast and secure retrieval of DNA sequences

    NARCIS (Netherlands)

    2014-01-01

    Sequence models are retrieved from a sequences index. The sequence models model DNA or RNA sequences stored in a database, and each comprises a finite memory tree source model and parameters for the finite memory tree source model. One or more DNA or RNA sequences stored in the database are

  6. The complete chloroplast genome sequence of Podocarpus lambertii: genome structure, evolutionary aspects, gene content and SSR detection.

    Directory of Open Access Journals (Sweden)

    Leila do Nascimento Vieira

    Full Text Available BACKGROUND: Podocarpus lambertii (Podocarpaceae is a native conifer from the Brazilian Atlantic Forest Biome, which is considered one of the 25 biodiversity hotspots in the world. The advancement of next-generation sequencing technologies has enabled the rapid acquisition of whole chloroplast (cp genome sequences at low cost. Several studies have proven the potential of cp genomes as tools to understand enigmatic and basal phylogenetic relationships at different taxonomic levels, as well as further probe the structural and functional evolution of plants. In this work, we present the complete cp genome sequence of P. lambertii. METHODOLOGY/PRINCIPAL FINDINGS: The P. lambertii cp genome is 133,734 bp in length, and similar to other sequenced cupressophytes, it lacks one of the large inverted repeat regions (IR. It contains 118 unique genes and one duplicated tRNA (trnN-GUU, which occurs as an inverted repeat sequence. The rps16 gene was not found, which was previously reported for the plastid genome of another Podocarpaceae (Nageia nagi and Araucariaceae (Agathis dammara. Structurally, P. lambertii shows 4 inversions of a large DNA fragment ∼20,000 bp compared to the Podocarpus totara cp genome. These unexpected characteristics may be attributed to geographical distance and different adaptive needs. The P. lambertii cp genome presents a total of 28 tandem repeats and 156 SSRs, with homo- and dipolymers being the most common and tri-, tetra-, penta-, and hexapolymers occurring with less frequency. CONCLUSION: The complete cp genome sequence of P. lambertii revealed significant structural changes, even in species from the same genus. These results reinforce the apparently loss of rps16 gene in Podocarpaceae cp genome. In addition, several SSRs in the P. lambertii cp genome are likely intraspecific polymorphism sites, which may allow highly sensitive phylogeographic and population structure studies, as well as phylogenetic studies of species of

  7. Entropic fluctuations in DNA sequences

    Science.gov (United States)

    Thanos, Dimitrios; Li, Wentian; Provata, Astero

    2018-03-01

    The Local Shannon Entropy (LSE) in blocks is used as a complexity measure to study the information fluctuations along DNA sequences. The LSE of a DNA block maps the local base arrangement information to a single numerical value. It is shown that despite this reduction of information, LSE allows to extract meaningful information related to the detection of repetitive sequences in whole chromosomes and is useful in finding evolutionary differences between organisms. More specifically, large regions of tandem repeats, such as centromeres, can be detected based on their low LSE fluctuations along the chromosome. Furthermore, an empirical investigation of the appropriate block sizes is provided and the relationship of LSE properties with the structure of the underlying repetitive units is revealed by using both computational and mathematical methods. Sequence similarity between the genomic DNA of closely related species also leads to similar LSE values at the orthologous regions. As an application, the LSE covariance function is used to measure the evolutionary distance between several primate genomes.

  8. A comparison of chloroplast genome sequences in Aconitum (Ranunculaceae: a traditional herbal medicinal genus

    Directory of Open Access Journals (Sweden)

    Hanghui Kong

    2017-11-01

    Full Text Available The herbal medicinal genus Aconitum L., belonging to the Ranunculaceae family, represents the earliest diverging lineage within the eudicots. It currently comprises of two subgenera, A. subgenus Lycoctonum and A. subg. Aconitum. The complete chloroplast (cp genome sequences were characterized in three species: A. angustius, A. finetianum, and A. sinomontanum in subg. Lycoctonum and compared to other Aconitum species to clarify their phylogenetic relationship and provide molecular information for utilization of Aconitum species particularly in Eastern Asia. The length of the chloroplast genome sequences were 156,109 bp in A. angustius, 155,625 bp in A. finetianum and 157,215 bp in A. sinomontanum, with each species possessing 126 genes with 84 protein coding genes (PCGs. While genomic rearrangements were absent, structural variation was detected in the LSC/IR/SSC boundaries. Five pseudogenes were identified, among which Ψrps19 and Ψycf1 were in the LSC/IR/SSC boundaries, Ψrps16 and ΨinfA in the LSC region, and Ψycf15 in the IRb region. The nucleotide variability (Pi of Aconitum was estimated to be 0.00549, with comparably higher variations in the LSC and SSC than the IR regions. Eight intergenic regions were revealed to be highly variable and a total of 58–62 simple sequence repeats (SSRs were detected in all three species. More than 80% of SSRs were present in the LSC region. Altogether, 64.41% and 46.81% of SSRs are mononucleotides in subg. Lycoctonum and subg. Aconitum, respectively, while a higher percentage of di-, tri-, tetra-, and penta- SSRs were present in subg. Aconitum. Most species of subg. Aconitum in Eastern Asia were first used for phylogenetic analyses. The availability of the complete cp genome sequences of these species in subg. Lycoctonum will benefit future phylogenetic analyses and aid in germplasm utilization in Aconitum species.

  9. A comparison of chloroplast genome sequences in Aconitum (Ranunculaceae): a traditional herbal medicinal genus.

    Science.gov (United States)

    Kong, Hanghui; Liu, Wanzhen; Yao, Gang; Gong, Wei

    2017-01-01

    The herbal medicinal genus Aconitum L., belonging to the Ranunculaceae family, represents the earliest diverging lineage within the eudicots. It currently comprises of two subgenera, A . subgenus Lycoctonum and A . subg. Aconitum . The complete chloroplast (cp) genome sequences were characterized in three species: A. angustius , A. finetianum , and A. sinomontanum in subg. Lycoctonum and compared to other Aconitum species to clarify their phylogenetic relationship and provide molecular information for utilization of Aconitum species particularly in Eastern Asia. The length of the chloroplast genome sequences were 156,109 bp in A. angustius , 155,625 bp in A. finetianum and 157,215 bp in A. sinomontanum , with each species possessing 126 genes with 84 protein coding genes (PCGs). While genomic rearrangements were absent, structural variation was detected in the LSC/IR/SSC boundaries. Five pseudogenes were identified, among which Ψ rps 19 and Ψ ycf 1 were in the LSC/IR/SSC boundaries, Ψ rps 16 and Ψ inf A in the LSC region, and Ψ ycf 15 in the IRb region. The nucleotide variability ( Pi ) of Aconitum was estimated to be 0.00549, with comparably higher variations in the LSC and SSC than the IR regions. Eight intergenic regions were revealed to be highly variable and a total of 58-62 simple sequence repeats (SSRs) were detected in all three species. More than 80% of SSRs were present in the LSC region. Altogether, 64.41% and 46.81% of SSRs are mononucleotides in subg. Lycoctonum and subg. Aconitum , respectively, while a higher percentage of di-, tri-, tetra-, and penta- SSRs were present in subg. Aconitum . Most species of subg. Aconitum in Eastern Asia were first used for phylogenetic analyses. The availability of the complete cp genome sequences of these species in subg. Lycoctonum will benefit future phylogenetic analyses and aid in germplasm utilization in Aconitum species.

  10. DNA Replication Profiling Using Deep Sequencing.

    Science.gov (United States)

    Saayman, Xanita; Ramos-Pérez, Cristina; Brown, Grant W

    2018-01-01

    Profiling of DNA replication during progression through S phase allows a quantitative snap-shot of replication origin usage and DNA replication fork progression. We present a method for using deep sequencing data to profile DNA replication in S. cerevisiae.

  11. The complete chloroplast genome sequence of an endemic monotypic genus Hagenia (Rosaceae: structural comparative analysis, gene content and microsatellite detection

    Directory of Open Access Journals (Sweden)

    Andrew W. Gichira

    2017-01-01

    Full Text Available Hagenia is an endangered monotypic genus endemic to the topical mountains of Africa. The only species, Hagenia abyssinica (Bruce J.F. Gmel, is an important medicinal plant producing bioactive compounds that have been traditionally used by African communities as a remedy for gastrointestinal ailments in both humans and animals. Complete chloroplast genomes have been applied in resolving phylogenetic relationships within plant families. We employed high-throughput sequencing technologies to determine the complete chloroplast genome sequence of H. abyssinica. The genome is a circular molecule of 154,961 base pairs (bp, with a pair of Inverted Repeats (IR 25,971 bp each, separated by two single copies; a large (LSC, 84,320 bp and a small single copy (SSC, 18,696. H. abyssinica’s chloroplast genome has a 37.1% GC content and encodes 112 unique genes, 78 of which code for proteins, 30 are tRNA genes and four are rRNA genes. A comparative analysis with twenty other species, sequenced to-date from the family Rosaceae, revealed similarities in structural organization, gene content and arrangement. The observed size differences are attributed to the contraction/expansion of the inverted repeats. The translational initiation factor gene (infA which had been previously reported in other chloroplast genomes was conspicuously missing in H. abyssinica. A total of 172 microsatellites and 49 large repeat sequences were detected in the chloroplast genome. A Maximum Likelihood analyses of 71 protein-coding genes placed Hagenia in Rosoideae. The availability of a complete chloroplast genome, the first in the Sanguisorbeae tribe, is beneficial for further molecular studies on taxonomic and phylogenomic resolution within the Rosaceae family.

  12. The complete chloroplast genome sequence of an endemic monotypic genus Hagenia (Rosaceae): structural comparative analysis, gene content and microsatellite detection.

    Science.gov (United States)

    Gichira, Andrew W; Li, Zhizhong; Saina, Josphat K; Long, Zhicheng; Hu, Guangwan; Gituru, Robert W; Wang, Qingfeng; Chen, Jinming

    2017-01-01

    Hagenia is an endangered monotypic genus endemic to the topical mountains of Africa. The only species, Hagenia abyssinica (Bruce) J.F. Gmel, is an important medicinal plant producing bioactive compounds that have been traditionally used by African communities as a remedy for gastrointestinal ailments in both humans and animals. Complete chloroplast genomes have been applied in resolving phylogenetic relationships within plant families. We employed high-throughput sequencing technologies to determine the complete chloroplast genome sequence of H. abyssinica. The genome is a circular molecule of 154,961 base pairs (bp), with a pair of Inverted Repeats (IR) 25,971 bp each, separated by two single copies; a large (LSC, 84,320 bp) and a small single copy (SSC, 18,696). H. abyssinica 's chloroplast genome has a 37.1% GC content and encodes 112 unique genes, 78 of which code for proteins, 30 are tRNA genes and four are rRNA genes. A comparative analysis with twenty other species, sequenced to-date from the family Rosaceae, revealed similarities in structural organization, gene content and arrangement. The observed size differences are attributed to the contraction/expansion of the inverted repeats. The translational initiation factor gene ( infA ) which had been previously reported in other chloroplast genomes was conspicuously missing in H. abyssinica . A total of 172 microsatellites and 49 large repeat sequences were detected in the chloroplast genome. A Maximum Likelihood analyses of 71 protein-coding genes placed Hagenia in Rosoideae. The availability of a complete chloroplast genome, the first in the Sanguisorbeae tribe, is beneficial for further molecular studies on taxonomic and phylogenomic resolution within the Rosaceae family.

  13. The Complete Chloroplast Genome Sequences of Five Epimedium Species: Lights into Phylogenetic and Taxonomic Analyses

    Science.gov (United States)

    Zhang, Yanjun; Du, Liuwen; Liu, Ao; Chen, Jianjun; Wu, Li; Hu, Weiming; Zhang, Wei; Kim, Kyunghee; Lee, Sang-Choon; Yang, Tae-Jin; Wang, Ying

    2016-01-01

    Epimedium L. is a phylogenetically and economically important genus in the family Berberidaceae. We here sequenced the complete chloroplast (cp) genomes of four Epimedium species using Illumina sequencing technology via a combination of de novo and reference-guided assembly, which was also the first comprehensive cp genome analysis on Epimedium combining the cp genome sequence of E. koreanum previously reported. The five Epimedium cp genomes exhibited typical quadripartite and circular structure that was rather conserved in genomic structure and the synteny of gene order. However, these cp genomes presented obvious variations at the boundaries of the four regions because of the expansion and contraction of the inverted repeat (IR) region and the single-copy (SC) boundary regions. The trnQ-UUG duplication occurred in the five Epimedium cp genomes, which was not found in the other basal eudicotyledons. The rapidly evolving cp genome regions were detected among the five cp genomes, as well as the difference of simple sequence repeats (SSR) and repeat sequence were identified. Phylogenetic relationships among the five Epimedium species based on their cp genomes showed accordance with the updated system of the genus on the whole, but reminded that the evolutionary relationships and the divisions of the genus need further investigation applying more evidences. The availability of these cp genomes provided valuable genetic information for accurately identifying species, taxonomy and phylogenetic resolution and evolution of Epimedium, and assist in exploration and utilization of Epimedium plants. PMID:27014326

  14. The complete chloroplast genome sequences of five Epimedium species: lights into phylogenetic and taxonomic analyses

    Directory of Open Access Journals (Sweden)

    Yanjun eZhang

    2016-03-01

    Full Text Available Epimedium L. is a phylogenetically and economically important genus in the family Berberidaceae. We here sequenced the complete chloroplast (cp genomes of four Epimedium species using Illumina sequencing technology via a combination of de novo and reference-guided assembly, which was also the first comprehensive cp genome analysis on Epimedium combining the cp genome sequence of E. koreanum previously reported. The five Epimedium cp genomes exhibited typical quadripartite and circular structure that was rather conserved in genomic structure and the synteny of gene order. However, these cp genomes presented obvious variations at the boundaries of the four regions because of the expansion and contraction of the inverted repeat (IR region and the single-copy (SC boundary regions. The trnQ-UUG duplication occurred in the five Epimedium cp genomes, which was not found in the other basal eudicotyledons. The rapidly evolving cp genome regions were detected among the five cp genomes, as well as the difference of simple sequence repeats (SSR and repeat sequence were identified. Phylogenetic relationships among the five Epimedium species based on their cp genomes showed accordance with the updated system of the genus on the whole, but reminded that the evolutionary relationships and the divisions of the genus need further investigation applying more evidences. The availability of these cp genomes provided valuable genetic information for accurately identifying species, taxonomy and phylogenetic resolution and evolution of Epimedium, and assist in exploration and utilization of Epimedium plants.

  15. The complete chloroplast genome sequences of Lychnis wilfordii and Silene capitata and comparative analyses with other Caryophyllaceae genomes.

    Science.gov (United States)

    Kang, Jong-Soo; Lee, Byoung Yoon; Kwak, Myounghai

    2017-01-01

    The complete chloroplast genomes of Lychnis wilfordii and Silene capitata were determined and compared with ten previously reported Caryophyllaceae chloroplast genomes. The chloroplast genome sequences of L. wilfordii and S. capitata contain 152,320 bp and 150,224 bp, respectively. The gene contents and orders among 12 Caryophyllaceae species are consistent, but several microstructural changes have occurred. Expansion of the inverted repeat (IR) regions at the large single copy (LSC)/IRb and small single copy (SSC)/IR boundaries led to partial or entire gene duplications. Additionally, rearrangements of the LSC region were caused by gene inversions and/or transpositions. The 18 kb inversions, which occurred three times in different lineages of tribe Sileneae, were thought to be facilitated by the intermolecular duplicated sequences. Sequence analyses of the L. wilfordii and S. capitata genomes revealed 39 and 43 repeats, respectively, including forward, palindromic, and reverse repeats. In addition, a total of 67 and 56 simple sequence repeats were discovered in the L. wilfordii and S. capitata chloroplast genomes, respectively. Finally, we constructed phylogenetic trees of the 12 Caryophyllaceae species and two Amaranthaceae species based on 73 protein-coding genes using both maximum parsimony and likelihood methods.

  16. The complete chloroplast genome sequence of Maddenia hypoleuca koehne (Prunoideae, Rosaceae).

    Science.gov (United States)

    Chen, Tao; Zhang, Jing; Liu, Yin; Wang, Hao; Wang, Juan; Chen, Qing; Tang, Hao-Ru; Wang, Xiao-Rong

    2016-11-01

    Maddenia hypoleuca Koehne belonging to family Rosaceae is a native species in China. The complete chloroplast (cp) genome was generated by de novo assembly using low coverage whole genome sequencing data and manual correction. The cp genome was 158 084 bp in length, with GC content of 36.63%. It exhibited a typical quadripartite structure: a pair of large inverted repeat regions (IRs, 26 246 bp each), a large single-copy region (LSC, 86 713 bp), and a small single-copy region (SSC, 18 879 bp). A total of 114 genes were predicted, which included 80 protein-coding genes, 30 tRNA genes, and four rRNA genes. Phylogenetic analysis indicated that M. hypoleuca is most closely related to Prunus padus within the Prunoideae subfamily, which conforms to the traditional classification.

  17. High-Throughput DNA sequencing of ancient wood.

    Science.gov (United States)

    Wagner, Stefanie; Lagane, Frédéric; Seguin-Orlando, Andaine; Schubert, Mikkel; Leroy, Thibault; Guichoux, Erwan; Chancerel, Emilie; Bech-Hebelstrup, Inger; Bernard, Vincent; Billard, Cyrille; Billaud, Yves; Bolliger, Matthias; Croutsch, Christophe; Čufar, Katarina; Eynaud, Frédérique; Heussner, Karl Uwe; Köninger, Joachim; Langenegger, Fabien; Leroy, Frédéric; Lima, Christine; Martinelli, Nicoletta; Momber, Garry; Billamboz, André; Nelle, Oliver; Palomo, Antoni; Piqué, Raquel; Ramstein, Marianne; Schweichel, Roswitha; Stäuble, Harald; Tegel, Willy; Terradas, Xavier; Verdin, Florence; Plomion, Christophe; Kremer, Antoine; Orlando, Ludovic

    2018-03-01

    Reconstructing the colonization and demographic dynamics that gave rise to extant forests is essential to forecasts of forest responses to environmental changes. Classical approaches to map how population of trees changed through space and time largely rely on pollen distribution patterns, with only a limited number of studies exploiting DNA molecules preserved in wooden tree archaeological and subfossil remains. Here, we advance such analyses by applying high-throughput (HTS) DNA sequencing to wood archaeological and subfossil material for the first time, using a comprehensive sample of 167 European white oak waterlogged remains spanning a large temporal (from 550 to 9,800 years) and geographical range across Europe. The successful characterization of the endogenous DNA and exogenous microbial DNA of 140 (~83%) samples helped the identification of environmental conditions favouring long-term DNA preservation in wood remains, and started to unveil the first trends in the DNA decay process in wood material. Additionally, the maternally inherited chloroplast haplotypes of 21 samples from three periods of forest human-induced use (Neolithic, Bronze Age and Middle Ages) were found to be consistent with those of modern populations growing in the same geographic areas. Our work paves the way for further studies aiming at using ancient DNA preserved in wood to reconstruct the micro-evolutionary response of trees to climate change and human forest management. © 2018 John Wiley & Sons Ltd.

  18. Norgal: extraction and de novo assembly of mitochondrial DNA from whole-genome sequencing data.

    Science.gov (United States)

    Al-Nakeeb, Kosai; Petersen, Thomas Nordahl; Sicheritz-Pontén, Thomas

    2017-11-21

    Whole-genome sequencing (WGS) projects provide short read nucleotide sequences from nuclear and possibly organelle DNA depending on the source of origin. Mitochondrial DNA is present in animals and fungi, while plants contain DNA from both mitochondria and chloroplasts. Current techniques for separating organelle reads from nuclear reads in WGS data require full reference or partial seed sequences for assembling. Norgal (de Novo ORGAneLle extractor) avoids this requirement by identifying a high frequency subset of k-mers that are predominantly of mitochondrial origin and performing a de novo assembly on a subset of reads that contains these k-mers. The method was applied to WGS data from a panda, brown algae seaweed, butterfly and filamentous fungus. We were able to extract full circular mitochondrial genomes and obtained sequence identities to the reference sequences in the range from 98.5 to 99.5%. We also assembled the chloroplasts of grape vines and cucumbers using Norgal together with seed-based de novo assemblers. Norgal is a pipeline that can extract and assemble full or partial mitochondrial and chloroplast genomes from WGS short reads without prior knowledge. The program is available at: https://bitbucket.org/kosaidtu/norgal .

  19. Origin and diversification of Hibiscus glaber, species endemic to the oceanic Bonin Islands, revealed by chloroplast DNA polymorphism.

    Science.gov (United States)

    Takayama, Koji; Ohi-Toma, Tetsuo; Kudoh, Hiroshi; Kato, Hidetoshi

    2005-04-01

    Abstract Two woody Hibiscus species co-occur in the Bonin Islands of the northwestern Pacific Ocean: Hibiscus glaber Matsum. is endemic to the islands, and its putative ancestral species, Hibiscus tiliaceus L., is widely distributed in coastal areas of the tropics and subtropics. To infer isolating mechanisms that led to speciation of H. glaber and the processes that resulted in co-occurrence of the two closely related species on the Bonin Islands, we conducted molecular phylogenetic analyses on chloroplast DNA (cpDNA) sequences. Materials collected from a wide area of the Pacific and Indian Oceans were used, and two closely related species, Hibiscus hamabo Siebold Zucc. and Hibiscus macrophyllus Roxb., were also included in the analyses. The constructed tree suggested that H. glaber has been derived from H. tiliaceus, and that most of the modern Bonin populations of H. tiliaceus did not share most recent ancestry with H. glaber. Geographic isolation appears to be the most important mechanism in the speciation of H. glaber. The co-occurrence of the two species can be attributed to multiple migrations of different lineages into the islands. While a wide and overlapping geographical distribution of haplotypes was found in H. tiliaceus, localized geographical distribution of haplotypes was detected in H. glaber. It is hypothesized that a shift to inland habitats may have affected the mode of seed dispersal from ocean currents to gravity and hence resulted in geographical structuring of H. glaber haplotypes.

  20. Complete sequence and comparative analysis of the chloroplast genome of Plinia trunciflora

    Directory of Open Access Journals (Sweden)

    Maria Eguiluz

    2017-11-01

    Full Text Available Abstract Plinia trunciflora is a Brazilian native fruit tree from the Myrtaceae family, also known as jaboticaba. This species has great potential by its fruit production. Due to the high content of essential oils in their leaves and of anthocyanins in the fruits, there is also an increasing interest by the pharmaceutical industry. Nevertheless, there are few studies focusing on its molecular biology and genetic characterization. We herein report the complete chloroplast (cp genome of P. trunciflora using high-throughput sequencing and compare it to other previously sequenced Myrtaceae genomes. The cp genome of P. trunciflora is 159,512 bp in size, comprising inverted repeats of 26,414 bp and single-copy regions of 88,097 bp (LSC and 18,587 bp (SSC. The genome contains 111 single-copy genes (77 protein-coding, 30 tRNA and four rRNA genes. Phylogenetic analysis using 57 cp protein-coding genes demonstrated that P. trunciflora, Eugenia uniflora and Acca sellowiana form a cluster with closer relationship to Syzygium cumini than with Eucalyptus. The complete cp sequence reported here can be used in evolutionary and population genetics studies, contributing to resolve the complex taxonomy of this species and fill the gap in genetic characterization.

  1. Complete sequence and comparative analysis of the chloroplast genome of Plinia trunciflora

    Science.gov (United States)

    Eguiluz, Maria; Yuyama, Priscila Mary; Guzman, Frank; Rodrigues, Nureyev Ferreira; Margis, Rogerio

    2017-01-01

    Abstract Plinia trunciflora is a Brazilian native fruit tree from the Myrtaceae family, also known as jaboticaba. This species has great potential by its fruit production. Due to the high content of essential oils in their leaves and of anthocyanins in the fruits, there is also an increasing interest by the pharmaceutical industry. Nevertheless, there are few studies focusing on its molecular biology and genetic characterization. We herein report the complete chloroplast (cp) genome of P. trunciflora using high-throughput sequencing and compare it to other previously sequenced Myrtaceae genomes. The cp genome of P. trunciflora is 159,512 bp in size, comprising inverted repeats of 26,414 bp and single-copy regions of 88,097 bp (LSC) and 18,587 bp (SSC). The genome contains 111 single-copy genes (77 protein-coding, 30 tRNA and four rRNA genes). Phylogenetic analysis using 57 cp protein-coding genes demonstrated that P. trunciflora, Eugenia uniflora and Acca sellowiana form a cluster with closer relationship to Syzygium cumini than with Eucalyptus. The complete cp sequence reported here can be used in evolutionary and population genetics studies, contributing to resolve the complex taxonomy of this species and fill the gap in genetic characterization. PMID:29111566

  2. Complete sequence and comparative analysis of the chloroplast genome of Plinia trunciflora.

    Science.gov (United States)

    Eguiluz, Maria; Yuyama, Priscila Mary; Guzman, Frank; Rodrigues, Nureyev Ferreira; Margis, Rogerio

    2017-01-01

    Plinia trunciflora is a Brazilian native fruit tree from the Myrtaceae family, also known as jaboticaba. This species has great potential by its fruit production. Due to the high content of essential oils in their leaves and of anthocyanins in the fruits, there is also an increasing interest by the pharmaceutical industry. Nevertheless, there are few studies focusing on its molecular biology and genetic characterization. We herein report the complete chloroplast (cp) genome of P. trunciflora using high-throughput sequencing and compare it to other previously sequenced Myrtaceae genomes. The cp genome of P. trunciflora is 159,512 bp in size, comprising inverted repeats of 26,414 bp and single-copy regions of 88,097 bp (LSC) and 18,587 bp (SSC). The genome contains 111 single-copy genes (77 protein-coding, 30 tRNA and four rRNA genes). Phylogenetic analysis using 57 cp protein-coding genes demonstrated that P. trunciflora, Eugenia uniflora and Acca sellowiana form a cluster with closer relationship to Syzygium cumini than with Eucalyptus. The complete cp sequence reported here can be used in evolutionary and population genetics studies, contributing to resolve the complex taxonomy of this species and fill the gap in genetic characterization.

  3. The chloroplast DNA locus psbZ-trnfM as a potential barcode marker in Phoenix L. (Arecaceae

    Directory of Open Access Journals (Sweden)

    Marco Ballardini

    2013-12-01

    Full Text Available The genus Phoenix (Arecaceae comprises 14 species distributed from Cape Verde Islands to SE Asia. It includes the economically important species Phoenix dactylifera. The paucity of differential morphological and anatomical useful characters, and interspecific hybridization, make identification of Phoenix species difficult. In this context, the development of reliable DNA markers for species and hybrid identification would be of great utility. Previous studies identified a 12 bp polymorphic chloroplast minisatellite in the trnG(GCC-trnfM(CAU spacer, and showed its potential for species identification in Phoenix. In this work, in order to develop an efficient DNA barcode marker for Phoenix, a longer cpDNA region (700 bp comprising the mentioned minisatellite, and located between the psbZ and trnfM(CAU genes, was sequenced. One hundred and thirty-six individuals, representing all Phoenix species except P. andamanensis, were analysed. The minisatellite showed 2-7 repetitions of the 12 bp motif, with 1-3 out of seven haplotypes per species. Phoenix reclinata and P. canariensis had species-specific haplotypes. Additional polymorphisms were found in the flanking regions of the minisatellite, including substitutions, indels and homopolymers. All this information allowed us to identify unambiguously eight out of the 13 species, and overall 80% of the individuals sampled. Phoenix rupicola and P. theophrasti had the same haplotype, and so had P. atlantica, P. dactylifera, and P. sylvestris (the “date palm complex” sensu Pintaud et al. 2013. For these species, additional molecular markers will be required for their unambiguous identification. The psbZ-trnfM(CAU region therefore could be considered as a good basis for the establishment of a DNA barcoding system in Phoenix, and is potentially useful for the identification of the female parent in Phoenix hybrids.

  4. DNA sequence modeling based on context trees

    NARCIS (Netherlands)

    Kusters, C.J.; Ignatenko, T.; Roland, J.; Horlin, F.

    2015-01-01

    Genomic sequences contain instructions for protein and cell production. Therefore understanding and identification of biologically and functionally meaningful patterns in DNA sequences is of paramount importance. Modeling of DNA sequences in its turn can help to better understand and identify such

  5. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Science.gov (United States)

    Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  6. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Directory of Open Access Journals (Sweden)

    Jianmin Fu

    Full Text Available Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  7. The complete chloroplast genome sequence of Ampelopsis: gene organization, comparative analysis and phylogenetic relationships to other angiosperms

    Directory of Open Access Journals (Sweden)

    Gurusamy eRaman

    2016-03-01

    Full Text Available Ampelopsis brevipedunculata is an economically important plant that belongs to the Vitaceae family of angiosperms. The phylogenetic placement of Vitaceae is still unresolved. Recent phylogenetic studies suggested that it should be placed in various alternative families including Caryophyllaceae, asteraceae, Saxifragaceae, Dilleniaceae, or with the rest of the rosid families. However, these analyses provided weak supportive results because they were based on only one of several genes. Accordingly, complete chloroplast genome sequences are required to resolve the phylogenetic relationships among angiosperms. Recent phylogenetic analyses based on the complete chloroplast genome sequence suggested strong support for the position of Vitaceae as the earliest diverging lineage of rosids and placed it as a sister to the remaining rosids. These studies also revealed relationships among several major lineages of angiosperms; however, they highlighted the significance of taxon sampling for obtaining accurate phylogenies. In the present study, we sequenced the complete chloroplast genome of A. brevipedunculata and used these data to assess the relationships among 32 angiosperms, including 18 taxa of rosids. The Ampelopsis chloroplast genome is 161,090 bp in length, and includes a pair of inverted repeats of 26,394 bp that are separated by small and large single copy regions of 19,036 bp and 89,266 bp, respectively. The gene content and order of Ampelopsis is identical to many other unrearranged angiosperm chloroplast genomes, including Vitis and tobacco. A phylogenetic tree constructed based on 70 protein-coding genes of 33 angiosperms showed that both Saxifragales and Vitaceae diverged from the rosid clade and formed two clades with 100% bootstrap value. The position of the Vitaceae is sister to Saxifragales, and both are the basal and earliest diverging lineages. Moreover, Saxifragales forms a sister clade to Vitaceae of rosids. Overall, the results of

  8. Complete Chloroplast Genome Sequences and Comparative Analysis of Chenopodium quinoa and C. album.

    Science.gov (United States)

    Hong, Su-Young; Cheon, Kyeong-Sik; Yoo, Ki-Oug; Lee, Hyun-Oh; Cho, Kwang-Soo; Suh, Jong-Taek; Kim, Su-Jeong; Nam, Jeong-Hwan; Sohn, Hwang-Bae; Kim, Yul-Ho

    2017-01-01

    The Chenopodium genus comprises ~150 species, including Chenopodium quinoa and Chenopodium album , two important crops with high nutritional value. To elucidate the phylogenetic relationship between the two species, the complete chloroplast (cp) genomes of these species were obtained by next generation sequencing. We performed comparative analysis of the sequences and, using InDel markers, inferred phylogeny and genetic diversity of the Chenopodium genus. The cp genome is 152,099 bp ( C. quinoa ) and 152,167 bp ( C. album ) long. In total, 119 genes (78 protein-coding, 37 tRNA, and 4 rRNA) were identified. We found 14 ( C. quinoa ) and 15 ( C. album ) tandem repeats (TRs); 14 TRs were present in both species and C. album and C. quinoa each had one species-specific TR. The trnI-GAU intron sequences contained one ( C. quinoa ) or two ( C. album ) copies of TRs (66 bp); the InDel marker was designed based on the copy number variation in TRs. Using the InDel markers, we detected this variation in the TR copy number in four species, Chenopodium hybridum, Chenopodium pumilio, Chenopodium ficifolium , and Chenopodium koraiense , but not in Chenopodium glaucum . A comparison of coding and non-coding regions between C. quinoa and C. album revealed divergent sites. Nucleotide diversity >0.025 was found in 17 regions-14 were located in the large single copy region (LSC), one in the inverted repeats, and two in the small single copy region (SSC). A phylogenetic analysis based on 59 protein-coding genes from 25 taxa resolved Chenopodioideae monophyletic and sister to Betoideae. The complete plastid genome sequences and molecular markers based on divergence hotspot regions in the two Chenopodium taxa will help to resolve the phylogenetic relationships of Chenopodium .

  9. "First generation" automated DNA sequencing technology.

    Science.gov (United States)

    Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

    2011-10-01

    Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.

  10. Plant DNA sequences from feces: potential means for assessing diets of wild primates.

    Science.gov (United States)

    Bradley, Brenda J; Stiller, Mathias; Doran-Sheehy, Diane M; Harris, Tara; Chapman, Colin A; Vigilant, Linda; Poinar, Hendrik

    2007-06-01

    Analyses of plant DNA in feces provides a promising, yet largely unexplored, means of documenting the diets of elusive primates. Here we demonstrate the promise and pitfalls of this approach using DNA extracted from fecal samples of wild western gorillas (Gorilla gorilla) and black and white colobus monkeys (Colobus guereza). From these DNA extracts we amplified, cloned, and sequenced small segments of chloroplast DNA (part of the rbcL gene) and plant nuclear DNA (ITS-2). The obtained sequences were compared to sequences generated from known plant samples and to those in GenBank to identify plant taxa in the feces. With further optimization, this method could provide a basic evaluation of minimum primate dietary diversity even when knowledge of local flora is limited. This approach may find application in studies characterizing the diets of poorly-known, unhabituated primate species or assaying consumer-resource relationships in an ecosystem. (c) 2007 Wiley-Liss, Inc.

  11. The complete chloroplast genome sequence of the relict woody plant Metasequoia glyptostroboides Hu et Cheng

    Directory of Open Access Journals (Sweden)

    Jinhui eChen

    2015-06-01

    Full Text Available Metasequoia glyptostroboides Hu et Cheng is the only species in the genus Metasequoia Miki ex Hu et Cheng, which belongs to the Cupressaceae family. There were around ten species in the Metasequoia genus, which were widely spread across the Northern Hemisphere during the Cretaceous of the Mesozoic and in the Cenozoic. M. glyptostroboides is the only remaining representative of this genus. Here, we report the complete chloroplast (cp genome sequence and the cp genomic features of M. glyptostroboides. The M. glyptostroboides cp genome is 131,887 bp in length, with a total of 117 genes comprised of 82 protein-coding genes, 31 tRNA genes and four rRNA genes. In this genome, 11 forward repeats, nine palindromic repeats and 15 tandem repeats were detected. A total of 188 perfect microsatellites were detected through simple sequence repeat (SSR analysis and these were distributed unevenly within the cp genome. Comparison of the cp genome structure and gene order to those of several other land plants indicated that a copy of the inverted repeat (IR region, which was found to be IR region A (IRA, was lost in the M. glyptostroboides cp ge-nome. The five most divergent and five most conserved genes were determined and further phylogenetic analysis was performed among plant species, especially for relat-ed species in conifers. Finally, phylogenetic analysis demonstrated that M. glyptostro-boides is a sister species to Cryptomeria japonica (L. F. D. Don and to Taiwania cryptomerioides Hayata. The complete cp genome sequence information of M. glyp-tostroboides will be great helpful for further investigations of this endemic relict woody plant and for in-depth understanding of the evolutionary history of the conif-erous cp genomes, especially for the position of M. glyptostroboides in plant systemat-ics and evolution.

  12. The complete chloroplast genome sequence of the relict woody plant Metasequoia glyptostroboides Hu et Cheng.

    Science.gov (United States)

    Chen, Jinhui; Hao, Zhaodong; Xu, Haibin; Yang, Liming; Liu, Guangxin; Sheng, Yu; Zheng, Chen; Zheng, Weiwei; Cheng, Tielong; Shi, Jisen

    2015-01-01

    Metasequoia glyptostroboides Hu et Cheng is the only species in the genus Metasequoia Miki ex Hu et Cheng, which belongs to the Cupressaceae family. There were around 10 species in the Metasequoia genus, which were widely spread across the Northern Hemisphere during the Cretaceous of the Mesozoic and in the Cenozoic. M. glyptostroboides is the only remaining representative of this genus. Here, we report the complete chloroplast (cp) genome sequence and the cp genomic features of M. glyptostroboides. The M. glyptostroboides cp genome is 131,887 bp in length, with a total of 117 genes comprised of 82 protein-coding genes, 31 tRNA genes and four rRNA genes. In this genome, 11 forward repeats, nine palindromic repeats, and 15 tandem repeats were detected. A total of 188 perfect microsatellites were detected through simple sequence repeat (SSR) analysis and these were distributed unevenly within the cp genome. Comparison of the cp genome structure and gene order to those of several other land plants indicated that a copy of the inverted repeat (IR) region, which was found to be IR region A (IRA), was lost in the M. glyptostroboides cp genome. The five most divergent and five most conserved genes were determined and further phylogenetic analysis was performed among plant species, especially for related species in conifers. Finally, phylogenetic analysis demonstrated that M. glyptostroboides is a sister species to Cryptomeria japonica (L. F.) D. Don and to Taiwania cryptomerioides Hayata. The complete cp genome sequence information of M. glyptostroboides will be great helpful for further investigations of this endemic relict woody plant and for in-depth understanding of the evolutionary history of the coniferous cp genomes, especially for the position of M. glyptostroboides in plant systematics and evolution.

  13. The complete chloroplast genome sequence of Pelargonium xhortorum: Or ganization and evolution of the largest and most highlyrearranged chloroplast genome of land plants

    Energy Technology Data Exchange (ETDEWEB)

    Chumley, Timothy W.; Palmer, Jeffrey D.; Mower, Jeffrey P.; Fourcade, H. Matthew; Calie, Patrick J.; Boore, Jeffrey L.; Jansen,Robert K.

    2006-01-20

    The chloroplast genome of Pelargonium e hortorum has beencompletely sequenced. It maps as a circular molecule of 217,942 bp, andis both the largest and most rearranged land plant chloroplast genome yetsequenced. It features two copies of a greatly expanded inverted repeat(IR) of 75,741 bp each, and consequently diminished single copy regionsof 59,710 bp and 6,750 bp. It also contains two different associations ofrepeated elements that contribute about 10 percent to the overall sizeand account for the majority of repeats found in the genome. Theyrepresent hotspots for rearrangements and gene duplications and include alarge number of pseudogenes. We propose simple models that account forthe major rearrangements with a minimum of eight IR boundary changes and12 inversions in addition to a several insertions of duplicated sequence.The major processes at work (duplication, IR expansion, and inversion)have disrupted at least one and possibly two or three transcriptionaloperons, and the genes involved in these disruptions form the core of thetwo major repeat associations. Despite the vast increase in size andcomplexity of the genome, the gene content is similar to that of otherangiosperms, with the exceptions of a large number of pseudogenes as partof the repeat associations, the recognition of two open reading frames(ORF56 and ORF42) in the trnA intron with similarities to previouslyidentified mitochondrial products (ACRS and pvs-trnA), the loss of accDand trnT-GGU, and in particular, the lack of a recognizably functionalrpoA. One or all of three similar open reading frames may possibly encodethe latter, however.

  14. Complete Chloroplast Genome Sequence of Aquilaria sinensis (Lour.) Gilg and Evolution Analysis within the Malvales Order.

    Science.gov (United States)

    Wang, Ying; Zhan, Di-Feng; Jia, Xian; Mei, Wen-Li; Dai, Hao-Fu; Chen, Xiong-Ting; Peng, Shi-Qing

    2016-01-01

    Aquilaria sinensis (Lour.) Gilg is an important medicinal woody plant producing agarwood, which is widely used in traditional Chinese medicine. High-throughput sequencing of chloroplast (cp) genomes enhanced the understanding about evolutionary relationships within plant families. In this study, we determined the complete cp genome sequences for A. sinensis. The size of the A. sinensis cp genome was 159,565 bp. This genome included a large single-copy region of 87,482 bp, a small single-copy region of 19,857 bp, and a pair of inverted repeats (IRa and IRb) of 26,113 bp each. The GC content of the genome was 37.11%. The A. sinensis cp genome encoded 113 functional genes, including 82 protein-coding genes, 27 tRNA genes, and 4 rRNA genes. Seven genes were duplicated in the protein-coding genes, whereas 11 genes were duplicated in the RNA genes. A total of 45 polymorphic simple-sequence repeat loci and 60 pairs of large repeats were identified. Most simple-sequence repeats were located in the noncoding sections of the large single-copy/small single-copy region and exhibited high A/T content. Moreover, 33 pairs of large repeat sequences were located in the protein-coding genes, whereas 27 pairs were located in the intergenic regions. Aquilaria sinensis cp genome bias ended with A/T on the basis of codon usage. The distribution of codon usage in A. sinensis cp genome was most similar to that in the Gonystylus bancanus cp genome. Comparative results of 82 protein-coding genes from 29 species of cp genomes demonstrated that A. sinensis was a sister species to G. bancanus within the Malvales order. Aquilaria sinensis cp genome presented the highest sequence similarity of >90% with the G. bancanus cp genome by using CGView Comparison Tool. This finding strongly supports the placement of A. sinensis as a sister to G. bancanus within the Malvales order. The complete A. sinensis cp genome information will be highly beneficial for further studies on this traditional medicinal

  15. Nucleotide sequence preservation of human mitochondrial DNA

    International Nuclear Information System (INIS)

    Monnat, R.J. Jr.; Loeb, L.A.

    1985-01-01

    Recombinant DNA techniques have been used to quantitate the amount of nucleotide sequence divergence in the mitochondrial DNA population of individual normal humans. Mitochondrial DNA was isolated from the peripheral blood lymphocytes of five normal humans and cloned in M13 mp11; 49 kilobases of nucleotide sequence information was obtained from 248 independently isolated clones from the five normal donors. Both between- and within-individual differences were identified. Between-individual differences were identified in approximately = to 1/200 nucleotides. In contrast, only one within-individual difference was identified in 49 kilobases of nucleotide sequence information. This high degree of mitochondrial nucleotide sequence homogeneity in human somatic cells is in marked contrast to the rapid evolutionary divergence of human mitochondrial DNA and suggests the existence of mechanisms for the concerted preservation of mammalian mitochondrial DNA sequences in single organisms

  16. Complete chloroplast genome sequences of Drimys, Liriodendron, andPiper: Implications for the phylogeny of magnoliids and the evolution ofGC content

    Energy Technology Data Exchange (ETDEWEB)

    Zhengqiu, C.; Penaflor, C.; Kuehl, J.V.; Leebens-Mack, J.; Carlson, J.; dePamphilis, C.W.; Boore, J.L.; Jansen, R.K.

    2006-06-01

    the inverted repeat due to the presence of rRNA genes and lowest in the small single copy region where most NADH genes are located. Phylogenetic analyses using maximum parsimony and maximum likelihood methods were performed on DNA sequences of 61 protein-coding genes. Trees from both analyses provided strong support for the monophyly of magnoliids and two strongly supported groups were identified, the Canellales/Piperales and the Laurales/Magnoliales. The phylogenies also provided moderate to strong support for the basal position of Amborella, and a sister relationship of magnoliids to a clade that includes monocots and eudicots. The complete sequences of three magnoliid chloroplast genomes provide new data from the largest basal angiosperm clade. Evolutionary comparisons of these new genome sequences, combined with other published angiosperm genome, confirm that GC content is unevenly distributed across the genome by location, codon position, and functional group. Furthermore, phylogenetic analyses provide the strongest support so far for the hypothesis that the magnoliids are sister to a large clade that includes both monocots and eudicots.

  17. Human Chromosome 7: DNA Sequence and Biology

    OpenAIRE

    Scherer, Stephen W.; Cheung, Joseph; MacDonald, Jeffrey R.; Osborne, Lucy R.; Nakabayashi, Kazuhiko; Herbrick, Jo-Anne; Carson, Andrew R.; Parker-Katiraee, Layla; Skaug, Jennifer; Khaja, Razi; Zhang, Junjun; Hudek, Alexander K.; Li, Martin; Haddad, May; Duggan, Gavin E.

    2003-01-01

    DNA sequence and annotation of the entire human chromosome 7, encompassing nearly 158 million nucleotides of DNA and 1917 gene structures, are presented. To generate a higher order description, additional structural features such as imprinted genes, fragile sites, and segmental duplications were integrated at the level of the DNA sequence with medical genetic data, including 440 chromosome rearrangement breakpoints associated with disease. This approach enabled the discovery of candidate gene...

  18. An assessment of Wx microsatellite allele, alkali degradation and differentiation of chloroplast DNA in traditional black rice (Oryza sativa L.) from Thailand and Lao PDR.

    Science.gov (United States)

    Prathepha, Preecha

    2007-01-15

    Thailand and Lao PDR are the country's rich rice diversity. To contribute a significant knowledge for development new rice varieties, a collection of 142 black rice (Oryza sativa) accessions were determined for variation of physico-chemical properties, Wx microsatellite allele, Wx allele and chloroplast DNA type. The results showed that amylose content of black rice accessions were ranged from 1.9 to 6.8%. All of the alkali disintegration types (high, intermediate and low) was observed in these rice with average of 1.75 on the 1-3 digestibility scale. The unique Wx microsatellite allele (CT)17 was found in these samples and all black rice strains carried Wx(b) allele. In addition, all black rice accessions were found the duplication of the 23 bp sequence motif in the exon 2 of the wx gene. This evidence is a common phenomenon in glutinous rice. Based on two growing condition for black rice, rainfed lowland and rainfed upland, chloroplast DNA type was distinct from each other. All rice strains from rainfed lowland was deletion plastotype, but all other rainfed upland strains were non-deletion types.

  19. Multiple tag labeling method for DNA sequencing

    Science.gov (United States)

    Mathies, R.A.; Huang, X.C.; Quesada, M.A.

    1995-07-25

    A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.

  20. Restriction enzyme analysis of the chloroplast DNA of Phaseolus vulgaris L. vr. Rio Negro Análise de restrição do DNA cloroplástico de Phaseolus vulgaris vr. Rio Negro

    Directory of Open Access Journals (Sweden)

    Sergio Echeverrigaray

    1996-12-01

    Full Text Available The chloroplast DNA of Phaseolus vulgaris L. vr. Rio Negro was isola ted from chloroplasts obtained by descontiuous sucrose gradient centrifugation. The restriction analysis with the enzymes HindIII, EcoRI and BamHI and their combination, allowed to identified more than 20 fragments of 18 to 0.65kb. The size of Phaseolus vulgaris L. cp DNA was estimated in 140kb with the presence of a repeat sequence of about 22kb.O DNA cloroplástico do cultivar Rio Negro (Phaseolus vulgaris L. foi isolado a partir de cloroplastos obtidos por gradiente descontínuo de sacarose. A análise de restrição com as enzimas HindIII, EcoRI e BamHI e a combinação destas, permitiu a identificação de mais de 20 fragmentos na faixa de 18 a 0.65kb. O tamanho do cp DNA de Phaseolus vulgaris L. foi estimado em 140kb com a existência de sequências repetidas de aproximadamente 22kb.

  1. Complete chloroplast genome sequence of green foxtail (Setaria viridis), a promising model system for C4 photosynthesis.

    Science.gov (United States)

    Wang, Shuo; Gao, Li-Zhi

    2016-09-01

    The complete chloroplast genome of green foxtail (Setaria viridis), a promising model system for C4 photosynthesis, is first reported in this study. The genome harbors a large single copy (LSC) region of 81 016 bp and a small single copy (SSC) region of 12 456  bp separated by a pair of inverted repeat (IRa and IRb) regions of 22 315 bp. GC content is 38.92%. The proportion of coding sequence is 57.97%, comprising of 111 (19 duplicated in IR regions) unique genes, 71 of which are protein-coding genes, four are rRNA genes, and 36 are tRNA genes. Phylogenetic analysis indicated that S. viridis was clustered with its cultivated species S. italica in the tribe Paniceae of the family Poaceae. This newly determined chloroplast genome will provide valuable genetic resources to assist future studies on C4 photosynthesis in grasses.

  2. Phylogenetic relationships in the genus Leonardoxa (Leguminosae: Caesalpinioideae) inferred from chloroplast trnL intron and trnL-trnF intergenic spacer sequences.

    Science.gov (United States)

    Brouat, Carine; Gielly, Ludovic; McKey, Doyle

    2001-01-01

    The African genus LEONARDOXA: (Leguminosae: Caesalpinioideae) comprises two Congolean species and a group of four mostly allopatric subspecies principally located in Cameroon and clustered together in the L. africana complex. LEONARDOXA: provides a good opportunity to investigate the evolutionary history of ant-plant mutualisms, as it exhibits various grades of ant-plant interactions from diffuse to obligate and symbiotic associations. We present in this paper the first molecular phylogenetic study of this genus. We sequenced both the chloroplast DNA trnL intron (677 aligned base pairs [bp]) and trnL-trnF intergene spacer (598 aligned bp). Inferred phylogenetic relationships suggested first that the genus is paraphyletic. The L. africana complex is clearly separated from the two Congolean species, and the integrity of the genus is thus in question. In the L. africana complex, our data showed a lack of congruence between clades suggested by morphological and chloroplast characters. This, and the low level of molecular divergence found between subspecies, suggests gene flow and introgressive events in the L. africana complex.

  3. Complete chloroplast genome sequence of common bermudagrass (Cynodon dactylon (L.) Pers.) and comparative analysis within the family Poaceae.

    Science.gov (United States)

    Huang, Ya-Yi; Cho, Shu-Ting; Haryono, Mindia; Kuo, Chih-Horng

    2017-01-01

    Common bermudagrass (Cynodon dactylon (L.) Pers.) belongs to the subfamily Chloridoideae of the Poaceae family, one of the most important plant families ecologically and economically. This grass has a long connection with human culture but its systematics is relatively understudied. In this study, we sequenced and investigated the chloroplast genome of common bermudagrass, which is 134,297 bp in length with two single copy regions (LSC: 79,732 bp; SSC: 12,521 bp) and a pair of inverted repeat (IR) regions (21,022 bp). The annotation contains a total of 128 predicted genes, including 82 protein-coding, 38 tRNA, and 8 rRNA genes. Additionally, our in silico analyses identified 10 sets of repeats longer than 20 bp and predicted the presence of 36 RNA editing sites. Overall, the chloroplast genome of common bermudagrass resembles those from other Poaceae lineages. Compared to most angiosperms, the accD gene and the introns of both clpP and rpoC1 genes are missing. Additionally, the ycf1, ycf2, ycf15, and ycf68 genes are pseudogenized and two genome rearrangements exist. Our phylogenetic analysis based on 47 chloroplast protein-coding genes supported the placement of common bermudagrass within Chloridoideae. Our phylogenetic character mapping based on the parsimony principle further indicated that the loss of the accD gene and clpP introns, the pseudogenization of four ycf genes, and the two rearrangements occurred only once after the most recent common ancestor of the Poaceae diverged from other monocots, which could explain the unusual long branch leading to the Poaceae when phylogeny is inferred based on chloroplast sequences.

  4. Phylogeographical variation of chloroplast DNA in holm oak (Quercus ilex L.).

    Science.gov (United States)

    Lumaret, R; Mir, C; Michaud, H; Raynal, V

    2002-11-01

    Variation in the lengths of restriction fragments (RFLPs) of the whole chloroplast DNA molecule was studied in 174 populations of Quercus ilex L. sampled over the entire distribution of this evergreen and mainly Mediterranean oak species. By using five endonucleases, 323 distinct fragments were obtained. From the 29 and 17 cpDNA changes identified as site and length mutations, respectively, 25 distinct chlorotypes were distinguished, mapped and treated cladistically with a parsimony analysis, using as an outgroup Q. alnifolia Poech, a closely related evergreen oak species endemic to Cyprus where Q. ilex does not grow. The predominant role of Q. ilex as maternal parent in hybridization with other species was reflected by the occurrence of a single very specific lineage of related chlorotypes, the most ancestral and recent ones being located in the southeastern and in the northwestern parts of the species' geographical distribution, respectively. The lineage was constituted of two clusters of chlorotypes observed in the 'ilex' morphotyped populations of the Balkan and Italian Peninsulas (including the contiguous French Riviera), respectively. A third cluster was divided into two subclusters identified in the 'rotundifolia' morphotyped populations of North Africa, and of Iberia and the adjacent French regions, respectively. Postglacial colonization probably started from three distinct southerly refugia located in each of the three European peninsulas, and a contact area between the Italian and the Iberian migration routes was identified in the Rhône valley (France). Chlorotypes identical or related to those of the Iberian cluster were identified in the populations from Catalonia and the French Languedoc region, which showed intermediate morphotypes, and in the French Atlantic populations which possessed the 'ilex' morphotype, suggesting the occurrence of adaptive morphological changes in the northern part of the species' distribution.

  5. The Complete Chloroplast and Mitochondrial Genome Sequences of Boea hygrometrica: Insights into the Evolution of Plant Organellar Genomes

    Science.gov (United States)

    Wang, Xumin; Deng, Xin; Zhang, Xiaowei; Hu, Songnian; Yu, Jun

    2012-01-01

    The complete nucleotide sequences of the chloroplast (cp) and mitochondrial (mt) genomes of resurrection plant Boea hygrometrica (Bh, Gesneriaceae) have been determined with the lengths of 153,493 bp and 510,519 bp, respectively. The smaller chloroplast genome contains more genes (147) with a 72% coding sequence, and the larger mitochondrial genome have less genes (65) with a coding faction of 12%. Similar to other seed plants, the Bh cp genome has a typical quadripartite organization with a conserved gene in each region. The Bh mt genome has three recombinant sequence repeats of 222 bp, 843 bp, and 1474 bp in length, which divide the genome into a single master circle (MC) and four isomeric molecules. Compared to other angiosperms, one remarkable feature of the Bh mt genome is the frequent transfer of genetic material from the cp genome during recent Bh evolution. We also analyzed organellar genome evolution in general regarding genome features as well as compositional dynamics of sequence and gene structure/organization, providing clues for the understanding of the evolution of organellar genomes in plants. The cp-derived sequences including tRNAs found in angiosperm mt genomes support the conclusion that frequent gene transfer events may have begun early in the land plant lineage. PMID:22291979

  6. EGNAS: an exhaustive DNA sequence design algorithm

    Directory of Open Access Journals (Sweden)

    Kick Alfred

    2012-06-01

    Full Text Available Abstract Background The molecular recognition based on the complementary base pairing of deoxyribonucleic acid (DNA is the fundamental principle in the fields of genetics, DNA nanotechnology and DNA computing. We present an exhaustive DNA sequence design algorithm that allows to generate sets containing a maximum number of sequences with defined properties. EGNAS (Exhaustive Generation of Nucleic Acid Sequences offers the possibility of controlling both interstrand and intrastrand properties. The guanine-cytosine content can be adjusted. Sequences can be forced to start and end with guanine or cytosine. This option reduces the risk of “fraying” of DNA strands. It is possible to limit cross hybridizations of a defined length, and to adjust the uniqueness of sequences. Self-complementarity and hairpin structures of certain length can be avoided. Sequences and subsequences can optionally be forbidden. Furthermore, sequences can be designed to have minimum interactions with predefined strands and neighboring sequences. Results The algorithm is realized in a C++ program. TAG sequences can be generated and combined with primers for single-base extension reactions, which were described for multiplexed genotyping of single nucleotide polymorphisms. Thereby, possible foldback through intrastrand interaction of TAG-primer pairs can be limited. The design of sequences for specific attachment of molecular constructs to DNA origami is presented. Conclusions We developed a new software tool called EGNAS for the design of unique nucleic acid sequences. The presented exhaustive algorithm allows to generate greater sets of sequences than with previous software and equal constraints. EGNAS is freely available for noncommercial use at http://www.chm.tu-dresden.de/pc6/EGNAS.

  7. The Complete Chloroplast Genome Sequence of Tree of Heaven (Ailanthus altissima (Mill. (Sapindales: Simaroubaceae, an Important Pantropical Tree

    Directory of Open Access Journals (Sweden)

    Josphat K. Saina

    2018-03-01

    Full Text Available Ailanthus altissima (Mill. Swingle (Simaroubaceae is a deciduous tree widely distributed throughout temperate regions in China, hence suitable for genetic diversity and evolutionary studies. Previous studies in A. altissima have mainly focused on its biological activities, genetic diversity and genetic structure. However, until now there is no published report regarding genome of this plant species or Simaroubaceae family. Therefore, in this paper, we first characterized A. altissima complete chloroplast genome sequence. The tree of heaven chloroplast genome was found to be a circular molecule 160,815 base pairs (bp in size and possess a quadripartite structure. The A. altissima chloroplast genome contains 113 unique genes of which 79 and 30 are protein coding and transfer RNA (tRNA genes respectively and also 4 ribosomal RNA genes (rRNA with overall GC content of 37.6%. Microsatellite marker detection identified A/T mononucleotides as majority SSRs in all the seven analyzed genomes. Repeat analyses of seven Sapindales revealed a total of 49 repeats in A. altissima, Rhus chinensis, Dodonaea viscosa, Leitneria floridana, while Azadirachta indica, Boswellia sacra, and Citrus aurantiifolia had a total of 48 repeats. The phylogenetic analysis using protein coding genes revealed that A. altissima is a sister to Leitneria floridana and also suggested that Simaroubaceae is a sister to Rutaceae family. The genome information reported here could be further applied for evolution and invasion, population genetics, and molecular studies in this plant species and family.

  8. Chromatid interchanges at intrachromosomal telomeric DNA sequences

    International Nuclear Information System (INIS)

    Fernandez, J.L.; Vazquez-Gundin, F.; Bilbao, A.; Gosalvez, J.; Goyanes, V.

    1997-01-01

    Chinese hamster Don cells were exposed to X-rays, mitomycin C and teniposide (VM-26) to induce chromatid exchanges (quadriradials and triradials). After fluorescence in situ hybridization (FISH) of telomere sequences it was found that interstitial telomere-like DNA sequence arrays presented around five times more breakage-rearrangements than the genome overall. This high recombinogenic capacity was independent of the clastogen, suggesting that this susceptibility is not related to the initial mechanisms of DNA damage. (author)

  9. Transcriptional regulation and DNA methylation in plastids during transitional conversion of chloroplasts to chromoplasts.

    OpenAIRE

    Kobayashi, H; Ngernprasirtsiri, J; Akazawa, T

    1990-01-01

    During transitional conversion of chloroplasts to chromoplasts in ripening tomato (Lycopersicon esculentum) fruits, transcripts for several plastid genes for photosynthesis decreased to undetectable levels. Run-on transcription of plastids indicated that transcriptional regulation operated as a predominant factor. We found that most of the genes in chloroplasts were actively transcribed in vitro by Escherichia coli and soluble plastid RNA polymerases, but some genes in chromoplasts seemed to ...

  10. Mitochondrial DNA sequence evolution in shorebird populations

    NARCIS (Netherlands)

    Wenink, P.W.

    1994-01-01

    This thesis describes the global molecular population structure of two shorebird species, in particular of the dunlin, Calidris alpina, by means of comparative sequence analysis of the most variable part of the mitochondrial DNA (mtDNA) genome. There are several reasons

  11. Recurrence plot analysis of DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Wu Zuobing [State Key Laboratory of Nonlinear Mechanics, Institute of Mechanics, Chinese Academy of Sciences, Beijing 100080 (China)]. E-mail: wuzb@lnm.imech.ac.cn

    2004-11-15

    Recurrence plot technique of DNA sequences is established on metric representation and employed to analyze correlation structure of nucleotide strings. It is found that, in the transference of nucleotide strings, a human DNA fragment has a major correlation distance, but a yeast chromosome's correlation distance has a constant increasing.

  12. On site DNA barcoding by nanopore sequencing.

    Directory of Open Access Journals (Sweden)

    Michele Menegon

    Full Text Available Biodiversity research is becoming increasingly dependent on genomics, which allows the unprecedented digitization and understanding of the planet's biological heritage. The use of genetic markers i.e. DNA barcoding, has proved to be a powerful tool in species identification. However, full exploitation of this approach is hampered by the high sequencing costs and the absence of equipped facilities in biodiversity-rich countries. In the present work, we developed a portable sequencing laboratory based on the portable DNA sequencer from Oxford Nanopore Technologies, the MinION. Complementary laboratory equipment and reagents were selected to be used in remote and tough environmental conditions. The performance of the MinION sequencer and the portable laboratory was tested for DNA barcoding in a mimicking tropical environment, as well as in a remote rainforest of Tanzania lacking electricity. Despite the relatively high sequencing error-rate of the MinION, the development of a suitable pipeline for data analysis allowed the accurate identification of different species of vertebrates including amphibians, reptiles and mammals. In situ sequencing of a wild frog allowed us to rapidly identify the species captured, thus confirming that effective DNA barcoding in the field is possible. These results open new perspectives for real-time-on-site DNA sequencing thus potentially increasing opportunities for the understanding of biodiversity in areas lacking conventional laboratory facilities.

  13. Transcriptional Slippage and RNA Editing Increase the Diversity of Transcripts in Chloroplasts: Insight from Deep Sequencing of Vigna radiata Genome and Transcriptome.

    Directory of Open Access Journals (Sweden)

    Ching-Ping Lin

    Full Text Available We performed deep sequencing of the nuclear and organellar genomes of three mungbean genotypes: Vigna radiata ssp. sublobata TC1966, V. radiata var. radiata NM92 and the recombinant inbred line RIL59 derived from a cross between TC1966 and NM92. Moreover, we performed deep sequencing of the RIL59 transcriptome to investigate transcript variability. The mungbean chloroplast genome has a quadripartite structure including a pair of inverted repeats separated by two single copy regions. A total of 213 simple sequence repeats were identified in the chloroplast genomes of NM92 and RIL59; 78 single nucleotide variants and nine indels were discovered in comparing the chloroplast genomes of TC1966 and NM92. Analysis of the mungbean chloroplast transcriptome revealed mRNAs that were affected by transcriptional slippage and RNA editing. Transcriptional slippage frequency was positively correlated with the length of simple sequence repeats of the mungbean chloroplast genome (R2=0.9911. In total, 41 C-to-U editing sites were found in 23 chloroplast genes and in one intergenic spacer. No editing site that swapped U to C was found. A combination of bioinformatics and experimental methods revealed that the plastid-encoded RNA polymerase-transcribed genes psbF and ndhA are affected by transcriptional slippage in mungbean and in main lineages of land plants, including three dicots (Glycine max, Brassica rapa, and Nicotiana tabacum, two monocots (Oryza sativa and Zea mays, two gymnosperms (Pinus taeda and Ginkgo biloba and one moss (Physcomitrella patens. Transcript analysis of the rps2 gene showed that transcriptional slippage could affect transcripts at single sequence repeat regions with poly-A runs. It showed that transcriptional slippage together with incomplete RNA editing may cause sequence diversity of transcripts in chloroplasts of land plants.

  14. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  15. Sequencing Intractable DNA to Close Microbial Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Hurt, Jr., Richard Ashley [ORNL; Brown, Steven D [ORNL; Podar, Mircea [ORNL; Palumbo, Anthony Vito [ORNL; Elias, Dwayne A [ORNL

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  16. Bacterial identification and subtyping using DNA microarray and DNA sequencing.

    Science.gov (United States)

    Al-Khaldi, Sufian F; Mossoba, Magdi M; Allard, Marc M; Lienau, E Kurt; Brown, Eric D

    2012-01-01

    The era of fast and accurate discovery of biological sequence motifs in prokaryotic and eukaryotic cells is here. The co-evolution of direct genome sequencing and DNA microarray strategies not only will identify, isotype, and serotype pathogenic bacteria, but also it will aid in the discovery of new gene functions by detecting gene expressions in different diseases and environmental conditions. Microarray bacterial identification has made great advances in working with pure and mixed bacterial samples. The technological advances have moved beyond bacterial gene expression to include bacterial identification and isotyping. Application of new tools such as mid-infrared chemical imaging improves detection of hybridization in DNA microarrays. The research in this field is promising and future work will reveal the potential of infrared technology in bacterial identification. On the other hand, DNA sequencing by using 454 pyrosequencing is so cost effective that the promise of $1,000 per bacterial genome sequence is becoming a reality. Pyrosequencing technology is a simple to use technique that can produce accurate and quantitative analysis of DNA sequences with a great speed. The deposition of massive amounts of bacterial genomic information in databanks is creating fingerprint phylogenetic analysis that will ultimately replace several technologies such as Pulsed Field Gel Electrophoresis. In this chapter, we will review (1) the use of DNA microarray using fluorescence and infrared imaging detection for identification of pathogenic bacteria, and (2) use of pyrosequencing in DNA cluster analysis to fingerprint bacterial phylogenetic trees.

  17. The complete chloroplast genome sequence of Aconitum coreanum and Aconitum carmichaelii and comparative analysis with other Aconitum species.

    Directory of Open Access Journals (Sweden)

    Inkyu Park

    Full Text Available Aconitum species (belonging to the Ranunculaceae are well known herbaceous medicinal ingredients and have great economic value in Asian countries. However, there are still limited genomic resources available for Aconitum species. In this study, we sequenced the chloroplast (cp genomes of two Aconitum species, A. coreanum and A. carmichaelii, using the MiSeq platform. The two Aconitum chloroplast genomes were 155,880 and 157,040 bp in length, respectively, and exhibited LSC and SSC regions separated by a pair of inverted repeat regions. Both cp genomes had 38% GC content and contained 131 unique functional genes including 86 protein-coding genes, eight ribosomal RNA genes, and 37 transfer RNA genes. The gene order, content, and orientation of the two Aconitum cp genomes exhibited the general structure of angiosperms, and were similar to those of other Aconitum species. Comparison of the cp genome structure and gene order with that of other Aconitum species revealed general contraction and expansion of the inverted repeat regions and single copy boundary regions. Divergent regions were also identified. In phylogenetic analysis, Aconitum species positon among the Ranunculaceae was determined with other family cp genomes in the Ranunculales. We obtained a barcoding target sequence in a divergent region, ndhC-trnV, and successfully developed a SCAR (sequence characterized amplified region marker for discrimination of A. coreanum. Our results provide useful genetic information and a specific barcode for discrimination of Aconitum species.

  18. The complete chloroplast genome sequence of Aconitum coreanum and Aconitum carmichaelii and comparative analysis with other Aconitum species.

    Science.gov (United States)

    Park, Inkyu; Kim, Wook-Jin; Yang, Sungyu; Yeo, Sang-Min; Li, Hulin; Moon, Byeong Cheol

    2017-01-01

    Aconitum species (belonging to the Ranunculaceae) are well known herbaceous medicinal ingredients and have great economic value in Asian countries. However, there are still limited genomic resources available for Aconitum species. In this study, we sequenced the chloroplast (cp) genomes of two Aconitum species, A. coreanum and A. carmichaelii, using the MiSeq platform. The two Aconitum chloroplast genomes were 155,880 and 157,040 bp in length, respectively, and exhibited LSC and SSC regions separated by a pair of inverted repeat regions. Both cp genomes had 38% GC content and contained 131 unique functional genes including 86 protein-coding genes, eight ribosomal RNA genes, and 37 transfer RNA genes. The gene order, content, and orientation of the two Aconitum cp genomes exhibited the general structure of angiosperms, and were similar to those of other Aconitum species. Comparison of the cp genome structure and gene order with that of other Aconitum species revealed general contraction and expansion of the inverted repeat regions and single copy boundary regions. Divergent regions were also identified. In phylogenetic analysis, Aconitum species positon among the Ranunculaceae was determined with other family cp genomes in the Ranunculales. We obtained a barcoding target sequence in a divergent region, ndhC-trnV, and successfully developed a SCAR (sequence characterized amplified region) marker for discrimination of A. coreanum. Our results provide useful genetic information and a specific barcode for discrimination of Aconitum species.

  19. RANDNA: a random DNA sequence generator.

    Science.gov (United States)

    Piva, Francesco; Principato, Giovanni

    2006-01-01

    Monte Carlo simulations are useful to verify the significance of data. Genomic regularities, such as the nucleotide correlations or the not uniform distribution of the motifs throughout genomic or mature mRNA sequences, exist and their significance can be checked by means of the Monte Carlo test. The test needs good quality random sequences in order to work, moreover they should have the same nucleotide distribution as the sequences in which the regularities have been found. Random DNA sequences are also useful to estimate the background score of an alignment, that is a threshold below which the resulting score is merely due to chance. We have developed RANDNA, a free software which allows to produce random DNA or RNA sequences setting both their length and the percentage of nucleotide composition. Sequences having the same nucleotide distribution of exonic, intronic or intergenic sequences can be generated. Its graphic interface makes it possible to easily set the parameters that characterize the sequences being produced and saved in a text format file. The pseudo-random number generator function of Borland Delphi 6 is used, since it guarantees a good randomness, a long cycle length and a high speed. We have checked the quality of sequences generated by the software, by means of well-known tests, both by themselves and versus genuine random sequences. We show the good quality of the generated sequences. The software, complete with examples and documentation, is freely available to users from: http://www.introni.it/en/software.

  20. Compressing DNA sequence databases with coil

    Directory of Open Access Journals (Sweden)

    Hendy Michael D

    2008-05-01

    Full Text Available Abstract Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.

  1. Quantum-Sequencing: Fast electronic single DNA molecule sequencing

    Science.gov (United States)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free, high-throughput and cost-effective, single-molecule sequencing method. Here, we present the first demonstration of unique ``electronic fingerprint'' of all nucleotides (A, G, T, C), with single-molecule DNA sequencing, using Quantum-tunneling Sequencing (Q-Seq) at room temperature. We show that the electronic state of the nucleobases shift depending on the pH, with most distinct states identified at acidic pH. We also demonstrate identification of single nucleotide modifications (methylation here). Using these unique electronic fingerprints (or tunneling data), we report a partial sequence of beta lactamase (bla) gene, which encodes resistance to beta-lactam antibiotics, with over 95% success rate. These results highlight the potential of Q-Seq as a robust technique for next-generation sequencing.

  2. Understanding human DNA sequence variation.

    Science.gov (United States)

    Kidd, K K; Pakstis, A J; Speed, W C; Kidd, J R

    2004-01-01

    Over the past century researchers have identified normal genetic variation and studied that variation in diverse human populations to determine the amounts and distributions of that variation. That information is being used to develop an understanding of the demographic histories of the different populations and the species as a whole, among other studies. With the advent of DNA-based markers in the last quarter century, these studies have accelerated. One of the challenges for the next century is to understand that variation. One component of that understanding will be population genetics. We present here examples of many of the ways these new data can be analyzed from a population perspective using results from our laboratory on multiple individual DNA-based polymorphisms, many clustered in haplotypes, studied in multiple populations representing all major geographic regions of the world. These data support an "out of Africa" hypothesis for human dispersal around the world and begin to refine the understanding of population structures and genetic relationships. We are also developing baseline information against which we can compare findings at different loci to aid in the identification of loci subject, now and in the past, to selection (directional or balancing). We do not yet have a comprehensive understanding of the extensive variation in the human genome, but some of that understanding is coming from population genetics.

  3. The complete chloroplast genome sequence of the medicinal plant Andrographis paniculata.

    Science.gov (United States)

    Ding, Ping; Shao, Yanhua; Li, Qian; Gao, Junli; Zhang, Runjing; Lai, Xiaoping; Wang, Deqin; Zhang, Huiye

    2016-07-01

    The complete chloroplast genome of Andrographis paniculata, an important medicinal plant with great economic value, has been studied in this article. The genome size is 150,249 bp in length, with 38.3% GC content. A pair of inverted repeats (IRs, 25,300 bp) are separated by a large single copy region (LSC, 82,459 bp) and a small single-copy region (SSC, 17,190 bp). The chloroplast genome contains 114 unique genes, 80 protein-coding genes, 30 tRNA genes and 4 rRNA genes. In these genes, 15 genes contained 1 intron and 3 genes comprised of 2 introns.

  4. DNA Sequencing in Cultural Heritage.

    Science.gov (United States)

    Vai, Stefania; Lari, Martina; Caramelli, David

    2016-02-01

    During the last three decades, DNA analysis on degraded samples revealed itself as an important research tool in anthropology, archaeozoology, molecular evolution, and population genetics. Application on topics such as determination of species origin of prehistoric and historic objects, individual identification of famous personalities, characterization of particular samples important for historical, archeological, or evolutionary reconstructions, confers to the paleogenetics an important role also for the enhancement of cultural heritage. A really fast improvement in methodologies in recent years led to a revolution that permitted recovering even complete genomes from highly degraded samples with the possibility to go back in time 400,000 years for samples from temperate regions and 700,000 years for permafrozen remains and to analyze even more recent material that has been subjected to hard biochemical treatments. Here we propose a review on the different methodological approaches used so far for the molecular analysis of degraded samples and their application on some case studies.

  5. Enhanced throughput for infrared automated DNA sequencing

    Science.gov (United States)

    Middendorf, Lyle R.; Gartside, Bill O.; Humphrey, Pat G.; Roemer, Stephen C.; Sorensen, David R.; Steffens, David L.; Sutter, Scott L.

    1995-04-01

    Several enhancements have been developed and applied to infrared automated DNA sequencing resulting in significantly higher throughput. A 41 cm sequencing gel (31 cm well- to-read distance) combines high resolution of DNA sequencing fragments with optimized run times yielding two runs per day of 500 bases per sample. A 66 cm sequencing gel (56 cm well-to-read distance) produces sequence read lengths of up to 1000 bases for ds and ss templates using either T7 polymerase or cycle-sequencing protocols. Using a multichannel syringe to load 64 lanes allows 16 samples (compatible with 96-well format) to be visualized for each run. The 41 cm gel configuration allows 16,000 bases per day (16 samples X 500 bases/sample X 2 ten hour runs/day) to be sequenced with the advantages of infrared technology. Enhancements to internal labeling techniques using an infrared-labeled dATP molecule (Boehringer Mannheim GmbH, Penzberg, Germany; Sequenase (U.S. Biochemical) have also been made. The inclusion of glycerol in the sequencing reactions yields greatly improved results for some primer and template combinations. The inclusion of (alpha) -Thio-dNTP's in the labeling reaction increases signal intensity two- to three-fold.

  6. The DnaJ-like zinc finger domain protein PSA2 affects light acclimation and chloroplast development in Arabidopsis thaliana

    Directory of Open Access Journals (Sweden)

    Yan-Wen eWang

    2016-03-01

    Full Text Available The biosynthesis of chlorophylls and carotenoids and the assembly of thylakoid membranes are critical for the photoautotrophic growth of plants. Different factors are involved in these two processes. In recent years, members of the DnaJ-like zinc finger domain proteins have been found to take part in the biogenesis and/or the maintenance of plastids. One member of this family of proteins, PSA2, was recently found to localize to the thylakoid lumen and regulate the accumulation of photosystem I. In this study, we report that the silencing of PSA2 in Arabidopsis thaliana resulted in variegated leaves and retarded growth. Although both chlorophylls and total carotenoids decreased in the psa2 mutant, violaxanthin and zeaxanthin accumulated in the mutant seedlings grown under growth condition. Lower levels of non-photochemical quenching and electron transport rate were also found in the psa2 mutant seedlings under growth condition compared with those of the wild-type plants, indicating an impaired capability to acclimate to normal light irradiance when PSA2 was silenced. Moreover, we also observed an abnormal assembly of grana thylakoids and poorly developed stroma thylakoids in psa2 chloroplasts. Taken together, our results demonstrate that PSA2 is a member of the DnaJ-like zinc finger domain protein family that affects light acclimation and chloroplast development.

  7. Chapter 2: Genetic Variability in Nuclear Ribosomal and Chloroplast DNA in Utah (Juniperus Osteosperma) and Western (J. Occidentalis) Juniper (Cupressaceae): Evidence for Interspecific Gene Flow1

    Energy Technology Data Exchange (ETDEWEB)

    Terry, Randall G.; Tausch, Robin J.; Nowak, Robert S.

    1998-02-14

    Early studies of evolutionary change in chloroplast DNA indicated limited variability within species. This finding has been attributed to relatively low rates of sequence evolution and has been maintained as justification for the lack of intraspecific sampling in studies examining, relationships at the species level and above. However, documentation of intraspecific variation in cpDNA has become increasingly common and has been attributed in many cases to ''chloroplast capture'' following genetic exchange across species boundaries. Rleseberg and Wendel (1993) list 37 cases of proposed hybridization in plants that include intraspecific variation in cpDNA, 24 (65%) of which they considered to be probable instances of introgression. Rieseberg (1995) suspected that a review of the literature at that time would reveal over 100 cases of intraspecific variation in CPDNA that could be attributed to hybridization and introgression. That intraspecific variation in cpDNA is potentially indicative of hybridization is founded on the expectation that slowly evolving loci or genomes will produce greater molecular variation between than within species. In cases where a species is polymorphic for CPDNA and at least one of the molecular variants is diagnostic for a second species, interspecific hybridization is a plausible explanation. Incongruence between relationships suggested by cpDNA variation and those supported by other types of data (e.g., morphology or molecular data from an additional locus) provides additional support for introgression. One aspect of hybridization in both animals and plants that has become increasingly evident is incongruence in the phylogenetic and geographic distribution of cytoplasmic and nuclear markers. In most cases cytoplasmic introgression appears to be more pervasive than nuclear exchange. This discordance appears attributable to several factors including differences in the mutation rate, number of effective alleles, and modes

  8. The complete chloroplast genome sequence of strawberry (Fragaria  × ananassa Duch.) and comparison with related species of Rosaceae.

    Science.gov (United States)

    Cheng, Hui; Li, Jinfeng; Zhang, Hong; Cai, Binhua; Gao, Zhihong; Qiao, Yushan; Mi, Lin

    2017-01-01

    Compared with other members of the family Rosaceae, the chloroplast genomes of Fragaria species exhibit low variation, and this situation has limited phylogenetic analyses; thus, complete chloroplast genome sequencing of Fragaria species is needed. In this study, we sequenced the complete chloroplast genome of F . ×  ananassa 'Benihoppe' using the Illumina HiSeq 2500-PE150 platform and then performed a combination of de novo assembly and reference-guided mapping of contigs to generate complete chloroplast genome sequences. The chloroplast genome exhibits a typical quadripartite structure with a pair of inverted repeats (IRs, 25,936 bp) separated by large (LSC, 85,531 bp) and small (SSC, 18,146 bp) single-copy (SC) regions. The length of the F . ×  ananassa 'Benihoppe' chloroplast genome is 155,549 bp, representing the smallest Fragaria chloroplast genome observed to date. The genome encodes 112 unique genes, comprising 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Comparative analysis of the overall nucleotide sequence identity among ten complete chloroplast genomes confirmed that for both coding and non-coding regions in Rosaceae, SC regions exhibit higher sequence variation than IRs. The Ka/Ks ratio of most genes was less than 1, suggesting that most genes are under purifying selection. Moreover, the mVISTA results also showed a high degree of conservation in genome structure, gene order and gene content in Fragaria , particularly among three octoploid strawberries which were F . ×  ananassa 'Benihoppe', F . chiloensis (GP33) and F . virginiana (O477). However, when the sequences of the coding and non-coding regions of F . ×  ananassa 'Benihoppe' were compared in detail with those of F . chiloensis (GP33) and F . virginiana (O477), a number of SNPs and InDels were revealed by MEGA 7. Six non-coding regions ( trnK - matK , trnS - trnG , atpF - atpH , trnC - petN , trnT - psbD and trnP - psaJ ) with a percentage of variable sites greater than

  9. The complete chloroplast genome sequence of strawberry (Fragaria  × ananassa Duch. and comparison with related species of Rosaceae

    Directory of Open Access Journals (Sweden)

    Hui Cheng

    2017-10-01

    Full Text Available Compared with other members of the family Rosaceae, the chloroplast genomes of Fragaria species exhibit low variation, and this situation has limited phylogenetic analyses; thus, complete chloroplast genome sequencing of Fragaria species is needed. In this study, we sequenced the complete chloroplast genome of F. × ananassa ‘Benihoppe’ using the Illumina HiSeq 2500-PE150 platform and then performed a combination of de novo assembly and reference-guided mapping of contigs to generate complete chloroplast genome sequences. The chloroplast genome exhibits a typical quadripartite structure with a pair of inverted repeats (IRs, 25,936 bp separated by large (LSC, 85,531 bp and small (SSC, 18,146 bp single-copy (SC regions. The length of the F. × ananassa ‘Benihoppe’ chloroplast genome is 155,549 bp, representing the smallest Fragaria chloroplast genome observed to date. The genome encodes 112 unique genes, comprising 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Comparative analysis of the overall nucleotide sequence identity among ten complete chloroplast genomes confirmed that for both coding and non-coding regions in Rosaceae, SC regions exhibit higher sequence variation than IRs. The Ka/Ks ratio of most genes was less than 1, suggesting that most genes are under purifying selection. Moreover, the mVISTA results also showed a high degree of conservation in genome structure, gene order and gene content in Fragaria, particularly among three octoploid strawberries which were F. × ananassa ‘Benihoppe’, F. chiloensis (GP33 and F. virginiana (O477. However, when the sequences of the coding and non-coding regions of F. × ananassa ‘Benihoppe’ were compared in detail with those of F. chiloensis (GP33 and F. virginiana (O477, a number of SNPs and InDels were revealed by MEGA 7. Six non-coding regions (trnK-matK, trnS-trnG, atpF-atpH, trnC-petN, trnT-psbD and trnP-psaJ with a percentage of variable sites greater than 1

  10. Chloroplast Genome Sequence of pigeonpea (Cajanus cajan (L. Millspaugh and Cajanus scarabaeoides: Genome organization and Comparison with other legumes

    Directory of Open Access Journals (Sweden)

    Tanvi Kaila

    2016-12-01

    Full Text Available Pigeonpea (Cajanus cajan (L. Millspaugh, a diploid (2n = 22 legume crop with a genome size of 852 Mbp, serves as an important source of human dietary protein especially in South East Asian and African regions. In this study, the draft chloroplast genomes of Cajanus cajan and Cajanus scarabaeoides were sequenced. Cajanus scarabaeoides is an important species of the Cajanus gene pool and has also been used for developing promising CMS system by different groups. A male sterile genotype harbouring the Cajanus scarabaeoides cytoplasm was used for sequencing the plastid genome. The cp genome of Cajanus cajan is 152,242bp long, having a quadripartite structure with LSC of 83,455 bp and SSC of 17,871 bp separated by IRs of 25,398 bp. Similarly, the cp genome of Cajanus scarabaeoides is 152,201bp long, having a quadripartite structure in which IRs of 25,402 bp length separates 83,423 bp of LSC and 17,854 bp of SSC. The pigeonpea cp genome contains 116 unique genes, including 30 tRNA, 4 rRNA, 78 predicted protein coding genes and 5 pseudogenes. A 50kb inversion was observed in the LSC region of pigeonpea cp genome, consistent with other legumes. Comparison of cp genome with other legumes revealed the contraction of IR boundaries due to the absence of rps19 gene in the IR region. Chloroplast SSRs were mined and a total of 280 and 292 cpSSRs were identified in Cajanus scarabaeoides and Cajanus cajan respectively. RNA editing was observed at 37 sites in both Cajanus scarabaeoides and Cajanus cajan, with maximum occurrence in the ndh genes. The pigeonpea cp genome sequence would be beneficial in providing informative molecular markers which can be utilized for genetic diversity analysis and aid in understanding the plant systematics studies among major grain legumes.

  11. Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum and Comparative Analysis with Common Buckwheat (F. esculentum.

    Directory of Open Access Journals (Sweden)

    Kwang-Soo Cho

    Full Text Available We report the chloroplast (cp genome sequence of tartary buckwheat (Fagopyrum tataricum obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats and F. esculentum (one repeat, and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes--rpoC2, ycf3, accD, and clpP--have high synonymous (Ks value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum.

  12. Population genetics, phylogenomics and hybrid speciation of Juglans in China determined from whole chloroplast genomes, transcriptomes, and genotyping-by-sequencing (GBS)

    Science.gov (United States)

    Peng Zhao; Hui-Juan Zhou; Daniel Potter; Yi-Heng Hu; Xiao-Jia Feng; Meng Dang; Li Feng; Saman Zulfiqar; Wen-Zhe Liu; Gui-Fang Zhao; Keith Woeste

    2018-01-01

    Genomic data are a powerful tool for elucidating the processes involved in the evolution and divergence of species. The speciation and phylogenetic relationships among Chinese Juglans remain unclear. Here, we used results from phylogenomic and population genetic analyses, transcriptomics, Genotyping-By-Sequencing (GBS), and whole chloroplast...

  13. Dog Y chromosomal DNA sequence: identification, sequencing and SNP discovery

    Directory of Open Access Journals (Sweden)

    Kirkness Ewen

    2006-10-01

    Full Text Available Abstract Background Population genetic studies of dogs have so far mainly been based on analysis of mitochondrial DNA, describing only the history of female dogs. To get a picture of the male history, as well as a second independent marker, there is a need for studies of biallelic Y-chromosome polymorphisms. However, there are no biallelic polymorphisms reported, and only 3200 bp of non-repetitive dog Y-chromosome sequence deposited in GenBank, necessitating the identification of dog Y chromosome sequence and the search for polymorphisms therein. The genome has been only partially sequenced for one male dog, disallowing mapping of the sequence into specific chromosomes. However, by comparing the male genome sequence to the complete female dog genome sequence, candidate Y-chromosome sequence may be identified by exclusion. Results The male dog genome sequence was analysed by Blast search against the human genome to identify sequences with a best match to the human Y chromosome and to the female dog genome to identify those absent in the female genome. Candidate sequences were then tested for male specificity by PCR of five male and five female dogs. 32 sequences from the male genome, with a total length of 24 kbp, were identified as male specific, based on a match to the human Y chromosome, absence in the female dog genome and male specific PCR results. 14437 bp were then sequenced for 10 male dogs originating from Europe, Southwest Asia, Siberia, East Asia, Africa and America. Nine haplotypes were found, which were defined by 14 substitutions. The genetic distance between the haplotypes indicates that they originate from at least five wolf haplotypes. There was no obvious trend in the geographic distribution of the haplotypes. Conclusion We have identified 24159 bp of dog Y-chromosome sequence to be used for population genetic studies. We sequenced 14437 bp in a worldwide collection of dogs, identifying 14 SNPs for future SNP analyses, and

  14. The sequence specificity of UV-induced DNA damage in a systematically altered DNA sequence.

    Science.gov (United States)

    Khoe, Clairine V; Chung, Long H; Murray, Vincent

    2018-06-01

    The sequence specificity of UV-induced DNA damage was investigated in a specifically designed DNA plasmid using two procedures: end-labelling and linear amplification. Absorption of UV photons by DNA leads to dimerisation of pyrimidine bases and produces two major photoproducts, cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). A previous study had determined that two hexanucleotide sequences, 5'-GCTC*AC and 5'-TATT*AA, were high intensity UV-induced DNA damage sites. The UV clone plasmid was constructed by systematically altering each nucleotide of these two hexanucleotide sequences. One of the main goals of this study was to determine the influence of single nucleotide alterations on the intensity of UV-induced DNA damage. The sequence 5'-GCTC*AC was designed to examine the sequence specificity of 6-4PPs and the highest intensity 6-4PP damage sites were found at 5'-GTTC*CC nucleotides. The sequence 5'-TATT*AA was devised to investigate the sequence specificity of CPDs and the highest intensity CPD damage sites were found at 5'-TTTT*CG nucleotides. It was proposed that the tetranucleotide DNA sequence, 5'-YTC*Y (where Y is T or C), was the consensus sequence for the highest intensity UV-induced 6-4PP adduct sites; while it was 5'-YTT*C for the highest intensity UV-induced CPD damage sites. These consensus tetranucleotides are composed entirely of consecutive pyrimidines and must have a DNA conformation that is highly productive for the absorption of UV photons. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.

  15. Identification of refugia and post-glacial colonisation routes of European white oaks based on chloroplast DNA and fossil pollen evidence

    NARCIS (Netherlands)

    Petit, R.J.; Brewer, S.; Bordács, S.; Burg, K.; Cheddadi, R.; Coart, E.; Cottrell, J.; Csaikl, U.M.; Dam, van B.C.; Deans, J.D.; Espinel, S.; Fineschi, S.; Finkeldey, R.; Glaz, I.; Goicoechea, P.G.; Jensen, J.S.; König, A.O.; Lowe, A.J.; Madsen, S.F.; Mátyás, G.; Munro, R.C.; Popescu, F.; Slade, D.; Tabbener, H.; Vries, de S.G.M.; Ziegenhagen, B.; Beaulieu, de J.L.; Kremer, A.

    2002-01-01

    The geographic distribution throughout Europe of each of 32 chloroplast DNA variants belonging to eight white oak species sampled from 2613 populations is presented. Clear-cut geographic patterns were revealed by the survey. These distributions, together with the available palynological information,

  16. DNA Barcoding: Amplification and sequence analysis of rbcl and matK genome regions in three divergent plant species

    Directory of Open Access Journals (Sweden)

    Javed Iqbal Wattoo

    2016-11-01

    Full Text Available Background: DNA barcoding is a novel method of species identification based on nucleotide diversity of conserved sequences. The establishment and refining of plant DNA barcoding systems is more challenging due to high genetic diversity among different species. Therefore, targeting the conserved nuclear transcribed regions would be more reliable for plant scientists to reveal genetic diversity, species discrimination and phylogeny. Methods: In this study, we amplified and sequenced the chloroplast DNA regions (matk+rbcl of Solanum nigrum, Euphorbia helioscopia and Dalbergia sissoo to study the functional annotation, homology modeling and sequence analysis to allow a more efficient utilization of these sequences among different plant species. These three species represent three families; Solanaceae, Euphorbiaceae and Fabaceae respectively. Biological sequence homology and divergence of amplified sequences was studied using Basic Local Alignment Tool (BLAST. Results: Both primers (matk+rbcl showed good amplification in three species. The sequenced regions reveled conserved genome information for future identification of different medicinal plants belonging to these species. The amplified conserved barcodes revealed different levels of biological homology after sequence analysis. The results clearly showed that the use of these conserved DNA sequences as barcode primers would be an accurate way for species identification and discrimination. Conclusion: The amplification and sequencing of conserved genome regions identified a novel sequence of matK in native species of Solanum nigrum. The findings of the study would be applicable in medicinal industry to establish DNA based identification of different medicinal plant species to monitor adulteration.

  17. Special Issue: Next Generation DNA Sequencing

    Directory of Open Access Journals (Sweden)

    Paul Richardson

    2010-10-01

    Full Text Available Next Generation Sequencing (NGS refers to technologies that do not rely on traditional dideoxy-nucleotide (Sanger sequencing where labeled DNA fragments are physically resolved by electrophoresis. These new technologies rely on different strategies, but essentially all of them make use of real-time data collection of a base level incorporation event across a massive number of reactions (on the order of millions versus 96 for capillary electrophoresis for instance. The major commercial NGS platforms available to researchers are the 454 Genome Sequencer (Roche, Illumina (formerly Solexa Genome analyzer, the SOLiD system (Applied Biosystems/Life Technologies and the Heliscope (Helicos Corporation. The techniques and different strategies utilized by these platforms are reviewed in a number of the papers in this special issue. These technologies are enabling new applications that take advantage of the massive data produced by this next generation of sequencing instruments. [...

  18. Simulating efficiently the evolution of DNA sequences.

    Science.gov (United States)

    Schöniger, M; von Haeseler, A

    1995-02-01

    Two menu-driven FORTRAN programs are described that simulate the evolution of DNA sequences in accordance with a user-specified model. This general stochastic model allows for an arbitrary stationary nucleotide composition and any transition-transversion bias during the process of base substitution. In addition, the user may define any hypothetical model tree according to which a family of sequences evolves. The programs suggest the computationally most inexpensive approach to generate nucleotide substitutions. Either reproducible or non-repeatable simulations, depending on the method of initializing the pseudo-random number generator, can be performed. The corresponding options are offered by the interface menu.

  19. Complete chloroplast genome of Gracilaria firma (Gracilariaceae, Rhodophyta), with discussion on the use of chloroplast phylogenomics in the subclass Rhodymeniophycidae.

    Science.gov (United States)

    Ng, Poh-Kheng; Lin, Showe-Mei; Lim, Phaik-Eem; Liu, Li-Chia; Chen, Chien-Ming; Pai, Tun-Wen

    2017-01-06

    The chloroplast genome of Gracilaria firma was sequenced in view of its role as an economically important marine crop with wide industrial applications. To date, there are only 15 chloroplast genomes published for the Florideophyceae. Apart from presenting the complete chloroplast genome of G. firma, this study also assessed the utility of genome-scale data to address the phylogenetic relationships within the subclass Rhodymeniophycidae. The synteny and genome structure of the chloroplast genomes across the taxa of Eurhodophytina was also examined. The chloroplast genome of Gracilaria firma maps as a circular molecule of 187,001 bp and contains 252 genes, which are distributed on both strands and consist of 35 RNA genes (3 rRNAs, 30 tRNAs, tmRNA and a ribonuclease P RNA component) and 217 protein-coding genes, including the unidentified open reading frames. The chloroplast genome of G. firma is by far the largest reported for Gracilariaceae, featuring a unique intergenic region of about 7000 bp with discontinuous vestiges of red algal plasmid DNA sequences interspersed between the nblA and cpeB genes. This chloroplast genome shows similar gene content and order to other Florideophycean taxa. Phylogenomic analyses based on the concatenated amino acid sequences of 146 protein-coding genes confirmed the monophyly of the classes Bangiophyceae and Florideophyceae with full nodal support. Relationships within the subclass Rhodymeniophycidae in Florideophyceae received moderate to strong nodal support, and the monotypic family of Gracilariales were resolved with maximum support. Chloroplast genomes hold substantial information that can be tapped for resolving the phylogenetic relationships of difficult regions in the Rhodymeniophycidae, which are perceived to have experienced rapid radiation and thus received low nodal support, as exemplified in this study. The present study shows that chloroplast genome of G. firma could serve as a key link to the full resolution of

  20. Genomic signal processing for DNA sequence clustering.

    Science.gov (United States)

    Mendizabal-Ruiz, Gerardo; Román-Godínez, Israel; Torres-Ramos, Sulema; Salido-Ruiz, Ricardo A; Vélez-Pérez, Hugo; Morales, J Alejandro

    2018-01-01

    Genomic signal processing (GSP) methods which convert DNA data to numerical values have recently been proposed, which would offer the opportunity of employing existing digital signal processing methods for genomic data. One of the most used methods for exploring data is cluster analysis which refers to the unsupervised classification of patterns in data. In this paper, we propose a novel approach for performing cluster analysis of DNA sequences that is based on the use of GSP methods and the K-means algorithm. We also propose a visualization method that facilitates the easy inspection and analysis of the results and possible hidden behaviors. Our results support the feasibility of employing the proposed method to find and easily visualize interesting features of sets of DNA data.

  1. Google matrix analysis of DNA sequences.

    Science.gov (United States)

    Kandiah, Vivek; Shepelyansky, Dima L

    2013-01-01

    For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW). At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  2. Google matrix analysis of DNA sequences.

    Directory of Open Access Journals (Sweden)

    Vivek Kandiah

    Full Text Available For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW. At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  3. A plastome mutation affects processing of both chloroplast and nuclear DNA-encoded plastid proteins.

    Science.gov (United States)

    Johnson, E M; Schnabelrauch, L S; Sears, B B

    1991-01-01

    Immunoblotting of a chloroplast mutant (pm7) of Oenothera showed that three proteins, cytochrome f and the 23 kDa and 16 kDa subunits of the oxygen-evolving subcomplex of photosystem II, were larger than the corresponding mature proteins of the wild type and, thus, appear to be improperly processed in pm7. The mutant is also chlorotic and has little or no internal membrane development in the plastids. The improperly processed proteins, and other proteins that are completely missing, represent products of both the plastid and nuclear genomes. To test for linkage of these defects, a green revertant of pm7 was isolated from cultures in which the mutant plastids were maintained in a nuclear background homozygous for the plastome mutator (pm) gene. In this revertant, all proteins analyzed co-reverted to the wild-type condition, indicating that a single mutation in a plastome gene is responsible for the complex phenotype of pm7. These results suggest that the defect in pm7 lies in a gene that affects a processing protease encoded in the chloroplast genome.

  4. Aspects of coverage in medical DNA sequencing

    Directory of Open Access Journals (Sweden)

    Wilson Richard K

    2008-05-01

    Full Text Available Abstract Background DNA sequencing is now emerging as an important component in biomedical studies of diseases like cancer. Short-read, highly parallel sequencing instruments are expected to be used heavily for such projects, but many design specifications have yet to be conclusively established. Perhaps the most fundamental of these is the redundancy required to detect sequence variations, which bears directly upon genomic coverage and the consequent resolving power for discerning somatic mutations. Results We address the medical sequencing coverage problem via an extension of the standard mathematical theory of haploid coverage. The expected diploid multi-fold coverage, as well as its generalization for aneuploidy are derived and these expressions can be readily evaluated for any project. The resulting theory is used as a scaling law to calibrate performance to that of standard BAC sequencing at 8× to 10× redundancy, i.e. for expected coverages that exceed 99% of the unique sequence. A differential strategy is formalized for tumor/normal studies wherein tumor samples are sequenced more deeply than normal ones. In particular, both tumor alleles should be detected at least twice, while both normal alleles are detected at least once. Our theory predicts these requirements can be met for tumor and normal redundancies of approximately 26× and 21×, respectively. We explain why these values do not differ by a factor of 2, as might intuitively be expected. Future technology developments should prompt even deeper sequencing of tumors, but the 21× value for normal samples is essentially a constant. Conclusion Given the assumptions of standard coverage theory, our model gives pragmatic estimates for required redundancy. The differential strategy should be an efficient means of identifying potential somatic mutations for further study.

  5. cDNA sequence quality data - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Budding yeast cDNA sequencing project cDNA sequence quality data Data detail Data name cDNA sequence quality... data DOI 10.18908/lsdba.nbdc00838-003 Description of data contents Phred's quality score. P...tion Download License Update History of This Database Site Policy | Contact Us cDNA sequence quality

  6. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    Energy Technology Data Exchange (ETDEWEB)

    Winston Chen, C.H.; Taranenko, N.I.; Zhu, Y.F.; Chung, C.N.; Allman, S.L.

    1997-03-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, the authors recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Snager`s enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. The preliminary results indicate laser mass spectrometry can possibly be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, the authors applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  7. Method for priming and DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Mugasimangalam, R.C.; Ulanovsky, L.E.

    1997-12-01

    A method is presented for improving the priming specificity of an oligonucleotide primer that is non-unique in a nucleic acid template which includes selecting a continuous stretch of several nucleotides in the template DNA where one of the four bases does not occur in the stretch. This also includes bringing the template DNA in contract with a non-unique primer partially or fully complimentary to the sequence immediately upstream of the selected sequence stretch. This results in polymerase-mediated differential extension of the primer in the presence of a subset of deoxyribonucleotide triphosphates that does not contain the base complementary to the base absent in the selected sequence stretch. These reactions occur at a temperature sufficiently low for allowing the extension of the non-unique primer. The method causes polymerase-mediated extension reactions in the presence of all four natural deoxyribonucleotide triphosphates or modifications. At this high temperature discrimination occurs against priming sites of the non-unique primer where the differential extension has not made the primer sufficiently stable to prime. However, the primer extended at the selected stretch is sufficiently stable to prime.

  8. Phylogenomic relationship of feijoa (Acca sellowiana (O.Berg) Burret) with other Myrtaceae based on complete chloroplast genome sequences.

    Science.gov (United States)

    Machado, Lilian de Oliveira; Vieira, Leila do Nascimento; Stefenon, Valdir Marcos; Oliveira Pedrosa, Fábio de; Souza, Emanuel Maltempi de; Guerra, Miguel Pedro; Nodari, Rubens Onofre

    2017-04-01

    Given their distribution, importance, and richness, Myrtaceae species comprise a model system for studying the evolution of tropical plant diversity. In addition, chloroplast (cp) genome sequencing is an efficient tool for phylogenetic relationship studies. Feijoa [Acca sellowiana (O. Berg) Burret; CN: pineapple-guava] is a Myrtaceae species that occurs naturally in southern Brazil and northern Uruguay. Feijoa is known for its exquisite perfume and flavorful fruits, pharmacological properties, ornamental value and increasing economic relevance. In the present work, we reported the complete cp genome of feijoa. The feijoa cp genome is a circular molecule of 159,370 bp with a quadripartite structure containing two single copy regions, a Large Single Copy region (LSC 88,028 bp) and a Small Single Copy region (SSC 18,598 bp) separated by Inverted Repeat regions (IRs 26,372 bp). The genome structure, gene order, GC content and codon usage are similar to those of typical angiosperm cp genomes. When compared to other cp genome sequences of Myrtaceae, feijoa showed closest relationship with pitanga (Eugenia uniflora L.). Furthermore, a comparison of pitanga synonymous (Ks) and nonsynonymous (Ka) substitution rates revealed extremely low values. Maximum Likelihood and Bayesian Inference analyses produced phylogenomic trees identical in topology. These trees supported monophyly of three Myrtoideae clades.

  9. Phylogenetic Relationships of the Fern Cyrtomium falcatum (Dryopteridaceae from Dokdo Island Based on Chloroplast Genome Sequencing

    Directory of Open Access Journals (Sweden)

    Gurusamy Raman

    2016-12-01

    Full Text Available Cyrtomium falcatum is a popular ornamental fern cultivated worldwide. Native to the Korean Peninsula, Japan, and Dokdo Island in the Sea of Japan, it is the only fern present on Dokdo Island. We isolated and characterized the chloroplast (cp genome of C. falcatum, and compared it with those of closely related species. The genes trnV-GAC and trnV-GAU were found to be present within the cp genome of C. falcatum, whereas trnP-GGG and rpl21 were lacking. Moreover, cp genomes of Cyrtomium devexiscapulae and Adiantum capillus-veneris lack trnP-GGG and rpl21, suggesting these are not conserved among angiosperm cp genomes. The deletion of trnR-UCG, trnR-CCG, and trnSeC in the cp genomes of C. falcatum and other eupolypod ferns indicates these genes are restricted to tree ferns, non-core leptosporangiates, and basal ferns. The C. falcatum cp genome also encoded ndhF and rps7, with GUG start codons that were only conserved in polypod ferns, and it shares two significant inversions with other ferns, including a minor inversion of the trnD-GUC region and an approximate 3 kb inversion of the trnG-trnT region. Phylogenetic analyses showed that Equisetum was found to be a sister clade to Psilotales-Ophioglossales with a 100% bootstrap (BS value. The sister relationship between Pteridaceae and eupolypods was also strongly supported by a 100% BS, but Bayesian molecular clock analyses suggested that C. falcatum diversified in the mid-Paleogene period (45.15 ± 4.93 million years ago and might have moved from Eurasia to Dokdo Island.

  10. A database of PCR primers for the chloroplast genomes of higher plants

    Science.gov (United States)

    Heinze, Berthold

    2007-01-01

    Background Chloroplast genomes evolve slowly and many primers for PCR amplification and analysis of chloroplast sequences can be used across a wide array of genera. In some cases 'universal' primers have been designed for the purpose of working across species boundaries. However, the essential information on these primer sequences is scattered throughout the literature. Results A database is presented here which assembles published primer information for chloroplast DNA. Additional primers were designed to fill gaps where little or no primer information could be found. Amplicons are either the genes themselves (typically useful in studies of sequence variation in higher-order phylogeny) or they are spacers, introns, and intergenic regions (for studies of phylogeographic patterns within and among species). The current list of 'generic' primers consists of more than 700 sequences. Wherever possible, we give the locations of the primers in the thirteen fully sequenced chloroplast genomes (Nicotiana tabacum, Atropa belladonna, Spinacia oleracea, Arabidopsis thaliana, Populus trichocarpa, Oryza sativa, Pinus thunbergii, Marchantia polymorpha, Zea mays, Oenothera elata, Acorus calamus, Eucalyptus globulus, Medicago trunculata). Conclusion The database described here is designed to serve as a resource for researchers who are venturing into the study of poorly described chloroplast genomes, whether for large- or small-scale DNA sequencing projects, to study molecular variation or to investigate chloroplast evolution. PMID:17326828

  11. A database of PCR primers for the chloroplast genomes of higher plants

    Directory of Open Access Journals (Sweden)

    Heinze Berthold

    2007-02-01

    Full Text Available Abstract Background Chloroplast genomes evolve slowly and many primers for PCR amplification and analysis of chloroplast sequences can be used across a wide array of genera. In some cases 'universal' primers have been designed for the purpose of working across species boundaries. However, the essential information on these primer sequences is scattered throughout the literature. Results A database is presented here which assembles published primer information for chloroplast DNA. Additional primers were designed to fill gaps where little or no primer information could be found. Amplicons are either the genes themselves (typically useful in studies of sequence variation in higher-order phylogeny or they are spacers, introns, and intergenic regions (for studies of phylogeographic patterns within and among species. The current list of 'generic' primers consists of more than 700 sequences. Wherever possible, we give the locations of the primers in the thirteen fully sequenced chloroplast genomes (Nicotiana tabacum, Atropa belladonna, Spinacia oleracea, Arabidopsis thaliana, Populus trichocarpa, Oryza sativa, Pinus thunbergii, Marchantia polymorpha, Zea mays, Oenothera elata, Acorus calamus, Eucalyptus globulus, Medicago trunculata. Conclusion The database described here is designed to serve as a resource for researchers who are venturing into the study of poorly described chloroplast genomes, whether for large- or small-scale DNA sequencing projects, to study molecular variation or to investigate chloroplast evolution.

  12. Chimeric proteins for detection and quantitation of DNA mutations, DNA sequence variations, DNA damage and DNA mismatches

    Science.gov (United States)

    McCutchen-Maloney, Sandra L.

    2002-01-01

    Chimeric proteins having both DNA mutation binding activity and nuclease activity are synthesized by recombinant technology. The proteins are of the general formula A-L-B and B-L-A where A is a peptide having DNA mutation binding activity, L is a linker and B is a peptide having nuclease activity. The chimeric proteins are useful for detection and identification of DNA sequence variations including DNA mutations (including DNA damage and mismatches) by binding to the DNA mutation and cutting the DNA once the DNA mutation is detected.

  13. Poincaré recurrences of DNA sequences

    Science.gov (United States)

    Frahm, K. M.; Shepelyansky, D. L.

    2012-01-01

    We analyze the statistical properties of Poincaré recurrences of Homo sapiens, mammalian, and other DNA sequences taken from the Ensembl Genome data base with up to 15 billion base pairs. We show that the probability of Poincaré recurrences decays in an algebraic way with the Poincaré exponent β≈4 even if the oscillatory dependence is well pronounced. The correlations between recurrences decay with an exponent ν≈0.6 that leads to an anomalous superdiffusive walk. However, for Homo sapiens sequences, with the largest available statistics, the diffusion coefficient converges to a finite value on distances larger than one million base pairs. We argue that the approach based on Poncaré recurrences determines new proximity features between different species and sheds a new light on their evolution history.

  14. Molecular phylogeny and systematics of the banana family (Musaceae) inferred from multiple nuclear and chloroplast DNA fragments, with a special reference to the genus Musa.

    Science.gov (United States)

    Li, Lin-Feng; Häkkinen, Markku; Yuan, Yong-Ming; Hao, Gang; Ge, Xue-Jun

    2010-10-01

    Musaceae is a small paleotropical family. Three genera have been recognised within this family although the generic delimitations remain controversial. Most species of the family (around 65 species) have been placed under the genus Musa and its infrageneric classification has long been disputed. In this study, we obtained nuclear ribosomal ITS and chloroplast (atpB-rbcL, rps16, and trnL-F) DNA sequences of 36 species (42 accessions of ingroups representing three genera) together with 10 accessions of ingroups retrieved from GenBank database and 4 accessions of outgroups, to construct the phylogeny of the family, with a special reference to the infrageneric classification of the genus Musa. Our phylogenetic analyses elaborated previous results in supporting the monophyly of the family and suggested that Musella and Ensete may be congeneric or at least closely related, but refuted the previous infrageneric classification of Musa. None of the five sections of Musa previously defined based on morphology was recovered as monophyletic group in the molecular phylogeny. Two infrageneric clades were identified, which corresponded well to the basic chromosome numbers of x=11 and 10/9/7, respectively: the former clade comprises species from the sections Musa and Rhodochlamys while the latter contains sections of Callimusa, Australimusa, and Ingentimusa. Copyright 2010 Elsevier Inc. All rights reserved.

  15. Image correlation method for DNA sequence alignment.

    Science.gov (United States)

    Curilem Saldías, Millaray; Villarroel Sassarini, Felipe; Muñoz Poblete, Carlos; Vargas Vásquez, Asticio; Maureira Butler, Iván

    2012-01-01

    The complexity of searches and the volume of genomic data make sequence alignment one of bioinformatics most active research areas. New alignment approaches have incorporated digital signal processing techniques. Among these, correlation methods are highly sensitive. This paper proposes a novel sequence alignment method based on 2-dimensional images, where each nucleic acid base is represented as a fixed gray intensity pixel. Query and known database sequences are coded to their pixel representation and sequence alignment is handled as object recognition in a scene problem. Query and database become object and scene, respectively. An image correlation process is carried out in order to search for the best match between them. Given that this procedure can be implemented in an optical correlator, the correlation could eventually be accomplished at light speed. This paper shows an initial research stage where results were "digitally" obtained by simulating an optical correlation of DNA sequences represented as images. A total of 303 queries (variable lengths from 50 to 4500 base pairs) and 100 scenes represented by 100 x 100 images each (in total, one million base pair database) were considered for the image correlation analysis. The results showed that correlations reached very high sensitivity (99.01%), specificity (98.99%) and outperformed BLAST when mutation numbers increased. However, digital correlation processes were hundred times slower than BLAST. We are currently starting an initiative to evaluate the correlation speed process of a real experimental optical correlator. By doing this, we expect to fully exploit optical correlation light properties. As the optical correlator works jointly with the computer, digital algorithms should also be optimized. The results presented in this paper are encouraging and support the study of image correlation methods on sequence alignment.

  16. The complete chloroplast genome of banana (Musa acuminata, Zingiberales): insight into plastid monocotyledon evolution.

    Science.gov (United States)

    Martin, Guillaume; Baurens, Franc-Christophe; Cardi, Céline; Aury, Jean-Marc; D'Hont, Angélique

    2013-01-01

    Banana (genus Musa) is a crop of major economic importance worldwide. It is a monocotyledonous member of the Zingiberales, a sister group of the widely studied Poales. Most cultivated bananas are natural Musa inter-(sub-)specific triploid hybrids. A Musa acuminata reference nuclear genome sequence was recently produced based on sequencing of genomic DNA enriched in nucleus. The Musa acuminata chloroplast genome was assembled with chloroplast reads extracted from whole-genome-shotgun sequence data. The Musa chloroplast genome is a circular molecule of 169,972 bp with a quadripartite structure containing two single copy regions, a Large Single Copy region (LSC, 88,338 bp) and a Small Single Copy region (SSC, 10,768 bp) separated by Inverted Repeat regions (IRs, 35,433 bp). Two forms of the chloroplast genome relative to the orientation of SSC versus LSC were found. The Musa chloroplast genome shows an extreme IR expansion at the IR/SSC boundary relative to the most common structures found in angiosperms. This expansion consists of the integration of three additional complete genes (rps15, ndhH and ycf1) and part of the ndhA gene. No such expansion has been observed in monocots so far. Simple Sequence Repeats were identified in the Musa chloroplast genome and a new set of Musa chloroplastic markers was designed. The complete sequence of M. acuminata ssp malaccensis chloroplast we reported here is the first one for the Zingiberales order. As such it provides new insight in the evolution of the chloroplast of monocotyledons. In particular, it reinforces that IR/SSC expansion has occurred independently several times within monocotyledons. The discovery of new polymorphic markers within Musa chloroplast opens new perspectives to better understand the origin of cultivated triploid bananas.

  17. The complete chloroplast genome of banana (Musa acuminata, Zingiberales: insight into plastid monocotyledon evolution.

    Directory of Open Access Journals (Sweden)

    Guillaume Martin

    Full Text Available Banana (genus Musa is a crop of major economic importance worldwide. It is a monocotyledonous member of the Zingiberales, a sister group of the widely studied Poales. Most cultivated bananas are natural Musa inter-(sub-specific triploid hybrids. A Musa acuminata reference nuclear genome sequence was recently produced based on sequencing of genomic DNA enriched in nucleus.The Musa acuminata chloroplast genome was assembled with chloroplast reads extracted from whole-genome-shotgun sequence data. The Musa chloroplast genome is a circular molecule of 169,972 bp with a quadripartite structure containing two single copy regions, a Large Single Copy region (LSC, 88,338 bp and a Small Single Copy region (SSC, 10,768 bp separated by Inverted Repeat regions (IRs, 35,433 bp. Two forms of the chloroplast genome relative to the orientation of SSC versus LSC were found. The Musa chloroplast genome shows an extreme IR expansion at the IR/SSC boundary relative to the most common structures found in angiosperms. This expansion consists of the integration of three additional complete genes (rps15, ndhH and ycf1 and part of the ndhA gene. No such expansion has been observed in monocots so far. Simple Sequence Repeats were identified in the Musa chloroplast genome and a new set of Musa chloroplastic markers was designed.The complete sequence of M. acuminata ssp malaccensis chloroplast we reported here is the first one for the Zingiberales order. As such it provides new insight in the evolution of the chloroplast of monocotyledons. In particular, it reinforces that IR/SSC expansion has occurred independently several times within monocotyledons. The discovery of new polymorphic markers within Musa chloroplast opens new perspectives to better understand the origin of cultivated triploid bananas.

  18. Is Didymosphenia geminata an introduced species in New Zealand? Evidence from trends in water chemistry, and chloroplast DNA.

    Science.gov (United States)

    Kilroy, Cathy; Novis, Phil

    2018-01-01

    Defining the geographic origins of free-living aquatic microorganisms can be problematic because many such organisms have ubiquitous distributions, and proving absence from a region is practically impossible. Geographic origins become important if microorganisms have invasive characteristics. The freshwater diatom Didymosphenia geminata is a potentially ubiquitous microorganism for which the recent global expansion of nuisance proliferations has been attributed to environmental change. The changes may include declines in dissolved reactive phosphorus (DRP) to low levels (e.g., 10 mg/m 3 because both these nutrient conditions are associated with nuisance proliferations of D. geminata . Proliferations of D. geminata have been observed in South Island, New Zealand, since 2004. We aimed to address the ubiquity hypothesis for D. geminata in New Zealand using historical river water nutrient data and new molecular analyses. We used 15 years of data at 77 river sites to assess whether trends in DRP or DIN prior to the spread of D. geminata were consistent with a transition from a rare, undetected, species to a nuisance species. We used new sequences of chloroplast regions to examine the genetic similarity of D. geminata populations from New Zealand and six overseas locations. We found no evidence for declines in DRP concentrations since 1989 that could explain the spread of proliferations since 2004. At some affected sites, lowest DRP occurred before 2004. Trends in DIN also did not indicate enhanced suitability for D. geminata . Lack of diversity in the chloroplast intergenic regions of New Zealand populations and populations from western North America is consistent with recent dispersal to New Zealand. Our analyses did not support the proposal that D. geminata was historically present in New Zealand rivers. These results provide further evidence countering proposals of general ubiquity in freshwater diatoms and indicate that, as assumed in 2004, D. geminata is a

  19. Light-dependent, plastome-wide association of the plastid-encoded RNA polymerase with chloroplast DNA.

    Science.gov (United States)

    Finster, Sabrina; Eggert, Erik; Zoschke, Reimo; Weihe, Andreas; Schmitz-Linneweber, Christian

    2013-12-01

    Plastid genes are transcribed by two types of RNA polymerases: a plastid-encoded eubacterial-type RNA polymerase (PEP) and nuclear-encoded phage-type RNA polymerases (NEPs). To investigate the spatio-temporal expression of PEP, we tagged its α-subunit with a hemagglutinin epitope (HA). Transplastomic tobacco plants were generated and analyzed for the distribution of the tagged polymerase in plastid sub-fractions, and associated genes were identified under various light conditions. RpoA:HA was detected as early as the 3rd day after imbibition, and was constitutively expressed in green tissue over 60 days of plant development. We found that the tagged polymerase subunit preferentially associated with the plastid membranes, and was less abundant in the soluble stroma fraction. Attachment of RpoA:HA to the membrane fraction during early seedling development was independent of DNA, but at later stages of development, DNA appears to facilitate attachment of the polymerase to membranes. To survey PEP-dependent transcription units, we probed for nucleic acids enriched in RpoA:HA precipitates using a tobacco chloroplast whole-genome tiling array. The most strongly co-enriched DNA fragments represent photosynthesis genes (e.g. psbA, psbC, psbD and rbcL), whose expression is known to be driven by PEP promoters, while NEP-dependent genes were less abundant in RpoA:HA precipitates. Additionally, we demonstrate that the association of PEP with photosynthesis-related genes was reduced during the dark period, indicating that plastome-wide PEP-DNA association is a light-dependent process. © 2013 The Authors The Plant Journal © 2013 John Wiley & Sons Ltd.

  20. Relationship between mRNA secondary structure and sequence variability in Chloroplast genes: possible life history implications.

    Science.gov (United States)

    Krishnan, Neeraja M; Seligmann, Hervé; Rao, Basuthkar J

    2008-01-28

    Synonymous sites are freer to vary because of redundancy in genetic code. Messenger RNA secondary structure restricts this freedom, as revealed by previous findings in mitochondrial genes that mutations at third codon position nucleotides in helices are more selected against than those in loops. This motivated us to explore the constraints imposed by mRNA secondary structure on evolutionary variability at all codon positions in general, in chloroplast systems. We found that the evolutionary variability and intrinsic secondary structure stability of these sequences share an inverse relationship. Simulations of most likely single nucleotide evolution in Psilotum nudum and Nephroselmis olivacea mRNAs, indicate that helix-forming propensities of mutated mRNAs are greater than those of the natural mRNAs for short sequences and vice-versa for long sequences. Moreover, helix-forming propensity estimated by the percentage of total mRNA in helices increases gradually with mRNA length, saturating beyond 1000 nucleotides. Protection levels of functionally important sites vary across plants and proteins: r-strategists minimize mutation costs in large genes; K-strategists do the opposite. Mrna length presumably predisposes shorter mRNAs to evolve under different constraints than longer mRNAs. The positive correlation between secondary structure protection and functional importance of sites suggests that some sites might be conserved due to packing-protection constraints at the nucleic acid level in addition to protein level constraints. Consequently, nucleic acid secondary structure a priori biases mutations. The converse (exposure of conserved sites) apparently occurs in a smaller number of cases, indicating a different evolutionary adaptive strategy in these plants. The differences between the protection levels of functionally important sites for r- and K-strategists reflect their respective molecular adaptive strategies. These converge with increasing domestication levels of

  1. The complete chloroplast genome sequence of Mahonia bealei (Berberidaceae) reveals a significant expansion of the inverted repeat and phylogenetic relationship with other angiosperms.

    Science.gov (United States)

    Ma, Ji; Yang, Bingxian; Zhu, Wei; Sun, Lianli; Tian, Jingkui; Wang, Xumin

    2013-10-10

    Mahonia bealei (Berberidaceae) is a frequently-used traditional Chinese medicinal plant with efficient anti-inflammatory ability. This plant is one of the sources of berberine, a new cholesterol-lowering drug with anti-diabetic activity. We have sequenced the complete nucleotide sequence of the chloroplast (cp) genome of M. bealei. The complete cp genome of M. bealei is 164,792 bp in length, and has a typical structure with large (LSC 73,052 bp) and small (SSC 18,591 bp) single-copy regions separated by a pair of inverted repeats (IRs 36,501 bp) of large size. The Mahonia cp genome contains 111 unique genes and 39 genes are duplicated in the IR regions. The gene order and content of M. bealei are almost unarranged which is consistent with the hypothesis that large IRs stabilize cp genome and reduce gene loss-and-gain probabilities during evolutionary process. A large IR expansion of over 12 kb has occurred in M. bealei, 15 genes (rps19, rpl22, rps3, rpl16, rpl14, rps8, infA, rpl36, rps11, petD, petB, psbH, psbN, psbT and psbB) have expanded to have an additional copy in the IRs. The IR expansion rearrangement occurred via a double-strand DNA break and subsequence repair, which is different from the ordinary gene conversion mechanism. Repeat analysis identified 39 direct/inverted repeats 30 bp or longer with a sequence identity ≥ 90%. Analysis also revealed 75 simple sequence repeat (SSR) loci and almost all are composed of A or T, contributing to a distinct bias in base composition. Comparison of protein-coding sequences with ESTs reveals 9 putative RNA edits and 5 of them resulted in non-synonymous modifications in rpoC1, rps2, rps19 and ycf1. Phylogenetic analysis using maximum parsimony (MP) and maximum likelihood (ML) was performed on a dataset composed of 65 protein-coding genes from 25 taxa, which yields an identical tree topology as previous plastid-based trees, and provides strong support for the sister relationship between Ranunculaceae and Berberidaceae

  2. The chloroplast and mitochondrial DNA type are correlated with the nuclear composition of somatic hybrid calli of Solanum tuberosum and Nicotiana plumbaginifolia.

    Science.gov (United States)

    Wolters, A M; Koornneef, M; Gilissen, L J

    1993-09-01

    This paper describes the analysis of chloroplast (cp) DNA and mitochondrial (mt) DNA in 21 somatic hybrid calli of Solanum tuberosum and Nicotiana plumbaginifolia by means of Southern-blot hybridization. Each of these calli contained only one type of cpDNA; 14 had the N. plumbaginifolia (Np) type and seven the S. tuberosum (St) type. N. plumbaginifolia cpDNA was present in hybrids previously shown to contain predominantly N. plumbaginifolia chromosomes whereas hybrids in which S. tuberosum chromosomes predominated possessed cpDNA from potato. We have analyzed the mtDNA of these 21 somatic hybrid calli using four restriction enzyme/probe combinations. Most fusion products had only, or mostly, mtDNA fragments from the parent that predominated in the nucleus. The hybrids containing mtDNA fragments from only one parent (and new fragments) also possessed chloroplasts from the same species. The results suggest the existence of a strong nucleo-cytoplasmic incongruity which affects the genome composition of somatic hybrids between distantly related species.

  3. A microfabricated hybrid device for DNA sequencing.

    Science.gov (United States)

    Liu, Shaorong

    2003-11-01

    We have created a hybrid device of a microfabricated round-channel twin-T injector incorporated with a separation capillary in order to extend the straight separation distance for high speed and long readlength DNA sequencing. Semicircular grooves on glass wafers are obtained using a photomask with a narrow line-width and a standard isotropic photolithographic etching process. Round channels are made when two etched wafers are face-to-face aligned and bonded. A two-mask fabrication process has been developed to make channels of two different diameters. The twin-T injector is formed by the smaller channels whose diameter matches the bore of the separation capillary, and the "usual" separation channel, now called the connection channel, is formed by the larger ones whose diameter matches the outer diameter of the separation capillary. The separation capillary is inserted through the connection channel all the way to the twin-T injector to allow the capillary bore flush with the twin-T injector channels. The total dead-volume of the connection is estimated to be approximately 5 pL. To demonstrate the efficiency of this hybrid device, we have performed four-color DNA sequencing on it. Using a 200 microm twin-T injector coupled with a separation capillary of 20 cm effective separation distance, we have obtained readlengths of 800 plus bases at an accuracy of 98.5% in 56 min, compared to about 650 bases in 100 min on a conventional 40 cm long capillary sequencing machine under similar conditions. At an increased separation field strength and using a diluted sieving matrix, the separation time has been reduced to 20 min with a readlength of 700 bases at 98.5% base-calling accuracy.

  4. Phylogenetic relationships and generic delimitation in Inuleae subtribe Inulinae (Asteraceae) based on ITS and cpDNA sequence data

    DEFF Research Database (Denmark)

    Englund, Marcus; Pornpongrungrueng, Pimwadee; Gustafsson, Mats

    2009-01-01

    Phylogenetic relationships in Inuleae subtribe Inulinae (Asteraceae) were investigated. DNA sequence data from three chloroplast regions (ndhF, trnL-F and psbA-trnH) and the nuclear ribosomal internal transcribed spacer (ITS) region were analysed separately and in combination using parsimony...... and Bayesian inference. A total of 163 ingroup taxa were included, of which 60 were sampled for all four markers. Conflicts between chloroplast and nuclear data were assessed using partitioned Bremer support (PBS). Rather than averaging PBS over several trees from constrained searches, individual trees were...... considered by evaluating PBS ranges. Criteria to be used in the detection of a significant conflict between data partitions are proposed. Three nodes in the total data tree were found to encompass significant conflict that could result from ancient hybridization. Neither of the large, heterogeneous...

  5. The complete chloroplast genome sequence of Gentiana lawrencei var. farreri (Gentianaceae) and comparative analysis with its congeneric species.

    Science.gov (United States)

    Fu, Peng-Cheng; Zhang, Yan-Zhao; Geng, Hui-Min; Chen, Shi-Long

    2016-01-01

    The chloroplast (cp) genome is useful in plant systematics, genetic diversity analysis, molecular identification and divergence dating. The genus Gentiana contains 362 species, but there are only two valuable complete cp genomes. The purpose of this study is to report the characterization of complete cp genome of G. lawrencei var. farreri , which is endemic to the Qinghai-Tibetan Plateau (QTP). Using high throughput sequencing technology, we got the complete nucleotide sequence of the G. lawrencei var. farreri cp genome. The comparison analysis including genome difference and gene divergence was performed with its congeneric species G. straminea . The simple sequence repeats (SSRs) and phylogenetics were studied as well. The cp genome of G. lawrencei var. farreri is a circular molecule of 138,750 bp, containing a pair of 24,653 bp inverted repeats which are separated by small and large single-copy regions of 11,365 and 78,082 bp, respectively. The cp genome contains 130 known genes, including 85 protein coding genes (PCGs), eight ribosomal RNA genes and 37 tRNA genes. Comparative analyses indicated that G. lawrencei var. farreri is 10,241 bp shorter than its congeneric species G. straminea. Four large gaps were detected that are responsible for 85% of the total sequence loss. Further detailed analyses revealed that 10 PCGs were included in the four gaps that encode nine NADH dehydrogenase subunits. The cp gene content, order and orientation are similar to those of its congeneric species, but with some variation among the PCGs. Three genes, ndhB , ndhF and clpP , have high nonsynonymous to synonymous values. There are 34 SSRs in the G. lawrencei var. farreri cp genome, of which 25 are mononucleotide repeats: no dinucleotide repeats were detected. Comparison with the G. straminea cp genome indicated that five SSRs have length polymorphisms and 23 SSRs are species-specific. The phylogenetic analysis of 48 PCGs from 12 Gentianales taxa cp genomes clearly identified

  6. Congruent Deep Relationships in the Grape Family (Vitaceae) Based on Sequences of Chloroplast Genomes and Mitochondrial Genes via Genome Skimming.

    Science.gov (United States)

    Zhang, Ning; Wen, Jun; Zimmer, Elizabeth A

    2015-01-01

    Vitaceae is well-known for having one of the most economically important fruits, i.e., the grape (Vitis vinifera). The deep phylogeny of the grape family was not resolved until a recent phylogenomic analysis of 417 nuclear genes from transcriptome data. However, it has been reported extensively that topologies based on nuclear and organellar genes may be incongruent due to differences in their evolutionary histories. Therefore, it is important to reconstruct a backbone phylogeny of the grape family using plastomes and mitochondrial genes. In this study,next-generation sequencing data sets of 27 species were obtained using genome skimming with total DNAs from silica-gel preserved tissue samples on an Illumina NextSeq 500 instrument [corrected]. Plastomes were assembled using the combination of de novo and reference genome (of V. vinifera) methods. Sixteen mitochondrial genes were also obtained via genome skimming using the reference genome of V. vinifera. Extensive phylogenetic analyses were performed using maximum likelihood and Bayesian methods. The topology based on either plastome data or mitochondrial genes is congruent with the one using hundreds of nuclear genes, indicating that the grape family did not exhibit significant reticulation at the deep level. The results showcase the power of genome skimming in capturing extensive phylogenetic data: especially from chloroplast and mitochondrial DNAs.

  7. Congruent Deep Relationships in the Grape Family (Vitaceae Based on Sequences of Chloroplast Genomes and Mitochondrial Genes via Genome Skimming.

    Directory of Open Access Journals (Sweden)

    Ning Zhang

    Full Text Available Vitaceae is well-known for having one of the most economically important fruits, i.e., the grape (Vitis vinifera. The deep phylogeny of the grape family was not resolved until a recent phylogenomic analysis of 417 nuclear genes from transcriptome data. However, it has been reported extensively that topologies based on nuclear and organellar genes may be incongruent due to differences in their evolutionary histories. Therefore, it is important to reconstruct a backbone phylogeny of the grape family using plastomes and mitochondrial genes. In this study,next-generation sequencing data sets of 27 species were obtained using genome skimming with total DNAs from silica-gel preserved tissue samples on an Illumina NextSeq 500 instrument [corrected]. Plastomes were assembled using the combination of de novo and reference genome (of V. vinifera methods. Sixteen mitochondrial genes were also obtained via genome skimming using the reference genome of V. vinifera. Extensive phylogenetic analyses were performed using maximum likelihood and Bayesian methods. The topology based on either plastome data or mitochondrial genes is congruent with the one using hundreds of nuclear genes, indicating that the grape family did not exhibit significant reticulation at the deep level. The results showcase the power of genome skimming in capturing extensive phylogenetic data: especially from chloroplast and mitochondrial DNAs.

  8. Identification of Meconopsis species by a DNA barcode sequence ...

    African Journals Online (AJOL)

    Deoxyribonucleic acid (DNA) barcoding is a novel technology that uses a standard DNA sequence to facilitate species identification. Species identification is necessary for the authentication of traditional plant based medicines. Although a consensus has not been agreed regarding which DNA sequences can be used as ...

  9. A DNA Structure-Based Bionic Wavelet Transform and Its Application to DNA Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Fei Chen

    2003-01-01

    Full Text Available DNA sequence analysis is of great significance for increasing our understanding of genomic functions. An important task facing us is the exploration of hidden structural information stored in the DNA sequence. This paper introduces a DNA structure-based adaptive wavelet transform (WT – the bionic wavelet transform (BWT – for DNA sequence analysis. The symbolic DNA sequence can be separated into four channels of indicator sequences. An adaptive symbol-to-number mapping, determined from the structural feature of the DNA sequence, was introduced into WT. It can adjust the weight value of each channel to maximise the useful energy distribution of the whole BWT output. The performance of the proposed BWT was examined by analysing synthetic and real DNA sequences. Results show that BWT performs better than traditional WT in presenting greater energy distribution. This new BWT method should be useful for the detection of the latent structural features in future DNA sequence analysis.

  10. Generic boundaries and evolution of characters in the Arctium group: a nuclear and chloroplast DNA analysis

    Directory of Open Access Journals (Sweden)

    Susanna, A.

    2003-12-01

    Full Text Available Generic delineation within the Arctium group (Compositae, Carduceae-Carduinae, formed by the genera Arctium, Cousinia, Hypacanthium and Schamalhausenia, has proven a complicated task. In particular, the precise limits between Arctium and Cousinia are very difficult to establish. Therefore, we have carried out a molecular survey of DNA sequences of two regions, the chloroplast gene matK and the nuclear-ribosomal spacers ITS 1 and 2, of a representation of all the genera of the group (in the case of Cousinia, centered in the species more obviously related to Arctiium. Our results show a precise correlation between molecular phylogeny and two very important characters, pollen type and chromosome numbers: all the investigated species with the Arctiastrum pollen type and x= 18, characteristics of Arctium sensu stricto, form a monophyletic clade, sister of another monophyletic clade formed by all the investigated species of Cousinia sensu slricto. However, the resulting "Arctioid" clade cannot be defined on macroscopic morphologic characters, because the main trait for segregating Arctium and Cousinia, the spiny pinnatifid-pinnatisect leaves of Cousinia, is adaptative and of scarce systematic relevance. In fact, our results suggest that spines have appeared at least in two different lineages: the genera Hypacanthium and Schamalhausenia, spiny and thus morphologically closer to Cousinia, are unambiguously related to the unarmed genus Arctium. An hypothesis on the evolution of morphology, pollen and chromosome numbers in the group is formulated. The systematic implications of this incongruence between molecular, pollen and karyology, on the one hand, and morphology, on the other hand, are evaluated. Some possible solutions are proposed, but none of them is totally satisfactory: more studies are necessary with the inclusion

  11. Sequence analysis of Maturase K (matK): A chloroplast-encoding ...

    African Journals Online (AJOL)

    The application and utilization of sequence data has been found very informative in the characterization and phylogenetic relationship of different crops species. This study aimed to use bioinformatics tools to characterize the matK gene in some selected legumes with special reference to pigeon pea [cajanus cajan ...

  12. Next Generation DNA Sequencing and the Future of Genomic Medicine

    OpenAIRE

    Anderson, Matthew W.; Schrijver, Iris

    2010-01-01

    In the years since the first complete human genome sequence was reported, there has been a rapid development of technologies to facilitate high-throughput sequence analysis of DNA (termed “next-generation” sequencing). These novel approaches to DNA sequencing offer the promise of complete genomic analysis at a cost feasible for routine clinical diagnostics. However, the ability to more thoroughly interrogate genomic sequence raises a number of important issues with regard to result interpreta...

  13. SWORDS: A statistical tool for analysing large DNA sequences

    Indian Academy of Sciences (India)

    Unknown

    These techniques are based on frequency distributions of DNA words in a large sequence, and have been packaged into a software called SWORDS. Using sequences available in ... tions with the cellular processes like recombination, replication .... in DNA sequences using certain specific probability laws. (Pevzner et al ...

  14. Chloroplast DNA polymorphism and evolutional relationships between Asian cultivated rice (Oryza sativa) and its wild relatives (O. rufipogon).

    Science.gov (United States)

    Li, W J; Zhang, B; Huang, G W; Kang, G P; Liang, M Z; Chen, L B

    2012-12-17

    We analyzed chloroplast DNA (cpDNA) polymorphism and phylogenic relationships between 6 typical indica rice, 4 japonica rice, 8 javanica rice, and 12 Asian common wild rice (Oryza rufipogon) strains collected from different latitudes in China by comparing polymorphism at 9 highly variable regions. One hundred and forty-four polymorphic bases were detected. The O. rufipogon samples had 117 polymorphic bases, showing rich genetic diversity. One hundred and thirty-one bases at 13 sites were identified with indica/japonica characteristics; they showed differences between the indica and japonica subspecies at these sites. The javanica strains and japonica shared similar bases at these 131 polymorphic sites, suggesting that javanica is closely related to japonica. On the basis of length analyses of the open reading frame (ORF)100 and (ORF)29-tRNA-Cys(GCA) (TrnC(GCA)) fragments, the O. rufipogon strains were classified into indica/japonica subgroups, which was consistent with the results of the phylogenic tree assay based on concatenated datasets. These results indicated that differences in indica and japonica also exist in the cpDNA genome of the O. rufipogon strains. However, these differences demonstrated a certain degree of primitiveness and incompleteness, as an O. rufipogon line may show different indica/ japonica attributes at different sites. Consequently, O. rufipogon cannot be simply classified into the indica/japonica types as O. sativa. Our data support the hypothesis that Asian cultivated rice, O. indica and O. japonica, separately evolved from Asian common wild rice (O. rufipogon) strains, which have different indica-japonica differentiation trends.

  15. Nuclear and cpDNA sequences combined provide strong inference of higher phylogenetic relationships in the phlox family (Polemoniaceae).

    Science.gov (United States)

    Johnson, Leigh A; Chan, Lauren M; Weese, Terri L; Busby, Lisa D; McMurry, Samuel

    2008-09-01

    Members of the phlox family (Polemoniaceae) serve as useful models for studying various evolutionary and biological processes. Despite its biological importance, no family-wide phylogenetic estimate based on multiple DNA regions with complete generic sampling is available. Here, we analyze one nuclear and five chloroplast DNA sequence regions (nuclear ITS, chloroplast matK, trnL intron plus trnL-trnF intergeneric spacer, and the trnS-trnG, trnD-trnT, and psbM-trnD intergenic spacers) using parsimony and Bayesian methods, as well as assessments of congruence and long branch attraction, to explore phylogenetic relationships among 84 ingroup species representing all currently recognized Polemoniaceae genera. Relationships inferred from the ITS and concatenated chloroplast regions are similar overall. A combined analysis provides strong support for the monophyly of Polemoniaceae and subfamilies Acanthogilioideae, Cobaeoideae, and Polemonioideae. Relationships among subfamilies, and thus for the precise root of Polemoniaceae, remain poorly supported. Within the largest subfamily, Polemonioideae, four clades corresponding to tribes Polemonieae, Phlocideae, Gilieae, and Loeselieae receive strong support. The monogeneric Polemonieae appears sister to Phlocideae. Relationships within Polemonieae, Phlocideae, and Gilieae are mostly consistent between analyses and data permutations. Many relationships within Loeselieae remain uncertain. Overall, inferred phylogenetic relationships support a higher-level classification for Polemoniaceae proposed in 2000.

  16. Euglena gracilis chloroplast DNA: analysis of a 1.6 kb intron of the psb C gene containing an open reading frame of 458 codons.

    Science.gov (United States)

    Montandon, P E; Vasserot, A; Stutz, E

    1986-01-01

    We retrieved a 1.6 kbp intron separating two exons of the psb C gene which codes for the 44 kDa reaction center protein of photosystem II. This intron is 3 to 4 times the size of all previously sequenced Euglena gracilis chloroplast introns. It contains an open reading frame of 458 codons potentially coding for a basic protein of 54 kDa of yet unknown function. The intron boundaries follow consensus sequences established for chloroplast introns related to class II and nuclear pre-mRNA introns. Its 3'-terminal segment has structural features similar to class II mitochondrial introns with an invariant base A as possible branch point for lariat formation.

  17. Variability of silver fir (Abies alba Mill. progeny from the Tisovik Reserve expressed in needle traits and chloroplast microsatellite DNA

    Directory of Open Access Journals (Sweden)

    Pawlaczyk Ewa M.

    2017-01-01

    Full Text Available Progeny from nineteen family lines of silver fir (Abies alba Mill. from the Tisovik Reserve growing in an experimental plot were analyzed based on 4 chloroplast microsatellite DNA loci and 12 morphological and anatomical needle traits. The Tisovik Reserve is located in Białowieża Primeval Forest, 120 km north of the natural range limit of this species, and embraces a small and isolated natural population of silver fir. The aim of this study was to determine genetic variation within and between progeny lines. Analysis of phenotypic variation showed that the traits which differed most among individuals were the needle width and the distance from resin canals to vascular bundle. Those traits, which differed most between the progeny lines, were the number of endodermic cells around the vascular bund and the weight of hypodermic cells. In Tisovik progeny, we detected 107 different haplotypes. In progeny lines, we detected more haplotypes than in maternal trees, and most haplotypes did not exist in maternal trees. This may be the result of pollen influx from other silver fir stands. Progeny from Tisovik showed a higher level of variability in comparison with maternal trees.

  18. Combined Analyses of Chloroplast DNA Haplotypes and Microsatellite Markers Reveal New Insights Into the Origin and Dissemination Route of Cultivated Pears Native to East Asia

    Directory of Open Access Journals (Sweden)

    Xiaoyan Yue

    2018-05-01

    Full Text Available Asian pear plays an important role in the world pear industry, accounting for over 70% of world total production volume. Commercial Asian pear production relies on four major pear cultivar groups, Japanese pear (JP, Chinese white pear (CWP, Chinese sand pear (CSP, and Ussurian pear (UP, but their origins remain controversial. We estimated the genetic diversity levels and structures in a large sample of existing local cultivars to investigate the origins of Asian pears using twenty-five genome-covering nuclear microsatellite (simple sequence repeats, nSSR markers and two non-coding chloroplast DNA (cpDNA regions (trnL-trnF and accD-psaI. High levels of genetic diversity were detected for both nSSRs (HE = 0.744 and cpDNAs (Hd = 0.792. The major variation was found within geographic populations of cultivated pear groups, demonstrating a close relationship among cultivar groups. CSPs showed a greater genetic diversity than CWPs and JPs, and lowest levels of genetic differentiation were detected among them. Phylogeographical analyses indicated that the CSP, CWP, and JP were derived from the same progenitor of Pyrus pyrifolia in China. A dissemination route of cultivated P. pyrifolia estimated by approximate Bayesian computation suggested that cultivated P. pyrifolia from the Middle Yangtze River Valley area contributed the major genetic resources to the cultivars, excluding those of southwestern China. Three major genetic groups of cultivated Pyrus pyrifolia were revealed using nSSRs and a Bayesian statistical inference: (a JPs; (b cultivars from South-Central China northward to northeastern China, covering the main pear production area in China; (c cultivars from southwestern China to southeastern China, including Yunnan, Guizhou, Guangdong, Guangxi, and Fujian Provinces. This reflected the synergistic effects of ecogeographical factors and human selection during cultivar spread and improvement. The analyses indicated that UP cultivars might be

  19. [Phylogenetic relationships of the species of Oxytropis DC. subg. Oxytropis and Phacoxytropis (Fabaceae) from Asian Russia inferred from the nucleotide sequence analysis of the intergenic spacers of the chloroplast genome].

    Science.gov (United States)

    Kholina, A B; Kozyrenko, M M; Artyukova, E V; Sandanov, D V; Andrianova, E A

    2016-08-01

    The nucleotide sequence analysis of trnH–psbA, trnL–trnF, and trnS–trnG intergenic spacer regions of chloroplast DNA performed in the representatives of the genus Oxytropis from Asian Russia provided clarification of the phylogenetic relationships of some species and sections in the subgenera Oxytropis and Phacoxytropis and in the genus Oxytropis as a whole. Only the section Mesogaea corresponds to the subgenus Phacoxytropis, while the section Janthina of the same subgenus groups together with the sections of the subgenus Oxytropis. The sections Chrysantha and Ortholoma of the subgenus Oxytropis are not only closely related to each other, but together with the section Mesogaea, they are grouped into the subgenus Phacoxytropis. It seems likely that the sections Chrysantha and Ortholoma should be assigned to the subgenus Phacoxytropis, and the section Janthina should be assigned to the subgenus Oxytropis. The molecular differences were identified between O. coerulea and O. mandshurica from the section Janthina that were indicative of considerable divergence of their chloroplast genomes and the species independence of the taxa. The species independence of O. czukotica belonging to the section Arctobia was also confirmed.

  20. RFLP of analyses of an intergenic spacer region of chloroplast DNA ...

    African Journals Online (AJOL)

    user

    2006-11-16

    Nov 16, 2006 ... amplified with PCR and digested with 6 restriction endonucleases (Hpa II, Alu I, Hinc II, Ava III, Nde I and. Hae III). According to results ... the environment, but DNA based techniques represent reliable tools and do not have many of ... existence of some variations in gene content (Downie and Palmer, 1992).

  1. A novel constraint for thermodynamically designing DNA sequences.

    Directory of Open Access Journals (Sweden)

    Qiang Zhang

    Full Text Available Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap.

  2. An automated annotation tool for genomic DNA sequences using

    Indian Academy of Sciences (India)

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated ...

  3. Phylogenetic relationships in the Festuca-Lolium complex (Loliinae; Poaceae: New insights from chloroplast sequences

    Directory of Open Access Journals (Sweden)

    Yajuan Cheng

    2016-07-01

    Full Text Available The species within the Lolium/Festuca grass complex have dispersed and colonized large areas of temperate global grasslands both naturally and by human intervention. The species within this grass complex represent some of the most important grass species both for amenity and agricultural use worldwide. There has been renewed interest by grass breeders in producing hybrid combinations between these species and several countries now market Festulolium varieties as a combination of genes from both genera. The two genera have been differentiated by their inflorescence structure, but controversy has surrounded the taxonomic classification of the Lolium-Festuca complex species for several decades. In order to better understand the complexities within the Lolium/Festuca complex and their genetic background, the phylogeny of important examplers from the Lolium-Festuca complex were reconstructed. In total 40 taxa representing the Festuca and Lolium species with Vulpia myuros and Brachypodium distachyon as outgroups were sampled, using two noncoding intergenic spacers (trnQ-rps16, trnH-psbA and one coding gene (rbcL. Maximum parsimony (MP, Bayesian inference (BI analyses based on each partition and combined plastid DNA dataset, and median-jointing network analysis were employed. The outcomes strongly suggested that the subgen. Schedonorus has a close relationship to Lolium, and it is also proposed to move the sect. Leucopoa from subgen. Leucopoa to Subgen. Schedonorus and to separate sect. Breviaristatae from the subgen. Leucopoa. We found that F. californica could be a lineage of hybrid origin because of its intermediate placement between the broad-leaved and fine-leaved clade.

  4. [Phylogenetic relationships among the genera of Taxodiaceae and Cupressaceae from 28S rDNA sequences].

    Science.gov (United States)

    Li, Chun-Xiang; Yang, Qun

    2003-03-01

    DNA sequences from 28S rDNA were used to assess relationships between and within traditional Taxodiaceae and Cupressaceae s.s. The MP tree and NJ tree generally are similar to one another. The results show that Taxodiaceae and Cupressaceae s.s. form a monophyletic conifer lineage excluding Sciadopitys. In the Taxodiaceae-Cupressaceae s.s. monophyletic group, the Taxodiaceae is paraphyletic. Taxodium, Glyptostrobus and Cryptomeria forming a clade(Taxodioideae), in which Glyptostrobus and Taxodium are closely related and sister to Cryptomeria; Sequoia, Sequoiadendron and Metasequoia are closely related to each other, forming another clade (Sequoioideae), in which Sequoia and Sequoiadendron are closely related and sister to Metasequoia; the seven genera of Cupressaceae s.s. are found to be closely related to form a monophyletic lineage (Cupressoideae). These results are basically similar to analyses from chloroplast gene data. But the relationships among Taiwania, Sequoioideae, Taxodioideae, and Cupressoideae remain unclear because of the slow evolution rate of 28S rDNA, which might best be answered by sequencing more rapidly evolving nuclear genes.

  5. Levenshtein error-correcting barcodes for multiplexed DNA sequencing

    NARCIS (Netherlands)

    Buschmann, Tilo; Bystrykh, Leonid V.

    2013-01-01

    Background: High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called

  6. Molecular design of sequence specific DNA alkylating agents.

    Science.gov (United States)

    Minoshima, Masafumi; Bando, Toshikazu; Shinohara, Ken-ichi; Sugiyama, Hiroshi

    2009-01-01

    Sequence-specific DNA alkylating agents have great interest for novel approach to cancer chemotherapy. We designed the conjugates between pyrrole (Py)-imidazole (Im) polyamides and DNA alkylating chlorambucil moiety possessing at different positions. The sequence-specific DNA alkylation by conjugates was investigated by using high-resolution denaturing polyacrylamide gel electrophoresis (PAGE). The results showed that polyamide chlorambucil conjugates alkylate DNA at flanking adenines in recognition sequences of Py-Im polyamides, however, the reactivities and alkylation sites were influenced by the positions of conjugation. In addition, we synthesized conjugate between Py-Im polyamide and another alkylating agent, 1-(chloromethyl)-5-hydroxy-1,2-dihydro-3H-benz[e]indole (seco-CBI). DNA alkylation reactivies by both alkylating polyamides were almost comparable. In contrast, cytotoxicities against cell lines differed greatly. These comparative studies would promote development of appropriate sequence-specific DNA alkylating polyamides against specific cancer cells.

  7. Food Fish Identification from DNA Extraction through Sequence Analysis

    Science.gov (United States)

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  8. DNA Polymerases Drive DNA Sequencing-by-Synthesis Technologies: Both Past and Present

    Directory of Open Access Journals (Sweden)

    Cheng-Yao eChen

    2014-06-01

    Full Text Available Next-generation sequencing (NGS technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. E. coli DNA polymerase I proteolytic (Klenow fragment was originally utilized in Sanger's dideoxy chain terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today's standard capillary electrophoresis (CE and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ⱷ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ⱷ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies.

  9. Sequence of human protamine 2 cDNA

    Energy Technology Data Exchange (ETDEWEB)

    Domenjoud, L; Fronia, C; Uhde, F; Engel, W [Universitaet Goettingen (West Germany)

    1988-08-11

    The authors report the cloning and sequencing of a cDNA clone for human protamine 2 (hp2), isolated from a human testis cDNA library cloned in the vector {lambda}-gt11. A 66mer oligonucleotide, that corresponds to an amino acid sequence which is highly conserved between hp2 and mouse protamine 2 (mp2) served as hybridization probe. The homology between the amino acid sequence deduced from our cDNA and the published amino acid sequence for hp2 is 100%.

  10. Sequence periodicity in nucleosomal DNA and intrinsic curvature.

    Science.gov (United States)

    Nair, T Murlidharan

    2010-05-17

    Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA.

  11. Rapid Multiplex Small DNA Sequencing on the MinION Nanopore Sequencing Platform

    Directory of Open Access Journals (Sweden)

    Shan Wei

    2018-05-01

    Full Text Available Real-time sequencing of short DNA reads has a wide variety of clinical and research applications including screening for mutations, target sequences and aneuploidy. We recently demonstrated that MinION, a nanopore-based DNA sequencing device the size of a USB drive, could be used for short-read DNA sequencing. In this study, an ultra-rapid multiplex library preparation and sequencing method for the MinION is presented and applied to accurately test normal diploid and aneuploidy samples’ genomic DNA in under three hours, including library preparation and sequencing. This novel method shows great promise as a clinical diagnostic test for applications requiring rapid short-read DNA sequencing.

  12. Spectral entropy criteria for structural segmentation in genomic DNA sequences

    International Nuclear Information System (INIS)

    Chechetkin, V.R.; Lobzin, V.V.

    2004-01-01

    The spectral entropy is calculated with Fourier structure factors and characterizes the level of structural ordering in a sequence of symbols. It may efficiently be applied to the assessment and reconstruction of the modular structure in genomic DNA sequences. We present the relevant spectral entropy criteria for the local and non-local structural segmentation in DNA sequences. The results are illustrated with the model examples and analysis of intervening exon-intron segments in the protein-coding regions

  13. Adenoviral DNA replication: DNA sequences and enzymes required for initiation in vitro

    International Nuclear Information System (INIS)

    Stillman, B.W.; Tamanoi, F.

    1983-01-01

    In this paper evidence is provided that the 140,000-dalton DNA polymerase is encoded by the adenoviral genome and is required for the initiation of DNA replication in vitro. The DNA sequences in the template DNA that are required for the initiation of replication have also been identified, using both plasmid DNAs and synthetic oligodeoxyribonucleotides. 48 references, 7 figures, 1 table

  14. Order and correlations in genomic DNA sequences. The spectral approach

    International Nuclear Information System (INIS)

    Lobzin, Vasilii V; Chechetkin, Vladimir R

    2000-01-01

    The structural analysis of genomic DNA sequences is discussed in the framework of the spectral approach, which is sufficiently universal due to the reciprocal correspondence and mutual complementarity of Fourier transform length scales. The spectral characteristics of random sequences of the same nucleotide composition possess the property of self-averaging for relatively short sequences of length M≥100-300. Comparison with the characteristics of random sequences determines the statistical significance of the structural features observed. Apart from traditional applications to the search for hidden periodicities, spectral methods are also efficient in studying mutual correlations in DNA sequences. By combining spectra for structure factors and correlation functions, not only integral correlations can be estimated but also their origin identified. Using the structural spectral entropy approach, the regularity of a sequence can be quantitatively assessed. A brief introduction to the problem is also presented and other major methods of DNA sequence analysis described. (reviews of topical problems)

  15. Toward a Better Compression for DNA Sequences Using Huffman Encoding.

    Science.gov (United States)

    Al-Okaily, Anas; Almarri, Badar; Al Yami, Sultan; Huang, Chun-Hsi

    2017-04-01

    Due to the significant amount of DNA data that are being generated by next-generation sequencing machines for genomes of lengths ranging from megabases to gigabases, there is an increasing need to compress such data to a less space and a faster transmission. Different implementations of Huffman encoding incorporating the characteristics of DNA sequences prove to better compress DNA data. These implementations center on the concepts of selecting frequent repeats so as to force a skewed Huffman tree, as well as the construction of multiple Huffman trees when encoding. The implementations demonstrate improvements on the compression ratios for five genomes with lengths ranging from 5 to 50 Mbp, compared with the standard Huffman tree algorithm. The research hence suggests an improvement on all such DNA sequence compression algorithms that use the conventional Huffman encoding. The research suggests an improvement on all DNA sequence compression algorithms that use the conventional Huffman encoding. Accompanying software is publicly available (AL-Okaily, 2016 ).

  16. Next Generation Sequencing of Ancient DNA: Requirements, Strategies and Perspectives

    Directory of Open Access Journals (Sweden)

    Michael Knapp

    2010-07-01

    Full Text Available The invention of next-generation-sequencing has revolutionized almost all fields of genetics, but few have profited from it as much as the field of ancient DNA research. From its beginnings as an interesting but rather marginal discipline, ancient DNA research is now on its way into the centre of evolutionary biology. In less than a year from its invention next-generation-sequencing had increased the amount of DNA sequence data available from extinct organisms by several orders of magnitude. Ancient DNA  research is now not only adding a temporal aspect to evolutionary studies and allowing for the observation of evolution in real time, it also provides important data to help understand the origins of our own species. Here we review progress that has been made in next-generation-sequencing of ancient DNA over the past five years and evaluate sequencing strategies and future directions.

  17. Thermodynamics of sequence-specific binding of PNA to DNA

    DEFF Research Database (Denmark)

    Ratilainen, T; Holmén, A; Tuite, E

    2000-01-01

    For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes) and seq......For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes...

  18. Local repeat sequence organization of an intergenic spacer

    Indian Academy of Sciences (India)

    The amplification yielded the same uniquely ``sequence-scrambled” product, whether the template used for PCR was total cellular DNA, chloroplast DNA or a plasmid clone DNA corresponding to that region. The PCR product, a ``unique” new sequence, had lost the repetitive organization of the template genome where it ...

  19. Characteristics of alternating current hopping conductivity in DNA sequences

    International Nuclear Information System (INIS)

    Song-Shan, Ma; Hui, Xu; Huan-You, Wang; Rui, Guo

    2009-01-01

    This paper presents a model to describe alternating current (AC) conductivity of DNA sequences, in which DNA is considered as a one-dimensional (1D) disordered system, and electrons transport via hopping between localized states. It finds that AC conductivity in DNA sequences increases as the frequency of the external electric field rises, and it takes the form of ø ac (ω) ∼ ω 2 ln 2 (1/ω). Also AC conductivity of DNA sequences increases with the increase of temperature, this phenomenon presents characteristics of weak temperature-dependence. Meanwhile, the AC conductivity in an off-diagonally correlated case is much larger than that in the uncorrelated case of the Anderson limit in low temperatures, which indicates that the off-diagonal correlations in DNA sequences have a great effect on the AC conductivity, while at high temperature the off-diagonal correlations no longer play a vital role in electric transport. In addition, the proportion of nucleotide pairs p also plays an important role in AC electron transport of DNA sequences. For p < 0.5, the conductivity of DNA sequence decreases with the increase of p, while for p ≥ 0.5, the conductivity increases with the increase of p. (cross-disciplinary physics and related areas of science and technology)

  20. Sequence-dependent DNA deformability studied using molecular dynamics simulations.

    Science.gov (United States)

    Fujii, Satoshi; Kono, Hidetoshi; Takenaka, Shigeori; Go, Nobuhiro; Sarai, Akinori

    2007-01-01

    Proteins recognize specific DNA sequences not only through direct contact between amino acids and bases, but also indirectly based on the sequence-dependent conformation and deformability of the DNA (indirect readout). We used molecular dynamics simulations to analyze the sequence-dependent DNA conformations of all 136 possible tetrameric sequences sandwiched between CGCG sequences. The deformability of dimeric steps obtained by the simulations is consistent with that by the crystal structures. The simulation results further showed that the conformation and deformability of the tetramers can highly depend on the flanking base pairs. The conformations of xATx tetramers show the most rigidity and are not affected by the flanking base pairs and the xYRx show by contrast the greatest flexibility and change their conformations depending on the base pairs at both ends, suggesting tetramers with the same central dimer can show different deformabilities. These results suggest that analysis of dimeric steps alone may overlook some conformational features of DNA and provide insight into the mechanism of indirect readout during protein-DNA recognition. Moreover, the sequence dependence of DNA conformation and deformability may be used to estimate the contribution of indirect readout to the specificity of protein-DNA recognition as well as nucleosome positioning and large-scale behavior of nucleic acids.

  1. Characteristics of alternating current hopping conductivity in DNA sequences

    Institute of Scientific and Technical Information of China (English)

    Ma Song-Shan; Xu Hui; Wang Huan-You; Guo Rui

    2009-01-01

    This paper presents a model to describe alternating current (AC) conductivity of DNA sequences,in which DNA is considered as a one-dimensional (1D) disordered system,and electrons transport via hopping between localized states.It finds that AC conductivity in DNA sequences increases as the frequency of the external electric field rises,and it takes the form of σac(ω)~ω2 ln2(1/ω).Also AC conductivity of DNA sequences increases with the increase of temperature,this phenomenon presents characteristics of weak temperature-dependence.Meanwhile,the AC conductivity in an off diagonally correlated case is much larger than that in the uncorrelated case of the Anderson limit in low temperatures,which indicates that the off-diagonal correlations in DNA sequences have a great effect on the AC conductivity,while at high temperature the off-diagonal correlations no longer play a vital role in electric transport. In addition,the proportion of nucleotide pairs p also plays an important role in AC electron transport of DNA sequences.For p<0.5,the conductivity of DNA sequence decreases with the increase of p,while for p > 0.5,the conductivity increases with the increase of p.

  2. Nucleotide sequence analysis of regions of adenovirus 5 DNA containing the origins of DNA replication

    International Nuclear Information System (INIS)

    Steenbergh, P.H.

    1979-01-01

    The purpose of the investigations described is the determination of nucleotide sequences at the molecular ends of the linear adenovirus type 5 DNA. Knowledge of the primary structure at the termini of this DNA molecule is of particular interest in the study of the mechanism of replication of adenovirus DNA. The initiation- and termination sites of adenovirus DNA replication are located at the ends of the DNA molecule. (Auth.)

  3. High-Throughput Block Optical DNA Sequence Identification.

    Science.gov (United States)

    Sagar, Dodderi Manjunatha; Korshoj, Lee Erik; Hanson, Katrina Bethany; Chowdhury, Partha Pratim; Otoupal, Peter Britton; Chatterjee, Anushree; Nagpal, Prashant

    2018-01-01

    Optical techniques for molecular diagnostics or DNA sequencing generally rely on small molecule fluorescent labels, which utilize light with a wavelength of several hundred nanometers for detection. Developing a label-free optical DNA sequencing technique will require nanoscale focusing of light, a high-throughput and multiplexed identification method, and a data compression technique to rapidly identify sequences and analyze genomic heterogeneity for big datasets. Such a method should identify characteristic molecular vibrations using optical spectroscopy, especially in the "fingerprinting region" from ≈400-1400 cm -1 . Here, surface-enhanced Raman spectroscopy is used to demonstrate label-free identification of DNA nucleobases with multiplexed 3D plasmonic nanofocusing. While nanometer-scale mode volumes prevent identification of single nucleobases within a DNA sequence, the block optical technique can identify A, T, G, and C content in DNA k-mers. The content of each nucleotide in a DNA block can be a unique and high-throughput method for identifying sequences, genes, and other biomarkers as an alternative to single-letter sequencing. Additionally, coupling two complementary vibrational spectroscopy techniques (infrared and Raman) can improve block characterization. These results pave the way for developing a novel, high-throughput block optical sequencing method with lossy genomic data compression using k-mer identification from multiplexed optical data acquisition. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. Metaxa: a software tool for automated detection and discrimination among ribosomal small subunit (12S/16S/18S) sequences of archaea, bacteria, eukaryotes, mitochondria, and chloroplasts in metagenomes and environmental sequencing datasets.

    Science.gov (United States)

    Bengtsson, Johan; Eriksson, K Martin; Hartmann, Martin; Wang, Zheng; Shenoy, Belle Damodara; Grelet, Gwen-Aëlle; Abarenkov, Kessy; Petri, Anna; Rosenblad, Magnus Alm; Nilsson, R Henrik

    2011-10-01

    The ribosomal small subunit (SSU) rRNA gene has emerged as an important genetic marker for taxonomic identification in environmental sequencing datasets. In addition to being present in the nucleus of eukaryotes and the core genome of prokaryotes, the gene is also found in the mitochondria of eukaryotes and in the chloroplasts of photosynthetic eukaryotes. These three sets of genes are conceptually paralogous and should in most situations not be aligned and analyzed jointly. To identify the origin of SSU sequences in complex sequence datasets has hitherto been a time-consuming and largely manual undertaking. However, the present study introduces Metaxa ( http://microbiology.se/software/metaxa/ ), an automated software tool to extract full-length and partial SSU sequences from larger sequence datasets and assign them to an archaeal, bacterial, nuclear eukaryote, mitochondrial, or chloroplast origin. Using data from reference databases and from full-length organelle and organism genomes, we show that Metaxa detects and scores SSU sequences for origin with very low proportions of false positives and negatives. We believe that this tool will be useful in microbial and evolutionary ecology as well as in metagenomics.

  5. Cloning, sequencing and expression of cDNA encoding growth ...

    Indian Academy of Sciences (India)

    Unknown

    of medicine, animal husbandry, fish farming and animal ..... northern pike (Esox lucius) growth hormone; Mol. Mar. Biol. ... prolactin 1-luciferase fusion gene in African catfish and ... 1988 Cloning and sequencing of cDNA that encodes goat.

  6. DNA Nucleotide Sequence Restricted by the RI Endonuclease

    Science.gov (United States)

    Hedgpeth, Joe; Goodman, Howard M.; Boyer, Herbert W.

    1972-01-01

    The sequence of DNA base pairs adjacent to the phosphodiester bonds cleaved by the RI restriction endonuclease in unmodified DNA from coliphage λ has been determined. The 5′-terminal nucleotide labeled with 32P and oligonucleotides up to the heptamer were analyzed from a pancreatic DNase digest. The following sequence of nucleotides adjacent to the RI break made in λ DNA was deduced from these data and from the 3′-dinucleotide sequence and nearest-neighbor analysis obtained from repair synthesis with the DNA polymerase of Rous sarcoma virus [Formula: see text] The RI endonuclease cleavage of the phosphodiester bonds (indicated by arrows) generates 5′-phosphoryls and short cohesive termini of four nucleotides, pApApTpT. The most striking feature of the sequence is its symmetry. PMID:4343974

  7. Capillary gel electrophoresis for rapid, high resolution DNA sequencing.

    OpenAIRE

    Swerdlow, H; Gesteland, R

    1990-01-01

    Capillary gel electrophoresis has been demonstrated for the separation and detection of DNA sequencing samples. Enzymatic dideoxy nucleotide chain termination was employed, using fluorescently tagged oligonucleotide primers and laser based on-column detection (limit of detection is 6,000 molecules per peak). Capillary gel separations were shown to be three times faster, with better resolution (2.4 x), and higher separation efficiency (5.4 x) than a conventional automated slab gel DNA sequenci...

  8. Phylogenetic relationships in Solanaceae and related species based on cpDNA sequence from plastid trnE-trnT region

    Directory of Open Access Journals (Sweden)

    Danila Montewka Melotto-Passarin

    2008-01-01

    Full Text Available Intergenic spacers of chloroplast DNA (cpDNA are very useful in phylogenetic and population genetic studiesof plant species, to study their potential integration in phylogenetic analysis. The non-coding trnE-trnT intergenic spacer ofcpDNA was analyzed to assess the nucleotide sequence polymorphism of 16 Solanaceae species and to estimate its ability tocontribute to the resolution of phylogenetic studies of this group. Multiple alignments of DNA sequences of trnE-trnT intergenicspacer made the identification of nucleotide variability in this region possible and the phylogeny was estimated by maximumparsimony and rooted with Convolvulaceae Ipomoea batatas, the most closely related family. Besides, this intergenic spacerwas tested for the phylogenetic ability to differentiate taxonomic levels. For this purpose, species from four other families wereanalyzed and compared with Solanaceae species. Results confirmed polymorphism in the trnE-trnT region at different taxonomiclevels.

  9. Recurrence time statistics: versatile tools for genomic DNA sequence analysis.

    Science.gov (United States)

    Cao, Yinhe; Tung, Wen-Wen; Gao, J B

    2004-01-01

    With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.

  10. An extended sequence specificity for UV-induced DNA damage.

    Science.gov (United States)

    Chung, Long H; Murray, Vincent

    2018-01-01

    The sequence specificity of UV-induced DNA damage was determined with a higher precision and accuracy than previously reported. UV light induces two major damage adducts: cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). Employing capillary electrophoresis with laser-induced fluorescence and taking advantages of the distinct properties of the CPDs and 6-4PPs, we studied the sequence specificity of UV-induced DNA damage in a purified DNA sequence using two approaches: end-labelling and a polymerase stop/linear amplification assay. A mitochondrial DNA sequence that contained a random nucleotide composition was employed as the target DNA sequence. With previous methodology, the UV sequence specificity was determined at a dinucleotide or trinucleotide level; however, in this paper, we have extended the UV sequence specificity to a hexanucleotide level. With the end-labelling technique (for 6-4PPs), the consensus sequence was found to be 5'-GCTC*AC (where C* is the breakage site); while with the linear amplification procedure, it was 5'-TCTT*AC. With end-labelling, the dinucleotide frequency of occurrence was highest for 5'-TC*, 5'-TT* and 5'-CC*; whereas it was 5'-TT* for linear amplification. The influence of neighbouring nucleotides on the degree of UV-induced DNA damage was also examined. The core sequences consisted of pyrimidine nucleotides 5'-CTC* and 5'-CTT* while an A at position "1" and C at position "2" enhanced UV-induced DNA damage. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.

  11. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    DEFF Research Database (Denmark)

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie

    2014-01-01

    The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sec...

  12. Protein import into chloroplasts requires a chloroplast ATPase

    International Nuclear Information System (INIS)

    Pain, D.; Blobel, G.

    1987-01-01

    The authors have transcribed mRNA from a cDNA clone coding for pea ribulose-1,5-bisphosphate carboxylase, translated the mRNA in a wheat germ cell-free system, and studied the energy requirement for posttranslational import of the [ 35 S]methionine-labeled protein into the stroma of pea chloroplasts. They found that import depends on ATP hydrolysis within the stroma. Import is not inhibited when H + , K + , Na + , or divalent cation gradients across the chloroplast membranes are dissipated by ionophores, as long as exogenously added ATP is also present during the import reaction. The data suggest that protein import into the chloroplast stroma requires a chloroplast ATPase that does not function to generate a membrane potential for driving the import reaction but that exerts its effect in another, yet-to-be-determined, mode. They have carried out a preliminary characterization of this ATPase regarding its nucleotide specificity and the effects of various ATPase inhibitors

  13. PREDICTION OF CHROMATIN STATES USING DNA SEQUENCE PROPERTIES

    KAUST Repository

    Bahabri, Rihab R.

    2013-06-01

    Activities of DNA are to a great extent controlled epigenetically through the internal struc- ture of chromatin. This structure is dynamic and is influenced by different modifications of histone proteins. Various combinations of epigenetic modification of histones pinpoint to different functional regions of the DNA determining the so-called chromatin states. How- ever, the characterization of chromatin states by the DNA sequence properties remains largely unknown. In this study we aim to explore whether DNA sequence patterns in the human genome can characterize different chromatin states. Using DNA sequence motifs we built binary classifiers for each chromatic state to eval- uate whether a given genomic sequence is a good candidate for belonging to a particular chromatin state. Of four classification algorithms (C4.5, Naive Bayes, Random Forest, and SVM) used for this purpose, the decision tree based classifiers (C4.5 and Random Forest) yielded best results among those we evaluated. Our results suggest that in general these models lack sufficient predictive power, although for four chromatin states (insulators, het- erochromatin, and two types of copy number variation) we found that presence of certain motifs in DNA sequences does imply an increased probability that such a sequence is one of these chromatin states.

  14. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    DEFF Research Database (Denmark)

    Nielsen, Peter E.

    2008-01-01

    sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technol. of protein dsDNA structures. (c) 2008 American Institute of Physics. [on SciFinder (R)] Udgivelsesdato...

  15. AU2EU : Privacy-preserving matching of DNA sequences

    NARCIS (Netherlands)

    Ignatenko, T.; Petkovic, M.; Naccache, D.; Sauveron, D.

    2014-01-01

    Advances in DNA sequencing create new opportunities for the use of DNA data in healthcare for diagnostic and treatment purposes, but also in many other health and well-being services. This brings new challenges with regard to the protection and use of this sensitive data. Thus, special technical

  16. Close sequence identity between ribosomal DNA episomes of the ...

    Indian Academy of Sciences (India)

    Unknown

    The restriction map of the E. dispar rDNA circle showed close simi- larity to EhR1 .... for 30 cycles in a DNA Thermal cycler (MJ Research,. USA). 3. .... by asterisk. The gaps show the variation between E. dispar and E. histolytica sequences.

  17. DNA Sequences of RAPD Fragments in the Egyptian cotton ...

    African Journals Online (AJOL)

    Random Amplified Polymorphic DNAs (RAPDs) is a DNA polymorphism assay based on the amplification of random DNA segments with single primers of arbitrary nucleotide sequence. Despite the fact that the RAPD technique has become a very powerful tool and has found use in numerous applications, yet, the nature of ...

  18. Effects of sequence on DNA wrapping around histones

    Science.gov (United States)

    Ortiz, Vanessa

    2011-03-01

    A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).

  19. Nuclear and Chloroplast DNA Variation Provides Insights into Population Structure and Multiple Origin of Native Aromatic Rices of Odisha, India.

    Directory of Open Access Journals (Sweden)

    Pritesh Sundar Roy

    Full Text Available A large number of short grain aromatic rice suited to the agro-climatic conditions and local preferences are grown in niche areas of different parts of India and their diversity is evolved over centuries as a result of selection by traditional farmers. Systematic characterization of these specialty rices has not been attempted. An effort was made to characterize 126 aromatic short grain rice landraces, collected from 19 different districts in the State of Odisha, from eastern India. High level of variation for grain quality and agronomic traits among these aromatic rices was observed and genotypes having desirable phenotypic traits like erect flag leaf, thick culm, compact and dense panicles, short plant stature, early duration, superior yield and grain quality traits were identified. A total of 24 SSR markers corresponding to the hyper variable regions of rice chromosomes were used to understand the genetic diversity and to establish the genetic relationship among the aromatic short grain rice landraces at nuclear genome level. SSR analysis of 126 genotypes from Odisha and 10 genotypes from other states revealed 110 alleles with an average of 4.583 and the Nei's genetic diversity value (He was in the range of 0.034-0.880 revealing two sub-populations SP 1 (membership percentage-27.1% and SP 2 (72.9%. At the organelle genomic level for the C/A repeats in PS1D sequence of chloroplasts, eight different plastid sub types and 33 haplotypes were detected. The japonica (Nipponbare subtype (6C7A was detected in 100 genotypes followed by O. rufipogon (KF428978 subtype (6C6A in 13 genotypes while indica (93-11 sub type (8C8A was seen in 14 genotypes. The tree constructed based on haplotypes suggests that short grain aromatic landraces might have independent origin of these plastid subtypes. Notably a wide range of diversity was observed among these landraces cultivated in different parts confined to the State of Odisha.

  20. DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

    Science.gov (United States)

    Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

    2012-01-01

    DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.

  1. Highly multiplexed targeted DNA sequencing from single nuclei.

    Science.gov (United States)

    Leung, Marco L; Wang, Yong; Kim, Charissa; Gao, Ruli; Jiang, Jerry; Sei, Emi; Navin, Nicholas E

    2016-02-01

    Single-cell DNA sequencing methods are challenged by poor physical coverage, high technical error rates and low throughput. To address these issues, we developed a single-cell DNA sequencing protocol that combines flow-sorting of single nuclei, time-limited multiple-displacement amplification (MDA), low-input library preparation, DNA barcoding, targeted capture and next-generation sequencing (NGS). This approach represents a major improvement over our previous single nucleus sequencing (SNS) Nature Protocols paper in terms of generating higher-coverage data (>90%), thereby enabling the detection of genome-wide variants in single mammalian cells at base-pair resolution. Furthermore, by pooling 48-96 single-cell libraries together for targeted capture, this approach can be used to sequence many single-cell libraries in parallel in a single reaction. This protocol greatly reduces the cost of single-cell DNA sequencing, and it can be completed in 5-6 d by advanced users. This single-cell DNA sequencing protocol has broad applications for studying rare cells and complex populations in diverse fields of biological research and medicine.

  2. Googling DNA sequences on the World Wide Web.

    Science.gov (United States)

    Hajibabaei, Mehrdad; Singer, Gregory A C

    2009-11-10

    New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bioinformatics applications. We have developed a novel algorithm and implemented it for searching species-specific genomic sequences, DNA barcodes, by using popular web-based methods such as Google. We developed an alignment independent character based algorithm based on dividing a sequence library (DNA barcodes) and query sequence to words. The actual search is conducted by conventional search tools such as freely available Google Desktop Search. We implemented our algorithm in two exemplar packages. We developed pre and post-processing software to provide customized input and output services, respectively. Our analysis of all publicly available DNA barcode sequences shows a high accuracy as well as rapid results. Our method makes use of conventional web-based technologies for specialized genetic data. It provides a robust and efficient solution for sequence search on the web. The integration of our search method for large-scale sequence libraries such as DNA barcodes provides an excellent web-based tool for accessing this information and linking it to other available categories of information on the web.

  3. Complete cDNA sequence coding for human docking protein

    Energy Technology Data Exchange (ETDEWEB)

    Hortsch, M; Labeit, S; Meyer, D I

    1988-01-11

    Docking protein (DP, or SRP receptor) is a rough endoplasmic reticulum (ER)-associated protein essential for the targeting and translocation of nascent polypeptides across this membrane. It specifically interacts with a cytoplasmic ribonucleoprotein complex, the signal recognition particle (SRP). The nucleotide sequence of cDNA encoding the entire human DP and its deduced amino acid sequence are given.

  4. Phylogenetic analysis of the genus Hordeum using repetitive DNA sequences

    DEFF Research Database (Denmark)

    Svitashev, S.; Bryngelsson, T.; Vershinin, A.

    1994-01-01

    A set of six cloned barley (Hordeum vulgare) repetitive DNA sequences was used for the analysis of phylogenetic relationships among 31 species (46 taxa) of the genus Hordeum, using molecular hybridization techniques. In situ hybridization experiments showed dispersed organization of the sequences...

  5. Sequence specificity of DNA cleavage by Micrococcus luteus γ endonuclease

    International Nuclear Information System (INIS)

    Hentosh, P.; Henner, W.D.; Reynolds, R.J.

    1985-01-01

    DNA fragments of defined sequence have been used to determine the sites of cleavage by γ-endonuclease activity in extracts prepared from Micrococcus luteus. End-labeled DNA restriction fragments of pBR322 DNA that had been irradiated under nitrogen in the presence of potassium iodide or t-butanol were treated with M. luteus γ endonuclease and analyzed on irradiated DNA preferentially at the positions of cytosines and thymines. DNA cleavage occurred immediately to the 3' side of pyrimidines in irradiated DNA and resulted in fragments that terminate in a 5'-phosphoryl group. These studies indicate that both altered cytosines and thymines may be important DNA lesions requiring repair after exposure to γ radiation

  6. Next-generation sequencing offers new insights into DNA degradation

    DEFF Research Database (Denmark)

    Overballe-Petersen, Søren; Orlando, Ludovic Antoine Alexandre; Willerslev, Eske

    2012-01-01

    The processes underlying DNA degradation are central to various disciplines, including cancer research, forensics and archaeology. The sequencing of ancient DNA molecules on next-generation sequencing platforms provides direct measurements of cytosine deamination, depurination and fragmentation...... rates that previously were obtained only from extrapolations of results from in vitro kinetic experiments performed over short timescales. For example, recent next-generation sequencing of ancient DNA reveals purine bases as one of the main targets of postmortem hydrolytic damage, through base...... elimination and strand breakage. It also shows substantially increased rates of DNA base-loss at guanosine. In this review, we argue that the latter results from an electron resonance structure unique to guanosine rather than adenosine having an extra resonance structure over guanosine as previously suggested....

  7. Mapping Base Modifications in DNA by Transverse-Current Sequencing

    Science.gov (United States)

    Alvarez, Jose R.; Skachkov, Dmitry; Massey, Steven E.; Kalitsov, Alan; Velev, Julian P.

    2018-02-01

    Sequencing DNA modifications and lesions, such as methylation of cytosine and oxidation of guanine, is even more important and challenging than sequencing the genome itself. The traditional methods for detecting DNA modifications are either insensitive to these modifications or require additional processing steps to identify a particular type of modification. Transverse-current sequencing in nanopores can potentially identify the canonical bases and base modifications in the same run. In this work, we demonstrate that the most common DNA epigenetic modifications and lesions can be detected with any predefined accuracy based on their tunneling current signature. Our results are based on simulations of the nanopore tunneling current through DNA molecules, calculated using nonequilibrium electron-transport methodology within an effective multiorbital model derived from first-principles calculations, followed by a base-calling algorithm accounting for neighbor current-current correlations. This methodology can be integrated with existing experimental techniques to improve base-calling fidelity.

  8. Directed PCR-free engineering of highly repetitive DNA sequences

    Directory of Open Access Journals (Sweden)

    Preissler Steffen

    2011-09-01

    Full Text Available Abstract Background Highly repetitive nucleotide sequences are commonly found in nature e.g. in telomeres, microsatellite DNA, polyadenine (poly(A tails of eukaryotic messenger RNA as well as in several inherited human disorders linked to trinucleotide repeat expansions in the genome. Therefore, studying repetitive sequences is of biological, biotechnological and medical relevance. However, cloning of such repetitive DNA sequences is challenging because specific PCR-based amplification is hampered by the lack of unique primer binding sites resulting in unspecific products. Results For the PCR-free generation of repetitive DNA sequences we used antiparallel oligonucleotides flanked by restriction sites of Type IIS endonucleases. The arrangement of recognition sites allowed for stepwise and seamless elongation of repetitive sequences. This facilitated the assembly of repetitive DNA segments and open reading frames encoding polypeptides with periodic amino acid sequences of any desired length. By this strategy we cloned a series of polyglutamine encoding sequences as well as highly repetitive polyadenine tracts. Such repetitive sequences can be used for diverse biotechnological applications. As an example, the polyglutamine sequences were expressed as His6-SUMO fusion proteins in Escherichia coli cells to study their aggregation behavior in vitro. The His6-SUMO moiety enabled affinity purification of the polyglutamine proteins, increased their solubility, and allowed controlled induction of the aggregation process. We successfully purified the fusions proteins and provide an example for their applicability in filter retardation assays. Conclusion Our seamless cloning strategy is PCR-free and allows the directed and efficient generation of highly repetitive DNA sequences of defined lengths by simple standard cloning procedures.

  9. Spreadsheet-based program for alignment of overlapping DNA sequences.

    Science.gov (United States)

    Anbazhagan, R; Gabrielson, E

    1999-06-01

    Molecular biology laboratories frequently face the challenge of aligning small overlapping DNA sequences derived from a long DNA segment. Here, we present a short program that can be used to adapt Excel spreadsheets as a tool for aligning DNA sequences, regardless of their orientation. The program runs on any Windows or Macintosh operating system computer with Excel 97 or Excel 98. The program is available for use as an Excel file, which can be downloaded from the BioTechniques Web site. Upon execution, the program opens a specially designed customized workbook and is capable of identifying overlapping regions between two sequence fragments and displaying the sequence alignment. It also performs a number of specialized functions such as recognition of restriction enzyme cutting sites and CpG island mapping without costly specialized software.

  10. Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

    Directory of Open Access Journals (Sweden)

    Jason D Thompson

    Full Text Available Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.

  11. Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

    Science.gov (United States)

    Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre

    2012-01-01

    Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.

  12. Chaos game representation (CGR)-walk model for DNA sequences

    International Nuclear Information System (INIS)

    Jie, Gao; Zhen-Yuan, Xu

    2009-01-01

    Chaos game representation (CGR) is an iterative mapping technique that processes sequences of units, such as nucleotides in a DNA sequence or amino acids in a protein, in order to determine the coordinates of their positions in a continuous space. This distribution of positions has two features: one is unique, and the other is source sequence that can be recovered from the coordinates so that the distance between positions may serve as a measure of similarity between the corresponding sequences. A CGR-walk model is proposed based on CGR coordinates for the DNA sequences. The CGR coordinates are converted into a time series, and a long-memory ARFIMA (p, d, q) model, where ARFIMA stands for autoregressive fractionally integrated moving average, is introduced into the DNA sequence analysis. This model is applied to simulating real CGR-walk sequence data of ten genomic sequences. Remarkably long-range correlations are uncovered in the data, and the results from these models are reasonably fitted with those from the ARFIMA (p, d, q) model. (cross-disciplinary physics and related areas of science and technology)

  13. Mitochondrial DNA sequence-based phylogenetic relationship ...

    Indian Academy of Sciences (India)

    cophaga ranges from 0.037–0.106 and 0.049–0.207 for COI and ND5 genes, respectively (tables 2 and 3). Analysis of genetic distance on the basis of sequence difference for both the mitochondrial genes shows very little genetic difference. The discrepancy in the phylogenetic trees based on individ- ual genes may be due ...

  14. Novel DNA sequence detection method based on fluorescence energy transfer

    International Nuclear Information System (INIS)

    Kobayashi, S.; Tamiya, E.; Karube, I.

    1987-01-01

    Recently the detection of specific DNA sequence, DNA analysis, has been becoming more important for diagnosis of viral genomes causing infections disease and human sequences related to inherited disorders. These methods typically involve electrophoresis, the immobilization of DNA on a solid support, hybridization to a complementary probe, the detection using labeled with /sup 32/P or nonisotopically with a biotin-avidin-enzyme system, and so on. These techniques are highly effective, but they are very time-consuming and expensive. A principle of fluorescene energy transfer is that the light energy from an excited donor (fluorophore) is transferred to an acceptor (fluorophore), if the acceptor exists in the vicinity of the donor and the excitation spectrum of donor overlaps the emission spectrum of acceptor. In this study, the fluorescence energy transfer was applied to the detection of specific DNA sequence using the hybridization method. The analyte, single-stranded DNA labeled with the donor fluorophore is hybridized to a probe DNA labeled with the acceptor. Because of the complementary DNA duplex formation, two fluorophores became to be closed to each other, and the fluorescence energy transfer was occurred

  15. Management of High-Throughput DNA Sequencing Projects: Alpheus.

    Science.gov (United States)

    Miller, Neil A; Kingsmore, Stephen F; Farmer, Andrew; Langley, Raymond J; Mudge, Joann; Crow, John A; Gonzalez, Alvaro J; Schilkey, Faye D; Kim, Ryan J; van Velkinburgh, Jennifer; May, Gregory D; Black, C Forrest; Myers, M Kathy; Utsey, John P; Frost, Nicholas S; Sugarbaker, David J; Bueno, Raphael; Gullans, Stephen R; Baxter, Susan M; Day, Steve W; Retzel, Ernest F

    2008-12-26

    High-throughput DNA sequencing has enabled systems biology to begin to address areas in health, agricultural and basic biological research. Concomitant with the opportunities is an absolute necessity to manage significant volumes of high-dimensional and inter-related data and analysis. Alpheus is an analysis pipeline, database and visualization software for use with massively parallel DNA sequencing technologies that feature multi-gigabase throughput characterized by relatively short reads, such as Illumina-Solexa (sequencing-by-synthesis), Roche-454 (pyrosequencing) and Applied Biosystem's SOLiD (sequencing-by-ligation). Alpheus enables alignment to reference sequence(s), detection of variants and enumeration of sequence abundance, including expression levels in transcriptome sequence. Alpheus is able to detect several types of variants, including non-synonymous and synonymous single nucleotide polymorphisms (SNPs), insertions/deletions (indels), premature stop codons, and splice isoforms. Variant detection is aided by the ability to filter variant calls based on consistency, expected allele frequency, sequence quality, coverage, and variant type in order to minimize false positives while maximizing the identification of true positives. Alpheus also enables comparisons of genes with variants between cases and controls or bulk segregant pools. Sequence-based differential expression comparisons can be developed, with data export to SAS JMP Genomics for statistical analysis.

  16. DNA-PK dependent targeting of DNA-ends to a protein complex assembled on matrix attachment region DNA sequences

    International Nuclear Information System (INIS)

    Mauldin, S.K.; Getts, R.C.; Perez, M.L.; DiRienzo, S.; Stamato, T.D.

    2003-01-01

    Full text: We find that nuclear protein extracts from mammalian cells contain an activity that allows DNA ends to associate with circular pUC18 plasmid DNA. This activity requires the catalytic subunit of DNA-PK (DNA-PKcs) and Ku since it was not observed in mutants lacking Ku or DNA-PKcs but was observed when purified Ku/DNA-PKcs was added to these mutant extracts. Competition experiments between pUC18 and pUC18 plasmids containing various nuclear matrix attachment region (MAR) sequences suggest that DNA ends preferentially associate with plasmids containing MAR DNA sequences. At a 1:5 mass ratio of MAR to pUC18, approximately equal amounts of DNA end binding to the two plasmids were observed, while at a 1:1 ratio no pUC18 end-binding was observed. Calculation of relative binding activities indicates that DNA-end binding activities to MAR sequences was 7 to 21 fold higher than pUC18. Western analysis of proteins bound to pUC18 and MAR plasmids indicates that XRCC4, DNA ligase IV, scaffold attachment factor A, topoisomerase II, and poly(ADP-ribose) polymerase preferentially associate with the MAR plasmid in the absence or presence of DNA ends. In contrast, Ku and DNA-PKcs were found on the MAR plasmid only in the presence of DNA ends. After electroporation of a 32P-labeled DNA probe into human cells and cell fractionation, 87% of the total intercellular radioactivity remained in nuclei after a 0.5M NaCl extraction suggesting the probe was strongly bound in the nucleus. The above observations raise the possibility that DNA-PK targets DNA-ends to a repair and/or DNA damage signaling complex which is assembled on MAR sites in the nucleus

  17. Transcriptome analysis of ectopic chloroplast development in green curd cauliflower (Brassica oleracea L. var. botrytis

    Directory of Open Access Journals (Sweden)

    Zhou Xiangjun

    2011-11-01

    Full Text Available Abstract Background Chloroplasts are the green plastids where photosynthesis takes place. The biogenesis of chloroplasts requires the coordinate expression of both nuclear and chloroplast genes and is regulated by developmental and environmental signals. Despite extensive studies of this process, the genetic basis and the regulatory control of chloroplast biogenesis and development remain to be elucidated. Results Green cauliflower mutant causes ectopic development of chloroplasts in the curd tissue of the plant, turning the otherwise white curd green. To investigate the transcriptional control of chloroplast development, we compared gene expression between green and white curds using the RNA-seq approach. Deep sequencing produced over 15 million reads with lengths of 86 base pairs from each cDNA library. A total of 7,155 genes were found to exhibit at least 3-fold changes in expression between green and white curds. These included light-regulated genes, genes encoding chloroplast constituents, and genes involved in chlorophyll biosynthesis. Moreover, we discovered that the cauliflower ELONGATED HYPOCOTYL5 (BoHY5 was expressed higher in green curds than white curds and that 2616 HY5-targeted genes, including 1600 up-regulated genes and 1016 down-regulated genes, were differently expressed in green in comparison to white curd tissue. All these 1600 up-regulated genes were HY5-targeted genes in the light. Conclusions The genome-wide profiling of gene expression by RNA-seq in green curds led to the identification of large numbers of genes associated with chloroplast development, and suggested the role of regulatory genes in the high hierarchy of light signaling pathways in mediating the ectopic chloroplast development in the green curd cauliflower mutant.

  18. Transcriptome analysis of ectopic chloroplast development in green curd cauliflower (Brassica oleracea L. var. botrytis).

    Science.gov (United States)

    Zhou, Xiangjun; Fei, Zhangjun; Thannhauser, Theodore W; Li, Li

    2011-11-23

    Chloroplasts are the green plastids where photosynthesis takes place. The biogenesis of chloroplasts requires the coordinate expression of both nuclear and chloroplast genes and is regulated by developmental and environmental signals. Despite extensive studies of this process, the genetic basis and the regulatory control of chloroplast biogenesis and development remain to be elucidated. Green cauliflower mutant causes ectopic development of chloroplasts in the curd tissue of the plant, turning the otherwise white curd green. To investigate the transcriptional control of chloroplast development, we compared gene expression between green and white curds using the RNA-seq approach. Deep sequencing produced over 15 million reads with lengths of 86 base pairs from each cDNA library. A total of 7,155 genes were found to exhibit at least 3-fold changes in expression between green and white curds. These included light-regulated genes, genes encoding chloroplast constituents, and genes involved in chlorophyll biosynthesis. Moreover, we discovered that the cauliflower ELONGATED HYPOCOTYL5 (BoHY5) was expressed higher in green curds than white curds and that 2616 HY5-targeted genes, including 1600 up-regulated genes and 1016 down-regulated genes, were differently expressed in green in comparison to white curd tissue. All these 1600 up-regulated genes were HY5-targeted genes in the light. The genome-wide profiling of gene expression by RNA-seq in green curds led to the identification of large numbers of genes associated with chloroplast development, and suggested the role of regulatory genes in the high hierarchy of light signaling pathways in mediating the ectopic chloroplast development in the green curd cauliflower mutant.

  19. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    Directory of Open Access Journals (Sweden)

    Baldwin Stephen A

    2011-03-01

    Full Text Available Abstract Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.

  20. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities.

    Science.gov (United States)

    Troshin, Peter V; Postis, Vincent Lg; Ashworth, Denise; Baldwin, Stephen A; McPherson, Michael J; Barton, Geoffrey J

    2011-03-07

    Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.

  1. Dialects of the DNA uptake sequence in Neisseriaceae.

    Directory of Open Access Journals (Sweden)

    Stephan A Frye

    2013-04-01

    Full Text Available In all sexual organisms, adaptations exist that secure the safe reassortment of homologous alleles and prevent the intrusion of potentially hazardous alien DNA. Some bacteria engage in a simple form of sex known as transformation. In the human pathogen Neisseria meningitidis and in related bacterial species, transformation by exogenous DNA is regulated by the presence of a specific DNA Uptake Sequence (DUS, which is present in thousands of copies in the respective genomes. DUS affects transformation by limiting DNA uptake and recombination in favour of homologous DNA. The specific mechanisms of DUS-dependent genetic transformation have remained elusive. Bioinformatic analyses of family Neisseriaceae genomes reveal eight distinct variants of DUS. These variants are here termed DUS dialects, and their effect on interspecies commutation is demonstrated. Each of the DUS dialects is remarkably conserved within each species and is distributed consistent with a robust Neisseriaceae phylogeny based on core genome sequences. The impact of individual single nucleotide transversions in DUS on meningococcal transformation and on DNA binding and uptake is analysed. The results show that a DUS core 5'-CTG-3' is required for transformation and that transversions in this core reduce DNA uptake more than two orders of magnitude although the level of DNA binding remains less affected. Distinct DUS dialects are efficient barriers to interspecies recombination in N. meningitidis, N. elongata, Kingella denitrificans, and Eikenella corrodens, despite the presence of the core sequence. The degree of similarity between the DUS dialect of the recipient species and the donor DNA directly correlates with the level of transformation and DNA binding and uptake. Finally, DUS-dependent transformation is documented in the genera Eikenella and Kingella for the first time. The results presented here advance our understanding of the function and evolution of DUS and genetic

  2. Dialects of the DNA Uptake Sequence in Neisseriaceae

    Science.gov (United States)

    Frye, Stephan A.; Nilsen, Mariann; Tønjum, Tone; Ambur, Ole Herman

    2013-01-01

    In all sexual organisms, adaptations exist that secure the safe reassortment of homologous alleles and prevent the intrusion of potentially hazardous alien DNA. Some bacteria engage in a simple form of sex known as transformation. In the human pathogen Neisseria meningitidis and in related bacterial species, transformation by exogenous DNA is regulated by the presence of a specific DNA Uptake Sequence (DUS), which is present in thousands of copies in the respective genomes. DUS affects transformation by limiting DNA uptake and recombination in favour of homologous DNA. The specific mechanisms of DUS–dependent genetic transformation have remained elusive. Bioinformatic analyses of family Neisseriaceae genomes reveal eight distinct variants of DUS. These variants are here termed DUS dialects, and their effect on interspecies commutation is demonstrated. Each of the DUS dialects is remarkably conserved within each species and is distributed consistent with a robust Neisseriaceae phylogeny based on core genome sequences. The impact of individual single nucleotide transversions in DUS on meningococcal transformation and on DNA binding and uptake is analysed. The results show that a DUS core 5′-CTG-3′ is required for transformation and that transversions in this core reduce DNA uptake more than two orders of magnitude although the level of DNA binding remains less affected. Distinct DUS dialects are efficient barriers to interspecies recombination in N. meningitidis, N. elongata, Kingella denitrificans, and Eikenella corrodens, despite the presence of the core sequence. The degree of similarity between the DUS dialect of the recipient species and the donor DNA directly correlates with the level of transformation and DNA binding and uptake. Finally, DUS–dependent transformation is documented in the genera Eikenella and Kingella for the first time. The results presented here advance our understanding of the function and evolution of DUS and genetic transformation

  3. The complete chloroplast genome sequence of Taxus chinensis var. mairei (Taxaceae): loss of an inverted repeat region and comparative analysis with related species.

    Science.gov (United States)

    Zhang, Yanzhen; Ma, Ji; Yang, Bingxian; Li, Ruyi; Zhu, Wei; Sun, Lianli; Tian, Jingkui; Zhang, Lin

    2014-05-01

    Taxus chinensis var. mairei (Taxaceae) is a domestic variety of yew species in local China. This plant is one of the sources for paclitaxel, which is a promising antineoplastic chemotherapy drugs during the last decade. We have sequenced the complete nucleotide sequence of the chloroplast (cp) genome of T. chinensis var. mairei. The T. chinensis var. mairei cp genome is 129,513 bp in length, with 113 single copy genes and two duplicated genes (trnI-CAU, trnQ-UUG). Among the 113 single copy genes, 9 are intron-containing. Compared to other land plant cp genomes, the T. chinensis var. mairei cp genome has lost one of the large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperm such as Cycas revoluta and Ginkgo biloba L. Compared to related species, the gene order of T. chinensis var. mairei has a large inversion of ~110kb including 91 genes (from rps18 to accD) with gene contents unarranged. Repeat analysis identified 48 direct and 2 inverted repeats 30 bp long or longer with a sequence identity greater than 90%. Repeated short segments were found in genes rps18, rps19 and clpP. Analysis also revealed 22 simple sequence repeat (SSR) loci and almost all are composed of A or T. Copyright © 2014 Elsevier B.V. All rights reserved.

  4. Mitochondrial DNA sequence evolution in the Arctoidea.

    OpenAIRE

    Zhang, Y P; Ryder, O A

    1993-01-01

    Some taxa in the superfamily Arctoidea, such as the giant panda and the lesser panda, have presented puzzles to taxonomists. In the present study, approximately 397 bases of the cytochrome b gene, 364 bases of the 12S rRNA gene, and 74 bases of the tRNA(Thr) and tRNA(Pro) genes from the giant panda, lesser panda, kinkajou, raccoon, coatimundi, and all species of the Ursidae were sequenced. The high transition/transversion ratios in cytochrome b and RNA genes prior to saturation suggest that t...

  5. Noninvasive prenatal paternity testing (NIPAT) through maternal plasma DNA sequencing

    DEFF Research Database (Denmark)

    Jiang, Haojun; Xie, Yifan; Li, Xuchao

    2016-01-01

    developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels......Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we...... paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future....

  6. Statistical assignment of DNA sequences using Bayesian phylogenetics

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Huelsenbeck, John P.

    2008-01-01

    We provide a new automated statistical method for DNA barcoding based on a Bayesian phylogenetic analysis. The method is based on automated database sequence retrieval, alignment, and phylogenetic analysis using a custom-built program for Bayesian phylogenetic analysis. We show on real data...... that the method outperforms Blast searches as a measure of confidence and can help eliminate 80% of all false assignment based on best Blast hit. However, the most important advance of the method is that it provides statistically meaningful measures of confidence. We apply the method to a re......-analysis of previously published ancient DNA data and show that, with high statistical confidence, most of the published sequences are in fact of Neanderthal origin. However, there are several cases of chimeric sequences that are comprised of a combination of both Neanderthal and modern human DNA....

  7. Network clustering coefficient approach to DNA sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gerhardt, Guenther J.L. [Universidade Federal do Rio Grande do Sul-Hospital de Clinicas de Porto Alegre, Rua Ramiro Barcelos 2350/sala 2040/90035-003 Porto Alegre (Brazil); Departamento de Fisica e Quimica da Universidade de Caxias do Sul, Rua Francisco Getulio Vargas 1130, 95001-970 Caxias do Sul (Brazil); Lemke, Ney [Programa Interdisciplinar em Computacao Aplicada, Unisinos, Av. Unisinos, 950, 93022-000 Sao Leopoldo, RS (Brazil); Corso, Gilberto [Departamento de Biofisica e Farmacologia, Centro de Biociencias, Universidade Federal do Rio Grande do Norte, Campus Universitario, 59072 970 Natal, RN (Brazil)]. E-mail: corso@dfte.ufrn.br

    2006-05-15

    In this work we propose an alternative DNA sequence analysis tool based on graph theoretical concepts. The methodology investigates the path topology of an organism genome through a triplet network. In this network, triplets in DNA sequence are vertices and two vertices are connected if they occur juxtaposed on the genome. We characterize this network topology by measuring the clustering coefficient. We test our methodology against two main bias: the guanine-cytosine (GC) content and 3-bp (base pairs) periodicity of DNA sequence. We perform the test constructing random networks with variable GC content and imposed 3-bp periodicity. A test group of some organisms is constructed and we investigate the methodology in the light of the constructed random networks. We conclude that the clustering coefficient is a valuable tool since it gives information that is not trivially contained in 3-bp periodicity neither in the variable GC content.

  8. Statistical properties and fractals of nucleotide clusters in DNA sequences

    International Nuclear Information System (INIS)

    Sun Tingting; Zhang Linxi; Chen Jin; Jiang Zhouting

    2004-01-01

    Statistical properties of nucleotide clusters in DNA sequences and their fractals are investigated in this paper. The average size of nucleotide clusters in non-coding sequence is larger than that in coding sequence. We investigate the cluster-size distribution P(S) for human chromosomes 21 and 22, and the results are different from previous works. The cluster-size distribution P(S 1 +S 2 ) with the total size of sequential Pu-cluster and Py-cluster S 1 +S 2 is studied. We observe that P(S 1 +S 2 ) follows an exponential decay both in coding and non-coding sequences. However, we get different results for human chromosomes 21 and 22. The probability distribution P(S 1 ,S 2 ) of nucleotide clusters with the size of sequential Pu-cluster and Py-cluster S 1 and S 2 respectively, is also examined. In the meantime, some of the linear correlations are obtained in the double logarithmic plots of the fluctuation F(l) versus nucleotide cluster distance l along the DNA chain. The power spectrums of nucleotide clusters are also discussed, and it is concluded that the curves are flat and hardly changed and the 1/3 frequency is neither observed in coding sequence nor in non-coding sequence. These investigations can provide some insights into the nucleotide clusters of DNA sequences

  9. DNA sequence responsible for the amplification of adjacent genes.

    Science.gov (United States)

    Pasion, S G; Hartigan, J A; Kumar, V; Biswas, D K

    1987-10-01

    A 10.3-kb DNA fragment in the 5'-flanking region of the rat prolactin (rPRL) gene was isolated from F1BGH(1)2C1, a strain of rat pituitary tumor cells (GH cells) that produces prolactin in response to 5-bromodeoxyuridine (BrdU). Following transfection and integration into genomic DNA of recipient mouse L cells, this DNA induced amplification of the adjacent thymidine kinase gene from Herpes simplex virus type 1 (HSV1TK). We confirmed the ability of this "Amplicon" sequence to induce amplification of other linked or unlinked genes in DNA-mediated gene transfer studies. When transferred into the mouse L cells with the 10.3-5'rPRL gene sequence of BrdU-responsive cells, both the human growth hormone and the HSV1TK genes are amplified in response to 5-bromodeoxyuridine. This observation is substantiated by BrdU-induced amplification of the cotransferred bacterial Neo gene. Cotransfection studies reveal that the BrdU-induced amplification capability is associated with a 4-kb DNA sequence in the 5'-flanking region of the rPRL gene of BrdU-responsive cells. These results demonstrate that genes of heterologous origin, linked or unlinked, and selected or unselected, can be coamplified when located within the amplification boundary of the Amplicon sequence.

  10. Anaplasma phagocytophilum in Danish sheep: confirmation by DNA sequencing

    Directory of Open Access Journals (Sweden)

    Thamsborg Stig M

    2009-12-01

    Full Text Available Abstract Background The presence of Anaplasma phagocytophilum, an Ixodes ricinus transmitted bacterium, was investigated in two flocks of Danish grazing lambs. Direct PCR detection was performed on DNA extracted from blood and serum with subsequent confirmation by DNA sequencing. Methods 31 samples obtained from clinically normal lambs in 2000 from Fussingø, Jutland and 12 samples from ten lambs and two ewes from a clinical outbreak at Feddet, Zealand in 2006 were included in the study. Some of the animals from Feddet had shown clinical signs of polyarthritis and general unthriftiness prior to sampling. DNA extraction was optimized from blood and serum and detection achieved by a 16S rRNA targeted PCR with verification of the product by DNA sequencing. Results Five DNA extracts were found positive by PCR, including two samples from 2000 and three from 2006. For both series of samples the product was verified as A. phagocytophilum by DNA sequencing. Conclusions A. phagocytophilum was detected by molecular methods for the first time in Danish grazing lambs during the two seasons investigated (2000 and 2006.

  11. Isolation of a sex-linked DNA sequence in cranes.

    Science.gov (United States)

    Duan, W; Fuerst, P A

    2001-01-01

    A female-specific DNA fragment (CSL-W; crane sex-linked DNA on W chromosome) was cloned from female whooping cranes (Grus americana). From the nucleotide sequence of CSL-W, a set of polymerase chain reaction (PCR) primers was identified which amplify a 227-230 bp female-specific fragment from all existing crane species and some other noncrane species. A duplicated versions of the DNA segment, which is found to have a larger size (231-235 bp) than CSL-W in both sexes, was also identified, and was designated CSL-NW (crane sex-linked DNA on non-W chromosome). The nucleotide similarity between the sequences of CSL-W and CSL-NW from whooping cranes was 86.3%. The CSL primers do not amplify any sequence from mammalian DNA, limiting the potential for contamination from human sources. Using the CSL primers in combination with a quick DNA extraction method allows the noninvasive identification of crane gender in less than 10 h. A test of the methodology was carried out on fully developed body feathers from 18 captive cranes and resulted in 100% successful identification.

  12. DNA Qualification Workflow for Next Generation Sequencing of Histopathological Samples

    Science.gov (United States)

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T.; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  13. DNA qualification workflow for next generation sequencing of histopathological samples.

    Directory of Open Access Journals (Sweden)

    Michele Simbolo

    Full Text Available Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF tissues, 6 formalin-fixed paraffin-embedded (FFPE tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard

  14. Light-stimulated accumulation of transcripts of nuclear and chloroplast genes for ribulosebisphosphate carboxylase

    Energy Technology Data Exchange (ETDEWEB)

    Smith, S M; Ellis, R J

    1981-01-01

    The chloroplast enzyme, ribulosebisphosphate carboxylase, consists of large subunit polypeptides encoded in the chloroplast genome and small subunit polypeptides encoded in the nuclear genome. Cloned DNA complementary to the small subunit mRNA hybridizes to a single RNA species of 900-1000 nucleotides in both total and poly(A)-containing RNA from leaves of Pisum sativum, but does not hybridize to chloroplast RNA. Small subunit cDNA hybridizes to at least three RNA species from nuclei, two of which are of higher molecular weight than the mature mRNA. A cloned large subunit DNA sequence hybridizes to a single species of Pisum chloroplast RNA containing approximately 1700 nucleotides, but does not hybridize to nuclear RNA. The light-stimulation of carboxylase accumulation reflects increases in the amounts of transcripts for both subunits in total leaf RNA. Transcripts of the small subunit gene are more abundant in nuclear RNA from light-grown leaves than in that from dark-grown leaves. These results suggest that the stimulation of carboxylase accumulation by light is mediated at the level of either transcription or RNA turnover in both nucleus and chloroplast.

  15. Compilation and analysis of Escherichia coli promoter DNA sequences.

    OpenAIRE

    Hawley, D K; McClure, W R

    1983-01-01

    The DNA sequence of 168 promoter regions (-50 to +10) for Escherichia coli RNA polymerase were compiled. The complete listing was divided into two groups depending upon whether or not the promoter had been defined by genetic (promoter mutations) or biochemical (5' end determination) criteria. A consensus promoter sequence based on homologies among 112 well-defined promoters was determined that was in substantial agreement with previous compilations. In addition, we have tabulated 98 promoter ...

  16. Polyfluorophore Labels on DNA: Dramatic Sequence Dependence of Quenching

    Science.gov (United States)

    Teo, Yin Nah; Wilson, James N.

    2010-01-01

    We describe studies carried out in the DNA context to test how a common fluorescence quencher, dabcyl, interacts with oligodeoxynu-cleoside fluorophores (ODFs)—a system of stacked, electronically interacting fluorophores built on a DNA scaffold. We tested twenty different tetrameric ODF sequences containing varied combinations and orderings of pyrene (Y), benzopyrene (B), perylene (E), dimethylaminostilbene (D), and spacer (S) monomers conjugated to the 3′ end of a DNA oligomer. Hybridization of this probe sequence to a dabcyl-labeled complementary strand resulted in strong quenching of fluorescence in 85% of the twenty ODF sequences. The high efficiency of quenching was also established by their large Stern–Volmer constants (KSV) of between 2.1 × 104 and 4.3 × 105M−1, measured with a free dabcyl quencher. Interestingly, quenching of ODFs displayed strong sequence dependence. This was particularly evident in anagrams of ODF sequences; for example, the sequence BYDS had a KSV that was approximately two orders of magnitude greater than that of BSDY, which has the same dye composition. Other anagrams, for example EDSY and ESYD, also displayed different responses upon quenching by dabcyl. Analysis of spectra showed that apparent excimer and exciplex emission bands were quenched with much greater efficiency compared to monomer emission bands by at least an order of magnitude. This suggests an important role played by delocalized excited states of the π stack of fluorophores in the amplified quenching of fluorescence. PMID:19780115

  17. Detecting differential DNA methylation from sequencing of bisulfite converted DNA of diverse species.

    Science.gov (United States)

    Huh, Iksoo; Wu, Xin; Park, Taesung; Yi, Soojin V

    2017-07-21

    DNA methylation is one of the most extensively studied epigenetic modifications of genomic DNA. In recent years, sequencing of bisulfite-converted DNA, particularly via next-generation sequencing technologies, has become a widely popular method to study DNA methylation. This method can be readily applied to a variety of species, dramatically expanding the scope of DNA methylation studies beyond the traditionally studied human and mouse systems. In parallel to the increasing wealth of genomic methylation profiles, many statistical tools have been developed to detect differentially methylated loci (DMLs) or differentially methylated regions (DMRs) between biological conditions. We discuss and summarize several key properties of currently available tools to detect DMLs and DMRs from sequencing of bisulfite-converted DNA. However, the majority of the statistical tools developed for DML/DMR analyses have been validated using only mammalian data sets, and less priority has been placed on the analyses of invertebrate or plant DNA methylation data. We demonstrate that genomic methylation profiles of non-mammalian species are often highly distinct from those of mammalian species using examples of honey bees and humans. We then discuss how such differences in data properties may affect statistical analyses. Based on these differences, we provide three specific recommendations to improve the power and accuracy of DML and DMR analyses of invertebrate data when using currently available statistical tools. These considerations should facilitate systematic and robust analyses of DNA methylation from diverse species, thus advancing our understanding of DNA methylation. © The Author 2017. Published by Oxford University Press.

  18. Population genetics, phylogenomics and hybrid speciation of Juglans in China determined from whole chloroplast genomes, transcriptomes, and genotyping-by-sequencing (GBS).

    Science.gov (United States)

    Zhao, Peng; Zhou, Hui-Juan; Potter, Daniel; Hu, Yi-Heng; Feng, Xiao-Jia; Dang, Meng; Feng, Li; Zulfiqar, Saman; Liu, Wen-Zhe; Zhao, Gui-Fang; Woeste, Keith

    2018-04-18

    Genomic data are a powerful tool for elucidating the processes involved in the evolution and divergence of species. The speciation and phylogenetic relationships among Chinese Juglans remain unclear. Here, we used results from phylogenomic and population genetic analyses, transcriptomics, Genotyping-By-Sequencing (GBS), and whole chloroplast genomes (Cp genome) data to infer processes of lineage formation among the five native Chinese species of the walnut genus (Juglans, Juglandaceae), a widespread, economically important group. We found that the processes of isolation generated diversity during glaciations, but that the recent range expansion of J. regia, probably from multiple refugia, led to hybrid formation both within and between sections of the genus. In southern China, human dispersal of J. regia brought it into contact with J. sigillata, which we determined to be an ecotype of J. regia that is now maintained as a landrace. In northern China, walnut hybridized with a distinct lineage of J. mandshurica to form J. hopeiensis, a controversial taxon (considered threatened) that our data indicate is a horticultural variety. Comparisons among whole chloroplast genomes and nuclear transcriptome analyses provided conflicting evidence for the timing of the divergence of Chinese Juglans taxa. J. cathayensis and J. mandshurica are poorly differentiated based our genomic data. Reconstruction of Juglans evolutionary history indicate that episodes of climatic variation over the past 4.5 to 33.80 million years, associated with glacial advances and retreats and population isolation, have shaped Chinese walnut demography and evolution, even in the presence of gene flow and introgression. Copyright © 2018 Elsevier Inc. All rights reserved.

  19. The cDNA sequence of a neutral horseradish peroxidase.

    Science.gov (United States)

    Bartonek-Roxå, E; Eriksson, H; Mattiasson, B

    1991-02-16

    A cDNA clone encoding a horseradish (Armoracia rusticana) peroxidase has been isolated and characterized. The cDNA contains 1378 nucleotides excluding the poly(A) tail and the deduced protein contains 327 amino acids which includes a 28 amino acid leader sequence. The predicted amino acid sequence is nine amino acids shorter than the major isoenzyme belonging to the horseradish peroxidase C group (HRP-C) and the sequence shows 53.7% identity with this isoenzyme. The described clone encodes nine cysteines of which eight correspond well with the cysteines found in HRP-C. Five potential N-glycosylation sites with the general sequence Asn-X-Thr/Ser are present in the deduced sequence. Compared to the earlier described HRP-C this is three glycosylation sites less. The shorter sequence and fewer N-glycosylation sites give the native isoenzyme a molecular weight of several thousands less than the horseradish peroxidase C isoenzymes. Comparison with the net charge value of HRP-C indicates that the described cDNA clone encodes a peroxidase which has either the same or a slightly less basic pI value, depending on whether the encoded protein is N-terminally blocked or not. This excludes the possibility that HRP-n could belong to either the HRP-A, -D or -E groups. The low sequence identity (53.7%) with HRP-C indicates that the described clone does not belong to the HRP-C isoenzyme group and comparison of the total amino acid composition with the HRP-B group does not place the described clone within this isoenzyme group. Our conclusion is that the described cDNA clone encodes a neutral horseradish peroxidase which belongs to a new, not earlier described, horseradish peroxidase group.

  20. RNA-DNA sequence differences spell genetic code ambiguities

    DEFF Research Database (Denmark)

    Bentin, Thomas; Nielsen, Michael L

    2013-01-01

    A recent paper in Science by Li et al. 2011(1) reports widespread sequence differences in the human transcriptome between RNAs and their encoding genes termed RNA-DNA differences (RDDs). The findings could add a new layer of complexity to gene expression but the study has been criticized. ...

  1. cDNA, genomic sequence cloning and overexpression of ribosomal ...

    African Journals Online (AJOL)

    RPS16 of eukaryote is a component of the 40S small ribosomal subunit encoded by RPS16 gene and is also a homolog of prokaryotic RPS9. The cDNA and genomic sequence of RPS16 was cloned successfully for the first time from the Giant Panda (Ailuropoda melanoleuca) using reverse transcription-polymerase chain ...

  2. DNA sequence and prokaryotic expression analysis of vitellogenin ...

    African Journals Online (AJOL)

    In this study, the DNA sequence of vitellogenin from Antheraea pernyi (Ap-Vg) was identified and its functional domain (30-740 aa, Ap-Vg-1) was expressed in Escherichia coli BL21 (DE3) cells. The recombinant Ap-Vg-1 proteins were purified and used for antibody preparation. The results showed that the intact DNA ...

  3. (Brassicaceae) based on nuclear ribosomal ITS DNA sequences

    Indian Academy of Sciences (India)

    Home; Journals; Journal of Genetics; Volume 93; Issue 2. Phylogeny and biogeography of Alyssum (Brassicaceae) based on nuclear ribosomal ITS DNA sequences. Yan Li Yan Kong Zhe Zhang Yanqiang Yin Bin Liu Guanghui Lv Xiyong Wang. Research Article Volume 93 Issue 2 August 2014 pp 313-323 ...

  4. The toxic dinoflagellate Dinophysis acuminata harbors permanent chloroplasts of cryptomomad prigin, not kleptochloroplasts

    DEFF Research Database (Denmark)

    Garcia, Lydia; Moestrup, Øjvind; Hansen, Per Juel

    2010-01-01

    of Dinophysis acuminata was established by feeding it the phototrophic ciliate Mesodinium rubrum (= Myrionecta rubra), which again was fed the cryptophyte Teleaulax amphioxeia. Molecular analysis comprising the nucleomorph LSU and two chloroplast markers (tufA gene and a fragment from the end of 16S r......DNA to the beginning of 23S rDNA) resulted in identical sequences for the three organisms. Yet, transmission electron microscopy of the three organisms revealed that several chloroplast features separated D. acuminata from both T. amphioxeia and M. rubrum. The thylakoid arrangement, the number of membranes around...

  5. The complete chloroplast genome of Sinopodophyllum hexandrum Ying (Berberidaceae).

    Science.gov (United States)

    Meng, Lihua; Liu, Ruijuan; Chen, Jianbing; Ding, Chenxu

    2017-05-01

    The complete nucleotide sequence of the Sinopodophyllum hexandrum Ying chloroplast genome (cpDNA) was determined based on next-generation sequencing technologies in this study. The genome was 157 203 bp in length, containing a pair of inverted repeat (IRa and IRb) regions of 25 960 bp, which were separated by a large single-copy (LSC) region of 87 065 bp and a small single-copy (SSC) region of 18 218 bp, respectively. The cpDNA contained 148 genes, including 96 protein-coding genes, 8 ribosomal RNA genes, and 44 tRNA genes. In these genes, eight harbored a single intron, and two (ycf3 and clpP) contained a couple of introns. The cpDNA AT content of S. hexandrum cpDNA is 61.5%.

  6. High Performance Systolic Array Core Architecture Design for DNA Sequencer

    Directory of Open Access Journals (Sweden)

    Saiful Nurdin Dayana

    2018-01-01

    Full Text Available This paper presents a high performance systolic array (SA core architecture design for Deoxyribonucleic Acid (DNA sequencer. The core implements the affine gap penalty score Smith-Waterman (SW algorithm. This time-consuming local alignment algorithm guarantees optimal alignment between DNA sequences, but it requires quadratic computation time when performed on standard desktop computers. The use of linear SA decreases the time complexity from quadratic to linear. In addition, with the exponential growth of DNA databases, the SA architecture is used to overcome the timing issue. In this work, the SW algorithm has been captured using Verilog Hardware Description Language (HDL and simulated using Xilinx ISIM simulator. The proposed design has been implemented in Xilinx Virtex -6 Field Programmable Gate Array (FPGA and improved in the core area by 90% reduction.

  7. Sequence heterogeneity accelerates protein search for targets on DNA

    International Nuclear Information System (INIS)

    Shvets, Alexey A.; Kolomeisky, Anatoly B.

    2015-01-01

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome

  8. Sequence heterogeneity accelerates protein search for targets on DNA

    Energy Technology Data Exchange (ETDEWEB)

    Shvets, Alexey A.; Kolomeisky, Anatoly B., E-mail: tolya@rice.edu [Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005 (United States)

    2015-12-28

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  9. Engineering the Chloroplast Genome of Oleaginous Marine Microalga Nannochloropsis oceanica

    Directory of Open Access Journals (Sweden)

    Qinhua Gan

    2018-04-01

    Full Text Available Plastid engineering offers an important tool to fill the gap between the technical and the enormous potential of microalgal photosynthetic cell factory. However, to date, few reports on plastid engineering in industrial microalgae have been documented. This is largely due to the small cell sizes and complex cell-wall structures which make these species intractable to current plastid transformation methods (i.e., biolistic transformation and polyethylene glycol-mediated transformation. Here, employing the industrial oleaginous microalga Nannochloropsis oceanica as a model, an electroporation-mediated chloroplast transformation approach was established. Fluorescent microscopy and laser confocal scanning microscopy confirmed the expression of the green fluorescence protein, driven by the endogenous plastid promoter and terminator. Zeocin-resistance selection led to an acquisition of homoplasmic strains of which a stable and site-specific recombination within the chloroplast genome was revealed by sequencing and DNA gel blotting. This demonstration of electroporation-mediated chloroplast transformation opens many doors for plastid genome editing in industrial microalgae, particularly species of which the chloroplasts are recalcitrant to chemical and microparticle bombardment transformation.

  10. DNA watermarks in non-coding regulatory sequences

    Directory of Open Access Journals (Sweden)

    Pyka Martin

    2009-07-01

    Full Text Available Abstract Background DNA watermarks can be applied to identify the unauthorized use of genetically modified organisms. It has been shown that coding regions can be used to encrypt information into living organisms by using the DNA-Crypt algorithm. Yet, if the sequence of interest presents a non-coding DNA sequence, either the function of a resulting functional RNA molecule or a regulatory sequence, such as a promoter, could be affected. For our studies we used the small cytoplasmic RNA 1 in yeast and the lac promoter region of Escherichia coli. Findings The lac promoter was deactivated by the integrated watermark. In addition, the RNA molecules displayed altered configurations after introducing a watermark, but surprisingly were functionally intact, which has been verified by analyzing the growth characteristics of both wild type and watermarked scR1 transformed yeast cells. In a third approach we introduced a second overlapping watermark into the lac promoter, which did not affect the promoter activity. Conclusion Even though the watermarked RNA and one of the watermarked promoters did not show any significant differences compared to the wild type RNA and wild type promoter region, respectively, it cannot be generalized that other RNA molecules or regulatory sequences behave accordingly. Therefore, we do not recommend integrating watermark sequences into regulatory regions.

  11. VoSeq: a voucher and DNA sequence web application.

    Directory of Open Access Journals (Sweden)

    Carlos Peña

    Full Text Available There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit.

  12. Next generation sequencing of DNA-launched Chikungunya vaccine virus

    Energy Technology Data Exchange (ETDEWEB)

    Hidajat, Rachmat; Nickols, Brian [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States); Forrester, Naomi [Institute for Human Infections and Immunity, Sealy Center for Vaccine Development and Department of Pathology, University of Texas Medical Branch, GNL, 301 University Blvd., Galveston, TX 77555 (United States); Tretyakova, Irina [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States); Weaver, Scott [Institute for Human Infections and Immunity, Sealy Center for Vaccine Development and Department of Pathology, University of Texas Medical Branch, GNL, 301 University Blvd., Galveston, TX 77555 (United States); Pushko, Peter, E-mail: ppushko@medigen-usa.com [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States)

    2016-03-15

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3′ untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. - Highlights: • Chikungunya virus (CHIKV) is an emerging pandemic threat. • In vivo DNA-launched attenuated CHIKV is a novel vaccine technology. • DNA-launched virus was sequenced using HiSeq2000 and compared to the 181/25 virus. • DNA-launched virus has lower frequency of SNPs at E2-12 and E2-82 attenuation loci.

  13. Next generation sequencing of DNA-launched Chikungunya vaccine virus

    International Nuclear Information System (INIS)

    Hidajat, Rachmat; Nickols, Brian; Forrester, Naomi; Tretyakova, Irina; Weaver, Scott; Pushko, Peter

    2016-01-01

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3′ untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. - Highlights: • Chikungunya virus (CHIKV) is an emerging pandemic threat. • In vivo DNA-launched attenuated CHIKV is a novel vaccine technology. • DNA-launched virus was sequenced using HiSeq2000 and compared to the 181/25 virus. • DNA-launched virus has lower frequency of SNPs at E2-12 and E2-82 attenuation loci.

  14. Presence of a consensus DNA motif at nearby DNA sequence of the mutation susceptible CG nucleotides.

    Science.gov (United States)

    Chowdhury, Kaushik; Kumar, Suresh; Sharma, Tanu; Sharma, Ankit; Bhagat, Meenakshi; Kamai, Asangla; Ford, Bridget M; Asthana, Shailendra; Mandal, Chandi C

    2018-01-10

    Complexity in tissues affected by cancer arises from somatic mutations and epigenetic modifications in the genome. The mutation susceptible hotspots present within the genome indicate a non-random nature and/or a position specific selection of mutation. An association exists between the occurrence of mutations and epigenetic DNA methylation. This study is primarily aimed at determining mutation status, and identifying a signature for predicting mutation prone zones of tumor suppressor (TS) genes. Nearby sequences from the top five positions having a higher mutation frequency in each gene of 42 TS genes were selected from a cosmic database and were considered as mutation prone zones. The conserved motifs present in the mutation prone DNA fragments were identified. Molecular docking studies were done to determine putative interactions between the identified conserved motifs and enzyme methyltransferase DNMT1. Collective analysis of 42 TS genes found GC as the most commonly replaced and AT as the most commonly formed residues after mutation. Analysis of the top 5 mutated positions of each gene (210 DNA segments for 42 TS genes) identified that CG nucleotides of the amino acid codons (e.g., Arginine) are most susceptible to mutation, and found a consensus DNA "T/AGC/GAGGA/TG" sequence present in these mutation prone DNA segments. Similar to TS genes, analysis of 54 oncogenes not only found CG nucleotides of the amino acid Arg as the most susceptible to mutation, but also identified the presence of similar consensus DNA motifs in the mutation prone DNA fragments (270 DNA segments for 54 oncogenes) of oncogenes. Docking studies depicted that, upon binding of DNMT1 methylates to this consensus DNA motif (C residues of CpG islands), mutation was likely to occur. Thus, this study proposes that DNMT1 mediated methylation in chromosomal DNA may decrease if a foreign DNA segment containing this consensus sequence along with CG nucleotides is exogenously introduced to dividing

  15. Complete DNA sequences of the plastid genomes of two parasitic flowering plant species, Cuscuta reflexa and Cuscuta gronovii.

    Science.gov (United States)

    Funk, Helena T; Berg, Sabine; Krupinska, Karin; Maier, Uwe G; Krause, Kirsten

    2007-08-22

    The holoparasitic plant genus Cuscuta comprises species with photosynthetic capacity and functional chloroplasts as well as achlorophyllous and intermediate forms with restricted photosynthetic activity and degenerated chloroplasts. Previous data indicated significant differences with respect to the plastid genome coding capacity in different Cuscuta species that could correlate with their photosynthetic activity. In order to shed light on the molecular changes accompanying the parasitic lifestyle, we sequenced the plastid chromosomes of the two species Cuscuta reflexa and Cuscuta gronovii. Both species are capable of performing photosynthesis, albeit with varying efficiencies. Together with the plastid genome of Epifagus virginiana, an achlorophyllous parasitic plant whose plastid genome has been sequenced, these species represent a series of progression towards total dependency on the host plant, ranging from reduced levels of photosynthesis in C. reflexa to a restricted photosynthetic activity and degenerated chloroplasts in C. gronovii to an achlorophyllous state in E. virginiana. The newly sequenced plastid genomes of C. reflexa and C. gronovii reveal that the chromosome structures are generally very similar to that of non-parasitic plants, although a number of species-specific insertions, deletions (indels) and sequence inversions were identified. However, we observed a gradual adaptation of the plastid genome to the different degrees of parasitism. The changes are particularly evident in C. gronovii and include (a) the parallel losses of genes for the subunits of the plastid-encoded RNA polymerase and the corresponding promoters from the plastid genome, (b) the first documented loss of the gene for a putative splicing factor, MatK, from the plastid genome and (c) a significant reduction of RNA editing. Overall, the comparative genomic analysis of plastid DNA from parasitic plants indicates a bias towards a simplification of the plastid gene expression

  16. Complete DNA sequences of the plastid genomes of two parasitic flowering plant species, Cuscuta reflexa and Cuscuta gronovii

    Directory of Open Access Journals (Sweden)

    Maier Uwe G

    2007-08-01

    Full Text Available Abstract Background The holoparasitic plant genus Cuscuta comprises species with photosynthetic capacity and functional chloroplasts as well as achlorophyllous and intermediate forms with restricted photosynthetic activity and degenerated chloroplasts. Previous data indicated significant differences with respect to the plastid genome coding capacity in different Cuscuta species that could correlate with their photosynthetic activity. In order to shed light on the molecular changes accompanying the parasitic lifestyle, we sequenced the plastid chromosomes of the two species Cuscuta reflexa and Cuscuta gronovii. Both species are capable of performing photosynthesis, albeit with varying efficiencies. Together with the plastid genome of Epifagus virginiana, an achlorophyllous parasitic plant whose plastid genome has been sequenced, these species represent a series of progression towards total dependency on the host plant, ranging from reduced levels of photosynthesis in C. reflexa to a restricted photosynthetic activity and degenerated chloroplasts in C. gronovii to an achlorophyllous state in E. virginiana. Results The newly sequenced plastid genomes of C. reflexa and C. gronovii reveal that the chromosome structures are generally very similar to that of non-parasitic plants, although a number of species-specific insertions, deletions (indels and sequence inversions were identified. However, we observed a gradual adaptation of the plastid genome to the different degrees of parasitism. The changes are particularly evident in C. gronovii and include (a the parallel losses of genes for the subunits of the plastid-encoded RNA polymerase and the corresponding promoters from the plastid genome, (b the first documented loss of the gene for a putative splicing factor, MatK, from the plastid genome and (c a significant reduction of RNA editing. Conclusion Overall, the comparative genomic analysis of plastid DNA from parasitic plants indicates a bias towards

  17. Micropatterning stretched and aligned DNA for sequence-specific nanolithography

    Science.gov (United States)

    Petit, Cecilia Anna Paulette

    Techniques for fabricating nanostructured materials can be categorized as either "top-down" or "bottom-up". Top-down techniques use lithography and contact printing to create patterned surfaces and microfluidic channels that can corral and organize nanoscale structures, such as molecules and nanorods in contrast; bottom-up techniques use self-assembly or molecular recognition to direct the organization of materials. A central goal in nanotechnology is the integration of bottom-up and top-down assembly strategies for materials development, device design; and process integration. With this goal in mind, we have developed strategies that will allow this integration by using DNA as a template for nanofabrication; two top-down approaches allow the placement of these templates, while the bottom-up technique uses the specific sequence of bases to pattern materials along each strand of DNA. Our first top-down approach, termed combing of molecules in microchannels (COMMIC), produces microscopic patterns of stretched and aligned molecules of DNA on surfaces. This process consists of passing an air-water interface over end adsorbed molecules inside microfabricated channels. The geometry of the microchannel directs the placement of the DNA molecules, while the geometry of the airwater interface directs the local orientation and curvature of the molecules. We developed another top-down strategy for creating micropatterns of stretched and aligned DNA using surface chemistry. Because DNA stretching occurs on hydrophobic surfaces, this technique uses photolithography to pattern vinyl-terminated silanes on glass When these surface-, are immersed in DNA solution, molecules adhere preferentially to the silanized areas. This approach has also proven useful in patterning protein for cell adhesion studies. Finally, we describe the use of these stretched and aligned molecules of DNA as templates for the subsequent bottom-up construction of hetero-structures through hybridization

  18. The ANGULATA7 gene encodes a DnaJ-like zinc finger-domain protein involved in chloroplast function and leaf development in Arabidopsis.

    Science.gov (United States)

    Muñoz-Nortes, Tamara; Pérez-Pérez, José Manuel; Ponce, María Rosa; Candela, Héctor; Micol, José Luis

    2017-03-01

    The characterization of mutants with altered leaf shape and pigmentation has previously allowed the identification of nuclear genes that encode plastid-localized proteins that perform essential functions in leaf growth and development. A large-scale screen previously allowed us to isolate ethyl methanesulfonate-induced mutants with small rosettes and pale green leaves with prominent marginal teeth, which were assigned to a phenotypic class that we dubbed Angulata. The molecular characterization of the 12 genes assigned to this phenotypic class should help us to advance our understanding of the still poorly understood relationship between chloroplast biogenesis and leaf morphogenesis. In this article, we report the phenotypic and molecular characterization of the angulata7-1 (anu7-1) mutant of Arabidopsis thaliana, which we found to be a hypomorphic allele of the EMB2737 gene, which was previously known only for its embryonic-lethal mutations. ANU7 encodes a plant-specific protein that contains a domain similar to the central cysteine-rich domain of DnaJ proteins. The observed genetic interaction of anu7-1 with a loss-of-function allele of GENOMES UNCOUPLED1 suggests that the anu7-1 mutation triggers a retrograde signal that leads to changes in the expression of many genes that normally function in the chloroplasts. Many such genes are expressed at higher levels in anu7-1 rosettes, with a significant overrepresentation of those required for the expression of plastid genome genes. Like in other mutants with altered expression of plastid-encoded genes, we found that anu7-1 exhibits defects in the arrangement of thylakoidal membranes, which appear locally unappressed. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.

  19. Description of a New Planktonic Mixotrophic Dinoflagellate Paragymnodinium shiwhaense n. gen., n. sp from the Coastal Waters off Western Korea: Morphology, Pigments, and Ribosomal DNA Gene Sequence

    DEFF Research Database (Denmark)

    Kang, Nam Seon; Jeong, Hae Jin; Moestrup, Øjvind

    2010-01-01

    The mixotrophic dinoflagellate Paragymnodinium shiwhaense n. gen., n. sp. is described from living cells and from cells prepared by light, scanning electron, and transmission electron microscopy. In addition, sequences of the small subunit (SSU) and large subunit (LSU) rDNA and photosynthetic...... extension-like furrow. The cingulum is as wide as 0.2-0.3 x cell length and displaced by 0.2-0.3 x cell length. Cell length and width of live cells fed Amphidinium carterae were 8.4-19.3 and 6.1-16.0 mu m, respectively. Paragymnodinium shiwhaense does not have a nuclear envelope chamber nor a nuclear...... fibrous connective (NFC). Cells contain chloroplasts, nematocysts, trichocysts, and peduncle, though eyespots, pyrenoids, and pusules are absent. The main accessory pigment is peridinin. The sequence of the SSU rDNA of this dinoflagellate (GenBank AM408889) is 4% different from that of Gymnodinium...

  20. The influence of DNA sequence on epigenome-induced pathologies

    Directory of Open Access Journals (Sweden)

    Meagher Richard B

    2012-07-01

    Full Text Available Abstract Clear cause-and-effect relationships are commonly established between genotype and the inherited risk of acquiring human and plant diseases and aberrant phenotypes. By contrast, few such cause-and-effect relationships are established linking a chromatin structure (that is, the epitype with the transgenerational risk of acquiring a disease or abnormal phenotype. It is not entirely clear how epitypes are inherited from parent to offspring as populations evolve, even though epigenetics is proposed to be fundamental to evolution and the likelihood of acquiring many diseases. This article explores the hypothesis that, for transgenerationally inherited chromatin structures, “genotype predisposes epitype”, and that epitype functions as a modifier of gene expression within the classical central dogma of molecular biology. Evidence for the causal contribution of genotype to inherited epitypes and epigenetic risk comes primarily from two different kinds of studies discussed herein. The first and direct method of research proceeds by the examination of the transgenerational inheritance of epitype and the penetrance of phenotype among genetically related individuals. The second approach identifies epitypes that are duplicated (as DNA sequences are duplicated and evolutionarily conserved among repeated patterns in the DNA sequence. The body of this article summarizes particularly robust examples of these studies from humans, mice, Arabidopsis, and other organisms. The bulk of the data from both areas of research support the hypothesis that genotypes predispose the likelihood of displaying various epitypes, but for only a few classes of epitype. This analysis suggests that renewed efforts are needed in identifying polymorphic DNA sequences that determine variable nucleosome positioning and DNA methylation as the primary cause of inherited epigenome-induced pathologies. By contrast, there is very little evidence that DNA sequence directly

  1. Pericentric satellite DNA sequences in Pipistrellus pipistrellus (Vespertilionidae; Chiroptera).

    Science.gov (United States)

    Barragán, M J L; Martínez, S; Marchal, J A; Fernández, R; Bullejos, M; Díaz de la Guardia, R; Sánchez, A

    2003-09-01

    This paper reports the molecular and cytogenetic characterization of a HindIII family of satellite DNA in the bat species Pipistrellus pipistrellus. This satellite is organized in tandem repeats of 418 bp monomer units, and represents approximately 3% of the whole genome. The consensus sequence from five cloned monomer units has an A-T content of 62.20%. We have found differences in the ladder pattern of bands between two populations of the same species. These differences are probably because of the absence of the target sites for the HindIII enzyme in most monomer units of one population, but not in the other. Fluorescent in situ hybridization (FISH) localized the satellite DNA in the pericentromeric regions of all autosomes and the X chromosome, but it was absent from the Y chromosome. Digestion of genomic DNAs with HpaII and its isoschizomer MspI demonstrated that these repetitive DNA sequences are not methylated. Other bat species were tested for the presence of this repetitive DNA. It was absent in five Vespertilionidae and one Rhinolophidae species, indicating that it could be a species/genus specific, repetitive DNA family.

  2. Early Lyme disease with spirochetemia - diagnosed by DNA sequencing

    Directory of Open Access Journals (Sweden)

    Jones William

    2010-11-01

    Full Text Available Abstract Background A sensitive and analytically specific nucleic acid amplification test (NAAT is valuable in confirming the diagnosis of early Lyme disease at the stage of spirochetemia. Findings Venous blood drawn from patients with clinical presentations of Lyme disease was tested for the standard 2-tier screen and Western Blot serology assay for Lyme disease, and also by a nested polymerase chain reaction (PCR for B. burgdorferi sensu lato 16S ribosomal DNA. The PCR amplicon was sequenced for B. burgdorferi genomic DNA validation. A total of 130 patients visiting emergency room (ER or Walk-in clinic (WALKIN, and 333 patients referred through the private physicians' offices were studied. While 5.4% of the ER/WALKIN patients showed DNA evidence of spirochetemia, none (0% of the patients referred from private physicians' offices were DNA-positive. In contrast, while 8.4% of the patients referred from private physicians' offices were positive for the 2-tier Lyme serology assay, only 1.5% of the ER/WALKIN patients were positive for this antibody test. The 2-tier serology assay missed 85.7% of the cases of early Lyme disease with spirochetemia. The latter diagnosis was confirmed by DNA sequencing. Conclusion Nested PCR followed by automated DNA sequencing is a valuable supplement to the standard 2-tier antibody assay in the diagnosis of early Lyme disease with spirochetemia. The best time to test for Lyme spirochetemia is when the patients living in the Lyme disease endemic areas develop unexplained symptoms or clinical manifestations that are consistent with Lyme disease early in the course of their illness.

  3. Spectral sum rules and search for periodicities in DNA sequences

    International Nuclear Information System (INIS)

    Chechetkin, V.R.

    2011-01-01

    Periodic patterns play the important regulatory and structural roles in genomic DNA sequences. Commonly, the underlying periodicities should be understood in a broad statistical sense, since the corresponding periodic patterns have been strongly distorted by the random point mutations and insertions/deletions during molecular evolution. The latent periodicities in DNA sequences can be efficiently displayed by Fourier transform. The criteria of significance for observed periodicities are obtained via the comparison versus the counterpart characteristics of the reference random sequences. We show that the restrictions imposed on the significance criteria by the rigorous spectral sum rules can be rationally described with De Finetti distribution. This distribution provides the convenient intermediate asymptotic form between Rayleigh distribution and exact combinatoric theory. - Highlights: → We study the significance criteria for latent periodicities in DNA sequences. → The constraints imposed by sum rules can be described with De Finetti distribution. → It is intermediate between Rayleigh distribution and exact combinatoric theory. → Theory is applicable to the study of correlations between different periodicities. → The approach can be generalized to the arbitrary discrete Fourier transform.

  4. Phylogenetic relationships of the Gomphales based on nuc-25S-rDNA, mit-12S-rDNA, and mit-atp6-DNA combined sequences

    Science.gov (United States)

    Admir J. Giachini; Kentaro Hosaka; Eduardo Nouhra; Joseph Spatafora; James M. Trappe

    2010-01-01

    Phylogenetic relationships among Geastrales, Gomphales, Hysterangiales, and Phallales were estimated via combined sequences: nuclear large subunit ribosomal DNA (nuc-25S-rDNA), mitochondrial small subunit ribosomal DNA (mit-12S-rDNA), and mitochondrial atp6 DNA (mit-atp6-DNA). Eighty-one taxa comprising 19 genera and 58 species...

  5. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, Natasha V. (Okemos, MI); Broekaert, Willem F. (Dilbeek, BE); Chua, Nam-Hai (Scarsdale, NY); Kush, Anil (New York, NY)

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a pu GOVERNMENT RIGHTS This application was funded under Department of Energy Contract DE-AC02-76ER01338. The U.S. Government has certain rights under this application and any patent issuing thereon.

  6. cDNA sequences of two apolipoproteins from lamprey

    International Nuclear Information System (INIS)

    Pontes, M.; Xu, X.; Graham, D.; Riley, M.; Doolittle, R.F.

    1987-01-01

    The messages for two small but abundant apolipoproteins found in lamprey blood plasma were cloned with the aid of oligonucleotide probes based on amino-terminal sequences. In both cases, numerous clones were identified in a lamprey liver cDNA library, consistent with the great abundance of these proteins in lamprey blood. One of the cDNAs (LAL1) has a coding region of 105 amino acids that corresponds to a 21-residue signal peptide, a putative 8-residue propeptide, and the 76-residue mature protein found in blood. The other cDNA (LAL2) codes for a total of 191 residues, the first 23 of which constitute a signal peptide. The two proteins, which occur in the high-density lipoprotein fraction of ultracentrifuged plasma, have amino acid compositions similar to those of apolipoproteins found in mammalian blood; computer analysis indicates that the sequences are largely helix-permissive. When the sequences were searched against an amino acid sequence data base, rat apolipoprotein IV was the best matching candidate in both cases. Although a reasonable alignment can be made with that sequence and LAL1, definitive assignment of the two lamprey proteins to typical mammalian classes cannot be made at this point

  7. An AU-rich element in the 3{prime} untranslated region of the spinach chloroplast petD gene participates in sequence-specific RNA-protein complex formation

    Energy Technology Data Exchange (ETDEWEB)

    Chen, Qiuyun; Adams, C.C.; Usack, L. [Cornell Univ., Ithaca, NY (United States)] [and others

    1995-04-01

    In chloroplasts, the 3{prime} untranslated regions of most mRNAs contain a stem-loop-forming inverted repeat (IR) sequence that is required for mRNA stability and correct 3{prime}-end formation. The IR regions of several mRNAs are also known to bind chloroplast proteins, as judged from in vitro gel mobility shift and UV cross-linking assays, and these RNA-protein interactions may be involved in the regulation of chloroplast mRNA processing and/or stability. Here we describe in detail the RNA and protein components that are involved in 3{prime} IR-containing RNA (3{prime} IR-RNA)-protein complex formation for the spinach chloroplast petD gene, which encodes subunit IV of the cytochrome b{sub 6}/f complex. We show that the complex contains 55-, 41-, and 29-kDa RNA-binding proteins (ribonucleoproteins [RNPs]). These proteins together protect a 90-nucleotide segment of RNA from RNase T{sub 1} digestion; this RNA contains the IR and downstream flanking sequences. Competition experiments using 3{prime} IR-RNAs from the psbA or rbcL gene demonstrate that the RNPs have a strong specificity for the petD sequence. Site-directed mutagenesis was carried out to define the RNA sequence elements required for complex formation. These studies identified an 8-nucleotide AU-rich sequence downstream of the IR; mutations within this sequence had moderate to severe effects on RNA-protein complex formation. Although other similar sequences are present in the petD 3{prime} untranslated region, only a single copy, which we have termed box II, appears to be essential for in vivo protein binding. In addition, the IR itself is necessary for optimal complex formation. These two sequence elements together with an RNP complex may direct correct 3{prime}-end processing and/or influence the stability of petD mRNA in chloroplasts. 48 refs., 9 figs., 2 tabs.

  8. Biased distribution of DNA uptake sequences towards genome maintenance genes

    DEFF Research Database (Denmark)

    Davidsen, T.; Rodland, E.A.; Lagesen, K.

    2004-01-01

    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...... in these organisms. Pasteurella multocida also displayed high frequencies of a putative DUS identical to that previously identified in H. influenzae and with a skewed distribution towards genome maintenance genes, indicating that this bacterium might be transformation competent under certain conditions....

  9. Development of a defined-sequence DNA system for use in DNA misrepair studies

    International Nuclear Information System (INIS)

    Sutton, S.; Tobias, C.A.

    1984-01-01

    The authors have developed a system that allows them to study cellular DNA repair processes at the molecular level. In particular, the authors are using this system to examine the consequences of a misrepair of radiation-induced DNA damage, as a function of dose. The cells being used are specially engineered haploid yeast cells. Maintained in the cells, at one copy per cell, is a cen plasmid, a plasmid that behaves like a functional chromosome. This plasmid carries a small defined sequence of DNA from the E. coli lac z gene. It is this lac z region (called the alpha region) that serves as the target for radiation damage. Two copies of the complimentary portion of the lac z gene are integrated into the yeast genome. Irradiated cells are screened for possible mutation in the alpha region by testing the cells' ability to hydrolyze xgal, a lactose substrate. The DNA of interest is then extracted from the cells, sequenced, and the sequence is compared to that of the control. Unlike the usual defined-sequence DNA systems, theirs is an in vivo system. A disadvantage is the relatively high background mutation rate. Results achieved with this system, as well as future applications, are discussed

  10. Rapid DNA sequencing by horizontal ultrathin gel electrophoresis.

    OpenAIRE

    Brumley, R L; Smith, L M

    1991-01-01

    A horizontal polyacrylamide gel electrophoresis apparatus has been developed that decreases the time required to separate the DNA fragments produced in enzymatic sequencing reactions. The configuration of this apparatus and the use of circulating coolant directly under the glass plates result in heat exchange that is approximately nine times more efficient than passive thermal transfer methods commonly used. Bubble-free gels as thin as 25 microns can be routinely cast on this device. The appl...

  11. Is photocleavage of DNA by YOYO-1 using a synchrotron radiation light source sequence dependent?

    DEFF Research Database (Denmark)

    Gilroy, Emma L.; Hoffmann, Søren Vrønning; Jones, Nykola C.

    2011-01-01

    ) throughout the irradiation period. The dependence of LD signals on DNA sequences and on time in the intense light beam was explored and quantified for single-stranded poly(dA), poly[(dA-dT)2], calf thymus DNA (ctDNA) and Micrococcus luteus DNA (mlDNA). The DNA and ligand regions of the spectrum showed...

  12. Comparison of DNA Quantification Methods for Next Generation Sequencing.

    Science.gov (United States)

    Robin, Jérôme D; Ludlow, Andrew T; LaRanger, Ryan; Wright, Woodring E; Shay, Jerry W

    2016-04-06

    Next Generation Sequencing (NGS) is a powerful tool that depends on loading a precise amount of DNA onto a flowcell. NGS strategies have expanded our ability to investigate genomic phenomena by referencing mutations in cancer and diseases through large-scale genotyping, developing methods to map rare chromatin interactions (4C; 5C and Hi-C) and identifying chromatin features associated with regulatory elements (ChIP-seq, Bis-Seq, ChiA-PET). While many methods are available for DNA library quantification, there is no unambiguous gold standard. Most techniques use PCR to amplify DNA libraries to obtain sufficient quantities for optical density measurement. However, increased PCR cycles can distort the library's heterogeneity and prevent the detection of rare variants. In this analysis, we compared new digital PCR technologies (droplet digital PCR; ddPCR, ddPCR-Tail) with standard methods for the titration of NGS libraries. DdPCR-Tail is comparable to qPCR and fluorometry (QuBit) and allows sensitive quantification by analysis of barcode repartition after sequencing of multiplexed samples. This study provides a direct comparison between quantification methods throughout a complete sequencing experiment and provides the impetus to use ddPCR-based quantification for improvement of NGS quality.

  13. Application of synthetic DNA probes to the analysis of DNA sequence variants in man

    International Nuclear Information System (INIS)

    Wallace, R.B.; Petz, L.D.; Yam, P.Y.

    1986-01-01

    Oligonucleotide probes provide a tool to discriminate between any two alleles on the basis of hybridization. Random sampling of the genome with different oligonucleotide probes should reveal polymorphism in a certain percentage of the cases. In the hope of identifying polymorphic regions more efficiently, we chose to take advantage of the proposed hypermutability of repeated DNA sequences and the specificity of oligonucleotide hybridization. Since, under appropriate conditions, oligonucleotide probes require complete base pairing for hybridization to occur, they will only hybridize to a subset of the members of a repeat family when all members of the family are not identical. The results presented here suggest that oligonucleotide hybridization can be used to extend the genomic sequences that can be tested for the presence of RFLPs. This expands the tools available to human genetics. In addition, the results suggest that repeated DNA sequences are indeed more polymorphic than single-copy sequences. 28 references, 2 figures

  14. Development of cleaved amplified polymorphic sequence (CAPS) and high-resolution melting (HRM) markers from the chloroplast genome of Glycyrrhiza species.

    Science.gov (United States)

    Jo, Ick-Hyun; Sung, Jwakyung; Hong, Chi-Eun; Raveendar, Sebastin; Bang, Kyong-Hwan; Chung, Jong-Wook

    2018-05-01

    Licorice ( Glycyrrhiza glabra ) is an important medicinal crop often used as health foods or medicine worldwide. The molecular genetics of licorice is under scarce owing to lack of molecular markers. Here, we have developed cleaved amplified polymorphic sequence (CAPS) and high-resolution melting (HRM) markers based on single nucleotide polymorphisms (SNP) by comparing the chloroplast genomes of two Glycyrrhiza species ( G. glabra and G. lepidota ). The CAPS and HRM markers were tested for diversity analysis with 24 Glycyrrhiza accessions. The restriction profiles generated with CAPS markers classified the accessions (2-4 genotypes) and melting curves (2-3) were obtained from the HRM markers. The number of alleles and major allele frequency were 2-6 and 0.31-0.92, respectively. The genetic distance and polymorphism information content values were 0.16-0.76 and 0.15-0.72, respectively. The phylogenetic relationships among the 24 accessions were estimated using a dendrogram, which classified them into four clades. Except clade III, the remaining three clades included the same species, confirming interspecies genetic correlation. These 18 CAPS and HRM markers might be helpful for genetic diversity assessment and rapid identification of licorice species.

  15. Methylation patterns of repetitive DNA sequences in germ cells of Mus musculus.

    Science.gov (United States)

    Sanford, J; Forrester, L; Chapman, V; Chandley, A; Hastie, N

    1984-03-26

    The major and the minor satellite sequences of Mus musculus were undermethylated in both sperm and oocyte DNAs relative to the amount of undermethylation observed in adult somatic tissue DNA. This hypomethylation was specific for satellite sequences in sperm DNA. Dispersed repetitive and low copy sequences show a high degree of methylation in sperm DNA; however, a dispersed repetitive sequence was undermethylated in oocyte DNA. This finding suggests a difference in the amount of total genomic DNA methylation between sperm and oocyte DNA. The methylation levels of the minor satellite sequences did not change during spermiogenesis, and were not associated with the onset of meiosis or a specific stage in sperm development.

  16. Fidelity and mutational spectrum of Pfu DNA polymerase on a human mitochondrial DNA sequence.

    Science.gov (United States)

    André, P; Kim, A; Khrapko, K; Thilly, W G

    1997-08-01

    The study of rare genetic changes in human tissues requires specialized techniques. Point mutations at fractions at or below 10(-6) must be observed to discover even the most prominent features of the point mutational spectrum. PCR permits the increase in number of mutant copies but does so at the expense of creating many additional mutations or "PCR noise". Thus, each DNA sequence studied must be characterized with regard to the DNA polymerase and conditions used to avoid interpreting a PCR-generated mutation as one arising in human tissue. The thermostable DNA polymerase derived from Pyrococcus furiosus designated Pfu has the highest fidelity of any DNA thermostable polymerase studied to date, and this property recommends it for analyses of tissue mutational spectra. Here, we apply constant denaturant capillary electrophoresis (CDCE) to separate and isolate the products of DNA amplification. This new strategy permitted direct enumeration and identification of point mutations created by Pfu DNA polymerase in a 96-bp low melting domain of a human mitochondrial sequence despite the very low mutant fractions generated in the PCR process. This sequence, containing part of the tRNA glycine and NADH dehydrogenase subunit 3 genes, is the target of our studies of mitochondrial mutagenesis in human cells and tissues. Incorrectly synthesized sequences were separated from the wild type as mutant/wild-type heteroduplexes by sequential enrichment on CDCE. An artificially constructed mutant was used as an internal standard to permit calculation of the mutant fraction. Our study found that the average error rate (mutations per base pair duplication) of Pfu was 6.5 x 10(-7), and five of its more frequent mutations (hot spots) consisted of three transversions (GC-->TA, AT-->TA, and AT-->CG), one transition (AT-->GC), and one 1-bp deletion (in an AAAAAA sequence). To achieve an even higher sensitivity, the amount of Pfu-induced mutants must be reduced.

  17. DNA barcode and identification of the varieties and provenances of Taiwan's domestic and imported made teas using ribosomal internal transcribed spacer 2 sequences.

    Science.gov (United States)

    Lee, Shih-Chieh; Wang, Chia-Hsiang; Yen, Cheng-En; Chang, Chieh

    2017-04-01

    The major aim of made tea identification is to identify the variety and provenance of the tea plant. The present experiment used 113 tea plants [Camellia sinensis (L.) O. Kuntze] housed at the Tea Research and Extension Substation, from which 113 internal transcribed spacer 2 (ITS2) fragments, 104 trnL intron, and 98 trnL-trnF intergenic sequence region DNA sequences were successfully sequenced. The similarity of the ITS2 nucleotide sequences between tea plants housed at the Tea Research and Extension Substation was 0.379-0.994. In this polymerase chain reaction-amplified noncoding region, no varieties possessed identical sequences. Compared with the trnL intron and trnL-trnF intergenic sequence fragments of chloroplast cpDNA, the proportion of ITS2 nucleotide sequence variation was large and is more suitable for establishing a DNA barcode database to identify tea plant varieties. After establishing the database, 30 imported teas and 35 domestic made teas were used in this model system to explore the feasibility of using ITS2 sequences to identify the varieties and provenances of made teas. A phylogenetic tree was constructed using ITS2 sequences with the unweighted pair group method with arithmetic mean, which indicated that the same variety of tea plant is likely to be successfully categorized into one cluster, but contamination from other tea plants was also detected. This result provides molecular evidence that the similarity between important tea varieties in Taiwan remains high. We suggest a direct, wide collection of made tea and original samples of tea plants to establish an ITS2 sequence molecular barcode identification database to identify the varieties and provenances of tea plants. The DNA barcode comparison method can satisfy the need for a rapid, low-cost, frontline differentiation of the large amount of made teas from Taiwan and abroad, and can provide molecular evidence of their varieties and provenances. Copyright © 2016. Published by Elsevier B.V.

  18. A sequence-dependent rigid-base model of DNA

    Science.gov (United States)

    Gonzalez, O.; Petkevičiutė, D.; Maddocks, J. H.

    2013-02-01

    A novel hierarchy of coarse-grain, sequence-dependent, rigid-base models of B-form DNA in solution is introduced. The hierarchy depends on both the assumed range of energetic couplings, and the extent of sequence dependence of the model parameters. A significant feature of the models is that they exhibit the phenomenon of frustration: each base cannot simultaneously minimize the energy of all of its interactions. As a consequence, an arbitrary DNA oligomer has an intrinsic or pre-existing stress, with the level of this frustration dependent on the particular sequence of the oligomer. Attention is focussed on the particular model in the hierarchy that has nearest-neighbor interactions and dimer sequence dependence of the model parameters. For a Gaussian version of this model, a complete coarse-grain parameter set is estimated. The parameterized model allows, for an oligomer of arbitrary length and sequence, a simple and explicit construction of an approximation to the configuration-space equilibrium probability density function for the oligomer in solution. The training set leading to the coarse-grain parameter set is itself extracted from a recent and extensive database of a large number of independent, atomic-resolution molecular dynamics (MD) simulations of short DNA oligomers immersed in explicit solvent. The Kullback-Leibler divergence between probability density functions is used to make several quantitative assessments of our nearest-neighbor, dimer-dependent model, which is compared against others in the hierarchy to assess various assumptions pertaining both to the locality of the energetic couplings and to the level of sequence dependence of its parameters. It is also compared directly against all-atom MD simulation to assess its predictive capabilities. The results show that the nearest-neighbor, dimer-dependent model can successfully resolve sequence effects both within and between oligomers. For example, due to the presence of frustration, the model can

  19. A sequence-dependent rigid-base model of DNA.

    Science.gov (United States)

    Gonzalez, O; Petkevičiūtė, D; Maddocks, J H

    2013-02-07

    A novel hierarchy of coarse-grain, sequence-dependent, rigid-base models of B-form DNA in solution is introduced. The hierarchy depends on both the assumed range of energetic couplings, and the extent of sequence dependence of the model parameters. A significant feature of the models is that they exhibit the phenomenon of frustration: each base cannot simultaneously minimize the energy of all of its interactions. As a consequence, an arbitrary DNA oligomer has an intrinsic or pre-existing stress, with the level of this frustration dependent on the particular sequence of the oligomer. Attention is focussed on the particular model in the hierarchy that has nearest-neighbor interactions and dimer sequence dependence of the model parameters. For a Gaussian version of this model, a complete coarse-grain parameter set is estimated. The parameterized model allows, for an oligomer of arbitrary length and sequence, a simple and explicit construction of an approximation to the configuration-space equilibrium probability density function for the oligomer in solution. The training set leading to the coarse-grain parameter set is itself extracted from a recent and extensive database of a large number of independent, atomic-resolution molecular dynamics (MD) simulations of short DNA oligomers immersed in explicit solvent. The Kullback-Leibler divergence between probability density functions is used to make several quantitative assessments of our nearest-neighbor, dimer-dependent model, which is compared against others in the hierarchy to assess various assumptions pertaining both to the locality of the energetic couplings and to the level of sequence dependence of its parameters. It is also compared directly against all-atom MD simulation to assess its predictive capabilities. The results show that the nearest-neighbor, dimer-dependent model can successfully resolve sequence effects both within and between oligomers. For example, due to the presence of frustration, the model can

  20. Methylation patterns of repetitive DNA sequences in germ cells of Mus musculus.

    OpenAIRE

    Sanford, J; Forrester, L; Chapman, V; Chandley, A; Hastie, N

    1984-01-01

    The major and the minor satellite sequences of Mus musculus were undermethylated in both sperm and oocyte DNAs relative to the amount of undermethylation observed in adult somatic tissue DNA. This hypomethylation was specific for satellite sequences in sperm DNA. Dispersed repetitive and low copy sequences show a high degree of methylation in sperm DNA; however, a dispersed repetitive sequence was undermethylated in oocyte DNA. This finding suggests a difference in the amount of total genomic...

  1. Chimeric TALE recombinases with programmable DNA sequence specificity.

    Science.gov (United States)

    Mercer, Andrew C; Gaj, Thomas; Fuller, Roberta P; Barbas, Carlos F

    2012-11-01

    Site-specific recombinases are powerful tools for genome engineering. Hyperactivated variants of the resolvase/invertase family of serine recombinases function without accessory factors, and thus can be re-targeted to sequences of interest by replacing native DNA-binding domains (DBDs) with engineered zinc-finger proteins (ZFPs). However, imperfect modularity with particular domains, lack of high-affinity binding to all DNA triplets, and difficulty in construction has hindered the widespread adoption of ZFPs in unspecialized laboratories. The discovery of a novel type of DBD in transcription activator-like effector (TALE) proteins from Xanthomonas provides an alternative to ZFPs. Here we describe chimeric TALE recombinases (TALERs): engineered fusions between a hyperactivated catalytic domain from the DNA invertase Gin and an optimized TALE architecture. We use a library of incrementally truncated TALE variants to identify TALER fusions that modify DNA with efficiency and specificity comparable to zinc-finger recombinases in bacterial cells. We also show that TALERs recombine DNA in mammalian cells. The TALER architecture described herein provides a platform for insertion of customized TALE domains, thus significantly expanding the targeting capacity of engineered recombinases and their potential applications in biotechnology and medicine.

  2. Structural properties of replication origins in yeast DNA sequences

    International Nuclear Information System (INIS)

    Cao Xiaoqin; Zeng Jia; Yan Hong

    2008-01-01

    Sequence-dependent DNA flexibility is an important structural property originating from the DNA 3D structure. In this paper, we investigate the DNA flexibility of the budding yeast (S. Cerevisiae) replication origins on a genome-wide scale using flexibility parameters from two different models, the trinucleotide and the tetranucleotide models. Based on analyzing average flexibility profiles of 270 replication origins, we find that yeast replication origins are significantly rigid compared with their surrounding genomic regions. To further understand the highly distinctive property of replication origins, we compare the flexibility patterns between yeast replication origins and promoters, and find that they both contain significantly rigid DNAs. Our results suggest that DNA flexibility is an important factor that helps proteins recognize and bind the target sites in order to initiate DNA replication. Inspired by the role of the rigid region in promoters, we speculate that the rigid replication origins may facilitate binding of proteins, including the origin recognition complex (ORC), Cdc6, Cdt1 and the MCM2-7 complex

  3. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697

  4. Characterization of mango (Mangifera indica L.) transcriptome and chloroplast genome.

    Science.gov (United States)

    Azim, M Kamran; Khan, Ishtaiq A; Zhang, Yong

    2014-05-01

    We characterized mango leaf transcriptome and chloroplast genome using next generation DNA sequencing. The RNA-seq output of mango transcriptome generated >12 million reads (total nucleotides sequenced >1 Gb). De novo transcriptome assembly generated 30,509 unigenes with lengths in the range of 300 to ≥3,000 nt and 67× depth of coverage. Blast searching against nonredundant nucleotide databases and several Viridiplantae genomic datasets annotated 24,593 mango unigenes (80% of total) and identified Citrus sinensis as closest neighbor of mango with 9,141 (37%) matched sequences. The annotation with gene ontology and Clusters of Orthologous Group terms categorized unigene sequences into 57 and 25 classes, respectively. More than 13,500 unigenes were assigned to 293 KEGG pathways. Besides major plant biology related pathways, KEGG based gene annotation pointed out active presence of an array of biochemical pathways involved in (a) biosynthesis of bioactive flavonoids, flavones and flavonols, (b) biosynthesis of terpenoids and lignins and (c) plant hormone signal transduction. The mango transcriptome sequences revealed 235 proteases belonging to five catalytic classes of proteolytic enzymes. The draft genome of mango chloroplast (cp) was obtained by a combination of Sanger and next generation sequencing. The draft mango cp genome size is 151,173 bp with a pair of inverted repeats of 27,093 bp separated by small and large single copy regions, respectively. Out of 139 genes in mango cp genome, 91 found to be protein coding. Sequence analysis revealed cp genome of C. sinensis as closest neighbor of mango. We found 51 short repeats in mango cp genome supposed to be associated with extensive rearrangements. This is the first report of transcriptome and chloroplast genome analysis of any Anacardiaceae family member.

  5. Phylogeny of the Serrasalmidae (Characiformes based on mitochondrial DNA sequences

    Directory of Open Access Journals (Sweden)

    Guillermo Ortí

    2008-01-01

    Full Text Available Previous studies based on DNA sequences of mitochondrial (mt rRNA genes showed three main groups within the subfamily Serrasalminae: (1 a "pacu" clade of herbivores (Colossoma, Mylossoma, Piaractus; (2 the "Myleus" clade (Myleus, Mylesinus, Tometes, Ossubtus; and (3 the "piranha" clade (Serrasalmus, Pygocentrus, Pygopristis, Pristobrycon, Catoprion, Metynnis. The genus Acnodon was placed as the sister taxon of clade (2+3. However, poor resolution within each clade was obtained due to low levels of variation among rRNA gene sequences. Complete sequences of the hypervariable mtDNA control region for a total of 45 taxa, and additional sequences of 12S and 16S rRNA from a total of 74 taxa representing all genera in the family are now presented to address intragroup relationships. Control region sequences of several serrasalmid species exhibit tandem repeats of short motifs (12 to 33 bp in the 3' end of this region, accounting for substantial length variation. Bayesian inference and maximum parsimony analyses of these sequences identify the same groupings as before and provide further evidence to support the following observations: (a Serrasalmus gouldingi and species of Pristobrycon (non-striolatus form a monophyletic group that is the sister group to other species of Serrasalmus and Pygocentrus; (b Catoprion, Pygopristis, and Pristobrycon striolatus form a well supported clade, sister to the group described above; (c some taxa assigned to the genus Myloplus (M. asterias, M tiete, M ternetzi, and M rubripinnis form a well supported group whereas other Myloplus species remain with uncertain affinities (d Mylesinus, Tometes and Myleus setiger form a monophyletic group.

  6. Bacterial DNA Sequence Compression Models Using Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Armando J. Pinho

    2013-08-01

    Full Text Available It is widely accepted that the advances in DNA sequencing techniques have contributed to an unprecedented growth of genomic data. This fact has increased the interest in DNA compression, not only from the information theory and biology points of view, but also from a practical perspective, since such sequences require storage resources. Several compression methods exist, and particularly, those using finite-context models (FCMs have received increasing attention, as they have been proven to effectively compress DNA sequences with low bits-per-base, as well as low encoding/decoding time-per-base. However, the amount of run-time memory required to store high-order finite-context models may become impractical, since a context-order as low as 16 requires a maximum of 17.2 x 109 memory entries. This paper presents a method to reduce such a memory requirement by using a novel application of artificial neural networks (ANN to build such probabilistic models in a compact way and shows how to use them to estimate the probabilities. Such a system was implemented, and its performance compared against state-of-the art compressors, such as XM-DNA (expert model and FCM-Mx (mixture of finite-context models , as well as with general-purpose compressors. Using a combination of order-10 FCM and ANN, similar encoding results to those of FCM, up to order-16, are obtained using only 17 megabytes of memory, whereas the latter, even employing hash-tables, uses several hundreds of megabytes.

  7. Using Synthetic Nanopores for Single-Molecule Analyses: Detecting SNPs, Trapping DNA Molecules, and the Prospects for Sequencing DNA

    Science.gov (United States)

    Dimitrov, Valentin V.

    2009-01-01

    This work focuses on studying properties of DNA molecules and DNA-protein interactions using synthetic nanopores, and it examines the prospects of sequencing DNA using synthetic nanopores. We have developed a method for discriminating between alleles that uses a synthetic nanopore to measure the binding of a restriction enzyme to DNA. There exists…

  8. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    2000-07-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  9. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 12 figs.

  10. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, Natasha V. (Okemos, MI); Broekaert, Willem F. (Dilbeek, BE); Chua, Nam-Hai (Scarsdale, NY); Kush, Anil (New York, NY)

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  11. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1995-03-21

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 11 figures.

  12. PISMA: A Visual Representation of Motif Distribution in DNA Sequences

    Directory of Open Access Journals (Sweden)

    Rogelio Alcántara-Silva

    2017-03-01

    Full Text Available Background: Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code–like, as a gene-map–like, and as a transcript scheme. Results: We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. Availability and Implementation: PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf .

  13. Population structure and post-glacial migration routes of Quercus robur and Quercus petraea in Denmark, based on chloroplast DNA analysis

    Energy Technology Data Exchange (ETDEWEB)

    Joehnk, N.; Siegismund, H.R. [The Royal Veterinary and Agricultural Univ., Hoersholm (Denmark). The Arboretum

    1997-07-01

    Populations of Quercus robur L. and Quercus petraea [Matt.] Liebl. were shown previously to be fixed for the same chloroplast DNA marker in western Europe and for another form of this marker in eastern Europe. Application of this marker to 17 Danish populations of Q. robur showed significant population differentiation (G{sub ST} 0.6). Restricted gene flow, low effective population size, restricted colonization ability of oak in dense forest and historical data might explain this. In addition, the genetic structure in eastern and western Denmark was quite different. In Jutland the populations were homogeneous for the western marker, in eastern Denmark, significant population differentiation and high diversity within populations were found. Post-glacial migration is likely to explain the geographical structure. Oaks have immigrated to Jutland from the west, whereas eastern Denmark was colonized from both east and west, forming a hybrid zone where immigrants met. Data from three populations of Q. petraea and from two hybrid populations also support this. 22 refs, 2 figs, 2 tabs

  14. Evidence of Natural Hybridization and Introgression between Vasconcellea Species (Caricaceae) from Southern Ecuador Revealed by Chloroplast, Mitochondrial and Nuclear DNA Markers

    Science.gov (United States)

    VAN DROOGENBROECK, B.; KYNDT, T.; ROMEIJN-PEETERS, E.; VAN THUYNE, W.; GOETGHEBEUR, P.; ROMERO-MOTOCHI, J. P.; GHEYSEN, G.

    2006-01-01

    • Background and Aims Vasconcellea × heilbornii is believed to be of natural hybrid origin between V. cundinamarcensis and V. stipulata, and is often difficult to discriminate from V. stipulata on morphological grounds. The aim of this paper is to examine individuals of these three taxa and of individuals from the closely related species V. parviflora and V. weberbaueri, which all inhabit a hybrid zone in southern Ecuador. • Methods Molecular data from mitochondrial, chloroplast and nuclear DNA from 61 individuals were analysed. • Key Results Molecular analysis confirmed occasional contemporary hybridization between V. stipulata, V. cundinamarcensis and V. × heilbornii and suggested the possible involvement of V. weberbaueri in the origin of V. × heilbornii. In addition, the molecular data indicated unidirectional introgression of the V. cundinamarcensis nuclear genome into that of V. stipulata. Several of the individuals examined with morphology similar to that of V. stipulata had genetic traces of hybridization with V. cundinamarcensis, which only seems to act as pollen donor in interspecific hybridization events. Molecular analyses also strongly suggested that most of the V. × heilbornii individuals are not F1 hybrids but instead are progeny of repeated backcrosses with V. stipulata. • Conclusions The results of the present study point to the need for re-evaluation of natural populations of V. stipulata and V. × heilbornii. In general, this analysis demonstrates the complex patterns of genetic and morphological diversity found in natural plant hybrid zones. PMID:16500954

  15. DNA Sequencing as a Tool to Monitor Marine Ecological Status

    Directory of Open Access Journals (Sweden)

    Kelly D. Goodwin

    2017-05-01

    Full Text Available Many ocean policies mandate integrated, ecosystem-based approaches to marine monitoring, driving a global need for efficient, low-cost bioindicators of marine ecological quality. Most traditional methods to assess biological quality rely on specialized expertise to provide visual identification of a limited set of specific taxonomic groups, a time-consuming process that can provide a narrow view of ecological status. In addition, microbial assemblages drive food webs but are not amenable to visual inspection and thus are largely excluded from detailed inventory. Molecular-based assessments of biodiversity and ecosystem function offer advantages over traditional methods and are increasingly being generated for a suite of taxa using a “microbes to mammals” or “barcodes to biomes” approach. Progress in these efforts coupled with continued improvements in high-throughput sequencing and bioinformatics pave the way for sequence data to be employed in formal integrated ecosystem evaluation, including food web assessments, as called for in the European Union Marine Strategy Framework Directive. DNA sequencing of bioindicators, both traditional (e.g., benthic macroinvertebrates, ichthyoplankton and emerging (e.g., microbial assemblages, fish via eDNA, promises to improve assessment of marine biological quality by increasing the breadth, depth, and throughput of information and by reducing costs and reliance on specialized taxonomic expertise.

  16. Protein import into chloroplasts requires a chloroplast ATPase

    Energy Technology Data Exchange (ETDEWEB)

    Pain, D.; Blobel, G.

    1987-05-01

    The authors have transcribed mRNA from a cDNA clone coding for pea ribulose-1,5-bisphosphate carboxylase, translated the mRNA in a wheat germ cell-free system, and studied the energy requirement for posttranslational import of the (/sup 35/S)methionine-labeled protein into the stroma of pea chloroplasts. They found that import depends on ATP hydrolysis within the stroma. Import is not inhibited when H/sup +/, K/sup +/, Na/sup +/, or divalent cation gradients across the chloroplast membranes are dissipated by ionophores, as long as exogenously added ATP is also present during the import reaction. The data suggest that protein import into the chloroplast stroma requires a chloroplast ATPase that does not function to generate a membrane potential for driving the import reaction but that exerts its effect in another, yet-to-be-determined, mode. They have carried out a preliminary characterization of this ATPase regarding its nucleotide specificity and the effects of various ATPase inhibitors.

  17. A MapReduce Framework for DNA Sequencing Data Processing

    Directory of Open Access Journals (Sweden)

    Samy Ghoneimy

    2016-12-01

    Full Text Available Genomics and Next Generation Sequencers (NGS like Illumina Hiseq produce data in the order of ‎‎200 billion base pairs in a single one-week run for a 60x human genome coverage, which ‎requires modern high-throughput experimental technologies that can ‎only be tackled with high performance computing (HPC and specialized software algorithms called ‎‎“short read aligners”. This paper focuses on the implementation of the DNA sequencing as a set of MapReduce programs that will accept a DNA data set as a FASTQ file and finally generate a VCF (variant call format file, which has variants for a given DNA data set. In this paper MapReduce/Hadoop along with Burrows-Wheeler Aligner (BWA, Sequence Alignment/Map (SAM ‎tools, are fully utilized to provide various utilities for manipulating alignments, including sorting, merging, indexing, ‎and generating alignments. The Map-Sort-Reduce process is designed to be suited for a Hadoop framework in ‎which each cluster is a traditional N-node Hadoop cluster to utilize all of the Hadoop features like HDFS, program ‎management and fault tolerance. The Map step performs multiple instances of the short read alignment algorithm ‎‎(BoWTie that run in parallel in Hadoop. The ordered list of the sequence reads are used as input tuples and the ‎output tuples are the alignments of the short reads. In the Reduce step many parallel instances of the Short ‎Oligonucleotide Analysis Package for SNP (SOAPsnp algorithm run in the cluster. Input tuples are sorted ‎alignments for a partition and the output tuples are SNP calls. Results are stored via HDFS, and then archived in ‎SOAPsnp format. ‎ The proposed framework enables extremely fast discovering somatic mutations, inferring population genetical ‎parameters, and performing association tests directly based on sequencing data without explicit genotyping or ‎linkage-based imputation. It also demonstrate that this method achieves comparable

  18. Retroviral DNA Sequences as a Means for Determining Ancient Diets.

    Directory of Open Access Journals (Sweden)

    Jessica I Rivera-Perez

    Full Text Available For ages, specialists from varying fields have studied the diets of the primeval inhabitants of our planet, detecting diet remains in archaeological specimens using a range of morphological and biochemical methods. As of recent, metagenomic ancient DNA studies have allowed for the comparison of the fecal and gut microbiomes associated to archaeological specimens from various regions of the world; however the complex dynamics represented in those microbial communities still remain unclear. Theoretically, similar to eukaryote DNA the presence of genes from key microbes or enzymes, as well as the presence of DNA from viruses specific to key organisms, may suggest the ingestion of specific diet components. In this study we demonstrate that ancient virus DNA obtained from coprolites also provides information reconstructing the host's diet, as inferred from sequences obtained from pre-Columbian coprolites. This depicts a novel and reliable approach to determine new components as well as validate the previously suggested diets of extinct cultures and animals. Furthermore, to our knowledge this represents the first description of the eukaryotic viral diversity found in paleofaeces belonging to pre-Columbian cultures.

  19. Mitochondrial DNA sequencing of cat hair: an informative forensic tool.

    Science.gov (United States)

    Tarditi, Christy R; Grahn, Robert A; Evans, Jeffrey J; Kurushima, Jennifer D; Lyons, Leslie A

    2011-01-01

    Approximately 81.7 million cats are in 37.5 million U.S. households. Shed fur can be criminal evidence because of transfer to victims, suspects, and/or their belongings. To improve cat hairs as forensic evidence, the mtDNA control region from single hairs, with and without root tags, was sequenced. A dataset of a 402-bp control region segment from 174 random-bred cats representing four U.S. geographic areas was generated to determine the informativeness of the mtDNA region. Thirty-two mtDNA mitotypes were observed ranging in frequencies from 0.6-27%. Four common types occurred in all populations. Low heteroplasmy, 1.7%, was determined. Unique mitotypes were found in 18 individuals, 10.3% of the population studied. The calculated discrimination power implied that 8.3 of 10 randomly selected individuals can be excluded by this region. The genetic characteristics of the region and the generated dataset support the use of this cat mtDNA region in forensic applications. 2010 American Academy of Forensic Sciences. Published 2010. This article is a U.S. Government work and is in the public domain in the U.S.A.

  20. Transcription blockage by homopurine DNA sequences: role of sequence composition and single-strand breaks

    Science.gov (United States)

    Belotserkovskii, Boris P.; Neil, Alexander J.; Saleh, Syed Shayon; Shin, Jane Hae Soo; Mirkin, Sergei M.; Hanawalt, Philip C.

    2013-01-01

    The ability of DNA to adopt non-canonical structures can affect transcription and has broad implications for genome functioning. We have recently reported that guanine-rich (G-rich) homopurine-homopyrimidine sequences cause significant blockage of transcription in vitro in a strictly orientation-dependent manner: when the G-rich strand serves as the non-template strand [Belotserkovskii et al. (2010) Mechanisms and implications of transcription blockage by guanine-rich DNA sequences., Proc. Natl Acad. Sci. USA, 107, 12816–12821]. We have now systematically studied the effect of the sequence composition and single-stranded breaks on this blockage. Although substitution of guanine by any other base reduced the blockage, cytosine and thymine reduced the blockage more significantly than adenine substitutions, affirming the importance of both G-richness and the homopurine-homopyrimidine character of the sequence for this effect. A single-strand break in the non-template strand adjacent to the G-rich stretch dramatically increased the blockage. Breaks in the non-template strand result in much weaker blockage signals extending downstream from the break even in the absence of the G-rich stretch. Our combined data support the notion that transcription blockage at homopurine-homopyrimidine sequences is caused by R-loop formation. PMID:23275544

  1. Automated methods for single-stranded DNA isolation and dideoxynucleotide DNA sequencing reactions on a robotic workstation

    International Nuclear Information System (INIS)

    Mardis, E.R.; Roe, B.A.

    1989-01-01

    Automated procedures have been developed for both the simultaneous isolation of 96 single-stranded M13 chimeric template DNAs in less than two hours, and for simultaneously pipetting 24 dideoxynucleotide sequencing reactions on a commercially available laboratory workstation. The DNA sequencing results obtained by either radiolabeled or fluorescent methods are consistent with the premise that automation of these portions of DNA sequencing projects will improve the reproducibility of the DNA isolation and the procedures for these normally labor-intensive steps provides an approach for rapid acquisition of large amounts of high quality, reproducible DNA sequence data

  2. Roche genome sequencer FLX based high-throughput sequencing of ancient DNA

    DEFF Research Database (Denmark)

    Alquezar-Planas, David E; Fordyce, Sarah Louise

    2012-01-01

    Since the development of so-called "next generation" high-throughput sequencing in 2005, this technology has been applied to a variety of fields. Such applications include disease studies, evolutionary investigations, and ancient DNA. Each application requires a specialized protocol to ensure...... that the data produced is optimal. Although much of the procedure can be followed directly from the manufacturer's protocols, the key differences lie in the library preparation steps. This chapter presents an optimized protocol for the sequencing of fossil remains and museum specimens, commonly referred...

  3. Horizontal gene transfer of a chloroplast DnaJ-Fer protein to Thaumarchaeota and the evolutionary history of the DnaK chaperone system in Archaea.

    Science.gov (United States)

    Petitjean, Céline; Moreira, David; López-García, Purificación; Brochier-Armanet, Céline

    2012-11-26

    In 2004, we discovered an atypical protein in metagenomic data from marine thaumarchaeotal species. This protein, referred as DnaJ-Fer, is composed of a J domain fused to a Ferredoxin (Fer) domain. Surprisingly, the same protein was also found in Viridiplantae (green algae and land plants). Because J domain-containing proteins are known to interact with the major chaperone DnaK/Hsp70, this suggested that a DnaK protein was present in Thaumarchaeota. DnaK/Hsp70, its co-chaperone DnaJ and the nucleotide exchange factor GrpE are involved, among others, in heat shocks and heavy metal cellular stress responses. Using phylogenomic approaches we have investigated the evolutionary history of the DnaJ-Fer protein and of interacting proteins DnaK, DnaJ and GrpE in Thaumarchaeota. These proteins have very complex histories, involving several inter-domain horizontal gene transfers (HGTs) to explain the contemporary distribution of these proteins in archaea. These transfers include one from Cyanobacteria to Viridiplantae and one from Viridiplantae to Thaumarchaeota for the DnaJ-Fer protein, as well as independent HGTs from Bacteria to mesophilic archaea for the DnaK/DnaJ/GrpE system, followed by HGTs among mesophilic and thermophilic archaea. We highlight the chimerical origin of the set of proteins DnaK, DnaJ, GrpE and DnaJ-Fer in Thaumarchaeota and suggest that the HGT of these proteins has played an important role in the adaptation of several archaeal groups to mesophilic and thermophilic environments from hyperthermophilic ancestors. Finally, the evolutionary history of DnaJ-Fer provides information useful for the relative dating of the diversification of Archaeplastida and Thaumarchaeota.

  4. Horizontal gene transfer of a chloroplast DnaJ-Fer protein to Thaumarchaeota and the evolutionary history of the DnaK chaperone system in Archaea

    Directory of Open Access Journals (Sweden)

    Petitjean Céline

    2012-11-01

    Full Text Available Abstract Background In 2004, we discovered an atypical protein in metagenomic data from marine thaumarchaeotal species. This protein, referred as DnaJ-Fer, is composed of a J domain fused to a Ferredoxin (Fer domain. Surprisingly, the same protein was also found in Viridiplantae (green algae and land plants. Because J domain-containing proteins are known to interact with the major chaperone DnaK/Hsp70, this suggested that a DnaK protein was present in Thaumarchaeota. DnaK/Hsp70, its co-chaperone DnaJ and the nucleotide exchange factor GrpE are involved, among others, in heat shocks and heavy metal cellular stress responses. Results Using phylogenomic approaches we have investigated the evolutionary history of the DnaJ-Fer protein and of interacting proteins DnaK, DnaJ and GrpE in Thaumarchaeota. These proteins have very complex histories, involving several inter-domain horizontal gene transfers (HGTs to explain the contemporary distribution of these proteins in archaea. These transfers include one from Cyanobacteria to Viridiplantae and one from Viridiplantae to Thaumarchaeota for the DnaJ-Fer protein, as well as independent HGTs from Bacteria to mesophilic archaea for the DnaK/DnaJ/GrpE system, followed by HGTs among mesophilic and thermophilic archaea. Conclusions We highlight the chimerical origin of the set of proteins DnaK, DnaJ, GrpE and DnaJ-Fer in Thaumarchaeota and suggest that the HGT of these proteins has played an important role in the adaptation of several archaeal groups to mesophilic and thermophilic environments from hyperthermophilic ancestors. Finally, the evolutionary history of DnaJ-Fer provides information useful for the relative dating of the diversification of Archaeplastida and Thaumarchaeota.

  5. DNA interaction with platinum-based cytostatics revealed by DNA sequencing.

    Science.gov (United States)

    Smerkova, Kristyna; Vaculovic, Tomas; Vaculovicova, Marketa; Kynicky, Jindrich; Brtnicky, Martin; Eckschlager, Tomas; Stiborova, Marie; Hubalek, Jaromir; Adam, Vojtech

    2017-12-15

    The main mechanism of action of platinum-based cytostatic drugs - cisplatin, oxaliplatin and carboplatin - is the formation of DNA cross-links, which restricts the transcription due to the disability of DNA to enter the active site of the polymerase. The polymerase chain reaction (PCR) was employed as a simplified model of the amplification process in the cell nucleus. PCR with fluorescently labelled dideoxynucleotides commonly employed for DNA sequencing was used to monitor the effect of platinum-based cytostatics on DNA in terms of decrease in labeling efficiency dependent on a presence of the DNA-drug cross-link. It was found that significantly different amounts of the drugs - cisplatin (0.21 μg/mL), oxaliplatin (5.23 μg/mL), and carboplatin (71.11 μg/mL) - were required to cause the same quenching effect (50%) on the fluorescent labelling of 50 μg/mL of DNA. Moreover, it was found that even though the amounts of the drugs was applied to the reaction mixture differing by several orders of magnitude, the amount of incorporated platinum, quantified by inductively coupled plasma mass spectrometry, was in all cases at the level of tenths of μg per 5 μg of DNA. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. New scoring schema for finding motifs in DNA Sequences

    Directory of Open Access Journals (Sweden)

    Nowzari-Dalini Abbas

    2009-03-01

    Full Text Available Abstract Background Pattern discovery in DNA sequences is one of the most fundamental problems in molecular biology with important applications in finding regulatory signals and transcription factor binding sites. An important task in this problem is to search (or predict known binding sites in a new DNA sequence. For this reason, all subsequences of the given DNA sequence are scored based on an scoring function and the prediction is done by selecting the best score. By assuming no dependency between binding site base positions, most of the available tools for known binding site prediction are designed. Recently Tomovic and Oakeley investigated the statistical basis for either a claim of dependence or independence, to determine whether such a claim is generally true, and they presented a scoring function for binding site prediction based on the dependency between binding site base positions. Our primary objective is to investigate the scoring functions which can be used in known binding site prediction based on the assumption of dependency or independency in binding site base positions. Results We propose a new scoring function based on the dependency between all positions in biding site base positions. This scoring function uses joint information content and mutual information as a measure of dependency between positions in transcription factor binding site. Our method for modeling dependencies is simply an extension of position independency methods. We evaluate our new scoring function on the real data sets extracted from JASPAR and TRANSFAC data bases, and compare the obtained results with two other well known scoring functions. Conclusion The results demonstrate that the new approach improves known binding site discovery and show that the joint information content and mutual information provide a better and more general criterion to investigate the relationships between positions in the TFBS. Our scoring function is formulated by simple

  7. A comparison of rice chloroplast genomes

    DEFF Research Database (Denmark)

    Tang, Jiabin; Xia, Hong'ai; Cao, Mengliang

    2004-01-01

    Using high quality sequence reads extracted from our whole genome shotgun repository, we assembled two chloroplast genome sequences from two rice (Oryza sativa) varieties, one from 93-11 (a typical indica variety) and the other from PA64S (an indica-like variety with maternal origin of japonica......), which are both parental varieties of the super-hybrid rice, LYP9. Based on the patterns of high sequence coverage, we partitioned chloroplast sequence variations into two classes, intravarietal and intersubspecific polymorphisms. Intravarietal polymorphisms refer to variations within 93-11 or PA64S...

  8. The complete chloroplast genome sequence of the CAM epiphyte Spanish moss (Tillandsia usneoides, Bromeliaceae and its comparative analysis.

    Directory of Open Access Journals (Sweden)

    Péter Poczai

    Full Text Available Spanish moss (Tillandsia usneoides is an epiphytic bromeliad widely distributed throughout tropical and warm temperate America. This plant is highly adapted to extreme environmental conditions. Striking features of this species include specialized trichomes (scales covering the surface of its shoots aiding the absorption of water and nutrients directly from the atmosphere and a specific photosynthesis using crassulacean acid metabolism (CAM. Here we report the plastid genome of Spanish moss and present the comparison of genome organization and sequence evolution within Poales. The plastome of Spanish moss has a quadripartite structure consisting of a large single copy (LSC, 87,439 bp, two inverted regions (IRa and IRb, 26,803 bp and short single copy (SSC, 18,612 bp region. The plastid genome had 37.2% GC content and 134 genes with 88 being unique protein-coding genes and 20 of these are duplicated in the IR, similar to other reported bromeliads. Our study shows that early diverging lineages of Poales do not have high substitution rates as compared to grasses, and plastid genomes of bromeliads show structural features considered to be ancestral in graminids. These include the loss of the introns in the clpP and rpoC1 genes and the complete loss or partial degradation of accD and ycf genes in the Graminid clade. Further structural rearrangements appeared in the graminids lacking in Spanish moss, which include a 28-kb inversion between the trnG-UCC-rps14 region and 6-kb in the trnG-UCC-psbD, followed by a third <1kb inversion in the trnT sequence.

  9. The complete chloroplast genome sequence of the CAM epiphyte Spanish moss (Tillandsia usneoides, Bromeliaceae) and its comparative analysis.

    Science.gov (United States)

    Poczai, Péter; Hyvönen, Jaakko

    2017-01-01

    Spanish moss (Tillandsia usneoides) is an epiphytic bromeliad widely distributed throughout tropical and warm temperate America. This plant is highly adapted to extreme environmental conditions. Striking features of this species include specialized trichomes (scales) covering the surface of its shoots aiding the absorption of water and nutrients directly from the atmosphere and a specific photosynthesis using crassulacean acid metabolism (CAM). Here we report the plastid genome of Spanish moss and present the comparison of genome organization and sequence evolution within Poales. The plastome of Spanish moss has a quadripartite structure consisting of a large single copy (LSC, 87,439 bp), two inverted regions (IRa and IRb, 26,803 bp) and short single copy (SSC, 18,612 bp) region. The plastid genome had 37.2% GC content and 134 genes with 88 being unique protein-coding genes and 20 of these are duplicated in the IR, similar to other reported bromeliads. Our study shows that early diverging lineages of Poales do not have high substitution rates as compared to grasses, and plastid genomes of bromeliads show structural features considered to be ancestral in graminids. These include the loss of the introns in the clpP and rpoC1 genes and the complete loss or partial degradation of accD and ycf genes in the Graminid clade. Further structural rearrangements appeared in the graminids lacking in Spanish moss, which include a 28-kb inversion between the trnG-UCC-rps14 region and 6-kb in the trnG-UCC-psbD, followed by a third <1kb inversion in the trnT sequence.

  10. Transient foreign gene expression in chloroplasts of cultured tobacco cells after biolistic delivery of chloroplast vectors.

    Science.gov (United States)

    Daniell, H; Vivekananda, J; Nielsen, B L; Ye, G N; Tewari, K K; Sanford, J C

    1990-01-01

    Expression of chloramphenicol acetyltransferase (cat) by suitable vectors in chloroplasts of cultured tobacco cells, delivered by high-velocity microprojectiles, is reported here. Several chloroplast expression vectors containing bacterial cat genes, placed under the control of either psbA promoter region from pea (pHD series) or rbcL promoter region from maize (pAC series) have been used in this study. In addition, chloroplast expression vectors containing replicon fragments from pea, tobacco, or maize chloroplast DNA have also been tested for efficiency and duration of cat expression in chloroplasts of tobacco cells. Cultured NT1 tobacco cells collected on filter papers were bombarded with tungsten particles coated with pUC118 (negative control), 35S-CAT (nuclear expression vector), pHD312 (repliconless chloroplast expression vector), and pHD407, pACp18, and pACp19 (chloroplast expression vectors with replicon). Sonic extracts of cells bombarded with pUC118 showed no detectable cat activity in the autoradiograms. Nuclear expression of cat reached two-thirds of the maximal 48 hr after bombardment and the maximal at 72 hr. Cells bombarded with chloroplast expression vectors showed a low level of expression until 48 hr of incubation. A dramatic increase in the expression of cat was observed 24 hr after the addition of fresh medium to cultured cells in samples bombarded with pHD407; the repliconless vector pHD312 showed about 50% of this maximal activity. The expression of nuclear cat and the repliconless chloroplast vector decreased after 72 hr, but a high level of chloroplast cat expression was maintained in cells bombarded with pHD407. Organelle-specific expression of cat in appropriate compartments was checked by introducing various plasmid constructions into tobacco protoplasts by electroporation. Although the nuclear expression vector 35S-CAT showed expression of cat, no activity was observed with any chloroplast vectors.

  11. Hypervariable minisatellite DNA sequences in the Indian peafowl Pavo cristatus.

    Science.gov (United States)

    Hanotte, O; Burke, T; Armour, J A; Jeffreys, A J

    1991-04-01

    We report here for the first time the large-scale isolation of hypervariable minisatellite DNA sequences from a non-human species, the Indian peafowl (Pavo cristatus). A size-selected genomic DNA fraction, rich in hypervariable minisatellites, was cloned into Charomid 9-36. This library was screened using two multilocus hypervariable probes, 33.6 and 33.15 and also, in a "probe-walking" approach, with five of the peafowl minisatellites initially isolated. Forty-eight positively hybridizing clones were characterized and found to originate from 30 different loci, 18 of which were polymorphic. Five of these variable minisatellite loci were studied further. They all showed Mendelian inheritance. The heterozygosities of these loci were relatively low (range 22-78%) in comparison with those of previously cloned human loci, as expected in view of inbreeding in our semicaptive study population. No new length allele mutations were observed in families and the mean mutation rate per locus is low (less than 0.004, 95% confidence maximum). These loci were also investigated by cross-species hybridization in related taxa. The ability of the probes to detect hypervariable sequences in other species within the same avian family was found to vary, from those probes that are species-specific to those that are apparently general to the family. We also illustrate the potential usefulness of these probes for paternity analysis in a study of sexual selection, and discuss the general application of specific hypervariable probes in behavioral and evolutionary studies.

  12. A pneumatic device for rapid loading of DNA sequencing gels.

    Science.gov (United States)

    Panussis, D A; Cook, M W; Rifkin, L L; Snider, J E; Strong, J T; McGrane, R M; Wilson, R K; Mardis, E R

    1998-05-01

    This work describes the design and construction of a device that facilitates the loading of DNA samples onto polyacrylamide gels for detection in the Perkin Elmer/Applied Biosystems (PE/ABI) 373 and 377 DNA sequencing instruments. The device is mounted onto the existing gel cassettes and makes the process of loading high-density gels less cumbersome while the associated time and errors are reduced. The principle of operation includes the simultaneous transfer of the entire batch of samples, in which a spring-loaded air cylinder generates positive pressure and flexible silica capillaries transfer the samples. A retractable capillary array carrier allows the delivery ends of the capillaries to be held up clear of the gel during loader attachment on the gel plates, while enabling their insertion in the gel wells once the device is securely mounted. Gel-loading devices capable of simultaneously transferring 72 samples onto the PE/ABI 373 and 377 are currently being used in our production sequencing groups while a 96-sample transfer prototype undergoes testing.

  13. Phylogenomics of Phrynosomatid Lizards: Conflicting Signals from Sequence Capture versus Restriction Site Associated DNA Sequencing

    Science.gov (United States)

    Leaché, Adam D.; Chavez, Andreas S.; Jones, Leonard N.; Grummer, Jared A.; Gottscho, Andrew D.; Linkem, Charles W.

    2015-01-01

    Sequence capture and restriction site associated DNA sequencing (RADseq) are popular methods for obtaining large numbers of loci for phylogenetic analysis. These methods are typically used to collect data at different evolutionary timescales; sequence capture is primarily used for obtaining conserved loci, whereas RADseq is designed for discovering single nucleotide polymorphisms (SNPs) suitable for population genetic or phylogeographic analyses. Phylogenetic questions that span both “recent” and “deep” timescales could benefit from either type of data, but studies that directly compare the two approaches are lacking. We compared phylogenies estimated from sequence capture and double digest RADseq (ddRADseq) data for North American phrynosomatid lizards, a species-rich and diverse group containing nine genera that began diversifying approximately 55 Ma. Sequence capture resulted in 584 loci that provided a consistent and strong phylogeny using concatenation and species tree inference. However, the phylogeny estimated from the ddRADseq data was sensitive to the bioinformatics steps used for determining homology, detecting paralogs, and filtering missing data. The topological conflicts among the SNP trees were not restricted to any particular timescale, but instead were associated with short internal branches. Species tree analysis of the largest SNP assembly, which also included the most missing data, supported a topology that matched the sequence capture tree. This preferred phylogeny provides strong support for the paraphyly of the earless lizard genera Holbrookia and Cophosaurus, suggesting that the earless morphology either evolved twice or evolved once and was subsequently lost in Callisaurus. PMID:25663487

  14. DNA hybridization kinetics: zippering, internal displacement and sequence dependence.

    Science.gov (United States)

    Ouldridge, Thomas E; Sulc, Petr; Romano, Flavio; Doye, Jonathan P K; Louis, Ard A

    2013-10-01

    Although the thermodynamics of DNA hybridization is generally well established, the kinetics of this classic transition is less well understood. Providing such understanding has new urgency because DNA nanotechnology often depends critically on binding rates. Here, we explore DNA oligomer hybridization kinetics using a coarse-grained model. Strand association proceeds through a complex set of intermediate states, with successful binding events initiated by a few metastable base-pairing interactions, followed by zippering of the remaining bonds. But despite reasonably strong interstrand interactions, initial contacts frequently dissociate because typical configurations in which they form differ from typical states of similar enthalpy in the double-stranded equilibrium ensemble. Initial contacts must be stabilized by two or three base pairs before full zippering is likely, resulting in negative effective activation enthalpies. Non-Arrhenius behavior arises because the number of base pairs required for nucleation increases with temperature. In addition, we observe two alternative pathways-pseudoknot and inchworm internal displacement-through which misaligned duplexes can rearrange to form duplexes. These pathways accelerate hybridization. Our results explain why experimentally observed association rates of GC-rich oligomers are higher than rates of AT- rich equivalents, and more generally demonstrate how association rates can be modulated by sequence choice.

  15. A Novel Computational Method for Detecting DNA Methylation Sites with DNA Sequence Information and Physicochemical Properties.

    Science.gov (United States)

    Pan, Gaofeng; Jiang, Limin; Tang, Jijun; Guo, Fei

    2018-02-08

    DNA methylation is an important biochemical process, and it has a close connection with many types of cancer. Research about DNA methylation can help us to understand the regulation mechanism and epigenetic reprogramming. Therefore, it becomes very important to recognize the methylation sites in the DNA sequence. In the past several decades, many computational methods-especially machine learning methods-have been developed since the high-throughout sequencing technology became widely used in research and industry. In order to accurately identify whether or not a nucleotide residue is methylated under the specific DNA sequence context, we propose a novel method that overcomes the shortcomings of previous methods for predicting methylation sites. We use k -gram, multivariate mutual information, discrete wavelet transform, and pseudo amino acid composition to extract features, and train a sparse Bayesian learning model to do DNA methylation prediction. Five criteria-area under the receiver operating characteristic curve (AUC), Matthew's correlation coefficient (MCC), accuracy (ACC), sensitivity (SN), and specificity-are used to evaluate the prediction results of our method. On the benchmark dataset, we could reach 0.8632 on AUC, 0.8017 on ACC, 0.5558 on MCC, and 0.7268 on SN. Additionally, the best results on two scBS-seq profiled mouse embryonic stem cells datasets were 0.8896 and 0.9511 by AUC, respectively. When compared with other outstanding methods, our method surpassed them on the accuracy of prediction. The improvement of AUC by our method compared to other methods was at least 0.0399 . For the convenience of other researchers, our code has been uploaded to a file hosting service, and can be downloaded from: https://figshare.com/s/0697b692d802861282d3.

  16. A Novel Computational Method for Detecting DNA Methylation Sites with DNA Sequence Information and Physicochemical Properties

    Directory of Open Access Journals (Sweden)

    Gaofeng Pan

    2018-02-01

    Full Text Available DNA methylation is an important biochemical process, and it has a close connection with many types of cancer. Research about DNA methylation can help us to understand the regulation mechanism and epigenetic reprogramming. Therefore, it becomes very important to recognize the methylation sites in the DNA sequence. In the past several decades, many computational methods—especially machine learning methods—have been developed since the high-throughout sequencing technology became widely used in research and industry. In order to accurately identify whether or not a nucleotide residue is methylated under the specific DNA sequence context, we propose a novel method that overcomes the shortcomings of previous methods for predicting methylation sites. We use k-gram, multivariate mutual information, discrete wavelet transform, and pseudo amino acid composition to extract features, and train a sparse Bayesian learning model to do DNA methylation prediction. Five criteria—area under the receiver operating characteristic curve (AUC, Matthew’s correlation coefficient (MCC, accuracy (ACC, sensitivity (SN, and specificity—are used to evaluate the prediction results of our method. On the benchmark dataset, we could reach 0.8632 on AUC, 0.8017 on ACC, 0.5558 on MCC, and 0.7268 on SN. Additionally, the best results on two scBS-seq profiled mouse embryonic stem cells datasets were 0.8896 and 0.9511 by AUC, respectively. When compared with other outstanding methods, our method surpassed them on the accuracy of prediction. The improvement of AUC by our method compared to other methods was at least 0.0399 . For the convenience of other researchers, our code has been uploaded to a file hosting service, and can be downloaded from: https://figshare.com/s/0697b692d802861282d3.

  17. Determination of cDNA and genomic DNA sequences of hevamine, a chitinase from the rubber tree Hevea brasiliensis

    NARCIS (Netherlands)

    Bokma, E; Spiering, M; Chow, KS; Mulder, PPMFA; Subroto, T; Beintema, JJ

    Hevamine is a chitinase from the rubber tree Hevea brasiliensis and belongs to the family 18 glycosyl hydrolases. This paper describes the cloning of hevamine DNA and cDNA sequences. Hevamine contains a signal peptide at the N-terminus and a putative vacuolar targeting sequence at the C-terminus

  18. Anti-DNA antibodies: Sequencing, cloning, and expression

    Energy Technology Data Exchange (ETDEWEB)

    Barry, M.M.

    1992-01-01

    To gain some insight into the mechanism of systemic lupus erythematosus, and the interactions involved in proteins binding to DNA four anti-DNA antibodies have been investigated. Two of the antibodies, Hed 10 and Jel 242, have previously been prepared from female NZB/NZW mice which develop an autoimmune disease resembling human SLE. The remaining two antibodies, Jel 72 and Jel 318, have previously been produced via immunization of C57BL/6 mice. The isotypes of the four antibodies investigated in this thesis were determined by an enzyme-linked-immunosorbent assay. All four antibodies contained [kappa] light chains and [gamma]2a heavy chains except Jel 318 which contains a [gamma]2b heavy chain. The complete variable regions of the heavy and light chains of these four antibodies were sequenced from their respective mRNAs. The gene segments and variable gene families expressed in each antibody were identified. Analysis of the genes used in the autoimmune anti-DNA antibodies and those produced by immunization indicated no obvious differences to account for their different origins. Examination of the amino acid residues present in the complementary-determining regions of these four antibodies indicates a preference for aromatic amino acids. Jel 72 and Jel 242 contain three arginine residues in the third complementary-determining region. A single-chain Fv and the variable region of the heavy chain of Hed 10 were expressed in Escherichia coli. Expression resulted in the production of a 26,000 M[sub r] protein and a 15,000 M[sub r] protein. An immunoblot indicated that the 26,000 M[sub r] protein was the Fv for Hed 10, while the 15,000 M[sub r] protein was shown to bind poly (dT). The contribution of the heavy chain to DNA binding was assessed.

  19. Comparative d2/d3 LSU–rDNA sequence study of some Iranian ...

    African Journals Online (AJOL)

    SERVER

    2007-11-05

    Nov 5, 2007 ... segments yielded one fragment at over all sequenced isolates as 787 bp in size. The DNA sequences were aligned .... expansion segments of the 28S rDNA subunit (D2/D3. LSU-rDNA) are the ... isolated from different geographical location from tea shrubs infested roots of Guilan province, Iran (Table 1).

  20. Sequence specificity and biological consequences of drugs that bind covalently in the minor groove of DNA

    International Nuclear Information System (INIS)

    Hurley, L.H.; Needham-VanDevanter, D.R.

    1986-01-01

    DNA ligands which bind within the minor groove of DNA exhibit varying degrees of sequence selectivity. Factors which contribute to nucleotide sequence recognition by minor groove ligands have been extensively investigated. Electrostatic interactions, ligand and DNA dehydration energies, hydrophobic interactions and steric factors all play significant roles in sequence selectivity in the minor groove. Interestingly, ligand recognition of nucleotide sequence in the minor groove does not involve significant hydrogen bonding. This is in sharp contrast to cellular enzyme and protein recognition of nucleotide sequence, which is achieved in the major groove via specific hydrogen bond formation between individual bases and the ligand. The ability to read nucleotide sequence via hydrogen bonding allows precise binding of proteins to specific DNA sequences. Minor groove ligands examined to date exhibit a much lower sequence specificity, generally binding to a subset of possible sequences, rather than a single sequence. 19 refs., 7 figs

  1. The DNA sequence of the human X chromosome

    Science.gov (United States)

    Ross, Mark T.; Grafham, Darren V.; Coffey, Alison J.; Scherer, Steven; McLay, Kirsten; Muzny, Donna; Platzer, Matthias; Howell, Gareth R.; Burrows, Christine; Bird, Christine P.; Frankish, Adam; Lovell, Frances L.; Howe, Kevin L.; Ashurst, Jennifer L.; Fulton, Robert S.; Sudbrak, Ralf; Wen, Gaiping; Jones, Matthew C.; Hurles, Matthew E.; Andrews, T. Daniel; Scott, Carol E.; Searle, Stephen; Ramser, Juliane; Whittaker, Adam; Deadman, Rebecca; Carter, Nigel P.; Hunt, Sarah E.; Chen, Rui; Cree, Andrew; Gunaratne, Preethi; Havlak, Paul; Hodgson, Anne; Metzker, Michael L.; Richards, Stephen; Scott, Graham; Steffen, David; Sodergren, Erica; Wheeler, David A.; Worley, Kim C.; Ainscough, Rachael; Ambrose, Kerrie D.; Ansari-Lari, M. Ali; Aradhya, Swaroop; Ashwell, Robert I. S.; Babbage, Anne K.; Bagguley, Claire L.; Ballabio, Andrea; Banerjee, Ruby; Barker, Gary E.; Barlow, Karen F.; Barrett, Ian P.; Bates, Karen N.; Beare, David M.; Beasley, Helen; Beasley, Oliver; Beck, Alfred; Bethel, Graeme; Blechschmidt, Karin; Brady, Nicola; Bray-Allen, Sarah; Bridgeman, Anne M.; Brown, Andrew J.; Brown, Mary J.; Bonnin, David; Bruford, Elspeth A.; Buhay, Christian; Burch, Paula; Burford, Deborah; Burgess, Joanne; Burrill, Wayne; Burton, John; Bye, Jackie M.; Carder, Carol; Carrel, Laura; Chako, Joseph; Chapman, Joanne C.; Chavez, Dean; Chen, Ellson; Chen, Guan; Chen, Yuan; Chen, Zhijian; Chinault, Craig; Ciccodicola, Alfredo; Clark, Sue Y.; Clarke, Graham; Clee, Chris M.; Clegg, Sheila; Clerc-Blankenburg, Kerstin; Clifford, Karen; Cobley, Vicky; Cole, Charlotte G.; Conquer, Jen S.; Corby, Nicole; Connor, Richard E.; David, Robert; Davies, Joy; Davis, Clay; Davis, John; Delgado, Oliver; DeShazo, Denise; Dhami, Pawandeep; Ding, Yan; Dinh, Huyen; Dodsworth, Steve; Draper, Heather; Dugan-Rocha, Shannon; Dunham, Andrew; Dunn, Matthew; Durbin, K. James; Dutta, Ireena; Eades, Tamsin; Ellwood, Matthew; Emery-Cohen, Alexandra; Errington, Helen; Evans, Kathryn L.; Faulkner, Louisa; Francis, Fiona; Frankland, John; Fraser, Audrey E.; Galgoczy, Petra; Gilbert, James; Gill, Rachel; Glöckner, Gernot; Gregory, Simon G.; Gribble, Susan; Griffiths, Coline; Grocock, Russell; Gu, Yanghong; Gwilliam, Rhian; Hamilton, Cerissa; Hart, Elizabeth A.; Hawes, Alicia; Heath, Paul D.; Heitmann, Katja; Hennig, Steffen; Hernandez, Judith; Hinzmann, Bernd; Ho, Sarah; Hoffs, Michael; Howden, Phillip J.; Huckle, Elizabeth J.; Hume, Jennifer; Hunt, Paul J.; Hunt, Adrienne R.; Isherwood, Judith; Jacob, Leni; Johnson, David; Jones, Sally; de Jong, Pieter J.; Joseph, Shirin S.; Keenan, Stephen; Kelly, Susan; Kershaw, Joanne K.; Khan, Ziad; Kioschis, Petra; Klages, Sven; Knights, Andrew J.; Kosiura, Anna; Kovar-Smith, Christie; Laird, Gavin K.; Langford, Cordelia; Lawlor, Stephanie; Leversha, Margaret; Lewis, Lora; Liu, Wen; Lloyd, Christine; Lloyd, David M.; Loulseged, Hermela; Loveland, Jane E.; Lovell, Jamieson D.; Lozado, Ryan; Lu, Jing; Lyne, Rachael; Ma, Jie; Maheshwari, Manjula; Matthews, Lucy H.; McDowall, Jennifer; McLaren, Stuart; McMurray, Amanda; Meidl, Patrick; Meitinger, Thomas; Milne, Sarah; Miner, George; Mistry, Shailesh L.; Morgan, Margaret; Morris, Sidney; Müller, Ines; Mullikin, James C.; Nguyen, Ngoc; Nordsiek, Gabriele; Nyakatura, Gerald; O’Dell, Christopher N.; Okwuonu, Geoffery; Palmer, Sophie; Pandian, Richard; Parker, David; Parrish, Julia; Pasternak, Shiran; Patel, Dina; Pearce, Alex V.; Pearson, Danita M.; Pelan, Sarah E.; Perez, Lesette; Porter, Keith M.; Ramsey, Yvonne; Reichwald, Kathrin; Rhodes, Susan; Ridler, Kerry A.; Schlessinger, David; Schueler, Mary G.; Sehra, Harminder K.; Shaw-Smith, Charles; Shen, Hua; Sheridan, Elizabeth M.; Shownkeen, Ratna; Skuce, Carl D.; Smith, Michelle L.; Sotheran, Elizabeth C.; Steingruber, Helen E.; Steward, Charles A.; Storey, Roy; Swann, R. Mark; Swarbreck, David; Tabor, Paul E.; Taudien, Stefan; Taylor, Tineace; Teague, Brian; Thomas, Karen; Thorpe, Andrea; Timms, Kirsten; Tracey, Alan; Trevanion, Steve; Tromans, Anthony C.; d’Urso, Michele; Verduzco, Daniel; Villasana, Donna; Waldron, Lenee; Wall, Melanie; Wang, Qiaoyan; Warren, James; Warry, Georgina L.; Wei, Xuehong; West, Anthony; Whitehead, Siobhan L.; Whiteley, Mathew N.; Wilkinson, Jane E.; Willey, David L.; Williams, Gabrielle; Williams, Leanne; Williamson, Angela; Williamson, Helen; Wilming, Laurens; Woodmansey, Rebecca L.; Wray, Paul W.; Yen, Jennifer; Zhang, Jingkun; Zhou, Jianling; Zoghbi, Huda; Zorilla, Sara; Buck, David; Reinhardt, Richard; Poustka, Annemarie; Rosenthal, André; Lehrach, Hans; Meindl, Alfons; Minx, Patrick J.; Hillier, LaDeana W.; Willard, Huntington F.; Wilson, Richard K.; Waterston, Robert H.; Rice, Catherine M.; Vaudin, Mark; Coulson, Alan; Nelson, David L.; Weinstock, George; Sulston, John E.; Durbin, Richard; Hubbard, Tim; Gibbs, Richard A.; Beck, Stephan; Rogers, Jane; Bentley, David R.

    2009-01-01

    The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence. PMID:15772651

  2. RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.

    Science.gov (United States)

    Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

    2012-01-01

    RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.

  3. Diversity of chloroplast genome among local clones of cocoa (Theobroma cacao, L.) from Central Sulawesi

    Science.gov (United States)

    Suwastika, I. Nengah; Pakawaru, Nurul Aisyah; Rifka, Rahmansyah, Muslimin, Ishizaki, Yoko; Cruz, André Freire; Basri, Zainuddin; Shiina, Takashi

    2017-02-01

    Chloroplast genomes typically range in size from 120 to 170 kilo base pairs (kb), which relatively conserved among plant species. Recent evaluation on several species, certain unique regions showed high variability which can be utilized in the phylogenetic analysis. Many fragments of coding regions, introns, and intergenic spacers, such as atpB-rbcL, ndhF, rbcL, rpl16, trnH-psbA, trnL-F, trnS-G, etc., have been used for phylogenetic reconstructions at various taxonomic levels. Based on that status, we would like to analysis the diversity of chloroplast genome within species of local cacao (Theobroma cacao L.) from Central Sulawesi. Our recent data showed, there were more than 20 clones from local farming in Central Sulawesi, and it can be detected based on phenotypic and nuclear-genome-based characterization (RAPD- Random Amplified Polymorphic DNA and SSR- Simple Sequences Repeat) markers. In developing DNA marker for this local cacao, here we also included analysis based on the variation of chloroplast genome. At least several regions such as rpl32-TurnL, it can be considered as chloroplast markers on our local clone of cocoa. Furthermore, we could develop phylogenetic analysis in between clones of cocoa.

  4. Genome-Wide Prediction of DNA Methylation Using DNA Composition and Sequence Complexity in Human.

    Science.gov (United States)

    Wu, Chengchao; Yao, Shixin; Li, Xinghao; Chen, Chujia; Hu, Xuehai

    2017-02-16

    DNA methylation plays a significant role in transcriptional regulation by repressing activity. Change of the DNA methylation level is an important factor affecting the expression of target genes and downstream phenotypes. Because current experimental technologies can only assay a small proportion of CpG sites in the human genome, it is urgent to develop reliable computational models for predicting genome-wide DNA methylation. Here, we proposed a novel algorithm that accurately extracted sequence complexity features (seven features) and developed a support-vector-machine-based prediction model with integration of the reported DNA composition features (trinucleotide frequency and GC content, 65 features) by utilizing the methylation profiles of embryonic stem cells in human. The prediction results from 22 human chromosomes with size-varied windows showed that the 600-bp window achieved the best average accuracy of 94.7%. Moreover, comparisons with two existing methods further showed the superiority of our model, and cross-species predictions on mouse data also demonstrated that our model has certain generalization ability. Finally, a statistical test of the experimental data and the predicted data on functional regions annotated by ChromHMM found that six out of 10 regions were consistent, which implies reliable prediction of unassayed CpG sites. Accordingly, we believe that our novel model will be useful and reliable in predicting DNA methylation.

  5. Complete Chloroplast Genomes of Papaver rhoeas and Papaver orientale: Molecular Structures, Comparative Analysis, and Phylogenetic Analysis

    Directory of Open Access Journals (Sweden)

    Jianguo Zhou

    2018-02-01

    Full Text Available Papaver rhoeas L. and P. orientale L., which belong to the family Papaveraceae, are used as ornamental and medicinal plants. The chloroplast genome has been used for molecular markers, evolutionary biology, and barcoding identification. In this study, the complete chloroplast genome sequences of P. rhoeas and P. orientale are reported. Results show that the complete chloroplast genomes of P. rhoeas and P. orientale have typical quadripartite structures, which are comprised of circular 152,905 and 152,799-bp-long molecules, respectively. A total of 130 genes were identified in each genome, including 85 protein-coding genes, 37 tRNA genes, and 8 rRNA genes. Sequence divergence analysis of four species from Papaveraceae indicated that the most divergent regions are found in the non-coding spacers with minimal differences among three Papaver species. These differences include the ycf1 gene and intergenic regions, such as rpoB-trnC, trnD-trnT, petA-psbJ, psbE-petL, and ccsA-ndhD. These regions are hypervariable regions, which can be used as specific DNA barcodes. This finding suggested that the chloroplast genome could be used as a powerful tool to resolve the phylogenetic positions and relationships of Papaveraceae. These results offer valuable information for future research in the identification of Papaver species and will benefit further investigations of these species.

  6. An Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases

    Directory of Open Access Journals (Sweden)

    Md. Rezaul Karim

    2012-03-01

    Full Text Available Mining interesting patterns from DNA sequences is one of the most challenging tasks in bioinformatics and computational biology. Maximal contiguous frequent patterns are preferable for expressing the function and structure of DNA sequences and hence can capture the common data characteristics among related sequences. Biologists are interested in finding frequent orderly arrangements of motifs that are responsible for similar expression of a group of genes. In order to reduce mining time and complexity, however, most existing sequence mining algorithms either focus on finding short DNA sequences or require explicit specification of sequence lengths in advance. The challenge is to find longer sequences without specifying sequence lengths in advance. In this paper, we propose an efficient approach to mining maximal contiguous frequent patterns from large DNA sequence datasets. The experimental results show that our proposed approach is memory-efficient and mines maximal contiguous frequent patterns within a reasonable time.

  7. Beyond DNA Sequencing in Space: Current and Future Omics Capabilities of the Biomolecule Sequencer Payload

    Science.gov (United States)

    Wallace, Sarah

    2017-01-01

    Why do we need a DNA sequencer to support the human exploration of space? (A) Operational environmental monitoring; (1) Identification of contaminating microbes, (2) Infectious disease diagnosis, (3) Reduce down mass (sample return for environmental monitoring, crew health, etc.). (B) Research; (1) Human, (2) Animal, (3) Microbes/Cell lines, (4) Plant. (C) Med Ops; (1) Response to countermeasures, (2) Radiation, (3) Real-time analysis can influence medical intervention. (C) Support astrobiology science investigations; (1) Technology superiorly suited to in situ nucleic acid-based life detection, (2) Functional testing for integration into robotics for extraplanetary exploration mission.

  8. Biomolecule Sequencer: Next-Generation DNA Sequencing Technology for In-Flight Environmental Monitoring, Research, and Beyond

    Science.gov (United States)

    Smith, David J.; Burton, Aaron; Castro-Wallace, Sarah; John, Kristen; Stahl, Sarah E.; Dworkin, Jason Peter; Lupisella, Mark L.

    2016-01-01

    On the International Space Station (ISS), technologies capable of rapid microbial identification and disease diagnostics are not currently available. NASA still relies upon sample return for comprehensive, molecular-based sample characterization. Next-generation DNA sequencing is a powerful approach for identifying microorganisms in air, water, and surfaces onboard spacecraft. The Biomolecule Sequencer payload, manifested to SpaceX-9 and scheduled on the Increment 4748 research plan (June 2016), will assess the functionality of a commercially-available next-generation DNA sequencer in the microgravity environment of ISS. The MinION device from Oxford Nanopore Technologies (Oxford, UK) measures picoamp changes in electrical current dependent on nucleotide sequences of the DNA strand migrating through nanopores in the system. The hardware is exceptionally small (9.5 x 3.2 x 1.6 cm), lightweight (120 grams), and powered only by a USB connection. For the ISS technology demonstration, the Biomolecule Sequencer will be powered by a Microsoft Surface Pro3. Ground-prepared samples containing lambda bacteriophage, Escherichia coli, and mouse genomic DNA, will be launched and stored frozen on the ISS until experiment initiation. Immediately prior to sequencing, a crew member will collect and thaw frozen DNA samples, connect the sequencer to the Surface Pro3, inject thawed samples into a MinION flow cell, and initiate sequencing. At the completion of the sequencing run, data will be downlinked for ground analysis. Identical, synchronous ground controls will be used for data comparisons to determine sequencer functionality, run-time sequence, current dynamics, and overall accuracy. We will present our latest results from the ISS flight experiment the first time DNA has ever been sequenced in space and discuss the many potential applications of the Biomolecule Sequencer for environmental monitoring, medical diagnostics, higher fidelity and more adaptable Space Biology Human

  9. Mechanism of sequence-specific template binding by the DNA primase of bacteriophage T7

    KAUST Repository

    Lee, Seung-Joo; Zhu, Bin; Hamdan, Samir; Richardson, Charles C.

    2010-01-01

    DNA primases catalyze the synthesis of the oligoribonucleotides required for the initiation of lagging strand DNA synthesis. Biochemical studies have elucidated the mechanism for the sequence-specific synthesis of primers. However, the physical

  10. Assessing the fidelity of ancient DNA sequences amplified from nuclear genes

    DEFF Research Database (Denmark)

    Binladen, Jonas; Wiuf, Carsten Henrik; Gilbert, M. Thomas P.

    2006-01-01

    To date, the field of ancient DNA has relied almost exclusively on mitochondrial DNA (mtDNA) sequences. However, a number of recent studies have reported the successful recovery of ancient nuclear DNA (nuDNA) sequences, thereby allowing the characterization of genetic loci directly involved...... in phenotypic traits of extinct taxa. It is well documented that postmortem damage in ancient mtDNA can lead to the generation of artifactual sequences. However, as yet no one has thoroughly investigated the damage spectrum in ancient nuDNA. By comparing clone sequences from 23 fossil specimens, recovered from...... adenine), respectively. Type 2 transitions are by far the most dominant and increase relative to those of type 1 with damage load. The results suggest that the deamination of cytosine (and 5-methyl cytosine) to uracil (and thymine) is the main cause of miscoding lesions in both ancient mtDNA and nu...

  11. Salinity inhibits post transcriptional processing of chloroplast 16S rRNA in shoot cultures of jojoba (Simmondsia chinesis).

    Science.gov (United States)

    Mizrahi-Aviv, Ela; Mills, David; Benzioni, Aliza; Bar-Zvi, Dudy

    2005-03-01

    Chloroplast metabolism is rapidly affected by salt stress. Photosynthesis is one of the first processes known to be affected by salinity. Here, we report that salinity inhibits chloroplast post-transcriptional RNA processing. A differentially expressed 680-bp cDNA, containing the 3' sequence of 16S rRNA, transcribed intergenic spacer, exon 1 and intron of tRNA(Ile), was isolated by differential display reverse transcriptase PCR from salt-grown jojoba (Simmondsia chinesis) shoot cultures. Northern blot analysis indicated that although most rRNA appears to be fully processed, partially processed chloroplast 16S rRNA accumulates in salt-grown cultures. Thus, salinity appears to decrease the processing of the rrn transcript. The possible effect of this decreased processing on physiological processes is, as yet, unknown.

  12. Homogeneity of the 16S rDNA sequence among geographically disparate isolates of Taylorella equigenitalis

    Directory of Open Access Journals (Sweden)

    Moore JE

    2006-01-01

    Full Text Available Abstract Background At present, six accessible sequences of 16S rDNA from Taylorella equigenitalis (T. equigenitalis are available, whose sequence differences occur at a few nucleotide positions. Thus it is important to determine these sequences from additional strains in other countries, if possible, in order to clarify any anomalies regarding 16S rDNA sequence heterogeneity. Here, we clone and sequence the approximate full-length 16S rDNA from additional strains of T. equigenitalis isolated in Japan, Australia and France and compare these sequences to the existing published sequences. Results Clarification of any anomalies regarding 16S rDNA sequence heterogeneity of T. equigenitalis was carried out. When cloning, sequencing and comparison of the approximate full-length 16S rDNA from 17 strains of T. equigenitalis isolated in Japan, Australia and France, nucleotide sequence differences were demonstrated at the six loci in the 1,469 nucleotide sequence. Moreover, 12 polymorphic sites occurred among 23 sequences of the 16S rDNA, including the six reference sequences. Conclusion High sequence similarity (99.5% or more was observed throughout, except from nucleotide positions 138 to 501 where substitutions and deletions were noted.

  13. Homogeneity of the 16S rDNA sequence among geographically disparate isolates of Taylorella equigenitalis

    Science.gov (United States)

    Matsuda, M; Tazumi, A; Kagawa, S; Sekizuka, T; Murayama, O; Moore, JE; Millar, BC

    2006-01-01

    Background At present, six accessible sequences of 16S rDNA from Taylorella equigenitalis (T. equigenitalis) are available, whose sequence differences occur at a few nucleotide positions. Thus it is important to determine these sequences from additional strains in other countries, if possible, in order to clarify any anomalies regarding 16S rDNA sequence heterogeneity. Here, we clone and sequence the approximate full-length 16S rDNA from additional strains of T. equigenitalis isolated in Japan, Australia and France and compare these sequences to the existing published sequences. Results Clarification of any anomalies regarding 16S rDNA sequence heterogeneity of T. equigenitalis was carried out. When cloning, sequencing and comparison of the approximate full-length 16S rDNA from 17 strains of T. equigenitalis isolated in Japan, Australia and France, nucleotide sequence differences were demonstrated at the six loci in the 1,469 nucleotide sequence. Moreover, 12 polymorphic sites occurred among 23 sequences of the 16S rDNA, including the six reference sequences. Conclusion High sequence similarity (99.5% or more) was observed throughout, except from nucleotide positions 138 to 501 where substitutions and deletions were noted. PMID:16398935

  14. Sequence Capture versus Restriction Site Associated DNA Sequencing for Shallow Systematics.

    Science.gov (United States)

    Harvey, Michael G; Smith, Brian Tilston; Glenn, Travis C; Faircloth, Brant C; Brumfield, Robb T

    2016-09-01

    Sequence capture and restriction site associated DNA sequencing (RAD-Seq) are two genomic enrichment strategies for applying next-generation sequencing technologies to systematics studies. At shallow timescales, such as within species, RAD-Seq has been widely adopted among researchers, although there has been little discussion of the potential limitations and benefits of RAD-Seq and sequence capture. We discuss a series of issues that may impact the utility of sequence capture and RAD-Seq data for shallow systematics in non-model species. We review prior studies that used both methods, and investigate differences between the methods by re-analyzing existing RAD-Seq and sequence capture data sets from a Neotropical bird (Xenops minutus). We suggest that the strengths of RAD-Seq data sets for shallow systematics are the wide dispersion of markers across the genome, the relative ease and cost of laboratory work, the deep coverage and read overlap at recovered loci, and the high overall information that results. Sequence capture's benefits include flexibility and repeatability in the genomic regions targeted, success using low-quality samples, more straightforward read orthology assessment, and higher per-locus information content. The utility of a method in systematics, however, rests not only on its performance within a study, but on the comparability of data sets and inferences with those of prior work. In RAD-Seq data sets, comparability is compromised by low overlap of orthologous markers across species and the sensitivity of genetic diversity in a data set to an interaction between the level of natural heterozygosity in the samples examined and the parameters used for orthology assessment. In contrast, sequence capture of conserved genomic regions permits interrogation of the same loci across divergent species, which is preferable for maintaining comparability among data sets and studies for the purpose of drawing general conclusions about the impact of

  15. repDNA: a Python package to generate various modes of feature vectors for DNA sequences by incorporating user-defined physicochemical properties and sequence-order effects.

    Science.gov (United States)

    Liu, Bin; Liu, Fule; Fang, Longyun; Wang, Xiaolong; Chou, Kuo-Chen

    2015-04-15

    In order to develop powerful computational predictors for identifying the biological features or attributes of DNAs, one of the most challenging problems is to find a suitable approach to effectively represent the DNA sequences. To facilitate the studies of DNAs and nucleotides, we developed a Python package called representations of DNAs (repDNA) for generating the widely used features reflecting the physicochemical properties and sequence-order effects of DNAs and nucleotides. There are three feature groups composed of 15 features. The first group calculates three nucleic acid composition features describing the local sequence information by means of kmers; the second group calculates six autocorrelation features describing the level of correlation between two oligonucleotides along a DNA sequence in terms of their specific physicochemical properties; the third group calculates six pseudo nucleotide composition features, which can be used to represent a DNA sequence with a discrete model or vector yet still keep considerable sequence-order information via the physicochemical properties of its constituent oligonucleotides. In addition, these features can be easily calculated based on both the built-in and user-defined properties via using repDNA. The repDNA Python package is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repDNA/. bliu@insun.hit.edu.cn or kcchou@gordonlifescience.org Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. [Whole Genome Sequencing of Human mtDNA Based on Ion Torrent PGM™ Platform].

    Science.gov (United States)

    Cao, Y; Zou, K N; Huang, J P; Ma, K; Ping, Y

    2017-08-01

    To analyze and detect the whole genome sequence of human mitochondrial DNA (mtDNA) by Ion Torrent PGM™ platform and to study the differences of mtDNA sequence in different tissues. Samples were collected from 6 unrelated individuals by forensic postmortem examination, including chest blood, hair, costicartilage, nail, skeletal muscle and oral epithelium. Amplification of whole genome sequence of mtDNA was performed by 4 pairs of primer. Libraries were constructed with Ion Shear™ Plus Reagents kit and Ion Plus Fragment Library kit. Whole genome sequencing of mtDNA was performed using Ion Torrent PGM™ platform. Sanger sequencing was used to determine the heteroplasmy positions and the mutation positions on HVⅠ region. The whole genome sequence of mtDNA from all samples were amplified successfully. Six unrelated individuals belonged to 6 different haplotypes. Different tissues in one individual had heteroplasmy difference. The heteroplasmy positions and the mutation positions on HVⅠ region were verified by Sanger sequencing. After a consistency check by the Kappa method, it was found that the results of mtDNA sequence had a high consistency in different tissues. The testing method used in present study for sequencing the whole genome sequence of human mtDNA can detect the heteroplasmy difference in different tissues, which have good consistency. The results provide guidance for the further applications of mtDNA in forensic science. Copyright© by the Editorial Department of Journal of Forensic Medicine

  17. Mechanism of protein import across the chloroplast envelope.

    Science.gov (United States)

    Chen, K; Chen, X; Schnell, D J

    2000-01-01

    The development and maintenance of chloroplasts relies on the contribution of protein subunits from both plastid and nuclear genomes. Most chloroplast proteins are encoded by nuclear genes and are post-translationally imported into the organelle across the double membrane of the chloroplast envelope. Protein import into the chloroplast consists of two essential elements: the specific recognition of the targeting signals (transit sequences) of cytoplasmic preproteins by receptors at the outer envelope membrane and the subsequent translocation of preproteins simultaneously across the double membrane of the envelope. These processes are mediated via the co-ordinate action of protein translocon complexes in the outer (Toc apparatus) and inner (Tic apparatus) envelope membranes.

  18. Developmental and Subcellular Organization of Single-Cell C₄ Photosynthesis in Bienertia sinuspersici Determined by Large-Scale Proteomics and cDNA Assembly from 454 DNA Sequencing.

    Science.gov (United States)

    Offermann, Sascha; Friso, Giulia; Doroshenk, Kelly A; Sun, Qi; Sharpe, Richard M; Okita, Thomas W; Wimmer, Diana; Edwards, Gerald E; van Wijk, Klaas J

    2015-05-01

    Kranz C4 species strictly depend on separation of primary and secondary carbon fixation reactions in different cell types. In contrast, the single-cell C4 (SCC4) species Bienertia sinuspersici utilizes intracellular compartmentation including two physiologically and biochemically different chloroplast types; however, information on identity, localization, and induction of proteins required for this SCC4 system is currently very limited. In this study, we determined the distribution of photosynthesis-related proteins and the induction of the C4 system during development by label-free proteomics of subcellular fractions and leaves of different developmental stages. This was enabled by inferring a protein sequence database from 454 sequencing of Bienertia cDNAs. Large-scale proteome rearrangements were observed as C4 photosynthesis developed during leaf maturation. The proteomes of the two chloroplasts are different with differential accumulation of linear and cyclic electron transport components, primary and secondary carbon fixation reactions, and a triose-phosphate shuttle that is shared between the two chloroplast types. This differential protein distribution pattern suggests the presence of a mRNA or protein-sorting mechanism for nuclear-encoded, chloroplast-targeted proteins in SCC4 species. The combined information was used to provide a comprehensive model for NAD-ME type carbon fixation in SCC4 species.

  19. Plant DNA Detection from Grasshopper Guts: A Step-by-Step Protocol, from Tissue Preparation to Obtaining Plant DNA Sequences

    Directory of Open Access Journals (Sweden)

    Alina Avanesyan

    2014-02-01

    Full Text Available Premise of the study: A PCR-based method of identifying ingested plant DNA in gut contents of Melanoplus grasshoppers was developed. Although previous investigations have focused on a variety of insects, there are no protocols available for plant DNA detection developed for grasshoppers, agricultural pests that significantly influence plant community composition. Methods and Results: The developed protocol successfully used the noncoding region of the chloroplast trnL (UAA gene and was tested in several feeding experiments. Plant DNA was obtained at seven time points post-ingestion from whole guts and separate gut sections, and was detectable up to 12 h post-ingestion in nymphs and 22 h post-ingestion in adult grasshoppers. Conclusions: The proposed protocol is an effective, relatively quick, and low-cost method of detecting plant DNA from the grasshopper gut and its different sections. This has important applications, from exploring plant “movement” during food consumption, to detecting plant–insect interactions.

  20. RevTrans: multiple alignment of coding DNA from aligned amino acid sequences

    DEFF Research Database (Denmark)

    Wernersson, Rasmus; Pedersen, Anders Gorm

    2003-01-01

    The simple fact that proteins are built from 20 amino acids while DNA only contains four different bases, means that the 'signal-to-noise ratio' in protein sequence alignments is much better than in alignments of DNA. Besides this information-theoretical advantage, protein alignments also benefit...... proteins. It is therefore preferable to align coding DNA at the amino acid level and it is for this purpose we have constructed the program RevTrans. RevTrans constructs a multiple DNA alignment by: (i) translating the DNA; (ii) aligning the resulting peptide sequences; and (iii) building a multiple DNA...

  1. DNA interactions with a Methylene Blue redox indicator depend on the DNA length and are sequence specific.

    Science.gov (United States)

    Farjami, Elaheh; Clima, Lilia; Gothelf, Kurt V; Ferapontova, Elena E

    2010-06-01

    A DNA molecular beacon approach was used for the analysis of interactions between DNA and Methylene Blue (MB) as a redox indicator of a hybridization event. DNA hairpin structures of different length and guanine (G) content were immobilized onto gold electrodes in their folded states through the alkanethiol linker at the 5'-end. Binding of MB to the folded hairpin DNA was electrochemically studied and compared with binding to the duplex structure formed by hybridization of the hairpin DNA to a complementary DNA strand. Variation of the electrochemical signal from the DNA-MB complex was shown to depend primarily on the DNA length and sequence used: the G-C base pairs were the preferential sites of MB binding in the duplex. For short 20 nts long DNA sequences, the increased electrochemical response from MB bound to the duplex structure was consistent with the increased amount of bound and electrochemically readable MB molecules (i.e. MB molecules that are available for the electron transfer (ET) reaction with the electrode). With longer DNA sequences, the balance between the amounts of the electrochemically readable MB molecules bound to the hairpin DNA and to the hybrid was opposite: a part of the MB molecules bound to the long-sequence DNA duplex seem to be electrochemically mute due to long ET distance. The increasing electrochemical response from MB bound to the short-length DNA hybrid contrasts with the decreasing signal from MB bound to the long-length DNA hybrid and allows an "off"-"on" genosensor development.

  2. OPTSDNA: Performance evaluation of an efficient distributed bioinformatics system for DNA sequence analysis.

    Science.gov (United States)

    Khan, Mohammad Ibrahim; Sheel, Chotan

    2013-01-01

    Storage of sequence data is a big concern as the amount of data generated is exponential in nature at several locations. Therefore, there is a need to develop techniques to store data using compression algorithm. Here we describe optimal storage algorithm (OPTSDNA) for storing large amount of DNA sequences of varying length. This paper provides performance analysis of optimal storage algorithm (OPTSDNA) of a distributed bioinformatics computing system for analysis of DNA sequences. OPTSDNA algorithm is used for storing various sizes of DNA sequences into database. DNA sequences of different lengths were stored by using this algorithm. These input DNA sequences are varied in size from very small to very large. Storage size is calculated by this algorithm. Response time is also calculated in this work. The efficiency and performance of the algorithm is high (in size calculation with percentage) when compared with other known with sequential approach.

  3. Sequencing of megabase plus DNA by hybridization: Method development ENT. Final technical progress report

    Energy Technology Data Exchange (ETDEWEB)

    Crkvenjakov, R.; Drmanac, R.

    1991-01-31

    Sequencing by hybridization (SBH) is the only sequencing method based on the experimental determination of the content of oligonucleotide sequences. The data acquisition relies on the natural process of base pairing. It is possible to determine the content of complementary oligosequences in the target DNA by the process of hybridization with oligonucleotide probes of known sequences.

  4. Microsatellite DNA in genomic survey sequences and UniGenes of loblolly pine

    Science.gov (United States)

    Craig S Echt; Surya Saha; Dennis L Deemer; C Dana Nelson

    2011-01-01

    Genomic DNA sequence databases are a potential and growing resource for simple sequence repeat (SSR) marker development in loblolly pine (Pinus taeda L.). Loblolly pine also has many expressed sequence tags (ESTs) available for microsatellite (SSR) marker development. We compared loblolly pine SSR densities in genome survey sequences (GSSs) to those in non-redundant...

  5. A 28,000 Years Old Cro-Magnon mtDNA Sequence Differs from All Potentially Contaminating Modern Sequences

    Science.gov (United States)

    Caramelli, David; Milani, Lucio; Vai, Stefania; Modi, Alessandra; Pecchioli, Elena; Girardi, Matteo; Pilli, Elena; Lari, Martina; Lippi, Barbara; Ronchitelli, Annamaria; Mallegni, Francesco; Casoli, Antonella; Bertorelle, Giorgio; Barbujani, Guido

    2008-01-01

    Background DNA sequences from ancient speciments may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal) and early modern (Cro-Magnoid) Europeans. Methodology/Principal Findings We typed the mitochondrial DNA (mtDNA) hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23) and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. Conclusions/Significance: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans. PMID:18628960

  6. A 28,000 years old Cro-Magnon mtDNA sequence differs from all potentially contaminating modern sequences.

    Directory of Open Access Journals (Sweden)

    David Caramelli

    Full Text Available BACKGROUND: DNA sequences from ancient specimens may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal and early modern (Cro-Magnoid Europeans. METHODOLOGY/PRINCIPAL FINDINGS: We typed the mitochondrial DNA (mtDNA hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23 and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. CONCLUSIONS/SIGNIFICANCE: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans.

  7. Sequence analysis of mitochondrial DNA hypervariable region III of ...

    African Journals Online (AJOL)

    Aghomotsegin

    2015-07-01

    Jul 1, 2015 ... population genetics research, studies based on mitochondrial DNA (mtDNA) and Y-chromosome DNA are an excellent way of illustrating population structure .... avoid landing investigators into serious situations of medical genetic privacy and ethnics, especially for. mtDNA coding area whose mutation often ...

  8. Protein and DNA sequence determinants of thermophilic adaptation.

    Directory of Open Access Journals (Sweden)

    Konstantin B Zeldovich

    2007-01-01

    Full Text Available There have been considerable attempts in the past to relate phenotypic trait--habitat temperature of organisms--to their genotypes, most importantly compositions of their genomes and proteomes. However, despite accumulation of anecdotal evidence, an exact and conclusive relationship between the former and the latter has been elusive. We present an exhaustive study of the relationship between amino acid composition of proteomes, nucleotide composition of DNA, and optimal growth temperature (OGT of prokaryotes. Based on 204 complete proteomes of archaea and bacteria spanning the temperature range from -10 degrees C to 110 degrees C, we performed an exhaustive enumeration of all possible sets of amino acids and found a set of amino acids whose total fraction in a proteome is correlated, to a remarkable extent, with the OGT. The universal set is Ile, Val, Tyr, Trp, Arg, Glu, Leu (IVYWREL, and the correlation coefficient is as high as 0.93. We also found that the G + C content in 204 complete genomes does not exhibit a significant correlation with OGT (R = -0.10. On the other hand, the fraction of A + G in coding DNA is correlated with temperature, to a considerable extent, due to codon patterns of IVYWREL amino acids. Further, we found strong and independent correlation between OGT and the frequency with which pairs of A and G nucleotides appear as nearest neighbors in genome sequences. This adaptation is achieved via codon bias. These findings present a direct link between principles of proteins structure and stability and evolutionary mechanisms of thermophylic adaptation. On the nucleotide level, the analysis provides an example of how nature utilizes codon bias for evolutionary adaptation to extreme conditions. Together these results provide a complete picture of how compositions of proteomes and genomes in prokaryotes adjust to the extreme conditions of the environment.

  9. Phylogeographical structure inferred from cpDNA sequence variation of Zygophyllum xanthoxylon across north-west China.

    Science.gov (United States)

    Shi, Xiao-Jun; Zhang, Ming-Li

    2015-03-01

    Zygophyllum xanthoxylon, a desert species, displaying a broad east-west continuous distribution pattern in arid Northwestern China, can be considered as a model species to investigate the biogeographical history of this region. We sequenced two chloroplast DNA spacers (psbK-psbI and rpl32-trnL) in 226 individuals from 31 populations to explore the phylogeographical structure. Median-joining network was constructed and analysis of AMOVA, SMOVA, neutrality tests and distribution analysis were used to examine genetic structure and potential range expansion. Using species distribution modeling, the geographical distribution of Z. xanthoxylon was modeled during the present and at the Last Glacial Maximum (LGM). Among 26 haplotypes, one was widely distributed, but most was restricted to either the eastern or western region. The populations with the highest levels of haplotype diversity were found in the Tianshan Mountains and its surroundings in the west, and the Helan Mountains and Alxa Plateau in the east. AMOVA and SAMOVA showed that over all populations, the species lacks phylogeographical structure, which is speculated to be the result of its specific biology. Neutrality tests and mismatch distribution analysis support past range expansions of the species. Comparing the current distribution to those cold and dry conditions in LGM, Z. xanthoxylon had a shrunken and more fragmented range during LGM. Based on the evidences from phylogeographical patterns, distribution of genetic variability, and paleodistribution modeling, Z. xanthoxylon is speculated most likely to have originated from the east and migrated westward via the Hexi Corridor.

  10. New chloroplast microsatellite markers suitable for assessing genetic diversity of Lolium perenne and other related grass species.

    Science.gov (United States)

    Diekmann, Kerstin; Hodkinson, Trevor R; Barth, Susanne

    2012-11-01

    Lolium perenne (perennial ryegrass) is the most important forage grass species of temperate regions. We have previously released the chloroplast genome sequence of L. perenne 'Cashel'. Here nine chloroplast microsatellite markers are published, which were designed based on knowledge about genetically variable regions within the L. perenne chloroplast genome. These markers were successfully used for characterizing the genetic diversity in Lolium and different grass species. Chloroplast genomes of 14 Poaceae taxa were screened for mononucleotide microsatellite repeat regions and primers designed for their amplification from nine loci. The potential of these markers to assess genetic diversity was evaluated on a set of 16 Irish and 15 European L. perenne ecotypes, nine L. perenne cultivars, other Lolium taxa and other grass species. All analysed Poaceae chloroplast genomes contained more than 200 mononucleotide repeats (chloroplast simple sequence repeats, cpSSRs) of at least 7 bp in length, concentrated mainly in the large single copy region of the genome. Nucleotide composition varied considerably among subfamilies (with Pooideae biased towards poly A repeats). The nine new markers distinguish L. perenne from all non-Lolium taxa. TeaCpSSR28 was able to distinguish between all Lolium species and Lolium multiflorum due to an elongation of an A(8) mononucleotide repeat in L. multiflorum. TeaCpSSR31 detected a considerable degree of microsatellite length variation and single nucleotide polymorphism. TeaCpSSR27 revealed variation within some L. perenne accessions due to a 44-bp indel and was hence readily detected by simple agarose gel electrophoresis. Smaller insertion/deletion events or single nucleotide polymorphisms detected by these new markers could be visualized by polyacrylamide gel electrophoresis or DNA sequencing, respectively. The new markers are a valuable tool for plant breeding companies, seed testing agencies and the wider scientific community due to

  11. Complete sequence analysis of 18S rDNA based on genomic DNA extraction from individual Demodex mites (Acari: Demodicidae).

    Science.gov (United States)

    Zhao, Ya-E; Xu, Ji-Ru; Hu, Li; Wu, Li-Ping; Wang, Zheng-Hang

    2012-05-01

    The study for the first time attempted to accomplish 18S ribosomal DNA (rDNA) complete sequence amplification and analysis for three Demodex species (Demodex folliculorum, Demodex brevis and Demodex canis) based on gDNA extraction from individual mites. The mites were treated by DNA Release Additive and Hot Start II DNA Polymerase so as to promote mite disruption and increase PCR specificity. Determination of D. folliculorum gDNA showed that the gDNA yield reached the highest at 1 mite, tending to descend with the increase of mite number. The individual mite gDNA was successfully used for 18S rDNA fragment (about 900 bp) amplification examination. The alignments of 18S rDNA complete sequences of individual mite samples and those of pooled mite samples ( ≥ 1000mites/sample) showed over 97% identities for each species, indicating that the gDNA extracted from a single individual mite was as satisfactory as that from pooled mites for PCR amplification. Further pairwise sequence analyses showed that average divergence, genetic distance, transition/transversion or phylogenetic tree could not effectively identify the three Demodex species, largely due to the differentiation in the D. canis isolates. It can be concluded that the individual Demodex mite gDNA can satisfy the molecular study of Demodex. 18S rDNA complete sequence is suitable for interfamily identification in Cheyletoidea, but whether it is suitable for intrafamily identification cannot be confirmed until the ascertainment of the types of Demodex mites parasitizing in dogs. Copyright © 2012 Elsevier Inc. All rights reserved.

  12. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution

    NARCIS (Netherlands)

    Falconer, Ester; Hills, Mark; Naumann, Ulrike; Poon, Steven S. S.; Chavez, Elizabeth A.; Sanders, Ashley D.; Zhao, Yongjun; Hirst, Martin; Lansdorp, Peter M.

    DNA rearrangements such as sister chromatid exchanges (SCEs) are sensitive indicators of genomic stress and instability, but they are typically masked by single-cell sequencing techniques. We developed Strand-seq to independently sequence parental DNA template strands from single cells, making it

  13. ASAP: Amplification, sequencing & annotation of plastomes

    Directory of Open Access Journals (Sweden)

    Folta Kevin M

    2005-12-01

    Full Text Available Abstract Background Availability of DNA sequence information is vital for pursuing structural, functional and comparative genomics studies in plastids. Traditionally, the first step in mining the valuable information within a chloroplast genome requires sequencing a chloroplast plasmid library or BAC clones. These activities involve complicated preparatory procedures like chloroplast DNA isolation or identification of the appropriate BAC clones to be sequenced. Rolling circle amplification (RCA is being used currently to amplify the chloroplast genome from purified chloroplast DNA and the resulting products are sheared and cloned prior to sequencing. Herein we present a universal high-throughput, rapid PCR-based technique to amplify, sequence and assemble plastid genome sequence from diverse species in a short time and at reasonable cost from total plant DNA, using the large inverted repeat region from strawberry and peach as proof of concept. The method exploits the highly conserved coding regions or intergenic regions of plastid genes. Using an informatics approach, chloroplast DNA sequence information from 5 available eudicot plastomes was aligned to identify the most conserved regions. Cognate primer pairs were then designed to generate ~1 – 1.2 kb overlapping amplicons from the inverted repeat region in 14 diverse genera. Results 100% coverage of the inverted repeat region was obtained from Arabidopsis, tobacco, orange, strawberry, peach, lettuce, tomato and Amaranthus. Over 80% coverage was obtained from distant species, including Ginkgo, loblolly pine and Equisetum. Sequence from the inverted repeat region of strawberry and peach plastome was obtained, annotated and analyzed. Additionally, a polymorphic region identified from gel electrophoresis was sequenced from tomato and Amaranthus. Sequence analysis revealed large deletions in these species relative to tobacco plastome thus exhibiting the utility of this method for structural and

  14. The DNA sequence and biology of human chromosome 19

    Energy Technology Data Exchange (ETDEWEB)

    Grimwood, J; Gordon, L A; Olsen, A; Terry, A; Schmutz, J; Lamerdin, J; Hellsten, U; Goodstein, D; Couronne, O; Tran-Gyamfi, M

    2004-04-06

    Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high GC content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in Mendelian disorders, including familial hypercholesterolemia and insulin-resistant diabetes. Nearly one quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.

  15. The complete chloroplast genome of the Dendrobium strongylanthum (Orchidaceae: Epidendroideae).

    Science.gov (United States)

    Li, Jing; Chen, Chen; Wang, Zhe-Zhi

    2016-07-01

    Complete chloroplast genome sequence is very useful for studying the phylogenetic and evolution of species. In this study, the complete chloroplast genome of Dendrobium strongylanthum was constructed from whole-genome Illumina sequencing data. The chloroplast genome is 153 058 bp in length with 37.6% GC content and consists of two inverted repeats (IRs) of 26 316 bp. The IR regions are separated by large single-copy region (LSC, 85 836 bp) and small single-copy (SSC, 14 590 bp) region. A total of 130 chloroplast genes were successfully annotated, including 84 protein coding genes, 38 tRNA genes, and eight rRNA genes. Phylogenetic analyses showed that the chloroplast genome of Dendrobium strongylanthum is related to that of the Dendrobium officinal.

  16. Capturing the Biofuel Wellhead and Powerhouse: The Chloroplast and Mitochondrial Genomes of the Leguminous Feedstock Tree Pongamia pinnata

    OpenAIRE

    Kazakoff, Stephen H.; Imelfort, Michael; Edwards, David; Koehorst, Jasper; Biswas, Bandana; Batley, Jacqueline; Scott, Paul T.; Gresshoff, Peter M.

    2012-01-01

    Pongamia pinnata (syn. Millettia pinnata) is a novel, fast-growing arboreal legume that bears prolific quantities of oil-rich seeds suitable for the production of biodiesel and aviation biofuel. Here, we have used Illumina® 'Second Generation DNA Sequencing (2GS)' and a new short-read de novo assembler, SaSSY, to assemble and annotate the Pongamia chloroplast (152,968 bp; cpDNA) and mitochondrial (425,718 bp; mtDNA) genomes. We also show that SaSSY can be used to accurately assemble 2GS data,...

  17. Analyses of charophyte chloroplast genomes help characterize the ancestral chloroplast genome of land plants.

    Science.gov (United States)

    Civaň, Peter; Foster, Peter G; Embley, Martin T; Séneca, Ana; Cox, Cymon J

    2014-04-01

    Despite the significance of the relationships between embryophytes and their charophyte algal ancestors in deciphering the origin and evolutionary success of land plants, few chloroplast genomes of the charophyte algae have been reconstructed to date. Here, we present new data for three chloroplast genomes of the freshwater charophytes Klebsormidium flaccidum (Klebsormidiophyceae), Mesotaenium endlicherianum (Zygnematophyceae), and Roya anglica (Zygnematophyceae). The chloroplast genome of Klebsormidium has a quadripartite organization with exceptionally large inverted repeat (IR) regions and, uniquely among streptophytes, has lost the rrn5 and rrn4.5 genes from the ribosomal RNA (rRNA) gene cluster operon. The chloroplast genome of Roya differs from other zygnematophycean chloroplasts, including the newly sequenced Mesotaenium, by having a quadripartite structure that is typical of other streptophytes. On the basis of the improbability of the novel gain of IR regions, we infer that the quadripartite structure has likely been lost independently in at least three zygnematophycean lineages, although the absence of the usual rRNA operonic synteny in the IR regions of Roya may indicate their de novo origin. Significantly, all zygnematophycean chloroplast genomes have undergone substantial genomic rearrangement, which may be the result of ancient retroelement activity evidenced by the presence of integrase-like and reverse transcriptase-like elements in the Roya chloroplast genome. Our results corroborate the close phylogenetic relationship between Zygnematophyceae and land plants and identify 89 protein-coding genes and 22 introns present in the chloroplast genome at the time of the evolutionary transition of plants to land, all of which can be found in the chloroplast genomes of extant charophytes.

  18. High Interlaboratory Reprocucibility of DNA Sequence-based Typing of Bacteria in a Multicenter Study

    DEFF Research Database (Denmark)

    Sousa, MA de; Boye, Kit; Lencastre, H de

    2006-01-01

    Current DNA amplification-based typing methods for bacterial pathogens often lack interlaboratory reproducibility. In this international study, DNA sequence-based typing of the Staphylococcus aureus protein A gene (spa, 110 to 422 bp) showed 100% intra- and interlaboratory reproducibility without...... extensive harmonization of protocols for 30 blind-coded S. aureus DNA samples sent to 10 laboratories. Specialized software for automated sequence analysis ensured a common typing nomenclature....

  19. Targeted DNA Methylation Analysis by High Throughput Sequencing in Porcine Peri-attachment Embryos

    OpenAIRE

    MORRILL, Benson H.; COX, Lindsay; WARD, Anika; HEYWOOD, Sierra; PRATHER, Randall S.; ISOM, S. Clay

    2013-01-01

    Abstract The purpose of this experiment was to implement and evaluate the effectiveness of a next-generation sequencing-based method for DNA methylation analysis in porcine embryonic samples. Fourteen discrete genomic regions were amplified by PCR using bisulfite-converted genomic DNA derived from day 14 in vivo-derived (IVV) and parthenogenetic (PA) porcine embryos as template DNA. Resulting PCR products were subjected to high-throughput sequencing using the Illumina Genome Analyzer IIx plat...

  20. Sequence analysis of mitochondrial DNA hypervariable region III of ...

    African Journals Online (AJOL)

    The aims of this research were to study mitochondrial DNA hypervariable region III and establish the degree of variation characteristic of a fragment. The mitochondrial DNA (mtDNA) is a small circular genome located within the mitochondria in the cytoplasm of the cell and a smaller 1.2 kb pair fragment, called the control ...

  1. Low-Energy Electron-Induced Strand Breaks in Telomere-Derived DNA Sequences-Influence of DNA Sequence and Topology.

    Science.gov (United States)

    Rackwitz, Jenny; Bald, Ilko

    2018-03-26

    During cancer radiation therapy high-energy radiation is used to reduce tumour tissue. The irradiation produces a shower of secondary low-energy (DNA very efficiently by dissociative electron attachment. Recently, it was suggested that low-energy electron-induced DNA strand breaks strongly depend on the specific DNA sequence with a high sensitivity of G-rich sequences. Here, we use DNA origami platforms to expose G-rich telomere sequences to low-energy (8.8 eV) electrons to determine absolute cross sections for strand breakage and to study the influence of sequence modifications and topology of telomeric DNA on the strand breakage. We find that the telomeric DNA 5'-(TTA GGG) 2 is more sensitive to low-energy electrons than an intermixed sequence 5'-(TGT GTG A) 2 confirming the unique electronic properties resulting from G-stacking. With increasing length of the oligonucleotide (i.e., going from 5'-(GGG ATT) 2 to 5'-(GGG ATT) 4 ), both the variety of topology and the electron-induced strand break cross sections increase. Addition of K + ions decreases the strand break cross section for all sequences that are able to fold G-quadruplexes or G-intermediates, whereas the strand break cross section for the intermixed sequence remains unchanged. These results indicate that telomeric DNA is rather sensitive towards low-energy electron-induced strand breakage suggesting significant telomere shortening that can also occur during cancer radiation therapy. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. cDNA sequencing improves the detection of P53 missense mutations in colorectal cancer

    International Nuclear Information System (INIS)

    Szybka, Malgorzata; Kordek, Radzislaw; Zakrzewska, Magdalena; Rieske, Piotr; Pasz-Walczak, Grazyna; Kulczycka-Wojdala, Dominika; Zawlik, Izabela; Stawski, Robert; Jesionek-Kupnicka, Dorota; Liberski, Pawel P

    2009-01-01

    Recently published data showed discrepancies beteween P53 cDNA and DNA sequencing in glioblastomas. We hypothesised that similar discrepancies may be observed in other human cancers. To this end, we analyzed 23 colorectal cancers for P53 mutations and gene expression using both DNA and cDNA sequencing, real-time PCR and immunohistochemistry. We found P53 gene mutations in 16 cases (15 missense and 1 nonsense). Two of the 15 cases with missense mutations showed alterations based only on cDNA, and not DNA sequencing. Moreover, in 6 of the 15 cases with a cDNA mutation those mutations were difficult to detect in the DNA sequencing, so the results of DNA analysis alone could be misinterpreted if the cDNA sequencing results had not also been available. In all those 15 cases, we observed a higher ratio of the mutated to the wild type template by cDNA analysis, but not by the DNA analysis. Interestingly, a similar overexpression of P53 mRNA was present in samples with and without P53 mutations. In terms of colorectal cancer, those discrepancies might be explained under three conditions: 1, overexpression of mutated P53 mRNA in cancer cells as compared with normal cells; 2, a higher content of cells without P53 mutation (normal cells and cells showing K-RAS and/or APC but not P53 mutation) in samples presenting P53 mutation; 3, heterozygous or hemizygous mutations of P53 gene. Additionally, for heterozygous mutations unknown mechanism(s) causing selective overproduction of mutated allele should also be considered. Our data offer new clues for studying discrepancy in P53 cDNA and DNA sequencing analysis

  3. Complex chloroplast RNA metabolism: just debugging the genetic programme?

    Directory of Open Access Journals (Sweden)

    Schmitz-Linneweber Christian

    2008-08-01

    Full Text Available Abstract Background The gene expression system of chloroplasts is far more complex than that of their cyanobacterial progenitor. This gain in complexity affects in particular RNA metabolism, specifically the transcription and maturation of RNA. Mature chloroplast RNA is generated by a plethora of nuclear-encoded proteins acquired or recruited during plant evolution, comprising additional RNA polymerases and sigma factors, and sequence-specific RNA maturation factors promoting RNA splicing, editing, end formation and translatability. Despite years of intensive research, we still lack a comprehensive explanation for this complexity. Results We inspected the available literature and genome databases for information on components of RNA metabolism in land plant chloroplasts. In particular, new inventions of chloroplast-specific mechanisms and the expansion of some gene/protein families detected in land plants lead us to suggest that the primary function of the additional nuclear-encoded components found in chloroplasts is the transgenomic suppression of point mutations, fixation of which occurred due to an enhanced genetic drift exhibited by chloroplast genomes. We further speculate that a fast evolution of transgenomic suppressors occurred after the water-to-land transition of plants. Conclusion Our inspections indicate that several chloroplast-specific mechanisms evolved in land plants to remedy point mutations that occurred after the water-to-land transition. Thus, the complexity of chloroplast gene expression evolved to guarantee the functionality of chloroplast genetic information and may not, with some exceptions, be involved in regulatory functions.

  4. Functional role of a highly repetitive DNA sequence in anchorage of the mouse genome.

    Science.gov (United States)

    Neuer-Nitsche, B; Lu, X N; Werner, D

    1988-09-12

    The major portion of the eukaryotic genome consists of various categories of repetitive DNA sequences which have been studied with respect to their base compositions, organizations, copy numbers, transcription and species specificities; their biological roles, however, are still unclear. A novel quality of a highly repetitive mouse DNA sequence is described which points to a functional role: All copies (approximately 50,000 per haploid genome) of this DNA sequence reside on genomic Alu I DNA fragments each associated with nuclear polypeptides that are not released from DNA by proteinase K, SDS and phenol extraction. By this quality the repetitive DNA sequence is classified as a member of the sub-set of DNA sequences involved in tight DNA-polypeptide complexes which have been previously shown to be components of the subnuclear structure termed 'nuclear matrix'. From these results it has to be concluded that the repetitive DNA sequence characterized in this report represents or comprises a signal for a large number of site specific attachment points of the mouse genome in the nuclear matrix.

  5. Sequencing historical specimens: successful preparation of small specimens with low amounts of degraded DNA.

    Science.gov (United States)

    Sproul, John S; Maddison, David R

    2017-11-01

    Despite advances that allow DNA sequencing of old museum specimens, sequencing small-bodied, historical specimens can be challenging and unreliable as many contain only small amounts of fragmented DNA. Dependable methods to sequence such specimens are especially critical if the specimens are unique. We attempt to sequence small-bodied (3-6 mm) historical specimens (including nomenclatural types) of beetles that have been housed, dried, in museums for 58-159 years, and for which few or no suitable replacement specimens exist. To better understand ideal approaches of sample preparation and produce preparation guidelines, we compared different library preparation protocols using low amounts of input DNA (1-10 ng). We also explored low-cost optimizations designed to improve library preparation efficiency and sequencing success of historical specimens with minimal DNA, such as enzymatic repair of DNA. We report successful sample preparation and sequencing for all historical specimens despite our low-input DNA approach. We provide a list of guidelines related to DNA repair, bead handling, reducing adapter dimers and library amplification. We present these guidelines to facilitate more economical use of valuable DNA and enable more consistent results in projects that aim to sequence challenging, irreplaceable historical specimens. © 2017 John Wiley & Sons Ltd.

  6. Sequence-Dependent Diastereospecific and Diastereodivergent Crosslinking of DNA by Decarbamoylmitomycin C.

    Science.gov (United States)

    Aguilar, William; Paz, Manuel M; Vargas, Anayatzinc; Clement, Cristina C; Cheng, Shu-Yuan; Champeil, Elise

    2018-04-20

    Mitomycin C (MC), a potent antitumor drug, and decarbamoylmitomycin C (DMC), a derivative lacking the carbamoyl group, form highly cytotoxic DNA interstrand crosslinks. The major interstrand crosslink formed by DMC is the C1'' epimer of the major crosslink formed by MC. The molecular basis for the stereochemical configuration exhibited by DMC was investigated using biomimetic synthesis. The formation of DNA-DNA crosslinks by DMC is diastereospecific and diastereodivergent: Only the 1''S-diastereomer of the initially formed monoadduct can form crosslinks at GpC sequences, and only the 1''R-diastereomer of the monoadduct can form crosslinks at CpG sequences. We also show that CpG and GpC sequences react with divergent diastereoselectivity in the first alkylation step: 1"S stereochemistry is favored at GpC sequences and 1''R stereochemistry is favored at CpG sequences. Therefore, the first alkylation step results, at each sequence, in the selective formation of the diastereomer able to generate an interstrand DNA-DNA crosslink after the "second arm" alkylation. Examination of the known DNA adduct pattern obtained after treatment of cancer cell cultures with DMC indicates that the GpC sequence is the major target for the formation of DNA-DNA crosslinks in vivo by this drug. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Cloning, sequencing, and expression of dnaK-operon proteins from the thermophilic bacterium Thermus thermophilus.

    Science.gov (United States)

    Osipiuk, J; Joachimiak, A

    1997-09-12

    We propose that the dnaK operon of Thermus thermophilus HB8 is composed of three functionally linked genes: dnaK, grpE, and dnaJ. The dnaK and dnaJ gene products are most closely related to their cyanobacterial homologs. The DnaK protein sequence places T. thermophilus in the plastid Hsp70 subfamily. In contrast, the grpE translated sequence is most similar to GrpE from Clostridium acetobutylicum, a Gram-positive anaerobic bacterium. A single promoter region, with homology to the Escherichia coli consensus promoter sequences recognized by the sigma70 and sigma32 transcription factors, precedes the postulated operon. This promoter is heat-shock inducible. The dnaK mRNA level increased more than 30 times upon 10 min of heat shock (from 70 degrees C to 85 degrees C). A strong transcription terminating sequence was found between the dnaK and grpE genes. The individual genes were cloned into pET expression vectors and the thermophilic proteins were overproduced at high levels in E. coli and purified to homogeneity. The recombinant T. thermophilus DnaK protein was shown to have a weak ATP-hydrolytic activity, with an optimum at 90 degrees C. The ATPase was stimulated by the presence of GrpE and DnaJ. Another open reading frame, coding for ClpB heat-shock protein, was found downstream of the dnaK operon.

  8. Analysis of T-DNA/Host-Plant DNA Junction Sequences in Single-Copy Transgenic Barley Lines

    Directory of Open Access Journals (Sweden)

    Joanne G. Bartlett

    2014-01-01

    Full Text Available Sequencing across the junction between an integrated transfer DNA (T-DNA and a host plant genome provides two important pieces of information. The junctions themselves provide information regarding the proportion of T-DNA which has integrated into the host plant genome, whilst the transgene flanking sequences can be used to study the local genetic environment of the integrated transgene. In addition, this information is important in the safety assessment of GM crops and essential for GM traceability. In this study, a detailed analysis was carried out on the right-border T-DNA junction sequences of single-copy independent transgenic barley lines. T-DNA truncations at the right-border were found to be relatively common and affected 33.3% of the lines. In addition, 14.3% of lines had rearranged construct sequence after the right border break-point. An in depth analysis of the host-plant flanking sequences revealed that a significant proportion of the T-DNAs integrated into or close to known repetitive elements. However, this integration into repetitive DNA did not have a negative effect on transgene expression.

  9. Two dimensional molecular electronics spectroscopy for molecular fingerprinting, DNA sequencing, and cancerous DNA recognition.

    Science.gov (United States)

    Rajan, Arunkumar Chitteth; Rezapour, Mohammad Reza; Yun, Jeonghun; Cho, Yeonchoo; Cho, Woo Jong; Min, Seung Kyu; Lee, Geunsik; Kim, Kwang S

    2014-02-25

    Laser-driven molecular spectroscopy of low spatial resolution is widely used, while electronic current-driven molecular spectroscopy of atomic scale resolution has been limited because currents provide only minimal information. However, electron transmission of a graphene nanoribbon on which a molecule is adsorbed shows molecular fingerprints of Fano resonances, i.e., characteristic features of frontier orbitals and conformations of physisorbed molecules. Utilizing these resonance profiles, here we demonstrate two-dimensional molecular electronics spectroscopy (2D MES). The differential conductance with respect to bias and gate voltages not only distinguishes different types of nucleobases for DNA sequencing but also recognizes methylated nucleobases which could be related to cancerous cell growth. This 2D MES could open an exciting field to recognize single molecule signatures at atomic resolution. The advantages of the 2D MES over the one-dimensional (1D) current analysis can be comparable to those of 2D NMR over 1D NMR analysis.

  10. Cytogenetic Analysis of Populus trichocarpa - Ribosomal DNA, Telomere Repeat Sequence, and Marker-selected BACs

    Science.gov (United States)

    M.N. lslam-Faridi; C.D. Nelson; S.P. DiFazio; L.E. Gunter; G.A. Tuskan

    2009-01-01

    The 185-285 rDNA and 55 rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 185-285 rDNA sites and one 55 rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis-type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones...

  11. cDNA sequence of human transforming gene hst and identification of the coding sequence required for transforming activity

    International Nuclear Information System (INIS)

    Taira, M.; Yoshida, T.; Miyagawa, K.; Sakamoto, H.; Terada, M.; Sugimura, T.

    1987-01-01

    The hst gene was originally identified as a transforming gene in DNAs from human stomach cancers and from a noncancerous portion of stomach mucosa by DNA-mediated transfection assay using NIH3T3 cells. cDNA clones of hst were isolated from the cDNA library constructed from poly(A) + RNA of a secondary transformant induced by the DNA from a stomach cancer. The sequence analysis of the hst cDNA revealed the presence of two open reading frames. When this cDNA was inserted into an expression vector containing the simian virus 40 promoter, it efficiently induced the transformation of NIH3T3 cells upon transfection. It was found that one of the reading frames, which coded for 206 amino acids, was responsible for the transforming activity

  12. Real sequence effects on the search dynamics of transcription factors on DNA

    DEFF Research Database (Denmark)

    Bauer, Maximilian; Rasmussen, Emil S.; Lomholt, Michael A.

    2015-01-01

    Recent experiments show that transcription factors (TFs) indeed use the facilitated diffusion mechanism to locate their target sequences on DNA in living bacteria cells: TFs alternate between sliding motion along DNA and relocation events through the cytoplasm. From simulations and theoretical...... analysis we study the TF-sliding motion for a large section of the DNA-sequence of a common E. coli strain, based on the two-state TF-model with a fast-sliding search state and a recognition state enabling target detection. For the probability to detect the target before dissociating from DNA the TF...... on the underlying nucleotide sequence is varied. A moderate dependence maximises the capability to distinguish between the main operator and similar sequences. Moreover, these auxiliary operators serve as starting points for DNA looping with the main operator, yielding a spectrum of target detection times spanning...

  13. Sequence-specific activation of the DNA sensor cGAS by Y-form DNA structures as found in primary HIV-1 cDNA.

    Science.gov (United States)

    Herzner, Anna-Maria; Hagmann, Cristina Amparo; Goldeck, Marion; Wolter, Steven; Kübler, Kirsten; Wittmann, Sabine; Gramberg, Thomas; Andreeva, Liudmila; Hopfner, Karl-Peter; Mertens, Christina; Zillinger, Thomas; Jin, Tengchuan; Xiao, Tsan Sam; Bartok, Eva; Coch, Christoph; Ackermann, Damian; Hornung, Veit; Ludwig, Janos; Barchet, Winfried; Hartmann, Gunther; Schlee, Martin

    2015-10-01

    Cytosolic DNA that emerges during infection with a retrovirus or DNA virus triggers antiviral type I interferon responses. So far, only double-stranded DNA (dsDNA) over 40 base pairs (bp) in length has been considered immunostimulatory. Here we found that unpaired DNA nucleotides flanking short base-paired DNA stretches, as in stem-loop structures of single-stranded DNA (ssDNA) derived from human immunodeficiency virus type 1 (HIV-1), activated the type I interferon-inducing DNA sensor cGAS in a sequence-dependent manner. DNA structures containing unpaired guanosines flanking short (12- to 20-bp) dsDNA (Y-form DNA) were highly stimulatory and specifically enhanced the enzymatic activity of cGAS. Furthermore, we found that primary HIV-1 reverse transcripts represented the predominant viral cytosolic DNA species during early infection of macrophages and that these ssDNAs were highly immunostimulatory. Collectively, our study identifies unpaired guanosines in Y-form DNA as a highly active, minimal cGAS recognition motif that enables detection of HIV-1 ssDNA.

  14. Local repeat sequence organization of an intergenic spacer in the ...

    Indian Academy of Sciences (India)

    Unknown

    chloroplast genome of Chlamydomonas reinhardtii leads to DNA expansion and sequence ... The discovery of uniparentally inherited streptomycin resistant mutants ... resembles yeast, mitochondrial and phage recombination in that it is typically ...... Sager R and Lane D 1972 Molecular basis of maternal inheritance; Proc.

  15. Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors.

    Science.gov (United States)

    Adalsteinsson, Viktor A; Ha, Gavin; Freeman, Samuel S; Choudhury, Atish D; Stover, Daniel G; Parsons, Heather A; Gydush, Gregory; Reed, Sarah C; Rotem, Denisse; Rhoades, Justin; Loginov, Denis; Livitz, Dimitri; Rosebrock, Daniel; Leshchiner, Ignaty; Kim, Jaegil; Stewart, Chip; Rosenberg, Mara; Francis, Joshua M; Zhang, Cheng-Zhong; Cohen, Ofir; Oh, Coyin; Ding, Huiming; Polak, Paz; Lloyd, Max; Mahmud, Sairah; Helvie, Karla; Merrill, Margaret S; Santiago, Rebecca A; O'Connor, Edward P; Jeong, Seong H; Leeson, Rachel; Barry, Rachel M; Kramkowski, Joseph F; Zhang, Zhenwei; Polacek, Laura; Lohr, Jens G; Schleicher, Molly; Lipscomb, Emily; Saltzman, Andrea; Oliver, Nelly M; Marini, Lori; Waks, Adrienne G; Harshman, Lauren C; Tolaney, Sara M; Van Allen, Eliezer M; Winer, Eric P; Lin, Nancy U; Nakabayashi, Mari; Taplin, Mary-Ellen; Johannessen, Cory M; Garraway, Levi A; Golub, Todd R; Boehm, Jesse S; Wagle, Nikhil; Getz, Gad; Love, J Christopher; Meyerson, Matthew

    2017-11-06

    Whole-exome sequencing of cell-free DNA (cfDNA) could enable comprehensive profiling of tumors from blood but the genome-wide concordance between cfDNA and tumor biopsies is uncertain. Here we report ichorCNA, software that quantifies tumor content in cfDNA from 0.1× coverage whole-genome sequencing data without prior knowledge of tumor mutations. We apply ichorCNA to 1439 blood samples from 520 patients with metastatic prostate or breast cancers. In the earliest tested sample for each patient, 34% of patients have ≥10% tumor-derived cfDNA, sufficient for standard coverage whole-exome sequencing. Using whole-exome sequencing, we validate the concordance of clonal somatic mutations (88%), copy number alterations (80%), mutational signatures, and neoantigens between cfDNA and matched tumor biopsies from 41 patients with ≥10% cfDNA tumor content. In summary, we provide methods to identify patients eligible for comprehensive cfDNA profiling, revealing its applicability to many patients, and demonstrate high concordance of cfDNA and metastatic tumor whole-exome sequencing.

  16. Targeting and tracing of specific DNA sequences with dTALEs in living cells

    Science.gov (United States)

    Thanisch, Katharina; Schneider, Katrin; Morbitzer, Robert; Solovei, Irina; Lahaye, Thomas; Bultmann, Sebastian; Leonhardt, Heinrich

    2014-01-01

    Epigenetic regulation of gene expression involves, besides DNA and histone modifications, the relative positioning of DNA sequences within the nucleus. To trace specific DNA sequences in living cells, we used programmable sequence-specific DNA binding of designer transcription activator-like effectors (dTALEs). We designed a recombinant dTALE (msTALE) with variable repeat domains to specifically bind a 19-bp target sequence of major satellite DNA. The msTALE was fused with green fluorescent protein (GFP) and stably expressed in mouse embryonic stem cells. Hybridization with a major satellite probe (3D-fluorescent in situ hybridization) and co-staining for known cellular structures confirmed in vivo binding of the GFP-msTALE to major satellite DNA present at nuclear chromocenters. Dual tracing of major satellite DNA and the replication machinery throughout S-phase showed co-localization during mid to late S-phase, directly demonstrating the late replication timing of major satellite DNA. Fluorescence bleaching experiments indicated a relatively stable but still dynamic binding, with mean residence times in the range of minutes. Fluorescently labeled dTALEs open new perspectives to target and trace DNA sequences and to monitor dynamic changes in subnuclear positioning as well as interactions with functional nuclear structures during cell cycle progression and cellular differentiation. PMID:24371265

  17. Targeting and tracing of specific DNA sequences with dTALEs in living cells.

    Science.gov (United States)

    Thanisch, Katharina; Schneider, Katrin; Morbitzer, Robert; Solovei, Irina; Lahaye, Thomas; Bultmann, Sebastian; Leonhardt, Heinrich

    2014-04-01

    Epigenetic regulation of gene expression involves, besides DNA and histone modifications, the relative positioning of DNA sequences within the nucleus. To trace specific DNA sequences in living cells, we used programmable sequence-specific DNA binding of designer transcription activator-like effectors (dTALEs). We designed a recombinant dTALE (msTALE) with variable repeat domains to specifically bind a 19-bp target sequence of major satellite DNA. The msTALE was fused with green fluorescent protein (GFP) and stably expressed in mouse embryonic stem cells. Hybridization with a major satellite probe (3D-fluorescent in situ hybridization) and co-staining for known cellular structures confirmed in vivo binding of the GFP-msTALE to major satellite DNA present at nuclear chromocenters. Dual tracing of major satellite DNA and the replication machinery throughout S-phase showed co-localization during mid to late S-phase, directly demonstrating the late replication timing of major satellite DNA. Fluorescence bleaching experiments indicated a relatively stable but still dynamic binding, with mean residence times in the range of minutes. Fluorescently labeled dTALEs open new perspectives to target and trace DNA sequences and to monitor dynamic changes in subnuclear positioning as well as interactions with functional nuclear structures during cell cycle progression and cellular differentiation.

  18. MotifMark: Finding Regulatory Motifs in DNA Sequences

    OpenAIRE

    Hassanzadeh, Hamid Reza; Kolhe, Pushkar; Isbell, Charles L.; Wang, May D.

    2017-01-01

    The interaction between proteins and DNA is a key driving force in a significant number of biological processes such as transcriptional regulation, repair, recombination, splicing, and DNA modification. The identification of DNA-binding sites and the specificity of target proteins in binding to these regions are two important steps in understanding the mechanisms of these biological activities. A number of high-throughput technologies have recently emerged that try to quantify the affinity be...

  19. Sequence analysis of the canine mitochondrial DNA control region from shed hair samples in criminal investigations.

    Science.gov (United States)

    Berger, C; Berger, B; Parson, W

    2012-01-01

    In recent years, evidence from domestic dogs has increasingly been analyzed by forensic DNA testing. Especially, canine hairs have proved most suitable and practical due to the high rate of hair transfer occurring between dogs and humans. Starting with the description of a contamination-free sample handling procedure, we give a detailed workflow for sequencing hypervariable segments (HVS) of the mtDNA control region from canine evidence. After the hair material is lysed and the DNA extracted by Phenol/Chloroform, the amplification and sequencing strategy comprises the HVS I and II of the canine control region and is optimized for DNA of medium-to-low quality and quantity. The sequencing procedure is based on the Sanger Big-dye deoxy-terminator method and the separation of the sequencing reaction products is performed on a conventional multicolor fluorescence detection capillary electrophoresis platform. Finally, software-aided base calling and sequence interpretation are addressed exemplarily.

  20. Chloroplast genes as genetic markers for inferring patterns of change, maternal ancestry and phylogenetic relationships among Eleusine species.

    Science.gov (United States)

    Agrawal, Renuka; Agrawal, Nitin; Tandon, Rajesh; Raina, Soom Nath

    2014-01-01

    Assessment of phylogenetic relationships is an important component of any successful crop improvement programme, as wild relatives of the crop species often carry agronomically beneficial traits. Since its domestication in East Africa, Eleusine coracana (2n = 4x = 36), a species belonging to the genus Eleusine (x = 8, 9, 10), has held a prominent place in the semi-arid regions of India, Nepal and Africa. The patterns of variation between the cultivated and wild species reported so far and the interpretations based upon them have been considered primarily in terms of nuclear events. We analysed, for the first time, the phylogenetic relationship between finger millet (E. coracana) and its wild relatives by species-specific chloroplast deoxyribonucleic acid (cpDNA) polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) and chloroplast simple sequence repeat (cpSSR) markers/sequences. Restriction fragment length polymorphism of the seven amplified chloroplast genes/intergenic spacers (trnK, psbD, psaA, trnH-trnK, trnL-trnF, 16S and trnS-psbC), nucleotide sequencing of the chloroplast trnK gene and chloroplast microsatellite polymorphism were analysed in all nine known species of Eleusine. The RFLP of all seven amplified chloroplast genes/intergenic spacers and trnK gene sequences in the diploid (2n = 16, 18, 20) and allotetraploid (2n = 36, 38) species resulted in well-resolved phylogenetic trees with high bootstrap values. Eleusine coracana, E. africana, E. tristachya, E. indica and E. kigeziensis did not show even a single change in restriction site. Eleusine intermedia and E. floccifolia were also shown to have identical cpDNA fragment patterns. The cpDNA diversity in Eleusine multiflora was found to be more extensive than that of the other eight species. The trnK gene sequence data complemented the results obtained by PCR-RFLP. The maternal lineage of all three allotetraploid species (AABB, AADD) was the same, with E. indica being the

  1. A duplex DNA-gold nanoparticle probe composed as a colorimetric biosensor for sequence-specific DNA-binding proteins.

    Science.gov (United States)

    Ahn, Junho; Choi, Yeonweon; Lee, Ae-Ree; Lee, Joon-Hwa; Jung, Jong Hwa

    2016-03-21

    Using duplex DNA-AuNP aggregates, a sequence-specific DNA-binding protein, SQUAMOSA Promoter-binding-Like protein 12 (SPL-12), was directly determined by SPL-12-duplex DNA interaction-based colorimetric actions of DNA-Au assemblies. In order to prepare duplex DNA-Au aggregates, thiol-modified DNA 1 and DNA 2 were attached onto the surface of AuNPs, respectively, by the salt-aging method and then the DNA-attached AuNPs were mixed. Duplex-DNA-Au aggregates having the average size of 160 nm diameter and the maximum absorption at 529 nm were able to recognize SPL-12 and reached the equivalent state by the addition of ∼30 equivalents of SPL-12 accompanying a color change from red to blue with a red shift of the maximum absorption at 570 nm. As a result, the aggregation size grew to about 247 nm. Also, at higher temperatures of the mixture of duplex-DNA-Au aggregate solution and SPL-12, the equivalent state was reached rapidly. On the contrary, in the control experiment using Bovine Serum Albumin (BSA), no absorption band shift of duplex-DNA-Au aggregates was observed.

  2. Ecological niche modelling and nDNA sequencing support a new, morphologically cryptic beetle species unveiled by DNA barcoding.

    Science.gov (United States)

    Hawlitschek, Oliver; Porch, Nick; Hendrich, Lars; Balke, Michael

    2011-02-09

    DNA sequencing techniques used to estimate biodiversity, such as DNA barcoding, may reveal cryptic species. However, disagreements between barcoding and morphological data have already led to controversy. Species delimitation should therefore not be based on mtDNA alone. Here, we explore the use of nDNA and bioclimatic modelling in a new species of aquatic beetle revealed by mtDNA sequence data. The aquatic beetle fauna of Australia is characterised by high degrees of endemism, including local radiations such as the genus Antiporus. Antiporus femoralis was previously considered to exist in two disjunct, but morphologically indistinguishable populations in south-western and south-eastern Australia. We constructed a phylogeny of Antiporus and detected a deep split between these populations. Diagnostic characters from the highly variable nuclear protein encoding arginine kinase gene confirmed the presence of two isolated populations. We then used ecological niche modelling to examine the climatic niche characteristics of the two populations. All results support the status of the two populations as distinct species. We describe the south-western species as Antiporus occidentalis sp.n. In addition to nDNA sequence data and extended use of mitochondrial sequences, ecological niche modelling has great potential for delineating morphologically cryptic species.

  3. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, N.V.; Broekaert, W.F.; Namhai Chua; Kush, A.

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids.

  4. Bisulfite sequencing reveals that Aspergillus flavus holds a hollow in DNA methylation.

    Directory of Open Access Journals (Sweden)

    Si-Yang Liu

    Full Text Available Aspergillus flavus first gained scientific attention for its production of aflatoxin. The underlying regulation of aflatoxin biosynthesis has been serving as a theoretical model for biosynthesis of other microbial secondary metabolites. Nevertheless, for several decades, the DNA methylation status, one of the important epigenomic modifications involved in gene regulation, in A. flavus remains to be controversial. Here, we applied bisulfite sequencing in conjunction with a biological replicate strategy to investigate the DNA methylation profiling of A. flavus genome. Both the bisulfite sequencing data and the methylome comparisons with other fungi confirm that the DNA methylation level of this fungus is negligible. Further investigation into the DNA methyltransferase of Aspergillus uncovers its close relationship with RID-like enzymes as well as its divergence with the methyltransferase of species with validated DNA methylation. The lack of repeat contents of the A. flavus' genome and the high RIP-index of the small amount of remanent repeat potentially support our speculation that DNA methylation may be absent in A. flavus or that it may possess de novo DNA methylation which occurs very transiently during the obscure sexual stage of this fungal species. This work contributes to our understanding on the DNA methylation status of A. flavus, as well as reinforces our views on the DNA methylation in fungal species. In addition, our strategy of applying bisulfite sequencing to DNA methylation detection in species with low DNA methylation may serve as a reference for later scientific investigations in other hypomethylated species.

  5. A family of selfish minicircular chromosomes with jumbled chloroplast gene fragments from a dinoflagellate.

    Science.gov (United States)

    Zhang, Z; Cavalier-Smith, T; Green, B R

    2001-08-01

    Chloroplast genes of several dinoflagellate species are located on unigenic DNA minicircular chromosomes. We have now completely sequenced five aberrant minicircular chromosomes from the dinoflagellate Heterocapsa triquetra. These probably nonfunctional DNA circles lack complete genes, with each being composed of several short fragments of two or three different chloroplast genes and a common conserved region with a tripartite 9G-9A-9G core like the putative replicon origin of functional single-gene circular chloroplast chromosomes. Their sequences imply that all five circles evolved by differential deletions and duplications from common ancestral circles bearing fragments of four genes: psbA, psbC, 16S rRNA, and 23S rRNA. It appears that recombination between separate unigenic chromosomes initially gave intermediate heterodimers, which were subsequently stabilized by deletions that included part or all of one putative replicon origin. We suggest that homologous recombination at the 9G-9A-9G core regions produced a psbA/psbC heterodimer which generated two distinct chimeric circles by differential deletions and duplications. A 23S/16S rRNA heterodimer more likely formed by illegitimate recombination between 16S and 23S rRNA genes. Homologous recombination between the 9G-9A-9G core regions of both heterodimers and additional differential deletions and duplications could then have yielded the other three circles. Near identity of the gene fragments and 9G-9A-9G cores, despite diverging adjacent regions, may be maintained by gene conversion. The conserved organization of the 9G-9A-9G cores alone favors the idea that they are replicon origins and suggests that they may enable the aberrant minicircles to parasitize the chloroplast's replication machinery as selfish circles.

  6. Two new computational methods for universal DNA barcoding: a benchmark using barcode sequences of bacteria, archaea, animals, fungi, and land plants.

    Science.gov (United States)

    Tanabe, Akifumi S; Toju, Hirokazu

    2013-01-01

    Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used "1-nearest-neighbor" (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate

  7. Substrate sequence selectivity of APOBEC3A implicates intra-DNA interactions.

    Science.gov (United States)

    Silvas, Tania V; Hou, Shurong; Myint, Wazo; Nalivaika, Ellen; Somasundaran, Mohan; Kelch, Brian A; Matsuo, Hiroshi; Kurt Yilmaz, Nese; Schiffer, Celia A

    2018-05-14

    The APOBEC3 (A3) family of human cytidine deaminases is renowned for providing a first line of defense against many exogenous and endogenous retroviruses. However, the ability of these proteins to deaminate deoxycytidines in ssDNA makes A3s a double-edged sword. When overexpressed, A3s can mutate endogenous genomic DNA resulting in a variety of cancers. Although the sequence context for mutating DNA varies among A3s, the mechanism for substrate sequence specificity is not well understood. To characterize substrate specificity of A3A, a systematic approach was used to quantify the affinity for substrate as a function of sequence context, length, secondary structure, and solution pH. We identified the A3A ssDNA binding motif as (T/C)TC(A/G), which correlated with enzymatic activity. We also validated that A3A binds RNA in a sequence specific manner. A3A bound tighter to substrate binding motif within a hairpin loop compared to linear oligonucleotide, suggesting A3A affinity is modulated by substrate structure. Based on these findings and previously published A3A-ssDNA co-crystal structures, we propose a new model with intra-DNA interactions for the molecular mechanism underlying A3A sequence preference. Overall, the sequence and structural preferences identified for A3A leads to a new paradigm for identifying A3A's involvement in mutation of endo