WorldWideScience

Sample records for subfossil dna sequences

  1. Absence of ancient DNA in sub-fossil insect inclusions preserved in 'Anthropocene' Colombian copal.

    Science.gov (United States)

    Penney, David; Wadsworth, Caroline; Fox, Graeme; Kennedy, Sandra L; Preziosi, Richard F; Brown, Terence A

    2013-01-01

    Insects preserved in copal, the sub-fossilized resin precursor of amber, have potential value in molecular ecological studies of recently-extinct species and of extant species that have never been collected as living specimens. The objective of the work reported in this paper was therefore to determine if ancient DNA is present in insects preserved in copal. We prepared DNA libraries from two stingless bees (Apidae: Meliponini: Trigonisca ameliae) preserved in 'Anthropocene' Colombian copal, dated to 'post-Bomb' and 10,612±62 cal yr BP, respectively, and obtained sequence reads using the GS Junior 454 System. Read numbers were low, but were significantly higher for DNA extracts prepared from crushed insects compared with extracts obtained by a non-destructive method. The younger specimen yielded sequence reads up to 535 nucleotides in length, but searches of these sequences against the nucleotide database revealed very few significant matches. None of these hits was to stingless bees though one read of 97 nucleotides aligned with two non-contiguous segments of the mitochondrial cytochrome oxidase subunit I gene of the East Asia bumblebee Bombus hypocrita. The most significant hit was for 452 nucleotides of a 470-nucleotide read that aligned with part of the genome of the root-nodulating bacterium Bradyrhizobium japonicum. The other significant hits were to proteobacteria and an actinomycete. Searches directed specifically at Apidae nucleotide sequences only gave short and insignificant alignments. All of the reads from the older specimen appeared to be artefacts. We were therefore unable to obtain any convincing evidence for the preservation of ancient DNA in either of the two copal inclusions that we studied, and conclude that DNA is not preserved in this type of material. Our results raise further doubts about claims of DNA extraction from fossil insects in amber, many millions of years older than copal.

  2. Absence of Ancient DNA in Sub-Fossil Insect Inclusions Preserved in ‘Anthropocene’ Colombian Copal

    Science.gov (United States)

    Penney, David; Wadsworth, Caroline; Fox, Graeme; Kennedy, Sandra L.; Preziosi, Richard F.; Brown, Terence A.

    2013-01-01

    Insects preserved in copal, the sub-fossilized resin precursor of amber, have potential value in molecular ecological studies of recently-extinct species and of extant species that have never been collected as living specimens. The objective of the work reported in this paper was therefore to determine if ancient DNA is present in insects preserved in copal. We prepared DNA libraries from two stingless bees (Apidae: Meliponini: Trigonisca ameliae) preserved in ‘Anthropocene’ Colombian copal, dated to ‘post-Bomb’ and 10,612±62 cal yr BP, respectively, and obtained sequence reads using the GS Junior 454 System. Read numbers were low, but were significantly higher for DNA extracts prepared from crushed insects compared with extracts obtained by a non-destructive method. The younger specimen yielded sequence reads up to 535 nucleotides in length, but searches of these sequences against the nucleotide database revealed very few significant matches. None of these hits was to stingless bees though one read of 97 nucleotides aligned with two non-contiguous segments of the mitochondrial cytochrome oxidase subunit I gene of the East Asia bumblebee Bombus hypocrita. The most significant hit was for 452 nucleotides of a 470-nucleotide read that aligned with part of the genome of the root-nodulating bacterium Bradyrhizobium japonicum. The other significant hits were to proteobacteria and an actinomycete. Searches directed specifically at Apidae nucleotide sequences only gave short and insignificant alignments. All of the reads from the older specimen appeared to be artefacts. We were therefore unable to obtain any convincing evidence for the preservation of ancient DNA in either of the two copal inclusions that we studied, and conclude that DNA is not preserved in this type of material. Our results raise further doubts about claims of DNA extraction from fossil insects in amber, many millions of years older than copal. PMID:24039876

  3. DNA sequencing conference, 2

    Energy Technology Data Exchange (ETDEWEB)

    Cook-Deegan, R.M. [Georgetown Univ., Kennedy Inst. of Ethics, Washington, DC (United States); Venter, J.C. [National Inst. of Neurological Disorders and Strokes, Bethesda, MD (United States); Gilbert, W. [Harvard Univ., Cambridge, MA (United States); Mulligan, J. [Stanford Univ., CA (United States); Mansfield, B.K. [Oak Ridge National Lab., TN (United States)

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  4. Automated DNA Sequencing System

    Energy Technology Data Exchange (ETDEWEB)

    Armstrong, G.A.; Ekkebus, C.P.; Hauser, L.J.; Kress, R.L.; Mural, R.J.

    1999-04-25

    Oak Ridge National Laboratory (ORNL) is developing a core DNA sequencing facility to support biological research endeavors at ORNL and to conduct basic sequencing automation research. This facility is novel because its development is based on existing standard biology laboratory equipment; thus, the development process is of interest to the many small laboratories trying to use automation to control costs and increase throughput. Before automation, biology Laboratory personnel purified DNA, completed cycle sequencing, and prepared 96-well sample plates with commercially available hardware designed specifically for each step in the process. Following purification and thermal cycling, an automated sequencing machine was used for the sequencing. A technician handled all movement of the 96-well sample plates between machines. To automate the process, ORNL is adding a CRS Robotics A- 465 arm, ABI 377 sequencing machine, automated centrifuge, automated refrigerator, and possibly an automated SpeedVac. The entire system will be integrated with one central controller that will direct each machine and the robot. The goal of this system is to completely automate the sequencing procedure from bacterial cell samples through ready-to-be-sequenced DNA and ultimately to completed sequence. The system will be flexible and will accommodate different chemistries than existing automated sequencing lines. The system will be expanded in the future to include colony picking and/or actual sequencing. This discrete event, DNA sequencing system will demonstrate that smaller sequencing labs can achieve cost-effective the laboratory grow.

  5. Gomphid DNA sequence data

    Data.gov (United States)

    U.S. Environmental Protection Agency — DNA sequence data for several genetic loci. This dataset is not publicly accessible because: It's already publicly available on GenBank. It can be accessed through...

  6. Evolution of DNA sequencing.

    Science.gov (United States)

    Tipu, Hamid Nawaz; Shabbir, Ambreen

    2015-03-01

    Sanger and coworkers introduced DNA sequencing in 1970s for the first time. It principally relied on termination of growing nucleotide chain when a dideoxythymidine triphosphate (ddTTP) was inserted in it. Detection of terminated sequences was done radiographically on Polyacrylamide Gel Electrophoresis (PAGE). Improvements that have evolved over time in original Sanger sequencing include replacement of radiography with fluorescence, use of separate fluorescent markers for each nucleotide, use of capillary electrophoresis instead of polyacrylamide gel electrophoresis and then introduction of capillary array electrophoresis. However, this technique suffered from few inherent limitations like decreased sensitivity for low level mutant alleles, complexities in analyzing highly polymorphic regions like Major Histocompatibility Complex (MHC) and high DNA concentrations required. Several Next Generation Sequencing (NGS) technologies have been introduced by Roche, Illumina and other commercial manufacturers that tend to overcome Sanger sequencing limitations and have been reviewed. Introduction of NGS in clinical research and medical diagnostics is expected to change entire diagnostic approach. These include study of cancer variants, detection of minimal residual disease, exome sequencing, detection of Single Nucleotide Polymorphisms (SNPs) and their disease association, epigenetic regulation of gene expression and sequencing of microorganisms genome.

  7. Transposon facilitated DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Berg, D.E.; Berg, C.M.; Huang, H.V.

    1990-01-01

    The purpose of this research is to investigate and develop methods that exploit the power of bacterial transposable elements for large scale DNA sequencing: Our premise is that the use of transposons to put primer binding sites randomly in target DNAs should provide access to all portions of large DNA fragments, without the inefficiencies of methods involving random subcloning and attendant repetitive sequencing, or of sequential synthesis of many oligonucleotide primers that are used to match systematically along a DNA molecule. Two unrelated bacterial transposons, Tn5 and {gamma}{delta}, are being used because they have both proven useful for molecular analyses, and because they differ sufficiently in mechanism and specificity of transposition to merit parallel development.

  8. Comparative and population mitogenomic analyses of Madagascar's extinct, giant 'subfossil' lemurs.

    Science.gov (United States)

    Kistler, Logan; Ratan, Aakrosh; Godfrey, Laurie R; Crowley, Brooke E; Hughes, Cris E; Lei, Runhua; Cui, Yinqiu; Wood, Mindy L; Muldoon, Kathleen M; Andriamialison, Haingoson; McGraw, John J; Tomsho, Lynn P; Schuster, Stephan C; Miller, Webb; Louis, Edward E; Yoder, Anne D; Malhi, Ripan S; Perry, George H

    2015-02-01

    Humans first arrived on Madagascar only a few thousand years ago. Subsequent habitat destruction and hunting activities have had significant impacts on the island's biodiversity, including the extinction of megafauna. For example, we know of 17 recently extinct 'subfossil' lemur species, all of which were substantially larger (body mass ∼11-160 kg) than any living population of the ∼100 extant lemur species (largest body mass ∼6.8 kg). We used ancient DNA and genomic methods to study subfossil lemur extinction biology and update our understanding of extant lemur conservation risk factors by i) reconstructing a comprehensive phylogeny of extinct and extant lemurs, and ii) testing whether low genetic diversity is associated with body size and extinction risk. We recovered complete or near-complete mitochondrial genomes from five subfossil lemur taxa, and generated sequence data from population samples of two extinct and eight extant lemur species. Phylogenetic comparisons resolved prior taxonomic uncertainties and confirmed that the extinct subfossil species did not comprise a single clade. Genetic diversity estimates for the two sampled extinct species were relatively low, suggesting small historical population sizes. Low genetic diversity and small population sizes are both risk factors that would have rendered giant lemurs especially susceptible to extinction. Surprisingly, among the extant lemurs, we did not observe a relationship between body size and genetic diversity. The decoupling of these variables suggests that risk factors other than body size may have as much or more meaning for establishing future lemur conservation priorities. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. Repeated DNA sequences in fungi

    Energy Technology Data Exchange (ETDEWEB)

    Dutta, S.K.

    1974-11-01

    Several fungal species, representatives of all broad groups like basidiomycetes, ascomycetes and phycomycetes, were examined for the nature of repeated DNA sequences by DNA:DNA reassociation studies using hydroxyapatite chromatography. All of the fungal species tested contained 10 to 20 percent repeated DNA sequences. There are approximately 100 to 110 copies of repeated DNA sequences of approximately 4 x 10/sup 7/ daltons piece size of each. Repeated DNA sequence homoduplexes showed on average 5/sup 0/C difference of T/sub e/50 (temperature at which 50 percent duplexes dissociate) values from the corresponding homoduplexes of unfractionated whole DNA. It is suggested that a part of repetitive sequences in fungi constitutes mitochondrial DNA and a part of it constitutes nuclear DNA. (auth)

  10. Information Theory of DNA Sequencing

    CERN Document Server

    Motahari, Abolfazl; Tse, David

    2012-01-01

    DNA sequencing is the basic workhorse of modern day biology and medicine. Shotgun sequencing is the dominant technique used: many randomly located short fragments called reads are extracted from the DNA sequence, and these reads are assembled to reconstruct the original sequence. By drawing an analogy between the DNA sequencing problem and the classic communication problem, we define an information theoretic notion of sequencing capacity. This is the maximum number of DNA base pairs that can be resolved reliably per read, and provides a fundamental limit to the performance that can be achieved by any assembly algorithm. We compute the sequencing capacity explicitly for a simple statistical model of the DNA sequence and the read process. Using this framework, we also study the impact of noise in the read process on the sequencing capacity.

  11. Sanger dideoxy sequencing of DNA.

    Science.gov (United States)

    Walker, Sarah E; Lorsch, Jon

    2013-01-01

    While the ease and reduced cost of automated DNA sequencing has largely obviated the need for manual dideoxy sequencing for routine purposes, specific applications require manual DNA sequencing. For instance, in studies of enzymes or proteins that bind or modify DNA, a DNA ladder is often used to map the site at which an enzyme is bound or a modification occurs. In these cases, the Sanger method for dideoxy sequencing provides a rapid and facile method for producing a labeled DNA ladder. Copyright © 2013 Elsevier Inc. All rights reserved.

  12. The Dynamics of DNA Sequencing.

    Science.gov (United States)

    Morvillo, Nancy

    1997-01-01

    Describes a paper-and-pencil activity that helps students understand DNA sequencing and expands student understanding of DNA structure, replication, and gel electrophoresis. Appropriate for advanced biology students who are familiar with the Sanger method. (DDR)

  13. Biosensors for DNA sequence detection

    Science.gov (United States)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  14. Graphene nanodevices for DNA sequencing

    Science.gov (United States)

    Heerema, Stephanie J.; Dekker, Cees

    2016-02-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with nanopores. Owing to its unique structure and properties, graphene provides interesting opportunities for the development of a new sequencing technology. In recent years, a wide range of creative ideas for graphene sequencers have been theoretically proposed and the first experimental demonstrations have begun to appear. Here, we review the different approaches to using graphene nanodevices for DNA sequencing, which involve DNA passing through graphene nanopores, nanogaps, and nanoribbons, and the physisorption of DNA on graphene nanostructures. We discuss the advantages and problems of each of these key techniques, and provide a perspective on the use of graphene in future DNA sequencing technology.

  15. Graphene nanodevices for DNA sequencing

    NARCIS (Netherlands)

    Heerema, S.J.; Dekker, C.

    2016-01-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with

  16. DNA Sequencing Sensors: An Overview

    Directory of Open Access Journals (Sweden)

    Jose Antonio Garrido-Cardenas

    2017-03-01

    Full Text Available The first sequencing of a complete genome was published forty years ago by the double Nobel Prize in Chemistry winner Frederick Sanger. That corresponded to the small sized genome of a bacteriophage, but since then there have been many complex organisms whose DNA have been sequenced. This was possible thanks to continuous advances in the fields of biochemistry and molecular genetics, but also in other areas such as nanotechnology and computing. Nowadays, sequencing sensors based on genetic material have little to do with those used by Sanger. The emergence of mass sequencing sensors, or new generation sequencing (NGS meant a quantitative leap both in the volume of genetic material that was able to be sequenced in each trial, as well as in the time per run and its cost. One can envisage that incoming technologies, already known as fourth generation sequencing, will continue to cheapen the trials by increasing DNA reading lengths in each run. All of this would be impossible without sensors and detection systems becoming smaller and more precise. This article provides a comprehensive overview on sensors for DNA sequencing developed within the last 40 years.

  17. Statistical properties of DNA sequences

    Science.gov (United States)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-01-01

    We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  18. DNA sequences at a glance.

    Directory of Open Access Journals (Sweden)

    Armando J Pinho

    Full Text Available Data summarization and triage is one of the current top challenges in visual analytics. The goal is to let users visually inspect large data sets and examine or request data with particular characteristics. The need for summarization and visual analytics is also felt when dealing with digital representations of DNA sequences. Genomic data sets are growing rapidly, making their analysis increasingly more difficult, and raising the need for new, scalable tools. For example, being able to look at very large DNA sequences while immediately identifying potentially interesting regions would provide the biologist with a flexible exploratory and analytical tool. In this paper we present a new concept, the "information profile", which provides a quantitative measure of the local complexity of a DNA sequence, independently of the direction of processing. The computation of the information profiles is computationally tractable: we show that it can be done in time proportional to the length of the sequence. We also describe a tool to compute the information profiles of a given DNA sequence, and use the genome of the fission yeast Schizosaccharomyces pombe strain 972 h(- and five human chromosomes 22 for illustration. We show that information profiles are useful for detecting large-scale genomic regularities by visual inspection. Several discovery strategies are possible, including the standalone analysis of single sequences, the comparative analysis of sequences from individuals from the same species, and the comparative analysis of sequences from different organisms. The comparison scale can be varied, allowing the users to zoom-in on specific details, or obtain a broad overview of a long segment. Software applications have been made available for non-commercial use at http://bioinformatics.ua.pt/software/dna-at-glance.

  19. A Demonstration of Automated DNA Sequencing.

    Science.gov (United States)

    Latourelle, Sandra; Seidel-Rogol, Bonnie

    1998-01-01

    Details a simulation that employs a paper-and-pencil model to demonstrate the principles behind automated DNA sequencing. Discusses the advantages of automated sequencing as well as the chemistry of automated DNA sequencing. (DDR)

  20. The sequence of sequencers: The history of sequencing DNA

    Science.gov (United States)

    Heather, James M.; Chain, Benjamin

    2016-01-01

    Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. PMID:26554401

  1. Entropic fluctuations in DNA sequences

    Science.gov (United States)

    Thanos, Dimitrios; Li, Wentian; Provata, Astero

    2018-03-01

    The Local Shannon Entropy (LSE) in blocks is used as a complexity measure to study the information fluctuations along DNA sequences. The LSE of a DNA block maps the local base arrangement information to a single numerical value. It is shown that despite this reduction of information, LSE allows to extract meaningful information related to the detection of repetitive sequences in whole chromosomes and is useful in finding evolutionary differences between organisms. More specifically, large regions of tandem repeats, such as centromeres, can be detected based on their low LSE fluctuations along the chromosome. Furthermore, an empirical investigation of the appropriate block sizes is provided and the relationship of LSE properties with the structure of the underlying repetitive units is revealed by using both computational and mathematical methods. Sequence similarity between the genomic DNA of closely related species also leads to similar LSE values at the orthologous regions. As an application, the LSE covariance function is used to measure the evolutionary distance between several primate genomes.

  2. Perspectives in Biochemistry: Methods for DNA Sequencing.

    Science.gov (United States)

    Wood, Anne T.

    1984-01-01

    Describes two frequently used DNA sequencing methods: Sander's enzymatic dideoxy method and Maxam and Gilbert's chemical sequencing method. Indicates that studying these methods provides students with knowledge of the chemical structure of DNA and how DNA sequence data are obtained. (JN)

  3. DNA sequencing technologies: 2006-2016.

    Science.gov (United States)

    Mardis, Elaine R

    2017-02-01

    Recent advances in the field of genomics have largely been due to the ability to sequence DNA at increasing throughput and decreasing cost. DNA sequencing was first introduced in 1977, and next-generation sequencing technologies have been available only during the past decade, but the diverse experiments and corresponding analyses facilitated by these techniques have transformed biological and biomedical research. Here, I review developments in DNA sequencing technologies over the past 10 years and look to the future for further applications.

  4. Plant DNA sequencing for phylogenetic analyses: from plants to sequences.

    Science.gov (United States)

    Neves, Susana S; Forrest, Laura L

    2011-01-01

    DNA sequences are important sources of data for phylogenetic analysis. Nowadays, DNA sequencing is a routine technique in molecular biology laboratories. However, there are specific questions associated with project design and sequencing of plant samples for phylogenetic analysis, which may not be familiar to researchers starting in the field. This chapter gives an overview of methods and protocols involved in the sequencing of plant samples, including general recommendations on the selection of species/taxa and DNA regions to be sequenced, and field collection of plant samples. Protocols of plant sample preparation, DNA extraction, PCR and cloning, which are critical to the success of molecular phylogenetic projects, are described in detail. Common problems of sequencing (using the Sanger method) are also addressed. Possible applications of second-generation sequencing techniques in plant phylogenetics are briefly discussed. Finally, orientation on the preparation of sequence data for phylogenetic analyses and submission to public databases is also given.

  5. "First generation" automated DNA sequencing technology.

    Science.gov (United States)

    Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

    2011-10-01

    Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.

  6. DNA Sequencing in Undergraduate Laboratory Courses.

    Science.gov (United States)

    Hamilton, Robert G.

    1997-01-01

    Discusses strategies to duplicate current research protocols using biochemical methods of analysis. Describes the use of the Silver Sequence kit that provides a technically simple and relatively inexpensive DNA sequencing exercise. (JRH)

  7. Fibonacci Sequence and Supramolecular Structure of DNA.

    Science.gov (United States)

    Shabalkin, I P; Grigor'eva, E Yu; Gudkova, M V; Shabalkin, P I

    2016-05-01

    We proposed a new model of supramolecular DNA structure. Similar to the previously developed by us model of primary DNA structure [11-15], 3D structure of DNA molecule is assembled in accordance to a mathematic rule known as Fibonacci sequence. Unlike primary DNA structure, supramolecular 3D structure is assembled from complex moieties including a regular tetrahedron and a regular octahedron consisting of monomers, elements of the primary DNA structure. The moieties of the supramolecular DNA structure forming fragments of regular spatial lattice are bound via linker (joint) sequences of the DNA chain. The lattice perceives and transmits information signals over a considerable distance without acoustic aberrations. Linker sequences expand conformational space between lattice segments allowing their sliding relative to each other under the action of external forces. In this case, sliding is provided by stretching of the stacked linker sequences.

  8. Analysis of Subfossil Molecular Remains of Purple Sulfur Bacteria in a Lake Sediment

    Science.gov (United States)

    Coolen, Marco J. L.; Overmann, Jörg

    1998-01-01

    Molecular remains of purple sulfur bacteria (Chromatiaceae) were detected in Holocene sediment layers of a meromictic salt lake (Mahoney Lake, British Columbia, Canada). The carotenoid okenone and bacteriophaeophytin a were present in sediments up to 11,000 years old. Okenone is specific for only a few species of Chromatiaceae, including Amoebobacter purpureus, which presently predominates in the chemocline bacterial community of the lake. With a primer set specific for Chromatiaceae in combination with denaturing gradient gel electrophoresis, 16S rRNA gene sequences of four different Chromatiaceae species were retrieved from different depths of the sediment. One of the sequences, which originated from a 9,100-year-old sample, was 99.2% identical to the 16S rRNA gene sequence of A. purpureus ML1 isolated from the chemocline. Employing primers specific for A. purpureus ML1 and dot blot hybridization of the PCR products, the detection limit for A. purpureus ML1 DNA could be lowered to 0.004% of the total community DNA. With this approach the DNA of the isolate was detected in 7 of 10 sediment layers, indicating that A. purpureus ML1 constituted at least a part of the ancient purple sulfur bacterial community. The concentrations of A. purpureus DNA and okenone in the sediment were not correlated, and the ratio of DNA to okenone was much lower in the subfossil sediment layers (2.7 · 10−6) than in intact cells (1.4). This indicates that degradation rates are significantly higher for genomic DNA than for hydrocarbon cell constituents, even under anoxic conditions and at the very high sulfide concentrations present in Mahoney Lake. PMID:9797316

  9. Production and use of bovine DNA libraries: DNA-sequencing.

    Science.gov (United States)

    Sallmann, H P; Fuhrmann, H; Huttel, K; Geldermann, H

    1990-03-01

    An important part in the use of genomic DNA libraries is the sequencing of identified clones for detailed information. In this study, methods for DNA sequence analysis were elaborated and employed for the k-casein gene, a bovine milk protein. The results encourage further research.

  10. Sequence Affects the Cyclization of DNA Minicircles.

    Science.gov (United States)

    Wang, Qian; Pettitt, B Montgomery

    2016-03-17

    Understanding how the sequence of a DNA molecule affects its dynamic properties is a central problem affecting biochemistry and biotechnology. The process of cyclizing short DNA, as a critical step in molecular cloning, lacks a comprehensive picture of the kinetic process containing sequence information. We have elucidated this process by using coarse-grained simulations, enhanced sampling methods, and recent theoretical advances. We are able to identify the types and positions of structural defects during the looping process at a base-pair level. Correlations along a DNA molecule dictate critical sequence positions that can affect the looping rate. Structural defects change the bending elasticity of the DNA molecule from a harmonic to subharmonic potential with respect to bending angles. We explore the subelastic chain as a possible model in loop formation kinetics. A sequence-dependent model is developed to qualitatively predict the relative loop formation time as a function of DNA sequence.

  11. Molecular dating of caprines using ancient DNA sequences of Myotragus balearicus, an extinct endemic Balearic mammal.

    Science.gov (United States)

    Lalueza-Fox, Carles; Castresana, Jose; Sampietro, Lourdes; Marquès-Bonet, Tomàs; Alcover, Josep Antoni; Bertranpetit, Jaume

    2005-12-06

    Myotragus balearicus was an endemic bovid from the Balearic Islands (Western Mediterranean) that became extinct around 6,000-4,000 years ago. The Myotragus evolutionary lineage became isolated in the islands most probably at the end of the Messinian crisis, when the desiccation of the Mediterranean ended, in a geological date established at 5.35 Mya. Thus, the sequences of Myotragus could be very valuable for calibrating the mammalian mitochondrial DNA clock and, in particular, the tree of the Caprinae subfamily, to which Myotragus belongs. We have retrieved the complete mitochondrial cytochrome b gene (1,143 base pairs), plus fragments of the mitochondrial 12S gene and the nuclear 28S rDNA multi-copy gene from a well preserved Myotragus subfossil bone. The best resolved phylogenetic trees, obtained with the cytochrome b gene, placed Myotragus in a position basal to the Ovis group. Using the calibration provided by the isolation of Balearic Islands, we calculated that the initial radiation of caprines can be dated at 6.2 +/- 0.4 Mya. In addition, alpine and southern chamois, considered until recently the same species, split around 1.6 +/- 0.3 Mya, indicating that the two chamois species have been separated much longer than previously thought. Since there are almost no extant endemic mammals in Mediterranean islands, the sequence of the extinct Balearic endemic Myotragus has been crucial for allowing us to use the Messinian crisis calibration point for dating the caprines phylogenetic tree.

  12. Dynamics and Control of DNA Sequence Amplification

    CERN Document Server

    Marimuthu, Karthikeyan

    2014-01-01

    DNA amplification is the process of replication of a specified DNA sequence \\emph{in vitro} through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction (PCR) as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal tempe...

  13. EGNAS: an exhaustive DNA sequence design algorithm

    Directory of Open Access Journals (Sweden)

    Kick Alfred

    2012-06-01

    Full Text Available Abstract Background The molecular recognition based on the complementary base pairing of deoxyribonucleic acid (DNA is the fundamental principle in the fields of genetics, DNA nanotechnology and DNA computing. We present an exhaustive DNA sequence design algorithm that allows to generate sets containing a maximum number of sequences with defined properties. EGNAS (Exhaustive Generation of Nucleic Acid Sequences offers the possibility of controlling both interstrand and intrastrand properties. The guanine-cytosine content can be adjusted. Sequences can be forced to start and end with guanine or cytosine. This option reduces the risk of “fraying” of DNA strands. It is possible to limit cross hybridizations of a defined length, and to adjust the uniqueness of sequences. Self-complementarity and hairpin structures of certain length can be avoided. Sequences and subsequences can optionally be forbidden. Furthermore, sequences can be designed to have minimum interactions with predefined strands and neighboring sequences. Results The algorithm is realized in a C++ program. TAG sequences can be generated and combined with primers for single-base extension reactions, which were described for multiplexed genotyping of single nucleotide polymorphisms. Thereby, possible foldback through intrastrand interaction of TAG-primer pairs can be limited. The design of sequences for specific attachment of molecular constructs to DNA origami is presented. Conclusions We developed a new software tool called EGNAS for the design of unique nucleic acid sequences. The presented exhaustive algorithm allows to generate greater sets of sequences than with previous software and equal constraints. EGNAS is freely available for noncommercial use at http://www.chm.tu-dresden.de/pc6/EGNAS.

  14. Inconsistencies in Neanderthal genomic DNA sequences.

    Directory of Open Access Journals (Sweden)

    Jeffrey D Wall

    2007-10-01

    Full Text Available Two recently published papers describe nuclear DNA sequences that were obtained from the same Neanderthal fossil. Our reanalyses of the data from these studies show that they are not consistent with each other and point to serious problems with the data quality in one of the studies, possibly due to modern human DNA contaminants and/or a high rate of sequencing errors.

  15. Mitochondrial DNA sequence evolution in shorebird populations

    NARCIS (Netherlands)

    Wenink, P.W.

    1994-01-01

    This thesis describes the global molecular population structure of two shorebird species, in particular of the dunlin, Calidris alpina, by means of comparative sequence analysis of the most variable part of the mitochondrial DNA (mtDNA) genome. There are several reasons

  16. On site DNA barcoding by nanopore sequencing.

    Directory of Open Access Journals (Sweden)

    Michele Menegon

    Full Text Available Biodiversity research is becoming increasingly dependent on genomics, which allows the unprecedented digitization and understanding of the planet's biological heritage. The use of genetic markers i.e. DNA barcoding, has proved to be a powerful tool in species identification. However, full exploitation of this approach is hampered by the high sequencing costs and the absence of equipped facilities in biodiversity-rich countries. In the present work, we developed a portable sequencing laboratory based on the portable DNA sequencer from Oxford Nanopore Technologies, the MinION. Complementary laboratory equipment and reagents were selected to be used in remote and tough environmental conditions. The performance of the MinION sequencer and the portable laboratory was tested for DNA barcoding in a mimicking tropical environment, as well as in a remote rainforest of Tanzania lacking electricity. Despite the relatively high sequencing error-rate of the MinION, the development of a suitable pipeline for data analysis allowed the accurate identification of different species of vertebrates including amphibians, reptiles and mammals. In situ sequencing of a wild frog allowed us to rapidly identify the species captured, thus confirming that effective DNA barcoding in the field is possible. These results open new perspectives for real-time-on-site DNA sequencing thus potentially increasing opportunities for the understanding of biodiversity in areas lacking conventional laboratory facilities.

  17. DNA sequencing using fluorescence background electroblotting membrane

    Science.gov (United States)

    Caldwell, K.D.; Chu, T.J.; Pitt, W.G.

    1992-05-12

    A method for the multiplex sequencing on DNA is disclosed which comprises the electroblotting or specific base terminated DNA fragments, which have been resolved by gel electrophoresis, onto the surface of a neutral non-aromatic polymeric microporous membrane exhibiting low background fluorescence which has been surface modified to contain amino groups. Polypropylene membranes are preferably and the introduction of amino groups is accomplished by subjecting the membrane to radio or microwave frequency plasma discharge in the presence of an aminating agent, preferably ammonia. The membrane, containing physically adsorbed DNA fragments on its surface after the electroblotting, is then treated with crosslinking means such as UV radiation or a glutaraldehyde spray to chemically bind the DNA fragments to the membrane through amino groups contained on the surface. The DNA fragments chemically bound to the membrane are subjected to hybridization probing with a tagged probe specific to the sequence of the DNA fragments. The tagging may be by either fluorophores or radioisotopes. The tagged probes hybridized to the target DNA fragments are detected and read by laser induced fluorescence detection or autoradiograms. The use of aminated low fluorescent background membranes allows the use of fluorescent detection and reading even when the available amount of DNA to be sequenced is small. The DNA bound to the membranes may be reprobed numerous times. No Drawings

  18. Nanogrid rolling circle DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Church, George M.; Porreca, Gregory J.; Shendure, Jay; Rosenbaum, Abraham Meir

    2017-04-18

    The present invention relates to methods for sequencing a polynucleotide immobilized on an array having a plurality of specific regions each having a defined diameter size, including synthesizing a concatemer of a polynucleotide by rolling circle amplification, wherein the concatemer has a cross-sectional diameter greater than the diameter of a specific region, immobilizing the concatemer to the specific region to make an immobilized concatemer, and sequencing the immobilized concatemer.

  19. DNA content and distribution in ancient feathers and potential to reconstruct the plumage of extinct avian taxa.

    Science.gov (United States)

    Rawlence, Nicolas J; Wood, Jamie R; Armstrong, Kyle N; Cooper, Alan

    2009-10-07

    Feathers are known to contain amplifiable DNA at their base (calamus) and have provided an important genetic source from museum specimens. However, feathers in subfossil deposits generally only preserve the upper shaft and feather 'vane' which are thought to be unsuitable for DNA analysis. We analyse subfossil moa feathers from Holocene New Zealand rockshelter sites and demonstrate that both ancient DNA and plumage information can be recovered from their upper portion, allowing species identification and a means to reconstruct the appearance of extinct taxa. These ancient DNA sequences indicate that the distal portions of feathers are an untapped resource for studies of museum, palaeontological and modern specimens. We investigate the potential to reconstruct the plumage of pre-historically extinct avian taxa using subfossil remains, rather than assuming morphological uniformity with closely related extant taxa. To test the notion of colour persistence in subfossil feathers, we perform digital comparisons of feathers of the red-crowned parakeet (Cyanoramphus novaezelandiae novaezelandiae) excavated from the same horizons as the moa feathers, with modern samples. The results suggest that the coloration of the moa feathers is authentic, and computer software is used to perform plumage reconstructions of moa based on subfossil remains.

  20. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  1. Osmylated DNA, a novel concept for sequencing DNA using nanopores.

    Science.gov (United States)

    Kanavarioti, Anastassia

    2015-03-27

    Saenger sequencing has led the advances in molecular biology, while faster and cheaper next generation technologies are urgently needed. A newer approach exploits nanopores, natural or solid-state, set in an electrical field, and obtains base sequence information from current variations due to the passage of a ssDNA molecule through the pore. A hurdle in this approach is the fact that the four bases are chemically comparable to each other which leads to small differences in current obstruction. 'Base calling' becomes even more challenging because most nanopores sense a short sequence and not individual bases. Perhaps sequencing DNA via nanopores would be more manageable, if only the bases were two, and chemically very different from each other; a sequence of 1s and 0s comes to mind. Osmylated DNA comes close to such a sequence of 1s and 0s. Osmylation is the addition of osmium tetroxide bipyridine across the C5-C6 double bond of the pyrimidines. Osmylation adds almost 400% mass to the reactive base, creates a sterically and electronically notably different molecule, labeled 1, compared to the unreactive purines, labeled 0. If osmylated DNA were successfully sequenced, the result would be a sequence of osmylated pyrimidines (1), and purines (0), and not of the actual nucleobases. To solve this problem we studied the osmylation reaction with short oligos and with M13mp18, a long ssDNA, developed a UV-vis assay to measure extent of osmylation, and designed two protocols. Protocol A uses mild conditions and yields osmylated thymidines (1), while leaving the other three bases (0) practically intact. Protocol B uses harsher conditions and effectively osmylates both pyrimidines, but not the purines. Applying these two protocols also to the complementary of the target polynucleotide yields a total of four osmylated strands that collectively could define the actual base sequence of the target DNA.

  2. Visual DNA -- identification of DNA sequence variations by bead trapping.

    Science.gov (United States)

    Ståhl, Patrik L; Gantelius, Jesper; Natanaelsson, Christian; Ahmadian, Afshin; Andersson-Svahn, Helene; Lundeberg, Joakim

    2007-12-01

    In this paper we describe a method that uses the nearly covalent strength biotin-streptavidin interaction to attach a paramagnetic bead of micrometer size to a DNA molecule of nanometer size, scaling up the spatial size of a query DNA strand by a factor of 1000, making it visible to the human eye. The use of magnetic principles enables rapid binding and washing of detector beads, facilitating a readout of amplified DNA sequences in a few minutes. Here we exemplify the method on mitochondrial DNA variations using an array platform. Visual identification and documentation can be performed with an ordinary mobile phone equipped with a built-in camera.

  3. Compressing DNA sequence databases with coil

    Directory of Open Access Journals (Sweden)

    Hendy Michael D

    2008-05-01

    Full Text Available Abstract Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.

  4. A subfossil half-mandible of a Grey Seal

    NARCIS (Netherlands)

    Bree, van P.J.H.; Bosscha Erdbrink, D.P.

    1995-01-01

    The fortuitous discovery, in the collections of the National Museum of Natural History at Leiden, of a probably subfossil right half-mandible of a Grey Seal is reported. A short description of the piece is given and it is compared with some other recent, subfossil and fossil material.

  5. Quantum-Sequencing: Fast electronic single DNA molecule sequencing

    Science.gov (United States)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free, high-throughput and cost-effective, single-molecule sequencing method. Here, we present the first demonstration of unique ``electronic fingerprint'' of all nucleotides (A, G, T, C), with single-molecule DNA sequencing, using Quantum-tunneling Sequencing (Q-Seq) at room temperature. We show that the electronic state of the nucleobases shift depending on the pH, with most distinct states identified at acidic pH. We also demonstrate identification of single nucleotide modifications (methylation here). Using these unique electronic fingerprints (or tunneling data), we report a partial sequence of beta lactamase (bla) gene, which encodes resistance to beta-lactam antibiotics, with over 95% success rate. These results highlight the potential of Q-Seq as a robust technique for next-generation sequencing.

  6. Understanding human DNA sequence variation.

    Science.gov (United States)

    Kidd, K K; Pakstis, A J; Speed, W C; Kidd, J R

    2004-01-01

    Over the past century researchers have identified normal genetic variation and studied that variation in diverse human populations to determine the amounts and distributions of that variation. That information is being used to develop an understanding of the demographic histories of the different populations and the species as a whole, among other studies. With the advent of DNA-based markers in the last quarter century, these studies have accelerated. One of the challenges for the next century is to understand that variation. One component of that understanding will be population genetics. We present here examples of many of the ways these new data can be analyzed from a population perspective using results from our laboratory on multiple individual DNA-based polymorphisms, many clustered in haplotypes, studied in multiple populations representing all major geographic regions of the world. These data support an "out of Africa" hypothesis for human dispersal around the world and begin to refine the understanding of population structures and genetic relationships. We are also developing baseline information against which we can compare findings at different loci to aid in the identification of loci subject, now and in the past, to selection (directional or balancing). We do not yet have a comprehensive understanding of the extensive variation in the human genome, but some of that understanding is coming from population genetics.

  7. DNA Sequencing in Cultural Heritage.

    Science.gov (United States)

    Vai, Stefania; Lari, Martina; Caramelli, David

    2016-02-01

    During the last three decades, DNA analysis on degraded samples revealed itself as an important research tool in anthropology, archaeozoology, molecular evolution, and population genetics. Application on topics such as determination of species origin of prehistoric and historic objects, individual identification of famous personalities, characterization of particular samples important for historical, archeological, or evolutionary reconstructions, confers to the paleogenetics an important role also for the enhancement of cultural heritage. A really fast improvement in methodologies in recent years led to a revolution that permitted recovering even complete genomes from highly degraded samples with the possibility to go back in time 400,000 years for samples from temperate regions and 700,000 years for permafrozen remains and to analyze even more recent material that has been subjected to hard biochemical treatments. Here we propose a review on the different methodological approaches used so far for the molecular analysis of degraded samples and their application on some case studies.

  8. Chromosome number9 specific repetitive DNA sequence

    Energy Technology Data Exchange (ETDEWEB)

    Joste, N.E.; Cram, L.S.; Hildebrand, C.E.; Jones, M.; Longmire, J.; Robinson, T.; Moyzis, R.K.

    1986-05-01

    Human repetitive DNA libraries have been constructed and various recombinant DNA clones isolated that are likely candidates for chromosome specific sequences. The first clone tested (pHuR 98; plasmid human repeat 98) was biotinylated and hybridized to human chromosomes in situ. The hybridized recombinant probe was detected with fluoresceinated avidin, and chromosomes were counter-stained with either propidium iodide or distamycin-DAPI. Specific hybridization to chromosome band 9q1 was obtained. The localization was confirmed by hybridizing radiolabeled pHuR 98 DNA to human chromosomes sorted by flow cytometry. Various methods, including orthogonal field pulsed gel electrophoresis analysis indicate that 75 kilobase blocks of this sequence are interspersed with other repetitive DNA sequences in this chromosome band. This study is the first to report a human repetitive DNA sequence uniquely localized to a specific chromosome. This clone provides an easily detected and highly specific chromosomal marker for molecular cytogenetic analyses in numerous basic research and clinical studies.

  9. Statistical and linguistic features of DNA sequences

    Science.gov (United States)

    Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.

  10. A Bioluminometric Method of DNA Sequencing

    Science.gov (United States)

    Ronaghi, Mostafa; Pourmand, Nader; Stolc, Viktor; Arnold, Jim (Technical Monitor)

    2001-01-01

    Pyrosequencing is a bioluminometric single-tube DNA sequencing method that takes advantage of co-operativity between four enzymes to monitor DNA synthesis. In this sequencing-by-synthesis method, a cascade of enzymatic reactions yields detectable light, which is proportional to incorporated nucleotides. Pyrosequencing has the advantages of accuracy, flexibility and parallel processing. It can be easily automated. Furthermore, the technique dispenses with the need for labeled primers, labeled nucleotides and gel-electrophoresis. In this chapter, the use of this technique for different applications is discussed.

  11. The DNA sequence specificity of bleomycin cleavage in a systematically altered DNA sequence.

    Science.gov (United States)

    Gautam, Shweta D; Chen, Jon K; Murray, Vincent

    2017-08-01

    Bleomycin is an anti-tumour agent that is clinically used to treat several types of cancers. Bleomycin cleaves DNA at specific DNA sequences and recent genome-wide DNA sequencing specificity data indicated that the sequence 5'-RTGT*AY (where T* is the site of bleomycin cleavage, R is G/A and Y is T/C) is preferentially cleaved by bleomycin in human cells. Based on this DNA sequence, we constructed a plasmid clone to explore this bleomycin cleavage preference. By systematic variation of single nucleotides in the 5'-RTGT*AY sequence, we were able to investigate the effect of nucleotide changes on bleomycin cleavage efficiency. We observed that the preferred consensus DNA sequence for bleomycin cleavage in the plasmid clone was 5'-YYGT*AW (where W is A/T). The most highly cleaved sequence was 5'-TCGT*AT and, in fact, the seven most highly cleaved sequences conformed to the consensus sequence 5'-YYGT*AW. A comparison with genome-wide results was also performed and while the core sequence was similar in both environments, the surrounding nucleotides were different.

  12. Footprinting with an automated capillary DNA sequencer.

    Science.gov (United States)

    Yindeeyoungyeon, W; Schell, M A

    2000-11-01

    Footprinting is a valuable tool for studying DNA-protein contacts. However, it usually involves expensive, tedious and hazardous steps such as radioactive labeling and analyses on polyacrylamide sequencing gels. We have developed an easy four-step footprinting method involving (i) the generation and purification of a PCR fragment that is fluorescently labeled at one end with 6-carboxyfluorescein; (ii) brief exposure of the fragment to a DNA-binding protein and then DNase I; (iii) spin-column purification; and (iv) analysis of partial digestion products on the ABI Prism 310 capillary DNA sequencer/genetic analyzer. Very detailed and sensitive footprints of large (> 400 bp) DNA fragments can be easily obtained, as illustrated by our use of this method to characterize binding of PhcA, a LysR-type activator, to two sites greater than 100 bp apart in the 5' untranslated region of xpsR, one of its regulated target genes. The advantages of this new method are that it (i) uses long-lived, safe and easy-to-make fluorescently labeled target fragments; (ii) uses sensitive, robust and highly reproducible fragment analysis using an automated DNA sequencer, instead of gel electrophoresis and autoradiography; and (iii) is cost effective.

  13. Vander Lugt correlation of DNA sequence data

    Science.gov (United States)

    Christens-Barry, William A.; Hawk, James F.; Martin, James C.

    1990-12-01

    DNA, the molecule containing the genetic code of an organism, is a linear chain of subunits. It is the sequence of subunits, of which there are four kinds, that constitutes the unique blueprint of an individual. This sequence is the focus of a large number of analyses performed by an army of geneticists, biologists, and computer scientists. Most of these analyses entail searches for specific subsequences within the larger set of sequence data. Thus, most analyses are essentially pattern recognition or correlation tasks. Yet, there are special features to such analysis that influence the strategy and methods of an optical pattern recognition approach. While the serial processing employed in digital electronic computers remains the main engine of sequence analyses, there is no fundamental reason that more efficient parallel methods cannot be used. We describe an approach using optical pattern recognition (OPR) techniques based on matched spatial filtering. This allows parallel comparison of large blocks of sequence data. In this study we have simulated a Vander Lugt1 architecture implementing our approach. Searches for specific target sequence strings within a block of DNA sequence from the Co/El plasmid2 are performed.

  14. Molecular dating of caprines using ancient DNA sequences of Myotragus balearicus, an extinct endemic Balearic mammal

    Directory of Open Access Journals (Sweden)

    Alcover Josep Antoni

    2005-12-01

    Full Text Available Abstract Background Myotragus balearicus was an endemic bovid from the Balearic Islands (Western Mediterranean that became extinct around 6,000-4,000 years ago. The Myotragus evolutionary lineage became isolated in the islands most probably at the end of the Messinian crisis, when the desiccation of the Mediterranean ended, in a geological date established at 5.35 Mya. Thus, the sequences of Myotragus could be very valuable for calibrating the mammalian mitochondrial DNA clock and, in particular, the tree of the Caprinae subfamily, to which Myotragus belongs. Results We have retrieved the complete mitochondrial cytochrome b gene (1,143 base pairs, plus fragments of the mitochondrial 12S gene and the nuclear 28S rDNA multi-copy gene from a well preserved Myotragus subfossil bone. The best resolved phylogenetic trees, obtained with the cytochrome b gene, placed Myotragus in a position basal to the Ovis group. Using the calibration provided by the isolation of Balearic Islands, we calculated that the initial radiation of caprines can be dated at 6.2 ± 0.4 Mya. In addition, alpine and southern chamois, considered until recently the same species, split around 1.6 ± 0.3 Mya, indicating that the two chamois species have been separated much longer than previously thought. Conclusion Since there are almost no extant endemic mammals in Mediterranean islands, the sequence of the extinct Balearic endemic Myotragus has been crucial for allowing us to use the Messinian crisis calibration point for dating the caprines phylogenetic tree.

  15. Sequence-Dependent Persistence Lengths of DNA.

    Science.gov (United States)

    Mitchell, Jonathan S; Glowacki, Jaroslaw; Grandchamp, Alexandre E; Manning, Robert S; Maddocks, John H

    2017-04-11

    A Monte Carlo code applied to the cgDNA coarse-grain rigid-base model of B-form double-stranded DNA is used to predict a sequence-averaged persistence length of l F = 53.5 nm in the sense of Flory, and of l p = 160 bp or 53.5 nm in the sense of apparent tangent-tangent correlation decay. These estimates are slightly higher than the consensus experimental values of 150 bp or 50 nm, but we believe the agreement to be good given that the cgDNA model is itself parametrized from molecular dynamics simulations of short fragments of length 10-20 bp, with no explicit fit to persistence length. Our Monte Carlo simulations further predict that there can be substantial dependence of persistence lengths on the specific sequence [Formula: see text] of a fragment. We propose, and confirm the numerical accuracy of, a simple factorization that separates the part of the apparent tangent-tangent correlation decay [Formula: see text] attributable to intrinsic shape, from a part [Formula: see text] attributable purely to stiffness, i.e., a sequence-dependent version of what has been called sequence-averaged dynamic persistence length l̅ d (=58.8 nm within the cgDNA model). For ensembles of both random and λ-phage fragments, the apparent persistence length [Formula: see text] has a standard deviation of 4 nm over sequence, whereas our dynamic persistence length [Formula: see text] has a standard deviation of only 1 nm. However, there are notable dynamic persistence length outliers, including poly(A) (exceptionally straight and stiff), poly(TA) (tightly coiled and exceptionally soft), and phased A-tract sequence motifs (exceptionally bent and stiff). The results of our numerical simulations agree reasonably well with both molecular dynamics simulation and diverse experimental data including minicircle cyclization rates and stereo cryo-electron microscopy images.

  16. Dog Y chromosomal DNA sequence: identification, sequencing and SNP discovery

    Directory of Open Access Journals (Sweden)

    Kirkness Ewen

    2006-10-01

    Full Text Available Abstract Background Population genetic studies of dogs have so far mainly been based on analysis of mitochondrial DNA, describing only the history of female dogs. To get a picture of the male history, as well as a second independent marker, there is a need for studies of biallelic Y-chromosome polymorphisms. However, there are no biallelic polymorphisms reported, and only 3200 bp of non-repetitive dog Y-chromosome sequence deposited in GenBank, necessitating the identification of dog Y chromosome sequence and the search for polymorphisms therein. The genome has been only partially sequenced for one male dog, disallowing mapping of the sequence into specific chromosomes. However, by comparing the male genome sequence to the complete female dog genome sequence, candidate Y-chromosome sequence may be identified by exclusion. Results The male dog genome sequence was analysed by Blast search against the human genome to identify sequences with a best match to the human Y chromosome and to the female dog genome to identify those absent in the female genome. Candidate sequences were then tested for male specificity by PCR of five male and five female dogs. 32 sequences from the male genome, with a total length of 24 kbp, were identified as male specific, based on a match to the human Y chromosome, absence in the female dog genome and male specific PCR results. 14437 bp were then sequenced for 10 male dogs originating from Europe, Southwest Asia, Siberia, East Asia, Africa and America. Nine haplotypes were found, which were defined by 14 substitutions. The genetic distance between the haplotypes indicates that they originate from at least five wolf haplotypes. There was no obvious trend in the geographic distribution of the haplotypes. Conclusion We have identified 24159 bp of dog Y-chromosome sequence to be used for population genetic studies. We sequenced 14437 bp in a worldwide collection of dogs, identifying 14 SNPs for future SNP analyses, and

  17. Special Issue: Next Generation DNA Sequencing

    Directory of Open Access Journals (Sweden)

    Paul Richardson

    2010-10-01

    Full Text Available Next Generation Sequencing (NGS refers to technologies that do not rely on traditional dideoxy-nucleotide (Sanger sequencing where labeled DNA fragments are physically resolved by electrophoresis. These new technologies rely on different strategies, but essentially all of them make use of real-time data collection of a base level incorporation event across a massive number of reactions (on the order of millions versus 96 for capillary electrophoresis for instance. The major commercial NGS platforms available to researchers are the 454 Genome Sequencer (Roche, Illumina (formerly Solexa Genome analyzer, the SOLiD system (Applied Biosystems/Life Technologies and the Heliscope (Helicos Corporation. The techniques and different strategies utilized by these platforms are reviewed in a number of the papers in this special issue. These technologies are enabling new applications that take advantage of the massive data produced by this next generation of sequencing instruments. [...

  18. DNA sequencing versus standard prenatal aneuploidy screening.

    Science.gov (United States)

    Bianchi, Diana W; Parker, R Lamar; Wentworth, Jeffrey; Madankumar, Rajeevi; Saffer, Craig; Das, Anita F; Craig, Joseph A; Chudova, Darya I; Devers, Patricia L; Jones, Keith W; Oliver, Kelly; Rava, Richard P; Sehnert, Amy J

    2014-02-27

    In high-risk pregnant women, noninvasive prenatal testing with the use of massively parallel sequencing of maternal plasma cell-free DNA (cfDNA testing) accurately detects fetal autosomal aneuploidy. Its performance in low-risk women is unclear. At 21 centers in the United States, we collected blood samples from women with singleton pregnancies who were undergoing standard aneuploidy screening (serum biochemical assays with or without nuchal translucency measurement). We performed massively parallel sequencing in a blinded fashion to determine the chromosome dosage for each sample. The primary end point was a comparison of the false positive rates of detection of fetal trisomies 21 and 18 with the use of standard screening and cfDNA testing. Birth outcomes or karyotypes were the reference standard. The primary series included 1914 women (mean age, 29.6 years) with an eligible sample, a singleton fetus without aneuploidy, results from cfDNA testing, and a risk classification based on standard screening. For trisomies 21 and 18, the false positive rates with cfDNA testing were significantly lower than those with standard screening (0.3% vs. 3.6% for trisomy 21, Paneuploidy (5 for trisomy 21, 2 for trisomy 18, and 1 for trisomy 13; negative predictive value, 100% [95% confidence interval, 99.8 to 100]). The positive predictive values for cfDNA testing versus standard screening were 45.5% versus 4.2% for trisomy 21 and 40.0% versus 8.3% for trisomy 18. In a general obstetrical population, prenatal testing with the use of cfDNA had significantly lower false positive rates and higher positive predictive values for detection of trisomies 21 and 18 than standard screening. (Funded by Illumina; ClinicalTrials.gov number, NCT01663350.).

  19. Genomic signal processing for DNA sequence clustering.

    Science.gov (United States)

    Mendizabal-Ruiz, Gerardo; Román-Godínez, Israel; Torres-Ramos, Sulema; Salido-Ruiz, Ricardo A; Vélez-Pérez, Hugo; Morales, J Alejandro

    2018-01-01

    Genomic signal processing (GSP) methods which convert DNA data to numerical values have recently been proposed, which would offer the opportunity of employing existing digital signal processing methods for genomic data. One of the most used methods for exploring data is cluster analysis which refers to the unsupervised classification of patterns in data. In this paper, we propose a novel approach for performing cluster analysis of DNA sequences that is based on the use of GSP methods and the K-means algorithm. We also propose a visualization method that facilitates the easy inspection and analysis of the results and possible hidden behaviors. Our results support the feasibility of employing the proposed method to find and easily visualize interesting features of sets of DNA data.

  20. Google matrix analysis of DNA sequences.

    Science.gov (United States)

    Kandiah, Vivek; Shepelyansky, Dima L

    2013-01-01

    For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW). At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  1. Google matrix analysis of DNA sequences.

    Directory of Open Access Journals (Sweden)

    Vivek Kandiah

    Full Text Available For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW. At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  2. What Advances Are Being Made in DNA Sequencing?

    Science.gov (United States)

    ... diagnosis in the future. For more information about DNA sequencing technologies and their use: Genetics Home Reference discusses ... illustration of the decline in the cost of DNA sequencing , including that caused by the introduction of new ...

  3. Next-generation sequencing offers new insights into DNA degradation

    DEFF Research Database (Denmark)

    Overballe-Petersen, Søren; Orlando, Ludovic Antoine Alexandre; Willerslev, Eske

    2012-01-01

    The processes underlying DNA degradation are central to various disciplines, including cancer research, forensics and archaeology. The sequencing of ancient DNA molecules on next-generation sequencing platforms provides direct measurements of cytosine deamination, depurination and fragmentation...

  4. Dog Y chromosomal DNA sequence: identification, sequencing and SNP discovery

    OpenAIRE

    Natanaelsson, Christian; Oskarsson, Mattias CR; Angleby, Helen; Lundeberg, Joakim; Kirkness, Ewen; Savolainen, Peter

    2006-01-01

    Abstract Background Population genetic studies of dogs have so far mainly been based on analysis of mitochondrial DNA, describing only the history of female dogs. To get a picture of the male history, as well as a second independent marker, there is a need for studies of biallelic Y-chromosome polymorphisms. However, there are no biallelic polymorphisms reported, and only 3200 bp of non-repetitive dog Y-chromosome sequence deposited in GenBank, necessitating the identification of dog Y chromo...

  5. cDNA sequence quality data - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Budding yeast cDNA sequencing project cDNA sequence quality data Data detail Data name cDNA sequence quality... data DOI 10.18908/lsdba.nbdc00838-003 Description of data contents Phred's quality score. P...tion Download License Update History of This Database Site Policy | Contact Us cDNA sequence quality

  6. Predicting DNA hybridization kinetics from sequence

    Science.gov (United States)

    Zhang, Jinny X.; Fang, John Z.; Duan, Wei; Wu, Lucia R.; Zhang, Angela W.; Dalchau, Neil; Yordanov, Boyan; Petersen, Rasmus; Phillips, Andrew; Zhang, David Yu

    2018-01-01

    Hybridization is a key molecular process in biology and biotechnology, but so far there is no predictive model for accurately determining hybridization rate constants based on sequence information. Here, we report a weighted neighbour voting (WNV) prediction algorithm, in which the hybridization rate constant of an unknown sequence is predicted based on similarity reactions with known rate constants. To construct this algorithm we first performed 210 fluorescence kinetics experiments to observe the hybridization kinetics of 100 different DNA target and probe pairs (36 nt sub-sequences of the CYCS and VEGF genes) at temperatures ranging from 28 to 55 °C. Automated feature selection and weighting optimization resulted in a final six-feature WNV model, which can predict hybridization rate constants of new sequences to within a factor of 3 with ∼91% accuracy, based on leave-one-out cross-validation. Accurate prediction of hybridization kinetics allows the design of efficient probe sequences for genomics research.

  7. Nucleosome DNA sequence structure of isochores

    Directory of Open Access Journals (Sweden)

    Trifonov Edward N

    2011-04-01

    Full Text Available Abstract Background Significant differences in G+C content between different isochore types suggest that the nucleosome positioning patterns in DNA of the isochores should be different as well. Results Extraction of the patterns from the isochore DNA sequences by Shannon N-gram extension reveals that while the general motif YRRRRRYYYYYR is characteristic for all isochore types, the dominant positioning patterns of the isochores vary between TAAAAATTTTTA and CGGGGGCCCCCG due to the large differences in G+C composition. This is observed in human, mouse and chicken isochores, demonstrating that the variations of the positioning patterns are largely G+C dependent rather than species-specific. The species-specificity of nucleosome positioning patterns is revealed by dinucleotide periodicity analyses in isochore sequences. While human sequences are showing CG periodicity, chicken isochores display AG (CT periodicity. Mouse isochores show very weak CG periodicity only. Conclusions Nucleosome positioning pattern as revealed by Shannon N-gram extension is strongly dependent on G+C content and different in different isochores. Species-specificity of the pattern is subtle. It is reflected in the choice of preferentially periodical dinucleotides.

  8. Rapid quantification of DNA libraries for next-generation sequencing.

    Science.gov (United States)

    Buehler, Bernd; Hogrefe, Holly H; Scott, Graham; Ravi, Harini; Pabón-Peña, Carlos; O'Brien, Scott; Formosa, Rachel; Happe, Scott

    2010-04-01

    The next-generation DNA sequencing workflows require an accurate quantification of the DNA molecules to be sequenced which assures optimal performance of the instrument. Here, we demonstrate the use of qPCR for quantification of DNA libraries used in next-generation sequencing. In addition, we find that qPCR quantification may allow improvements to current NGS workflows, including reducing the amount of library DNA required, increasing the accuracy in quantifying amplifiable DNA, and avoiding amplification bias by reducing or eliminating the need to amplify DNA before sequencing. Copyright 2010. Published by Elsevier Inc.

  9. DNA sequence pattern recognition methods in GRAIL

    Energy Technology Data Exchange (ETDEWEB)

    Uberbacher, E.C.; Xu, Ying; Shah, M.; Matis, S.; Guan, X.; Mural, R.J.

    1995-12-31

    The goal of the GRAIL project has been to create a comprehensive analysis environment where a host of questions about genes and genome structure can be answered as quickly and accurately as possible. Constructing this system has entailed solving a number of significant technical challenges including: (a) making coding recognition in sequence more sensitive and accurate, (b) compensating for isochore base compositional effects in coding prediction, (c) developing methods to determine which parts of each strand of a long genomic DNA are the coding strand, (d) improving the accuracy of splice site prediction and recognizing non-consensus sites, and (e) recognizing variable regulatory structures such as polymerase II promoters. An additional challenge has been to construct algorithms which compensate for the deleterious effects of insertion or deletion (indel) errors in the coding region recognition process. This paper addresses progress on these technical issues and the current state of sequence feature recognition methods.

  10. DNA sequencing using biotinylated dideoxynucleotides and mass spectrometry

    Science.gov (United States)

    Edwards, John R.; Itagaki, Yasuhiro; Ju, Jingyue

    2001-01-01

    Matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MS) has been explored widely for DNA sequencing. The major requirement for this method is that the DNA sequencing fragments must be free from alkaline and alkaline earth salts as well as other contaminants for accurately measuring the masses of the DNA fragments. We report here the development of a novel MS DNA sequencing method that generates Sanger-sequencing fragments in one tube using biotinylated dideoxynucleotides. The DNA sequencing fragments that carry a biotin at the 3′-end are made free from salts and other components in the sequencing reaction by capture with streptavidin-coated magnetic beads. Only correctly terminated biotinylated DNA fragments are subsequently released and loaded onto a mass spectrometer to obtain accurate DNA sequencing data. Compared with gel electrophoresis-based sequencing systems, MS produces a very high resolution of DNA-sequencing fragments, fast separation on microsecond time scales, and completely eliminates the compressions associated with gel electrophoresis. The high resolution of MS allows accurate mutation and heterozygote detection. This optimized solid-phase DNA-sequencing chemistry plus future improvements in detector sensitivity for large DNA fragments in MS instrumentation will further improve MS for DNA sequencing. PMID:11691941

  11. Image correlation method for DNA sequence alignment.

    Directory of Open Access Journals (Sweden)

    Millaray Curilem Saldías

    Full Text Available The complexity of searches and the volume of genomic data make sequence alignment one of bioinformatics most active research areas. New alignment approaches have incorporated digital signal processing techniques. Among these, correlation methods are highly sensitive. This paper proposes a novel sequence alignment method based on 2-dimensional images, where each nucleic acid base is represented as a fixed gray intensity pixel. Query and known database sequences are coded to their pixel representation and sequence alignment is handled as object recognition in a scene problem. Query and database become object and scene, respectively. An image correlation process is carried out in order to search for the best match between them. Given that this procedure can be implemented in an optical correlator, the correlation could eventually be accomplished at light speed. This paper shows an initial research stage where results were "digitally" obtained by simulating an optical correlation of DNA sequences represented as images. A total of 303 queries (variable lengths from 50 to 4500 base pairs and 100 scenes represented by 100 x 100 images each (in total, one million base pair database were considered for the image correlation analysis. The results showed that correlations reached very high sensitivity (99.01%, specificity (98.99% and outperformed BLAST when mutation numbers increased. However, digital correlation processes were hundred times slower than BLAST. We are currently starting an initiative to evaluate the correlation speed process of a real experimental optical correlator. By doing this, we expect to fully exploit optical correlation light properties. As the optical correlator works jointly with the computer, digital algorithms should also be optimized. The results presented in this paper are encouraging and support the study of image correlation methods on sequence alignment.

  12. Silicene nanoribbon as a new DNA sequencing device

    Science.gov (United States)

    Alesheikh, Sara; Shahtahmassebi, Nasser; Roknabadi, Mahmood Rezaee; Pilevar Shahri, Raheleh

    2018-02-01

    The importance of applying DNA sequencing in different fields, results in looking for fast and cheap methods. Nanotechnology helps this development by introducing nanostructures used for DNA sequencing. In this work we study the interaction between zigzag silicene nanoribbon and DNA nucleobases using DFT and non equilibrium Green's function approach, to investigate the possibility of using zigzag silicene nanoribbons as a biosensor for DNA sequencing.

  13. Identification of Meconopsis species by a DNA barcode sequence ...

    African Journals Online (AJOL)

    Deoxyribonucleic acid (DNA) barcoding is a novel technology that uses a standard DNA sequence to facilitate species identification. Species identification is necessary for the authentication of traditional plant based medicines. Although a consensus has not been agreed regarding which DNA sequences can be used as ...

  14. The Subfossil Trunk of Chiarano (TV

    Directory of Open Access Journals (Sweden)

    Tiziana Urso

    2017-06-01

    Full Text Available This paper reports the results of the characterization of a subfossil trunk found buried in the mud of the Piavon canal, at Chiarano (TV, when dredging took place in 2008. The trunk, of imposing dimensions, lacking branches and bark, has a black, deeply cracked and strongly deteriorated outer surface with a carbonized appearance, while internally it has the typical blackish colour of the so-called drowned oak. The studies have demonstrated that it is a tree belonging to the genus Quercus, common oak or sessile oak, that may have been felled between the end of the 12 th and early 14 th century A.D.. Determination of the MWC and residual basic density indicate that the deterioration decreases from the outside inwards; the ash content is high externally and diminishes moving toward the centre. Nowadays, the Piavon is an irrigation canal, but in Venetian times it was navigable and was used for the transport of goods and timber. There were extensive woodlands of common oak and sessile oak all along the Piavon, the size and composition of which is documented in the Venetian cadastres, which also report the distances between the woodlands and the nearest water courses, proof of the importance of river transport for the timber. In particular, an 18 hectare oak woodland is recorded in the Surian cadastre (1569-70 for the villa at Chiarano. The oaks were used by the Republic of Venice mainly for the construction and maintenance of the shipping fleet. The Chiarano trunk, given its age and the area where it was found, may therefore be a trunk felled in Venetian times, perhaps destined for naval use, which was lost during its transport by floating.

  15. A DNA Structure-Based Bionic Wavelet Transform and Its Application to DNA Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Fei Chen

    2003-01-01

    Full Text Available DNA sequence analysis is of great significance for increasing our understanding of genomic functions. An important task facing us is the exploration of hidden structural information stored in the DNA sequence. This paper introduces a DNA structure-based adaptive wavelet transform (WT – the bionic wavelet transform (BWT – for DNA sequence analysis. The symbolic DNA sequence can be separated into four channels of indicator sequences. An adaptive symbol-to-number mapping, determined from the structural feature of the DNA sequence, was introduced into WT. It can adjust the weight value of each channel to maximise the useful energy distribution of the whole BWT output. The performance of the proposed BWT was examined by analysing synthetic and real DNA sequences. Results show that BWT performs better than traditional WT in presenting greater energy distribution. This new BWT method should be useful for the detection of the latent structural features in future DNA sequence analysis.

  16. Short-sequence DNA repeats in prokaryotic genomes

    NARCIS (Netherlands)

    A.F. van Belkum (Alex); S. Scherer; L. van Alphen (Loek); H.A. Verbrugh (Henri)

    1998-01-01

    textabstractShort-sequence DNA repeat (SSR) loci can be identified in all eukaryotic and many prokaryotic genomes. These loci harbor short or long stretches of repeated nucleotide sequence motifs. DNA sequence motifs in a single locus can be identical and/or

  17. SWORDS: A statistical tool for analysing large DNA sequences

    Indian Academy of Sciences (India)

    In this article, we present some simple yet effective statistical techniques for analysing and comparing large DNA sequences. These techniques are based on frequency distributions of DNA words in a large sequence, and have been packaged into a software called SWORDS. Using sequences available in public domain ...

  18. Toward a Better Compression for DNA Sequences Using Huffman Encoding

    National Research Council Canada - National Science Library

    Al-Okaily, Anas; Almarri, Badar; Al Yami, Sultan; Huang, Chun-Hsi

    ... to compress such data to a less space and a faster transmission. Different implementations of Huffman encoding incorporating the characteristics of DNA sequences prove to better compress DNA data...

  19. An automated annotation tool for genomic DNA sequences using ...

    Indian Academy of Sciences (India)

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated ...

  20. Random Coding Bounds for DNA Codes Based on Fibonacci Ensembles of DNA Sequences

    Science.gov (United States)

    2008-07-01

    Highway, Suite 1204, Arlington, VA 22202-4302, and to the Office of Management and Budget, Paperwork Reduction Project (0704-0188) Washington, DC...COVERED (From - To) 6 Jul 08 – 11 Jul 08 4. TITLE AND SUBTITLE RANDOM CODING BOUNDS FOR DNA CODES BASED ON FIBONACCI ENSEMBLES OF DNA SEQUENCES...sequences which are generalizations of the Fibonacci sequences. 15. SUBJECT TERMS DNA Codes, Fibonacci Ensembles, DNA Computing, Code Optimization 16

  1. Levenshtein error-correcting barcodes for multiplexed DNA sequencing

    NARCIS (Netherlands)

    Buschmann, Tilo; Bystrykh, Leonid V.

    2013-01-01

    Background: High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called

  2. Design of sequence-specific DNA-binding molecules.

    Science.gov (United States)

    Dervan, P B

    1986-04-25

    Base sequence information can be stored in the local structure of right-handed double-helical DNA (B-DNA). The question arises as to whether a set of rules for the three-dimensional readout of the B-DNA helix can be developed. This would allow the design of synthetic molecules that bind DNA of any specific sequence and site size. There are four stages of development for each new synthetic sequence-specific DNA-binding molecule: design, synthesis, testing for sequence specificity, and reevaluation of the design. This approach has produced bis(distamycin)fumaramide, a synthetic, crescent-shaped oligopeptide that binds nine contiguous adenine-thymine base pairs in the minor groove of double-helical DNA.

  3. Detection of DNA Methylation by Whole-Genome Bisulfite Sequencing.

    Science.gov (United States)

    Li, Qing; Hermanson, Peter J; Springer, Nathan M

    2018-01-01

    DNA methylation plays an important role in the regulation of the expression of transposons and genes. Various methods have been developed to assay DNA methylation levels. Bisulfite sequencing is considered to be the "gold standard" for single-base resolution measurement of DNA methylation levels. Coupled with next-generation sequencing, whole-genome bisulfite sequencing (WGBS) allows DNA methylation to be evaluated at a genome-wide scale. Here, we described a protocol for WGBS in plant species with large genomes. This protocol has been successfully applied to assay genome-wide DNA methylation levels in maize and barley. This protocol has also been successfully coupled with sequence capture technology to assay DNA methylation levels in a targeted set of genomic regions.

  4. Food Fish Identification from DNA Extraction through Sequence Analysis

    Science.gov (United States)

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  5. Affordable Hands-On DNA Sequencing and Genotyping: An Exercise for Teaching DNA Analysis to Undergraduates

    Science.gov (United States)

    Shah, Kushani; Thomas, Shelby; Stein, Arnold

    2013-01-01

    In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C…

  6. Mesoscopic Model for Free Energy Landscape Analysis of DNA sequences

    CERN Document Server

    Tapia-Rojo, R; Mazo, J J; Falo, F; 10.1103/PhysRevE.86.021908

    2012-01-01

    A mesoscopic model which allows us to identify and quantify the strength of binding sites in DNA sequences is proposed. The model is based on the Peyrard-Bishop-Dauxois model for the DNA chain coupled to a Brownian particle which explores the sequence interacting more importantly with open base pairs of the DNA chain. We apply the model to promoter sequences of different organisms. The free energy landscape obtained for these promoters shows a complex structure that is strongly connected to their biological behavior. The analysis method used is able to quantify free energy differences of sites within genome sequences.

  7. Sequence-Specific DNA Binding by a Short Peptide Dimer

    Science.gov (United States)

    Talanian, Robert V.; McKnight, C. James; Kim, Peter S.

    1990-08-01

    A recently described class of DNA binding proteins is characterized by the "bZIP" motif, which consists of a basic region that contacts DNA and an adjacent "leucine zipper" that mediates protein dimerization. A peptide model for the basic region of the yeast transcriptional activator GCN4 has been developed in which the leucine zipper has been replaced by a disulfide bond. The 34-residue peptide dimer, but not the reduced monomer, binds DNA with nanomolar affinity at 4^circC. DNA binding is sequence-specific as judged by deoxyribonuclease I footprinting. Circular dichroism spectroscopy suggests that the peptide adopts a helical structure when bound to DNA. These results demonstrate directly that the GCN4 basic region is sufficient for sequence-specific DNA binding and suggest that a major function of the GCN4 leucine zipper is simply to mediate protein dimerization. Our approach provides a strategy for the design of short sequence-specific DNA binding peptides.

  8. DNA Polymerases Drive DNA Sequencing-by-Synthesis Technologies: Both Past and Present

    Directory of Open Access Journals (Sweden)

    Cheng-Yao eChen

    2014-06-01

    Full Text Available Next-generation sequencing (NGS technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. E. coli DNA polymerase I proteolytic (Klenow fragment was originally utilized in Sanger's dideoxy chain terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today's standard capillary electrophoresis (CE and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ⱷ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ⱷ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies.

  9. DNA polymerases drive DNA sequencing-by-synthesis technologies: both past and present

    Science.gov (United States)

    Chen, Cheng-Yao

    2014-01-01

    Next-generation sequencing (NGS) technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. Escherichia coli DNA polymerase I proteolytic (Klenow) fragment was originally utilized in Sanger’s dideoxy chain-terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today’s standard capillary electrophoresis (CE) and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ϕ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ϕ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies. PMID:25009536

  10. Next-generation sequencing technologies for environmental DNA research.

    Science.gov (United States)

    Shokralla, Shadi; Spall, Jennifer L; Gibson, Joel F; Hajibabaei, Mehrdad

    2012-04-01

    Since 2005, advances in next-generation sequencing technologies have revolutionized biological science. The analysis of environmental DNA through the use of specific gene markers such as species-specific DNA barcodes has been a key application of next-generation sequencing technologies in ecological and environmental research. Access to parallel, massive amounts of sequencing data, as well as subsequent improvements in read length and throughput of different sequencing platforms, is leading to a better representation of sample diversity at a reasonable cost. New technologies are being developed rapidly and have the potential to dramatically accelerate ecological and environmental research. The fast pace of development and improvements in next-generation sequencing technologies can reflect on broader and more robust applications in environmental DNA research. Here, we review the advantages and limitations of current next-generation sequencing technologies in regard to their application for environmental DNA analysis. © 2012 Blackwell Publishing Ltd.

  11. Development of an automated procedure for fluorescent DNA sequencing.

    Science.gov (United States)

    Wilson, R K; Chen, C; Avdalovic, N; Burns, J; Hood, L

    1990-04-01

    We describe here the development of a procedure for complete automation of the dideoxynucleotide DNA sequencing chemistry using fluorescent dye-labeled oligonucleotide primers. This procedure combines rapid preparation of template DNA using a modification of the polymerase chain reaction, automation of the DNA sequencing reactions using a robotic laboratory workstation, and subsequent analysis of the fluorescent-labeled reaction products on a commercial automated fluorescent sequencer. Using this procedure, we were able to produce sufficient quantities of template DNA directly from bacterial colonies or bacteriophage plaques, perform the DNA sequencing reactions on these templates, and load the reaction products on the fluorescent DNA sequencer in a single work day. This scheme for automation of the fluorescent DNA sequencing method allows the fluorescent sequencer to be run at its full capacity every day and eliminates much of the labor required to obtain a high level of data output. Currently, we are able to perform and analyze 16 fluorescent-labeled reactions every day, with an average output of over 7000 bp per sequencer run.

  12. [Study on factors influencing DNA sequencing by automatic genetic analyzer].

    Science.gov (United States)

    Yan, Shaofei; Wang, Wei; Xu, Jin; Bai, Li; Gan, Xin; Li, Fengqin

    2015-05-01

    To acquire accurate and successful DNA sequencing in a cost-effective way by ABI3500xl automatic genetic analyzer. BigDye was diluted to 8, 16 and 32 times in PCR product sequencing. Three different methods including CENTRI-SEP kit, BigDye cleaning beads and ethanol-NaAc-EDTA were used to purify the sequencing PCR products. The results of DNA sequencing were correct when BigDye was diluted up to 16 times. The misreading of nucleic acid bases was found as BigDye was diluted to 32 times. All three purification methods provided acceptable DNA sequencing results. In terms of method for purification of PCR products, the CENTRI-SEP Kit was the most expensive but time-saving (0.5 h), while ethanol-NaAc-EDTA method was the most economical but time-consuming (2 h). The BigDye cleaning beads method was of a suitable purification time (1 h) but not fit for high-throughput DNA sequencing. BigDye should be diluted up to 16 times in DNA sequencing by ABI3500xl DNA analyzer. Although all three purification methods may promise DNA sequencing results with good quality, it is necessary to choose an appropriate one to keep the balance between time and cost on the basis of the lab condition.

  13. Collection and extraction of saliva DNA for next generation sequencing.

    Science.gov (United States)

    Goode, Michael R; Cheong, Soo Yeon; Li, Ning; Ray, William C; Bartlett, Christopher W

    2014-08-27

    The preferred source of DNA in human genetics research is blood, or cell lines derived from blood, as these sources yield large quantities of high quality DNA. However, DNA extraction from saliva can yield high quality DNA with little to no degradation/fragmentation that is suitable for a variety of DNA assays without the expense of a phlebotomist and can even be acquired through the mail. However, at present, no saliva DNA collection/extraction protocols for next generation sequencing have been presented in the literature. This protocol optimizes parameters of saliva collection/storage and DNA extraction to be of sufficient quality and quantity for DNA assays with the highest standards, including microarray genotyping and next generation sequencing.

  14. Cross-utilizing hyperchaotic and DNA sequences for image encryption

    Science.gov (United States)

    Zhan, Kun; Wei, Dong; Shi, Jinhui; Yu, Jun

    2017-01-01

    The hyperchaotic sequence and the DNA sequence are utilized jointly for image encryption. A four-dimensional hyperchaotic system is used to generate a pseudorandom sequence. The main idea is to apply the hyperchaotic sequence to almost all steps of the encryption. All intensity values of an input image are converted to a serial binary digit stream, and the bitstream is scrambled globally by the hyperchaotic sequence. DNA algebraic operation and complementation are performed between the hyperchaotic sequence and the DNA sequence to obtain a robust encryption performance. The experiment results demonstrate that the encryption algorithm achieves the performance of the state-of-the-art methods in term of quality, security, and robustness against noise and cropping attack.

  15. An auditory display tool for DNA sequence analysis.

    Science.gov (United States)

    Temple, Mark D

    2017-04-24

    DNA Sonification refers to the use of an auditory display to convey the information content of DNA sequence data. Six sonification algorithms are presented that each produce an auditory display. These algorithms are logically designed from the simple through to the more complex. Three of these parse individual nucleotides, nucleotide pairs or codons into musical notes to give rise to 4, 16 or 64 notes, respectively. Codons may also be parsed degenerately into 20 notes with respect to the genetic code. Lastly nucleotide pairs can be parsed as two separate frames or codons can be parsed as three reading frames giving rise to multiple streams of audio. The most informative sonification algorithm reads the DNA sequence as codons in three reading frames to produce three concurrent streams of audio in an auditory display. This approach is advantageous since start and stop codons in either frame have a direct affect to start or stop the audio in that frame, leaving the other frames unaffected. Using these methods, DNA sequences such as open reading frames or repetitive DNA sequences can be distinguished from one another. These sonification tools are available through a webpage interface in which an input DNA sequence can be processed in real time to produce an auditory display playable directly within the browser. The potential of this approach as an analytical tool is discussed with reference to auditory displays derived from test sequences including simple nucleotide sequences, repetitive DNA sequences and coding or non-coding genes. This study presents a proof-of-concept that some properties of a DNA sequence can be identified through sonification alone and argues for their inclusion within the toolkit of DNA sequence browsers as an adjunct to existing visual and analytical tools.

  16. Simulations Using Random-Generated DNA and RNA Sequences

    Science.gov (United States)

    Bryce, C. F. A.

    1977-01-01

    Using a very simple computer program written in BASIC, a very large number of random-generated DNA or RNA sequences are obtained. Students use these sequences to predict complementary sequences and translational products, evaluate base compositions, determine frequencies of particular triplet codons, and suggest possible secondary structures.…

  17. Illumina Sequencing of Bisulfite-Converted DNA Libraries.

    Science.gov (United States)

    Lizardi, Paul M; Yan, Qin; Wajapeyee, Narendra

    2017-11-01

    Here we describe a standard MethylC-seq protocol using single-read sequencing on an Illumina Genome Analyzer II platform. The protocol involves ligation of methylated sequencing adaptors to sonicated genomic DNA, gel purification, sodium bisulfite conversion, polymerase chain reaction (PCR) amplification, and sequencing. © 2017 Cold Spring Harbor Laboratory Press.

  18. Haplogrouping mitochondrial DNA sequences in Legal Medicine/Forensic Genetics.

    Science.gov (United States)

    Bandelt, Hans-Jürgen; van Oven, Mannis; Salas, Antonio

    2012-11-01

    Haplogrouping refers to the classification of (partial) mitochondrial DNA (mtDNA) sequences into haplogroups using the current knowledge of the worldwide mtDNA phylogeny. Haplogroup assignment of mtDNA control-region sequences assists in the focused comparison with closely related complete mtDNA sequences and thus serves two main goals in forensic genetics: first is the a posteriori quality analysis of sequencing results and second is the prediction of relevant coding-region sites for confirmation or further refinement of haplogroup status. The latter may be important in forensic casework where discrimination power needs to be as high as possible. However, most articles published in forensic genetics perform haplogrouping only in a rudimentary or incorrect way. The present study features PhyloTree as the key tool for assigning control-region sequences to haplogroups and elaborates on additional Web-based searches for finding near-matches with complete mtDNA genomes in the databases. In contrast, none of the automated haplogrouping tools available can yet compete with manual haplogrouping using PhyloTree plus additional Web-based searches, especially when confronted with artificial recombinants still present in forensic mtDNA datasets. We review and classify the various attempts at haplogrouping by using a multiplex approach or relying on automated haplogrouping. Furthermore, we re-examine a few articles in forensic journals providing mtDNA population data where appropriate haplogrouping following PhyloTree immediately highlights several kinds of sequence errors.

  19. A mathematical model and numerical method for thermoelectric DNA sequencing

    Science.gov (United States)

    Shi, Liwei; Guilbeau, Eric J.; Nestorova, Gergana; Dai, Weizhong

    2014-05-01

    Single nucleotide polymorphisms (SNPs) are single base pair variations within the genome that are important indicators of genetic predisposition towards specific diseases. This study explores the feasibility of SNP detection using a thermoelectric sequencing method that measures the heat released when DNA polymerase inserts a deoxyribonucleoside triphosphate into a DNA strand. We propose a three-dimensional mathematical model that governs the DNA sequencing device with a reaction zone that contains DNA template/primer complex immobilized to the surface of the lower channel wall. The model is then solved numerically. Concentrations of reactants and the temperature distribution are obtained. Results indicate that when the nucleoside is complementary to the next base in the DNA template, polymerization occurs lengthening the complementary polymer and releasing thermal energy with a measurable temperature change, implying that the thermoelectric conceptual device for sequencing DNA may be feasible for identifying specific genes in individuals.

  20. Next Generation Sequencing of Ancient DNA: Requirements, Strategies and Perspectives

    Directory of Open Access Journals (Sweden)

    Michael Knapp

    2010-07-01

    Full Text Available The invention of next-generation-sequencing has revolutionized almost all fields of genetics, but few have profited from it as much as the field of ancient DNA research. From its beginnings as an interesting but rather marginal discipline, ancient DNA research is now on its way into the centre of evolutionary biology. In less than a year from its invention next-generation-sequencing had increased the amount of DNA sequence data available from extinct organisms by several orders of magnitude. Ancient DNA  research is now not only adding a temporal aspect to evolutionary studies and allowing for the observation of evolution in real time, it also provides important data to help understand the origins of our own species. Here we review progress that has been made in next-generation-sequencing of ancient DNA over the past five years and evaluate sequencing strategies and future directions.

  1. An Evolution Based Biosensor Receptor DNA Sequence Generation Algorithm

    Directory of Open Access Journals (Sweden)

    Yupeng Zang

    2009-12-01

    Full Text Available A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.

  2. An evolution based biosensor receptor DNA sequence generation algorithm.

    Science.gov (United States)

    Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M; Lee, Jaewan; Zang, Yupeng

    2010-01-01

    A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.

  3. Mitochondrial DNA sequence-based phylogenetic relationship ...

    Indian Academy of Sciences (India)

    Introduction. Mitochondrial DNA (mtDNA) has been one of the most widely used molecular markers for phylogenetic studies in animals, because of its simple genomic structure (Avise. 2004). Among insects, the maximum .... 2007 Population structure of the malaria vector Anopheles dar- lingi in Rondonia, Brazilian Amazon, ...

  4. Efficiency of methylated DNA immunoprecipitation bisulphite sequencing for whole-genome DNA methylation analysis.

    Science.gov (United States)

    Jeong, Hae Min; Lee, Sangseon; Chae, Heejoon; Kim, RyongNam; Kwon, Mi Jeong; Oh, Ensel; Choi, Yoon-La; Kim, Sun; Shin, Young Kee

    2016-08-01

    We compared four common methods for measuring DNA methylation levels and recommended the most efficient method in terms of cost and coverage. The DNA methylation status of liver and stomach tissues was profiled using four different methods, whole-genome bisulphite sequencing (WG-BS), targeted bisulphite sequencing (Targeted-BS), methylated DNA immunoprecipitation sequencing (MeDIP-seq) and methylated DNA immunoprecipitation bisulphite sequencing (MeDIP-BS). We calculated DNA methylation levels using each method and compared the results. MeDIP-BS yielded the most similar DNA methylation profile to WG-BS, with 20 times less data, suggesting remarkable cost savings and coverage efficiency compared with the other methods. MeDIP-BS is a practical cost-effective method for analyzing whole-genome DNA methylation that is highly accurate at base-pair resolution.

  5. Plasmonic Nanopores for Trapping, Controlling Displacement, and Sequencing of DNA.

    Science.gov (United States)

    Belkin, Maxim; Chao, Shu-Han; Jonsson, Magnus P; Dekker, Cees; Aksimentiev, Aleksei

    2015-11-24

    With the aim of developing a DNA sequencing methodology, we theoretically examine the feasibility of using nanoplasmonics to control the translocation of a DNA molecule through a solid-state nanopore and to read off sequence information using surface-enhanced Raman spectroscopy. Using molecular dynamics simulations, we show that high-intensity optical hot spots produced by a metallic nanostructure can arrest DNA translocation through a solid-state nanopore, thus providing a physical knob for controlling the DNA speed. Switching the plasmonic field on and off can displace the DNA molecule in discrete steps, sequentially exposing neighboring fragments of a DNA molecule to the pore as well as to the plasmonic hot spot. Surface-enhanced Raman scattering from the exposed DNA fragments contains information about their nucleotide composition, possibly allowing the identification of the nucleotide sequence of a DNA molecule transported through the hot spot. The principles of plasmonic nanopore sequencing can be extended to detection of DNA modifications and RNA characterization.

  6. Plasmonic Nanopores for Trapping, Controlling Displacement, and Sequencing of DNA

    Science.gov (United States)

    2015-01-01

    With the aim of developing a DNA sequencing methodology, we theoretically examine the feasibility of using nanoplasmonics to control the translocation of a DNA molecule through a solid-state nanopore and to read off sequence information using surface-enhanced Raman spectroscopy. Using molecular dynamics simulations, we show that high-intensity optical hot spots produced by a metallic nanostructure can arrest DNA translocation through a solid-state nanopore, thus providing a physical knob for controlling the DNA speed. Switching the plasmonic field on and off can displace the DNA molecule in discrete steps, sequentially exposing neighboring fragments of a DNA molecule to the pore as well as to the plasmonic hot spot. Surface-enhanced Raman scattering from the exposed DNA fragments contains information about their nucleotide composition, possibly allowing the identification of the nucleotide sequence of a DNA molecule transported through the hot spot. The principles of plasmonic nanopore sequencing can be extended to detection of DNA modifications and RNA characterization. PMID:26401685

  7. Detecting DNA modifications from SMRT sequencing data by modeling sequence context dependence of polymerase kinetic.

    Directory of Open Access Journals (Sweden)

    Zhixing Feng

    Full Text Available DNA modifications such as methylation and DNA damage can play critical regulatory roles in biological systems. Single molecule, real time (SMRT sequencing technology generates DNA sequences as well as DNA polymerase kinetic information that can be used for the direct detection of DNA modifications. We demonstrate that local sequence context has a strong impact on DNA polymerase kinetics in the neighborhood of the incorporation site during the DNA synthesis reaction, allowing for the possibility of estimating the expected kinetic rate of the enzyme at the incorporation site using kinetic rate information collected from existing SMRT sequencing data (historical data covering the same local sequence contexts of interest. We develop an Empirical Bayesian hierarchical model for incorporating historical data. Our results show that the model could greatly increase DNA modification detection accuracy, and reduce requirement of control data coverage. For some DNA modifications that have a strong signal, a control sample is not even needed by using historical data as alternative to control. Thus, sequencing costs can be greatly reduced by using the model. We implemented the model in a R package named seqPatch, which is available at https://github.com/zhixingfeng/seqPatch.

  8. Semiconductor-based DNA sequencing of histone modification states

    Science.gov (United States)

    Cheng, Christine S.; Rai, Kunal; Garber, Manuel; Hollinger, Andrew; Robbins, Dana; Anderson, Scott; Macbeth, Alyssa; Tzou, Austin; Carneiro, Mauricio O.; Raychowdhury, Raktima; Russ, Carsten; Hacohen, Nir; Gershenwald, Jeffrey E.; Lennon, Niall; Nusbaum, Chad; Chin, Lynda; Regev, Aviv; Amit, Ido

    2013-01-01

    The recent development of a semiconductor-based, non-optical DNA sequencing technology promises scalable, low-cost and rapid sequence data production. The technology has previously been applied mainly to genomic sequencing and targeted re-sequencing. Here we demonstrate the utility of Ion Torrent semiconductor-based sequencing for sensitive, efficient and rapid chromatin immunoprecipitation followed by sequencing (ChIP-seq) through the application of sample preparation methods that are optimized for ChIP-seq on the Ion Torrent platform. We leverage this method for epigenetic profiling of tumour tissues. PMID:24157732

  9. ATRF Houses the Latest DNA Sequencing Technologies | Poster

    Science.gov (United States)

    By Ashley DeVine, Staff Writer By the end of October, the Advanced Technology Research Facility (ATRF) will be one of the few facilities in the world to house all of the latest DNA sequencing technologies.

  10. DNA sequence and prokaryotic expression analysis of vitellogenin ...

    African Journals Online (AJOL)

    USER

    2010-02-08

    Feb 8, 2010 ... In this study, the DNA sequence of vitellogenin from Antheraea pernyi (Ap-Vg) was identified and its functional ..... silk-producing insects based on 16S ribosomal RNA and cytochrome oxidase subunit I genes. J. Genet.

  11. DNA sequencing using polymerase substrate-binding kinetics.

    Science.gov (United States)

    Previte, Michael John Robert; Zhou, Chunhong; Kellinger, Matthew; Pantoja, Rigo; Chen, Cheng-Yao; Shi, Jin; Wang, BeiBei; Kia, Amirali; Etchin, Sergey; Vieceli, John; Nikoomanzar, Ali; Bomati, Erin; Gloeckner, Christian; Ronaghi, Mostafa; He, Molly Min

    2015-01-23

    Next-generation sequencing (NGS) has transformed genomic research by decreasing the cost of sequencing. However, whole-genome sequencing is still costly and complex for diagnostics purposes. In the clinical space, targeted sequencing has the advantage of allowing researchers to focus on specific genes of interest. Routine clinical use of targeted NGS mandates inexpensive instruments, fast turnaround time and an integrated and robust workflow. Here we demonstrate a version of the Sequencing by Synthesis (SBS) chemistry that potentially can become a preferred targeted sequencing method in the clinical space. This sequencing chemistry uses natural nucleotides and is based on real-time recording of the differential polymerase/DNA-binding kinetics in the presence of correct or mismatch nucleotides. This ensemble SBS chemistry has been implemented on an existing Illumina sequencing platform with integrated cluster amplification. We discuss the advantages of this sequencing chemistry for targeted sequencing as well as its limitations for other applications.

  12. Levenshtein error-correcting barcodes for multiplexed DNA sequencing.

    Science.gov (United States)

    Buschmann, Tilo; Bystrykh, Leonid V

    2013-09-11

    High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called multiplexing approach relies on a specific DNA tag or barcode that is attached to the sequencing or amplification primer and hence appears at the beginning of the sequence in every read. After sequencing, each sample read is identified on the basis of the respective barcode sequence.Alterations of DNA barcodes during synthesis, primer ligation, DNA amplification, or sequencing may lead to incorrect sample identification unless the error is revealed and corrected. This can be accomplished by implementing error correcting algorithms and codes. This barcoding strategy increases the total number of correctly identified samples, thus improving overall sequencing efficiency. Two popular sets of error-correcting codes are Hamming codes and Levenshtein codes. Levenshtein codes operate only on words of known length. Since a DNA sequence with an embedded barcode is essentially one continuous long word, application of the classical Levenshtein algorithm is problematic. In this paper we demonstrate the decreased error correction capability of Levenshtein codes in a DNA context and suggest an adaptation of Levenshtein codes that is proven of efficiently correcting nucleotide errors in DNA sequences. In our adaption we take the DNA context into account and redefine the word length whenever an insertion or deletion is revealed. In simulations we show the superior error correction capability of the new method compared to traditional Levenshtein and Hamming based codes in the presence of multiple errors. We present an adaptation of Levenshtein codes to DNA contexts capable of correction of a pre-defined number of insertion, deletion, and substitution mutations. Our improved method is additionally capable

  13. Levenshtein error-correcting barcodes for multiplexed DNA sequencing

    Science.gov (United States)

    2013-01-01

    Background High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called multiplexing approach relies on a specific DNA tag or barcode that is attached to the sequencing or amplification primer and hence appears at the beginning of the sequence in every read. After sequencing, each sample read is identified on the basis of the respective barcode sequence. Alterations of DNA barcodes during synthesis, primer ligation, DNA amplification, or sequencing may lead to incorrect sample identification unless the error is revealed and corrected. This can be accomplished by implementing error correcting algorithms and codes. This barcoding strategy increases the total number of correctly identified samples, thus improving overall sequencing efficiency. Two popular sets of error-correcting codes are Hamming codes and Levenshtein codes. Result Levenshtein codes operate only on words of known length. Since a DNA sequence with an embedded barcode is essentially one continuous long word, application of the classical Levenshtein algorithm is problematic. In this paper we demonstrate the decreased error correction capability of Levenshtein codes in a DNA context and suggest an adaptation of Levenshtein codes that is proven of efficiently correcting nucleotide errors in DNA sequences. In our adaption we take the DNA context into account and redefine the word length whenever an insertion or deletion is revealed. In simulations we show the superior error correction capability of the new method compared to traditional Levenshtein and Hamming based codes in the presence of multiple errors. Conclusion We present an adaptation of Levenshtein codes to DNA contexts capable of correction of a pre-defined number of insertion, deletion, and substitution mutations. Our improved

  14. Directed Evolution of DNA Polymerases for Next Generation Sequencing

    Science.gov (United States)

    Leconte, Aaron M.; Patel, Maha P.; Sass, Lauryn E.; McInerney, Peter; Jarosz, Mirna; Kung, Li; Bowers, Jayson L.; Buzby, Philip R.; Efcavitch, J. William; Romesberg, Floyd E.

    2011-01-01

    We present the application of an activity-based phage display method to identify DNA polymerases tailored for next generation sequencing applications. Using this approach, we identify a mutant of Taq DNA polymerase that incorporates the fluorophore-labeled dA, dT, dC, and dG substrates ~50 to 400-fold more efficiently into scarred primers in solution and that also demonstrates significantly improved performance under actual sequencing conditions. PMID:20629059

  15. Local alignment of two-base encoded DNA sequence.

    Science.gov (United States)

    Homer, Nils; Merriman, Barry; Nelson, Stanley F

    2009-06-09

    DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data.

  16. PREDICTION OF CHROMATIN STATES USING DNA SEQUENCE PROPERTIES

    KAUST Repository

    Bahabri, Rihab R.

    2013-06-01

    Activities of DNA are to a great extent controlled epigenetically through the internal struc- ture of chromatin. This structure is dynamic and is influenced by different modifications of histone proteins. Various combinations of epigenetic modification of histones pinpoint to different functional regions of the DNA determining the so-called chromatin states. How- ever, the characterization of chromatin states by the DNA sequence properties remains largely unknown. In this study we aim to explore whether DNA sequence patterns in the human genome can characterize different chromatin states. Using DNA sequence motifs we built binary classifiers for each chromatic state to eval- uate whether a given genomic sequence is a good candidate for belonging to a particular chromatin state. Of four classification algorithms (C4.5, Naive Bayes, Random Forest, and SVM) used for this purpose, the decision tree based classifiers (C4.5 and Random Forest) yielded best results among those we evaluated. Our results suggest that in general these models lack sufficient predictive power, although for four chromatin states (insulators, het- erochromatin, and two types of copy number variation) we found that presence of certain motifs in DNA sequences does imply an increased probability that such a sequence is one of these chromatin states.

  17. Nuclear and mitochondrial DNA sequences from two Denisovan individuals.

    Science.gov (United States)

    Sawyer, Susanna; Renaud, Gabriel; Viola, Bence; Hublin, Jean-Jacques; Gansauge, Marie-Theres; Shunkov, Michael V; Derevianko, Anatoly P; Prüfer, Kay; Kelso, Janet; Pääbo, Svante

    2015-12-22

    Denisovans, a sister group of Neandertals, have been described on the basis of a nuclear genome sequence from a finger phalanx (Denisova 3) found in Denisova Cave in the Altai Mountains. The only other Denisovan specimen described to date is a molar (Denisova 4) found at the same site. This tooth carries a mtDNA sequence similar to that of Denisova 3. Here we present nuclear DNA sequences from Denisova 4 and a morphological description, as well as mitochondrial and nuclear DNA sequence data, from another molar (Denisova 8) found in Denisova Cave in 2010. This new molar is similar to Denisova 4 in being very large and lacking traits typical of Neandertals and modern humans. Nuclear DNA sequences from the two molars form a clade with Denisova 3. The mtDNA of Denisova 8 is more diverged and has accumulated fewer substitutions than the mtDNAs of the other two specimens, suggesting Denisovans were present in the region over an extended period. The nuclear DNA sequence diversity among the three Denisovans is comparable to that among six Neandertals, but lower than that among present-day humans.

  18. SWORDS: A statistical tool for analysing large DNA sequences

    Indian Academy of Sciences (India)

    Unknown

    Unusual frequencies of certain DNA words in. Escherichia coli and virus genomes and possible statis- tical and biological implications of such over- and under- representation of those words have been studied in the literature based on Markov chain models for DNA sequences (Phillips et al 1987a,b; Prum et al 1995; Leung.

  19. Maternal Plasma DNA and RNA Sequencing for Prenatal Testing

    NARCIS (Netherlands)

    Tamminga, Saskia; van Maarle, Merel; Henneman, Lidewij; Oudejans, Cees B. M.; Cornel, Martina C.; Sistermans, Erik A.

    2016-01-01

    Cell-free DNA (cf DNA) testing has recently become indispensable in diagnostic testing and screening. In the prenatal setting, this type of testing is often called noninvasive prenatal testing (NIPT). With a number of techniques, using either next-generation sequencing or single nucleotide

  20. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    DEFF Research Database (Denmark)

    Nielsen, Peter E.

    2008-01-01

    sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technol. of protein dsDNA structures. (c) 2008 American Institute of Physics. [on SciFinder (R)] Udgivelsesdato...

  1. DNA Sequences of RAPD Fragments in the Egyptian cotton ...

    African Journals Online (AJOL)

    Random Amplified Polymorphic DNAs (RAPDs) is a DNA polymorphism assay based on the amplification of random DNA segments with single primers of arbitrary nucleotide sequence. Despite the fact that the RAPD technique has become a very powerful tool and has found use in numerous applications, yet, the nature of ...

  2. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    DEFF Research Database (Denmark)

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie

    2014-01-01

    The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sec...

  3. A method for cloning and sequencing long palindromic DNA junctions.

    Science.gov (United States)

    Rattray, Alison J

    2004-11-08

    DNA sequences containing long adjacent inverted repeats (palindromes) are inherently unstable and are associated with many types of chromosomal rearrangements. The instability associated with palindromic sequences also creates difficulties in their molecular analysis: long palindromes (>250 bp/arm) are highly unstable in Escherichia coli, and cannot be directly PCR amplified or sequenced due to their propensity to form intra-strand hairpins. Here, we show that DNA molecules containing long palindromes (>900 bp/arm) can be transformed and stably maintained in Saccharomyces cerevisiae cells lacking a functional SAE2 gene. Treatment of the palindrome-containing DNA with sodium bisulfite at high temperature results in deamination of cytosine, converting it to uracil and thus reducing the propensity to form intra-strand hairpins. The bisulfite-treated DNA can then be PCR amplified, cloned and sequenced, allowing determination of the nucleotide sequence of the junctions. Our data demonstrates that long palindromes with either no spacer (perfect) or a 2 bp spacer can be stably maintained, recovered and sequenced from sae2Delta yeast cells. Since DNA sequences from mammalian cells can be gap repaired by their co-transformation into yeast cells with an appropriate vector, the methods described in this manuscript should provide some of the necessary tools to isolate and characterize palindromic junctions from mammalian cells.

  4. Sequencing and Analysis of Neanderthal Genomic DNA

    OpenAIRE

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith, Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Paabo, Svante; Pritchard, Jonathan K.; Rubin, Edward M.

    2006-01-01

    Our knowledge of Neanderthals is based on a limited number of remains and artifacts from which we must make inferences about their biology, behavior, and relationship to ourselves. Here, we describe the characterization of these extinct hominids from a new perspective, based on the development of a Neanderthal metagenomic library and its high-throughput sequencing and analysis. Several lines of evidence indicate that the 65,250 base pairs of hominid sequence so far identified in the library a...

  5. Cloning, sequencing and expression of cDNA encoding growth ...

    Indian Academy of Sciences (India)

    Home; Journals; Journal of Biosciences; Volume 26; Issue 3. Cloning, sequencing ... The full-length cDNA clone is 1132 bp in length, coding for an open reading frame (ORF) of 603 bp; the reading frame encodes a putative polypeptide of 200 amino acids including the signal sequence of 22 amino acids. The 5′ and 3′ ...

  6. The properties and applications of single-molecule DNA sequencing

    Science.gov (United States)

    2011-01-01

    Single-molecule sequencing enables DNA or RNA to be sequenced directly from biological samples, making it well-suited for diagnostic and clinical applications. Here we review the properties and applications of this rapidly evolving and promising technology. PMID:21349208

  7. Mitochondrial DNA sequence-based phylogenetic relationship ...

    Indian Academy of Sciences (India)

    The phylogenetic relationships among flesh flies of the family Sarcophagidae has been based mainly on the morphology of male genitalia. However, the male genitalic character-based relationships are far from satisfactory. Therefore, in the present study mitochondrial DNA has been used as marker to unravel genetic ...

  8. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  9. High-throughput DNA sequencing errors are reduced by orders of magnitude using circle sequencing

    Science.gov (United States)

    Lou, Dianne I.; Hussmann, Jeffrey A.; McBee, Ross M.; Acevedo, Ashley; Andino, Raul; Press, William H.; Sawyer, Sara L.

    2013-01-01

    A major limitation of high-throughput DNA sequencing is the high rate of erroneous base calls produced. For instance, Illumina sequencing machines produce errors at a rate of ∼0.1–1 × 10−2 per base sequenced. These technologies typically produce billions of base calls per experiment, translating to millions of errors. We have developed a unique library preparation strategy, “circle sequencing,” which allows for robust downstream computational correction of these errors. In this strategy, DNA templates are circularized, copied multiple times in tandem with a rolling circle polymerase, and then sequenced on any high-throughput sequencing machine. Each read produced is computationally processed to obtain a consensus sequence of all linked copies of the original molecule. Physically linking the copies ensures that each copy is independently derived from the original molecule and allows for efficient formation of consensus sequences. The circle-sequencing protocol precedes standard library preparations and is therefore suitable for a broad range of sequencing applications. We tested our method using the Illumina MiSeq platform and obtained errors in our processed sequencing reads at a rate as low as 7.6 × 10−6 per base sequenced, dramatically improving the error rate of Illumina sequencing and putting error on par with low-throughput, but highly accurate, Sanger sequencing. Circle sequencing also had substantially higher efficiency and lower cost than existing barcode-based schemes for correcting sequencing errors. PMID:24243955

  10. Comparison of DNA fragments as donor DNAs upon sequence conversion of cleaved target DNA.

    Science.gov (United States)

    Suzuki, Tetsuya; Imada, Takashi; Komatsu, Yasuo; Kamiya, Hiroyuki

    2017-06-03

    Pinpoint sequence alteration (genome editing) by the combination of the site-specific cleavage of a target DNA and a donor nucleic acid has attracted much attention and the sequence of the target DNA is expected to be changed to that of a donor nucleic acid. In most cases, oligodeoxyribonucleotides (ODNs) and plasmid DNAs have been used as donors. However, a several hundred-base single-stranded (ss) DNA fragment and a 5'-tailed duplex (TD) accomplished the desired sequence changes without DNA cleavage, and might serve as better donors for the cleaved target DNA than ODNs and plasmid DNAs. In this study, sequence conversion efficiencies were compared with various donor DNAs in model sequence alteration experiments, using episomal DNA. The efficiencies with the ss and TD fragments were higher than those with the ODN and plasmid DNA. The sequence change by the TD seemed somewhat less efficient but slightly more accurate than that by the ss DNA fragment. These results suggested that the ss and TD fragments are better donors for targeted sequence alteration.

  11. Sequencing of adenine in DNA by scanning tunneling microscopy

    Science.gov (United States)

    Tanaka, Hiroyuki; Taniguchi, Masateru

    2017-08-01

    The development of DNA sequencing technology utilizing the detection of a tunnel current is important for next-generation sequencer technologies based on single-molecule analysis technology. Using a scanning tunneling microscope, we previously reported that dI/dV measurements and dI/dV mapping revealed that the guanine base (purine base) of DNA adsorbed onto the Cu(111) surface has a characteristic peak at V s = -1.6 V. If, in addition to guanine, the other purine base of DNA, namely, adenine, can be distinguished, then by reading all the purine bases of each single strand of a DNA double helix, the entire base sequence of the original double helix can be determined due to the complementarity of the DNA base pair. Therefore, the ability to read adenine is important from the viewpoint of sequencing. Here, we report on the identification of adenine by STM topographic and spectroscopic measurements using a synthetic DNA oligomer and viral DNA.

  12. cDNA cloning and sequencing of ostrich Growth hormone

    Directory of Open Access Journals (Sweden)

    Doosti Abbas

    2012-01-01

    Full Text Available In recent years, industrial breeding of ostrich (Struthio camelus has been widely developed in Iran. Growth hormone (GH is a peptide hormone that stimulates growth and cell reproduction in different animals. The aim of this study was to clone and sequence the ostrich growth hormone gene in E. coli, done for the first time in Iran. The cDNA that encodes ostrich growth hormone was isolated from total mRNA of the pituitary gland and amplified by RT-PCR using GH specific PCR primers. Then GH cDNA was cloned by T/A cloning technique and the construct was transformed into E. coli. Finally, GH cDNA sequence was submitted to the GenBank (Accession number: JN559394. The results of present study showed that GH cDNA was successfully cloned in E. coli. Sequencing confirmed that GH cDNA was cloned and that the length of ostrich GH cDNA was 672 bp; BLAST search showed that the sequence of growth hormone cDNA of the ostrich from Iran has 100% homology with other records existing in GenBank.

  13. Sequence-Dependent Persistence Length of Long DNA

    Science.gov (United States)

    Chuang, Hui-Min; Reifenberger, Jeffrey G.; Cao, Han; Dorfman, Kevin D.

    2017-12-01

    Using a high-throughput genome-mapping approach, we obtained circa 50 million measurements of the extension of internal human DNA segments in a 41 nm ×41 nm nanochannel. The underlying DNA sequences, obtained by mapping to the reference human genome, are 2.5-393 kilobase pairs long and contain percent GC contents between 32.5% and 60%. Using Odijk's theory for a channel-confined wormlike chain, these data reveal that the DNA persistence length increases by almost 20% as the percent GC content increases. The increased persistence length is rationalized by a model, containing no adjustable parameters, that treats the DNA as a statistical terpolymer with a sequence-dependent intrinsic persistence length and a sequence-independent electrostatic persistence length.

  14. Thermodynamics of sequence-specific binding of PNA to DNA

    DEFF Research Database (Denmark)

    Ratilainen, T; Holmén, A; Tuite, E

    2000-01-01

    For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes) and seq......For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes...... relative to that of the perfectly matched sequence with a corresponding free energy penalty of about 15 kJ mol(-1) bp(-1). The average cost of a single mismatch is therefore estimated to be on the order of or larger than the gain of two matched base pairs, resulting in an apparent binding constant of only...

  15. Reduced-stringency DNA reassociation: sequence specific duplex formation.

    OpenAIRE

    Burr, H E; Schimke, R T

    1982-01-01

    Reduced-stringency DNA reassociation conditions allow low stability duplexes to be detected in prokaryotic, plant, fish, avian, mammalian, and primate genomes. Highly diverged families of sequences can be detected in avian, mouse, and human unique sequence dNAs. Such a family has been described among twelve species of birds; based on species specific melting profiles and fractionation of sequences belonging to this family, it was concluded that permissive reassociation conditions did not arti...

  16. Efficient depletion of host DNA contamination in malaria clinical sequencing.

    Science.gov (United States)

    Oyola, Samuel O; Gu, Yong; Manske, Magnus; Otto, Thomas D; O'Brien, John; Alcock, Daniel; Macinnis, Bronwyn; Berriman, Matthew; Newbold, Chris I; Kwiatkowski, Dominic P; Swerdlow, Harold P; Quail, Michael A

    2013-03-01

    The cost of whole-genome sequencing (WGS) is decreasing rapidly as next-generation sequencing technology continues to advance, and the prospect of making WGS available for public health applications is becoming a reality. So far, a number of studies have demonstrated the use of WGS as an epidemiological tool for typing and controlling outbreaks of microbial pathogens. Success of these applications is hugely dependent on efficient generation of clean genetic material that is free from host DNA contamination for rapid preparation of sequencing libraries. The presence of large amounts of host DNA severely affects the efficiency of characterizing pathogens using WGS and is therefore a serious impediment to clinical and epidemiological sequencing for health care and public health applications. We have developed a simple enzymatic treatment method that takes advantage of the methylation of human DNA to selectively deplete host contamination from clinical samples prior to sequencing. Using malaria clinical samples with over 80% human host DNA contamination, we show that the enzymatic treatment enriches Plasmodium falciparum DNA up to ∼9-fold and generates high-quality, nonbiased sequence reads covering >98% of 86,158 catalogued typeable single-nucleotide polymorphism loci.

  17. Directed PCR-free engineering of highly repetitive DNA sequences

    Directory of Open Access Journals (Sweden)

    Preissler Steffen

    2011-09-01

    Full Text Available Abstract Background Highly repetitive nucleotide sequences are commonly found in nature e.g. in telomeres, microsatellite DNA, polyadenine (poly(A tails of eukaryotic messenger RNA as well as in several inherited human disorders linked to trinucleotide repeat expansions in the genome. Therefore, studying repetitive sequences is of biological, biotechnological and medical relevance. However, cloning of such repetitive DNA sequences is challenging because specific PCR-based amplification is hampered by the lack of unique primer binding sites resulting in unspecific products. Results For the PCR-free generation of repetitive DNA sequences we used antiparallel oligonucleotides flanked by restriction sites of Type IIS endonucleases. The arrangement of recognition sites allowed for stepwise and seamless elongation of repetitive sequences. This facilitated the assembly of repetitive DNA segments and open reading frames encoding polypeptides with periodic amino acid sequences of any desired length. By this strategy we cloned a series of polyglutamine encoding sequences as well as highly repetitive polyadenine tracts. Such repetitive sequences can be used for diverse biotechnological applications. As an example, the polyglutamine sequences were expressed as His6-SUMO fusion proteins in Escherichia coli cells to study their aggregation behavior in vitro. The His6-SUMO moiety enabled affinity purification of the polyglutamine proteins, increased their solubility, and allowed controlled induction of the aggregation process. We successfully purified the fusions proteins and provide an example for their applicability in filter retardation assays. Conclusion Our seamless cloning strategy is PCR-free and allows the directed and efficient generation of highly repetitive DNA sequences of defined lengths by simple standard cloning procedures.

  18. Low fluorescence background electroblotting membrane for DNA sequencing.

    Science.gov (United States)

    Chu, T J; Caldwell, K D; Weiss, R B; Gesteland, R F; Pitt, W G

    1992-03-01

    A low fluorescence background polypropylene (PP) membrane has been developed for ultimate use as an electroblotting membrane in DNA sequencing based on fluorescence detection. The DNA binding capacity of this membrane is improved by a surface modification using radio frequency plasma discharge (RFPD) in ammonia gas. The RFPD operational parameters are evaluated both in terms of membrane nitrogen content and in terms of the product's capacity for binding radioisotope-labeled DNA fragments. The surface morphologies of the derivatized membranes are examined by scanning electron microscopy; their mechanical and electrical properties, which are important for the subsequent sequencing procedures, are likewise established. Due to the goal of developing a membrane suitable for multiplex processing, in which the electroblotted DNA must withstand dozens of hybridization/stripping cycles, special attention is given the covalent attachment of DNA to the membrane. The modified PP membrane is evaluated in a multiplex sequencing application using radioisotope-labeled DNA probes, and found to yield somewhat better binding of a given amount of electroblotted DNA than the commonly used GeneScreen membrane. A tenfold repetition of the probing indicates little loss of signal; the membrane-bound DNA is stable upon storage and shows no detectable loss in probing efficiency after one month.

  19. Bioinformatics analysis of circulating cell-free DNA sequencing data.

    Science.gov (United States)

    Chan, Landon L; Jiang, Peiyong

    2015-10-01

    The discovery of cell-free DNA molecules in plasma has opened up numerous opportunities in noninvasive diagnosis. Cell-free DNA molecules have become increasingly recognized as promising biomarkers for detection and management of many diseases. The advent of next generation sequencing has provided unprecedented opportunities to scrutinize the characteristics of cell-free DNA molecules in plasma in a genome-wide fashion and at single-base resolution. Consequently, clinical applications of circulating cell-free DNA analysis have not only revolutionized noninvasive prenatal diagnosis but also facilitated cancer detection and monitoring toward an era of blood-based personalized medicine. With the remarkably increasing throughput and lowering cost of next generation sequencing, bioinformatics analysis becomes increasingly demanding to understand the large amount of data generated by these sequencing platforms. In this Review, we highlight the major bioinformatics algorithms involved in the analysis of cell-free DNA sequencing data. Firstly, we briefly describe the biological properties of these molecules and provide an overview of the general bioinformatics approach for the analysis of cell-free DNA. Then, we discuss the specific upstream bioinformatics considerations concerning the analysis of sequencing data of circulating cell-free DNA, followed by further detailed elaboration on each key clinical situation in noninvasive prenatal diagnosis and cancer management where downstream bioinformatics analysis is heavily involved. We also discuss bioinformatics analysis as well as clinical applications of the newly developed massively parallel bisulfite sequencing of cell-free DNA. Finally, we offer our perspectives on the future development of bioinformatics in noninvasive diagnosis. Copyright © 2015 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

  20. Identification of repeats in DNA sequences using nucleotide distribution uniformity.

    Science.gov (United States)

    Yin, Changchuan

    2017-01-07

    Repetitive elements are important in genomic structures, functions and regulations, yet effective methods in precisely identifying repetitive elements in DNA sequences are not fully accessible, and the relationship between repetitive elements and periodicities of genomes is not clearly understood. We present an ab initio method to quantitatively detect repetitive elements and infer the consensus repeat pattern in repetitive elements. The method uses the measure of the distribution uniformity of nucleotides at periodic positions in DNA sequences or genomes. It can identify periodicities, consensus repeat patterns, copy numbers and perfect levels of repetitive elements. The results of using the method on different DNA sequences and genomes demonstrate efficacy and accuracy in identifying repeat patterns and periodicities. The complexity of the method is linear with respect to the lengths of the analyzed sequences. The Python programs in this study are freely available to the public upon request or at https://github.com/cyinbox/DNADU. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. Palindromic sequence artifacts generated during next generation sequencing library preparation from historic and ancient DNA.

    Directory of Open Access Journals (Sweden)

    Bastiaan Star

    Full Text Available Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua, which forms interrupted palindromes consisting of reverse complementary sequence at the 5' and 3'-ends of sequencing reads. The palindromic sequences themselves have specific properties - the bases at the 5'-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3'-end. The terminal 3' bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA, which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3'-end of DNA strands, with the 5'-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias.

  2. Isolation and enrichment of Cryptosporidium DNA and verification of DNA purity for whole-genome sequencing.

    Science.gov (United States)

    Guo, Yaqiong; Li, Na; Lysén, Colleen; Frace, Michael; Tang, Kevin; Sammons, Scott; Roellig, Dawn M; Feng, Yaoyu; Xiao, Lihua

    2015-02-01

    Whole-genome sequencing of Cryptosporidium spp. is hampered by difficulties in obtaining sufficient, highly pure genomic DNA from clinical specimens. In this study, we developed procedures for the isolation and enrichment of Cryptosporidium genomic DNA from fecal specimens and verification of DNA purity for whole-genome sequencing. The isolation and enrichment of genomic DNA were achieved by a combination of three oocyst purification steps and whole-genome amplification (WGA) of DNA from purified oocysts. Quantitative PCR (qPCR) analysis of WGA products was used as an initial quality assessment of amplified genomic DNA. The purity of WGA products was assessed by Sanger sequencing of cloned products. Next-generation sequencing tools were used in final evaluations of genome coverage and of the extent of contamination. Altogether, 24 fecal specimens of Cryptosporidium parvum, C. hominis, C. andersoni, C. ubiquitum, C. tyzzeri, and Cryptosporidium chipmunk genotype I were processed with the procedures. As expected, WGA products with low (sequences in Sanger sequencing. The cloning-sequencing analysis, however, showed significant contamination in 5 WGA products (proportion of positive colonies derived from Cryptosporidium genomic DNA, ≤25%). Following this strategy, 20 WGA products from six Cryptosporidium species or genotypes with low (mostly sequencing, generating sequence data covering 94.5% to 99.7% of Cryptosporidium genomes, with mostly minor contamination from bacterial, fungal, and host DNA. These results suggest that the described strategy can be used effectively for the isolation and enrichment of Cryptosporidium DNA from fecal specimens for whole-genome sequencing. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  3. PCR primers for metazoan mitochondrial 12S ribosomal DNA sequences.

    Directory of Open Access Journals (Sweden)

    Ryuji J Machida

    Full Text Available BACKGROUND: Assessment of the biodiversity of communities of small organisms is most readily done using PCR-based analysis of environmental samples consisting of mixtures of individuals. Known as metagenetics, this approach has transformed understanding of microbial communities and is beginning to be applied to metazoans as well. Unlike microbial studies, where analysis of the 16S ribosomal DNA sequence is standard, the best gene for metazoan metagenetics is less clear. In this study we designed a set of PCR primers for the mitochondrial 12S ribosomal DNA sequence based on 64 complete mitochondrial genomes and then tested their efficacy. METHODOLOGY/PRINCIPAL FINDINGS: A total of the 64 complete mitochondrial genome sequences representing all metazoan classes available in GenBank were downloaded using the NCBI Taxonomy Browser. Alignment of sequences was performed for the excised mitochondrial 12S ribosomal DNA sequences, and conserved regions were identified for all 64 mitochondrial genomes. These regions were used to design a primer pair that flanks a more variable region in the gene. Then all of the complete metazoan mitochondrial genomes available in NCBI's Organelle Genome Resources database were used to determine the percentage of taxa that would likely be amplified using these primers. Results suggest that these primers will amplify target sequences for many metazoans. CONCLUSIONS/SIGNIFICANCE: Newly designed 12S ribosomal DNA primers have considerable potential for metazoan metagenetic analysis because of their ability to amplify sequences from many metazoans.

  4. Picidae in the European fossil, subfossil and recent bird faunas and their osteological characteristics

    Directory of Open Access Journals (Sweden)

    Kessler Jenő (Eugen

    2016-06-01

    Full Text Available This paper presents the European fossil, subfossil and recent representatives of the Picidae family. Following the list of fossil and subfossil remains, the author analyzes and presents images of the osteological characteristics of the order’s 10 recent European species.

  5. Probabilistic models for semisupervised discriminative motif discovery in DNA sequences.

    Science.gov (United States)

    Kim, Jong Kyoung; Choi, Seungjin

    2011-01-01

    Methods for discriminative motif discovery in DNA sequences identify transcription factor binding sites (TFBSs), searching only for patterns that differentiate two sets (positive and negative sets) of sequences. On one hand, discriminative methods increase the sensitivity and specificity of motif discovery, compared to generative models. On the other hand, generative models can easily exploit unlabeled sequences to better detect functional motifs when labeled training samples are limited. In this paper, we develop a hybrid generative/discriminative model which enables us to make use of unlabeled sequences in the framework of discriminative motif discovery, leading to semisupervised discriminative motif discovery. Numerical experiments on yeast ChIP-chip data for discovering DNA motifs demonstrate that the best performance is obtained between the purely-generative and the purely-discriminative and the semisupervised learning improves the performance when labeled sequences are limited.

  6. High-throughput sequencing in mitochondrial DNA research.

    Science.gov (United States)

    Ye, Fei; Samuels, David C; Clark, Travis; Guo, Yan

    2014-07-01

    Next-generation sequencing, also known as high-throughput sequencing, has greatly enhanced researchers' ability to conduct biomedical research on all levels. Mitochondrial research has also benefitted greatly from high-throughput sequencing; sequencing technology now allows for screening of all 16,569 base pairs of the mitochondrial genome simultaneously for SNPs and low level heteroplasmy and, in some cases, the estimation of mitochondrial DNA copy number. It is important to realize the full potential of high-throughput sequencing for the advancement of mitochondrial research. To this end, we review how high-throughput sequencing has impacted mitochondrial research in the categories of SNPs, low level heteroplasmy, copy number, and structural variants. We also discuss the different types of mitochondrial DNA sequencing and their pros and cons. Based on previous studies conducted by various groups, we provide strategies for processing mitochondrial DNA sequencing data, including assembly, variant calling, and quality control. Copyright © 2014 Elsevier B.V. and Mitochondria Research Society. All rights reserved.

  7. Improved Algorithm for Analysis of DNA Sequences Using Multiresolution Transformation

    Directory of Open Access Journals (Sweden)

    T. M. Inbamalar

    2015-01-01

    Full Text Available Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA, the ribonucleic acid (RNA, and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI site. The comparative analysis is done and it ensures the efficiency of the proposed system.

  8. Improved algorithm for analysis of DNA sequences using multiresolution transformation.

    Science.gov (United States)

    Inbamalar, T M; Sivakumar, R

    2015-01-01

    Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA), the ribonucleic acid (RNA), and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP) methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP) representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI) site. The comparative analysis is done and it ensures the efficiency of the proposed system.

  9. A fast algorithm for exonic regions prediction in DNA sequences.

    Science.gov (United States)

    Saberkari, Hamidreza; Shamsi, Mousa; Heravi, Hamed; Sedaaghi, Mohammad Hossein

    2013-07-01

    The main purpose of this paper is to introduce a fast method for gene prediction in DNA sequences based on the period-3 property in exons. First, the symbolic DNA sequences were converted to digital signal using the electron ion interaction potential method. Then, to reduce the effect of background noise in the period-3 spectrum, we used the discrete wavelet transform at three levels and applied it on the input digital signal. Finally, the Goertzel algorithm was used to extract period-3 components in the filtered DNA sequence. The proposed algorithm leads to decrease the computational complexity and hence, increases the speed of the process. Detection of small size exons in DNA sequences, exactly, is another advantage of the algorithm. The proposed algorithm ability in exon prediction was compared with several existing methods at the nucleotide level using: (i) specificity - sensitivity values; (ii) receiver operating curves (ROC); and (iii) area under ROC curve. Simulation results confirmed that the proposed method can be used as a promising tool for exon prediction in DNA sequences.

  10. Predicting target DNA sequences of DNA-binding proteins based on unbound structures.

    Directory of Open Access Journals (Sweden)

    Chien-Yu Chen

    Full Text Available DNA-binding proteins such as transcription factors use DNA-binding domains (DBDs to bind to specific sequences in the genome to initiate many important biological functions. Accurate prediction of such target sequences, often represented by position weight matrices (PWMs, is an important step to understand many biological processes. Recent studies have shown that knowledge-based potential functions can be applied on protein-DNA co-crystallized structures to generate PWMs that are considerably consistent with experimental data. However, this success has not been extended to DNA-binding proteins lacking co-crystallized structures. This study aims at investigating the possibility of predicting the DNA sequences bound by DNA-binding proteins from the proteins' unbound structures (structures of the unbound state. Given an unbound query protein and a template complex, the proposed method first employs structure alignment to generate synthetic protein-DNA complexes for the query protein. Once a complex is available, an atomic-level knowledge-based potential function is employed to predict PWMs characterizing the sequences to which the query protein can bind. The evaluation of the proposed method is based on seven DNA-binding proteins, which have structures of both DNA-bound and unbound forms for prediction as well as annotated PWMs for validation. Since this work is the first attempt to predict target sequences of DNA-binding proteins from their unbound structures, three types of structural variations that presumably influence the prediction accuracy were examined and discussed. Based on the analyses conducted in this study, the conformational change of proteins upon binding DNA was shown to be the key factor. This study sheds light on the challenge of predicting the target DNA sequences of a protein lacking co-crystallized structures, which encourages more efforts on the structure alignment-based approaches in addition to docking- and homology

  11. How effective is graphene nanopore geometry on DNA sequencing?

    CERN Document Server

    Satarifard, Vahid; Ejtehadi, Mohammad Reza

    2015-01-01

    In this paper we investigate the effects of graphene nanopore geometry on homopolymer ssDNA pulling process through nanopore using steered molecular dynamic (SMD) simulations. Different graphene nanopores are examined including axially symmetric and asymmetric monolayer graphene nanopores as well as five layer graphene polyhedral crystals (GPC). The pulling force profile, moving fashion of ssDNA, work done in irreversible DNA pulling and orientations of DNA bases near the nanopore are assessed. Simulation results demonstrate the strong effect of the pore shape as well as geometrical symmetry on free energy barrier, orientations and dynamic of DNA translocation through graphene nanopore. Our study proposes that the symmetric circular geometry of monolayer graphene nanopore with high pulling velocity can be used for DNA sequencing.

  12. Development of Active DNA Control Technique for DNA Sequencer With a Solid-state Nanopore

    Science.gov (United States)

    Akahori, Rena; Harada, Kunio; Goto, Yusuke; Yanagi, Itaru; Yokoi, Takahide; Oura, Takeshi; Shibahara, Masashi; Takeda, Ken-Ichi

    We have developed a technique that can control the arbitrary speeds of DNA passing through a solid-state nanopore of a DNA sequencer. For this active DNA control technique, we used a DNA-immobilized Si probe, larger than the membrane with a nanopore, and used a piezoelectric actuator and stepper motor to drive the probe. This probe enables a user to adjust the relative position between the nanopore and DNA immobilized on the probe without the need for precise lateral control. In this presentation, we demonstrate how DNA (block copolymer ([(dT)25-(dC)25-(dA)50]m)), immobilized on the probe, slid through a nanopore and was pulled out using the active DNA control technique. As the DNA-immobilized probe was being pulled out, we obtained various ion-current signal levels corresponding to the number of different nucleotides in a single strand of DNA.

  13. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    Directory of Open Access Journals (Sweden)

    Baldwin Stephen A

    2011-03-01

    Full Text Available Abstract Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.

  14. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities.

    Science.gov (United States)

    Troshin, Peter V; Postis, Vincent Lg; Ashworth, Denise; Baldwin, Stephen A; McPherson, Michael J; Barton, Geoffrey J

    2011-03-07

    Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.

  15. VoSeq: a voucher and DNA sequence web application.

    Science.gov (United States)

    Peña, Carlos; Malm, Tobias

    2012-01-01

    There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL) and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit).

  16. Dialects of the DNA uptake sequence in Neisseriaceae.

    Directory of Open Access Journals (Sweden)

    Stephan A Frye

    2013-04-01

    Full Text Available In all sexual organisms, adaptations exist that secure the safe reassortment of homologous alleles and prevent the intrusion of potentially hazardous alien DNA. Some bacteria engage in a simple form of sex known as transformation. In the human pathogen Neisseria meningitidis and in related bacterial species, transformation by exogenous DNA is regulated by the presence of a specific DNA Uptake Sequence (DUS, which is present in thousands of copies in the respective genomes. DUS affects transformation by limiting DNA uptake and recombination in favour of homologous DNA. The specific mechanisms of DUS-dependent genetic transformation have remained elusive. Bioinformatic analyses of family Neisseriaceae genomes reveal eight distinct variants of DUS. These variants are here termed DUS dialects, and their effect on interspecies commutation is demonstrated. Each of the DUS dialects is remarkably conserved within each species and is distributed consistent with a robust Neisseriaceae phylogeny based on core genome sequences. The impact of individual single nucleotide transversions in DUS on meningococcal transformation and on DNA binding and uptake is analysed. The results show that a DUS core 5'-CTG-3' is required for transformation and that transversions in this core reduce DNA uptake more than two orders of magnitude although the level of DNA binding remains less affected. Distinct DUS dialects are efficient barriers to interspecies recombination in N. meningitidis, N. elongata, Kingella denitrificans, and Eikenella corrodens, despite the presence of the core sequence. The degree of similarity between the DUS dialect of the recipient species and the donor DNA directly correlates with the level of transformation and DNA binding and uptake. Finally, DUS-dependent transformation is documented in the genera Eikenella and Kingella for the first time. The results presented here advance our understanding of the function and evolution of DUS and genetic

  17. Dialects of the DNA Uptake Sequence in Neisseriaceae

    Science.gov (United States)

    Frye, Stephan A.; Nilsen, Mariann; Tønjum, Tone; Ambur, Ole Herman

    2013-01-01

    In all sexual organisms, adaptations exist that secure the safe reassortment of homologous alleles and prevent the intrusion of potentially hazardous alien DNA. Some bacteria engage in a simple form of sex known as transformation. In the human pathogen Neisseria meningitidis and in related bacterial species, transformation by exogenous DNA is regulated by the presence of a specific DNA Uptake Sequence (DUS), which is present in thousands of copies in the respective genomes. DUS affects transformation by limiting DNA uptake and recombination in favour of homologous DNA. The specific mechanisms of DUS–dependent genetic transformation have remained elusive. Bioinformatic analyses of family Neisseriaceae genomes reveal eight distinct variants of DUS. These variants are here termed DUS dialects, and their effect on interspecies commutation is demonstrated. Each of the DUS dialects is remarkably conserved within each species and is distributed consistent with a robust Neisseriaceae phylogeny based on core genome sequences. The impact of individual single nucleotide transversions in DUS on meningococcal transformation and on DNA binding and uptake is analysed. The results show that a DUS core 5′-CTG-3′ is required for transformation and that transversions in this core reduce DNA uptake more than two orders of magnitude although the level of DNA binding remains less affected. Distinct DUS dialects are efficient barriers to interspecies recombination in N. meningitidis, N. elongata, Kingella denitrificans, and Eikenella corrodens, despite the presence of the core sequence. The degree of similarity between the DUS dialect of the recipient species and the donor DNA directly correlates with the level of transformation and DNA binding and uptake. Finally, DUS–dependent transformation is documented in the genera Eikenella and Kingella for the first time. The results presented here advance our understanding of the function and evolution of DUS and genetic transformation

  18. Statistical assignment of DNA sequences using Bayesian phylogenetics

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Huelsenbeck, John P.

    2008-01-01

    We provide a new automated statistical method for DNA barcoding based on a Bayesian phylogenetic analysis. The method is based on automated database sequence retrieval, alignment, and phylogenetic analysis using a custom-built program for Bayesian phylogenetic analysis. We show on real data...... that the method outperforms Blast searches as a measure of confidence and can help eliminate 80% of all false assignment based on best Blast hit. However, the most important advance of the method is that it provides statistically meaningful measures of confidence. We apply the method to a re......-analysis of previously published ancient DNA data and show that, with high statistical confidence, most of the published sequences are in fact of Neanderthal origin. However, there are several cases of chimeric sequences that are comprised of a combination of both Neanderthal and modern human DNA....

  19. Multiplexed DNA sequence capture of mitochondrial genomes using PCR products.

    Directory of Open Access Journals (Sweden)

    Tomislav Maricic

    Full Text Available BACKGROUND: To utilize the power of high-throughput sequencers, target enrichment methods have been developed. The majority of these require reagents and equipment that are only available from commercial vendors and are not suitable for the targets that are a few kilobases in length. METHODOLOGY/PRINCIPAL FINDINGS: We describe a novel and economical method in which custom made long-range PCR products are used to capture complete human mitochondrial genomes from complex DNA mixtures. We use the method to capture 46 complete mitochondrial genomes in parallel and we sequence them on a single lane of an Illumina GA(II instrument. CONCLUSIONS/SIGNIFICANCE: This method is economical and simple and particularly suitable for targets that can be amplified by PCR and do not contain highly repetitive sequences such as mtDNA. It has applications in population genetics and forensics, as well as studies of ancient DNA.

  20. Noninvasive prenatal paternity testing (NIPAT) through maternal plasma DNA sequencing

    DEFF Research Database (Denmark)

    Jiang, Haojun; Xie, Yifan; Li, Xuchao

    2016-01-01

    Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we...... developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels...... paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future....

  1. Clinical DNA Sequencer for Ultra-Low Cost Testing

    Science.gov (United States)

    Church, George; Olejnik, Jerzy; Werner, Martina; Guggenheim, Evan; DiMeo, James; Marma, Mong Sano; Visalakshi, Visa; Hagerott, Thomas; Golaski, Edmund; Veatch, Philip; Stoops, David; Gordon, Steven

    2012-01-01

    We present a new sequencing instrument, the MINI, for sequencing DNA in the clinic or core research laboratory. Unlike all other DNA sequencing systems, which run only one or two samples at a time, the MINI can simultaneously run any number of flow cells between one and twenty. Each flow cell is designed to be disposable, low-cost and use very little reagent; thus, DNA from a single patient or specimen may be cost effectively sequenced without the need for indexing multiple samples in a single flow cell. This is an important feature for the clinic, as in addition to simplifying the sample preparation process, different sample may be kept physically separate (meters) from one another, thereby significantly reducing the chance of contamination or false diagnosis. Low cost (about $100 per sequencing test) is achieved through a unique sequencing by synthesis chemistry and low reagent consumption. Parallel flow cell processing and fluidics design results in high throughput (tens of tests per day). In addition to sequence-based clinical testing, the system supports targeted resequencing up to an exome per flow cell. Read lengths are driven by application requirements and are between 35-100 bp.

  2. Comment on "Linguistic features of noncoding DNA sequences"

    CERN Document Server

    Israeloff, N E; Chan, K; Israeloff, N E; Kagalenko, M; Chan, K

    1995-01-01

    In a recent Physical Review Letter, Mantegna et. al., report that certain statistical signatures of natural language can be found in non-coding DNA sequences. In this comment we show that random noise with power-law correlation similar to 1/f noise, exhibits the same "linguistic" signature as those found in non-coding DNA. We conclude that these signa- tures cannot distinguish languages from noise.

  3. Anaplasma phagocytophilum in Danish sheep: confirmation by DNA sequencing

    Directory of Open Access Journals (Sweden)

    Thamsborg Stig M

    2009-12-01

    Full Text Available Abstract Background The presence of Anaplasma phagocytophilum, an Ixodes ricinus transmitted bacterium, was investigated in two flocks of Danish grazing lambs. Direct PCR detection was performed on DNA extracted from blood and serum with subsequent confirmation by DNA sequencing. Methods 31 samples obtained from clinically normal lambs in 2000 from Fussingø, Jutland and 12 samples from ten lambs and two ewes from a clinical outbreak at Feddet, Zealand in 2006 were included in the study. Some of the animals from Feddet had shown clinical signs of polyarthritis and general unthriftiness prior to sampling. DNA extraction was optimized from blood and serum and detection achieved by a 16S rRNA targeted PCR with verification of the product by DNA sequencing. Results Five DNA extracts were found positive by PCR, including two samples from 2000 and three from 2006. For both series of samples the product was verified as A. phagocytophilum by DNA sequencing. Conclusions A. phagocytophilum was detected by molecular methods for the first time in Danish grazing lambs during the two seasons investigated (2000 and 2006.

  4. Defining the sequence requirements for the positioning of base J in DNA using SMRT sequencing.

    Science.gov (United States)

    Genest, Paul-Andre; Baugh, Loren; Taipale, Alex; Zhao, Wanqi; Jan, Sabrina; van Luenen, Henri G A M; Korlach, Jonas; Clark, Tyson; Luong, Khai; Boitano, Matthew; Turner, Steve; Myler, Peter J; Borst, Piet

    2015-02-27

    Base J (β-D-glucosyl-hydroxymethyluracil) replaces 1% of T in the Leishmania genome and is only found in telomeric repeats (99%) and in regions where transcription starts and stops. This highly restricted distribution must be co-determined by the thymidine hydroxylases (JBP1 and JBP2) that catalyze the initial step in J synthesis. To determine the DNA sequences recognized by JBP1/2, we used SMRT sequencing of DNA segments inserted into plasmids grown in Leishmania tarentolae. We show that SMRT sequencing recognizes base J in DNA. Leishmania DNA segments that normally contain J also picked up J when present in the plasmid, whereas control sequences did not. Even a segment of only 10 telomeric (GGGTTA) repeats was modified in the plasmid. We show that J modification usually occurs at pairs of Ts on opposite DNA strands, separated by 12 nucleotides. Modifications occur near G-rich sequences capable of forming G-quadruplexes and JBP2 is needed, as it does not occur in JBP2-null cells. We propose a model whereby de novo J insertion is mediated by JBP2. JBP1 then binds to J and hydroxylates another T 13 bp downstream (but not upstream) on the complementary strand, allowing JBP1 to maintain existing J following DNA replication. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Identification of Bacterial Species in Kuwaiti Waters Through DNA Sequencing

    Science.gov (United States)

    Chen, K.

    2017-01-01

    With an objective of identifying the bacterial diversity associated with ecosystem of various Kuwaiti Seas, bacteria were cultured and isolated from 3 water samples. Due to the difficulties for cultured and isolated fecal coliforms on the selective agar plates, bacterial isolates from marine agar plates were selected for molecular identification. 16S rRNA genes were successfully amplified from the genome of the selected isolates using Universal Eubacterial 16S rRNA primers. The resulted amplification products were subjected to automated DNA sequencing. Partial 16S rDNA sequences obtained were compared directly with sequences in the NCBI database using BLAST as well as with the sequences available with Ribosomal Database Project (RDP).

  6. DNA qualification workflow for next generation sequencing of histopathological samples.

    Directory of Open Access Journals (Sweden)

    Michele Simbolo

    Full Text Available Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF tissues, 6 formalin-fixed paraffin-embedded (FFPE tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard

  7. Restriction and Sequence Alterations Affect DNA Uptake Sequence-Dependent Transformation in Neisseria meningitidis

    Science.gov (United States)

    Ambur, Ole Herman; Frye, Stephan A.; Nilsen, Mariann; Hovland, Eirik; Tønjum, Tone

    2012-01-01

    Transformation is a complex process that involves several interactions from the binding and uptake of naked DNA to homologous recombination. Some actions affect transformation favourably whereas others act to limit it. Here, meticulous manipulation of a single type of transforming DNA allowed for quantifying the impact of three different mediators of meningococcal transformation: NlaIV restriction, homologous recombination and the DNA Uptake Sequence (DUS). In the wildtype, an inverse relationship between the transformation frequency and the number of NlaIV restriction sites in DNA was observed when the transforming DNA harboured a heterologous region for selection (ermC) but not when the transforming DNA was homologous with only a single nucleotide heterology. The influence of homologous sequence in transforming DNA was further studied using plasmids with a small interruption or larger deletions in the recombinogenic region and these alterations were found to impair transformation frequency. In contrast, a particularly potent positive driver of DNA uptake in Neisseria sp. are short DUS in the transforming DNA. However, the molecular mechanism(s) responsible for DUS specificity remains unknown. Increasing the number of DUS in the transforming DNA was here shown to exert a positive effect on transformation. Furthermore, an influence of variable placement of DUS relative to the homologous region in the donor DNA was documented for the first time. No effect of altering the orientation of DUS was observed. These observations suggest that DUS is important at an early stage in the recognition of DNA, but does not exclude the existence of more than one level of DUS specificity in the sequence of events that constitute transformation. New knowledge on the positive and negative drivers of transformation may in a larger perspective illuminate both the mechanisms and the evolutionary role(s) of one of the most conserved mechanisms in nature: homologous recombination. PMID

  8. Restriction and sequence alterations affect DNA uptake sequence-dependent transformation in Neisseria meningitidis.

    Directory of Open Access Journals (Sweden)

    Ole Herman Ambur

    Full Text Available Transformation is a complex process that involves several interactions from the binding and uptake of naked DNA to homologous recombination. Some actions affect transformation favourably whereas others act to limit it. Here, meticulous manipulation of a single type of transforming DNA allowed for quantifying the impact of three different mediators of meningococcal transformation: NlaIV restriction, homologous recombination and the DNA Uptake Sequence (DUS. In the wildtype, an inverse relationship between the transformation frequency and the number of NlaIV restriction sites in DNA was observed when the transforming DNA harboured a heterologous region for selection (ermC but not when the transforming DNA was homologous with only a single nucleotide heterology. The influence of homologous sequence in transforming DNA was further studied using plasmids with a small interruption or larger deletions in the recombinogenic region and these alterations were found to impair transformation frequency. In contrast, a particularly potent positive driver of DNA uptake in Neisseria sp. are short DUS in the transforming DNA. However, the molecular mechanism(s responsible for DUS specificity remains unknown. Increasing the number of DUS in the transforming DNA was here shown to exert a positive effect on transformation. Furthermore, an influence of variable placement of DUS relative to the homologous region in the donor DNA was documented for the first time. No effect of altering the orientation of DUS was observed. These observations suggest that DUS is important at an early stage in the recognition of DNA, but does not exclude the existence of more than one level of DUS specificity in the sequence of events that constitute transformation. New knowledge on the positive and negative drivers of transformation may in a larger perspective illuminate both the mechanisms and the evolutionary role(s of one of the most conserved mechanisms in nature: homologous

  9. Massively parallel DNA sequencing: the new frontier in biogeography

    Directory of Open Access Journals (Sweden)

    Luiz A. Rocha

    2013-04-01

    Full Text Available The advent of Sanger sequencing represented a scientific break-through that greatly advanced biogeographic studies. However, this technology has several limitations that have hampered more advanced studies in the field. The development of novel techniques which more fully exploit the potential of Massively Parallel Sequencing (MPS to deliver sequence data at a fraction of the cost of Sanger sequencing promises to revolutionize biogeographic studies. Approaches like Restriction-site Associated DNA sequencing (RADseq and UltraConserved Element (UCE sequencing enable the collection of unprecedented amounts of data for multi-locus studies of population genetics and phylogenetics respectively, which in turn can be used for biogeographic analysis. Here we review those and other methods related to MPS, and provide examples of how they can be used in tropical Atlantic biogeography.

  10. Alignment of DNA and protein sequences containing frameshift errors

    Energy Technology Data Exchange (ETDEWEB)

    Guan, X.; Uberbacher, E.C.

    1995-04-01

    Molecular sequences, like all experimental data, are subject to error. Many current DNA sequencing protocols have very significant error rates and often generate artifactual insertions and deletions of bases (indels) which corrupt the translation of sequences and compromise the detection of protein homologies. The impact of these errors on the utility of molecular sequence data is dependent on the analytic technique used to interpret the data. In the presence of frameshift errors, standard algorithms using six frame translation can miss important homologies because only sub-fragments of the correct translation are available in any given frame. We present a new algorithm which can detect and correct frameshift errors in DNA sequences during comparison of translated sequences with protein sequences in the databases. This algorithm can recognize homologous proteins sharing 30% identity even in the presence of a 7% frameshift error rate. Our algorithm uses dynamic programming, producing a guaranteed optimal alignment in the presence of frameshifts, and has a sensitivity equivalent to Smith-Waterman. The computational efficiency of the algorithm is O(nm) where n and m are the sizes of two sequences being compared. The algorithm does not rely on prior knowledge or heuristic rules and performs significantly better than any previously reported method.

  11. Validation of human papillomavirus genotyping by signature DNA sequence analysis.

    Science.gov (United States)

    Lee, Sin Hang; Vigliotti, Veronica S; Vigliotti, Jessica S; Pappu, Suri

    2009-05-22

    Screening with combined cytologic and HPV testing has led to the highest number of excessive colposcopic referrals due to high false positive rates of the current HPV testing in the USA. How best to capitalize on the enhanced sensitivity of HPV DNA testing while minimizing false-positive results from its lower specificity is an important task for the clinical pathologists. The HPV L1 gene DNA in liquid-based Pap cytology specimens was initially amplified by the degenerate MY09/MY11 PCR primers and then re-amplified by the nested GP5+/GP6+ primers, or the heminested GP6/MY11, heminested GP5/MY09 primers or their modified equivalent without sample purification or DNA extraction. The nested PCR products were used for direct automated DNA sequencing. A 34- to 50-base sequence including the GP5+ priming site was selected as the signature sequence for routine genotyping by online BLAST sequence alignment algorithms. Of 3,222 specimens, 352 were found to contain HPV DNA, with 92% of the positive samples infected by only 1 of the 35 HPV genotypes detected and 8% by more than 1 HPV genotype. The most common genotype was HPV-16 (68 isolates), followed by HPV-52 (25 isolates). More than half (53.7%) of the total number of HPV isolates relied on a nested PCR for detection although the majority of HPV-16, -18, -31, -33 -35 and -58 isolates were detected by a single MY09/MY11 PCR. Alignment of a 34-base sequence downstream of the GP5+ site failed to distinguish some isolates of HPV-16, -31 and -33. Novel variants of HPV with less than "100% identities" signature sequence match with those stored in the Genbank database were also detected by signature DNA sequencing in this rural and suburban population of the United States. Laboratory staff must be familiar with the limitations of the consensus PCR primers, the locations of the signature sequence in the L1 gene for some HPV genotypes, and HPV genotype sequence variants in order to perform accurate HPV genotyping.

  12. Validation of human papillomavirus genotyping by signature DNA sequence analysis

    Directory of Open Access Journals (Sweden)

    Vigliotti Jessica S

    2009-05-01

    Full Text Available Abstract Background Screening with combined cytologic and HPV testing has led to the highest number of excessive colposcopic referrals due to high false positive rates of the current HPV testing in the USA. How best to capitalize on the enhanced sensitivity of HPV DNA testing while minimizing false-positive results from its lower specificity is an important task for the clinical pathologists. Methods The HPV L1 gene DNA in liquid-based Pap cytology specimens was initially amplified by the degenerate MY09/MY11 PCR primers and then re-amplified by the nested GP5+/GP6+ primers, or the heminested GP6/MY11, heminested GP5/MY09 primers or their modified equivalent without sample purification or DNA extraction. The nested PCR products were used for direct automated DNA sequencing. A 34- to 50-base sequence including the GP5+ priming site was selected as the signature sequence for routine genotyping by online BLAST sequence alignment algorithms. Results Of 3,222 specimens, 352 were found to contain HPV DNA, with 92% of the positive samples infected by only 1 of the 35 HPV genotypes detected and 8% by more than 1 HPV genotype. The most common genotype was HPV-16 (68 isolates, followed by HPV-52 (25 isolates. More than half (53.7% of the total number of HPV isolates relied on a nested PCR for detection although the majority of HPV-16, -18, -31, -33 -35 and -58 isolates were detected by a single MY09/MY11 PCR. Alignment of a 34-base sequence downstream of the GP5+ site failed to distinguish some isolates of HPV-16, -31 and -33. Novel variants of HPV with less than "100% identities" signature sequence match with those stored in the Genbank database were also detected by signature DNA sequencing in this rural and suburban population of the United States. Conclusion Laboratory staff must be familiar with the limitations of the consensus PCR primers, the locations of the signature sequence in the L1 gene for some HPV genotypes, and HPV genotype sequence

  13. DNA shotgun sequencing analysis of Garcinia mangostana L. variety Mesta

    Directory of Open Access Journals (Sweden)

    Syuhaidah Abu Bakar

    2017-06-01

    Full Text Available Mangosteen (Garcinia mangostana Linn. is an ultra-tropical tree characterized by its unique dark purple fruits with white flesh. The xanthone-rich purple pericarp tissue contains valuable compounds with medicinal properties. Following previously reported genome sequencing of a common variety of mangosteen [1], we performed another whole genome sequencing of a commercially popular variety of this fruit species (var. Mesta for comparative analysis of its genome composition. Raw reads of the DNA sequencing project were deposited to SRA database with the accession number SRX2709728.

  14. High-throughput DNA sequencing: a genomic data manufacturing process.

    Science.gov (United States)

    Huang, G M

    1999-01-01

    The progress trends in automated DNA sequencing operation are reviewed. Technological development in sequencing instruments, enzymatic chemistry and robotic stations has resulted in ever-increasing capacity of sequence data production. This progress leads to a higher demand on laboratory information management and data quality assessment. High-throughput laboratories face the challenge of organizational management, as well as technology management. Engineering principles of process control should be adopted in this biological data manufacturing procedure. While various systems attempt to provide solutions to automate different parts of, or even the entire process, new technical advances will continue to change the paradigm and provide new challenges.

  15. cDNA, genomic sequence cloning and overexpression of ribosomal ...

    African Journals Online (AJOL)

    RPS16 of eukaryote is a component of the 40S small ribosomal subunit encoded by RPS16 gene and is also a homolog of prokaryotic RPS9. The cDNA and genomic sequence of RPS16 was cloned successfully for the first time from the Giant Panda (Ailuropoda melanoleuca) using reverse transcription-polymerase chain ...

  16. cDNA, genomic sequence cloning and overexpression of ribosomal ...

    African Journals Online (AJOL)

    PRECIOUS

    2009-11-02

    Nov 2, 2009 ... RPS20 is a component of the 40S small ribosomal subunit encoded by RPS20 gene, which is conserved between eukaryotes, prokaryotes and archaebacteria. The cDNA and the genomic sequence of RPS20 were cloned successfully from the Giant Panda (Ailuropoda melanoleuca) using RT-PCR ...

  17. POSA : Perl objects for DNA sequencing data analysis

    NARCIS (Netherlands)

    Aerts, JA; Jungerius, BJ; Groenen, MA

    2004-01-01

    Background: Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide

  18. POSA: perl objects for DNA sequencing data analysis

    NARCIS (Netherlands)

    Aerts, J.A.; Jungerius, B.J.; Groenen, M.A.M.

    2004-01-01

    Background - Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide

  19. DNA sequence and prokaryotic expression analysis of vitellogenin ...

    African Journals Online (AJOL)

    In this study, the DNA sequence of vitellogenin from Antheraea pernyi (Ap-Vg) was identified and its functional domain (30-740 aa, Ap-Vg-1) was expressed in Escherichia coli BL21 (DE3) cells. The recombinant Ap-Vg-1 proteins were purified and used for antibody preparation. The results showed that the intact DNA ...

  20. RNA-DNA sequence differences spell genetic code ambiguities

    DEFF Research Database (Denmark)

    Bentin, Thomas; Nielsen, Michael L

    2013-01-01

    A recent paper in Science by Li et al. 2011(1) reports widespread sequence differences in the human transcriptome between RNAs and their encoding genes termed RNA-DNA differences (RDDs). The findings could add a new layer of complexity to gene expression but the study has been criticized. ...

  1. cDNA, genomic sequence cloning and overexpression of ...

    African Journals Online (AJOL)

    Cytochrome c oxidase (COX) is a component of the mitochondria respiratory chain. COX6b1 is one of the COX small subunits encoded by nuclear genes. In currently study, the cDNA and the genomic sequence of COX6b1 were successfully cloned from the Ailuropoda melanoleuca with the RT-PCR technology and ...

  2. Direct multiplex sequencing (DMPS)--a novel method for targeted high-throughput sequencing of ancient and highly degraded DNA

    National Research Council Canada - National Science Library

    Stiller, Mathias; Knapp, Michael; Stenzel, Udo; Hofreiter, Michael; Meyer, Matthias

    2009-01-01

    Although the emergence of high-throughput sequencing technologies has enabled whole-genome sequencing from extinct organisms, little progress has been made in accelerating targeted sequencing from highly degraded DNA...

  3. Sequence heterogeneity accelerates protein search for targets on DNA

    Energy Technology Data Exchange (ETDEWEB)

    Shvets, Alexey A.; Kolomeisky, Anatoly B., E-mail: tolya@rice.edu [Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005 (United States)

    2015-12-28

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  4. Quality assessment of DNA sequence data: autopsy of a mis-sequenced mtDNA population sample.

    Science.gov (United States)

    Bandelt, H-J; Kivisild, T

    2006-05-01

    Published DNA data sets constitute a body of sequencing results resting in silico that are supposed to reflect the variation of (once) living cells. In cases where the DNA variation reported is suspected to be fraught with artefacts, an autopsy of the full body of data is needed to clarify the amount and causes of mis-sequencing. In this paper we elaborate on strategies that allow a clear-cut identification of the problems in severely flawed mtDNA data. This approach is applied, by way of example, to a data set of HVS-I sequences from the Caucasus, published by Nasidze & Stoneking in 2001. These data bear numerous ambiguous nucleotide positions and suffer from an even higher number of phantom mutations, indicating that severe biochemical problems adversely influenced those sequencing results at the time. Furthermore, systematic omission of sequences with a long C-stretch (incurred by a transition at position 16189) must have severely biased the data set. Since no complete correction of these data has appeared to date, this example of mis-sequencing necessitates circumstantial evidence that is bullet-proof.

  5. DNA watermarks in non-coding regulatory sequences

    Directory of Open Access Journals (Sweden)

    Pyka Martin

    2009-07-01

    Full Text Available Abstract Background DNA watermarks can be applied to identify the unauthorized use of genetically modified organisms. It has been shown that coding regions can be used to encrypt information into living organisms by using the DNA-Crypt algorithm. Yet, if the sequence of interest presents a non-coding DNA sequence, either the function of a resulting functional RNA molecule or a regulatory sequence, such as a promoter, could be affected. For our studies we used the small cytoplasmic RNA 1 in yeast and the lac promoter region of Escherichia coli. Findings The lac promoter was deactivated by the integrated watermark. In addition, the RNA molecules displayed altered configurations after introducing a watermark, but surprisingly were functionally intact, which has been verified by analyzing the growth characteristics of both wild type and watermarked scR1 transformed yeast cells. In a third approach we introduced a second overlapping watermark into the lac promoter, which did not affect the promoter activity. Conclusion Even though the watermarked RNA and one of the watermarked promoters did not show any significant differences compared to the wild type RNA and wild type promoter region, respectively, it cannot be generalized that other RNA molecules or regulatory sequences behave accordingly. Therefore, we do not recommend integrating watermark sequences into regulatory regions.

  6. Environmental DNA sequencing primers for eutardigrades and bdelloid rotifers.

    Science.gov (United States)

    Robeson, Michael S; Costello, Elizabeth K; Freeman, Kristen R; Whiting, Jeremy; Adams, Byron; Martin, Andrew P; Schmidt, Steve K

    2009-12-11

    The time it takes to isolate individuals from environmental samples and then extract DNA from each individual is one of the problems with generating molecular data from meiofauna such as eutardigrades and bdelloid rotifers. The lack of consistent morphological information and the extreme abundance of these classes makes morphological identification of rare, or even common cryptic taxa a large and unwieldy task. This limits the ability to perform large-scale surveys of the diversity of these organisms.Here we demonstrate a culture-independent molecular survey approach that enables the generation of large amounts of eutardigrade and bdelloid rotifer sequence data directly from soil. Our PCR primers, specific to the 18s small-subunit rRNA gene, were developed for both eutardigrades and bdelloid rotifers. The developed primers successfully amplified DNA of their target organism from various soil DNA extracts. This was confirmed by both the BLAST similarity searches and phylogenetic analyses. Tardigrades showed much better phylogenetic resolution than bdelloids. Both groups of organisms exhibited varying levels of endemism. The development of clade-specific primers for characterizing eutardigrades and bdelloid rotifers from environmental samples should greatly increase our ability to characterize the composition of these taxa in environmental samples. Environmental sequencing as shown here differs from other molecular survey methods in that there is no need to pre-isolate the organisms of interest from soil in order to amplify their DNA. The DNA sequences obtained from methods that do not require culturing can be identified post-hoc and placed phylogenetically as additional closely related sequences are obtained from morphologically identified conspecifics. Our non-cultured environmental sequence based approach will be able to provide a rapid and large-scale screening of the presence, absence and diversity of Bdelloidea and Eutardigrada in a variety of soils.

  7. DNA sequence analysis using hierarchical ART-based classification networks

    Energy Technology Data Exchange (ETDEWEB)

    LeBlanc, C.; Hruska, S.I. [Florida State Univ., Tallahassee, FL (United States); Katholi, C.R.; Unnasch, T.R. [Univ. of Alabama, Birmingham, AL (United States)

    1994-12-31

    Adaptive resonance theory (ART) describes a class of artificial neural network architectures that act as classification tools which self-organize, work in real-time, and require no retraining to classify novel sequences. We have adapted ART networks to provide support to scientists attempting to categorize tandem repeat DNA fragments from Onchocerca volvulus. In this approach, sequences of DNA fragments are presented to multiple ART-based networks which are linked together into two (or more) tiers; the first provides coarse sequence classification while the sub- sequent tiers refine the classifications as needed. The overall rating of the resulting classification of fragments is measured using statistical techniques based on those introduced to validate results from traditional phylogenetic analysis. Tests of the Hierarchical ART-based Classification Network, or HABclass network, indicate its value as a fast, easy-to-use classification tool which adapts to new data without retraining on previously classified data.

  8. Complete nucleotide sequence of minicircle kinetoplast DNA from Trypanosoma equiperdum.

    Science.gov (United States)

    Barrois, M; Riou, G; Galibert, F

    1981-06-01

    The kinetoplast DNA of Trypanosoma equiperdum is composed of about 3000 supercoiled minicircles of 1000 base pairs and about 50 supercoiled maxicircles of 23,000 base pairs topologically interlocked so as to form a compact network. Minicircles of T. equiperdum, which are homogeneous in base sequence, were purified by equilibrium CsCl centrifugation and used as starting material for DNA sequence analysis. One minicircle is composed of 1012 base pairs and has an adenine.thymine base pair content of 72.8%. The termination codons are uniformly distributed along the molecule and restrict the coding potentiality of the molecule to oligopeptides of about 20 amino acids. The molecule contains three dyad symmetries and a sequence of 12 nucleotides is repeated six times. We also noted the presence of a region of about 130 base pairs that is almost perfectly homologous with that of the minicircles from the closely related species T. brucei.

  9. VoSeq: a voucher and DNA sequence web application.

    Directory of Open Access Journals (Sweden)

    Carlos Peña

    Full Text Available There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit.

  10. VoSeq: A Voucher and DNA Sequence Web Application

    Science.gov (United States)

    Peña, Carlos; Malm, Tobias

    2012-01-01

    There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL) and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit). PMID:22720030

  11. A juvenile subfossil crocodylian from Anjohibe Cave, Northwestern Madagascar

    Directory of Open Access Journals (Sweden)

    Joshua C. Mathews

    2016-09-01

    Full Text Available Madagascar’s subfossil record preserves a diverse community of animals including elephant birds, pygmy hippopotamus, giant lemurs, turtles, crocodiles, bats, rodents, and carnivorans. These fossil accumulations give us a window into the island’s past from 80,000 years ago to a mere few hundred years ago, recording the extinction of some groups and the persistence of others. The crocodylian subfossil record is limited to two taxa, Voay robustus and Crocodylus niloticus, found at sites distributed throughout the island. V. robustus is extinct while C. niloticus is still found on the island today, but whether these two species overlapped temporally, or if Voay was driven to extinction by competing with Crocodylus remains unknown. While their size and presumed behavior was similar to each other, nearly nothing is known about the growth and development of Voay, as the overwhelming majority of fossil specimens represent mature adult individuals. Here we describe a nearly complete juvenile crocodylian specimen from Anjohibe Cave, northwestern Madagascar. The specimen is referred to Crocodylus based on the presence of caviconchal recesses on the medial wall of the maxillae, and to C. niloticus based on the presence of an oval shaped internal choana, lack of rostral ornamentation and a long narrow snout. However, as there are currently no described juvenile specimens of Voay robustus, it is important to recognize that some of the defining characteristics of that genus may have changed through ontogeny. Elements include a nearly complete skull and many postcranial elements (cervical, thoracic, sacral, and caudal vertebrae, pectoral elements, pelvic elements, forelimb and hindlimb elements, osteoderms. Crocodylus niloticus currently inhabits Madagascar but is locally extinct from this particular region; radiometric dating indicates an age of ∼460–310 years before present (BP. This specimen clearly represents a juvenile based on the extremely small

  12. Entire Mitochondrial DNA Sequencing on Massively Parallel Sequencing for the Korean Population.

    Science.gov (United States)

    Park, Sohyung; Cho, Sohee; Seo, Hee Jin; Lee, Ji Hyun; Kim, Moon Young; Lee, Soong Deok

    2017-04-01

    Mitochondrial DNA (mtDNA) genome analysis has been a potent tool in forensic practice as well as in the understanding of human phylogeny in the maternal lineage. The traditional mtDNA analysis is focused on the control region, but the introduction of massive parallel sequencing (MPS) has made the typing of the entire mtDNA genome (mtGenome) more accessible for routine analysis. The complete mtDNA information can provide large amounts of novel genetic data for diverse populations as well as improved discrimination power for identification. The genetic diversity of the mtDNA sequence in different ethnic populations has been revealed through MPS analysis, but the Korean population not only has limited MPS data for the entire mtGenome, the existing data is mainly focused on the control region. In this study, the complete mtGenome data for 186 Koreans, obtained using Ion Torrent Personal Genome Machine (PGM) technology and retrieved from rather common mtDNA haplogroups based on the control region sequence, are described. The results showed that 24 haplogroups, determined with hypervariable regions only, branched into 47 subhaplogroups, and point heteroplasmy was more frequent in the coding regions. In addition, sequence variations in the coding regions observed in this study were compared with those presented in other reports on different populations, and there were similar features observed in the sequence variants for the predominant haplogroups among East Asian populations, such as Haplogroup D and macrohaplogroups M9, G, and D. This study is expected to be the trigger for the development of Korean specific mtGenome data followed by numerous future studies. © 2017 The Korean Academy of Medical Sciences.

  13. Next generation sequencing of DNA-launched Chikungunya vaccine virus

    Energy Technology Data Exchange (ETDEWEB)

    Hidajat, Rachmat; Nickols, Brian [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States); Forrester, Naomi [Institute for Human Infections and Immunity, Sealy Center for Vaccine Development and Department of Pathology, University of Texas Medical Branch, GNL, 301 University Blvd., Galveston, TX 77555 (United States); Tretyakova, Irina [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States); Weaver, Scott [Institute for Human Infections and Immunity, Sealy Center for Vaccine Development and Department of Pathology, University of Texas Medical Branch, GNL, 301 University Blvd., Galveston, TX 77555 (United States); Pushko, Peter, E-mail: ppushko@medigen-usa.com [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States)

    2016-03-15

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3′ untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. - Highlights: • Chikungunya virus (CHIKV) is an emerging pandemic threat. • In vivo DNA-launched attenuated CHIKV is a novel vaccine technology. • DNA-launched virus was sequenced using HiSeq2000 and compared to the 181/25 virus. • DNA-launched virus has lower frequency of SNPs at E2-12 and E2-82 attenuation loci.

  14. Assessing the fidelity of ancient DNA sequences amplified from nuclear genes

    DEFF Research Database (Denmark)

    Binladen, Jonas; Wiuf, Carsten Henrik; Gilbert, M. Thomas P.

    2006-01-01

    To date, the field of ancient DNA has relied almost exclusively on mitochondrial DNA (mtDNA) sequences. However, a number of recent studies have reported the successful recovery of ancient nuclear DNA (nuDNA) sequences, thereby allowing the characterization of genetic loci directly involved in ph...

  15. Presence of a consensus DNA motif at nearby DNA sequence of the mutation susceptible CG nucleotides.

    Science.gov (United States)

    Chowdhury, Kaushik; Kumar, Suresh; Sharma, Tanu; Sharma, Ankit; Bhagat, Meenakshi; Kamai, Asangla; Ford, Bridget M; Asthana, Shailendra; Mandal, Chandi C

    2018-01-10

    Complexity in tissues affected by cancer arises from somatic mutations and epigenetic modifications in the genome. The mutation susceptible hotspots present within the genome indicate a non-random nature and/or a position specific selection of mutation. An association exists between the occurrence of mutations and epigenetic DNA methylation. This study is primarily aimed at determining mutation status, and identifying a signature for predicting mutation prone zones of tumor suppressor (TS) genes. Nearby sequences from the top five positions having a higher mutation frequency in each gene of 42 TS genes were selected from a cosmic database and were considered as mutation prone zones. The conserved motifs present in the mutation prone DNA fragments were identified. Molecular docking studies were done to determine putative interactions between the identified conserved motifs and enzyme methyltransferase DNMT1. Collective analysis of 42 TS genes found GC as the most commonly replaced and AT as the most commonly formed residues after mutation. Analysis of the top 5 mutated positions of each gene (210 DNA segments for 42 TS genes) identified that CG nucleotides of the amino acid codons (e.g., Arginine) are most susceptible to mutation, and found a consensus DNA "T/AGC/GAGGA/TG" sequence present in these mutation prone DNA segments. Similar to TS genes, analysis of 54 oncogenes not only found CG nucleotides of the amino acid Arg as the most susceptible to mutation, but also identified the presence of similar consensus DNA motifs in the mutation prone DNA fragments (270 DNA segments for 54 oncogenes) of oncogenes. Docking studies depicted that, upon binding of DNMT1 methylates to this consensus DNA motif (C residues of CpG islands), mutation was likely to occur. Thus, this study proposes that DNMT1 mediated methylation in chromosomal DNA may decrease if a foreign DNA segment containing this consensus sequence along with CG nucleotides is exogenously introduced to dividing

  16. The influence of DNA sequence on epigenome-induced pathologies

    Science.gov (United States)

    2012-01-01

    Clear cause-and-effect relationships are commonly established between genotype and the inherited risk of acquiring human and plant diseases and aberrant phenotypes. By contrast, few such cause-and-effect relationships are established linking a chromatin structure (that is, the epitype) with the transgenerational risk of acquiring a disease or abnormal phenotype. It is not entirely clear how epitypes are inherited from parent to offspring as populations evolve, even though epigenetics is proposed to be fundamental to evolution and the likelihood of acquiring many diseases. This article explores the hypothesis that, for transgenerationally inherited chromatin structures, “genotype predisposes epitype”, and that epitype functions as a modifier of gene expression within the classical central dogma of molecular biology. Evidence for the causal contribution of genotype to inherited epitypes and epigenetic risk comes primarily from two different kinds of studies discussed herein. The first and direct method of research proceeds by the examination of the transgenerational inheritance of epitype and the penetrance of phenotype among genetically related individuals. The second approach identifies epitypes that are duplicated (as DNA sequences are duplicated) and evolutionarily conserved among repeated patterns in the DNA sequence. The body of this article summarizes particularly robust examples of these studies from humans, mice, Arabidopsis, and other organisms. The bulk of the data from both areas of research support the hypothesis that genotypes predispose the likelihood of displaying various epitypes, but for only a few classes of epitype. This analysis suggests that renewed efforts are needed in identifying polymorphic DNA sequences that determine variable nucleosome positioning and DNA methylation as the primary cause of inherited epigenome-induced pathologies. By contrast, there is very little evidence that DNA sequence directly determines the inherited

  17. The influence of DNA sequence on epigenome-induced pathologies

    Directory of Open Access Journals (Sweden)

    Meagher Richard B

    2012-07-01

    Full Text Available Abstract Clear cause-and-effect relationships are commonly established between genotype and the inherited risk of acquiring human and plant diseases and aberrant phenotypes. By contrast, few such cause-and-effect relationships are established linking a chromatin structure (that is, the epitype with the transgenerational risk of acquiring a disease or abnormal phenotype. It is not entirely clear how epitypes are inherited from parent to offspring as populations evolve, even though epigenetics is proposed to be fundamental to evolution and the likelihood of acquiring many diseases. This article explores the hypothesis that, for transgenerationally inherited chromatin structures, “genotype predisposes epitype”, and that epitype functions as a modifier of gene expression within the classical central dogma of molecular biology. Evidence for the causal contribution of genotype to inherited epitypes and epigenetic risk comes primarily from two different kinds of studies discussed herein. The first and direct method of research proceeds by the examination of the transgenerational inheritance of epitype and the penetrance of phenotype among genetically related individuals. The second approach identifies epitypes that are duplicated (as DNA sequences are duplicated and evolutionarily conserved among repeated patterns in the DNA sequence. The body of this article summarizes particularly robust examples of these studies from humans, mice, Arabidopsis, and other organisms. The bulk of the data from both areas of research support the hypothesis that genotypes predispose the likelihood of displaying various epitypes, but for only a few classes of epitype. This analysis suggests that renewed efforts are needed in identifying polymorphic DNA sequences that determine variable nucleosome positioning and DNA methylation as the primary cause of inherited epigenome-induced pathologies. By contrast, there is very little evidence that DNA sequence directly

  18. Early Lyme disease with spirochetemia - diagnosed by DNA sequencing

    Directory of Open Access Journals (Sweden)

    Jones William

    2010-11-01

    Full Text Available Abstract Background A sensitive and analytically specific nucleic acid amplification test (NAAT is valuable in confirming the diagnosis of early Lyme disease at the stage of spirochetemia. Findings Venous blood drawn from patients with clinical presentations of Lyme disease was tested for the standard 2-tier screen and Western Blot serology assay for Lyme disease, and also by a nested polymerase chain reaction (PCR for B. burgdorferi sensu lato 16S ribosomal DNA. The PCR amplicon was sequenced for B. burgdorferi genomic DNA validation. A total of 130 patients visiting emergency room (ER or Walk-in clinic (WALKIN, and 333 patients referred through the private physicians' offices were studied. While 5.4% of the ER/WALKIN patients showed DNA evidence of spirochetemia, none (0% of the patients referred from private physicians' offices were DNA-positive. In contrast, while 8.4% of the patients referred from private physicians' offices were positive for the 2-tier Lyme serology assay, only 1.5% of the ER/WALKIN patients were positive for this antibody test. The 2-tier serology assay missed 85.7% of the cases of early Lyme disease with spirochetemia. The latter diagnosis was confirmed by DNA sequencing. Conclusion Nested PCR followed by automated DNA sequencing is a valuable supplement to the standard 2-tier antibody assay in the diagnosis of early Lyme disease with spirochetemia. The best time to test for Lyme spirochetemia is when the patients living in the Lyme disease endemic areas develop unexplained symptoms or clinical manifestations that are consistent with Lyme disease early in the course of their illness.

  19. PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context

    OpenAIRE

    Zhou, Jiyun; Xu, Ruifeng; He, Yulan; Lu, Qin; Wang, Hongpeng; Kong, Bing

    2016-01-01

    Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predi...

  20. cDNA encoding a polypeptide including a hevein sequence

    Science.gov (United States)

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a pu GOVERNMENT RIGHTS This application was funded under Department of Energy Contract DE-AC02-76ER01338. The U.S. Government has certain rights under this application and any patent issuing thereon.

  1. Cloning and sequencing of cDNA and genomic DNA encoding PDM phosphatase of Fusarium moniliforme.

    Science.gov (United States)

    Yoshida, Hiroshi; Iizuka, Mari; Narita, Takao; Norioka, Naoko; Norioka, Shigemi

    2006-12-01

    PDM phosphatase was purified approximately 500-fold through six steps from the extract of dried powder of the culture filtrate of Fusarium moniliforme. The purified preparation appeared homogeneous on SDS-PAGE although the protein band was broad. Amino acid sequence information was collected on tryptic peptides from this preparation. cDNA cloning was carried out based on the information. A full-length cDNA was obtained and sequenced. The sequence had an open reading frame of 651 amino acid residues with a molecular mass of 69,988 Da. Cloning and sequencing of the genomic DNA corresponding to the cDNA was also conducted. The deduced amino acid sequence could account for many but not all of the tryptic peptides, suggesting presence of contaminant protein(s). SDS-PAGE analysis after chemical deglycosylation showed two proteins with molecular masses of 58 and 68 kDa. This implied that the 58 kDa protein had been copurified with PDM phosphatase. Homology search showed that PDM phosphatase belongs to the purple acid phosphatase family, which is widely distributed in the biosphere. Sequence data of fungal purple acid phosphatases were collected from the database. Processing of the data revealed presence of two types, whose evolutionary relationships were discussed.

  2. Phylogenetic relationships of the Gomphales based on nuc-25S-rDNA, mit-12S-rDNA, and mit-atp6-DNA combined sequences

    Science.gov (United States)

    Admir J. Giachini; Kentaro Hosaka; Eduardo Nouhra; Joseph Spatafora; James M. Trappe

    2010-01-01

    Phylogenetic relationships among Geastrales, Gomphales, Hysterangiales, and Phallales were estimated via combined sequences: nuclear large subunit ribosomal DNA (nuc-25S-rDNA), mitochondrial small subunit ribosomal DNA (mit-12S-rDNA), and mitochondrial atp6 DNA (mit-atp6-DNA). Eighty-one taxa comprising 19 genera and 58 species...

  3. Multilocus DNA Sequence Comparisons Rapidly Identify Pathogenic Molds

    Science.gov (United States)

    Rakeman, Jennifer L.; Bui, Uyen; LaFe, Karen; Chen, Yi-Ching; Honeycutt, Rhonda J.; Cookson, Brad T.

    2005-01-01

    The increasing incidence of opportunistic fungal infections necessitates rapid and accurate identification of the associated fungi to facilitate optimal patient treatment. Traditional phenotype-based identification methods utilized in clinical laboratories rely on the production and recognition of reproductive structures, making identification difficult or impossible when these structures are not observed. We hypothesized that DNA sequence analysis of multiple loci is useful for rapidly identifying medically important molds. Our study included the analysis of the D1/D2 hypervariable region of the 28S ribosomal gene and the internal transcribed spacer (ITS) regions 1 and 2 of the rRNA operon. Two hundred one strains, including 143 clinical isolates and 58 reference and type strains, representing 43 recognized species and one possible new species, were examined. We generated a phenotypically validated database of 118 diagnostic alleles. DNA length polymorphisms detected among ITS1 and ITS2 PCR products can differentiate 20 of 33 species of molds tested, and ITS DNA sequence analysis permits identification of all species tested. For 42 of 44 species tested, conspecific strains displayed >99% sequence identity at ITS1 and ITS2; sequevars were detected in two species. For all 44 species, identifications by genotypic and traditional phenotypic methods were 100% concordant. Because dendrograms based on ITS sequence analysis are similar in topology to 28S-based trees, we conclude that ITS sequences provide phylogenetically valid information and can be utilized to identify clinically important molds. Additionally, this phenotypically validated database of ITS sequences will be useful for identifying new species of pathogenic molds. PMID:16000456

  4. Phylogenetic analysis of the genus Hordeum using repetitive DNA sequences

    DEFF Research Database (Denmark)

    Svitashev, S.; Bryngelsson, T.; Vershinin, A.

    1994-01-01

    over all chromosomes of H. vulgare and the wild barley species H. bulbosum, H. marinum and H. murinum. Southern blot hybridization revealed different levels of polymorphism among barley species and the RFLP data were used to generate a phylogenetic tree for the genus Hordeum. Our data are in a good......A set of six cloned barley (Hordeum vulgare) repetitive DNA sequences was used for the analysis of phylogenetic relationships among 31 species (46 taxa) of the genus Hordeum, using molecular hybridization techniques. In situ hybridization experiments showed dispersed organization of the sequences...

  5. The implementation of bit-parallelism for DNA sequence alignment

    Science.gov (United States)

    Setyorini; Kuspriyanto; Widyantoro, D. H.; Pancoro, A.

    2017-05-01

    Dynamic Programming (DP) remain the central algorithm of biological sequence alignment. Matching score computation is the most time-consuming process. Bit-parallelism is one of approximate string matching techniques that transform DP matrix cell unit processing into word unit (groups of cell). Bit-parallelism computate the scores column-wise. Adopting from word processing in computer system work, this technique promise reducing time in score computing process in DP matrix. In this paper, we implement bit-parallelism technique for DNA sequence alignment. Our bit-parallelism implementation have less time for score computational process but still need improvement for there construction process.

  6. Biased distribution of DNA uptake sequences towards genome maintenance genes

    DEFF Research Database (Denmark)

    Davidsen, T.; Rodland, E.A.; Lagesen, K.

    2004-01-01

    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...... in these organisms. Pasteurella multocida also displayed high frequencies of a putative DUS identical to that previously identified in H. influenzae and with a skewed distribution towards genome maintenance genes, indicating that this bacterium might be transformation competent under certain conditions....

  7. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison

    Directory of Open Access Journals (Sweden)

    Kato Mikio

    2003-01-01

    Full Text Available Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA.

  8. Nonrepetitive DNA Sequence Representation in Sea Urchin Embryo Messenger RNA

    Science.gov (United States)

    Goldberg, Robert B.; Galau, Glenn A.; Britten, Roy J.; Davidson, Eric H.

    1973-01-01

    Messenger RNA was prepared from developing sea urchin gastrulae by puromycin release from polyribosomes. Approximately 60% of the total mRNA radioactivity of the postnuclear supernatant was recovered and shown to be free of any other labeled RNA species such as ribosomal and nuclear RNA. The mRNA was examined by hybridization to DNA present in great excess. The mRNA hybridizes almost exclusively with nonrepetitive DNA. Almost all of the messenger RNA molecules of sea urchin gastrulae therefore consist of transcripts from nonrepetitive sequences. It appears that the structural genes expressed at this stage are typically not repeated in the genome and the mRNA does not include recognizable repetitive sequence. PMID:4519642

  9. Roche genome sequencer FLX based high-throughput sequencing of ancient DNA

    DEFF Research Database (Denmark)

    Alquezar-Planas, David E; Fordyce, Sarah Louise

    2012-01-01

    Since the development of so-called "next generation" high-throughput sequencing in 2005, this technology has been applied to a variety of fields. Such applications include disease studies, evolutionary investigations, and ancient DNA. Each application requires a specialized protocol to ensure tha...

  10. The exceptional genomic word symmetry along DNA sequences

    OpenAIRE

    Afreixo, Vera; Rodrigues, Jo?o M. O. S.; Carlos A. C. Bastos; Silva, Raquel M.

    2016-01-01

    Background The second Chargaff?s parity rule and its extensions are recognized as universal phenomena in DNA sequences. However, parity of the frequencies of reverse complementary oligonucleotides could be a mere consequence of the single nucleotide parity rule, if nucleotide independence is assumed. Exceptional symmetry (symmetry beyond that expected under an independent nucleotide assumption) was proposed previously as a meaningful measure of the extension of the second parity rule to oligo...

  11. Computational optimisation of targeted DNA sequencing for cancer detection

    Science.gov (United States)

    Martinez, Pierre; McGranahan, Nicholas; Birkbak, Nicolai Juul; Gerlinger, Marco; Swanton, Charles

    2013-12-01

    Despite recent progress thanks to next-generation sequencing technologies, personalised cancer medicine is still hampered by intra-tumour heterogeneity and drug resistance. As most patients with advanced metastatic disease face poor survival, there is need to improve early diagnosis. Analysing circulating tumour DNA (ctDNA) might represent a non-invasive method to detect mutations in patients, facilitating early detection. In this article, we define reduced gene panels from publicly available datasets as a first step to assess and optimise the potential of targeted ctDNA scans for early tumour detection. Dividing 4,467 samples into one discovery and two independent validation cohorts, we show that up to 76% of 10 cancer types harbour at least one mutation in a panel of only 25 genes, with high sensitivity across most tumour types. Our analyses demonstrate that targeting ``hotspot'' regions would introduce biases towards in-frame mutations and would compromise the reproducibility of tumour detection.

  12. Comparison of DNA Quantification Methods for Next Generation Sequencing.

    Science.gov (United States)

    Robin, Jérôme D; Ludlow, Andrew T; LaRanger, Ryan; Wright, Woodring E; Shay, Jerry W

    2016-04-06

    Next Generation Sequencing (NGS) is a powerful tool that depends on loading a precise amount of DNA onto a flowcell. NGS strategies have expanded our ability to investigate genomic phenomena by referencing mutations in cancer and diseases through large-scale genotyping, developing methods to map rare chromatin interactions (4C; 5C and Hi-C) and identifying chromatin features associated with regulatory elements (ChIP-seq, Bis-Seq, ChiA-PET). While many methods are available for DNA library quantification, there is no unambiguous gold standard. Most techniques use PCR to amplify DNA libraries to obtain sufficient quantities for optical density measurement. However, increased PCR cycles can distort the library's heterogeneity and prevent the detection of rare variants. In this analysis, we compared new digital PCR technologies (droplet digital PCR; ddPCR, ddPCR-Tail) with standard methods for the titration of NGS libraries. DdPCR-Tail is comparable to qPCR and fluorometry (QuBit) and allows sensitive quantification by analysis of barcode repartition after sequencing of multiplexed samples. This study provides a direct comparison between quantification methods throughout a complete sequencing experiment and provides the impetus to use ddPCR-based quantification for improvement of NGS quality.

  13. Targeted DNA methylation analysis by next-generation sequencing.

    Science.gov (United States)

    Masser, Dustin R; Stanford, David R; Freeman, Willard M

    2015-02-24

    The role of epigenetic processes in the control of gene expression has been known for a number of years. DNA methylation at cytosine residues is of particular interest for epigenetic studies as it has been demonstrated to be both a long lasting and a dynamic regulator of gene expression. Efforts to examine epigenetic changes in health and disease have been hindered by the lack of high-throughput, quantitatively accurate methods. With the advent and popularization of next-generation sequencing (NGS) technologies, these tools are now being applied to epigenomics in addition to existing genomic and transcriptomic methodologies. For epigenetic investigations of cytosine methylation where regions of interest, such as specific gene promoters or CpG islands, have been identified and there is a need to examine significant numbers of samples with high quantitative accuracy, we have developed a method called Bisulfite Amplicon Sequencing (BSAS). This method combines bisulfite conversion with targeted amplification of regions of interest, transposome-mediated library construction and benchtop NGS. BSAS offers a rapid and efficient method for analysis of up to 10 kb of targeted regions in up to 96 samples at a time that can be performed by most research groups with basic molecular biology skills. The results provide absolute quantitation of cytosine methylation with base specificity. BSAS can be applied to any genomic region from any DNA source. This method is useful for hypothesis testing studies of target regions of interest as well as confirmation of regions identified in genome-wide methylation analyses such as whole genome bisulfite sequencing, reduced representation bisulfite sequencing, and methylated DNA immunoprecipitation sequencing.

  14. Correlation approach to identify coding regions in DNA sequences

    Science.gov (United States)

    Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1994-01-01

    Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.

  15. Bisulfite sequencing of chromatin immunoprecipitated DNA (BisChIP-seq) directly informs methylation status of histone-modified DNA

    NARCIS (Netherlands)

    Statham, A.L.; Robinson, M.D.; Song, J.Z.; Coolen, M.W.; Stirzaker, C.; Clark, S. J.

    2012-01-01

    The complex relationship between DNA methylation, chromatin modification, and underlying DNA sequence is often difficult to unravel with existing technologies. Here, we describe a novel technique based on high-throughput sequencing of bisulfite-treated chromatin immunoprecipitated DNA (BisChIP-seq),

  16. Mixed sequence reader: a program for analyzing DNA sequences with heterozygous base calling.

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3.

  17. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Directory of Open Access Journals (Sweden)

    Chun-Tien Chang

    2012-01-01

    Full Text Available The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs, insertion-deletions (indels, short tandem repeats (STRs, and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR, which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS; (iii determine human papilloma virus (HPV genotypes by searching current viral databases in cases of double infections; (iv estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4 and its paralog HSPDP3.

  18. A sequence-dependent rigid-base model of DNA.

    Science.gov (United States)

    Gonzalez, O; Petkevičiūtė, D; Maddocks, J H

    2013-02-07

    A novel hierarchy of coarse-grain, sequence-dependent, rigid-base models of B-form DNA in solution is introduced. The hierarchy depends on both the assumed range of energetic couplings, and the extent of sequence dependence of the model parameters. A significant feature of the models is that they exhibit the phenomenon of frustration: each base cannot simultaneously minimize the energy of all of its interactions. As a consequence, an arbitrary DNA oligomer has an intrinsic or pre-existing stress, with the level of this frustration dependent on the particular sequence of the oligomer. Attention is focussed on the particular model in the hierarchy that has nearest-neighbor interactions and dimer sequence dependence of the model parameters. For a Gaussian version of this model, a complete coarse-grain parameter set is estimated. The parameterized model allows, for an oligomer of arbitrary length and sequence, a simple and explicit construction of an approximation to the configuration-space equilibrium probability density function for the oligomer in solution. The training set leading to the coarse-grain parameter set is itself extracted from a recent and extensive database of a large number of independent, atomic-resolution molecular dynamics (MD) simulations of short DNA oligomers immersed in explicit solvent. The Kullback-Leibler divergence between probability density functions is used to make several quantitative assessments of our nearest-neighbor, dimer-dependent model, which is compared against others in the hierarchy to assess various assumptions pertaining both to the locality of the energetic couplings and to the level of sequence dependence of its parameters. It is also compared directly against all-atom MD simulation to assess its predictive capabilities. The results show that the nearest-neighbor, dimer-dependent model can successfully resolve sequence effects both within and between oligomers. For example, due to the presence of frustration, the model can

  19. Considering DNA damage when interpreting mtDNA heteroplasmy in deep sequencing data.

    Science.gov (United States)

    Rathbun, Molly M; McElhoe, Jennifer A; Parson, Walther; Holland, Mitchell M

    2017-01-01

    Resolution of mitochondrial (mt) DNA heteroplasmy is now possible when applying a massively parallel sequencing (MPS) approach, including minor components down to 1%. However, reporting thresholds and interpretation criteria will need to be established for calling heteroplasmic variants that address a number of important topics, one of which is DNA damage. We assessed the impact of increasing amounts of DNA damage on the interpretation of minor component sequence variants in the mtDNA control region, including low-level mixed sites. A passive approach was used to evaluate the impact of storage conditions, and an active approach was employed to accelerate the process of hydrolytic damage (for example, replication errors associated with depurination events). The patterns of damage were compared and assessed in relation to damage typically encountered in poor quality samples. As expected, the number of miscoding lesions increased as conditions worsened. Single nucleotide polymorphisms (SNPs) associated with miscoding lesions were indistinguishable from innate heteroplasmy and were most often observed as 1-2% of the total sequencing reads. Numerous examples of miscoding lesions above 2% were identified, including two complete changes in the nucleotide sequence, presenting a challenge when assessing the placement of reporting thresholds for heteroplasmy. To mitigate the impact, replication of miscoding lesions was not observed in stored samples, and was rarely seen in data associated with accelerated hydrolysis. In addition, a significant decrease in the expected transition:transversion ratio was observed, providing a useful tool for predicting the presence of damage-induced lesions. The results of this study directly impact MPS analysis of minor sequence variants from poorly preserved DNA extracts, and when biological samples have been exposed to agents that induce DNA damage. These findings are particularly relevant to clinical and forensic investigations. Copyright

  20. Structural properties of replication origins in yeast DNA sequences

    Science.gov (United States)

    Cao, Xiao-Qin; Zeng, Jia; Yan, Hong

    2008-09-01

    Sequence-dependent DNA flexibility is an important structural property originating from the DNA 3D structure. In this paper, we investigate the DNA flexibility of the budding yeast (S. Cerevisiae) replication origins on a genome-wide scale using flexibility parameters from two different models, the trinucleotide and the tetranucleotide models. Based on analyzing average flexibility profiles of 270 replication origins, we find that yeast replication origins are significantly rigid compared with their surrounding genomic regions. To further understand the highly distinctive property of replication origins, we compare the flexibility patterns between yeast replication origins and promoters, and find that they both contain significantly rigid DNAs. Our results suggest that DNA flexibility is an important factor that helps proteins recognize and bind the target sites in order to initiate DNA replication. Inspired by the role of the rigid region in promoters, we speculate that the rigid replication origins may facilitate binding of proteins, including the origin recognition complex (ORC), Cdc6, Cdt1 and the MCM2-7 complex.

  1. DNA Sequence Determinants Controlling Affinity, Stability and Shape of DNA Complexes Bound by the Nucleoid Protein Fis.

    Science.gov (United States)

    Hancock, Stephen P; Stella, Stefano; Cascio, Duilio; Johnson, Reid C

    2016-01-01

    The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequences in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. The affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.

  2. Characterization of North American Armillaria species: Genetic relationships determined by ribosomal DNA sequences and AFLP markers

    Science.gov (United States)

    M. -S. Kim; N. B. Klopfenstein; J. W. Hanna; G. I. McDonald

    2006-01-01

    Phylogenetic and genetic relationships among 10 North American Armillaria species were analysed using sequence data from ribosomal DNA (rDNA), including intergenic spacer (IGS-1), internal transcribed spacers with associated 5.8S (ITS + 5.8S), and nuclear large subunit rDNA (nLSU), and amplified fragment length polymorphism (AFLP) markers. Based on rDNA sequence data,...

  3. Sequencing of mitochondrial HV1 and HV2 DNA with length heteroplasmy

    DEFF Research Database (Denmark)

    Rasmussen, E. Michael; Eriksen, Birthe; Larsen, Hans Jakob

    2003-01-01

    This study presents a fast method for sequencing the poly C/G regions in HV1 and HV2 in the mitochondrial DNA (mtDNA)......This study presents a fast method for sequencing the poly C/G regions in HV1 and HV2 in the mitochondrial DNA (mtDNA)...

  4. DNA Targeting Sequence Improves Magnetic Nanoparticle-Based Plasmid DNA Transfection Efficiency in Model Neurons.

    Science.gov (United States)

    Vernon, Matthew M; Dean, David A; Dobson, Jon

    2015-08-17

    Efficient non-viral plasmid DNA transfection of most stem cells, progenitor cells and primary cell lines currently presents an obstacle for many applications within gene therapy research. From a standpoint of efficiency and cell viability, magnetic nanoparticle-based DNA transfection is a promising gene vectoring technique because it has demonstrated rapid and improved transfection outcomes when compared to alternative non-viral methods. Recently, our research group introduced oscillating magnet arrays that resulted in further improvements to this novel plasmid DNA (pDNA) vectoring technology. Continued improvements to nanomagnetic transfection techniques have focused primarily on magnetic nanoparticle (MNP) functionalization and transfection parameter optimization: cell confluence, growth media, serum starvation, magnet oscillation parameters, etc. Noting that none of these parameters can assist in the nuclear translocation of delivered pDNA following MNP-pDNA complex dissociation in the cell's cytoplasm, inclusion of a cassette feature for pDNA nuclear translocation is theoretically justified. In this study incorporation of a DNA targeting sequence (DTS) feature in the transfecting plasmid improved transfection efficiency in model neurons, presumably from increased nuclear translocation. This observation became most apparent when comparing the response of the dividing SH-SY5Y precursor cell to the non-dividing and differentiated SH-SY5Y neuroblastoma cells.

  5. Using Synthetic Nanopores for Single-Molecule Analyses: Detecting SNPs, Trapping DNA Molecules, and the Prospects for Sequencing DNA

    Science.gov (United States)

    Dimitrov, Valentin V.

    2009-01-01

    This work focuses on studying properties of DNA molecules and DNA-protein interactions using synthetic nanopores, and it examines the prospects of sequencing DNA using synthetic nanopores. We have developed a method for discriminating between alleles that uses a synthetic nanopore to measure the binding of a restriction enzyme to DNA. There exists…

  6. Generating Exome Enriched Sequencing Libraries from Formalin-Fixed, Paraffin-Embedded Tissue DNA for Next Generation Sequencing

    Science.gov (United States)

    Marosy, Beth A.; Craig, Brian D.; Hetrick, Kurt N.; Witmer, P. Dane; Ling, Hua; Griffith, Sean M.; Myers, Ben; Ostrander, Elaine A.; Stanford, Janet L.; Brody, Lawrence C.; Doheny, Kimberly F.

    2016-01-01

    This unit describes a protocol for generating exome enriched sequencing libraries using DNA extracted from Formalin Fixed Paraffin Embedded (FFPE) samples. Utilizing commercially available kits, we present a low input FFPE workflow starting with 50ng of DNA. This procedure includes a repair step to address damage caused by FFPE preservation that improves sequence quality. Subsequently, libraries undergo an in-solution targeted selection for exons, followed by sequencing using the Illumina next generation short read sequencing platform. PMID:28075488

  7. Random-breakage mapping method applied to human DNA sequences

    Science.gov (United States)

    Lobrich, M.; Rydberg, B.; Cooper, P. K.; Chatterjee, A. (Principal Investigator)

    1996-01-01

    The random-breakage mapping method [Game et al. (1990) Nucleic Acids Res., 18, 4453-4461] was applied to DNA sequences in human fibroblasts. The methodology involves NotI restriction endonuclease digestion of DNA from irradiated calls, followed by pulsed-field gel electrophoresis, Southern blotting and hybridization with DNA probes recognizing the single copy sequences of interest. The Southern blots show a band for the unbroken restriction fragments and a smear below this band due to radiation induced random breaks. This smear pattern contains two discontinuities in intensity at positions that correspond to the distance of the hybridization site to each end of the restriction fragment. By analyzing the positions of those discontinuities we confirmed the previously mapped position of the probe DXS1327 within a NotI fragment on the X chromosome, thus demonstrating the validity of the technique. We were also able to position the probes D21S1 and D21S15 with respect to the ends of their corresponding NotI fragments on chromosome 21. A third chromosome 21 probe, D21S11, has previously been reported to be close to D21S1, although an uncertainty about a second possible location existed. Since both probes D21S1 and D21S11 hybridized to a single NotI fragment and yielded a similar smear pattern, this uncertainty is removed by the random-breakage mapping method.

  8. Isolation and analysis of high quality nuclear DNA with reduced organellar DNA for plant genome sequencing and resequencing.

    Science.gov (United States)

    Lutz, Kerry A; Wang, Wenqin; Zdepski, Anna; Michael, Todd P

    2011-05-20

    High throughput sequencing (HTS) technologies have revolutionized the field of genomics by drastically reducing the cost of sequencing, making it feasible for individual labs to sequence or resequence plant genomes. Obtaining high quality, high molecular weight DNA from plants poses significant challenges due to the high copy number of chloroplast and mitochondrial DNA, as well as high levels of phenolic compounds and polysaccharides. Multiple methods have been used to isolate DNA from plants; the CTAB method is commonly used to isolate total cellular DNA from plants that contain nuclear DNA, as well as chloroplast and mitochondrial DNA. Alternatively, DNA can be isolated from nuclei to minimize chloroplast and mitochondrial DNA contamination. We describe optimized protocols for isolation of nuclear DNA from eight different plant species encompassing both monocot and eudicot species. These protocols use nuclei isolation to minimize chloroplast and mitochondrial DNA contamination. We also developed a protocol to determine the number of chloroplast and mitochondrial DNA copies relative to the nuclear DNA using quantitative real time PCR (qPCR). We compared DNA isolated from nuclei to total cellular DNA isolated with the CTAB method. As expected, DNA isolated from nuclei consistently yielded nuclear DNA with fewer chloroplast and mitochondrial DNA copies, as compared to the total cellular DNA prepared with the CTAB method. This protocol will allow for analysis of the quality and quantity of nuclear DNA before starting a plant whole genome sequencing or resequencing experiment. Extracting high quality, high molecular weight nuclear DNA in plants has the potential to be a bottleneck in the era of whole genome sequencing and resequencing. The methods that are described here provide a framework for researchers to extract and quantify nuclear DNA in multiple types of plants.

  9. Isolation and analysis of high quality nuclear DNA with reduced organellar DNA for plant genome sequencing and resequencing

    Directory of Open Access Journals (Sweden)

    Zdepski Anna

    2011-05-01

    Full Text Available Abstract Background High throughput sequencing (HTS technologies have revolutionized the field of genomics by drastically reducing the cost of sequencing, making it feasible for individual labs to sequence or resequence plant genomes. Obtaining high quality, high molecular weight DNA from plants poses significant challenges due to the high copy number of chloroplast and mitochondrial DNA, as well as high levels of phenolic compounds and polysaccharides. Multiple methods have been used to isolate DNA from plants; the CTAB method is commonly used to isolate total cellular DNA from plants that contain nuclear DNA, as well as chloroplast and mitochondrial DNA. Alternatively, DNA can be isolated from nuclei to minimize chloroplast and mitochondrial DNA contamination. Results We describe optimized protocols for isolation of nuclear DNA from eight different plant species encompassing both monocot and eudicot species. These protocols use nuclei isolation to minimize chloroplast and mitochondrial DNA contamination. We also developed a protocol to determine the number of chloroplast and mitochondrial DNA copies relative to the nuclear DNA using quantitative real time PCR (qPCR. We compared DNA isolated from nuclei to total cellular DNA isolated with the CTAB method. As expected, DNA isolated from nuclei consistently yielded nuclear DNA with fewer chloroplast and mitochondrial DNA copies, as compared to the total cellular DNA prepared with the CTAB method. This protocol will allow for analysis of the quality and quantity of nuclear DNA before starting a plant whole genome sequencing or resequencing experiment. Conclusions Extracting high quality, high molecular weight nuclear DNA in plants has the potential to be a bottleneck in the era of whole genome sequencing and resequencing. The methods that are described here provide a framework for researchers to extract and quantify nuclear DNA in multiple types of plants.

  10. Isolation and analysis of high quality nuclear DNA with reduced organellar DNA for plant genome sequencing and resequencing

    Science.gov (United States)

    2011-01-01

    Background High throughput sequencing (HTS) technologies have revolutionized the field of genomics by drastically reducing the cost of sequencing, making it feasible for individual labs to sequence or resequence plant genomes. Obtaining high quality, high molecular weight DNA from plants poses significant challenges due to the high copy number of chloroplast and mitochondrial DNA, as well as high levels of phenolic compounds and polysaccharides. Multiple methods have been used to isolate DNA from plants; the CTAB method is commonly used to isolate total cellular DNA from plants that contain nuclear DNA, as well as chloroplast and mitochondrial DNA. Alternatively, DNA can be isolated from nuclei to minimize chloroplast and mitochondrial DNA contamination. Results We describe optimized protocols for isolation of nuclear DNA from eight different plant species encompassing both monocot and eudicot species. These protocols use nuclei isolation to minimize chloroplast and mitochondrial DNA contamination. We also developed a protocol to determine the number of chloroplast and mitochondrial DNA copies relative to the nuclear DNA using quantitative real time PCR (qPCR). We compared DNA isolated from nuclei to total cellular DNA isolated with the CTAB method. As expected, DNA isolated from nuclei consistently yielded nuclear DNA with fewer chloroplast and mitochondrial DNA copies, as compared to the total cellular DNA prepared with the CTAB method. This protocol will allow for analysis of the quality and quantity of nuclear DNA before starting a plant whole genome sequencing or resequencing experiment. Conclusions Extracting high quality, high molecular weight nuclear DNA in plants has the potential to be a bottleneck in the era of whole genome sequencing and resequencing. The methods that are described here provide a framework for researchers to extract and quantify nuclear DNA in multiple types of plants. PMID:21599914

  11. Genetic variability of Taenia saginata inferred from mitochondrial DNA sequences.

    Science.gov (United States)

    Rostami, Sima; Salavati, Reza; Beech, Robin N; Babaei, Zahra; Sharbatkhori, Mitra; Harandi, Majid Fasihi

    2015-04-01

    Taenia saginata is an important tapeworm, infecting humans in many parts of the world. The present study was undertaken to identify inter- and intraspecific variation of T. saginata isolated from cattle in different parts of Iran using two mitochondrial CO1 and 12S rRNA genes. Up to 105 bovine specimens of T. saginata were collected from 20 slaughterhouses in three provinces of Iran. DNA were extracted from the metacestode Cysticercus bovis. After PCR amplification, sequencing of CO1 and 12S rRNA genes were carried out and two phylogenetic analyses of the sequence data were generated by Bayesian inference on CO1 and 12S rRNA sequences. Sequence analyses of CO1 and 12S rRNA genes showed 11 and 29 representative profiles respectively. The level of pairwise nucleotide variation between individual haplotypes of CO1 gene was 0.3-2.4% while the overall nucleotide variation among all 11 haplotypes was 4.6%. For 12S rRNA sequence data, level of pairwise nucleotide variation was 0.2-2.5% and the overall nucleotide variation was determined as 5.8% among 29 haplotypes of 12S rRNA gene. Considerable genetic diversity was found in both mitochondrial genes particularly in 12S rRNA gene.

  12. Hardware Accelerator for the Multifractal Analysis of DNA Sequences.

    Science.gov (United States)

    Duarte-Sanchez, Jorge E; Velasco-Medina, Jaime; Moreno, Pedro A

    2017-07-24

    The multifractal analysis has allowed to quantify the genetic variability and non-linear stability along the human genome sequence. It has some implications in explaining several genetic diseases given by some chromosome abnormalities, among other genetic particularities. The multifractal analysis of a genome is carried out by dividing the complete DNA sequence in smaller fragments and calculating the generalized dimension spectrum of each fragment using the chaos game representation and the box-counting method. This is a time consuming process because it involves the processing of large data sets using floating-point representation. In order to reduce the computation time, we designed an application-specific processor, here called multifractal processor, which is based on our proposed hardware-oriented algorithm for calculating efficiently the generalized dimension spectrum of DNA sequences. The multifractal processor was implemented on a low-cost SoC-FPGA and was verified by processing a complete human genome. The execution time and numeric results of the Multifractal processor were compared with the results obtained from the software implementation executed in a 20-core workstation, achieving a speed up of 2.6x and an average error of 0.0003%.

  13. DNA Sequencing as a Tool to Monitor Marine Ecological Status

    Directory of Open Access Journals (Sweden)

    Kelly D. Goodwin

    2017-05-01

    Full Text Available Many ocean policies mandate integrated, ecosystem-based approaches to marine monitoring, driving a global need for efficient, low-cost bioindicators of marine ecological quality. Most traditional methods to assess biological quality rely on specialized expertise to provide visual identification of a limited set of specific taxonomic groups, a time-consuming process that can provide a narrow view of ecological status. In addition, microbial assemblages drive food webs but are not amenable to visual inspection and thus are largely excluded from detailed inventory. Molecular-based assessments of biodiversity and ecosystem function offer advantages over traditional methods and are increasingly being generated for a suite of taxa using a “microbes to mammals” or “barcodes to biomes” approach. Progress in these efforts coupled with continued improvements in high-throughput sequencing and bioinformatics pave the way for sequence data to be employed in formal integrated ecosystem evaluation, including food web assessments, as called for in the European Union Marine Strategy Framework Directive. DNA sequencing of bioindicators, both traditional (e.g., benthic macroinvertebrates, ichthyoplankton and emerging (e.g., microbial assemblages, fish via eDNA, promises to improve assessment of marine biological quality by increasing the breadth, depth, and throughput of information and by reducing costs and reliance on specialized taxonomic expertise.

  14. Discovering motifs in ranked lists of DNA sequences.

    Directory of Open Access Journals (Sweden)

    Eran Eden

    2007-03-01

    Full Text Available Computational methods for discovery of sequence elements that are enriched in a target set compared with a background set are fundamental in molecular biology research. One example is the discovery of transcription factor binding motifs that are inferred from ChIP-chip (chromatin immuno-precipitation on a microarray measurements. Several major challenges in sequence motif discovery still require consideration: (i the need for a principled approach to partitioning the data into target and background sets; (ii the lack of rigorous models and of an exact p-value for measuring motif enrichment; (iii the need for an appropriate framework for accounting for motif multiplicity; (iv the tendency, in many of the existing methods, to report presumably significant motifs even when applied to randomly generated data. In this paper we present a statistical framework for discovering enriched sequence elements in ranked lists that resolves these four issues. We demonstrate the implementation of this framework in a software application, termed DRIM (discovery of rank imbalanced motifs, which identifies sequence motifs in lists of ranked DNA sequences. We applied DRIM to ChIP-chip and CpG methylation data and obtained the following results. (i Identification of 50 novel putative transcription factor (TF binding sites in yeast ChIP-chip data. The biological function of some of them was further investigated to gain new insights on transcription regulation networks in yeast. For example, our discoveries enable the elucidation of the network of the TF ARO80. Another finding concerns a systematic TF binding enhancement to sequences containing CA repeats. (ii Discovery of novel motifs in human cancer CpG methylation data. Remarkably, most of these motifs are similar to DNA sequence elements bound by the Polycomb complex that promotes histone methylation. Our findings thus support a model in which histone methylation and CpG methylation are mechanistically linked

  15. A MapReduce Framework for DNA Sequencing Data Processing

    Directory of Open Access Journals (Sweden)

    Samy Ghoneimy

    2016-12-01

    Full Text Available Genomics and Next Generation Sequencers (NGS like Illumina Hiseq produce data in the order of ‎‎200 billion base pairs in a single one-week run for a 60x human genome coverage, which ‎requires modern high-throughput experimental technologies that can ‎only be tackled with high performance computing (HPC and specialized software algorithms called ‎‎“short read aligners”. This paper focuses on the implementation of the DNA sequencing as a set of MapReduce programs that will accept a DNA data set as a FASTQ file and finally generate a VCF (variant call format file, which has variants for a given DNA data set. In this paper MapReduce/Hadoop along with Burrows-Wheeler Aligner (BWA, Sequence Alignment/Map (SAM ‎tools, are fully utilized to provide various utilities for manipulating alignments, including sorting, merging, indexing, ‎and generating alignments. The Map-Sort-Reduce process is designed to be suited for a Hadoop framework in ‎which each cluster is a traditional N-node Hadoop cluster to utilize all of the Hadoop features like HDFS, program ‎management and fault tolerance. The Map step performs multiple instances of the short read alignment algorithm ‎‎(BoWTie that run in parallel in Hadoop. The ordered list of the sequence reads are used as input tuples and the ‎output tuples are the alignments of the short reads. In the Reduce step many parallel instances of the Short ‎Oligonucleotide Analysis Package for SNP (SOAPsnp algorithm run in the cluster. Input tuples are sorted ‎alignments for a partition and the output tuples are SNP calls. Results are stored via HDFS, and then archived in ‎SOAPsnp format. ‎ The proposed framework enables extremely fast discovering somatic mutations, inferring population genetical ‎parameters, and performing association tests directly based on sequencing data without explicit genotyping or ‎linkage-based imputation. It also demonstrate that this method achieves comparable

  16. Legume genomics: understanding biology through DNA and RNA sequencing

    Science.gov (United States)

    O'Rourke, Jamie A.; Bolon, Yung-Tsi; Bucciarelli, Bruna; Vance, Carroll P.

    2014-01-01

    Background The legume family (Leguminosae) consists of approx. 17 000 species. A few of these species, including, but not limited to, Phaseolus vulgaris, Cicer arietinum and Cajanus cajan, are important dietary components, providing protein for approx. 300 million people worldwide. Additional species, including soybean (Glycine max) and alfalfa (Medicago sativa), are important crops utilized mainly in animal feed. In addition, legumes are important contributors to biological nitrogen, forming symbiotic relationships with rhizobia to fix atmospheric N2 and providing up to 30 % of available nitrogen for the next season of crops. The application of high-throughput genomic technologies including genome sequencing projects, genome re-sequencing (DNA-seq) and transcriptome sequencing (RNA-seq) by the legume research community has provided major insights into genome evolution, genomic architecture and domestication. Scope and Conclusions This review presents an overview of the current state of legume genomics and explores the role that next-generation sequencing technologies play in advancing legume genomics. The adoption of next-generation sequencing and implementation of associated bioinformatic tools has allowed researchers to turn each species of interest into their own model organism. To illustrate the power of next-generation sequencing, an in-depth overview of the transcriptomes of both soybean and white lupin (Lupinus albus) is provided. The soybean transcriptome focuses on analysing seed development in two near-isogenic lines, examining the role of transporters, oil biosynthesis and nitrogen utilization. The white lupin transcriptome analysis examines how phosphate deficiency alters gene expression patterns, inducing the formation of cluster roots. Such studies illustrate the power of next-generation sequencing and bioinformatic analyses in elucidating the gene networks underlying biological processes. PMID:24769535

  17. Simulating DNA coding sequence evolution with EvolveAGene 3.

    Science.gov (United States)

    Hall, Barry G

    2008-04-01

    Phylogenetic reconstruction based upon multiple alignments of molecular sequences is important to most branches of modern biology and is central to molecular evolution. Understanding the historical relationships among macromolecules depends upon computer programs that implement a variety of analytical methods. Because it is impossible to know those historical relationships with certainty, assessment of the accuracy of methods and the programs that implement them requires the use of programs that realistically simulate the evolution of DNA sequences. EvolveAGene 3 is a realistic coding sequence simulation program that separates mutation from selection and allows the user to set selection conditions, including variable regions of selection intensity within the sequence and variation in intensity of selection over branches. Variation includes base substitutions, insertions, and deletions. To the best of my knowledge, it is the only program available that simulates the evolution of intact coding sequences. Output includes the true tree and true alignments of the resulting coding sequence and corresponding protein sequences. A log file reports the frequencies of each kind of base substitution, the ratio of transition to transversion substitutions, the ratio of indel to base substitution mutations, and the numbers of silent and amino acid replacement mutations. The realism of the data sets has been assessed by comparing the d(N)/d(S) ratio, the ratio of transition to transversion substitutions, and the ratio of indel to base substitution mutations of the simulated data sets with those parameters of real data sets from the "gold standard" BaliBase collection of structural alignments. Results show that the data sets produced by EvolveAGene 3 are very similar to real data sets, and EvolveAGene 3 is therefore a realistic simulation program that can be used to evaluate a variety of programs and methods in molecular evolution.

  18. Mitochondrial DNA sequencing of cat hair: an informative forensic tool.

    Science.gov (United States)

    Tarditi, Christy R; Grahn, Robert A; Evans, Jeffrey J; Kurushima, Jennifer D; Lyons, Leslie A

    2011-01-01

    Approximately 81.7 million cats are in 37.5 million U.S. households. Shed fur can be criminal evidence because of transfer to victims, suspects, and/or their belongings. To improve cat hairs as forensic evidence, the mtDNA control region from single hairs, with and without root tags, was sequenced. A dataset of a 402-bp control region segment from 174 random-bred cats representing four U.S. geographic areas was generated to determine the informativeness of the mtDNA region. Thirty-two mtDNA mitotypes were observed ranging in frequencies from 0.6-27%. Four common types occurred in all populations. Low heteroplasmy, 1.7%, was determined. Unique mitotypes were found in 18 individuals, 10.3% of the population studied. The calculated discrimination power implied that 8.3 of 10 randomly selected individuals can be excluded by this region. The genetic characteristics of the region and the generated dataset support the use of this cat mtDNA region in forensic applications. 2010 American Academy of Forensic Sciences. Published 2010. This article is a U.S. Government work and is in the public domain in the U.S.A.

  19. Retroviral DNA Sequences as a Means for Determining Ancient Diets.

    Directory of Open Access Journals (Sweden)

    Jessica I Rivera-Perez

    Full Text Available For ages, specialists from varying fields have studied the diets of the primeval inhabitants of our planet, detecting diet remains in archaeological specimens using a range of morphological and biochemical methods. As of recent, metagenomic ancient DNA studies have allowed for the comparison of the fecal and gut microbiomes associated to archaeological specimens from various regions of the world; however the complex dynamics represented in those microbial communities still remain unclear. Theoretically, similar to eukaryote DNA the presence of genes from key microbes or enzymes, as well as the presence of DNA from viruses specific to key organisms, may suggest the ingestion of specific diet components. In this study we demonstrate that ancient virus DNA obtained from coprolites also provides information reconstructing the host's diet, as inferred from sequences obtained from pre-Columbian coprolites. This depicts a novel and reliable approach to determine new components as well as validate the previously suggested diets of extinct cultures and animals. Furthermore, to our knowledge this represents the first description of the eukaryotic viral diversity found in paleofaeces belonging to pre-Columbian cultures.

  20. Peptide Synthesis on a Next-Generation DNA Sequencing Platform.

    Science.gov (United States)

    Svensen, Nina; Peersen, Olve B; Jaffrey, Samie R

    2016-09-02

    Methods for displaying large numbers of peptides on solid surfaces are essential for high-throughput characterization of peptide function and binding properties. Here we describe a method for converting the >10(7) flow cell-bound clusters of identical DNA strands generated by the Illumina DNA sequencing technology into clusters of complementary RNA, and subsequently peptide clusters. We modified the flow-cell-bound primers with ribonucleotides thus enabling them to be used by poliovirus polymerase 3D(pol) . The primers hybridize to the clustered DNA thus leading to RNA clusters. The RNAs fold into functional protein- or small molecule-binding aptamers. We used the mRNA-display approach to synthesize flow-cell-tethered peptides from these RNA clusters. The peptides showed selective binding to cognate antibodies. The methods described here provide an approach for using DNA clusters to template peptide synthesis on an Illumina flow cell, thus providing new opportunities for massively parallel peptide-based assays. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. Targeted deep DNA methylation analysis of circulating cell-free DNA in plasma using massively parallel semiconductor sequencing.

    Science.gov (United States)

    Vaca-Paniagua, Felipe; Oliver, Javier; Nogueira da Costa, Andre; Merle, Philippe; McKay, James; Herceg, Zdenko; Holmila, Reetta

    2015-01-01

    To set up a targeted methylation analysis using semiconductor sequencing and evaluate the potential for studying methylation in circulating cell-free DNA (cfDNA). Methylation of VIM, FBLN1, LTBP2, HINT2, h19 and IGF2 was analyzed in plasma cfDNA and white blood cell DNA obtained from eight hepatocellular carcinoma patients and eight controls using Ion Torrent™ PGM sequencer. h19 and IGF2 showed consistent methylation levels and methylation was detected for VIM and FBLN1, whereas LTBP2 and HINT2 did not show methylation for target regions. VIM gene promoter methylation was higher in HCC cfDNA than in cfDNA of controls or white blood cell DNA. Semiconductor sequencing is a suitable method for analyzing methylation profiles in cfDNA. Furthermore, differences in cfDNA methylation can be detected between controls and hepatocellular carcinoma cases, even though due to the small sample set these results need further validation.

  2. A Simulation of DNA Sequencing Utilizing 3M Post-It[R] Notes

    Science.gov (United States)

    Christensen, Doug

    2009-01-01

    An inexpensive and equipment free approach to teaching the technical aspects of DNA sequencing. The activity described requires an instructor with a familiarity of DNA sequencing technology but provides a straight forward method of teaching the technical aspects of sequencing in the absence of expensive sequencing equipment. The final sequence…

  3. Entropy and long-range correlations in DNA sequences.

    Science.gov (United States)

    Melnik, S S; Usatenko, O V

    2014-12-01

    We analyze the structure of DNA molecules of different organisms by using the additive Markov chain approach. Transforming nucleotide sequences into binary strings, we perform statistical analysis of the corresponding "texts". We develop the theory of N-step additive binary stationary ergodic Markov chains and analyze their differential entropy. Supposing that the correlations are weak we express the conditional probability function of the chain by means of the pair correlation function and represent the entropy as a functional of the pair correlator. Since the model uses two point correlators instead of probability of block occurring, it makes possible to calculate the entropy of subsequences at much longer distances than with the use of the standard methods. We utilize the obtained analytical result for numerical evaluation of the entropy of coarse-grained DNA texts. We believe that the entropy study can be used for biological classification of living species. Copyright © 2014. Published by Elsevier Ltd.

  4. A DNA sequence obtained by replacement of the dopamine RNA aptamer bases is not an aptamer

    DEFF Research Database (Denmark)

    Álvarez-Martos, Isabel; Ferapontova, Elena

    2017-01-01

    of dopamine is a 57 nucleotides long RNA sequence reported in 1997 (Biochemistry, 1997, 36, 9726). Later, it was suggested that the DNA homologue of the RNA aptamer retains the specificity of dopamine binding (Biochem. Biophys. Res. Commun., 2009, 388, 732). Here, we show that the DNA sequence obtained...... by the replacement of the RNA aptamer bases for their DNA analogues is not able of specific biorecognition of dopamine, in contrast to the original RNA aptamer sequence. This DNA sequence binds dopamine and structurally related catecholamine neurotransmitters non-specifically, as any DNA sequence, and, thus...

  5. Automated methods for single-stranded DNA isolation and dideoxynucleotide DNA sequencing reactions on a robotic workstation.

    Science.gov (United States)

    Mardis, E R; Roe, B A

    1989-09-01

    Automated procedures have been developed for both the simultaneous isolation of 96 single-stranded M13 chimeric template DNAs in less than two hours, and for simultaneously pipetting 24 dideoxynucleotide sequencing reactions on a commercially available laboratory workstation. The DNA sequencing results obtained by either radiolabeled or fluorescent methods are consistent with the premise that automation of these portions of DNA sequencing projects will improve the reproducibility of the DNA isolation and the procedures for these normally labor-intensive steps provides an approach for rapid acquisition of large amounts of high quality, reproducible DNA sequence data.

  6. DNA interaction with platinum-based cytostatics revealed by DNA sequencing.

    Science.gov (United States)

    Smerkova, Kristyna; Vaculovic, Tomas; Vaculovicova, Marketa; Kynicky, Jindrich; Brtnicky, Martin; Eckschlager, Tomas; Stiborova, Marie; Hubalek, Jaromir; Adam, Vojtech

    2017-12-15

    The main mechanism of action of platinum-based cytostatic drugs - cisplatin, oxaliplatin and carboplatin - is the formation of DNA cross-links, which restricts the transcription due to the disability of DNA to enter the active site of the polymerase. The polymerase chain reaction (PCR) was employed as a simplified model of the amplification process in the cell nucleus. PCR with fluorescently labelled dideoxynucleotides commonly employed for DNA sequencing was used to monitor the effect of platinum-based cytostatics on DNA in terms of decrease in labeling efficiency dependent on a presence of the DNA-drug cross-link. It was found that significantly different amounts of the drugs - cisplatin (0.21 μg/mL), oxaliplatin (5.23 μg/mL), and carboplatin (71.11 μg/mL) - were required to cause the same quenching effect (50%) on the fluorescent labelling of 50 μg/mL of DNA. Moreover, it was found that even though the amounts of the drugs was applied to the reaction mixture differing by several orders of magnitude, the amount of incorporated platinum, quantified by inductively coupled plasma mass spectrometry, was in all cases at the level of tenths of μg per 5 μg of DNA. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens.

    Science.gov (United States)

    Shokralla, Shadi; Gibson, Joel F; Nikbakht, Hamid; Janzen, Daniel H; Hallwachs, Winnie; Hajibabaei, Mehrdad

    2014-09-01

    DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large-scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next-generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high-target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next-generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10-mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full-length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full-length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next-generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content. © 2014 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.

  8. PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context.

    Science.gov (United States)

    Zhou, Jiyun; Xu, Ruifeng; He, Yulan; Lu, Qin; Wang, Hongpeng; Kong, Bing

    2016-06-10

    Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community.

  9. 18S rDNA sequences and the holometabolous insects.

    Science.gov (United States)

    Carmean, D; Kimsey, L S; Berbee, M L

    1992-12-01

    The Holometabola (insects with complete metamorphosis: beetles, wasps, flies, fleas, butterflies, lacewings, and others) is a monophyletic group that includes the majority of the world's animal species. Holometabolous orders are well defined by morphological characters, but relationships among orders are unclear. In a search for a region of DNA that will clarify the interordinal relationships we sequenced approximately 1080 nucleotides of the 5' end of the 18S ribosomal RNA gene from representatives of 14 families of insects in the orders Hymenoptera (sawflies and wasps), Neuroptera (lacewing and antlion), Siphonaptera (flea), and Mecoptera (scorpionfly). We aligned the sequences with the published sequences of insects from the orders Coleoptera (beetle) and Diptera (mosquito and Drosophila), and the outgroups aphid, shrimp, and spider. Unlike the other insects examined in this study, the neuropterans have A-T rich insertions or expansion regions: one in the antlion was approximately 260 bp long. The dipteran 18S rDNA evolved rapidly, with over 3 times as many substitutions among the aligned sequences, and 2-3 times more unalignable nucleotides than other Holometabola, in violation of an insect-wide molecular clock. When we excluded the long-branched taxa (Diptera, shrimp, and spider) from the analysis, the most parsimonious (minimum-length) trees placed the beetle basal to other holometabolous orders, and supported a morphologically monophyletic clade including the fleas+scorpionflies (96% bootstrap support). However, most interordinal relationships were not significantly supported when tested by maximum likelihood or bootstrapping and were sensitive to the taxa included in the analysis. The most parsimonious and maximum-likelihood trees both separated the Coleoptera and Neuroptera, but this separation was not statistically significant.(ABSTRACT TRUNCATED AT 250 WORDS)

  10. Maternal Plasma DNA and RNA Sequencing for Prenatal Testing.

    Science.gov (United States)

    Tamminga, Saskia; van Maarle, Merel; Henneman, Lidewij; Oudejans, Cees B M; Cornel, Martina C; Sistermans, Erik A

    2016-01-01

    Cell-free DNA (cfDNA) testing has recently become indispensable in diagnostic testing and screening. In the prenatal setting, this type of testing is often called noninvasive prenatal testing (NIPT). With a number of techniques, using either next-generation sequencing or single nucleotide polymorphism-based approaches, fetal cfDNA in maternal plasma can be analyzed to screen for rhesus D genotype, common chromosomal aneuploidies, and increasingly for testing other conditions, including monogenic disorders. With regard to screening for common aneuploidies, challenges arise when implementing NIPT in current prenatal settings. Depending on the method used (targeted or nontargeted), chromosomal anomalies other than trisomy 21, 18, or 13 can be detected, either of fetal or maternal origin, also referred to as unsolicited or incidental findings. For various biological reasons, there is a small chance of having either a false-positive or false-negative NIPT result, or no result, also referred to as a "no-call." Both pre- and posttest counseling for NIPT should include discussing potential discrepancies. Since NIPT remains a screening test, a positive NIPT result should be confirmed by invasive diagnostic testing (either by chorionic villus biopsy or by amniocentesis). As the scope of NIPT is widening, professional guidelines need to discuss the ethics of what to offer and how to offer. In this review, we discuss the current biochemical, clinical, and ethical challenges of cfDNA testing in the prenatal setting and its future perspectives including novel applications that target RNA instead of DNA. © 2016 Elsevier Inc. All rights reserved.

  11. DNA from extinct giant lemurs links archaeolemurids to extant indriids

    Directory of Open Access Journals (Sweden)

    Hänni Catherine

    2008-04-01

    Full Text Available Abstract Background Although today 15% of living primates are endemic to Madagascar, their diversity was even greater in the recent past since dozens of extinct species have been recovered from Holocene excavation sites. Among them were the so-called "giant lemurs" some of which weighed up to 160 kg. Although extensively studied, the phylogenetic relationships between extinct and extant lemurs are still difficult to decipher, mainly due to morphological specializations that reflect ecology more than phylogeny, resulting in rampant homoplasy. Results Ancient DNA recovered from subfossils recently supported a sister relationship between giant "sloth" lemurs and extant indriids and helped to revise the phylogenetic position of Megaladapis edwardsi among lemuriformes, but several taxa – such as the Archaeolemuridae – still await analysis. We therefore used ancient DNA technology to address the phylogenetic status of the two archaeolemurid genera (Archaeolemur and Hadropithecus. Despite poor DNA preservation conditions in subtropical environments, we managed to recover 94- to 539-bp sequences for two mitochondrial genes among 5 subfossil samples. Conclusion This new sequence information provides evidence for the proximity of Archaeolemur and Hadropithecus to extant indriids, in agreement with earlier assessments of their taxonomic status (Primates, Indrioidea and in contrast to recent suggestions of a closer relationship to the Lemuridae made on the basis of analyses of dental developmental and postcranial characters. These data provide new insights into the evolution of the locomotor apparatus among lemurids and indriids.

  12. Is photocleavage of DNA by YOYO-1 using a synchrotron radiation light source sequence dependent?

    DEFF Research Database (Denmark)

    Gilroy, Emma L.; Hoffmann, Søren Vrønning; Jones, Nykola C.

    2011-01-01

    ) throughout the irradiation period. The dependence of LD signals on DNA sequences and on time in the intense light beam was explored and quantified for single-stranded poly(dA), poly[(dA-dT)2], calf thymus DNA (ctDNA) and Micrococcus luteus DNA (mlDNA). The DNA and ligand regions of the spectrum showed...... was predominantly responsible for the catalysis of DNA cleavage. In homopolymeric DNAs, intercalated YOYO was unable to cleave DNA. In mixed-sequence DNAs the data suggest that YOYO in some but not all intercalated binding sites can cause cleavage. It is also likely that cleavage occurs at transient single...

  13. Recent progress in atomistic simulation of electrical current DNA sequencing.

    Science.gov (United States)

    Kim, Han Seul; Kim, Yong-Hoon

    2015-07-15

    We review recent advances in the DNA sequencing method based on measurements of transverse electrical currents. Device configurations proposed in the literature are classified according to whether the molecular fingerprints appear as the major (Mode I) or perturbing (Mode II) current signals. Scanning tunneling microscope and tunneling electrode gap configurations belong to the former category, while the nanochannels with or without an embedded nanopore belong to the latter. The molecular sensing mechanisms of Modes I and II roughly correspond to the electron tunneling and electrochemical gating, respectively. Special emphasis will be given on the computer simulation studies, which have been playing a critical role in the initiation and development of the field. We also highlight low-dimensional nanomaterials such as carbon nanotubes, graphene, and graphene nanoribbons that allow the novel Mode II approach. Finally, several issues in previous computational studies are discussed, which points to future research directions toward more reliable simulation of electrical current DNA sequencing devices. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. Mylodon darwinii DNA sequences from ancient fecal hair shafts.

    Science.gov (United States)

    Clack, Andrew A; MacPhee, Ross D E; Poinar, Hendrik N

    2012-01-20

    Preserved hair has been increasingly used as an ancient DNA source in high throughput sequencing endeavors, and it may actually offer several advantages compared to more traditional ancient DNA substrates like bone. However, cold environments have yielded the most informative ancient hair specimens, while its preservation, and thus utility, in temperate regions is not well documented. Coprolites could represent a previously underutilized preservation substrate for hairs, which, if present therein, represent macroscopic packages of specific cells that are relatively simple to separate, clean and process. In this pilot study, we report amplicons 147-152 base pairs in length (w/primers) from hair shafts preserved in a south Chilean coprolite attributed to Darwin's extinct ground sloth, Mylodon darwinii. Our results suggest that hairs preserved in coprolites from temperate cave environments can serve as an effective source of ancient DNA. This bodes well for potential molecular-based population and phylogeographic studies on sloths, several species of which have been understudied despite leaving numerous coprolites in caves across of the Americas. Copyright © 2011. Published by Elsevier GmbH.

  15. The Eemian/Early Vistulian development of the Solniki paleolake (north-eastern Poland as shown by subfossil Cladocera

    Directory of Open Access Journals (Sweden)

    Monika Magdalena Niska

    2016-12-01

    Full Text Available This work presents results of a paleolimnological study focussed on subfossil Cladocera analysis and on different aspects of the evolution of the Solniki paleolake during the Eemian/Early Vistulian period. The study aimed at the reconstruction of the long-term dynamics of this paleoecosystem and at defining the conditions (e.g., water level, trophic status and water temperature of the ancient lake. Paleolacustrine deposits of ca. 10 m thickness were discovered at Solniki during cartographic works for the Trześcianka sheet of the Detailed Geological Map of Poland. This archives recorded one full-interglacial sequence (Eemian Interglacial, one interstadial warming (Brørup and two stadial coolings (Herning and Rederstall stages, which were confirmed by palynological analyses. The subfossil Cladocera fauna from the Solniki paleolake consisted in 17 species belonging to the families Bosminidae, Chydoridae, Sididae and Daphniidae. Littoral species were dominant (52%, the most frequent of which were Alona affinis and Camptocercus rectirostris. The most abundance pelagic species were Eubosmina coregoni and Bosmina longirostris. The sediment species composition was quite similar to that of contemporary Central European lakes. The early and the late stages of Eemian Interglacial were likely the most favourable periods for the Cladocera development in the paleolake, in relation to higher water level, moderate water temperature and the mesotrophic state of water. A further ecologically favourable period was the Brørup Interstadial. The highest species richness, abundance, and diversity during the whole paleolake existence were recorded during these three periods. Surprisingly, the middle of the Middle Eemian Interglacial climate optimum appeared as an unfavourable period for the Cladocera growth as it was associated with decreasing water level and pronounced climate fluctuations. This sequence was also recorded by other studies of Eemian lakes in Central

  16. A novel method for comparative analysis of DNA sequences by Ramanujan-Fourier transform.

    Science.gov (United States)

    Yin, Changchuan; Yin, Xuemeng E; Wang, Jiasong

    2014-12-01

    Alignment-free sequence analysis approaches provide important alternatives over multiple sequence alignment (MSA) in biological sequence analysis because alignment-free approaches have low computation complexity and are not dependent on high level of sequence identity. However, most of the existing alignment-free methods do not employ true full information content of sequences and thus can not accurately reveal similarities and differences among DNA sequences. We present a novel alignment-free computational method for sequence analysis based on Ramanujan-Fourier transform (RFT), in which complete information of DNA sequences is retained. We represent DNA sequences as four binary indicator sequences and apply RFT on the indicator sequences to convert them into frequency domain. The Euclidean distance of the complete RFT coefficients of DNA sequences are used as similarity measures. To address the different lengths of RFT coefficients in Euclidean space, we pad zeros to short DNA binary sequences so that the binary sequences equal the longest length in the comparison sequence data. Thus, the DNA sequences are compared in the same dimensional frequency space without information loss. We demonstrate the usefulness of the proposed method by presenting experimental results on hierarchical clustering of genes and genomes. The proposed method opens a new channel to biological sequence analysis, classification, and structural module identification.

  17. A Novel Computational Method for Detecting DNA Methylation Sites with DNA Sequence Information and Physicochemical Properties.

    Science.gov (United States)

    Pan, Gaofeng; Jiang, Limin; Tang, Jijun; Guo, Fei

    2018-02-08

    DNA methylation is an important biochemical process, and it has a close connection with many types of cancer. Research about DNA methylation can help us to understand the regulation mechanism and epigenetic reprogramming. Therefore, it becomes very important to recognize the methylation sites in the DNA sequence. In the past several decades, many computational methods-especially machine learning methods-have been developed since the high-throughout sequencing technology became widely used in research and industry. In order to accurately identify whether or not a nucleotide residue is methylated under the specific DNA sequence context, we propose a novel method that overcomes the shortcomings of previous methods for predicting methylation sites. We use k -gram, multivariate mutual information, discrete wavelet transform, and pseudo amino acid composition to extract features, and train a sparse Bayesian learning model to do DNA methylation prediction. Five criteria-area under the receiver operating characteristic curve (AUC), Matthew's correlation coefficient (MCC), accuracy (ACC), sensitivity (SN), and specificity-are used to evaluate the prediction results of our method. On the benchmark dataset, we could reach 0.8632 on AUC, 0.8017 on ACC, 0.5558 on MCC, and 0.7268 on SN. Additionally, the best results on two scBS-seq profiled mouse embryonic stem cells datasets were 0.8896 and 0.9511 by AUC, respectively. When compared with other outstanding methods, our method surpassed them on the accuracy of prediction. The improvement of AUC by our method compared to other methods was at least 0.0399 . For the convenience of other researchers, our code has been uploaded to a file hosting service, and can be downloaded from: https://figshare.com/s/0697b692d802861282d3.

  18. Potential use of DNA barcoding for the identification of Salvia based on cpDNA and nrDNA sequences.

    Science.gov (United States)

    Wang, Meng; Zhao, Hong-Xia; Wang, Long; Wang, Tao; Yang, Rui-Wu; Wang, Xiao-Li; Zhou, Yong-Hong; Ding, Chun-Bang; Zhang, Li

    2013-10-10

    An effective DNA marker for authenticating the genus Salvia was screened using seven DNA regions (rbcL, matK, trnL-F, and psbA-trnH from the chloroplast genome, and ITS, ITS1, and ITS2 from the nuclear genome) and three combinations (rbcL+matK, psbA-trnH+ITS1, and trnL-F+ITS1). The present study collected 232 sequences from 27 Salvia species through DNA sequencing and 77 sequences within the same taxa from the GenBank. The discriminatory capabilities of these regions were evaluated in terms of PCR amplification success, intraspecific and interspecific divergence, DNA barcoding gaps, and identification efficiency via a tree-based method. ITS1 was superior to the other marker for discriminating between species, with an accuracy of 81.48%. The three combinations did not increase species discrimination. Finally, we found that ITS1 is a powerful barcode for identifying Salvia species, especially Salvia miltiorrhiza. © 2013.

  19. Detection and mapping of mtDNA SNPs in Atlantic salmon using high throughput DNA sequencing

    Directory of Open Access Journals (Sweden)

    Olafsdottir Gudbjorg

    2011-04-01

    Full Text Available Abstract Background Approximately half of the mitochondrial genome inherent within 546 individual Atlantic salmon (Salmo salar derived from across the species' North Atlantic range, was selectively amplified with a novel combination of standard PCR and pyro-sequencing in a single run using 454 Titanium FLX technology (Roche, 454 Life Sciences. A unique combination of barcoded primers and a partitioned sequencing plate was employed to designate each sequence read to its original sample. The sequence reads were aligned according to the S. salar mitochondrial reference sequence (NC_001960.1, with the objective of identifying single nucleotide polymorphisms (SNPs. They were validated if they met with the following three stringent criteria: (i sequence reads were produced from both DNA strands; (ii SNPs were confirmed in a minimum of 90% of replicate sequence reads; and (iii SNPs occurred in more than one individual. Results Pyrosequencing generated a total of 179,826,884 bp of data, and 10,765 of the total 10,920 S. salar sequences (98.6% were assigned back to their original samples. The approach taken resulted in a total of 216 SNPs and 2 indels, which were validated and mapped onto the S. salar mitochondrial genome, including 107 SNPs and one indel not previously reported. An average of 27.3 sequence reads with a standard deviation of 11.7 supported each SNP per individual. Conclusion The study generated a mitochondrial SNP panel from a large sample group across a broad geographical area, reducing the potential for ascertainment bias, which has hampered previous studies. The SNPs identified here validate those identified in previous studies, and also contribute additional potentially informative loci for the future study of phylogeography and evolution in the Atlantic salmon. The overall success experienced with this novel application of HT sequencing of targeted regions suggests that the same approach could be successfully applied for SNP mining

  20. Determination of cDNA and genomic DNA sequences of hevamine, a chitinase from the rubber tree Hevea brasiliensis

    NARCIS (Netherlands)

    Bokma, E; Spiering, M; Chow, KS; Mulder, PPMFA; Subroto, T; Beintema, JJ

    Hevamine is a chitinase from the rubber tree Hevea brasiliensis and belongs to the family 18 glycosyl hydrolases. This paper describes the cloning of hevamine DNA and cDNA sequences. Hevamine contains a signal peptide at the N-terminus and a putative vacuolar targeting sequence at the C-terminus

  1. Construction of a Sequencing Library from Circulating Cell-Free DNA.

    Science.gov (United States)

    Fang, Nan; Löffert, Dirk; Akinci-Tolun, Rumeysa; Heitz, Katja; Wolf, Alexander

    2016-04-01

    Circulating DNA is cell-free DNA (cfDNA) in serum or plasma that can be used for non-invasive prenatal testing, as well as cancer diagnosis, prognosis, and stratification. High-throughput sequence analysis of the cfDNA with next-generation sequencing technologies has proven to be a highly sensitive and specific method in detecting and characterizing mutations in cancer and other diseases, as well as aneuploidy during pregnancy. This unit describes detailed procedures to extract circulating cfDNA from human serum and plasma and generate sequencing libraries from a wide concentration range of circulating DNA. Copyright © 2016 John Wiley & Sons, Inc.

  2. The evolution processes of DNA sequences, languages and carols

    Science.gov (United States)

    Hauck, Jürgen; Henkel, Dorothea; Mika, Klaus

    2001-04-01

    The sequences of bases A, T, C and G of about 100 enolase, secA and cytochrome DNA were analyzed for attractive or repulsive interactions by the numbers T 1,T 2,T 3; r of nearest, next-nearest and third neighbor bases of the same kind and the concentration r=other bases/analyzed base. The area of possible T1, T2 values is limited by the linear borders T 2=2T 1-2, T 2=0 or T1=0 for clustering, attractive or repulsive interactions and the border T2=-2 T1+2(2- r) for a variation from repulsive to attractive interactions at r⩽2. Clustering is preferred by most bases in sequences of enolases and secA’ s. Major deviations with repulsive interactions of some bases are observed for archaea bacteria in secA and for highly developed animals and the human species in enolase sequences. The borders of the structure map for enthalpy stabilized structures with maximum interactions are approached in few cases. Most letters of the natural languages and some music notes are at the borders of the structure map.

  3. Long-range correlations and charge transport properties of DNA sequences

    Science.gov (United States)

    Liu, Xiao-liang; Ren, Yi; Xie, Qiong-tao; Deng, Chao-sheng; Xu, Hui

    2010-04-01

    By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5sequence displays a transition from correlation behavior to anticorrelation behavior. The resonant peaks of the transmission coefficient in genomic sequences can survive in longer sequence length than in random sequences but in shorter sequence length than in quasiperiodic sequences. It is shown that the genomic sequences have long-range correlation properties to some extent but the correlations are not strong enough to maintain the scale invariance properties.

  4. Long-range correlations and charge transport properties of DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Liu Xiaoliang, E-mail: xlliucsu@yahoo.com.c [College of Physical Science and Technology and College of Metallurgical Science and Engineering, Central South University, Changsha 410083 (China); Ren, Yi [College of Physical Science and Technology and College of Metallurgical Science and Engineering, Central South University, Changsha 410083 (China); Xie, Qiong-tao [Key Laboratory of Low Dimensional Quantum Structures and Quantum Control of Ministry of Education (Hunan Normal University), Changsha 410081 (China); Deng, Chao-sheng; Xu, Hui [College of Physical Science and Technology and College of Metallurgical Science and Engineering, Central South University, Changsha 410083 (China)

    2010-04-26

    By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that lambda-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5sequence displays a transition from correlation behavior to anticorrelation behavior. The resonant peaks of the transmission coefficient in genomic sequences can survive in longer sequence length than in random sequences but in shorter sequence length than in quasiperiodic sequences. It is shown that the genomic sequences have long-range correlation properties to some extent but the correlations are not strong enough to maintain the scale invariance properties.

  5. Structural analysis of DNA sequence: evidence for lateral gene transfer in Thermotoga maritima

    DEFF Research Database (Denmark)

    Worning, Peder; Jensen, Lars Juhl; Nelson, K. E.

    2000-01-01

    The recently published complete DNA sequence of the bacterium Thermotoga maritima provides evidence, based on protein sequence conservation, for lateral gene transfer between Archaea and Bacteria. We introduce a new method of periodicity analysis of DNA sequences, based on structural parameters......, which brings independent evidence for the lateral gene transfer in the genome of T.maritima, The structural analysis relates the Archaea-like DNA sequences to the genome of Pyrococcus horikoshii. Analysis of 24 complete genomic DNA sequences shows different periodicity patterns for organisms...

  6. Transfection of the inner cell mass and lack of a unique DNA sequence affecting the uptake of exogenous DNA by sperm as shown by dideoxy sequencing analogues.

    Science.gov (United States)

    Cabrera, M; Chan, P J; Kalugdan, T H; King, A

    1997-02-01

    The purpose of this study was to determine whether exogenous DNA internalized into blastocysts after transference from DNA-carrier sperm are localized at the inner cell mass or trophoblast cells and to identify differences in uptake of exogenous DNA fragments by sperm due to unique DNA sequences. Mouse blastocysts at the hatching stage were exposed to migrating human sperm cells carrying exogenous DNA fragments synthesized from the E6-E7 conserved gene regions of human papillomavirus (HPV) types 16 and 18. After an interaction period of 2 hr, the transfected blastocysts were washed several times to remove extraneous sperm and the blastocysts were dissected into groups of cells derived from the inner cell mass and trophoblasts. The cells were analyzed by polymerase chain reaction (PCR) for the presence of HPV DNA fragments. In the second part of the experiment, thawed donor (N = 10) sperm cells were pooled, washed, and divided into two fractions. The first (control) fraction was added with formalin and further divided and added with a 35S-radiolabeled G, A, T, or C sequencing mixture. The second fraction was similarly treated but the formalin step was omitted from the treatment. After an hour of incubation at 37 degrees C, the sperm specimens were washed several times by centrifugation and DNA extracted by the GeneReleaser method. The extracted DNA were processed on sequence gels, and the autoradiographs analyzed. Mouse blastocysts transfected by carrier sperm with DNA from HPV types 16 and 18 showed localization of the HPV DNA to both the inner cell mass and trophoblast cells. Negative controls consisting of untreated human sperm and untreated mouse blastocysts did not reveal any evidence of HPV DNA. The positive sperm control generated expected DNA fragments from HPV types 16 and 18. In the second experiment, the intensities of the DNA fragments in the G, A, T, and C columns from low to high molecular weights were not different from the positive control bands

  7. Repetitive sequence analysis and karyotyping reveals centromere-associated DNA sequences in radish (Raphanus sativus L.).

    Science.gov (United States)

    He, Qunyan; Cai, Zexi; Hu, Tianhua; Liu, Huijun; Bao, Chonglai; Mao, Weihai; Jin, Weiwei

    2015-04-18

    Radish (Raphanus sativus L., 2n = 2x = 18) is a major root vegetable crop especially in eastern Asia. Radish root contains various nutritions which play an important role in strengthening immunity. Repetitive elements are primary components of the genomic sequence and the most important factors in genome size variations in higher eukaryotes. To date, studies about repetitive elements of radish are still limited. To better understand genome structure of radish, we undertook a study to evaluate the proportion of repetitive elements and their distribution in radish. We conducted genome-wide characterization of repetitive elements in radish with low coverage genome sequencing followed by similarity-based cluster analysis. Results showed that about 31% of the genome was composed of repetitive sequences. Satellite repeats were the most dominating elements of the genome. The distribution pattern of three satellite repeat sequences (CL1, CL25, and CL43) on radish chromosomes was characterized using fluorescence in situ hybridization (FISH). CL1 was predominantly located at the centromeric region of all chromosomes, CL25 located at the subtelomeric region, and CL43 was a telomeric satellite. FISH signals of two satellite repeats, CL1 and CL25, together with 5S rDNA and 45S rDNA, provide useful cytogenetic markers to identify each individual somatic metaphase chromosome. The centromere-specific histone H3 (CENH3) has been used as a marker to identify centromere DNA sequences. One putative CENH3 (RsCENH3) was characterized and cloned from radish. Its deduced amino acid sequence shares high similarities to those of the CENH3s in Brassica species. An antibody against B. rapa CENH3, specifically stained radish centromeres. Immunostaining and chromatin immunoprecipitation (ChIP) tests with anti-BrCENH3 antibody demonstrated that both the centromere-specific retrotransposon (CR-Radish) and satellite repeat (CL1) are directly associated with RsCENH3 in radish. Proportions

  8. New radiocarbon dates for Milu (Elaphurus davidianus) sub-fossils from southeast China

    Energy Technology Data Exchange (ETDEWEB)

    Ding, X.F. [State Key Laboratory of Nuclear Physics and Technology and Institute of Heavy Ion Physics, School of Physics, Peking University, Beijing 100871 (China); Shen, C.D., E-mail: cdshen@gig.ac.cn [State Key Laboratory of Isotope Geochemistry, Guangzhou Institute of Geochemistry, Chinese Academy of Sciences, 510640 Guangzhou (China); Ding, P.; Yi, W.X. [State Key Laboratory of Isotope Geochemistry, Guangzhou Institute of Geochemistry, Chinese Academy of Sciences, 510640 Guangzhou (China); Fu, D.P.; Liu, K.X. [State Key Laboratory of Nuclear Physics and Technology and Institute of Heavy Ion Physics, School of Physics, Peking University, Beijing 100871 (China)

    2013-01-15

    Milu (Elaphurus davidianus, Pere David's deer) is one of the few species of large mammals that became extinct in the wild, but survived domestically. A good understanding of expansion and habitat is required if the reintroduction of Milu into the wild is to be implemented. Among the widely reported findings of Milu sub-fossils, only a small fraction have been dated. Here we report new AMS radiocarbon dates on Milu sub-fossil samples unearthed from two sites at Qingdun, Jiangsu and Fujiashan, Zhejiang in southeast China. These AMS {sup 14}C ages of Milu sub-fossils provide new evidence for the presence of Milu expansion in the lower reaches of the Yangtze River during the Holocene Optimum interval from 5000 yr BC to 3000 yr BC. These new ages also have important implications for the reconstruction of the paleoclimate and paleogeography during the Neolithic Period in southeast China.

  9. New radiocarbon dates for Milu (Elaphurus davidianus) sub-fossils from southeast China

    Science.gov (United States)

    Ding, X. F.; Shen, C. D.; Ding, P.; Yi, W. X.; Fu, D. P.; Liu, K. X.

    2013-01-01

    Milu (Elaphurus davidianus, Père David’s deer) is one of the few species of large mammals that became extinct in the wild, but survived domestically. A good understanding of expansion and habitat is required if the reintroduction of Milu into the wild is to be implemented. Among the widely reported findings of Milu sub-fossils, only a small fraction have been dated. Here we report new AMS radiocarbon dates on Milu sub-fossil samples unearthed from two sites at Qingdun, Jiangsu and Fujiashan, Zhejiang in southeast China. These AMS 14C ages of Milu sub-fossils provide new evidence for the presence of Milu expansion in the lower reaches of the Yangtze River during the Holocene Optimum interval from 5000 yr BC to 3000 yr BC. These new ages also have important implications for the reconstruction of the paleoclimate and paleogeography during the Neolithic Period in southeast China.

  10. Partial DNA sequencing of Douglas-fir cDNAs used in RFLP mapping

    Science.gov (United States)

    K.D. Jermstad; D.L. Bassoni; C.S. Kinlaw; D.B. Neale

    1998-01-01

    DNA sequences from 87 Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco) cDNA RFLP probes were determined. Sequences were submitted to the GenBank dbEST database and searched for similarity against nucleotide and protein databases using the BLASTn and BLASTx programs. Twenty-one sequences (24%) were assigned putative functions; 18 of which...

  11. Bloom DNA helicase facilitates homologous recombination between diverged homologous sequences.

    Science.gov (United States)

    Kikuchi, Koji; Abdel-Aziz, H Ismail; Taniguchi, Yoshihito; Yamazoe, Mitsuyoshi; Takeda, Shunichi; Hirota, Kouji

    2009-09-25

    Bloom syndrome caused by inactivation of the Bloom DNA helicase (Blm) is characterized by increases in the level of sister chromatid exchange, homologous recombination (HR) associated with cross-over. It is therefore believed that Blm works as an anti-recombinase. Meanwhile, in Drosophila, DmBlm is required specifically to promote the synthesis-dependent strand anneal (SDSA), a type of HR not associating with cross-over. However, conservation of Blm function in SDSA through higher eukaryotes has been a matter of debate. Here, we demonstrate the function of Blm in SDSA type HR in chicken DT40 B lymphocyte line, where Ig gene conversion diversifies the immunoglobulin V gene through intragenic HR between diverged homologous segments. This reaction is initiated by the activation-induced cytidine deaminase enzyme-mediated uracil formation at the V gene, which in turn converts into abasic site, presumably leading to a single strand gap. Ig gene conversion frequency was drastically reduced in BLM(-/-) cells. In addition, BLM(-/-) cells used limited donor segments harboring higher identity compared with other segments in Ig gene conversion event, suggesting that Blm can promote HR between diverged sequences. To further understand the role of Blm in HR between diverged homologous sequences, we measured the frequency of gene targeting induced by an I-SceI-endonuclease-mediated double-strand break. BLM(-/-) cells showed a severer defect in the gene targeting frequency as the number of heterologous sequences increased at the double-strand break site. Conversely, the overexpression of Blm, even an ATPase-defective mutant, strongly stimulated gene targeting. In summary, Blm promotes HR between diverged sequences through a novel ATPase-independent mechanism.

  12. Genome-Wide Prediction of DNA Methylation Using DNA Composition and Sequence Complexity in Human

    Directory of Open Access Journals (Sweden)

    Chengchao Wu

    2017-02-01

    Full Text Available DNA methylation plays a significant role in transcriptional regulation by repressing activity. Change of the DNA methylation level is an important factor affecting the expression of target genes and downstream phenotypes. Because current experimental technologies can only assay a small proportion of CpG sites in the human genome, it is urgent to develop reliable computational models for predicting genome-wide DNA methylation. Here, we proposed a novel algorithm that accurately extracted sequence complexity features (seven features and developed a support-vector-machine-based prediction model with integration of the reported DNA composition features (trinucleotide frequency and GC content, 65 features by utilizing the methylation profiles of embryonic stem cells in human. The prediction results from 22 human chromosomes with size-varied windows showed that the 600-bp window achieved the best average accuracy of 94.7%. Moreover, comparisons with two existing methods further showed the superiority of our model, and cross-species predictions on mouse data also demonstrated that our model has certain generalization ability. Finally, a statistical test of the experimental data and the predicted data on functional regions annotated by ChromHMM found that six out of 10 regions were consistent, which implies reliable prediction of unassayed CpG sites. Accordingly, we believe that our novel model will be useful and reliable in predicting DNA methylation.

  13. Beyond DNA Sequencing in Space: Current and Future Omics Capabilities of the Biomolecule Sequencer Payload

    Science.gov (United States)

    Wallace, Sarah

    2017-01-01

    Why do we need a DNA sequencer to support the human exploration of space? (A) Operational environmental monitoring; (1) Identification of contaminating microbes, (2) Infectious disease diagnosis, (3) Reduce down mass (sample return for environmental monitoring, crew health, etc.). (B) Research; (1) Human, (2) Animal, (3) Microbes/Cell lines, (4) Plant. (C) Med Ops; (1) Response to countermeasures, (2) Radiation, (3) Real-time analysis can influence medical intervention. (C) Support astrobiology science investigations; (1) Technology superiorly suited to in situ nucleic acid-based life detection, (2) Functional testing for integration into robotics for extraplanetary exploration mission.

  14. Exploring possible DNA structures in real-time polymerase kinetics using Pacific Biosciences sequencer data.

    Science.gov (United States)

    Sawaya, Sterling; Boocock, James; Black, Michael A; Gemmell, Neil J

    2015-01-28

    Pausing of DNA polymerase can indicate the presence of a DNA structure that differs from the canonical double-helix. Here we detail a method to investigate how polymerase pausing in the Pacific Biosciences sequencer reads can be related to DNA sequences. The Pacific Biosciences sequencer uses optics to view a polymerase and its interaction with a single DNA molecule in real-time, offering a unique way to detect potential alternative DNA structures. We have developed a new way to examine polymerase kinetics data and relate it to the DNA sequence by using a wavelet transform of read information from the sequencer. We use this method to examine how polymerase kinetics are related to nucleotide base composition. We then examine tandem repeat sequences known for their ability to form different DNA structures: (CGG)n and (CG)n repeats which can, respectively, form G-quadruplex DNA and Z-DNA. We find pausing around the (CGG)n repeat that may indicate the presence of G-quadruplexes in some of the sequencer reads. The (CG)n repeat does not appear to cause polymerase pausing, but its kinetics signature nevertheless suggests the possibility that alternative nucleotide conformations may sometimes be present. We discuss the implications of using our method to discover DNA sequences capable of forming alternative structures. The analyses presented here can be reproduced on any Pacific Biosciences kinetics data for any DNA pattern of interest using an R package that we have made publicly available.

  15. Episodic Statistics of Evolutionary Substitutions in DNA Sequences

    Science.gov (United States)

    West, Bruce J.

    1998-03-01

    The number of molecular substitutions occurring in a DNA sequence in a given time interval is described by a fractional-difference equation whose statistics are described by a truncated Levy distribution and which has an inverse power law correlation function. This is an empirically motivated stochastic model of molecular evolution and does not address the evolutionary mechanisms that lead to substitutions. The Levy stable process yields a Fano Factor, the ratio of the variance to the mean in the number of molecular substitutions, that increases as a power law in time. This prediction agrees with the observed statistics across 49 different genes in mammals. This model of molecular evolution is episodic and is consistent with the punctuated equilibrium model of macroevolution without making additional statistical assumptions.

  16. Biomolecule Sequencer: Next-Generation DNA Sequencing Technology for In-Flight Environmental Monitoring, Research, and Beyond

    Science.gov (United States)

    Smith, David J.; Burton, Aaron; Castro-Wallace, Sarah; John, Kristen; Stahl, Sarah E.; Dworkin, Jason Peter; Lupisella, Mark L.

    2016-01-01

    On the International Space Station (ISS), technologies capable of rapid microbial identification and disease diagnostics are not currently available. NASA still relies upon sample return for comprehensive, molecular-based sample characterization. Next-generation DNA sequencing is a powerful approach for identifying microorganisms in air, water, and surfaces onboard spacecraft. The Biomolecule Sequencer payload, manifested to SpaceX-9 and scheduled on the Increment 4748 research plan (June 2016), will assess the functionality of a commercially-available next-generation DNA sequencer in the microgravity environment of ISS. The MinION device from Oxford Nanopore Technologies (Oxford, UK) measures picoamp changes in electrical current dependent on nucleotide sequences of the DNA strand migrating through nanopores in the system. The hardware is exceptionally small (9.5 x 3.2 x 1.6 cm), lightweight (120 grams), and powered only by a USB connection. For the ISS technology demonstration, the Biomolecule Sequencer will be powered by a Microsoft Surface Pro3. Ground-prepared samples containing lambda bacteriophage, Escherichia coli, and mouse genomic DNA, will be launched and stored frozen on the ISS until experiment initiation. Immediately prior to sequencing, a crew member will collect and thaw frozen DNA samples, connect the sequencer to the Surface Pro3, inject thawed samples into a MinION flow cell, and initiate sequencing. At the completion of the sequencing run, data will be downlinked for ground analysis. Identical, synchronous ground controls will be used for data comparisons to determine sequencer functionality, run-time sequence, current dynamics, and overall accuracy. We will present our latest results from the ISS flight experiment the first time DNA has ever been sequenced in space and discuss the many potential applications of the Biomolecule Sequencer for environmental monitoring, medical diagnostics, higher fidelity and more adaptable Space Biology Human

  17. Length-independent DNA packing into nanopore zero-mode waveguides for low-input DNA sequencing

    Science.gov (United States)

    Larkin, Joseph; Henley, Robert Y.; Jadhav, Vivek; Korlach, Jonas; Wanunu, Meni

    2017-12-01

    Compared with conventional methods, single-molecule real-time (SMRT) DNA sequencing exhibits longer read lengths than conventional methods, less GC bias, and the ability to read DNA base modifications. However, reading DNA sequence from sub-nanogram quantities is impractical owing to inefficient delivery of DNA molecules into the confines of zero-mode waveguides—zeptolitre optical cavities in which DNA sequencing proceeds. Here, we show that the efficiency of voltage-induced DNA loading into waveguides equipped with nanopores at their floors is five orders of magnitude greater than existing methods. In addition, we find that DNA loading is nearly length-independent, unlike diffusive loading, which is biased towards shorter fragments. We demonstrate here loading and proof-of-principle four-colour sequence readout of a polymerase-bound 20,000-base-pair-long DNA template within seconds from a sub-nanogram input quantity, a step towards low-input DNA sequencing and mammalian epigenomic mapping of native DNA samples.

  18. Analysis of mitochondrial DNA sequences in patients with isolated or combined oxidative phosphorylation system deficiency.

    NARCIS (Netherlands)

    Hinttala, R.; Smeets, R.; Moilanen, J.S.; Ugalde, C.; Uusimaa, J.; Smeitink, J.A.M.; Majamaa, K.

    2006-01-01

    BACKGROUND: Enzyme deficiencies of the oxidative phosphorylation (OXPHOS) system may be caused by mutations in the mitochondrial DNA (mtDNA) or in the nuclear DNA. OBJECTIVE: To analyse the sequences of the mtDNA coding region in 25 patients with OXPHOS system deficiency to identify the underlying

  19. An Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases

    Directory of Open Access Journals (Sweden)

    Md. Rezaul Karim

    2012-03-01

    Full Text Available Mining interesting patterns from DNA sequences is one of the most challenging tasks in bioinformatics and computational biology. Maximal contiguous frequent patterns are preferable for expressing the function and structure of DNA sequences and hence can capture the common data characteristics among related sequences. Biologists are interested in finding frequent orderly arrangements of motifs that are responsible for similar expression of a group of genes. In order to reduce mining time and complexity, however, most existing sequence mining algorithms either focus on finding short DNA sequences or require explicit specification of sequence lengths in advance. The challenge is to find longer sequences without specifying sequence lengths in advance. In this paper, we propose an efficient approach to mining maximal contiguous frequent patterns from large DNA sequence datasets. The experimental results show that our proposed approach is memory-efficient and mines maximal contiguous frequent patterns within a reasonable time.

  20. Evolutionary History of Terrestrial Pathogens and Endoparasites as Revealed in Fossils and Subfossils

    Directory of Open Access Journals (Sweden)

    George Poinar

    2014-01-01

    Full Text Available The present work uses fossils and subfossils to decipher the origin and evolution of terrestrial pathogens and endoparasites. Fossils, as interpreted by morphology or specific features of their hosts, furnish minimum dates for the origin of infectious agents, coevolution with hosts, and geographical locations. Subfossils, those that can be C14 dated (roughly under 50,000 years and are identified by morphology as well as molecular and immunological techniques, provide time periods when humans became infected with various diseases. The pathogen groups surveyed include viruses, bacteria, protozoa, fungi, and select multicellular endoparasites including nematodes, trematodes, cestodes, and insect parasitoids in the terrestrial environment.

  1. Mitochondrial DNA sequence diversity in a sedentary population from Egypt.

    Science.gov (United States)

    Stevanovitch, A; Gilles, A; Bouzaid, E; Kefi, R; Paris, F; Gayraud, R P; Spadoni, J L; El-Chenawi, F; Béraud-Colomb, E

    2004-01-01

    The mitochondrial DNA (mtDNA) diversity of 58 individuals from Upper Egypt, more than half (34 individuals) from Gurna, whose population has an ancient cultural history, were studied by sequencing the control-region and screening diagnostic RFLP markers. This sedentary population presented similarities to the Ethiopian population by the L1 and L2 macrohaplogroup frequency (20.6%), by the West Eurasian component (defined by haplogroups H to K and T to X) and particularly by a high frequency (17.6%) of haplogroup M1. We statistically and phylogenetically analysed and compared the Gurna population with other Egyptian, Near East and sub-Saharan Africa populations; AMOVA and Minimum Spanning Network analysis showed that the Gurna population was not isolated from neighbouring populations. Our results suggest that the Gurna population has conserved the trace of an ancestral genetic structure from an ancestral East African population, characterized by a high M1 haplogroup frequency. The current structure of the Egyptian population may be the result of further influence of neighbouring populations on this ancestral population.

  2. Sequence Capture versus Restriction Site Associated DNA Sequencing for Shallow Systematics.

    Science.gov (United States)

    Harvey, Michael G; Smith, Brian Tilston; Glenn, Travis C; Faircloth, Brant C; Brumfield, Robb T

    2016-09-01

    Sequence capture and restriction site associated DNA sequencing (RAD-Seq) are two genomic enrichment strategies for applying next-generation sequencing technologies to systematics studies. At shallow timescales, such as within species, RAD-Seq has been widely adopted among researchers, although there has been little discussion of the potential limitations and benefits of RAD-Seq and sequence capture. We discuss a series of issues that may impact the utility of sequence capture and RAD-Seq data for shallow systematics in non-model species. We review prior studies that used both methods, and investigate differences between the methods by re-analyzing existing RAD-Seq and sequence capture data sets from a Neotropical bird (Xenops minutus). We suggest that the strengths of RAD-Seq data sets for shallow systematics are the wide dispersion of markers across the genome, the relative ease and cost of laboratory work, the deep coverage and read overlap at recovered loci, and the high overall information that results. Sequence capture's benefits include flexibility and repeatability in the genomic regions targeted, success using low-quality samples, more straightforward read orthology assessment, and higher per-locus information content. The utility of a method in systematics, however, rests not only on its performance within a study, but on the comparability of data sets and inferences with those of prior work. In RAD-Seq data sets, comparability is compromised by low overlap of orthologous markers across species and the sensitivity of genetic diversity in a data set to an interaction between the level of natural heterozygosity in the samples examined and the parameters used for orthology assessment. In contrast, sequence capture of conserved genomic regions permits interrogation of the same loci across divergent species, which is preferable for maintaining comparability among data sets and studies for the purpose of drawing general conclusions about the impact of

  3. Correcting for Sample Contamination in Genotype Calling of DNA Sequence Data

    OpenAIRE

    Flickinger, Matthew; Jun, Goo; Abecasis, Gonçalo R.; Boehnke, Michael; Kang, Hyun Min

    2015-01-01

    DNA sample contamination is a frequent problem in DNA sequencing studies and can result in genotyping errors and reduced power for association testing. We recently described methods to identify within-species DNA sample contamination based on sequencing read data, showed that our methods can reliably detect and estimate contamination levels as low as 1%, and suggested strategies to identify and remove contaminated samples from sequencing studies. Here we propose methods to model contamination...

  4. Detecting and Estimating Contamination of Human DNA Samples in Sequencing and Array-Based Genotype Data

    OpenAIRE

    Jun, Goo; Flickinger, Matthew; Hetrick, Kurt N.; Romm, Jane M.; Doheny, Kimberly F.; Abecasis, Gonçalo R.; Boehnke, Michael; Kang, Hyun Min

    2012-01-01

    DNA sample contamination is a serious problem in DNA sequencing studies and may result in systematic genotype misclassification and false positive associations. Although methods exist to detect and filter out cross-species contamination, few methods to detect within-species sample contamination are available. In this paper, we describe methods to identify within-species DNA sample contamination based on (1) a combination of sequencing reads and array-based genotype data, (2) sequence reads al...

  5. Differential conductance as a promising approach for rapid DNA sequencing with nanopore-embedded electrodes

    Science.gov (United States)

    He, Yuhui; Shao, Lubing; Scheicher, Ralph H.; Grigoriev, Anton; Ahuja, Rajeev; Long, Shibing; Ji, Zhuoyu; Yu, Zhaoan; Liu, Ming

    2010-07-01

    We propose an approach for nanopore-based DNA sequencing using characteristic transverse differential conductance. Molecular dynamics and electron transport simulations show that the transverse differential conductance during the translocation of DNA through the nanopore is distinguishable enough for the detection of the base sequence and can withstand electrical noise caused by DNA structure fluctuation. Our findings demonstrate several advantages of the transverse conductance approach, which may lead to important applications in rapid genome sequencing.

  6. A novel DNA sequence database for analyzing human demographic history.

    Science.gov (United States)

    Wall, Jeffrey D; Cox, Murray P; Mendez, Fernando L; Woerner, August; Severson, Tesa; Hammer, Michael F

    2008-08-01

    While there are now extensive databases of human genomic sequences from both private and public efforts to catalog human nucleotide variation, there are very few large-scale surveys designed for the purpose of analyzing human population history. Demographic inference from patterns of SNP variation in current large public databases is complicated by ascertainment biases associated with SNP discovery and the ways that populations and regions of the genome are sampled. Here, we present results from a resequencing survey of 40 independent intergenic regions on the autosomes and X chromosome comprising ~210 kb from each of 90 humans from six geographically diverse populations (i.e., a total of ~18.9 Mb). Unlike other public DNA sequence databases, we include multiple indigenous populations that serve as important reservoirs of human genetic diversity, such as the San of Namibia, the Biaka of the Central African Republic, and Melanesians from Papua New Guinea. In fact, only 20% of the SNPs that we find are contained in the HapMap database. We identify several key differences in patterns of variability in our database compared with other large public databases, including higher levels of nucleotide diversity within populations, greater levels of differentiation between populations, and significant differences in the frequency spectrum. Because variants at loci included in this database are less likely to be subject to ascertainment biases or linked to sites under selection, these data will be more useful for accurately reconstructing past changes in size and structure of human populations.

  7. Z-DNA-forming sequences generate large-scale deletions in mammalian cells

    OpenAIRE

    Wang, Guliang; Christensen, Laura A.; Vasquez, Karen M.

    2006-01-01

    Spontaneous chromosomal breakages frequently occur at genomic hot spots in the absence of DNA damage and can result in translocation-related human disease. Chromosomal breakpoints are often mapped near purine–pyrimidine Z-DNA-forming sequences in human tumors. However, it is not known whether Z-DNA plays a role in the generation of these chromosomal breakages. Here, we show that Z-DNA-forming sequences induce high levels of genetic instability in both bacterial and mammalian cells. In mammali...

  8. High Interlaboratory Reprocucibility of DNA Sequence-based Typing of Bacteria in a Multicenter Study

    DEFF Research Database (Denmark)

    Sousa, MA de; Boye, Kit; Lencastre, H de

    2006-01-01

    Current DNA amplification-based typing methods for bacterial pathogens often lack interlaboratory reproducibility. In this international study, DNA sequence-based typing of the Staphylococcus aureus protein A gene (spa, 110 to 422 bp) showed 100% intra- and interlaboratory reproducibility without...... extensive harmonization of protocols for 30 blind-coded S. aureus DNA samples sent to 10 laboratories. Specialized software for automated sequence analysis ensured a common typing nomenclature.......Current DNA amplification-based typing methods for bacterial pathogens often lack interlaboratory reproducibility. In this international study, DNA sequence-based typing of the Staphylococcus aureus protein A gene (spa, 110 to 422 bp) showed 100% intra- and interlaboratory reproducibility without...

  9. Generating Exome Enriched Sequencing Libraries from Formalin-Fixed, Paraffin-Embedded Tissue DNA for Next-Generation Sequencing.

    Science.gov (United States)

    Marosy, Beth A; Craig, Brian D; Hetrick, Kurt N; Witmer, P Dane; Ling, Hua; Griffith, Sean M; Myers, Benjamin; Ostrander, Elaine A; Stanford, Janet L; Brody, Lawrence C; Doheny, Kimberly F

    2017-01-11

    This unit describes a technique for generating exome-enriched sequencing libraries using DNA extracted from formalin-fixed paraffin-embedded (FFPE) samples. Utilizing commercially available kits, we present a low-input FFPE workflow starting with 50 ng of DNA. This procedure includes a repair step to address damage caused by FFPE preservation that improves sequence quality. Subsequently, libraries undergo an in-solution-targeted selection for exons, followed by sequencing using the Illumina next-generation short-read sequencing platform. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  10. A High-Throughput Process for the Solid-Phase Purification of Synthetic DNA Sequences.

    Science.gov (United States)

    Grajkowski, Andrzej; Cieślak, Jacek; Beaucage, Serge L

    2017-06-19

    An efficient process for the purification of synthetic phosphorothioate and native DNA sequences is presented. The process is based on the use of an aminopropylated silica gel support functionalized with aminooxyalkyl functions to enable capture of DNA sequences through an oximation reaction with the keto function of a linker conjugated to the 5'-terminus of DNA sequences. Deoxyribonucleoside phosphoramidites carrying this linker, as a 5'-hydroxyl protecting group, have been synthesized for incorporation into DNA sequences during the last coupling step of a standard solid-phase synthesis protocol executed on a controlled pore glass (CPG) support. Solid-phase capture of the nucleobase- and phosphate-deprotected DNA sequences released from the CPG support is demonstrated to proceed near quantitatively. Shorter than full-length DNA sequences are first washed away from the capture support; the solid-phase purified DNA sequences are then released from this support upon reaction with tetra-n-butylammonium fluoride in dry dimethylsulfoxide (DMSO) and precipitated in tetrahydrofuran (THF). The purity of solid-phase-purified DNA sequences exceeds 98%. The simulated high-throughput and scalability features of the solid-phase purification process are demonstrated without sacrificing purity of the DNA sequences. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  11. Cloning and molecular genetics analyses of Deschampsia antarctica Desv. chloroplast and mitochondrial DNA sequence

    Directory of Open Access Journals (Sweden)

    O.P. Savchuk

    2012-03-01

    Full Text Available Chloroplast and mitochondrial DNA sequences of Deschampsia antarctica were studied. We had made comparison analysis with completely sequenced genomes of other temperateness plants to find homology.

  12. [Images of Alu-sequence in 7 DNA clones from the human genome].

    Science.gov (United States)

    Korotkov, E V

    1987-01-01

    Information theory methods were used for computer search of Alu-like sequences in human DNA and RNA. Eight new regions related to the Alu repeat sequence was revealed in 85 clones from the EMBL-5 data bank. Some of these regions are purine-pyrimidine images of Alu repeats sequence, the rest are more complex images of Alu repeat sequence. A new definition for the likeness of different sequences--information image of sequence--was introduced. This information theory application greatly increases the power of DNA sequences computer analysis.

  13. True single-molecule DNA sequencing of a pleistocene horse bone

    Science.gov (United States)

    Orlando, Ludovic; Ginolhac, Aurelien; Raghavan, Maanasa; Vilstrup, Julia; Rasmussen, Morten; Magnussen, Kim; Steinmann, Kathleen E.; Kapranov, Philipp; Thompson, John F.; Zazula, Grant; Froese, Duane; Moltke, Ida; Shapiro, Beth; Hofreiter, Michael; Al-Rasheid, Khaled A.S.; Gilbert, M. Thomas P.; Willerslev, Eske

    2011-01-01

    Second-generation sequencing platforms have revolutionized the field of ancient DNA, opening access to complete genomes of past individuals and extinct species. However, these platforms are dependent on library construction and amplification steps that may result in sequences that do not reflect the original DNA template composition. This is particularly true for ancient DNA, where templates have undergone extensive damage post-mortem. Here, we report the results of the first “true single molecule sequencing” of ancient DNA. We generated 115.9 Mb and 76.9 Mb of DNA sequences from a permafrost-preserved Pleistocene horse bone using the Helicos HeliScope and Illumina GAIIx platforms, respectively. We find that the percentage of endogenous DNA sequences derived from the horse is higher among the Helicos data than Illumina data. This result indicates that the molecular biology tools used to generate sequencing libraries of ancient DNA molecules, as required for second-generation sequencing, introduce biases into the data that reduce the efficiency of the sequencing process and limit our ability to fully explore the molecular complexity of ancient DNA extracts. We demonstrate that simple modifications to the standard Helicos DNA template preparation protocol further increase the proportion of horse DNA for this sample by threefold. Comparison of Helicos-specific biases and sequence errors in modern DNA with those in ancient DNA also reveals extensive cytosine deamination damage at the 3′ ends of ancient templates, indicating the presence of 3′-sequence overhangs. Our results suggest that paleogenomes could be sequenced in an unprecedented manner by combining current second- and third-generation sequencing approaches. PMID:21803858

  14. DNA sequencing validation of Chlamydia trachomatis and Neisseria gonorrhoeae nucleic acid tests.

    Science.gov (United States)

    Lee, Sin Hang; Vigliotti, Veronica S; Pappu, Suri

    2008-06-01

    DNA sequencing was used to confirm Chlamydia trachomatis and Neisseria gonorrhoeae nucleic acids in endocervical swab samples. DNA in residues of the samples with positive results by 2 commercial kits was subjected to nested polymerase chain reaction (PCR) amplification. The nested PCR amplicons were used as templates for direct automated DNA sequencing. A 40-base signature sequence was sufficient to achieve unequivocal validation of C trachomatis cryptic plasmid and gonococcal opa gene DNA. DNA with a signature sequence specific for C trachomatis was identified in all 14 samples and for N gonorrhoeae in all 13 samples with positive results by the commercial kits for these 2 microbes. In a low-prevalence population, PCR retesting of 289 samples with initial negative results by a non-nucleic acid amplification assay revealed 3 samples positive for C trachomatis and 2 samples positive for N gonorrhoeae that were missed by the commercial kit. DNA sequencing is a useful tool in validating molecular diagnostics.

  15. The mitochondrial DNA sequence specificity of the anti-tumour drug bleomycin using end-labeled DNA and capillary electrophoresis and a comparison with genome-wide DNA sequencing.

    Science.gov (United States)

    Chung, Long H; Murray, Vincent

    2016-01-01

    The DNA sequence specificity of the cancer chemotherapeutic agent, bleomycin, was investigated in two human mitochondrial DNA sequences. Bleomycin was found to cleave preferentially at 5'-TGT*A-3' DNA sequences (where * is the cleavage site). The bleomycin analysis using capillary electrophoresis with laser-induced fluorescence was determined on both DNA strands and each strand was independently fluorescently labelled at the 3'- and 5'-ends. There was a high level of correlation between the intensity of bleomycin cleavage sites analysed by 3'- and 5'-end labelling. This is the first occasion that a comprehensive comparison has been made between these two end-labelling procedures to quantify cleavage by a DNA damaging agent and to investigate end-label bias. A comparison was also made between the bleomycin DNA sequence specificity obtained from genome-wide next-generation sequencing with that obtained from purified plasmid DNA sequences. This was accomplished by cloning sections of human mitochondrial DNA and comparing these identical mitochondrial DNA in the human mitochondrial genome. At individual sites, there was a very low level of correlation between bleomycin cleavage in plasmid sequencing and genome-wide sequencing. However, the overall bleomycin DNA sequence specificity was very similar in the two environments, namely 5'-TGT*A-3'. Copyright © 2015 Elsevier B.V. All rights reserved.

  16. A 28,000 years old Cro-Magnon mtDNA sequence differs from all potentially contaminating modern sequences.

    Directory of Open Access Journals (Sweden)

    David Caramelli

    Full Text Available BACKGROUND: DNA sequences from ancient specimens may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal and early modern (Cro-Magnoid Europeans. METHODOLOGY/PRINCIPAL FINDINGS: We typed the mitochondrial DNA (mtDNA hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23 and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. CONCLUSIONS/SIGNIFICANCE: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans.

  17. Radiocarbon dating with annual-resolution of subfossil trees from the Younger Dryas event in the southern French Alps

    Science.gov (United States)

    Capano, Manuela; Miramont, Cécile; Guibal, Frédéric; Kromer, Bernd; Tuna, Thibaut; Fagault, Yoann; Bard, Edouard

    2017-04-01

    Tree rings are an important archive for the calibration of radiocarbon data. The younger part of the IntCal curve is based essentially on tree-ring chronologies, absolutely dated by dendrochronological analysis. For the Northern Hemisphere (NH), a gap still exists between the absolutely dated sequences and a floating chronology. Based on the Southern Hemisphere (SH) tree-ring chronologies a link has been previously proposed (Reimer et al. 2013, Radiocarbon; see also update in Hogg et al. 2016, Radiocarbon). By measuring radiocarbon at annual resolution in French subfossil pines (Pinus sylvestris L.) we propose to improve the connection between the absolute chronology and the floating chronology. Several subfossil pines have been found in the Southern French Alps; they were buried by flood deposits, allowing their preservation. Some trees discovered in the Barbier riverbed were dated to the Younger Dryas periods by previous decadal radiocarbon measurements, performed in Heidelberg and Mannheim. The trees selected for our new study are Barb12 and Barb17 (analyzed sequences of 163 and 152 rings, respectively). These sequences were sampled at annual resolution when permitted by the ring width. As a first step, every third ring was pretreated for radiocarbon analysis. These samples were sliced in small pieces and pretreated by using the ABA-B method before being combusted, graphitized with the AGE system and measured with AixMICADAS (Bard et al. 2015, Nucl. Instr. Meth. B). From the comparison with the kauri sequence, the Barb12-17 sequence can be dated from about 12835 to 12606 cal. BP. It can also be used to calculate the interhemispheric gradient (IHG) over the overlapping period. In order to reduce the inter-annual variability, the Barb12-17 record was smoothed, grouped and averaged over the same decades as in the Kauri record. On the basis of twenty values, a mean IHG value of ca. 60 years was calculated. Quantification of the IHG around 50 yr is particularly

  18. Cell-Free DNA Next-Generation Sequencing in Pancreatobiliary Carcinomas.

    Science.gov (United States)

    Zill, Oliver A; Greene, Claire; Sebisanovic, Dragan; Siew, Lai Mun; Leng, Jim; Vu, Mary; Hendifar, Andrew E; Wang, Zhen; Atreya, Chloe E; Kelley, Robin K; Van Loon, Katherine; Ko, Andrew H; Tempero, Margaret A; Bivona, Trever G; Munster, Pamela N; Talasaz, AmirAli; Collisson, Eric A

    2015-10-01

    Patients with pancreatic and biliary carcinomas lack personalized treatment options, in part because biopsies are often inadequate for molecular characterization. Cell-free DNA (cfDNA) sequencing may enable a precision oncology approach in this setting. We attempted to prospectively analyze 54 genes in tumor and cfDNA for 26 patients. Tumor sequencing failed in 9 patients (35%). In the remaining 17, 90.3% (95% confidence interval, 73.1%-97.5%) of mutations detected in tumor biopsies were also detected in cfDNA. The diagnostic accuracy of cfDNA sequencing was 97.7%, with 92.3% average sensitivity and 100% specificity across five informative genes. Changes in cfDNA correlated well with tumor marker dynamics in serial sampling (r = 0.93). We demonstrate that cfDNA sequencing is feasible, accurate, and sensitive in identifying tumor-derived mutations without prior knowledge of tumor genotype or the abundance of circulating tumor DNA. cfDNA sequencing should be considered in pancreatobiliary cancer trials where tissue sampling is unsafe, infeasible, or otherwise unsuccessful. Precision medicine efforts in biliary and pancreatic cancers have been frustrated by difficulties in obtaining adequate tumor tissue for next-generation sequencing. cfDNA sequencing reliably and accurately detects tumor-derived mutations, paving the way for precision oncology approaches in these deadly diseases. ©2015 American Association for Cancer Research.

  19. Microsatellite DNA in genomic survey sequences and UniGenes of loblolly pine

    Science.gov (United States)

    Craig S Echt; Surya Saha; Dennis L Deemer; C Dana Nelson

    2011-01-01

    Genomic DNA sequence databases are a potential and growing resource for simple sequence repeat (SSR) marker development in loblolly pine (Pinus taeda L.). Loblolly pine also has many expressed sequence tags (ESTs) available for microsatellite (SSR) marker development. We compared loblolly pine SSR densities in genome survey sequences (GSSs) to those in non-redundant...

  20. Sequencing of megabase plus DNA by hybridization: Method development ENT. Final technical progress report

    Energy Technology Data Exchange (ETDEWEB)

    Crkvenjakov, R.; Drmanac, R.

    1991-01-31

    Sequencing by hybridization (SBH) is the only sequencing method based on the experimental determination of the content of oligonucleotide sequences. The data acquisition relies on the natural process of base pairing. It is possible to determine the content of complementary oligosequences in the target DNA by the process of hybridization with oligonucleotide probes of known sequences.

  1. DNA interactions with a Methylene Blue redox indicator depend on the DNA length and are sequence specific.

    Science.gov (United States)

    Farjami, Elaheh; Clima, Lilia; Gothelf, Kurt V; Ferapontova, Elena E

    2010-06-01

    A DNA molecular beacon approach was used for the analysis of interactions between DNA and Methylene Blue (MB) as a redox indicator of a hybridization event. DNA hairpin structures of different length and guanine (G) content were immobilized onto gold electrodes in their folded states through the alkanethiol linker at the 5'-end. Binding of MB to the folded hairpin DNA was electrochemically studied and compared with binding to the duplex structure formed by hybridization of the hairpin DNA to a complementary DNA strand. Variation of the electrochemical signal from the DNA-MB complex was shown to depend primarily on the DNA length and sequence used: the G-C base pairs were the preferential sites of MB binding in the duplex. For short 20 nts long DNA sequences, the increased electrochemical response from MB bound to the duplex structure was consistent with the increased amount of bound and electrochemically readable MB molecules (i.e. MB molecules that are available for the electron transfer (ET) reaction with the electrode). With longer DNA sequences, the balance between the amounts of the electrochemically readable MB molecules bound to the hairpin DNA and to the hybrid was opposite: a part of the MB molecules bound to the long-sequence DNA duplex seem to be electrochemically mute due to long ET distance. The increasing electrochemical response from MB bound to the short-length DNA hybrid contrasts with the decreasing signal from MB bound to the long-length DNA hybrid and allows an "off"-"on" genosensor development.

  2. MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences

    OpenAIRE

    Schwartz, Scott; Elnitski, Laura; Li, Mei; Weirauch, Matt; Riemer, Cathy; Smit, Arian; Green, Eric D.; Hardison, Ross C.; Miller, Webb

    2003-01-01

    Analysis of multiple sequence alignments can generate important, testable hypotheses about the phylogenetic history and cellular function of genomic sequences. We describe the MultiPipMaker server, which aligns multiple, long genomic DNA sequences quickly and with good sensitivity (available at http://bio.cse.psu.edu/ since May 2001). Alignments are computed between a contiguous reference sequence and one or more secondary sequences, which can be finished or draft sequence. The outputs includ...

  3. OPTSDNA: Performance evaluation of an efficient distributed bioinformatics system for DNA sequence analysis.

    Science.gov (United States)

    Khan, Mohammad Ibrahim; Sheel, Chotan

    2013-01-01

    Storage of sequence data is a big concern as the amount of data generated is exponential in nature at several locations. Therefore, there is a need to develop techniques to store data using compression algorithm. Here we describe optimal storage algorithm (OPTSDNA) for storing large amount of DNA sequences of varying length. This paper provides performance analysis of optimal storage algorithm (OPTSDNA) of a distributed bioinformatics computing system for analysis of DNA sequences. OPTSDNA algorithm is used for storing various sizes of DNA sequences into database. DNA sequences of different lengths were stored by using this algorithm. These input DNA sequences are varied in size from very small to very large. Storage size is calculated by this algorithm. Response time is also calculated in this work. The efficiency and performance of the algorithm is high (in size calculation with percentage) when compared with other known with sequential approach.

  4. Sequence-specific electron injection into DNA from an intermolecular electron donor

    OpenAIRE

    Morinaga, Hironobu; Takenaka, Tomohiro; Hashiya, Fumitaka; Kizaki, Seiichiro; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

    2013-01-01

    Electron transfer in DNA has been intensively studied to elucidate its biological roles and for applications in bottom-up DNA nanotechnology. Recently, mechanisms of electron transfer to DNA have been investigated; however, most of the systems designed are intramolecular. Here, we synthesized pyrene-conjugated pyrrole-imidazole polyamides (PPIs) to achieve sequence-specific electron injection into DNA in an intermolecular fashion. Electron injection from PPIs into DNA was detected using 5-bro...

  5. Methylation-capture and Next-Generation Sequencing of free circulating DNA from human plasma.

    Science.gov (United States)

    Warton, Kristina; Lin, Vita; Navin, Tina; Armstrong, Nicola J; Kaplan, Warren; Ying, Kevin; Gloss, Brian; Mangs, Helena; Nair, Shalima S; Hacker, Neville F; Sutherland, Robert L; Clark, Susan J; Samimi, Goli

    2014-06-15

    Free circulating DNA (fcDNA) has many potential clinical applications, due to the non-invasive way in which it is collected. However, because of the low concentration of fcDNA in blood, genome-wide analysis carries many technical challenges that must be overcome before fcDNA studies can reach their full potential. There are currently no definitive standards for fcDNA collection, processing and whole-genome sequencing. We report novel detailed methodology for the capture of high-quality methylated fcDNA, library preparation and downstream genome-wide Next-Generation Sequencing. We also describe the effects of sample storage, processing and scaling on fcDNA recovery and quality. Use of serum versus plasma, and storage of blood prior to separation resulted in genomic DNA contamination, likely due to leukocyte lysis. Methylated fcDNA fragments were isolated from 5 donors using a methyl-binding protein-based protocol and appear as a discrete band of ~180 bases. This discrete band allows minimal sample loss at the size restriction step in library preparation for Next-Generation Sequencing, allowing for high-quality sequencing from minimal amounts of fcDNA. Following sequencing, we obtained 37 × 10(6)-86 × 10(6) unique mappable reads, representing more than 50% of total mappable reads. The methylation status of 9 genomic regions as determined by DNA capture and sequencing was independently validated by clonal bisulphite sequencing. Our optimized methods provide high-quality methylated fcDNA suitable for whole-genome sequencing, and allow good library complexity and accurate sequencing, despite using less than half of the recommended minimum input DNA.

  6. Complete Genomic DNA Sequence of the East Asian Spotted Fever Disease Agent Rickettsia japonica

    Science.gov (United States)

    Matsutani, Minenosuke; Ogawa, Motohiko; Takaoka, Naohisa; Hanaoka, Nozomu; Toh, Hidehiro; Yamashita, Atsushi; Oshima, Kenshiro; Hirakawa, Hideki; Kuhara, Satoru; Suzuki, Harumi; Hattori, Masahira; Kishimoto, Toshio; Ando, Shuji; Azuma, Yoshinao; Shirai, Mutsunori

    2013-01-01

    Rickettsia japonica is an obligate intracellular alphaproteobacteria that causes tick-borne Japanese spotted fever, which has spread throughout East Asia. We determined the complete genomic DNA sequence of R. japonica type strain YH (VR-1363), which consists of 1,283,087 base pairs (bp) and 971 protein-coding genes. Comparison of the genomic DNA sequence of R. japonica with other rickettsiae in the public databases showed that 2 regions (4,323 and 216 bp) were conserved in a very narrow range of Rickettsia species, and the shorter one was inserted in, and disrupted, a preexisting open reading frame (ORF). While it is unknown how the DNA sequences were acquired in R. japonica genomes, it may be a useful signature for the diagnosis of Rickettsia species. Instead of the species-specific inserted DNA sequences, rickettsial genomes contain Rickettsia-specific palindromic elements (RPEs), which are also capable of locating in preexisting ORFs. Precise alignments of protein and DNA sequences involving RPEs showed that when a gene contains an inserted DNA sequence, each rickettsial ortholog carried an inserted DNA sequence at the same locus. The sequence, ATGAC, was shown to be highly frequent and thus characteristic in certain RPEs (RPE-4, RPE-6, and RPE-7). This finding implies that RPE-4, RPE-6, and RPE-7 were derived from a common inserted DNA sequence. PMID:24039725

  7. Subfossil bog-pine horizons document climate and ecosystem changes during the Mid-Holocene

    NARCIS (Netherlands)

    Eckstein, J.; Leuschner, H.H.; Bauerochse, A.; Sass-Klaassen, U.

    2009-01-01

    Extended dendrochronological investigations were performed on subfossil pine entombed in peat layers of former raised bogs in Lower Saxony (NW Germany). The aim was to study of dynamics in bog development in response to local environmental conditions and regional changes in climate throughout the

  8. Early Holocene fauna from a new subfossil site: A first assessment ...

    African Journals Online (AJOL)

    Early Holocene fauna from a new subfossil site: A first assessment from Christmas River, south central Madagascar. ... Thus, elevation above sea level may have acted as a filter that limited species dispersal across the island in the past. Such a scenario would explain the distinction between more humid, higher elevation, ...

  9. Representation of DNA sequences in genetic codon context with applications in exon and intron prediction.

    Science.gov (United States)

    Yin, Changchuan

    2015-04-01

    To apply digital signal processing (DSP) methods to analyze DNA sequences, the sequences first must be specially mapped into numerical sequences. Thus, effective numerical mappings of DNA sequences play key roles in the effectiveness of DSP-based methods such as exon prediction. Despite numerous mappings of symbolic DNA sequences to numerical series, the existing mapping methods do not include the genetic coding features of DNA sequences. We present a novel numerical representation of DNA sequences using genetic codon context (GCC) in which the numerical values are optimized by simulation annealing to maximize the 3-periodicity signal to noise ratio (SNR). The optimized GCC representation is then applied in exon and intron prediction by Short-Time Fourier Transform (STFT) approach. The results show the GCC method enhances the SNR values of exon sequences and thus increases the accuracy of predicting protein coding regions in genomes compared with the commonly used 4D binary representation. In addition, this study offers a novel way to reveal specific features of DNA sequences by optimizing numerical mappings of symbolic DNA sequences.

  10. Repeated DNA sequences in the microbat species Miniopterus schreibersi (Vespertilionidae; Chiroptera).

    Science.gov (United States)

    Barragán, M J L; Martínez, S; Marchal, J A; Bullejos, M; Díaz de la Guardia, R; Sánchez, A

    2002-01-01

    Repetitive DNA sequences represent a substantial component of eukaryotic genomes. These sequences have been described and characterized in many mammalian species. However, little information about repetitive DNA sequences is available in bat species. Here we describe an EcoRI family of repetitive DNA sequences present in the species Miniopterus schreibersi. These repetitive sequences are 57.85%, A-T rich, organized in tandem, and with a monomer unit length of 904 bp. Methylation analysis using the isoesquizomer pair MspI and HpaII indicates that the cytosines present in the sequences CCGG are partially methylated. Furthermore, Southern blot analysis demonstrated that these DNA sequences are absent in the genomes of four related microbat species and suggest that it could be specific to the M. schreibersi genome.

  11. Complete sequence analysis of 18S rDNA based on genomic DNA extraction from individual Demodex mites (Acari: Demodicidae).

    Science.gov (United States)

    Zhao, Ya-E; Xu, Ji-Ru; Hu, Li; Wu, Li-Ping; Wang, Zheng-Hang

    2012-05-01

    The study for the first time attempted to accomplish 18S ribosomal DNA (rDNA) complete sequence amplification and analysis for three Demodex species (Demodex folliculorum, Demodex brevis and Demodex canis) based on gDNA extraction from individual mites. The mites were treated by DNA Release Additive and Hot Start II DNA Polymerase so as to promote mite disruption and increase PCR specificity. Determination of D. folliculorum gDNA showed that the gDNA yield reached the highest at 1 mite, tending to descend with the increase of mite number. The individual mite gDNA was successfully used for 18S rDNA fragment (about 900 bp) amplification examination. The alignments of 18S rDNA complete sequences of individual mite samples and those of pooled mite samples ( ≥ 1000mites/sample) showed over 97% identities for each species, indicating that the gDNA extracted from a single individual mite was as satisfactory as that from pooled mites for PCR amplification. Further pairwise sequence analyses showed that average divergence, genetic distance, transition/transversion or phylogenetic tree could not effectively identify the three Demodex species, largely due to the differentiation in the D. canis isolates. It can be concluded that the individual Demodex mite gDNA can satisfy the molecular study of Demodex. 18S rDNA complete sequence is suitable for interfamily identification in Cheyletoidea, but whether it is suitable for intrafamily identification cannot be confirmed until the ascertainment of the types of Demodex mites parasitizing in dogs. Copyright © 2012 Elsevier Inc. All rights reserved.

  12. Synergy of Two Assembly Languages in DNA Nanostructures: Self-Assembly of Sequence-Defined Polymers on DNA Cages.

    Science.gov (United States)

    Chidchob, Pongphak; Edwardson, Thomas G W; Serpell, Christopher J; Sleiman, Hanadi F

    2016-04-06

    DNA base-pairing is the central interaction in DNA assembly. However, this simple four-letter (A-T and G-C) language makes it difficult to create complex structures without using a large number of DNA strands of different sequences. Inspired by protein folding, we introduce hydrophobic interactions to expand the assembly language of DNA nanotechnology. To achieve this, DNA cages of different geometries are combined with sequence-defined polymers containing long alkyl and oligoethylene glycol repeat units. Anisotropic decoration of hydrophobic polymers on one face of the cage leads to hydrophobically driven formation of quantized aggregates of DNA cages, where polymer length determines the cage aggregation number. Hydrophobic chains decorated on both faces of the cage can undergo an intrascaffold "handshake" to generate DNA-micelle cages, which have increased structural stability and assembly cooperativity, and can encapsulate small molecules. The polymer sequence order can control the interaction between hydrophobic blocks, leading to unprecedented "doughnut-shaped" DNA cage-ring structures. We thus demonstrate that new structural and functional modes in DNA nanostructures can emerge from the synergy of two interactions, providing an attractive approach to develop protein-inspired assembly modules in DNA nanotechnology.

  13. The DNA sequence and biology of human chromosome 19

    Energy Technology Data Exchange (ETDEWEB)

    Grimwood, J; Gordon, L A; Olsen, A; Terry, A; Schmutz, J; Lamerdin, J; Hellsten, U; Goodstein, D; Couronne, O; Tran-Gyamfi, M

    2004-04-06

    Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high GC content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in Mendelian disorders, including familial hypercholesterolemia and insulin-resistant diabetes. Nearly one quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.

  14. Phylogeography of Anastrepha obliqua inferred with mtDNA sequencing.

    Science.gov (United States)

    Ruiz-Arce, Raul; Barr, Norman B; Owen, Christopher L; Thomas, Donald B; McPheron, Bruce A

    2012-12-01

    Anastrepha obliqua (Macquart) (Diptera: Tephritidae), the West Indian fruit fly, is a frugivorous pest that occasionally finds its way to commercial growing areas outside its native distribution. It inhabits areas in Mexico, Central and South America, and the Caribbean with occasional infestations having occurred in the southern tier states (California, Florida, and Texas) of the United States. This fly is associated with many plant species and is a major pest of mango and plum. We examine the genetic diversity of the West Indian fruit fly based on mitochondrial COI and ND6 DNA sequences. Our analysis of 349 individuals from 54 geographic collections from Mexico, Central America, the Caribbean, and South America detected 61 haplotypes that are structured into three phylogenetic clades. The distribution of these clades among populations is associated with geography. Six populations are identified in this analysis: Mesoamerica, Central America, Caribbean, western Mexico, Andean South America, and eastern Brazil. In addition, substantial differences exist among these genetic types that warrants further taxonomic review.

  15. [Patentability of DNA sequences: the debate remains open].

    Science.gov (United States)

    Martín Uranga, Amelia

    2013-01-01

    The patentability of human genes was from the beginning of the discussion concerning the Directive on the legal protection of biotechnological inventions, an issue that provoked debates among politicians, scientists, lawyers and civil society itself. Although Directive 98/44 tried to settle the matter by stating that to support the patentability of human genes, it should know what role they fulfill, which protein they encode, all of this as an essential requirement to test its industrial application. However, following the judgment of 13 June 2013 (Supreme Court of the United States of America in the case of Association for Molecular Pathology et al. versus Myriad Genetics Inc.) the debate on this issue has been reopened. There are several issues to be considered, taking into account that the patents on DNA & Gene Sequences have played an important incentive to increase the interest in biotechnology applied to human health. On the other hand, this is a paradigm shift in the R & D of biopharmaceutical companies, and it has moved from an in house research model to a model of open innovation, a model of collaboration between large corporations with biotech SMEs and public and private research centers. This model of innovation, impacts on the issue of the industrial property, and therefore it will be necessary to clearly define what each party brings to the relationship and how they are expected to share the results. But all of this, with the ultimate goal that the patients have access to treatments and medications most innovative, safe and effective.

  16. Cloning of rat aorta lysyl oxidase cDNA: Complete codons and predicted amino acid sequence

    Energy Technology Data Exchange (ETDEWEB)

    Trackman, P.C.; Pratt, A.M.; Wolanski, A.; Tang, Shiowshih; Offner, G.D.; Troxler, R.F.; Kagan, H.M. (Boston Univ. School of Medicine, MA (USA))

    1990-05-22

    Lysyl oxidase cDNA clones were identified by their reactivity with anti-bovine lysyl oxidase in a neonatal rat aorta cDNA {lambda}gt11 expression library. A 500-bp cDNA sequence encoding four of six peptides derived from proteolytic digests of bovine aorta lysyl oxidase was found from the overlapping cDNA sequences of two positive clones. The library was rescreened with a radiolabeled cDNA probe made from one of these clones, thus identifying an additional 13 positive clones. Sequencing of the largest two of these overlapping clones resulted in 2,672 bp of cDNA sequence containing partial 5{prime}- and 3{prime}-untranslated sequences of 286 and 1,159 nucleotides, respectively, and a complete open reading frame of 1,227 bp encoding a polypeptide of 409 amino acids (46 kDa), consistent with the 48 {plus minus} 3 kDa cell-free translation product of rat smooth muscle cell RNA that was immunoprecipitated by anti-bovine lysyl oxidase. The rat aorta cDNA-derived amino acid sequence contains the sequence of each of the six peptides isolated and sequenced from the 32-kDa bovine aorta enzyme, including the C-terminal peptide with sequence identity of 96%. Southern blotting of rat genomic DNA with lysyl oxidase cDNA probes indicated that the lysyl oxidase gene is located at a single locus and does not appear to be a member of a multigene family. A potential stem-loop structure was found in the 3{prime}-untranslated region of the cDNA. The deduced amino acid sequence contains a putative signal peptide, in addition to sequences that are similar to those of other known copper proteins.

  17. DNA sequence-dependent mechanics and protein-assisted bending in repressor-mediated loop formation

    Science.gov (United States)

    Boedicker, James Q.; Garcia, Hernan G.; Johnson, Stephanie; Phillips, Rob

    2014-01-01

    As the chief informational molecule of life, DNA is subject to extensive physical manipulations. The energy required to deform double-helical DNA depends on sequence, and this mechanical code of DNA influences gene regulation, such as through nucleosome positioning. Here we examine the sequence-dependent flexibility of DNA in bacterial transcription factor-mediated looping, a context for which the role of sequence remains poorly understood. Using a suite of synthetic constructs repressed by the Lac repressor and two well-known sequences that show large flexibility differences in vitro, we make precise statistical mechanical predictions as to how DNA sequence influences loop formation and test these predictions using in vivo transcription and in vitro single-molecule assays. Surprisingly, sequence-dependent flexibility does not affect in vivo gene regulation. By theoretically and experimentally quantifying the relative contributions of sequence and the DNA-bending protein HU to DNA mechanical properties, we reveal that bending by HU dominates DNA mechanics and masks intrinsic sequence-dependent flexibility. Such a quantitative understanding of how mechanical regulatory information is encoded in the genome will be a key step towards a predictive understanding of gene regulation at single-base pair resolution. PMID:24231252

  18. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution

    NARCIS (Netherlands)

    Falconer, Ester; Hills, Mark; Naumann, Ulrike; Poon, Steven S. S.; Chavez, Elizabeth A.; Sanders, Ashley D.; Zhao, Yongjun; Hirst, Martin; Lansdorp, Peter M.

    2012-01-01

    DNA rearrangements such as sister chromatid exchanges (SCEs) are sensitive indicators of genomic stress and instability, but they are typically masked by single-cell sequencing techniques. We developed Strand-seq to independently sequence parental DNA template strands from single cells, making it

  19. Cloning and sequencing of complete τ-crystallin cDNA from ...

    Indian Academy of Sciences (India)

    Unknown

    embryonic tissues namely, brain, heart and GAM tissues. In each case, cDNA made by RT-PCR from total RNA isolated from the respective tissue was used as template. 2.6 Sequence analysis and primer design. The completed cDNA sequences were analysed for open reading frame using NCBI-ORF finder and homology.

  20. Comparative d2/d3 LSU–rDNA sequence study of some Iranian ...

    African Journals Online (AJOL)

    SERVER

    2007-11-05

    Nov 5, 2007 ... Key words: Tea, Pratylenchus loosi, D2/D3 LSU rDNA, sequencing, Iran. INTRODUCTION. The tea root lesion .... Original DNA sequence data were collected from 13 Iranian tea root lesion nematode isolates that verified by ..... from Pasture grass in central Florida. Nematology. 42: 159-172. Jaumot M ...

  1. Basic Gene Grammars and DNA-ChartParser for language processing of Escherichia coli promoter DNA sequences.

    Science.gov (United States)

    Leung, S; Mellish, C; Robertson, D

    2001-03-01

    The field of 'DNA linguistics' has emerged from pioneering work in computational linguistics and molecular biology. Most formal grammars in this field are expressed using Definite Clause Grammars but these have computational limitations which must be overcome. The present study provides a new DNA parsing system, comprising a logic grammar formalism called Basic Gene Grammars and a bidirectional chart parser DNA-ChartParser. The use of Basic Gene Grammars is demonstrated in representing many formulations of the knowledge of Escherichia coli promoters, including knowledge acquired from human experts, consensus sequences, statistics (weight matrices), symbolic learning, and neural network learning. The DNA-ChartParser provides bidirectional parsing facilities for BGGs in handling overlapping categories, gap categories, approximate pattern matching, and constraints. Basic Gene Grammars and the DNA-ChartParser allowed different sources of knowledge for recognizing E.coli promoters to be combined to achieve better accuracy as assessed by parsing these DNA sequences in real-world data sets.

  2. Structural biology of disease-associated repetitive DNA sequences and protein-DNA complexes involved in DNA damage and repair

    Energy Technology Data Exchange (ETDEWEB)

    Gupta, G.; Santhana Mariappan, S.V.; Chen, X.; Catasti, P.; Silks, L.A. III; Moyzis, R.K.; Bradbury, E.M.; Garcia, A.E.

    1997-07-01

    This project is aimed at formulating the sequence-structure-function correlations of various microsatellites in the human (and other eukaryotic) genomes. Here the authors have been able to develop and apply structure biology tools to understand the following: the molecular mechanism of length polymorphism microsatellites; the molecular mechanism by which the microsatellites in the noncoding regions alter the regulation of the associated gene; and finally, the molecular mechanism by which the expansion of these microsatellites impairs gene expression and causes the disease. Their multidisciplinary structural biology approach is quantitative and can be applied to all coding and noncoding DNA sequences associated with any gene. Both NIH and DOE are interested in developing quantitative tools for understanding the function of various human genes for prevention against diseases caused by genetic and environmental effects.

  3. A Microbiome DNA Enrichment Method for Next-Generation Sequencing Sample Preparation.

    Science.gov (United States)

    Yigit, Erbay; Feehery, George R; Langhorst, Bradley W; Stewart, Fiona J; Dimalanta, Eileen T; Pradhan, Sriharsa; Slatko, Barton; Gardner, Andrew F; McFarland, James; Sumner, Christine; Davis, Theodore B

    2016-07-01

    "Microbiome" is used to describe the communities of microorganisms and their genes in a particular environment, including communities in association with a eukaryotic host or part of a host. One challenge in microbiome analysis concerns the presence of host DNA in samples. Removal of host DNA before sequencing results in greater sequence depth of the intended microbiome target population. This unit describes a novel method of microbial DNA enrichment in which methylated host DNA such as human genomic DNA is selectively bound and separated from microbial DNA before next-generation sequencing (NGS) library construction. This microbiome enrichment technique yields a higher fraction of microbial sequencing reads and improved read quality resulting in a reduced cost of downstream data generation and analysis. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  4. A new database of mitochondrial DNA hypervariable regions I and II sequences from 162 Japanese individuals.

    Science.gov (United States)

    Imaizumi, K; Parsons, T J; Yoshino, M; Holland, M M

    2002-04-01

    A database of mitochondrial DNA (mtDNA) hypervariable region 1 (HV1) and region 2 (HV2) sequences of the mtDNA control region was established from 162 unrelated Japanese individuals. The random match probability and the genetic diversity for this database were 0.96% and 0.997, respectively. Length heteroplasmy in the C-stretch regions located around position 16189 in HVI and 310 in HV2 was observed in 37% and 38% of the samples, respectively. A strategy using internal sequencing primers was devised to obtain confirmed sequences in these length heteroplasmic individuals. This database, combined with other mtDNA sequence databases from the Japanese population, will permit the significance of mtDNA match results to be properly reported in mtDNA typing casework in Japan.

  5. Sequencing the hypervariable regions of human mitochondrial DNA using massively parallel sequencing: Enhanced data acquisition for DNA samples encountered in forensic testing.

    Science.gov (United States)

    Davis, Carey; Peters, Dixie; Warshauer, David; King, Jonathan; Budowle, Bruce

    2015-03-01

    Mitochondrial DNA testing is a useful tool in the analysis of forensic biological evidence. In cases where nuclear DNA is damaged or limited in quantity, the higher copy number of mitochondrial genomes available in a sample can provide information about the source of a sample. Currently, Sanger-type sequencing (STS) is the primary method to develop mitochondrial DNA profiles. This method is laborious and time consuming. Massively parallel sequencing (MPS) can increase the amount of information obtained from mitochondrial DNA samples while improving turnaround time by decreasing the numbers of manipulations and more so by exploiting high throughput analyses to obtain interpretable results. In this study 18 buccal swabs, three different tissue samples from five individuals, and four bones samples from casework were sequenced at hypervariable regions I and II using STS and MPS. Sample enrichment for STS and MPS was PCR-based. Library preparation for MPS was performed using Nextera® XT DNA Sample Preparation Kit and sequencing was performed on the MiSeq™ (Illumina, Inc.). MPS yielded full concordance of base calls with STS results, and the newer methodology was able to resolve length heteroplasmy in homopolymeric regions. This study demonstrates short amplicon MPS of mitochondrial DNA is feasible, can provide information not possible with STS, and lays the groundwork for development of a whole genome sequencing strategy for degraded samples. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  6. Carrier molecules and extraction of circulating tumor DNA for next generation sequencing in colorectal cancer.

    Science.gov (United States)

    Beránek, Martin; Sirák, Igor; Vošmik, Milan; Petera, Jiří; Drastíková, Monika; Palička, Vladimír

    The aims of the study were: i) to compare circulating tumor DNA (ctDNA) yields obtained by different manual extraction procedures, ii) to evaluate the addition of various carrier molecules into the plasma to improve ctDNA extraction recovery, and iii) to use next generation sequencing (NGS) technology to analyze KRAS, BRAF, and NRAS somatic mutations in ctDNA from patients with metastatic colorectal cancer. Venous blood was obtained from patients who suffered from metastatic colorectal carcinoma. For plasma ctDNA extraction, the following carriers were tested: carrier RNA, polyadenylic acid, glycogen, linear acrylamide, yeast tRNA, salmon sperm DNA, and herring sperm DNA. Each extract was characterized by quantitative real-time PCR and next generation sequencing. The addition of polyadenylic acid had a significant positive effect on the amount of ctDNA eluted. The sequencing data revealed five cases of ctDNA mutated in KRAS and one patient with a BRAF mutation. An agreement of 86% was found between tumor tissues and ctDNA. Testing somatic mutations in ctDNA seems to be a promising tool to monitor dynamically changing genotypes of tumor cells circulating in the body. The optimized process of ctDNA extraction should help to obtain more reliable sequencing data in patients with metastatic colorectal cancer.

  7. Sequence analysis of three mitochondrial DNA molecules reveals interesting differences among Saccharomyces yeasts

    DEFF Research Database (Denmark)

    Langkjær, Rikke Breinhold; Casaregola, S.; Ussery, David

    2003-01-01

    The complete sequences of mitochondrial DNA ( mtDNA) from the two budding yeasts Saccharomyces castellii and Saccharomyces servazzii, consisting of 25 753 and 30 782 bp, respectively, were analysed and compared to Saccharomyces cerevisiae mtDNA. While some of the traits are very similar among Sac...

  8. Sequencing strategy of mitochondrial HV1 and HV2 DNA with length heteroplasmy

    DEFF Research Database (Denmark)

    Rasmussen, Erik Michael; Sørensen, E; Eriksen, Birthe

    2002-01-01

    We describe a method to obtain reliable mitochondrial DNA (mtDNA) sequences downstream of the homopolymeric stretches with length heteroplasmy in the sequencing direction. The method is based on the use of junction primers that bind to a part of the homopolymeric stretch and the first 2-4 bases...... downstream of the homopolymeric region. This junction primer method gave clear and unambiguous results using samples from 21 individuals with length heteroplasmy in the hypervariable regions HV1, HV2 or both. The method is of special value for forensic casework, because sequencing of both strands of an mtDNA...... region is preferable in order to reduce ambiguities in sequence determination....

  9. Identification of DNA Sequences Specific for Vibrio vulnificus Biotype 2 Strains by Suppression Subtractive Hybridization

    OpenAIRE

    Lee, Chung-Te; Amaro, Carmen; Sanjuán, Eva; Hor, Lien-I

    2005-01-01

    Vibrio vulnificus can be divided into three biotypes, and only biotype 2, which is further divided into serovars, contains eel-virulent strains. We compared the genomic DNA of a biotype 2 serovar E isolate (tester) with the genomic DNAs of three biotype 1 strains by suppression subtractive hybridization and then tested the distribution of the tester-specific DNA sequences in a wide collection of bacterial strains. In this way we identified three plasmid-borne DNA sequences that were specific ...

  10. High Interlaboratory Reprocucibility of DNA Sequence-based Typing of Bacteria in a Multicenter Study

    DEFF Research Database (Denmark)

    Sousa, MA de; Boye, Kit; Lencastre, H de

    2006-01-01

    Current DNA amplification-based typing methods for bacterial pathogens often lack interlaboratory reproducibility. In this international study, DNA sequence-based typing of the Staphylococcus aureus protein A gene (spa, 110 to 422 bp) showed 100% intra- and interlaboratory reproducibility without...... extensive harmonization of protocols for 30 blind-coded S. aureus DNA samples sent to 10 laboratories. Specialized software for automated sequence analysis ensured a common typing nomenclature....

  11. Improving the performance of true single molecule sequencing for ancient DNA.

    Science.gov (United States)

    Ginolhac, Aurelien; Vilstrup, Julia; Stenderup, Jesper; Rasmussen, Morten; Stiller, Mathias; Shapiro, Beth; Zazula, Grant; Froese, Duane; Steinmann, Kathleen E; Thompson, John F; Al-Rasheid, Khaled A S; Gilbert, Thomas M P; Willerslev, Eske; Orlando, Ludovic

    2012-05-10

    Second-generation sequencing technologies have revolutionized our ability to recover genetic information from the past, allowing the characterization of the first complete genomes from past individuals and extinct species. Recently, third generation Helicos sequencing platforms, which perform true Single-Molecule DNA Sequencing (tSMS), have shown great potential for sequencing DNA molecules from Pleistocene fossils. Here, we aim at improving even further the performance of tSMS for ancient DNA by testing two novel tSMS template preparation methods for Pleistocene bone fossils, namely oligonucleotide spiking and treatment with DNA phosphatase. We found that a significantly larger fraction of the horse genome could be covered following oligonucleotide spiking however not reproducibly and at the cost of extra post-sequencing filtering procedures and skewed %GC content. In contrast, we showed that treating ancient DNA extracts with DNA phosphatase improved the amount of endogenous sequence information recovered per sequencing channel by up to 3.3-fold, while still providing molecular signatures of endogenous ancient DNA damage, including cytosine deamination and fragmentation by depurination. Additionally, we confirmed the existence of molecular preservation niches in large bone crystals from which DNA could be preferentially extracted. We propose DNA phosphatase treatment as a mechanism to increase sequence coverage of ancient genomes when using Helicos tSMS as a sequencing platform. Together with mild denaturation temperatures that favor access to endogenous ancient templates over modern DNA contaminants, this simple preparation procedure can improve overall Helicos tSMS performance when damaged DNA templates are targeted.

  12. Sequence analysis of mitochondrial DNA hypervariable region III of ...

    African Journals Online (AJOL)

    The aims of this research were to study mitochondrial DNA hypervariable region III and establish the degree of variation characteristic of a fragment. The mitochondrial DNA (mtDNA) is a small circular genome located within the mitochondria in the cytoplasm of the cell and a smaller 1.2 kb pair fragment, called the control ...

  13. Agarose Gel Size Selection for DNA Sequencing Libraries.

    Science.gov (United States)

    Mardis, Elaine; McCombie, W Richard

    2017-08-01

    Agarose gel electrophoresis may be used to purify fragmented genomic DNA after ligation of adaptors. After electrophoresis, the region of the gel containing the desired size range of DNA is excised, and the DNA is subsequently extracted from the gel and purified by passage through a spin column. © 2017 Cold Spring Harbor Laboratory Press.

  14. Two-label peak-height encoded DNA sequencing by capillary gel electrophoresis: three examples.

    OpenAIRE

    Chen, D.; Harke, H R; Dovichi, N J

    1992-01-01

    We report a modification to the peak-height encoded DNA sequencing technique of Tabor and Richardson. As in the original protocol, the sequencing reaction uses modified T7 polymerase with manganese rather than magnesium to produce very uniform incorporation of each dideoxynucleoside. To improve sequencing accuracy, two fluorescently labeled primers are employed in separate sequencing reactions. As an example, one sequencing reaction uses a FAM-labeled primer with dideoxyadenosine triphosphate...

  15. True single-molecule DNA sequencing of a pleistocene horse bone

    DEFF Research Database (Denmark)

    Orlando, Ludovic Antoine Alexandre; Ginolhac, Aurélien; Raghavan, Maanasa

    2011-01-01

    Second-generation sequencing platforms have revolutionized the field of ancient DNA, opening access to complete genomes of past individuals and extinct species. However, these platforms are dependent on library construction and amplification steps that may result in sequences that do not reflect ......, indicating the presence of 3'-sequence overhangs. Our results suggest that paleogenomes could be sequenced in an unprecedented manner by combining current second- and third- generation sequencing approaches....

  16. Sequencing historical specimens: successful preparation of small specimens with low amounts of degraded DNA.

    Science.gov (United States)

    Sproul, John S; Maddison, David R

    2017-11-01

    Despite advances that allow DNA sequencing of old museum specimens, sequencing small-bodied, historical specimens can be challenging and unreliable as many contain only small amounts of fragmented DNA. Dependable methods to sequence such specimens are especially critical if the specimens are unique. We attempt to sequence small-bodied (3-6 mm) historical specimens (including nomenclatural types) of beetles that have been housed, dried, in museums for 58-159 years, and for which few or no suitable replacement specimens exist. To better understand ideal approaches of sample preparation and produce preparation guidelines, we compared different library preparation protocols using low amounts of input DNA (1-10 ng). We also explored low-cost optimizations designed to improve library preparation efficiency and sequencing success of historical specimens with minimal DNA, such as enzymatic repair of DNA. We report successful sample preparation and sequencing for all historical specimens despite our low-input DNA approach. We provide a list of guidelines related to DNA repair, bead handling, reducing adapter dimers and library amplification. We present these guidelines to facilitate more economical use of valuable DNA and enable more consistent results in projects that aim to sequence challenging, irreplaceable historical specimens. © 2017 John Wiley & Sons Ltd.

  17. Analysis of T-DNA/Host-Plant DNA Junction Sequences in Single-Copy Transgenic Barley Lines

    Directory of Open Access Journals (Sweden)

    Joanne G. Bartlett

    2014-01-01

    Full Text Available Sequencing across the junction between an integrated transfer DNA (T-DNA and a host plant genome provides two important pieces of information. The junctions themselves provide information regarding the proportion of T-DNA which has integrated into the host plant genome, whilst the transgene flanking sequences can be used to study the local genetic environment of the integrated transgene. In addition, this information is important in the safety assessment of GM crops and essential for GM traceability. In this study, a detailed analysis was carried out on the right-border T-DNA junction sequences of single-copy independent transgenic barley lines. T-DNA truncations at the right-border were found to be relatively common and affected 33.3% of the lines. In addition, 14.3% of lines had rearranged construct sequence after the right border break-point. An in depth analysis of the host-plant flanking sequences revealed that a significant proportion of the T-DNAs integrated into or close to known repetitive elements. However, this integration into repetitive DNA did not have a negative effect on transgene expression.

  18. Cytogenetic Analysis of Populus trichocarpa - Ribosomal DNA, Telomere Repeat Sequence, and Marker-selected BACs

    Science.gov (United States)

    M.N. lslam-Faridi; C.D. Nelson; S.P. DiFazio; L.E. Gunter; G.A. Tuskan

    2009-01-01

    The 185-285 rDNA and 55 rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 185-285 rDNA sites and one 55 rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis-type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones...

  19. The complete nucleotide sequence of the mitochondrial DNA of the dogfish, Scyliorhinus canicula.

    OpenAIRE

    Delarbre, C; Spruyt, N; Delmarre, C; Gallut, C.; Barriel, V; Janvier, P.; Laudet, V; Gachelin, G

    1998-01-01

    We have determined the complete nucleotide sequence of the mitochondrial DNA (mtDNA) of the dogfish, Scyliorhinus canicula. The 16,697-bp-long mtDNA possesses a gene organization identical to that of the Osteichthyes, but different from that of the sea lamprey Petromyzon marinus. The main features of the mtDNA of osteichthyans were thus established in the common ancestor to chondrichthyans and osteichthyans. The phylogenetic analysis confirms that the Chondrichthyes are the sister group of th...

  20. Detection of reverse transcriptase termination sites using cDNA ligation and massive parallel sequencing

    DEFF Research Database (Denmark)

    Kielpinski, Lukasz J; Boyd, Mette; Sandelin, Albin

    2013-01-01

    Detection of reverse transcriptase termination sites is important in many different applications, such as structural probing of RNAs, rapid amplification of cDNA 5' ends (5' RACE), cap analysis of gene expression, and detection of RNA modifications and protein-RNA cross-links. The throughput...... amplification, Illumina adapters and index sequences are introduced, thereby allowing amplicons to be pooled and sequenced on the standard Illumina platform for genomic DNA sequencing. Moreover, we demonstrate how to map sequencing reads and perform analysis of the sequencing data with freely available tools...

  1. A unique DNA repair and recombination gene (recN) sequence for ...

    Indian Academy of Sciences (India)

    2013-04-23

    Apr 23, 2013 ... Ribosomal gene sequences are a popular choice for identification of bacterial species and, often, for making phylogenetic interpretations. Although very popular, the sequences of 16S rDNA and 16-23S intergenic sequences often fail to differentiate closely related species of bacteria. The availability of ...

  2. A program for reading DNA sequence gels using a small computer equipped with a graphics tablet.

    Science.gov (United States)

    Lautenberger, J A

    1982-01-01

    A program has been written in BASIC that allows DNA sequence gels to be read by a Tektronix model 4052 computer equipped with a graphics tablet. Sequences from each gel are stored on tape for later transfer to a larger computer where they are melded into a complete overall sequence. The program should be adaptable to other small computers. PMID:7063401

  3. A unique DNA repair and recombination gene (recN) sequence for ...

    Indian Academy of Sciences (India)

    Ribosomal gene sequences are a popular choice for identification of bacterial species and, often, for making phylogenetic interpretations. Although very popular, the sequences of 16S rDNA and 16-23S intergenic sequences often fail to differentiate closely related species of bacteria. The availability of complete genome ...

  4. DATEL: A Scarless and Sequence-Independent DNA Assembly Method Using Thermostable Exonucleases and Ligase.

    Science.gov (United States)

    Jin, Peng; Ding, Wenwen; Du, Guocheng; Chen, Jian; Kang, Zhen

    2016-09-16

    DNA assembly is a pivotal technique in synthetic biology. Here, we report a scarless and sequence-independent DNA assembly method using thermal exonucleases (Taq and Pfu DNA polymerases) and Taq DNA ligase (DATEL). Under the optimized conditions, DATEL allows rapid assembly of 2-10 DNA fragments (1-2 h) with high accuracy (between 74 and 100%). Owing to the simple operation system with denaturation-annealing-cleavage-ligation temperature cycles in one tube, DATEL is expected to be a desirable choice for both manual and automated high-throughput assembly of DNA fragments, which will greatly facilitate the rapid progress of synthetic biology and metabolic engineering.

  5. Single-strand DNA library preparation improves sequencing of formalin-fixed and paraffin-embedded (FFPE) cancer DNA

    Science.gov (United States)

    Stiller, Mathias; Sucker, Antje; Griewank, Klaus; Aust, Daniela; Baretton, Gustavo Bruno; Schadendorf, Dirk; Horn, Susanne

    2016-01-01

    DNA derived from formalin-fixed and paraffin-embedded (FFPE) tissue has been a challenge to large-scale genomic sequencing, due to its low quality and quantities. Improved techniques enabling the genome-wide analysis of FFPE material would be of great value, both from a research and clinical perspective. Comparing a single-strand DNA library preparation method originally developed for ancient DNA to conventional protocols using double-stranded DNA derived from FFPE material we obtain on average 900-fold more library molecules and improved sequence complexity from as little as 5 ng input DNA. FFPE DNA is highly fragmented, usually below 100bp, and up to 60% of reads start after or end prior to adenine residues, suggesting that crosslinks predominate at adenine residues. Similar to ancient DNA, C > T substitutions are slightly increased with maximum rates up to 3% at the ends of molecules. In whole exome sequencing of single-strand libraries from lung, breast, colorectal, prostate and skin cancers we identify known cancer mutations. In summary, we show that single-strand library preparation enables genomic sequencing, even from low amounts of degraded FFPE DNA. This method provides a clear advantage both in research and clinical settings, where FFPE material (e.g. from biopsies) often is the only source of DNA available. Improving the genetic characterization that can be performed on conventional archived FFPE tissue, the single-strand library preparation allows scarce samples to be used in personalized medicine and enables larger sample sizes in future sequencing studies. PMID:27463017

  6. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system.

    Science.gov (United States)

    Schloss, Patrick D; Jenior, Matthew L; Koumpouras, Charles C; Westcott, Sarah L; Highlander, Sarah K

    2016-01-01

    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V5, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

  7. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system

    Directory of Open Access Journals (Sweden)

    Patrick D. Schloss

    2016-03-01

    Full Text Available Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina’s MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3–V5, V1–V3, V1–V5, V1–V6, and V1–V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1–V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina’s MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

  8. Realistic artificial DNA sequences as negative controls for computational genomics

    National Research Council Canada - National Science Library

    Caballero, Juan; Smit, Arian F A; Hood, Leroy; Glusman, Gustavo

    2014-01-01

    .... This last method can lead to underestimation of false-positive rates. We developed a new method for generating artificial sequences that are modeled after real intergenic sequences in terms of composition, complexity and interspersed repeat content...

  9. The impact of DNA input amount and DNA source on the performance of whole-exome sequencing in cancer epidemiology.

    Science.gov (United States)

    Zhu, Qianqian; Hu, Qiang; Shepherd, Lori; Wang, Jianmin; Wei, Lei; Morrison, Carl D; Conroy, Jeffrey M; Glenn, Sean T; Davis, Warren; Kwan, Marilyn L; Ergas, Isaac J; Roh, Janise M; Kushi, Lawrence H; Ambrosone, Christine B; Liu, Song; Yao, Song

    2015-08-01

    Whole-exome sequencing (WES) has recently emerged as an appealing approach to systematically study coding variants. However, the requirement for a large amount of high-quality DNA poses a barrier that may limit its application in large cancer epidemiologic studies. We evaluated the performance of WES with low input amount and saliva DNA as an alternative source material. Five breast cancer patients were randomly selected from the Pathways Study. From each patient, four samples, including 3 μg, 1 μg, and 0.2 μg blood DNA and 1 μg saliva DNA, were aliquoted for library preparation using the Agilent SureSelect Kit and sequencing using Illumina HiSeq2500. Quality metrics of sequencing and variant calling, as well as concordance of variant calls from the whole exome and 21 known breast cancer genes, were assessed by input amount and DNA source. There was little difference by input amount or DNA source on the quality of sequencing and variant calling. The concordance rate was about 98% for single-nucleotide variant calls and 83% to 86% for short insertion/deletion calls. For the 21 known breast cancer genes, WES based on low input amount and saliva DNA identified the same set variants in samples from a same patient. Low DNA input amount, as well as saliva DNA, can be used to generate WES data of satisfactory quality. Our findings support the expansion of WES applications in cancer epidemiologic studies where only low DNA amount or saliva samples are available. ©2015 American Association for Cancer Research.

  10. Influence of DNA sequence on the structure of minicircles under torsional stress.

    Science.gov (United States)

    Wang, Qian; Irobalieva, Rossitza N; Chiu, Wah; Schmid, Michael F; Fogg, Jonathan M; Zechiedrich, Lynn; Pettitt, B Montgomery

    2017-07-27

    The sequence dependence of the conformational distribution of DNA under various levels of torsional stress is an important unsolved problem. Combining theory and coarse-grained simulations shows that the DNA sequence and a structural correlation due to topology constraints of a circle are the main factors that dictate the 3D structure of a 336 bp DNA minicircle under torsional stress. We found that DNA minicircle topoisomers can have multiple bend locations under high torsional stress and that the positions of these sharp bends are determined by the sequence, and by a positive mechanical correlation along the sequence. We showed that simulations and theory are able to provide sequence-specific information about individual DNA minicircles observed by cryo-electron tomography (cryo-ET). We provided a sequence-specific cryo-ET tomogram fitting of DNA minicircles, registering the sequence within the geometric features. Our results indicate that the conformational distribution of minicircles under torsional stress can be designed, which has important implications for using minicircle DNA for gene therapy. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. HMGA1a recognition candidate DNA sequences in humans.

    Directory of Open Access Journals (Sweden)

    Takayuki Manabe

    Full Text Available High mobility group protein A1a (HMGA1a acts as an architectural transcription factor and influences a diverse array of normal biological processes. It binds AT-rich sequences, and previous reports have demonstrated HMGA1a binding to the authentic promoters of various genes. However, the precise sequences that HMGA1a binds to remain to be clarified. Therefore, in this study, we searched for the sequences with the highest affinity for human HMGA1a using an existing SELEX method, and then compared the identified sequences with known human promoter sequences. Based on our results, we propose the sequences "-(G/A-G-(A/T-(A/T-A-T-T-T-" as HMGA1a-binding candidate sequences. Furthermore, these candidate sequences bound native human HMGA1a from SK-N-SH cells. When candidate sequences were analyzed by performing FASTAs against all known human promoter sequences, 500-900 sequences were hit by each one. Some of the extracted genes have already been proven or suggested as HMGA1a-binding promoters. The candidate sequences presented here represent important information for research into the various roles of HMGA1a, including cell differentiation, death, growth, proliferation, and the pathogenesis of cancer.

  12. The DNA sequence, annotation and analysis of human chromosome 3

    DEFF Research Database (Denmark)

    Muzny, Donna M; Scherer, Steven E; Kaul, Rajinder

    2006-01-01

    After the completion of a draft human genome sequence, the International Human Genome Sequencing Consortium has proceeded to finish and annotate each of the 24 chromosomes comprising the human genome. Here we describe the sequencing and analysis of human chromosome 3, one of the largest human chr...

  13. cDNA, genomic sequence cloning and overexpression of ribosomal ...

    African Journals Online (AJOL)

    Alignment analysis indicated that the nucleotide sequence of the coding sequence shows a high homology to those of Homo sapiens, Pongo abelii, Macaca fascicularis, Mus musculus, Bos taurus and Rattus norvegicus are 93.1, 92.5, 92.2, 91.1, 90.6 and 90.0% respectively. The amino acid sequence encoded by RPS20 ...

  14. Sequence-specific activation of the DNA sensor cGAS by Y-form DNA structures as found in primary HIV-1 cDNA.

    Science.gov (United States)

    Herzner, Anna-Maria; Hagmann, Cristina Amparo; Goldeck, Marion; Wolter, Steven; Kübler, Kirsten; Wittmann, Sabine; Gramberg, Thomas; Andreeva, Liudmila; Hopfner, Karl-Peter; Mertens, Christina; Zillinger, Thomas; Jin, Tengchuan; Xiao, Tsan Sam; Bartok, Eva; Coch, Christoph; Ackermann, Damian; Hornung, Veit; Ludwig, Janos; Barchet, Winfried; Hartmann, Gunther; Schlee, Martin

    2015-10-01

    Cytosolic DNA that emerges during infection with a retrovirus or DNA virus triggers antiviral type I interferon responses. So far, only double-stranded DNA (dsDNA) over 40 base pairs (bp) in length has been considered immunostimulatory. Here we found that unpaired DNA nucleotides flanking short base-paired DNA stretches, as in stem-loop structures of single-stranded DNA (ssDNA) derived from human immunodeficiency virus type 1 (HIV-1), activated the type I interferon-inducing DNA sensor cGAS in a sequence-dependent manner. DNA structures containing unpaired guanosines flanking short (12- to 20-bp) dsDNA (Y-form DNA) were highly stimulatory and specifically enhanced the enzymatic activity of cGAS. Furthermore, we found that primary HIV-1 reverse transcripts represented the predominant viral cytosolic DNA species during early infection of macrophages and that these ssDNAs were highly immunostimulatory. Collectively, our study identifies unpaired guanosines in Y-form DNA as a highly active, minimal cGAS recognition motif that enables detection of HIV-1 ssDNA.

  15. Sequence specificity of DNA alkylation by the antitumor natural product leinamycin.

    Science.gov (United States)

    Zang, Hong; Gates, Kent S

    2003-12-01

    Reaction with thiol converts the antitumor natural product leinamycin to an episulfonium ion that alkylates the N(7)-position of guanine residues in double-stranded DNA. The sequence specificity for DNA alkylation by this structurally novel compound has not previously been examined. It is reported here that leinamycin shows significant (>10-fold) preferences for alkylation at the 5'-G in 5'-GG and 5'-GT sequences. The sequence preferences for activated leinamycin are significantly different from that observed for the structurally simple episulfonium ion generated from 2-chloroethyl ethyl sulfide. DNA alkylation by activated leinamycin is inhibited by addition of salt (100 mM NaClO(4)), although the degree of inhibition is somewhat less than that seen for 2-chloroethyl ethyl sulfide. This result suggests that electrostatic interactions between the activated leinamycin and the N(7)-position of guanine residues facilitate efficient DNA alkylation. However, the observed sequence preferences for DNA alkylation by activated leinamycin do not correlate strongly with calculated sequence-dependent variations in the molecular electrostatic potential at the N(7)-atom of guanine residues in duplex DNA. Thus, electrostatic interactions between activated leinamycin and DNA do not appear to be the primary determinant for sequence specificity. Rather, the results suggest that sequence-specific noncovalent interactions of leinamycin with the DNA double helix on the 3'-side of the alkylated guanine residue play a major role in determining the preferred alkylation sites. Consistent with the notion that noncovalent binding plays an important role in DNA alkylation by leinamycin, experiments with 2'-deoxyoligonucleotide substrates confirm that the natural product does not alkylate single-stranded DNA under conditions where duplex DNA is efficiently alkylated.

  16. Sequence analysis of mitochondrial DNA hypervariable region III of ...

    African Journals Online (AJOL)

    Aghomotsegin

    2015-07-01

    Jul 1, 2015 ... degradation; third, higher rate of evolution: DNA alterations (mutations) occur in a number of ... The result is that the rate of change, or evolutionary rate, of mitochondrial DNA is about five times greater .... example mass graves in mass disasters, there are newly discovered forensically validated methods ...

  17. An Automated Sample Preparation System for Large-Scale DNA Sequencing

    Science.gov (United States)

    Marziali, Andre; Willis, Thomas D.; Federspiel, Nancy A.; Davis, Ronald W.

    1999-01-01

    Recent advances in DNA sequencing technologies, both in the form of high lane-density gels and automated capillary systems, will lead to an increased requirement for sample preparation systems that operate at low cost and high throughput. As part of the development of a fully automated sequencing system, we have developed an automated subsystem capable of producing 10,000 sequence-ready ssDNA templates per day from libraries of M13 plaques at a cost of $0.29 per sample. This Front End has been in high throughput operation since June, 1997 and has produced > 400,000 high-quality DNA templates. PMID:10330125

  18. The Study of Correlation Structures of DNA Sequences A Critical Review

    CERN Document Server

    Li, W

    1997-01-01

    The study of correlation structure in the primary sequences of DNA is reviewed. The issues reviewed include: symmetries among 16 base-base correlation functions, accurate estimation of correlation measures, the relationship between $1/f$ and Lorentzian spectra, heterogeneity in DNA sequences, different modeling strategies of the correlation structure of DNA sequences, the difference of correlation structure between coding and non-coding regions (besides the period-3 pattern), and source of broad distribution of domain sizes. Although some of the results remain controversial, a body of work on this topic constitutes a good starting point for future studies.

  19. Ecological niche modelling and nDNA sequencing support a new, morphologically cryptic beetle species unveiled by DNA barcoding.

    Directory of Open Access Journals (Sweden)

    Oliver Hawlitschek

    2011-02-01

    Full Text Available DNA sequencing techniques used to estimate biodiversity, such as DNA barcoding, may reveal cryptic species. However, disagreements between barcoding and morphological data have already led to controversy. Species delimitation should therefore not be based on mtDNA alone. Here, we explore the use of nDNA and bioclimatic modelling in a new species of aquatic beetle revealed by mtDNA sequence data.The aquatic beetle fauna of Australia is characterised by high degrees of endemism, including local radiations such as the genus Antiporus. Antiporus femoralis was previously considered to exist in two disjunct, but morphologically indistinguishable populations in south-western and south-eastern Australia. We constructed a phylogeny of Antiporus and detected a deep split between these populations. Diagnostic characters from the highly variable nuclear protein encoding arginine kinase gene confirmed the presence of two isolated populations. We then used ecological niche modelling to examine the climatic niche characteristics of the two populations. All results support the status of the two populations as distinct species. We describe the south-western species as Antiporus occidentalis sp.n.In addition to nDNA sequence data and extended use of mitochondrial sequences, ecological niche modelling has great potential for delineating morphologically cryptic species.

  20. Sequencing for complete rDNA sequences (18S, ITS1, 5.8S, ITS2, and 28S rDNA) of Demodex and phylogenetic analysis of Acari based on 18S and 28S rDNA.

    Science.gov (United States)

    Zhao, Ya-E; Wu, Li-Ping; Hu, Li; Xu, Yang; Wang, Zheng-Hang; Liu, Wen-Yan

    2012-11-01

    Due to the difficulty of DNA extraction for Demodex, few studies dealt with the identification and the phyletic evolution of Demodex at molecular level. In this study, we amplified, sequenced, and analyzed a complete (Demodex folliculorum) and an almost complete (D12 missing) (Demodex brevis) ribosomal DNA (rDNA) sequence and also analyzed the primary sequences of divergent domains in small-subunit ribosomal RNA (rRNA) of 51 species and in large-subunit rRNA of 43 species from four superfamilies in Acari (Cheyletoidea, Tetranychoidea, Analgoidea, and Ixodoidea). The results revealed that 18S rDNA sequence was relatively conserved in rDNA-coding regions and was not evolving as rapidly as 28S rDNA sequence. The evolutionary rates of transcribed spacer regions were much higher than those of the coding regions. The maximum parsimony trees of 18S and 28S rDNA appeared to be almost identical, consistent with their morphological classification. Based on the fact that the resolution capability of sequence length and the divergence of the 13 segments (D1-D6, D7a, D7b, and D8-D12) of 28S rDNA were stronger than that of the nine variable regions (V1-V9) of 18S rDNA, we were able to identify Demodex (Cheyletoidea) by the indels occurring in D2, D6, and D8.

  1. Comparison of sequence of cDNA clone with other genomic and cDNA sequences for human C-reactive protein

    Energy Technology Data Exchange (ETDEWEB)

    Tenchini, M.L.; Bossi, E.; Marchetti, L.; Malcovati, M. (Universita di Milano Via Viotti (Italy)); Lorenzetti, R. (M.M.D.R.I. Via R. Lepetit, Gerenzano (Italy))

    1992-04-01

    A clone for C-reactive protein (CRP) has been isolated from a human liver cDNA library; this clone harbors a plasmid, pC81, which has an insert of 1631 bp. When compared to genomic and cDNA sequences published to date now, pC81 has revealed homologies and differences that might help to clarify the structure of this gene and the presence of allelic variants in man.

  2. Plant organellar DNA primase-helicase synthesizes RNA primers for organellar DNA polymerases using a unique recognition sequence.

    Science.gov (United States)

    Peralta-Castro, Antolín; Baruch-Torres, Noe; Brieba, Luis G

    2017-10-13

    DNA primases recognize single-stranded DNA (ssDNA) sequences to synthesize RNA primers during lagging-strand replication. Arabidopsis thaliana encodes an ortholog of the DNA primase-helicase from bacteriophage T7, dubbed AtTwinkle, that localizes in chloroplasts and mitochondria. Herein, we report that AtTwinkle synthesizes RNA primers from a 5'-(G/C)GGA-3' template sequence. Within this sequence, the underlined nucleotides are cryptic, meaning that they are essential for template recognition but are not instructional during RNA synthesis. Thus, in contrast to all primases characterized to date, the sequence recognized by AtTwinkle requires two nucleotides (5'-GA-3') as a cryptic element. The divergent zinc finger binding domain (ZBD) of the primase module of AtTwinkle may be responsible for template sequence recognition. During oligoribonucleotide synthesis, AtTwinkle shows a strong preference for rCTP as its initial ribonucleotide and a moderate preference for rGMP or rCMP incorporation during elongation. RNA products synthetized by AtTwinkle are efficiently used as primers for plant organellar DNA polymerases. In sum, our data strongly suggest that AtTwinkle primes organellar DNA polymerases during lagging strand synthesis in plant mitochondria and chloroplast following a primase-mediated mechanism. This mechanism contrasts to lagging-strand DNA replication in metazoan mitochondria, in which transcripts synthesized by mitochondrial RNA polymerase prime mitochondrial DNA polymerase γ. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. Bisulfite sequencing reveals that Aspergillus flavus holds a hollow in DNA methylation.

    Directory of Open Access Journals (Sweden)

    Si-Yang Liu

    Full Text Available Aspergillus flavus first gained scientific attention for its production of aflatoxin. The underlying regulation of aflatoxin biosynthesis has been serving as a theoretical model for biosynthesis of other microbial secondary metabolites. Nevertheless, for several decades, the DNA methylation status, one of the important epigenomic modifications involved in gene regulation, in A. flavus remains to be controversial. Here, we applied bisulfite sequencing in conjunction with a biological replicate strategy to investigate the DNA methylation profiling of A. flavus genome. Both the bisulfite sequencing data and the methylome comparisons with other fungi confirm that the DNA methylation level of this fungus is negligible. Further investigation into the DNA methyltransferase of Aspergillus uncovers its close relationship with RID-like enzymes as well as its divergence with the methyltransferase of species with validated DNA methylation. The lack of repeat contents of the A. flavus' genome and the high RIP-index of the small amount of remanent repeat potentially support our speculation that DNA methylation may be absent in A. flavus or that it may possess de novo DNA methylation which occurs very transiently during the obscure sexual stage of this fungal species. This work contributes to our understanding on the DNA methylation status of A. flavus, as well as reinforces our views on the DNA methylation in fungal species. In addition, our strategy of applying bisulfite sequencing to DNA methylation detection in species with low DNA methylation may serve as a reference for later scientific investigations in other hypomethylated species.

  4. Noninvasive diagnosis of fetal aneuploidy by shotgun sequencing DNA from maternal blood

    Science.gov (United States)

    Fan, H. Christina; Blumenfeld, Yair J.; Chitkara, Usha; Hudgins, Louanne; Quake, Stephen R.

    2008-01-01

    We directly sequenced cell-free DNA with high-throughput shotgun sequencing technology from plasma of pregnant women, obtaining, on average, 5 million sequence tags per patient sample. This enabled us to measure the over- and underrepresentation of chromosomes from an aneuploid fetus. The sequencing approach is polymorphism-independent and therefore universally applicable for the noninvasive detection of fetal aneuploidy. Using this method, we successfully identified all nine cases of trisomy 21 (Down syndrome), two cases of trisomy 18 (Edward syndrome), and one case of trisomy 13 (Patau syndrome) in a cohort of 18 normal and aneuploid pregnancies; trisomy was detected at gestational ages as early as the 14th week. Direct sequencing also allowed us to study the characteristics of cell-free plasma DNA, and we found evidence that this DNA is enriched for sequences from nucleosomes. PMID:18838674

  5. Noninvasive diagnosis of fetal aneuploidy by shotgun sequencing DNA from maternal blood.

    Science.gov (United States)

    Fan, H Christina; Blumenfeld, Yair J; Chitkara, Usha; Hudgins, Louanne; Quake, Stephen R

    2008-10-21

    We directly sequenced cell-free DNA with high-throughput shotgun sequencing technology from plasma of pregnant women, obtaining, on average, 5 million sequence tags per patient sample. This enabled us to measure the over- and underrepresentation of chromosomes from an aneuploid fetus. The sequencing approach is polymorphism-independent and therefore universally applicable for the noninvasive detection of fetal aneuploidy. Using this method, we successfully identified all nine cases of trisomy 21 (Down syndrome), two cases of trisomy 18 (Edward syndrome), and one case of trisomy 13 (Patau syndrome) in a cohort of 18 normal and aneuploid pregnancies; trisomy was detected at gestational ages as early as the 14th week. Direct sequencing also allowed us to study the characteristics of cell-free plasma DNA, and we found evidence that this DNA is enriched for sequences from nucleosomes.

  6. Z-DNA-forming sequences generate large-scale deletions in mammalian cells.

    Science.gov (United States)

    Wang, Guliang; Christensen, Laura A; Vasquez, Karen M

    2006-02-21

    Spontaneous chromosomal breakages frequently occur at genomic hot spots in the absence of DNA damage and can result in translocation-related human disease. Chromosomal breakpoints are often mapped near purine-pyrimidine Z-DNA-forming sequences in human tumors. However, it is not known whether Z-DNA plays a role in the generation of these chromosomal breakages. Here, we show that Z-DNA-forming sequences induce high levels of genetic instability in both bacterial and mammalian cells. In mammalian cells, the Z-DNA-forming sequences induce double-strand breaks nearby, resulting in large-scale deletions in 95% of the mutants. These Z-DNA-induced double-strand breaks in mammalian cells are not confined to a specific sequence but rather are dispersed over a 400-bp region, consistent with chromosomal breakpoints in human diseases. This observation is in contrast to the mutations generated in Escherichia coli that are predominantly small deletions within the repeats. We found that the frequency of small deletions is increased by replication in mammalian cell extracts. Surprisingly, the large-scale deletions generated in mammalian cells are, at least in part, replication-independent and are likely initiated by repair processing cleavages surrounding the Z-DNA-forming sequence. These results reveal that mammalian cells process Z-DNA-forming sequences in a strikingly different fashion from that used by bacteria. Our data suggest that Z-DNA-forming sequences may be causative factors for gene translocations found in leukemias and lymphomas and that certain cellular conditions such as active transcription may increase the risk of Z-DNA-related genetic instability.

  7. Rapid and accurate identification of microorganisms contaminating cosmetic products based on DNA sequence homology.

    Science.gov (United States)

    Fujita, Y; Shibayama, H; Suzuki, Y; Karita, S; Takamatsu, S

    2005-12-01

    The aim of this study was to develop rapid and accurate procedures to identify microorganisms contaminating cosmetic products, based on the identity of the nucleotide sequences of the internal transcribed spacer (ITS) region of the ribosomal RNA coding DNA (rDNA). Five types of microorganisms were isolated from the inner portion of lotion bottle caps, skin care lotions, and cleansing gels. The rDNA ITS region of microorganisms was amplified through the use of colony-direct PCR or ordinal PCR using DNA extracts as templates. The nucleotide sequences of the amplified DNA were determined and subjected to homology search of a publicly available DNA database. Thereby, we obtained DNA sequences possessing high similarity with the query sequences from the databases of all the five organisms analyzed. The traditional identification procedure requires expert skills, and a time period of approximately 1 month to identify the microorganisms. On the contrary, 3-7 days were sufficient to complete all the procedures employed in the current method, including isolation and cultivation of organisms, DNA sequencing, and the database homology search. Moreover, it was possible to develop the skills necessary to perform the molecular techniques required for the identification procedures within 1 week. Consequently, the current method is useful for rapid and accurate identification of microorganisms, contaminating cosmetics.

  8. Human Genome Sequencing at the Population Scale: A Primer on High-Throughput DNA Sequencing and Analysis.

    Science.gov (United States)

    Goldfeder, Rachel L; Wall, Dennis P; Khoury, Muin J; Ioannidis, John P A; Ashley, Euan A

    2017-10-15

    Most human diseases have underlying genetic causes. To better understand the impact of genes on disease and its implications for medicine and public health, researchers have pursued methods for determining the sequences of individual genes, then all genes, and now complete human genomes. Massively parallel high-throughput sequencing technology, where DNA is sheared into smaller pieces, sequenced, and then computationally reordered and analyzed, enables fast and affordable sequencing of full human genomes. As the price of sequencing continues to decline, more and more individuals are having their genomes sequenced. This may facilitate better population-level disease subtyping and characterization, as well as individual-level diagnosis and personalized treatment and prevention plans. In this review, we describe several massively parallel high-throughput DNA sequencing technologies and their associated strengths, limitations, and error modes, with a focus on applications in epidemiologic research and precision medicine. We detail the methods used to computationally process and interpret sequence data to inform medical or preventative action. © The Author(s) 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  9. Application of Quaternion in improving the quality of global sequence alignment scores for an ambiguous sequence target in Streptococcus pneumoniae DNA

    Science.gov (United States)

    Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.

    2017-07-01

    DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.

  10. mtDNA-Server: next-generation sequencing data analysis of human mitochondrial DNA in the cloud.

    Science.gov (United States)

    Weissensteiner, Hansi; Forer, Lukas; Fuchsberger, Christian; Schöpf, Bernd; Kloss-Brandstätter, Anita; Specht, Günther; Kronenberg, Florian; Schönherr, Sebastian

    2016-07-08

    Next generation sequencing (NGS) allows investigating mitochondrial DNA (mtDNA) characteristics such as heteroplasmy (i.e. intra-individual sequence variation) to a higher level of detail. While several pipelines for analyzing heteroplasmies exist, issues in usability, accuracy of results and interpreting final data limit their usage. Here we present mtDNA-Server, a scalable web server for the analysis of mtDNA studies of any size with a special focus on usability as well as reliable identification and quantification of heteroplasmic variants. The mtDNA-Server workflow includes parallel read alignment, heteroplasmy detection, artefact or contamination identification, variant annotation as well as several quality control metrics, often neglected in current mtDNA NGS studies. All computational steps are parallelized with Hadoop MapReduce and executed graphically with Cloudgene. We validated the underlying heteroplasmy and contamination detection model by generating four artificial sample mix-ups on two different NGS devices. Our evaluation data shows that mtDNA-Server detects heteroplasmies and artificial recombinations down to the 1% level with perfect specificity and outperforms existing approaches regarding sensitivity. mtDNA-Server is currently able to analyze the 1000G Phase 3 data (n = 2,504) in less than 5 h and is freely accessible at https://mtdna-server.uibk.ac.at. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. Modeling the early stage of DNA sequence recognition within RecA nucleoprotein filaments.

    Science.gov (United States)

    Saladin, Adrien; Amourda, Christopher; Poulain, Pierre; Férey, Nicolas; Baaden, Marc; Zacharias, Martin; Delalande, Olivier; Prévost, Chantal

    2010-10-01

    Homologous recombination is a fundamental process enabling the repair of double-strand breaks with a high degree of fidelity. In prokaryotes, it is carried out by RecA nucleofilaments formed on single-stranded DNA (ssDNA). These filaments incorporate genomic sequences that are homologous to the ssDNA and exchange the homologous strands. Due to the highly dynamic character of this process and its rapid propagation along the filament, the sequence recognition and strand exchange mechanism remains unknown at the structural level. The recently published structure of the RecA/DNA filament active for recombination (Chen et al., Mechanism of homologous recombination from the RecA-ssDNA/dsDNA structure, Nature 2008, 453, 489) provides a starting point for new exploration of the system. Here, we investigate the possible geometries of association of the early encounter complex between RecA/ssDNA filament and double-stranded DNA (dsDNA). Due to the huge size of the system and its dense packing, we use a reduced representation for protein and DNA together with state-of-the-art molecular modeling methods, including systematic docking and virtual reality simulations. The results indicate that it is possible for the double-stranded DNA to access the RecA-bound ssDNA while initially retaining its Watson-Crick pairing. They emphasize the importance of RecA L2 loop mobility for both recognition and strand exchange.

  12. Protection of the genome and central protein-coding sequences by non-coding DNA against DNA damage from radiation.

    Science.gov (United States)

    Qiu, Guo-Hua

    2015-01-01

    Non-coding DNA comprises a very large proportion of the total genomic content in higher organisms, but its function remains largely unclear. Non-coding DNA sequences constitute the majority of peripheral heterochromatin, which has been hypothesized to be the genome's 'bodyguard' against DNA damage from chemicals and radiation for almost four decades. The bodyguard protective function of peripheral heterochromatin in genome defense has been strengthened by the results from numerous recent studies, which are summarized in this review. These data have suggested that cells and/or organisms with a higher level of heterochromatin and more non-coding DNA sequences, including longer telomeric DNA and rDNAs, exhibit a lower frequency of DNA damage, higher radioresistance and longer lifespan after IR exposure. In addition, the majority of heterochromatin is peripherally located in the three-dimensional structure of genome organization. Therefore, the peripheral heterochromatin with non-coding DNA could play a protective role in genome defense against DNA damage from ionizing radiation by both absorbing the radicals from water radiolysis in the cytosol and reducing the energy of IR. However, the bodyguard protection by heterochromatin has been challenged by the observation that DNA damage is less frequently detected in peripheral heterochromatin than in euchromatin, which is inconsistent with the expectation and simulation results. Previous studies have also shown that the DNA damage in peripheral heterochromatin is rarely repaired and moves more quickly, broadly and outwardly to approach the nuclear pore complex (NPC). Additionally, it has been shown that extrachromosomal circular DNAs (eccDNAs) are formed in the nucleus, highly detectable in the cytoplasm (particularly under stress conditions) and shuttle between the nucleus and the cytoplasm. Based on these studies, this review speculates that the sites of DNA damage in peripheral heterochromatin could occur more

  13. Ancient mtDNA sequences from the First Australians revisited

    National Research Council Canada - National Science Library

    Heupink, Tim H; Subramanian, Sankar; Wright, Joanne L; Endicott, Phillip; Westaway, Michael Carrington; Huynen, Leon; Parson, Walther; Millar, Craig D; Willerslev, Eske; Lambert, David M

    2016-01-01

    ... [Willandra Lakes Hominid (WLH3)]. This landmark study in human ancient DNA suggested that an early modern human mitochondrial lineage emerged in Asia and that the theory of modern human origins could no longer be considered solely...

  14. Decoding the regulatory landscape of medulloblastoma using DNA methylation sequencing

    NARCIS (Netherlands)

    Hovestadt, Volker; Jones, David T. W.; Picelli, Simone; Wang, Wei; Kool, Marcel; Northcott, Paul A.; Sultan, Marc; Stachurski, Katharina; Ryzhova, Marina; Warnatz, Hans-Jörg; Ralser, Meryem; Brun, Sonja; Bunt, Jens; Jäger, Natalie; Kleinheinz, Kortine; Erkek, Serap; Weber, Ursula D.; Bartholomae, Cynthia C.; von Kalle, Christof; Lawerenz, Chris; Eils, Jürgen; Koster, Jan; Versteeg, Rogier; Milde, Till; Witt, Olaf; Schmidt, Sabine; Wolf, Stephan; Pietsch, Torsten; Rutkowski, Stefan; Scheurlen, Wolfram; Taylor, Michael D.; Brors, Benedikt; Felsberg, Jörg; Reifenberger, Guido; Borkhardt, Arndt; Lehrach, Hans; Wechsler-Reya, Robert J.; Eils, Roland; Yaspo, Marie-Laure; Landgraf, Pablo; Korshunov, Andrey; Zapatka, Marc; Radlwimmer, Bernhard; Pfister, Stefan M.; Lichter, Peter

    2014-01-01

    Epigenetic alterations, that is, disruption of DNA methylation and chromatin architecture, are now acknowledged as a universal feature of tumorigenesis. Medulloblastoma, a clinically challenging, malignant childhood brain tumour, is no exception. Despite much progress from recent genomics studies,

  15. Sequences from first settlers reveal rapid evolution in Icelandic mtDNA pool.

    Science.gov (United States)

    Helgason, Agnar; Lalueza-Fox, Carles; Ghosh, Shyamali; Sigurethardóttir, Sigrún; Sampietro, Maria Lourdes; Gigli, Elena; Baker, Adam; Bertranpetit, Jaume; Arnadóttir, Lilja; Thornorsteinsdottir, Unnur; Stefánsson, Kári

    2009-01-01

    A major task in human genetics is to understand the nature of the evolutionary processes that have shaped the gene pools of contemporary populations. Ancient DNA studies have great potential to shed light on the evolution of populations because they provide the opportunity to sample from the same population at different points in time. Here, we show that a sample of mitochondrial DNA (mtDNA) control region sequences from 68 early medieval Icelandic skeletal remains is more closely related to sequences from contemporary inhabitants of Scotland, Ireland, and Scandinavia than to those from the modern Icelandic population. Due to a faster rate of genetic drift in the Icelandic mtDNA pool during the last 1,100 years, the sequences carried by the first settlers were better preserved in their ancestral gene pools than among their descendants in Iceland. These results demonstrate the inferential power gained in ancient DNA studies through the application of population genetics analyses to relatively large samples.

  16. Multi-modulus algorithm based on global artificial fish swarm intelligent optimization of DNA encoding sequences.

    Science.gov (United States)

    Guo, Y C; Wang, H; Wu, H P; Zhang, M Q

    2015-12-21

    Aimed to address the defects of the large mean square error (MSE), and the slow convergence speed in equalizing the multi-modulus signals of the constant modulus algorithm (CMA), a multi-modulus algorithm (MMA) based on global artificial fish swarm (GAFS) intelligent optimization of DNA encoding sequences (GAFS-DNA-MMA) was proposed. To improve the convergence rate and reduce the MSE, this proposed algorithm adopted an encoding method based on DNA nucleotide chains to provide a possible solution to the problem. Furthermore, the GAFS algorithm, with its fast convergence and global search ability, was used to find the best sequence. The real and imaginary parts of the initial optimal weight vector of MMA were obtained through DNA coding of the best sequence. The simulation results show that the proposed algorithm has a faster convergence speed and smaller MSE in comparison with the CMA, the MMA, and the AFS-DNA-MMA.

  17. Comparing the performance of three ancient DNA extraction methods for high-throughput sequencing

    DEFF Research Database (Denmark)

    Gamba, Cristina; Hanghøj, Kristian Ebbesen; Gaunitz, Charleen

    2016-01-01

    The DNA molecules that can be extracted from archaeological and palaeontological remains are often degraded and massively contaminated with environmental microbial material. This reduces the efficacy of shotgun approaches for sequencing ancient genomes, despite the decreasing sequencing costs...... of high-throughput sequencing (HTS). Improving the recovery of endogenous molecules from the DNA extraction and purification steps could, thus, help advance the characterization of ancient genomes. Here, we apply the three most commonly used DNA extraction methods to five ancient bone samples spanning...... a ~30 thousand year temporal range and originating from a diversity of environments, from South America to Alaska. We show that methods based on the purification of DNA fragments using silica columns are more advantageous than in solution methods and increase not only the total amount of DNA molecules...

  18. 5'-end sequences of budding yeast full-length cDNA clones and quality scores - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Budding yeast cDNA sequencing project 5'-end sequences of budding yeast full-length cDNA clones and quality score...s Data detail Data name 5'-end sequences of budding yeast full-length cDNA clones and quality score...or-capping method, the sequence quality score generated by the Phred software, and links to SGD, dbEST and U...es. FASTA format. Quality Phred's quality score About This Database Database Desc...g yeast full-length cDNA clones and quality scores - Budding yeast cDNA sequencing project | LSDB Archive ...

  19. Genomic signal processing methods for computation of alignment-free distances from DNA sequences.

    Science.gov (United States)

    Borrayo, Ernesto; Mendizabal-Ruiz, E Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P; Morales, J Alejandro

    2014-01-01

    Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments.

  20. Designing universal primers for the isolation of DNA sequences encoding Proanthocyanidins biosynthetic enzymes in Crataegus aronia

    Directory of Open Access Journals (Sweden)

    Zuiter Afnan

    2012-08-01

    Full Text Available Abstract Background Hawthorn is the common name of all plant species in the genus Crataegus, which belongs to the Rosaceae family. Crataegus are considered useful medicinal plants because of their high content of proanthocyanidins (PAs and other related compounds. To improve PAs production in Crataegus tissues, the sequences of genes encoding PAs biosynthetic enzymes are required. Findings Different bioinformatics tools, including BLAST, multiple sequence alignment and alignment PCR analysis were used to design primers suitable for the amplification of DNA fragments from 10 candidate genes encoding enzymes involved in PAs biosynthesis in C. aronia. DNA sequencing results proved the utility of the designed primers. The primers were used successfully to amplify DNA fragments of different PAs biosynthesis genes in different Rosaceae plants. Conclusion To the best of our knowledge, this is the first use of the alignment PCR approach to isolate DNA sequences encoding PAs biosynthetic enzymes in Rosaceae plants.

  1. Toward a new paradigm of DNA writing using a massively parallel sequencing platform and degenerate oligonucleotide.

    Science.gov (United States)

    Hwang, Byungjin; Bang, Duhee

    2016-11-23

    All synthetic DNA materials require prior programming of the building blocks of the oligonucleotide sequences. The development of a programmable microarray platform provides cost-effective and time-efficient solutions in the field of data storage using DNA. However, the scalability of the synthesis is not on par with the accelerating sequencing capacity. Here, we report on a new paradigm of generating genetic material (writing) using a degenerate oligonucleotide and optomechanical retrieval method that leverages sequencing (reading) throughput to generate the desired number of oligonucleotides. As a proof of concept, we demonstrate the feasibility of our concept in digital information storage in DNA. In simulation, the ability to store data is expected to exponentially increase with increase in degenerate space. The present study highlights the major framework change in conventional DNA writing paradigm as a sequencer itself can become a potential source of making genetic materials.

  2. Designing universal primers for the isolation of DNA sequences encoding Proanthocyanidins biosynthetic enzymes in Crataegus aronia.

    Science.gov (United States)

    Zuiter, Afnan Saeid; Sawwan, Jammal; Al Abdallat, Ayed

    2012-08-10

    Hawthorn is the common name of all plant species in the genus Crataegus, which belongs to the Rosaceae family. Crataegus are considered useful medicinal plants because of their high content of proanthocyanidins (PAs) and other related compounds. To improve PAs production in Crataegus tissues, the sequences of genes encoding PAs biosynthetic enzymes are required. Different bioinformatics tools, including BLAST, multiple sequence alignment and alignment PCR analysis were used to design primers suitable for the amplification of DNA fragments from 10 candidate genes encoding enzymes involved in PAs biosynthesis in C. aronia. DNA sequencing results proved the utility of the designed primers. The primers were used successfully to amplify DNA fragments of different PAs biosynthesis genes in different Rosaceae plants. To the best of our knowledge, this is the first use of the alignment PCR approach to isolate DNA sequences encoding PAs biosynthetic enzymes in Rosaceae plants.

  3. Cytogenetic Analysis of Populus trichocarpa - Ribosomal DNA, Telomere Repeat Sequence, and Marker-selected BACs

    Energy Technology Data Exchange (ETDEWEB)

    Tuskan, Gerald A [ORNL; Gunter, Lee E [ORNL; DiFazio, Stephen P [West Virginia University

    2009-01-01

    The 18S-28S rDNA and 5S rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 18S-28S rDNA sites and one 5S rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis -type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones selected from 2 linkage groups based on genome sequence assembly (LG-I and LG-VI) were localized on 2 chromosomes, as expected. BACs from LG-I hybridized to the longest chromosome in the complement. All BAC positions were found to be concordant with sequence assembly positions. BAC-FISH will be useful for delineating each of the Populus trichocarpa chromosomes and improving the sequence assembly of this model angiosperm tree species.

  4. Finite-size effects on long-range correlations: implications for analyzing DNA sequences

    Science.gov (United States)

    Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Simons, M.; Stanley, H. E.

    1993-01-01

    We analyze the fluctuations in the correlation exponents obtained for noncoding DNA sequences. We find prominent sample-to-sample variations as well as variations within a single sample in the scaling exponent. To determine if these fluctuations may result from finite system size, we generate correlated random sequences of comparable length and study the fluctuations in this control system. We find that the DNA exponent fluctuations are consistent with those obtained from the control sequences having long-range power-law correlations. Finally, we compare our exponents for the DNA sequences with the exponents obtained from power-spectrum analysis and correlation-function techniques, and demonstrate that the original "DNA-walk" method is intrinsically more accurate due to reduced noise.

  5. Sequence-specific RNA Photocleavage by Single-stranded DNA in Presence of Riboflavin.

    Science.gov (United States)

    Zhao, Yongyun; Chen, Gangyi; Yuan, Yi; Li, Na; Dong, Juan; Huang, Xin; Cui, Xin; Tang, Zhuo

    2015-10-13

    Constant efforts have been made to develop new method to realize sequence-specific RNA degradation, which could cause inhibition of the expression of targeted gene. Herein, by using an unmodified short DNA oligonucleotide for sequence recognition and endogenic small molecule, vitamin B2 (riboflavin) as photosensitizer, we report a simple strategy to realize the sequence-specific photocleavage of targeted RNA. The DNA strand is complimentary to the target sequence to form DNA/RNA duplex containing a G • U wobble in the middle. The cleavage reaction goes through oxidative elimination mechanism at the nucleoside downstream of U of the G • U wobble in duplex to obtain unnatural RNA terminal, and the whole process is under tight control by using light as switch, which means the cleavage could be carried out according to specific spatial and temporal requirements. The biocompatibility of this method makes the DNA strand in combination with riboflavin a promising molecular tool for RNA manipulation.

  6. The mammalian transcriptome and the function of non-coding DNA sequences

    National Research Council Canada - National Science Library

    Shabalina, Svetlana A; Spiridonov, Nikolay A

    2004-01-01

    .... With the completion of the human and mouse genomes and the accumulation of data on the mammalian transcriptome, the focus now shifts to non-coding DNA sequences, RNA-coding genes and their transcripts...

  7. Repeated DNA sequences in the microbat species Miniopterus schreibersi (Vespertilionidae; Chiroptera)

    National Research Council Canada - National Science Library

    Barragán, M J L; Martínez, S; Marchal, J A; Bullejos, M; Díaz de la Guardia, R; Sánchez, A

    2002-01-01

    .... Furthermore, Southern blot analysis demonstrated that these DNA sequences are absent in the genomes of four related microbat species and suggest that it could be specific to the M. schreibersi genome.

  8. MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences

    National Research Council Canada - National Science Library

    Campanella, James J; Bitincka, Ledion; Smalley, John

    2003-01-01

    .... We have developed MatGAT (Matrix Global Alignment Tool), a simple, easy to use computer application that generates similarity/identity matrices for DNA or protein sequences without needing pre-alignment of the data...

  9. Phylogenetic analysis of flatfish (Order Pleuronectiformes) based on mitochondrial 16s rDNA sequences

    National Research Council Canada - National Science Library

    Pardo, Belén G; Machordom, Annie; Foresti, Fausto; Porto-Foresti, Fábio; Azevedo, Marisa F. C; Bañon, Rafael; Sánchez, Laura; Martínez, Paulino

    2005-01-01

    .... In the present study, the phylogenetic relationships of 30 flatfish species pertaining to seven different families were examined by sequence analysis of the first half of the 16S mitochondrial DNA gene...

  10. Identification and cloning of endogenous retroviral sequences present in human DNA.

    OpenAIRE

    Martin, M. A.; Bryan, T; Rasheed, S; Khan, A S

    1981-01-01

    Using nonstringent annealing conditions and a 2.75-kilobase segment of cloned African green monkey DNA that specifically hybridizes to the proviruses of AKR ecotropic murine leukemia virus (MuLV) and baboon endogenous virus (BaEV) as a probe, we detected related sequences in three different preparations of human brain DNA fragments. The blot-hybridization pattern obtained with cleaved human DNA was similar to that previously reported for the interaction of MuLV cDNA and cleaved mouse DNA and ...

  11. An alternative novel tool for DNA editing without target sequence limitation: the structure-guided nuclease.

    Science.gov (United States)

    Xu, Shu; Cao, Shasha; Zou, Bingjie; Yue, Yunyun; Gu, Chun; Chen, Xin; Wang, Pei; Dong, Xiaohua; Xiang, Zheng; Li, Kai; Zhu, Minsheng; Zhao, Qingshun; Zhou, Guohua

    2016-09-15

    Engineered endonucleases are a powerful tool for editing DNA. However, sequence preferences may limit their application. We engineer a structure-guided endonuclease (SGN) composed of flap endonuclease-1 (FEN-1), which recognizes the 3' flap structure, and the cleavage domain of Fok I (Fn1), which cleaves DNA strands. The SGN recognizes the target DNA on the basis of the 3' flap structure formed between the target and the guide DNA (gDNA) and cut the target through its Fn1 dimerization. Our results show that the SGN, guided by a pair of gDNAs, cleaves transgenic reporter gene and endogenous genes in zebrafish embryonic genome.

  12. Efficiency of ITS Sequences for DNA Barcoding in Passiflora (Passifloraceae

    Directory of Open Access Journals (Sweden)

    Giovanna Câmara Giudicelli

    2015-04-01

    Full Text Available DNA barcoding is a technique for discriminating and identifying species using short, variable, and standardized DNA regions. Here, we tested for the first time the performance of plastid and nuclear regions as DNA barcodes in Passiflora. This genus is a largely variable, with more than 900 species of high ecological, commercial, and ornamental importance. We analyzed 1034 accessions of 222 species representing the four subgenera of Passiflora and evaluated the effectiveness of five plastid regions and three nuclear datasets currently employed as DNA barcodes in plants using barcoding gap, applied similarity-, and tree-based methods. The plastid regions were able to identify less than 45% of species, whereas the nuclear datasets were efficient for more than 50% using “best match” and “best close match” methods of TaxonDNA software. All subgenera presented higher interspecific pairwise distances and did not fully overlap with the intraspecific distance, and similarity-based methods showed better results than tree-based methods. The nuclear ribosomal internal transcribed spacer 1 (ITS1 region presented a higher discrimination power than the other datasets and also showed other desirable characteristics as a DNA barcode for this genus. Therefore, we suggest that this region should be used as a starting point to identify Passiflora species.

  13. Algorithms for mapping high-throughput DNA sequences

    DEFF Research Database (Denmark)

    Frellsen, Jes; Menzel, Peter; Krogh, Anders

    2014-01-01

    of data generation, new bioinformatics approaches have been developed to cope with the large amount of sequencing reads obtained in these experiments. In this chapter, we first introduce HTS technologies and their usage in molecular biology and discuss the problem of mapping sequencing reads...

  14. Solving the Curriculum Sequencing Problem with DNA Computing Approach

    Science.gov (United States)

    Debbah, Amina; Ben Ali, Yamina Mohamed

    2014-01-01

    In the e-learning systems, a learning path is known as a sequence of learning materials linked to each others to help learners achieving their learning goals. As it is impossible to have the same learning path that suits different learners, the Curriculum Sequencing problem (CS) consists of the generation of a personalized learning path for each…

  15. A next generation semiconductor based sequencing approach for the identification of meat species in DNA mixtures.

    Science.gov (United States)

    Bertolini, Francesca; Ghionda, Marco Ciro; D'Alessandro, Enrico; Geraci, Claudia; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine) for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon) as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43%) in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97) and lower for avian species (0.70). PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.

  16. A next generation semiconductor based sequencing approach for the identification of meat species in DNA mixtures.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43% in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97 and lower for avian species (0.70. PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.

  17. Determining DNA Sequence Specificity of Natural and Artificial Transcription Factors by Cognate Site Identifier Analysis

    Science.gov (United States)

    Ozers, Mary S.; Warren, Christopher L.; Ansari, Aseem Z.

    Artificial transcription factors (ATFs) are designed to mimic natural transcription factors in the control of gene expression and are comprised of domains for DNA binding and gene regulation. ATF domains are modular, interchangeable, and can be composed of protein-based or nonpeptidic moieties, yielding DNA-interacting regulatory molecules that can either activate or inhibit transcription. Sequence-specific targeting is a key determinant in ATF activity, and DNA-binding domains such as natural zinc fingers and synthetic polyamides have emerged as useful DNA targeting molecules. Defining the comprehensive DNA binding specificity of these targeting molecules for accurate manipulations of the genome can be achieved using cognate site identifier DNA microarrays to explore the entire sequence space of binding sites. Design of ATFs that regulate gene expression with temporal control will generate important molecular tools to probe cell- and tissue-specific gene regulation and to function as potential therapeutic agents.

  18. Distribution of repetitive DNA sequences in eubacteria and application to fingerprinting of bacterial genomes.

    Science.gov (United States)

    Versalovic, J; Koeuth, T; Lupski, J R

    1991-12-25

    Dispersed repetitive DNA sequences have been described recently in eubacteria. To assess the distribution and evolutionary conservation of two distinct prokaryotic repetitive elements, consensus oligonucleotides were used in polymerase chain reaction [PCR] amplification and slot blot hybridization experiments with genomic DNA from diverse eubacterial species. Oligonucleotides matching Repetitive Extragenic Palindromic [REP] elements and Enterobacterial Repetitive Intergenic Consensus [ERIC] sequences were synthesized and tested as opposing PCR primers in the amplification of eubacterial genomic DNA. REP and ERIC consensus oligonucleotides produced clearly resolvable bands by agarose gel electrophoresis following PCR amplification. These band patterns provided unambiguous DNA fingerprints of different eubacterial species and strains. Both REP and ERIC probes hybridized preferentially to genomic DNA from Gram-negative enteric bacteria and related species. Widespread distribution of these repetitive DNA elements in the genomes of various microorganisms should enable rapid identification of bacterial species and strains, and be useful for the analysis of prokaryotic genomes.

  19. Draft versus finished sequence data for DNA and protein diagnostic signature development

    Energy Technology Data Exchange (ETDEWEB)

    Gardner, S N; Lam, M W; Smith, J R; Torres, C L; Slezak, T R

    2004-10-29

    Sequencing pathogen genomes is costly, demanding careful allocation of limited sequencing resources. We built a computational Sequencing Analysis Pipeline (SAP) to guide decisions regarding the amount of genomic sequencing necessary to develop high-quality diagnostic DNA and protein signatures. SAP uses simulations to estimate the number of target genomes and close phylogenetic relatives (near neighbors, or NNs) to sequence. We use SAP to assess whether draft data is sufficient or finished sequencing is required using Marburg and variola virus sequences. Simulations indicate that intermediate to high quality draft with error rates of 10{sup -3}-10{sup -5} ({approx} 8x coverage) of target organisms is suitable for DNA signature prediction. Low quality draft with error rates of {approx} 1% (3x to 6x coverage) of target isolates is inadequate for DNA signature prediction, although low quality draft of NNs is sufficient, as long as the target genomes are of high quality. For protein signature prediction, sequencing errors in target genomes substantially reduce the detection of amino acid sequence conservation, even if the draft is of high quality. In summary, high quality draft of target and low quality draft of NNs appears to be a cost-effective investment for DNA signature prediction, but may lead to underestimation of predicted protein signatures.

  20. Properties of CENP-B and its target sequence in a satellite DNA

    Energy Technology Data Exchange (ETDEWEB)

    Masumoto, H.; Yoda, K.; Ikeno, M.; Kitagawa, K.; Muro, Y.; Okazaki, T. [Nagoya Univ. (Japan)

    1993-12-31

    The centromere plays an essential role in the proper segregation of eukaryotic chromosomes at mitosis and meiosis. The centromere is the multifunctional domain of chromosome responsible for sister chromatid association at the inner site and for microtubule attachment at the outer surface. It also acts as a mechanochemical motor for chromosome movement. These multiple centromere functions must, in some way, be directed by a cis-acting DNA sequence located in the centromere region. Indeed, specific centromere DNA sequences (CEN-DNA) were identified in two yeast species. In Saccharomyces cerevisiae, CEN-DNA consists of roughly 125 bp sequence composed of three conserved elements. In contrast, the centromere sequence of S. pombe is quite different from S. cerevisiae in length and sequence organization. The molecular bases for understanding the structure and function of the centromere/kinetochore domain have not been elucidated in higher eukaryotes. In mammalian cells, satellite DNA`s are localized in the centromeric heterochromatin or heterochromatic arm. In all human chromosomes, the alpha satellite or alphoid DNA family, a highly repetitive DNA composed of about 170 bp fundamental monomer repeating units, is found at the primary constriction. Its function, however, has not been established.

  1. Divergence between samples of chimpanzee and human DNA sequences is 5%, counting indels

    Science.gov (United States)

    Britten, Roy J.

    2002-01-01

    Five chimpanzee bacterial artificial chromosome (BAC) sequences (described in GenBank) have been compared with the best matching regions of the human genome sequence to assay the amount and kind of DNA divergence. The conclusion is the old saw that we share 98.5% of our DNA sequence with chimpanzee is probably in error. For this sample, a better estimate would be that 95% of the base pairs are exactly shared between chimpanzee and human DNA. In this sample of 779 kb, the divergence due to base substitution is 1.4%, and there is an additional 3.4% difference due to the presence of indels. The gaps in alignment are present in about equal amounts in the chimp and human sequences. They occur equally in repeated and nonrepeated sequences, as detected by repeatmasker (http://ftp.genome.washington.edu/RM/RepeatMasker.html). PMID:12368483

  2. Noninvasive Prenatal Paternity Testing (NIPAT) through Maternal Plasma DNA Sequencing: A Pilot Study.

    Science.gov (United States)

    Jiang, Haojun; Xie, Yifan; Li, Xuchao; Ge, Huijuan; Deng, Yongqiang; Mu, Haofang; Feng, Xiaoli; Yin, Lu; Du, Zhou; Chen, Fang; He, Nongyue

    2016-01-01

    Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels in order to verify the performance in clinical cases. Combining targeted deep sequencing of selective SNP and informative bioinformatics pipeline, we calculated the combined paternity index (CPI) of 17 cases to determine paternity. Sequencing-based NIPAT results fully agreed with invasive prenatal paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future.

  3. Live cell imaging of repetitive DNA sequences via GFP-tagged polydactyl zinc finger proteins

    Science.gov (United States)

    Lindhout, Beatrice I.; Fransz, Paul; Tessadori, Federico; Meckel, Tobias; Hooykaas, Paul J.J.; van der Zaal, Bert J.

    2007-01-01

    Several techniques are available to study chromosomes or chromosomal domains in nuclei of chemically fixed or living cells. Current methods to detect DNA sequences in vivo are limited to trans interactions between a DNA sequence and a transcription factor from natural systems. Here, we expand live cell imaging tools using a novel approach based on zinc finger-DNA recognition codes. We constructed several polydactyl zinc finger (PZF) DNA-binding domains aimed to recognize specific DNA sequences in Arabidopsis and mouse and fused these with GFP. Plants and mouse cells expressing PZF:GFP proteins were subsequently analyzed by confocal microscopy. For Arabidopsis, we designed a PZF:GFP protein aimed to specifically recognize a 9-bp sequence within centromeric 180-bp repeat and monitored centromeres in living roots. Similarly, in mouse cells a PZF:GFP protein was targeted to a 9-bp sequence in the major satellite repeat. Both PZF:GFP proteins localized in chromocenters which represent heterochromatin domains containing centromere and other tandem repeats. The number of PZF:GFP molecules per centromere in Arabidopsis, quantified with near single-molecule precision, approximated the number of expected binding sites. Our data demonstrate that live cell imaging of specific DNA sequences can be achieved with artificial zinc finger proteins in different organisms. PMID:17704126

  4. Realistic artificial DNA sequences as negative controls for computational genomics

    Science.gov (United States)

    Caballero, Juan; Smit, Arian F. A.; Hood, Leroy; Glusman, Gustavo

    2014-01-01

    A common practice in computational genomic analysis is to use a set of ‘background’ sequences as negative controls for evaluating the false-positive rates of prediction tools, such as gene identification programs and algorithms for detection of cis-regulatory elements. Such ‘background’ sequences are generally taken from regions of the genome presumed to be intergenic, or generated synthetically by ‘shuffling’ real sequences. This last method can lead to underestimation of false-positive rates. We developed a new method for generating artificial sequences that are modeled after real intergenic sequences in terms of composition, complexity and interspersed repeat content. These artificial sequences can serve as an inexhaustible source of high-quality negative controls. We used artificial sequences to evaluate the false-positive rates of a set of programs for detecting interspersed repeats, ab initio prediction of coding genes, transcribed regions and non-coding genes. We found that RepeatMasker is more accurate than PClouds, Augustus has the lowest false-positive rate of the coding gene prediction programs tested, and Infernal has a low false-positive rate for non-coding gene detection. A web service, source code and the models for human and many other species are freely available at http://repeatmasker.org/garlic/. PMID:24803667

  5. Realistic artificial DNA sequences as negative controls for computational genomics.

    Science.gov (United States)

    Caballero, Juan; Smit, Arian F A; Hood, Leroy; Glusman, Gustavo

    2014-07-01

    A common practice in computational genomic analysis is to use a set of 'background' sequences as negative controls for evaluating the false-positive rates of prediction tools, such as gene identification programs and algorithms for detection of cis-regulatory elements. Such 'background' sequences are generally taken from regions of the genome presumed to be intergenic, or generated synthetically by 'shuffling' real sequences. This last method can lead to underestimation of false-positive rates. We developed a new method for generating artificial sequences that are modeled after real intergenic sequences in terms of composition, complexity and interspersed repeat content. These artificial sequences can serve as an inexhaustible source of high-quality negative controls. We used artificial sequences to evaluate the false-positive rates of a set of programs for detecting interspersed repeats, ab initio prediction of coding genes, transcribed regions and non-coding genes. We found that RepeatMasker is more accurate than PClouds, Augustus has the lowest false-positive rate of the coding gene prediction programs tested, and Infernal has a low false-positive rate for non-coding gene detection. A web service, source code and the models for human and many other species are freely available at http://repeatmasker.org/garlic/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. DNA stretching and optimization of nucleobase recognition in enzymatic nanopore sequencing

    NARCIS (Netherlands)

    Stoddart, David; Franceschini, Lorenzo; Heron, Andrew; Bayley, Hagan; Maglia, Giovanni

    2015-01-01

    In nanopore sequencing, where single DNA strands are electrophoretically translocated through a nanopore and the resulting ionic signal is used to identify the four DNA bases, an enzyme has been used to ratchet the nucleic acid stepwise through the pore at a controlled speed. In this work, we

  7. Morphology and Small-Subunit Ribosomal DNA Sequence of Henneguya Adiposa (Myxosporea) From Ictalurus punctatus (Siluriformes)

    Science.gov (United States)

    The original description of Henneguya adiposa, a myxozoan parasitizing channel catfish Ictalurus punctatus, is supplemented with new data on spore morphology, including photomicrographs and line drawings, as well as 18S small-subunit (SSU) ribosomal DNA (rDNA) sequence. Elongate, translucent, linear...

  8. Mitochondrial DNA sequence variation in Finnish patients with matrilineal diabetes mellitus

    Directory of Open Access Journals (Sweden)

    Soini Heidi K

    2012-07-01

    Full Text Available Abstract Background The genetic background of type 2 diabetes is complex involving contribution by both nuclear and mitochondrial genes. There is an excess of maternal inheritance in patients with type 2 diabetes and, furthermore, diabetes is a common symptom in patients with mutations in mitochondrial DNA (mtDNA. Polymorphisms in mtDNA have been reported to act as risk factors in several complex diseases. Findings We examined the nucleotide variation in complete mtDNA sequences of 64 Finnish patients with matrilineal diabetes. We used conformation sensitive gel electrophoresis and sequencing to detect sequence variation. We analysed the pathogenic potential of nonsynonymous variants detected in the sequences and examined the role of the m.16189 T>C variant. Controls consisted of non-diabetic subjects ascertained in the same population. The frequency of mtDNA haplogroup V was 3-fold higher in patients with diabetes. Patients harboured many nonsynonymous mtDNA substitutions that were predicted to be possibly or probably damaging. Furthermore, a novel m.13762 T>G in MTND5 leading to p.Ser476Ala and several rare mtDNA variants were found. Haplogroup H1b harbouring m.16189 T > C and m.3010 G > A was found to be more frequent in patients with diabetes than in controls. Conclusions Mildly deleterious nonsynonymous mtDNA variants and rare population-specific haplotypes constitute genetic risk factors for maternally inherited diabetes.

  9. Real sequence effects on the search dynamics of transcription factors on DNA

    DEFF Research Database (Denmark)

    Bauer, Maximilian; Rasmussen, Emil S.; Lomholt, Michael A.

    2015-01-01

    Recent experiments show that transcription factors (TFs) indeed use the facilitated diffusion mechanism to locate their target sequences on DNA in living bacteria cells: TFs alternate between sliding motion along DNA and relocation events through the cytoplasm. From simulations and theoretical an...

  10. Mitochondrial DNA sequence variation in Finnish patients with matrilineal diabetes mellitus.

    Science.gov (United States)

    Soini, Heidi K; Moilanen, Jukka S; Finnila, Saara; Majamaa, Kari

    2012-07-10

    The genetic background of type 2 diabetes is complex involving contribution by both nuclear and mitochondrial genes. There is an excess of maternal inheritance in patients with type 2 diabetes and, furthermore, diabetes is a common symptom in patients with mutations in mitochondrial DNA (mtDNA). Polymorphisms in mtDNA have been reported to act as risk factors in several complex diseases. We examined the nucleotide variation in complete mtDNA sequences of 64 Finnish patients with matrilineal diabetes. We used conformation sensitive gel electrophoresis and sequencing to detect sequence variation. We analysed the pathogenic potential of nonsynonymous variants detected in the sequences and examined the role of the m.16189 T>C variant. Controls consisted of non-diabetic subjects ascertained in the same population. The frequency of mtDNA haplogroup V was 3-fold higher in patients with diabetes. Patients harboured many nonsynonymous mtDNA substitutions that were predicted to be possibly or probably damaging. Furthermore, a novel m.13762 T>G in MTND5 leading to p.Ser476Ala and several rare mtDNA variants were found. Haplogroup H1b harbouring m.16189 T > C and m.3010 G > A was found to be more frequent in patients with diabetes than in controls. Mildly deleterious nonsynonymous mtDNA variants and rare population-specific haplotypes constitute genetic risk factors for maternally inherited diabetes.

  11. Cloning and sequence analysis of H. contortus HC58cDNA gene ...

    African Journals Online (AJOL)

    The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5\\' and 3\\' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851bp long, with open reading frame of 717bp, ...

  12. High-Quality Exome Sequencing of Whole-Genome Amplified Neonatal Dried Blood Spot DNA

    DEFF Research Database (Denmark)

    Poulsen, Jesper Buchhave; Lescai, Francesco; Grove, Jakob

    2016-01-01

    be amplified to obtain micrograms of an otherwise limited resource, referred to as whole-genome amplified DNA (wgaDNA). Here we investigate the robustness of exome sequencing of wgaDNA of neonatal DBS samples. We conducted three pilot studies of seven, eight and seven subjects, respectively. For each subject...... from variant calls. No differences were observed substituting 2x3.2 with 2x1.6 mm discs, allowing for additional reduction of sample material in future projects....

  13. Effective DNA extraction method for fragment analysis using capillary sequencer of the kelp, Saccharina

    OpenAIRE

    Maeda, Takashi; Kawai, Tadashi; NAKAOKA, MASAHIRO; Yotsukura, Norishige

    2013-01-01

    The DNA fragment analysis can become an effective tool to study genetic differences between not only species but also individuals on saccharinan kelp from which the little genetic diversity was reported. This time, extraction methods of suitable DNA for use in the analysis with a capillary sequencer was examined on Saccharina japonica var. diabolica that contains polysaccharide abundantly. When AFLP was performed using genomic DNA extracted by seven different methods: (1) commercial kit, (2) ...

  14. Protocols for 16S rDNA Array Analyses of Microbial Communities by Sequence-Specific Labeling of DNA Probes

    Directory of Open Access Journals (Sweden)

    Knut Rudi

    2003-01-01

    Full Text Available Analyses of complex microbial communities are becoming increasingly important. Bottlenecks in these analyses, however, are the tools to actually describe the biodiversity. Novel protocols for DNA array-based analyses of microbial communities are presented. In these protocols, the specificity obtained by sequence-specific labeling of DNA probes is combined with the possibility of detecting several different probes simultaneously by DNA array hybridization. The gene encoding 16S ribosomal RNA was chosen as the target in these analyses. This gene contains both universally conserved regions and regions with relatively high variability. The universally conserved regions are used for PCR amplification primers, while the variable regions are used for the specific probes. Protocols are presented for DNA purification, probe construction, probe labeling, and DNA array hybridizations.

  15. Ray Wu as Fifth Business: Deconstructing collective memory in the history of DNA sequencing.

    Science.gov (United States)

    Onaga, Lisa A

    2014-06-01

    The concept of 'Fifth Business' is used to analyze a minority standpoint and bring serious attention to the role of scientists who play a galvanizing role in a science but for multiple reasons appear less prominently in more common recounts of any particular development. Biochemist Ray Wu (1928-2008) published a DNA sequencing experiment in March 1970 using DNA polymerase catalysis and specific nucleotide labeling, both of which are foundational to general sequencing methods today. The scant mention of Wu's work from textbooks, research articles, and other accounts of DNA sequencing calls into question how scientific collective memory forms. This alternative history seeks to understand why a key figure in nucleic acid sequence analysis has remained less visibly connected or peripheral to solidifying narratives about the history of DNA sequencing. The study resists predictable dismissals of Wu's work in order to seriously examine the formation of his nucleic acid sequence analysis research program and how he shared his knowledge of sequencing during a period of rapid advancement in the field. An analysis of Wu's work on sequencing the cohesive ends of lambda bacteriophage in the 1960s and 1970s exemplifies how a variety of individuals and groups attempted to develop protocol for sequencing the order of nucleotide base pairs comprising DNA. This historical examination of the sociality of scientific research suggests a way to understand how Wu and others contributed to the very collective memory of DNA sequencing that Wu eventually tried to repair. The study of Wu, who was a Chinese immigrant to the United States, provides a foundation for further critical scholarship on the heterogeneous histories of Asian American bioscientists, the sociality of their scientific works, and how the resulting knowledge produced is preserved, if not evenly, in a scientific field's collective memory. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data.

    Science.gov (United States)

    Jun, Goo; Flickinger, Matthew; Hetrick, Kurt N; Romm, Jane M; Doheny, Kimberly F; Abecasis, Gonçalo R; Boehnke, Michael; Kang, Hyun Min

    2012-11-02

    DNA sample contamination is a serious problem in DNA sequencing studies and may result in systematic genotype misclassification and false positive associations. Although methods exist to detect and filter out cross-species contamination, few methods to detect within-species sample contamination are available. In this paper, we describe methods to identify within-species DNA sample contamination based on (1) a combination of sequencing reads and array-based genotype data, (2) sequence reads alone, and (3) array-based genotype data alone. Analysis of sequencing reads allows contamination detection after sequence data is generated but prior to variant calling; analysis of array-based genotype data allows contamination detection prior to generation of costly sequence data. Through a combination of analysis of in silico and experimentally contaminated samples, we show that our methods can reliably detect and estimate levels of contamination as low as 1%. We evaluate the impact of DNA contamination on genotype accuracy and propose effective strategies to screen for and prevent DNA contamination in sequencing studies. Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  17. Massively parallel multiplex DNA sequencing for specimen identification using an Illumina MiSeq platform.

    Science.gov (United States)

    Shokralla, Shadi; Porter, Teresita M; Gibson, Joel F; Dobosz, Rafal; Janzen, Daniel H; Hallwachs, Winnie; Golding, G Brian; Hajibabaei, Mehrdad

    2015-04-17

    Genetic information is a valuable component of biosystematics, especially specimen identification through the use of species-specific DNA barcodes. Although many genomics applications have shifted to High-Throughput Sequencing (HTS) or Next-Generation Sequencing (NGS) technologies, sample identification (e.g., via DNA barcoding) is still most often done with Sanger sequencing. Here, we present a scalable double dual-indexing approach using an Illumina Miseq platform to sequence DNA barcode markers. We achieved 97.3% success by using half of an Illumina Miseq flowcell to obtain 658 base pairs of the cytochrome c oxidase I DNA barcode in 1,010 specimens from eleven orders of arthropods. Our approach recovers a greater proportion of DNA barcode sequences from individuals than does conventional Sanger sequencing, while at the same time reducing both per specimen costs and labor time by nearly 80%. In addition, the use of HTS allows the recovery of multiple sequences per specimen, for deeper analysis of genetic variation in target gene regions.

  18. Sequence-specific nucleic acid mobility using a reversible block copolymer gel matrix and DNA amphiphiles (lipid-DNA) in capillary and microfluidic electrophoretic separations

    NARCIS (Netherlands)

    Wagler, Patrick; Minero, Gabriel Antonio S.; Tangen, Uwe; de Vries, Jan Willem; Prusty, Deepak; Kwak, Minseok; Herrmann, Andreas; McCaskill, John S.

    2015-01-01

    Reversible noncovalent but sequence-dependent attachment of DNA to gels is shown to allow programmable mobility processing of DNA populations. The covalent attachment of DNA oligomers to polyacrylamide gels using acrydite-modified oligonucleotides has enabled sequence-specific mobility assays for

  19. Multilocus sequence typing of Staphylococcus aureus with DNA array technology

    NARCIS (Netherlands)

    W.B. van Leeuwen (Willem); C. Jay (Corinne); S.V. Snijders (Susan); N. Durin (Nathalia); B. Lacroix (Bruno); H.A. Verbrugh (Henri); M.C. Enright (Mark); A. Troesch (Alain); A.F. van Belkum (Alex)

    2003-01-01

    textabstractA newly developed oligonucleotide array suited for multilocus sequence typing (MLST) of Staphylococcus aureus strains was analyzed with two strain collections in a two-center study. MLST allele identification for the first strain collection fully agreed with

  20. DNA methylation and transcription in HERV (K, W, E) and LINE sequences remain unchanged upon foreign DNA insertions.

    Science.gov (United States)

    Weber, Stefanie; Jung, Susan; Doerfler, Walter

    2016-02-01

    DNA methylation and transcriptional profiles were determined in the regulatory sequences of the human endogenous retroviral (HERV-K, -W, -E) and LINE-1.2 elements and were compared between non-transgenomic and plasmid-transgenomic cells. DNA methylation profiles in the HERV (K, W, E) and LINE sequences were determined by bisulfite genomic sequencing. The transcription of these genome segments was assessed by quantitative real-time PCR. In HERV-K, HERV-W and LINE-1.2 the levels of DNA methylation ranged between 75 and 98%, while in HERV-E they were around 60%. Nevertheless, the HERV and LINE-1.2 sequences were actively transcribed. No differences were found in comparisons of HERV and LINE-1.2 CpG methylation and transcription patterns between non-transgenomic and plasmid-transgenomic HCT116 cells. The insertion of a 5.6 kbp plasmid into the HCT116 genome had no effect on the HERV and LINE-1.2 methylation and transcription profiles, although other parts of the HCT116 genome had shown marked changes. These repetitive sequences are transcribed, probably because the large number of HERV and LINE-1.2 elements harbor copies with non- or hypo-methylated long terminal repeat sequences.

  1. Algorithm of detecting structural variations in DNA sequences

    Science.gov (United States)

    Nałecz-Charkiewicz, Katarzyna; Nowak, Robert

    2014-11-01

    Whole genome sequencing enables to use the longest common subsequence algorithm to detect genetic structure variations. We propose to search position of short unique fragments, genetic markers, to achieve acceptable time and space complexity. The markers are generated by algorithms searching the genetic sequence or its Fourier transformation. The presented methods are checked on structural variations generated in silico on bacterial genomes giving the comparable or better results than other solutions.

  2. DNA sequence and analysis of human chromosome 9

    OpenAIRE

    Humphray, S. J.; Oliver, K.; Hunt, A. R.; Plumb, R. W.; Loveland, J. E.; Howe, K. L.; Andrews, T. D.; Searle, S.; Hunt, S. E.; Scott, C. E.; Jones, M. C.; Ainscough, R.; Almeida, J. P.; Ambrose, K. D.; Ashwell, R. I. S.

    2004-01-01

    Chromosome 9 is highly structurally polymorphic. It contains the largest autosomal block of heterochromatin, which is heteromorphic in 6–8% of humans, whereas pericentric inversions occur in more than 1% of the population. The finished euchromatic sequence of chromosome 9 comprises 109,044,351 base pairs and represents >99.6% of the region. Analysis of the sequence reveals many intra- and interchromosomal duplications, including segmental duplications adjacent to both the centromere and the l...

  3. DNA recognition by F factor TraI36: highly sequence-specific binding of single-stranded DNA.

    Science.gov (United States)

    Stern, J C; Schildbach, J F

    2001-09-25

    The TraI protein has two essential roles in transfer of conjugative plasmid F Factor. As part of a complex of DNA-binding proteins, TraI introduces a site- and strand-specific nick at the plasmid origin of transfer (oriT), cutting the DNA strand that is transferred to the recipient cell. TraI also acts as a helicase, presumably unwinding the plasmid strands prior to transfer. As an essential feature of its nicking activity, TraI is capable of binding and cleaving single-stranded DNA oligonucleotides containing an oriT sequence. The specificity of TraI DNA recognition was examined by measuring the binding of oriT oligonucleotide variants to TraI36, a 36-kD amino-terminal domain of TraI that retains the sequence-specific nucleolytic activity. TraI36 recognition is highly sequence-specific for an 11-base region of oriT, with single base changes reducing affinity by as much as 8000-fold. The binding data correlate with plasmid mobilization efficiencies: plasmids containing sequences bound with lower affinities by TraI36 are transferred between cells at reduced frequencies. In addition to the requirement for high affinity binding to oriT, efficient in vitro nicking and in vivo plasmid mobilization requires a pyrimidine immediately 5' of the nick site. The high sequence specificity of TraI single-stranded DNA recognition suggests that despite its recognition of single-stranded DNA, TraI is capable of playing a major regulatory role in initiation and/or termination of plasmid transfer.

  4. Tomato protoplast DNA transformation : physical linkage and recombination of exogenous DNA sequences

    NARCIS (Netherlands)

    Jongsma, Maarten; Koornneef, Maarten; Zabel, Pim; Hille, Jacques

    1987-01-01

    Tomato protoplasts have been transformed with plasmid DNA's, containing a chimeric kanamycin resistance gene and putative tomato origins of replication. A calcium phosphate-DNA mediated transformation procedure was employed in combination with either polyethylene glycol or polyvinyl alcohol. There

  5. Is photocleavage of DNA by YOYO-1 using a synchrotron radiation light source sequence dependent?

    Science.gov (United States)

    Gilroy, Emma L; Hoffmann, Søren Vrønning; Jones, Nykola C; Rodger, Alison

    2011-10-01

    The photocleavage of double-stranded and single-stranded DNA by the fluorescent dye YOYO-1 was investigated in real time by using the synchrotron radiation light source ASTRID (ISA, Denmark) both to initiate the reaction and to monitor its progress using Couette flow linear dichroism (LD) throughout the irradiation period. The dependence of LD signals on DNA sequences and on time in the intense light beam was explored and quantified for single-stranded poly(dA), poly[(dA-dT)(2)], calf thymus DNA (ctDNA) and Micrococcus luteus DNA (mlDNA). The DNA and ligand regions of the spectrum showed different LD kinetic behaviors, and there was significant sequence dependence of the kinetics. However, in contrast to expectations from the literature, we found that poly(dA), mlDNA, low salt ctDNA and low salt poly[(dA-dT)(2)] all had significant populations of groove-bound YOYO. It seems that this mode was predominantly responsible for the catalysis of DNA cleavage. In homopolymeric DNAs, intercalated YOYO was unable to cleave DNA. In mixed-sequence DNAs the data suggest that YOYO in some but not all intercalated binding sites can cause cleavage. It is also likely that cleavage occurs at transient single-stranded regions. The reaction rates for a 100 mA beam current of 0.5-μW power varied from 0.6 h(-1) for single-stranded poly(dA) to essentially zero for low salt poly[(dG-dC)(2)] and high salt poly[(dA-dT)(2)]. At the conclusion of the experiments with each kind of DNA, uncleaved DNA with intercalated YOYO remained.

  6. Efficient and specific internal cleavage of a retroviral palindromic DNA sequence by tetrameric HIV-1 integrase.

    Directory of Open Access Journals (Sweden)

    Olivier Delelis

    Full Text Available BACKGROUND: HIV-1 integrase (IN catalyses the retroviral integration process, removing two nucleotides from each long terminal repeat and inserting the processed viral DNA into the target DNA. It is widely assumed that the strand transfer step has no sequence specificity. However, recently, it has been reported by several groups that integration sites display a preference for palindromic sequences, suggesting that a symmetry in the target DNA may stabilise the tetrameric organisation of IN in the synaptic complex. METHODOLOGY/PRINCIPAL FINDINGS: We assessed the ability of several palindrome-containing sequences to organise tetrameric IN and investigated the ability of IN to catalyse DNA cleavage at internal positions. Only one palindromic sequence was successfully cleaved by IN. Interestingly, this symmetrical sequence corresponded to the 2-LTR junction of retroviral DNA circles-a palindrome similar but not identical to the consensus sequence found at integration sites. This reaction depended strictly on the cognate retroviral sequence of IN and required a full-length wild-type IN. Furthermore, the oligomeric state of IN responsible for this cleavage differed from that involved in the 3'-processing reaction. Palindromic cleavage strictly required the tetrameric form, whereas 3'-processing was efficiently catalysed by a dimer. CONCLUSIONS/SIGNIFICANCE: Our findings suggest that the restriction-like cleavage of palindromic sequences may be a general physiological activity of retroviral INs and that IN tetramerisation is strongly favoured by DNA symmetry, either at the target site for the concerted integration or when the DNA contains the 2-LTR junction in the case of the palindromic internal cleavage.

  7. Genetic alterations of hepatocellular carcinoma by random amplified polymorphic DNA analysis and cloning sequencing of tumor differential DNA fragment

    Science.gov (United States)

    Xian, Zhi-Hong; Cong, Wen-Ming; Zhang, Shu-Hui; Wu, Meng-Chao

    2005-01-01

    AIM: To study the genetic alterations and their association with clinicopathological characteristics of hepatocellular carcinoma (HCC), and to find the tumor related DNA fragments. METHODS: DNA isolated from tumors and corresponding noncancerous liver tissues of 56 HCC patients was amplified by random amplified polymorphic DNA (RAPD) with 10 random 10-mer arbitrary primers. The RAPD bands showing obvious differences in tumor tissue DNA corresponding to that of normal tissue were separated, purified, cloned and sequenced. DNA sequences were analyzed and compared with GenBank data. RESULTS: A total of 56 cases of HCC were demonstrated to have genetic alterations, which were detected by at least one primer. The detestability of genetic alterations ranged from 20% to 70% in each case, and 17.9% to 50% in each primer. Serum HBV infection, tumor size, histological grade, tumor capsule, as well as tumor intrahepatic metastasis, might be correlated with genetic alterations on certain primers. A band with a higher intensity of 480 bp or so amplified fragments in tumor DNA relative to normal DNA could be seen in 27 of 56 tumor samples using primer 4. Sequence analysis of these fragments showed 91% homology with Homo sapiens double homeobox protein DUX10 gene. CONCLUSION: Genetic alterations are a frequent event in HCC, and tumor related DNA fragments have been found in this study, which may be associated with hepatocarcin-ogenesis. RAPD is an effective method for the identification and analysis of genetic alterations in HCC, and may provide new information for further evaluating the molecular mechanism of hepatocarcinogenesis. PMID:15996039

  8. Automated pneumococcal MLST using liquid-handling robotics and a capillary DNA sequencer.

    Science.gov (United States)

    Jefferies, Johanna; Clarke, Stuart C; Diggle, Mathew A; Smith, Andrew; Dowson, Chris; Mitchell, Tim

    2003-07-01

    Multilocus sequence typing (MLST) is used by the Scottish Meningococcus and Pneumococcus Reference Laboratory (SMPRL) as a routine method for the characterization of certain bacterial pathogens. The SMPRL recently started performing MLST on strains of Streptococcus pneumoniae, and here we describe a fully automated method for MLST using a 96-well-format liquid-handling robot and a 96-capillary automated DNA sequencer.

  9. Noninvasive prenatal diagnosis of fetal trisomy 18 and trisomy 13 by maternal plasma DNA sequencing.

    NARCIS (Netherlands)

    Chen, E.Z.; Chiu, R.W.; Sun, H; Akolekar, R.; Chan, K.C.; Leung, T.Y.; Jiang, P.; Zheng, Y.W.; Lun, F.M.; Chan, L.Y.; Jin, Y.; Go, A.T.; Lau, E.T; To, W.W.; Leung, W.C.; Tang, R.Y.; Au-Yeung, S.K.; Lam, H.; Kung, Y.Y.; Zhang, X.; Vugt, J.M.G. van; Minekawa, R.; Tang, M.H.; Wang, J.; Oudejans, C.B.; Lau, T.K.; Nicolaides, K.H.; Lo, Y.M.

    2011-01-01

    Massively parallel sequencing of DNA molecules in the plasma of pregnant women has been shown to allow accurate and noninvasive prenatal detection of fetal trisomy 21. However, whether the sequencing approach is as accurate for the noninvasive prenatal diagnosis of trisomy 13 and 18 is unclear due

  10. Tandemly repeated sequence in 5'end of mtDNA control region of ...

    African Journals Online (AJOL)

    Extensive length variability was observed in 5' end sequence of the mitochondrial DNA control region of the Japanese Spanish mackerel (Scomberomorus niphonius). This length variability was due to the presence of varying numbers of a 56-bp tandemly repeated sequence and a 46-bp insertion/deletion (indel).

  11. 2D-dynamic representation of DNA sequences as a graphical tool in bioinformatics

    Science.gov (United States)

    Bielińska-Wa̧Ż, D.; Wa̧Ż, P.

    2016-10-01

    2D-dynamic representation of DNA sequences is briefly reviewed. Some new examples of 2D-dynamic graphs which are the graphical tool of the method are shown. Using the examples of the complete genome sequences of the Zika virus it is shown that the present method can be applied for the study of the evolution of viral genomes.

  12. Open source tools to exploit DNA sequence data from livestock species

    Science.gov (United States)

    Next-Generation Sequencing (NGS) is a recent technological development that allows researchers to rapidly determine the DNA sequence of an individual. The decrease in cost of NGS has brought the technology into the realm of practical applications in livestock genomics, where it can be used to genera...

  13. Discovery and genotyping of existing and induced DNA sequence variation in potato

    NARCIS (Netherlands)

    Uitdewilligen, J.G.A.M.L.

    2012-01-01

    In this thesis natural and induced DNA sequence diversity in potato (Solanum tuberosum) for use in marker-trait analysis and potato breeding is assessed. The study addresses the challenges of reliable, high-throughput identification and genotyping of sequence variants in existing tetraploid potato

  14. Sequence-specific protection of duplex DNA against restriction and methylation enzymes by pseudocomplementary PNAs

    DEFF Research Database (Denmark)

    Izvolsky, K I; Demidov, V V; Nielsen, P E

    2000-01-01

    I restriction endonuclease and dam methylase. The pcPNA-assisted protection against enzymatic methylation is more efficient when the PNA-binding site embodies the methylase-recognition site rather than overlaps it. We conclude that pcPNAs may provide the robust tools allowing to sequence-specifically manipulate...... DNA duplexes in a virtually sequence-unrestricted manner....

  15. Sequence of a DNA probe specific for Anopheles quadrimaculatus species A (Diptera: Culicidae).

    Science.gov (United States)

    Johnson, D W; Cockburn, A F; Seawright, J A

    1993-09-01

    The nucleotide sequence was determined for a portion of a 12-kb genomic DNA clone specific for Anopheles quadrimaculatus species A. Four short, internally repeated sequences were identified. Synthetic oligonucleotide probes were prepared based on these four repeats. The oligonucleotides are highly specific and can be reliably used to separate individuals of An. quadrimaculatus species A from members of other species of the complex.

  16. Coinfection of Fusobacterium nucleatum and Actinomyces israelii in Mastoiditis Diagnosed by Next-Generation DNA Sequencing

    Science.gov (United States)

    Hoogestraat, Daniel R.; Abbott, April N.; SenGupta, Dhruba J.; Cummings, Lisa A.; Butler-Wu, Susan M.; Stephens, Karen; Cookson, Brad T.; Hoffman, Noah G.

    2014-01-01

    Some bacterial infections involve potentially complex mixtures of species that can now be distinguished using next-generation DNA sequencing. We present a case of mastoiditis where Gram stain, culture, and molecular diagnosis were nondiagnostic or discrepant. Next-generation sequencing implicated coinfection of Fusobacterium nucleatum and Actinomyces israelii, resolving these diagnostic discrepancies. PMID:24574281

  17. Cloning, sequencing and expression of a novel xylanase cDNA from ...

    African Journals Online (AJOL)

    A strain SH 2016, capable of producing xylanase, was isolated and identified as Aspergillus awamori, based on its physiological and biochemical characteristics as well as its ITS rDNA gene sequence analysis. A xylanase gene of 591 bp was cloned from this newly isolated A. awamori and the ORF sequence predicted a ...

  18. Critical bending torque of DNA is a materials parameter independent of local base sequence.

    Science.gov (United States)

    Wang, Juan; Qu, Hao; Zocchi, Giovanni

    2013-09-01

    Short double-stranded DNA molecules exhibit a softening transition under large bending which is quantitatively described by a critical bending torque τ_{c} at which the molecule develops a kink. Through equilibrium measurements of the elastic energy of short (∼10 nm), highly stressed DNA molecules with a nick at the center we determine τ_{c} for different sequences around the nick. We find that τ_{c} is a robust materials parameter essentially independent of sequence. The measurements also show that, at least for nicked DNA, the local structure at the origin of the softening transition is not a single-stranded "bubble."

  19. WEB-THERMODYN: sequence analysis software for profiling DNA helical stability

    OpenAIRE

    Huang, Yanlin; Kowalski, David

    2003-01-01

    WEB-THERMODYN analyzes DNA sequences and computes the DNA helical stability, i.e. the free energy required to unwind and separate the strands of the double helix. A helical stability profile across a selected DNA region or the entire sequence is generated by sliding-window analysis. WEB-THERMODYN can predict sites of low helical stability present at regulatory regions for transcription and replication and can be used to test the influence of mutations. The program can be accessed at: http://w...

  20. DNA fingerprinting based on simple sequence repeat (SSR ...

    African Journals Online (AJOL)

    New varieties of sugarcane are protected using morphological descriptors, which have limitations in identifying morphologically similar cultivars. Development of a reliable DNA fingerprint system for identification of new varieties would contribute greatly to the breeding of these species. Microsatellite markers are tools with ...

  1. Random amplified polymorphic DNA (RAPD) and simple sequence ...

    African Journals Online (AJOL)

    Administrator

    2011-06-06

    Jun 6, 2011 ... of polymorphic bands, average number of alleles per locus, effective .... Materials for DNA isolation were obtained from a set of 5 to 7 plants ..... Among factors that might have contributed to ... Inheritance of RAPDs in F1 hybrids of corn. ... by using cluster analysis of RAPD molecular marker, phenotype and.

  2. Sequence Dependent Interactions Between DNA and Single-Walled Carbon Nanotubes

    Science.gov (United States)

    Roxbury, Daniel

    It is known that single-stranded DNA adopts a helical wrap around a single-walled carbon nanotube (SWCNT), forming a water-dispersible hybrid molecule. The ability to sort mixtures of SWCNTs based on chirality (electronic species) has recently been demonstrated using special short DNA sequences that recognize certain matching SWCNTs of specific chirality. This thesis investigates the intricacies of DNA-SWCNT sequence-specific interactions through both experimental and molecular simulation studies. The DNA-SWCNT binding strengths were experimentally quantified by studying the kinetics of DNA replacement by a surfactant on the surface of particular SWCNTs. Recognition ability was found to correlate strongly with measured binding strength, e.g. DNA sequence (TAT)4 was found to bind 20 times stronger to the (6,5)-SWCNT than sequence (TAT)4T. Next, using replica exchange molecular dynamics (REMD) simulations, equilibrium structures formed by (a) single-strands and (b) multiple-strands of 12-mer oligonucleotides adsorbed on various SWCNTs were explored. A number of structural motifs were discovered in which the DNA strand wraps around the SWCNT and 'stitches' to itself via hydrogen bonding. Great variability among equilibrium structures was observed and shown to be directly influenced by DNA sequence and SWCNT type. For example, the (6,5)-SWCNT DNA recognition sequence, (TAT)4, was found to wrap in a tight single-stranded right-handed helical conformation. In contrast, DNA sequence T12 forms a beta-barrel left-handed structure on the same SWCNT. These are the first theoretical indications that DNA-based SWCNT selectivity can arise on a molecular level. In a biomedical collaboration with the Mayo Clinic, pathways for DNA-SWCNT internalization into healthy human endothelial cells were explored. Through absorbance spectroscopy, TEM imaging, and confocal fluorescence microscopy, we showed that intracellular concentrations of SWCNTs far exceeded those of the incubation

  3. DNA sequence conservation between the Bacillus anthracis pXO2 plasmid and genomic sequence from closely related bacteria

    Directory of Open Access Journals (Sweden)

    Sabin Robert

    2002-12-01

    Full Text Available Abstract Background Complete sequencing and annotation of the 96.2 kb Bacillus anthracis plasmid, pXO2, predicted 85 open reading frames (ORFs. Bacillus cereus and Bacillus thuringiensis isolates that ranged in genomic similarity to B. anthracis, as determined by amplified fragment length polymorphism (AFLP analysis, were examined by PCR for the presence of sequences similar to 47 pXO2 ORFs. Results The two most distantly related isolates examined, B. thuringiensis 33679 and B. thuringiensis AWO6, produced the greatest number of ORF sequences similar to pXO2; 10 detected in 33679 and 16 in AWO6. No more than two of the pXO2 ORFs were detected in any one of the remaining isolates. Dot-blot DNA hybridizations between pXO2 ORF fragments and total genomic DNA from AWO6 were consistent with the PCR assay results for this isolate and also revealed nine additional ORFs shared between these two bacteria. Sequences similar to the B. anthracis cap genes or their regulator, acpA, were not detected among any of the examined isolates. Conclusions The presence of pXO2 sequences in the other Bacillus isolates did not correlate with genomic relatedness established by AFLP analysis. The presence of pXO2 ORF sequences in other Bacillus species suggests the possibility that certain pXO2 plasmid gene functions may also be present in other closely related bacteria.

  4. High-throughput sequencing of nematode communities from total soil DNA extractions

    DEFF Research Database (Denmark)

    Sapkota, Rumakanta; Nicolaisen, Mogens

    2015-01-01

    nematodes without the need for enrichment was developed. Using this strategy on DNA templates from a set of 22 agricultural soils, we obtained 64.4% sequences of nematode origin in total, whereas the remaining sequences were almost entirely from other metazoans. The nematode sequences were derived from...... a broad taxonomic range and most sequences were from nematode taxa that have previously been found to be abundant in soil such as Tylenchida, Rhabditida, Dorylaimida, Triplonchida and Araeolaimida. Conclusions: Our amplification and sequencing strategy for assessing nematode diversity was able to collect...

  5. Complete genome sequence of the mitochondrial DNA of the river lamprey, Lethenteron japonicum.

    Science.gov (United States)

    Kawai, Yuri L; Yura, Kei; Shindo, Miyuki; Kusakabe, Rie; Hayashi, Keiko; Hata, Kenichiro; Nakabayashi, Kazuhiko; Okamura, Kohji

    2015-01-01

    Lampreys are eel-like jawless fishes evolutionarily positioned between invertebrates and vertebrates, and have been used as model organisms to explore vertebrate evolution. In this study we determined the complete genome sequence of the mitochondrial DNA of the Japanese river lamprey, Lethenteron japonicum, using next-generation sequencers. The sequence was 16,272 bp in length. The gene content and order were identical to those of the sea lamprey, Petromyzon marinus, which has been the reference among lamprey species. However, the sequence similarity was less than 90%, suggesting the need for the whole-genome sequencing of L. japonicum.

  6. Application of neural networks and other machine learning algorithms to DNA sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Lapedes, A.; Barnes, C.; Burks, C.; Farber, R.; Sirotkin, K.

    1988-01-01

    In this article we report initial, quantitative results on application of simple neutral networks, and simple machine learning methods, to two problems in DNA sequence analysis. The two problems we consider are: (1) determination of whether procaryotic and eucaryotic DNA sequences segments are translated to protein. An accuracy of 99.4% is reported for procaryotic DNA (E. coli) and 98.4% for eucaryotic DNA (H. Sapiens genes known to be expressed in liver); (2) determination of whether eucaryotic DNA sequence segments containing the dinucleotides ''AG'' or ''GT'' are transcribed to RNA splice junctions. Accuracy of 91.2% was achieved on intron/exon splice junctions (acceptor sites) and 92.8% on exon/intron splice junctions (donor sites). The solution of these two problems, by use of information processing algorithms operating on unannotated base sequences and without recourse to biological laboratory work, is relevant to the Human Genome Project. A variety of neural network, machine learning, and information theoretic algorithms are used. The accuracies obtained exceed those of previous investigations for which quantitative results are available in the literature. They result from an ongoing program of research that applies machine learning algorithms to the problem of determining biological function of DNA sequences. Some predictions of possible new genes using these methods are listed -- although a complete survey of the H. sapiens and E. coli sections of GenBank will be given elsewhere. 36 refs., 6 figs., 6 tabs.

  7. Identification of Chinese Herbs Using a Sequencing-Free Nanostructured Electrochemical DNA Biosensor

    Directory of Open Access Journals (Sweden)

    Yan Lei

    2015-11-01

    Full Text Available Due to the nearly identical phenotypes and chemical constituents, it is often very challenging to accurately differentiate diverse species of a Chinese herbal genus. Although technologies including DNA barcoding have been introduced to help address this problem, they are generally time-consuming and require expensive sequencing. Herein, we present a simple sequencing-free electrochemical biosensor, which enables easy differentiation between two closely related Fritillaria species. To improve its differentiation capability using trace amounts of DNA sample available from herbal extracts, a stepwise electrochemical deposition of reduced graphene oxide (RGO and gold nanoparticles (AuNPs was adopted to engineer a synergistic nanostructured sensing interface. By using such a nanofeatured electrochemical DNA (E-DNA biosensor, two Chinese herbal species of Fritillaria (F. thunbergii and F. cirrhosa were successfully discriminated at the DNA level, because a fragment of 16-mer sequence at the spacer region of the 5S-rRNA only exists in F. thunbergii. This E-DNA sensor was capable of identifying the target sequence in the range from 100 fM to 10 nM, and a detection limit as low as 11.7 fM (S/N = 3 was obtained. Importantly, this sensor was applied to detect the unique fragment of the PCR products amplified from F. thunbergii and F. cirrhosa, respectively. We anticipate that such a direct, sequencing-free sensing mode will ultimately pave the way towards a new generation of herb-identification strategies.

  8. Contributions of Sequence to the Higher-Order Structures of DNA.

    Science.gov (United States)

    Todolli, Stefjord; Perez, Pamela J; Clauvelin, Nicolas; Olson, Wilma K

    2017-02-07

    One of the critical unanswered questions in genome biophysics is how the primary sequence of DNA bases influences the global properties of very-long-chain molecules. The local sequence-dependent features of DNA found in high-resolution structures introduce irregularities in the disposition of adjacent residues that facilitate the specific binding of proteins and modulate the global folding and interactions of double helices with hundreds of basepairs. These features also determine the positions of nucleosomes on DNA and the lengths of the interspersed DNA linkers. Like the patterns of basepair association within DNA, the arrangements of nucleosomes in chromatin modulate the properties of longer polymers. The intrachromosomal loops detected in genomic studies contain hundreds of nucleosomes, and given that the simulated configurations of chromatin depend on the lengths of linker DNA, the formation of these loops may reflect sequence-dependent information encoded within the positioning of the nucleosomes. With knowledge of the positions of nucleosomes on a given genome, methods are now at hand to estimate the looping propensities of chromatin in terms of the spacing of nucleosomes and to make a direct connection between the DNA base sequence and larger-scale chromatin folding. Copyright © 2017 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  9. Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments.

    Science.gov (United States)

    Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias

    2013-09-24

    Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retrieval. In combination with single-stranded DNA library preparation, this method enabled us to reconstruct the mitochondrial genome sequence from a Middle Pleistocene cave bear (Ursus deningeri) bone excavated at Sima de los Huesos in the Sierra de Atapuerca, Spain. Phylogenetic reconstructions indicate that the U. deningeri sequence forms an early diverging sister lineage to all Western European Late Pleistocene cave bears. Our results prove that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost. Moreover, the techniques presented enable the retrieval of phylogenetically informative sequences from samples in which virtually all DNA is diminished to fragments shorter than 50 bp.

  10. Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments

    Science.gov (United States)

    Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias

    2013-01-01

    Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retrieval. In combination with single-stranded DNA library preparation, this method enabled us to reconstruct the mitochondrial genome sequence from a Middle Pleistocene cave bear (Ursus deningeri) bone excavated at Sima de los Huesos in the Sierra de Atapuerca, Spain. Phylogenetic reconstructions indicate that the U. deningeri sequence forms an early diverging sister lineage to all Western European Late Pleistocene cave bears. Our results prove that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost. Moreover, the techniques presented enable the retrieval of phylogenetically informative sequences from samples in which virtually all DNA is diminished to fragments shorter than 50 bp. PMID:24019490

  11. Comparing the performance of three ancient DNA extraction methods for high-throughput sequencing.

    Science.gov (United States)

    Gamba, Cristina; Hanghøj, Kristian; Gaunitz, Charleen; Alfarhan, Ahmed H; Alquraishi, Saleh A; Al-Rasheid, Khaled A S; Bradley, Daniel G; Orlando, Ludovic

    2016-03-01

    The DNA molecules that can be extracted from archaeological and palaeontological remains are often degraded and massively contaminated with environmental microbial material. This reduces the efficacy of shotgun approaches for sequencing ancient genomes, despite the decreasing sequencing costs of high-throughput sequencing (HTS). Improving the recovery of endogenous molecules from the DNA extraction and purification steps could, thus, help advance the characterization of ancient genomes. Here, we apply the three most commonly used DNA extraction methods to five ancient bone samples spanning a ~30 thousand year temporal range and originating from a diversity of environments, from South America to Alaska. We show that methods based on the purification of DNA fragments using silica columns are more advantageous than in solution methods and increase not only the total amount of DNA molecules retrieved but also the relative importance of endogenous DNA fragments and their molecular diversity. Therefore, these methods provide a cost-effective solution for downstream applications, including DNA sequencing on HTS platforms. © 2015 John Wiley & Sons Ltd.

  12. Identification of DNA lesions using a third base pair for amplification and nanopore sequencing

    Science.gov (United States)

    Riedl, Jan; Ding, Yun; Fleming, Aaron M.; Burrows, Cynthia J.

    2015-01-01

    Damage to the genome is implicated in the progression of cancer and stress-induced diseases. DNA lesions exist in low levels, and cannot be amplified by standard PCR because they are frequently strong blocks to polymerases. Here, we describe a method for PCR amplification of lesion-containing DNA in which the site and identity could be marked, copied and sequenced. Critical for this method is installation of either the dNaM or d5SICS nucleotides at the lesion site after processing via the base excision repair process. These marker nucleotides constitute an unnatural base pair, allowing large quantities of marked DNA to be made by PCR amplification. Sanger sequencing confirms the potential for this method to locate lesions by marking, amplifying and sequencing a lesion in the KRAS gene. Detection using the α-hemolysin nanopore is also developed to analyse the markers in individual DNA strands with the potential to identify multiple lesions per strand. PMID:26542210

  13. [Relationships among planktons DNA sequence diversity, water quality and fish diseases in Siniperca chuatsi ponds].

    Science.gov (United States)

    Wang, Ya-jun; Wu, Shu-qin; Lin, Wen-hui; Yang, Zhi-hui; Wu, Hui-min; Shi, Cun-bin; Pan, Hou-jun

    2007-01-01

    By using random amplified polymorphic DNA (RAPD) technique, this paper studied the alpha-diversity of plankton communities and its relationships with water quality and fish diseases in 7 Siniperca chuatsi ponds, as well as the effects of stocking density and a new culture model on the diversity and water quality. The results showed that there was a significant negative correlation between the DNA sequence diversity of plankton communities and water quality index, and high stocking density decreased the DNA sequence diversity and increased the water quality index. The new culture model with short culture period, low stocking density and high feeding stuff input had a greater damage on the water environment. Hierarchical cluster analysis indicated that there existed similarities in the DNA sequences of plankton communities and the physicochemical properties of water bodies in the ponds with fish diseases, which provided a possibility to predict the diseases occurrence in Siniperca chuatsi ponds.

  14. A simple protocol for the automation of DNA cycle sequencing reactions and polymerase chain reactions.

    Science.gov (United States)

    Civitello, A B; Richards, S; Gibbs, R A

    1992-01-01

    Automated DNA sequencing methods using robotic workstations have been previously reported, however it is often an arduous task to import these technologies into a laboratory. We describe protocols making use of a Beckman Biomek 1000 robotic workstation to prepare polymerase chain reactions (PCRs) and "cycle sequencing" reactions to be performed in a Perkin Elmer Cetus System 9600 thermocycler. The combination of these two instruments allows a high throughput of PCR and DNA sequencing reactions. The programs described are freely available via anonymous file transfer protocol (FTP).

  15. Database Description - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Budding yeast cDNA sequencing project Database Description General information of database D...ases Organism Taxonomy Name: Saccharomyces cerevisiae Taxonomy ID: 4932 Database description 5'-end sequence...nuine 5'-end, mapping the 5'-end sequence to the genome will lead to accurate identification of the transcript... title: A large-scale full-length cDNA analysis to explore the budding yeast transcriptome. Author name(s): ...rvices Not available URL of Web services - Need for user registration - About This Database Database Descript

  16. Repetitive sequences in Eurasian lynx (Lynx lynx L.) mitochondrial DNA control region.

    Science.gov (United States)

    Sindičić, Magda; Gomerčić, Tomislav; Galov, Ana; Polanc, Primož; Huber, Duro; Slavica, Alen

    2012-06-01

    Mitochondrial DNA (mtDNA) control region (CR) of numerous species is known to include up to five different repetitive sequences (RS1-RS5) that are found at various locations, involving motifs of different length and extensive length heteroplasmy. Two repetitive sequences (RS2 and RS3) on opposite sides of mtDNA central conserved region have been described in domestic cat (Felis catus) and some other felid species. However, the presence of repetitive sequence RS3 has not been detected in Eurasian lynx (Lynx lynx) yet. We analyzed mtDNA CR of 35 Eurasian lynx (L. lynx L.) samples to characterize repetitive sequences and to compare them with those found in other felid species. We confirmed the presence of 80 base pairs (bp) repetitive sequence (RS2) at the 5' end of the Eurasian lynx mtDNA CR L strand and for the first time we described RS3 repetitive sequence at its 3' end, consisting of an array of tandem repeats five to ten bp long. We found that felid species share similar RS3 repetitive pattern and fundamental repeat motif TACAC.

  17. A DNA sequence obtained by replacement of the dopamine RNA aptamer bases is not an aptamer.

    Science.gov (United States)

    Álvarez-Martos, Isabel; Ferapontova, Elena E

    2017-08-05

    A unique specificity of the aptamer-ligand biorecognition and binding facilitates bioanalysis and biosensor development, contributing to discrimination of structurally related molecules, such as dopamine and other catecholamine neurotransmitters. The aptamer sequence capable of specific binding of dopamine is a 57 nucleotides long RNA sequence reported in 1997 (Biochemistry, 1997, 36, 9726). Later, it was suggested that the DNA homologue of the RNA aptamer retains the specificity of dopamine binding (Biochem. Biophys. Res. Commun., 2009, 388, 732). Here, we show that the DNA sequence obtained by the replacement of the RNA aptamer bases for their DNA analogues is not able of specific biorecognition of dopamine, in contrast to the original RNA aptamer sequence. This DNA sequence binds dopamine and structurally related catecholamine neurotransmitters non-specifically, as any DNA sequence, and, thus, is not an aptamer and cannot be used neither for in vivo nor in situ analysis of dopamine in the presence of structurally related neurotransmitters. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. DNA sequencing by two-dimensional materials: As theoretical modeling meets experiments.

    Science.gov (United States)

    Liang, Lijun; Shen, Jia-Wei; Zhang, Zhisen; Wang, Qi

    2017-03-15

    Owing to their extraordinary electrical, chemical, optical, mechanical and structural properties, two-dimensional (2D) materials (mainly including graphene, boron nitride, MoS2 etc.) have stimulated exploding interests in sensor applications. 2D-material based nanoscale DNA sequencing is a single-molecule technique with revolutionary potential. In this paper, we review the methodology of DNA sequencing based on the measurements of ionic current, force peak, and transverse electrical currents etc. by 2D materials. The advantages and disadvantages of DNA sequencing by 2D materials are discussed. Besides the recent development of experiments, we will focus on the theoretical calculations of DNA sequencing, which have been played a critical role in the development of this field. Special emphasis will focus on the disagreements between experiments and theoretical calculations, and the explanations for the discrepancy will be highlighted. Finally, some new plausible sequencing methods from computational studies will be discussed, which may be applied in the realistic DNA sequencing experiments in future. Copyright © 2015 Elsevier B.V. All rights reserved.

  19. The chemical structure of DNA sequence signals for RNA transcription

    Science.gov (United States)

    George, D. G.; Dayhoff, M. O.

    1982-01-01

    The proposed recognition sites for RNA transcription for E. coli NRA polymerase, bacteriophage T7 RNA polymerase, and eukaryotic RNA polymerase Pol II are evaluated in the light of the requirements for efficient recognition. It is shown that although there is good experimental evidence that specific nucleic acid sequence patterns are involved in transcriptional regulation in bacteria and bacterial viruses, among the sequences now available, only in the case of the promoters recognized by bacteriophage T7 polymerase does it seem likely that the pattern is sufficient. It is concluded that the eukaryotic pattern that is investigated is not restrictive enough to serve as a recognition site.

  20. Sequence conservation of an avian centromeric repeated DNA component.

    Science.gov (United States)

    Madsen, C S; Brooks, J E; de Kloet, E; de Kloet, S R

    1994-06-01

    The approximately 190-bp centromeric repeat monomers of the spur-winged lapwing (Vanellus spinosus, Charadriidae), the Chilean flamingo (Phoenicopterus chilensis, Phoenicopteridae), the sarus crane (Grus antigone, Gruidae), parrots (Psittacidae), waterfowl (Anatidae), and the merlin (Falco columbarius, Falconidae) contain elements that are interspecifically highly variable, as well as elements (trinucleotides and higher order oligonucleotides) that are highly conserved in sequence and relative location within the repeat. Such conservation suggests that the centromeric repeats of these avian species have evolved from a common ancestral sequence that may date from very early stages of avian radiation.

  1. DNA sequencing with capillary electrophoresis and single cell analysis with mass spectrometry

    Energy Technology Data Exchange (ETDEWEB)

    Fung, N.

    1998-03-27

    Since the first demonstration of the laser in the 1960`s, lasers have found numerous applications in analytical chemistry. In this work, two different applications are described, namely, DNA sequencing with capillary gel electrophoresis and single cell analysis with mass spectrometry. Two projects are described in which high-speed DNA separations with capillary gel electrophoresis were demonstrated. In the third project, flow cytometry and mass spectrometry were coupled via a laser vaporization/ionization interface and individual mammalian cells were analyzed. First, DNA Sanger fragments were separated by capillary gel electrophoresis. A separation speed of 20 basepairs per minute was demonstrated with a mixed poly(ethylene oxide) (PEO) sieving solution. In addition, a new capillary wall treatment protocol was developed in which bare (or uncoated) capillaries can be used in DNA sequencing. Second, a temperature programming scheme was used to separate DNA Sanger fragments. Third, flow cytometry and mass spectrometry were coupled with a laser vaporization/ionization interface.

  2. Sequence-specific electron injection into DNA from an intermolecular electron donor.

    Science.gov (United States)

    Morinaga, Hironobu; Takenaka, Tomohiro; Hashiya, Fumitaka; Kizaki, Seiichiro; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

    2013-04-01

    Electron transfer in DNA has been intensively studied to elucidate its biological roles and for applications in bottom-up DNA nanotechnology. Recently, mechanisms of electron transfer to DNA have been investigated; however, most of the systems designed are intramolecular. Here, we synthesized pyrene-conjugated pyrrole-imidazole polyamides (PPIs) to achieve sequence-specific electron injection into DNA in an intermolecular fashion. Electron injection from PPIs into DNA was detected using 5-bromouracil as an electron acceptor. Twelve different 5-bromouracil-containing oligomers were synthesized to examine the electron-injection ability of PPI. Product analysis demonstrated that the electron transfer from PPIs was localized in a range of 8 bp from the binding site of the PPIs. These results demonstrate that PPIs can be a useful tool for sequence-specific electron injection.

  3. Transposon-like sequences in extrachromosomal circular DNA from mouse thymocytes.

    Science.gov (United States)

    Fujimoto, S; Tsuda, T; Toda, M; Yamagishi, H

    1985-01-01

    Small polydisperse circular (spc) DNA was isolated from mouse thymocytes and cloned into the HindIII site of lambda vector Charon 7. Fifty-six recombinants from this spc DNA library were analyzed. R repeats, which were originally found near immunoglobulin genes, were enriched in spc DNA clones relative to their representation in the chromosome. In one clone, the R sequence was linked to Bam and MIF sequences and the contiguous arrangement was truncated from both ends. In another clone, composite Bam/R and R repeats existed as a pair in inverted repeat orientation. Truncation occurred from the 5' side without affecting the 3' ends. In both clones, short direct repeats flanked the repeated sequences. The possible role of R sequences in transposition and circular formation is discussed. PMID:2984679

  4. Distribution and sequence homogeneity of an abundant satellite DNA in the beetle, Tenebrio molitor.

    Science.gov (United States)

    Davis, C A; Wyatt, G R

    1989-01-01

    The mealworm beetle, Tenebrio molitor, contains an unusually abundant and homogeneous satellite DNA which constitutes up to 60% of its genome. The satellite DNA is shown to be present in all of the chromosomes by in situ hybridization. 18 dimers of the repeat unit were cloned and sequenced. The consensus sequence is 142 nt long and lacks any internal repeat structure. Monomers of the sequence are very similar, showing on average a 2% divergence from the calculated consensus. Variant nucleotides are scattered randomly throughout the sequence although some variants are more common than others. Neighboring repeat units are no more alike than randomly chosen ones. The results suggest that some mechanism, perhaps gene conversion, is acting to maintain the homogeneity of the satellite DNA despite its abundance and distribution on all of the chromosomes. Images PMID:2762148

  5. Improvements on a privacy-protection algorithm for DNA sequences with generalization lattices.

    Science.gov (United States)

    Li, Guang; Wang, Yadong; Su, Xiaohong

    2012-10-01

    When developing personal DNA databases, there must be an appropriate guarantee of anonymity, which means that the data cannot be related back to individuals. DNA lattice anonymization (DNALA) is a successful method for making personal DNA sequences anonymous. However, it uses time-consuming multiple sequence alignment and a low-accuracy greedy clustering algorithm. Furthermore, DNALA is not an online algorithm, and so it cannot quickly return results when the database is updated. This study improves the DNALA method. Specifically, we replaced the multiple sequence alignment in DNALA with global pairwise sequence alignment to save time, and we designed a hybrid clustering algorithm comprised of a maximum weight matching (MWM)-based algorithm and an online algorithm. The MWM-based algorithm is more accurate than the greedy algorithm in DNALA and has the same time complexity. The online algorithm can process data quickly when the database is updated. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  6. Characterization and assessment of an avian repetitive DNA sequence as an icterid phylogenetic marker.

    Science.gov (United States)

    Quinn, J S; Guglich, E; Seutin, G; Lau, R; Marsolais, J; Parna, L; Boag, P T; White, B N

    1992-02-01

    The first tandemly repeated sequence examined in a passerine bird, a 431-bp PstI fragment named pMAT1, has been cloned from the genome of the brown-headed cowbird (Molothrus ater). The sequence represents about 5-10% of the genome (about 4 x 10(5) copies) and yields prominent ethidium bromide stained bands when genomic DNA cut with a variety of restriction enzymes is electrophoresed in agarose gels. A particularly striking ladder of fragments is apparent when the DNA is cut with HinfI, indicative of a tandem arrangement of the monomer. The cloned PstI monomer has been sequenced, revealing no internal repeated structure. There are sequences that hybridize with pMAT1 found in related nine-primaried oscines but not in more distantly related oscines, suboscines, or nonpasserine species. Little sequence similarity to tandemly repeated PstI cut sequences from the merlin (Falco columbarius), saurus crane (Grus antigone), or Puerto Rican parrot (Amazona vittata) or to HinfI digested sequence from the Toulouse goose (Anser anser) was detected. The isolated sequence was used as a probe to examine DNA samples of eight members of the tribe Icterini. This examination revealed phylogenetically informative characters. The repeat contains cutting sites from a number of restriction enzymes, which, if sufficiently polymorphic, would provide new phylogenetic characters. Sequences like these, conserved within a species, but variable between closely related species, may be very useful for phylogenetic studies of closely related taxa.

  7. Predicting DNA-binding sites of proteins from amino acid sequence

    Directory of Open Access Journals (Sweden)

    Wu Feihong

    2006-05-01

    Full Text Available Abstract Background Understanding the molecular details of protein-DNA interactions is critical for deciphering the mechanisms of gene regulation. We present a machine learning approach for the identification of amino acid residues involved in protein-DNA interactions. Results We start with a Naïve Bayes classifier trained to predict whether a given amino acid residue is a DNA-binding residue based on its identity and the identities of its sequence neighbors. The input to the classifier consists of the identities of the target residue and 4 sequence neighbors on each side of the target residue. The classifier is trained and evaluated (using leave-one-out cross-validation on a non-redundant set of 171 proteins. Our results indicate the feasibility of identifying interface residues based on local sequence information. The classifier achieves 71% overall accuracy with a correlation coefficient of 0.24, 35% specificity and 53% sensitivity in identifying interface residues as evaluated by leave-one-out cross-validation. We show that the performance of the classifier is improved by using sequence entropy of the target residue (the entropy of the corresponding column in multiple alignment obtained by aligning the target sequence with its sequence homologs as additional input. The classifier achieves 78% overall accuracy with a correlation coefficient of 0.28, 44% specificity and 41% sensitivity in identifying interface residues. Examination of the predictions in the context of 3-dimensional structures of proteins demonstrates the effectiveness of this method in identifying DNA-binding sites from sequence information. In 33% (56 out of 171 of the proteins, the classifier identifies the interaction sites by correctly recognizing at least half of the interface residues. In 87% (149 out of 171 of the proteins, the classifier correctly identifies at least 20% of the interface residues. This suggests the possibility of using such classifiers to identify

  8. Development of a Novel Technology for Label Free DNA Sequencing

    Science.gov (United States)

    2012-05-21

    biological processes and cannot be solely explained by the tight binding approximation. In real processes, the vibrations of the lattice and the...Frequency Rabi Oscillation in a Coupled-Double- Quantum-Dot Semiconductor System The mechanism of the charge transfer inside a DNA molecule under the...molecule. To this connection, in terms of density matrix theory, we have studied the Rabi oscillation of the charge transfer in a coupled- double

  9. The complete nucleotide sequence of the mitochondrial DNA of the dogfish, Scyliorhinus canicula.

    Science.gov (United States)

    Delarbre, C; Spruyt, N; Delmarre, C; Gallut, C; Barriel, V; Janvier, P; Laudet, V; Gachelin, G

    1998-09-01

    We have determined the complete nucleotide sequence of the mitochondrial DNA (mtDNA) of the dogfish, Scyliorhinus canicula. The 16,697-bp-long mtDNA possesses a gene organization identical to that of the Osteichthyes, but different from that of the sea lamprey Petromyzon marinus. The main features of the mtDNA of osteichthyans were thus established in the common ancestor to chondrichthyans and osteichthyans. The phylogenetic analysis confirms that the Chondrichthyes are the sister group of the Osteichthyes.

  10. Highly accurate fluorogenic DNA sequencing with information theory-based error correction.

    Science.gov (United States)

    Chen, Zitian; Zhou, Wenxiong; Qiao, Shuo; Kang, Li; Duan, Haifeng; Xie, X Sunney; Huang, Yanyi

    2017-12-01

    Eliminating errors in next-generation DNA sequencing has proved challenging. Here we present error-correction code (ECC) sequencing, a method to greatly improve sequencing accuracy by combining fluorogenic sequencing-by-synthesis (SBS) with an information theory-based error-correction algorithm. ECC embeds redundancy in sequencing reads by creating three orthogonal degenerate sequences, generated by alternate dual-base reactions. This is similar to encoding and decoding strategies that have proved effective in detecting and correcting errors in information communication and storage. We show that, when combined with a fluorogenic SBS chemistry with raw accuracy of 98.1%, ECC sequencing provides single-end, error-free sequences up to 200 bp. ECC approaches should enable accurate identification of extremely rare genomic variations in various applications in biology and medicine.

  11. Highly parallel translation of DNA sequences into small molecules.

    Directory of Open Access Journals (Sweden)

    Rebecca M Weisinger

    Full Text Available A large body of in vitro evolution work establishes the utility of biopolymer libraries comprising 10(10 to 10(15 distinct molecules for the discovery of nanomolar-affinity ligands to proteins. Small-molecule libraries of comparable complexity will likely provide nanomolar-affinity small-molecule ligands. Unlike biopolymers, small molecules can offer the advantages of cell permeability, low immunogenicity, metabolic stability, rapid diffusion and inexpensive mass production. It is thought that such desirable in vivo behavior is correlated with the physical properties of small molecules, specifically a limited number of hydrogen bond donors and acceptors, a defined range of hydrophobicity, and most importantly, molecular weights less than 500 Daltons. Creating a collection of 10(10 to 10(15 small molecules that meet these criteria requires the use of hundreds to thousands of diversity elements per step in a combinatorial synthesis of three to five steps. With this goal in mind, we have reported a set of mesofluidic devices that enable DNA-programmed combinatorial chemistry in a highly parallel 384-well plate format. Here, we demonstrate that these devices can translate DNA genes encoding 384 diversity elements per coding position into corresponding small-molecule gene products. This robust and efficient procedure yields small molecule-DNA conjugates suitable for in vitro evolution experiments.

  12. Sequence Searcher: A Java tool to perform regular expression and fuzzy searches of multiple DNA and protein sequences

    Directory of Open Access Journals (Sweden)

    Upton Chris

    2009-01-01

    Full Text Available Abstract Background Many sequence-searching tools have limiting factors for their use. For example, they may be platform specific, enforce restrictive size limits and sequences to be searched, or only allow searches of one of DNA or protein. Findings We present an easy-to-use, fast, platform-independent tool to search for amino acid or nucleotide patterns within one or many protein or nucleic acid sequences. The user can choose to search for regular expressions or perform a fuzzy search in which a particular number of errors is accepted during matching of a sequence. Positions of mismatches in fuzzy searches are displayed graphically the user. Conclusion SeqS provides an improved feature set and functions as a stand-alone tool or could be integrated into other bioinformatics platforms.

  13. Short Note DNA sequences from the Little Brown Bustard Eupodotis ...

    African Journals Online (AJOL)

    Taxonomic classification of birds based exclusively on morphology and plumage traits has often been found to be inconsistent with true evolutionary history when tested with molecular phylogenies based on neutrally evolving markers. Here we present cytochrome-b gene sequences for the poorly known Little Brown ...

  14. Computational optimisation of targeted DNA sequencing for cancer detection

    DEFF Research Database (Denmark)

    Martinez, Pierre; McGranahan, Nicholas; Birkbak, Nicolai Juul

    2013-01-01

    Despite recent progress thanks to next-generation sequencing technologies, personalised cancer medicine is still hampered by intra-tumour heterogeneity and drug resistance. As most patients with advanced metastatic disease face poor survival, there is need to improve early diagnosis. Analysing...... introduce biases towards in-frame mutations and would compromise the reproducibility of tumour detection....

  15. Targeted enrichment of genomic DNA regions for next generation sequencing

    NARCIS (Netherlands)

    Mertens, F.; El-Sharawy, A.; Sauer, S.; Van Helvoort, J.; Van der Zaag, P.J.; Franke, A.; Nilsson, M.; Lehrach. H.; Brookes, A.

    2011-01-01

    In this review we discuss the latest targeted enrichment methods, and aspects of their utilization along with second generation sequencing for complex genome analysis. In doing so we provide an overview of issues involved in detecting genetic variation, for which targeted enrichment has become a

  16. BIOLOG - a DNA sequence analysis system in PROLOG.

    Science.gov (United States)

    Lyall, A; Hammond, P; Brough, D; Glover, D

    1984-01-11

    BIOLOG contains facilities for the analysis of nucleic acid sequences. These facilities are available through queries and commands of the underlying implementation language PROLOG. Familiarity with PROLOG is gained by using the built-in BIOLOG functions. This experience should enable the user to extend the current system and define new facilities.

  17. DNA Methylation Assessed by SMRT Sequencing Is Linked to Mutations in Neisseria meningitidis Isolates.

    Science.gov (United States)

    Sater, Mohamad R Abdul; Lamelas, Araceli; Wang, Guilin; Clark, Tyson A; Röltgen, Katharina; Mane, Shrikant; Korlach, Jonas; Pluschke, Gerd; Schmid, Christoph D

    2015-01-01

    The Gram-negative bacterium Neisseria meningitidis features extensive genetic variability. To present, proposed virulence genotypes are also detected in isolates from asymptomatic carriers, indicating more complex mechanisms underlying variable colonization modes of N. meningitidis. We applied the Single Molecule, Real-Time (SMRT) sequencing method from Pacific Biosciences to assess the genome-wide DNA modification profiles of two genetically related N. meningitidis strains, both of serogroup A. The resulting DNA methylomes revealed clear divergences, represented by the detection of shared and of strain-specific DNA methylation target motifs. The positional distribution of these methylated target sites within the genomic sequences displayed clear biases, which suggest a functional role of DNA methylation related to the regulation of genes. DNA methylation in N. meningitidis has a likely underestimated potential for variability, as evidenced by a careful analysis of the ORF status of a panel of confirmed and predicted DNA methyltransferase genes in an extended collection of N. meningitidis strains of serogroup A. Based on high coverage short sequence reads, we find phase variability as a major contributor to the variability in DNA methylation. Taking into account the phase variable loci, the inferred functional status of DNA methyltransferase genes matched the observed methylation profiles. Towards an elucidation of presently incompletely characterized functional consequences of DNA methylation in N. meningitidis, we reveal a prominent colocalization of methylated bases with Single Nucleotide Polymorphisms (SNPs) detected within our genomic sequence collection. As a novel observation we report increased mutability also at 6mA methylated nucleotides, complementing mutational hotspots previously described at 5mC methylated nucleotides. These findings suggest a more diverse role of DNA methylation and Restriction-Modification (RM) systems in the evolution of

  18. DNA Methylation Assessed by SMRT Sequencing Is Linked to Mutations in Neisseria meningitidis Isolates.

    Directory of Open Access Journals (Sweden)

    Mohamad R Abdul Sater

    Full Text Available The Gram-negative bacterium Neisseria meningitidis features extensive genetic variability. To present, proposed virulence genotypes are also detected in isolates from asymptomatic carriers, indicating more complex mechanisms underlying variable colonization modes of N. meningitidis. We applied the Single Molecule, Real-Time (SMRT sequencing method from Pacific Biosciences to assess the genome-wide DNA modification profiles of two genetically related N. meningitidis strains, both of serogroup A. The resulting DNA methylomes revealed clear divergences, represented by the detection of shared and of strain-specific DNA methylation target motifs. The positional distribution of these methylated target sites within the genomic sequences displayed clear biases, which suggest a functional role of DNA methylation related to the regulation of genes. DNA methylation in N. meningitidis has a likely underestimated potential for variability, as evidenced by a careful analysis of the ORF status of a panel of confirmed and predicted DNA methyltransferase genes in an extended collection of N. meningitidis strains of serogroup A. Based on high coverage short sequence reads, we find phase variability as a major contributor to the variability in DNA methylation. Taking into account the phase variable loci, the inferred functional status of DNA methyltransferase genes matched the observed methylation profiles. Towards an elucidation of presently incompletely characterized functional consequences of DNA methylation in N. meningitidis, we reveal a prominent colocalization of methylated bases with Single Nucleotide Polymorphisms (SNPs detected within our genomic sequence collection. As a novel observation we report increased mutability also at 6mA methylated nucleotides, complementing mutational hotspots previously described at 5mC methylated nucleotides. These findings suggest a more diverse role of DNA methylation and Restriction-Modification (RM systems in the

  19. Genomic DNA sequence and cytosine methylation changes of adult rice leaves after seeds space flight

    Science.gov (United States)

    Shi, Jinming

    In this study, cytosine methylation on CCGG site and genomic DNA sequence changes of adult leaves of rice after seeds space flight were detected by methylation-sensitive amplification polymorphism (MSAP) and Amplified fragment length polymorphism (AFLP) technique respectively. Rice seeds were planted in the trial field after 4 days space flight on the shenzhou-6 Spaceship of China. Adult leaves of space-treated rice including 8 plants chosen randomly and 2 plants with phenotypic mutation were used for AFLP and MSAP analysis. Polymorphism of both DNA sequence and cytosine methylation were detected. For MSAP analysis, the average polymorphic frequency of the on-ground controls, space-treated plants and mutants are 1.3%, 3.1% and 11% respectively. For AFLP analysis, the average polymorphic frequencies are 1.4%, 2.9%and 8%respectively. Total 27 and 22 polymorphic fragments were cloned sequenced from MSAP and AFLP analysis respectively. Nine of the 27 fragments from MSAP analysis show homology to coding sequence. For the 22 polymorphic fragments from AFLP analysis, no one shows homology to mRNA sequence and eight fragments show homology to repeat region or retrotransposon sequence. These results suggest that although both genomic DNA sequence and cytosine methylation status can be effected by space flight, the genomic region homology to the fragments from genome DNA and cytosine methylation analysis were different.

  20. Investigation of length heteroplasmy in mitochondrial DNA control region by massively parallel sequencing.

    Science.gov (United States)

    Lin, Chun-Yen; Tsai, Li-Chin; Hsieh, Hsing-Mei; Huang, Chia-Hung; Yu, Yu-Jen; Tseng, Bill; Linacre, Adrian; Lee, James Chun-I

    2017-09-01

    Accurate sequencing of the control region of the mitochondrial genome is notoriously difficult due to the presence of polycytosine bases, termed C-tracts. The precise number of bases that constitute a C-tract and the bases beyond the poly cytosines may not be accurately defined when analyzing Sanger sequencing data separated by capillary electrophoresis. Massively parallel sequencing has the potential to resolve such poor definition and provides the opportunity to discover variants due to length heteroplasmy. In this study, the control region of mitochondrial genomes from 20 samples was sequenced using both standard Sanger methods with separation by capillary electrophoresis and also using massively parallel DNA sequencing technology. After comparison of the two sets of generated sequence, with the exception of the C-tracts where length heteroplasmy was observed, all sequences were concordant. Sequences of three segments 16184-16193, 303-315 and 568-573 with C-tracts in HVI, II and III can be clearly defined from the massively parallel sequencing data using the program SEQ Mapper. Multiple sequence variants were observed in the length of C-tracts longer than 7 bases. Our report illustrates the accurate designation of all the length variants leading to heteroplasmy in the control region of the mitochondrial genome that can be determined by SEQ Mapper based on data generated by massively parallel DNA sequencing. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Correcting sequencing errors in DNA coding regions using a dynamic programming approach

    Energy Technology Data Exchange (ETDEWEB)

    Xu, Y.; Mural, R.J.; Uberbacher, E.C.

    1994-12-01

    This paper presents an algorithm for detecting and ``correcting`` sequencing errors that occur in DNA coding regions. The types of sequencing error addressed include insertions and deletions (indels) of DNA bases. The goal is to provide a capability which makes single-pass or low-redundancy sequence data more informative, reducing the need for high-redundancy sequencing for gene identification and characterization purposes. The algorithm detects sequencing errors by discovering changes in the statistically preferred reading frame within a putative coding region and then inserts a number of ``neutral`` bases at a perceived reading frame transition point to make the putative exon candidate frame consistent. The authors have implemented the algorithm as a front-end subsystem of the GRAIL DNA sequence analysis system to construct a version which is very error tolerant and also intend to use this as a testbed for further development of sequencing error-correction technology. On a test set consisting of 68 Human DNA sequences with 1% randomly generated indels in coding regions, the algorithm detected and corrected 76% of the indels. The average distance between the position of an indel and the predicted one was 9.4 bases. With this subsystem in place, GRAIL correctly predicted 89% of the coding messages with 10% false message on the ``corrected`` sequences, compared to 69% correctly predicted coding messages and 11% falsely predicted messages on the ``corrupted`` sequences using standard GRAIL II method. The method uses a dynamic programming algorithm, and runs in time and space linear to the size of the input sequence.

  2. Ligation bias in illumina next-generation DNA libraries: implications for sequencing ancient genomes.

    Directory of Open Access Journals (Sweden)

    Andaine Seguin-Orlando

    Full Text Available Ancient DNA extracts consist of a mixture of endogenous molecules and contaminant DNA templates, often originating from environmental microbes. These two populations of templates exhibit different chemical characteristics, with the former showing depurination and cytosine deamination by-products, resulting from post-mortem DNA damage. Such chemical modifications can interfere with the molecular tools used for building second-generation DNA libraries, and limit our ability to fully characterize the true complexity of ancient DNA extracts. In this study, we first use fresh DNA extracts to demonstrate that library preparation based on adapter ligation at AT-overhangs are biased against DNA templates starting with thymine residues, contrarily to blunt-end adapter ligation. We observe the same bias on fresh DNA extracts sheared on Bioruptor, Covaris and nebulizers. This contradicts previous reports suggesting that this bias could originate from the methods used for shearing DNA. This also suggests that AT-overhang adapter ligation efficiency is affected in a sequence-dependent manner and results in an uneven representation of different genomic contexts. We then show how this bias could affect the base composition of ancient DNA libraries prepared following AT-overhang ligation, mainly by limiting the ability to ligate DNA templates starting with thymines and therefore deaminated cytosines. This results in particular nucleotide misincorporation damage patterns, deviating from the signature generally expected for authenticating ancient sequence data. Consequently, we show that models adequate for estimating post-mortem DNA damage levels must be robust to the molecular tools used for building ancient DNA libraries.

  3. Sequencing cDNAs: An Introduction to DNA Sequence Analysis in the Undergraduate Molecular Genetics Course.

    Science.gov (United States)

    Galewsky, Samuel

    2000-01-01

    Introduces a series of molecular genetics laboratories where students pick a single colony from a Drosophila melanogester embryo cDNA library and purify the plasmid, then analyze the insert through restriction digests and gel electrophoresis. (Author/YDS)

  4. Ultra-deep sequencing of mouse mitochondrial DNA: mutational patterns and their origins.

    Directory of Open Access Journals (Sweden)

    Adam Ameur

    2011-03-01

    Full Text Available Somatic mutations of mtDNA are implicated in the aging process, but there is no universally accepted method for their accurate quantification. We have used ultra-deep sequencing to study genome-wide mtDNA mutation load in the liver of normally- and prematurely-aging mice. Mice that are homozygous for an allele expressing a proof-reading-deficient mtDNA polymerase (mtDNA mutator mice have 10-times-higher point mutation loads than their wildtype siblings. In addition, the mtDNA mutator mice have increased levels of a truncated linear mtDNA molecule, resulting in decreased sequence coverage in the deleted region. In contrast, circular mtDNA molecules with large deletions occur at extremely low frequencies in mtDNA mutator mice and can therefore not drive the premature aging phenotype. Sequence analysis shows that the main proportion of the mutation load in heterozygous mtDNA mutator mice and their wildtype siblings is inherited from their heterozygous mothers consistent with germline transmission. We found no increase in levels of point mutations or deletions in wildtype C57Bl/6N mice with increasing age, thus questioning the causative role of these changes in aging. In addition, there was no increased frequency of transversion mutations with time in any of the studied genotypes, arguing against oxidative damage as a major cause of mtDNA mutations. Our results from studies of mice thus indicate that most somatic mtDNA mutations occur as replication errors during development and do not result from damage accumulation in adult life.

  5. Routine human papillomavirus genotyping by DNA sequencing in community hospital laboratories

    Directory of Open Access Journals (Sweden)

    Vigliotti Jessica S

    2007-06-01

    Full Text Available Abstract Background Human papillomavirus (HPV genotyping is important for following up patients with persistent HPV infection and for evaluation of prevention strategy for the individual patients to be immunized with type-specific HPV vaccines. The aim of this study was to optimize a robust "low-temperature" (LoTemp™ PCR system to streamline the research protocols for HPV DNA nested PCR-amplification followed by genotyping with direct DNA sequencing. The protocol optimization facilitates transferring this molecular technology into clinical laboratory practice. In particular, lowering the temperature by 10°C at each step of thermocycling during in vitro DNA amplification yields more homogeneous PCR products. With this protocol, template purification before enzymatic cycle primer extensions is no longer necessary. Results The HPV genomic DNA extracted from liquid-based alcohol-preserved cervicovaginal cells was first amplified by the consensus MY09/MY11 primer pair followed by nested PCR with GP5+/GP6+ primers. The 150 bp nested PCR products were subjected to direct DNA sequencing. The hypervariable 34–50 bp DNA sequence downstream of the GP5+ primer site was compared to the known HPV DNA sequences stored in the GenBank using on-line BLAST for genotyping. The LoTemp™ ready-to-use PCR polymerase reagents proved to be stable at room temperature for at least 6 weeks. Nested PCR detected 107 isolates of HPV in 513 cervicovaginal clinical samples, all validated by DNA sequencing. HPV-16 was the most prevalent genotype constituting 29 of 107 positive cases (27.2%, followed by HPV-56 (8.5%. For comparison, Digene HC2 test detected 62.6% of the 107 HPV isolates and returned 11 (37.9% of the 29 HPV-16 positive cases as "positive for high-risk HPV". Conclusion The LoTemp™ ready-to-use PCR polymerase system which allows thermocycling at 85°C for denaturing, 40°C for annealing and 65°C for primer extension can be adapted for target HPV DNA

  6. Direct evidence for sequence-dependent attraction between double-stranded DNA controlled by methylation

    Science.gov (United States)

    Yoo, Jejoong; Kim, Hajin; Aksimentiev, Aleksei; Ha, Taekjip

    2016-03-01

    Although proteins mediate highly ordered DNA organization in vivo, theoretical studies suggest that homologous DNA duplexes can preferentially associate with one another even in the absence of proteins. Here we combine molecular dynamics simulations with single-molecule fluorescence resonance energy transfer experiments to examine the interactions between duplex DNA in the presence of spermine, a biological polycation. We find that AT-rich DNA duplexes associate more strongly than GC-rich duplexes, regardless of the sequence homology. Methyl groups of thymine acts as a steric block, relocating spermine from major grooves to interhelical regions, thereby increasing DNA-DNA attraction. Indeed, methylation of cytosines makes attraction between GC-rich DNA as strong as that between AT-rich DNA. Recent genome-wide chromosome organization studies showed that remote contact frequencies are higher for AT-rich and methylated DNA, suggesting that direct DNA-DNA interactions that we report here may play a role in the chromosome organization and gene regulation.

  7. High penetrance of sequencing errors and interpretative shortcomings in mtDNA sequence analysis of LHON patients.

    Science.gov (United States)

    Bandelt, Hans-Jürgen; Yao, Yong-Gang; Salas, Antonio; Kivisild, Toomas; Bravi, Claudio M

    2007-01-12

    For identifying mutation(s) that are potentially pathogenic it is essential to determine the entire mitochondrial DNA (mtDNA) sequences from patients suffering from a particular mitochondrial disease, such as Leber hereditary optic neuropathy (LHON). However, such sequencing efforts can, in the worst case, be riddled with errors by imposing phantom mutations or misreporting variant nucleotides, and moreover, by inadvertently regarding some mutations as novel and pathogenic, which are actually known to define minor haplogroups. Under such circumstances it remains unclear whether the disease-associated mutations would have been determined adequately. Here, we re-analyse four problematic LHON studies and propose guidelines by which some of the pitfalls could be avoided.

  8. cDNA sequence and deduced amino acid sequence of a fungal stress protein induced in Rhizopus nigricans by steroids.

    Science.gov (United States)

    Cresnar, B; Plaper, A; Breskvar, K; Hudnik-Plevnik, T

    1998-09-29

    cDNA clone was isolated from lambdagt11 library prepared from Rhizopus nigricans after growing the fungus in the presence of progesterone. Northern blot analysis of total RNA showed that expression of corresponding mRNA was up-regulated in R. nigricans after treatment with different steroids and after exposure of the fungus to heat shock or osmotic stress. Sequence analysis revealed an open reading frame for a 364-amino-acid polypeptide. The predicted amino acid sequence exhibited significant similarity to several sugar epimerases in two domains common to these enzymes. Our results suggest that the analyzed cDNA is coding for a fungal stress inducible protein belonging to sugar epimerases. Copyright 1998 Academic Press.

  9. Comparison of microbial DNA enrichment tools for metagenomic whole genome sequencing.

    Science.gov (United States)

    Thoendel, Matthew; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Yao, Janet Z; Chia, Nicholas; Hanssen, Arlen D; Abdel, Matthew P; Patel, Robin

    2016-08-01

    Metagenomic whole genome sequencing for detection of pathogens in clinical samples is an exciting new area for discovery and clinical testing. A major barrier to this approach is the overwhelming ratio of human to pathogen DNA in samples with low pathogen abundance, which is typical of most clinical specimens. Microbial DNA enrichment methods offer the potential to relieve this limitation by improving this ratio. Two commercially available enrichment kits, the NEBNext Microbiome DNA Enrichment Kit and the Molzym MolYsis Basic kit, were tested for their ability to enrich for microbial DNA from resected arthroplasty component sonicate fluids from prosthetic joint infections or uninfected sonicate fluids spiked with Staphylococcus aureus. Using spiked uninfected sonicate fluid there was a 6-fold enrichment of bacterial DNA with the NEBNext kit and 76-fold enrichment with the MolYsis kit. Metagenomic whole genome sequencing of sonicate fluid revealed 13- to 85-fold enrichment of bacterial DNA using the NEBNext enrichment kit. The MolYsis approach achieved 481- to 9580-fold enrichment, resulting in 7 to 59% of sequencing reads being from the pathogens known to be present in the samples. These results demonstrate the usefulness of these tools when testing clinical samples with low microbial burden using next generation sequencing. Copyright © 2016 Elsevier B.V. All rights reserved.

  10. Isolation and sequence analysis of a cDNA clone encoding the fifth complement component

    DEFF Research Database (Denmark)

    Lundwall, Åke B; Wetsel, Rick A; Kristensen, Torsten

    1985-01-01

    obtained further predicted an arginine-rich sequence (RPRR) immediately upstream of the N-terminal threonine of C5a, indicating that the promolecule form of C5 is synthesized with a beta alpha-chain orientation as previously shown for pro-C3 and pro-C4. The C5 cDNA clone was sheared randomly by sonication......DNA clone of 1.85 kilobase pairs was isolated. Hybridization of the mixed-sequence probe to the complementary strand of the plasmid insert and sequence analysis by the dideoxy method predicted the expected protein sequence of C5a (positions 1-12), amino-terminal to the anticipated priming site. The sequence...

  11. The complete mitochondrial DNA sequence of Crotalus horridus (timber rattlesnake).

    Science.gov (United States)

    Hall, Jacob B; Cobb, Vincent A; Cahoon, A Bruce

    2013-04-01

    The complete mitogenome of the timber rattlesnake (Crotalus horridus) was completed using Sanger sequencing. It is 17,260 bp with 13 protein-coding genes, 21 tRNAs, two rRNAs and two control regions. Gene synteny is consistent with other snakes with the exception of a missing redundant tRNA (Ser) . This mitogenome should prove to be a useful addition of a well-known member of the Viperidae snake family.

  12. The DNA sequence and analysis of human chromosome 13

    OpenAIRE

    Dunham, A.; Matthews, L. H.; Burton, J.; Ashurst, J. L.; Howe, K. L.; Ashcroft, K. J.; Beare, D. M.; Burford, D. C.; Hunt, S. E.; Griffiths-Jones, S.; Jones, M. C.; Keenan, S. J.; Oliver, K.; Scott, C. E.; Ainscough, R.

    2004-01-01

    Chromosome 13 is the largest acrocentric human chromosome. It carries genes involved in cancer including the breast cancer type 2 (BRCA2) and retinoblastoma (RB1) genes, is frequently rearranged in B-cell chronic lymphocytic leukaemia, and contains the DAOA locus associated with bipolar disorder and schizophrenia. We describe completion and analysis of 95.5 megabases (Mb) of sequence from chromosome 13, which contains 633 genes and 296 pseudogenes. We estimate that more than 95.4% of the prot...

  13. Semi-automated library preparation for high-throughput DNA sequencing platforms.

    Science.gov (United States)

    Farias-Hesson, Eveline; Erikson, Jonathan; Atkins, Alexander; Shen, Peidong; Davis, Ronald W; Scharfe, Curt; Pourmand, Nader

    2010-01-01

    Next-generation sequencing platforms are powerful technologies, providing gigabases of genetic information in a single run. An important prerequisite for high-throughput DNA sequencing is the development of robust and cost-effective preprocessing protocols for DNA sample library construction. Here we report the development of a semi-automated sample preparation protocol to produce adaptor-ligated fragment libraries. Using a liquid-handling robot in conjunction with Carboxy Terminated Magnetic Beads, we labeled each library sample using a unique 6 bp DNA barcode, which allowed multiplex sample processing and sequencing of 32 libraries in a single run using Applied Biosystems' SOLiD sequencer. We applied our semi-automated pipeline to targeted medical resequencing of nuclear candidate genes in individuals affected by mitochondrial disorders. This novel method is capable of preparing as much as 32 DNA libraries in 2.01 days (8-hour workday) for emulsion PCR/high throughput DNA sequencing, increasing sample preparation production by 8-fold.

  14. Experimental and theoretical studies of sequence effects on the fluctuation and melting of short DNA molecules

    Energy Technology Data Exchange (ETDEWEB)

    Peyrard, M; Cuesta-Lopez, S [Universite de Lyon, Ecole Normale Superieure de Lyon, Laboratoire de Physique, CNRS UMR 5672, 46 allee d' Italie, F-69364 Lyon Cedex 07 (France); Angelov, D [Universite de Lyon, Ecole Normale Superieure de Lyon, Laboratoire de Biologie Moleculaire de la Cellule, CNRS UMR 5239, 46 allee d' Italie, F-69364 Lyon Cedex 07 (France)], E-mail: Michel.Peyrard@ens-lyon.fr

    2009-01-21

    Understanding the melting of short DNA sequences probes DNA at the scale of the genetic code and raises questions which are very different from those posed by very long sequences, which have been extensively studied. We investigate this problem by combining experiments and theory. A new experimental method allows us to make a mapping of the opening of the guanines along the sequence as a function of temperature. The results indicate that non-local effects may be important in DNA because an AT-rich region is able to influence the opening of a base pair which is about 10 base pairs away. An earlier mesoscopic model of DNA is modified to correctly describe the timescales associated with the opening of individual base pairs well below melting, and to properly take into account the sequence. Using this model to analyze some characteristic sequences for which detailed experimental data on the melting is available (Montrichok et al 2003 Europhys. Lett. 62 452), we show that we have to introduce non-local effects of AT-rich regions to get acceptable results. This brings a second indication that the influence of these highly fluctuating regions of DNA on their neighborhood can extend to some distance.

  15. Inhibition of hepatitis B virus replication with linear DNA sequences expressing antiviral micro-RNA shuttles

    Energy Technology Data Exchange (ETDEWEB)

    Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie; Weinberg, Marc S. [Antiviral Gene Therapy Research Unit, University of the Witwatersrand (South Africa); Arbuthnot, Patrick, E-mail: Patrick.Arbuthnot@wits.ac.za [Antiviral Gene Therapy Research Unit, University of the Witwatersrand (South Africa)

    2009-11-20

    RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR) shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.

  16. DNA extraction, Polymerase Chain Reaction, and Sequencing : Workshop in Clinical Genetics

    Directory of Open Access Journals (Sweden)

    Sumadi Lukman Anwar

    2017-02-01

    Full Text Available Abstract DNA extraction, Polymerase Chain Reaction (PCR, and Sequencing are basic methods in molecular biology and genetics. Those there are routinely performed as basic methods in genetic research and currently also for diagnostic lab especially for pathology and human genetics. With the advance in the genetics and clinical service for cancer management, mutation analysis is very important not only for diagnosis but also for prediction of therapeutic response. Detection of KRAS, BRAF, EGFR, and c-KIT mutations is presently performed in almost every molecular pathology lab as part of daily clinical service in cancer management. In this workshop we will discuss tips and tricks for those three basic lab methods. How to improve amount and purity of DNA extraction from blood and tissues, how to avoid DNA degradation during the procedure and storage, how to perform PCR, factors and substance that inhibit polymerases during PCR, how to design effective primer pairs, and how basic theory for sequencing, and interpretation of sequencing will be discussed. Although it has been widely discussed, this workshop is especially important for clinicians who previous do not have hands-on laboratory experience. In addition, number of labs with ability to perform and serve basic genetic and molecular analysis are still limited in Indonesia. With this workshop, we expect to improve knowledge and skill in DNA extraction, PCR, and Sequencing. Keywords : DNA, PCR, sequencing

  17. Noninvasive detection of fetal subchromosomal abnormalities by semiconductor sequencing of maternal plasma DNA.

    Science.gov (United States)

    Yin, Ai-hua; Peng, Chun-fang; Zhao, Xin; Caughey, Bennett A; Yang, Jie-xia; Liu, Jian; Huang, Wei-wei; Liu, Chang; Luo, Dong-hong; Liu, Hai-liang; Chen, Yang-yi; Wu, Jing; Hou, Rui; Zhang, Mindy; Ai, Michael; Zheng, Lianghong; Xue, Rachel Q; Mai, Ming-qin; Guo, Fang-fang; Qi, Yi-ming; Wang, Dong-mei; Krawczyk, Michal; Zhang, Daniel; Wang, Yu-nan; Huang, Quan-fei; Karin, Michael; Zhang, Kang

    2015-11-24

    Noninvasive prenatal testing (NIPT) using sequencing of fetal cell-free DNA from maternal plasma has enabled accurate prenatal diagnosis of aneuploidy and become increasingly accepted in clinical practice. We investigated whether NIPT using semiconductor sequencing platform (SSP) could reliably detect subchromosomal deletions/duplications in women carrying high-risk fetuses. We first showed that increasing concentration of abnormal DNA and sequencing depth improved detection. Subsequently, we analyzed plasma from 1,456 pregnant women to develop a method for estimating fetal DNA concentration based on the size distribution of DNA fragments. Finally, we collected plasma from 1,476 pregnant women with fetal structural abnormalities detected on ultrasound who also underwent an invasive diagnostic procedure. We used SSP of maternal plasma DNA to detect subchromosomal abnormalities and validated our results with array comparative genomic hybridization (aCGH). With 3.5 million reads, SSP detected 56 of 78 (71.8%) subchromosomal abnormalities detected by aCGH. With increased sequencing depth up to 10 million reads and restriction of the size of abnormalities to more than 1 Mb, sensitivity improved to 69 of 73 (94.5%). Of 55 false-positive samples, 35 were caused by deletions/duplications present in maternal DNA, indicating the necessity of a validation test to exclude maternal karyotype abnormalities. This study shows that detection of fetal subchromosomal abnormalities is a viable extension of NIPT based on SSP. Although we focused on the application of cell-free DNA sequencing for NIPT, we believe that this method has broader applications for genetic diagnosis, such as analysis of circulating tumor DNA for detection of cancer.

  18. Scaling behaviors of CG clusters in coding and noncoding DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Zhang Linxi [Department of Physics, Wenzhou Normal College, Wenzhou 325027 (China)]. E-mail: lxzhang@hzcnc.com; Chen Jin [Department of Physics, Wenzhou Normal College, Wenzhou 325027 (China)

    2005-04-01

    In this paper the statistical properties of CG clusters in coding and non-coding DNA sequences are investigated through calculating the cluster-size distribution of CG clusters P(S) and the breadth of the distribution of the root-mean-square size of CG clusters {sigma}{sub m} in consecutive, non-overlapping blocks of m bases. There do exist some differences between coding and non-coding sequences. The cluster-size distribution of CG clusters P(S) for both coding and noncoding sequences follows an exponential decay of P(S){proportional_to}e{sup -{alpha}}{sup S}, and the value of {alpha} depends on the percentage of C-G content for coding sequences. It can fit into a linear line regularly but the case is contrary for noncoding sequences. We find that {xi}(m)={sigma}mm of CG clusters all obeys the good power-law decay of {xi}(m){proportional_to}m{sup -{gamma}} in both coding and non-coding sequences, and the value of {gamma} is 0.949+/-0.014 and 0.826+/-0.011 for coding and noncoding sequences, respectively. Therefore, we can distinguish between coding and non-coding sequences on the basis of the value of {gamma}. At the meantime, we also discuss the power-law of {xi}(m){proportional_to}m{sup -{gamma}} for random sequence, and find that the value of {gamma} for random sequence is very close to 1.00. So we can know that the value of {gamma} for coding sequences is more close to the random sequence, and obtain the conclusion that the behavior of coding sequence trends to random sequence more similarly. This investigation can provide some insights into DNA sequences.

  19. Electrochemical direct immobilization of DNA sequences for label-free herpes virus detection

    Energy Technology Data Exchange (ETDEWEB)

    Phuong Dinh Tam; Mai Anh Tuan [International Training Institute for Materials Science (Viet Nam); Tran Trung [Department of Electrochemistry, Hung-Yen University of Technology and Education (Viet Nam); Nguyen Duc Chien [Institute of Engineering Physics, Hanoi University of Technology, 1 Dai Co Viet Road, Hanoi (Viet Nam)], E-mail: tr_trunghut@yahoo.com

    2009-09-01

    DNA sequences/bio-macromolecules of herpes virus (5'-AT CAC CGA CCC GGA GAG GGA C-3') were directly immobilized into polypyrrole matrix by using the cyclic voltammetry method, and grafted onto arrays of interdigitated platinum microelectrodes. The morphology surface of the obtained PPy/DNA of herpes virus composite films was investigated by a FESEM Hitachi-S 4800. Fourier transform infrared spectroscopy (FTIR) was used to characterize the PPy/DNA film and to study the specific interactions that may exist between DNA biomacromolecules and PPy chains. Attempts are made to use these PPy/DNA composite films for label-free herpes virus detection revealed a response time of 60 s in solutions containing as low as 2 nM DNA concentration, and self life of six months when emerged in double distilled water and kept refrigerated.

  20. Structural basis for sequence-specific recognition of DNA by TAL effectors

    KAUST Repository

    Deng, Dong

    2012-01-05

    TAL (transcription activator-like) effectors, secreted by phytopathogenic bacteria, recognize host DNA sequences through a central domain of tandem repeats. Each repeat comprises 33 to 35 conserved amino acids and targets a specific base pair by using two hypervariable residues [known as repeat variable diresidues (RVDs)] at positions 12 and 13. Here, we report the crystal structures of an 11.5-repeat TAL effector in both DNA-free and DNA-bound states. Each TAL repeat comprises two helices connected by a short RVD-containing loop. The 11.5 repeats form a right-handed, superhelical structure that tracks along the sense strand of DNA duplex, with RVDs contacting the major groove. The 12th residue stabilizes the RVD loop, whereas the 13th residue makes a base-specific contact. Understanding DNA recognition by TAL effectors may facilitate rational design of DNA-binding proteins with biotechnological applications.