WorldWideScience

Sample records for dna sequences required

  1. Next Generation Sequencing of Ancient DNA: Requirements, Strategies and Perspectives

    Directory of Open Access Journals (Sweden)

    Michael Knapp

    2010-07-01

    Full Text Available The invention of next-generation-sequencing has revolutionized almost all fields of genetics, but few have profited from it as much as the field of ancient DNA research. From its beginnings as an interesting but rather marginal discipline, ancient DNA research is now on its way into the centre of evolutionary biology. In less than a year from its invention next-generation-sequencing had increased the amount of DNA sequence data available from extinct organisms by several orders of magnitude. Ancient DNA  research is now not only adding a temporal aspect to evolutionary studies and allowing for the observation of evolution in real time, it also provides important data to help understand the origins of our own species. Here we review progress that has been made in next-generation-sequencing of ancient DNA over the past five years and evaluate sequencing strategies and future directions.

  2. Dna Sequencing

    Science.gov (United States)

    Tabor, Stanley; Richardson, Charles C.

    1995-04-25

    A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.

  3. DNA sequences encoding erythropoietin

    Energy Technology Data Exchange (ETDEWEB)

    Lin, F.K.

    1987-10-27

    A purified and isolated DNA sequence is described consisting essentially of a DNA sequence encoding a polypeptide having an amino acid sequence sufficiently duplicative of that of erythropoietin to allow possession of the biological property of causing bone marrow cells to increase production of reticulocytes and red blood cells, and to increase hemoglobin synthesis or iron uptake.

  4. DNA sequencing conference, 2

    Energy Technology Data Exchange (ETDEWEB)

    Cook-Deegan, R.M. [Georgetown Univ., Kennedy Inst. of Ethics, Washington, DC (United States); Venter, J.C. [National Inst. of Neurological Disorders and Strokes, Bethesda, MD (United States); Gilbert, W. [Harvard Univ., Cambridge, MA (United States); Mulligan, J. [Stanford Univ., CA (United States); Mansfield, B.K. [Oak Ridge National Lab., TN (United States)

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  5. Automated DNA Sequencing System

    Energy Technology Data Exchange (ETDEWEB)

    Armstrong, G.A.; Ekkebus, C.P.; Hauser, L.J.; Kress, R.L.; Mural, R.J.

    1999-04-25

    Oak Ridge National Laboratory (ORNL) is developing a core DNA sequencing facility to support biological research endeavors at ORNL and to conduct basic sequencing automation research. This facility is novel because its development is based on existing standard biology laboratory equipment; thus, the development process is of interest to the many small laboratories trying to use automation to control costs and increase throughput. Before automation, biology Laboratory personnel purified DNA, completed cycle sequencing, and prepared 96-well sample plates with commercially available hardware designed specifically for each step in the process. Following purification and thermal cycling, an automated sequencing machine was used for the sequencing. A technician handled all movement of the 96-well sample plates between machines. To automate the process, ORNL is adding a CRS Robotics A- 465 arm, ABI 377 sequencing machine, automated centrifuge, automated refrigerator, and possibly an automated SpeedVac. The entire system will be integrated with one central controller that will direct each machine and the robot. The goal of this system is to completely automate the sequencing procedure from bacterial cell samples through ready-to-be-sequenced DNA and ultimately to completed sequence. The system will be flexible and will accommodate different chemistries than existing automated sequencing lines. The system will be expanded in the future to include colony picking and/or actual sequencing. This discrete event, DNA sequencing system will demonstrate that smaller sequencing labs can achieve cost-effective the laboratory grow.

  6. Evolution of DNA sequencing

    National Research Council Canada - National Science Library

    Tipu, Hamid Nawaz; Shabbir, Ambreen

    2015-01-01

    Sanger and coworkers introduced DNA sequencing in 1970s for the first time. It principally relied on termination of growing nucleotide chain when a dideoxythymidine triphosphate (ddTTP) was inserted...

  7. Gomphid DNA sequence data

    Data.gov (United States)

    U.S. Environmental Protection Agency — DNA sequence data for several genetic loci. This dataset is not publicly accessible because: It's already publicly available on GenBank. It can be accessed through...

  8. Sequences in sigma(54) region I required for binding to early melted DNA and their involvement in sigma-DNA isomerisation.

    Science.gov (United States)

    Gallegos, M T; Buck, M

    2000-04-07

    The bacterial sigma(54) RNA polymerase functions in a transcription activation mechanism that fully relies upon nucleotide hydrolysis by an enhancer binding activator protein to stimulate open complex formation. Here, we describe results of DNA-binding assays used to probe the role of the sigma(54) amino terminal region I in activation. Of the 15 region I alanine substitution mutants assayed, several specifically failed to bind to a DNA structure representing an early conformation in DNA melting. The same mutants are defective in activated transcription and in forming an isomerised sigma-DNA complex on the early opened DNA. The mechanism of activation may therefore require tight binding of sigma(54) to particular early melted DNA structures. Where mutant sigma(54) binding to early melted DNA was detected, activator-dependent isomerisation generally occurred as efficiently as with the wild-type protein, suggesting that certain region I sequences are largely uninvolved in sigma isomerisation. DNA-binding, sigma isomerisation and transcription activation assays allow formulation of a functional map of region I.

  9. DNA sequencing by CE.

    Science.gov (United States)

    Karger, Barry L; Guttman, András

    2009-06-01

    Sequencing of human and other genomes has been at the center of interest in the biomedical field over the past several decades and is now leading toward an era of personalized medicine. During this time, DNA-sequencing methods have evolved from the labor-intensive slab gel electrophoresis, through automated multiCE systems using fluorophore labeling with multispectral imaging, to the "next-generation" technologies of cyclic-array, hybridization based, nanopore and single molecule sequencing. Deciphering the genetic blueprint and follow-up confirmatory sequencing of Homo sapiens and other genomes were only possible with the advent of modern sequencing technologies that were a result of step-by-step advances with a contribution of academics, medical personnel and instrument companies. While next-generation sequencing is moving ahead at breakneck speed, the multicapillary electrophoretic systems played an essential role in the sequencing of the Human Genome, the foundation of the field of genomics. In this prospective, we wish to overview the role of CE in DNA sequencing based in part of several of our articles in this journal.

  10. DNA sequences required to induce localized conversion in Streptococcus pneumoniae transformation.

    Science.gov (United States)

    Garcia, P; Gasc, A M; Kyriakidis, X; Baty, D; Sicard, M

    1988-11-01

    In pneumococcal transformation a particular point mutation belonging to the amiA locus is able markedly to enhance recombination frequency when crossed with any other markers of this gene. This results from a polarized conversion of the mutation towards the wild-type sequence. In this report, by site-directed oligonucleotide mutagenesis, we have generated a series of mutants showing various degrees of conversion. We have found that the substitution 5'-ATTCAT----5'-ATTAAT is a sufficient signal for localized conversion. Changing individual bases within this sequence results in decreased conversion frequencies to levels that depend on the mutation, suggesting that there is a family to related sequences which may act as a substrate for a conversion system. Moreover, the length over which this conversion occurs has been estimated to be 12 base pairs on the average.

  11. Sequence characteristics required for cooperative binding and efficient in vivo titration of the replication initiator protein DnaA in E. coli

    DEFF Research Database (Denmark)

    Hansen, Flemming G.; Christensen, Bjarke Bak; Atlung, Tove

    2007-01-01

    was eliminated by insertion of half a helical turn between any of the DnaA boxes. Titration strongly depends on the presence and orientation of the promoter distal R6 DnaA box located 104 bp upstream of the R5 box as well as neighbouring sequences downstream of R6. Titration depends on the integrity of a 43 bp...... was located less than 20 bp upstream of the -35 sequence. Thus, the architectural requirements for titration and for repression of transcription are different. A new set of rules for identifying efficiently titrating DnaA box regions was formulated and used to analyse sequences for which good titration data...

  12. Mu-like prophage strong gyrase site sequences: analysis of properties required for promoting efficient mu DNA replication.

    Science.gov (United States)

    Oram, Mark; Pato, Martin L

    2004-07-01

    The bacteriophage Mu genome contains a centrally located strong gyrase site (SGS) that is required for efficient prophage replication. To aid in studying the unusual properties of the SGS, we sought other gyrase sites that might be able to substitute for the SGS in Mu replication. Five candidate sites were obtained by PCR from Mu-like prophage sequences present in Escherichia coli O157:H7 Sakai, Haemophilus influenzae Rd, Salmonella enterica serovar Typhi CT18, and two strains of Neisseria meningitidis. Each of the sites was used to replace the natural Mu SGS to form recombinant prophages, and the effects on Mu replication and host lysis were determined. The site from the E. coli prophage supported markedly enhanced replication and host lysis over that observed with a Mu derivative lacking the SGS, those from the N. meningitidis prophages allowed a small enhancement, and the sites from the Haemophilus and Salmonella prophages gave none. Each of the candidate sites was cleaved specifically by E. coli DNA gyrase both in vitro and in vivo. Supercoiling assays performed in vitro, with the five sites or the Mu SGS individually cloned into a pUC19 reporter plasmid, showed that the Mu SGS and the E. coli or N. meningitidis sequences allowed an enhancement of processive, gyrase-dependent supercoiling, whereas the H. influenzae or Salmonella serovar Typhi sequences did not. While consistent with a requirement for enhanced processivity of supercoiling for a site to function in Mu replication, these data suggest that other factors are also important. The relevance of these observations to an understanding of the function of the SGS is discussed. Copyright 2004 American Society for Microbiology

  13. Information Theory of DNA Sequencing

    CERN Document Server

    Motahari, Abolfazl; Tse, David

    2012-01-01

    DNA sequencing is the basic workhorse of modern day biology and medicine. Shotgun sequencing is the dominant technique used: many randomly located short fragments called reads are extracted from the DNA sequence, and these reads are assembled to reconstruct the original sequence. By drawing an analogy between the DNA sequencing problem and the classic communication problem, we define an information theoretic notion of sequencing capacity. This is the maximum number of DNA base pairs that can be resolved reliably per read, and provides a fundamental limit to the performance that can be achieved by any assembly algorithm. We compute the sequencing capacity explicitly for a simple statistical model of the DNA sequence and the read process. Using this framework, we also study the impact of noise in the read process on the sequencing capacity.

  14. DNA Sequencing Using capillary Electrophoresis

    Energy Technology Data Exchange (ETDEWEB)

    Dr. Barry Karger

    2011-05-09

    application papers of sequencing up to this level were also published in the mid 1990's. A major interest of the sequencing community has always been read length. The longer the sequence read per run the more efficient the process as well as the ability to read repeat sequences. We therefore devoted a great deal of time to studying the factors influencing read length in capillary electrophoresis, including polymer type and molecule weight, capillary column temperature, applied electric field, etc. In our initial optimization, we were able to demonstrate, for the first time, the sequencing of over 1000 bases with 90% accuracy. The run required 80 minutes for separation. Sequencing of 1000 bases per column was next demonstrated on a multiple capillary instrument. Our studies revealed that linear polyacrylamide produced the longest read lengths because the hydrophilic single strand DNA had minimal interaction with the very hydrophilic linear polyacrylamide. Any interaction of the DNA with the polymer would lead to broader peaks and lower read length. Another important parameter was the molecular weight of the linear chains. High molecular weight (> 1 MDA) was important to allow the long single strand DNA to reptate through the entangled polymer matrix. In an important paper, we showed an inverse emulsion method to prepare reproducibility linear polyacrylamide polymer with an average MWT of 9MDa. This approach was used in the polymer for sequencing the human genome. Another critical factor in the successful use of capillary electrophoresis for sequencing was the sample preparation method. In the Sanger sequencing reaction, high concentration of salts and dideoxynucleotide remained. Since the sample was introduced to the capillary column by electrokinetic injection, these salt ions would be favorably injected into the column over the sequencing fragments, thus reducing the signal for longer fragments and hence reading read length. In two papers, we examined the role of

  15. [DNA sequencing technology and automatization of it].

    Science.gov (United States)

    Kraev, A S

    1991-01-01

    Precise manipulations with genetic material, typical for modern experiments in molecular biology and in new biotechnology, require a capability to determine DNA base sequence. This capability enables today to exploit specific genetic knowledge for the dissection of complex cell processes and for modulation of cell metabolism in transgenic organisms. The review focuses on such DNA sequencing technologies that are widespread in general laboratory practice. They can safely be called, with the availability of commercial reagents, industrial techniques. Modern DNA sequencing requires recurrent breakdown of large genomic DNA into smaller pieces, that are then amplified, sequenced and the initial long stretch reconstructed via overlap of small pieces. The DNA sequencing process has several steps: a DNA fragment is obtained in sufficient quantity and purity, it is converted to a form suitable for a particular sequencing method, a sequencing reaction is performed and its products fractionated; and finally the resultant data are interpreted (i.e. an autoradiograph is read into a computer memory) and a long sequence in reconstructed via overlap of short stretches. These steps are considered in separate parts; an accent is made on sequencing strategies with respect to their biological task. In the last part, possibilities for automation of sequencing experiment are considered, followed by a discussion of domestic problems in DNA sequencing.

  16. Biosensors for DNA sequence detection

    Science.gov (United States)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  17. Graphene nanodevices for DNA sequencing

    Science.gov (United States)

    Heerema, Stephanie J.; Dekker, Cees

    2016-02-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with nanopores. Owing to its unique structure and properties, graphene provides interesting opportunities for the development of a new sequencing technology. In recent years, a wide range of creative ideas for graphene sequencers have been theoretically proposed and the first experimental demonstrations have begun to appear. Here, we review the different approaches to using graphene nanodevices for DNA sequencing, which involve DNA passing through graphene nanopores, nanogaps, and nanoribbons, and the physisorption of DNA on graphene nanostructures. We discuss the advantages and problems of each of these key techniques, and provide a perspective on the use of graphene in future DNA sequencing technology.

  18. DNA Sequencing Using capillary Electrophoresis

    Energy Technology Data Exchange (ETDEWEB)

    Dr. Barry Karger

    2011-05-09

    application papers of sequencing up to this level were also published in the mid 1990's. A major interest of the sequencing community has always been read length. The longer the sequence read per run the more efficient the process as well as the ability to read repeat sequences. We therefore devoted a great deal of time to studying the factors influencing read length in capillary electrophoresis, including polymer type and molecule weight, capillary column temperature, applied electric field, etc. In our initial optimization, we were able to demonstrate, for the first time, the sequencing of over 1000 bases with 90% accuracy. The run required 80 minutes for separation. Sequencing of 1000 bases per column was next demonstrated on a multiple capillary instrument. Our studies revealed that linear polyacrylamide produced the longest read lengths because the hydrophilic single strand DNA had minimal interaction with the very hydrophilic linear polyacrylamide. Any interaction of the DNA with the polymer would lead to broader peaks and lower read length. Another important parameter was the molecular weight of the linear chains. High molecular weight (> 1 MDA) was important to allow the long single strand DNA to reptate through the entangled polymer matrix. In an important paper, we showed an inverse emulsion method to prepare reproducibility linear polyacrylamide polymer with an average MWT of 9MDa. This approach was used in the polymer for sequencing the human genome. Another critical factor in the successful use of capillary electrophoresis for sequencing was the sample preparation method. In the Sanger sequencing reaction, high concentration of salts and dideoxynucleotide remained. Since the sample was introduced to the capillary column by electrokinetic injection, these salt ions would be favorably injected into the column over the sequencing fragments, thus reducing the signal for longer fragments and hence reading read length. In two papers, we examined the role of

  19. Applications of mass spectrometry to DNA fingerprinting and DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Jacobson, K.B.; Buchanan, M.V.; Chen, C.H.; Doktycz, M.J.; McLuckey, S.A. (Oak Ridge National Lab., TN (United States)); Arlinghaus, H.F. (Atom Sciences, Inc., Oak Ridge, TN (United States))

    1993-01-01

    DNA fingerprinting and sequencing rely on polyacrylamide gel electrophoresis to determine the sizes of the DNA fragments. Innovative altematives to polyacrylamide gel electrophoresis are under investigation for characterization of such fingerprinting and sequencing. One method uses stable isotopes of tin and other elements to label the DNAwhereas other procedures do not require labels. The detectors in each case are mass spectrometers that detect either the stable isotopes or the DNA fragments themselves. If successful, these methods will speed up the rate of DNA analysis by one or two orders of magnitude.

  20. Applications of mass spectrometry to DNA fingerprinting and DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Jacobson, K.B.; Buchanan, M.V.; Chen, C.H.; Doktycz, M.J.; McLuckey, S.A. [Oak Ridge National Lab., TN (United States); Arlinghaus, H.F. [Atom Sciences, Inc., Oak Ridge, TN (United States)

    1993-06-01

    DNA fingerprinting and sequencing rely on polyacrylamide gel electrophoresis to determine the sizes of the DNA fragments. Innovative altematives to polyacrylamide gel electrophoresis are under investigation for characterization of such fingerprinting and sequencing. One method uses stable isotopes of tin and other elements to label the DNAwhereas other procedures do not require labels. The detectors in each case are mass spectrometers that detect either the stable isotopes or the DNA fragments themselves. If successful, these methods will speed up the rate of DNA analysis by one or two orders of magnitude.

  1. Duplication in DNA Sequences

    Science.gov (United States)

    Ito, Masami; Kari, Lila; Kincaid, Zachary; Seki, Shinnosuke

    The duplication and repeat-deletion operations are the basis of a formal language theoretic model of errors that can occur during DNA replication. During DNA replication, subsequences of a strand of DNA may be copied several times (resulting in duplications) or skipped (resulting in repeat-deletions). As formal language operations, iterated duplication and repeat-deletion of words and languages have been well studied in the literature. However, little is known about single-step duplications and repeat-deletions. In this paper, we investigate several properties of these operations, including closure properties of language families in the Chomsky hierarchy and equations involving these operations. We also make progress toward a characterization of regular languages that are generated by duplicating a regular language.

  2. Nucleotide sequence and transcript organization of a region of the vaccinia virus genome which encodes a constitutively expressed gene required for DNA replication.

    Science.gov (United States)

    Roseman, N A; Hruby, D E

    1987-05-01

    A vaccinia virus (VV) gene required for DNA replication has been mapped to the left side of the 16-kilobase (kb) VV HindIII D DNA fragment by marker rescue of a DNA- temperature-sensitive mutant, ts17, using cloned fragments of the viral genome. The region of VV DNA containing the ts17 locus (3.6 kb) was sequenced. This nucleotide sequence contains one complete open reading frame (ORF) and two incomplete ORFs reading from left to right. Analysis of this region at early times revealed that transcription from the incomplete upstream ORF terminates coincidentally with the complete ORF encoding the ts17 gene product, which is directly downstream. The predicted proteins encoded by this region correlate well with polypeptides mapped by in vitro translation of hybrid-selected early mRNA. The nucleotide sequences of a 1.3-kb BglII fragment derived from ts17 and from two ts17 revertants were also determined, and the nature of the ts17 mutation was identified. S1 nuclease protection studies were carried out to determine the 5' and 3' ends of the transcripts and to examine the kinetics of expression of the ts17 gene during viral infection. The ts17 transcript is present at both early and late times postinfection, indicating that this gene is constitutively expressed. Surprisingly, the transcriptional start throughout infection occurs at the proposed late regulatory element TAA, which immediately precedes the putative initiation codon ATG. Although the biological activity of the ts17-encoded polypeptide was not identified, it was noted that in ts17-infected cells, expression of a nonlinked VV immediate-early gene (thymidine kinase) was deregulated at the nonpermissive temperature. This result may indicate that the ts17 gene product is functionally required at an early step of the VV replicative cycle.

  3. DNA Sequencing Sensors: An Overview

    Directory of Open Access Journals (Sweden)

    Jose Antonio Garrido-Cardenas

    2017-03-01

    Full Text Available The first sequencing of a complete genome was published forty years ago by the double Nobel Prize in Chemistry winner Frederick Sanger. That corresponded to the small sized genome of a bacteriophage, but since then there have been many complex organisms whose DNA have been sequenced. This was possible thanks to continuous advances in the fields of biochemistry and molecular genetics, but also in other areas such as nanotechnology and computing. Nowadays, sequencing sensors based on genetic material have little to do with those used by Sanger. The emergence of mass sequencing sensors, or new generation sequencing (NGS meant a quantitative leap both in the volume of genetic material that was able to be sequenced in each trial, as well as in the time per run and its cost. One can envisage that incoming technologies, already known as fourth generation sequencing, will continue to cheapen the trials by increasing DNA reading lengths in each run. All of this would be impossible without sensors and detection systems becoming smaller and more precise. This article provides a comprehensive overview on sensors for DNA sequencing developed within the last 40 years.

  4. Sequencing intractable DNA to close microbial genomes.

    Science.gov (United States)

    Hurt, Richard A; Brown, Steven D; Podar, Mircea; Palumbo, Anthony V; Elias, Dwayne A

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  5. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  6. Structural Complexity of DNA Sequence

    Directory of Open Access Journals (Sweden)

    Cheng-Yuan Liou

    2013-01-01

    Full Text Available In modern bioinformatics, finding an efficient way to allocate sequence fragments with biological functions is an important issue. This paper presents a structural approach based on context-free grammars extracted from original DNA or protein sequences. This approach is radically different from all those statistical methods. Furthermore, this approach is compared with a topological entropy-based method for consistency and difference of the complexity results.

  7. A microfluidic device for preparing next generation DNA sequencing libraries and for automating other laboratory protocols that require one or more column chromatography steps.

    Directory of Open Access Journals (Sweden)

    Swee Jin Tan

    Full Text Available Library preparation for next-generation DNA sequencing (NGS remains a key bottleneck in the sequencing process which can be relieved through improved automation and miniaturization. We describe a microfluidic device for automating laboratory protocols that require one or more column chromatography steps and demonstrate its utility for preparing Next Generation sequencing libraries for the Illumina and Ion Torrent platforms. Sixteen different libraries can be generated simultaneously with significantly reduced reagent cost and hands-on time compared to manual library preparation. Using an appropriate column matrix and buffers, size selection can be performed on-chip following end-repair, dA tailing, and linker ligation, so that the libraries eluted from the chip are ready for sequencing. The core architecture of the device ensures uniform, reproducible column packing without user supervision and accommodates multiple routine protocol steps in any sequence, such as reagent mixing and incubation; column packing, loading, washing, elution, and regeneration; capture of eluted material for use as a substrate in a later step of the protocol; and removal of one column matrix so that two or more column matrices with different functional properties can be used in the same protocol. The microfluidic device is mounted on a plastic carrier so that reagents and products can be aliquoted and recovered using standard pipettors and liquid handling robots. The carrier-mounted device is operated using a benchtop controller that seals and operates the device with programmable temperature control, eliminating any requirement for the user to manually attach tubing or connectors. In addition to NGS library preparation, the device and controller are suitable for automating other time-consuming and error-prone laboratory protocols requiring column chromatography steps, such as chromatin immunoprecipitation.

  8. Fractals in DNA sequence analysis

    Institute of Scientific and Technical Information of China (English)

    Yu Zu-Guo(喻祖国); Vo Anh; Gong Zhi-Min(龚志民); Long Shun-Chao(龙顺潮)

    2002-01-01

    Fractal methods have been successfully used to study many problems in physics, mathematics, engineering, finance,and even in biology. There has been an increasing interest in unravelling the mysteries of DNA; for example, how can we distinguish coding and noncoding sequences, and the problems of classification and evolution relationship of organisms are key problems in bioinformatics. Although much research has been carried out by taking into consideration the long-range correlations in DNA sequences, and the global fractal dimension has been used in these works by other people, the models and methods are somewhat rough and the results are not satisfactory. In recent years, our group has introduced a time series model (statistical point of view) and a visual representation (geometrical point of view)to DNA sequence analysis. We have also used fractal dimension, correlation dimension, the Hurst exponent and the dimension spectrum (multifractal analysis) to discuss problems in this field. In this paper, we introduce these fractal models and methods and the results of DNA sequence analysis.

  9. Improved taboo search algorithm for designing DNA sequences

    Institute of Scientific and Technical Information of China (English)

    Kai Zhang; Jin Xu; Xiutang Geng; Jianhua Xiao; Linqiang Pan

    2008-01-01

    The design of DNA sequences is one of the most practical and important research topics in DNA computing.We adopt taboo search algorithm and improve the method for the systematic design of equal-length DNA sequences,which can satisfy certain combinatorial and thermodynamic constraints.Using taboo search algorithm,our method can avoid trapping into local optimization and can find a set of good DNA sequences satisfying required constraints.

  10. Compressing DNA sequence databases with coil

    Directory of Open Access Journals (Sweden)

    Hendy Michael D

    2008-05-01

    Full Text Available Abstract Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.

  11. Automated Template Quantification for DNA Sequencing Facilities

    Science.gov (United States)

    Ivanetich, Kathryn M.; Yan, Wilson; Wunderlich, Kathleen M.; Weston, Jennifer; Walkup, Ward G.; Simeon, Christian

    2005-01-01

    The quantification of plasmid DNA by the PicoGreen dye binding assay has been automated, and the effect of quantification of user-submitted templates on DNA sequence quality in a core laboratory has been assessed. The protocol pipets, mixes and reads standards, blanks and up to 88 unknowns, generates a standard curve, and calculates template concentrations. For pUC19 replicates at five concentrations, coefficients of variance were 0.1, and percent errors were from 1% to 7% (n = 198). Standard curves with pUC19 DNA were nonlinear over the 1 to 1733 ng/μL concentration range required to assay the majority (98.7%) of user-submitted templates. Over 35,000 templates have been quantified using the protocol. For 1350 user-submitted plasmids, 87% deviated by ≥ 20% from the requested concentration (500 ng/μL). Based on data from 418 sequencing reactions, quantification of user-submitted templates was shown to significantly improve DNA sequence quality. The protocol is applicable to all types of double-stranded DNA, is unaffected by primer (1 pmol/μL), and is user modifiable. The protocol takes 30 min, saves 1 h of technical time, and costs approximately $0.20 per unknown. PMID:16461949

  12. Information Analysis of DNA Sequences

    CERN Document Server

    Mohammed, Riyazuddin

    2010-01-01

    The problem of differentiating the informational content of coding (exons) and non-coding (introns) regions of a DNA sequence is one of the central problems of genomics. The introns are estimated to be nearly 95% of the DNA and since they do not seem to participate in the process of transcription of amino-acids, they have been termed "junk DNA." Although it is believed that the non-coding regions in genomes have no role in cell growth and evolution, demonstration that these regions carry useful information would tend to falsify this belief. In this paper, we consider entropy as a measure of information by modifying the entropy expression to take into account the varying length of these sequences. Exons are usually much shorter in length than introns; therefore the comparison of the entropy values needs to be normalized. A length correction strategy was employed using randomly generated nucleonic base strings built out of the alphabet of the same size as the exons under question. Our analysis shows that intron...

  13. Affordable Hands-On DNA Sequencing and Genotyping: An Exercise for Teaching DNA Analysis to Undergraduates

    Science.gov (United States)

    Shah, Kushani; Thomas, Shelby; Stein, Arnold

    2013-01-01

    In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C…

  14. Sequence-specific recognition of DNA nanostructures.

    Science.gov (United States)

    Rusling, David A; Fox, Keith R

    2014-05-15

    DNA is the most exploited biopolymer for the programmed self-assembly of objects and devices that exhibit nanoscale-sized features. One of the most useful properties of DNA nanostructures is their ability to be functionalized with additional non-nucleic acid components. The introduction of such a component is often achieved by attaching it to an oligonucleotide that is part of the nanostructure, or hybridizing it to single-stranded overhangs that extend beyond or above the nanostructure surface. However, restrictions in nanostructure design and/or the self-assembly process can limit the suitability of these procedures. An alternative strategy is to couple the component to a DNA recognition agent that is capable of binding to duplex sequences within the nanostructure. This offers the advantage that it requires little, if any, alteration to the nanostructure and can be achieved after structure assembly. In addition, since the molecular recognition of DNA can be controlled by varying pH and ionic conditions, such systems offer tunable properties that are distinct from simple Watson-Crick hybridization. Here, we describe methodology that has been used to exploit and characterize the sequence-specific recognition of DNA nanostructures, with the aim of generating functional assemblies for bionanotechnology and synthetic biology applications.

  15. Yeast DNA sequences initiating gene expression in Escherichia coli.

    Science.gov (United States)

    Lewin, Astrid; Tran, Thi Tuyen; Jacob, Daniela; Mayer, Martin; Freytag, Barbara; Appel, Bernd

    2004-01-01

    DNA transfer between pro- and eukaryotes occurs either during natural horizontal gene transfer or as a result of the employment of gene technology. We analysed the capacity of DNA sequences from a eukaryotic donor organism (Saccharomyces cerevisiae) to serve as promoter region in a prokaryotic recipient (Escherichia coli) by creating fusions between promoterless luxAB genes from Vibrio harveyi and random DNA sequences from S. cerevisiae and measuring the luminescence of transformed E. coli. Fifty-four out of 100 randomly analysed S. cerevisiae DNA sequences caused considerable gene expression in E. coli. Determination of transcription start sites within six selected yeast sequences in E. coli confirmed the existence of bacterial -10 and -35 consensus sequences at appropriate distances upstream from transcription initiation sites. Our results demonstrate that the probability of transcription of transferred eukaryotic DNA in bacteria is extremely high and does not require the insertion of the transferred DNA behind a promoter of the recipient genome.

  16. Preparing DNA libraries for multiplexed paired-end deep sequencing for Illumina GA sequencers.

    Science.gov (United States)

    Son, Mike S; Taylor, Ronald K

    2011-02-01

    Whole-genome sequencing, also known as deep sequencing, is becoming a more affordable and efficient way to identify SNP mutations, deletions, and insertions in DNA sequences across several different strains. Two major obstacles preventing the widespread use of deep sequencers are the costs involved in services used to prepare DNA libraries for sequencing and the overall accuracy of the sequencing data. This unit describes the preparation of DNA libraries for multiplexed paired-end sequencing using the Illumina GA series sequencer. Self-preparation of DNA libraries can help reduce overall expenses, especially if optimization is required for the different samples, and use of the Illumina GA Sequencer can improve the quality of the data.

  17. A novel constraint for thermodynamically designing DNA sequences.

    Directory of Open Access Journals (Sweden)

    Qiang Zhang

    Full Text Available Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap.

  18. A novel constraint for thermodynamically designing DNA sequences.

    Science.gov (United States)

    Zhang, Qiang; Wang, Bin; Wei, Xiaopeng; Zhou, Changjun

    2013-01-01

    Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired) hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE) to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap.

  19. DNA Polymerases Drive DNA Sequencing-by-Synthesis Technologies: Both Past and Present

    Directory of Open Access Journals (Sweden)

    Cheng-Yao eChen

    2014-06-01

    Full Text Available Next-generation sequencing (NGS technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. E. coli DNA polymerase I proteolytic (Klenow fragment was originally utilized in Sanger's dideoxy chain terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today's standard capillary electrophoresis (CE and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ⱷ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ⱷ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies.

  20. Fibonacci Sequence and Supramolecular Structure of DNA.

    Science.gov (United States)

    Shabalkin, I P; Grigor'eva, E Yu; Gudkova, M V; Shabalkin, P I

    2016-05-01

    We proposed a new model of supramolecular DNA structure. Similar to the previously developed by us model of primary DNA structure [11-15], 3D structure of DNA molecule is assembled in accordance to a mathematic rule known as Fibonacci sequence. Unlike primary DNA structure, supramolecular 3D structure is assembled from complex moieties including a regular tetrahedron and a regular octahedron consisting of monomers, elements of the primary DNA structure. The moieties of the supramolecular DNA structure forming fragments of regular spatial lattice are bound via linker (joint) sequences of the DNA chain. The lattice perceives and transmits information signals over a considerable distance without acoustic aberrations. Linker sequences expand conformational space between lattice segments allowing their sliding relative to each other under the action of external forces. In this case, sliding is provided by stretching of the stacked linker sequences.

  1. Mitochondrial DNA sequence evolution in shorebird populations.

    NARCIS (Netherlands)

    Wenink, P.W.

    1994-01-01

    This thesis describes the global molecular population structure of two shorebird species, in particular of the dunlin, Calidris alpina, by means of comparative sequence analysis of the most variable part of the mitochondrial DNA (mtDNA) genome. There are several reasons why mtDNA is the molecule of

  2. Biased distribution of DNA uptake sequences towards genome maintenance genes

    DEFF Research Database (Denmark)

    Davidsen, T.; Rodland, E.A.; Lagesen, K.

    2004-01-01

    coding regions are the DNA uptake sequences (DUS) required for natural genetic transformation. More importantly, we found a significantly higher density of DUS within genes involved in DNA repair, recombination, restriction-modification and replication than in any other annotated gene group......Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...

  3. Long range correlations in DNA sequences

    CERN Document Server

    Mohanty, A K

    2002-01-01

    The so called long range correlation properties of DNA sequences are studied using the variance analyses of the density distribution of a single or a group of nucleotides in a model independent way. This new method which was suggested earlier has been applied to extract slope parameters that characterize the correlation properties for several intron containing and intron less DNA sequences. An important aspect of all the DNA sequences is the properties of complimentarity by virtue of which any two complimentary distributions (like GA is complimentary to TC or G is complimentary to ATC) have identical fluctuations at all scales although their distribution functions need not be identical. Due to this complimentarity, the famous DNA walk representation whose statistical interpretation is still unresolved is shown to be a special case of the present formalism with a density distribution corresponding to a purine or a pyrimidine group. Another interesting aspect of most of the DNA sequences is that the factorial m...

  4. Dynamics and Control of DNA Sequence Amplification

    CERN Document Server

    Marimuthu, Karthikeyan

    2014-01-01

    DNA amplification is the process of replication of a specified DNA sequence \\emph{in vitro} through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction (PCR) as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal tempe...

  5. DNA display I. Sequence-encoded routing of DNA populations.

    Directory of Open Access Journals (Sweden)

    David R Halpin

    2004-07-01

    Full Text Available Recently reported technologies for DNA-directed organic synthesis and for DNA computing rely on routing DNA populations through complex networks. The reduction of these ideas to practice has been limited by a lack of practical experimental tools. Here we describe a modular design for DNA routing genes, and routing machinery made from oligonucleotides and commercially available chromatography resins. The routing machinery partitions nanomole quantities of DNA into physically distinct subpools based on sequence. Partitioning steps can be iterated indefinitely, with worst-case yields of 85% per step. These techniques facilitate DNA-programmed chemical synthesis, and thus enable a materials biology that could revolutionize drug discovery.

  6. Visible periodicity of strong nucleosome DNA sequences.

    Science.gov (United States)

    Salih, Bilal; Tripathi, Vijay; Trifonov, Edward N

    2015-01-01

    Fifteen years ago, Lowary and Widom assembled nucleosomes on synthetic random sequence DNA molecules, selected the strongest nucleosomes and discovered that the TA dinucleotides in these strong nucleosome sequences often appear at 10-11 bases from one another or at distances which are multiples of this period. We repeated this experiment computationally, on large ensembles of natural genomic sequences, by selecting the strongest nucleosomes--i.e. those with such distances between like-named dinucleotides, multiples of 10.4 bases, the structural and sequence period of nucleosome DNA. The analysis confirmed the periodicity of TA dinucleotides in the strong nucleosomes, and revealed as well other periodic sequence elements, notably classical AA and TT dinucleotides. The matrices of DNA bendability and their simple linear forms--nucleosome positioning motifs--are calculated from the strong nucleosome DNA sequences. The motifs are in full accord with nucleosome positioning sequences derived earlier, thus confirming that the new technique, indeed, detects strong nucleosomes. Species- and isochore-specific variations of the matrices and of the positioning motifs are demonstrated. The strong nucleosome DNA sequences manifest the highest hitherto nucleosome positioning sequence signals, showing the dinucleotide periodicities in directly observable rather than in hidden form.

  7. EGNAS: an exhaustive DNA sequence design algorithm

    Directory of Open Access Journals (Sweden)

    Kick Alfred

    2012-06-01

    Full Text Available Abstract Background The molecular recognition based on the complementary base pairing of deoxyribonucleic acid (DNA is the fundamental principle in the fields of genetics, DNA nanotechnology and DNA computing. We present an exhaustive DNA sequence design algorithm that allows to generate sets containing a maximum number of sequences with defined properties. EGNAS (Exhaustive Generation of Nucleic Acid Sequences offers the possibility of controlling both interstrand and intrastrand properties. The guanine-cytosine content can be adjusted. Sequences can be forced to start and end with guanine or cytosine. This option reduces the risk of “fraying” of DNA strands. It is possible to limit cross hybridizations of a defined length, and to adjust the uniqueness of sequences. Self-complementarity and hairpin structures of certain length can be avoided. Sequences and subsequences can optionally be forbidden. Furthermore, sequences can be designed to have minimum interactions with predefined strands and neighboring sequences. Results The algorithm is realized in a C++ program. TAG sequences can be generated and combined with primers for single-base extension reactions, which were described for multiplexed genotyping of single nucleotide polymorphisms. Thereby, possible foldback through intrastrand interaction of TAG-primer pairs can be limited. The design of sequences for specific attachment of molecular constructs to DNA origami is presented. Conclusions We developed a new software tool called EGNAS for the design of unique nucleic acid sequences. The presented exhaustive algorithm allows to generate greater sets of sequences than with previous software and equal constraints. EGNAS is freely available for noncommercial use at http://www.chm.tu-dresden.de/pc6/EGNAS.

  8. In the Staphylococcus aureus two-component system sae, the response regulator SaeR binds to a direct repeat sequence and DNA binding requires phosphorylation by the sensor kinase SaeS.

    Science.gov (United States)

    Sun, Fei; Li, Chunling; Jeong, Dowon; Sohn, Changmo; He, Chuan; Bae, Taeok

    2010-04-01

    Staphylococcus aureus uses the SaeRS two-component system to control the expression of many virulence factors such as alpha-hemolysin and coagulase; however, the molecular mechanism of this signaling has not yet been elucidated. Here, using the P1 promoter of the sae operon as a model target DNA, we demonstrated that the unphosphorylated response regulator SaeR does not bind to the P1 promoter DNA, while its C-terminal DNA binding domain alone does. The DNA binding activity of full-length SaeR could be restored by sensor kinase SaeS-induced phosphorylation. Phosphorylated SaeR is more resistant to digestion by trypsin, suggesting conformational changes. DNase I footprinting assays revealed that the SaeR protection region in the P1 promoter contains a direct repeat sequence (GTTAAN(6)GTTAA [where N is any nucleotide]). This sequence is critical to the binding of phosphorylated SaeR. Mutational changes in the repeat sequence greatly reduced both the in vitro binding of SaeR and the in vivo function of the P1 promoter. From these results, we concluded that SaeR recognizes the direct repeat sequence as a binding site and that binding requires phosphorylation by SaeS.

  9. Nanopore DNA sequencing using kinetic proofreading

    Science.gov (United States)

    Ling, Xinsheng

    We propose a method of DNA sequencing by combining the physical method of nanopore electrical measurements and Southern's sequencing-by-hybridization. The new key ingredient, essential to both lowering the costs and increasing the precision, is an asymmetric nanopore sandwich device capable of measuring the DNA hybridization probe twice separated by a designed waiting time. Those incorrect probes appearing only once in nanopore ionic current traces are discriminated from the correct ones that appear twice. This method of discrimination is similar to the principle of kinetic proofreading proposed by Hopfield and Ninio in gene transcription and translation processes. An error analysis is of this nanopore kinetic proofreading (nKP) technique for DNA sequencing is carried out in comparison with the most precise 3' dideoxy termination method developed by Sanger. Nanopore DNA sequencing using kinetic proofreading.

  10. Extracting biological knowledge from DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    De La Vega, F.M. [CINVESTAV-IPN (Mexico); Thieffry, D. [Universite Libre de Bruxelles, Rhode-Saint-Genese (Belgium)]|[Universidad Nacional Autonoma de Mexico, Morelos (Mexico); Collado-Vides, J. [Universidad Nacional Autonoma de Mexico, Morelos (Mexico)

    1996-12-31

    This session describes the elucidation of information from dna sequences and what challenges computational biologists face in their task of summarizing and deciphering the human genome. Techniques discussed include methods from statistics, information theory, artificial intelligence and linguistics. 1 ref.

  11. An approach to sequence DNA without tagging

    Science.gov (United States)

    Niu, Sanjun; Saraf, Ravi F.

    2002-10-01

    Microarray technology is playing an increasingly important role in biology and medicine and its application to genomics for gene expression analysis has already reached the market with a variety of commercially available instruments. In these combinatorial analysis methods, known probe single-strand DNA (ssDNA) 'primers' are attached in clusters of typically 100 µm × 100 µm pixels. Each pixel of the array has a slightly different sequence. On exposure to 'unknown' target ssDNA, the pixels with the right complementary probe ssDNA sequence convert to double-stranded DNA (dsDNA) by a hybridization reaction. To transduct the conversion of the pixel to dsDNA, the target ssDNA is labelled with a photoluminescent tag during the polymerase chain reaction (PCR) amplification process. Due to the statistical distribution of the tags in the target ssDNA, it becomes significantly difficult to implement these methods as a diagnostic tool in a pathology laboratory. A method to sequence DNA without tagging the molecule is developed. The fabrication process is compatible with current microelectronics and (emerging) soft-material fabrication technologies, allowing the method to be integrable with micro-electromechanical systems (MEMS) and lab-on-a-chip devices. An estimated sensitivity of 10-12 g on a 1 cm2 device area is obtained.

  12. gargammel: a sequence simulator for ancient DNA.

    Science.gov (United States)

    Renaud, Gabriel; Hanghøj, Kristian; Willerslev, Eske; Orlando, Ludovic

    2016-10-29

    Ancient DNA has emerged as a remarkable tool to infer the history of extinct species and past populations. However, many of its characteristics, such as extensive fragmentation, damage and contamination, can influence downstream analyses. To help investigators measure how these could impact their analyses in silico, we have developed gargammel, a package that simulates ancient DNA fragments given a set of known reference genomes. Our package simulates the entire molecular process from post-mortem DNA fragmentation and DNA damage to experimental sequencing errors, and reproduces most common bias observed in ancient DNA datasets.

  13. Inconsistencies in Neanderthal genomic DNA sequences.

    Directory of Open Access Journals (Sweden)

    Jeffrey D Wall

    2007-10-01

    Full Text Available Two recently published papers describe nuclear DNA sequences that were obtained from the same Neanderthal fossil. Our reanalyses of the data from these studies show that they are not consistent with each other and point to serious problems with the data quality in one of the studies, possibly due to modern human DNA contaminants and/or a high rate of sequencing errors.

  14. Mitochondrial DNA sequence evolution in shorebird populations

    NARCIS (Netherlands)

    Wenink, P.W.

    1994-01-01

    This thesis describes the global molecular population structure of two shorebird species, in particular of the dunlin, Calidris alpina, by means of comparative sequence analysis of the most variable part of the mitochondrial DNA (mtDNA) genome. There are several reasons

  15. Nanogrid rolling circle DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Church, George M.; Porreca, Gregory J.; Shendure, Jay; Rosenbaum, Abraham Meir

    2017-04-18

    The present invention relates to methods for sequencing a polynucleotide immobilized on an array having a plurality of specific regions each having a defined diameter size, including synthesizing a concatemer of a polynucleotide by rolling circle amplification, wherein the concatemer has a cross-sectional diameter greater than the diameter of a specific region, immobilizing the concatemer to the specific region to make an immobilized concatemer, and sequencing the immobilized concatemer.

  16. Nanopore-CMOS Interfaces for DNA Sequencing.

    Science.gov (United States)

    Magierowski, Sebastian; Huang, Yiyun; Wang, Chengjie; Ghafar-Zadeh, Ebrahim

    2016-08-06

    DNA sequencers based on nanopore sensors present an opportunity for a significant break from the template-based incumbents of the last forty years. Key advantages ushered by nanopore technology include a simplified chemistry and the ability to interface to CMOS technology. The latter opportunity offers substantial promise for improvement in sequencing speed, size and cost. This paper reviews existing and emerging means of interfacing nanopores to CMOS technology with an emphasis on massively-arrayed structures. It presents this in the context of incumbent DNA sequencing techniques, reviews and quantifies nanopore characteristics and models and presents CMOS circuit methods for the amplification of low-current nanopore signals in such interfaces.

  17. Osmylated DNA, a novel concept for sequencing DNA using nanopores

    Science.gov (United States)

    Kanavarioti, Anastassia

    2015-03-01

    Saenger sequencing has led the advances in molecular biology, while faster and cheaper next generation technologies are urgently needed. A newer approach exploits nanopores, natural or solid-state, set in an electrical field, and obtains base sequence information from current variations due to the passage of a ssDNA molecule through the pore. A hurdle in this approach is the fact that the four bases are chemically comparable to each other which leads to small differences in current obstruction. ‘Base calling’ becomes even more challenging because most nanopores sense a short sequence and not individual bases. Perhaps sequencing DNA via nanopores would be more manageable, if only the bases were two, and chemically very different from each other; a sequence of 1s and 0s comes to mind. Osmylated DNA comes close to such a sequence of 1s and 0s. Osmylation is the addition of osmium tetroxide bipyridine across the C5-C6 double bond of the pyrimidines. Osmylation adds almost 400% mass to the reactive base, creates a sterically and electronically notably different molecule, labeled 1, compared to the unreactive purines, labeled 0. If osmylated DNA were successfully sequenced, the result would be a sequence of osmylated pyrimidines (1), and purines (0), and not of the actual nucleobases. To solve this problem we studied the osmylation reaction with short oligos and with M13mp18, a long ssDNA, developed a UV-vis assay to measure extent of osmylation, and designed two protocols. Protocol A uses mild conditions and yields osmylated thymidines (1), while leaving the other three bases (0) practically intact. Protocol B uses harsher conditions and effectively osmylates both pyrimidines, but not the purines. Applying these two protocols also to the complementary of the target polynucleotide yields a total of four osmylated strands that collectively could define the actual base sequence of the target DNA.

  18. Electrochemical measurement for analysis of DNA sequence

    Energy Technology Data Exchange (ETDEWEB)

    Cho, S.B.; Hong, J.S.; Pak, J.H. [Korea University, Seoul (Korea); Kim, Y.M. [National Institute of Health, Seoul (Korea)

    2002-02-01

    One of the important roles of a DNA chip is the capability of detecting genetic diseases and mutations by analyzing DNA sequence. For a successful electrochemical genotyping, several aspects should be considered including the chemical treatment of electrode surface, DNA immobilization on electrode, hybridization, choice of an intercalator to be selectively bound to double standed DNA, and an equipment for detecting and analyzing the output singal. Au was used as the electrode material, 2-mercaptoethanol was used for linking DNA to Au electrode, and methylene blue was used as an indicator that can be bound to a double stranded DNA selectively. From the analysis of reductive current of this indicator that was bound to a double stranded DNA on an electrode, a normal double stranded DNA was able to be distinguished from a single stranded DNA in just a few seconds. Also, it was found that the peak reduction current of indicator is proportional to the concentration of target DNA to be hybridized with probe DNA. Therefore, it is possible to realize a simple and cheap DNA sensor using the electrochemical measurement for genotyping. (author). 20 refs., 8 figs., 1 tab.

  19. Dynamics and control of DNA sequence amplification

    Energy Technology Data Exchange (ETDEWEB)

    Marimuthu, Karthikeyan [Department of Chemical Engineering and Center for Advanced Process Decision-Making, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 (United States); Chakrabarti, Raj, E-mail: raj@pmc-group.com, E-mail: rajc@andrew.cmu.edu [Department of Chemical Engineering and Center for Advanced Process Decision-Making, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 (United States); Division of Fundamental Research, PMC Advanced Technology, Mount Laurel, New Jersey 08054 (United States)

    2014-10-28

    DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions.

  20. Female-specific DNA sequences in geese.

    Science.gov (United States)

    Huang, M C; Lin, W C; Horng, Y M; Rouvier, R; Huang, C W

    2003-07-01

    1. The OPAE random primers (Operon Technologies, Inc., CA) were used for random amplified polymorphic DNA (RAPD) fingerprinting in Chinese, White Roman and Landaise geese. One of these primers, OPAE-06, produced a 938-bp sex-specific fragment in all females and in no males of Chinese geese only. 2. A novel female-specific DNA sequence in Chinese goose was cloned and sequenced. Two primers, CGSex-F and CGSex-R, were designed in order to amplify a 912-bp sex-specific polymerase chain reaction (PCR) fragment on genomic DNA from female geese. 3. It was shown that a simple and effective PCR-based sexing technique could be used in the three goose breeds studied. 4. Nucleotide sequencing of the sex-specific fragments in White Roman and Landaise geese was performed and sequence differences were observed among these three breeds.

  1. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  2. DNA Sequencing in Cultural Heritage.

    Science.gov (United States)

    Vai, Stefania; Lari, Martina; Caramelli, David

    2016-02-01

    During the last three decades, DNA analysis on degraded samples revealed itself as an important research tool in anthropology, archaeozoology, molecular evolution, and population genetics. Application on topics such as determination of species origin of prehistoric and historic objects, individual identification of famous personalities, characterization of particular samples important for historical, archeological, or evolutionary reconstructions, confers to the paleogenetics an important role also for the enhancement of cultural heritage. A really fast improvement in methodologies in recent years led to a revolution that permitted recovering even complete genomes from highly degraded samples with the possibility to go back in time 400,000 years for samples from temperate regions and 700,000 years for permafrozen remains and to analyze even more recent material that has been subjected to hard biochemical treatments. Here we propose a review on the different methodological approaches used so far for the molecular analysis of degraded samples and their application on some case studies.

  3. cDNA sequence quality data - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available Budding yeast cDNA sequencing project cDNA sequence quality data Data detail Data name cDNA sequence quality... data Description of data contents Phred's quality score. PHD format, one file to a single cDNA data, and co...ription Download License Update History of This Database Site Policy | Contact Us cDNA sequence quality data - Budding yeast cDNA sequencing project | LSDB Archive ...

  4. DNA sequencing by synthesis with degenerate primers

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    The degenerate primer-based sequencing Was developed by a synthesis method(DP-SBS)for high-throughput DNA sequencing,in which a set of degenerate primers are hybridized on the arrayed DNA templates and extended by DNA polymerase on microarrays.In this method,adifferent set of degenerate primers containing a give nnumber(n)of degenerate nucleotides at the 3'-ends were annealed to the sequenced templates that were immobilized on the solid surface.The nucleotides(n+1)on the template sequences were determined by detecting the incorporation of fluorescent labeled nucleotides.The fluorescent labeled nucleotide was incorporated into the primer in a base-specific manner after the enzymatic primer extension reactions and nine-base length were read out accurately.The main advanmge of the DP-SBS is that the method only uses very conventional biochemical reagents and avoids the complicated special chemical reagents for removing the labeled nucleotides and reactivating the primer for further extension.From the present study,it is found that the DP-SBS method is reliable,simple,and cost-effective for laboratory-sequencing a large amount of short DNA fragments.

  5. Glycome mapping on DNA sequencing equipment.

    Science.gov (United States)

    Laroy, Wouter; Contreras, Roland; Callewaert, Nico

    2006-01-01

    Here we provide a detailed protocol for the analysis of protein-linked glycans on DNA sequencing equipment. This protocol satisfies the glyco-analytical needs of many projects and can form the basis of 'glycomics' studies, in which robustness, high throughput, high sensitivity and reliable quantification are of paramount importance. The protocol routinely resolves isobaric glycan stereoisomers, which is much more difficult by mass spectrometry (MS). Earlier methods made use of polyacrylamide gel-based sequencers, but we have now adapted the technique to multicapillary DNA sequencers, which represent the state of the art today. In addition, we have integrated an option for HPLC-based fractionation of highly anionic 8-amino-1,3,6-pyrenetrisulfonic acid (APTS)-labeled glycans before rapid capillary electrophoretic profiling. This option facilitates either two-dimensional profiling of complex glycan mixtures and exoglycosidase sequencing, or MS analysis of particular compounds of interest rather than of the total pool of glycans in a sample.

  6. The complete DNA sequence of vaccinia virus.

    Science.gov (United States)

    Goebel, S J; Johnson, G P; Perkus, M E; Davis, S W; Winslow, J P; Paoletti, E

    1990-11-01

    The complete DNA sequence of the genome of vaccinia virus has been determined. The genome consisted of 191,636 bp with a base composition of 66.6% A + T. We have identified 198 "major" protein-coding regions and 65 overlapping "minor" regions, for a total of 263 potential genes. Genes encoded by the virus were located by examination of DNA sequence characteristics and compared with existing vaccinia virus mapping analyses, sequence data, and transcription data. These genes were found to be compactly organized along the genome with relatively few regions of noncoding sequences. Whereas several similarities to proteins of known function were discerned, the function of the majority of proteins encoded by these open reading frames is as yet undetermined.

  7. The DNA sequence specificity of bleomycin cleavage in a systematically altered DNA sequence.

    Science.gov (United States)

    Gautam, Shweta D; Chen, Jon K; Murray, Vincent

    2017-08-01

    Bleomycin is an anti-tumour agent that is clinically used to treat several types of cancers. Bleomycin cleaves DNA at specific DNA sequences and recent genome-wide DNA sequencing specificity data indicated that the sequence 5'-RTGT*AY (where T* is the site of bleomycin cleavage, R is G/A and Y is T/C) is preferentially cleaved by bleomycin in human cells. Based on this DNA sequence, we constructed a plasmid clone to explore this bleomycin cleavage preference. By systematic variation of single nucleotides in the 5'-RTGT*AY sequence, we were able to investigate the effect of nucleotide changes on bleomycin cleavage efficiency. We observed that the preferred consensus DNA sequence for bleomycin cleavage in the plasmid clone was 5'-YYGT*AW (where W is A/T). The most highly cleaved sequence was 5'-TCGT*AT and, in fact, the seven most highly cleaved sequences conformed to the consensus sequence 5'-YYGT*AW. A comparison with genome-wide results was also performed and while the core sequence was similar in both environments, the surrounding nucleotides were different.

  8. POSA : Perl objects for DNA sequencing data analysis

    NARCIS (Netherlands)

    Aerts, JA; Jungerius, BJ; Groenen, MA

    2004-01-01

    Background: Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide

  9. POSA: perl objects for DNA sequencing data analysis

    NARCIS (Netherlands)

    Aerts, J.A.; Jungerius, B.J.; Groenen, M.A.M.

    2004-01-01

    Background - Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide

  10. POSA : Perl objects for DNA sequencing data analysis

    NARCIS (Netherlands)

    Aerts, JA; Jungerius, BJ; Groenen, MA

    2004-01-01

    Background: Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide

  11. DNA Sequence Alignment during Homologous Recombination.

    Science.gov (United States)

    Greene, Eric C

    2016-05-27

    Homologous recombination allows for the regulated exchange of genetic information between two different DNA molecules of identical or nearly identical sequence composition, and is a major pathway for the repair of double-stranded DNA breaks. A key facet of homologous recombination is the ability of recombination proteins to perfectly align the damaged DNA with homologous sequence located elsewhere in the genome. This reaction is referred to as the homology search and is akin to the target searches conducted by many different DNA-binding proteins. Here I briefly highlight early investigations into the homology search mechanism, and then describe more recent research. Based on these studies, I summarize a model that includes a combination of intersegmental transfer, short-distance one-dimensional sliding, and length-specific microhomology recognition to efficiently align DNA sequences during the homology search. I also suggest some future directions to help further our understanding of the homology search. Where appropriate, I direct the reader to other recent reviews describing various issues related to homologous recombination.

  12. DNA Sequence Alignment during Homologous Recombination*

    Science.gov (United States)

    Greene, Eric C.

    2016-01-01

    Homologous recombination allows for the regulated exchange of genetic information between two different DNA molecules of identical or nearly identical sequence composition, and is a major pathway for the repair of double-stranded DNA breaks. A key facet of homologous recombination is the ability of recombination proteins to perfectly align the damaged DNA with homologous sequence located elsewhere in the genome. This reaction is referred to as the homology search and is akin to the target searches conducted by many different DNA-binding proteins. Here I briefly highlight early investigations into the homology search mechanism, and then describe more recent research. Based on these studies, I summarize a model that includes a combination of intersegmental transfer, short-distance one-dimensional sliding, and length-specific microhomology recognition to efficiently align DNA sequences during the homology search. I also suggest some future directions to help further our understanding of the homology search. Where appropriate, I direct the reader to other recent reviews describing various issues related to homologous recombination. PMID:27129270

  13. Autoantigenic proteins that bind recombinogenic sequences in Epstein-Barr virus and cellular DNA.

    OpenAIRE

    1994-01-01

    We have identified conserved autoantigenic cellular proteins that bind to G-rich sequence motifs in recombinogenic regions of Epstein-Barr virus (EBV) DNA. This binding activity, called TRBP, recognizes the EBV terminal repeats, a locus responsible for interconversion of linear and circular EBV DNA. We found that TRBP also binds to EBV DNA sequences involved in deletion of EBNA2, a gene product required for immortalization. We show that TRBP binds sequences present in repetitive cellular DNA,...

  14. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    Science.gov (United States)

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  15. The first determination of DNA sequence of a specific gene.

    Science.gov (United States)

    Inouye, Masayori

    2016-05-10

    How and when the first DNA sequence of a gene was determined? In 1977, F. Sanger came up with an innovative technology to sequence DNA by using chain terminators, and determined the entire DNA sequence of the 5375-base genome of bacteriophage φX 174 (Sanger et al., 1977). While this Sanger's achievement has been recognized as the first DNA sequencing of genes, we had determined DNA sequence of a gene, albeit a partial sequence, 11 years before the Sanger's DNA sequence (Okada et al., 1966).

  16. DNA sequencing by nanopores: advances and challenges

    Science.gov (United States)

    Agah, Shaghayegh; Zheng, Ming; Pasquali, Matteo; Kolomeisky, Anatoly B.

    2016-10-01

    Developing inexpensive and simple DNA sequencing methods capable of detecting entire genomes in short periods of time could revolutionize the world of medicine and technology. It will also lead to major advances in our understanding of fundamental biological processes. It has been shown that nanopores have the ability of single-molecule sensing of various biological molecules rapidly and at a low cost. This has stimulated significant experimental efforts in developing DNA sequencing techniques by utilizing biological and artificial nanopores. In this review, we discuss recent progress in the nanopore sequencing field with a focus on the nature of nanopores and on sensing mechanisms during the translocation. Current challenges and alternative methods are also discussed.

  17. Rapid DNA sequencing by horizontal ultrathin gel electrophoresis.

    Science.gov (United States)

    Brumley, R L; Smith, L M

    1991-01-01

    A horizontal polyacrylamide gel electrophoresis apparatus has been developed that decreases the time required to separate the DNA fragments produced in enzymatic sequencing reactions. The configuration of this apparatus and the use of circulating coolant directly under the glass plates result in heat exchange that is approximately nine times more efficient than passive thermal transfer methods commonly used. Bubble-free gels as thin as 25 microns can be routinely cast on this device. The application to these ultrathin gels of electric fields up to 250 volts/cm permits the rapid separation of multiple DNA sequencing reactions in parallel. When used in conjunction with 32P-based autoradiography, the DNA bands appear substantially sharper than those obtained in conventional electrophoresis. This increased sharpness permits shorter autoradiographic exposure times and longer sequence reads. Images PMID:1870968

  18. Preparation of next-generation sequencing libraries from damaged DNA.

    Science.gov (United States)

    Briggs, Adrian W; Heyn, Patricia

    2012-01-01

    Next-generation sequencing (NGS) has revolutionized ancient DNA research, especially when combined with high-throughput target enrichment methods. However, attaining high sequencing depth and accuracy from samples often remains problematic due to the damaged state of ancient DNA, in particular the extremely low copy number of ancient DNA and the abundance of uracil residues derived from cytosine deamination that lead to miscoding errors. It is therefore critical to use a highly efficient procedure for conversion of a raw DNA extract into an adaptor-ligated sequencing library, and equally important to reduce errors from uracil residues. We present a protocol for NGS library preparation that allows highly efficient conversion of DNA fragments into an adaptor-ligated form. The protocol incorporates an option to remove the vast majority of uracil miscoding lesions as part of the library preparation process. The procedure requires only two spin column purification steps and no gel purification or bead handling. Starting from an aliquot of DNA extract, a finished, highly amplified library can be generated in 5 h, or under 3 h if uracil removal is not required.

  19. Mitochondrial DNA sequence variation in Greeks.

    Science.gov (United States)

    Kouvatsi, A; Karaiskou, N; Apostolidis, A; Kirmizidis, G

    2001-12-01

    Mitochondrial DNA (mtDNA) control region sequences were determined in 54 unrelated Greeks, coming from different regions in Greece, for both segments HVR-I and HVR-II. Fifty-two different mtDNA haplotypes were revealed, one of which was shared by three individuals. A very low heterogeneity was found among Greek regions. No one cluster of lineages was specific to individuals coming from a certain region. The average pairwise difference distribution showed a value of 7.599. The data were compared with that for other European or neighbor populations (British, French, Germans, Tuscans, Bulgarians, and Turks). The genetic trees that were constructed revealed homogeneity between Europeans. Median networks revealed that most of the Greek mtDNA haplotypes are clustered to the five known haplogroups and that a number of haplotypes are shared among Greeks and other European and Near Eastern populations.

  20. Local Renyi entropic profiles of DNA sequences

    Directory of Open Access Journals (Sweden)

    Vinga Susana

    2007-10-01

    Full Text Available Abstract Background In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM. Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. Results The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at http://kdbio.inesc-id.pt/~svinga/ep/. Conclusion The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures.

  1. New stopping criteria for segmenting DNA sequences

    CERN Document Server

    Li, W

    2001-01-01

    We propose a solution on the stopping criterion in segmenting inhomogeneous DNA sequences with complex statistical patterns. This new stopping criterion is based on Bayesian Information Criterion (BIC) in the model selection framework. When this stopping criterion is applied to a left telomere sequence of yeast Saccharomyces cerevisiae and the complete genome sequence of bacterium Escherichia coli, borders of biologically meaningful units were identified (e.g. subtelomeric units, replication origin, and replication terminus), and a more reasonable number of domains was obtained. We also introduce a measure called segmentation strength which can be used to control the delineation of large domains. The relationship between the average domain size and the threshold of segmentation strength is determined for several genome sequences.

  2. Dialects of the DNA uptake sequence in Neisseriaceae.

    Directory of Open Access Journals (Sweden)

    Stephan A Frye

    2013-04-01

    Full Text Available In all sexual organisms, adaptations exist that secure the safe reassortment of homologous alleles and prevent the intrusion of potentially hazardous alien DNA. Some bacteria engage in a simple form of sex known as transformation. In the human pathogen Neisseria meningitidis and in related bacterial species, transformation by exogenous DNA is regulated by the presence of a specific DNA Uptake Sequence (DUS, which is present in thousands of copies in the respective genomes. DUS affects transformation by limiting DNA uptake and recombination in favour of homologous DNA. The specific mechanisms of DUS-dependent genetic transformation have remained elusive. Bioinformatic analyses of family Neisseriaceae genomes reveal eight distinct variants of DUS. These variants are here termed DUS dialects, and their effect on interspecies commutation is demonstrated. Each of the DUS dialects is remarkably conserved within each species and is distributed consistent with a robust Neisseriaceae phylogeny based on core genome sequences. The impact of individual single nucleotide transversions in DUS on meningococcal transformation and on DNA binding and uptake is analysed. The results show that a DUS core 5'-CTG-3' is required for transformation and that transversions in this core reduce DNA uptake more than two orders of magnitude although the level of DNA binding remains less affected. Distinct DUS dialects are efficient barriers to interspecies recombination in N. meningitidis, N. elongata, Kingella denitrificans, and Eikenella corrodens, despite the presence of the core sequence. The degree of similarity between the DUS dialect of the recipient species and the donor DNA directly correlates with the level of transformation and DNA binding and uptake. Finally, DUS-dependent transformation is documented in the genera Eikenella and Kingella for the first time. The results presented here advance our understanding of the function and evolution of DUS and genetic

  3. Roche genome sequencer FLX based high-throughput sequencing of ancient DNA

    DEFF Research Database (Denmark)

    Alquezar-Planas, David E; Fordyce, Sarah Louise

    2012-01-01

    Since the development of so-called "next generation" high-throughput sequencing in 2005, this technology has been applied to a variety of fields. Such applications include disease studies, evolutionary investigations, and ancient DNA. Each application requires a specialized protocol to ensure tha...

  4. Sequence-selective DNA recognition with peptide-bisbenzamidine conjugates.

    Science.gov (United States)

    Sánchez, Mateo I; Vázquez, Olalla; Vázquez, M Eugenio; Mascareñas, José L

    2013-07-22

    Transcription factors (TFs) are specialized proteins that play a key role in the regulation of genetic expression. Their mechanism of action involves the interaction with specific DNA sequences, which usually takes place through specialized domains of the protein. However, achieving an efficient binding usually requires the presence of the full protein. This is the case for bZIP and zinc finger TF families, which cannot interact with their target sites when the DNA binding fragments are presented as isolated monomers. Herein it is demonstrated that the DNA binding of these monomeric peptides can be restored when conjugated to aza-bisbenzamidines, which are readily accessible molecules that interact with A/T-rich sites by insertion into their minor groove. Importantly, the fluorogenic properties of the aza-benzamidine unit provide details of the DNA interaction that are eluded in electrophoresis mobility shift assays (EMSA). The hybrids based on the GCN4 bZIP protein preferentially bind to composite sequences containing tandem bisbenzamidine-GCN4 binding sites (TCAT⋅AAATT). Fluorescence reverse titrations show an interesting multiphasic profile consistent with the formation of competitive nonspecific complexes at low DNA/peptide ratios. On the other hand, the conjugate with the DNA binding domain of the zinc finger protein GAGA binds with high affinity (KD≈12 nM) and specificity to a composite AATTT⋅GAGA sequence containing both the bisbenzamidine and the TF consensus binding sites.

  5. The PCNA interaction protein box sequence in Rad54 is an integral part of its ATPase domain and is required for efficient DNA repair and recombination

    DEFF Research Database (Denmark)

    Burgess, Rebecca C; Sebesta, Marek; Sisakova, Alexandra

    2013-01-01

    Rad54 is an ATP-driven translocase involved in the genome maintenance pathway of homologous recombination (HR). Although its activity has been implicated in several steps of HR, its exact role(s) at each step are still not fully understood. We have identified a new interaction between Rad54...... and the replicative DNA clamp, proliferating cell nuclear antigen (PCNA). This interaction was only mildly weakened by the mutation of two key hydrophobic residues in the highly-conserved PCNA interaction motif (PIP-box) of Rad54 (Rad54-AA). Intriguingly, the rad54-AA mutant cells displayed sensitivity to DNA damage...

  6. POSA: Perl Objects for DNA Sequencing Data Analysis

    Directory of Open Access Journals (Sweden)

    Jungerius Bart J

    2004-08-01

    Full Text Available Abstract Background Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide modules that need advanced informatics skills to allow implementation in pipelines. Results Here we present POSA, a pair of new perl objects that describe DNA sequence traces and Phrap contig assemblies in detail. Methods included in POSA include basecalling with quality scores (by Phred, contig assembly (by Phrap, generation of primer3 input and automated SNP annotation (by PolyPhred. Although easily implemented by users with only limited programming experience, these objects considerabily reduce hands-on analysis time compared to using the Staden package for extracting sequence information from raw sequencing files and for SNP discovery. Conclusions The POSA objects allow a flexible and easy design, implementation and usage of perl-based pipelines to handle and analyze DNA sequencing data, while requiring only minor programming skills.

  7. DNA sequence analysis of newly formed telomeres in yeast.

    Science.gov (United States)

    Wang, S S; Pluta, A F; Zakian, V A

    1989-01-01

    A plasmid can be maintained in linear form in baker's yeast if it bears telomeric sequences at each end. Linear plasmids bearing cloned telomeric C4A4 repeats at one end (test end) and a natural DNA terminus with approximately 300 bps of C4A2 repeats at the other or control end were introduced by transformation into yeast. Test-end termini of 28 to 112 bps supported telomere formation. During telomere formation, C4A2 repeats were often transferred to test-end termini. To determine in greater detail the fate of test-end sequences on these plasmids after propagation in yeast, test-end telomeres were subcloned into E. coli and sequenced. DNA sequencing established a number of points about the molecular events involved in telomere formation in yeast. The results suggest that there are at least two mechanisms for telomere formation in yeast. One is mediated by a recombination event that requires neither a long stretch of homology nor the RAD52 gene product. The other mechanism is by addition of C1-3A repeats to the termini of linear DNA molecules. The telomeric sequence required to support C1-3A addition need not be at the very end of a molecule for telomere formation.

  8. Adenovirus sequences required for replication in vivo.

    OpenAIRE

    Wang, K.; Pearson, G D

    1985-01-01

    We have studied the in vivo replication properties of plasmids carrying deletion mutations within cloned adenovirus terminal sequences. Deletion mapping located the adenovirus DNA replication origin entirely within the first 67 bp of the adenovirus inverted terminal repeat. This region could be further subdivided into two functional domains: a minimal replication origin and an adjacent auxillary region which boosted the efficiency of replication by more than 100-fold. The minimal origin occup...

  9. Next-generation sequencing offers new insights into DNA degradation

    DEFF Research Database (Denmark)

    Overballe-Petersen, Søren; Orlando, Ludovic Antoine Alexandre; Willerslev, Eske

    2012-01-01

    The processes underlying DNA degradation are central to various disciplines, including cancer research, forensics and archaeology. The sequencing of ancient DNA molecules on next-generation sequencing platforms provides direct measurements of cytosine deamination, depurination and fragmentation r...

  10. Random Coding Bounds for DNA Codes Based on Fibonacci Ensembles of DNA Sequences

    Science.gov (United States)

    2008-07-01

    COVERED (From - To) 6 Jul 08 – 11 Jul 08 4. TITLE AND SUBTITLE RANDOM CODING BOUNDS FOR DNA CODES BASED ON FIBONACCI ENSEMBLES OF DNA SEQUENCES ... sequences which are generalizations of the Fibonacci sequences . 15. SUBJECT TERMS DNA Codes, Fibonacci Ensembles, DNA Computing, Code Optimization 16...coding bound on the rate of DNA codes is proved. To obtain the bound, we use some ensembles of DNA sequences which are generalizations of the Fibonacci

  11. An oligonucleotide hybridization approach to DNA sequencing.

    Science.gov (United States)

    Khrapko, K R; Lysov YuP; Khorlyn, A A; Shick, V V; Florentiev, V L; Mirzabekov, A D

    1989-10-09

    We have proposed a DNA sequencing method based on hybridization of a DNA fragment to be sequenced with the complete set of fixed-length oligonucleotides (e.g., 4(8) = 65,536 possible 8-mers) immobilized individually as dots of a 2-D matrix [(1989) Dokl. Akad. Nauk SSSR 303, 1508-1511]. It was shown that the list of hybridizing octanucleotides is sufficient for the computer-assisted reconstruction of the structures for 80% of random-sequence fragments up to 200 bases long, based on the analysis of the octanucleotide overlapping. Here a refinement of the method and some experimental data are presented. We have performed hybridizations with oligonucleotides immobilized on a glass plate, and obtained their dissociation curves down to heptanucleotides. Other approaches, e.g., an additional hybridization of short oligonucleotides which continuously extend duplexes formed between the fragment and immobilized oligonucleotides, should considerably increase either the probability of unambiguous reconstruction, or the length of reconstructed sequences, or decrease the size of immobilized oligonucleotides.

  12. Modified Genetic Algorithm for DNA Sequence Assembly by Shotgun and Hybridization Sequencing Techniques

    Directory of Open Access Journals (Sweden)

    Prof.Narayan Kumar Sahu

    2012-09-01

    Full Text Available Since the advent of rapid DNA sequencing methods in 1976, scientists have had the problem of inferring DNA sequences from sequenced fragments. Shotgun sequencing is a well-established biological and computational method used in practice. Many conventional algorithms for shotgun sequencing are based on the notion of pair wise fragment overlap. While shotgun sequencing infers a DNA sequence given the sequences of overlapping fragments, a recent and complementary method, called sequencing by hybridization (SBH, infers a DNA sequence given the set of oligomers that represents all sub words of some fixed length, k. In this paper, we propose a new computer algorithm for DNA sequence assembly that combines in a novel way the techniques of both shotgun and SBH methods. Based on our preliminary investigations, the algorithm promises- to be very fast and practical for DNA sequence assembly [1].

  13. Nucleosome DNA sequence structure of isochores

    Directory of Open Access Journals (Sweden)

    Trifonov Edward N

    2011-04-01

    Full Text Available Abstract Background Significant differences in G+C content between different isochore types suggest that the nucleosome positioning patterns in DNA of the isochores should be different as well. Results Extraction of the patterns from the isochore DNA sequences by Shannon N-gram extension reveals that while the general motif YRRRRRYYYYYR is characteristic for all isochore types, the dominant positioning patterns of the isochores vary between TAAAAATTTTTA and CGGGGGCCCCCG due to the large differences in G+C composition. This is observed in human, mouse and chicken isochores, demonstrating that the variations of the positioning patterns are largely G+C dependent rather than species-specific. The species-specificity of nucleosome positioning patterns is revealed by dinucleotide periodicity analyses in isochore sequences. While human sequences are showing CG periodicity, chicken isochores display AG (CT periodicity. Mouse isochores show very weak CG periodicity only. Conclusions Nucleosome positioning pattern as revealed by Shannon N-gram extension is strongly dependent on G+C content and different in different isochores. Species-specificity of the pattern is subtle. It is reflected in the choice of preferentially periodical dinucleotides.

  14. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    Energy Technology Data Exchange (ETDEWEB)

    Winston Chen, C.H.; Taranenko, N.I.; Zhu, Y.F.; Chung, C.N.; Allman, S.L.

    1997-03-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, the authors recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Snager`s enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. The preliminary results indicate laser mass spectrometry can possibly be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, the authors applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  15. Sequence dependent hole evolution in DNA.

    Science.gov (United States)

    Lakhno, V D

    2004-06-01

    The paper examines thedynamical behavior of a radical cation(G(+*)) generated in adouble stranded DNA for differentoligonucleotide sequences. The resonancehole tunneling through an oligonucleotidesequence is studied by the method ofnumerical integration of self-consistentquantum-mechanical equations. The holemotion is considered quantum mechanicallyand nucleotide base oscillations aretreated classically. The results obtaineddemonstrate a strong dependence of chargetransfer on the type of nucleotidesequence. The rates of the hole transferare calculated for different nucleotidesequences and compared with experimentaldata on the transfer from (G(+*))to a GGG unit.

  16. The PCNA interaction protein box sequence in Rad54 is an integral part of its ATPase domain and is required for efficient DNA repair and recombination.

    Directory of Open Access Journals (Sweden)

    Rebecca C Burgess

    Full Text Available Rad54 is an ATP-driven translocase involved in the genome maintenance pathway of homologous recombination (HR. Although its activity has been implicated in several steps of HR, its exact role(s at each step are still not fully understood. We have identified a new interaction between Rad54 and the replicative DNA clamp, proliferating cell nuclear antigen (PCNA. This interaction was only mildly weakened by the mutation of two key hydrophobic residues in the highly-conserved PCNA interaction motif (PIP-box of Rad54 (Rad54-AA. Intriguingly, the rad54-AA mutant cells displayed sensitivity to DNA damage and showed HR defects similar to the null mutant, despite retaining its ability to interact with HR proteins and to be recruited to HR foci in vivo. We therefore surmised that the PCNA interaction might be impaired in vivo and was unable to promote repair synthesis during HR. Indeed, the Rad54-AA mutant was defective in primer extension at the MAT locus as well as in vitro, but additional biochemical analysis revealed that this mutant also had diminished ATPase activity and an inability to promote D-loop formation. Further mutational analysis of the putative PIP-box uncovered that other phenotypically relevant mutants in this domain also resulted in a loss of ATPase activity. Therefore, we have found that although Rad54 interacts with PCNA, the PIP-box motif likely plays only a minor role in stabilizing the PCNA interaction, and rather, this conserved domain is probably an extension of the ATPase domain III.

  17. Transverse Electronic Signature of DNA for Electronic Sequencing

    Science.gov (United States)

    Xu, Mingsheng; Endres, Robert G.; Arakawa, Yasuhiko

    In recent years, the proliferation of large-scale DNA sequencing projects for applications in clinical medicine and health care has driven the search for new methods that could reduce the time and cost. The commonly used Sanger sequencing method relies on the chemistry to read the bases in DNA and is far too slow and expensive for reading personal genetic codes. There were earlier attempts to sequence DNA by directly visualizing the nucleotide composition of the DNA molecules by scanning tunneling microscopy (STM). However, sequencing DNA based on directly imaging DNA's atomic structure has not yet been successful. In Chap. 9, Xu, Endres, and Arakawa report a potential physical alternative by detecting unique transverse electronic signatures of DNA bases using ultrahigh vacuum STM. Supported by the principles, calculations and statistical analyses, these authors argue that it would be possible to directly sequence DNA by the STM-based technology without any modification of the DNA.

  18. Environmental DNA sequencing primers for eutardigrades and bdelloid rotifers

    Directory of Open Access Journals (Sweden)

    Martin Andrew P

    2009-12-01

    Full Text Available Abstract Background The time it takes to isolate individuals from environmental samples and then extract DNA from each individual is one of the problems with generating molecular data from meiofauna such as eutardigrades and bdelloid rotifers. The lack of consistent morphological information and the extreme abundance of these classes makes morphological identification of rare, or even common cryptic taxa a large and unwieldy task. This limits the ability to perform large-scale surveys of the diversity of these organisms. Here we demonstrate a culture-independent molecular survey approach that enables the generation of large amounts of eutardigrade and bdelloid rotifer sequence data directly from soil. Our PCR primers, specific to the 18s small-subunit rRNA gene, were developed for both eutardigrades and bdelloid rotifers. Results The developed primers successfully amplified DNA of their target organism from various soil DNA extracts. This was confirmed by both the BLAST similarity searches and phylogenetic analyses. Tardigrades showed much better phylogenetic resolution than bdelloids. Both groups of organisms exhibited varying levels of endemism. Conclusion The development of clade-specific primers for characterizing eutardigrades and bdelloid rotifers from environmental samples should greatly increase our ability to characterize the composition of these taxa in environmental samples. Environmental sequencing as shown here differs from other molecular survey methods in that there is no need to pre-isolate the organisms of interest from soil in order to amplify their DNA. The DNA sequences obtained from methods that do not require culturing can be identified post-hoc and placed phylogenetically as additional closely related sequences are obtained from morphologically identified conspecifics. Our non-cultured environmental sequence based approach will be able to provide a rapid and large-scale screening of the presence, absence and diversity of

  19. Environmental DNA sequencing primers for eutardigrades and bdelloid rotifers

    Science.gov (United States)

    2009-01-01

    Background The time it takes to isolate individuals from environmental samples and then extract DNA from each individual is one of the problems with generating molecular data from meiofauna such as eutardigrades and bdelloid rotifers. The lack of consistent morphological information and the extreme abundance of these classes makes morphological identification of rare, or even common cryptic taxa a large and unwieldy task. This limits the ability to perform large-scale surveys of the diversity of these organisms. Here we demonstrate a culture-independent molecular survey approach that enables the generation of large amounts of eutardigrade and bdelloid rotifer sequence data directly from soil. Our PCR primers, specific to the 18s small-subunit rRNA gene, were developed for both eutardigrades and bdelloid rotifers. Results The developed primers successfully amplified DNA of their target organism from various soil DNA extracts. This was confirmed by both the BLAST similarity searches and phylogenetic analyses. Tardigrades showed much better phylogenetic resolution than bdelloids. Both groups of organisms exhibited varying levels of endemism. Conclusion The development of clade-specific primers for characterizing eutardigrades and bdelloid rotifers from environmental samples should greatly increase our ability to characterize the composition of these taxa in environmental samples. Environmental sequencing as shown here differs from other molecular survey methods in that there is no need to pre-isolate the organisms of interest from soil in order to amplify their DNA. The DNA sequences obtained from methods that do not require culturing can be identified post-hoc and placed phylogenetically as additional closely related sequences are obtained from morphologically identified conspecifics. Our non-cultured environmental sequence based approach will be able to provide a rapid and large-scale screening of the presence, absence and diversity of Bdelloidea and Eutardigrada in

  20. A new DNA sequence assembly program.

    Science.gov (United States)

    Bonfield, J K; Smith, K f; Staden, R

    1995-01-01

    We describe the Genome Assembly Program (GAP), a new program for DNA sequence assembly. The program is suitable for large and small projects, a variety of strategies and can handle data from a range of sequencing instruments. It retains the useful components of our previous work, but includes many novel ideas and methods. Many of these methods have been made possible by the program's completely new, and highly interactive, graphical user interface. The program provides many visual clues to the current state of a sequencing project and allows users to interact in intuitive and graphical ways with their data. The program has tools to display and manipulate the various types of data that help to solve and check difficult assemblies, particularly those in repetitive genomes. We have introduced the following new displays: the Contig Selector, the Contig Comparator, the Template Display, the Restriction Enzyme Map and the Stop Codon Map. We have also made it possible to have any number of Contig Editors and Contig Joining Editors running simultaneously even on the same contig. The program also includes a new 'Directed Assembly' algorithm and routines for automatically detecting unfinished segments of sequence, to which it suggests experimental solutions. Images PMID:8559656

  1. Understanding Long-Range Correlations in DNA sequences

    CERN Document Server

    Li, W; Kaneko, K; Wentian Li; Thomas G Marr; Kunihiko Kaneko

    1994-01-01

    Abstract: In this paper, we review the literature on statistical long-range correlation in DNA sequences. We examine the current evidence for these correlations, and conclude that a mixture of many length scales (including some relatively long ones) in DNA sequences is responsible for the observed 1/f-like spectral component. We note the complexity of the correlation structure in DNA sequences. The observed complexity often makes it hard, or impossible, to decompose the sequence into a few statistically stationary regions. We suggest that, based on the complexity of DNA sequences, a fruitful approach to understand long-range correlation is to model duplication, and other rearrangement processes, in DNA sequences. One model, called ``expansion-modification system", contains only point duplication and point mutation. Though simplistic, this model is able to generate sequences with 1/f spectra. We emphasize the importance of DNA duplication in its contribution to the observed long-range correlation in DNA sequen...

  2. Chimeric proteins for detection and quantitation of DNA mutations, DNA sequence variations, DNA damage and DNA mismatches

    Science.gov (United States)

    McCutchen-Maloney, Sandra L.

    2002-01-01

    Chimeric proteins having both DNA mutation binding activity and nuclease activity are synthesized by recombinant technology. The proteins are of the general formula A-L-B and B-L-A where A is a peptide having DNA mutation binding activity, L is a linker and B is a peptide having nuclease activity. The chimeric proteins are useful for detection and identification of DNA sequence variations including DNA mutations (including DNA damage and mismatches) by binding to the DNA mutation and cutting the DNA once the DNA mutation is detected.

  3. ProteDNA: a sequence-based predictor of sequence-specific DNA-binding residues in transcription factors

    OpenAIRE

    2009-01-01

    This article presents the design of a sequence-based predictor named ProteDNA for identifying the sequence-specific binding residues in a transcription factor (TF). Concerning protein–DNA interactions, there are two types of binding mechanisms involved, namely sequence-specific binding and nonspecific binding. Sequence-specific bindings occur between protein sidechains and nucleotide bases and correspond to sequence-specific recognition of genes. Therefore, sequence-specific bindings are esse...

  4. Bacterial DNA Sequence Compression Models Using Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Armando J. Pinho

    2013-08-01

    Full Text Available It is widely accepted that the advances in DNA sequencing techniques have contributed to an unprecedented growth of genomic data. This fact has increased the interest in DNA compression, not only from the information theory and biology points of view, but also from a practical perspective, since such sequences require storage resources. Several compression methods exist, and particularly, those using finite-context models (FCMs have received increasing attention, as they have been proven to effectively compress DNA sequences with low bits-per-base, as well as low encoding/decoding time-per-base. However, the amount of run-time memory required to store high-order finite-context models may become impractical, since a context-order as low as 16 requires a maximum of 17.2 x 109 memory entries. This paper presents a method to reduce such a memory requirement by using a novel application of artificial neural networks (ANN to build such probabilistic models in a compact way and shows how to use them to estimate the probabilities. Such a system was implemented, and its performance compared against state-of-the art compressors, such as XM-DNA (expert model and FCM-Mx (mixture of finite-context models , as well as with general-purpose compressors. Using a combination of order-10 FCM and ANN, similar encoding results to those of FCM, up to order-16, are obtained using only 17 megabytes of memory, whereas the latter, even employing hash-tables, uses several hundreds of megabytes.

  5. DNA Sequence Optimization Based on Continuous Particle Swarm Optimization for Reliable DNA Computing and DNA Nanotechnology

    Directory of Open Access Journals (Sweden)

    N. K. Khalid

    2008-01-01

    Full Text Available Problem statement: In DNA based computation and DNA nanotechnology, the design of good DNA sequences has turned out to be an essential problem and one of the most practical and important research topics. Basically, the DNA sequence design problem is a multi-objective problem and it can be evaluated using four objective functions, namely, Hmeasure, similarity, continuity and hairpin. Approach: There are several ways to solve multi-objective problem, however, in order to evaluate the correctness of PSO algorithm in DNA sequence design, this problem is converted into single objective problem. Particle Swarm Optimization (PSO is proposed to minimize the objective in the problem, subjected to two constraints: melting temperature and GCcontent. A model is developed to present the DNA sequence design based on PSO computation. Results: Based on experiments and researches done, 20 particles are used in the implementation of the optimization process, where the average values and the standard deviation for 100 runs are shown along with comparison to other existing methods. Conclusion: The results achieve verified that PSO can suitably solves the DNA sequence design problem using the proposed method and model, comparatively better than other approaches.

  6. A DNA Structure-Based Bionic Wavelet Transform and Its Application to DNA Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Fei Chen

    2003-01-01

    Full Text Available DNA sequence analysis is of great significance for increasing our understanding of genomic functions. An important task facing us is the exploration of hidden structural information stored in the DNA sequence. This paper introduces a DNA structure-based adaptive wavelet transform (WT – the bionic wavelet transform (BWT – for DNA sequence analysis. The symbolic DNA sequence can be separated into four channels of indicator sequences. An adaptive symbol-to-number mapping, determined from the structural feature of the DNA sequence, was introduced into WT. It can adjust the weight value of each channel to maximise the useful energy distribution of the whole BWT output. The performance of the proposed BWT was examined by analysing synthetic and real DNA sequences. Results show that BWT performs better than traditional WT in presenting greater energy distribution. This new BWT method should be useful for the detection of the latent structural features in future DNA sequence analysis.

  7. Sequences sufficient for programming imprinted germline DNA methylation defined.

    Directory of Open Access Journals (Sweden)

    Yoon Jung Park

    Full Text Available Epigenetic marks are fundamental to normal development, but little is known about signals that dictate their placement. Insights have been provided by studies of imprinted loci in mammals, where monoallelic expression is epigenetically controlled. Imprinted expression is regulated by DNA methylation programmed during gametogenesis in a sex-specific manner and maintained after fertilization. At Rasgrf1 in mouse, paternal-specific DNA methylation on a differential methylation domain (DMD requires downstream tandem repeats. The DMD and repeats constitute a binary switch regulating paternal-specific expression. Here, we define sequences sufficient for imprinted methylation using two transgenic mouse lines: One carries the entire Rasgrf1 cluster (RC; the second carries only the DMD and repeats (DR from Rasgrf1. The RC transgene recapitulated all aspects of imprinting seen at the endogenous locus. DR underwent proper DNA methylation establishment in sperm and erasure in oocytes, indicating the DMD and repeats are sufficient to program imprinted DNA methylation in germlines. Both transgenes produce a DMD-spanning pit-RNA, previously shown to be necessary for imprinted DNA methylation at the endogenous locus. We show that when pit-RNA expression is controlled by the repeats, it regulates DNA methylation in cis only and not in trans. Interestingly, pedigree history dictated whether established DR methylation patterns were maintained after fertilization. When DR was paternally transmitted followed by maternal transmission, the unmethylated state that was properly established in the female germlines could not be maintained. This provides a model for transgenerational epigenetic inheritance in mice.

  8. Significance of satellite DNA revealed by conservation of a widespread repeat DNA sequence among angiosperms.

    Science.gov (United States)

    Mehrotra, Shweta; Goel, Shailendra; Raina, Soom Nath; Rajpal, Vijay Rani

    2014-08-01

    The analysis of plant genome structure and evolution requires comprehensive characterization of repetitive sequences that make up the majority of plant nuclear DNA. In the present study, we analyzed the nature of pCtKpnI-I and pCtKpnI-II tandem repeated sequences, reported earlier in Carthamus tinctorius. Interestingly, homolog of pCtKpnI-I repeat sequence was also found to be present in widely divergent families of angiosperms. pCtKpnI-I showed high sequence similarity but low copy number among various taxa of different families of angiosperms analyzed. In comparison, pCtKpnI-II was specific to the genus Carthamus and was not present in any other taxa analyzed. The molecular structure of pCtKpnI-I was analyzed in various unrelated taxa of angiosperms to decipher the evolutionary conserved nature of the sequence and its possible functional role.

  9. Solid-Phase Purification of Synthetic DNA Sequences.

    Science.gov (United States)

    Grajkowski, Andrzej; Cieslak, Jacek; Beaucage, Serge L

    2016-08-05

    Although high-throughput methods for solid-phase synthesis of DNA sequences are currently available for synthetic biology applications and technologies for large-scale production of nucleic acid-based drugs have been exploited for various therapeutic indications, little has been done to develop high-throughput procedures for the purification of synthetic nucleic acid sequences. An efficient process for purification of phosphorothioate and native DNA sequences is described herein. This process consists of functionalizing commercial aminopropylated silica gel with aminooxyalkyl functions to enable capture of DNA sequences carrying a 5'-siloxyl ether linker with a "keto" function through an oximation reaction. Deoxyribonucleoside phosphoramidites functionalized with the 5'-siloxyl ether linker were prepared in yields of 75-83% and incorporated last into the solid-phase assembly of DNA sequences. Capture of nucleobase- and phosphate-deprotected DNA sequences released from the synthesis support is demonstrated to proceed near quantitatively. After shorter than full-length DNA sequences were washed from the capture support, the purified DNA sequences were released from this support upon treatment with tetra-n-butylammonium fluoride in dry DMSO. The purity of released DNA sequences exceeds 98%. The scalability and high-throughput features of the purification process are demonstrated without sacrificing purity of the DNA sequences.

  10. Affinity purification of sequence-specific DNA binding proteins.

    OpenAIRE

    1986-01-01

    We describe a method for affinity purification of sequence-specific DNA binding proteins that is fast and effective. Complementary chemically synthesized oligodeoxynucleotides that contain a recognition site for a sequence-specific DNA binding protein are annealed and ligated to give oligomers. This DNA is then covalently coupled to Sepharose CL-2B with cyanogen bromide to yield the affinity resin. A partially purified protein fraction is combined with competitor DNA and subsequently passed t...

  11. Choosing the best heuristic for seeded alignment of DNA sequences

    Directory of Open Access Journals (Sweden)

    Buhler Jeremy

    2006-03-01

    Full Text Available Abstract Background Seeded alignment is an important component of algorithms for fast, large-scale DNA similarity search. A good seed matching heuristic can reduce the execution time of genomic-scale sequence comparison without degrading sensitivity. Recently, many types of seed have been proposed to improve on the performance of traditional contiguous seeds as used in, e.g., NCBI BLASTN. Choosing among these seed types, particularly those that use information besides the presence or absence of matching residue pairs, requires practical guidance based on a rigorous comparison, including assessment of sensitivity, specificity, and computational efficiency. This work performs such a comparison, focusing on alignments in DNA outside widely studied coding regions. Results We compare seeds of several types, including those allowing transition mutations rather than matches at fixed positions, those allowing transitions at arbitrary positions ("BLASTZ" seeds, and those using a more general scoring matrix. For each seed type, we use an extended version of our Mandala seed design software to choose seeds with optimized sensitivity for various levels of specificity. Our results show that, on a test set biased toward alignments of noncoding DNA, transition information significantly improves seed performance, while finer distinctions between different types of mismatches do not. BLASTZ seeds perform especially well. These results depend on properties of our test set that are not shared by EST-based test sets with a strong bias toward coding DNA. Conclusion Practical seed design requires careful attention to the properties of the alignments being sought. For noncoding DNA sequences, seeds that use transition information, especially BLASTZ-style seeds, are particularly useful. The Mandala seed design software can be found at http://www.cse.wustl.edu/~yanni/mandala/.

  12. DNA sequence chromatogram browsing using JAVA and CORBA.

    Science.gov (United States)

    Parsons, J D; Buehler, E; Hillier, L

    1999-03-01

    DNA sequence chromatograms (traces) are the primary data source for all large-scale genomic and expressed sequence tags (ESTs) sequencing projects. Access to the sequencing trace assists many later analyses, for example contig assembly and polymorphism detection, but obtaining and using traces is problematic. Traces are not collected and published centrally, they are much larger than the base calls derived from them, and viewing them requires the interactivity of a local graphical client with local data. To provide efficient global access to DNA traces, we developed a client/server system based on flexible Java components integrated into other applications including an applet for use in a WWW browser and a stand-alone trace viewer. Client/server interaction is facilitated by CORBA middleware which provides a well-defined interface, a naming service, and location independence. [The software is packaged as a Jar file available from the following URL: http://www.ebi.ac.uk/jparsons. Links to working examples of the trace viewers can be found at http://corba.ebi.ac.uk/EST. All the Washington University mouse EST traces are available for browsing at the same URL.

  13. SWORDS: A statistical tool for analysing large DNA sequences

    Indian Academy of Sciences (India)

    Probal Chaudhuri; Sandip Das

    2002-02-01

    In this article, we present some simple yet effective statistical techniques for analysing and comparing large DNA sequences. These techniques are based on frequency distributions of DNA words in a large sequence, and have been packaged into a software called SWORDS. Using sequences available in public domain databases housed in the Internet, we demonstrate how SWORDS can be conveniently used by molecular biologists and geneticists to unmask biologically important features hidden in large sequences and assess their statistical significance.

  14. DNA Sequence Determination by Hybridization: A Strategy for Efficient Large-Scale Sequencing

    Science.gov (United States)

    Drmanac, R.; Drmanac, S.; Strezoska, Z.; Paunesku, T.; Labat, I.; Zeremski, M.; Snoddy, J.; Funkhouser, W. K.; Koop, B.; Hood, L.; Crkvenjakov, R.

    1993-06-01

    The concept of sequencing by hybridization (SBH) makes use of an array of all possible n-nucleotide oligomers (n-mers) to identify n-mers present in an unknown DNA sequence. Computational approaches can then be used to assemble the complete sequence. As a validation of this concept, the sequences of three DNA fragments, 343 base pairs in length, were determined with octamer oligonucleotides. Possible applications of SBH include physical mapping (ordering) of overlapping DNA clones, sequence checking, DNA fingerprinting comparisons of normal and disease-causing genes, and the identification of DNA fragments with particular sequence motifs in complementary DNA and genomic libraries. The SBH techniques may accelerate the mapping and sequencing phases of the human genome project.

  15. Extraction of high-molecular-weight genomic DNA for long-read sequencing of single molecules.

    Science.gov (United States)

    Mayjonade, Baptiste; Gouzy, Jérôme; Donnadieu, Cécile; Pouilly, Nicolas; Marande, William; Callot, Caroline; Langlade, Nicolas; Muños, Stéphane

    2016-10-01

    De novo sequencing of complex genomes is one of the main challenges for researchers seeking high-quality reference sequences. Many de novo assemblies are based on short reads, producing fragmented genome sequences. Third-generation sequencing, with read lengths >10 kb, will improve the assembly of complex genomes, but these techniques require high-molecular-weight genomic DNA (gDNA), and gDNA extraction protocols used for obtaining smaller fragments for short-read sequencing are not suitable for this purpose. Methods of preparing gDNA for bacterial artificial chromosome (BAC) libraries could be adapted, but these approaches are time-consuming, and commercial kits for these methods are expensive. Here, we present a protocol for rapid, inexpensive extraction of high-molecular-weight gDNA from bacteria, plants, and animals. Our technique was validated using sunflower leaf samples, producing a mean read length of 12.6 kb and a maximum read length of 80 kb.

  16. Optimized Protocol for Simple Extraction of High-Quality Genomic DNA from Clostridium difficile for Whole-Genome Sequencing.

    Science.gov (United States)

    Sim, James Heng Chiak; Anikst, Victoria; Lohith, Akshar; Pourmand, Nader; Banaei, Niaz

    2015-07-01

    Successful sequencing of the Clostridium difficile genome requires high-quality genomic DNA (gDNA) as the starting material. gDNA extraction using conventional methods is laborious. We describe here an optimized method for the simple extraction of C. difficile gDNA using the QIAamp DNA minikit, which yielded high-quality sequence reads on the Illumina MiSeq platform. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  17. New method to study DNA sequences: the languages of evolution.

    Science.gov (United States)

    Spinelli, Gino; Mayer-Foulkes, David

    2008-04-01

    Recently, several authors have reported statistical evidence for deterministic dynamics in the flux of genetic information, suggesting that evolution involves the emergence and maintenance of a fractal landscape in DNA chains. Here we examine the idea that motif repetition lies at the origin of these statistical properties of DNA. To analyse repetition patterns we apply a modification of the BDS statistic, devised to analyze complex economic dynamics and adapted here to DNA sequence analysis. This provides a new method to detect structured signals in genetic information. We compare naturally occurring DNA sequences along the evolutionary tree with randomly generated sequences and also with simulated sequences with repetition motifs. For easier understanding, we also define a new statistic for a DNA sequence that constitutes a specific fingerprint. The new methods are applied to exon and intron DNA sequences, finding specific statistical differences. Moreover, by analysing DNA sequences of different species from Bacteria to Man, we explore the evolution of these linguistic DNA features along the evolutionary tree. The results are consistent with the idea that all the flux of DNA information need not be random, but may be structured along the evolutionary tree. The implications for evolutionary theory are discussed.

  18. True single-molecule DNA sequencing of a pleistocene horse bone

    DEFF Research Database (Denmark)

    Orlando, Ludovic Antoine Alexandre; Ginolhac, Aurélien; Raghavan, Maanasa;

    2011-01-01

    Second-generation sequencing platforms have revolutionized the field of ancient DNA, opening access to complete genomes of past individuals and extinct species. However, these platforms are dependent on library construction and amplification steps that may result in sequences that do not reflect......-preserved Pleistocene horse bone using the Helicos HeliScope and Illumina GAIIx platforms, respectively. We find that the percentage of endogenous DNA sequences derived from the horse is higher among the Helicos data than Illumina data. This result indicates that the molecular biology tools used to generate sequencing...... libraries of ancient DNA molecules as required for second-generation sequencing introduce biases into the data, that reduce the efficiency of the sequencing process and limit our ability to fully explore the molecular complexity of ancient DNA extracts. We demonstrate that simple modifications...

  19. Statistical assignment of DNA sequences using Bayesian phylogenetics

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Huelsenbeck, John P.;

    2008-01-01

    -analysis of previously published ancient DNA data and show that, with high statistical confidence, most of the published sequences are in fact of Neanderthal origin. However, there are several cases of chimeric sequences that are comprised of a combination of both Neanderthal and modern human DNA....

  20. Levenshtein error-correcting barcodes for multiplexed DNA sequencing

    NARCIS (Netherlands)

    Buschmann, Tilo; Bystrykh, Leonid V.

    2013-01-01

    Background: High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called multi

  1. DNA sequence-dependent mechanics and protein-assisted bending in repressor-mediated loop formation

    Science.gov (United States)

    Boedicker, James Q.; Garcia, Hernan G.; Johnson, Stephanie; Phillips, Rob

    2013-12-01

    As the chief informational molecule of life, DNA is subject to extensive physical manipulations. The energy required to deform double-helical DNA depends on sequence, and this mechanical code of DNA influences gene regulation, such as through nucleosome positioning. Here we examine the sequence-dependent flexibility of DNA in bacterial transcription factor-mediated looping, a context for which the role of sequence remains poorly understood. Using a suite of synthetic constructs repressed by the Lac repressor and two well-known sequences that show large flexibility differences in vitro, we make precise statistical mechanical predictions as to how DNA sequence influences loop formation and test these predictions using in vivo transcription and in vitro single-molecule assays. Surprisingly, sequence-dependent flexibility does not affect in vivo gene regulation. By theoretically and experimentally quantifying the relative contributions of sequence and the DNA-bending protein HU to DNA mechanical properties, we reveal that bending by HU dominates DNA mechanics and masks intrinsic sequence-dependent flexibility. Such a quantitative understanding of how mechanical regulatory information is encoded in the genome will be a key step towards a predictive understanding of gene regulation at single-base pair resolution.

  2. Food Fish Identification from DNA Extraction through Sequence Analysis

    Science.gov (United States)

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  3. Food Fish Identification from DNA Extraction through Sequence Analysis

    Science.gov (United States)

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  4. Mesoscopic Model for Free Energy Landscape Analysis of DNA sequences

    CERN Document Server

    Tapia-Rojo, R; Mazo, J J; Falo, F; 10.1103/PhysRevE.86.021908

    2012-01-01

    A mesoscopic model which allows us to identify and quantify the strength of binding sites in DNA sequences is proposed. The model is based on the Peyrard-Bishop-Dauxois model for the DNA chain coupled to a Brownian particle which explores the sequence interacting more importantly with open base pairs of the DNA chain. We apply the model to promoter sequences of different organisms. The free energy landscape obtained for these promoters shows a complex structure that is strongly connected to their biological behavior. The analysis method used is able to quantify free energy differences of sites within genome sequences.

  5. Cloning and sequencing of mouse GABA transporter complementary DNA

    Institute of Scientific and Technical Information of China (English)

    TAMANTHONYC.W.; LIHEGUO; 等

    1994-01-01

    A cDNA encoding the mouse GABA transporter has been isolated and sequenced.The results show that the mouse GABA transporter cDNA differs from that of the rat by 60 base pairs at the open reading frame region but the deduced amino acid sequences of the two cDNAs are identical and both composed of 599 amino acids.However,the amino acid sequence is different from the sequence deduced from a recently published mouse GABA transporter cDNA.

  6. Effects of Sequence on Transmission Properties of DNA Molecules

    Institute of Scientific and Technical Information of China (English)

    DONG Rui-Xin; YAN Xun-Ling; YANG Bing

    2008-01-01

    A double helix model of charge transport in DNA molecule is given and the transmission spectra of four DNA sequences are obtained. The calculated results show that the transmission characteristics of DNA are not only related to the longitudinal transport but also to the transverse transport of molecule. The periodic sequence with the same composition has stronger conduction ability. With the increasing of bases composition, the conductive ability reduces, but the weight of θ direction rises in charge transfer.

  7. PCR master mixes harbour murine DNA sequences. Caveat emptor!

    Directory of Open Access Journals (Sweden)

    Philip W Tuke

    Full Text Available BACKGROUND: XMRV is the most recently described retrovirus to be found in Man, firstly in patients with prostate cancer (PC and secondly in 67% of patients with chronic fatigue syndrome (CFS and 3.7% of controls. Both disease associations remain contentious. Indeed, a recent publication has concluded that "XMRV is unlikely to be a human pathogen". Subsequently related but different polytropic MLV (pMLV sequences were also reported from the blood of 86.5% of patients with CFS. and 6.8% of controls. Consequently we decided to investigate blood donors for evidence of XMRV/pMLV. METHODOLOGY/PRINCIPAL FINDINGS: Testing of cDNA prepared from the whole blood of 80 random blood donors, generated gag PCR signals from two samples (7C and 9C. These had previously tested negative for XMRV by two other PCR based techniques. To test whether the PCR mix was the source of these sequences 88 replicates of water were amplified using Invitrogen Platinum Taq (IPT and Applied Biosystems Taq Gold LD (ABTG. Four gag sequences (2D, 3F, 7H, 12C were generated with the IPT, a further sequence (12D by ABTG re-amplification of an IPT first round product. Sequence comparisons revealed remarkable similarities between these sequences, endogeous MLVs and the pMLV sequences reported in patients with CFS. CONCLUSIONS/SIGNIFICANCE: Methodologies for the detection of viruses highly homologous to endogenous murine viruses require special caution as the very reagents used in the detection process can be a source of contamination and at a level where it is not immediately apparent. It is suggested that such contamination is likely to explain the apparent presence of pMLV in CFS.

  8. mapDamage: testing for damage patterns in ancient DNA sequences.

    Science.gov (United States)

    Ginolhac, Aurelien; Rasmussen, Morten; Gilbert, M Thomas P; Willerslev, Eske; Orlando, Ludovic

    2011-08-01

    Ancient DNA extracts consist of a mixture of contaminant DNA molecules, most often originating from environmental microbes, and endogenous fragments exhibiting substantial levels of DNA damage. The latter introduce specific nucleotide misincorporations and DNA fragmentation signatures in sequencing reads that could be advantageously used to argue for sequence validity. mapDamage is a Perl script that computes nucleotide misincorporation and fragmentation patterns using next-generation sequencing reads mapped against a reference genome. The Perl script outputs are further automatically processed in embedded R script in order to detect typical patterns of genuine ancient DNA sequences. The Perl script mapDamage is freely available with documentation and example files at http://geogenetics.ku.dk/all_literature/mapdamage/. The script requires prior installation of the SAMtools suite and R environment and has been validated on both GNU/Linux and MacOSX operating systems.

  9. Code domains in tandem repetitive DNA sequence structures.

    Science.gov (United States)

    Vogt, P

    1992-10-01

    Traditionally, many people doing research in molecular biology attribute coding properties to a given DNA sequence if this sequence contains an open reading frame for translation into a sequence of amino acids. This protein coding capability of DNA was detected about 30 years ago. The underlying genetic code is highly conserved and present in every biological species studied so far. Today, it is obvious that DNA has a much larger coding potential for other important tasks. Apart from coding for specific RNA molecules such as rRNA, snRNA and tRNA molecules, specific structural and sequence patterns of the DNA chain itself express distinct codes for the regulation and expression of its genetic activity. A chromatin code has been defined for phasing of the histone-octamer protein complex in the nucleosome. A translation frame code has been shown to exist that determines correct triplet counting at the ribosome during protein synthesis. A loop code seems to organize the single stranded interaction of the nascent RNA chain with proteins during the splicing process, and a splicing code phases successive 5' and 3' splicing sites. Most of these DNA codes are not exclusively based on the primary DNA sequence itself, but also seem to include specific features of the corresponding higher order structures. Based on the view that these various DNA codes are genetically instructive for specific molecular interactions or processes, important in the nucleus during interphase and during cell division, the coding capability of tandem repetitive DNA sequences has recently been reconsidered.

  10. ProteDNA: a sequence-based predictor of sequence-specific DNA-binding residues in transcription factors.

    Science.gov (United States)

    Chu, Wen-Yi; Huang, Yu-Feng; Huang, Chun-Chin; Cheng, Yi-Sheng; Huang, Chien-Kang; Oyang, Yen-Jen

    2009-07-01

    This article presents the design of a sequence-based predictor named ProteDNA for identifying the sequence-specific binding residues in a transcription factor (TF). Concerning protein-DNA interactions, there are two types of binding mechanisms involved, namely sequence-specific binding and nonspecific binding. Sequence-specific bindings occur between protein sidechains and nucleotide bases and correspond to sequence-specific recognition of genes. Therefore, sequence-specific bindings are essential for correct gene regulation. In this respect, ProteDNA is distinctive since it has been designed to identify sequence-specific binding residues. In order to accommodate users with different application needs, ProteDNA has been designed to operate under two modes, namely, the high-precision mode and the balanced mode. According to the experiments reported in this article, under the high-precision mode, ProteDNA has been able to deliver precision of 82.3%, specificity of 99.3%, sensitivity of 49.8% and accuracy of 96.5%. Meanwhile, under the balanced mode, ProteDNA has been able to deliver precision of 60.8%, specificity of 97.6%, sensitivity of 60.7% and accuracy of 95.4%. ProteDNA is available at the following websites: http://protedna.csbb.ntu.edu.tw/, http://protedna.csie.ntu.edu.tw/, http://bio222.esoe.ntu.edu.tw/ProteDNA/.

  11. Discovering motifs in ranked lists of DNA sequences.

    Directory of Open Access Journals (Sweden)

    Eran Eden

    2007-03-01

    Full Text Available Computational methods for discovery of sequence elements that are enriched in a target set compared with a background set are fundamental in molecular biology research. One example is the discovery of transcription factor binding motifs that are inferred from ChIP-chip (chromatin immuno-precipitation on a microarray measurements. Several major challenges in sequence motif discovery still require consideration: (i the need for a principled approach to partitioning the data into target and background sets; (ii the lack of rigorous models and of an exact p-value for measuring motif enrichment; (iii the need for an appropriate framework for accounting for motif multiplicity; (iv the tendency, in many of the existing methods, to report presumably significant motifs even when applied to randomly generated data. In this paper we present a statistical framework for discovering enriched sequence elements in ranked lists that resolves these four issues. We demonstrate the implementation of this framework in a software application, termed DRIM (discovery of rank imbalanced motifs, which identifies sequence motifs in lists of ranked DNA sequences. We applied DRIM to ChIP-chip and CpG methylation data and obtained the following results. (i Identification of 50 novel putative transcription factor (TF binding sites in yeast ChIP-chip data. The biological function of some of them was further investigated to gain new insights on transcription regulation networks in yeast. For example, our discoveries enable the elucidation of the network of the TF ARO80. Another finding concerns a systematic TF binding enhancement to sequences containing CA repeats. (ii Discovery of novel motifs in human cancer CpG methylation data. Remarkably, most of these motifs are similar to DNA sequence elements bound by the Polycomb complex that promotes histone methylation. Our findings thus support a model in which histone methylation and CpG methylation are mechanistically linked

  12. DNA Shape Dominates Sequence Affinity in Nucleosome Formation

    Science.gov (United States)

    Freeman, Gordon S.; Lequieu, Joshua P.; Hinckley, Daniel M.; Whitmer, Jonathan K.; de Pablo, Juan J.

    2014-10-01

    Nucleosomes provide the basic unit of compaction in eukaryotic genomes, and the mechanisms that dictate their position at specific locations along a DNA sequence are of central importance to genetics. In this Letter, we employ molecular models of DNA and proteins to elucidate various aspects of nucleosome positioning. In particular, we show how DNA's histone affinity is encoded in its sequence-dependent shape, including subtle deviations from the ideal straight B-DNA form and local variations of minor groove width. By relying on high-precision simulations of the free energy of nucleosome complexes, we also demonstrate that, depending on DNA's intrinsic curvature, histone binding can be dominated by bending interactions or electrostatic interactions. More generally, the results presented here explain how sequence, manifested as the shape of the DNA molecule, dominates molecular recognition in the problem of nucleosome positioning.

  13. A MapReduce Framework for DNA Sequencing Data Processing

    Directory of Open Access Journals (Sweden)

    Samy Ghoneimy

    2016-12-01

    Full Text Available Genomics and Next Generation Sequencers (NGS like Illumina Hiseq produce data in the order of ‎‎200 billion base pairs in a single one-week run for a 60x human genome coverage, which ‎requires modern high-throughput experimental technologies that can ‎only be tackled with high performance computing (HPC and specialized software algorithms called ‎‎“short read aligners”. This paper focuses on the implementation of the DNA sequencing as a set of MapReduce programs that will accept a DNA data set as a FASTQ file and finally generate a VCF (variant call format file, which has variants for a given DNA data set. In this paper MapReduce/Hadoop along with Burrows-Wheeler Aligner (BWA, Sequence Alignment/Map (SAM ‎tools, are fully utilized to provide various utilities for manipulating alignments, including sorting, merging, indexing, ‎and generating alignments. The Map-Sort-Reduce process is designed to be suited for a Hadoop framework in ‎which each cluster is a traditional N-node Hadoop cluster to utilize all of the Hadoop features like HDFS, program ‎management and fault tolerance. The Map step performs multiple instances of the short read alignment algorithm ‎‎(BoWTie that run in parallel in Hadoop. The ordered list of the sequence reads are used as input tuples and the ‎output tuples are the alignments of the short reads. In the Reduce step many parallel instances of the Short ‎Oligonucleotide Analysis Package for SNP (SOAPsnp algorithm run in the cluster. Input tuples are sorted ‎alignments for a partition and the output tuples are SNP calls. Results are stored via HDFS, and then archived in ‎SOAPsnp format. ‎ The proposed framework enables extremely fast discovering somatic mutations, inferring population genetical ‎parameters, and performing association tests directly based on sequencing data without explicit genotyping or ‎linkage-based imputation. It also demonstrate that this method achieves comparable

  14. An Optimal Seed Based Compression Algorithm for DNA Sequences

    Directory of Open Access Journals (Sweden)

    Pamela Vinitha Eric

    2016-01-01

    Full Text Available This paper proposes a seed based lossless compression algorithm to compress a DNA sequence which uses a substitution method that is similar to the LempelZiv compression scheme. The proposed method exploits the repetition structures that are inherent in DNA sequences by creating an offline dictionary which contains all such repeats along with the details of mismatches. By ensuring that only promising mismatches are allowed, the method achieves a compression ratio that is at par or better than the existing lossless DNA sequence compression algorithms.

  15. DNA splice site sequences clustering method for conservativeness analysis

    Institute of Scientific and Technical Information of China (English)

    Quanwei Zhang; Qinke Peng; Tao Xu

    2009-01-01

    DNA sequences that are near to splice sites have remarkable conservativeness,and many researchers have contributed to the prediction of splice site.In order to mine the underlying biological knowledge,we analyze the conservativeness of DNA splice site adjacent sequences by clustering.Firstly,we propose a kind of DNA splice site sequences clustering method which is based on DBSCAN,and use four kinds of dissimilarity calculating methods.Then,we analyze the conservative feature of the clustering results and the experimental data set.

  16. Thermodynamics of sequence-specific binding of PNA to DNA

    DEFF Research Database (Denmark)

    Ratilainen, T; Holmén, A; Tuite, E

    2000-01-01

    For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes) and seq......For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes...

  17. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    Science.gov (United States)

    Nielsen, Peter E.

    2008-10-01

    Peptide nucleic acids (PNA) can be designed to target duplex DNA with very high sequence specificity and efficiency via various binding modes. We have designed three domain PNA clamps, that bind stably to predefined decameric homopurine targets in large dsDNA molecules and via a third PNA domain sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technology of protein dsDNA structures.

  18. Comparative analysis of Lactobacillus plantarum WCFS1 transcriptomes using DNA microarray and next generation sequencing technologies

    NARCIS (Netherlands)

    Leimena, M.M.; Wels, M.; Bongers, R.; Smid, E.J.; Zoetendal, E.G.; Kleerebezem, M.

    2012-01-01

    RNA sequencing is starting to compete with the use of DNA microarrays for transcription analysis in eukaryotes as well as in prokaryotes. Application of RNA sequencing in prokaryotes requires additional steps in the RNA preparation procedure to increase the relative abundance of mRNA and cannot

  19. Current-voltage characteristics of double-strand DNA sequences

    Science.gov (United States)

    Bezerril, L. M.; Moreira, D. A.; Albuquerque, E. L.; Fulco, U. L.; de Oliveira, E. L.; de Sousa, J. S.

    2009-09-01

    We use a tight-binding formulation to investigate the transmissivity and the current-voltage (I-V) characteristics of sequences of double-strand DNA molecules. In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of artificial sequences (the long-range correlated Fibonacci and Rudin-Shapiro one) and a random sequence, which is a kind of prototype of a short-range correlated system. The random sequence is presented here with the same first neighbors pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the transmissivity spectra, although the I-V curves seem to be mostly influenced by the short-range correlations.

  20. Current-voltage characteristics of double-strand DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bezerril, L.M.; Moreira, D.A. [Departamento de Fisica, Universidade Federal do Rio Grande do Norte, 59072-970, Natal-RN (Brazil); Albuquerque, E.L., E-mail: eudenilson@dfte.ufrn.b [Departamento de Fisica, Universidade Federal do Rio Grande do Norte, 59072-970, Natal-RN (Brazil); Fulco, U.L. [Departamento de Biofisica e Farmacologia, Universidade Federal do Rio Grande do Norte, 59072-970, Natal-RN (Brazil); Oliveira, E.L. de; Sousa, J.S. de [Departamento de Fisica, Universidade Federal do Ceara, 60455-760, Fortaleza-CE (Brazil)

    2009-09-07

    We use a tight-binding formulation to investigate the transmissivity and the current-voltage (I-V) characteristics of sequences of double-strand DNA molecules. In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of artificial sequences (the long-range correlated Fibonacci and Rudin-Shapiro one) and a random sequence, which is a kind of prototype of a short-range correlated system. The random sequence is presented here with the same first neighbors pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the transmissivity spectra, although the I-V curves seem to be mostly influenced by the short-range correlations.

  1. High-throughput sequencing for the identification of binding molecules from DNA-encoded chemical libraries.

    Science.gov (United States)

    Buller, Fabian; Steiner, Martina; Scheuermann, Jörg; Mannocci, Luca; Nissen, Ina; Kohler, Manuel; Beisel, Christian; Neri, Dario

    2010-07-15

    DNA-encoded chemical libraries are large collections of small organic molecules, individually coupled to DNA fragments that serve as amplifiable identification bar codes. The isolation of specific binders requires a quantitative analysis of the distribution of DNA fragments in the library before and after capture on an immobilized target protein of interest. Here, we show how Illumina sequencing can be applied to the analysis of DNA-encoded chemical libraries, yielding over 10 million DNA sequence tags per flow-lane. The technology can be used in a multiplex format, allowing the encoding and subsequent sequencing of multiple selections in the same experiment. The sequence distributions in DNA-encoded chemical library selections were found to be similar to the ones obtained using 454 technology, thus reinforcing the concept that DNA sequencing is an appropriate avenue for the decoding of library selections. The large number of sequences obtained with the Illumina method now enables the study of very large DNA-encoded chemical libraries (>500,000 compounds) and reduces decoding costs.

  2. Characteristics of alternating current hopping conductivity in DNA sequences

    Institute of Scientific and Technical Information of China (English)

    Ma Song-Shan; Xu Hui; Wang Huan-You; Guo Rui

    2009-01-01

    This paper presents a model to describe alternating current (AC) conductivity of DNA sequences,in which DNA is considered as a one-dimensional (1D) disordered system,and electrons transport via hopping between localized states.It finds that AC conductivity in DNA sequences increases as the frequency of the external electric field rises,and it takes the form of σac(ω)~ω2 ln2(1/ω).Also AC conductivity of DNA sequences increases with the increase of temperature,this phenomenon presents characteristics of weak temperature-dependence.Meanwhile,the AC conductivity in an off diagonally correlated case is much larger than that in the uncorrelated case of the Anderson limit in low temperatures,which indicates that the off-diagonal correlations in DNA sequences have a great effect on the AC conductivity,while at high temperature the off-diagonal correlations no longer play a vital role in electric transport. In addition,the proportion of nucleotide pairs p also plays an important role in AC electron transport of DNA sequences.For p<0.5,the conductivity of DNA sequence decreases with the increase of p,while for p > 0.5,the conductivity increases with the increase of p.

  3. Statistical methods for detecting periodic fragments in DNA sequence data

    Directory of Open Access Journals (Sweden)

    Ying Hua

    2011-04-01

    Full Text Available Abstract Background Period 10 dinucleotides are structurally and functionally validated factors that influence the ability of DNA to form nucleosomes, histone core octamers. Robust identification of periodic signals in DNA sequences is therefore required to understand nucleosome organisation in genomes. While various techniques for identifying periodic components in genomic sequences have been proposed or adopted, the requirements for such techniques have not been considered in detail and confirmatory testing for a priori specified periods has not been developed. Results We compared the estimation accuracy and suitability for confirmatory testing of autocorrelation, discrete Fourier transform (DFT, integer period discrete Fourier transform (IPDFT and a previously proposed Hybrid measure. A number of different statistical significance procedures were evaluated but a blockwise bootstrap proved superior. When applied to synthetic data whose period-10 signal had been eroded, or for which the signal was approximately period-10, the Hybrid technique exhibited superior properties during exploratory period estimation. In contrast, confirmatory testing using the blockwise bootstrap procedure identified IPDFT as having the greatest statistical power. These properties were validated on yeast sequences defined from a ChIP-chip study where the Hybrid metric confirmed the expected dominance of period-10 in nucleosome associated DNA but IPDFT identified more significant occurrences of period-10. Application to the whole genomes of yeast and mouse identified ~ 21% and ~ 19% respectively of these genomes as spanned by period-10 nucleosome positioning sequences (NPS. Conclusions For estimating the dominant period, we find the Hybrid period estimation method empirically to be the most effective for both eroded and approximate periodicity. The blockwise bootstrap was found to be effective as a significance measure, performing particularly well in the problem of

  4. Spectroscopic investigation on the telomeric DNA base sequence repeat

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    Telomeres are protein-DNA complexes at the terminals of linear chromosomes, which protect chromosomal integrity and maintain cellular replicative capacity.From single-cell organisms to advanced animals and plants,structures and functions of telomeres are both very conservative. In cells of human and vertebral animals, telomeric DNA base sequences all are (TTAGGG)n. In the present work, we have obtained absorption and fluorescence spectra measured from seven synthesized oligonucleotides to simulate the telomeric DNA system and calculated their relative fluorescence quantum yields on which not only telomeric DNA characteristics are predicted but also possibly the shortened telomeric sequences during cell division are imrelative fluorescence quantum yield and remarkable excitation energy innerconversion, which tallies with the telomeric sequence of (TTAGGG)n. This result shows that telomeric DNA has a strong non-radiative or innerconvertible capability.``

  5. ATRF Houses the Latest DNA Sequencing Technologies | Poster

    Science.gov (United States)

    By Ashley DeVine, Staff Writer By the end of October, the Advanced Technology Research Facility (ATRF) will be one of the few facilities in the world to house all of the latest DNA sequencing technologies.

  6. Bacteriophage lambda DNA packaging: DNA site requirements for termination and processivity.

    Science.gov (United States)

    Cue, D; Feiss, M

    2001-08-10

    Bacteriophage lambda chromosomes are processively packaged into preformed shells, using end-to-end multimers of intracellular viral DNA as the packaging substate. A 200 bp long DNA segment, cos, contains all the sequences needed for DNA packaging. The work reported here shows that efficient DNA packaging termination requires cos's I2 segment, in addition to the required termination subsite, cosQ, and the nicking site, cosN. Efficient processivity requires cosB, in addition to cosQ and cosN. An initiation-defective mutant form of cosB sponsored efficient processivity, indicating that the terminase-cosB interactions required for termination are less stringent than those required at initiation. The finding that an initiation-defective form of cosB is functional for processivity allows a re-interpretation of a similar finding, obtained previously, that the initiation-defective cosB of phage 21 is functional for processivity by the lambda packaging machinery. The cosBphi21 result can now be interpreted as indicating that interactions between cosBphi21 and lambda terminase, while insufficient for initiation, function for processivity.

  7. Mitochondrial DNA variant discovery and evaluation in human Cardiomyopathies through next-generation sequencing.

    Directory of Open Access Journals (Sweden)

    Michael V Zaragoza

    Full Text Available Mutations in mitochondrial DNA (mtDNA may cause maternally-inherited cardiomyopathy and heart failure. In homoplasmy all mtDNA copies contain the mutation. In heteroplasmy there is a mixture of normal and mutant copies of mtDNA. The clinical phenotype of an affected individual depends on the type of genetic defect and the ratios of mutant and normal mtDNA in affected tissues. We aimed at determining the sensitivity of next-generation sequencing compared to Sanger sequencing for mutation detection in patients with mitochondrial cardiomyopathy. We studied 18 patients with mitochondrial cardiomyopathy and two with suspected mitochondrial disease. We "shotgun" sequenced PCR-amplified mtDNA and multiplexed using a single run on Roche's 454 Genome Sequencer. By mapping to the reference sequence, we obtained 1,300x average coverage per case and identified high-confidence variants. By comparing these to >400 mtDNA substitution variants detected by Sanger, we found 98% concordance in variant detection. Simulation studies showed that >95% of the homoplasmic variants were detected at a minimum sequence coverage of 20x while heteroplasmic variants required >200x coverage. Several Sanger "misses" were detected by 454 sequencing. These included the novel heteroplasmic 7501T>C in tRNA serine 1 in a patient with sudden cardiac death. These results support a potential role of next-generation sequencing in the discovery of novel mtDNA variants with heteroplasmy below the level reliably detected with Sanger sequencing. We hope that this will assist in the identification of mtDNA mutations and key genetic determinants for cardiomyopathy and mitochondrial disease.

  8. Nanopore DNA sequencing and epigenetic detection with a MspA nanopore

    Science.gov (United States)

    Laszlo, Andrew H.

    DNA forms the molecular basis for all known life. Widespread DNA sequencing has the potential to revolutionize healthcare and our understanding of the life sciences. Sequencing has already had a profound effect on our understanding of the molecular basis of life and underpinnings of disease. Current DNA sequencing technologies require costly reagents, can sequence only short DNA strands, and take too long to complete entire genomes. Furthermore, the required DNA sample size limits the types of experiments that can be run. For instance sequencing single cells is extremely difficult. New technologies are key to making DNA sequencing as cheap and accessible as possible and for making new experiments possible. One such new technology is nanopore sequencing. In nanopore sequencing, a thin membrane is used to divide a salt solution into two wells: cis and trans. This membrane contains a single nanometer sized hole that forms the only electrical connection between the two wells. When a voltage is applied across the membrane, ion current flows through the nanopore. This ion current is the primary signal for nanopore sequencing. DNA is negatively charged and can be pulled into the pore. When DNA is pulled into the pore, it occludes the pore and reduces the ion current that can pass through the pore. Individual DNA nucleotides along the DNA strand block the pore to varying degrees. One can measure the degree to which the pore is blocked as DNA passes through the pore and use the ion current signal to read off the DNA sequence. This thesis chronicles recent advances in the Gundlach laboratory in which I have played a leading role. It describes our work testing the biological nanopore Mycobacterium smegmatis porin A (MspA) for nanopore sequencing. The thesis consists of five chapters and three appendices which contain supplemental information for Chapters 2, 3, and 4. Chapter 1 begins with some motivation and defines the current challenges in DNA sequencing. I also introduce

  9. Repetitive DNA Sequences in Wheat and Its Relatives

    Institute of Scientific and Technical Information of China (English)

    ZHANG Xue-yong; LI Da-yong

    2001-01-01

    Repetitive DNA sequences form a large portion of eukaryote genomes. Using wheat ( Triticum )as a model, the classification, features and functions of repetitive DNA sequences in the Tritieeae grass tribe is reviewed as well as the role of these sequences in genome differentiation, control and regulation of homologous chromosome synapsis and pairing. Transposable elements, as an important portion of dispersed repetitives,may play an essential role in gene mutation of the host. Dynamic models for change of copy number and sequences of the repetitive family are also presented after the models of Charlesworth et al. Application of repetitive DNA sequences in the study of evolution, chromosome fingerprinting and marker assisted gene transfer and breeding are described by taking wheat as an example.

  10. Which Are More Random: Coding or Noncoding DNA Sequences?

    Institute of Scientific and Technical Information of China (English)

    WU Fang; ZHENG Wei-Mou

    2002-01-01

    Evidence seems to show that coding DNA is more random than noncoding DNA, but other conflictingevidence also exists. Based on the third-base degeneracy of codons, we regard the third position of codons as a 'noisy'position. By deleting one fixed position of non-overlapping triplets in a given sequence, three masked sequences may bededuced from the sequence. We have investigated the block-to-site mutual information functions of coding and noncodingsequences in yeast without and with the masking. Characteristics that distinguish coding from noncoding DNA havebeen found. It is observed that the strong correlations in the coding regions may be blocked by the third base of codons,and the proper masking can extract the correlations. Distribution of dimeric tandem repeats of unmasked sequences isalso compared with that of masked sequences.

  11. PREDICTION OF CHROMATIN STATES USING DNA SEQUENCE PROPERTIES

    KAUST Repository

    Bahabri, Rihab R.

    2013-06-01

    Activities of DNA are to a great extent controlled epigenetically through the internal struc- ture of chromatin. This structure is dynamic and is influenced by different modifications of histone proteins. Various combinations of epigenetic modification of histones pinpoint to different functional regions of the DNA determining the so-called chromatin states. How- ever, the characterization of chromatin states by the DNA sequence properties remains largely unknown. In this study we aim to explore whether DNA sequence patterns in the human genome can characterize different chromatin states. Using DNA sequence motifs we built binary classifiers for each chromatic state to eval- uate whether a given genomic sequence is a good candidate for belonging to a particular chromatin state. Of four classification algorithms (C4.5, Naive Bayes, Random Forest, and SVM) used for this purpose, the decision tree based classifiers (C4.5 and Random Forest) yielded best results among those we evaluated. Our results suggest that in general these models lack sufficient predictive power, although for four chromatin states (insulators, het- erochromatin, and two types of copy number variation) we found that presence of certain motifs in DNA sequences does imply an increased probability that such a sequence is one of these chromatin states.

  12. Effects of sequence on DNA wrapping around histones

    Science.gov (United States)

    Ortiz, Vanessa

    2011-03-01

    A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).

  13. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    DEFF Research Database (Denmark)

    Nielsen, Peter E.

    2008-01-01

    sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technol. of protein dsDNA structures. (c) 2008 American Institute of Physics. [on SciFinder (R)] Udgivelsesdato...

  14. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    DEFF Research Database (Denmark)

    Nielsen, Peter E.

    2008-01-01

    sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technol. of protein dsDNA structures. (c) 2008 American Institute of Physics. [on SciFinder (R)] Udgivelsesdato...

  15. Designing universal primers for the isolation of DNA sequences encoding Proanthocyanidins biosynthetic enzymes in Crataegus aronia.

    Science.gov (United States)

    Zuiter, Afnan Saeid; Sawwan, Jammal; Al Abdallat, Ayed

    2012-08-10

    Hawthorn is the common name of all plant species in the genus Crataegus, which belongs to the Rosaceae family. Crataegus are considered useful medicinal plants because of their high content of proanthocyanidins (PAs) and other related compounds. To improve PAs production in Crataegus tissues, the sequences of genes encoding PAs biosynthetic enzymes are required. Different bioinformatics tools, including BLAST, multiple sequence alignment and alignment PCR analysis were used to design primers suitable for the amplification of DNA fragments from 10 candidate genes encoding enzymes involved in PAs biosynthesis in C. aronia. DNA sequencing results proved the utility of the designed primers. The primers were used successfully to amplify DNA fragments of different PAs biosynthesis genes in different Rosaceae plants. To the best of our knowledge, this is the first use of the alignment PCR approach to isolate DNA sequences encoding PAs biosynthetic enzymes in Rosaceae plants.

  16. Designing universal primers for the isolation of DNA sequences encoding Proanthocyanidins biosynthetic enzymes in Crataegus aronia

    Directory of Open Access Journals (Sweden)

    Zuiter Afnan

    2012-08-01

    Full Text Available Abstract Background Hawthorn is the common name of all plant species in the genus Crataegus, which belongs to the Rosaceae family. Crataegus are considered useful medicinal plants because of their high content of proanthocyanidins (PAs and other related compounds. To improve PAs production in Crataegus tissues, the sequences of genes encoding PAs biosynthetic enzymes are required. Findings Different bioinformatics tools, including BLAST, multiple sequence alignment and alignment PCR analysis were used to design primers suitable for the amplification of DNA fragments from 10 candidate genes encoding enzymes involved in PAs biosynthesis in C. aronia. DNA sequencing results proved the utility of the designed primers. The primers were used successfully to amplify DNA fragments of different PAs biosynthesis genes in different Rosaceae plants. Conclusion To the best of our knowledge, this is the first use of the alignment PCR approach to isolate DNA sequences encoding PAs biosynthetic enzymes in Rosaceae plants.

  17. Strong physical constraints on sequence-specific target location by proteins on DNA molecules

    DEFF Research Database (Denmark)

    Flyvbjerg, H.; Keatch, S.A.; Dryden, D.T.F

    2006-01-01

    Sequence-specific binding to DNA in the presence of competing non-sequence-specific ligands is a problem faced by proteins in all organisms. It is akin to the problem of parking a truck at a loading bay by the side of a road in the presence of cars parked at random along the road. Cars even...... partially covering the loading bay prevent correct parking of the truck. Similarly on DNA, non-specific ligands interfere with the binding and function of sequence-specific proteins. We derive a formula for the probability that the loading bay is free from parked cars. The probability depends on the size...... of the loading bay and allows an estimation of the size of the footprint on the DNA of the sequence-specific protein by assaying protein binding or function in the presence of increasing concentrations of non-specific ligand. Assaying for function gives an 'activity footprint'; the minimum length of DNA required...

  18. Strong physical constraints on sequence-specific target location by proteins on DNA molecules

    DEFF Research Database (Denmark)

    Flyvbjerg, H.; Keatch, S.A.; Dryden, D.T.F

    2006-01-01

    Sequence-specific binding to DNA in the presence of competing non-sequence-specific ligands is a problem faced by proteins in all organisms. It is akin to the problem of parking a truck at a loading bay by the side of a road in the presence of cars parked at random along the road. Cars even...... partially covering the loading bay prevent correct parking of the truck. Similarly on DNA, non-specific ligands interfere with the binding and function of sequence-specific proteins. We derive a formula for the probability that the loading bay is free from parked cars. The probability depends on the size...... of the loading bay and allows an estimation of the size of the footprint on the DNA of the sequence-specific protein by assaying protein binding or function in the presence of increasing concentrations of non-specific ligand. Assaying for function gives an 'activity footprint'; the minimum length of DNA required...

  19. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    DEFF Research Database (Denmark)

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie

    2014-01-01

    The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross...

  20. Biometric Authentication Using ElGamal Cryptosystem And DNA Sequence

    Directory of Open Access Journals (Sweden)

    V.SAMUEL SUSAN

    2010-06-01

    Full Text Available Biometrics are automated methods of identifying a person or verifying the identity of a person based on a Physiological or behavioral characteristic. Physiological haracteristics include hand or finger images, facial characteristics and iris recognition. Behavioral characteristics include dynamic signature verification, speaker verification and keystroke dynamics. DNA is unique feature among individuals. DNA provides high security level, long term stability, user acceptance and is intrusive. Combining ElGamal cryptosystem and DNA sequence, a novel biometric authentication scheme is proposed.

  1. Protein sequence for clustering DNA based on Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Gamal. F. Elhadi

    2012-01-01

    Full Text Available DNA is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms and some viruses. Clustering is a process that groups a set of objects into clusters so that the similarity among objects in the same cluster is high, while that among the objects in different clusters is low. In this paper, we proposed an approach for clustering DNA sequences using Self-Organizing Map (SOM algorithm and Protein Sequence. The main objective is to analyze biological data and to bunch DNA to many clusters more easily and efficiently. We use the proposed approach to analyze both large and small amount of input DNA sequences. The results show that the similarity of the sequences does not depend on the amount of input sequences. Our approach depends on evaluating the degree of the DNA sequences similarity using the hierarchal representation Dendrogram. Representing large amount of data using hierarchal tree gives the ability to compare large sequences efficiently

  2. Sequencing and Analysis of Neanderthal Genomic DNA

    OpenAIRE

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith, Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Paabo, Svante; Pritchard, Jonathan K; Rubin, Edward M.

    2006-01-01

    Our knowledge of Neanderthals is based on a limited number of remains and artifacts from which we must make inferences about their biology, behavior, and relationship to ourselves. Here, we describe the characterization of these extinct hominids from a new perspective, based on the development of a Neanderthal metagenomic library and its high-throughput sequencing and analysis. Several lines of evidence indicate that the 65,250 base pairs of hominid sequence so far identified in the library a...

  3. DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

    Science.gov (United States)

    Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

    2012-01-01

    DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.

  4. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    DEFF Research Database (Denmark)

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie;

    2014-01-01

    sections for electron induced single strand breaks in specific 13 mer oligonucleotides we used atomic force microscopy analysis of DNA origami based DNA nanoarrays. We investigated the DNA sequences 5'-TT(XYX)3TT with X = A, G, C and Y = T, BrU 5-bromouracil and found absolute strand break cross sections...

  5. Dramatic reduction of sequence artefacts from DNA isolated from formalin-fixed cancer biopsies by treatment with uracil- DNA glycosylase.

    Science.gov (United States)

    Do, Hongdo; Dobrovic, Alexander

    2012-05-01

    Non-reproducible sequence artefacts are frequently detected in DNA from formalinfixed and paraffin-embedded (FFPE) tissues. However, no rational strategy has been developed for reduction of sequence artefacts from FFPE DNA as the underlying causes of the artefacts are poorly understood. As cytosine deamination to uracil is a common form of DNA damage in ancient DNA, we set out to examine whether treatment of FFPE DNA with uracil-DNA glycosylase (UDG) would lead to the reduction of C>T (and G>A) sequence artefacts. Heteroduplex formation in high resolution melting (HRM)-based assays was used for the detection of sequence variants in FFPE DNA samples. A set of samples that gave false positive HRM results for screening for the E17K mutation in exon 4 of the AKT1 gene were chosen for analysis. Sequencing of these samples showed multiple non-reproducible C:G>T:A artefacts. Treatment of the FFPE DNA with UDG prior to PCR amplification led to a very marked reduction of the sequence artefacts as indicated by both HRM and sequencing analysis, indicating that uracil lesions are the major cause of sequence artefacts. Similar results were shown for the BRAF V600 region in the same sample set and EGFR exon 19 in another sample set. UDG treatment specifically suppressed the formation of artefacts in FFPE DNA as it did not affect the detection of true KRAS codon 12 and true EGFR exon 19 and 20 mutations. We conclude that uracil in FFPE DNA leads to a significant proportion of sequence artefacts. These can be minimised by a simple UDG pretreatment which can be readily carried out, in the same tube, as the PCR immediately prior to commencing thermal cycling. HRM is a convenient way of monitoring both the degree of damage and the effectiveness of the UDG treatment. These findings have immediate and important implications for cancer diagnostics where FFPE DNA is used as the primary genetic material for mutational studies guiding personalised medicine strategies and where simple

  6. Algorithms for mapping high-throughput DNA sequences

    DEFF Research Database (Denmark)

    Frellsen, Jes; Menzel, Peter; Krogh, Anders

    2014-01-01

    Abstract High-throughput sequencing (HTS) technologies revolutionized the field of molecular biology by enabling large scale whole genome sequencing as well as a broad range of experiments for studying the cell's inner workings directly on DNA or RNA level. Given the dramatically increased rate...

  7. Ancient DNA sequence revealed by error-correcting codes.

    Science.gov (United States)

    Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo

    2015-07-10

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.

  8. Zinc finger recombinases with adaptable DNA sequence specificity.

    Directory of Open Access Journals (Sweden)

    Chris Proudfoot

    Full Text Available Site-specific recombinases have become essential tools in genetics and molecular biology for the precise excision or integration of DNA sequences. However, their utility is currently limited to circumstances where the sites recognized by the recombinase enzyme have been introduced into the DNA being manipulated, or natural 'pseudosites' are already present. Many new applications would become feasible if recombinase activity could be targeted to chosen sequences in natural genomic DNA. Here we demonstrate efficient site-specific recombination at several sequences taken from a 1.9 kilobasepair locus of biotechnological interest (in the bovine β-casein gene, mediated by zinc finger recombinases (ZFRs, chimaeric enzymes with linked zinc finger (DNA recognition and recombinase (catalytic domains. In the "Z-sites" tested here, 22 bp casein gene sequences are flanked by 9 bp motifs recognized by zinc finger domains. Asymmetric Z-sites were recombined by the concomitant action of two ZFRs with different zinc finger DNA-binding specificities, and could be recombined with a heterologous site in the presence of a third recombinase. Our results show that engineered ZFRs may be designed to promote site-specific recombination at many natural DNA sequences.

  9. cDNA cloning and sequencing of ostrich Growth hormone

    Directory of Open Access Journals (Sweden)

    Doosti Abbas

    2012-01-01

    Full Text Available In recent years, industrial breeding of ostrich (Struthio camelus has been widely developed in Iran. Growth hormone (GH is a peptide hormone that stimulates growth and cell reproduction in different animals. The aim of this study was to clone and sequence the ostrich growth hormone gene in E. coli, done for the first time in Iran. The cDNA that encodes ostrich growth hormone was isolated from total mRNA of the pituitary gland and amplified by RT-PCR using GH specific PCR primers. Then GH cDNA was cloned by T/A cloning technique and the construct was transformed into E. coli. Finally, GH cDNA sequence was submitted to the GenBank (Accession number: JN559394. The results of present study showed that GH cDNA was successfully cloned in E. coli. Sequencing confirmed that GH cDNA was cloned and that the length of ostrich GH cDNA was 672 bp; BLAST search showed that the sequence of growth hormone cDNA of the ostrich from Iran has 100% homology with other records existing in GenBank.

  10. Polyamide platinum anticancer complexes designed to target specific DNA sequences.

    Science.gov (United States)

    Jaramillo, David; Wheate, Nial J; Ralph, Stephen F; Howard, Warren A; Tor, Yitzhak; Aldrich-Wright, Janice R

    2006-07-24

    Two new platinum complexes, trans-chlorodiammine[N-(2-aminoethyl)-4-[4-(N-methylimidazole-2-carboxamido)-N-methylpyrrole-2-carboxamido]-N-methylpyrrole-2-carboxamide]platinum(II) chloride (DJ1953-2) and trans-chlorodiammine[N-(6-aminohexyl)-4-[4-(N-methylimidazole-2-carboxamido)-N-methylpyrrole-2-carboxamido]-N-methylpyrrole-2-carboxamide]platinum(II) chloride (DJ1953-6) have been synthesized as proof-of-concept molecules in the design of agents that can specifically target genes in DNA. Coordinate covalent binding to DNA was demonstrated with electrospray ionization mass spectrometry. Using circular dichroism, these complexes were found to show greater DNA binding affinity to the target sequence: d(CATTGTCAGAC)(2), than toward either d(GTCTGTCAATG)(2,) which contains different flanking sequences, or d(CATTGAGAGAC)(2), which contains a double base pair mismatch sequence. DJ1953-2 unwinds the DNA helix by around 13 degrees , but neither metal complex significantly affects the DNA melting temperature. Unlike simple DNA minor groove binders, DJ1953-2 is able to inhibit, in vitro, RNA synthesis. The cytotoxicity of both metal complexes in the L1210 murine leukaemia cell line was also determined, with DJ1953-6 (34 microM) more active than DJ1953-2 (>50 microM). These results demonstrate the potential of polyamide platinum complexes and provide the structural basis for designer agents that are able to recognize biologically relevant sequences and prevent DNA transcription and replication.

  11. Selective binding of anti-DNA antibodies to native dsDNA fragments of differing sequence.

    Science.gov (United States)

    Uccellini, Melissa B; Busto, Patricia; Debatis, Michelle; Marshak-Rothstein, Ann; Viglianti, Gregory A

    2012-03-30

    Systemic autoimmune diseases are characterized by the development of autoantibodies directed against a limited subset of nuclear antigens, including DNA. DNA-specific B cells take up mammalian DNA through their B cell receptor, and this DNA is subsequently transported to an endosomal compartment where it can potentially engage TLR9. We have previously shown that ssDNA-specific B cells preferentially bind to particular DNA sequences, and antibody specificity for short synthetic oligodeoxynucleotides (ODNs). Since CpG-rich DNA, the ligand for TLR9 is found in low abundance in mammalian DNA, we sought to determine whether antibodies derived from DNA-reactive B cells showed binding preference for CpG-rich native dsDNA, and thereby select immunostimulatory DNA for delivery to TLR9. We examined a panel of anti-DNA antibodies for binding to CpG-rich and CpG-poor DNA fragments. We show that a number of anti-DNA antibodies do show preference for binding to certain native dsDNA fragments of differing sequence, but this does not correlate directly with the presence of CpG dinucleotides. An antibody with preference for binding to a fragment containing optimal CpG motifs was able to promote B cell proliferation to this fragment at 10-fold lower antibody concentrations than an antibody that did not selectively bind to this fragment, indicating that antibody binding preference can influence autoreactive B cell responses.

  12. Nanopore-based Fourth-generation DNA Sequencing Technology

    Institute of Scientific and Technical Information of China (English)

    Yanxiao Feng; Yuechuan Zhang; Cuifeng Ying; Deqiang Wang; Chunlei Du

    2015-01-01

    Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than$100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications.

  13. Directed PCR-free engineering of highly repetitive DNA sequences

    Directory of Open Access Journals (Sweden)

    Preissler Steffen

    2011-09-01

    Full Text Available Abstract Background Highly repetitive nucleotide sequences are commonly found in nature e.g. in telomeres, microsatellite DNA, polyadenine (poly(A tails of eukaryotic messenger RNA as well as in several inherited human disorders linked to trinucleotide repeat expansions in the genome. Therefore, studying repetitive sequences is of biological, biotechnological and medical relevance. However, cloning of such repetitive DNA sequences is challenging because specific PCR-based amplification is hampered by the lack of unique primer binding sites resulting in unspecific products. Results For the PCR-free generation of repetitive DNA sequences we used antiparallel oligonucleotides flanked by restriction sites of Type IIS endonucleases. The arrangement of recognition sites allowed for stepwise and seamless elongation of repetitive sequences. This facilitated the assembly of repetitive DNA segments and open reading frames encoding polypeptides with periodic amino acid sequences of any desired length. By this strategy we cloned a series of polyglutamine encoding sequences as well as highly repetitive polyadenine tracts. Such repetitive sequences can be used for diverse biotechnological applications. As an example, the polyglutamine sequences were expressed as His6-SUMO fusion proteins in Escherichia coli cells to study their aggregation behavior in vitro. The His6-SUMO moiety enabled affinity purification of the polyglutamine proteins, increased their solubility, and allowed controlled induction of the aggregation process. We successfully purified the fusions proteins and provide an example for their applicability in filter retardation assays. Conclusion Our seamless cloning strategy is PCR-free and allows the directed and efficient generation of highly repetitive DNA sequences of defined lengths by simple standard cloning procedures.

  14. The impact of DNA input amount and DNA source on the performance of whole-exome sequencing in cancer epidemiology.

    Science.gov (United States)

    Zhu, Qianqian; Hu, Qiang; Shepherd, Lori; Wang, Jianmin; Wei, Lei; Morrison, Carl D; Conroy, Jeffrey M; Glenn, Sean T; Davis, Warren; Kwan, Marilyn L; Ergas, Isaac J; Roh, Janise M; Kushi, Lawrence H; Ambrosone, Christine B; Liu, Song; Yao, Song

    2015-08-01

    Whole-exome sequencing (WES) has recently emerged as an appealing approach to systematically study coding variants. However, the requirement for a large amount of high-quality DNA poses a barrier that may limit its application in large cancer epidemiologic studies. We evaluated the performance of WES with low input amount and saliva DNA as an alternative source material. Five breast cancer patients were randomly selected from the Pathways Study. From each patient, four samples, including 3 μg, 1 μg, and 0.2 μg blood DNA and 1 μg saliva DNA, were aliquoted for library preparation using the Agilent SureSelect Kit and sequencing using Illumina HiSeq2500. Quality metrics of sequencing and variant calling, as well as concordance of variant calls from the whole exome and 21 known breast cancer genes, were assessed by input amount and DNA source. There was little difference by input amount or DNA source on the quality of sequencing and variant calling. The concordance rate was about 98% for single-nucleotide variant calls and 83% to 86% for short insertion/deletion calls. For the 21 known breast cancer genes, WES based on low input amount and saliva DNA identified the same set variants in samples from a same patient. Low DNA input amount, as well as saliva DNA, can be used to generate WES data of satisfactory quality. Our findings support the expansion of WES applications in cancer epidemiologic studies where only low DNA amount or saliva samples are available. ©2015 American Association for Cancer Research.

  15. Mitochondrial DNA sequence analysis of two mouse hepatocarcinoma cell lines

    Institute of Scientific and Technical Information of China (English)

    Ji-Gang Dai; Xia Lei; Jia-Xin Min; Guo-Qiang Zhang; Hong Wei

    2005-01-01

    AIM: To study genetic difference of mitochondrial DNA (mtDNA)between two hepatocarcinoma cell lines (Hca-F and Hca-P)with diverse metastatic characteristics and the relationship between mtDNA changes in cancer cells and their oncogenic phenotype.METHODS: Mitochondrial DNA D-loop, tRNAMet+Glu+Ile and ND3gene fragments from the hepatocarcinoma cell lines with 1100, 1126 and 534 bp in length respectively were analysed by PCR amplification and restriction fragment length polymorphism techniques. The D-loop 3' end sequence of the hepatocarcinoma cell lines was determined by sequencing.RESULTS: No amplification fragment length polymorphism and restriction fragment length polymorphism were observed in tRNAMet+Glu+Ile,ND3 and D-loop of mitochondrial DNA of the hepatocarcinoma cells. Sequence differences between Hca-F and Hca-P were found in mtDNA D-loop.CONCLUSION: Deletion mutations of mitochondrial DNA restriction fragment may not play a significant role in carcinogenesis. Genetic difference of mtDNA D-loop between Hca-F and Hca-P, which may reflect the environmental and genetic influences during tumor progression, could be linked to their tumorigenic phenotypes.

  16. Palindromic sequence artifacts generated during next generation sequencing library preparation from historic and ancient DNA.

    Directory of Open Access Journals (Sweden)

    Bastiaan Star

    Full Text Available Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua, which forms interrupted palindromes consisting of reverse complementary sequence at the 5' and 3'-ends of sequencing reads. The palindromic sequences themselves have specific properties - the bases at the 5'-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3'-end. The terminal 3' bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA, which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3'-end of DNA strands, with the 5'-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias.

  17. PCR primers for metazoan mitochondrial 12S ribosomal DNA sequences.

    Directory of Open Access Journals (Sweden)

    Ryuji J Machida

    Full Text Available BACKGROUND: Assessment of the biodiversity of communities of small organisms is most readily done using PCR-based analysis of environmental samples consisting of mixtures of individuals. Known as metagenetics, this approach has transformed understanding of microbial communities and is beginning to be applied to metazoans as well. Unlike microbial studies, where analysis of the 16S ribosomal DNA sequence is standard, the best gene for metazoan metagenetics is less clear. In this study we designed a set of PCR primers for the mitochondrial 12S ribosomal DNA sequence based on 64 complete mitochondrial genomes and then tested their efficacy. METHODOLOGY/PRINCIPAL FINDINGS: A total of the 64 complete mitochondrial genome sequences representing all metazoan classes available in GenBank were downloaded using the NCBI Taxonomy Browser. Alignment of sequences was performed for the excised mitochondrial 12S ribosomal DNA sequences, and conserved regions were identified for all 64 mitochondrial genomes. These regions were used to design a primer pair that flanks a more variable region in the gene. Then all of the complete metazoan mitochondrial genomes available in NCBI's Organelle Genome Resources database were used to determine the percentage of taxa that would likely be amplified using these primers. Results suggest that these primers will amplify target sequences for many metazoans. CONCLUSIONS/SIGNIFICANCE: Newly designed 12S ribosomal DNA primers have considerable potential for metazoan metagenetic analysis because of their ability to amplify sequences from many metazoans.

  18. Recognizing a Single Base in an Individual DNA Strand: A Step Toward Nanopore DNA Sequencing**

    Science.gov (United States)

    Ashkenasy, N.; Sánchez-Quesada, J.; Ghadiri, M. R.; Bayley, H.

    2007-01-01

    Functional supramolecular chemistry at the single-molecule level. Single strands of DNA can be captured inside α-hemolysin transmembrane pore protein to form single-species α-HL·DNA pseudorotaxanes. This process can be used to identify a single adenine nucleotide at a specific location on a strand of DNA by the characteristic reductions in the α-HL ion conductance. This study suggests that α-HL-mediated single-molecule DNA sequencing might be fundamentally feasible. PMID:15666419

  19. Analysis of sequence variation in Gnathostoma spinigerum mitochondrial DNA by single-strand conformation polymorphism analysis and DNA sequence.

    Science.gov (United States)

    Ngarmamonpirat, Charinthon; Waikagul, Jitra; Petmitr, Songsak; Dekumyoy, Paron; Rojekittikhun, Wichit; Anantapruti, Malinee T

    2005-03-01

    Morphological variations were observed in the advance third stage larvae of Gnathostoma spinigerum collected from swamp eel (Fluta alba), the second intermediate host. Larvae with typical and three atypical types were chosen for partial cytochrome c oxidase subunit I (COI) gene sequence analysis. A 450 bp polymerase chain reaction product of the COI gene was amplified from mitochondrial DNA. The variations were analyzed by single-strand conformation polymorphism and DNA sequencing. The nucleotide variations of the COI gene in the four types of larvae indicated the presence of an intra-specific variation of mitochondrial DNA in the G. spinigerum population.

  20. Draft versus finished sequence data for DNA and protein diagnostic signature development

    Energy Technology Data Exchange (ETDEWEB)

    Gardner, S N; Lam, M W; Smith, J R; Torres, C L; Slezak, T R

    2004-10-29

    Sequencing pathogen genomes is costly, demanding careful allocation of limited sequencing resources. We built a computational Sequencing Analysis Pipeline (SAP) to guide decisions regarding the amount of genomic sequencing necessary to develop high-quality diagnostic DNA and protein signatures. SAP uses simulations to estimate the number of target genomes and close phylogenetic relatives (near neighbors, or NNs) to sequence. We use SAP to assess whether draft data is sufficient or finished sequencing is required using Marburg and variola virus sequences. Simulations indicate that intermediate to high quality draft with error rates of 10{sup -3}-10{sup -5} ({approx} 8x coverage) of target organisms is suitable for DNA signature prediction. Low quality draft with error rates of {approx} 1% (3x to 6x coverage) of target isolates is inadequate for DNA signature prediction, although low quality draft of NNs is sufficient, as long as the target genomes are of high quality. For protein signature prediction, sequencing errors in target genomes substantially reduce the detection of amino acid sequence conservation, even if the draft is of high quality. In summary, high quality draft of target and low quality draft of NNs appears to be a cost-effective investment for DNA signature prediction, but may lead to underestimation of predicted protein signatures.

  1. Efficient and specific internal cleavage of a retroviral palindromic DNA sequence by tetrameric HIV-1 integrase.

    Directory of Open Access Journals (Sweden)

    Olivier Delelis

    Full Text Available BACKGROUND: HIV-1 integrase (IN catalyses the retroviral integration process, removing two nucleotides from each long terminal repeat and inserting the processed viral DNA into the target DNA. It is widely assumed that the strand transfer step has no sequence specificity. However, recently, it has been reported by several groups that integration sites display a preference for palindromic sequences, suggesting that a symmetry in the target DNA may stabilise the tetrameric organisation of IN in the synaptic complex. METHODOLOGY/PRINCIPAL FINDINGS: We assessed the ability of several palindrome-containing sequences to organise tetrameric IN and investigated the ability of IN to catalyse DNA cleavage at internal positions. Only one palindromic sequence was successfully cleaved by IN. Interestingly, this symmetrical sequence corresponded to the 2-LTR junction of retroviral DNA circles-a palindrome similar but not identical to the consensus sequence found at integration sites. This reaction depended strictly on the cognate retroviral sequence of IN and required a full-length wild-type IN. Furthermore, the oligomeric state of IN responsible for this cleavage differed from that involved in the 3'-processing reaction. Palindromic cleavage strictly required the tetrameric form, whereas 3'-processing was efficiently catalysed by a dimer. CONCLUSIONS/SIGNIFICANCE: Our findings suggest that the restriction-like cleavage of palindromic sequences may be a general physiological activity of retroviral INs and that IN tetramerisation is strongly favoured by DNA symmetry, either at the target site for the concerted integration or when the DNA contains the 2-LTR junction in the case of the palindromic internal cleavage.

  2. Probabilistic models for semisupervised discriminative motif discovery in DNA sequences.

    Science.gov (United States)

    Kim, Jong Kyoung; Choi, Seungjin

    2011-01-01

    Methods for discriminative motif discovery in DNA sequences identify transcription factor binding sites (TFBSs), searching only for patterns that differentiate two sets (positive and negative sets) of sequences. On one hand, discriminative methods increase the sensitivity and specificity of motif discovery, compared to generative models. On the other hand, generative models can easily exploit unlabeled sequences to better detect functional motifs when labeled training samples are limited. In this paper, we develop a hybrid generative/discriminative model which enables us to make use of unlabeled sequences in the framework of discriminative motif discovery, leading to semisupervised discriminative motif discovery. Numerical experiments on yeast ChIP-chip data for discovering DNA motifs demonstrate that the best performance is obtained between the purely-generative and the purely-discriminative and the semisupervised learning improves the performance when labeled sequences are limited.

  3. Applications of recursive segmentation to the analysis of DNA sequences.

    Science.gov (United States)

    Li, Wentian; Bernaola-Galván, Pedro; Haghighi, Fatameh; Grosse, Ivo

    2002-07-01

    Recursive segmentation is a procedure that partitions a DNA sequence into domains with a homogeneous composition of the four nucleotides A, C, G and T. This procedure can also be applied to any sequence converted from a DNA sequence, such as to a binary strong(G + C)/weak(A + T) sequence, to a binary sequence indicating the presence or absence of the dinucleotide CpG, or to a sequence indicating both the base and the codon position information. We apply various conversion schemes in order to address the following five DNA sequence analysis problems: isochore mapping, CpG island detection, locating the origin and terminus of replication in bacterial genomes, finding complex repeats in telomere sequences, and delineating coding and noncoding regions. We find that the recursive segmentation procedure can successfully detect isochore borders, CpG islands, and the origin and terminus of replication, but it needs improvement for detecting complex repeats as well as borders between coding and noncoding regions.

  4. Chaos game representation (CGR)-walk model for DNA sequences

    Institute of Scientific and Technical Information of China (English)

    Gao Jie; Xu Zhen-Yuan

    2009-01-01

    Chaos game representation (CGR) is an iterative mapping technique that processes sequences of units, such as nucleotides in a DNA sequence or amino acids in a protein, in order to determine the coordinates of their positions in a continuous space. This distribution of positions has two features: one is unique, and the other is source sequence that can be recovered from the coordinates so that the distance between positions may serve as a measure of similarity between the corresponding sequences. A CGR-walk model is proposed based on CGR coordinates for the DNA sequences. The CGR coordinates are converted into a time series, and a long-memory ARFIMA (p, d, q) model, where ARFIMA stands for autoregressive fractionally integrated moving average, is introduced into the DNA sequence analysis. This model is applied to simulating real CGR-walk sequence data of ten genomic sequences. Remarkably long-range correlations are uncovered in the data, and the results from these models are reasonably fitted with those from the ARFIMA (p, d, q) model.

  5. Improved Algorithm for Analysis of DNA Sequences Using Multiresolution Transformation

    Directory of Open Access Journals (Sweden)

    T. M. Inbamalar

    2015-01-01

    Full Text Available Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA, the ribonucleic acid (RNA, and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI site. The comparative analysis is done and it ensures the efficiency of the proposed system.

  6. Improved algorithm for analysis of DNA sequences using multiresolution transformation.

    Science.gov (United States)

    Inbamalar, T M; Sivakumar, R

    2015-01-01

    Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA), the ribonucleic acid (RNA), and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP) methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP) representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI) site. The comparative analysis is done and it ensures the efficiency of the proposed system.

  7. How effective is graphene nanopore geometry on DNA sequencing?

    CERN Document Server

    Satarifard, Vahid; Ejtehadi, Mohammad Reza

    2015-01-01

    In this paper we investigate the effects of graphene nanopore geometry on homopolymer ssDNA pulling process through nanopore using steered molecular dynamic (SMD) simulations. Different graphene nanopores are examined including axially symmetric and asymmetric monolayer graphene nanopores as well as five layer graphene polyhedral crystals (GPC). The pulling force profile, moving fashion of ssDNA, work done in irreversible DNA pulling and orientations of DNA bases near the nanopore are assessed. Simulation results demonstrate the strong effect of the pore shape as well as geometrical symmetry on free energy barrier, orientations and dynamic of DNA translocation through graphene nanopore. Our study proposes that the symmetric circular geometry of monolayer graphene nanopore with high pulling velocity can be used for DNA sequencing.

  8. Qualitatively predicting acetylation and methylation areas in DNA sequences.

    Science.gov (United States)

    Pham, Tho Hoan; Tran, Dang Hung; Ho, Tu Bao; Satou, Kenji; Valiente, Gabriel

    2005-01-01

    Eukaryotic genomes are packaged by the wrapping of DNA around histone octamers to form nucleosomes. Nucleosome occupancy, acetylation, and methylation, which have a major impact on all nuclear processes involving DNA, have been recently mapped across the yeast genome using chromatin immunoprecipitation and DNA microarrays. However, this experimental protocol is laborious and expensive. Moreover, experimental methods often produce noisy results. In this paper, we introduce a computational approach to the qualitative prediction of nucleosome occupancy, acetylation, and methylation areas in DNA sequences. Our method uses support vector machines to discriminate between DNA areas with high and low relative occupancy, acetylation, or methylation, and rank k-gram features based on their support for these DNA modifications. Experimental results on the yeast genome reveal genetic area preferences of nucleosome occupancy, acetylation, and methylation that are consistent with previous studies. Supplementary files are available from http://www.jaist.ac.jp/~tran/nucleosome/.

  9. Ribosomal DNA copy number loss and sequence variation in cancer.

    Science.gov (United States)

    Xu, Baoshan; Li, Hua; Perry, John M; Singh, Vijay Pratap; Unruh, Jay; Yu, Zulin; Zakari, Musinu; McDowell, William; Li, Linheng; Gerton, Jennifer L

    2017-06-01

    Ribosomal DNA is one of the most variable regions in the human genome with respect to copy number. Despite the importance of rDNA for cellular function, we know virtually nothing about what governs its copy number, stability, and sequence in the mammalian genome due to challenges associated with mapping and analysis. We applied computational and droplet digital PCR approaches to measure rDNA copy number in normal and cancer states in human and mouse genomes. We find that copy number and sequence can change in cancer genomes. Counterintuitively, human cancer genomes show a loss of copies, accompanied by global copy number co-variation. The sequence can also be more variable in the cancer genome. Cancer genomes with lower copies have mutational evidence of mTOR hyperactivity. The PTEN phosphatase is a tumor suppressor that is critical for genome stability and a negative regulator of the mTOR kinase pathway. Surprisingly, but consistent with the human cancer genomes, hematopoietic cancer stem cells from a Pten-/- mouse model for leukemia have lower rDNA copy number than normal tissue, despite increased proliferation, rRNA production, and protein synthesis. Loss of copies occurs early and is associated with hypersensitivity to DNA damage. Therefore, copy loss is a recurrent feature in cancers associated with mTOR activation. Ribosomal DNA copy number may be a simple and useful indicator of whether a cancer will be sensitive to DNA damaging treatments.

  10. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    Science.gov (United States)

    2011-01-01

    Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/. PMID:21385349

  11. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    Directory of Open Access Journals (Sweden)

    Baldwin Stephen A

    2011-03-01

    Full Text Available Abstract Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.

  12. VoSeq: a voucher and DNA sequence web application.

    Science.gov (United States)

    Peña, Carlos; Malm, Tobias

    2012-01-01

    There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL) and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit).

  13. Label-free DNA sequencing using Millikan detection.

    Science.gov (United States)

    Dettloff, Roger; Leiske, Danielle; Chow, Andrea; Farinas, Javier

    2015-10-15

    A label-free method for DNA sequencing based on the principle of the Millikan oil drop experiment was developed. This sequencing-by-synthesis approach sensed increases in bead charge as nucleotides were added by a polymerase to DNA templates attached to beads. The balance between an electrical force, which was dependent on the number of nucleotide charges on a bead, and opposing hydrodynamic drag and restoring tether forces resulted in a bead velocity that was a function of the number of nucleotides attached to the bead. The velocity of beads tethered via a polymer to a microfluidic channel and subjected to an oscillating electric field was measured using dark-field microscopy and used to determine how many nucleotides were incorporated during each sequencing-by-synthesis cycle. Increases in bead velocity of approximately 1% were reliably detected during DNA polymerization, allowing for sequencing of short DNA templates. The method could lead to a low-cost, high-throughput sequencing platform that could enable routine sequencing in medical applications.

  14. Hiding message into DNA sequence through DNA coding and chaotic maps.

    Science.gov (United States)

    Liu, Guoyan; Liu, Hongjun; Kadir, Abdurahman

    2014-09-01

    The paper proposes an improved reversible substitution method to hide data into deoxyribonucleic acid (DNA) sequence, and four measures have been taken to enhance the robustness and enlarge the hiding capacity, such as encode the secret message by DNA coding, encrypt it by pseudo-random sequence, generate the relative hiding locations by piecewise linear chaotic map, and embed the encoded and encrypted message into a randomly selected DNA sequence using the complementary rule. The key space and the hiding capacity are analyzed. Experimental results indicate that the proposed method has a better performance compared with the competing methods with respect to robustness and capacity.

  15. Mitoxantrone and Analogues Bind and Stabilize i-Motif Forming DNA Sequences

    Science.gov (United States)

    Wright, Elisé P.; Day, Henry A.; Ibrahim, Ali M.; Kumar, Jeethendra; Boswell, Leo J. E.; Huguin, Camille; Stevenson, Clare E. M.; Pors, Klaus; Waller, Zoë A. E.

    2016-12-01

    There are hundreds of ligands which can interact with G-quadruplex DNA, yet very few which target i-motif. To appreciate an understanding between the dynamics between these structures and how they can be affected by intervention with small molecule ligands, more i-motif binding compounds are required. Herein we describe how the drug mitoxantrone can bind, induce folding of and stabilise i-motif forming DNA sequences, even at physiological pH. Additionally, mitoxantrone was found to bind i-motif forming sequences preferentially over double helical DNA. We also describe the stabilisation properties of analogues of mitoxantrone. This offers a new family of ligands with potential for use in experiments into the structure and function of i-motif forming DNA sequences.

  16. Nonlinear Aspects of Coding and Noncoding DNA Sequences

    Science.gov (United States)

    Stanley, H. Eugene

    2001-03-01

    One of the most remarkable features of human DNA is that 97 percent is not coding for proteins. Studying this noncoding DNA is important both for practical reasons (to distinguish it from the coding DNA as the human genome is sequenced), and for scientific reasons (why is the noncoding DNA present at all, if it appears to have little if any purpose?). In this talk we discuss new methods of analyzing coding and noncoding DNA in parallel, with a view to uncovering different statistical properties of the two kinds of DNA. We also speculate on possible roles of noncoding DNA. The work reported here was carried out primarily by P. Bernaola-Galvan, S. V. Buldyrev, P. Carpena, N. Dokholyan, A. L. Goldberger, I. Grosse, S. Havlin, H. Herzel, J. L. Oliver, C.-K. Peng, M. Simons, H. E. Stanley, R. H. R. Stanley, and G. M. Viswanathan. [1] For a brief overview in language that physicists can understand, see H. E. Stanley, S. V. Buldyrev, A. L. Goldberger, S. Havlin, C.-K. Peng, and M. Simons, "Scaling Features of Noncoding DNA" [Proc. XII Max Born Symposium, Wroclaw], Physica A 273, 1-18 (1999). [2] I. Grosse, H. Herzel, S. V. Buldyrev, and H. E. Stanley, "Species Independence of Mutual Information in Coding and Noncoding DNA," Phys. Rev. E 61, 5624-5629 (2000). [3] P. Bernaola-Galvan, I. Grosse, P. Carpena, J. L. Oliver, and H. E. Stanley, "Identification of DNA Coding Regions Using an Entropic Segmentation Method," Phys. Rev. Lett. 84, 1342-1345 (2000). [4] N. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, "Distributions of Dimeric Tandem Repeats in Non-coding and Coding DNA Sequences," J. Theor. Biol. 202, 273-282 (2000). [5] R. H. R. Stanley, N. V. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, "Clumping of Identical Oligonucleotides in Coding and Noncoding DNA Sequences," J. Biomol. Structure and Design 17, 79-87 (1999). [6] N. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, "Distribution of Base Pair Repeats in Coding and Noncoding DNA

  17. Mitochondrial DNA sequence of Onychostoma rara.

    Science.gov (United States)

    Zeng, Chun-Fang; Li, Xiao-Ling; Li, Chuan-Wu; Huang, Xiang-Rong; Wan, Yi-Wen

    2015-01-01

    The complete mitochondrial genome sequence of Onychostoma rara was determined to be 16,590 bp in length and contains 13 protein-coding genes (PCGs), 22 tRNA genes, large (rrnL) and small (rrnS) rRNA and the non-coding control region. Its total A + T content is 55.65%. We also analyzed the structure of control region, 6 CSBs (CSB-1, CSB-2, CSB-3, CSB-D, CSB-E and CSB-F) and 2 bp tandem repeat were detected.

  18. DNA sequence alignment by microhomology sampling during homologous recombination.

    Science.gov (United States)

    Qi, Zhi; Redding, Sy; Lee, Ja Yil; Gibb, Bryan; Kwon, YoungHo; Niu, Hengyao; Gaines, William A; Sung, Patrick; Greene, Eric C

    2015-02-26

    Homologous recombination (HR) mediates the exchange of genetic information between sister or homologous chromatids. During HR, members of the RecA/Rad51 family of recombinases must somehow search through vast quantities of DNA sequence to align and pair single-strand DNA (ssDNA) with a homologous double-strand DNA (dsDNA) template. Here, we use single-molecule imaging to visualize Rad51 as it aligns and pairs homologous DNA sequences in real time. We show that Rad51 uses a length-based recognition mechanism while interrogating dsDNA, enabling robust kinetic selection of 8-nucleotide (nt) tracts of microhomology, which kinetically confines the search to sites with a high probability of being a homologous target. Successful pairing with a ninth nucleotide coincides with an additional reduction in binding free energy, and subsequent strand exchange occurs in precise 3-nt steps, reflecting the base triplet organization of the presynaptic complex. These findings provide crucial new insights into the physical and evolutionary underpinnings of DNA recombination. Copyright © 2015 Elsevier Inc. All rights reserved.

  19. Noninvasive prenatal paternity testing (NIPAT) through maternal plasma DNA sequencing

    DEFF Research Database (Denmark)

    Jiang, Haojun; Xie, Yifan; Li, Xuchao

    2016-01-01

    Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we...... developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels...... paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future....

  20. Accelerating Computation of DNA Sequence Alignment in Distributed Environment

    Science.gov (United States)

    Guo, Tao; Li, Guiyang; Deaton, Russel

    Sequence similarity and alignment are most important operations in computational biology. However, analyzing large sets of DNA sequence seems to be impractical on a regular PC. Using multiple threads with JavaParty mechanism, this project has successfully implemented in extending the capabilities of regular Java to a distributed environment for simulation of DNA computation. With the aid of JavaParty and the design of multiple threads, the results of this study demonstrated that the modified regular Java program could perform parallel computing without using RMI or socket communication. In this paper, an efficient method for modeling and comparing DNA sequences with dynamic programming and JavaParty was firstly proposed. Additionally, results of this method in distributed environment have been discussed.

  1. Facilitated diffusion on mobile DNA: configurational traps and sequence heterogeneity

    CERN Document Server

    Brackley, C A; Marenduzzo, D; 10.1103/PhysRevLett.109.168103

    2012-01-01

    We present Brownian dynamics simulations of the facilitated diffusion of a protein, modelled as a sphere with a binding site on its surface, along DNA, modelled as a semi-flexible polymer. We consider both the effect of DNA organisation in 3D, and of sequence heterogeneity. We find that in a network of DNA loops, as are thought to be present in bacterial DNA, the search process is very sensitive to the spatial location of the target within such loops. Therefore, specific genes might be repressed or promoted by changing the local topology of the genome. On the other hand, sequence heterogeneity creates traps which normally slow down facilitated diffusion. When suitably positioned, though, these traps can, surprisingly, render the search process much more efficient.

  2. Vector sequences - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available [ Credits ] BLAST Search Image Search Home About Archive Update History Contact us ...od - Number of data entries 7 entries - Joomla SEF URLs by Artio About This Database Database Description Download License Update His...tory of This Database Site Policy | Contact Us Vector sequences - Budding yeast cDNA sequencing project | LSDB Archive ...

  3. Comment on "Linguistic features of noncoding DNA sequences"

    CERN Document Server

    Israeloff, N E; Chan, K; Israeloff, N E; Kagalenko, M; Chan, K

    1995-01-01

    In a recent Physical Review Letter, Mantegna et. al., report that certain statistical signatures of natural language can be found in non-coding DNA sequences. In this comment we show that random noise with power-law correlation similar to 1/f noise, exhibits the same "linguistic" signature as those found in non-coding DNA. We conclude that these signa- tures cannot distinguish languages from noise.

  4. An efficient strategy for large-scale high-throughput transposon-mediated sequencing of cDNA clones

    Science.gov (United States)

    Butterfield, Yaron S. N.; Marra, Marco A.; Asano, Jennifer K.; Chan, Susanna Y.; Guin, Ranabir; Krzywinski, Martin I.; Lee, Soo Sen; MacDonald, Kim W. K.; Mathewson, Carrie A.; Olson, Teika E.; Pandoh, Pawan K.; Prabhu, Anna-Liisa; Schnerch, Angelique; Skalska, Ursula; Smailus, Duane E.; Stott, Jeff M.; Tsai, Miranda I.; Yang, George S.; Zuyderduyn, Scott D.; Schein, Jacqueline E.; Jones, Steven J. M.

    2002-01-01

    We describe an efficient high-throughput method for accurate DNA sequencing of entire cDNA clones. Developed as part of our involvement in the Mammalian Gene Collection full-length cDNA sequencing initiative, the method has been used and refined in our laboratory since September 2000. Amenable to large scale projects, we have used the method to generate >7 Mb of accurate sequence from 3695 candidate full-length cDNAs. Sequencing is accomplished through the insertion of Mu transposon into cDNAs, followed by sequencing reactions primed with Mu-specific sequencing primers. Transposon insertion reactions are not performed with individual cDNAs but rather on pools of up to 96 clones. This pooling strategy reduces the number of transposon insertion sequencing libraries that would otherwise be required, reducing the costs and enhancing the efficiency of the transposon library construction procedure. Sequences generated using transposon-specific sequencing primers are assembled to yield the full-length cDNA sequence, with sequence editing and other sequence finishing activities performed as required to resolve sequence ambiguities. Although analysis of the many thousands (22 785) of sequenced Mu transposon insertion events revealed a weak sequence preference for Mu insertion, we observed insertion of the Mu transposon into 1015 of the possible 1024 5mer candidate insertion sites. PMID:12034834

  5. Sequence dependence of transcription factor-mediated DNA looping.

    Science.gov (United States)

    Johnson, Stephanie; Lindén, Martin; Phillips, Rob

    2012-09-01

    DNA is subject to large deformations in a wide range of biological processes. Two key examples illustrate how such deformations influence the readout of the genetic information: the sequestering of eukaryotic genes by nucleosomes and DNA looping in transcriptional regulation in both prokaryotes and eukaryotes. These kinds of regulatory problems are now becoming amenable to systematic quantitative dissection with a powerful dialogue between theory and experiment. Here, we use a single-molecule experiment in conjunction with a statistical mechanical model to test quantitative predictions for the behavior of DNA looping at short length scales and to determine how DNA sequence affects looping at these lengths. We calculate and measure how such looping depends upon four key biological parameters: the strength of the transcription factor binding sites, the concentration of the transcription factor, and the length and sequence of the DNA loop. Our studies lead to the surprising insight that sequences that are thought to be especially favorable for nucleosome formation because of high flexibility lead to no systematically detectable effect of sequence on looping, and begin to provide a picture of the distinctions between the short length scale mechanics of nucleosome formation and looping.

  6. Label-Free DNA Sequencing Using Millikan Detection

    OpenAIRE

    Dettloff, Roger; Leiske, Danielle; Chow, Andrea; Farinas, Javier

    2015-01-01

    A label-free method for DNA sequencing based on the principle of the Millikan oil drop experiment was developed. This sequencing-by-synthesis approach sensed increases in bead charge as nucleotides were added by a polymerase to DNA templates attached to beads. The balance between an electrical force, which was dependent on the number of nucleotide charges on a bead, and opposing hydrodynamic drag and restoring tether forces resulted in a bead velocity that was a function of the number of nucl...

  7. Anaplasma phagocytophilum in Danish sheep: confirmation by DNA sequencing

    Directory of Open Access Journals (Sweden)

    Thamsborg Stig M

    2009-12-01

    Full Text Available Abstract Background The presence of Anaplasma phagocytophilum, an Ixodes ricinus transmitted bacterium, was investigated in two flocks of Danish grazing lambs. Direct PCR detection was performed on DNA extracted from blood and serum with subsequent confirmation by DNA sequencing. Methods 31 samples obtained from clinically normal lambs in 2000 from Fussingø, Jutland and 12 samples from ten lambs and two ewes from a clinical outbreak at Feddet, Zealand in 2006 were included in the study. Some of the animals from Feddet had shown clinical signs of polyarthritis and general unthriftiness prior to sampling. DNA extraction was optimized from blood and serum and detection achieved by a 16S rRNA targeted PCR with verification of the product by DNA sequencing. Results Five DNA extracts were found positive by PCR, including two samples from 2000 and three from 2006. For both series of samples the product was verified as A. phagocytophilum by DNA sequencing. Conclusions A. phagocytophilum was detected by molecular methods for the first time in Danish grazing lambs during the two seasons investigated (2000 and 2006.

  8. Perspectives of DNA microarray and next-generation DNA sequencing technologies

    Institute of Scientific and Technical Information of China (English)

    TENG XiaoKun; XIAO HuaSheng

    2009-01-01

    DNA microarray and next-generation DNA sequencing technologies are important tools for high-throughput genome research, in revealing both the structural and functional characteristics of genomes. In the past decade the DNA microarray technologies have been widely applied in the studies of functional genomics, systems biology and pharmacogenomics. The next-generation DNA sequenc-ing method was first introduced by the 454 Company in 2003, immediately followed by the establish-ment of the Solexa and Solid techniques by other biotech companies. Though it has not been long since the first emergence of this technology, with the fast and impressive improvement, the application of this technology has extended to almost all fields of genomics research, as a rival challenging the existing DNA microarray technology. This paper briefly reviews the working principles of these two technologies as well as their application and perspectives in genome research.

  9. Real-time DNA sequencing from single polymerase molecules.

    Science.gov (United States)

    Eid, John; Fehr, Adrian; Gray, Jeremy; Luong, Khai; Lyle, John; Otto, Geoff; Peluso, Paul; Rank, David; Baybayan, Primo; Bettman, Brad; Bibillo, Arkadiusz; Bjornson, Keith; Chaudhuri, Bidhan; Christians, Frederick; Cicero, Ronald; Clark, Sonya; Dalal, Ravindra; Dewinter, Alex; Dixon, John; Foquet, Mathieu; Gaertner, Alfred; Hardenbol, Paul; Heiner, Cheryl; Hester, Kevin; Holden, David; Kearns, Gregory; Kong, Xiangxu; Kuse, Ronald; Lacroix, Yves; Lin, Steven; Lundquist, Paul; Ma, Congcong; Marks, Patrick; Maxham, Mark; Murphy, Devon; Park, Insil; Pham, Thang; Phillips, Michael; Roy, Joy; Sebra, Robert; Shen, Gene; Sorenson, Jon; Tomaney, Austin; Travers, Kevin; Trulson, Mark; Vieceli, John; Wegener, Jeffrey; Wu, Dawn; Yang, Alicia; Zaccarin, Denis; Zhao, Peter; Zhong, Frank; Korlach, Jonas; Turner, Stephen

    2009-01-02

    We present single-molecule, real-time sequencing data obtained from a DNA polymerase performing uninterrupted template-directed synthesis using four distinguishable fluorescently labeled deoxyribonucleoside triphosphates (dNTPs). We detected the temporal order of their enzymatic incorporation into a growing DNA strand with zero-mode waveguide nanostructure arrays, which provide optical observation volume confinement and enable parallel, simultaneous detection of thousands of single-molecule sequencing reactions. Conjugation of fluorophores to the terminal phosphate moiety of the dNTPs allows continuous observation of DNA synthesis over thousands of bases without steric hindrance. The data report directly on polymerase dynamics, revealing distinct polymerization states and pause sites corresponding to DNA secondary structure. Sequence data were aligned with the known reference sequence to assay biophysical parameters of polymerization for each template position. Consensus sequences were generated from the single-molecule reads at 15-fold coverage, showing a median accuracy of 99.3%, with no systematic error beyond fluorophore-dependent error rates.

  10. Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications

    Energy Technology Data Exchange (ETDEWEB)

    Torella, JP; Lienert, F; Boehm, CR; Chen, JH; Way, JC; Silver, PA

    2014-08-07

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.

  11. Identification of Chinese Herbs Using a Sequencing-Free Nanostructured Electrochemical DNA Biosensor

    Directory of Open Access Journals (Sweden)

    Yan Lei

    2015-11-01

    Full Text Available Due to the nearly identical phenotypes and chemical constituents, it is often very challenging to accurately differentiate diverse species of a Chinese herbal genus. Although technologies including DNA barcoding have been introduced to help address this problem, they are generally time-consuming and require expensive sequencing. Herein, we present a simple sequencing-free electrochemical biosensor, which enables easy differentiation between two closely related Fritillaria species. To improve its differentiation capability using trace amounts of DNA sample available from herbal extracts, a stepwise electrochemical deposition of reduced graphene oxide (RGO and gold nanoparticles (AuNPs was adopted to engineer a synergistic nanostructured sensing interface. By using such a nanofeatured electrochemical DNA (E-DNA biosensor, two Chinese herbal species of Fritillaria (F. thunbergii and F. cirrhosa were successfully discriminated at the DNA level, because a fragment of 16-mer sequence at the spacer region of the 5S-rRNA only exists in F. thunbergii. This E-DNA sensor was capable of identifying the target sequence in the range from 100 fM to 10 nM, and a detection limit as low as 11.7 fM (S/N = 3 was obtained. Importantly, this sensor was applied to detect the unique fragment of the PCR products amplified from F. thunbergii and F. cirrhosa, respectively. We anticipate that such a direct, sequencing-free sensing mode will ultimately pave the way towards a new generation of herb-identification strategies.

  12. Identification of Bacterial Species in Kuwaiti Waters Through DNA Sequencing

    Science.gov (United States)

    Chen, K.

    2017-01-01

    With an objective of identifying the bacterial diversity associated with ecosystem of various Kuwaiti Seas, bacteria were cultured and isolated from 3 water samples. Due to the difficulties for cultured and isolated fecal coliforms on the selective agar plates, bacterial isolates from marine agar plates were selected for molecular identification. 16S rRNA genes were successfully amplified from the genome of the selected isolates using Universal Eubacterial 16S rRNA primers. The resulted amplification products were subjected to automated DNA sequencing. Partial 16S rDNA sequences obtained were compared directly with sequences in the NCBI database using BLAST as well as with the sequences available with Ribosomal Database Project (RDP).

  13. DNA qualification workflow for next generation sequencing of histopathological samples.

    Directory of Open Access Journals (Sweden)

    Michele Simbolo

    Full Text Available Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF tissues, 6 formalin-fixed paraffin-embedded (FFPE tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard

  14. DNA qualification workflow for next generation sequencing of histopathological samples.

    Science.gov (United States)

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  15. Pentaprobe: a comprehensive sequence for the one-step detection of DNA-binding activities.

    Science.gov (United States)

    Kwan, Ann H Y; Czolij, Robert; Mackay, Joel P; Crossley, Merlin

    2003-10-15

    The rapid increase in the number of novel proteins identified in genome projects necessitates simple and rapid methods for assigning function. We describe a strategy for determining whether novel proteins possess typical sequence-specific DNA-binding activity. Many proteins bind recognition sequences of 5 bp or less. Given that there are 4(5) possible 5 bp sites, one might expect the length of sequence required to cover all possibilities would be 4(5) x 5 or 5120 nt. But by allowing overlaps, utilising both strands and using a computer algorithm to generate the minimum sequence, we find the length required is only 516 base pairs. We generated this sequence as six overlapping double-stranded oligonucleotides, termed pentaprobe, and used it in gel retardation experiments to assess DNA binding by both known and putative DNA-binding proteins from several protein families. We have confirmed binding by the zinc finger proteins BKLF, Eos and Pegasus, the Ets domain protein PU.1 and the treble clef N- and C-terminal fingers of GATA-1. We also showed that the N-terminal zinc finger domain of FOG-1 does not behave as a typical DNA-binding domain. Our results suggest that pentaprobe, and related sequences such as hexaprobe, represent useful tools for probing protein function.

  16. Nonconsensus Protein Binding to Repetitive DNA Sequence Elements Significantly Affects Eukaryotic Genomes.

    Science.gov (United States)

    Afek, Ariel; Cohen, Hila; Barber-Zucker, Shiran; Gordân, Raluca; Lukatsky, David B

    2015-08-01

    Recent genome-wide experiments in different eukaryotic genomes provide an unprecedented view of transcription factor (TF) binding locations and of nucleosome occupancy. These experiments revealed that a large fraction of TF binding events occur in regions where only a small number of specific TF binding sites (TFBSs) have been detected. Furthermore, in vitro protein-DNA binding measurements performed for hundreds of TFs indicate that TFs are bound with wide range of affinities to different DNA sequences that lack known consensus motifs. These observations have thus challenged the classical picture of specific protein-DNA binding and strongly suggest the existence of additional recognition mechanisms that affect protein-DNA binding preferences. We have previously demonstrated that repetitive DNA sequence elements characterized by certain symmetries statistically affect protein-DNA binding preferences. We call this binding mechanism nonconsensus protein-DNA binding in order to emphasize the point that specific consensus TFBSs do not contribute to this effect. In this paper, using the simple statistical mechanics model developed previously, we calculate the nonconsensus protein-DNA binding free energy for the entire C. elegans and D. melanogaster genomes. Using the available chromatin immunoprecipitation followed by sequencing (ChIP-seq) results on TF-DNA binding preferences for ~100 TFs, we show that DNA sequences characterized by low predicted free energy of nonconsensus binding have statistically higher experimental TF occupancy and lower nucleosome occupancy than sequences characterized by high free energy of nonconsensus binding. This is in agreement with our previous analysis performed for the yeast genome. We suggest therefore that nonconsensus protein-DNA binding assists the formation of nucleosome-free regions, as TFs outcompete nucleosomes at genomic locations with enhanced nonconsensus binding. In addition, here we perform a new, large-scale analysis using

  17. Nonconsensus Protein Binding to Repetitive DNA Sequence Elements Significantly Affects Eukaryotic Genomes.

    Directory of Open Access Journals (Sweden)

    Ariel Afek

    2015-08-01

    Full Text Available Recent genome-wide experiments in different eukaryotic genomes provide an unprecedented view of transcription factor (TF binding locations and of nucleosome occupancy. These experiments revealed that a large fraction of TF binding events occur in regions where only a small number of specific TF binding sites (TFBSs have been detected. Furthermore, in vitro protein-DNA binding measurements performed for hundreds of TFs indicate that TFs are bound with wide range of affinities to different DNA sequences that lack known consensus motifs. These observations have thus challenged the classical picture of specific protein-DNA binding and strongly suggest the existence of additional recognition mechanisms that affect protein-DNA binding preferences. We have previously demonstrated that repetitive DNA sequence elements characterized by certain symmetries statistically affect protein-DNA binding preferences. We call this binding mechanism nonconsensus protein-DNA binding in order to emphasize the point that specific consensus TFBSs do not contribute to this effect. In this paper, using the simple statistical mechanics model developed previously, we calculate the nonconsensus protein-DNA binding free energy for the entire C. elegans and D. melanogaster genomes. Using the available chromatin immunoprecipitation followed by sequencing (ChIP-seq results on TF-DNA binding preferences for ~100 TFs, we show that DNA sequences characterized by low predicted free energy of nonconsensus binding have statistically higher experimental TF occupancy and lower nucleosome occupancy than sequences characterized by high free energy of nonconsensus binding. This is in agreement with our previous analysis performed for the yeast genome. We suggest therefore that nonconsensus protein-DNA binding assists the formation of nucleosome-free regions, as TFs outcompete nucleosomes at genomic locations with enhanced nonconsensus binding. In addition, here we perform a new, large

  18. High-throughput DNA sequencing: a genomic data manufacturing process.

    Science.gov (United States)

    Huang, G M

    1999-01-01

    The progress trends in automated DNA sequencing operation are reviewed. Technological development in sequencing instruments, enzymatic chemistry and robotic stations has resulted in ever-increasing capacity of sequence data production. This progress leads to a higher demand on laboratory information management and data quality assessment. High-throughput laboratories face the challenge of organizational management, as well as technology management. Engineering principles of process control should be adopted in this biological data manufacturing procedure. While various systems attempt to provide solutions to automate different parts of, or even the entire process, new technical advances will continue to change the paradigm and provide new challenges.

  19. Achieving high throughput sequencing of a cDNA library utilizing an alternative protocol for the bench top next-generation sequencing system.

    Science.gov (United States)

    Wan, Minxi; Faruq, Junaid; Rosenberg, Julian N; Xia, Jinlan; Oyler, George A; Betenbaugh, Michael J

    2013-02-15

    The development of next-generation sequencing (NGS) technologies has provided novel tools for genome analysis and expression profiling. A high throughput cDNA sequencing method using a bench top next-generation sequencing system, GS Junior, is now available. Here, we used an alternative protocol to the standard method for generating the cDNA library. This protocol can decrease the number of processing steps to manipulate RNA when constructing a cDNA library from an RNA sample, and does not require mRNA isolation from total RNA. Thus it can decrease the risk of RNA degradation and the cost for preparing a cDNA library. Also, the efficiency of sequencing data obtained with this approach is comparable to the standard method as verified by sequencing characteristics and expression levels of the reference gene glyceraldehyde-3-phosphate dehydrogenase (GAPDH).

  20. RNA-DNA sequence differences spell genetic code ambiguities

    DEFF Research Database (Denmark)

    Bentin, Thomas; Nielsen, Michael L

    2013-01-01

    A recent paper in Science by Li et al. 2011(1) reports widespread sequence differences in the human transcriptome between RNAs and their encoding genes termed RNA-DNA differences (RDDs). The findings could add a new layer of complexity to gene expression but the study has been criticized. ...

  1. Functionalized nanopore-embedded electrodes for rapid DNA sequencing

    CERN Document Server

    He, Haiying; Pandey, Ravindra; Rocha, Alexandre Reily; Sanvito, Stefano; Grigoriev, Anton; Ahuja, Rajeev; Karna, Shashi P

    2007-01-01

    The determination of a patient's DNA sequence can, in principle, reveal an increased risk to fall ill with particular diseases [1,2] and help to design "personalized medicine" [3]. Moreover, statistical studies and comparison of genomes [4] of a large number of individuals are crucial for the analysis of mutations [5] and hereditary diseases, paving the way to preventive medicine [6]. DNA sequencing is, however, currently still a vastly time-consuming and very expensive task [4], consisting of pre-processing steps, the actual sequencing using the Sanger method, and post-processing in the form of data analysis [7]. Here we propose a new approach that relies on functionalized nanopore-embedded electrodes to achieve an unambiguous distinction of the four nucleic acid bases in the DNA sequencing process. This represents a significant improvement over previously studied designs [8,9] which cannot reliably distinguish all four bases of DNA. The transport properties of the setup investigated by us, employing state-o...

  2. DNA sequence handling programs in BASIC for home computers.

    OpenAIRE

    Biro, P A

    1984-01-01

    This paper describes a DNA sequence handling program written entirely in BASIC and designed to be run on an Atari home computer. Many of the features common to more sophisticated programs have been included. The advantage of this program are its convenience, its transportability and its potential for user modification. The disadvantages are lack of sophistication and speed.

  3. Decoding long nanopore sequencing reads of natural DNA.

    Science.gov (United States)

    Laszlo, Andrew H; Derrington, Ian M; Ross, Brian C; Brinkerhoff, Henry; Adey, Andrew; Nova, Ian C; Craig, Jonathan M; Langford, Kyle W; Samson, Jenny Mae; Daza, Riza; Doering, Kenji; Shendure, Jay; Gundlach, Jens H

    2014-08-01

    Nanopore sequencing of DNA is a single-molecule technique that may achieve long reads, low cost and high speed with minimal sample preparation and instrumentation. Here, we build on recent progress with respect to nanopore resolution and DNA control to interpret the procession of ion current levels observed during the translocation of DNA through the pore MspA. As approximately four nucleotides affect the ion current of each level, we measured the ion current corresponding to all 256 four-nucleotide combinations (quadromers). This quadromer map is highly predictive of ion current levels of previously unmeasured sequences derived from the bacteriophage phi X 174 genome. Furthermore, we show nanopore sequencing reads of phi X 174 up to 4,500 bases in length, which can be unambiguously aligned to the phi X 174 reference genome, and demonstrate proof-of-concept utility with respect to hybrid genome assembly and polymorphism detection. This work provides a foundation for nanopore sequencing of long, natural DNA strands.

  4. jMOTU and Taxonerator: turning DNA Barcode sequences into annotated operational taxonomic units.

    Directory of Open Access Journals (Sweden)

    Martin Jones

    Full Text Available BACKGROUND: DNA barcoding and other DNA sequence-based techniques for investigating and estimating biodiversity require explicit methods for associating individual sequences with taxa, as it is at the taxon level that biodiversity is assessed. For many projects, the bioinformatic analyses required pose problems for laboratories whose prime expertise is not in bioinformatics. User-friendly tools are required for both clustering sequences into molecular operational taxonomic units (MOTU and for associating these MOTU with known organismal taxonomies. RESULTS: Here we present jMOTU, a Java program for the analysis of DNA barcode datasets that uses an explicit, determinate algorithm to define MOTU. We demonstrate its usefulness for both individual specimen-based Sanger sequencing surveys and bulk-environment metagenetic surveys using long-read next-generation sequencing data. jMOTU is driven through a graphical user interface, and can analyse tens of thousands of sequences in a short time on a desktop computer. A companion program, Taxonerator, that adds traditional taxonomic annotation to MOTU, is also presented. Clustering and taxonomic annotation data are stored in a relational database, and are thus amenable to subsequent data mining and web presentation. CONCLUSIONS: jMOTU efficiently and robustly identifies the molecular taxa present in survey datasets, and Taxonerator decorates the MOTU with putative identifications. jMOTU and Taxonerator are freely available from http://www.nematodes.org/.

  5. A simple method encoding linear single strain DNA sequence with natural numbers

    Institute of Scientific and Technical Information of China (English)

    LI Jiye; XU Yuan; ZHANG Wang

    2008-01-01

    A simple method presenting linear single strain DNA (LssDNA) sequence with natural numbers is introduced in this paper. The method presents LssDNA correspondingly with the numerals 1, 2, 3 and 4. After calculation, the sequence can be coded in natural numbers which can also be decoded into the DNA sequence. Thus, an LssDNA sequence can be expressed in a natural number and a dot at coordinate axes. In the future, a new LssDNA sequences database termed "DotBank" would be realized in which each LssDNA sequence is determined as a dot.

  6. Sequence heterogeneity accelerates protein search for targets on DNA

    Energy Technology Data Exchange (ETDEWEB)

    Shvets, Alexey A.; Kolomeisky, Anatoly B., E-mail: tolya@rice.edu [Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005 (United States)

    2015-12-28

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  7. Solid-State Nanopore-Based DNA Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Zewen Liu

    2016-01-01

    Full Text Available The solid-state nanopore-based DNA sequencing technology is becoming more and more attractive for its brand new future in gene detection field. The challenges that need to be addressed are diverse: the effective methods to detect base-specific signatures, the control of the nanopore’s size and surface properties, and the modulation of translocation velocity and behavior of the DNA molecules. Among these challenges, the realization of the high-quality nanopores with the help of modern micro/nanofabrication technologies is a crucial one. In this paper, typical technologies applied in the field of solid-state nanopore-based DNA sequencing have been reviewed.

  8. Electronic density of states in sequence dependent DNA molecules

    Science.gov (United States)

    de Oliveira, B. P. W.; Albuquerque, E. L.; Vasconcelos, M. S.

    2006-09-01

    We report in this work a numerical study of the electronic density of states (DOS) in π-stacked arrays of DNA single-strand segments made up from the nucleotides guanine G, adenine A, cytosine C and thymine T, forming a Rudin-Shapiro (RS) as well as a Fibonacci (FB) polyGC quasiperiodic sequences. Both structures are constructed starting from a G nucleotide as seed and following their respective inflation rules. Our theoretical method uses Dyson's equation together with a transfer-matrix treatment, within an electronic tight-binding Hamiltonian model, suitable to describe the DNA segments modelled by the quasiperiodic chains. We compared the DOS spectra found for the quasiperiodic structure to those using a sequence of natural DNA, as part of the human chromosome Ch22, with a remarkable concordance, as far as the RS structure is concerned. The electronic spectrum shows several peaks, corresponding to localized states, as well as a striking self-similar aspect.

  9. A complexity measure for symbolic sequences and applications to DNA

    CERN Document Server

    Majtey, A P; Lamberti, P W; Majtey, Ana P.; Roman-Roldan, Ramon; Lamberti, Pedro W.

    2006-01-01

    We introduce a complexity measure for symbolic sequences. Starting from a segmentation procedure of the sequence, we define its complexity as the entropy of the distribution of lengths of the domains of relatively uniform composition in which the sequence is decomposed. We show that this quantity verifies the properties usually required for a ``good'' complexity measure. In particular it satisfies the one hump property, is super-additive and has the important property of being dependent of the level of detail in which the sequence is analyzed. Finally we apply it to the evaluation of the complexity profile of some genetic sequences.

  10. A blind testing design for authenticating ancient DNA sequences.

    Science.gov (United States)

    Yang, H; Golenberg, E M; Shoshani, J

    1997-04-01

    Reproducibility is a serious concern among researchers of ancient DNA. We designed a blind testing procedure to evaluate laboratory accuracy and authenticity of ancient DNA obtained from closely related extant and extinct species. Soft tissue and bones of fossil and contemporary museum proboscideans were collected and identified based on morphology by one researcher, and other researchers carried out DNA testing on the samples, which were assigned anonymous numbers. DNA extracted using three principal isolation methods served as template in PCR amplifications of a segment of the cytochrome b gene (mitochondrial genome), and the PCR product was directly sequenced and analyzed. The results show that such a blind testing design performed in one laboratory, when coupled with phylogenetic analysis, can nonarbitrarily test the consistency and reliability of ancient DNA results. Such reproducible results obtained from the blind testing can increase confidence in the authenticity of ancient sequences obtained from postmortem specimens and avoid bias in phylogenetic analysis. A blind testing design may be applicable as an alternative to confirm ancient DNA results in one laboratory when independent testing by two laboratories is not available.

  11. DNA watermarks in non-coding regulatory sequences

    Directory of Open Access Journals (Sweden)

    Pyka Martin

    2009-07-01

    Full Text Available Abstract Background DNA watermarks can be applied to identify the unauthorized use of genetically modified organisms. It has been shown that coding regions can be used to encrypt information into living organisms by using the DNA-Crypt algorithm. Yet, if the sequence of interest presents a non-coding DNA sequence, either the function of a resulting functional RNA molecule or a regulatory sequence, such as a promoter, could be affected. For our studies we used the small cytoplasmic RNA 1 in yeast and the lac promoter region of Escherichia coli. Findings The lac promoter was deactivated by the integrated watermark. In addition, the RNA molecules displayed altered configurations after introducing a watermark, but surprisingly were functionally intact, which has been verified by analyzing the growth characteristics of both wild type and watermarked scR1 transformed yeast cells. In a third approach we introduced a second overlapping watermark into the lac promoter, which did not affect the promoter activity. Conclusion Even though the watermarked RNA and one of the watermarked promoters did not show any significant differences compared to the wild type RNA and wild type promoter region, respectively, it cannot be generalized that other RNA molecules or regulatory sequences behave accordingly. Therefore, we do not recommend integrating watermark sequences into regulatory regions.

  12. A Nano-Biosensor for DNA Sequence Detection Using Absorption Spectra of SWNT-DNA Composite

    Directory of Open Access Journals (Sweden)

    J. Bansal

    2011-01-01

    Full Text Available A biosensor based on Single Walled Carbon Nanotube (SWNT-Poly (GTn ssDNA hybrid has been developed for medical diagnostics. The absorption spectrum of this assay is determined with the help of a Shimadzu UV-VIS-NIR spectrophotometer. Two distinct bands each containing three peaks corresponding to first and second van Hove singularities in the density of states of the nanotubes were observed in the absorption spectrum. When a single-stranded DNA (ssDNA having a sequence complementary to probic DNA is added to the ssDNA-SWNT conjugates, hybridization takes place, which causes the red shift of absorption spectrum of nanotubes. On the other hand, when the DNA is noncomplementary, no shift in the absorption spectrum occurs since hybridization between the DNA and probe does not take place. The red shifting of the spectrum is considered to be due to change in the dielectric environment around nanotubes.

  13. Spectral sum rules and search for periodicities in DNA sequences

    Science.gov (United States)

    Chechetkin, V. R.

    2011-04-01

    Periodic patterns play the important regulatory and structural roles in genomic DNA sequences. Commonly, the underlying periodicities should be understood in a broad statistical sense, since the corresponding periodic patterns have been strongly distorted by the random point mutations and insertions/deletions during molecular evolution. The latent periodicities in DNA sequences can be efficiently displayed by Fourier transform. The criteria of significance for observed periodicities are obtained via the comparison versus the counterpart characteristics of the reference random sequences. We show that the restrictions imposed on the significance criteria by the rigorous spectral sum rules can be rationally described with De Finetti distribution. This distribution provides the convenient intermediate asymptotic form between Rayleigh distribution and exact combinatoric theory.

  14. Inhibition of hepatitis B virus replication with linear DNA sequences expressing antiviral micro-RNA shuttles

    Energy Technology Data Exchange (ETDEWEB)

    Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie; Weinberg, Marc S. [Antiviral Gene Therapy Research Unit, University of the Witwatersrand (South Africa); Arbuthnot, Patrick, E-mail: Patrick.Arbuthnot@wits.ac.za [Antiviral Gene Therapy Research Unit, University of the Witwatersrand (South Africa)

    2009-11-20

    RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR) shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.

  15. ADN-Viewer: a 3D approach for bioinformatic analyses of large DNA sequences.

    Science.gov (United States)

    Hérisson, Joan; Ferey, Nicolas; Gros, Pierre-Emmanuel; Gherbi, Rachid

    2007-01-20

    Most of biologists work on textual DNA sequences that are limited to the linear representation of DNA. In this paper, we address the potential offered by Virtual Reality for 3D modeling and immersive visualization of large genomic sequences. The representation of the 3D structure of naked DNA allows biologists to observe and analyze genomes in an interactive way at different levels. We developed a powerful software platform that provides a new point of view for sequences analysis: ADNViewer. Nevertheless, a classical eukaryotic chromosome of 40 million base pairs requires about 6 Gbytes of 3D data. In order to manage these huge amounts of data in real-time, we designed various scene management algorithms and immersive human-computer interaction for user-friendly data exploration. In addition, one bioinformatics study scenario is proposed.

  16. VoSeq: a voucher and DNA sequence web application.

    Directory of Open Access Journals (Sweden)

    Carlos Peña

    Full Text Available There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit.

  17. Cloning, characterization, and properties of seven triplet repeat DNA sequences.

    Science.gov (United States)

    Ohshima, K; Kang, S; Larson, J E; Wells, R D

    1996-07-12

    Several neuromuscular and neurodegenerative diseases are caused by genetically unstable triplet repeat sequences (CTG.CAG, CGG.CCG, or AAG.CTT) in or near the responsible genes. We implemented novel cloning strategies with chemically synthesized oligonucleotides to clone seven of the triplet repeat sequences (GTA.TAC, GAT.ATC, GTT.AAC, CAC.GTG, AGG.CCT, TCG.CGA, and AAG.CTT), and the adjoining paper (Ohshima, K., Kang, S., Larson, J. E., and Wells, R. D.(1996) J. Biol. Chem. 271, 16784-16791) describes studies on TTA.TAA. This approach in conjunction with in vivo expansion studies in Escherichia coli enabled the preparation of at least 81 plasmids containing the repeat sequences with lengths of approximately 16 up to 158 triplets in both orientations with varying extents of polymorphisms. The inserts were characterized by DNA sequencing as well as DNA polymerase pausings, two-dimensional agarose gel electrophoresis, and chemical probe analyses to evaluate the capacity to adopt negative supercoil induced non-B DNA conformations. AAG.CTT and AGG.CCT form intramolecular triplexes, and the other five repeat sequences do not form any previously characterized non-B structures. However, long tracts of TCG.CGA showed strong inhibition of DNA synthesis at specific loci in the repeats as seen in the cases of CTG.CAG and CGG.CCG (Kang, S., Ohshima, K., Shimizu, M., Amirhaeri, S., and Wells, R. D.(1995) J. Biol. Chem. 270, 27014-27021). This work along with other studies (Wells, R. D.(1996) J. Biol. Chem. 271, 2875-2878) on CTG.CAG, CGG.CCG, and TTA.TAA makes available long inserts of all 10 triplet repeat sequences for a variety of physical, molecular biological, genetic, and medical investigations. A model to explain the reduction in mRNA abundance in Friedreich's ataxia based on intermolecular triplex formation is proposed.

  18. A Real-Time de novo DNA Sequencing Assembly Platform Based on an FPGA Implementation.

    Science.gov (United States)

    Hu, Yuanqi; Georgiou, Pantelis

    2016-01-01

    This paper presents an FPGA based DNA comparison platform which can be run concurrently with the sensing phase of DNA sequencing and shortens the overall time needed for de novo DNA assembly. A hybrid overlap searching algorithm is applied which is scalable and can deal with incremental detection of new bases. To handle the incomplete data set which gradually increases during sequencing time, all-against-all comparisons are broken down into successive window-against-window comparison phases and executed using a novel dynamic suffix comparison algorithm combined with a partitioned dynamic programming method. The complete system has been designed to facilitate parallel processing in hardware, which allows real-time comparison and full scalability as well as a decrease in the number of computations required. A base pair comparison rate of 51.2 G/s is achieved when implemented on an FPGA with successful DNA comparison when using data sets from real genomes.

  19. Perspectives of DNA microarray and next-generation DNA sequencing technologies

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    DNA microarray and next-generation DNA sequencing technologies are important tools for high-throughput genome research,in revealing both the structural and functional characteristics of genomes.In the past decade the DNA microarray technologies have been widely applied in the studies of functional genomics,systems biology and pharmacogenomics.The next-generation DNA sequencing method was first introduced by the 454 Company in 2003,immediately followed by the establishment of the Solexa and Solid techniques by other biotech companies.Though it has not been long since the first emergence of this technology,with the fast and impressive improvement,the application of this technology has extended to almost all fields of genomics research,as a rival challenging the existing DNA microarray technology.This paper briefly reviews the working principles of these two technologies as well as their application and perspectives in genome research.

  20. [Patentability of DNA sequences: the debate remains open].

    Science.gov (United States)

    Martín Uranga, Amelia

    2013-01-01

    The patentability of human genes was from the beginning of the discussion concerning the Directive on the legal protection of biotechnological inventions, an issue that provoked debates among politicians, scientists, lawyers and civil society itself. Although Directive 98/44 tried to settle the matter by stating that to support the patentability of human genes, it should know what role they fulfill, which protein they encode, all of this as an essential requirement to test its industrial application. However, following the judgment of 13 June 2013 (Supreme Court of the United States of America in the case of Association for Molecular Pathology et al. versus Myriad Genetics Inc.) the debate on this issue has been reopened. There are several issues to be considered, taking into account that the patents on DNA & Gene Sequences have played an important incentive to increase the interest in biotechnology applied to human health. On the other hand, this is a paradigm shift in the R & D of biopharmaceutical companies, and it has moved from an in house research model to a model of open innovation, a model of collaboration between large corporations with biotech SMEs and public and private research centers. This model of innovation, impacts on the issue of the industrial property, and therefore it will be necessary to clearly define what each party brings to the relationship and how they are expected to share the results. But all of this, with the ultimate goal that the patients have access to treatments and medications most innovative, safe and effective.

  1. Recent developments in sequence selective minor groove DNA effectors.

    Science.gov (United States)

    Reddy, B S; Sharma, S K; Lown, J W

    2001-04-01

    DNA is a well characterized intracellular target but its large size and sequential nature make it an elusive target for selective drug action. Binding of low molecular weight ligands to DNA causes a wide variety of potential biological responses. In this respect the main consideration is given to recent developments in DNA sequence selective binding agents bearing conjugated effectors because of their potential application in diagnosis and treatment of cancers as well as in molecular biology. Recent progress in the development of cross linked lexitropsin oligopeptides and hairpins, which bind selectively to the minor groove of duplex DNA, is discussed. Bis-distamycins and related lexitropsins show inhibitory activity against HIV-1 and HIV-2 integrases at low nanomolar concentrations. Benzoyl nitrogen mustard analogs of lexitropsins are active against a variety of tumor models. Certain of the bis-benzimidazoles show altered DNA sequence preference and bind to DNA at 5'CG and TG sequences rather than at the preferred AT sites of the parent drug. A comparison of bifunctional bizelesin with monoalkylating adozelesin shows that it appears to have an increased sequence selectivity such that monoalkylating compounds react at more than one site but bizelesin reacts only at sites where there are two suitably positioned alkylation sites. Adozelesin, bizelesin and carzelesin are far more potent as cytotoxic agents than cisplatin or doxorubicin. A new class of 1,2,9,9a-tetrahydrocyclo-propa[c]benz[e]indole-4-one (CBI) analogs i.e., CBI-lexitropsin conjugates arising from the latter leads are also discussed.A number of cyclopropylpyrroloindole (CPI) and CBI-lexitropsin conjugates related to CC-1065 alkylate at the N3 position of adenine in the minor groove of DNA in a sequence specific manner, and also show cytotoxicities in the femtomolar range. The cross linking efficiency of PBD dimers is much greater than that of other cross linkers including cisplatin, and melphalan. A new

  2. Mechanism of sequence-specific template binding by the DNA primase of bacteriophage T7

    KAUST Repository

    Lee, Seung-Joo

    2010-03-28

    DNA primases catalyze the synthesis of the oligoribonucleotides required for the initiation of lagging strand DNA synthesis. Biochemical studies have elucidated the mechanism for the sequence-specific synthesis of primers. However, the physical interactions of the primase with the DNA template to explain the basis of specificity have not been demonstrated. Using a combination of surface plasmon resonance and biochemical assays, we show that T7 DNA primase has only a slightly higher affinity for DNA containing the primase recognition sequence (5\\'-TGGTC-3\\') than for DNA lacking the recognition site. However, this binding is drastically enhanced by the presence of the cognate Nucleoside triphosphates (NTPs), Adenosine triphosphate (ATP) and Cytosine triphosphate (CTP) that are incorporated into the primer, pppACCA. Formation of the dimer, pppAC, the initial step of sequence-specific primer synthesis, is not sufficient for the stable binding. Preformed primers exhibit significantly less selective binding than that observed with ATP and CTP. Alterations in subdomains of the primase result in loss of selective DNA binding. We present a model in which conformational changes induced during primer synthesis facilitate contact between the zinc-binding domain and the polymerase domain. The Author(s) 2010. Published by Oxford University Press.

  3. [Characterization and modification of phage T7 DNA polymerase for use in DNA sequencing]: Progress report

    Energy Technology Data Exchange (ETDEWEB)

    1992-01-01

    This project focuses on the DNA polymerase and accessory proteins of phage T7 for use in DNA sequence analysis. T7 DNA polymerase (gene 5 protein) interacts with accessory proteins for the acquisition of properties such as processivity that are necessary for DNA replication. One goal is to understand these interactions in order to modify the proteins to increase their usefulness with DNA sequence analysis. Using a genetically modified gene 5 protein lacking 3' to 5' exonuclease activity we have found that in the presence of manganese there is no discrimination against dideoxynucleotides, a property that enables novel approaches to DNA sequencing using automated technology. Pyrophosphorolysis can create problems in DNA sequence determination, a problem that can be eliminated by the addition of pyrophosphatase. Crystals of the gene 5 protein/thioredoxin complex have now been obtained and X-ray diffraction analysis will be undertaken once their quality has been improved. Amino acid changes in gene 5 protein have been identified that alter its interaction with thioredoxin. Characterization of these proteins should help determine how thioredoxin confers processivity on polymerization. We have characterized the 17 DNA binding protein, the gene 2.5 protein, and shown that it interacts with gene 5 protein and gene 4 protein. The gene 2.5 protein mediates homologous base pairing and strand uptake. Gene 5.5 protein interacts with E. coli Hl protein and affects gene expression. Biochemical and genetic studies on the T7 56-kDa gene 4 protein, the helicase, are focused on its physical interaction with T7 DNA polymerase and the mechanism by which the hydrolysis of nucleoside triphosphates fuels its unidirectional translocation on DNA.

  4. [Characterization and modification of phage T7 DNA polymerase for use in DNA sequencing]: Progress report

    Energy Technology Data Exchange (ETDEWEB)

    1992-12-31

    This project focuses on the DNA polymerase and accessory proteins of phage T7 for use in DNA sequence analysis. T7 DNA polymerase (gene 5 protein) interacts with accessory proteins for the acquisition of properties such as processivity that are necessary for DNA replication. One goal is to understand these interactions in order to modify the proteins to increase their usefulness with DNA sequence analysis. Using a genetically modified gene 5 protein lacking 3` to 5` exonuclease activity we have found that in the presence of manganese there is no discrimination against dideoxynucleotides, a property that enables novel approaches to DNA sequencing using automated technology. Pyrophosphorolysis can create problems in DNA sequence determination, a problem that can be eliminated by the addition of pyrophosphatase. Crystals of the gene 5 protein/thioredoxin complex have now been obtained and X-ray diffraction analysis will be undertaken once their quality has been improved. Amino acid changes in gene 5 protein have been identified that alter its interaction with thioredoxin. Characterization of these proteins should help determine how thioredoxin confers processivity on polymerization. We have characterized the 17 DNA binding protein, the gene 2.5 protein, and shown that it interacts with gene 5 protein and gene 4 protein. The gene 2.5 protein mediates homologous base pairing and strand uptake. Gene 5.5 protein interacts with E. coli Hl protein and affects gene expression. Biochemical and genetic studies on the T7 56-kDa gene 4 protein, the helicase, are focused on its physical interaction with T7 DNA polymerase and the mechanism by which the hydrolysis of nucleoside triphosphates fuels its unidirectional translocation on DNA.

  5. Next generation sequencing of DNA-launched Chikungunya vaccine virus

    Energy Technology Data Exchange (ETDEWEB)

    Hidajat, Rachmat; Nickols, Brian [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States); Forrester, Naomi [Institute for Human Infections and Immunity, Sealy Center for Vaccine Development and Department of Pathology, University of Texas Medical Branch, GNL, 301 University Blvd., Galveston, TX 77555 (United States); Tretyakova, Irina [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States); Weaver, Scott [Institute for Human Infections and Immunity, Sealy Center for Vaccine Development and Department of Pathology, University of Texas Medical Branch, GNL, 301 University Blvd., Galveston, TX 77555 (United States); Pushko, Peter, E-mail: ppushko@medigen-usa.com [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States)

    2016-03-15

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3′ untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. - Highlights: • Chikungunya virus (CHIKV) is an emerging pandemic threat. • In vivo DNA-launched attenuated CHIKV is a novel vaccine technology. • DNA-launched virus was sequenced using HiSeq2000 and compared to the 181/25 virus. • DNA-launched virus has lower frequency of SNPs at E2-12 and E2-82 attenuation loci.

  6. Cloning and characterization of a repetitive DNA sequence specific for Trichomonas vaginalis.

    Science.gov (United States)

    Paces, J; Urbánková, V; Urbánek, P

    1992-09-01

    A family of 650-bp-long repeats from the Trichomonas vaginalis genome, designated the Tv-E650 family, was cloned and sequenced. The nucleotide sequence is A+T-rich (73.3% A+T in the consensus sequence) and highly conserved among the 8 molecular clones analyzed. The differences among the clones are single-nucleotide and 2-nucleotide substitutions and insertions or deletions. The sequence uniformity of the clones as well as the presence of identical mutations in different clones suggest that efficient sequence homogenization mechanisms, such as gene conversion or recurring unequal crossing-over, operate in T. vaginalis. The copy number of the Tv-E650 repeats was estimated to be about 10(2)-10(3) per genome. Based on the DNA hybridization results, the Tv-E650 repeat family is conserved in all T. vaginalis strains examined, regardless of their diverse geographical origin. No hybridization of the Tv-E650 probe was found with the DNA from Trichomonas tenax, Trichomonas gallinae and Pentatrichomonas hominis, indicating that the Tv-E650 repeated sequences are species-specific. A dot blot hybridization protocol was developed which does not require isolation of DNA. By using this protocol it was possible to detect the DNA released from approximately 10(3) T. vaginalis cells per dot. These observations suggest that the Tv-E650 probe is potentially applicable to the identification and detection of T. vaginalis.

  7. Delineating relative homogeneous G+C domains in DNA sequences.

    Science.gov (United States)

    Li, W

    2001-10-03

    The concept of homogeneity of G+C content is always relative and subjective. This point is emphasized and quantified in this paper using a simple example of one sequence segmented into two subsequences. Whether the sequence is homogeneous or not can be answered by whether the two-subsequence model describes the DNA sequence better than the one-sequence model. There are at least three equivalent ways of looking at the 1-to-2 segmentation: Jensen-Shannon divergence measure, log likelihood ratio test, and model selection using Bayesian information criterion. Once a criterion is chosen, a DNA sequence can be recursively segmented into multiple domains. We use one subjective criterion called segmentation strength based on the Bayesian information criterion. Whether or not a sequence is homogeneous and how many domains it has depend on this criterion. We compare six different genome sequences (yeast S. cerevisiae chromosome III and IV, bacterium M. pneumoniae, human major histocompatibility complex sequence, longest contigs in human chromosome 21 and 22) by recursive segmentations at different strength criteria. Results by recursive segmentation confirm that yeast chromosome IV is more homogeneous than yeast chromosome III, human chromosome 21 is more homogeneous than human chromosome 22, and bacterial genomes may not be homogeneous due to short segments with distinct base compositions. The recursive segmentation also provides a quantitative criterion for identifying isochores in human sequences. Some features of our recursive segmentation, such as the possibility of delineating domain borders accurately, are superior to those of the moving-window approach commonly used in such analyses.

  8. Automated parallel DNA sequencing on multiple channel microchips.

    Science.gov (United States)

    Liu, S; Ren, H; Gao, Q; Roach, D J; Loder, R T; Armstrong, T M; Mao, Q; Blaga, I; Barker, D L; Jovanovich, S B

    2000-05-09

    We report automated DNA sequencing in 16-channel microchips. A microchip prefilled with sieving matrix is aligned on a heating plate affixed to a movable platform. Samples are loaded into sample reservoirs by using an eight-tip pipetting device, and the chip is docked with an array of electrodes in the focal plane of a four-color scanning detection system. Under computer control, high voltage is applied to the appropriate reservoirs in a programmed sequence that injects and separates the DNA samples. An integrated four-color confocal fluorescent detector automatically scans all 16 channels. The system routinely yields more than 450 bases in 15 min in all 16 channels. In the best case using an automated base-calling program, 543 bases have been called at an accuracy of >99%. Separations, including automated chip loading and sample injection, normally are completed in less than 18 min. The advantages of DNA sequencing on capillary electrophoresis chips include uniform signal intensity and tolerance of high DNA template concentration. To understand the fundamentals of these unique features we developed a theoretical treatment of cross-channel chip injection that we call the differential concentration effect. We present experimental evidence consistent with the predictions of the theory.

  9. Impacts of degraded DNA on restriction enzyme associated DNA sequencing (RADSeq).

    Science.gov (United States)

    Graham, Carly F; Glenn, Travis C; McArthur, Andrew G; Boreham, Douglas R; Kieran, Troy; Lance, Stacey; Manzon, Richard G; Martino, Jessica A; Pierson, Todd; Rogers, Sean M; Wilson, Joanna Y; Somers, Christopher M

    2015-11-01

    Degraded DNA from suboptimal field sampling is common in molecular ecology. However, its impact on techniques that use restriction site associated next-generation DNA sequencing (RADSeq, GBS) is unknown. We experimentally examined the effects of in situDNA degradation on data generation for a modified double-digest RADSeq approach (3RAD). We generated libraries using genomic DNA serially extracted from the muscle tissue of 8 individual lake whitefish (Coregonus clupeaformis) following 0-, 12-, 48- and 96-h incubation at room temperature posteuthanasia. This treatment of the tissue resulted in input DNA that ranged in quality from nearly intact to highly sheared. All samples were sequenced as a multiplexed pool on an Illumina MiSeq. Libraries created from low to moderately degraded DNA (12-48 h) performed well. In contrast, the number of RADtags per individual, number of variable sites, and percentage of identical RADtags retained were all dramatically reduced when libraries were made using highly degraded DNA (96-h group). This reduction in performance was largely due to a significant and unexpected loss of raw reads as a result of poor quality scores. Our findings remained consistent after changes in restriction enzymes, modified fold coverage values (2- to 16-fold), and additional read-length trimming. We conclude that starting DNA quality is an important consideration for RADSeq; however, the approach remains robust until genomic DNA is extensively degraded.

  10. Assessing the fidelity of ancient DNA sequences amplified from nuclear genes

    DEFF Research Database (Denmark)

    Binladen, Jonas; Wiuf, Carsten Henrik; Gilbert, M. Thomas P.

    2006-01-01

    To date, the field of ancient DNA has relied almost exclusively on mitochondrial DNA (mtDNA) sequences. However, a number of recent studies have reported the successful recovery of ancient nuclear DNA (nuDNA) sequences, thereby allowing the characterization of genetic loci directly involved in ph...

  11. Sigma: multiple alignment of weakly-conserved non-coding DNA sequence

    Directory of Open Access Journals (Sweden)

    Siddharthan Rahul

    2006-03-01

    Full Text Available Abstract Background Existing tools for multiple-sequence alignment focus on aligning protein sequence or protein-coding DNA sequence, and are often based on extensions to Needleman-Wunsch-like pairwise alignment methods. We introduce a new tool, Sigma, with a new algorithm and scoring scheme designed specifically for non-coding DNA sequence. This problem acquires importance with the increasing number of published sequences of closely-related species. In particular, studies of gene regulation seek to take advantage of comparative genomics, and recent algorithms for finding regulatory sites in phylogenetically-related intergenic sequence require alignment as a preprocessing step. Much can also be learned about evolution from intergenic DNA, which tends to evolve faster than coding DNA. Sigma uses a strategy of seeking the best possible gapless local alignments (a strategy earlier used by DiAlign, at each step making the best possible alignment consistent with existing alignments, and scores the significance of the alignment based on the lengths of the aligned fragments and a background model which may be supplied or estimated from an auxiliary file of intergenic DNA. Results Comparative tests of sigma with five earlier algorithms on synthetic data generated to mimic real data show excellent performance, with Sigma balancing high "sensitivity" (more bases aligned with effective filtering of "incorrect" alignments. With real data, while "correctness" can't be directly quantified for the alignment, running the PhyloGibbs motif finder on pre-aligned sequence suggests that Sigma's alignments are superior. Conclusion By taking into account the peculiarities of non-coding DNA, Sigma fills a gap in the toolbox of bioinformatics.

  12. The influence of DNA sequence on epigenome-induced pathologies

    Directory of Open Access Journals (Sweden)

    Meagher Richard B

    2012-07-01

    Full Text Available Abstract Clear cause-and-effect relationships are commonly established between genotype and the inherited risk of acquiring human and plant diseases and aberrant phenotypes. By contrast, few such cause-and-effect relationships are established linking a chromatin structure (that is, the epitype with the transgenerational risk of acquiring a disease or abnormal phenotype. It is not entirely clear how epitypes are inherited from parent to offspring as populations evolve, even though epigenetics is proposed to be fundamental to evolution and the likelihood of acquiring many diseases. This article explores the hypothesis that, for transgenerationally inherited chromatin structures, “genotype predisposes epitype”, and that epitype functions as a modifier of gene expression within the classical central dogma of molecular biology. Evidence for the causal contribution of genotype to inherited epitypes and epigenetic risk comes primarily from two different kinds of studies discussed herein. The first and direct method of research proceeds by the examination of the transgenerational inheritance of epitype and the penetrance of phenotype among genetically related individuals. The second approach identifies epitypes that are duplicated (as DNA sequences are duplicated and evolutionarily conserved among repeated patterns in the DNA sequence. The body of this article summarizes particularly robust examples of these studies from humans, mice, Arabidopsis, and other organisms. The bulk of the data from both areas of research support the hypothesis that genotypes predispose the likelihood of displaying various epitypes, but for only a few classes of epitype. This analysis suggests that renewed efforts are needed in identifying polymorphic DNA sequences that determine variable nucleosome positioning and DNA methylation as the primary cause of inherited epigenome-induced pathologies. By contrast, there is very little evidence that DNA sequence directly

  13. Spectral sum rules and search for periodicities in DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Chechetkin, V.R., E-mail: chechet@biochip.r [Theoretical Department of Division for Perspective Investigations, Troitsk Institute of Innovation and Thermonuclear Investigations (TRINITI), Troitsk, 142190 Moscow Region (Russian Federation)

    2011-04-18

    Periodic patterns play the important regulatory and structural roles in genomic DNA sequences. Commonly, the underlying periodicities should be understood in a broad statistical sense, since the corresponding periodic patterns have been strongly distorted by the random point mutations and insertions/deletions during molecular evolution. The latent periodicities in DNA sequences can be efficiently displayed by Fourier transform. The criteria of significance for observed periodicities are obtained via the comparison versus the counterpart characteristics of the reference random sequences. We show that the restrictions imposed on the significance criteria by the rigorous spectral sum rules can be rationally described with De Finetti distribution. This distribution provides the convenient intermediate asymptotic form between Rayleigh distribution and exact combinatoric theory. - Highlights: We study the significance criteria for latent periodicities in DNA sequences. The constraints imposed by sum rules can be described with De Finetti distribution. It is intermediate between Rayleigh distribution and exact combinatoric theory. Theory is applicable to the study of correlations between different periodicities. The approach can be generalized to the arbitrary discrete Fourier transform.

  14. Early Lyme disease with spirochetemia - diagnosed by DNA sequencing

    Directory of Open Access Journals (Sweden)

    Jones William

    2010-11-01

    Full Text Available Abstract Background A sensitive and analytically specific nucleic acid amplification test (NAAT is valuable in confirming the diagnosis of early Lyme disease at the stage of spirochetemia. Findings Venous blood drawn from patients with clinical presentations of Lyme disease was tested for the standard 2-tier screen and Western Blot serology assay for Lyme disease, and also by a nested polymerase chain reaction (PCR for B. burgdorferi sensu lato 16S ribosomal DNA. The PCR amplicon was sequenced for B. burgdorferi genomic DNA validation. A total of 130 patients visiting emergency room (ER or Walk-in clinic (WALKIN, and 333 patients referred through the private physicians' offices were studied. While 5.4% of the ER/WALKIN patients showed DNA evidence of spirochetemia, none (0% of the patients referred from private physicians' offices were DNA-positive. In contrast, while 8.4% of the patients referred from private physicians' offices were positive for the 2-tier Lyme serology assay, only 1.5% of the ER/WALKIN patients were positive for this antibody test. The 2-tier serology assay missed 85.7% of the cases of early Lyme disease with spirochetemia. The latter diagnosis was confirmed by DNA sequencing. Conclusion Nested PCR followed by automated DNA sequencing is a valuable supplement to the standard 2-tier antibody assay in the diagnosis of early Lyme disease with spirochetemia. The best time to test for Lyme spirochetemia is when the patients living in the Lyme disease endemic areas develop unexplained symptoms or clinical manifestations that are consistent with Lyme disease early in the course of their illness.

  15. MEME: discovering and analyzing DNA and protein sequence motifs.

    Science.gov (United States)

    Bailey, Timothy L; Williams, Nadya; Misleh, Chris; Li, Wilfred W

    2006-07-01

    MEME (Multiple EM for Motif Elicitation) is one of the most widely used tools for searching for novel 'signals' in sets of biological sequences. Applications include the discovery of new transcription factor binding sites and protein domains. MEME works by searching for repeated, ungapped sequence patterns that occur in the DNA or protein sequences provided by the user. Users can perform MEME searches via the web server hosted by the National Biomedical Computation Resource (http://meme.nbcr.net) and several mirror sites. Through the same web server, users can also access the Motif Alignment and Search Tool to search sequence databases for matches to motifs encoded in several popular formats. By clicking on buttons in the MEME output, users can compare the motifs discovered in their input sequences with databases of known motifs, search sequence databases for matches to the motifs and display the motifs in various formats. This article describes the freely accessible web server and its architecture, and discusses ways to use MEME effectively to find new sequence patterns in biological sequences and analyze their significance.

  16. A CLIQUE algorithm using DNA computing techniques based on closed-circle DNA sequences.

    Science.gov (United States)

    Zhang, Hongyan; Liu, Xiyu

    2011-07-01

    DNA computing has been applied in broad fields such as graph theory, finite state problems, and combinatorial problem. DNA computing approaches are more suitable used to solve many combinatorial problems because of the vast parallelism and high-density storage. The CLIQUE algorithm is one of the gird-based clustering techniques for spatial data. It is the combinatorial problem of the density cells. Therefore we utilize DNA computing using the closed-circle DNA sequences to execute the CLIQUE algorithm for the two-dimensional data. In our study, the process of clustering becomes a parallel bio-chemical reaction and the DNA sequences representing the marked cells can be combined to form a closed-circle DNA sequences. This strategy is a new application of DNA computing. Although the strategy is only for the two-dimensional data, it provides a new idea to consider the grids to be vertexes in a graph and transform the search problem into a combinatorial problem.

  17. Short sequence effect of ancient DNA on mammoth phylogenetic analyses

    Institute of Scientific and Technical Information of China (English)

    Guilian SHENG; Lianjuan WU; Xindong HOU; Junxia YUAN; Shenghong CHENG; Bojian ZHONG; Xulong LAI

    2009-01-01

    The evolution of Elephantidae has been intensively studied in the past few years, especially after 2006. The molecular approaches have made great contribution to the assumption that the extinct woolly mammoth has a close relationship with the Asian elephant instead of the African elephant. In this study, partial ancient DNA sequences of cytochrome b (cyt b) gene in mitochondrial genome were successfully retrieved from Late Pleistocene Mammuthus primigenius bones collected from Heilongjiang Province in Northeast China. Both the partial and complete homologous cyt b gene sequences and the whole mitochondrial genome sequences extracted from GenBank were aligned and used as datasets for phylogenetic analyses. All of the phylogenetic trees, based on either the partial or the complete cyt b gene, reject the relationship constructed by the whole mitochondrial genome, showing the occurrence of an effect of sequence length of cyt b gene on mammoth phylogenetic analyses.

  18. Accelerating DNA Sequence Analysis using Intel Xeon Phi

    OpenAIRE

    Memeti, Suejb; Pllana, Sabri

    2015-01-01

    Genetic information is increasing exponentially, doubling every 18 months. Analyzing this information within a reasonable amount of time requires parallel computing resources. While considerable research has addressed DNA analysis using GPUs, so far not much attention has been paid to the Intel Xeon Phi coprocessor. In this paper we present an algorithm for large-scale DNA analysis that exploits thread-level and the SIMD parallelism of the Intel Xeon Phi. We evaluate our approach for various ...

  19. An optimized chloroplast DNA extraction protocol for grasses (Poaceae proves suitable for whole plastid genome sequencing and SNP detection.

    Directory of Open Access Journals (Sweden)

    Kerstin Diekmann

    Full Text Available BACKGROUND: Obtaining chloroplast genome sequences is important to increase the knowledge about the fundamental biology of plastids, to understand evolutionary and ecological processes in the evolution of plants, to develop biotechnological applications (e.g. plastid engineering and to improve the efficiency of breeding schemes. Extraction of pure chloroplast DNA is required for efficient sequencing of chloroplast genomes. Unfortunately, most protocols for extracting chloroplast DNA were developed for eudicots and do not produce sufficiently pure yields for a shotgun sequencing approach of whole plastid genomes from the monocot grasses. METHODOLOGY/PRINCIPAL FINDINGS: We have developed a simple and inexpensive method to obtain chloroplast DNA from grass species by modifying and extending protocols optimized for the use in eudicots. Many protocols for extracting chloroplast DNA require an ultracentrifugation step to efficiently separate chloroplast DNA from nuclear DNA. The developed method uses two more centrifugation steps than previously reported protocols and does not require an ultracentrifuge. CONCLUSIONS/SIGNIFICANCE: The described method delivered chloroplast DNA of very high quality from two grass species belonging to highly different taxonomic subfamilies within the grass family (Lolium perenne, Pooideae; Miscanthus x giganteus, Panicoideae. The DNA from Lolium perenne was used for whole chloroplast genome sequencing and detection of SNPs. The sequence is publicly available on EMBL/GenBank.

  20. ParB spreading requires DNA bridging

    NARCIS (Netherlands)

    Graham, Thomas G. W.; Wang, Xindan; Song, Dan; Etson, Candice M.; van Oijen, Antoine M.; Rudner, David Z.; Loparo, Joseph J.

    2014-01-01

    The parABS system is a widely employed mechanism for plasmid partitioning and chromosome segregation in bacteria. ParB binds to parS sites on plasmids and chromosomes and associates with broad regions of adjacent DNA, a phenomenon known as spreading. Although essential for ParB function, the mechani

  1. Viral discovery and sequence recovery using DNA microarrays.

    Directory of Open Access Journals (Sweden)

    David Wang

    2003-11-01

    Full Text Available Because of the constant threat posed by emerging infectious diseases and the limitations of existing approaches used to identify new pathogens, there is a great demand for new technological methods for viral discovery. We describe herein a DNA microarray-based platform for novel virus identification and characterization. Central to this approach was a DNA microarray designed to detect a wide range of known viruses as well as novel members of existing viral families; this microarray contained the most highly conserved 70mer sequences from every fully sequenced reference viral genome in GenBank. During an outbreak of severe acute respiratory syndrome (SARS in March 2003, hybridization to this microarray revealed the presence of a previously uncharacterized coronavirus in a viral isolate cultivated from a SARS patient. To further characterize this new virus, approximately 1 kb of the unknown virus genome was cloned by physically recovering viral sequences hybridized to individual array elements. Sequencing of these fragments confirmed that the virus was indeed a new member of the coronavirus family. This combination of array hybridization followed by direct viral sequence recovery should prove to be a general strategy for the rapid identification and characterization of novel viruses and emerging infectious disease.

  2. Phylogenetic relationships of the Gomphales based on nuc-25S-rDNA, mit-12S-rDNA, and mit-atp6-DNA combined sequences

    Science.gov (United States)

    Admir J. Giachini; Kentaro Hosaka; Eduardo Nouhra; Joseph Spatafora; James M. Trappe

    2010-01-01

    Phylogenetic relationships among Geastrales, Gomphales, Hysterangiales, and Phallales were estimated via combined sequences: nuclear large subunit ribosomal DNA (nuc-25S-rDNA), mitochondrial small subunit ribosomal DNA (mit-12S-rDNA), and mitochondrial atp6 DNA (mit-atp6-DNA). Eighty-one taxa comprising 19 genera and 58 species...

  3. Phylogenetic analysis of the genus Hordeum using repetitive DNA sequences

    DEFF Research Database (Denmark)

    Svitashev, S.; Bryngelsson, T.; Vershinin, A.

    1994-01-01

    A set of six cloned barley (Hordeum vulgare) repetitive DNA sequences was used for the analysis of phylogenetic relationships among 31 species (46 taxa) of the genus Hordeum, using molecular hybridization techniques. In situ hybridization experiments showed dispersed organization of the sequences...... over all chromosomes of H. vulgare and the wild barley species H. bulbosum, H. marinum and H. murinum. Southern blot hybridization revealed different levels of polymorphism among barley species and the RFLP data were used to generate a phylogenetic tree for the genus Hordeum. Our data are in a good...

  4. The implementation of bit-parallelism for DNA sequence alignment

    Science.gov (United States)

    Setyorini; Kuspriyanto; Widyantoro, D. H.; Pancoro, A.

    2017-05-01

    Dynamic Programming (DP) remain the central algorithm of biological sequence alignment. Matching score computation is the most time-consuming process. Bit-parallelism is one of approximate string matching techniques that transform DP matrix cell unit processing into word unit (groups of cell). Bit-parallelism computate the scores column-wise. Adopting from word processing in computer system work, this technique promise reducing time in score computing process in DP matrix. In this paper, we implement bit-parallelism technique for DNA sequence alignment. Our bit-parallelism implementation have less time for score computational process but still need improvement for there construction process.

  5. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison

    Directory of Open Access Journals (Sweden)

    Kato Mikio

    2003-01-01

    Full Text Available Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA.

  6. Facile, High Quality Sequencing of Bacterial Genomes from Small Amounts of DNA

    Directory of Open Access Journals (Sweden)

    Momchilo Vuyisich

    2014-01-01

    Full Text Available Sequencing bacterial genomes has traditionally required large amounts of genomic DNA (~1 μg. There have been few studies to determine the effects of the input DNA amount or library preparation method on the quality of sequencing data. Several new commercially available library preparation methods enable shotgun sequencing from as little as 1 ng of input DNA. In this study, we evaluated the NEBNext Ultra library preparation reagents for sequencing bacterial genomes. We have evaluated the utility of NEBNext Ultra for resequencing and de novo assembly of four bacterial genomes and compared its performance with the TruSeq library preparation kit. The NEBNext Ultra reagents enable high quality resequencing and de novo assembly of a variety of bacterial genomes when using 100 ng of input genomic DNA. For the two most challenging genomes (Burkholderia spp., which have the highest GC content and are the longest, we also show that the quality of both resequencing and de novo assembly is not decreased when only 10 ng of input genomic DNA is used.

  7. Effect of dephasing on DNA sequencing via transverse electronic transport

    Energy Technology Data Exchange (ETDEWEB)

    Zwolak, Michael [Los Alamos National Laboratory; Krems, Matt [NON LANL; Pershin, Yuriy V [NON LANL; Di Ventra, Massimiliano [NON LANL

    2009-01-01

    We study theoretically the effects of dephasing on DNA sequencing in a nanopore via transverse electronic transport. To do this, we couple classical molecular dynamics simulations with transport calculations using scattering theory. Previous studies, which did not include dephasing, have shown that by measuring the transverse current of a particular base multiple times, one can get distributions of currents for each base that are distinguishable. We introduce a dephasing parameter into transport calculations to simulate the effects of the ions and other fluctuations. These effects lower the overall magnitude of the current, but have little effect on the current distributions themselves. The results of this work further implicate that distinguishing DNA bases via transverse electronic transport has potential as a sequencing tool.

  8. In the Staphylococcus aureus Two-Component System sae, the Response Regulator SaeR Binds to a Direct Repeat Sequence and DNA Binding Requires Phosphorylation by the Sensor Kinase SaeS ▿

    OpenAIRE

    Sun, Fei; Li, Chunling; Jeong, Dowon; Sohn, Changmo; He, Chuan; Bae, Taeok

    2010-01-01

    Staphylococcus aureus uses the SaeRS two-component system to control the expression of many virulence factors such as alpha-hemolysin and coagulase; however, the molecular mechanism of this signaling has not yet been elucidated. Here, using the P1 promoter of the sae operon as a model target DNA, we demonstrated that the unphosphorylated response regulator SaeR does not bind to the P1 promoter DNA, while its C-terminal DNA binding domain alone does. The DNA binding activity of full-length Sae...

  9. Measuring cation dependent DNA polymerase fidelity landscapes by deep sequencing.

    Directory of Open Access Journals (Sweden)

    Bradley Michael Zamft

    Full Text Available High-throughput recording of signals embedded within inaccessible micro-environments is a technological challenge. The ideal recording device would be a nanoscale machine capable of quantitatively transducing a wide range of variables into a molecular recording medium suitable for long-term storage and facile readout in the form of digital data. We have recently proposed such a device, in which cation concentrations modulate the misincorporation rate of a DNA polymerase (DNAP on a known template, allowing DNA sequences to encode information about the local cation concentration. In this work we quantify the cation sensitivity of DNAP misincorporation rates, making possible the indirect readout of cation concentration by DNA sequencing. Using multiplexed deep sequencing, we quantify the misincorporation properties of two DNA polymerases--Dpo4 and Klenow exo(---obtaining the probability and base selectivity of misincorporation at all positions within the template. We find that Dpo4 acts as a DNA recording device for Mn(2+ with a misincorporation rate gain of ∼2%/mM. This modulation of misincorporation rate is selective to the template base: the probability of misincorporation on template T by Dpo4 increases >50-fold over the range tested, while the other template bases are affected less strongly. Furthermore, cation concentrations act as scaling factors for misincorporation: on a given template base, Mn(2+ and Mg(2+ change the overall misincorporation rate but do not alter the relative frequencies of incoming misincorporated nucleotides. Characterization of the ion dependence of DNAP misincorporation serves as the first step towards repurposing it as a molecular recording device.

  10. An automated annotation tool for genomic DNA sequences using GeneScan and BLAST

    Indian Academy of Sciences (India)

    Andrew M. Lynn; Chakresh Kumar Jain; K. Kosalai; Pranjan Barman; Nupur Thakur; Harish Batra; Alok Bhattacharya

    2001-04-01

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated annotation of genome DNA sequences.

  11. Color image encryption scheme using CML and DNA sequence operations.

    Science.gov (United States)

    Wang, Xing-Yuan; Zhang, Hui-Li; Bao, Xue-Mei

    2016-06-01

    In this paper, an encryption algorithm for color images using chaotic system and DNA (Deoxyribonucleic acid) sequence operations is proposed. Three components for the color plain image is employed to construct a matrix, then perform confusion operation on the pixels matrix generated by the spatiotemporal chaos system, i.e., CML (coupled map lattice). DNA encoding rules, and decoding rules are introduced in the permutation phase. The extended Hamming distance is proposed to generate new initial values for CML iteration combining color plain image. Permute the rows and columns of the DNA matrix and then get the color cipher image from this matrix. Theoretical analysis and experimental results prove the cryptosystem secure and practical, and it is suitable for encrypting color images of any size. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  12. Similarity Estimation Between DNA Sequences Based on Local Pattern Histograms of Binary Images

    Institute of Scientific and Technical Information of China (English)

    Yusei Kobori; Satoshi Mizuta

    2016-01-01

    Graphical representation of DNA sequences is one of the most popular techniques for alignment-free sequence comparison. Here, we propose a new method for the feature extraction of DNA sequences represented by binary images, by estimating the similarity between DNA sequences using the frequency histograms of local bitmap patterns of images. Our method shows linear time complexity for the length of DNA sequences, which is practical even when long sequences, such as whole genome sequences, are compared. We tested five distance measures for the estimation of sequence similarities, and found that the histogram intersection and Manhattan distance are the most appropriate ones for phylogenetic analyses.

  13. Rapid sequencing of DNA based on single-molecule detection

    Science.gov (United States)

    Soper, Steven A.; Davis, Lloyd M.; Fairfield, Frederick R.; Hammond, Mark L.; Harger, Carol A.; Jett, James H.; Keller, Richard A.; Marrone, Babetta L.; Martin, John C.; Nutter, Harvey L.; Shera, E. Brooks; Simpson, Daniel J.

    1991-07-01

    Sequencing the human genome is a major undertaking considering the large number of nucleotides present in the genome and the slow methods currently available to perform the task. The authors have recently reported on a scheme to sequence DNA rapidly using a non-gel based technique. The concept is based upon the incorporation of fluorescently labeled nucleotides into a strand of DNA, isolation and manipulation of a labeled DNA fragment and the detection of single nucleotides using ultra-sensitive laser-induced fluorescence detection following their cleavage from the fragment. Detection of individual fluorophores in the liquid phase was accomplished with time-gated detection following pulsed-laser excitation. The photon bursts from individual rhodamine 6G (R6G) molecules travelling through a laser beam have been observed, as have bursts from single fluorescently modified nucleotides. Using two different biotinylated nucleotides as a model system for fluorescently labeled nucleotides, the authors have observed synthesis of the complementary copy of M13 bacteriophage. Work with fluorescently labeled nucleotides is underway. Individual molecules of DNA attached to a microbead have been observed and manipulated with an epifluorescence microscope.

  14. Human mitochondrial DNA complete amplification and sequencing: a new validated primer set that prevents nuclear DNA sequences of mitochondrial origin co-amplification.

    Science.gov (United States)

    Ramos, Amanda; Santos, Cristina; Alvarez, Luis; Nogués, Ramon; Aluja, Maria Pilar

    2009-05-01

    To date, there are no published primers to amplify the entire mitochondrial DNA (mtDNA) that completely prevent the amplification of nuclear DNA (nDNA) sequences of mitochondrial origin. The main goal of this work was to design, validate and describe a set of primers, to specifically amplify and sequence the complete human mtDNA, allowing the correct interpretation of mtDNA heteroplasmy in healthy and pathological samples. Validation was performed using two different approaches: (i) Basic Local Alignment Search Tool and (ii) amplification using isolated nDNA obtained from sperm cells by differential lyses. During the validation process, two mtDNA regions, with high similarity with nDNA, represent the major problematic areas for primer design. One of these could represent a non-published nuclear DNA sequence of mitochondrial origin. For two of the initially designed fragments, the amplification results reveal PCR artifacts that can be attributed to the poor quality of the DNA. After the validation, nine overlapping primer pairs to perform mtDNA amplification and 22 additional internal primers for mtDNA sequencing were obtained. These primers could be a useful tool in future projects that deal with mtDNA complete sequencing and heteroplasmy detection, since they represent a set of primers that have been tested for the non-amplification of nDNA.

  15. A DNA sequence alignment algorithm using quality information and a fuzzy inference method

    Institute of Scientific and Technical Information of China (English)

    Kwangbaek Kim; Minhwan Kim; Youngwoon Woo

    2008-01-01

    DNA sequence alignment algorithms in computational molecular biology have been improved by diverse methods.In this paper.We propose a DNA sequence alignment that Uses quality information and a fuzzy inference method developed based on the characteristics of DNA fragments and a fuzzy logic system in order to improve conventional DNA sequence alignment methods that uses DNA sequence quality information.In conventional algorithms.DNA sequence alignment scores are calculated by the global sequence alignment algorithm proposed by Needleman-Wunsch,which is established by using quality information of each DNA fragment.However,there may be errors in the process of calculating DNA sequence alignment scores when the quality of DNA fragment tips is low.because only the overall DNA sequence quality information are used.In our proposed method.an exact DNA sequence alignment can be achieved in spite of the low quality of DNA fragment tips by improvement of conventional algorithms using quality information.Mapping score parameters used to calculate DNA sequence alignment scores are dynamically adjusted by the fuzzy logic system utilizing lengths of DNA fragments and frequencies of low quality DNA bases in the fragments.From the experiments by applying real genome data of National Center for Bioteclmology Information,we could see that the proposed method is more efficient than conventional algorithms.

  16. DNA Amplification and Nucleotide Sequence Determination of a Region of Mitochondrial DNA in the Sea Snake, Laticauda Semifasciata

    OpenAIRE

    Eguchi, Tomoko; Eguchi, Yukinori; Oshiro, Minoru; Asato, Tsuyoshi; Takei, Hiroshi; Nakashima, Yasutsugu

    1993-01-01

    We determined the nucleotide sequence of a region of the 12S ribosomal RNA (rRNA) gene in the mitochondrial DNA (mtDNA) of the sea snake, Laticauda semifasciata, using the polymerase chain reaction (PCR). We synthesized oligonucleotide primers according to the nucleotide sequence of human mt DNA 12S rRNA gene and found that the target sequence (386bp) of the sea snake mtDNA could be amplified with these primers. The nucleotide sequence of the amplified region of the sea snake mt DNA was deter...

  17. High-throughput DNA Stretching in Continuous Elongational Flow for Genome Sequence Scanning

    Science.gov (United States)

    Meltzer, Robert; Griffis, Joshua; Safranovitch, Mikhail; Malkin, Gene; Cameron, Douglas

    2014-03-01

    Genome Sequence Scanning (GSS) identifies and compares bacterial genomes by stretching long (60 - 300 kb) genomic DNA restriction fragments and scanning for site-selective fluorescent probes. Practical application of GSS requires: 1) high throughput data acquisition, 2) efficient DNA stretching, 3) reproducible DNA elasticity in the presence of intercalating fluorescent dyes. GSS utilizes a pseudo-two-dimensional micron-scale funnel with convergent sheathing flows to stretch one molecule at a time in continuous elongational flow and center the DNA stream over diffraction-limited confocal laser excitation spots. Funnel geometry has been optimized to maximize throughput of DNA within the desired length range (>10 million nucleobases per second). A constant-strain detection channel maximizes stretching efficiency by applying a constant parabolic tension profile to each molecule, minimizing relaxation and flow-induced tumbling. The effect of intercalator on DNA elasticity is experimentally controlled by reacting one molecule of DNA at a time in convergent sheathing flows of the dye. Derivations of accelerating flow and non-linear tension distribution permit alignment of detected fluorescence traces to theoretical templates derived from whole-genome sequence data.

  18. Model identification for DNA sequence-structure relationships.

    Science.gov (United States)

    Hawley, Stephen Dwyer; Chiu, Anita; Chizeck, Howard Jay

    2006-11-01

    We investigate the use of algebraic state-space models for the sequence dependent properties of DNA. By considering the DNA sequence as an input signal, rather than using an all atom physical model, computational efficiency is achieved. A challenge in deriving this type of model is obtaining its structure and estimating its parameters. Here we present two candidate model structures for the sequence dependent structural property Slide and a method of encoding the models so that a recursive least squares algorithm can be applied for parameter estimation. These models are based on the assumption that the value of Slide at a base-step is determined by the surrounding tetranucleotide sequence. The first model takes the four bases individually as inputs and has a median root mean square deviation of 0.90 A. The second model takes the four bases pairwise and has a median root mean square deviation of 0.88 A. These values indicate that the accuracy of these models is within the useful range for structure prediction. Performance is comparable to published predictions of a more physically derived model, at significantly less computational cost.

  19. DNA Sequencing via Quantum Mechanics and Machine Learning

    CERN Document Server

    Yuen, Henry; Zhang, Kevin J; Nomura, Ken-ichi; Kalia, Rajiv K; Nakano, Aiichiro; Vashishta, Priya

    2010-01-01

    Rapid sequencing of individual human genome is prerequisite to genomic medicine, where diseases will be prevented by preemptive cures. Quantum-mechanical tunneling through single-stranded DNA in a solid-state nanopore has been proposed for rapid DNA sequencing, but unfortunately the tunneling current alone cannot distinguish the four nucleotides due to large fluctuations in molecular conformation and solvent. Here, we propose a machine-learning approach applied to the tunneling current-voltage (I-V) characteristic for efficient discrimination between the four nucleotides. We first combine principal component analysis (PCA) and fuzzy c-means (FCM) clustering to learn the "fingerprints" of the electronic density-of-states (DOS) of the four nucleotides, which can be derived from the I-V data. We then apply the hidden Markov model and the Viterbi algorithm to sequence a time series of DOS data (i.e., to solve the sequencing problem). Numerical experiments show that the PCA-FCM approach can classify unlabeled DOS ...

  20. Artificial intelligence approach in analysis of DNA sequences.

    Science.gov (United States)

    Brézillon, P J; Zaraté, P; Saci, F

    1993-01-01

    We present an approach for designing a knowledge-based system, called Sequence Acquisition In Context (SAIC), that will be able to cooperate with a biologist in the analysis of DNA sequences. The main task of the system is the acquisition of the expert knowledge that the biologist uses for solving ambiguities from gel autoradiograms, with the aim of re-using it later for solving similar ambiguities. The various types of expert knowledge constitute what we call the contextual knowledge of the sequence analysis. Contextual knowledge deals with the unavoidable problems that are common in the study of the living material (eg noise on data, difficulties of observations). Indeed, the analysis of DNA sequences from autoradiograms belongs to an emerging and promising area of investigation, namely reasoning with images. The SAIC project is developed in a theoretical framework that is shared with other applications. Not all tasks have the same importance in each application. We use this observation for designing an intelligent assistant system with three applications. In the SAIC project, we focus on knowledge acquisition, human-computer interaction and explanation. The project will benefit research in the two other applications. We also discuss our SAIC project in the context of large international projects that aim to re-use and share knowledge in a repository.

  1. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697

  2. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Directory of Open Access Journals (Sweden)

    Chun-Tien Chang

    2012-01-01

    Full Text Available The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs, insertion-deletions (indels, short tandem repeats (STRs, and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR, which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS; (iii determine human papilloma virus (HPV genotypes by searching current viral databases in cases of double infections; (iv estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4 and its paralog HSPDP3.

  3. Mixed sequence reader: a program for analyzing DNA sequences with heterozygous base calling.

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3.

  4. Preparation of high-quality next-generation sequencing libraries from picogram quantities of target DNA.

    Science.gov (United States)

    Parkinson, Nicholas J; Maslau, Siarhei; Ferneyhough, Ben; Zhang, Gang; Gregory, Lorna; Buck, David; Ragoussis, Jiannis; Ponting, Chris P; Fischer, Michael D

    2012-01-01

    New sequencing technologies can address diverse biomedical questions but are limited by a minimum required DNA input of typically 1 μg. We describe how sequencing libraries can be reproducibly created from 20 pg of input DNA using a modified transpososome-mediated fragmentation technique. Resulting libraries incorporate in-line bar-coding, which facilitates sample multiplexes that can be sequenced using Illumina platforms with the manufacturer's sequencing primer. We demonstrate this technique by providing deep coverage sequence of the Escherichia coli K-12 genome that shows equivalent target coverage to a 1-μg input library prepared using standard Illumina methods. Reducing template quantity does, however, increase the proportion of duplicate reads and enriches coverage in low-GC regions. This finding was confirmed with exhaustive resequencing of a mouse library constructed from 20 pg of gDNA input (about seven haploid genomes) resulting in ∼0.4-fold statistical coverage of uniquely mapped fragments. This implies that a near-complete coverage of the mouse genome is obtainable with this approach using 20 genomes as input. Application of this new method now allows genomic studies from low mass samples and routine preparation of sequencing libraries from enrichment procedures.

  5. Bisulfite sequencing of chromatin immunoprecipitated DNA (BisChIP-seq) directly informs methylation status of histone-modified DNA

    NARCIS (Netherlands)

    Statham, A.L.; Robinson, M.D.; Song, J.Z.; Coolen, M.W.; Stirzaker, C.; Clark, S. J.

    2012-01-01

    The complex relationship between DNA methylation, chromatin modification, and underlying DNA sequence is often difficult to unravel with existing technologies. Here, we describe a novel technique based on high-throughput sequencing of bisulfite-treated chromatin immunoprecipitated DNA (BisChIP-seq),

  6. From DNA sequence to transcriptional behaviour: a quantitative approach.

    Science.gov (United States)

    Segal, Eran; Widom, Jonathan

    2009-07-01

    Complex transcriptional behaviours are encoded in the DNA sequences of gene regulatory regions. Advances in our understanding of these behaviours have been recently gained through quantitative models that describe how molecules such as transcription factors and nucleosomes interact with genomic sequences. An emerging view is that every regulatory sequence is associated with a unique binding affinity landscape for each molecule and, consequently, with a unique set of molecule-binding configurations and transcriptional outputs. We present a quantitative framework based on existing methods that unifies these ideas. This framework explains many experimental observations regarding the binding patterns of factors and nucleosomes and the dynamics of transcriptional activation. It can also be used to model more complex phenomena such as transcriptional noise and the evolution of transcriptional regulation.

  7. On-Demand Indexing for Referential Compression of DNA Sequences.

    Directory of Open Access Journals (Sweden)

    Fernando Alves

    Full Text Available The decreasing costs of genome sequencing is creating a demand for scalable storage and processing tools and techniques to deal with the large amounts of generated data. Referential compression is one of these techniques, in which the similarity between the DNA of organisms of the same or an evolutionary close species is exploited to reduce the storage demands of genome sequences up to 700 times. The general idea is to store in the compressed file only the differences between the to-be-compressed and a well-known reference sequence. In this paper, we propose a method for improving the performance of referential compression by removing the most costly phase of the process, the complete reference indexing. Our approach, called On-Demand Indexing (ODI compresses human chromosomes five to ten times faster than other state-of-the-art tools (on average, while achieving similar compression ratios.

  8. Maternal Plasma DNA and RNA Sequencing for Prenatal Testing.

    Science.gov (United States)

    Tamminga, Saskia; van Maarle, Merel; Henneman, Lidewij; Oudejans, Cees B M; Cornel, Martina C; Sistermans, Erik A

    2016-01-01

    Cell-free DNA (cfDNA) testing has recently become indispensable in diagnostic testing and screening. In the prenatal setting, this type of testing is often called noninvasive prenatal testing (NIPT). With a number of techniques, using either next-generation sequencing or single nucleotide polymorphism-based approaches, fetal cfDNA in maternal plasma can be analyzed to screen for rhesus D genotype, common chromosomal aneuploidies, and increasingly for testing other conditions, including monogenic disorders. With regard to screening for common aneuploidies, challenges arise when implementing NIPT in current prenatal settings. Depending on the method used (targeted or nontargeted), chromosomal anomalies other than trisomy 21, 18, or 13 can be detected, either of fetal or maternal origin, also referred to as unsolicited or incidental findings. For various biological reasons, there is a small chance of having either a false-positive or false-negative NIPT result, or no result, also referred to as a "no-call." Both pre- and posttest counseling for NIPT should include discussing potential discrepancies. Since NIPT remains a screening test, a positive NIPT result should be confirmed by invasive diagnostic testing (either by chorionic villus biopsy or by amniocentesis). As the scope of NIPT is widening, professional guidelines need to discuss the ethics of what to offer and how to offer. In this review, we discuss the current biochemical, clinical, and ethical challenges of cfDNA testing in the prenatal setting and its future perspectives including novel applications that target RNA instead of DNA.

  9. Chimeric TALE recombinases with programmable DNA sequence specificity.

    Science.gov (United States)

    Mercer, Andrew C; Gaj, Thomas; Fuller, Roberta P; Barbas, Carlos F

    2012-11-01

    Site-specific recombinases are powerful tools for genome engineering. Hyperactivated variants of the resolvase/invertase family of serine recombinases function without accessory factors, and thus can be re-targeted to sequences of interest by replacing native DNA-binding domains (DBDs) with engineered zinc-finger proteins (ZFPs). However, imperfect modularity with particular domains, lack of high-affinity binding to all DNA triplets, and difficulty in construction has hindered the widespread adoption of ZFPs in unspecialized laboratories. The discovery of a novel type of DBD in transcription activator-like effector (TALE) proteins from Xanthomonas provides an alternative to ZFPs. Here we describe chimeric TALE recombinases (TALERs): engineered fusions between a hyperactivated catalytic domain from the DNA invertase Gin and an optimized TALE architecture. We use a library of incrementally truncated TALE variants to identify TALER fusions that modify DNA with efficiency and specificity comparable to zinc-finger recombinases in bacterial cells. We also show that TALERs recombine DNA in mammalian cells. The TALER architecture described herein provides a platform for insertion of customized TALE domains, thus significantly expanding the targeting capacity of engineered recombinases and their potential applications in biotechnology and medicine.

  10. Comparative analysis of Lactobacillus plantarum WCFS1 transcriptomes by using DNA microarray and next-generation sequencing technologies.

    NARCIS (Netherlands)

    Leimena, M.M.; Wels, M.W.; Bongers, R.S.; Smid, E.J.; Zoetendal, E.G.; Kleerebezem, M.

    2012-01-01

    RNA sequencing is starting to compete with the use of DNA microarrays for transcription analysis in eukaryotes as well as in prokaryotes. The application of RNA sequencing in prokaryotes requires additional steps in the RNA preparation procedure to increase the relative abundance of mRNA and cannot

  11. DNA methyltransferases are required to induce heterochromatic re-replication in Arabidopsis.

    Directory of Open Access Journals (Sweden)

    Hume Stroud

    2012-07-01

    Full Text Available The relationship between epigenetic marks on chromatin and the regulation of DNA replication is poorly understood. Mutations of the H3K27 methyltransferase genes, Arabidopsis trithorax-related protein5 (ATXR5 and ATXR6, result in re-replication (repeated origin firing within the same cell cycle. Here we show that mutations that reduce DNA methylation act to suppress the re-replication phenotype of atxr5 atxr6 mutants. This suggests that DNA methylation, a mark enriched at the same heterochromatic regions that re-replicate in atxr5/6 mutants, is required for aberrant re-replication. In contrast, RNA sequencing analyses suggest that ATXR5/6 and DNA methylation cooperatively transcriptionally silence transposable elements (TEs. Hence our results suggest a complex relationship between ATXR5/6 and DNA methylation in the regulation of DNA replication and transcription of TEs.

  12. SeqControl: process control for DNA sequencing.

    Science.gov (United States)

    Chong, Lauren C; Albuquerque, Marco A; Harding, Nicholas J; Caloian, Cristian; Chan-Seng-Yue, Michelle; de Borja, Richard; Fraser, Michael; Denroche, Robert E; Beck, Timothy A; van der Kwast, Theodorus; Bristow, Robert G; McPherson, John D; Boutros, Paul C

    2014-10-01

    As high-throughput sequencing continues to increase in speed and throughput, routine clinical and industrial application draws closer. These 'production' settings will require enhanced quality monitoring and quality control to optimize output and reduce costs. We developed SeqControl, a framework for predicting sequencing quality and coverage using a set of 15 metrics describing overall coverage, coverage distribution, basewise coverage and basewise quality. Using whole-genome sequences of 27 prostate cancers and 26 normal references, we derived multivariate models that predict sequencing quality and depth. SeqControl robustly predicted how much sequencing was required to reach a given coverage depth (area under the curve (AUC) = 0.993), accurately classified clinically relevant formalin-fixed, paraffin-embedded samples, and made predictions from as little as one-eighth of a sequencing lane (AUC = 0.967). These techniques can be immediately incorporated into existing sequencing pipelines to monitor data quality in real time. SeqControl is available at http://labs.oicr.on.ca/Boutros-lab/software/SeqControl/.

  13. Stability of capillary gels for automated sequencing of DNA.

    Science.gov (United States)

    Swerdlow, H; Dew-Jager, K E; Brady, K; Grey, R; Dovichi, N J; Gesteland, R

    1992-08-01

    Recent interest in capillary gel electrophoresis has been fueled by the Human Genome Project and other large-scale sequencing projects. Advances in gel polymerization techniques and detector design have enabled sequencing of DNA directly in capillaries. Efforts to exploit this technology have been hampered by problems with the reproducibility and stability of gels. Gel instability manifests itself during electrophoresis as a decrease in the current passing through the capillary under a constant voltage. Upon subsequent microscopic examination, bubbles are often visible at or near the injection (cathodic) end of the capillary gel. Gels have been prepared with the polyacrylamide matrix covalently attached to the silica walls of the capillary. These gels, although more stable, still suffer from problems with bubbles. The use of actual DNA sequencing samples also adversely affects gel stability. We examined the mechanisms underlying these disruptive processes by employing polyacrylamide gel-filled capillaries in which the gel was not attached to the capillary wall. Three sources of gel instability were identified. Bubbles occurring in the absence of sample introduction were attributed to electroosmotic force; replacing the denaturant urea with formamide was shown to reduce the frequency of these bubbles. The slow, steady decline in current through capillary sequencing gels interferes with the ability to detect other gel problems. This phenomenon was shown to be a result of ionic depletion at the gel-liquid interface. The decline was ameliorated by adding denaturant and acrylamide monomers to the buffer reservoirs. Sample-induced problems were shown to be due to the presence of template DNA; elimination of the template allowed sample loading to occur without complications.(ABSTRACT TRUNCATED AT 250 WORDS)

  14. Complete genome sequence of mitochondrial DNA (mtDNA) of Chlorella sorokiniana.

    Science.gov (United States)

    Orsini, Massimiliano; Costelli, Cristina; Malavasi, Veronica; Cusano, Roberto; Concas, Alessandro; Angius, Andrea; Cao, Giacomo

    2016-01-01

    The complete sequence of mitochondrial genome of the Chlorella sorokiniana strain (SAG 111-8 k) is presented in this work. Within the Chlorella genus, it represents the second species with a complete sequenced and annotated mitochondrial genome (GenBank accession no. KM241869). The genome consists of circular chromosomes of 52,528 bp and encodes a total of 31 protein coding genes, 3 rRNAs and 26 tRNAs. The overall AT contents of the C. sorokiniana mtDNA is 70.89%, while the coding sequence is of 97.4%.

  15. Sequencing of mitochondrial HV1 and HV2 DNA with length heteroplasmy

    DEFF Research Database (Denmark)

    Rasmussen, E. Michael; Eriksen, Birthe; Larsen, Hans Jakob

    2003-01-01

    This study presents a fast method for sequencing the poly C/G regions in HV1 and HV2 in the mitochondrial DNA (mtDNA)......This study presents a fast method for sequencing the poly C/G regions in HV1 and HV2 in the mitochondrial DNA (mtDNA)...

  16. Developmentally programmed excision of internal DNA sequences in Paramecium aurelia.

    Science.gov (United States)

    Gratias, A; Bétermier, M

    2001-01-01

    The development of a new somatic nucleus (macronucleus) during sexual reproduction of the ciliate Paramecium aurelia involves reproducible chromosomal rearrangements that affect the entire germline genome. Macronuclear development can be induced experimentally, which makes P. aurelia an attractive model for the study of the mechanism and the regulation of DNA rearrangements. Two major types of rearrangements have been identified: the fragmentation of the germline chromosomes, followed by the formation of the new macronuclear chromosome ends in association with imprecise DNA elimination, and the precise excision of internal eliminated sequences (IESs). All IESs identified so far are short, A/T rich and non-coding elements. They are flanked by a direct repeat of a 5'-TA-3' dinucleotide, a single copy of which remains at the macronuclear junction after excision. The number of these single-copy sequences has been estimated to be around 60,000 per haploid genome. This review focuses on the current knowledge about the genetic and epigenetic determinants of IES elimination in P. aurelia, the analysis of excision products, and the tightly regulated timing of excision throughout macronuclear development. Several models for the molecular mechanism of IES excision will be discussed in relation to those proposed for DNA elimination in other ciliates.

  17. Genetic variability of Taenia saginata inferred from mitochondrial DNA sequences.

    Science.gov (United States)

    Rostami, Sima; Salavati, Reza; Beech, Robin N; Babaei, Zahra; Sharbatkhori, Mitra; Harandi, Majid Fasihi

    2015-04-01

    Taenia saginata is an important tapeworm, infecting humans in many parts of the world. The present study was undertaken to identify inter- and intraspecific variation of T. saginata isolated from cattle in different parts of Iran using two mitochondrial CO1 and 12S rRNA genes. Up to 105 bovine specimens of T. saginata were collected from 20 slaughterhouses in three provinces of Iran. DNA were extracted from the metacestode Cysticercus bovis. After PCR amplification, sequencing of CO1 and 12S rRNA genes were carried out and two phylogenetic analyses of the sequence data were generated by Bayesian inference on CO1 and 12S rRNA sequences. Sequence analyses of CO1 and 12S rRNA genes showed 11 and 29 representative profiles respectively. The level of pairwise nucleotide variation between individual haplotypes of CO1 gene was 0.3-2.4% while the overall nucleotide variation among all 11 haplotypes was 4.6%. For 12S rRNA sequence data, level of pairwise nucleotide variation was 0.2-2.5% and the overall nucleotide variation was determined as 5.8% among 29 haplotypes of 12S rRNA gene. Considerable genetic diversity was found in both mitochondrial genes particularly in 12S rRNA gene.

  18. DNA sequencing technology, walking with modular primers. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Ulanovsky, L.

    1996-12-31

    The success of the Human Genome Project depends on the development of adequate technology for rapid and inexpensive DNA sequencing, which will also benefit biomedical research in general. The authors are working on DNA technologies that eliminate primer synthesis, the main bottleneck in sequencing by primer walking. They have developed modular primers that are assembled from three 5-mer, 6-mer or 7-mer modules selected from a presynthesized library of as few as 1,000 oligonucleotides ({double_bond}4, {double_bond}5, {double_bond}7). The three modules anneal contiguously at the selected template site and prime there uniquely, even though each is not unique for the most part when used alone. This technique is expected to speed up primer walking 30 to 50 fold, and reduce the sequencing cost by a factor of 5 to 15. Time and expensive will be saved on primer synthesis itself and even more so due to closed-loop automation of primer walking, made possible by the instant availability of primers. Apart from saving time and cost, closed-loop automation would also minimize the errors and complications associated with human intervention between the walks. The author has also developed two additional approaches to primer-library based sequencing. One involves a branched structure of modular primers which has a distinctly different mechanism of achieving priming specificity. The other introduces the concept of ``Differential Extension with Nucleotide Subsets`` as an approach increasing priming specificity, priming strength and allowing cycle sequencing. These approaches are expected to be more robust than the original version of the modular primer technique.

  19. Isolation and analysis of high quality nuclear DNA with reduced organellar DNA for plant genome sequencing and resequencing

    Directory of Open Access Journals (Sweden)

    Zdepski Anna

    2011-05-01

    Full Text Available Abstract Background High throughput sequencing (HTS technologies have revolutionized the field of genomics by drastically reducing the cost of sequencing, making it feasible for individual labs to sequence or resequence plant genomes. Obtaining high quality, high molecular weight DNA from plants poses significant challenges due to the high copy number of chloroplast and mitochondrial DNA, as well as high levels of phenolic compounds and polysaccharides. Multiple methods have been used to isolate DNA from plants; the CTAB method is commonly used to isolate total cellular DNA from plants that contain nuclear DNA, as well as chloroplast and mitochondrial DNA. Alternatively, DNA can be isolated from nuclei to minimize chloroplast and mitochondrial DNA contamination. Results We describe optimized protocols for isolation of nuclear DNA from eight different plant species encompassing both monocot and eudicot species. These protocols use nuclei isolation to minimize chloroplast and mitochondrial DNA contamination. We also developed a protocol to determine the number of chloroplast and mitochondrial DNA copies relative to the nuclear DNA using quantitative real time PCR (qPCR. We compared DNA isolated from nuclei to total cellular DNA isolated with the CTAB method. As expected, DNA isolated from nuclei consistently yielded nuclear DNA with fewer chloroplast and mitochondrial DNA copies, as compared to the total cellular DNA prepared with the CTAB method. This protocol will allow for analysis of the quality and quantity of nuclear DNA before starting a plant whole genome sequencing or resequencing experiment. Conclusions Extracting high quality, high molecular weight nuclear DNA in plants has the potential to be a bottleneck in the era of whole genome sequencing and resequencing. The methods that are described here provide a framework for researchers to extract and quantify nuclear DNA in multiple types of plants.

  20. DNA Sequencing as a Tool to Monitor Marine Ecological Status

    Directory of Open Access Journals (Sweden)

    Kelly D. Goodwin

    2017-05-01

    Full Text Available Many ocean policies mandate integrated, ecosystem-based approaches to marine monitoring, driving a global need for efficient, low-cost bioindicators of marine ecological quality. Most traditional methods to assess biological quality rely on specialized expertise to provide visual identification of a limited set of specific taxonomic groups, a time-consuming process that can provide a narrow view of ecological status. In addition, microbial assemblages drive food webs but are not amenable to visual inspection and thus are largely excluded from detailed inventory. Molecular-based assessments of biodiversity and ecosystem function offer advantages over traditional methods and are increasingly being generated for a suite of taxa using a “microbes to mammals” or “barcodes to biomes” approach. Progress in these efforts coupled with continued improvements in high-throughput sequencing and bioinformatics pave the way for sequence data to be employed in formal integrated ecosystem evaluation, including food web assessments, as called for in the European Union Marine Strategy Framework Directive. DNA sequencing of bioindicators, both traditional (e.g., benthic macroinvertebrates, ichthyoplankton and emerging (e.g., microbial assemblages, fish via eDNA, promises to improve assessment of marine biological quality by increasing the breadth, depth, and throughput of information and by reducing costs and reliance on specialized taxonomic expertise.

  1. Mitochondrial DNA sequence analysis of patients with 'atypical psychosis'.

    Science.gov (United States)

    Kazuno, An-A; Munakata, Kae; Mori, Kanako; Tanaka, Masashi; Nanko, Shinichiro; Kunugi, Hiroshi; Umekage, Tadashi; Tochigi, Mamoru; Kohda, Kazuhisa; Sasaki, Tsukasa; Akiyama, Tsuyoshi; Washizuka, Shinsuke; Kato, Nobumasa; Kato, Tadafumi

    2005-08-01

    Although classical psychopathological studies have shown the presence of an independent diagnostic category, 'atypical psychosis', most psychotic patients are currently classified into two major diagnostic categories, schizophrenia and bipolar disorder, by the Diagnostic and Statistical Manual of Mental Disorders (4th edn; DSM-IV) criteria. 'Atypical psychosis' is characterized by acute confusion without systematic delusion, emotional instability, and psychomotor excitement or stupor. Such clinical features resemble those seen in organic mental syndrome, and differential diagnosis is often difficult. Because patients with mitochondrial myopathy, encephalopathy, lactic acidosis, and stroke-like episodes (MELAS) sometimes show organic mental disorder, 'atypical psychosis' may be caused by mutations of mitochondrial DNA (mtDNA) in some patients. In the present study whole mtDNA was sequenced for seven patients with various psychotic disorders, who could be categorized as 'atypical psychosis'. None of them had known mtDNA mutations pathogenic for mitochondrial encephalopathy. Two of seven patients belonged to a subhaplogroup F1b1a with low frequency. These results did not support the hypothesis that clinical presentation of some patients with 'atypical psychosis' is a reflection of subclinical mitochondrial encephalopathy. However, the subhaplogroup F1b1a may be a good target for association study of 'atypical psychosis'.

  2. Retroviral DNA Sequences as a Means for Determining Ancient Diets.

    Directory of Open Access Journals (Sweden)

    Jessica I Rivera-Perez

    Full Text Available For ages, specialists from varying fields have studied the diets of the primeval inhabitants of our planet, detecting diet remains in archaeological specimens using a range of morphological and biochemical methods. As of recent, metagenomic ancient DNA studies have allowed for the comparison of the fecal and gut microbiomes associated to archaeological specimens from various regions of the world; however the complex dynamics represented in those microbial communities still remain unclear. Theoretically, similar to eukaryote DNA the presence of genes from key microbes or enzymes, as well as the presence of DNA from viruses specific to key organisms, may suggest the ingestion of specific diet components. In this study we demonstrate that ancient virus DNA obtained from coprolites also provides information reconstructing the host's diet, as inferred from sequences obtained from pre-Columbian coprolites. This depicts a novel and reliable approach to determine new components as well as validate the previously suggested diets of extinct cultures and animals. Furthermore, to our knowledge this represents the first description of the eukaryotic viral diversity found in paleofaeces belonging to pre-Columbian cultures.

  3. Peptide Synthesis on a Next-Generation DNA Sequencing Platform.

    Science.gov (United States)

    Svensen, Nina; Peersen, Olve B; Jaffrey, Samie R

    2016-09-01

    Methods for displaying large numbers of peptides on solid surfaces are essential for high-throughput characterization of peptide function and binding properties. Here we describe a method for converting the >10(7) flow cell-bound clusters of identical DNA strands generated by the Illumina DNA sequencing technology into clusters of complementary RNA, and subsequently peptide clusters. We modified the flow-cell-bound primers with ribonucleotides thus enabling them to be used by poliovirus polymerase 3D(pol) . The primers hybridize to the clustered DNA thus leading to RNA clusters. The RNAs fold into functional protein- or small molecule-binding aptamers. We used the mRNA-display approach to synthesize flow-cell-tethered peptides from these RNA clusters. The peptides showed selective binding to cognate antibodies. The methods described here provide an approach for using DNA clusters to template peptide synthesis on an Illumina flow cell, thus providing new opportunities for massively parallel peptide-based assays.

  4. Entropy and long-range correlations in DNA sequences.

    Science.gov (United States)

    Melnik, S S; Usatenko, O V

    2014-12-01

    We analyze the structure of DNA molecules of different organisms by using the additive Markov chain approach. Transforming nucleotide sequences into binary strings, we perform statistical analysis of the corresponding "texts". We develop the theory of N-step additive binary stationary ergodic Markov chains and analyze their differential entropy. Supposing that the correlations are weak we express the conditional probability function of the chain by means of the pair correlation function and represent the entropy as a functional of the pair correlator. Since the model uses two point correlators instead of probability of block occurring, it makes possible to calculate the entropy of subsequences at much longer distances than with the use of the standard methods. We utilize the obtained analytical result for numerical evaluation of the entropy of coarse-grained DNA texts. We believe that the entropy study can be used for biological classification of living species.

  5. Field guide to next-generation DNA sequencers.

    Science.gov (United States)

    Glenn, Travis C

    2011-09-01

    The diversity of available 2(nd) and 3(rd) generation DNA sequencing platforms is increasing rapidly. Costs for these systems range from $10/Mb for 454 and some Ion Torrent chips). In terms of cost per nonmultiplexed sample and instrument run time, the Pacific Biosciences and Ion Torrent platforms excel, with the 454 GS Junior and Illumina MiSeq also notable in this regard. All platforms allow multiplexing of samples, but details of library preparation, experimental design and data analysis can constrain the options. The wide range of characteristics among available platforms provides opportunities both to conduct groundbreaking studies and to waste money on scales that were previously infeasible. Thus, careful thought about the desired characteristics of these systems is warranted before purchasing or using any of them. Updated information from this guide will be maintained at: http://dna.uga.edu/ and http://tomato.biol.trinity.edu/blog/. © 2011 Blackwell Publishing Ltd.

  6. Exponential megapriming PCR (EMP) cloning--seamless DNA insertion into any target plasmid without sequence constraints.

    Science.gov (United States)

    Ulrich, Alexander; Andersen, Kasper R; Schwartz, Thomas U

    2012-01-01

    We present a fast, reliable and inexpensive restriction-free cloning method for seamless DNA insertion into any plasmid without sequence limitation. Exponential megapriming PCR (EMP) cloning requires two consecutive PCR steps and can be carried out in one day. We show that EMP cloning has a higher efficiency than restriction-free (RF) cloning, especially for long inserts above 2.5 kb. EMP further enables simultaneous cloning of multiple inserts.

  7. Exponential megapriming PCR (EMP cloning--seamless DNA insertion into any target plasmid without sequence constraints.

    Directory of Open Access Journals (Sweden)

    Alexander Ulrich

    Full Text Available We present a fast, reliable and inexpensive restriction-free cloning method for seamless DNA insertion into any plasmid without sequence limitation. Exponential megapriming PCR (EMP cloning requires two consecutive PCR steps and can be carried out in one day. We show that EMP cloning has a higher efficiency than restriction-free (RF cloning, especially for long inserts above 2.5 kb. EMP further enables simultaneous cloning of multiple inserts.

  8. New scoring schema for finding motifs in DNA Sequences

    Directory of Open Access Journals (Sweden)

    Nowzari-Dalini Abbas

    2009-03-01

    Full Text Available Abstract Background Pattern discovery in DNA sequences is one of the most fundamental problems in molecular biology with important applications in finding regulatory signals and transcription factor binding sites. An important task in this problem is to search (or predict known binding sites in a new DNA sequence. For this reason, all subsequences of the given DNA sequence are scored based on an scoring function and the prediction is done by selecting the best score. By assuming no dependency between binding site base positions, most of the available tools for known binding site prediction are designed. Recently Tomovic and Oakeley investigated the statistical basis for either a claim of dependence or independence, to determine whether such a claim is generally true, and they presented a scoring function for binding site prediction based on the dependency between binding site base positions. Our primary objective is to investigate the scoring functions which can be used in known binding site prediction based on the assumption of dependency or independency in binding site base positions. Results We propose a new scoring function based on the dependency between all positions in biding site base positions. This scoring function uses joint information content and mutual information as a measure of dependency between positions in transcription factor binding site. Our method for modeling dependencies is simply an extension of position independency methods. We evaluate our new scoring function on the real data sets extracted from JASPAR and TRANSFAC data bases, and compare the obtained results with two other well known scoring functions. Conclusion The results demonstrate that the new approach improves known binding site discovery and show that the joint information content and mutual information provide a better and more general criterion to investigate the relationships between positions in the TFBS. Our scoring function is formulated by simple

  9. CtIP-dependent DNA resection is required for DNA damage checkpoint maintenance but not initiation

    DEFF Research Database (Denmark)

    Kousholt, Arne Nedergaard; Fugger, Kasper; Hoffmann, Saskia

    2012-01-01

    by camptothecin and ionizing radiation. In contrast, we find that DNA end resection was critically required for sustained ATR-CHK1 checkpoint signaling and for maintaining both the intra-S- and G2-phase checkpoints. Consequently, resection-deficient cells entered mitosis with persistent DNA damage. In conclusion......To prevent accumulation of mutations, cells respond to DNA lesions by blocking cell cycle progression and initiating DNA repair. Homology-directed repair of DNA breaks requires CtIP-dependent resection of the DNA ends, which is thought to play a key role in activation of ATR (ataxia telangiectasia...... mutated and Rad3 related) and CHK1 kinases to induce the cell cycle checkpoint. In this paper, we show that CHK1 was rapidly and robustly activated before detectable end resection. Moreover, we show that the key resection factor CtIP was dispensable for initial ATR-CHK1 activation after DNA damage...

  10. Photocatalytic probing of DNA sequence by using TiO{sub 2}/dopamine-DNA triads

    Energy Technology Data Exchange (ETDEWEB)

    Liu Jianqin [Center for Nanoscale Materials, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439 (United States); Garza, Linda de la [Chemistry Division, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439 (United States); Zhang Ligang; Dimitrijevic, Nada M. [Center for Nanoscale Materials, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439 (United States); Zuo Xiaobing; Tiede, David M. [Chemistry Division, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439 (United States); Rajh, Tijana [Center for Nanoscale Materials, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439 (United States); Chemistry Division, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439 (United States)], E-mail: rajh@anl.gov

    2007-10-15

    A method to control charge transfer reaction in DNA using hybrid nanometer-sized TiO{sub 2} nanoparticles was developed. In this system extended charge separation reflects the sequence of DNA and was measured using metallic silver deposition or by photocurrent response. Light-induced extended charge separation in these systems was found to be dependent on the DNA-bridge length and sequence. The yield of photocatalytic deposition of silver was studied in systems having GG accepting sites imbedded in AT runs at varying distances from the TiO{sub 2} nanoparticle surface. Weak distance dependence of charge separation indicative of a hole hopping through mediating adenine (A) sites was found. The quantum yield of silver deposition in the system having a GG accepting site placed 8.5 A from the nanoparticle surface was found to be {phi} = 0.70 (70%) and {phi} = 0.56 (56%) for (A){sub n} and (AT){sub n/2} bridge, respectively. Hole injection to GG trapping sites as far as 70 A from a nanoparticle surface in the absence of G hopping sites was measured. Introduction of G hopping sites increased the efficiency of hole injection. The efficiency of photocatalytic deposition of metallic silver was found to be sensitive to the presence of a single nucleobase mismatch in the DNA sequence.

  11. Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens.

    Science.gov (United States)

    Shokralla, Shadi; Gibson, Joel F; Nikbakht, Hamid; Janzen, Daniel H; Hallwachs, Winnie; Hajibabaei, Mehrdad

    2014-09-01

    DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large-scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next-generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high-target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next-generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10-mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full-length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full-length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next-generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content.

  12. Quality standards for DNA sequence variation databases to improve clinical management under development in Australia

    Directory of Open Access Journals (Sweden)

    B. Bennetts

    2014-09-01

    Full Text Available Despite the routine nature of comparing sequence variations identified during clinical testing to database records, few databases meet quality requirements for clinical diagnostics. To address this issue, The Royal College of Pathologists of Australasia (RCPA in collaboration with the Human Genetics Society of Australasia (HGSA, and the Human Variome Project (HVP is developing standards for DNA sequence variation databases intended for use in the Australian clinical environment. The outputs of this project will be promoted to other health systems and accreditation bodies by the Human Variome Project to support the development of similar frameworks in other jurisdictions.

  13. Indirect readout of DNA sequence by p22 repressor: roles of DNA and protein functional groups in modulating DNA conformation.

    Science.gov (United States)

    Harris, Lydia-Ann; Watkins, Derrick; Williams, Loren Dean; Koudelka, Gerald B

    2013-01-09

    The repressor of bacteriophage P22 (P22R) discriminates between its various DNA binding sites by sensing the identity of non-contacted base pairs at the center of its binding site. The "indirect readout" of these non-contacted bases is apparently based on DNA's sequence-dependent conformational preferences. The structures of P22R-DNA complexes indicate that the non-contacted base pairs at the center of the binding site are in the B' state. This finding suggests that indirect readout and therefore binding site discrimination depend on P22R's ability to either sense and/or impose the B' state on the non-contacted bases of its binding sites. We show here that the affinity of binding sites for P22R depends on the tendency of the central bases to assume the B'-DNA state. Furthermore, we identify functional groups in the minor groove of the non-contacted bases as the essential modulators of indirect readout by P22R. In P22R-DNA complexes, the negatively charged E44 and E48 residues are provocatively positioned near the negatively charged DNA phosphates of the non-contacted nucleotides. The close proximity of the negatively charged groups on protein and DNA suggests that electrostatics may play a key role in the indirect readout process. Changing either of two negatively charged residues to uncharged residues eliminates the ability of P22R to impose structural changes on DNA and to recognize non-contacted base sequence. These findings suggest that these negatively charged amino acids function to force the P22R-bound DNA into the B' state and therefore play a key role in indirect readout by P22R.

  14. A DNA 'barcode blitz': rapid digitization and sequencing of a natural history collection.

    Directory of Open Access Journals (Sweden)

    Paul D N Hebert

    Full Text Available DNA barcoding protocols require the linkage of each sequence record to a voucher specimen that has, whenever possible, been authoritatively identified. Natural history collections would seem an ideal resource for barcode library construction, but they have never seen large-scale analysis because of concerns linked to DNA degradation. The present study examines the strength of this barrier, carrying out a comprehensive analysis of moth and butterfly (Lepidoptera species in the Australian National Insect Collection. Protocols were developed that enabled tissue samples, specimen data, and images to be assembled rapidly. Using these methods, a five-person team processed 41,650 specimens representing 12,699 species in 14 weeks. Subsequent molecular analysis took about six months, reflecting the need for multiple rounds of PCR as sequence recovery was impacted by age, body size, and collection protocols. Despite these variables and the fact that specimens averaged 30.4 years old, barcode records were obtained from 86% of the species. In fact, one or more barcode compliant sequences (>487 bp were recovered from virtually all species represented by five or more individuals, even when the youngest was 50 years old. By assembling specimen images, distributional data, and DNA barcode sequences on a web-accessible informatics platform, this study has greatly advanced accessibility to information on thousands of species. Moreover, much of the specimen data became publically accessible within days of its acquisition, while most sequence results saw release within three months. As such, this study reveals the speed with which DNA barcode workflows can mobilize biodiversity data, often providing the first web-accessible information for a species. These results further suggest that existing collections can enable the rapid development of a comprehensive DNA barcode library for the most diverse compartment of terrestrial biodiversity - insects.

  15. SAM: String-based sequence search algorithm for mitochondrial DNA database queries

    Science.gov (United States)

    Röck, Alexander; Irwin, Jodi; Dür, Arne; Parsons, Thomas; Parson, Walther

    2011-01-01

    The analysis of the haploid mitochondrial (mt) genome has numerous applications in forensic and population genetics, as well as in disease studies. Although mtDNA haplotypes are usually determined by sequencing, they are rarely reported as a nucleotide string. Traditionally they are presented in a difference-coded position-based format relative to the corrected version of the first sequenced mtDNA. This convention requires recommendations for standardized sequence alignment that is known to vary between scientific disciplines, even between laboratories. As a consequence, database searches that are vital for the interpretation of mtDNA data can suffer from biased results when query and database haplotypes are annotated differently. In the forensic context that would usually lead to underestimation of the absolute and relative frequencies. To address this issue we introduce SAM, a string-based search algorithm that converts query and database sequences to position-free nucleotide strings and thus eliminates the possibility that identical sequences will be missed in a database query. The mere application of a BLAST algorithm would not be a sufficient remedy as it uses a heuristic approach and does not address properties specific to mtDNA, such as phylogenetically stable but also rapidly evolving insertion and deletion events. The software presented here provides additional flexibility to incorporate phylogenetic data, site-specific mutation rates, and other biologically relevant information that would refine the interpretation of mitochondrial DNA data. The manuscript is accompanied by freeware and example data sets that can be used to evaluate the new software (http://stringvalidation.org). PMID:21056022

  16. A DNA 'barcode blitz': rapid digitization and sequencing of a natural history collection.

    Science.gov (United States)

    Hebert, Paul D N; Dewaard, Jeremy R; Zakharov, Evgeny V; Prosser, Sean W J; Sones, Jayme E; McKeown, Jaclyn T A; Mantle, Beth; La Salle, John

    2013-01-01

    DNA barcoding protocols require the linkage of each sequence record to a voucher specimen that has, whenever possible, been authoritatively identified. Natural history collections would seem an ideal resource for barcode library construction, but they have never seen large-scale analysis because of concerns linked to DNA degradation. The present study examines the strength of this barrier, carrying out a comprehensive analysis of moth and butterfly (Lepidoptera) species in the Australian National Insect Collection. Protocols were developed that enabled tissue samples, specimen data, and images to be assembled rapidly. Using these methods, a five-person team processed 41,650 specimens representing 12,699 species in 14 weeks. Subsequent molecular analysis took about six months, reflecting the need for multiple rounds of PCR as sequence recovery was impacted by age, body size, and collection protocols. Despite these variables and the fact that specimens averaged 30.4 years old, barcode records were obtained from 86% of the species. In fact, one or more barcode compliant sequences (>487 bp) were recovered from virtually all species represented by five or more individuals, even when the youngest was 50 years old. By assembling specimen images, distributional data, and DNA barcode sequences on a web-accessible informatics platform, this study has greatly advanced accessibility to information on thousands of species. Moreover, much of the specimen data became publically accessible within days of its acquisition, while most sequence results saw release within three months. As such, this study reveals the speed with which DNA barcode workflows can mobilize biodiversity data, often providing the first web-accessible information for a species. These results further suggest that existing collections can enable the rapid development of a comprehensive DNA barcode library for the most diverse compartment of terrestrial biodiversity - insects.

  17. Mitochondrial DNA sequence variation in the Anatolian Peninsula (Turkey)

    Indian Academy of Sciences (India)

    Hatice Mergen; Reyhan Öner; Cihan Öner

    2004-04-01

    Throughout human history, the region known today as the Anatolian peninsula (Turkey) has served as a junction connecting the Middle East, Europe and Central Asia, and, thus, has been subject to major population movements. The present study is undertaken to obtain information about the distribution of the existing mitochondrial D-loop sequence variations in the Turkish population of Anatolia. A few studies have previously reported mtDNA sequences in Turks. We attempted to extend these results by analysing a cohort that is not only larger, but also more representative of the Turkish population living in Anatolia. In order to obtain a descriptive picture for the phylogenetic distribution of the mitochondrial genome within Turkey, we analysed mitochondrial D-loop region sequence variations in 75 individuals from different parts of Anatolia by direct sequencing. Analysis of the two hypervariable segments within the noncoding region of the mitochondrial genome revealed the existence of 81 nucleotide mutations at 79 sites. The neighbour-joining tree of Kimura’s distance matrix has revealed the presence of six main clusters, of which H and U are the most common. The data obtained are also compared with several European and Turkic Central Asian populations.

  18. 40 CFR 86.230-11 - Test sequence: general requirements.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 18 2010-07-01 2010-07-01 false Test sequence: general requirements. 86.230-11 Section 86.230-11 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR... New Medium-Duty Passenger Vehicles; Cold Temperature Test Procedures § 86.230-11 Test...

  19. 40 CFR 86.230-94 - Test sequence: general requirements.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 18 2010-07-01 2010-07-01 false Test sequence: general requirements. 86.230-94 Section 86.230-94 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR... New Medium-Duty Passenger Vehicles; Cold Temperature Test Procedures § 86.230-94 Test...

  20. 40 CFR 86.1773-99 - Test sequence; general requirements.

    Science.gov (United States)

    2010-07-01

    ... required for fuel-flexible and dual-fuel vehicles when operating on gasoline. Natural gas, hybrid electric and diesel-fueled vehicles shall also be exempt from 50 °F testing. (3) The following schedule... vehicle shall be approximately level during all phases of the test sequence to prevent abnormal fuel...

  1. Bacterial repetitive extragenic palindromic sequences are DNA targets for Insertion Sequence elements

    Directory of Open Access Journals (Sweden)

    Pareja Eduardo

    2006-03-01

    Full Text Available Abstract Background Mobile elements are involved in genomic rearrangements and virulence acquisition, and hence, are important elements in bacterial genome evolution. The insertion of some specific Insertion Sequences had been associated with repetitive extragenic palindromic (REP elements. Considering that there are a sufficient number of available genomes with described REPs, and exploiting the advantage of the traceability of transposition events in genomes, we decided to exhaustively analyze the relationship between REP sequences and mobile elements. Results This global multigenome study highlights the importance of repetitive extragenic palindromic elements as target sequences for transposases. The study is based on the analysis of the DNA regions surrounding the 981 instances of Insertion Sequence elements with respect to the positioning of REP sequences in the 19 available annotated microbial genomes corresponding to species of bacteria with reported REP sequences. This analysis has allowed the detection of the specific insertion into REP sequences for ISPsy8 in Pseudomonas syringae DC3000, ISPa11 in P. aeruginosa PA01, ISPpu9 and ISPpu10 in P. putida KT2440, and ISRm22 and ISRm19 in Sinorhizobium meliloti 1021 genome. Preference for insertion in extragenic spaces with REP sequences has also been detected for ISPsy7 in P. syringae DC3000, ISRm5 in S. meliloti and ISNm1106 in Neisseria meningitidis MC58 and Z2491 genomes. Probably, the association with REP elements that we have detected analyzing genomes is only the tip of the iceberg, and this association could be even more frequent in natural isolates. Conclusion Our findings characterize REP elements as hot spots for transposition and reinforce the relationship between REP sequences and genomic plasticity mediated by mobile elements. In addition, this study defines a subset of REP-recognizer transposases with high target selectivity that can be useful in the development of new tools for

  2. Noncontinuously binding loop-out primers for avoiding problematic DNA sequences in PCR and sanger sequencing.

    Science.gov (United States)

    Sumner, Kelli; Swensen, Jeffrey J; Procter, Melinda; Jama, Mohamed; Wooderchak-Donahue, Whitney; Lewis, Tracey; Fong, Michael; Hubley, Lindsey; Schwarz, Monica; Ha, Youna; Paul, Eleri; Brulotte, Benjamin; Lyon, Elaine; Bayrak-Toydemir, Pinar; Mao, Rong; Pont-Kingdon, Genevieve; Best, D Hunter

    2014-09-01

    We present a method in which noncontinuously binding (loop-out) primers are used to exclude regions of DNA that typically interfere with PCR amplification and/or analysis by Sanger sequencing. Several scenarios were tested using this design principle, including M13-tagged PCR primers, non-M13-tagged PCR primers, and sequencing primers. With this technique, a single oligonucleotide is designed in two segments that flank, but do not include, a short region of problematic DNA sequence. During PCR amplification or sequencing, the problematic region is looped-out from the primer binding site, where it does not interfere with the reaction. Using this method, we successfully excluded regions of up to 46 nucleotides. Loop-out primers were longer than traditional primers (27 to 40 nucleotides) and had higher melting temperatures. This method allows the use of a standardized PCR protocol throughout an assay, keeps the number of PCRs to a minimum, reduces the chance for laboratory error, and, above all, does not interrupt the clinical laboratory workflow.

  3. Two Dimensional Yau-Hausdorff Distance with Applications on Comparison of DNA and Protein Sequences.

    Directory of Open Access Journals (Sweden)

    Kun Tian

    Full Text Available Comparing DNA or protein sequences plays an important role in the functional analysis of genomes. Despite many methods available for sequences comparison, few methods retain the information content of sequences. We propose a new approach, the Yau-Hausdorff method, which considers all translations and rotations when seeking the best match of graphical curves of DNA or protein sequences. The complexity of this method is lower than that of any other two dimensional minimum Hausdorff algorithm. The Yau-Hausdorff method can be used for measuring the similarity of DNA sequences based on two important tools: the Yau-Hausdorff distance and graphical representation of DNA sequences. The graphical representations of DNA sequences conserve all sequence information and the Yau-Hausdorff distance is mathematically proved as a true metric. Therefore, the proposed distance can preciously measure the similarity of DNA sequences. The phylogenetic analyses of DNA sequences by the Yau-Hausdorff distance show the accuracy and stability of our approach in similarity comparison of DNA or protein sequences. This study demonstrates that Yau-Hausdorff distance is a natural metric for DNA and protein sequences with high level of stability. The approach can be also applied to similarity analysis of protein sequences by graphic representations, as well as general two dimensional shape matching.

  4. Two Dimensional Yau-Hausdorff Distance with Applications on Comparison of DNA and Protein Sequences.

    Science.gov (United States)

    Tian, Kun; Yang, Xiaoqian; Kong, Qin; Yin, Changchuan; He, Rong L; Yau, Stephen S-T

    2015-01-01

    Comparing DNA or protein sequences plays an important role in the functional analysis of genomes. Despite many methods available for sequences comparison, few methods retain the information content of sequences. We propose a new approach, the Yau-Hausdorff method, which considers all translations and rotations when seeking the best match of graphical curves of DNA or protein sequences. The complexity of this method is lower than that of any other two dimensional minimum Hausdorff algorithm. The Yau-Hausdorff method can be used for measuring the similarity of DNA sequences based on two important tools: the Yau-Hausdorff distance and graphical representation of DNA sequences. The graphical representations of DNA sequences conserve all sequence information and the Yau-Hausdorff distance is mathematically proved as a true metric. Therefore, the proposed distance can preciously measure the similarity of DNA sequences. The phylogenetic analyses of DNA sequences by the Yau-Hausdorff distance show the accuracy and stability of our approach in similarity comparison of DNA or protein sequences. This study demonstrates that Yau-Hausdorff distance is a natural metric for DNA and protein sequences with high level of stability. The approach can be also applied to similarity analysis of protein sequences by graphic representations, as well as general two dimensional shape matching.

  5. A fast DNA sequence handling program for Apple II computer in BASIC and 6502 assembler.

    Science.gov (United States)

    Paolella, G

    1985-01-01

    A fast general purpose DNA handling program has been developed in BASIC and machine language. The program runs on the Apple II plus or on the Apple IIe microcomputer, without additional hardware except for disk drives and printer. The program allows file insertion and editing, translation into protein sequence, reverse translation, search for small strings and restriction enzyme sites. The homology may be shown either as a comparison of two sequences or through a matrix on screen. Two additional features are: (i) drawing restriction site maps on the printer; and (ii) simulating a gel electrophoresis of restriction fragments both on screen and on paper. All the operations are very fast. The more common tasks are carried out almost instantly; only more complex routines, like finding homology between large sequences or searching and sorting all the restriction sites in a long sequence require longer, but still quite acceptable, times (generally under 30 s).

  6. Nucleotide sequence of cloned cDNA for human pancreatic kallikrein.

    Science.gov (United States)

    Fukushima, D; Kitamura, N; Nakanishi, S

    1985-12-31

    Cloned cDNA sequences for human pancreatic kallikrein have been isolated and determined by molecular cloning and sequence analysis. The identity between human pancreatic and urinary kallikreins is indicated by the complete coincidence between the amino acid sequence deduced from the cloned cDNA sequence and that reported partially for urinary kallikrein. The active enzyme form of the human pancreatic kallikrein consists of 238 amino acids and is preceded by a signal peptide and a profragment of 24 amino acids. A sequence comparison of this with other mammalian kallikreins indicates that key amino acid residues required for both serine protease activity and kallikrein-like cleavage specificity are retained in the human sequence, and residues corresponding to some external loops of the kallikrein diverge from other kallikreins. Analyses by RNA blot hybridization, primer extension, and S1 nuclease mapping indicate that the pancreatic kallikrein mRNA is also expressed in the kidney and sublingual gland, suggesting the active synthesis of urinary kallikrein in these tissues. Furthermore, the tissue-specific regulation of the expression of the members of the human kallikrein gene family has been discussed.

  7. The chemical structure of DNA sequence signals for RNA transcription

    Science.gov (United States)

    George, D. G.; Dayhoff, M. O.

    1982-01-01

    The proposed recognition sites for RNA transcription for E. coli NRA polymerase, bacteriophage T7 RNA polymerase, and eukaryotic RNA polymerase Pol II are evaluated in the light of the requirements for efficient recognition. It is shown that although there is good experimental evidence that specific nucleic acid sequence patterns are involved in transcriptional regulation in bacteria and bacterial viruses, among the sequences now available, only in the case of the promoters recognized by bacteriophage T7 polymerase does it seem likely that the pattern is sufficient. It is concluded that the eukaryotic pattern that is investigated is not restrictive enough to serve as a recognition site.

  8. Complete genome sequence of chloroplast DNA (cpDNA) of Chlorella sorokiniana.

    Science.gov (United States)

    Orsini, Massimiliano; Cusano, Roberto; Costelli, Cristina; Malavasi, Veronica; Concas, Alessandro; Angius, Andrea; Cao, Giacomo

    2016-01-01

    The complete chloroplast genome sequence of Chlorella sorokiniana strain (SAG 111-8 k) is presented in this study. The genome consists of circular chromosomes of 109,811 bp, which encode a total of 109 genes, including 74 proteins, 3 rRNAs and 31 tRNAs. Moreover, introns are not detected and all genes are present in single copy. The overall AT contents of the C. sorokiniana cpDNA is 65.9%, the coding sequence is 59.1% and a large inverted repeat (IR) is not observed.

  9. Mapping vaccinia virus DNA replication origins at nucleotide level by deep sequencing.

    Science.gov (United States)

    Senkevich, Tatiana G; Bruno, Daniel; Martens, Craig; Porcella, Stephen F; Wolf, Yuri I; Moss, Bernard

    2015-09-01

    Poxviruses reproduce in the host cytoplasm and encode most or all of the enzymes and factors needed for expression and synthesis of their double-stranded DNA genomes. Nevertheless, the mode of poxvirus DNA replication and the nature and location of the replication origins remain unknown. A current but unsubstantiated model posits only leading strand synthesis starting at a nick near one covalently closed end of the genome and continuing around the other end to generate a concatemer that is subsequently resolved into unit genomes. The existence of specific origins has been questioned because any plasmid can replicate in cells infected by vaccinia virus (VACV), the prototype poxvirus. We applied directional deep sequencing of short single-stranded DNA fragments enriched for RNA-primed nascent strands isolated from the cytoplasm of VACV-infected cells to pinpoint replication origins. The origins were identified as the switching points of the fragment directions, which correspond to the transition from continuous to discontinuous DNA synthesis. Origins containing a prominent initiation point mapped to a sequence within the hairpin loop at one end of the VACV genome and to the same sequence within the concatemeric junction of replication intermediates. These findings support a model for poxvirus genome replication that involves leading and lagging strand synthesis and is consistent with the requirements for primase and ligase activities as well as earlier electron microscopic and biochemical studies implicating a replication origin at the end of the VACV genome.

  10. Distinguishing forest and savanna African elephants using short nuclear DNA sequences.

    Science.gov (United States)

    Ishida, Yasuko; Demeke, Yirmed; van Coeverden de Groot, Peter J; Georgiadis, Nicholas J; Leggett, Keith E A; Fox, Virginia E; Roca, Alfred L

    2011-01-01

    A more complete description of African elephant phylogeography would require a method that distinguishes forest and savanna elephants using DNA from low-quality samples. Although mitochondrial DNA is often the marker of choice for species identification, the unusual cytonuclear patterns in African elephants make nuclear markers more reliable. We therefore designed and utilized genetic markers for short nuclear DNA regions that contain fixed nucleotide differences between forest and savanna elephants. We used M13 forward and reverse sequences to increase the total length of PCR amplicons and to improve the quality of sequences for the target DNA. We successfully sequenced fragments of nuclear genes from dung samples of known savanna and forest elephants in the Democratic Republic of Congo, Ethiopia, and Namibia. Elephants at previously unexamined locations were found to have nucleotide character states consistent with their status as savanna or forest elephants. Using these and results from previous studies, we estimated that the short-amplicon nuclear markers could distinguish forest from savanna African elephants with more than 99% accuracy. Nuclear genotyping of museum, dung, or ivory samples will provide better-informed conservation management of Africa's elephants.

  11. Interaction between Escherichia coli DNA polymerase IV and single-stranded DNA-binding protein is required for DNA synthesis on SSB-coated DNA.

    Science.gov (United States)

    Furukohri, Asako; Nishikawa, Yoshito; Akiyama, Masahiro Tatsumi; Maki, Hisaji

    2012-07-01

    DNA polymerase IV (Pol IV) is one of three translesion polymerases in Escherichia coli. A mass spectrometry study revealed that single-stranded DNA-binding protein (SSB) in lysates prepared from exponentially-growing cells has a strong affinity for column-immobilized Pol IV. We found that purified SSB binds directly to Pol IV in a pull-down assay, whereas SSBΔC8, a mutant protein lacking the C-terminal tail, failed to interact with Pol IV. These results show that the interaction between Pol IV and SSB is mediated by the C-terminal tail of SSB. When polymerase activity was tested on an SSBΔC8-coated template, we observed a strong inhibition of Pol IV activity. Competition experiments using a synthetic peptide containing the amino acid sequence of SSB tail revealed that the chain-elongating capacity of Pol IV was greatly impaired when the interaction between Pol IV and SSB tail was inhibited. These results demonstrate that Pol IV requires the interaction with the C-terminal tail of SSB to replicate DNA efficiently when the template ssDNA is covered with SSB. We speculate that at the primer/template junction, Pol IV interacts with the tail of the nearest SSB tetramer on the template, and that this interaction allows the polymerase to travel along the template while disassembling SSB.

  12. Structural basis for sequence specific DNA binding and protein dimerization of HOXA13.

    Directory of Open Access Journals (Sweden)

    Yonghong Zhang

    Full Text Available The homeobox gene (HOXA13 codes for a transcription factor protein that binds to AT-rich DNA sequences and controls expression of genes during embryonic morphogenesis. Here we present the NMR structure of HOXA13 homeodomain (A13DBD bound to an 11-mer DNA duplex. A13DBD forms a dimer that binds to DNA with a dissociation constant of 7.5 nM. The A13DBD/DNA complex has a molar mass of 35 kDa consistent with two molecules of DNA bound at both ends of the A13DBD dimer. A13DBD contains an N-terminal arm (residues 324 - 329 that binds in the DNA minor groove, and a C-terminal helix (residues 362 - 382 that contacts the ATAA nucleotide sequence in the major groove. The N370 side-chain forms hydrogen bonds with the purine base of A5* (base paired with T5. Side-chain methyl groups of V373 form hydrophobic contacts with the pyrimidine methyl groups of T5, T6* and T7*, responsible for recognition of TAA in the DNA core. I366 makes similar methyl contacts with T3* and T4*. Mutants (I366A, N370A and V373G all have decreased DNA binding and transcriptional activity. Exposed protein residues (R337, K343, and F344 make intermolecular contacts at the protein dimer interface. The mutation F344A weakens protein dimerization and lowers transcriptional activity by 76%. We conclude that the non-conserved residue, V373 is critical for structurally recognizing TAA in the major groove, and that HOXA13 dimerization is required to activate transcription of target genes.

  13. Anti-DNA antibodies: Sequencing, cloning, and expression

    Energy Technology Data Exchange (ETDEWEB)

    Barry, M.M.

    1992-01-01

    To gain some insight into the mechanism of systemic lupus erythematosus, and the interactions involved in proteins binding to DNA four anti-DNA antibodies have been investigated. Two of the antibodies, Hed 10 and Jel 242, have previously been prepared from female NZB/NZW mice which develop an autoimmune disease resembling human SLE. The remaining two antibodies, Jel 72 and Jel 318, have previously been produced via immunization of C57BL/6 mice. The isotypes of the four antibodies investigated in this thesis were determined by an enzyme-linked-immunosorbent assay. All four antibodies contained [kappa] light chains and [gamma]2a heavy chains except Jel 318 which contains a [gamma]2b heavy chain. The complete variable regions of the heavy and light chains of these four antibodies were sequenced from their respective mRNAs. The gene segments and variable gene families expressed in each antibody were identified. Analysis of the genes used in the autoimmune anti-DNA antibodies and those produced by immunization indicated no obvious differences to account for their different origins. Examination of the amino acid residues present in the complementary-determining regions of these four antibodies indicates a preference for aromatic amino acids. Jel 72 and Jel 242 contain three arginine residues in the third complementary-determining region. A single-chain Fv and the variable region of the heavy chain of Hed 10 were expressed in Escherichia coli. Expression resulted in the production of a 26,000 M[sub r] protein and a 15,000 M[sub r] protein. An immunoblot indicated that the 26,000 M[sub r] protein was the Fv for Hed 10, while the 15,000 M[sub r] protein was shown to bind poly (dT). The contribution of the heavy chain to DNA binding was assessed.

  14. Nucleotide sequences related to the transforming gene of avian sarcoma virus are present in DNA of uninfected vertebrates.

    Science.gov (United States)

    Spector, D H; Varmus, H E; Bishop, J M

    1978-09-01

    We have detected nucleotide sequences related to the transforming gene of avian sarcoma vius (ASV) in the DNA of uninfected vertebrates. Purified radioactive DNA (cDNAsarc) complementary to most of all of the gene (src) required for transformation of fibroblasts by ASV was annealed with DNA from a variety of normal species. Under conditions that facilitate pairing of partially matched nucleotide sequences (1.5 M NaCl, 59 degrees), cDNAsarc formed duplexes with chicken, human, calf, mouse, and salmon DNA but not with DNA from sea urchin, Drosophila, or Escherichia coli. The kinetics of duplex formation indicated that cDNAsarc was reacting with nucleotide sequences present in a single copy or at most a few copies per cell. In contrast to the preceding findings, nucleotide sequences complementary to the remainder of the ASV genome were observed only in chicken DNA. Thermal denaturation studies of the duplexes formed with cDNAsarc indicated a high degree of conservation of the nucleotide sequences related to src in vertebrate DNAs; the reductions in melting temperature suggested about 3--4% mismatching of cDNAsarc with chicken DNA and 8--10% mismatching of cDNAsarc with the other vertebrate DNAs.

  15. Analysis of DNA sequences by an optical time-integrating correlator.

    Science.gov (United States)

    Brousseau, N; Brousseau, R; Salt, J W; Gutz, L; Tucker, M D

    1992-08-10

    The analysis of the molecular structure called DNA is of particular interest for the understanding of the basic processes governing life. Correlation techniques implemented on digital computers are currently used to do this analysis, but the process is so slow that the mapping and sequencing of the entire human genome requires a computational breakthrough. This paper presents a new method of performing the analysis of DNA sequences with an optical time-integrating correlator. The method is characterized by short processing times that make the analysis of the entire human genome a tractable enterprise. A processing strategy and the resultant processing times are presented. Experimental proofs of concept for the two types of analysis specified by the strategy are also included.

  16. High throughput DNA sequence variant detection by conformation sensitive capillary electrophoresis and automated peak comparison.

    Science.gov (United States)

    Davies, Helen; Dicks, Ed; Stephens, Philip; Cox, Charles; Teague, Jon; Greenman, Chris; Bignell, Graham; O'meara, Sarah; Edkins, Sarah; Parker, Adrian; Stevens, Claire; Menzies, Andrew; Blow, Matt; Bottomley, Bill; Dronsfield, Mark; Futreal, P Andrew; Stratton, Michael R; Wooster, Richard

    2006-03-01

    We report the development of a heteroduplex-based mutation detection method using multicapillary automated sequencers, known as conformation-sensitive capillary electrophoresis (CSCE). Our optimized CSCE protocol detected 93 of 95 known base substitution sequence variants. Since the optimization of the method, we have analyzed 215 Mb of DNA and identified 3397 unique variants. An analysis of this data set indicates that the sensitivity of CSCE is above 95% in the central 56% of the average PCR product. To fully exploit the mutation detection capacity of this method, we have developed software, canplot, which automatically compares normal and test results to prioritize samples that are most likely to contain variants. Using multiple fluorescent dyes, CSCE has the capacity to screen over 2.2 Mb on one ABI3730 each day. Therefore this technique is suitable for projects where a rapid and sensitive DNA mutation detection system is required.

  17. The DNA sequence of the human X chromosome.

    Science.gov (United States)

    Ross, Mark T; Grafham, Darren V; Coffey, Alison J; Scherer, Steven; McLay, Kirsten; Muzny, Donna; Platzer, Matthias; Howell, Gareth R; Burrows, Christine; Bird, Christine P; Frankish, Adam; Lovell, Frances L; Howe, Kevin L; Ashurst, Jennifer L; Fulton, Robert S; Sudbrak, Ralf; Wen, Gaiping; Jones, Matthew C; Hurles, Matthew E; Andrews, T Daniel; Scott, Carol E; Searle, Stephen; Ramser, Juliane; Whittaker, Adam; Deadman, Rebecca; Carter, Nigel P; Hunt, Sarah E; Chen, Rui; Cree, Andrew; Gunaratne, Preethi; Havlak, Paul; Hodgson, Anne; Metzker, Michael L; Richards, Stephen; Scott, Graham; Steffen, David; Sodergren, Erica; Wheeler, David A; Worley, Kim C; Ainscough, Rachael; Ambrose, Kerrie D; Ansari-Lari, M Ali; Aradhya, Swaroop; Ashwell, Robert I S; Babbage, Anne K; Bagguley, Claire L; Ballabio, Andrea; Banerjee, Ruby; Barker, Gary E; Barlow, Karen F; Barrett, Ian P; Bates, Karen N; Beare, David M; Beasley, Helen; Beasley, Oliver; Beck, Alfred; Bethel, Graeme; Blechschmidt, Karin; Brady, Nicola; Bray-Allen, Sarah; Bridgeman, Anne M; Brown, Andrew J; Brown, Mary J; Bonnin, David; Bruford, Elspeth A; Buhay, Christian; Burch, Paula; Burford, Deborah; Burgess, Joanne; Burrill, Wayne; Burton, John; Bye, Jackie M; Carder, Carol; Carrel, Laura; Chako, Joseph; Chapman, Joanne C; Chavez, Dean; Chen, Ellson; Chen, Guan; Chen, Yuan; Chen, Zhijian; Chinault, Craig; Ciccodicola, Alfredo; Clark, Sue Y; Clarke, Graham; Clee, Chris M; Clegg, Sheila; Clerc-Blankenburg, Kerstin; Clifford, Karen; Cobley, Vicky; Cole, Charlotte G; Conquer, Jen S; Corby, Nicole; Connor, Richard E; David, Robert; Davies, Joy; Davis, Clay; Davis, John; Delgado, Oliver; Deshazo, Denise; Dhami, Pawandeep; Ding, Yan; Dinh, Huyen; Dodsworth, Steve; Draper, Heather; Dugan-Rocha, Shannon; Dunham, Andrew; Dunn, Matthew; Durbin, K James; Dutta, Ireena; Eades, Tamsin; Ellwood, Matthew; Emery-Cohen, Alexandra; Errington, Helen; Evans, Kathryn L; Faulkner, Louisa; Francis, Fiona; Frankland, John; Fraser, Audrey E; Galgoczy, Petra; Gilbert, James; Gill, Rachel; Glöckner, Gernot; Gregory, Simon G; Gribble, Susan; Griffiths, Coline; Grocock, Russell; Gu, Yanghong; Gwilliam, Rhian; Hamilton, Cerissa; Hart, Elizabeth A; Hawes, Alicia; Heath, Paul D; Heitmann, Katja; Hennig, Steffen; Hernandez, Judith; Hinzmann, Bernd; Ho, Sarah; Hoffs, Michael; Howden, Phillip J; Huckle, Elizabeth J; Hume, Jennifer; Hunt, Paul J; Hunt, Adrienne R; Isherwood, Judith; Jacob, Leni; Johnson, David; Jones, Sally; de Jong, Pieter J; Joseph, Shirin S; Keenan, Stephen; Kelly, Susan; Kershaw, Joanne K; Khan, Ziad; Kioschis, Petra; Klages, Sven; Knights, Andrew J; Kosiura, Anna; Kovar-Smith, Christie; Laird, Gavin K; Langford, Cordelia; Lawlor, Stephanie; Leversha, Margaret; Lewis, Lora; Liu, Wen; Lloyd, Christine; Lloyd, David M; Loulseged, Hermela; Loveland, Jane E; Lovell, Jamieson D; Lozado, Ryan; Lu, Jing; Lyne, Rachael; Ma, Jie; Maheshwari, Manjula; Matthews, Lucy H; McDowall, Jennifer; McLaren, Stuart; McMurray, Amanda; Meidl, Patrick; Meitinger, Thomas; Milne, Sarah; Miner, George; Mistry, Shailesh L; Morgan, Margaret; Morris, Sidney; Müller, Ines; Mullikin, James C; Nguyen, Ngoc; Nordsiek, Gabriele; Nyakatura, Gerald; O'Dell, Christopher N; Okwuonu, Geoffery; Palmer, Sophie; Pandian, Richard; Parker, David; Parrish, Julia; Pasternak, Shiran; Patel, Dina; Pearce, Alex V; Pearson, Danita M; Pelan, Sarah E; Perez, Lesette; Porter, Keith M; Ramsey, Yvonne; Reichwald, Kathrin; Rhodes, Susan; Ridler, Kerry A; Schlessinger, David; Schueler, Mary G; Sehra, Harminder K; Shaw-Smith, Charles; Shen, Hua; Sheridan, Elizabeth M; Shownkeen, Ratna; Skuce, Carl D; Smith, Michelle L; Sotheran, Elizabeth C; Steingruber, Helen E; Steward, Charles A; Storey, Roy; Swann, R Mark; Swarbreck, David; Tabor, Paul E; Taudien, Stefan; Taylor, Tineace; Teague, Brian; Thomas, Karen; Thorpe, Andrea; Timms, Kirsten; Tracey, Alan; Trevanion, Steve; Tromans, Anthony C; d'Urso, Michele; Verduzco, Daniel; Villasana, Donna; Waldron, Lenee; Wall, Melanie; Wang, Qiaoyan; Warren, James; Warry, Georgina L; Wei, Xuehong; West, Anthony; Whitehead, Siobhan L; Whiteley, Mathew N; Wilkinson, Jane E; Willey, David L; Williams, Gabrielle; Williams, Leanne; Williamson, Angela; Williamson, Helen; Wilming, Laurens; Woodmansey, Rebecca L; Wray, Paul W; Yen, Jennifer; Zhang, Jingkun; Zhou, Jianling; Zoghbi, Huda; Zorilla, Sara; Buck, David; Reinhardt, Richard; Poustka, Annemarie; Rosenthal, André; Lehrach, Hans; Meindl, Alfons; Minx, Patrick J; Hillier, Ladeana W; Willard, Huntington F; Wilson, Richard K; Waterston, Robert H; Rice, Catherine M; Vaudin, Mark; Coulson, Alan; Nelson, David L; Weinstock, George; Sulston, John E; Durbin, Richard; Hubbard, Tim; Gibbs, Richard A; Beck, Stephan; Rogers, Jane; Bentley, David R

    2005-03-17

    The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence.

  18. Polymorphic DNA sequences and their application in paternity testing; Polimorficzne sekwencje DNA i ich zastosowanie w dochodzeniu spornego ojcostwa

    Energy Technology Data Exchange (ETDEWEB)

    Slomski, R. [Polska Akademia Nauk, Poznan (Poland). Zaklad Genetyki Czlowieka]|[Akademia Rolnicza, Poznan (Poland)]|[Laboratorium Genetyki Molekularnej, Poznan (Poland); Kwiatkowska, J.; Chlebowska, H. [Polska Akademia Nauk, Poznan (Poland). Zaklad Genetyki Czlowieka; Siemieniallo, B. [Akademia Rolnicza, Poznan (Poland); Slomska, M. [Laboratorium Genetyki Molekularnej, Poznan (Poland)

    1994-12-31

    Characteristics of polymorphic sequences of DNA, especially satellite, mini satellite and micro satellite sequences are presented. Own experience from the use of multi and single locus analysis of DNA in paternity testing has been compared with the results of research in other laboratories. Critical points of both types of analysis are discussed. (author). 53 refs, 4 figs, 2 tabs.

  19. Determination of cDNA and genomic DNA sequences of hevamine, a chitinase from the rubber tree Hevea brasiliensis

    NARCIS (Netherlands)

    Bokma, E; Spiering, M; Chow, KS; Mulder, PPMFA; Subroto, T; Beintema, JJ

    2001-01-01

    Hevamine is a chitinase from the rubber tree Hevea brasiliensis and belongs to the family 18 glycosyl hydrolases. This paper describes the cloning of hevamine DNA and cDNA sequences. Hevamine contains a signal peptide at the N-terminus and a putative vacuolar targeting sequence at the C-terminus whi

  20. Construction of a Sequencing Library from Circulating Cell-Free DNA.

    Science.gov (United States)

    Fang, Nan; Löffert, Dirk; Akinci-Tolun, Rumeysa; Heitz, Katja; Wolf, Alexander

    2016-04-01

    Circulating DNA is cell-free DNA (cfDNA) in serum or plasma that can be used for non-invasive prenatal testing, as well as cancer diagnosis, prognosis, and stratification. High-throughput sequence analysis of the cfDNA with next-generation sequencing technologies has proven to be a highly sensitive and specific method in detecting and characterizing mutations in cancer and other diseases, as well as aneuploidy during pregnancy. This unit describes detailed procedures to extract circulating cfDNA from human serum and plasma and generate sequencing libraries from a wide concentration range of circulating DNA.

  1. Phylogeny of Pelargonium (Geraniaceae) based on DNA sequences from three genomes

    NARCIS (Netherlands)

    Bakker, F.T.; Culham, A.; Hettiarachi, P.; Touloumendidou, T.; Gibby, M.

    2004-01-01

    Phylogenetic hypotheses for the largely South African genus Pelargonium L'Hér. (Geraniaceae) were derived based on DNA sequence data from nuclear, chloroplast and mitochondrial encoded regions. The datasets were unequally represented and comprised cpDNA trnL-F sequences for 152 taxa, nrDNA ITS seque

  2. Repetitive sequence analysis and karyotyping reveals centromere-associated DNA sequences in radish (Raphanus sativus L.).

    Science.gov (United States)

    He, Qunyan; Cai, Zexi; Hu, Tianhua; Liu, Huijun; Bao, Chonglai; Mao, Weihai; Jin, Weiwei

    2015-04-18

    Radish (Raphanus sativus L., 2n = 2x = 18) is a major root vegetable crop especially in eastern Asia. Radish root contains various nutritions which play an important role in strengthening immunity. Repetitive elements are primary components of the genomic sequence and the most important factors in genome size variations in higher eukaryotes. To date, studies about repetitive elements of radish are still limited. To better understand genome structure of radish, we undertook a study to evaluate the proportion of repetitive elements and their distribution in radish. We conducted genome-wide characterization of repetitive elements in radish with low coverage genome sequencing followed by similarity-based cluster analysis. Results showed that about 31% of the genome was composed of repetitive sequences. Satellite repeats were the most dominating elements of the genome. The distribution pattern of three satellite repeat sequences (CL1, CL25, and CL43) on radish chromosomes was characterized using fluorescence in situ hybridization (FISH). CL1 was predominantly located at the centromeric region of all chromosomes, CL25 located at the subtelomeric region, and CL43 was a telomeric satellite. FISH signals of two satellite repeats, CL1 and CL25, together with 5S rDNA and 45S rDNA, provide useful cytogenetic markers to identify each individual somatic metaphase chromosome. The centromere-specific histone H3 (CENH3) has been used as a marker to identify centromere DNA sequences. One putative CENH3 (RsCENH3) was characterized and cloned from radish. Its deduced amino acid sequence shares high similarities to those of the CENH3s in Brassica species. An antibody against B. rapa CENH3, specifically stained radish centromeres. Immunostaining and chromatin immunoprecipitation (ChIP) tests with anti-BrCENH3 antibody demonstrated that both the centromere-specific retrotransposon (CR-Radish) and satellite repeat (CL1) are directly associated with RsCENH3 in radish. Proportions

  3. Thermochemical and kinetic evidence for nucleotide-sequence-dependent RecA-DNA interactions.

    Science.gov (United States)

    Wittung, P; Ellouze, C; Maraboeuf, F; Takahashi, M; Nordèn, B

    1997-05-01

    RecA catalyses homologous recombination in Escherichia coli by promoting pairing of homologous DNA molecules after formation of a helical nucleoprotein filament with single-stranded DNA. The primary reaction of RecA with DNA is generally assumed to be unspecific. We show here, by direct measurement of the interaction enthalpy by means of isothermal titration calorimetry, that the polymerisation of RecA on single-stranded DNA depends on the DNA sequence, with a high exothermic preference for thymine bases. This enthalpic sequence preference of thymines by RecA correlates with faster binding kinetics of RecA to thymine DNA. Furthermore, the enthalpy of interaction between the RecA x DNA filament and a second DNA strand is large only when the added DNA is complementary to the bound DNA in RecA. This result suggests a possibility for a rapid search mechanism by RecA x DNA filaments for homologous DNA molecules.

  4. RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.

    Science.gov (United States)

    Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

    2012-01-01

    RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.

  5. DNA sequence, products, and transcriptional pattern of the genes involved in production of the DNA replication inhibitor microcin B17.

    Science.gov (United States)

    Genilloud, O; Moreno, F; Kolter, R

    1989-02-01

    The 3.8-kilobase segment of plasmid DNA that contains the genes required for production of the DNA replication inhibitor microcin B17 was sequenced. The sequence contains four open reading frames which were shown to be translated in vivo by the construction of fusions to lacZ. The location of these open reading frames fits well with the location of the four microcin B17 production genes, mcbABCD, identified previously through genetic complementation. The products of the four genes have been identified, and the observed molecular weights of the proteins agree with those predicted from the nucleotide sequence. The transcription of these genes was studied by using fusions to lacZ and physical mapping of mRNA start sites. Three promoters were identified in this region. The major promoter for all the genes is a growth phase-regulated OmpR-dependent promoter located upstream of mcbA. A second promoter is located within mcbC and is responsible for a low-level basal expression of mcbD. A third promoter, located within mcbD, promotes transcription in the reverse direction starting within mcbD and extending through mcbC. The resulting mRNA appears to be an untranslated antisense transcript that could play a regulatory role in the expression of these genes.

  6. Genome-Wide Prediction of DNA Methylation Using DNA Composition and Sequence Complexity in Human

    Science.gov (United States)

    Wu, Chengchao; Yao, Shixin; Li, Xinghao; Chen, Chujia; Hu, Xuehai

    2017-01-01

    DNA methylation plays a significant role in transcriptional regulation by repressing activity. Change of the DNA methylation level is an important factor affecting the expression of target genes and downstream phenotypes. Because current experimental technologies can only assay a small proportion of CpG sites in the human genome, it is urgent to develop reliable computational models for predicting genome-wide DNA methylation. Here, we proposed a novel algorithm that accurately extracted sequence complexity features (seven features) and developed a support-vector-machine-based prediction model with integration of the reported DNA composition features (trinucleotide frequency and GC content, 65 features) by utilizing the methylation profiles of embryonic stem cells in human. The prediction results from 22 human chromosomes with size-varied windows showed that the 600-bp window achieved the best average accuracy of 94.7%. Moreover, comparisons with two existing methods further showed the superiority of our model, and cross-species predictions on mouse data also demonstrated that our model has certain generalization ability. Finally, a statistical test of the experimental data and the predicted data on functional regions annotated by ChromHMM found that six out of 10 regions were consistent, which implies reliable prediction of unassayed CpG sites. Accordingly, we believe that our novel model will be useful and reliable in predicting DNA methylation. PMID:28212312

  7. Beyond DNA Sequencing in Space: Current and Future Omics Capabilities of the Biomolecule Sequencer Payload

    Science.gov (United States)

    Wallace, Sarah

    2017-01-01

    Why do we need a DNA sequencer to support the human exploration of space? (A) Operational environmental monitoring; (1) Identification of contaminating microbes, (2) Infectious disease diagnosis, (3) Reduce down mass (sample return for environmental monitoring, crew health, etc.). (B) Research; (1) Human, (2) Animal, (3) Microbes/Cell lines, (4) Plant. (C) Med Ops; (1) Response to countermeasures, (2) Radiation, (3) Real-time analysis can influence medical intervention. (C) Support astrobiology science investigations; (1) Technology superiorly suited to in situ nucleic acid-based life detection, (2) Functional testing for integration into robotics for extraplanetary exploration mission.

  8. An improved chloroplast DNA extraction procedure for whole plastid genome sequencing.

    Science.gov (United States)

    Shi, Chao; Hu, Na; Huang, Hui; Gao, Ju; Zhao, You-Jie; Gao, Li-Zhi

    2012-01-01

    Chloroplast genomes supply valuable genetic information for evolutionary and functional studies in plants. The past five years have witnessed a dramatic increase in the number of completely sequenced chloroplast genomes with the application of second-generation sequencing technology in plastid genome sequencing projects. However, cost-effective high-throughput chloroplast DNA (cpDNA) extraction becomes a major bottleneck restricting the application, as conventional methods are difficult to make a balance between the quality and yield of cpDNAs. We first tested two traditional methods to isolate cpDNA from the three species, Oryza brachyantha, Leersia japonica and Prinsepia utihis. Both of them failed to obtain properly defined cpDNA bands. However, we developed a simple but efficient method based on sucrose gradients and found that the modified protocol worked efficiently to isolate the cpDNA from the same three plant species. We sequenced the isolated DNA samples with Illumina (Solexa) sequencing technology to test cpDNA purity according to aligning sequence reads to the reference chloroplast genomes, showing that the reference genome was properly covered. We show that 40-50% cpDNA purity is achieved with our method. Here we provide an improved method used to isolate cpDNA from angiosperms. The Illumina sequencing results suggest that the isolated cpDNA has reached enough yield and sufficient purity to perform subsequent genome assembly. The cpDNA isolation protocol thus will be widely applicable to the plant chloroplast genome sequencing projects.

  9. Quality evaluation of methyl binding domain based kits for enrichment DNA-methylation sequencing.

    Directory of Open Access Journals (Sweden)

    Tim De Meyer

    Full Text Available DNA-methylation is an important epigenetic feature in health and disease. Methylated sequence capturing by Methyl Binding Domain (MBD based enrichment followed by second-generation sequencing provides the best combination of sensitivity and cost-efficiency for genome-wide DNA-methylation profiling. However, existing implementations are numerous, and quality control and optimization require expensive external validation. Therefore, this study has two aims: 1 to identify a best performing kit for MBD-based enrichment using independent validation data, and 2 to evaluate whether quality evaluation can also be performed solely based on the characteristics of the generated sequences. Five commercially available kits for MBD enrichment were combined with Illumina GAIIx sequencing for three cell lines (HCT15, DU145, PC3. Reduced representation bisulfite sequencing data (all three cell lines and publicly available Illumina Infinium BeadChip data (DU145 and PC3 were used for benchmarking. Consistent large-scale differences in yield, sensitivity and specificity between the different kits could be identified, with Diagenode's MethylCap kit as overall best performing kit under the tested conditions. This kit could also be identified with the Fragment CpG-plot, which summarizes the CpG content of the captured fragments, implying that the latter can be used as a tool to monitor data quality. In conclusion, there are major quality differences between kits for MBD-based capturing of methylated DNA, with the MethylCap kit performing best under the used settings. The Fragment CpG-plot is able to monitor data quality based on inherent sequence data characteristics, and is therefore a cost-efficient tool for experimental optimization, but also to monitor quality throughout routine applications.

  10. Plasmid P1 replication: negative control by repeated DNA sequences.

    OpenAIRE

    Chattoraj, D; Cordes, K.; Abeles, A

    1984-01-01

    The incompatibility locus, incA, of the unit-copy plasmid P1 is contained within a fragment that is essentially a set of nine 19-base-pair repeats. One or more copies of the fragment destabilizes the plasmid when present in trans. Here we show that extra copies of incA interfere with plasmid DNA replication and that a deletion of most of incA increases plasmid copy number. Thus, incA is not essential for replication but is required for its control. When cloned in a high-copy-number vector, pi...

  11. Analysis of mitochondrial DNA sequences in patients with isolated or combined oxidative phosphorylation system deficiency.

    NARCIS (Netherlands)

    Hinttala, R.; Smeets, R.; Moilanen, J.S.; Ugalde, C.; Uusimaa, J.; Smeitink, J.A.M.; Majamaa, K.

    2006-01-01

    BACKGROUND: Enzyme deficiencies of the oxidative phosphorylation (OXPHOS) system may be caused by mutations in the mitochondrial DNA (mtDNA) or in the nuclear DNA. OBJECTIVE: To analyse the sequences of the mtDNA coding region in 25 patients with OXPHOS system deficiency to identify the underlying g

  12. A discriminative approach for unsupervised clustering of DNA sequence motifs.

    Directory of Open Access Journals (Sweden)

    Philip Stegmaier

    Full Text Available Algorithmic comparison of DNA sequence motifs is a problem in bioinformatics that has received increased attention during the last years. Its main applications concern characterization of potentially novel motifs and clustering of a motif collection in order to remove redundancy. Despite growing interest in motif clustering, the question which motif clusters to aim at has so far not been systematically addressed. Here we analyzed motif similarities in a comprehensive set of vertebrate transcription factor classes. For this we developed enhanced similarity scores by inclusion of the information coverage (IC criterion, which evaluates the fraction of information an alignment covers in aligned motifs. A network-based method enabled us to identify motif clusters with high correspondence to DNA-binding domain phylogenies and prior experimental findings. Based on this analysis we derived a set of motif families representing distinct binding specificities. These motif families were used to train a classifier which was further integrated into a novel algorithm for unsupervised motif clustering. Application of the new algorithm demonstrated its superiority to previously published methods and its ability to reproduce entrained motif families. As a result, our work proposes a probabilistic approach to decide whether two motifs represent common or distinct binding specificities.

  13. Highly parallel translation of DNA sequences into small molecules.

    Directory of Open Access Journals (Sweden)

    Rebecca M Weisinger

    Full Text Available A large body of in vitro evolution work establishes the utility of biopolymer libraries comprising 10(10 to 10(15 distinct molecules for the discovery of nanomolar-affinity ligands to proteins. Small-molecule libraries of comparable complexity will likely provide nanomolar-affinity small-molecule ligands. Unlike biopolymers, small molecules can offer the advantages of cell permeability, low immunogenicity, metabolic stability, rapid diffusion and inexpensive mass production. It is thought that such desirable in vivo behavior is correlated with the physical properties of small molecules, specifically a limited number of hydrogen bond donors and acceptors, a defined range of hydrophobicity, and most importantly, molecular weights less than 500 Daltons. Creating a collection of 10(10 to 10(15 small molecules that meet these criteria requires the use of hundreds to thousands of diversity elements per step in a combinatorial synthesis of three to five steps. With this goal in mind, we have reported a set of mesofluidic devices that enable DNA-programmed combinatorial chemistry in a highly parallel 384-well plate format. Here, we demonstrate that these devices can translate DNA genes encoding 384 diversity elements per coding position into corresponding small-molecule gene products. This robust and efficient procedure yields small molecule-DNA conjugates suitable for in vitro evolution experiments.

  14. Z-DNA-forming sequences generate large-scale deletions in mammalian cells

    OpenAIRE

    Wang, Guliang; Christensen, Laura A.; Vasquez, Karen M.

    2006-01-01

    Spontaneous chromosomal breakages frequently occur at genomic hot spots in the absence of DNA damage and can result in translocation-related human disease. Chromosomal breakpoints are often mapped near purine–pyrimidine Z-DNA-forming sequences in human tumors. However, it is not known whether Z-DNA plays a role in the generation of these chromosomal breakages. Here, we show that Z-DNA-forming sequences induce high levels of genetic instability in both bacterial and mammalian cells. In mammali...

  15. Mitochondrial DNA sequence analysis of four Alzheimer`s and Parkinson`s disease patients

    Energy Technology Data Exchange (ETDEWEB)

    Brown, M.D.; Shoffner, J.M.; Wallace, D.C. [Emory Univ. School of Medicine, Atlanta, GA (United States)] [and others

    1996-01-22

    The mitochondrial DNA (mtDNA) sequence was determined on 3 patients with Alzheimer`s disease (AD) exhibiting AD plus Parkinson`s disease (PD) neuropathologic changes and one patient with PD. Patient mtDNA sequences were compared to the standard Cambridge sequence to identify base changes. In the first AD + PD patient, 2 of the 15 nucleotide substitutions may contribute to the neuropathology, a nucleotide pair (np) 4336 transition in the tRNA{sup Gln} gene found 7.4 times more frequently in patients than in controls, and a unique np 721 transition in the 12S rRNA gene which was not found in 70 other patients or 905 controls. In the second AD + PD patient, 27 nucleotide substitutions were detected, including an np 3397 transition in the ND1 gene which converts a conserved methionine to a valine. In the third AD + PD patient, 2 polymorphic base substitutions frequently found at increased frequency in Leber`s hereditary optic neuropathy patients were observed, an np 4216 transition in ND1 and an np 13708 transition in the ND5 gene. For the PD patient, 2 novel variants were observed among 25 base substitutions, an np 1709 substitution in the 16S rRNA gene and an np 15851 missense mutation in the cytb gene. Further studies will be required to demonstrate a casual role for these base substitutions in neurodegenerative disease. 68 refs., 2 tabs.

  16. Sequence-Specific Biosensing of DNA Target through Relay PCR with Small-Molecule Fluorophore.

    Science.gov (United States)

    Yasmeen, Afshan; Du, Feng; Zhao, Yongyun; Dong, Juan; Chen, Haodong; Huang, Xin; Cui, Xin; Tang, Zhuo

    2016-07-15

    Polymerase chain reaction coupled with signal generation offers sensitive recognition of target DNA sequence; however, these procedures require fluorophore-labeled oligonucleotide probes and high-tech equipment to achieve high specificity. Therefore, intensive research has been conducted to develop reliable, convenient, and economical DNA detection methods. The relay PCR described here is the first sequence-specific detection method using a small-molecule fluorophore as a sensor and combines the classic 5'-3' exonuclease activity of Taq polymerase with an RNA mimic of GFP to build a label-free DNA detection platform. Primarily, Taq polymerase cleaves the 5' noncomplementary overhang of the target specific probe during extension of the leading primer to release a relay oligo to initiate tandem PCR of the reporting template, which encodes the sequence of RNA aptamer. Afterward, the PCR product is transcribed to mRNA, which could generate a fluorescent signal in the presence of corresponding fluorophore. In addition to high sensitivity and specificity, the flexibility of choosing different fluorescent reporting signals makes this method versatile in either single or multiple target detection.

  17. Multiplexed highly-accurate DNA sequencing of closely-related HIV-1 variants using continuous long reads from single molecule, real-time sequencing

    Science.gov (United States)

    Dilernia, Dario A.; Chien, Jung-Ting; Monaco, Daniela C.; Brown, Michael P.S.; Ende, Zachary; Deymier, Martin J.; Yue, Ling; Paxinos, Ellen E.; Allen, Susan; Tirado-Ramos, Alfredo; Hunter, Eric

    2015-01-01

    Single Molecule, Real-Time (SMRT®) Sequencing (Pacific Biosciences, Menlo Park, CA, USA) provides the longest continuous DNA sequencing reads currently available. However, the relatively high error rate in the raw read data requires novel analysis methods to deconvolute sequences derived from complex samples. Here, we present a workflow of novel computer algorithms able to reconstruct viral variant genomes present in mixtures with an accuracy of >QV50. This approach relies exclusively on Continuous Long Reads (CLR), which are the raw reads generated during SMRT Sequencing. We successfully implement this workflow for simultaneous sequencing of mixtures containing up to forty different >9 kb HIV-1 full genomes. This was achieved using a single SMRT Cell for each mixture and desktop computing power. This novel approach opens the possibility of solving complex sequencing tasks that currently lack a solution. PMID:26101252

  18. Multiplexed highly-accurate DNA sequencing of closely-related HIV-1 variants using continuous long reads from single molecule, real-time sequencing.

    Science.gov (United States)

    Dilernia, Dario A; Chien, Jung-Ting; Monaco, Daniela C; Brown, Michael P S; Ende, Zachary; Deymier, Martin J; Yue, Ling; Paxinos, Ellen E; Allen, Susan; Tirado-Ramos, Alfredo; Hunter, Eric

    2015-11-16

    Single Molecule, Real-Time (SMRT) Sequencing (Pacific Biosciences, Menlo Park, CA, USA) provides the longest continuous DNA sequencing reads currently available. However, the relatively high error rate in the raw read data requires novel analysis methods to deconvolute sequences derived from complex samples. Here, we present a workflow of novel computer algorithms able to reconstruct viral variant genomes present in mixtures with an accuracy of >QV50. This approach relies exclusively on Continuous Long Reads (CLR), which are the raw reads generated during SMRT Sequencing. We successfully implement this workflow for simultaneous sequencing of mixtures containing up to forty different >9 kb HIV-1 full genomes. This was achieved using a single SMRT Cell for each mixture and desktop computing power. This novel approach opens the possibility of solving complex sequencing tasks that currently lack a solution.

  19. Synthetic CpG islands reveal DNA sequence determinants of chromatin structure

    Science.gov (United States)

    Wachter, Elisabeth; Quante, Timo; Merusi, Cara; Arczewska, Aleksandra; Stewart, Francis; Webb, Shaun; Bird, Adrian

    2014-01-01

    The mammalian genome is punctuated by CpG islands (CGIs), which differ sharply from the bulk genome by being rich in G + C and the dinucleotide CpG. CGIs often include transcription initiation sites and display ‘active’ histone marks, notably histone H3 lysine 4 methylation. In embryonic stem cells (ESCs) some CGIs adopt a ‘bivalent’ chromatin state bearing simultaneous ‘active’ and ‘inactive’ chromatin marks. To determine whether CGI chromatin is developmentally programmed at specific genes or is imposed by shared features of CGI DNA, we integrated artificial CGI-like DNA sequences into the ESC genome. We found that bivalency is the default chromatin structure for CpG-rich, G + C-rich DNA. A high CpG density alone is not sufficient for this effect, as A + T-rich sequence settings invariably provoke de novo DNA methylation leading to loss of CGI signature chromatin. We conclude that both CpG-richness and G + C-richness are required for induction of signature chromatin structures at CGIs. DOI: http://dx.doi.org/10.7554/eLife.03397.001 PMID:25259796

  20. DNA Translocation through Nanopores at Physiological Ionic Strengths Requires Precise Nanoscale Engineering.

    Science.gov (United States)

    Franceschini, Lorenzo; Brouns, Tine; Willems, Kherim; Carlon, Enrico; Maglia, Giovanni

    2016-09-27

    Many important processes in biology involve the translocation of a biopolymer through a nanometer-scale pore. Moreover, the electrophoretic transport of DNA across nanopores is under intense investigation for single-molecule DNA sequencing and analysis. Here, we show that the precise patterning of the ClyA biological nanopore with positive charges is crucial to observe the electrophoretic translocation of DNA at physiological ionic strength. Surprisingly, the strongly electronegative 3.3 nm internal constriction of the nanopore did not require modifications. Further, DNA translocation could only be observed from the wide entry of the nanopore. Our results suggest that the engineered positive charges are important to align the DNA in order to overcome the entropic and electrostatic barriers for DNA translocation through the narrow constriction. Finally, the dependencies of nucleic acid translocations on the Debye length of the solution are consistent with a physical model where the capture of double-stranded DNA is diffusion-limited while the capture of single-stranded DNA is reaction-limited.

  1. Generating Exome Enriched Sequencing Libraries from Formalin-Fixed, Paraffin-Embedded Tissue DNA for Next-Generation Sequencing.

    Science.gov (United States)

    Marosy, Beth A; Craig, Brian D; Hetrick, Kurt N; Witmer, P Dane; Ling, Hua; Griffith, Sean M; Myers, Benjamin; Ostrander, Elaine A; Stanford, Janet L; Brody, Lawrence C; Doheny, Kimberly F

    2017-01-11

    This unit describes a technique for generating exome-enriched sequencing libraries using DNA extracted from formalin-fixed paraffin-embedded (FFPE) samples. Utilizing commercially available kits, we present a low-input FFPE workflow starting with 50 ng of DNA. This procedure includes a repair step to address damage caused by FFPE preservation that improves sequence quality. Subsequently, libraries undergo an in-solution-targeted selection for exons, followed by sequencing using the Illumina next-generation short-read sequencing platform. © 2017 by John Wiley & Sons, Inc.

  2. Sequence selective naked-eye detection of DNA harnessing extension of oligonucleotide-modified nucleotides.

    Science.gov (United States)

    Verga, Daniela; Welter, Moritz; Marx, Andreas

    2016-02-01

    DNA polymerases can efficiently and sequence selectively incorporate oligonucleotide (ODN)-modified nucleotides and the incorporated oligonucleotide strand can be employed as primer in rolling circle amplification (RCA). The effective amplification of the DNA primer by Φ29 DNA polymerase allows the sequence-selective hybridisation of the amplified strand with a G-quadruplex DNA sequence that has horse radish peroxidase-like activity. Based on these findings we develop a system that allows DNA detection with single-base resolution by naked eye.

  3. True single-molecule DNA sequencing of a pleistocene horse bone

    DEFF Research Database (Denmark)

    Orlando, Ludovic Antoine Alexandre; Ginolhac, Aurélien; Raghavan, Maanasa

    2011-01-01

    -preserved Pleistocene horse bone using the Helicos HeliScope and Illumina GAIIx platforms, respectively. We find that the percentage of endogenous DNA sequences derived from the horse is higher among the Helicos data than Illumina data. This result indicates that the molecular biology tools used to generate sequencing...... to the standard Helicos DNA template preparation protocol further increase the proportion of horse DNA for this sample by 3-fold. Comparison of Helicos-specific biases and sequence errors in modern DNA with those in ancient DNA also reveals extensive cytosine deamination damage at the 3' ends of ancient templates...

  4. A High-Throughput Process for the Solid-Phase Purification of Synthetic DNA Sequences.

    Science.gov (United States)

    Grajkowski, Andrzej; Cieślak, Jacek; Beaucage, Serge L

    2017-06-19

    An efficient process for the purification of synthetic phosphorothioate and native DNA sequences is presented. The process is based on the use of an aminopropylated silica gel support functionalized with aminooxyalkyl functions to enable capture of DNA sequences through an oximation reaction with the keto function of a linker conjugated to the 5'-terminus of DNA sequences. Deoxyribonucleoside phosphoramidites carrying this linker, as a 5'-hydroxyl protecting group, have been synthesized for incorporation into DNA sequences during the last coupling step of a standard solid-phase synthesis protocol executed on a controlled pore glass (CPG) support. Solid-phase capture of the nucleobase- and phosphate-deprotected DNA sequences released from the CPG support is demonstrated to proceed near quantitatively. Shorter than full-length DNA sequences are first washed away from the capture support; the solid-phase purified DNA sequences are then released from this support upon reaction with tetra-n-butylammonium fluoride in dry dimethylsulfoxide (DMSO) and precipitated in tetrahydrofuran (THF). The purity of solid-phase-purified DNA sequences exceeds 98%. The simulated high-throughput and scalability features of the solid-phase purification process are demonstrated without sacrificing purity of the DNA sequences. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  5. Cloning and molecular genetics analyses of Deschampsia antarctica Desv. chloroplast and mitochondrial DNA sequence

    Directory of Open Access Journals (Sweden)

    O.P. Savchuk

    2012-03-01

    Full Text Available Chloroplast and mitochondrial DNA sequences of Deschampsia antarctica were studied. We had made comparison analysis with completely sequenced genomes of other temperateness plants to find homology.

  6. Pairwise selection assembly for sequence-independent construction of long-length DNA.

    Science.gov (United States)

    Blake, William J; Chapman, Brad A; Zindal, Anuradha; Lee, Michael E; Lippow, Shaun M; Baynes, Brian M

    2010-05-01

    The engineering of biological components has been facilitated by de novo synthesis of gene-length DNA. Biological engineering at the level of pathways and genomes, however, requires a scalable and cost-effective assembly of DNA molecules that are longer than approximately 10 kb, and this remains a challenge. Here we present the development of pairwise selection assembly (PSA), a process that involves hierarchical construction of long-length DNA through the use of a standard set of components and operations. In PSA, activation tags at the termini of assembly sub-fragments are reused throughout the assembly process to activate vector-encoded selectable markers. Marker activation enables stringent selection for a correctly assembled product in vivo, often obviating the need for clonal isolation. Importantly, construction via PSA is sequence-independent, and does not require primary sequence modification (e.g. the addition or removal of restriction sites). The utility of PSA is demonstrated in the construction of a completely synthetic 91-kb chromosome arm from Saccharomyces cerevisiae.

  7. Characterization of Expressed Sequence Tags From a Gallus gallus Pineal Gland cDNA Library

    OpenAIRE

    2005-01-01

    The pineal gland is the circadian oscillator in the chicken, regulating diverse functions ranging from egg laying to feeding. Here, we describe the isolation and characterization of expressed sequence tags (ESTs) isolated from a chicken pineal gland cDNA library. A total of 192 unique sequences were analysed and submitted to GenBank; 6% of the ESTs matched neither GenBank cDNA sequences nor the newly assembled chicken genomic DNA sequence, three ESTs aligned with sequences designated to be on...

  8. Sequences Characterization of Microsatellite DNA Sequences in Pacific Abalone (Haliotis discus hannat)

    Institute of Scientific and Technical Information of China (English)

    LI Qi; Kijima Akihiro

    2007-01-01

    The microsatellite-enriched library was constructed using magnetic bead hybridization selection method, and the microsatellite DNA sequences were analyzed in Pacific abalone Haliotis discus hannai. Three hundred and fifty white colonies were screened using PCR-based technique, and 84 clones were identified to potentially contain microsatellite repeat motif. The 84 clones were sequenced, and 42 microsatellites and 4 minisatellites with a minimum of five repeats were found (13.1% of white colonies screened). Besides the motif of CA contained in the oligoprobe, we also found other 16 types of microsatellite repeats including a dinucleotide repeat, two tetranucleotide repeats, twelve pentanucleotide repeats and a hexanucleotide repeat. According to Weber(1990), the microsatellite sequences obtained could be categorized structurally into perfect repeats (73.3%), imperfect repeats(13.3%), and compound repeats (13.4%). Among the microsatellite repeats, relatively short arrays (< 20 repeats) were most abundant,accounting for 75.0%. The largest length of microsatellites was 48 repeats, and the average number of repeats was 13.4. The data on the composition and length distribution of microsatellites obtained in the present study can be useful for choosing the repeat motifs for microsatetlite isolation in other abalone species.

  9. Unimodular sequence design under frequency hopping communication compatibility requirements

    Science.gov (United States)

    Ge, Peng; Cui, Guolong; Kong, Lingjiang; Yang, Jianyu

    2016-12-01

    The integrated design for both radar and anonymous communication has drawn more attention recently since wireless communication system appeals to enhance security and reliability. Given the frequency hopping (FH) communication system, an effective way to realize integrated design is to meet the spectrum compatibility between these two systems. The paper deals with a unimodular sequence design technique which considers optimizing both the spectrum compatibility and peak sidelobes levels (PSL) of auto-correlation function (ACF). The spectrum compatibility requirement realizes anonymous communication for the FH system and provides this system lower probability of intercept (LPI) since the spectrum of the FH system is hidden in that of the radar system. The proposed algorithm, named generalized fitting template (GFT) technique, converts the sequence optimization design problem to a iterative fitting process. In this process, the power spectrum density (PSD) and PSL behaviors of the generated sequences fit both PSD and PSL templates progressively. Two templates are established based on the spectrum compatibility requirement and the expected PSL. As noted, in order to ensure the communication security and reliability, spectrum compatibility requirement is given a higher priority to achieve in the GFT algorithm. This algorithm realizes this point by adjusting the weight adaptively between these two terms during the iteration process. The simulation results are analyzed in terms of bit error rate (BER), PSD, PSL, and signal-interference rate (SIR) for both the radar and FH systems. The performance of GFT is compared with SCAN, CAN, FRE, CYC, and MAT algorithms in the above aspects, which shows its good effectiveness.

  10. DUC-Curve, a highly compact 2D graphical representation of DNA sequences and its application in sequence alignment

    Science.gov (United States)

    Li, Yushuang; Liu, Qian; Zheng, Xiaoqi

    2016-08-01

    A highly compact and simple 2D graphical representation of DNA sequences, named DUC-Curve, is constructed through mapping four nucleotides to a unit circle with a cyclic order. DUC-Curve could directly detect nucleotide, di-nucleotide compositions and microsatellite structure from DNA sequences. Moreover, it also could be used for DNA sequence alignment. Taking geometric center vectors of DUC-Curves as sequence descriptor, we perform similarity analysis on the first exons of β-globin genes of 11 species, oncogene TP53 of 27 species and twenty-four Influenza A viruses, respectively. The obtained reasonable results illustrate that the proposed method is very effective in sequence comparison problems, and will at least play a complementary role in classification and clustering problems.

  11. Relaxase DNA Binding and Cleavage Are Two Distinguishable Steps in Conjugative DNA Processing That Involve Different Sequence Elements of the nic Site*

    Science.gov (United States)

    Lucas, María; González-Pérez, Blanca; Cabezas, Matilde; Moncalian, Gabriel; Rivas, Germán; de la Cruz, Fernando

    2010-01-01

    TrwC, the relaxase of plasmid R388, catalyzes a series of concerted DNA cleavage and strand transfer reactions on a specific site (nic) of its origin of transfer (oriT). nic contains the cleavage site and an adjacent inverted repeat (IR2). Mutation analysis in the nic region indicated that recognition of the IR2 proximal arm and the nucleotides located between IR2 and the cleavage site were essential for supercoiled DNA processing, as judged either by in vitro nic cleavage or by mobilization of a plasmid containing oriT. Formation of the IR2 cruciform and recognition of the distal IR2 arm and loop were not necessary for these reactions to take place. On the other hand, IR2 was not involved in TrwC single-stranded DNA processing in vitro. For single-stranded DNA nic cleavage, TrwC recognized a sequence embracing six nucleotides upstream of the cleavage site and two nucleotides downstream. This suggests that TrwC DNA binding and cleavage are two distinguishable steps in conjugative DNA processing and that different sequence elements are recognized by TrwC in each step. IR2-proximal arm recognition was crucial for the initial supercoiled DNA binding. Subsequent recognition of the adjacent single-stranded DNA binding site was required to position the cleavage site in the active center of the protein so that the nic cleavage reaction could take place. PMID:20061574

  12. Base excision of oxidative purine and pyrimidine DNA damage in Saccharomyces cerevisiae by a DNA glycosylase with sequence similarity to endonuclease III from Escherichia coli.

    Science.gov (United States)

    Eide, L; Bjørås, M; Pirovano, M; Alseth, I; Berdal, K G; Seeberg, E

    1996-10-01

    One gene locus on chromosome I in Saccharomyces cerevisiae encodes a protein (YAB5_YEAST; accession no. P31378) with local sequence similarity to the DNA repair glycosylase endonuclease III from Escherichia coli. We have analyzed the function of this gene, now assigned NTG1 (endonuclease three-like glycosylase 1), by cloning, mutant analysis, and gene expression in E. coli. Targeted gene disruption of NTG1 produces a mutant that is sensitive to H2O2 and menadione, indicating that NTG1 is required for repair of oxidative DNA damage in vivo. Northern blot analysis and expression studies of a NTG1-lacZ gene fusion showed that NTG1 is induced by cell exposure to different DNA damaging agents, particularly menadione, and hence belongs to the DNA damage-inducible regulon in S. cerevisiae. When expressed in E. coli, the NTG1 gene product cleaves plasmid DNA damaged by osmium tetroxide, thus, indicating specificity for thymine glycols in DNA similarly as is the case for EndoIII. However, NTG1 also releases formamidopyrimidines from DNA with high efficiency and, hence, represents a glycosylase with a novel range of substrate recognition. Sequences similar to NTG1 from other eukaryotes, including Caenorhabditis elegans, Schizosaccharomyces pombe, and mammals, have recently been entered in the GenBank suggesting the universal presence of NTG1-like genes in higher organisms. S. cerevisiae NTG1 does not have the [4Fe-4S] cluster DNA binding domain characteristic of the other members of this family.

  13. Relaxase DNA binding and cleavage are two distinguishable steps in conjugative DNA processing that involve different sequence elements of the nic site.

    Science.gov (United States)

    Lucas, María; González-Pérez, Blanca; Cabezas, Matilde; Moncalian, Gabriel; Rivas, Germán; de la Cruz, Fernando

    2010-03-19

    TrwC, the relaxase of plasmid R388, catalyzes a series of concerted DNA cleavage and strand transfer reactions on a specific site (nic) of its origin of transfer (oriT). nic contains the cleavage site and an adjacent inverted repeat (IR(2)). Mutation analysis in the nic region indicated that recognition of the IR(2) proximal arm and the nucleotides located between IR(2) and the cleavage site were essential for supercoiled DNA processing, as judged either by in vitro nic cleavage or by mobilization of a plasmid containing oriT. Formation of the IR(2) cruciform and recognition of the distal IR(2) arm and loop were not necessary for these reactions to take place. On the other hand, IR(2) was not involved in TrwC single-stranded DNA processing in vitro. For single-stranded DNA nic cleavage, TrwC recognized a sequence embracing six nucleotides upstream of the cleavage site and two nucleotides downstream. This suggests that TrwC DNA binding and cleavage are two distinguishable steps in conjugative DNA processing and that different sequence elements are recognized by TrwC in each step. IR(2)-proximal arm recognition was crucial for the initial supercoiled DNA binding. Subsequent recognition of the adjacent single-stranded DNA binding site was required to position the cleavage site in the active center of the protein so that the nic cleavage reaction could take place.

  14. A 28,000 years old Cro-Magnon mtDNA sequence differs from all potentially contaminating modern sequences.

    Directory of Open Access Journals (Sweden)

    David Caramelli

    Full Text Available BACKGROUND: DNA sequences from ancient specimens may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal and early modern (Cro-Magnoid Europeans. METHODOLOGY/PRINCIPAL FINDINGS: We typed the mitochondrial DNA (mtDNA hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23 and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. CONCLUSIONS/SIGNIFICANCE: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans.

  15. Sequencing of megabase plus DNA by hybridization: Method development ENT. Final technical progress report

    Energy Technology Data Exchange (ETDEWEB)

    Crkvenjakov, R.; Drmanac, R.

    1991-01-31

    Sequencing by hybridization (SBH) is the only sequencing method based on the experimental determination of the content of oligonucleotide sequences. The data acquisition relies on the natural process of base pairing. It is possible to determine the content of complementary oligosequences in the target DNA by the process of hybridization with oligonucleotide probes of known sequences.

  16. One-way sequencing of multiple amplicons from tandem repetitive mitochondrial DNA control region.

    Science.gov (United States)

    Xu, Jiawu; Fonseca, Dina M

    2011-10-01

    Repetitive DNA sequences not only exist abundantly in eukaryotic nuclear genomes, but also occur as tandem repeats in many animal mitochondrial DNA (mtDNA) control regions. Due to concerted evolution, these repetitive sequences are highly similar or even identical within a genome. When long repetitive regions are the targets of amplification for the purpose of sequencing, multiple amplicons may result if one primer has to be located inside the repeats. Here, we show that, without separating these amplicons by gel purification or cloning, directly sequencing the mitochondrial repeats with the primer outside repetitive region is feasible and efficient. We exemplify it by sequencing the mtDNA control region of the mosquito Aedes albopictus, which harbors typical large tandem DNA repeats. This one-way sequencing strategy is optimal for population surveys.

  17. DNA interactions with a Methylene Blue redox indicator depend on the DNA length and are sequence specific.

    Science.gov (United States)

    Farjami, Elaheh; Clima, Lilia; Gothelf, Kurt V; Ferapontova, Elena E

    2010-06-01

    A DNA molecular beacon approach was used for the analysis of interactions between DNA and Methylene Blue (MB) as a redox indicator of a hybridization event. DNA hairpin structures of different length and guanine (G) content were immobilized onto gold electrodes in their folded states through the alkanethiol linker at the 5'-end. Binding of MB to the folded hairpin DNA was electrochemically studied and compared with binding to the duplex structure formed by hybridization of the hairpin DNA to a complementary DNA strand. Variation of the electrochemical signal from the DNA-MB complex was shown to depend primarily on the DNA length and sequence used: the G-C base pairs were the preferential sites of MB binding in the duplex. For short 20 nts long DNA sequences, the increased electrochemical response from MB bound to the duplex structure was consistent with the increased amount of bound and electrochemically readable MB molecules (i.e. MB molecules that are available for the electron transfer (ET) reaction with the electrode). With longer DNA sequences, the balance between the amounts of the electrochemically readable MB molecules bound to the hairpin DNA and to the hybrid was opposite: a part of the MB molecules bound to the long-sequence DNA duplex seem to be electrochemically mute due to long ET distance. The increasing electrochemical response from MB bound to the short-length DNA hybrid contrasts with the decreasing signal from MB bound to the long-length DNA hybrid and allows an "off"-"on" genosensor development.

  18. Isolation of DNA for Sequence Analysis from Herbarium Material of Some Lichen Specimens

    OpenAIRE

    Aras, Sümer; CANSARAN, Demet

    2006-01-01

    An improved protocol for the isolation of DNA from herbarium material of some lichen specimens is described. The isolated DNA is suitable for PCR reactions for DNA sequence analysis. The hexadecyltrimethylammonium bromide (CTAB) based protocol defined in this study provides a number of advantages, mainly speed and reliability. In addition, different DNA extraction protocols were examined to determine the yield of DNA from the thallus of lichen specimens. The methods examined include a CTAB ba...

  19. Isolation of DNA for Sequence Analysis from Herbarium Material of Some Lichen Specimens

    OpenAIRE

    Aras, Sümer; CANSARAN, Demet

    2014-01-01

    An improved protocol for the isolation of DNA from herbarium material of some lichen specimens is described. The isolated DNA is suitable for PCR reactions for DNA sequence analysis. The hexadecyltrimethylammonium bromide (CTAB) based protocol defined in this study provides a number of advantages, mainly speed and reliability. In addition, different DNA extraction protocols were examined to determine the yield of DNA from the thallus of lichen specimens. The methods examined include a CTAB ba...

  20. RevTrans: multiple alignment of coding DNA from aligned amino acid sequences

    DEFF Research Database (Denmark)

    Wernersson, Rasmus; Pedersen, Anders Gorm

    2003-01-01

    The simple fact that proteins are built from 20 amino acids while DNA only contains four different bases, means that the 'signal-to-noise ratio' in protein sequence alignments is much better than in alignments of DNA. Besides this information-theoretical advantage, protein alignments also benefit...... proteins. It is therefore preferable to align coding DNA at the amino acid level and it is for this purpose we have constructed the program RevTrans. RevTrans constructs a multiple DNA alignment by: (i) translating the DNA; (ii) aligning the resulting peptide sequences; and (iii) building a multiple DNA...... alignment by 'reverse translation' of the aligned protein sequences. In the resulting DNA alignment, gaps occur in groups of three corresponding to entire codons, and analogous codon positions are therefore always lined up. These features are useful when constructing multiple DNA alignments for phylogenetic...

  1. A Compression & Encryption Algorithm on DNA Sequences Using Dynamic Look up Table and Modified Huffman Techniques

    Directory of Open Access Journals (Sweden)

    Syed Mahamud Hossein

    2013-09-01

    Full Text Available Storing, transmitting and security of DNA sequences are well known research challenge. The problem has got magnified with increasing discovery and availability of DNA sequences. We have represent DNA sequence compression algorithm based on Dynamic Look Up Table (DLUT and modified Huffman technique. DLUT consists of 43(64 bases that are 64 sub-stings, each sub-string is of 3 bases long. Each sub-string are individually coded by single ASCII code from 33(! to 96(` and vice versa. Encode depends on encryption key choose by user from four base pair {a,t.g and c}and decode also require decryption key provide by the encoded user. Decoding must require authenticate input for encode the data. The sub-strings are combined into a Dynamic Look up Table based pre-coding routine. This algorithm is tested on reverse; complement & reverse complement the DNA sequences and also test on artificial DNA sequences of equivalent length. Speed of encryption and security levels are two important measurements for evaluating any encryption system. Due to proliferate of ubiquitous computing system, where digital contents are accessible through resource constraint biological database security concern is very important issue. A lot of research has been made to find an encryption system which can be run effectively in those biological databases. Information security is the most challenging question to protect the data from unauthorized user. The proposed method may protect the data from hackers. It can provide the three tier security, in tier one is ASCII code, in tier two is nucleotide (a,t,g and c choice by user and tier three is change of label or change of node position in Huffman Tree. Compression of the genome sequences will help to increase the efficiency of their use. The greatest advantage of this algorithm is fast execution, small memory occupation and easy implementation. Since the program to implement the technique have been written originally in the C language

  2. SeeDNA: A Visualization Tool for K-string Content of Long DNA Sequences and Their Randomized Counterparts

    Institute of Scientific and Technical Information of China (English)

    Junjie Shen; Shuyu Zhang; Hoong-Chien Lee; Bailin Hao

    2004-01-01

    An interactive tool to visualize the K-string composition of long DNA sequences including bacterial complete genomes is described. It is especially useful for exploring short palindromic structures in the sequences. The SeeDNA program runs on Red Hat Linux with GTK+ support. It displays two-dimensional (2D) or one-dimensional (1D) histograms of the K-string distribution of a given sequence and/or its randomized counterpart. It is also capable of showing the difference of K-string distributions between two sequences. The C source code using the GTK+package is freely available.

  3. Analysis and location of a rice BAC clone containing telomeric DNA sequences

    Institute of Scientific and Technical Information of China (English)

    翟文学; 陈浩; 颜辉煌; 严长杰; 王国梁; 朱立煌

    1999-01-01

    BAC2, a rice BAC clone containing (TTTAGGG)n homologous sequences, was analyzed by Southern hybridization and DNA sequencing of its subclones. It was disclosed that there were many tandem repeated satellite DNA sequences, called TA352, as well as simple tandem repeats consisting of TTTAGGG or its variant within the BAC2 insert. A 0. 8 kb (TTTAGGG) n-containing fragment in BAC2 was mapped in the telomere regions of at least 5 pairs of rice chromosomes by using fluorescence in situ hybridization (FISH). By RFLP analysis of low copy sequences the BAC2 clone was localized in one terminal region of chromosome 6. All the results strongly suggest that the telomeric DNA sequences of rice are TTTAGGG or its variant, and the linked satellite DNA TA352 sequences belong to telomere-associated sequences.

  4. The exonuclease activity of DNA polymerase γ is required for ligation during mitochondrial DNA replication

    Science.gov (United States)

    Macao, Bertil; Uhler, Jay P.; Siibak, Triinu; Zhu, Xuefeng; Shi, Yonghong; Sheng, Wenwen; Olsson, Monica; Stewart, James B.; Gustafsson, Claes M.; Falkenberg, Maria

    2015-01-01

    Mitochondrial DNA (mtDNA) polymerase γ (POLγ) harbours a 3′–5′ exonuclease proofreading activity. Here we demonstrate that this activity is required for the creation of ligatable ends during mtDNA replication. Exonuclease-deficient POLγ fails to pause on reaching a downstream 5′-end. Instead, the enzyme continues to polymerize into double-stranded DNA, creating an unligatable 5′-flap. Disease-associated mutations can both increase and decrease exonuclease activity and consequently impair DNA ligation. In mice, inactivation of the exonuclease activity causes an increase in mtDNA mutations and premature ageing phenotypes. These mutator mice also contain high levels of truncated, linear fragments of mtDNA. We demonstrate that the formation of these fragments is due to impaired ligation, causing nicks near the origin of heavy-strand DNA replication. In the subsequent round of replication, the nicks lead to double-strand breaks and linear fragment formation. PMID:26095671

  5. Nucleosome positioning and kinetics near transcription-start-site barriers are controlled by interplay between active remodeling and DNA sequence.

    Science.gov (United States)

    Parmar, Jyotsana J; Marko, John F; Padinhateeri, Ranjith

    2014-01-01

    We investigate how DNA sequence, ATP-dependent chromatin remodeling and nucleosome-depleted 'barriers' co-operate to determine the kinetics of nucleosome organization, in a stochastic model of nucleosome positioning and dynamics. We find that 'statistical' positioning of nucleosomes against 'barriers', hypothesized to control chromatin structure near transcription start sites, requires active remodeling and therefore cannot be described using equilibrium statistical mechanics. We show that, unlike steady-state occupancy, DNA site exposure kinetics near a barrier is dominated by DNA sequence rather than by proximity to the barrier itself. The timescale for formation of positioning patterns near barriers is proportional to the timescale for active nucleosome eviction. We also show that there are strong gene-to-gene variations in nucleosome positioning near barriers, which are eliminated by averaging over many genes. Our results suggest that measurement of nucleosome kinetics can reveal information about sequence-dependent regulation that is not apparent in steady-state nucleosome occupancy.

  6. Complete sequence analysis of 18S rDNA based on genomic DNA extraction from individual Demodex mites (Acari: Demodicidae).

    Science.gov (United States)

    Zhao, Ya-E; Xu, Ji-Ru; Hu, Li; Wu, Li-Ping; Wang, Zheng-Hang

    2012-05-01

    The study for the first time attempted to accomplish 18S ribosomal DNA (rDNA) complete sequence amplification and analysis for three Demodex species (Demodex folliculorum, Demodex brevis and Demodex canis) based on gDNA extraction from individual mites. The mites were treated by DNA Release Additive and Hot Start II DNA Polymerase so as to promote mite disruption and increase PCR specificity. Determination of D. folliculorum gDNA showed that the gDNA yield reached the highest at 1 mite, tending to descend with the increase of mite number. The individual mite gDNA was successfully used for 18S rDNA fragment (about 900 bp) amplification examination. The alignments of 18S rDNA complete sequences of individual mite samples and those of pooled mite samples ( ≥ 1000mites/sample) showed over 97% identities for each species, indicating that the gDNA extracted from a single individual mite was as satisfactory as that from pooled mites for PCR amplification. Further pairwise sequence analyses showed that average divergence, genetic distance, transition/transversion or phylogenetic tree could not effectively identify the three Demodex species, largely due to the differentiation in the D. canis isolates. It can be concluded that the individual Demodex mite gDNA can satisfy the molecular study of Demodex. 18S rDNA complete sequence is suitable for interfamily identification in Cheyletoidea, but whether it is suitable for intrafamily identification cannot be confirmed until the ascertainment of the types of Demodex mites parasitizing in dogs.

  7. Synapsis of Recombination Signal Sequences Located in cis and DNA Underwinding in V(D)J Recombination

    OpenAIRE

    Ciubotaru, Mihai; Schatz, David G.

    2004-01-01

    V(D)J recombination requires binding and synapsis of a complementary (12/23) pair of recombination signal sequences (RSSs) by the RAG1 and RAG2 proteins, aided by a high-mobility group protein, HMG1 or HMG2. Double-strand DNA cleavage within this synaptic, or paired, complex is thought to involve DNA distortion or melting near the site of cleavage. Although V(D)J recombination normally occurs between RSSs located on the same DNA molecule (in cis), all previous studies that directly assessed R...

  8. The DNA sequence and biology of human chromosome 19.

    Science.gov (United States)

    Grimwood, Jane; Gordon, Laurie A; Olsen, Anne; Terry, Astrid; Schmutz, Jeremy; Lamerdin, Jane; Hellsten, Uffe; Goodstein, David; Couronne, Olivier; Tran-Gyamfi, Mary; Aerts, Andrea; Altherr, Michael; Ashworth, Linda; Bajorek, Eva; Black, Stacey; Branscomb, Elbert; Caenepeel, Sean; Carrano, Anthony; Caoile, Chenier; Chan, Yee Man; Christensen, Mari; Cleland, Catherine A; Copeland, Alex; Dalin, Eileen; Dehal, Paramvir; Denys, Mirian; Detter, John C; Escobar, Julio; Flowers, Dave; Fotopulos, Dea; Garcia, Carmen; Georgescu, Anca M; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Ho, Isaac; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Larionov, Vladimer; Leem, Sun-Hee; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Malfatti, Stephanie; Martinez, Diego; McCready, Paula; Medina, Catherine; Morgan, Jenna; Nelson, Kathryn; Nolan, Matt; Ovcharenko, Ivan; Pitluck, Sam; Pollard, Martin; Popkie, Anthony P; Predki, Paul; Quan, Glenda; Ramirez, Lucia; Rash, Sam; Retterer, James; Rodriguez, Alex; Rogers, Stephanine; Salamov, Asaf; Salazar, Angelica; She, Xinwei; Smith, Doug; Slezak, Tom; Solovyev, Victor; Thayer, Nina; Tice, Hope; Tsai, Ming; Ustaszewska, Anna; Vo, Nu; Wagner, Mark; Wheeler, Jeremy; Wu, Kevin; Xie, Gary; Yang, Joan; Dubchak, Inna; Furey, Terrence S; DeJong, Pieter; Dickson, Mark; Gordon, David; Eichler, Evan E; Pennacchio, Len A; Richardson, Paul; Stubbs, Lisa; Rokhsar, Daniel S; Myers, Richard M; Rubin, Edward M; Lucas, Susan M

    2004-04-01

    Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high G + C content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in mendelian disorders, including familial hypercholesterolaemia and insulin-resistant diabetes. Nearly one-quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.

  9. The DNA sequence and biology of human chromosome 19

    Energy Technology Data Exchange (ETDEWEB)

    Grimwood, J; Gordon, L A; Olsen, A; Terry, A; Schmutz, J; Lamerdin, J; Hellsten, U; Goodstein, D; Couronne, O; Tran-Gyamfi, M

    2004-04-06

    Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high GC content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in Mendelian disorders, including familial hypercholesterolemia and insulin-resistant diabetes. Nearly one quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.

  10. Silicon Nanopore Devices for DNA Translocation and Sequencing Studies

    Science.gov (United States)

    Ling, Sean

    2005-03-01

    In this talk, I will discuss the recent progress [1-3] in developing solid-state nanopore devices using silicon technology. We have demonstrated a novel technique for shaping nanopores in the range of 1-10 nm, using surface-tension-driven mass flow with single nanometer precision. This technique overcomes a major technical challenge in silicon technology. I will also discuss the current effort [3] in developing integrated nanopore silicon chips with electrically addressable nanopores. These devices are used for DNA translocation and sequencing studies. This work was done in collaboration with the group of Cees Dekker at TU-Delft with partial support from FOM and Guggenheim Foundation. The work at Brown was supported by NSF-NER and NSF-NIRT. [1] A.J. Storm, J.H. Chen, X.S. Ling, H. Zandbergen, and C. Dekker, ``Fabrication of Solid-State Nanopores with Single Nanometer Precision'', Nature Materials, 2, 537 (2003). [2] A.J. Storm, J.H. Chen, X.S. Ling, H. Zandbergen, and C. Dekker, ``Electron-Beam-Induced Deformations of SiO2 Nanostructures'', Journal of Applied Physics (submitted, 2004). [3] X.S. Ling, "Addressable nanopores and micropores" (patent pending).

  11. The DNA sequence and biology of human chromosome 19

    Energy Technology Data Exchange (ETDEWEB)

    Grimwood, Jane; Gordon, Laurie A.; Olsen, Anne; Terry, Astrid; Schmutz, Jeremy; Lamerdin, Jane; Hellsten, Uffe; Goodstein, David; Couronne, Olivier; Tran-Gyamfi, Mary; Aerts, Andrea; Altherr, Michael; Ashworth, Linda; Bajorek, Eva; Black, Stacey; Branscomb, Elbert; Caenepeel, Sean; Carrano, Anthony; Caoile, Chenier; Chan, Yee Man; Christensen, Mari; Cleland, Catherine A.; Copeland, Alex; Dalin, Eileen; Dehal, Paramvir; Denys, Mirian; Detter, John C.; Escobar, Julio; Flowers, Dave; Fotopulos, Dea; Garcia, Carmen; Georgescu, Anca M.; Glavina, Tijana; Gomez, Maria; Gonzales, Eldelyn; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Ho, Issac; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Larionov, Vladimer; Leem, Sun-Hee; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Malfatti, Stephanie; Martinez, Diego; McCready, Paula; Medina, Catherine; Morgan, Jenna; Nelson, Kathryn; Nolan, Matt; Ovcharenko, Ivan; Pitluck, Sam; Pollard, Martin; Popkie, Anthony P.; Predki, Paul; Quan, Glenda; Ramirez, Lucia; Rash, Sam; Retterer, James; Rodriguez, Alex; Rogers, Stephanine; Salamov, Asaf; Salazar, Angelica; She, Xinwei; Smith, Doug; Slezak, Tom; Solovyev, Victor; Thayer, Nina; Tice, Hope; Tsai, Ming; Ustaszewska, Anna; Vo, Nu; Wagner, Mark; Wheeler, Jeremy; Wu, Kevin; Xie, Gary; Yang, Joan; Dubchak, Inna; Furey, Terrence S.; DeJong, Pieter; Dickson, Mark; Gordon, David; Eichler, Evan E.; Pennacchio, Len A.; Richardson, Paul; Stubbs, Lisa; Rokhsar, Daniel S.; Myers, Richard M.; Rubin, Edward M.; Lucas, Susan M.

    2003-09-15

    Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high G1C content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9 percent of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in mendelian disorders, including familial hypercholesterolaemia and insulin-resistant diabetes. Nearly one-quarter of these genes belong to tandemly arranged families, encompassing more than 25 percent of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, a nd segments of coding and non-coding conservation with the distant fish species Takifugu.

  12. Homologous recombination is required for recovery from oxidative DNA damage.

    Science.gov (United States)

    Hayashi, Michio; Umezu, Keiko

    2017-04-03

    We have been studying the genetic events, including chromosome loss, chromosome rearrangements and intragenic point mutations, that are responsible for the deletion of a URA3 marker in a loss of heterozygosity (LOH) assay in the yeast Saccharomycess cerevisiae. With this assay, we previously showed that homologous recombination plays an important role in genome maintenance in response to DNA lesions that occur spontaneously in normally growing cells. Here, to investigate DNA lesions capable of triggering homologous recombination, we examined the effects of oxidative stress, a prominent cause of endogenous DNA damage, on LOH events. Treatment of log-phase cells with H2O2 first caused growth arrest and then, during the subsequent recovery, chromosome loss and various chromosome rearrangements were induced more than 10-fold. Further analysis of the rearrangements showed that gene conversion was strongly induced, approximately 100 times more frequently than in untreated cells. Consistent with these results, two diploid strains deficient for homologous recombination, rad52Δ/rad52Δ and rad51Δ/rad51Δ, were sensitive to H2O2 treatment. In addition, chromosome DNA breaks were detected in H2O2-treated cells using pulsed-field gel electrophoresis. Altogether, these results suggest that oxidative stress induced recombinogenic lesions on chromosomes, which then triggered homologous recombination leading to chromosome rearrangements, and that this response contributed to the survival of cells afflicted by oxidative DNA damage. We therefore conclude that homologous recombination is required for the recovery of cells from oxidative stress.

  13. Genomic libraries: II. Subcloning, sequencing, and assembling large-insert genomic DNA clones.

    Science.gov (United States)

    Quail, Mike A; Matthews, Lucy; Sims, Sarah; Lloyd, Christine; Beasley, Helen; Baxter, Simon W

    2011-01-01

    Sequencing large insert clones to completion is useful for characterizing specific genomic regions, identifying haplotypes, and closing gaps in whole genome sequencing projects. Despite being a standard technique in molecular laboratories, DNA sequencing using the Sanger method can be highly problematic when complex secondary structures or sequence repeats are encountered in genomic clones. Here, we describe methods to isolate DNA from a large insert clone (fosmid or BAC), subclone the sample, and sequence the region to the highest industry standard. Troubleshooting solutions for sequencing difficult templates are discussed.

  14. Defining cosQ, the site required for termination of bacteriophage lambda DNA packaging.

    Science.gov (United States)

    Wieczorek, D J; Feiss, M

    2001-06-01

    Bacteriophage lambda is a double-stranded DNA virus that processes concatemeric DNA into virion chromosomes by cutting at specific recognition sites termed cos. A cos is composed of three subsites: cosN, the nicking site; cosB, required for packaging initiation; and cosQ, required for termination of chromosome packaging. During packaging termination, nicking of the bottom strand of cosN depends on cosQ, suggesting that cosQ is needed to deliver terminase to the bottom strand of cosN to carry out nicking. In the present work, saturation mutagenesis showed that a 7-bp segment comprises cosQ. A proposal that cosQ function requires an optimal sequence match between cosQ and cosNR, the right cosN half-site, was tested by constructing double cosQ mutants; the behavior of the double mutants was inconsistent with the proposal. Substitutions in the 17-bp region between cosQ and cosN resulted in no major defects in chromosome packaging. Insertional mutagenesis indicated that proper spacing between cosQ and cosN is required. The lethality of integral helical insertions eliminated a model in which DNA looping enables cosQ to deliver a gpA protomer for nicking at cosN. The 7 bp of cosQ coincide exactly with the recognition sequence for the Escherichia coli restriction endonuclease, EcoO109I.

  15. PriC-mediated DNA replication restart requires PriC complex formation with the single-stranded DNA-binding protein.

    Science.gov (United States)

    Wessel, Sarah R; Marceau, Aimee H; Massoni, Shawn C; Zhou, Ruobo; Ha, Taekjip; Sandler, Steven J; Keck, James L

    2013-06-14

    Frequent collisions between cellular DNA replication complexes (replisomes) and obstacles such as damaged DNA or frozen protein complexes make DNA replication fork progression surprisingly sporadic. These collisions can lead to the ejection of replisomes prior to completion of replication, which, if left unrepaired, results in bacterial cell death. As such, bacteria have evolved DNA replication restart mechanisms that function to reload replisomes onto abandoned DNA replication forks. Here, we define a direct interaction between PriC, a key Escherichia coli DNA replication restart protein, and the single-stranded DNA-binding protein (SSB), a protein that is ubiquitously associated with DNA replication forks. PriC/SSB complex formation requires evolutionarily conserved residues from both proteins, including a pair of Arg residues from PriC and the C terminus of SSB. In vitro, disruption of the PriC/SSB interface by sequence changes in either protein blocks the first step of DNA replication restart, reloading of the replicative DnaB helicase onto an abandoned replication fork. Consistent with the critical role of PriC/SSB complex formation in DNA replication restart, PriC variants that cannot bind SSB are non-functional in vivo. Single-molecule experiments demonstrate that PriC binding to SSB alters SSB/DNA complexes, exposing single-stranded DNA and creating a platform for other proteins to bind. These data lead to a model in which PriC interaction with SSB remodels SSB/DNA structures at abandoned DNA replication forks to create a DNA structure that is competent for DnaB loading.

  16. Simple, multiplexed, PCR-based barcoding of DNA enables sensitive mutation detection in liquid biopsies using sequencing.

    Science.gov (United States)

    Ståhlberg, Anders; Krzyzanowski, Paul M; Jackson, Jennifer B; Egyud, Matthew; Stein, Lincoln; Godfrey, Tony E

    2016-06-20

    Detection of cell-free DNA in liquid biopsies offers great potential for use in non-invasive prenatal testing and as a cancer biomarker. Fetal and tumor DNA fractions however can be extremely low in these samples and ultra-sensitive methods are required for their detection. Here, we report an extremely simple and fast method for introduction of barcodes into DNA libraries made from 5 ng of DNA. Barcoded adapter primers are designed with an oligonucleotide hairpin structure to protect the molecular barcodes during the first rounds of polymerase chain reaction (PCR) and prevent them from participating in mis-priming events. Our approach enables high-level multiplexing and next-generation sequencing library construction with flexible library content. We show that uniform libraries of 1-, 5-, 13- and 31-plex can be generated. Utilizing the barcodes to generate consensus reads for each original DNA molecule reduces background sequencing noise and allows detection of variant alleles below 0.1% frequency in clonal cell line DNA and in cell-free plasma DNA. Thus, our approach bridges the gap between the highly sensitive but specific capabilities of digital PCR, which only allows a limited number of variants to be analyzed, with the broad target capability of next-generation sequencing which traditionally lacks the sensitivity to detect rare variants.

  17. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution

    NARCIS (Netherlands)

    Falconer, Ester; Hills, Mark; Naumann, Ulrike; Poon, Steven S. S.; Chavez, Elizabeth A.; Sanders, Ashley D.; Zhao, Yongjun; Hirst, Martin; Lansdorp, Peter M.

    2012-01-01

    DNA rearrangements such as sister chromatid exchanges (SCEs) are sensitive indicators of genomic stress and instability, but they are typically masked by single-cell sequencing techniques. We developed Strand-seq to independently sequence parental DNA template strands from single cells, making it po

  18. Sequencing strategy of mitochondrial HV1 and HV2 DNA with length heteroplasmy

    DEFF Research Database (Denmark)

    Rasmussen, Erik Michael; Sørensen, E; Eriksen, Birthe

    2002-01-01

    We describe a method to obtain reliable mitochondrial DNA (mtDNA) sequences downstream of the homopolymeric stretches with length heteroplasmy in the sequencing direction. The method is based on the use of junction primers that bind to a part of the homopolymeric stretch and the first 2-4 bases...

  19. Molecular characterization and physical localization of highly repetitive DNA sequences from Brazilian Alstroemeria species

    NARCIS (Netherlands)

    Kuipers, A.G.J.; Kamstra, S.A.; Jeu, de M.J.; Jacobsen, E.

    2002-01-01

    Highly repetitive DNA sequences were isolated from genomic DNA libraries of Alstroemeria psittacina and A. inodora. Among the repetitive sequences that were isolated, tandem repeats as well as dispersed repeats could be discerned. The tandem repeats belonged to a family of interlinked Sau3A subfragm

  20. A pattern matching approach for the estimation of alignment between any two given DNA sequences.

    Science.gov (United States)

    Basu, K; Sriraam, N; Richard, R J A

    2007-08-01

    For a given DNA sequence, it is well known that pair wise alignment schemes are used to determine the similarity with the DNA sequences available in the databanks. The efficiency of the alignment decides the type of amino acids and its corresponding proteins. In order to evaluate the given DNA sequence for its proteomic identity, a pattern matching approach is proposed in this paper. A block based semi-global alignment scheme is introduced to determine the similarity between the DNA sequences (known and given). The two DNA sequences are divided into blocks of equal length and alignment is performed which minimizes the computational complexity. The efficiency of the alignment scheme is evaluated using the parameter, percentage of similarity (POS). Four essential DNA version of the amino acids that emphasize the importance of proteomic functionalities are chosen as patterns and matching is performed with the known and given DNA sequences to determine the similarity between them. The ratio of amino acid counts between the two sequences is estimated and the results are compared with that of the POS value. It is found from the experimental results that higher the POS value and the pattern matching higher are the similarity between the two DNA sequences. The optimal block is also identified based on the POS value and amino acids count.

  1. MSA-PAD: DNA multiple sequence alignment framework based on PFAM accessed domain information.

    Science.gov (United States)

    Balech, Bachir; Vicario, Saverio; Donvito, Giacinto; Monaco, Alfonso; Notarangelo, Pasquale; Pesole, Graziano

    2015-08-01

    Here we present the MSA-PAD application, a DNA multiple sequence alignment framework that uses PFAM protein domain information to align DNA sequences encoding either single or multiple protein domains. MSA-PAD has two alignment options: gene and genome mode.

  2. Therapeutic modulation of endogenous gene function by agents with designed DNA-sequence specificities

    NARCIS (Netherlands)

    Uil, T.G.; Haisma, H.J.; Rots, Marianne

    2003-01-01

    Designer molecules that can specifically target pre-determined DNA sequences provide a means to modulate endogenous gene function. Different classes of sequence-specific DNA-binding agents have been developed, including triplex-forming molecules, synthetic polyamides and designer zinc finger protein

  3. Methods for sequencing GC-rich and CCT repeat DNA templates

    Science.gov (United States)

    Robinson, Donna L.

    2007-02-20

    The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.

  4. Mapping and Use of a Sequence that Targets DNA Ligase I to Sites of DNA Replication In Vivo

    OpenAIRE

    Cardoso, M. Cristina; Joseph, Cuthbert; Rahn, Hans-Peter; Reusch, Regina; Nadal-Ginard, Bernardo; Leonhardt, Heinrich

    1997-01-01

    The mammalian nucleus is highly organized, and nuclear processes such as DNA replication occur in discrete nuclear foci, a phenomenon often termed “functional organization” of the nucleus. We describe the identification and characterization of a bipartite targeting sequence (amino acids 1–28 and 111–179) that is necessary and sufficient to direct DNA ligase I to nuclear replication foci during S phase. This targeting sequence is located within the regulatory, NH2-terminal domain of the protei...

  5. Proceedings of the relevance of mass spectrometry to DNA sequence determination: Research needs for the Human Genome Program

    Energy Technology Data Exchange (ETDEWEB)

    Edmonds, C.G.; Smith, R.D. (Pacific Northwest Lab., Richland, WA (USA)); Smith, L.M. (Wisconsin Univ., Madison, WI (USA))

    1990-11-01

    A workshop was sponsored for the US Department of Energy (DOE), Office of Health and Environmental Research by Pacific Northwest Laboratory, April 4--5, 1990, in Seattle, Washington, to examine the potential role of mass spectrometry in the joint DOE/National Institutes of Health (NIH) Human Genome Program. The workshop was occasioned by recent developments in mass spectrometry that are providing new levels for selectivity, sensitivity, and, in particular, new methods of ionization appropriate for large biopolymers such as DNA. During discussions, three general mass spectrometric approaches to the determination of DNA sequence were considered: (1) the mass spectrometric detection of isotopic labels from DNA sequencing mixtures separated using gel electrophoresis, (2) the direct mass spectrometric analysis from direct ionization of unfractionated sequencing mixtures where the measured mass of the constituents functions to identify and order the base sequence (replacing separation by gel electrophoresis), and (3) an approach in which a single highly charged molecular ion of a large DNA segment produced is rapidly sequenced in an ion cyclotron resonance ion trap. The consensus of the workshop was that, on the basis of the new developments, mass spectrometry has the potential to provide the substantial increases in sequencing speed required for the Human Genome Program. 66 refs., 3 tabs.

  6. Proceedings of the relevance of mass spectrometry to DNA sequence determination: Research needs for the Human Genome Program

    Energy Technology Data Exchange (ETDEWEB)

    Edmonds, C.G.; Smith, R.D. (Pacific Northwest Lab., Richland, WA (USA)); Smith, L.M. (Wisconsin Univ., Madison, WI (USA))

    1990-11-01

    A workshop was sponsored for the US Department of Energy (DOE), Office of Health and Environmental Research by Pacific Northwest Laboratory, April 4--5, 1990, in Seattle, Washington, to examine the potential role of mass spectrometry in the joint DOE/National Institutes of Health (NIH) Human Genome Program. The workshop was occasioned by recent developments in mass spectrometry that are providing new levels for selectivity, sensitivity, and, in particular, new methods of ionization appropriate for large biopolymers such as DNA. During discussions, three general mass spectrometric approaches to the determination of DNA sequence were considered: (1) the mass spectrometric detection of isotopic labels from DNA sequencing mixtures separated using gel electrophoresis, (2) the direct mass spectrometric analysis from direct ionization of unfractionated sequencing mixtures where the measured mass of the constituents functions to identify and order the base sequence (replacing separation by gel electrophoresis), and (3) an approach in which a single highly charged molecular ion of a large DNA segment produced is rapidly sequenced in an ion cyclotron resonance ion trap. The consensus of the workshop was that, on the basis of the new developments, mass spectrometry has the potential to provide the substantial increases in sequencing speed required for the Human Genome Program. 66 refs., 3 tabs.

  7. Structural biology of disease-associated repetitive DNA sequences and protein-DNA complexes involved in DNA damage and repair

    Energy Technology Data Exchange (ETDEWEB)

    Gupta, G.; Santhana Mariappan, S.V.; Chen, X.; Catasti, P.; Silks, L.A. III; Moyzis, R.K.; Bradbury, E.M.; Garcia, A.E.

    1997-07-01

    This project is aimed at formulating the sequence-structure-function correlations of various microsatellites in the human (and other eukaryotic) genomes. Here the authors have been able to develop and apply structure biology tools to understand the following: the molecular mechanism of length polymorphism microsatellites; the molecular mechanism by which the microsatellites in the noncoding regions alter the regulation of the associated gene; and finally, the molecular mechanism by which the expansion of these microsatellites impairs gene expression and causes the disease. Their multidisciplinary structural biology approach is quantitative and can be applied to all coding and noncoding DNA sequences associated with any gene. Both NIH and DOE are interested in developing quantitative tools for understanding the function of various human genes for prevention against diseases caused by genetic and environmental effects.

  8. Sequence-Dependent Fluorescence of Cy3- and Cy5-Labeled Double-Stranded DNA.

    Science.gov (United States)

    Kretschy, Nicole; Sack, Matej; Somoza, Mark M

    2016-03-16

    The fluorescent intensity of Cy3 and Cy5 dyes is strongly dependent on the nucleobase sequence of the labeled oligonucleotides. Sequence-dependent fluorescence may significantly influence the data obtained from many common experimental methods based on fluorescence detection of nucleic acids, such as sequencing, PCR, FRET, and FISH. To quantify sequence dependent fluorescence, we have measured the fluorescence intensity of Cy3 and Cy5 bound to the 5' end of all 1024 possible double-stranded DNA 5mers. The fluorescence intensity was also determined for these dyes bound to the 5' end of fixed-sequence double-stranded DNA with a variable sequence 3' overhang adjacent to the dye. The labeled DNA oligonucleotides were made using light-directed, in situ microarray synthesis. The results indicate that the fluorescence intensity of both dyes is sensitive to all five bases or base pairs, that the sequence dependence is stronger for double- (vs single-) stranded DNA, and that the dyes are sensitive to both the adjacent dsDNA sequence and the 3'-ssDNA overhang. Purine-rich sequences result in higher fluorescence. The results can be used to estimate measurement error in experiments with fluorescent-labeled DNA, as well as to optimize the fluorescent signal by considering the nucleobase environment of the labeling cyanine dye.

  9. The immunogenicity of viral haemorragic septicaemia rhabdovirus (VHSV) DNA vaccines can depend on plasmid regulatory sequences.

    Science.gov (United States)

    Chico, V; Ortega-Villaizan, M; Falco, A; Tafalla, C; Perez, L; Coll, J M; Estepa, A

    2009-03-18

    A plasmid DNA encoding the viral hemorrhagic septicaemia virus (VHSV)-G glycoprotein under the control of 5' sequences (enhancer/promoter sequence plus both non-coding 1st exon and 1st intron sequences) from carp beta-actin gene (pAE6-G(VHSV)) was compared to the vaccine plasmid usually described the gene expression is regulated by the human cytomegalovirus (CMV) immediate-early promoter (pMCV1.4-G(VHSV)). We observed that these two plasmids produced a markedly different profile in the level and time of expression of the encoded-antigen, and this may have a direct effect upon the intensity and suitability of the in vivo immune response. Thus, fish genetic immunisation assays were carried out to study the immune response of both plasmids. A significantly enhanced specific-antibody response against the viral glycoprotein was found in the fish immunised with pAE6-G(VHSV). However, the protective efficacy against VHSV challenge conferred by both plasmids was similar. Later analysis of the transcription profile of a set of representative immune-related genes in the DNA immunized fish suggested that depending on the plasmid-related regulatory sequences controlling its expression, the plasmid might activate distinct patterns of the immune system. All together, the results from this study mainly point out that the selection of a determinate encoded-antigen/vector combination for genetic immunisation is of extraordinary importance in designing optimised DNA vaccines that, when required for inducing protective immune response, could elicit responses biased to antigen-specific antibodies or cytotoxic T cells generation.

  10. Sequencing the hypervariable regions of human mitochondrial DNA using massively parallel sequencing: Enhanced data acquisition for DNA samples encountered in forensic testing.

    Science.gov (United States)

    Davis, Carey; Peters, Dixie; Warshauer, David; King, Jonathan; Budowle, Bruce

    2015-03-01

    Mitochondrial DNA testing is a useful tool in the analysis of forensic biological evidence. In cases where nuclear DNA is damaged or limited in quantity, the higher copy number of mitochondrial genomes available in a sample can provide information about the source of a sample. Currently, Sanger-type sequencing (STS) is the primary method to develop mitochondrial DNA profiles. This method is laborious and time consuming. Massively parallel sequencing (MPS) can increase the amount of information obtained from mitochondrial DNA samples while improving turnaround time by decreasing the numbers of manipulations and more so by exploiting high throughput analyses to obtain interpretable results. In this study 18 buccal swabs, three different tissue samples from five individuals, and four bones samples from casework were sequenced at hypervariable regions I and II using STS and MPS. Sample enrichment for STS and MPS was PCR-based. Library preparation for MPS was performed using Nextera® XT DNA Sample Preparation Kit and sequencing was performed on the MiSeq™ (Illumina, Inc.). MPS yielded full concordance of base calls with STS results, and the newer methodology was able to resolve length heteroplasmy in homopolymeric regions. This study demonstrates short amplicon MPS of mitochondrial DNA is feasible, can provide information not possible with STS, and lays the groundwork for development of a whole genome sequencing strategy for degraded samples.

  11. cljam: a library for handling DNA sequence alignment/map (SAM) with parallel processing.

    Science.gov (United States)

    Takeuchi, Toshiki; Yamada, Atsuo; Aoki, Takashi; Nishimura, Kunihiro

    2016-01-01

    Next-generation sequencing can determine DNA bases and the results of sequence alignments are generally stored in files in the Sequence Alignment/Map (SAM) format and the compressed binary version (BAM) of it. SAMtools is a typical tool for dealing with files in the SAM/BAM format. SAMtools has various functions, including detection of variants, visualization of alignments, indexing, extraction of parts of the data and loci, and conversion of file formats. It is written in C and can execute fast. However, SAMtools requires an additional implementation to be used in parallel with, for example, OpenMP (Open Multi-Processing) libraries. For the accumulation of next-generation sequencing data, a simple parallelization program, which can support cloud and PC cluster environments, is required. We have developed cljam using the Clojure programming language, which simplifies parallel programming, to handle SAM/BAM data. Cljam can run in a Java runtime environment (e.g., Windows, Linux, Mac OS X) with Clojure. Cljam can process and analyze SAM/BAM files in parallel and at high speed. The execution time with cljam is almost the same as with SAMtools. The cljam code is written in Clojure and has fewer lines than other similar tools.

  12. Phylogenetic relationships within Pelargonium section Peristera (Geraniaceae) inferred from nrDNA and cpDNA sequence comparisons.

    NARCIS (Netherlands)

    Bakker, F.T.; Helbrugge, D.; Culham, A.; Gibby, M.

    1998-01-01

    Phylogenetic analysis of nrDNA ITS and tmL (UAA) 5' exon-tmF (GAA) chloroplast DNA sequences from 17 species of Pelargonium sect. Peristera, together with nine putative outgroups, suggests paraphyly for the section and a close relationship between the highly disjurmt South African and Australian spe

  13. High-Quality Exome Sequencing of Whole-Genome Amplified Neonatal Dried Blood Spot DNA

    DEFF Research Database (Denmark)

    Poulsen, Jesper Buchhave; Lescai, Francesco; Grove, Jakob;

    2016-01-01

    and WB vs WB sample types yielded similar concordance rates, with values close to 100%. WgaDNA of neonatal DBS samples performs with great accuracy and efficiency in exome sequencing. The wgaDNA performed similarly to matched high-quality reference--whole-blood DNA--based on concordance rates calculated...

  14. Carrier molecules and extraction of circulating tumor DNA for next generation sequencing in colorectal cancer.

    Science.gov (United States)

    Beránek, Martin; Sirák, Igor; Vošmik, Milan; Petera, Jiří; Drastíková, Monika; Palička, Vladimír

    The aims of the study were: i) to compare circulating tumor DNA (ctDNA) yields obtained by different manual extraction procedures, ii) to evaluate the addition of various carrier molecules into the plasma to improve ctDNA extraction recovery, and iii) to use next generation sequencing (NGS) technology to analyze KRAS, BRAF, and NRAS somatic mutations in ctDNA from patients with metastatic colorectal cancer. Venous blood was obtained from patients who suffered from metastatic colorectal carcinoma. For plasma ctDNA extraction, the following carriers were tested: carrier RNA, polyadenylic acid, glycogen, linear acrylamide, yeast tRNA, salmon sperm DNA, and herring sperm DNA. Each extract was characterized by quantitative real-time PCR and next generation sequencing. The addition of polyadenylic acid had a significant positive effect on the amount of ctDNA eluted. The sequencing data revealed five cases of ctDNA mutated in KRAS and one patient with a BRAF mutation. An agreement of 86% was found between tumor tissues and ctDNA. Testing somatic mutations in ctDNA seems to be a promising tool to monitor dynamically changing genotypes of tumor cells circulating in the body. The optimized process of ctDNA extraction should help to obtain more reliable sequencing data in patients with metastatic colorectal cancer.

  15. Raman-based system for DNA sequencing-mapping and other separations

    Science.gov (United States)

    Vo-Dinh, Tuan

    1994-01-01

    DNA sequencing and mapping are performed by using a Raman spectrometer with a surface enhanced Raman scattering (SERS) substrate to enhance the Raman signal. A SERS label is attached to a DNA fragment and then analyzed with the Raman spectrometer to identify the DNA fragment according to characteristics of the Raman spectrum generated.

  16. Computational optimisation of targeted DNA sequencing for cancer detection

    DEFF Research Database (Denmark)

    Martinez, Pierre; McGranahan, Nicholas; Birkbak, Nicolai Juul

    2013-01-01

    circulating tumour DNA (ctDNA) might represent a non-invasive method to detect mutations in patients, facilitating early detection. In this article, we define reduced gene panels from publicly available datasets as a first step to assess and optimise the potential of targeted ctDNA scans for early tumour...

  17. High Interlaboratory Reprocucibility of DNA Sequence-based Typing of Bacteria in a Multicenter Study

    DEFF Research Database (Denmark)

    Sousa, MA de; Boye, Kit; Lencastre, H de

    2006-01-01

    Current DNA amplification-based typing methods for bacterial pathogens often lack interlaboratory reproducibility. In this international study, DNA sequence-based typing of the Staphylococcus aureus protein A gene (spa, 110 to 422 bp) showed 100% intra- and interlaboratory reproducibility without...... extensive harmonization of protocols for 30 blind-coded S. aureus DNA samples sent to 10 laboratories. Specialized software for automated sequence analysis ensured a common typing nomenclature....

  18. cDNA sequencing improves the detection of P53 missense mutations in colorectal cancer

    Directory of Open Access Journals (Sweden)

    Jesionek-Kupnicka Dorota

    2009-08-01

    Full Text Available Abstract Background Recently published data showed discrepancies beteween P53 cDNA and DNA sequencing in glioblastomas. We hypothesised that similar discrepancies may be observed in other human cancers. Methods To this end, we analyzed 23 colorectal cancers for P53 mutations and gene expression using both DNA and cDNA sequencing, real-time PCR and immunohistochemistry. Results We found P53 gene mutations in 16 cases (15 missense and 1 nonsense. Two of the 15 cases with missense mutations showed alterations based only on cDNA, and not DNA sequencing. Moreover, in 6 of the 15 cases with a cDNA mutation those mutations were difficult to detect in the DNA sequencing, so the results of DNA analysis alone could be misinterpreted if the cDNA sequencing results had not also been available. In all those 15 cases, we observed a higher ratio of the mutated to the wild type template by cDNA analysis, but not by the DNA analysis. Interestingly, a similar overexpression of P53 mRNA was present in samples with and without P53 mutations. Conclusion In terms of colorectal cancer, those discrepancies might be explained under three conditions: 1, overexpression of mutated P53 mRNA in cancer cells as compared with normal cells; 2, a higher content of cells without P53 mutation (normal cells and cells showing K-RAS and/or APC but not P53 mutation in samples presenting P53 mutation; 3, heterozygous or hemizygous mutations of P53 gene. Additionally, for heterozygous mutations unknown mechanism(s causing selective overproduction of mutated allele should also be considered. Our data offer new clues for studying discrepancy in P53 cDNA and DNA sequencing analysis.

  19. Influence of DNA extraction on oral microbial profiles obtained via 16S rRNA gene sequencing

    Directory of Open Access Journals (Sweden)

    Loreto Abusleme

    2014-04-01

    Full Text Available Background and objective: The advent of next-generation sequencing has significantly facilitated characterization of the oral microbiome. Despite great efforts in streamlining the processes of sequencing and data curation, upstream steps required for amplicon library generation could still influence 16S rRNA gene-based microbial profiles. Among upstream processes, DNA extraction is a critical step that could represent a great source of bias. Accounting for bias introduced by extraction procedures is important when comparing studies that use different methods. Identifying the method that best portrays communities is also desirable. Accordingly, the aim of this study was to evaluate bias introduced by different DNA extraction procedures on oral microbiome profiles. Design: Four DNA extraction methods were tested on mock communities consisting of seven representative oral bacteria. Additionally, supragingival plaque samples were collected from seven individuals and divided equally to test two commonly used DNA extraction procedures. Amplicon libraries of the 16S rRNA gene were generated and sequenced via 454-pyrosequencing. Results: Evaluation of mock communities revealed that DNA yield and bacterial species representation varied with DNA extraction methods. Despite producing the lowest yield of DNA, a method that included bead beating was the only protocol capable of detecting all seven species in the mock community. Comparison of the performance of two commonly used methods (crude lysis and a chemical/enzymatic lysis+column-based DNA isolation on plaque samples showed no effect of extraction protocols on taxa prevalence but global community structure and relative abundance of individual taxa were affected. At the phylum level, the latter method improved the recovery of Actinobacteria, Bacteroidetes, and Spirochaetes over crude lysis. Conclusion: DNA extraction distorts microbial profiles in simulated and clinical oral samples, reinforcing the

  20. Influence of DNA extraction on oral microbial profiles obtained via 16S rRNA gene sequencing.

    Science.gov (United States)

    Abusleme, Loreto; Hong, Bo-Young; Dupuy, Amanda K; Strausbaugh, Linda D; Diaz, Patricia I

    2014-01-01

    The advent of next-generation sequencing has significantly facilitated characterization of the oral microbiome. Despite great efforts in streamlining the processes of sequencing and data curation, upstream steps required for amplicon library generation could still influence 16S rRNA gene-based microbial profiles. Among upstream processes, DNA extraction is a critical step that could represent a great source of bias. Accounting for bias introduced by extraction procedures is important when comparing studies that use different methods. Identifying the method that best portrays communities is also desirable. Accordingly, the aim of this study was to evaluate bias introduced by different DNA extraction procedures on oral microbiome profiles. Four DNA extraction methods were tested on mock communities consisting of seven representative oral bacteria. Additionally, supragingival plaque samples were collected from seven individuals and divided equally to test two commonly used DNA extraction procedures. Amplicon libraries of the 16S rRNA gene were generated and sequenced via 454-pyrosequencing. Evaluation of mock communities revealed that DNA yield and bacterial species representation varied with DNA extraction methods. Despite producing the lowest yield of DNA, a method that included bead beating was the only protocol capable of detecting all seven species in the mock community. Comparison of the performance of two commonly used methods (crude lysis and a chemical/enzymatic lysis+column-based DNA isolation) on plaque samples showed no effect of extraction protocols on taxa prevalence but global community structure and relative abundance of individual taxa were affected. At the phylum level, the latter method improved the recovery of Actinobacteria, Bacteroidetes, and Spirochaetes over crude lysis. DNA extraction distorts microbial profiles in simulated and clinical oral samples, reinforcing the importance of careful selection of a DNA extraction protocol to improve

  1. Molecular characterization and phylogeny of whipworm nematodes inferred from DNA sequences of cox1 mtDNA and 18S rDNA.

    Science.gov (United States)

    Callejón, Rocío; Nadler, Steven; De Rojas, Manuel; Zurita, Antonio; Petrášová, Jana; Cutillas, Cristina

    2013-11-01

    A molecular phylogenetic hypothesis is presented for the genus Trichuris based on sequence data from the mitochondrial cytochrome c oxidase 1 (cox1) and ribosomal 18S genes. The taxa consisted of different described species and several host-associated isolates (undescribed taxa) of Trichuris collected from hosts from Spain. Sequence data from mitochondrial cox1 (partial gene) and nuclear 18S near-complete gene were analyzed by maximum likelihood and Bayesian inference methods, as separate and combined datasets, to evaluate phylogenetic relationships among taxa. Phylogenetic results based on 18S ribosomal DNA (rDNA) were robust for relationships among species; cox1 sequences delimited species and revealed phylogeographic variation, but most relationships among Trichuris species were poorly resolved by mitochondrial sequences. The phylogenetic hypotheses for both genes strongly supported monophyly of Trichuris, and distinct genetic lineages corresponding to described species or nematodes associated with certain hosts were recognized based on cox1 sequences. Phylogenetic reconstructions based on concatenated sequences of the two loci, cox1 (mitochondrial DNA (mtDNA)) and 18S rDNA, were congruent with the overall topology inferred from 18S and previously published results based on internal transcribed spacer sequences. Our results demonstrate that the 18S rDNA and cox1 mtDNA genes provide resolution at different levels, but together resolve relationships among geographic populations and species in the genus Trichuris.

  2. Stwl modifies chromatin compaction and is required to maintain DNA integrity in the presence of perturbed DNA replication

    NARCIS (Netherlands)

    Yi, X.; Vries, de H.I.; Siudeja, K.; Rana, A.; Lemstra, W.; Brunsting, J.F.; Kok, R.J.M.; Smulders, Y.M.; Schaefer, M.; Dijk, F.; Shang, Y.F.; Eggen, B.J.L.; Kampinga, H.H.; Sibon, O.C.M.

    2009-01-01

    Hydroxyurea, a well-known DNA replication inhibitor, induces cell cycle arrest and intact checkpoint functions are required to survive DNA replication stress induced by this genotoxic agent. Perturbed DNA synthesis also results in elevated levels of DNA damage. It is unclear how organisms prevent ac

  3. Stwl Modifies Chromatin Compaction and Is Required to Maintain DNA Integrity in the Presence of Perturbed DNA Replication

    NARCIS (Netherlands)

    Yi, Xia; Vries, Hilda I. de; Siudeja, Katarzyna; Rana, Anil; Lemstra, Willy; Brunsting, Jeanette F.; Kok, Rob M.; Smulders, Yvo M.; Schaefer, Matthias; Dijk, Freark; Shang, Yongfeng; Eggen, Bart J.L.; Kampinga, Harm H.; Sibon, Ody C.M.

    2009-01-01

    Hydroxyurea, a well-known DNA replication inhibitor, induces cell cycle arrest and intact checkpoint functions are required to survive DNA replication stress induced by this genotoxic agent. Perturbed DNA synthesis also results in elevated levels of DNA damage. It is unclear how organisms prevent ac

  4. An improved chloroplast DNA extraction procedure for whole plastid genome sequencing.

    Directory of Open Access Journals (Sweden)

    Chao Shi

    Full Text Available BACKGROUND: Chloroplast genomes supply valuable genetic information for evolutionary and functional studies in plants. The past five years have witnessed a dramatic increase in the number of completely sequenced chloroplast genomes with the application of second-generation sequencing technology in plastid genome sequencing projects. However, cost-effective high-throughput chloroplast DNA (cpDNA extraction becomes a major bottleneck restricting the application, as conventional methods are difficult to make a balance between the quality and yield of cpDNAs. METHODOLOGY/PRINCIPAL FINDINGS: We first tested two traditional methods to isolate cpDNA from the three species, Oryza brachyantha, Leersia japonica and Prinsepia utihis. Both of them failed to obtain properly defined cpDNA bands. However, we developed a simple but efficient method based on sucrose gradients and found that the modified protocol worked efficiently to isolate the cpDNA from the same three plant species. We sequenced the isolated DNA samples with Illumina (Solexa sequencing technology to test cpDNA purity according to aligning sequence reads to the reference chloroplast genomes, showing that the reference genome was properly covered. We show that 40-50% cpDNA purity is achieved with our method. CONCLUSION: Here we provide an improved method used to isolate cpDNA from angiosperms. The Illumina sequencing results suggest that the isolated cpDNA has reached enough yield and sufficient purity to perform subsequent genome assembly. The cpDNA isolation protocol thus will be widely applicable to the plant chloroplast genome sequencing projects.

  5. String Match Algorithms and Applications in DNA Sequence Alignment%字符串匹配算法在 DNA 序列比对中的应用

    Institute of Scientific and Technical Information of China (English)

    陈建平

    2015-01-01

    The advancement of high-throughput sequencing technologies has led bioinformatics research into the big data era.New technologies generate huge amounts of biological genetic data,which pose significant challenges to data analysis. DNA sequence alignment is one critical step of the bioinformatics analysis flow,providing mapping information for the following variants calling processes.Question B of 201 5 “Shenzhen Cup ”Summer Camp of Mathematical Modeling discusses about DNA sequence alignment problem,requiring students to provide the best solution for fast sequence alignment.Here,we give a brief review on the students’work,then we introduce algorithms implemented in common DNA sequence alignment programs.%高通量测序技术的飞速发展让生物信息领域迎来了大数据时代。新技术在提供海量生物遗传信息的同时,也给分析这些数据带来了新的挑战。DNA 序列比对是信息分析流程中的关键步骤,为后续的变异检测提供序列比对信息。2015“深圳杯”数学建模夏令营 B 题以 DNA 序列比对为研究课题,希望参赛学生给出序列快速比对的最佳方案。本文简要点评了各参赛队伍的解答情况,然后介绍了现有 DNA 序列比对软件中用到的算法和数据结构。

  6. SRA-domain proteins required for DRM2-mediated de novo DNA methylation.

    Directory of Open Access Journals (Sweden)

    Lianna M Johnson

    2008-11-01

    Full Text Available De novo DNA methylation and the maintenance of DNA methylation in asymmetrical sequence contexts is catalyzed by homologous proteins in plants (DRM2 and animals (DNMT3a/b. In plants, targeting of DRM2 depends on small interfering RNAs (siRNAs, although the molecular details are still unclear. Here, we show that two SRA-domain proteins (SUVH9 and SUVH2 are also essential for DRM2-mediated de novo and maintenance DNA methylation in Arabidopsis thaliana. At some loci, SUVH9 and SUVH2 act redundantly, while at other loci only SUVH2 is required, and this locus specificity correlates with the differing DNA-binding affinity of the SRA domains within SUVH9 and SUVH2. Specifically, SUVH9 preferentially binds methylated asymmetric sites, while SUVH2 preferentially binds methylated CG sites. The suvh9 and suvh2 mutations do not eliminate siRNAs, suggesting a role for SUVH9 and SUVH2 late in the RNA-directed DNA methylation pathway. With these new results, it is clear that SRA-domain proteins are involved in each of the three pathways leading to DNA methylation in Arabidopsis.

  7. A novel low energy electron microscope for DNA sequencing and surface analysis

    Energy Technology Data Exchange (ETDEWEB)

    Mankos, M., E-mail: marian@electronoptica.com [Electron Optica Inc., 1000 Elwell Court #110, Palo Alto, CA 94303 (United States); Shadman, K. [Electron Optica Inc., 1000 Elwell Court #110, Palo Alto, CA 94303 (United States); Persson, H.H.J. [Stanford Genome Technology Center, Stanford University School of Medicine, 855 California Avenue, Palo Alto, CA 94304 (United States); N’Diaye, A.T. [Electron Optica Inc., 1000 Elwell Court #110, Palo Alto, CA 94303 (United States); NCEM, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720 (United States); Schmid, A.K. [NCEM, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720 (United States); Davis, R.W. [Stanford Genome Technology Center, Stanford University School of Medicine, 855 California Avenue, Palo Alto, CA 94304 (United States)

    2014-10-15

    Monochromatic, aberration-corrected, dual-beam low energy electron microscopy (MAD-LEEM) is a novel technique that is directed towards imaging nanostructures and surfaces with sub-nanometer resolution. The technique combines a monochromator, a mirror aberration corrector, an energy filter, and dual beam illumination in a single instrument. The monochromator reduces the energy spread of the illuminating electron beam, which significantly improves spectroscopic and spatial resolution. Simulation results predict that the novel aberration corrector design will eliminate the second rank chromatic and third and fifth order spherical aberrations, thereby improving the resolution into the sub-nanometer regime at landing energies as low as one hundred electron-Volts. The energy filter produces a beam that can extract detailed information about the chemical composition and local electronic states of non-periodic objects such as nanoparticles, interfaces, defects, and macromolecules. The dual flood illumination eliminates charging effects that are generated when a conventional LEEM is used to image insulating specimens. A potential application for MAD-LEEM is in DNA sequencing, which requires high resolution to distinguish the individual bases and high speed to reduce the cost. The MAD-LEEM approach images the DNA with low electron impact energies, which provides nucleobase contrast mechanisms without organometallic labels. Furthermore, the micron-size field of view when combined with imaging on the fly provides long read lengths, thereby reducing the demand on assembling the sequence. Experimental results from bulk specimens with immobilized single-base oligonucleotides demonstrate that base specific contrast is available with reflected, photo-emitted, and Auger electrons. Image contrast simulations of model rectangular features mimicking the individual nucleotides in a DNA strand have been developed to translate measurements of contrast on bulk DNA to the detectability of

  8. Analysis of T-DNA/Host-Plant DNA Junction Sequences in Single-Copy Transgenic Barley Lines

    Directory of Open Access Journals (Sweden)

    Joanne G. Bartlett

    2014-01-01

    Full Text Available Sequencing across the junction between an integrated transfer DNA (T-DNA and a host plant genome provides two important pieces of information. The junctions themselves provide information regarding the proportion of T-DNA which has integrated into the host plant genome, whilst the transgene flanking sequences can be used to study the local genetic environment of the integrated transgene. In addition, this information is important in the safety assessment of GM crops and essential for GM traceability. In this study, a detailed analysis was carried out on the right-border T-DNA junction sequences of single-copy independent transgenic barley lines. T-DNA truncations at the right-border were found to be relatively common and affected 33.3% of the lines. In addition, 14.3% of lines had rearranged construct sequence after the right border break-point. An in depth analysis of the host-plant flanking sequences revealed that a significant proportion of the T-DNAs integrated into or close to known repetitive elements. However, this integration into repetitive DNA did not have a negative effect on transgene expression.

  9. eDNA Barcoding: Using Next-Generation Sequencing of Environmental DNA for Detection and Identification of Cetacean Species

    Science.gov (United States)

    2015-09-30

    present. The unique amplicon sequences (haplotypes) can be matched back to the reference database or searched against GenBank for species identification...PCR primers designed from a comprehensive reference database of sequences from cetacean species. Figure 2: Deployment of a hydrophone and a...ubiquitous DNA sequencing for surveys of biodiversity more efficient and affordable in the near future. RELATED PROJECTS None to date.

  10. The Paramecium germline genome provides a niche for intragenic parasitic DNA: evolutionary dynamics of internal eliminated sequences.

    Directory of Open Access Journals (Sweden)

    Olivier Arnaiz

    Full Text Available Insertions of parasitic DNA within coding sequences are usually deleterious and are generally counter-selected during evolution. Thanks to nuclear dimorphism, ciliates provide unique models to study the fate of such insertions. Their germline genome undergoes extensive rearrangements during development of a new somatic macronucleus from the germline micronucleus following sexual events. In Paramecium, these rearrangements include precise excision of unique-copy Internal Eliminated Sequences (IES from the somatic DNA, requiring the activity of a domesticated piggyBac transposase, PiggyMac. We have sequenced Paramecium tetraurelia germline DNA, establishing a genome-wide catalogue of -45,000 IESs, in order to gain insight into their evolutionary origin and excision mechanism. We obtained direct evidence that PiggyMac is required for excision of all IESs. Homology with known P. tetraurelia Tc1/mariner transposons, described here, indicates that at least a fraction of IESs derive from these elements. Most IES insertions occurred before a recent whole-genome duplication that preceded diversification of the P. aurelia species complex, but IES invasion of the Paramecium genome appears to be an ongoing process. Once inserted, IESs decay rapidly by accumulation of deletions and point substitutions. Over 90% of the IESs are shorter than 150 bp and present a remarkable size distribution with a -10 bp periodicity, corresponding to the helical repeat of double-stranded DNA and suggesting DNA loop formation during assembly of a transpososome-like excision complex. IESs are equally frequent within and between coding sequences; however, excision is not 100% efficient and there is selective pressure against IES insertions, in particular within highly expressed genes. We discuss the possibility that ancient domestication of a piggyBac transposase favored subsequent propagation of transposons throughout the germline by allowing insertions in coding sequences, a

  11. The Paramecium germline genome provides a niche for intragenic parasitic DNA: evolutionary dynamics of internal eliminated sequences.

    Science.gov (United States)

    Arnaiz, Olivier; Mathy, Nathalie; Baudry, Céline; Malinsky, Sophie; Aury, Jean-Marc; Denby Wilkes, Cyril; Garnier, Olivier; Labadie, Karine; Lauderdale, Benjamin E; Le Mouël, Anne; Marmignon, Antoine; Nowacki, Mariusz; Poulain, Julie; Prajer, Malgorzata; Wincker, Patrick; Meyer, Eric; Duharcourt, Sandra; Duret, Laurent; Bétermier, Mireille; Sperling, Linda

    2012-01-01

    Insertions of parasitic DNA within coding sequences are usually deleterious and are generally counter-selected during evolution. Thanks to nuclear dimorphism, ciliates provide unique models to study the fate of such insertions. Their germline genome undergoes extensive rearrangements during development of a new somatic macronucleus from the germline micronucleus following sexual events. In Paramecium, these rearrangements include precise excision of unique-copy Internal Eliminated Sequences (IES) from the somatic DNA, requiring the activity of a domesticated piggyBac transposase, PiggyMac. We have sequenced Paramecium tetraurelia germline DNA, establishing a genome-wide catalogue of -45,000 IESs, in order to gain insight into their evolutionary origin and excision mechanism. We obtained direct evidence that PiggyMac is required for excision of all IESs. Homology with known P. tetraurelia Tc1/mariner transposons, described here, indicates that at least a fraction of IESs derive from these elements. Most IES insertions occurred before a recent whole-genome duplication that preceded diversification of the P. aurelia species complex, but IES invasion of the Paramecium genome appears to be an ongoing process. Once inserted, IESs decay rapidly by accumulation of deletions and point substitutions. Over 90% of the IESs are shorter than 150 bp and present a remarkable size distribution with a -10 bp periodicity, corresponding to the helical repeat of double-stranded DNA and suggesting DNA loop formation during assembly of a transpososome-like excision complex. IESs are equally frequent within and between coding sequences; however, excision is not 100% efficient and there is selective pressure against IES insertions, in particular within highly expressed genes. We discuss the possibility that ancient domestication of a piggyBac transposase favored subsequent propagation of transposons throughout the germline by allowing insertions in coding sequences, a fraction of the

  12. SeqAnt: A web service to rapidly identify and annotate DNA sequence variations

    Directory of Open Access Journals (Sweden)

    Patel Viren

    2010-09-01

    Full Text Available Abstract Background The enormous throughput and low cost of second-generation sequencing platforms now allow research and clinical geneticists to routinely perform single experiments that identify tens of thousands to millions of variant sites. Existing methods to annotate variant sites using information from publicly available databases via web browsers are too slow to be useful for the large sequencing datasets being routinely generated by geneticists. Because sequence annotation of variant sites is required before functional characterization can proceed, the lack of a high-throughput pipeline to efficiently annotate variant sites can act as a significant bottleneck in genetics research. Results SeqAnt (Sequence Annotator is an open source web service and software package that rapidly annotates DNA sequence variants and identifies recessive or compound heterozygous loci in human, mouse, fly, and worm genome sequencing experiments. Variants are characterized with respect to their functional type, frequency, and evolutionary conservation. Annotated variants can be viewed on a web browser, downloaded in a tab-delimited text file, or directly uploaded in a BED format to the UCSC genome browser. To demonstrate the speed of SeqAnt, we annotated a series of publicly available datasets that ranged in size from 37 to 3,439,107 variant sites. The total time to completely annotate these data completely ranged from 0.17 seconds to 28 minutes 49.8 seconds. Conclusion SeqAnt is an open source web service and software package that overcomes a critical bottleneck facing research and clinical geneticists using second-generation sequencing platforms. SeqAnt will prove especially useful for those investigators who lack dedicated bioinformatics personnel or infrastructure in their laboratories.

  13. High-throughput sequencing of nematode communities from total soil DNA extractions

    DEFF Research Database (Denmark)

    Sapkota, Rumakanta; Nicolaisen, Mogens

    2015-01-01

    nematodes without the need for enrichment was developed. Using this strategy on DNA templates from a set of 22 agricultural soils, we obtained 64.4% sequences of nematode origin in total, whereas the remaining sequences were almost entirely from other metazoans. The nematode sequences were derived from...

  14. Numerical encoding of DNA sequences by chaos game representation with application in similarity comparison.

    Science.gov (United States)

    Hoang, Tung; Yin, Changchuan; Yau, Stephen S-T

    2016-10-01

    Numerical encoding plays an important role in DNA sequence analysis via computational methods, in which numerical values are associated with corresponding symbolic characters. After numerical representation, digital signal processing methods can be exploited to analyze DNA sequences. To reflect the biological properties of the original sequence, it is vital that the representation is one-to-one. Chaos Game Representation (CGR) is an iterative mapping technique that assigns each nucleotide in a DNA sequence to a respective position on the plane that allows the depiction of the DNA sequence in the form of image. Using CGR, a biological sequence can be transformed one-to-one to a numerical sequence that preserves the main features of the original sequence. In this research, we propose to encode DNA sequences by considering 2D CGR coordinates as complex numbers, and apply digital signal processing methods to analyze their evolutionary relationship. Computational experiments indicate that this approach gives comparable results to the state-of-the-art multiple sequence alignment method, Clustal Omega, and is significantly faster. The MATLAB code for our method can be accessed from: www.mathworks.com/matlabcentral/fileexchange/57152.

  15. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system.

    Science.gov (United States)

    Schloss, Patrick D; Jenior, Matthew L; Koumpouras, Charles C; Westcott, Sarah L; Highlander, Sarah K

    2016-01-01

    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V5, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

  16. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system

    Directory of Open Access Journals (Sweden)

    Patrick D. Schloss

    2016-03-01

    Full Text Available Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina’s MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3–V5, V1–V3, V1–V5, V1–V6, and V1–V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1–V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina’s MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

  17. Transgenerational inheritance: Models and mechanisms of non-DNA sequence-based inheritance.

    Science.gov (United States)

    Miska, Eric A; Ferguson-Smith, Anne C

    2016-10-07

    Heritability has traditionally been thought to be a characteristic feature of the genetic material of an organism-notably, its DNA. However, it is now clear that inheritance not based on DNA sequence exists in multiple organisms, with examples found in microbes, plants, and invertebrate and vertebrate animals. In mammals, the molecular mechanisms have been challenging to elucidate, in part due to difficulties in designing robust models and approaches. Here we review some of the evidence, concepts, and potential mechanisms of non-DNA sequence-based transgenerational inheritance. We highlight model systems and discuss whether phenotypes are replicated or reconstructed over successive generations, as well as whether mechanisms operate at transcriptional and/or posttranscriptional levels. Finally, we explore the short- and long-term implications of non-DNA sequence-based inheritance. Understanding the effects of non-DNA sequence-based mechanisms is key to a full appreciation of heritability in health and disease.

  18. CMSA: a heterogeneous CPU/GPU computing system for multiple similar RNA/DNA sequence alignment.

    Science.gov (United States)

    Chen, Xi; Wang, Chen; Tang, Shanjiang; Yu, Ce; Zou, Quan

    2017-06-24

    The multiple sequence alignment (MSA) is a classic and powerful technique for sequence analysis in bioinformatics. With the rapid growth of biological datasets, MSA parallelization becomes necessary to keep its running time in an acceptable level. Although there are a lot of work on MSA problems, their approaches are either insufficient or contain some implicit assumptions that limit the generality of usage. First, the information of users' sequences, including the sizes of datasets and the lengths of sequences, can be of arbitrary values and are generally unknown before submitted, which are unfortunately ignored by previous work. Second, the center star strategy is suited for aligning similar sequences. But its first stage, center sequence selection, is highly time-consuming and requires further optimization. Moreover, given the heterogeneous CPU/GPU platform, prior studies consider the MSA parallelization on GPU devices only, making the CPUs idle during the computation. Co-run computation, however, can maximize the utilization of the computing resources by enabling the workload computation on both CPU and GPU simultaneously. This paper presents CMSA, a robust and efficient MSA system for large-scale datasets on the heterogeneous CPU/GPU platform. It performs and optimizes multiple sequence alignment automatically for users' submitted sequences without any assumptions. CMSA adopts the co-run computation model so that both CPU and GPU devices are fully utilized. Moreover, CMSA proposes an improved center star strategy that reduces the time complexity of its center sequence selection process from O(mn (2)) to O(mn). The experimental results show that CMSA achieves an up to 11× speedup and outperforms the state-of-the-art software. CMSA focuses on the multiple similar RNA/DNA sequence alignment and proposes a novel bitmap based algorithm to improve the center star strategy. We can conclude that harvesting the high performance of modern GPU is a promising approach to

  19. HMGA1a recognition candidate DNA sequences in humans.

    Directory of Open Access Journals (Sweden)

    Takayuki Manabe

    Full Text Available High mobility group protein A1a (HMGA1a acts as an architectural transcription factor and influences a diverse array of normal biological processes. It binds AT-rich sequences, and previous reports have demonstrated HMGA1a binding to the authentic promoters of various genes. However, the precise sequences that HMGA1a binds to remain to be clarified. Therefore, in this study, we searched for the sequences with the highest affinity for human HMGA1a using an existing SELEX method, and then compared the identified sequences with known human promoter sequences. Based on our results, we propose the sequences "-(G/A-G-(A/T-(A/T-A-T-T-T-" as HMGA1a-binding candidate sequences. Furthermore, these candidate sequences bound native human HMGA1a from SK-N-SH cells. When candidate sequences were analyzed by performing FASTAs against all known human promoter sequences, 500-900 sequences were hit by each one. Some of the extracted genes have already been proven or suggested as HMGA1a-binding promoters. The candidate sequences presented here represent important information for research into the various roles of HMGA1a, including cell differentiation, death, growth, proliferation, and the pathogenesis of cancer.

  20. The DNA sequence, annotation and analysis of human chromosome 3

    DEFF Research Database (Denmark)

    Muzny, Donna M; Scherer, Steven E; Kaul, Rajinder

    2006-01-01

    After the completion of a draft human genome sequence, the International Human Genome Sequencing Consortium has proceeded to finish and annotate each of the 24 chromosomes comprising the human genome. Here we describe the sequencing and analysis of human chromosome 3, one of the largest human chr...

  1. Discovering approximate-associated sequence patterns for protein-DNA interactions

    KAUST Repository

    Chan, Tak Ming

    2010-12-30

    Motivation: The bindings between transcription factors (TFs) and transcription factor binding sites (TFBSs) are fundamental protein-DNA interactions in transcriptional regulation. Extensive efforts have been made to better understand the protein-DNA interactions. Recent mining on exact TF-TFBS-associated sequence patterns (rules) has shown great potentials and achieved very promising results. However, exact rules cannot handle variations in real data, resulting in limited informative rules. In this article, we generalize the exact rules to approximate ones for both TFs and TFBSs, which are essential for biological variations. Results: A progressive approach is proposed to address the approximation to alleviate the computational requirements. Firstly, similar TFBSs are grouped from the available TF-TFBS data (TRANSFAC database). Secondly, approximate and highly conserved binding cores are discovered from TF sequences corresponding to each TFBS group. A customized algorithm is developed for the specific objective. We discover the approximate TF-TFBS rules by associating the grouped TFBS consensuses and TF cores. The rules discovered are evaluated by matching (verifying with) the actual protein-DNA binding pairs from Protein Data Bank (PDB) 3D structures. The approximate results exhibit many more verified rules and up to 300% better verification ratios than the exact ones. The customized algorithm achieves over 73% better verification ratios than traditional methods. Approximate rules (64-79%) are shown statistically significant. Detailed variation analysis and conservation verification on NCBI records demonstrate that the approximate rules reveal both the flexible and specific protein-DNA interactions accurately. The approximate TF-TFBS rules discovered show great generalized capability of exploring more informative binding rules. © The Author 2010. Published by Oxford University Press. All rights reserved.

  2. Structural analysis of DNA sequence: evidence for lateral gene transfer in Thermotoga maritima

    DEFF Research Database (Denmark)

    Worning, Peder; Jensen, Lars Juhl; Nelson, K. E.

    2000-01-01

    The recently published complete DNA sequence of the bacterium Thermotoga maritima provides evidence, based on protein sequence conservation, for lateral gene transfer between Archaea and Bacteria. We introduce a new method of periodicity analysis of DNA sequences, based on structural parameters......, which brings independent evidence for the lateral gene transfer in the genome of T.maritima, The structural analysis relates the Archaea-like DNA sequences to the genome of Pyrococcus horikoshii. Analysis of 24 complete genomic DNA sequences shows different periodicity patterns for organisms...... of different origin, The typical genomic periodicity for Bacteria is 11 bp whilst it is 10 bp for Archaea, Eukaryotes have more complex spectra but the dominant period in the yeast Saccharomyces cerevisiae is 10.2 bp. These periodicities are most likely reflective of differences in chromatin structure....

  3. Sequence-specific activation of the DNA sensor cGAS by Y-form DNA structures as found in primary HIV-1 cDNA.

    Science.gov (United States)

    Herzner, Anna-Maria; Hagmann, Cristina Amparo; Goldeck, Marion; Wolter, Steven; Kübler, Kirsten; Wittmann, Sabine; Gramberg, Thomas; Andreeva, Liudmila; Hopfner, Karl-Peter; Mertens, Christina; Zillinger, Thomas; Jin, Tengchuan; Xiao, Tsan Sam; Bartok, Eva; Coch, Christoph; Ackermann, Damian; Hornung, Veit; Ludwig, Janos; Barchet, Winfried; Hartmann, Gunther; Schlee, Martin

    2015-10-01

    Cytosolic DNA that emerges during infection with a retrovirus or DNA virus triggers antiviral type I interferon responses. So far, only double-stranded DNA (dsDNA) over 40 base pairs (bp) in length has been considered immunostimulatory. Here we found that unpaired DNA nucleotides flanking short base-paired DNA stretches, as in stem-loop structures of single-stranded DNA (ssDNA) derived from human immunodeficiency virus type 1 (HIV-1), activated the type I interferon-inducing DNA sensor cGAS in a sequence-dependent manner. DNA structures containing unpaired guanosines flanking short (12- to 20-bp) dsDNA (Y-form DNA) were highly stimulatory and specifically enhanced the enzymatic activity of cGAS. Furthermore, we found that primary HIV-1 reverse transcripts represented the predominant viral cytosolic DNA species during early infection of macrophages and that these ssDNAs were highly immunostimulatory. Collectively, our study identifies unpaired guanosines in Y-form DNA as a highly active, minimal cGAS recognition motif that enables detection of HIV-1 ssDNA.

  4. Validated primer set that prevents nuclear DNA sequences of mitochondrial origin co-amplification: a revision based on the New Human Genome Reference Sequence (GRCh37).

    Science.gov (United States)

    Ramos, Amanda; Santos, Cristina; Barbena, Elena; Mateiu, Ligia; Alvarez, Luis; Nogués, Ramon; Aluja, Maria Pilar

    2011-03-01

    A new human genome reference sequence--GRCh37--was recently generated and made available by the Genome Reference Consortium. Since the prior disposable human reference sequence--hg18--was previously used for the mitochondrial DNA primer BLAST validation, a revision of those previously published primer pairs is required. Thus, the aim of this Short Communication is to perform an in silico BLAST test of the published disposable nine primer pairs using the new human reference sequence and to report the pertinent modifications. The new analysis showed that one of the tested primer pairs requires a revision. Therefore, a new validated primer pair, which specifically amplifies the mitochondrial region located between positions 6520 and 9184, is presented.

  5. Multiplexed metagenome mining using short DNA sequence tags facilitates targeted discovery of epoxyketone proteasome inhibitors.

    Science.gov (United States)

    Owen, Jeremy G; Charlop-Powers, Zachary; Smith, Alexandra G; Ternei, Melinda A; Calle, Paula Y; Reddy, Boojala Vijay B; Montiel, Daniel; Brady, Sean F

    2015-04-07

    In molecular evolutionary analyses, short DNA sequences are used to infer phylogenetic relationships among species. Here we apply this principle to the study of bacterial biosynthesis, enabling the targeted isolation of previously unidentified natural products directly from complex metagenomes. Our approach uses short natural product sequence tags derived from conserved biosynthetic motifs to profile biosynthetic diversity in the environment and then guide the recovery of gene clusters from metagenomic libraries. The methodology is conceptually simple, requires only a small investment in sequencing, and is not computationally demanding. To demonstrate the power of this approach to natural product discovery we conducted a computational search for epoxyketone proteasome inhibitors within 185 globally distributed soil metagenomes. This led to the identification of 99 unique epoxyketone sequence tags, falling into 6 phylogenetically distinct clades. Complete gene clusters associated with nine unique tags were recovered from four saturating soil metagenomic libraries. Using heterologous expression methodologies, seven potent epoxyketone proteasome inhibitors (clarepoxcins A-E and landepoxcins A and B) were produced from these pathways, including compounds with different warhead structures and a naturally occurring halohydrin prodrug. This study provides a template for the targeted expansion of bacterially derived natural products using the global metagenome.

  6. Detection of short repeated genomic sequences on metaphase chromosomes using padlock probes and target primed rolling circle DNA synthesis

    Directory of Open Access Journals (Sweden)

    Stougaard Magnus

    2007-11-01

    Full Text Available Abstract Background In situ detection of short sequence elements in genomic DNA requires short probes with high molecular resolution and powerful specific signal amplification. Padlock probes can differentiate single base variations. Ligated padlock probes can be amplified in situ by rolling circle DNA synthesis and detected by fluorescence microscopy, thus enhancing PRINS type reactions, where localized DNA synthesis reports on the position of hybridization targets, to potentially reveal the binding of single oligonucleotide-size probe molecules. Such a system has been presented for the detection of mitochondrial DNA in fixed cells, whereas attempts to apply rolling circle detection to metaphase chromosomes have previously failed, according to the literature. Methods Synchronized cultured cells were fixed with methanol/acetic acid to prepare chromosome spreads in teflon-coated diagnostic well-slides. Apart from the slide format and the chromosome spreading everything was done essentially according to standard protocols. Hybridization targets were detected in situ with padlock probes, which were ligated and amplified using target primed rolling circle DNA synthesis, and detected by fluorescence labeling. Results An optimized protocol for the spreading of condensed metaphase chromosomes in teflon-coated diagnostic well-slides was developed. Applying this protocol we generated specimens for target primed rolling circle DNA synthesis of padlock probes recognizing a 40 nucleotide sequence in the male specific repetitive satellite I sequence (DYZ1 on the Y-chromosome and a 32 nucleotide sequence in the repetitive kringle IV domain in the apolipoprotein(a gene positioned on the long arm of chromosome 6. These targets were detected with good efficiency, but the efficiency on other target sites was unsatisfactory. Conclusion Our aim was to test the applicability of the method used on mitochondrial DNA to the analysis of nuclear genomes, in particular as

  7. Analysis of human mitochondrial DNA sequences from fecally polluted environmental waters as a tool to study population diversity

    Directory of Open Access Journals (Sweden)

    Vikram Kapoor

    2017-05-01

    Full Text Available Mitochondrial signature sequences have frequently been used to study human population diversity around the world. Traditionally, this requires obtaining samples directly from individuals which is cumbersome, time consuming and limited to the number of individuals that participated in these types of surveys. Here, we used environmental DNA extracts to determine the presence and sequence variability of human mitochondrial sequences as a means to study the diversity of populations inhabiting in areas nearby a tropical watershed impacted with human fecal pollution. We used high-throughput sequencing (Illumina and barcoding to obtain thousands of sequences from the mitochondrial hypervariable region 2 (HVR2 and determined the different haplotypes present in 10 different water samples. Sequence analyses indicated a total of 19 distinct variants with frequency greater than 5%. The HVR2 sequences were associated with haplogroups of West Eurasian (57.6%, Sub-Saharan African (23.9%, and American Indian (11% ancestry. This was in relative accordance with population census data from the watershed sites. The results from this study demonstrates the potential value of mitochondrial sequence data retrieved from fecally impacted environmental waters to study the population diversity of local municipalities. This environmental DNA approach may also have other public health implications such as tracking background levels of human mitochondrial genes associated with diseases. It may be possible to expand this approach to other animal species inhabiting or using natural water systems.

  8. Repetitive sequences in plant nuclear DNA: types, distribution, evolution and function.

    Science.gov (United States)

    Mehrotra, Shweta; Goyal, Vinod

    2014-08-01

    Repetitive DNA sequences are a major component of eukaryotic genomes and may account for up to 90% of the genome size. They can be divided into minisatellite, microsatellite and satellite sequences. Satellite DNA sequences are considered to be a fast-evolving component of eukaryotic genomes, comprising tandemly-arrayed, highly-repetitive and highly-conserved monomer sequences. The monomer unit of satellite DNA is 150-400 base pairs (bp) in length. Repetitive sequences may be species- or genus-specific, and may be centromeric or subtelomeric in nature. They exhibit cohesive and concerted evolution caused by molecular drive, leading to high sequence homogeneity. Repetitive sequences accumulate variations in sequence and copy number during evolution, hence they are important tools for taxonomic and phylogenetic studies, and are known as "tuning knobs" in the evolution. Therefore, knowledge of repetitive sequences assists our understanding of the organization, evolution and behavior of eukaryotic genomes. Repetitive sequences have cytoplasmic, cellular and developmental effects and play a role in chromosomal recombination. In the post-genomics era, with the introduction of next-generation sequencing technology, it is possible to evaluate complex genomes for analyzing repetitive sequences and deciphering the yet unknown functional potential of repetitive sequences. Copyright © 2014 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

  9. The Study of Correlation Structures of DNA Sequences A Critical Review

    CERN Document Server

    Li, W

    1997-01-01

    The study of correlation structure in the primary sequences of DNA is reviewed. The issues reviewed include: symmetries among 16 base-base correlation functions, accurate estimation of correlation measures, the relationship between $1/f$ and Lorentzian spectra, heterogeneity in DNA sequences, different modeling strategies of the correlation structure of DNA sequences, the difference of correlation structure between coding and non-coding regions (besides the period-3 pattern), and source of broad distribution of domain sizes. Although some of the results remain controversial, a body of work on this topic constitutes a good starting point for future studies.

  10. Sequence-specific Hydrolysis of Single-stranded DNA by PNA-Cerium (Ⅳ) Adduct

    Institute of Scientific and Technical Information of China (English)

    He Bai SHEN; Feng WANG; Yong Tao YANG

    2005-01-01

    A novel artificial site specific cleavage reagent, with peptide nucleic acid (PNA) as sequence-recognizing moiety and cerium (Ⅳ) ions as "scissors" for cleaving target DNA, was synthesized. Subsequently, it was employed in the cleavage of target 26-mer single-stranded DNA (ssDNA), which has 10-mer sequence complementary with PNA recognizer in the hybrids,under physiological conditions. Reversed-phase high-performance liquid chromatogram (RPHPLC) experiments indicated that the artificial site specific cleavage reagent could cleave the target DNA specifically.

  11. Hyperrecombination at a specific DNA sequence in pneumococcal transformation.

    Science.gov (United States)

    Lefèvre, J C; Gasc, A M; Burger, A C; Mostachfi, P; Sicard, A M

    1984-08-01

    In pneumococcal transformation, recombination frequency between point mutations is usually proportional to physical distances. We have identified an aberrant marker belonging to the amiA locus that appeared to markedly enhance recombination frequency when crossed with any other markers of this gene. This mutation results from the C-to-A transversion in the sequence A-T-T-C-A-T----A-T-T-A-A-T. This effect is especially apparent for short distances as small as 27 base pairs. The hyperrecombination does not require the wild-type function of the pneumococcal gene for an ATP-dependent DNase (which is homologous to the product of the Escherichia coli recBC genes) or of the hex genes, which correct certain mismatched bases in transformation. The hyperrecombination is affected by the presence of nearby mismatched bases that trigger an excision-repair system. It is proposed that the mutation that shows hyperrecombination is sometimes converted to the wild-type allele at the heteroduplex stage of transformation.

  12. Ecological niche modelling and nDNA sequencing support a new, morphologically cryptic beetle species unveiled by DNA barcoding.

    Directory of Open Access Journals (Sweden)

    Oliver Hawlitschek

    Full Text Available BACKGROUND: DNA sequencing techniques used to estimate biodiversity, such as DNA barcoding, may reveal cryptic species. However, disagreements between barcoding and morphological data have already led to controversy. Species delimitation should therefore not be based on mtDNA alone. Here, we explore the use of nDNA and bioclimatic modelling in a new species of aquatic beetle revealed by mtDNA sequence data. METHODOLOGY/PRINCIPAL FINDINGS: The aquatic beetle fauna of Australia is characterised by high degrees of endemism, including local radiations such as the genus Antiporus. Antiporus femoralis was previously considered to exist in two disjunct, but morphologically indistinguishable populations in south-western and south-eastern Australia. We constructed a phylogeny of Antiporus and detected a deep split between these populations. Diagnostic characters from the highly variable nuclear protein encoding arginine kinase gene confirmed the presence of two isolated populations. We then used ecological niche modelling to examine the climatic niche characteristics of the two populations. All results support the status of the two populations as distinct species. We describe the south-western species as Antiporus occidentalis sp.n. CONCLUSION/SIGNIFICANCE: In addition to nDNA sequence data and extended use of mitochondrial sequences, ecological niche modelling has great potential for delineating morphologically cryptic species.

  13. High-speed automated DNA sequencing utilizing from-the-side laser excitation

    Science.gov (United States)

    Westphall, Michael S.; Brumley, Robert L., Jr.; Buxton, Erin C.; Smith, Lloyd M.

    1995-04-01

    The Human Genome Initiative is an ambitious international effort to map and sequence the three billion bases of DNA encoded in the human genome. If successfully completed, the resultant sequence database will be a tool of unparalleled power for biomedical research. One of the major challenges of this project is in the area of DNA sequencing technology. At this time, virtually all DNA sequencing is based upon the separation of DNA fragments in high resolution polyacrylamide gels. This method, as generally practiced, is one to two orders of magnitude too slow and expensive for the successful completion of the Human Genome projection. One reasonable approach is improved sequencing of DNA fragments is to increase the performance of such gel-based sequencing methods. Decreased sequencing times may be obtained by increasing the magnitude of the electric field employed. This is not possible with conventional sequencing, due to the fact that the additional heat associated with the increased electric field cannot be adequately dissipated. Recent developments in the use of thin gels have addressed this problem. Performing electrophoresis in ultrathin (50 to 100 microns) gels greatly increases the heat transfer efficiency, thus allowing the benefits of larger electric fields to be obtained. An increase in separation speed of about an order of magnitude is readily achieved. Thin gels have successfully been used in capillary and slab formats. A detection system has been designed for use with a multiple fluorophore sequencing strategy in horizontal ultrathin slab gels. The system employs laser through-the-side excitation and a cooled CCD detector; this allows for the parallel detection of up to 24 sets of four fluorescently labeled DNA sequencing reactions during their electrophoretic separation in ultrathin (115 micrometers ) denaturing polyacrylamide gels. Four hundred bases of sequence information is obtained from 100 ng of M13 template DNA in an hour, corresponding to an

  14. Sequencing for complete rDNA sequences (18S, ITS1, 5.8S, ITS2, and 28S rDNA) of Demodex and phylogenetic analysis of Acari based on 18S and 28S rDNA.

    Science.gov (United States)

    Zhao, Ya-E; Wu, Li-Ping; Hu, Li; Xu, Yang; Wang, Zheng-Hang; Liu, Wen-Yan

    2012-11-01

    Due to the difficulty of DNA extraction for Demodex, few studies dealt with the identification and the phyletic evolution of Demodex at molecular level. In this study, we amplified, sequenced, and analyzed a complete (Demodex folliculorum) and an almost complete (D12 missing) (Demodex brevis) ribosomal DNA (rDNA) sequence and also analyzed the primary sequences of divergent domains in small-subunit ribosomal RNA (rRNA) of 51 species and in large-subunit rRNA of 43 species from four superfamilies in Acari (Cheyletoidea, Tetranychoidea, Analgoidea, and Ixodoidea). The results revealed that 18S rDNA sequence was relatively conserved in rDNA-coding regions and was not evolving as rapidly as 28S rDNA sequence. The evolutionary rates of transcribed spacer regions were much higher than those of the coding regions. The maximum parsimony trees of 18S and 28S rDNA appeared to be almost identical, consistent with their morphological classification. Based on the fact that the resolution capability of sequence length and the divergence of the 13 segments (D1-D6, D7a, D7b, and D8-D12) of 28S rDNA were stronger than that of the nine variable regions (V1-V9) of 18S rDNA, we were able to identify Demodex (Cheyletoidea) by the indels occurring in D2, D6, and D8.

  15. Bisulfite sequencing reveals that Aspergillus flavus holds a hollow in DNA methylation.

    Directory of Open Access Journals (Sweden)

    Si-Yang Liu

    Full Text Available Aspergillus flavus first gained scientific attention for its production of aflatoxin. The underlying regulation of aflatoxin biosynthesis has been serving as a theoretical model for biosynthesis of other microbial secondary metabolites. Nevertheless, for several decades, the DNA methylation status, one of the important epigenomic modifications involved in gene regulation, in A. flavus remains to be controversial. Here, we applied bisulfite sequencing in conjunction with a biological replicate strategy to investigate the DNA methylation profiling of A. flavus genome. Both the bisulfite sequencing data and the methylome comparisons with other fungi confirm that the DNA methylation level of this fungus is negligible. Further investigation into the DNA methyltransferase of Aspergillus uncovers its close relationship with RID-like enzymes as well as its divergence with the methyltransferase of species with validated DNA methylation. The lack of repeat contents of the A. flavus' genome and the high RIP-index of the small amount of remanent repeat potentially support our speculation that DNA methylation may be absent in A. flavus or that it may possess de novo DNA methylation which occurs very transiently during the obscure sexual stage of this fungal species. This work contributes to our understanding on the DNA methylation status of A. flavus, as well as reinforces our views on the DNA methylation in fungal species. In addition, our strategy of applying bisulfite sequencing to DNA methylation detection in species with low DNA methylation may serve as a reference for later scientific investigations in other hypomethylated species.

  16. Noninvasive diagnosis of fetal aneuploidy by shotgun sequencing DNA from maternal blood

    Science.gov (United States)

    Fan, H. Christina; Blumenfeld, Yair J.; Chitkara, Usha; Hudgins, Louanne; Quake, Stephen R.

    2008-01-01

    We directly sequenced cell-free DNA with high-throughput shotgun sequencing technology from plasma of pregnant women, obtaining, on average, 5 million sequence tags per patient sample. This enabled us to measure the over- and underrepresentation of chromosomes from an aneuploid fetus. The sequencing approach is polymorphism-independent and therefore universally applicable for the noninvasive detection of fetal aneuploidy. Using this method, we successfully identified all nine cases of trisomy 21 (Down syndrome), two cases of trisomy 18 (Edward syndrome), and one case of trisomy 13 (Patau syndrome) in a cohort of 18 normal and aneuploid pregnancies; trisomy was detected at gestational ages as early as the 14th week. Direct sequencing also allowed us to study the characteristics of cell-free plasma DNA, and we found evidence that this DNA is enriched for sequences from nucleosomes. PMID:18838674

  17. Capture and Amplification by Tailing and Switching (CATS). An ultrasensitive ligation-independent method for generation of DNA libraries for deep sequencing from picogram amounts of DNA and RNA.

    Science.gov (United States)

    Turchinovich, Andrey; Surowy, Harald; Serva, Andrius; Zapatka, Marc; Lichter, Peter; Burwinkel, Barbara

    2014-01-01

    Massive parallel sequencing (MPS) technologies have paved the way into new areas of research including individualized medicine. However, sequencing of trace amounts of DNA or RNA still remains a major challenge, especially for degraded nucleic acids like circulating DNA. This together with high cost and time requirements impedes many important applications of MPS in medicine and fundamental science. We have established a fast, cheap and highly efficient protocol called 'Capture and Amplification by Tailing and Switching' (CATS) to directly generate ready-to-sequence libraries for MPS from nanogram and picogram quantities of both DNA and RNA. Furthermore, those DNA libraries are strand-specific, can be prepared within 2-3 h and do not require preliminary sample amplification steps. To exemplify the capacity of the technique, we have generated and sequenced DNA libraries from hundred-picogram amounts of circulating nucleic acids isolated from human blood plasma, one nanogram of mRNA-enriched total RNA from cultured cells and few nanograms of bisulfite-converted DNA. The approach for DNA library preparation from minimal and fragmented input described here will find broad application in diverse research areas such as translational medicine including therapy monitoring, prediction, prognosis and early detection of various human disorders and will permit high-throughput DNA sequencing from previously inaccessible material such as minute forensic and archeological samples.

  18. A conserved sequence extending motif III of the motor domain in the Snf2-family DNA translocase Rad54 is critical for ATPase activity.

    Directory of Open Access Journals (Sweden)

    Xiao-Ping Zhang

    Full Text Available Rad54 is a dsDNA-dependent ATPase that translocates on duplex DNA. Its ATPase function is essential for homologous recombination, a pathway critical for meiotic chromosome segregation, repair of complex DNA damage, and recovery of stalled or broken replication forks. In recombination, Rad54 cooperates with Rad51 protein and is required to dissociate Rad51 from heteroduplex DNA to allow access by DNA polymerases for recombination-associated DNA synthesis. Sequence analysis revealed that Rad54 contains a perfect match to the consensus PIP box sequence, a widely spread PCNA interaction motif. Indeed, Rad54 interacts directly with PCNA, but this interaction is not mediated by the Rad54 PIP box-like sequence. This sequence is located as an extension of motif III of the Rad54 motor domain and is essential for full Rad54 ATPase activity. Mutations in this motif render Rad54 non-functional in vivo and severely compromise its activities in vitro. Further analysis demonstrated that such mutations affect dsDNA binding, consistent with the location of this sequence motif on the surface of the cleft formed by two RecA-like domains, which likely forms the dsDNA binding site of Rad54. Our study identified a novel sequence motif critical for Rad54 function and showed that even perfect matches to the PIP box consensus may not necessarily identify PCNA interaction sites.

  19. Z-DNA-forming sequences generate large-scale deletions in mammalian cells.

    Science.gov (United States)

    Wang, Guliang; Christensen, Laura A; Vasquez, Karen M

    2006-02-21

    Spontaneous chromosomal breakages frequently occur at genomic hot spots in the absence of DNA damage and can result in translocation-related human disease. Chromosomal breakpoints are often mapped near purine-pyrimidine Z-DNA-forming sequences in human tumors. However, it is not known whether Z-DNA plays a role in the generation of these chromosomal breakages. Here, we show that Z-DNA-forming sequences induce high levels of genetic instability in both bacterial and mammalian cells. In mammalian cells, the Z-DNA-forming sequences induce double-strand breaks nearby, resulting in large-scale deletions in 95% of the mutants. These Z-DNA-induced double-strand breaks in mammalian cells are not confined to a specific sequence but rather are dispersed over a 400-bp region, consistent with chromosomal breakpoints in human diseases. This observation is in contrast to the mutations generated in Escherichia coli that are predominantly small deletions within the repeats. We found that the frequency of small deletions is increased by replication in mammalian cell extracts. Surprisingly, the large-scale deletions generated in mammalian cells are, at least in part, replication-independent and are likely initiated by repair processing cleavages surrounding the Z-DNA-forming sequence. These results reveal that mammalian cells process Z-DNA-forming sequences in a strikingly different fashion from that used by bacteria. Our data suggest that Z-DNA-forming sequences may be causative factors for gene translocations found in leukemias and lymphomas and that certain cellular conditions such as active transcription may increase the risk of Z-DNA-related genetic instability.

  20. Recombinant human MDM2 oncoprotein shows sequence composition selectivity for binding to both RNA and DNA.

    Science.gov (United States)

    Challen, Christine; Anderson, John J; Chrzanowska-Lightowlers, Zofia M A; Lightowlers, Robert N; Lunec, John

    2012-03-01

    MDM2 is a 90 kDa nucleo-phosphoprotein that binds p53 and other proteins contributing to its oncogenic properties. Its structure includes an amino proximal p53 binding site, a central acidic domain and a carboxy region which incorporates Zinc and Ring Finger domains suggestive of nucleic acid binding or transcription factor function. It has previously been reported that a bacculovirus expressed MDM2 protein binds RNA in a sequence-specific manner through the Ring Finger domain, however, its ability to bind DNA has yet to be examined. We report here that a bacterially expressed human MDM2 protein binds both DNA as well as the previously defined RNA consensus sequence. DNA binding appears selective and involves the carboxy-terminal domain of the molecule. RNA binding is inhibited by an MDM2 specific antibody, which recognises an epitope within the carboxy region of the protein. Selection cloning and sequence analysis of MDM2 DNA binding sequences, unlike RNA binding sequences, revealed no obvious DNA binding consensus sequence, but preferential binding to oligopurine:pyrimidine-rich stretches. Our results suggest that the observed preferential DNA binding may occur through the Zinc Finger or in a charge-charge interaction through the Ring Finger, thereby implying potentially different mechanisms for DNA and RNA MDM2 binding.

  1. A human cellular sequence implicated in trk oncogene activation is DNA damage inducible

    Energy Technology Data Exchange (ETDEWEB)

    Ben-Ishai, R.; Scharf, R.; Sharon, R.; Kapten, I. (Technion-Israel Institute of Technology, Haifa (Israel))

    1990-08-01

    Xeroderma pigmentosum cells, which are deficient in the repair of UV light-induced DNA damage, have been used to clone DNA-damage-inducible transcripts in human cells. The cDNA clone designated pC-5 hybridizes on RNA gel blots to a 1-kilobase transcript, which is moderately abundant in nontreated cells and whose synthesis is enhanced in human cells following UV irradiation or treatment with several other DNA-damaging agents. UV-enhanced transcription of C-5 RNA is transient and occurs at lower fluences and to a greater extent in DNA-repair-deficient than in DNA-repair-proficient cells. Southern blot analysis indicates that the C-5 gene belongs to a multigene family. A cDNA clone containing the complete coding sequence of C-5 was isolated. Sequence analysis revealed that it is homologous to a human cellular sequence encoding the amino-terminal activating sequence of the trk-2h chimeric oncogene. The presence of DNA-damage-responsive sequences at the 5' end of a chimeric oncogene could result in enhanced expression of the oncogene in response to carcinogens.

  2. Multilocus sequence analysis supports the taxonomic position of Astragalus glycyphyllos symbionts based on DNA-DNA hybridization.

    Science.gov (United States)

    Gnat, Sebastian; Małek, Wanda; Oleńska, Ewa; Wdowiak-Wróbel, Sylwia; Kalita, Michał; Rogalski, Jerzy; Wójcik, Magdalena

    2016-04-01

    In this study, the phylogenetic relationship and taxonomic status of six strains, representing different phenons and genomic groups of Astragalus glycyphyllos symbionts, originating from Poland, were established by comparative analysis of five concatenated housekeeping gene sequences (atpD, dnaK, glnA, recA and rpoB), DNA-DNA hybridization and total DNA G+C content. Maximum-likelihood phylogenetic analysis of combined atpD, dnaK, glnA, recA and rpoB sequence data placed the studied bacteria into the clade comprising the genus Mesorhizobium. In the core gene phylograms, four A. glycyphyllos nodule isolates (AG1, AG7, AG15 and AG27) formed a cluster common with Mesorhizobium ciceri, whereas the two other A. glycyphyllos symbionts (AG17 and AG22) were grouped together with Mesorhizobium amorphae and M. septentrionale. The species position of the studied bacteria was clarified by DNA-DNA hybridization. The DNA-DNA relatedness between isolates AG1, AG7, AG15 and AG27 and reference strain M. ciceri USDA 3383T was 76.4-84.2%, and all these A. glycyphyllos nodulators were defined as members of the genomospecies M. ciceri. DNA-DNA relatedness for isolates AG17 and AG22 and the reference strain M. amorphae ICMP 15022T was 77.5 and 80.1%, respectively. We propose that the nodule isolates AG17 and AG22 belong to the genomic species M. amorphae. Additionally, it was found that the total DNA G+C content of the six test A. glycyphyllos symbionts was 59.4-62.1 mol%, within the range for species of the genus Mesorhizobium.

  3. Fast mitochondrial DNA isolation from mammalian cells for next-generation sequencing.

    Science.gov (United States)

    Quispe-Tintaya, Wilber; White, Ryan R; Popov, Vasily N; Vijg, Jan; Maslov, Alexander Y

    2013-09-01

    Standard methods for mitochondrial DNA (mtDNA) extraction do not provide the level of enrichment for mtDNA sufficient for direct sequencing and must be followed by long-range-PCR amplification, which can bias the sequencing results. Here, we describe a fast, cost-effective, and reliable method for preparation of mtDNA enriched samples from eukaryotic cells ready for direct sequencing. Our protocol utilizes a conventional miniprep kit, paramagnetic bead-based purification, and an optional, limited PCR amplification of mtDNA. The first two steps alone provide more than 2000-fold enrichment for mtDNA when compared with total cellular DNA (~200-fold in comparison with current commercially available kits) as demonstrated by real-time PCR. The percentage of sequencing reads aligned to mtDNA was about 22% for non-amplified samples and greater than 99% for samples subjected to 10 cycles of long-range-PCR with mtDNA specific primers.

  4. Mitochondrial DNA sequence and phylogenetic evaluation of geographically disparate Sus scrofa breeds.

    Science.gov (United States)

    Cannon, M V; Brandebourg, T D; Kohn, M C; Ðikić, D; Irwin, M H; Pinkert, C A

    2015-01-01

    Next generation sequencing of mitochondrial DNA (mtDNA) facilitates studies into the metabolic characteristics of production animals and their relation to production traits. Sequence analysis of mtDNA from pure-bred swine with highly disparate production characteristics (Mangalica Blonde, Mangalica Swallow-bellied, Meishan, Turopolje, and Yorkshire) was initiated to evaluate the influence of mtDNA polymorphisms on mitochondrial function. Herein, we report the complete mtDNA sequences of five Sus scrofa breeds and evaluate their position within the phylogeny of domestic swine. Phenotypic traits of Yorkshire, Mangalica Blonde, and Swallow-belly swine are presented to demonstrate their metabolic characteristics. Our data support the division of European and Asian breeds noted previously and confirm European ancestry of Mangalica and Turopolje breeds. Furthermore, mtDNA differences between breeds suggest function-altering changes in proteins involved in oxidative phosphorylation such as ATP synthase 6 (MT-ATP6), cytochrome oxidase I (MT-CO1), cytochrome oxidase III (MT-CO3), and cytochrome b (MT-CYB), supporting the hypothesis that mtDNA polymorphisms contribute to differences in metabolic traits between swine breeds. Our sequence data form the basis for future research into the roles of mtDNA in determining production traits in domestic animals. Additionally, such studies should provide insight into how mtDNA haplotype influences the extreme adiposity observed in Mangalica breeds.

  5. Application of Quaternion in improving the quality of global sequence alignment scores for an ambiguous sequence target in Streptococcus pneumoniae DNA

    Science.gov (United States)

    Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.

    2017-07-01

    DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.

  6. Reducing diagnostic turnaround times of exome sequencing for families requiring timely diagnoses.

    Science.gov (United States)

    Bourchany, Aurélie; Thauvin-Robinet, Christel; Lehalle, Daphné; Bruel, Ange-Line; Masurel-Paulet, Alice; Jean, Nolwenn; Nambot, Sophie; Willems, Marjorie; Lambert, Laetitia; El Chehadeh-Djebbar, Salima; Schaefer, Elise; Jaquette, Aurélia; St-Onge, Judith; Poe, Charlotte; Jouan, Thibaud; Chevarin, Martin; Callier, Patrick; Mosca-Boidron, Anne-Laure; Laurent, Nicole; Lefebvre, Mathilde; Huet, Frédéric; Houcinat, Nada; Moutton, Sébastien; Philippe, Christophe; Tran-Mau-Them, Frédéric; Vitobello, Antonio; Kuentz, Paul; Duffourd, Yannis; Rivière, Jean-Baptiste; Thevenon, Julien; Faivre, Laurence

    2017-08-12

    Whole-exome sequencing (WES) has now entered medical practice with powerful applications in the diagnosis of rare Mendelian disorders. Although the usefulness and cost-effectiveness of WES have been widely demonstrated, it is essential to reduce the diagnostic turnaround time to make WES a first-line procedure. Since 2011, the automation of laboratory procedures and advances in sequencing chemistry have made it possible to carry out diagnostic whole genome sequencing from the blood sample to molecular diagnosis of suspected genetic disorders within 50 h. Taking advantage of these advances, the main objective of the study was to improve turnaround times for sequencing results. WES was proposed to 29 patients with severe undiagnosed disorders with developmental abnormalities and faced with medical situations requiring rapid diagnosis. Each family gave consent. The extracted DNA was sequenced on a NextSeq500 (Illumina) instrument. Data were analyzed following standard procedures. Variants were interpreted using in-house software. Each rare variant affecting protein sequences with clinical relevance was tested for familial segregation. The diagnostic rate was 45% (13/29), with a mean turnaround time of 40 days from reception of the specimen to delivery of results to the referring physician. Besides permitting genetic counseling, the rapid diagnosis for positive families led to two pre-natal diagnoses and two inclusions in clinical trials. This pilot study demonstrated the feasibility of rapid diagnostic WES in our primary genetics center. It reduced the diagnostic odyssey and helped provide support to families. Copyright © 2017. Published by Elsevier Masson SAS.

  7. An investigation of the structural requirements for ATP hydrolysis and DNA cleavage by the EcoKI Type I DNA restriction and modification enzyme.

    Science.gov (United States)

    Roberts, Gareth A; Cooper, Laurie P; White, John H; Su, Tsueu-Ju; Zipprich, Jakob T; Geary, Paul; Kennedy, Cowan; Dryden, David T F

    2011-09-01

    Type I DNA restriction/modification systems are oligomeric enzymes capable of switching between a methyltransferase function on hemimethylated host DNA and an endonuclease function on unmethylated foreign DNA. They have long been believed to not turnover as endonucleases with the enzyme becoming inactive after cleavage. Cleavage is preceded and followed by extensive ATP hydrolysis and DNA translocation. A role for dissociation of subunits to allow their reuse has been proposed for the EcoR124I enzyme. The EcoKI enzyme is a stable assembly in the absence of DNA, so recycling was thought impossible. Here, we demonstrate that EcoKI becomes unstable on long unmethylated DNA; reuse of the methyltransferase subunits is possible so that restriction proceeds until the restriction subunits have been depleted. We observed that RecBCD exonuclease halts restriction and does not assist recycling. We examined the DNA structure required to initiate ATP hydrolysis by EcoKI and find that a 21-bp duplex with single-stranded extensions of 12 bases on either side of the target sequence is sufficient to support hydrolysis. Lastly, we discuss whether turnover is an evolutionary requirement for restriction, show that the ATP hydrolysis is not deleterious to the host cell and discuss how foreign DNA occasionally becomes fully methylated by these systems.

  8. mtDNA-Server: next-generation sequencing data analysis of human mitochondrial DNA in the cloud

    Science.gov (United States)

    Weissensteiner, Hansi; Forer, Lukas; Fuchsberger, Christian; Schöpf, Bernd; Kloss-Brandstätter, Anita; Specht, Günther; Kronenberg, Florian; Schönherr, Sebastian

    2016-01-01

    Next generation sequencing (NGS) allows investigating mitochondrial DNA (mtDNA) characteristics such as heteroplasmy (i.e. intra-individual sequence variation) to a higher level of detail. While several pipelines for analyzing heteroplasmies exist, issues in usability, accuracy of results and interpreting final data limit their usage. Here we present mtDNA-Server, a scalable web server for the analysis of mtDNA studies of any size with a special focus on usability as well as reliable identification and quantification of heteroplasmic variants. The mtDNA-Server workflow includes parallel read alignment, heteroplasmy detection, artefact or contamination identification, variant annotation as well as several quality control metrics, often neglected in current mtDNA NGS studies. All computational steps are parallelized with Hadoop MapReduce and executed graphically with Cloudgene. We validated the underlying heteroplasmy and contamination detection model by generating four artificial sample mix-ups on two different NGS devices. Our evaluation data shows that mtDNA-Server detects heteroplasmies and artificial recombinations down to the 1% level with perfect specificity and outperforms existing approaches regarding sensitivity. mtDNA-Server is currently able to analyze the 1000G Phase 3 data (n = 2,504) in less than 5 h and is freely accessible at https://mtdna-server.uibk.ac.at. PMID:27084948

  9. mtDNA-Server: next-generation sequencing data analysis of human mitochondrial DNA in the cloud.

    Science.gov (United States)

    Weissensteiner, Hansi; Forer, Lukas; Fuchsberger, Christian; Schöpf, Bernd; Kloss-Brandstätter, Anita; Specht, Günther; Kronenberg, Florian; Schönherr, Sebastian

    2016-07-08

    Next generation sequencing (NGS) allows investigating mitochondrial DNA (mtDNA) characteristics such as heteroplasmy (i.e. intra-individual sequence variation) to a higher level of detail. While several pipelines for analyzing heteroplasmies exist, issues in usability, accuracy of results and interpreting final data limit their usage. Here we present mtDNA-Server, a scalable web server for the analysis of mtDNA studies of any size with a special focus on usability as well as reliable identification and quantification of heteroplasmic variants. The mtDNA-Server workflow includes parallel read alignment, heteroplasmy detection, artefact or contamination identification, variant annotation as well as several quality control metrics, often neglected in current mtDNA NGS studies. All computational steps are parallelized with Hadoop MapReduce and executed graphically with Cloudgene. We validated the underlying heteroplasmy and contamination detection model by generating four artificial sample mix-ups on two different NGS devices. Our evaluation data shows that mtDNA-Server detects heteroplasmies and artificial recombinations down to the 1% level with perfect specificity and outperforms existing approaches regarding sensitivity. mtDNA-Server is currently able to analyze the 1000G Phase 3 data (n = 2,504) in less than 5 h and is freely accessible at https://mtdna-server.uibk.ac.at.

  10. Discriminative prediction of mammalian enhancers from DNA sequence.

    Science.gov (United States)

    Lee, Dongwon; Karchin, Rachel; Beer, Michael A

    2011-12-01

    Accurately predicting regulatory sequences and enhancers in entire genomes is an important but difficult problem, especially in large vertebrate genomes. With the advent of ChIP-seq technology, experimental detection of genome-wide EP300/CREBBP bound regions provides a powerful platform to develop predictive tools for regulatory sequences and to study their sequence properties. Here, we develop a support vector machine (SVM) framework which can accurately identify EP300-bound enhancers using only genomic sequence and an unbiased set of general sequence features. Moreover, we find that the predictive sequence features identified by the SVM classifier reveal biologically relevant sequence elements enriched in the enhancers, but we also identify other features that are significantly depleted in enhancers. The predictive sequence features are evolutionarily conserved and spatially clustered, providing further support of their functional significance. Although our SVM is trained on experimental data, we also predict novel enhancers and show that these putative enhancers are significantly enriched in both ChIP-seq signal and DNase I hypersensitivity signal in the mouse brain and are located near relevant genes. Finally, we present results of comparisons between other EP300/CREBBP data sets using our SVM and uncover sequence elements enriched and/or depleted in the different classes of enhancers. Many of these sequence features play a role in specifying tissue-specific or developmental-stage-specific enhancer activity, but our results indicate that some features operate in a general or tissue-independent manner. In addition to providing a high confidence list of enhancer targets for subsequent experimental investigation, these results contribute to our understanding of the general sequence structure of vertebrate enhancers.

  11. A method for high-throughput production of sequence-verified DNA libraries and strain collections.

    Science.gov (United States)

    Smith, Justin D; Schlecht, Ulrich; Xu, Weihong; Suresh, Sundari; Horecka, Joe; Proctor, Michael J; Aiyar, Raeka S; Bennett, Richard A O; Chu, Angela; Li, Yong Fuga; Roy, Kevin; Davis, Ronald W; Steinmetz, Lars M; Hyman, Richard W; Levy, Sasha F; St Onge, Robert P

    2017-02-13

    The low costs of array-synthesized oligonucleotide libraries are empowering rapid advances in quantitative and synthetic biology. However, high synthesis error rates, uneven representation, and lack of access to individual oligonucleotides limit the true potential of these libraries. We have developed a cost-effective method called Recombinase Directed Indexing (REDI), which involves integration of a complex library into yeast, site-specific recombination to index library DNA, and next-generation sequencing to identify desired clones. We used REDI to generate a library of ~3,300 DNA probes that exhibited > 96% purity and remarkable uniformity (> 95% of probes within twofold of the median abundance). Additionally, we created a collection of ~9,000 individually accessible CRISPR interference yeast strains for > 99% of genes required for either fermentative or respiratory growth, demonstrating the utility of REDI for rapid and cost-effective creation of strain collections from oligonucleotide pools. Our approach is adaptable to any complex DNA library, and fundamentally changes how these libraries can be parsed, maintained, propagated, and characterized.

  12. What Advances Are Being Made in DNA Sequencing?

    Science.gov (United States)

    ... Help Me Understand Genetics Home Help Me Understand Genetics Genomic Research What advances are being made in DNA ... Research Institute (NHGRI). The American College of Medical Genetics and Genomics (ACMG) has laid out their policies regarding whole ...

  13. Modeling the early stage of DNA sequence recognition within RecA nucleoprotein filaments.

    Science.gov (United States)

    Saladin, Adrien; Amourda, Christopher; Poulain, Pierre; Férey, Nicolas; Baaden, Marc; Zacharias, Martin; Delalande, Olivier; Prévost, Chantal

    2010-10-01

    Homologous recombination is a fundamental process enabling the repair of double-strand breaks with a high degree of fidelity. In prokaryotes, it is carried out by RecA nucleofilaments formed on single-stranded DNA (ssDNA). These filaments incorporate genomic sequences that are homologous to the ssDNA and exchange the homologous strands. Due to the highly dynamic character of this process and its rapid propagation along the filament, the sequence recognition and strand exchange mechanism remains unknown at the structural level. The recently published structure of the RecA/DNA filament active for recombination (Chen et al., Mechanism of homologous recombination from the RecA-ssDNA/dsDNA structure, Nature 2008, 453, 489) provides a starting point for new exploration of the system. Here, we investigate the possible geometries of association of the early encounter complex between RecA/ssDNA filament and double-stranded DNA (dsDNA). Due to the huge size of the system and its dense packing, we use a reduced representation for protein and DNA together with state-of-the-art molecular modeling methods, including systematic docking and virtual reality simulations. The results indicate that it is possible for the double-stranded DNA to access the RecA-bound ssDNA while initially retaining its Watson-Crick pairing. They emphasize the importance of RecA L2 loop mobility for both recognition and strand exchange.

  14. High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

    Directory of Open Access Journals (Sweden)

    Soichi Inagaki

    Full Text Available Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.

  15. Comparing the performance of three ancient DNA extraction methods for high-throughput sequencing

    DEFF Research Database (Denmark)

    Gamba, Cristina; Hanghøj, Kristian Ebbesen; Gaunitz, Charleen

    2016-01-01

    The DNA molecules that can be extracted from archaeological and palaeontological remains are often degraded and massively contaminated with environmental microbial material. This reduces the efficacy of shotgun approaches for sequencing ancient genomes, despite the decreasing sequencing costs...... of high-throughput sequencing (HTS). Improving the recovery of endogenous molecules from the DNA extraction and purification steps could, thus, help advance the characterization of ancient genomes. Here, we apply the three most commonly used DNA extraction methods to five ancient bone samples spanning...... a ~30 thousand year temporal range and originating from a diversity of environments, from South America to Alaska. We show that methods based on the purification of DNA fragments using silica columns are more advantageous than in solution methods and increase not only the total amount of DNA molecules...

  16. Efficiency of ITS sequences for DNA barcoding in Passiflora (Passifloraceae).

    Science.gov (United States)

    Giudicelli, Giovanna Câmara; Mäder, Geraldo; de Freitas, Loreta Brandão

    2015-04-01

    DNA barcoding is a technique for discriminating and identifying species using short, variable, and standardized DNA regions. Here, we tested for the first time the performance of plastid and nuclear regions as DNA barcodes in Passiflora. This genus is a largely variable, with more than 900 species of high ecological, commercial, and ornamental importance. We analyzed 1034 accessions of 222 species representing the four subgenera of Passiflora and evaluated the effectiveness of five plastid regions and three nuclear datasets currently employed as DNA barcodes in plants using barcoding gap, applied similarity-, and tree-based methods. The plastid regions were able to identify less than 45% of species, whereas the nuclear datasets were efficient for more than 50% using "best match" and "best close match" methods of TaxonDNA software. All subgenera presented higher interspecific pairwise distances and did not fully overlap with the intraspecific distance, and similarity-based methods showed better results than tree-based methods. The nuclear ribosomal internal transcribed spacer 1 (ITS1) region presented a higher discrimination power than the other datasets and also showed other desirable characteristics as a DNA barcode for this genus. Therefore, we suggest that this region should be used as a starting point to identify Passiflora species.

  17. Efficiency of ITS Sequences for DNA Barcoding in Passiflora (Passifloraceae

    Directory of Open Access Journals (Sweden)

    Giovanna Câmara Giudicelli

    2015-04-01

    Full Text Available DNA barcoding is a technique for discriminating and identifying species using short, variable, and standardized DNA regions. Here, we tested for the first time the performance of plastid and nuclear regions as DNA barcodes in Passiflora. This genus is a largely variable, with more than 900 species of high ecological, commercial, and ornamental importance. We analyzed 1034 accessions of 222 species representing the four subgenera of Passiflora and evaluated the effectiveness of five plastid regions and three nuclear datasets currently employed as DNA barcodes in plants using barcoding gap, applied similarity-, and tree-based methods. The plastid regions were able to identify less than 45% of species, whereas the nuclear datasets were efficient for more than 50% using “best match” and “best close match” methods of TaxonDNA software. All subgenera presented higher interspecific pairwise distances and did not fully overlap with the intraspecific distance, and similarity-based methods showed better results than tree-based methods. The nuclear ribosomal internal transcribed spacer 1 (ITS1 region presented a higher discrimination power than the other datasets and also showed other desirable characteristics as a DNA barcode for this genus. Therefore, we suggest that this region should be used as a starting point to identify Passiflora species.

  18. Applications of inter simple sequence repeat (ISSR) rDNA in ...

    African Journals Online (AJOL)

    Applications of inter simple sequence repeat (ISSR) rDNA in detecting ... and phylogenetic relationships between Lymnaea natalensis collected from Giza, ... in water samples of all tested governorates with different significant differences.

  19. Genomic Signal Processing Methods for Computation of Alignment-Free Distances from DNA Sequences

    Science.gov (United States)

    Borrayo, Ernesto; Mendizabal-Ruiz, E. Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P.; Morales, J. Alejandro

    2014-01-01

    Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments. PMID:25393409

  20. Genomic signal processing methods for computation of alignment-free distances from DNA sequences.

    Science.gov (United States)

    Borrayo, Ernesto; Mendizabal-Ruiz, E Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P; Morales, J Alejandro

    2014-01-01

    Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments.

  1. Tsukamurella tyrosinosolvens intravascular catheter infection identified using 16S ribosomal DNA sequencing.

    Science.gov (United States)

    Sheridan, Elizabeth A S; Warwick, Simon; Chan, Anthony; Dall'Antonia, Martino; Koliou, Maria; Sefton, Armine

    2003-03-01

    Cultures of blood from a hemodialysis line repeatedly yielded a gram-positive rod. The organism was identified as Tsukamurella tyrosinosolvens by 16S ribosomal DNA sequencing, and the patient was treated successfully by removal of the line.

  2. Cytogenetic analysis of Populus trichocarpa--ribosomal DNA, telomere repeat sequence, and marker-selected BACs.

    Science.gov (United States)

    Islam-Faridi, M N; Nelson, C D; DiFazio, S P; Gunter, L E; Tuskan, G A

    2009-01-01

    The 18S-28S rDNA and 5S rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 18S-28S rDNA sites and one 5S rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis-type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones selected from 2 linkage groups based on genome sequence assembly (LG-I and LG-VI) were localized on 2 chromosomes, as expected. BACs from LG-I hybridized to the longest chromosome in the complement. All BAC positions were found to be concordant with sequence assembly positions. BAC-FISH will be useful for delineating each of the Populus trichocarpa chromosomes and improving the sequence assembly of this model angiosperm tree species.

  3. Cytogenetic Analysis of Populus trichocarpa - Ribosomal DNA, Telomere Repeat Sequence, and Marker-selected BACs

    Energy Technology Data Exchange (ETDEWEB)

    Tuskan, Gerald A [ORNL; Gunter, Lee E [ORNL; DiFazio, Stephen P [West Virginia University

    2009-01-01

    The 18S-28S rDNA and 5S rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 18S-28S rDNA sites and one 5S rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis -type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones selected from 2 linkage groups based on genome sequence assembly (LG-I and LG-VI) were localized on 2 chromosomes, as expected. BACs from LG-I hybridized to the longest chromosome in the complement. All BAC positions were found to be concordant with sequence assembly positions. BAC-FISH will be useful for delineating each of the Populus trichocarpa chromosomes and improving the sequence assembly of this model angiosperm tree species.

  4. Sequence dependent charge transport through DNA molecules: The role of periodicity and long-range correlations

    Energy Technology Data Exchange (ETDEWEB)

    Guo Aimin [Department of Physics Science and Technology, Central South University, Changsha 410083 (China)]. E-mail: guoaimin1982@163.com; Xu Hui [Department of Physics Science and Technology, Central South University, Changsha 410083 (China)

    2007-04-01

    Understanding the intrinsic conduction properties in DNA plays a vital role in molecular electronic engineering. We use a tight-binding model to investigate the transmissivity, Lyapunov coefficient and localization length of DNA, including periodic poly(G)-poly(C), quasiperiodic Fibonacci polyGC and random sequences, as a function of sequence length and temperature. Our results show that the periodicity and long-range correlations in DNA can persist high transmittivity, and new transmission peaks can appear for whatever the sequence length or temperature increases, but the mean transmission coefficient decreases. Meanwhile, thermal enhancement of conductance is a generic feature in all these sequences and the asymptotic behaviors of localization length for random DNA are in good agreement with the mobility edge theory.

  5. Solving the Curriculum Sequencing Problem with DNA Computing Approach

    Science.gov (United States)

    Debbah, Amina; Ben Ali, Yamina Mohamed

    2014-01-01

    In the e-learning systems, a learning path is known as a sequence of learning materials linked to each others to help learners achieving their learning goals. As it is impossible to have the same learning path that suits different learners, the Curriculum Sequencing problem (CS) consists of the generation of a personalized learning path for each…

  6. Algorithms for mapping high-throughput DNA sequences

    DEFF Research Database (Denmark)

    Frellsen, Jes; Menzel, Peter; Krogh, Anders

    2014-01-01

    of data generation, new bioinformatics approaches have been developed to cope with the large amount of sequencing reads obtained in these experiments. In this chapter, we first introduce HTS technologies and their usage in molecular biology and discuss the problem of mapping sequencing reads...

  7. 5'-end sequences of budding yeast full-length cDNA clones and quality scores - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Budding yeast cDNA sequencing project 5'-end sequences of budding yeast full-length cDNA clones and quality score...s Data detail Data name 5'-end sequences of budding yeast full-length cDNA clones and quality score...or-capping method, the sequence quality score generated by the Phred software, and links to SGD, dbEST and U...es. FASTA format. Quality Phred's quality score About This Database Database Desc...g yeast full-length cDNA clones and quality scores - Budding yeast cDNA sequencing project | LSDB Archive ...

  8. The Second Subunit of DNA Polymerase Delta Is Required for Genomic Stability and Epigenetic Regulation1[OPEN

    Science.gov (United States)

    Cheng, Jinkui; Lai, Jinsheng; Gong, Zhizhong

    2016-01-01

    DNA polymerase δ plays crucial roles in DNA repair and replication as well as maintaining genomic stability. However, the function of POLD2, the second small subunit of DNA polymerase δ, has not been characterized yet in Arabidopsis (Arabidopsis thaliana). During a genetic screen for release of transcriptional gene silencing, we identified a mutation in POLD2. Whole-genome bisulfite sequencing indicated that POLD2 is not involved in the regulation of DNA methylation. POLD2 genetically interacts with Ataxia Telangiectasia-mutated and Rad3-related and DNA polymerase α. The pold2-1 mutant exhibits genomic instability with a high frequency of homologous recombination. It also exhibits hypersensitivity to DNA-damaging reagents and short telomere length. Whole-genome chromatin immunoprecipitation sequencing and RNA sequencing analyses suggest that pold2-1 changes H3K27me3 and H3K4me3 modifications, and these changes are correlated with the gene expression levels. Our study suggests that POLD2 is required for maintaining genome integrity and properly establishing the epigenetic markers during DNA replication to modulate gene expression. PMID:27208288

  9. Cooperative heteroassembly of the adenoviral L4-22K and IVa2 proteins onto the viral packaging sequence DNA.

    Science.gov (United States)

    Yang, Teng-Chieh; Maluf, Nasib Karl

    2012-02-21

    Human adenovirus (Ad) is an icosahedral, double-stranded DNA virus. Viral DNA packaging refers to the process whereby the viral genome becomes encapsulated by the viral particle. In Ad, activation of the DNA packaging reaction requires at least three viral components: the IVa2 and L4-22K proteins and a section of DNA within the viral genome, called the packaging sequence. Previous studies have shown that the IVa2 and L4-22K proteins specifically bind to conserved elements within the packaging sequence and that these interactions are absolutely required for the observation of DNA packaging. However, the equilibrium mechanism for assembly of IVa2 and L4-22K onto the packaging sequence has not been determined. Here we characterize the assembly of the IVa2 and L4-22K proteins onto truncated packaging sequence DNA by analytical sedimentation velocity and equilibrium methods. At limiting concentrations of L4-22K, we observe a species with two IVa2 monomers and one L4-22K monomer bound to the DNA. In this species, the L4-22K monomer is promoting positive cooperative interactions between the two bound IVa2 monomers. As L4-22K levels are increased, we observe a species with one IVa2 monomer and three L4-22K monomers bound to the DNA. To explain this result, we propose a model in which L4-22K self-assembly on the DNA competes with IVa2 for positive heterocooperative interactions, destabilizing binding of the second IVa2 monomer. Thus, we propose that L4-22K levels control the extent of cooperativity observed between adjacently bound IVa2 monomers. We have also determined the hydrodynamic properties of all observed stoichiometric species; we observe that species with three L4-22K monomers bound have more extended conformations than species with a single L4-22K bound. We suggest this might reflect a molecular switch that controls insertion of the viral DNA into the capsid.

  10. A next generation semiconductor based sequencing approach for the identification of meat species in DNA mixtures.

    Science.gov (United States)

    Bertolini, Francesca; Ghionda, Marco Ciro; D'Alessandro, Enrico; Geraci, Claudia; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine) for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon) as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43%) in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97) and lower for avian species (0.70). PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.

  11. Supported PCR: an efficient procedure to amplify sequences flanking a known DNA segment

    OpenAIRE

    Rudenko, George N.; Rommens, Caius M.T.; Nijkamp, H. John J.; Hille, Jacques

    1993-01-01

    We describe a novel modification of the polymerase chain reaction for efficient in vitro amplification of genomic DNA sequences flanking short stretches of known sequence. The technique utilizes a target enrichment step, based on the selective isolation of biotinylated fragments from the bulk of genomic DNA on streptavidin-containing support. Subsequently, following ligation with a second universal linker primer, the selected fragments can be amplified to amounts suitable for further molecula...

  12. A next generation semiconductor based sequencing approach for the identification of meat species in DNA mixtures.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43% in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97 and lower for avian species (0.70. PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.

  13. Effect of DNA Extraction Methods on the Apparent Structure of Yak Rumen Microbial Communities as Revealed by 16S rDNA Sequencing.

    Science.gov (United States)

    Chen, Ya-Bing; Lan, Dao-Liang; Tang, Cheng; Yang, Xiao-Nong; Li, Jian

    2015-01-01

    To more efficiently identify the microbial community of the yak rumen, the standardization of DNA extraction is key to ensure fidelity while studying environmental microbial communities. In this study, we systematically compared the efficiency of several extraction methods based on DNA yield, purity, and 16S rDNA sequencing to determine the optimal DNA extraction methods whose DNA products reflect complete bacterial communities. The results indicate that method 6 (hexadecyltrimethylammomium bromide-lysozyme-physical lysis by bead beating) is recommended for the DNA isolation of the rumen microbial community due to its high yield, operational taxonomic unit, bacterial diversity, and excellent cell-breaking capability. The results also indicate that the bead-beating step is necessary to effectively break down the cell walls of all of the microbes, especially Gram-positive bacteria. Another aim of this study was to preliminarily analyze the bacterial community via 16S rDNA sequencing. The microbial community spanned approximately 21 phyla, 35 classes, 75 families, and 112 genera. A comparative analysis showed some variations in the microbial community between yaks and cattle that may be attributed to diet and environmental differences. Interestingly, numerous uncultured or unclassified bacteria were found in yak rumen, suggesting that further research is required to determine the specific functional and ecological roles of these bacteria in yak rumen. In summary, the investigation of the optimal