WorldWideScience

Sample records for anonymous dna sequences

  1. Identification of genes in anonymous DNA sequences. Annual performance report, February 1, 1991--January 31, 1992

    Energy Technology Data Exchange (ETDEWEB)

    Fields, C.A.

    1996-06-01

    The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progress report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.

  2. Dna Sequencing

    Science.gov (United States)

    Tabor, Stanley; Richardson, Charles C.

    1995-04-25

    A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.

  3. DNA sequences encoding erythropoietin

    Energy Technology Data Exchange (ETDEWEB)

    Lin, F.K.

    1987-10-27

    A purified and isolated DNA sequence is described consisting essentially of a DNA sequence encoding a polypeptide having an amino acid sequence sufficiently duplicative of that of erythropoietin to allow possession of the biological property of causing bone marrow cells to increase production of reticulocytes and red blood cells, and to increase hemoglobin synthesis or iron uptake.

  4. DNA sequencing conference, 2

    Energy Technology Data Exchange (ETDEWEB)

    Cook-Deegan, R.M. [Georgetown Univ., Kennedy Inst. of Ethics, Washington, DC (United States); Venter, J.C. [National Inst. of Neurological Disorders and Strokes, Bethesda, MD (United States); Gilbert, W. [Harvard Univ., Cambridge, MA (United States); Mulligan, J. [Stanford Univ., CA (United States); Mansfield, B.K. [Oak Ridge National Lab., TN (United States)

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  5. Evolution of DNA Sequencing

    International Nuclear Information System (INIS)

    Sanger and coworkers introduced DNA sequencing in 1970s for the first time. It principally relied on termination of growing nucleotide chain when a dideoxythymidine triphosphate (ddTTP) was inserted in it. Detection of terminated sequences was done radiographically on Polyacrylamide Gel Electrophoresis (PAGE). Improvements that have evolved over time in original Sanger sequencing include replacement of radiography with fluorescence, use of separate fluorescent markers for each nucleotide, use of capillary electrophoresis instead of polyacrylamide gel electrophoresis and then introduction of capillary array electrophoresis. However, this technique suffered from few inherent limitations like decreased sensitivity for low level mutant alleles, complexities in analyzing highly polymorphic regions like Major Histocompatibility Complex (MHC) and high DNA concentrations required. Several Next Generation Sequencing (NGS) technologies have been introduced by Roche, Illumina and other commercial manufacturers that tend to overcome Sanger sequencing limitations and have been reviewed. Introduction of NGS in clinical research and medical diagnostics is expected to change entire diagnostic approach. These include study of cancer variants, detection of minimal residual disease, exome sequencing, detection of Single Nucleotide Polymorphisms (SNPs) and their disease association, epigenetic regulation of gene expression and sequencing of microorganisms genome. (author)

  6. Information Theory of DNA Sequencing

    CERN Document Server

    Motahari, Abolfazl; Tse, David

    2012-01-01

    DNA sequencing is the basic workhorse of modern day biology and medicine. Shotgun sequencing is the dominant technique used: many randomly located short fragments called reads are extracted from the DNA sequence, and these reads are assembled to reconstruct the original sequence. By drawing an analogy between the DNA sequencing problem and the classic communication problem, we define an information theoretic notion of sequencing capacity. This is the maximum number of DNA base pairs that can be resolved reliably per read, and provides a fundamental limit to the performance that can be achieved by any assembly algorithm. We compute the sequencing capacity explicitly for a simple statistical model of the DNA sequence and the read process. Using this framework, we also study the impact of noise in the read process on the sequencing capacity.

  7. Graphene nanodevices for DNA sequencing

    Science.gov (United States)

    Heerema, Stephanie J.; Dekker, Cees

    2016-02-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with nanopores. Owing to its unique structure and properties, graphene provides interesting opportunities for the development of a new sequencing technology. In recent years, a wide range of creative ideas for graphene sequencers have been theoretically proposed and the first experimental demonstrations have begun to appear. Here, we review the different approaches to using graphene nanodevices for DNA sequencing, which involve DNA passing through graphene nanopores, nanogaps, and nanoribbons, and the physisorption of DNA on graphene nanostructures. We discuss the advantages and problems of each of these key techniques, and provide a perspective on the use of graphene in future DNA sequencing technology.

  8. Duplication in DNA Sequences

    Science.gov (United States)

    Ito, Masami; Kari, Lila; Kincaid, Zachary; Seki, Shinnosuke

    The duplication and repeat-deletion operations are the basis of a formal language theoretic model of errors that can occur during DNA replication. During DNA replication, subsequences of a strand of DNA may be copied several times (resulting in duplications) or skipped (resulting in repeat-deletions). As formal language operations, iterated duplication and repeat-deletion of words and languages have been well studied in the literature. However, little is known about single-step duplications and repeat-deletions. In this paper, we investigate several properties of these operations, including closure properties of language families in the Chomsky hierarchy and equations involving these operations. We also make progress toward a characterization of regular languages that are generated by duplicating a regular language.

  9. Structural complexity of DNA sequence.

    Science.gov (United States)

    Liou, Cheng-Yuan; Tseng, Shen-Han; Cheng, Wei-Chen; Tsai, Huai-Ying

    2013-01-01

    In modern bioinformatics, finding an efficient way to allocate sequence fragments with biological functions is an important issue. This paper presents a structural approach based on context-free grammars extracted from original DNA or protein sequences. This approach is radically different from all those statistical methods. Furthermore, this approach is compared with a topological entropy-based method for consistency and difference of the complexity results. PMID:23662161

  10. Structural Complexity of DNA Sequence

    Directory of Open Access Journals (Sweden)

    Cheng-Yuan Liou

    2013-01-01

    Full Text Available In modern bioinformatics, finding an efficient way to allocate sequence fragments with biological functions is an important issue. This paper presents a structural approach based on context-free grammars extracted from original DNA or protein sequences. This approach is radically different from all those statistical methods. Furthermore, this approach is compared with a topological entropy-based method for consistency and difference of the complexity results.

  11. Fractals in DNA sequence analysis

    Institute of Scientific and Technical Information of China (English)

    Yu Zu-Guo(喻祖国); Vo Anh; Gong Zhi-Min(龚志民); Long Shun-Chao(龙顺潮)

    2002-01-01

    Fractal methods have been successfully used to study many problems in physics, mathematics, engineering, finance,and even in biology. There has been an increasing interest in unravelling the mysteries of DNA; for example, how can we distinguish coding and noncoding sequences, and the problems of classification and evolution relationship of organisms are key problems in bioinformatics. Although much research has been carried out by taking into consideration the long-range correlations in DNA sequences, and the global fractal dimension has been used in these works by other people, the models and methods are somewhat rough and the results are not satisfactory. In recent years, our group has introduced a time series model (statistical point of view) and a visual representation (geometrical point of view)to DNA sequence analysis. We have also used fractal dimension, correlation dimension, the Hurst exponent and the dimension spectrum (multifractal analysis) to discuss problems in this field. In this paper, we introduce these fractal models and methods and the results of DNA sequence analysis.

  12. Comparative population genetic analysis of bocaccio rockfish Sebastes paucispinis using anonymous and gene-associated simple sequence repeat loci.

    Science.gov (United States)

    Buonaccorsi, Vincent P; Kimbrell, Carol A; Lynn, Eric A; Hyde, John R

    2012-01-01

    Comparative population genetic analyses of traditional and emergent molecular markers aid in determining appropriate use of new technologies. The bocaccio rockfish Sebastes paucispinis is a high gene-flow marine species off the west coast of North America that experienced strong population decline over the past 3 decades. We used 18 anonymous and 13 gene-associated simple sequence repeat (SSR) loci (expressed sequence tag [EST]-SSRs) to characterize range-wide population structure with temporal replicates. No F(ST)-outliers were detected using the LOSITAN program, suggesting that neither balancing nor divergent selection affected the loci surveyed. Consistent hierarchical structuring of populations by geography or year class was not detected regardless of marker class. The EST-SSRs were less variable than the anonymous SSRs, but no correlation between F(ST) and variation or marker class was observed. General linear model analysis showed that low EST-SSR variation was attributable to low mean repeat number. Comparative genomic analysis with Gasterosteus aculeatus, Takifugu rubripes, and Oryzias latipes showed consistently lower repeat number in EST-SSRs than SSR loci that were not in ESTs. Purifying selection likely imposed functional constraints on EST-SSRs resulting in low repeat numbers that affected diversity estimates but did not affect the observed pattern of population structure.

  13. DNA Sequencing Using capillary Electrophoresis

    Energy Technology Data Exchange (ETDEWEB)

    Dr. Barry Karger

    2011-05-09

    The overall goal of this program was to develop capillary electrophoresis as the tool to be used to sequence for the first time the Human Genome. Our program was part of the Human Genome Project. In this work, we were highly successful and the replaceable polymer we developed, linear polyacrylamide, was used by the DOE sequencing lab in California to sequence a significant portion of the human genome using the MegaBase multiple capillary array electrophoresis instrument. In this final report, we summarize our efforts and success. We began our work by separating by capillary electrophoresis double strand oligonucleotides using cross-linked polyacrylamide gels in fused silica capillaries. This work showed the potential of the methodology. However, preparation of such cross-linked gel capillaries was difficult with poor reproducibility, and even more important, the columns were not very stable. We improved stability by using non-cross linked linear polyacrylamide. Here, the entangled linear chains could move when osmotic pressure (e.g. sample injection) was imposed on the polymer matrix. This relaxation of the polymer dissipated the stress in the column. Our next advance was to use significantly lower concentrations of the linear polyacrylamide that the polymer could be automatically blown out after each run and replaced with fresh linear polymer solution. In this way, a new column was available for each analytical run. Finally, while testing many linear polymers, we selected linear polyacrylamide as the best matrix as it was the most hydrophilic polymer available. Under our DOE program, we demonstrated initially the success of the linear polyacrylamide to separate double strand DNA. We note that the method is used even today to assay purity of double stranded DNA fragments. Our focus, of course, was on the separation of single stranded DNA for sequencing purposes. In one paper, we demonstrated the success of our approach in sequencing up to 500 bases. Other

  14. Nanopore DNA sequencing with MspA

    OpenAIRE

    Derrington, Ian M.; Butler, Tom Z.; Collins, Marcus D.; Manrao, Elizabeth; Pavlenok, Mikhail; Niederweis, Michael; Gundlach, Jens H.

    2010-01-01

    Nanopore sequencing has the potential to become a direct, fast, and inexpensive DNA sequencing technology. The simplest form of nanopore DNA sequencing utilizes the hypothesis that individual nucleotides of single-stranded DNA passing through a nanopore will uniquely modulate an ionic current flowing through the pore, allowing the record of the current to yield the DNA sequence. We demonstrate that the ionic current through the engineered Mycobacterium smegmatis porin A, MspA, has the ability...

  15. Nanopore DNA sequencing with MspA.

    Science.gov (United States)

    Derrington, Ian M; Butler, Tom Z; Collins, Marcus D; Manrao, Elizabeth; Pavlenok, Mikhail; Niederweis, Michael; Gundlach, Jens H

    2010-09-14

    Nanopore sequencing has the potential to become a direct, fast, and inexpensive DNA sequencing technology. The simplest form of nanopore DNA sequencing utilizes the hypothesis that individual nucleotides of single-stranded DNA passing through a nanopore will uniquely modulate an ionic current flowing through the pore, allowing the record of the current to yield the DNA sequence. We demonstrate that the ionic current through the engineered Mycobacterium smegmatis porin A, MspA, has the ability to distinguish all four DNA nucleotides and resolve single-nucleotides in single-stranded DNA when double-stranded DNA temporarily holds the nucleotides in the pore constriction. Passing DNA with a series of double-stranded sections through MspA provides proof of principle of a simple DNA sequencing method using a nanopore. These findings highlight the importance of MspA in the future of nanopore sequencing. PMID:20798343

  16. Anonymous Gossiping

    CERN Document Server

    Datta, Anwitaman

    2010-01-01

    In this paper we introduce a novel gossiping primitive to support privacy preserving data analytics (PPDA). In contrast to existing computational PPDA primitives such as secure multiparty computation and data randomization based approaches, the proposed primitive `anonymous gossiping' is a communication primitive for privacy preserving personalized information aggregation complementing such traditional computational analytics. We realize this novel primitive by composing existing gossiping mechanisms for peer sampling & information aggregation and onion routing technique for establishing anonymous communication. This is more an `ideas' paper, rather than providing concrete and quantified results.

  17. A genetic similarity algorithm for searching the Gene Ontology terms and annotating anonymous protein sequences.

    Science.gov (United States)

    Othman, Razib M; Deris, Safaai; Illias, Rosli M

    2008-02-01

    A genetic similarity algorithm is introduced in this study to find a group of semantically similar Gene Ontology terms. The genetic similarity algorithm combines semantic similarity measure algorithm with parallel genetic algorithm. The semantic similarity measure algorithm is used to compute the similitude strength between the Gene Ontology terms. Then, the parallel genetic algorithm is employed to perform batch retrieval and to accelerate the search in large search space of the Gene Ontology graph. The genetic similarity algorithm is implemented in the Gene Ontology browser named basic UTMGO to overcome the weaknesses of the existing Gene Ontology browsers which use a conventional approach based on keyword matching. To show the applicability of the basic UTMGO, we extend its structure to develop a Gene Ontology -based protein sequence annotation tool named extended UTMGO. The objective of developing the extended UTMGO is to provide a simple and practical tool that is capable of producing better results and requires a reasonable amount of running time with low computing cost specifically for offline usage. The computational results and comparison with other related tools are presented to show the effectiveness of the proposed algorithm and tools.

  18. Suicidal nucleotide sequences for DNA polymerization.

    OpenAIRE

    Samadashwily, G M; Dayn, A; Mirkin, S M

    1993-01-01

    Studying the activity of T7 DNA polymerase (Sequenase) on open circular DNAs, we observed virtually complete termination within potential triplex-forming sequences. Mutations destroying the triplex potential of the sequences prevented termination, while compensatory mutations restoring triplex potential restored it. We hypothesize that strand displacement during DNA polymerization of double-helical templates brings three DNA strands (duplex DNA downstream of the polymerase plus a displaced ov...

  19. Fibonacci Sequence and Supramolecular Structure of DNA.

    Science.gov (United States)

    Shabalkin, I P; Grigor'eva, E Yu; Gudkova, M V; Shabalkin, P I

    2016-05-01

    We proposed a new model of supramolecular DNA structure. Similar to the previously developed by us model of primary DNA structure [11-15], 3D structure of DNA molecule is assembled in accordance to a mathematic rule known as Fibonacci sequence. Unlike primary DNA structure, supramolecular 3D structure is assembled from complex moieties including a regular tetrahedron and a regular octahedron consisting of monomers, elements of the primary DNA structure. The moieties of the supramolecular DNA structure forming fragments of regular spatial lattice are bound via linker (joint) sequences of the DNA chain. The lattice perceives and transmits information signals over a considerable distance without acoustic aberrations. Linker sequences expand conformational space between lattice segments allowing their sliding relative to each other under the action of external forces. In this case, sliding is provided by stretching of the stacked linker sequences.

  20. Fibonacci Sequence and Supramolecular Structure of DNA.

    Science.gov (United States)

    Shabalkin, I P; Grigor'eva, E Yu; Gudkova, M V; Shabalkin, P I

    2016-05-01

    We proposed a new model of supramolecular DNA structure. Similar to the previously developed by us model of primary DNA structure [11-15], 3D structure of DNA molecule is assembled in accordance to a mathematic rule known as Fibonacci sequence. Unlike primary DNA structure, supramolecular 3D structure is assembled from complex moieties including a regular tetrahedron and a regular octahedron consisting of monomers, elements of the primary DNA structure. The moieties of the supramolecular DNA structure forming fragments of regular spatial lattice are bound via linker (joint) sequences of the DNA chain. The lattice perceives and transmits information signals over a considerable distance without acoustic aberrations. Linker sequences expand conformational space between lattice segments allowing their sliding relative to each other under the action of external forces. In this case, sliding is provided by stretching of the stacked linker sequences. PMID:27265133

  1. Against anonymity.

    Science.gov (United States)

    Baker, Robert

    2014-05-01

    In 'New Threats to Academic Freedom' Francesca Minerva argues that anonymity for the authors of controversial articles is a prerequisite for academic freedom in the Internet age. This argument draws its intellectual and emotional power from the author's account of the reaction to the on-line publication of ' After-birth abortion: why should the baby live?'--an article that provoked cascades of hostile postings and e-mails. Reflecting on these events, Minerva proposes that publishers should offer the authors of controversial articles the option of publishing their articles anonymously. This response reviews the history of anonymous publication and concludes that its reintroduction in the Internet era would recreate problems similar to those that led print journals to abandon the practice: corruption of scholarly discourse by invective and hate speech, masked conflicts of interest, and a diminution of editorial accountability. It also contends that Minerva misreads the intent of the hostile e-mails provoked by 'After-birth abortion,' and that ethicists who publish controversial articles should take responsibility by dialoguing with their critics--even those whose critiques are emotionally charged and hostile.

  2. Against anonymity.

    Science.gov (United States)

    Baker, Robert

    2014-05-01

    In 'New Threats to Academic Freedom' Francesca Minerva argues that anonymity for the authors of controversial articles is a prerequisite for academic freedom in the Internet age. This argument draws its intellectual and emotional power from the author's account of the reaction to the on-line publication of ' After-birth abortion: why should the baby live?'--an article that provoked cascades of hostile postings and e-mails. Reflecting on these events, Minerva proposes that publishers should offer the authors of controversial articles the option of publishing their articles anonymously. This response reviews the history of anonymous publication and concludes that its reintroduction in the Internet era would recreate problems similar to those that led print journals to abandon the practice: corruption of scholarly discourse by invective and hate speech, masked conflicts of interest, and a diminution of editorial accountability. It also contends that Minerva misreads the intent of the hostile e-mails provoked by 'After-birth abortion,' and that ethicists who publish controversial articles should take responsibility by dialoguing with their critics--even those whose critiques are emotionally charged and hostile. PMID:24724540

  3. Sequence Affects the Cyclization of DNA Minicircles.

    Science.gov (United States)

    Wang, Qian; Pettitt, B Montgomery

    2016-03-17

    Understanding how the sequence of a DNA molecule affects its dynamic properties is a central problem affecting biochemistry and biotechnology. The process of cyclizing short DNA, as a critical step in molecular cloning, lacks a comprehensive picture of the kinetic process containing sequence information. We have elucidated this process by using coarse-grained simulations, enhanced sampling methods, and recent theoretical advances. We are able to identify the types and positions of structural defects during the looping process at a base-pair level. Correlations along a DNA molecule dictate critical sequence positions that can affect the looping rate. Structural defects change the bending elasticity of the DNA molecule from a harmonic to subharmonic potential with respect to bending angles. We explore the subelastic chain as a possible model in loop formation kinetics. A sequence-dependent model is developed to qualitatively predict the relative loop formation time as a function of DNA sequence. PMID:26938490

  4. Mitochondrial DNA sequence evolution in shorebird populations.

    NARCIS (Netherlands)

    Wenink, P.W.

    1994-01-01

    This thesis describes the global molecular population structure of two shorebird species, in particular of the dunlin, Calidris alpina, by means of comparative sequence analysis of the most variable part of the mitochondrial DNA (mtDNA) genome. There are several reasons why mtDNA is the molecule of

  5. DNA extraction columns contaminated with murine sequences.

    Directory of Open Access Journals (Sweden)

    Otto Erlwein

    Full Text Available Sequences of the novel gammaretrovirus, xenotropic murine leukemia virus-related virus (XMRV have been described in human prostate cancer tissue, although the amounts of DNA are low. Furthermore, XMRV sequences and polytropic (p murine leukemia viruses (MLVs have been reported in patients with chronic fatigue syndrome (CFS. In assessing the prevalence of XMRV in prostate cancer tissue samples we discovered that eluates from naïve DNA purification columns, when subjected to PCR with primers designed to detect genomic mouse DNA contamination, occasionally gave rise to amplification products. Further PCR analysis, using primers to detect XMRV, revealed sequences derived from XMRV and pMLVs from mouse and human DNA and DNA of unspecified origin. Thus, DNA purification columns can present problems when used to detect minute amounts of DNA targets by highly sensitive amplification techniques.

  6. Using DNA looping to measure sequence dependent DNA elasticity

    Science.gov (United States)

    Kandinov, Alan; Raghunathan, Krishnan; Meiners, Jens-Christian

    2012-10-01

    We are using tethered particle motion (TPM) microscopy to observe protein-mediated DNA looping in the lactose repressor system in DNA constructs with varying AT / CG content. We use these data to determine the persistence length of the DNA as a function of its sequence content and compare the data to direct micromechanical measurements with constant-force axial optical tweezers. The data from the TPM experiments show a much smaller sequence effect on the persistence length than the optical tweezers experiments.

  7. Mitochondrial DNA sequence evolution in shorebird populations.

    OpenAIRE

    Wenink, P W

    1994-01-01

    This thesis describes the global molecular population structure of two shorebird species, in particular of the dunlin, Calidris alpina, by means of comparative sequence analysis of the most variable part of the mitochondrial DNA (mtDNA) genome. There are several reasons why mtDNA is the molecule of choice to probe the recent evolutionary history of a species. Most importantly, mtDNA accumulates substitutions at a high average rate that permits the tracing of genealogies within the time frame ...

  8. DNA display I. Sequence-encoded routing of DNA populations.

    Directory of Open Access Journals (Sweden)

    David R Halpin

    2004-07-01

    Full Text Available Recently reported technologies for DNA-directed organic synthesis and for DNA computing rely on routing DNA populations through complex networks. The reduction of these ideas to practice has been limited by a lack of practical experimental tools. Here we describe a modular design for DNA routing genes, and routing machinery made from oligonucleotides and commercially available chromatography resins. The routing machinery partitions nanomole quantities of DNA into physically distinct subpools based on sequence. Partitioning steps can be iterated indefinitely, with worst-case yields of 85% per step. These techniques facilitate DNA-programmed chemical synthesis, and thus enable a materials biology that could revolutionize drug discovery.

  9. Long range correlations in DNA sequences

    CERN Document Server

    Mohanty, A K

    2002-01-01

    The so called long range correlation properties of DNA sequences are studied using the variance analyses of the density distribution of a single or a group of nucleotides in a model independent way. This new method which was suggested earlier has been applied to extract slope parameters that characterize the correlation properties for several intron containing and intron less DNA sequences. An important aspect of all the DNA sequences is the properties of complimentarity by virtue of which any two complimentary distributions (like GA is complimentary to TC or G is complimentary to ATC) have identical fluctuations at all scales although their distribution functions need not be identical. Due to this complimentarity, the famous DNA walk representation whose statistical interpretation is still unresolved is shown to be a special case of the present formalism with a density distribution corresponding to a purine or a pyrimidine group. Another interesting aspect of most of the DNA sequences is that the factorial m...

  10. Dynamics and Control of DNA Sequence Amplification

    CERN Document Server

    Marimuthu, Karthikeyan

    2014-01-01

    DNA amplification is the process of replication of a specified DNA sequence \\emph{in vitro} through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction (PCR) as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal tempe...

  11. Group Anonymity

    CERN Document Server

    Chertov, Oleg; 10.1007/978-3-642-14058-7_61

    2010-01-01

    In recent years the amount of digital data in the world has risen immensely. But, the more information exists, the greater is the possibility of its unwanted disclosure. Thus, the data privacy protection has become a pressing problem of the present time. The task of individual privacy-preserving is being thoroughly studied nowadays. At the same time, the problem of statistical disclosure control for collective (or group) data is still open. In this paper we propose an effective and relatively simple (wavelet-based) way to provide group anonymity in collective data. We also provide a real-life example to illustrate the method.

  12. Visible periodicity of strong nucleosome DNA sequences.

    Science.gov (United States)

    Salih, Bilal; Tripathi, Vijay; Trifonov, Edward N

    2015-01-01

    Fifteen years ago, Lowary and Widom assembled nucleosomes on synthetic random sequence DNA molecules, selected the strongest nucleosomes and discovered that the TA dinucleotides in these strong nucleosome sequences often appear at 10-11 bases from one another or at distances which are multiples of this period. We repeated this experiment computationally, on large ensembles of natural genomic sequences, by selecting the strongest nucleosomes--i.e. those with such distances between like-named dinucleotides, multiples of 10.4 bases, the structural and sequence period of nucleosome DNA. The analysis confirmed the periodicity of TA dinucleotides in the strong nucleosomes, and revealed as well other periodic sequence elements, notably classical AA and TT dinucleotides. The matrices of DNA bendability and their simple linear forms--nucleosome positioning motifs--are calculated from the strong nucleosome DNA sequences. The motifs are in full accord with nucleosome positioning sequences derived earlier, thus confirming that the new technique, indeed, detects strong nucleosomes. Species- and isochore-specific variations of the matrices and of the positioning motifs are demonstrated. The strong nucleosome DNA sequences manifest the highest hitherto nucleosome positioning sequence signals, showing the dinucleotide periodicities in directly observable rather than in hidden form.

  13. Extracting biological knowledge from DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    De La Vega, F.M. [CINVESTAV-IPN (Mexico); Thieffry, D. [Universite Libre de Bruxelles, Rhode-Saint-Genese (Belgium)]|[Universidad Nacional Autonoma de Mexico, Morelos (Mexico); Collado-Vides, J. [Universidad Nacional Autonoma de Mexico, Morelos (Mexico)

    1996-12-31

    This session describes the elucidation of information from dna sequences and what challenges computational biologists face in their task of summarizing and deciphering the human genome. Techniques discussed include methods from statistics, information theory, artificial intelligence and linguistics. 1 ref.

  14. Chromatid interchanges at intrachromosomal telomeric DNA sequences

    International Nuclear Information System (INIS)

    Chinese hamster Don cells were exposed to X-rays, mitomycin C and teniposide (VM-26) to induce chromatid exchanges (quadriradials and triradials). After fluorescence in situ hybridization (FISH) of telomere sequences it was found that interstitial telomere-like DNA sequence arrays presented around five times more breakage-rearrangements than the genome overall. This high recombinogenic capacity was independent of the clastogen, suggesting that this susceptibility is not related to the initial mechanisms of DNA damage. (author)

  15. Nucleotide Capacitance Calculation for DNA Sequencing

    OpenAIRE

    Lu, Jun-Qiang; Zhang, X.-G.

    2008-01-01

    Using a first-principles linear response theory, the capacitance of the DNA nucleotides, adenine, cytosine, guanine, and thymine, are calculated. The difference in the capacitance between the nucleotides is studied with respect to conformational distortion. The result suggests that although an alternate current capacitance measurement of a single-stranded DNA chain threaded through a nanogap electrode may not be sufficient to be used as a standalone method for rapid DNA sequencing, the capaci...

  16. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  17. Osmylated DNA, a novel concept for sequencing DNA using nanopores

    Science.gov (United States)

    Kanavarioti, Anastassia

    2015-03-01

    Saenger sequencing has led the advances in molecular biology, while faster and cheaper next generation technologies are urgently needed. A newer approach exploits nanopores, natural or solid-state, set in an electrical field, and obtains base sequence information from current variations due to the passage of a ssDNA molecule through the pore. A hurdle in this approach is the fact that the four bases are chemically comparable to each other which leads to small differences in current obstruction. ‘Base calling’ becomes even more challenging because most nanopores sense a short sequence and not individual bases. Perhaps sequencing DNA via nanopores would be more manageable, if only the bases were two, and chemically very different from each other; a sequence of 1s and 0s comes to mind. Osmylated DNA comes close to such a sequence of 1s and 0s. Osmylation is the addition of osmium tetroxide bipyridine across the C5-C6 double bond of the pyrimidines. Osmylation adds almost 400% mass to the reactive base, creates a sterically and electronically notably different molecule, labeled 1, compared to the unreactive purines, labeled 0. If osmylated DNA were successfully sequenced, the result would be a sequence of osmylated pyrimidines (1), and purines (0), and not of the actual nucleobases. To solve this problem we studied the osmylation reaction with short oligos and with M13mp18, a long ssDNA, developed a UV-vis assay to measure extent of osmylation, and designed two protocols. Protocol A uses mild conditions and yields osmylated thymidines (1), while leaving the other three bases (0) practically intact. Protocol B uses harsher conditions and effectively osmylates both pyrimidines, but not the purines. Applying these two protocols also to the complementary of the target polynucleotide yields a total of four osmylated strands that collectively could define the actual base sequence of the target DNA.

  18. Osmylated DNA, a novel concept for sequencing DNA using nanopores

    International Nuclear Information System (INIS)

    Saenger sequencing has led the advances in molecular biology, while faster and cheaper next generation technologies are urgently needed. A newer approach exploits nanopores, natural or solid-state, set in an electrical field, and obtains base sequence information from current variations due to the passage of a ssDNA molecule through the pore. A hurdle in this approach is the fact that the four bases are chemically comparable to each other which leads to small differences in current obstruction. ‘Base calling’ becomes even more challenging because most nanopores sense a short sequence and not individual bases. Perhaps sequencing DNA via nanopores would be more manageable, if only the bases were two, and chemically very different from each other; a sequence of 1s and 0s comes to mind. Osmylated DNA comes close to such a sequence of 1s and 0s. Osmylation is the addition of osmium tetroxide bipyridine across the C5–C6 double bond of the pyrimidines. Osmylation adds almost 400% mass to the reactive base, creates a sterically and electronically notably different molecule, labeled 1, compared to the unreactive purines, labeled 0. If osmylated DNA were successfully sequenced, the result would be a sequence of osmylated pyrimidines (1), and purines (0), and not of the actual nucleobases. To solve this problem we studied the osmylation reaction with short oligos and with M13mp18, a long ssDNA, developed a UV–vis assay to measure extent of osmylation, and designed two protocols. Protocol A uses mild conditions and yields osmylated thymidines (1), while leaving the other three bases (0) practically intact. Protocol B uses harsher conditions and effectively osmylates both pyrimidines, but not the purines. Applying these two protocols also to the complementary of the target polynucleotide yields a total of four osmylated strands that collectively could define the actual base sequence of the target DNA. (paper)

  19. Physical approaches to DNA sequencing and detection

    CERN Document Server

    Zwolak, Michael

    2007-01-01

    With the continued improvement of sequencing technologies, the prospect of genome-based medicine is now at the forefront of scientific research. To realize this potential, however, we need a revolutionary sequencing method for the cost-effective and rapid interrogation of individual genomes. This capability is likely to be provided by a physical approach to probing DNA at the single nucleotide level. This is in sharp contrast to current techniques and instruments which probe, through chemical elongation, electrophoresis, and optical detection, length differences and terminating bases of strands of DNA. In this Colloquium we review several physical approaches to DNA detection that have the potential to deliver fast and low-cost sequencing. Center-fold to these approaches is the concept of nanochannels or nanopores which allow for the spatial confinement of DNA molecules. In addition to their possible impact in medicine and biology, the methods offer ideal test beds to study open scientific issues and challenge...

  20. Cloned endogenous retroviral sequences from human DNA.

    OpenAIRE

    Bonner, T I; O'Connell, C; Cohen, M.

    1982-01-01

    We have screened a human DNA library using as probe a chimpanzee sequence that contains homology to the polymerase gene of the endogenous baboon virus. One set of overlapping clones spans about 20 kilobases and contains regions of DNA sequence homology to the gag p30, gag p15, and polymerase genes of Moloney murine leukemia virus. Furthermore, the spacings are the same as in Moloney virus between these sequences and a 480-nucleotide region that has the structural characteristics of a 3' copy ...

  1. Estimating the entropy of DNA sequences.

    Science.gov (United States)

    Schmitt, A O; Herzel, H

    1997-10-01

    The Shannon entropy is a standard measure for the order state of symbol sequences, such as, for example, DNA sequences. In order to incorporate correlations between symbols, the entropy of n-mers (consecutive strands of n symbols) has to be determined. Here, an assay is presented to estimate such higher order entropies (block entropies) for DNA sequences when the actual number of observations is small compared with the number of possible outcomes. The n-mer probability distribution underlying the dynamical process is reconstructed using elementary statistical principles: The theorem of asymptotic equi-distribution and the Maximum Entropy Principle. Constraints are set to force the constructed distributions to adopt features which are characteristic for the real probability distribution. From the many solutions compatible with these constraints the one with the highest entropy is the most likely one according to the Maximum Entropy Principle. An algorithm performing this procedure is expounded. It is tested by applying it to various DNA model sequences whose exact entropies are known. Finally, results for a real DNA sequence, the complete genome of the Epstein Barr virus, are presented and compared with those of other information carriers (texts, computer source code, music). It seems as if DNA sequences possess much more freedom in the combination of the symbols of their alphabet than written language or computer source codes. PMID:9344742

  2. Dynamics and control of DNA sequence amplification

    International Nuclear Information System (INIS)

    DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions

  3. Dynamics and control of DNA sequence amplification

    Energy Technology Data Exchange (ETDEWEB)

    Marimuthu, Karthikeyan [Department of Chemical Engineering and Center for Advanced Process Decision-Making, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 (United States); Chakrabarti, Raj, E-mail: raj@pmc-group.com, E-mail: rajc@andrew.cmu.edu [Department of Chemical Engineering and Center for Advanced Process Decision-Making, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 (United States); Division of Fundamental Research, PMC Advanced Technology, Mount Laurel, New Jersey 08054 (United States)

    2014-10-28

    DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions.

  4. Compressing DNA sequence databases with coil

    Directory of Open Access Journals (Sweden)

    Hendy Michael D

    2008-05-01

    Full Text Available Abstract Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.

  5. cDNA sequence quality data - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available Budding yeast cDNA sequencing project cDNA sequence quality data Data detail Data name cDNA sequence quality... data Description of data contents Phred's quality score. PHD format, one file to a single cDNA data, and co...ription Download License Update History of This Database Site Policy | Contact Us cDNA sequence quality data - Budding yeast cDNA sequencing project | LSDB Archive ...

  6. DNA Sequencing in Cultural Heritage.

    Science.gov (United States)

    Vai, Stefania; Lari, Martina; Caramelli, David

    2016-02-01

    During the last three decades, DNA analysis on degraded samples revealed itself as an important research tool in anthropology, archaeozoology, molecular evolution, and population genetics. Application on topics such as determination of species origin of prehistoric and historic objects, individual identification of famous personalities, characterization of particular samples important for historical, archeological, or evolutionary reconstructions, confers to the paleogenetics an important role also for the enhancement of cultural heritage. A really fast improvement in methodologies in recent years led to a revolution that permitted recovering even complete genomes from highly degraded samples with the possibility to go back in time 400,000 years for samples from temperate regions and 700,000 years for permafrozen remains and to analyze even more recent material that has been subjected to hard biochemical treatments. Here we propose a review on the different methodological approaches used so far for the molecular analysis of degraded samples and their application on some case studies.

  7. Indexing for Large DNA Database Sequences

    Directory of Open Access Journals (Sweden)

    S. M. Wohoush & M.H. Saheb

    2011-10-01

    Full Text Available Bioinformatics data consists of a huge amount of information due to the large number ofsequences, the very high sequences lengths and the daily new additions. This data need to beefficiently accessed for many needs. What makes one DNA data item distinct from another is itsDNA sequence. DNA sequence consists of a combination of four characters which are A, C, G, Tand have different lengths. Use a suitable representation of DNA sequences, and a suitable indexstructure to hold this representation at main memory will lead to have efficient processing byaccessing the DNA sequences through indexing, and will reduce number of disk I/O accesses.I/O operations needed at the end, to avoid false hits, we reduce the number of candidate DNAsequences that need to be checked by pruning, so no need to search the whole database. Weneed to have a suitable index for searching DNA sequences efficiently, with suitable index sizeand searching time. The suitable selection of relation fields, where index is build upon has a bigeffect on index size and search time. Our experiments use the n-gram wavelet transformationupon one field and multi-fields index structure under the relational DBMS environment. Resultsshow the need to consider index size and search time while using indexing carefully. Increasingwindow size decreases the amount of I/O reference. The use of a single field and multiple fieldsindexing is highly affected by window size value. Increasing window size value lead to bettersearching time with special type index using single filed indexing. While the search time is almostgood and the same with most index types when using multiple field indexing. Storage spaceneeded for RDMS indexing types are almost the same or greater than the actual data.

  8. Detecting seeded motifs in DNA sequences

    OpenAIRE

    Pizzi, Cinzia; Bortoluzzi, Stefania; Bisognin, Andrea; Coppe, Alessandro; Danieli, Gian Antonio

    2005-01-01

    The problem of detecting DNA motifs with functional relevance in real biological sequences is difficult due to a number of biological, statistical and computational issues and also because of the lack of knowledge about the structure of searched patterns. Many algorithms are implemented in fully automated processes, which are often based upon a guess of input parameters from the user at the very first step. In this paper, we present a novel method for the detection of seeded DNA motifs, compo...

  9. Sequence-Specific Ultrasonic Cleavage of DNA

    OpenAIRE

    Grokhovsky, Sergei L.; Il'icheva, Irina A.; Nechipurenko, Dmitry Yu.; Golovkin, Michail V.; Panchenko, Larisa A.; Polozov, Robert V.; Nechipurenko, Yury D.

    2011-01-01

    We investigated the phenomenon of ultrasonic cleavage of DNA by analyzing a large set of cleavage patterns of DNA restriction fragments using polyacrylamide gel electrophoresis. The cleavage intensity of individual phosphodiester bonds was found to depend on the nucleotide sequence and the position of the bond with respect to the ends of the fragment. The relative intensities of cleavage of the central phosphodiester bond in 16 dinucleotides and 256 tetranucleotides were determined by multiva...

  10. DNA sequencing by synthesis with degenerate primers

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    The degenerate primer-based sequencing Was developed by a synthesis method(DP-SBS)for high-throughput DNA sequencing,in which a set of degenerate primers are hybridized on the arrayed DNA templates and extended by DNA polymerase on microarrays.In this method,adifferent set of degenerate primers containing a give nnumber(n)of degenerate nucleotides at the 3'-ends were annealed to the sequenced templates that were immobilized on the solid surface.The nucleotides(n+1)on the template sequences were determined by detecting the incorporation of fluorescent labeled nucleotides.The fluorescent labeled nucleotide was incorporated into the primer in a base-specific manner after the enzymatic primer extension reactions and nine-base length were read out accurately.The main advanmge of the DP-SBS is that the method only uses very conventional biochemical reagents and avoids the complicated special chemical reagents for removing the labeled nucleotides and reactivating the primer for further extension.From the present study,it is found that the DP-SBS method is reliable,simple,and cost-effective for laboratory-sequencing a large amount of short DNA fragments.

  11. The complete DNA sequence of vaccinia virus.

    Science.gov (United States)

    Goebel, S J; Johnson, G P; Perkus, M E; Davis, S W; Winslow, J P; Paoletti, E

    1990-11-01

    The complete DNA sequence of the genome of vaccinia virus has been determined. The genome consisted of 191,636 bp with a base composition of 66.6% A + T. We have identified 198 "major" protein-coding regions and 65 overlapping "minor" regions, for a total of 263 potential genes. Genes encoded by the virus were located by examination of DNA sequence characteristics and compared with existing vaccinia virus mapping analyses, sequence data, and transcription data. These genes were found to be compactly organized along the genome with relatively few regions of noncoding sequences. Whereas several similarities to proteins of known function were discerned, the function of the majority of proteins encoded by these open reading frames is as yet undetermined.

  12. Automated Template Quantification for DNA Sequencing Facilities

    Science.gov (United States)

    Ivanetich, Kathryn M.; Yan, Wilson; Wunderlich, Kathleen M.; Weston, Jennifer; Walkup, Ward G.; Simeon, Christian

    2005-01-01

    The quantification of plasmid DNA by the PicoGreen dye binding assay has been automated, and the effect of quantification of user-submitted templates on DNA sequence quality in a core laboratory has been assessed. The protocol pipets, mixes and reads standards, blanks and up to 88 unknowns, generates a standard curve, and calculates template concentrations. For pUC19 replicates at five concentrations, coefficients of variance were 0.1, and percent errors were from 1% to 7% (n = 198). Standard curves with pUC19 DNA were nonlinear over the 1 to 1733 ng/μL concentration range required to assay the majority (98.7%) of user-submitted templates. Over 35,000 templates have been quantified using the protocol. For 1350 user-submitted plasmids, 87% deviated by ≥ 20% from the requested concentration (500 ng/μL). Based on data from 418 sequencing reactions, quantification of user-submitted templates was shown to significantly improve DNA sequence quality. The protocol is applicable to all types of double-stranded DNA, is unaffected by primer (1 pmol/μL), and is user modifiable. The protocol takes 30 min, saves 1 h of technical time, and costs approximately $0.20 per unknown. PMID:16461949

  13. The first determination of DNA sequence of a specific gene.

    Science.gov (United States)

    Inouye, Masayori

    2016-05-10

    How and when the first DNA sequence of a gene was determined? In 1977, F. Sanger came up with an innovative technology to sequence DNA by using chain terminators, and determined the entire DNA sequence of the 5375-base genome of bacteriophage φX 174 (Sanger et al., 1977). While this Sanger's achievement has been recognized as the first DNA sequencing of genes, we had determined DNA sequence of a gene, albeit a partial sequence, 11 years before the Sanger's DNA sequence (Okada et al., 1966).

  14. Modified Genetic Algorithm for DNA Sequence Assembly by Shotgun and Hybridization Sequencing Techniques

    OpenAIRE

    Prof.Narayan Kumar Sahu; Prof.Somesh Dewangan; Prof.Akash Wanjari

    2012-01-01

    Since the advent of rapid DNA sequencing methods in 1976, scientists have had the problem of inferring DNA sequences from sequenced fragments. Shotgun sequencing is a well-established biological and computational method used in practice. Many conventional algorithms for shotgun sequencing are based on the notion of pair wise fragment overlap. While shotgun sequencing infers a DNA sequence given the sequences of overlapping fragments, a recent and complementary method, called sequencing by hy...

  15. The DNA sequence of human chromosome 7.

    Science.gov (United States)

    Hillier, Ladeana W; Fulton, Robert S; Fulton, Lucinda A; Graves, Tina A; Pepin, Kymberlie H; Wagner-McPherson, Caryn; Layman, Dan; Maas, Jason; Jaeger, Sara; Walker, Rebecca; Wylie, Kristine; Sekhon, Mandeep; Becker, Michael C; O'Laughlin, Michelle D; Schaller, Mark E; Fewell, Ginger A; Delehaunty, Kimberly D; Miner, Tracie L; Nash, William E; Cordes, Matt; Du, Hui; Sun, Hui; Edwards, Jennifer; Bradshaw-Cordum, Holland; Ali, Johar; Andrews, Stephanie; Isak, Amber; Vanbrunt, Andrew; Nguyen, Christine; Du, Feiyu; Lamar, Betty; Courtney, Laura; Kalicki, Joelle; Ozersky, Philip; Bielicki, Lauren; Scott, Kelsi; Holmes, Andrea; Harkins, Richard; Harris, Anthony; Strong, Cynthia Madsen; Hou, Shunfang; Tomlinson, Chad; Dauphin-Kohlberg, Sara; Kozlowicz-Reilly, Amy; Leonard, Shawn; Rohlfing, Theresa; Rock, Susan M; Tin-Wollam, Aye-Mon; Abbott, Amanda; Minx, Patrick; Maupin, Rachel; Strowmatt, Catrina; Latreille, Phil; Miller, Nancy; Johnson, Doug; Murray, Jennifer; Woessner, Jeffrey P; Wendl, Michael C; Yang, Shiaw-Pyng; Schultz, Brian R; Wallis, John W; Spieth, John; Bieri, Tamberlyn A; Nelson, Joanne O; Berkowicz, Nicolas; Wohldmann, Patricia E; Cook, Lisa L; Hickenbotham, Matthew T; Eldred, James; Williams, Donald; Bedell, Joseph A; Mardis, Elaine R; Clifton, Sandra W; Chissoe, Stephanie L; Marra, Marco A; Raymond, Christopher; Haugen, Eric; Gillett, Will; Zhou, Yang; James, Rose; Phelps, Karen; Iadanoto, Shawn; Bubb, Kerry; Simms, Elizabeth; Levy, Ruth; Clendenning, James; Kaul, Rajinder; Kent, W James; Furey, Terrence S; Baertsch, Robert A; Brent, Michael R; Keibler, Evan; Flicek, Paul; Bork, Peer; Suyama, Mikita; Bailey, Jeffrey A; Portnoy, Matthew E; Torrents, David; Chinwalla, Asif T; Gish, Warren R; Eddy, Sean R; McPherson, John D; Olson, Maynard V; Eichler, Evan E; Green, Eric D; Waterston, Robert H; Wilson, Richard K

    2003-07-10

    Human chromosome 7 has historically received prominent attention in the human genetics community, primarily related to the search for the cystic fibrosis gene and the frequent cytogenetic changes associated with various forms of cancer. Here we present more than 153 million base pairs representing 99.4% of the euchromatic sequence of chromosome 7, the first metacentric chromosome completed so far. The sequence has excellent concordance with previously established physical and genetic maps, and it exhibits an unusual amount of segmentally duplicated sequence (8.2%), with marked differences between the two arms. Our initial analyses have identified 1,150 protein-coding genes, 605 of which have been confirmed by complementary DNA sequences, and an additional 941 pseudogenes. Of genes confirmed by transcript sequences, some are polymorphic for mutations that disrupt the reading frame. PMID:12853948

  16. DNA sequencing by nanopores: advances and challenges

    Science.gov (United States)

    Agah, Shaghayegh; Zheng, Ming; Pasquali, Matteo; Kolomeisky, Anatoly B.

    2016-10-01

    Developing inexpensive and simple DNA sequencing methods capable of detecting entire genomes in short periods of time could revolutionize the world of medicine and technology. It will also lead to major advances in our understanding of fundamental biological processes. It has been shown that nanopores have the ability of single-molecule sensing of various biological molecules rapidly and at a low cost. This has stimulated significant experimental efforts in developing DNA sequencing techniques by utilizing biological and artificial nanopores. In this review, we discuss recent progress in the nanopore sequencing field with a focus on the nature of nanopores and on sensing mechanisms during the translocation. Current challenges and alternative methods are also discussed.

  17. Mitochondrial DNA sequence variation in Greeks.

    Science.gov (United States)

    Kouvatsi, A; Karaiskou, N; Apostolidis, A; Kirmizidis, G

    2001-12-01

    Mitochondrial DNA (mtDNA) control region sequences were determined in 54 unrelated Greeks, coming from different regions in Greece, for both segments HVR-I and HVR-II. Fifty-two different mtDNA haplotypes were revealed, one of which was shared by three individuals. A very low heterogeneity was found among Greek regions. No one cluster of lineages was specific to individuals coming from a certain region. The average pairwise difference distribution showed a value of 7.599. The data were compared with that for other European or neighbor populations (British, French, Germans, Tuscans, Bulgarians, and Turks). The genetic trees that were constructed revealed homogeneity between Europeans. Median networks revealed that most of the Greek mtDNA haplotypes are clustered to the five known haplogroups and that a number of haplotypes are shared among Greeks and other European and Near Eastern populations.

  18. New stopping criteria for segmenting DNA sequences

    CERN Document Server

    Li, W

    2001-01-01

    We propose a solution on the stopping criterion in segmenting inhomogeneous DNA sequences with complex statistical patterns. This new stopping criterion is based on Bayesian Information Criterion (BIC) in the model selection framework. When this stopping criterion is applied to a left telomere sequence of yeast Saccharomyces cerevisiae and the complete genome sequence of bacterium Escherichia coli, borders of biologically meaningful units were identified (e.g. subtelomeric units, replication origin, and replication terminus), and a more reasonable number of domains was obtained. We also introduce a measure called segmentation strength which can be used to control the delineation of large domains. The relationship between the average domain size and the threshold of segmentation strength is determined for several genome sequences.

  19. ASTRAL, a hyperspectral imaging DNA sequencer

    Science.gov (United States)

    O'Brien, Kevin M.; Wren, Jonathan; Davé, Varshal K.; Bai, Diane; Anderson, Richard D.; Rayner, Simon; Evans, Glen A.; Dabiri, Ali E.; Garner, Harold R.

    1998-05-01

    We are developing a prototype automatic DNA sequencer which utilizes polyacrylamide slab gels imaged through a novel optical detection system. The design of this prototype sequencer allows the ability to perform direct optical coupling over the entire read area of the gel and hyperspectrographic separation and detection of the fluorescence emission. The machine has no moving parts. All the major components incorporated in this prototype are all currently available "off the shelf," thus reducing equipment development time and decreasing costs. Software developed for data acquisition, analysis, and conversion to other standard formats facilitates compatibility.

  20. Inferring Coalescence Times from DNA Sequence Data

    OpenAIRE

    Tavare, S; Balding, D. J.; Griffiths, R. C.; Donnelly, P

    1997-01-01

    The paper is concerned with methods for the estimation of the coalescence time (time since the most recent common ancestor) of a sample of intraspecies DNA sequences. The methods take advantage of prior knowledge of population demography, in addition to the molecular data. While some theoretical results are presented, a central focus is on computational methods. These methods are easy to implement, and, since explicit formulae tend to be either unavailable or unilluminating, they are also mor...

  1. A DNA Structure-Based Bionic Wavelet Transform and Its Application to DNA Sequence Analysis

    OpenAIRE

    Fei Chen; Yuan-Ting Zhang

    2003-01-01

    DNA sequence analysis is of great significance for increasing our understanding of genomic functions. An important task facing us is the exploration of hidden structural information stored in the DNA sequence. This paper introduces a DNA structure-based adaptive wavelet transform (WT) – the bionic wavelet transform (BWT) – for DNA sequence analysis. The symbolic DNA sequence can be separated into four channels of indicator sequences. An adaptive symbol-to-number mapping, determined from the s...

  2. Vector sequences - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available Budding yeast cDNA sequencing project Vector sequences Data detail Data name Vector sequences Description of data contents Vector seq...wnload License Update History of This Database Site Policy | Contact Us Vector sequences - Budding yeast cDNA sequencing project | LSDB Archive ... ...uences used for sequencing. Multi FASTA format. 7 entries. Data file File name: vec

  3. Modified Genetic Algorithm for DNA Sequence Assembly by Shotgun and Hybridization Sequencing Techniques

    Directory of Open Access Journals (Sweden)

    Prof.Narayan Kumar Sahu

    2012-09-01

    Full Text Available Since the advent of rapid DNA sequencing methods in 1976, scientists have had the problem of inferring DNA sequences from sequenced fragments. Shotgun sequencing is a well-established biological and computational method used in practice. Many conventional algorithms for shotgun sequencing are based on the notion of pair wise fragment overlap. While shotgun sequencing infers a DNA sequence given the sequences of overlapping fragments, a recent and complementary method, called sequencing by hybridization (SBH, infers a DNA sequence given the set of oligomers that represents all sub words of some fixed length, k. In this paper, we propose a new computer algorithm for DNA sequence assembly that combines in a novel way the techniques of both shotgun and SBH methods. Based on our preliminary investigations, the algorithm promises- to be very fast and practical for DNA sequence assembly [1].

  4. Nucleosome DNA sequence structure of isochores

    Directory of Open Access Journals (Sweden)

    Trifonov Edward N

    2011-04-01

    Full Text Available Abstract Background Significant differences in G+C content between different isochore types suggest that the nucleosome positioning patterns in DNA of the isochores should be different as well. Results Extraction of the patterns from the isochore DNA sequences by Shannon N-gram extension reveals that while the general motif YRRRRRYYYYYR is characteristic for all isochore types, the dominant positioning patterns of the isochores vary between TAAAAATTTTTA and CGGGGGCCCCCG due to the large differences in G+C composition. This is observed in human, mouse and chicken isochores, demonstrating that the variations of the positioning patterns are largely G+C dependent rather than species-specific. The species-specificity of nucleosome positioning patterns is revealed by dinucleotide periodicity analyses in isochore sequences. While human sequences are showing CG periodicity, chicken isochores display AG (CT periodicity. Mouse isochores show very weak CG periodicity only. Conclusions Nucleosome positioning pattern as revealed by Shannon N-gram extension is strongly dependent on G+C content and different in different isochores. Species-specificity of the pattern is subtle. It is reflected in the choice of preferentially periodical dinucleotides.

  5. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    Energy Technology Data Exchange (ETDEWEB)

    Winston Chen, C.H.; Taranenko, N.I.; Zhu, Y.F.; Chung, C.N.; Allman, S.L.

    1997-03-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, the authors recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Snager`s enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. The preliminary results indicate laser mass spectrometry can possibly be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, the authors applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  6. Sequence dependent hole evolution in DNA.

    Science.gov (United States)

    Lakhno, V D

    2004-06-01

    The paper examines thedynamical behavior of a radical cation(G(+*)) generated in adouble stranded DNA for differentoligonucleotide sequences. The resonancehole tunneling through an oligonucleotidesequence is studied by the method ofnumerical integration of self-consistentquantum-mechanical equations. The holemotion is considered quantum mechanicallyand nucleotide base oscillations aretreated classically. The results obtaineddemonstrate a strong dependence of chargetransfer on the type of nucleotidesequence. The rates of the hole transferare calculated for different nucleotidesequences and compared with experimentaldata on the transfer from (G(+*))to a GGG unit.

  7. Recent advances in DNA sequencing techniques

    Science.gov (United States)

    Singh, Rama Shankar

    2013-06-01

    Successful mapping of the draft human genome in 2001 and more recent mapping of the human microbiome genome in 2012 have relied heavily on the parallel processing of the second generation/Next Generation Sequencing (NGS) DNA machines at a cost of several millions dollars and long computer processing times. These have been mainly biochemical approaches. Here a system analysis approach is used to review these techniques by identifying the requirements, specifications, test methods, error estimates, repeatability, reliability and trends in the cost reduction. The first generation, NGS and the Third Generation Single Molecule Real Time (SMART) detection sequencing methods are reviewed. Based on the National Human Genome Research Institute (NHGRI) data, the achieved cost reduction of 1.5 times per yr. from Sep. 2001 to July 2007; 7 times per yr., from Oct. 2007 to Apr. 2010; and 2.5 times per yr. from July 2010 to Jan 2012 are discussed.

  8. Poincaré recurrences of DNA sequences

    Science.gov (United States)

    Frahm, K. M.; Shepelyansky, D. L.

    2012-01-01

    We analyze the statistical properties of Poincaré recurrences of Homo sapiens, mammalian, and other DNA sequences taken from the Ensembl Genome data base with up to 15 billion base pairs. We show that the probability of Poincaré recurrences decays in an algebraic way with the Poincaré exponent β≈4 even if the oscillatory dependence is well pronounced. The correlations between recurrences decay with an exponent ν≈0.6 that leads to an anomalous superdiffusive walk. However, for Homo sapiens sequences, with the largest available statistics, the diffusion coefficient converges to a finite value on distances larger than one million base pairs. We argue that the approach based on Poncaré recurrences determines new proximity features between different species and sheds a new light on their evolution history.

  9. Image correlation method for DNA sequence alignment.

    Directory of Open Access Journals (Sweden)

    Millaray Curilem Saldías

    Full Text Available The complexity of searches and the volume of genomic data make sequence alignment one of bioinformatics most active research areas. New alignment approaches have incorporated digital signal processing techniques. Among these, correlation methods are highly sensitive. This paper proposes a novel sequence alignment method based on 2-dimensional images, where each nucleic acid base is represented as a fixed gray intensity pixel. Query and known database sequences are coded to their pixel representation and sequence alignment is handled as object recognition in a scene problem. Query and database become object and scene, respectively. An image correlation process is carried out in order to search for the best match between them. Given that this procedure can be implemented in an optical correlator, the correlation could eventually be accomplished at light speed. This paper shows an initial research stage where results were "digitally" obtained by simulating an optical correlation of DNA sequences represented as images. A total of 303 queries (variable lengths from 50 to 4500 base pairs and 100 scenes represented by 100 x 100 images each (in total, one million base pair database were considered for the image correlation analysis. The results showed that correlations reached very high sensitivity (99.01%, specificity (98.99% and outperformed BLAST when mutation numbers increased. However, digital correlation processes were hundred times slower than BLAST. We are currently starting an initiative to evaluate the correlation speed process of a real experimental optical correlator. By doing this, we expect to fully exploit optical correlation light properties. As the optical correlator works jointly with the computer, digital algorithms should also be optimized. The results presented in this paper are encouraging and support the study of image correlation methods on sequence alignment.

  10. Detecting seeded motifs in DNA sequences.

    Science.gov (United States)

    Pizzi, Cinzia; Bortoluzzi, Stefania; Bisognin, Andrea; Coppe, Alessandro; Danieli, Gian Antonio

    2005-01-01

    The problem of detecting DNA motifs with functional relevance in real biological sequences is difficult due to a number of biological, statistical and computational issues and also because of the lack of knowledge about the structure of searched patterns. Many algorithms are implemented in fully automated processes, which are often based upon a guess of input parameters from the user at the very first step. In this paper, we present a novel method for the detection of seeded DNA motifs, composed by regions with a different extent of variability. The method is based on a multi-step approach, which was implemented in a motif searching web tool (MOST). Overrepresented exact patterns are extracted from input sequences and clustered to produce motifs core regions, which are then extended and scored to generate seeded motifs. The combination of automated pattern discovery algorithms and different display tools for the evaluation and selection of results at several analysis steps can potentially lead to much more meaningful results than complete automation can produce. Experimental results on different yeast and human real datasets proved the methodology to be a promising solution for finding seeded motifs. MOST web tool is freely available at http://telethon.bio.unipd.it/bioinfo/MOST. PMID:16141193

  11. Detecting seeded motifs in DNA sequences

    Science.gov (United States)

    Pizzi, Cinzia; Bortoluzzi, Stefania; Bisognin, Andrea; Coppe, Alessandro; Danieli, Gian Antonio

    2005-01-01

    The problem of detecting DNA motifs with functional relevance in real biological sequences is difficult due to a number of biological, statistical and computational issues and also because of the lack of knowledge about the structure of searched patterns. Many algorithms are implemented in fully automated processes, which are often based upon a guess of input parameters from the user at the very first step. In this paper, we present a novel method for the detection of seeded DNA motifs, composed by regions with a different extent of variability. The method is based on a multi-step approach, which was implemented in a motif searching web tool (MOST). Overrepresented exact patterns are extracted from input sequences and clustered to produce motifs core regions, which are then extended and scored to generate seeded motifs. The combination of automated pattern discovery algorithms and different display tools for the evaluation and selection of results at several analysis steps can potentially lead to much more meaningful results than complete automation can produce. Experimental results on different yeast and human real datasets proved the methodology to be a promising solution for finding seeded motifs. MOST web tool is freely available at . PMID:16141193

  12. DNA Sequence Optimization Based on Continuous Particle Swarm Optimization for Reliable DNA Computing and DNA Nanotechnology

    Directory of Open Access Journals (Sweden)

    N. K. Khalid

    2008-01-01

    Full Text Available Problem statement: In DNA based computation and DNA nanotechnology, the design of good DNA sequences has turned out to be an essential problem and one of the most practical and important research topics. Basically, the DNA sequence design problem is a multi-objective problem and it can be evaluated using four objective functions, namely, Hmeasure, similarity, continuity and hairpin. Approach: There are several ways to solve multi-objective problem, however, in order to evaluate the correctness of PSO algorithm in DNA sequence design, this problem is converted into single objective problem. Particle Swarm Optimization (PSO is proposed to minimize the objective in the problem, subjected to two constraints: melting temperature and GCcontent. A model is developed to present the DNA sequence design based on PSO computation. Results: Based on experiments and researches done, 20 particles are used in the implementation of the optimization process, where the average values and the standard deviation for 100 runs are shown along with comparison to other existing methods. Conclusion: The results achieve verified that PSO can suitably solves the DNA sequence design problem using the proposed method and model, comparatively better than other approaches.

  13. A DNA Structure-Based Bionic Wavelet Transform and Its Application to DNA Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Fei Chen

    2003-01-01

    Full Text Available DNA sequence analysis is of great significance for increasing our understanding of genomic functions. An important task facing us is the exploration of hidden structural information stored in the DNA sequence. This paper introduces a DNA structure-based adaptive wavelet transform (WT – the bionic wavelet transform (BWT – for DNA sequence analysis. The symbolic DNA sequence can be separated into four channels of indicator sequences. An adaptive symbol-to-number mapping, determined from the structural feature of the DNA sequence, was introduced into WT. It can adjust the weight value of each channel to maximise the useful energy distribution of the whole BWT output. The performance of the proposed BWT was examined by analysing synthetic and real DNA sequences. Results show that BWT performs better than traditional WT in presenting greater energy distribution. This new BWT method should be useful for the detection of the latent structural features in future DNA sequence analysis.

  14. Non-random DNA fragmentation in next-generation sequencing

    Science.gov (United States)

    Poptsova, Maria S.; Il'Icheva, Irina A.; Nechipurenko, Dmitry Yu.; Panchenko, Larisa A.; Khodikov, Mingian V.; Oparina, Nina Y.; Polozov, Robert V.; Nechipurenko, Yury D.; Grokhovsky, Sergei L.

    2014-03-01

    Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed ``reads'' are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of double-stranded breaks depend on the nucleotide sequence. In this work we analyzed genomic reads from NGS data and discovered that fragmentation methods based on the action of the hydrodynamic forces on DNA, produce similar bias. Consideration of this non-random DNA fragmentation may allow one to unravel what factors and to what extent influence the non-uniform coverage of various genomic regions.

  15. Counterintuitive DNA Sequence Dependence in Supercoiling-Induced DNA Melting

    NARCIS (Netherlands)

    Vlijm, R.; Torre, J.; Dekker, C.

    2015-01-01

    The metabolism of DNA in cells relies on the balance between hybridized double-stranded DNA (dsDNA) and local de-hybridized regions of ssDNA that provide access to binding proteins. Traditional melting experiments, in which short pieces of dsDNA are heated up until the point of melting into ssDNA, h

  16. Short-sequence DNA repeats in prokaryotic genomes

    NARCIS (Netherlands)

    A.F. van Belkum (Alex); S. Scherer; L. van Alphen (Loek); H.A. Verbrugh (Henri)

    1998-01-01

    textabstractShort-sequence DNA repeat (SSR) loci can be identified in all eukaryotic and many prokaryotic genomes. These loci harbor short or long stretches of repeated nucleotide sequence motifs. DNA sequence motifs in a single locus can be identical and/or heterogeneo

  17. 5'-end sequences of budding yeast full-length cDNA clones - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available Budding yeast cDNA sequencing project 5'-end sequences of budding yeast full-length cDNA clones Data detail Data name 5'-end sequence...s of budding yeast full-length cDNA clones Description of data contents cDNA sequence...e Update History of This Database Site Policy | Contact Us 5'-end sequences of budding yeast full-length cDNA clones - Budding yeast cDNA sequencing project | LSDB Archive ...

  18. Anonymizing Unstructured Data

    CERN Document Server

    Motwani, Rajeev

    2008-01-01

    In this paper we consider the problem of anonymizing datasets in which each individual is associated with a set of items that constitute private information about the individual. Illustrative datasets include market-basket datasets and search engine query logs. We formalize the notion of k-anonymity for set-valued data as a variant of the k-anonymity model for traditional relational datasets. We define an optimization problem that arises from this definition of anonymity and provide a constant factor approximation algorithm for the same. We evaluate our algorithms on the America Online query log dataset.

  19. An Anonymity Revocation Technology for Anonymous Communication

    Science.gov (United States)

    Antoniou, Giannakis; Batten, Lynn; Parampalli, Udaya

    A number of privacy-enhancing technologies (PETs) have been proposed in the last three decades offering unconditional communication anonymity to their users. Unconditional anonymity can, however, be a security threat because it allows users to employ a PET in order to act maliciously while hiding their identity. In the last few years, several technologies which revoke the identity of users who use PETs have been proposed. These are known as anonymity revocation technologies (ARTs). However, the construction of ARTs has been developed in an ad hoc manner without a theoretical basis outlining the goals and underlying principles. In this chapter we present a set of fundamental principles and requirements for construction of an ART, identifying the necessary features. We then propose an abstract scheme for construction of an ART based on these features.

  20. Next-generation sequencing offers new insights into DNA degradation

    DEFF Research Database (Denmark)

    Overballe-Petersen, Søren; Orlando, Ludovic Antoine Alexandre; Willerslev, Eske

    2012-01-01

    The processes underlying DNA degradation are central to various disciplines, including cancer research, forensics and archaeology. The sequencing of ancient DNA molecules on next-generation sequencing platforms provides direct measurements of cytosine deamination, depurination and fragmentation...... rates that previously were obtained only from extrapolations of results from in vitro kinetic experiments performed over short timescales. For example, recent next-generation sequencing of ancient DNA reveals purine bases as one of the main targets of postmortem hydrolytic damage, through base...

  1. A motif-independent metric for DNA sequence specificity

    OpenAIRE

    Pinello Luca; Lo Bosco Giosuè; Hanlon Bret; Yuan Guo-Cheng

    2011-01-01

    Abstract Background Genome-wide mapping of protein-DNA interactions has been widely used to investigate biological functions of the genome. An important question is to what extent such interactions are regulated at the DNA sequence level. However, current investigation is hampered by the lack of computational methods for systematic evaluating sequence specificity. Results We present a simple, unbiased quantitative measure for DNA sequence specificity called the Motif Independent Measure (MIM)...

  2. Protection of DNA sequences by triplex-bridge formation.

    OpenAIRE

    Kiyama, R; Oishi, M

    1995-01-01

    We have demonstrated that the DNA sequence between two triplex-forming polypurine.polypyrimidine (Pu.Py) tracts was protected from DNA modifying enzymes upon formation of triplex DNA structures with an oligodeoxyribonucleotide in which two triplex-forming Pu or Py tracts were placed at the termini (triplex-bridge formation). In model experiments, when two triplex structures were formed between double-stranded DNA with the sequence (AG)17-(N)18-(T)34, and an oligodeoxyribonucleotide, (T)34-(N)...

  3. Effect of Noise on DNA Sequencing via Transverse Electronic Transport

    OpenAIRE

    Krems, Matt; Zwolak, Michael; Pershin, Yuriy V.; Di Ventra, Massimiliano

    2009-01-01

    Previous theoretical studies have shown that measuring the transverse current across DNA strands while they translocate through a nanopore or channel may provide a statistically distinguishable signature of the DNA bases, and may thus allow for rapid DNA sequencing. However, fluctuations of the environment, such as ionic and DNA motion, introduce important scattering processes that may affect the viability of this approach to sequencing. To understand this issue, we have analyzed a simple mod...

  4. SWORDS: A statistical tool for analysing large DNA sequences

    Indian Academy of Sciences (India)

    Probal Chaudhuri; Sandip Das

    2002-02-01

    In this article, we present some simple yet effective statistical techniques for analysing and comparing large DNA sequences. These techniques are based on frequency distributions of DNA words in a large sequence, and have been packaged into a software called SWORDS. Using sequences available in public domain databases housed in the Internet, we demonstrate how SWORDS can be conveniently used by molecular biologists and geneticists to unmask biologically important features hidden in large sequences and assess their statistical significance.

  5. Anonymization of Court Orders

    DEFF Research Database (Denmark)

    Povlsen, Claus; Jongejan, Bart; Hansen, Dorte Haltrup;

    We describe an anonymization tool that was commissioned by and specified together with Schultz, a publishing company specialized in Danish law related publications. Unavailability of training data and the need to guarantee compliance with pre-existing anonymization guidelines forced us to implement...

  6. A novel constraint for thermodynamically designing DNA sequences.

    Directory of Open Access Journals (Sweden)

    Qiang Zhang

    Full Text Available Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap.

  7. A novel constraint for thermodynamically designing DNA sequences.

    Science.gov (United States)

    Zhang, Qiang; Wang, Bin; Wei, Xiaopeng; Zhou, Changjun

    2013-01-01

    Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired) hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE) to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap. PMID:24015217

  8. DNA display I. Sequence-encoded routing of DNA populations.

    OpenAIRE

    Halpin, David R; Pehr B Harbury

    2004-01-01

    Recently reported technologies for DNA-directed organic synthesis and for DNA computing rely on routing DNA populations through complex networks. The reduction of these ideas to practice has been limited by a lack of practical experimental tools. Here we describe a modular design for DNA routing genes, and routing machinery made from oligonucleotides and commercially available chromatography resins. The routing machinery partitions nanomole quantities of DNA into physically distinct subpools ...

  9. New method to study DNA sequences: the languages of evolution.

    Science.gov (United States)

    Spinelli, Gino; Mayer-Foulkes, David

    2008-04-01

    Recently, several authors have reported statistical evidence for deterministic dynamics in the flux of genetic information, suggesting that evolution involves the emergence and maintenance of a fractal landscape in DNA chains. Here we examine the idea that motif repetition lies at the origin of these statistical properties of DNA. To analyse repetition patterns we apply a modification of the BDS statistic, devised to analyze complex economic dynamics and adapted here to DNA sequence analysis. This provides a new method to detect structured signals in genetic information. We compare naturally occurring DNA sequences along the evolutionary tree with randomly generated sequences and also with simulated sequences with repetition motifs. For easier understanding, we also define a new statistic for a DNA sequence that constitutes a specific fingerprint. The new methods are applied to exon and intron DNA sequences, finding specific statistical differences. Moreover, by analysing DNA sequences of different species from Bacteria to Man, we explore the evolution of these linguistic DNA features along the evolutionary tree. The results are consistent with the idea that all the flux of DNA information need not be random, but may be structured along the evolutionary tree. The implications for evolutionary theory are discussed.

  10. Levenshtein error-correcting barcodes for multiplexed DNA sequencing

    NARCIS (Netherlands)

    Buschmann, Tilo; Bystrykh, Leonid V.

    2013-01-01

    Background: High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called multi

  11. Affordable Hands-On DNA Sequencing and Genotyping: An Exercise for Teaching DNA Analysis to Undergraduates

    Science.gov (United States)

    Shah, Kushani; Thomas, Shelby; Stein, Arnold

    2013-01-01

    In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C…

  12. DNA Polymerases Drive DNA Sequencing-by-Synthesis Technologies: Both Past and Present

    Directory of Open Access Journals (Sweden)

    Cheng-Yao eChen

    2014-06-01

    Full Text Available Next-generation sequencing (NGS technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. E. coli DNA polymerase I proteolytic (Klenow fragment was originally utilized in Sanger's dideoxy chain terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today's standard capillary electrophoresis (CE and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ⱷ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ⱷ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies.

  13. Immunostimulatory DNA sequences influence the course of adjuvant arthritis

    NARCIS (Netherlands)

    Ronaghy, A; Prakken, BJ; Takabayashi, K; Firestein, GS; Boyle, D; Zvailfler, NJ; Roord, STA; Albani, S; Carson, DA; Raz, E

    2002-01-01

    Bacterial DNA is enriched in unmethylated CpG motifs that have been shown to activate the innate immune system. These immunostimulatory DNA sequences (ISS) induce inflammation when injected directly into joints. However, the role of bacterial DNA in systemic arthritis is not known. The purpose of th

  14. Food Fish Identification from DNA Extraction through Sequence Analysis

    Science.gov (United States)

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  15. Cloning and sequencing of mouse GABA transporter complementary DNA

    Institute of Scientific and Technical Information of China (English)

    TAMANTHONYC.W.; LIHEGUO; 等

    1994-01-01

    A cDNA encoding the mouse GABA transporter has been isolated and sequenced.The results show that the mouse GABA transporter cDNA differs from that of the rat by 60 base pairs at the open reading frame region but the deduced amino acid sequences of the two cDNAs are identical and both composed of 599 amino acids.However,the amino acid sequence is different from the sequence deduced from a recently published mouse GABA transporter cDNA.

  16. Shotgun DNA sequencing using cloned DNase I-generated fragments.

    OpenAIRE

    Anderson, S

    1981-01-01

    A method for DNA sequencing has been developed that utilises libraries of cloned randomly-fragmented DNA. The DNA to be sequenced is first subjected to limit attach by a non-specific endonuclease (DNase I in the presence of Mn++), fractionated by size and cloned in a single-stranded phage vector. Clones are then picked at random and used to provide a template for sequencing by the dideoxynucleotide chain termination method. This technique was used to sequence completely a 4257 bp EcoRI fragme...

  17. Spatially localized generation of nucleotide sequence-specific DNA damage

    OpenAIRE

    Oh, Dennis H.; King, Brett A.; Boxer, Steven G.; Hanawalt, Philip C.

    2001-01-01

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen–DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for ...

  18. Effects of Sequence on Transmission Properties of DNA Molecules

    Institute of Scientific and Technical Information of China (English)

    DONG Rui-Xin; YAN Xun-Ling; YANG Bing

    2008-01-01

    A double helix model of charge transport in DNA molecule is given and the transmission spectra of four DNA sequences are obtained. The calculated results show that the transmission characteristics of DNA are not only related to the longitudinal transport but also to the transverse transport of molecule. The periodic sequence with the same composition has stronger conduction ability. With the increasing of bases composition, the conductive ability reduces, but the weight of θ direction rises in charge transfer.

  19. An iterative and regenerative method for DNA sequencing.

    Science.gov (United States)

    Jones, D H

    1997-05-01

    This paper presents, to our knowledge, the first iterative DNA sequencing method that regenerates the product of interest during each iterative cycle, allowing it to overcome the critical obstacles that impede alternative iterative approaches to DNA sequencing: loss of product and the accumulation of background signal due to incomplete reactions. It can sequence numerous double-stranded (ds) DNA segments in parallel without gel resolution of DNA fragments and can sequence DNA that is almost entirely double-stranded, preventing the secondary structures that impede sequencing by hybridization. This method uses ligation of an adaptor containing the recognition domain for a class-IIS restriction endonuclease and digestion with a class-IIS restriction endonuclease that recognizes the adaptor's recognition domain. This generates a set of DNA templates that are each composed of a short overhang positioned at a fixed interval with respect to one end of the original dsDNA fragment. Adaptor ligation also appends a unique sequence during each iterative cycle, so that the polymerase chain reaction can be used to regenerate the desired template-precursor before class-IIS restriction endonuclease digestion. Following class-IIS restriction endonuclease digestion, sequencing of a nucleotide in each overhang occurs by template-directed ligation during adaptor ligation or through a separate template-directed polymerization step with labeled ddNTPs. DNA sequencing occurs in strides determined by the number of nucleotides separating the recognition and cleavage domains for the class-IIS restriction endonuclease encoded in the ligated adaptor, maximizing the span of DNA sequenced for a given number of iterative cycles. This method allows the concurrent sequencing of numerous dsDNA segments in a microplate format, and in the future it can be adapted to biochip format. PMID:9149879

  20. Anonymity in science.

    Science.gov (United States)

    Neuroskeptic

    2013-05-01

    The history of science is replete with important works that were originally published without the author's legal name being revealed. Most modern scientists will have worked anonymously in their capacity as peer reviewers. But why is anonymity so popular? And is it a valid approach? I argue that pseudonymity and anonymity, although not appropriate for all forms of scientific communication, have a vital role to play in academic discourse. They can facilitate the free expression of interpretations and ideas, and can help to ensure that suggestions and criticisms are evaluated dispassionately, regardless of their source. PMID:23570959

  1. Multiplexed Sequence Encoding: A Framework for DNA Communication.

    Science.gov (United States)

    Zakeri, Bijan; Carr, Peter A; Lu, Timothy K

    2016-01-01

    Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication-data encoding, data transfer & data extraction-and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system-Multiplexed Sequence Encoding (MuSE)-that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA. PMID:27050646

  2. DNA splice site sequences clustering method for conservativeness analysis

    Institute of Scientific and Technical Information of China (English)

    Quanwei Zhang; Qinke Peng; Tao Xu

    2009-01-01

    DNA sequences that are near to splice sites have remarkable conservativeness,and many researchers have contributed to the prediction of splice site.In order to mine the underlying biological knowledge,we analyze the conservativeness of DNA splice site adjacent sequences by clustering.Firstly,we propose a kind of DNA splice site sequences clustering method which is based on DBSCAN,and use four kinds of dissimilarity calculating methods.Then,we analyze the conservative feature of the clustering results and the experimental data set.

  3. DNA sequence analysis with droplet-based microfluidics

    Science.gov (United States)

    Abate, Adam R.; Hung, Tony; Sperling, Ralph A.; Mary, Pascaline; Rotem, Assaf; Agresti, Jeremy J.; Weiner, Michael A.; Weitz, David A.

    2014-01-01

    Droplet-based microfluidic techniques can form and process micrometer scale droplets at thousands per second. Each droplet can house an individual biochemical reaction, allowing millions of reactions to be performed in minutes with small amounts of total reagent. This versatile approach has been used for engineering enzymes, quantifying concentrations of DNA in solution, and screening protein crystallization conditions. Here, we use it to read the sequences of DNA molecules with a FRET-based assay. Using probes of different sequences, we interrogate a target DNA molecule for polymorphisms. With a larger probe set, additional polymorphisms can be interrogated as well as targets of arbitrary sequence. PMID:24185402

  4. Thermodynamics of sequence-specific binding of PNA to DNA

    DEFF Research Database (Denmark)

    Ratilainen, T; Holmén, A; Tuite, E;

    2000-01-01

    For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes) and seq......For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes...

  5. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    DEFF Research Database (Denmark)

    Nielsen, Peter E.

    2008-01-01

    Peptide nucleic acids (PNA) can be designed to target duplex DNA with very high sequence specificity and efficiency via various binding modes. We have designed three domain PNA clamps, that bind stably to predefined decameric homopurine targets in large dsDNA mols. and via a third PNA domain...... sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technol. of protein dsDNA structures. (c) 2008 American Institute of Physics. [on SciFinder (R)] Udgivelsesdato...

  6. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    Science.gov (United States)

    Nielsen, Peter E.

    2008-10-01

    Peptide nucleic acids (PNA) can be designed to target duplex DNA with very high sequence specificity and efficiency via various binding modes. We have designed three domain PNA clamps, that bind stably to predefined decameric homopurine targets in large dsDNA molecules and via a third PNA domain sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technology of protein dsDNA structures.

  7. Current-voltage characteristics of double-strand DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bezerril, L.M.; Moreira, D.A. [Departamento de Fisica, Universidade Federal do Rio Grande do Norte, 59072-970, Natal-RN (Brazil); Albuquerque, E.L., E-mail: eudenilson@dfte.ufrn.b [Departamento de Fisica, Universidade Federal do Rio Grande do Norte, 59072-970, Natal-RN (Brazil); Fulco, U.L. [Departamento de Biofisica e Farmacologia, Universidade Federal do Rio Grande do Norte, 59072-970, Natal-RN (Brazil); Oliveira, E.L. de; Sousa, J.S. de [Departamento de Fisica, Universidade Federal do Ceara, 60455-760, Fortaleza-CE (Brazil)

    2009-09-07

    We use a tight-binding formulation to investigate the transmissivity and the current-voltage (I-V) characteristics of sequences of double-strand DNA molecules. In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of artificial sequences (the long-range correlated Fibonacci and Rudin-Shapiro one) and a random sequence, which is a kind of prototype of a short-range correlated system. The random sequence is presented here with the same first neighbors pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the transmissivity spectra, although the I-V curves seem to be mostly influenced by the short-range correlations.

  8. Characteristics of alternating current hopping conductivity in DNA sequences

    Institute of Scientific and Technical Information of China (English)

    Ma Song-Shan; Xu Hui; Wang Huan-You; Guo Rui

    2009-01-01

    This paper presents a model to describe alternating current (AC) conductivity of DNA sequences,in which DNA is considered as a one-dimensional (1D) disordered system,and electrons transport via hopping between localized states.It finds that AC conductivity in DNA sequences increases as the frequency of the external electric field rises,and it takes the form of σac(ω)~ω2 ln2(1/ω).Also AC conductivity of DNA sequences increases with the increase of temperature,this phenomenon presents characteristics of weak temperature-dependence.Meanwhile,the AC conductivity in an off diagonally correlated case is much larger than that in the uncorrelated case of the Anderson limit in low temperatures,which indicates that the off-diagonal correlations in DNA sequences have a great effect on the AC conductivity,while at high temperature the off-diagonal correlations no longer play a vital role in electric transport. In addition,the proportion of nucleotide pairs p also plays an important role in AC electron transport of DNA sequences.For p<0.5,the conductivity of DNA sequence decreases with the increase of p,while for p > 0.5,the conductivity increases with the increase of p.

  9. Nucleotide sequence analysis of regions of adenovirus 5 DNA containing the origins of DNA replication

    International Nuclear Information System (INIS)

    The purpose of the investigations described is the determination of nucleotide sequences at the molecular ends of the linear adenovirus type 5 DNA. Knowledge of the primary structure at the termini of this DNA molecule is of particular interest in the study of the mechanism of replication of adenovirus DNA. The initiation- and termination sites of adenovirus DNA replication are located at the ends of the DNA molecule. (Auth.)

  10. Spectroscopic investigation on the telomeric DNA base sequence repeat

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    Telomeres are protein-DNA complexes at the terminals of linear chromosomes, which protect chromosomal integrity and maintain cellular replicative capacity.From single-cell organisms to advanced animals and plants,structures and functions of telomeres are both very conservative. In cells of human and vertebral animals, telomeric DNA base sequences all are (TTAGGG)n. In the present work, we have obtained absorption and fluorescence spectra measured from seven synthesized oligonucleotides to simulate the telomeric DNA system and calculated their relative fluorescence quantum yields on which not only telomeric DNA characteristics are predicted but also possibly the shortened telomeric sequences during cell division are imrelative fluorescence quantum yield and remarkable excitation energy innerconversion, which tallies with the telomeric sequence of (TTAGGG)n. This result shows that telomeric DNA has a strong non-radiative or innerconvertible capability.``

  11. What Advances Are Being Made in DNA Sequencing?

    Science.gov (United States)

    ... of DNA sequencing , including that caused by the introduction of new technologies, is provided by the National ... Library of Medicine Lister Hill National Center for Biomedical Communications 8600 Rockville Pike, Bethesda, MD 20894, USA ...

  12. Pyrimidine-specific chemical reactions useful for DNA sequencing.

    OpenAIRE

    Rubin, C M; Schmid, C. W.

    1980-01-01

    Potassium permanganate reacts selectively with thymidine residues in DNA (1) while hydroxylamine hydrochloride at pH 6 specifically attacks cytosine (2). We have adopted these reactions for use with the chemical sequencing method developed by Maxam and Gilbert (3).

  13. ATRF Houses the Latest DNA Sequencing Technologies | Poster

    Science.gov (United States)

    By Ashley DeVine, Staff Writer By the end of October, the Advanced Technology Research Facility (ATRF) will be one of the few facilities in the world to house all of the latest DNA sequencing technologies.

  14. Inferring ethnicity from mitochondrial DNA sequence

    OpenAIRE

    Lee, Chih; Măndoiu, Ion I; Nelson, Craig E.

    2011-01-01

    Background The assignment of DNA samples to coarse population groups can be a useful but difficult task. One such example is the inference of coarse ethnic groupings for forensic applications. Ethnicity plays an important role in forensic investigation and can be inferred with the help of genetic markers. Being maternally inherited, of high copy number, and robust persistence in degraded samples, mitochondrial DNA may be useful for inferring coarse ethnicity. In this study, we compare the per...

  15. Statistical methods for detecting periodic fragments in DNA sequence data

    OpenAIRE

    Ying Hua; Epps Julien; Huttley Gavin A

    2011-01-01

    Abstract Background Period 10 dinucleotides are structurally and functionally validated factors that influence the ability of DNA to form nucleosomes, histone core octamers. Robust identification of periodic signals in DNA sequences is therefore required to understand nucleosome organisation in genomes. While various techniques for identifying periodic components in genomic sequences have been proposed or adopted, the requirements for such techniques have not been considered in detail and con...

  16. Which Are More Random: Coding or Noncoding DNA Sequences?

    Institute of Scientific and Technical Information of China (English)

    WU Fang; ZHENG Wei-Mou

    2002-01-01

    Evidence seems to show that coding DNA is more random than noncoding DNA, but other conflictingevidence also exists. Based on the third-base degeneracy of codons, we regard the third position of codons as a 'noisy'position. By deleting one fixed position of non-overlapping triplets in a given sequence, three masked sequences may bededuced from the sequence. We have investigated the block-to-site mutual information functions of coding and noncodingsequences in yeast without and with the masking. Characteristics that distinguish coding from noncoding DNA havebeen found. It is observed that the strong correlations in the coding regions may be blocked by the third base of codons,and the proper masking can extract the correlations. Distribution of dimeric tandem repeats of unmasked sequences isalso compared with that of masked sequences.

  17. Repetitive DNA Sequences in Wheat and Its Relatives

    Institute of Scientific and Technical Information of China (English)

    ZHANG Xue-yong; LI Da-yong

    2001-01-01

    Repetitive DNA sequences form a large portion of eukaryote genomes. Using wheat ( Triticum )as a model, the classification, features and functions of repetitive DNA sequences in the Tritieeae grass tribe is reviewed as well as the role of these sequences in genome differentiation, control and regulation of homologous chromosome synapsis and pairing. Transposable elements, as an important portion of dispersed repetitives,may play an essential role in gene mutation of the host. Dynamic models for change of copy number and sequences of the repetitive family are also presented after the models of Charlesworth et al. Application of repetitive DNA sequences in the study of evolution, chromosome fingerprinting and marker assisted gene transfer and breeding are described by taking wheat as an example.

  18. Discovering simple DNA sequences by the algorithmic significance method.

    Science.gov (United States)

    Milosavljević, A; Jurka, J

    1993-08-01

    A new method, 'algorithmic significance', is proposed as a tool for discovery of patterns in DNA sequences. The main idea is that patterns can be discovered by finding ways to encode the observed data concisely. In this sense, the method can be viewed as a formal version of the Occam's Razor principle. In this paper the method is applied to discover significantly simple DNA sequences. We define DNA sequences to be simple if they contain repeated occurrences of certain 'words' and thus can be encoded in a small number of bits. Such definition includes minisatellites and microsatellites. A standard dynamic programming algorithm for data compression is applied to compute the minimal encoding lengths of sequences in linear time. An electronic mail server for identification of simple sequences based on the proposed method has been installed at the Internet address pythia/anl.gov. PMID:8402207

  19. PREDICTION OF CHROMATIN STATES USING DNA SEQUENCE PROPERTIES

    KAUST Repository

    Bahabri, Rihab R.

    2013-06-01

    Activities of DNA are to a great extent controlled epigenetically through the internal struc- ture of chromatin. This structure is dynamic and is influenced by different modifications of histone proteins. Various combinations of epigenetic modification of histones pinpoint to different functional regions of the DNA determining the so-called chromatin states. How- ever, the characterization of chromatin states by the DNA sequence properties remains largely unknown. In this study we aim to explore whether DNA sequence patterns in the human genome can characterize different chromatin states. Using DNA sequence motifs we built binary classifiers for each chromatic state to eval- uate whether a given genomic sequence is a good candidate for belonging to a particular chromatin state. Of four classification algorithms (C4.5, Naive Bayes, Random Forest, and SVM) used for this purpose, the decision tree based classifiers (C4.5 and Random Forest) yielded best results among those we evaluated. Our results suggest that in general these models lack sufficient predictive power, although for four chromatin states (insulators, het- erochromatin, and two types of copy number variation) we found that presence of certain motifs in DNA sequences does imply an increased probability that such a sequence is one of these chromatin states.

  20. Protein sequence for clustering DNA based on Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Gamal. F. Elhadi

    2012-01-01

    Full Text Available DNA is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms and some viruses. Clustering is a process that groups a set of objects into clusters so that the similarity among objects in the same cluster is high, while that among the objects in different clusters is low. In this paper, we proposed an approach for clustering DNA sequences using Self-Organizing Map (SOM algorithm and Protein Sequence. The main objective is to analyze biological data and to bunch DNA to many clusters more easily and efficiently. We use the proposed approach to analyze both large and small amount of input DNA sequences. The results show that the similarity of the sequences does not depend on the amount of input sequences. Our approach depends on evaluating the degree of the DNA sequences similarity using the hierarchal representation Dendrogram. Representing large amount of data using hierarchal tree gives the ability to compare large sequences efficiently

  1. Anonymous Mobile Payment Solution

    Directory of Open Access Journals (Sweden)

    Alhaj Ali Jalila

    2015-09-01

    Full Text Available The evolution and increasing popularity of mobile handheld devices has led to the development of payment applications. The global acceptance of mobile payments is hindered by security and privacy concerns. One of the main problems evoked is the anonymity related with banking transactions. In this paper I propose a new secured architecture for mobile banking. Anonymity and privacy protection are the measures to be enhanced in order to satisfy people’s current needs. The banking platform must provide the highest level of security for messages exchanged between bank and the customer.

  2. Fluorescent signatures for variable DNA sequences

    OpenAIRE

    Rice, John E; Arthur H. Reis; Rice, Lisa M.; Carver-Brown, Rachel K.; Wangh, Lawrence J.

    2012-01-01

    Life abounds with genetic variations writ in sequences that are often only a few hundred nucleotides long. Rapid detection of these variations for identification of genetic diseases, pathogens and organisms has become the mainstay of molecular science and medicine. This report describes a new, highly informative closed-tube polymerase chain reaction (PCR) strategy for analysis of both known and unknown sequence variations. It combines efficient quantitative amplification of single-stranded DN...

  3. Algorithms for mapping high-throughput DNA sequences

    DEFF Research Database (Denmark)

    Frellsen, Jes; Menzel, Peter; Krogh, Anders

    2014-01-01

    Abstract High-throughput sequencing (HTS) technologies revolutionized the field of molecular biology by enabling large scale whole genome sequencing as well as a broad range of experiments for studying the cell's inner workings directly on DNA or RNA level. Given the dramatically increased rate o...

  4. An integer programming approach to DNA sequence assembly.

    Science.gov (United States)

    Chang, Youngjung; Sahinidis, Nikolaos V

    2011-08-10

    De novo sequence assembly is a ubiquitous combinatorial problem in all DNA sequencing technologies. In the presence of errors in the experimental data, the assembly problem is computationally challenging, and its solution may not lead to a unique reconstruct. The enumeration of all alternative solutions is important in drawing a reliable conclusion on the target sequence, and is often overlooked in the heuristic approaches that are currently available. In this paper, we develop an integer programming formulation and global optimization solution strategy to solve the sequence assembly problem with errors in the data. We also propose an efficient technique to identify all alternative reconstructs. When applied to examples of sequencing-by-hybridization, our approach dramatically increases the length of DNA sequences that can be handled with global optimality certificate to over 10,000, which is more than 10 times longer than previously reported. For some problem instances, alternative solutions exhibited a wide range of different ability in reproducing the target DNA sequence. Therefore, it is important to utilize the methodology proposed in this paper in order to obtain all alternative solutions to reliably infer the true reconstruct. These alternative solutions can be used to refine the obtained results and guide the design of further experiments to correctly reconstruct the target DNA sequence. PMID:21864794

  5. A motif-independent metric for DNA sequence specificity

    Directory of Open Access Journals (Sweden)

    Pinello Luca

    2011-10-01

    Full Text Available Abstract Background Genome-wide mapping of protein-DNA interactions has been widely used to investigate biological functions of the genome. An important question is to what extent such interactions are regulated at the DNA sequence level. However, current investigation is hampered by the lack of computational methods for systematic evaluating sequence specificity. Results We present a simple, unbiased quantitative measure for DNA sequence specificity called the Motif Independent Measure (MIM. By analyzing both simulated and real experimental data, we found that the MIM measure can be used to detect sequence specificity independent of presence of transcription factor (TF binding motifs. We also found that the level of specificity associated with H3K4me1 target sequences is highly cell-type specific and highest in embryonic stem (ES cells. We predicted H3K4me1 target sequences by using the N- score model and found that the prediction accuracy is indeed high in ES cells.The software to compute the MIM is freely available at: https://github.com/lucapinello/mim. Conclusions Our method provides a unified framework for quantifying DNA sequence specificity and serves as a guide for development of sequence-based prediction models.

  6. Mitochondrial DNA sequence variation in single cells from leukemia patients

    OpenAIRE

    Yao, Yong-Gang; Ogasawara, Yoji; Kajigaya, Sachiko; Molldrem, Jeffrey J.; Falcão, Roberto P; Pintão, Maria-Carolina; McCoy, J. Philip; Rizzatti, Edgar Gil; Young, Neal S

    2007-01-01

    A high frequency of mtDNA somatic mutation has been observed in many tumors as well as in aging tissues. In this study, we analyzed the mtDNA control region sequence variation in 3534 single normal cells and individual blasts from 18 patients with leukemia and 10 healthy donors, to address the mutation process in leukemic cells. We found significant differences in mtDNA sequence, as represented by the number of haplotypes and the mean number of cells with each nonaggregate haplotype in a popu...

  7. Selective binding of anti-DNA antibodies to native dsDNA fragments of differing sequence.

    Science.gov (United States)

    Uccellini, Melissa B; Busto, Patricia; Debatis, Michelle; Marshak-Rothstein, Ann; Viglianti, Gregory A

    2012-03-30

    Systemic autoimmune diseases are characterized by the development of autoantibodies directed against a limited subset of nuclear antigens, including DNA. DNA-specific B cells take up mammalian DNA through their B cell receptor, and this DNA is subsequently transported to an endosomal compartment where it can potentially engage TLR9. We have previously shown that ssDNA-specific B cells preferentially bind to particular DNA sequences, and antibody specificity for short synthetic oligodeoxynucleotides (ODNs). Since CpG-rich DNA, the ligand for TLR9 is found in low abundance in mammalian DNA, we sought to determine whether antibodies derived from DNA-reactive B cells showed binding preference for CpG-rich native dsDNA, and thereby select immunostimulatory DNA for delivery to TLR9. We examined a panel of anti-DNA antibodies for binding to CpG-rich and CpG-poor DNA fragments. We show that a number of anti-DNA antibodies do show preference for binding to certain native dsDNA fragments of differing sequence, but this does not correlate directly with the presence of CpG dinucleotides. An antibody with preference for binding to a fragment containing optimal CpG motifs was able to promote B cell proliferation to this fragment at 10-fold lower antibody concentrations than an antibody that did not selectively bind to this fragment, indicating that antibody binding preference can influence autoreactive B cell responses.

  8. Electronic Transport and Thermopower in Aperiodic DNA Sequences

    Science.gov (United States)

    Roche, Stephan; Maciá, Enrique

    A detailed study of charge transport properties of synthetic and genomic DNA sequences is reported. Genomic sequences of the Chromosome 22, λ-bacteriophage, and D1s80 genes of Human and Pygmy chimpanzee are considered in this work, and compared with both periodic and quasiperiodic (Fibonacci) sequences of nucleotides. Charge transfer efficiency is compared for all these different sequences, and large variations in charge transfer efficiency, stemming from sequence-dependent effects, are reported. In addition, basic characteristics of tunneling currents, including contact effects, are described. Finally, the thermoelectric power of nucleobases connected in between metallic contacts at different temperatures is presented.

  9. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  10. cDNA cloning and sequencing of ostrich Growth hormone

    Directory of Open Access Journals (Sweden)

    Doosti Abbas

    2012-01-01

    Full Text Available In recent years, industrial breeding of ostrich (Struthio camelus has been widely developed in Iran. Growth hormone (GH is a peptide hormone that stimulates growth and cell reproduction in different animals. The aim of this study was to clone and sequence the ostrich growth hormone gene in E. coli, done for the first time in Iran. The cDNA that encodes ostrich growth hormone was isolated from total mRNA of the pituitary gland and amplified by RT-PCR using GH specific PCR primers. Then GH cDNA was cloned by T/A cloning technique and the construct was transformed into E. coli. Finally, GH cDNA sequence was submitted to the GenBank (Accession number: JN559394. The results of present study showed that GH cDNA was successfully cloned in E. coli. Sequencing confirmed that GH cDNA was cloned and that the length of ostrich GH cDNA was 672 bp; BLAST search showed that the sequence of growth hormone cDNA of the ostrich from Iran has 100% homology with other records existing in GenBank.

  11. DNA sequence of the yeast transketolase gene.

    Science.gov (United States)

    Fletcher, T S; Kwee, I L; Nakada, T; Largman, C; Martin, B M

    1992-02-18

    Transketolase (EC 2.2.1.1) is the enzyme that, together with aldolase, forms a reversible link between the glycolytic and pentose phosphate pathways. We have cloned and sequenced the transketolase gene from yeast (Saccharomyces cerevisiae). This is the first transketolase gene of the pentose phosphate shunt to be sequenced from any source. The molecular mass of the proposed translated protein is 73,976 daltons, in good agreement with the observed molecular mass of about 75,000 daltons. The 5'-nontranslated region of the gene is similar to other yeast genes. There is no evidence of 5'-splice junctions or branch points in the sequence. The 3'-nontranslated region contains the polyadenylation signal (AATAAA), 80 base pairs downstream from the termination codon. A high degree of homology is found between yeast transketolase and dihydroxyacetone synthase (formaldehyde transketolase) from the yeast Hansenula polymorpha. The overall sequence identity between these two proteins is 37%, with four regions of much greater similarity. The regions from amino acid residues 98-131, 157-182, 410-433, and 474-489 have sequence identities of 74%, 66%, 83%, and 82%, respectively. One of these regions (157-182) includes a possible thiamin pyrophosphate (TPP) binding domain, and another (410-433) may contain the catalytic domain. PMID:1737042

  12. Apple II software for M13 shotgun DNA sequencing.

    OpenAIRE

    Larson, R; Messing, J

    1982-01-01

    A set of programs is presented for the reconstruction of a DNA sequence from data generated by the M13 shotgun sequencing technique. Once the sequence has been established and stored other programs are used for its analysis. The programs have been written for the Apple II microcomputer. A minimum investment is required for the hardware and the software is easily interchangeable between the growing number of interested researchers. Copies are available in ready to use form.

  13. Nanopore-based Fourth-generation DNA Sequencing Technology

    Institute of Scientific and Technical Information of China (English)

    Yanxiao Feng; Yuechuan Zhang; Cuifeng Ying; Deqiang Wang; Chunlei Du

    2015-01-01

    Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than$100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications.

  14. Characterization of a DNA sequence family in the Prader-Willi/Angelman syndrome chromosome region in 15q11-q13

    Energy Technology Data Exchange (ETDEWEB)

    Dittrich, B.; Knoblauch, H.; Buiting, K.; Horsthemke, B. (Universitaetsklinikum Essen (Germany))

    1993-04-01

    IR4-3R (D15S11) is an anonymous DNA sequence from human chromosome 15. Using YAC cloning and restriction enzyme analysis, the authors have found that IR4-3R detects five related DNA sequences, which are spread over 700 kb within the Prader-Willi/Angelman syndrome chromosome region in 15q11-q 13. The RsaI and StyI polymorphisms, which were described previously, are associated with the most proximal copy of IR4-3R and are in strong linkage disequilibrium. IR4-3R represents the third DNA sequence family that has been identified in 15q11-q13. 14 refs., 2 figs., 1 tab.

  15. Mitochondrial DNA sequence analysis of two mouse hepatocarcinoma cell lines

    Institute of Scientific and Technical Information of China (English)

    Ji-Gang Dai; Xia Lei; Jia-Xin Min; Guo-Qiang Zhang; Hong Wei

    2005-01-01

    AIM: To study genetic difference of mitochondrial DNA (mtDNA)between two hepatocarcinoma cell lines (Hca-F and Hca-P)with diverse metastatic characteristics and the relationship between mtDNA changes in cancer cells and their oncogenic phenotype.METHODS: Mitochondrial DNA D-loop, tRNAMet+Glu+Ile and ND3gene fragments from the hepatocarcinoma cell lines with 1100, 1126 and 534 bp in length respectively were analysed by PCR amplification and restriction fragment length polymorphism techniques. The D-loop 3' end sequence of the hepatocarcinoma cell lines was determined by sequencing.RESULTS: No amplification fragment length polymorphism and restriction fragment length polymorphism were observed in tRNAMet+Glu+Ile,ND3 and D-loop of mitochondrial DNA of the hepatocarcinoma cells. Sequence differences between Hca-F and Hca-P were found in mtDNA D-loop.CONCLUSION: Deletion mutations of mitochondrial DNA restriction fragment may not play a significant role in carcinogenesis. Genetic difference of mtDNA D-loop between Hca-F and Hca-P, which may reflect the environmental and genetic influences during tumor progression, could be linked to their tumorigenic phenotypes.

  16. Bayesian classification for promoter prediction in human DNA sequences

    Science.gov (United States)

    Bercher, J.-F.; Jardin, P.; Duriez, B.

    2006-11-01

    Many Computational methods are yet available for data retrieval and analysis of genomic sequences, but some functional sites are difficult to characterize. In this work, we examine the problem of promoter localization in human DNA sequences. Promoters are regulatory regions that governs the expression of genes, and their prediction is reputed difficult, so that this issue is still open. We present the Chaos Game representation (CGR) of DNA sequences which has many interesting properties, and the notion of `genomic signature' that proved relevant in phylogeny applications. Based on this notion, we develop a (naïve) bayesian classifier, evaluate its performances, and show that its adaptive implementation enable to reveal or assess core-promoter positions along a DNA sequence.

  17. MetaGeneAnnotator: Detecting Species-Specific Patterns of Ribosomal Binding Site for Precise Gene Prediction in Anonymous Prokaryotic and Phage Genomes

    OpenAIRE

    Noguchi, Hideki; Taniguchi, Takeaki; Itoh, Takehiko

    2008-01-01

    Recent advances in DNA sequencers are accelerating genome sequencing, especially in microbes, and complete and draft genomes from various species have been sequenced in rapid succession. Here, we present a comprehensive gene prediction tool, the MetaGeneAnnotator (MGA), which precisely predicts all kinds of prokaryotic genes from a single or a set of anonymous genomic sequences having a variety of lengths. The MGA integrates statistical models of prophage genes, in addition to those of bacter...

  18. LONG-RANGE CORRELATIONS IN DNA SEQUENCES USING TWO-DIMENSIONAL DNA WALKS

    Institute of Scientific and Technical Information of China (English)

    Jin Chen; Lin-xi Zhang; De-lu Zhao

    2005-01-01

    The characterization of long-range correlations and fractal properties of DNA sequences has proved to be a difficult though rewarding task mainly due to the mosaic character of DNA consisting of many patches of various lengths with different nucleotide constitutions. In this paper we investigate statistical correlations among different positions in DNA sequences using the two-dimensional DNA walk. The root-mean-square fluctuation F(l) is described by a power law. The autocorrelation function C(l), which is used to measure the linear dependence and periodicity, exists a power law of C(l) -τμ. We also calculate the mean-square distance <R2(l)> along the DNA chain, and it may be expressed as <R2(l)> - l r with 2 >γ> 1. Our investigations can provide some insights into long-range correlations in DNA sequences.

  19. Assurances of past donor anonymity are meaningless

    OpenAIRE

    Blyth, Eric

    2005-01-01

    The New Scientist recently recounted the story of an American teenager conceived through ostensibly anonymous donor insemination who had been able to identify his donor through DNA testing and an internet genetic database service (also see BioNews issue 333, at http://www.bionews.org.uk/new.lasso?storyid=2808). In fact, we have known since Barry Stevens' remarkable documentary, Offspring, released in 2001, that with some genetic background information, access to DNA testing and the intern...

  20. Probabilistic models for semisupervised discriminative motif discovery in DNA sequences.

    Science.gov (United States)

    Kim, Jong Kyoung; Choi, Seungjin

    2011-01-01

    Methods for discriminative motif discovery in DNA sequences identify transcription factor binding sites (TFBSs), searching only for patterns that differentiate two sets (positive and negative sets) of sequences. On one hand, discriminative methods increase the sensitivity and specificity of motif discovery, compared to generative models. On the other hand, generative models can easily exploit unlabeled sequences to better detect functional motifs when labeled training samples are limited. In this paper, we develop a hybrid generative/discriminative model which enables us to make use of unlabeled sequences in the framework of discriminative motif discovery, leading to semisupervised discriminative motif discovery. Numerical experiments on yeast ChIP-chip data for discovering DNA motifs demonstrate that the best performance is obtained between the purely-generative and the purely-discriminative and the semisupervised learning improves the performance when labeled sequences are limited.

  1. Chaos game representation (CGR)-walk model for DNA sequences

    Institute of Scientific and Technical Information of China (English)

    Gao Jie; Xu Zhen-Yuan

    2009-01-01

    Chaos game representation (CGR) is an iterative mapping technique that processes sequences of units, such as nucleotides in a DNA sequence or amino acids in a protein, in order to determine the coordinates of their positions in a continuous space. This distribution of positions has two features: one is unique, and the other is source sequence that can be recovered from the coordinates so that the distance between positions may serve as a measure of similarity between the corresponding sequences. A CGR-walk model is proposed based on CGR coordinates for the DNA sequences. The CGR coordinates are converted into a time series, and a long-memory ARFIMA (p, d, q) model, where ARFIMA stands for autoregressive fractionally integrated moving average, is introduced into the DNA sequence analysis. This model is applied to simulating real CGR-walk sequence data of ten genomic sequences. Remarkably long-range correlations are uncovered in the data, and the results from these models are reasonably fitted with those from the ARFIMA (p, d, q) model.

  2. Hiding message into DNA sequence through DNA coding and chaotic maps.

    Science.gov (United States)

    Liu, Guoyan; Liu, Hongjun; Kadir, Abdurahman

    2014-09-01

    The paper proposes an improved reversible substitution method to hide data into deoxyribonucleic acid (DNA) sequence, and four measures have been taken to enhance the robustness and enlarge the hiding capacity, such as encode the secret message by DNA coding, encrypt it by pseudo-random sequence, generate the relative hiding locations by piecewise linear chaotic map, and embed the encoded and encrypted message into a randomly selected DNA sequence using the complementary rule. The key space and the hiding capacity are analyzed. Experimental results indicate that the proposed method has a better performance compared with the competing methods with respect to robustness and capacity. PMID:25023893

  3. How effective is graphene nanopore geometry on DNA sequencing?

    CERN Document Server

    Satarifard, Vahid; Ejtehadi, Mohammad Reza

    2015-01-01

    In this paper we investigate the effects of graphene nanopore geometry on homopolymer ssDNA pulling process through nanopore using steered molecular dynamic (SMD) simulations. Different graphene nanopores are examined including axially symmetric and asymmetric monolayer graphene nanopores as well as five layer graphene polyhedral crystals (GPC). The pulling force profile, moving fashion of ssDNA, work done in irreversible DNA pulling and orientations of DNA bases near the nanopore are assessed. Simulation results demonstrate the strong effect of the pore shape as well as geometrical symmetry on free energy barrier, orientations and dynamic of DNA translocation through graphene nanopore. Our study proposes that the symmetric circular geometry of monolayer graphene nanopore with high pulling velocity can be used for DNA sequencing.

  4. Mitochondrial DNA Sequence Analysis - Validation and Use for Forensic Casework.

    Science.gov (United States)

    Holland, M M; Parsons, T J

    1999-06-01

    With the discovery of the polymerase chain reaction (PCR) in the mid-1980's, the last in a series of critical molecular biology techniques (to include the isolation of DNA from human and non-human biological material, and primary sequence analysis of DNA) had been developed to rapidly analyze minute quantities of mitochondrial DNA (mtDNA). This was especially true for mtDNA isolated from challenged sources, such as ancient or aged skeletal material and hair shafts. One of the beneficiaries of this work has been the forensic community. Over the last decade, a significant amount of research has been conducted to develop PCR-based sequencing assays for the mtDNA control region (CR), which have subsequently been used to further characterize the CR. As a result, the reliability of these assays has been investigated, the limitations of the procedures have been determined, and critical aspects of the analysis process have been identified, so that careful control and monitoring will provide the basis for reliable testing. With the application of these assays to forensic identification casework, mtDNA sequence analysis has been properly validated, and is a reliable procedure for the examination of biological evidence encountered in forensic criminalistic cases. PMID:26255820

  5. Management of High-Throughput DNA Sequencing Projects: Alpheus.

    Science.gov (United States)

    Miller, Neil A; Kingsmore, Stephen F; Farmer, Andrew; Langley, Raymond J; Mudge, Joann; Crow, John A; Gonzalez, Alvaro J; Schilkey, Faye D; Kim, Ryan J; van Velkinburgh, Jennifer; May, Gregory D; Black, C Forrest; Myers, M Kathy; Utsey, John P; Frost, Nicholas S; Sugarbaker, David J; Bueno, Raphael; Gullans, Stephen R; Baxter, Susan M; Day, Steve W; Retzel, Ernest F

    2008-12-26

    High-throughput DNA sequencing has enabled systems biology to begin to address areas in health, agricultural and basic biological research. Concomitant with the opportunities is an absolute necessity to manage significant volumes of high-dimensional and inter-related data and analysis. Alpheus is an analysis pipeline, database and visualization software for use with massively parallel DNA sequencing technologies that feature multi-gigabase throughput characterized by relatively short reads, such as Illumina-Solexa (sequencing-by-synthesis), Roche-454 (pyrosequencing) and Applied Biosystem's SOLiD (sequencing-by-ligation). Alpheus enables alignment to reference sequence(s), detection of variants and enumeration of sequence abundance, including expression levels in transcriptome sequence. Alpheus is able to detect several types of variants, including non-synonymous and synonymous single nucleotide polymorphisms (SNPs), insertions/deletions (indels), premature stop codons, and splice isoforms. Variant detection is aided by the ability to filter variant calls based on consistency, expected allele frequency, sequence quality, coverage, and variant type in order to minimize false positives while maximizing the identification of true positives. Alpheus also enables comparisons of genes with variants between cases and controls or bulk segregant pools. Sequence-based differential expression comparisons can be developed, with data export to SAS JMP Genomics for statistical analysis. PMID:20151039

  6. Nonlinear Aspects of Coding and Noncoding DNA Sequences

    Science.gov (United States)

    Stanley, H. Eugene

    2001-03-01

    One of the most remarkable features of human DNA is that 97 percent is not coding for proteins. Studying this noncoding DNA is important both for practical reasons (to distinguish it from the coding DNA as the human genome is sequenced), and for scientific reasons (why is the noncoding DNA present at all, if it appears to have little if any purpose?). In this talk we discuss new methods of analyzing coding and noncoding DNA in parallel, with a view to uncovering different statistical properties of the two kinds of DNA. We also speculate on possible roles of noncoding DNA. The work reported here was carried out primarily by P. Bernaola-Galvan, S. V. Buldyrev, P. Carpena, N. Dokholyan, A. L. Goldberger, I. Grosse, S. Havlin, H. Herzel, J. L. Oliver, C.-K. Peng, M. Simons, H. E. Stanley, R. H. R. Stanley, and G. M. Viswanathan. [1] For a brief overview in language that physicists can understand, see H. E. Stanley, S. V. Buldyrev, A. L. Goldberger, S. Havlin, C.-K. Peng, and M. Simons, "Scaling Features of Noncoding DNA" [Proc. XII Max Born Symposium, Wroclaw], Physica A 273, 1-18 (1999). [2] I. Grosse, H. Herzel, S. V. Buldyrev, and H. E. Stanley, "Species Independence of Mutual Information in Coding and Noncoding DNA," Phys. Rev. E 61, 5624-5629 (2000). [3] P. Bernaola-Galvan, I. Grosse, P. Carpena, J. L. Oliver, and H. E. Stanley, "Identification of DNA Coding Regions Using an Entropic Segmentation Method," Phys. Rev. Lett. 84, 1342-1345 (2000). [4] N. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, "Distributions of Dimeric Tandem Repeats in Non-coding and Coding DNA Sequences," J. Theor. Biol. 202, 273-282 (2000). [5] R. H. R. Stanley, N. V. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, "Clumping of Identical Oligonucleotides in Coding and Noncoding DNA Sequences," J. Biomol. Structure and Design 17, 79-87 (1999). [6] N. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, "Distribution of Base Pair Repeats in Coding and Noncoding DNA

  7. Sequence tagged microsatellite profiling (STMP): improved isolation of DNA sequence flanking target SSRs.

    Science.gov (United States)

    Hayden, M J; Good, G; Sharp, P J

    2002-12-01

    Sequence tagged microsatellite profiling (STMP) enables the rapid development of large numbers of co-dominant DNA markers, known as sequence tagged microsatellites (STMs). Each STM is amplified by PCR using a single primer specific to the conserved DNA sequence flanking the microsatellite repeat in combination with a universal primer that anchors to the 5'-ends of the microsatellites. It is also possible to convert STMs into conventional microsatellite, or simple sequence repeat (SSR), markers that are amplified using a pair of primers flanking the repeat sequence. Here, we describe a modification of the STMP procedure to significantly improve the capacity to convert STMs into conventional SSRs and, therefore, facilitate the development of highly specific DNA markers for purposes such as marker-assisted breeding. The usefulness of this technique was demonstrated in bread wheat. PMID:12466561

  8. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    Directory of Open Access Journals (Sweden)

    Baldwin Stephen A

    2011-03-01

    Full Text Available Abstract Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.

  9. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    Science.gov (United States)

    2011-01-01

    Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/. PMID:21385349

  10. A novel chaotic image encryption scheme using DNA sequence operations

    Science.gov (United States)

    Wang, Xing-Yuan; Zhang, Ying-Qian; Bao, Xue-Mei

    2015-10-01

    In this paper, we propose a novel image encryption scheme based on DNA (Deoxyribonucleic acid) sequence operations and chaotic system. Firstly, we perform bitwise exclusive OR operation on the pixels of the plain image using the pseudorandom sequences produced by the spatiotemporal chaos system, i.e., CML (coupled map lattice). Secondly, a DNA matrix is obtained by encoding the confused image using a kind of DNA encoding rule. Then we generate the new initial conditions of the CML according to this DNA matrix and the previous initial conditions, which can make the encryption result closely depend on every pixel of the plain image. Thirdly, the rows and columns of the DNA matrix are permuted. Then, the permuted DNA matrix is confused once again. At last, after decoding the confused DNA matrix using a kind of DNA decoding rule, we obtain the ciphered image. Experimental results and theoretical analysis show that the scheme is able to resist various attacks, so it has extraordinarily high security.

  11. Mitochondrial DNA sequence of Onychostoma rara.

    Science.gov (United States)

    Zeng, Chun-Fang; Li, Xiao-Ling; Li, Chuan-Wu; Huang, Xiang-Rong; Wan, Yi-Wen

    2015-01-01

    The complete mitochondrial genome sequence of Onychostoma rara was determined to be 16,590 bp in length and contains 13 protein-coding genes (PCGs), 22 tRNA genes, large (rrnL) and small (rrnS) rRNA and the non-coding control region. Its total A + T content is 55.65%. We also analyzed the structure of control region, 6 CSBs (CSB-1, CSB-2, CSB-3, CSB-D, CSB-E and CSB-F) and 2 bp tandem repeat were detected.

  12. Mitochondrial DNA sequence evolution in the Arctoidea.

    OpenAIRE

    Zhang, Y P; Ryder, O. A.

    1993-01-01

    Some taxa in the superfamily Arctoidea, such as the giant panda and the lesser panda, have presented puzzles to taxonomists. In the present study, approximately 397 bases of the cytochrome b gene, 364 bases of the 12S rRNA gene, and 74 bases of the tRNA(Thr) and tRNA(Pro) genes from the giant panda, lesser panda, kinkajou, raccoon, coatimundi, and all species of the Ursidae were sequenced. The high transition/transversion ratios in cytochrome b and RNA genes prior to saturation suggest that t...

  13. DNA binding of dinuclear iron(II) metallosupramolecular cylinders. DNA unwinding and sequence preference.

    Science.gov (United States)

    Malina, Jaroslav; Hannon, Michael J; Brabec, Viktor

    2008-06-01

    [Fe(2)L(3)](4+) (L = C(25)H(20)N(4)) is a synthetic tetracationic supramolecular cylinder (with a triple helical architecture) that targets the major groove of DNA and can bind to DNA Y-shaped junctions. To explore the DNA-binding mode of [Fe(2)L(3)](4+), we examine herein the interactions of pure enantiomers of this cylinder with DNA by biochemical and molecular biology methods. The results have revealed that, in addition to the previously reported bending of DNA, the enantiomers extensively unwind DNA, with the M enantiomer being the more efficient at unwinding, and exhibit preferential binding to regular alternating purine-pyrimidine sequences, with the M enantiomer showing a greater preference. Also, interestingly, the DNA binding of bulky cylinders [Fe(2)(L-CF(3))(3)](4+) and [Fe(2)(L-Ph)(3)](4+) results in no DNA unwinding and also no sequence preference of their DNA binding was observed. The observation of sequence-preference in the binding of these supramolecular cylinders suggests that a concept based on the use of metallosupramolecular cylinders might result in molecular designs that recognize the genetic code in a sequence-dependent manner with a potential ability to affect the processing of the genetic code. PMID:18467423

  14. Multiplexed DNA sequence capture of mitochondrial genomes using PCR products.

    Directory of Open Access Journals (Sweden)

    Tomislav Maricic

    Full Text Available BACKGROUND: To utilize the power of high-throughput sequencers, target enrichment methods have been developed. The majority of these require reagents and equipment that are only available from commercial vendors and are not suitable for the targets that are a few kilobases in length. METHODOLOGY/PRINCIPAL FINDINGS: We describe a novel and economical method in which custom made long-range PCR products are used to capture complete human mitochondrial genomes from complex DNA mixtures. We use the method to capture 46 complete mitochondrial genomes in parallel and we sequence them on a single lane of an Illumina GA(II instrument. CONCLUSIONS/SIGNIFICANCE: This method is economical and simple and particularly suitable for targets that can be amplified by PCR and do not contain highly repetitive sequences such as mtDNA. It has applications in population genetics and forensics, as well as studies of ancient DNA.

  15. Facilitated diffusion on mobile DNA: configurational traps and sequence heterogeneity

    CERN Document Server

    Brackley, C A; Marenduzzo, D; 10.1103/PhysRevLett.109.168103

    2012-01-01

    We present Brownian dynamics simulations of the facilitated diffusion of a protein, modelled as a sphere with a binding site on its surface, along DNA, modelled as a semi-flexible polymer. We consider both the effect of DNA organisation in 3D, and of sequence heterogeneity. We find that in a network of DNA loops, as are thought to be present in bacterial DNA, the search process is very sensitive to the spatial location of the target within such loops. Therefore, specific genes might be repressed or promoted by changing the local topology of the genome. On the other hand, sequence heterogeneity creates traps which normally slow down facilitated diffusion. When suitably positioned, though, these traps can, surprisingly, render the search process much more efficient.

  16. Multiple Base Substitution Corrections in DNA Sequence Evolution

    Science.gov (United States)

    Kowalczuk, M.; Mackiewicz, P.; Szczepanik, D.; Nowicka, A.; Dudkiewicz, M.; Dudek, M. R.; Cebrat, S.

    We discuss the Jukes and Cantor's one-parameter model and Kimura's two-parameter model unability to describe evolution of asymmetric DNA molecules. The standard distance measure between two DNA sequences, which is the number of substitutions per site, should include the effect of multiple base substitutions separately for each type of the base. Otherwise, the respective tables of substitutions cannot reconstruct the asymmetric DNA molecule with respect to the composition. Basing on Kimura's neutral theory, we have derived a linear law for the correlation of the mean survival time of nucleotides under constant mutation pressure and their fraction in the genome. According to the law, the corrections to Kimura's theory have been discussed to describe evolution of genomes with asymmetric nucleotide composition. We consider the particular case of the strongly asymmetric Borrelia burgdorferi genome and we discuss in detail the corrections, which should be introduced into the distance measure between two DNA sequences to include multiple base substitutions.

  17. Anonymous Authentication for Smartcards

    Directory of Open Access Journals (Sweden)

    J. Hajny

    2010-06-01

    Full Text Available The paper presents an innovative solution in the field of RFID (Radio-Frequency IDentification smartcard authentication. Currently the smartcards are used for many purposes - e.g. employee identification, library cards, student cards or even identity credentials. Personal identity is revealed to untrustworthy entities every time we use these cards. Such information could later be used without our knowledge and for harmful reasons like shopping pattern scanning or even movement tracking. We present a communication scheme for keeping one’s identity private in this paper. Although our system provides anonymity, it does not allow users to abuse this feature. The system is based on strong cryptographic primitives that provide features never available before. Besides theoretical design of the anonymous authentication scheme and its analysis we also provide implementation results.

  18. SNP discovery using Paired-End RAD-tag sequencing on pooled genomic DNA of Sisymbrium austriacum (Brassicaceae).

    Science.gov (United States)

    Vandepitte, K; Honnay, O; Mergeay, J; Breyne, P; Roldán-Ruiz, I; De Meyer, T

    2013-03-01

    Single nucleotide polymorphisms SNPs are rapidly replacing anonymous markers in population genomic studies, but their use in non model organisms is hampered by the scarcity of cost-effective approaches to uncover genome-wide variation in a comprehensive subset of individuals. The screening of one or only a few individuals induces ascertainment bias. To discover SNPs for a population genomic study of the Pyrenean rocket (Sisymbrium austriacum subsp. chrysanthum), we undertook a pooled RAD-PE (Restriction site Associated DNA Paired-End sequencing) approach. RAD tags were generated from the PstI-digested pooled genomic DNA of 12 individuals sampled across the species distribution range and paired-end sequenced using Illumina technology to produce ~24.5 Mb of sequences, covering ~7% of the specie's genome. Sequences were assembled into ~76 000 contigs with a mean length of 323 bp (N(50)  = 357 bp, sequencing depth = 24x). In all, >15 000 SNPs were called, of which 47% were annotated in putative genic regions based on homology with the Arabidopsis thaliana genome. Gene ontology (GO) slim categorization demonstrated that the identified SNPs covered extant genic variation well. The validation of 300 SNPs on a larger set of individuals using a KASPar assay underpinned the utility of pooled RAD-PE as an inexpensive genome-wide SNP discovery technique (success rate: 87%). In addition to SNPs, we discovered >600 putative SSR markers.

  19. Ancient mtDNA sequences from the First Australians revisited.

    Science.gov (United States)

    Heupink, Tim H; Subramanian, Sankar; Wright, Joanne L; Endicott, Phillip; Westaway, Michael Carrington; Huynen, Leon; Parson, Walther; Millar, Craig D; Willerslev, Eske; Lambert, David M

    2016-06-21

    The publication in 2001 by Adcock et al. [Adcock GJ, et al. (2001) Proc Natl Acad Sci USA 98(2):537-542] in PNAS reported the recovery of short mtDNA sequences from ancient Australians, including the 42,000-y-old Mungo Man [Willandra Lakes Hominid (WLH3)]. This landmark study in human ancient DNA suggested that an early modern human mitochondrial lineage emerged in Asia and that the theory of modern human origins could no longer be considered solely through the lens of the "Out of Africa" model. To evaluate these claims, we used second generation DNA sequencing and capture methods as well as PCR-based and single-primer extension (SPEX) approaches to reexamine the same four Willandra Lakes and Kow Swamp 8 (KS8) remains studied in the work by Adcock et al. Two of the remains sampled contained no identifiable human DNA (WLH15 and WLH55), whereas the Mungo Man (WLH3) sample contained no Aboriginal Australian DNA. KS8 reveals human mitochondrial sequences that differ from the previously inferred sequence. Instead, we recover a total of five modern European contaminants from Mungo Man (WLH3). We show that the remaining sample (WLH4) contains ∼1.4% human DNA, from which we assembled two complete mitochondrial genomes. One of these was a previously unidentified Aboriginal Australian haplotype belonging to haplogroup S2 that we sequenced to a high coverage. The other was a contaminating modern European mitochondrial haplotype. Although none of the sequences that we recovered matched those reported by Adcock et al., except a contaminant, these findings show the feasibility of obtaining important information from ancient Aboriginal Australian remains. PMID:27274055

  20. DNA sequence analysis with droplet-based microfluidics

    OpenAIRE

    Abate, Adam R.; Hung, Tony; Sperling, Ralph A.; Mary, Pascaline; Rotem, Assaf; Agresti, Jeremy J.; Weiner, Michael A.; Weitz, David A.

    2013-01-01

    Droplet-based microfluidic techniques can form and process micrometer scale droplets at thousands per second. Each droplet can house an individual biochemical reaction, allowing millions of reactions to be performed in minutes with small amounts of total reagent. This versatile approach has been used for engineering enzymes, quantifying concentrations of DNA in solution, and screening protein crystallization conditions. Here, we use it to read the sequences of DNA molecules with a FRET-based ...

  1. Perspectives of DNA microarray and next-generation DNA sequencing technologies

    Institute of Scientific and Technical Information of China (English)

    TENG XiaoKun; XIAO HuaSheng

    2009-01-01

    DNA microarray and next-generation DNA sequencing technologies are important tools for high-throughput genome research, in revealing both the structural and functional characteristics of genomes. In the past decade the DNA microarray technologies have been widely applied in the studies of functional genomics, systems biology and pharmacogenomics. The next-generation DNA sequenc-ing method was first introduced by the 454 Company in 2003, immediately followed by the establish-ment of the Solexa and Solid techniques by other biotech companies. Though it has not been long since the first emergence of this technology, with the fast and impressive improvement, the application of this technology has extended to almost all fields of genomics research, as a rival challenging the existing DNA microarray technology. This paper briefly reviews the working principles of these two technologies as well as their application and perspectives in genome research.

  2. Sequence-selective DNA recognition with peptide-bisbenzamidine conjugates.

    Science.gov (United States)

    Sánchez, Mateo I; Vázquez, Olalla; Vázquez, M Eugenio; Mascareñas, José L

    2013-07-22

    Transcription factors (TFs) are specialized proteins that play a key role in the regulation of genetic expression. Their mechanism of action involves the interaction with specific DNA sequences, which usually takes place through specialized domains of the protein. However, achieving an efficient binding usually requires the presence of the full protein. This is the case for bZIP and zinc finger TF families, which cannot interact with their target sites when the DNA binding fragments are presented as isolated monomers. Herein it is demonstrated that the DNA binding of these monomeric peptides can be restored when conjugated to aza-bisbenzamidines, which are readily accessible molecules that interact with A/T-rich sites by insertion into their minor groove. Importantly, the fluorogenic properties of the aza-benzamidine unit provide details of the DNA interaction that are eluded in electrophoresis mobility shift assays (EMSA). The hybrids based on the GCN4 bZIP protein preferentially bind to composite sequences containing tandem bisbenzamidine-GCN4 binding sites (TCAT⋅AAATT). Fluorescence reverse titrations show an interesting multiphasic profile consistent with the formation of competitive nonspecific complexes at low DNA/peptide ratios. On the other hand, the conjugate with the DNA binding domain of the zinc finger protein GAGA binds with high affinity (KD≈12 nM) and specificity to a composite AATTT⋅GAGA sequence containing both the bisbenzamidine and the TF consensus binding sites.

  3. Fast comparison of DNA sequences by oligonucleotide profiling

    Directory of Open Access Journals (Sweden)

    Marín Ignacio

    2008-02-01

    Full Text Available Abstract Background The comparison of DNA sequences is a traditional problem in genomics and bioinformatics. Many new opportunities emerge due to the improvement of personal computers, allowing the implementation of novel strategies of analysis. Findings We describe a new program, called UVWORD, which determines the number of times that each DNA word present in a sequence (target is found in a second sequence (source, a procedure that we have called oligonucleotide profiling. On a standard computer, the user may search for words of a size ranging from k = 1 to k = 14 nucleotides. Average counts for groups of contiguous words may also be established. The rate of analysis on standard computers is from 3.4 (k = 14 to 16 millions of words per second (1 ≤ k ≤ 8. This makes feasible the fast screening of even the longest known DNA molecules. Discussion We show that the combination of the ability of analyzing words of relatively long size, which occur very rarely by chance, and the fast speed of the program allows to perform novel types of screenings, complementary to those provided by standard programs such as BLAST. This method can be used to determine oligonucleotide content, to characterize the distribution of repetitive sequences in chromosomes, to determine the evolutionary conservation of sequences in different species, to establish regions of similar DNA among chromosomes or genomes, etc.

  4. An Uncompressed Image Encryption Algorithm Based on DNA Sequences

    Directory of Open Access Journals (Sweden)

    Shima Ramesh Maniyath

    2011-07-01

    Full Text Available The rapid growth of the Internet and digitized content made image and video distribution simpler. Hence the need for image and video data protection is on the rise. In this paper, we propose a secure and computationally feasible image and video encryption/decryption algorithm based on DNA sequences. The main purpose of this algorithm is to reduce the big image encryption time. This algorithm is implemented by using the natural DNA sequences as main keys. The first part is the process of pixel scrambling. The original image is confused in the light of the scrambling sequence which is generated by the DNA sequence. The second part is the process of pixel replacement. The pixel gray values of the new image and the one of the three encryption templates generated by the other DNA sequence are XORed bit-by-bit in turn. The main scope of this paper is to propose an extension of this algorithm to videos and making it secure using modern Biological technology. A security analysis for the proposed system is performed and presented.

  5. A Comparison of Computation Techniques for DNA Sequence Comparison

    Directory of Open Access Journals (Sweden)

    Harshita G. Patil

    2012-04-01

    Full Text Available This Project shows a comparison survey done on DNA sequence comparison techniques. The various techniques implemented are sequential comparison, multithreading on a single computer and multithreading using parallel processing. This Project shows the issues involved in implementing a dynamic programming algorithm for biological sequence comparison on a general purpose parallel computing platform Tiling is an important technique for extraction of parallelism. Informally, tiling consists of partitioning the iteration space into several chunks of computation called tiles (blocks such that sequential traversal of the tiles covers the entire iteration space. The idea behind tiling is to increase the granularity of computation and decrease the amount of communication incurred between processors. This makes tiling more suitable for distributed memory architectures where communication startup costs are very high and hence frequent communication is undesirable. Our work to develop sequence- comparison mechanism and software supports the identification of sequences of DNA.

  6. DNA qualification workflow for next generation sequencing of histopathological samples.

    Science.gov (United States)

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  7. DNA qualification workflow for next generation sequencing of histopathological samples.

    Directory of Open Access Journals (Sweden)

    Michele Simbolo

    Full Text Available Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF tissues, 6 formalin-fixed paraffin-embedded (FFPE tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard

  8. DNA qualification workflow for next generation sequencing of histopathological samples.

    Science.gov (United States)

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  9. High-throughput DNA sequencing: a genomic data manufacturing process.

    Science.gov (United States)

    Huang, G M

    1999-01-01

    The progress trends in automated DNA sequencing operation are reviewed. Technological development in sequencing instruments, enzymatic chemistry and robotic stations has resulted in ever-increasing capacity of sequence data production. This progress leads to a higher demand on laboratory information management and data quality assessment. High-throughput laboratories face the challenge of organizational management, as well as technology management. Engineering principles of process control should be adopted in this biological data manufacturing procedure. While various systems attempt to provide solutions to automate different parts of, or even the entire process, new technical advances will continue to change the paradigm and provide new challenges.

  10. A simple method encoding linear single strain DNA sequence with natural numbers

    Institute of Scientific and Technical Information of China (English)

    LI Jiye; XU Yuan; ZHANG Wang

    2008-01-01

    A simple method presenting linear single strain DNA (LssDNA) sequence with natural numbers is introduced in this paper. The method presents LssDNA correspondingly with the numerals 1, 2, 3 and 4. After calculation, the sequence can be coded in natural numbers which can also be decoded into the DNA sequence. Thus, an LssDNA sequence can be expressed in a natural number and a dot at coordinate axes. In the future, a new LssDNA sequences database termed "DotBank" would be realized in which each LssDNA sequence is determined as a dot.

  11. Privacy-Enhanced Methods for Comparing Compressed DNA Sequences

    CERN Document Server

    Eppstein, David; Baldi, Pierre

    2011-01-01

    In this paper, we study methods for improving the efficiency and privacy of compressed DNA sequence comparison computations, under various querying scenarios. For instance, one scenario involves a querier, Bob, who wants to test if his DNA string, $Q$, is close to a DNA string, $Y$, owned by a data owner, Alice, but Bob does not want to reveal $Q$ to Alice and Alice is willing to reveal $Y$ to Bob \\emph{only if} it is close to $Q$. We describe a privacy-enhanced method for comparing two compressed DNA sequences, which can be used to achieve the goals of such a scenario. Our method involves a reduction to set differencing, and we describe a privacy-enhanced protocol for set differencing that achieves absolute privacy for Bob (in the information theoretic sense), and a quantifiable degree of privacy protection for Alice. One of the important features of our protocols, which makes them ideally suited to privacy-enhanced DNA sequence comparison problems, is that the communication complexity of our solutions is pr...

  12. Functionalized nanopore-embedded electrodes for rapid DNA sequencing

    CERN Document Server

    He, Haiying; Pandey, Ravindra; Rocha, Alexandre Reily; Sanvito, Stefano; Grigoriev, Anton; Ahuja, Rajeev; Karna, Shashi P

    2007-01-01

    The determination of a patient's DNA sequence can, in principle, reveal an increased risk to fall ill with particular diseases [1,2] and help to design "personalized medicine" [3]. Moreover, statistical studies and comparison of genomes [4] of a large number of individuals are crucial for the analysis of mutations [5] and hereditary diseases, paving the way to preventive medicine [6]. DNA sequencing is, however, currently still a vastly time-consuming and very expensive task [4], consisting of pre-processing steps, the actual sequencing using the Sanger method, and post-processing in the form of data analysis [7]. Here we propose a new approach that relies on functionalized nanopore-embedded electrodes to achieve an unambiguous distinction of the four nucleic acid bases in the DNA sequencing process. This represents a significant improvement over previously studied designs [8,9] which cannot reliably distinguish all four bases of DNA. The transport properties of the setup investigated by us, employing state-o...

  13. Decoding long nanopore sequencing reads of natural DNA.

    Science.gov (United States)

    Laszlo, Andrew H; Derrington, Ian M; Ross, Brian C; Brinkerhoff, Henry; Adey, Andrew; Nova, Ian C; Craig, Jonathan M; Langford, Kyle W; Samson, Jenny Mae; Daza, Riza; Doering, Kenji; Shendure, Jay; Gundlach, Jens H

    2014-08-01

    Nanopore sequencing of DNA is a single-molecule technique that may achieve long reads, low cost and high speed with minimal sample preparation and instrumentation. Here, we build on recent progress with respect to nanopore resolution and DNA control to interpret the procession of ion current levels observed during the translocation of DNA through the pore MspA. As approximately four nucleotides affect the ion current of each level, we measured the ion current corresponding to all 256 four-nucleotide combinations (quadromers). This quadromer map is highly predictive of ion current levels of previously unmeasured sequences derived from the bacteriophage phi X 174 genome. Furthermore, we show nanopore sequencing reads of phi X 174 up to 4,500 bases in length, which can be unambiguously aligned to the phi X 174 reference genome, and demonstrate proof-of-concept utility with respect to hybrid genome assembly and polymorphism detection. This work provides a foundation for nanopore sequencing of long, natural DNA strands. PMID:24964173

  14. RNA-DNA sequence differences spell genetic code ambiguities

    DEFF Research Database (Denmark)

    Bentin, Thomas; Nielsen, Michael L

    2013-01-01

    A recent paper in Science by Li et al. 2011(1) reports widespread sequence differences in the human transcriptome between RNAs and their encoding genes termed RNA-DNA differences (RDDs). The findings could add a new layer of complexity to gene expression but the study has been criticized. ...

  15. Derivatized versions of ligase enzymes for constructing DNA sequences

    Science.gov (United States)

    Mariella, Jr., Raymond P.; Christian, Allen T.; Tucker, James D.; Dzenitis, John M.; Papavasiliou, Alexandros P.

    2006-08-15

    A method of making very long, double-stranded synthetic poly-nucleotides. A multiplicity of short oligonucleotides is provided. The short oligonucleotides are sequentially hybridized to each other. Enzymatic ligation of the oligonucleotides provides a contiguous piece of PCR-ready DNA of predetermined sequence.

  16. Statistical assignment of DNA sequences using Bayesian phylogenetics

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Huelsenbeck, John P;

    2008-01-01

    We provide a new automated statistical method for DNA barcoding based on a Bayesian phylogenetic analysis. The method is based on automated database sequence retrieval, alignment, and phylogenetic analysis using a custom-built program for Bayesian phylogenetic analysis. We show on real data that ...

  17. POSA : Perl objects for DNA sequencing data analysis

    NARCIS (Netherlands)

    Aerts, JA; Jungerius, BJ; Groenen, MA

    2004-01-01

    Background: Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide

  18. A Nano-Biosensor for DNA Sequence Detection Using Absorption Spectra of SWNT-DNA Composite

    Directory of Open Access Journals (Sweden)

    J. Bansal

    2011-01-01

    Full Text Available A biosensor based on Single Walled Carbon Nanotube (SWNT-Poly (GTn ssDNA hybrid has been developed for medical diagnostics. The absorption spectrum of this assay is determined with the help of a Shimadzu UV-VIS-NIR spectrophotometer. Two distinct bands each containing three peaks corresponding to first and second van Hove singularities in the density of states of the nanotubes were observed in the absorption spectrum. When a single-stranded DNA (ssDNA having a sequence complementary to probic DNA is added to the ssDNA-SWNT conjugates, hybridization takes place, which causes the red shift of absorption spectrum of nanotubes. On the other hand, when the DNA is noncomplementary, no shift in the absorption spectrum occurs since hybridization between the DNA and probe does not take place. The red shifting of the spectrum is considered to be due to change in the dielectric environment around nanotubes.

  19. DNA Sequence Evolution with Neighbor-Dependent Mutation

    CERN Document Server

    Arndt, P F; Hwa, T; Arndt, Peter F.; Burge, Christopher B.; Hwa, Terence

    2001-01-01

    We introduce a model of DNA sequence evolution which can account for biases in mutation rates that depend on the identity of the neighboring bases. An analytic solution for this class of non-equilibrium models is developed by adopting well-known methods of nonlinear dynamics. Results are presented for the CpG-methylation-deamination process which dominates point substitutions in vertebrates. The dinucleotide frequencies generated by the model (using empirically obtained mutation rates) match the overall pattern observed in non-coding DNA. A web-based tool has been constructed to compute single- and dinucleotide frequencies for arbitrary neighbor-dependent mutation rates. Alsoprovided is the backward procedure to infer the mutation rates using maximum likelihood analysis given the observed single- and dinucleotide frequencies. Reasonable estimates of the mutation rates can be obtained very efficiently, using generic non-coding DNA sequences as input, after masking outlong homonucleotide subsequences. Our metho...

  20. Solid-State Nanopore-Based DNA Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Zewen Liu

    2016-01-01

    Full Text Available The solid-state nanopore-based DNA sequencing technology is becoming more and more attractive for its brand new future in gene detection field. The challenges that need to be addressed are diverse: the effective methods to detect base-specific signatures, the control of the nanopore’s size and surface properties, and the modulation of translocation velocity and behavior of the DNA molecules. Among these challenges, the realization of the high-quality nanopores with the help of modern micro/nanofabrication technologies is a crucial one. In this paper, typical technologies applied in the field of solid-state nanopore-based DNA sequencing have been reviewed.

  1. Sequence heterogeneity accelerates protein search for targets on DNA

    International Nuclear Information System (INIS)

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome

  2. Sequence heterogeneity accelerates protein search for targets on DNA

    Energy Technology Data Exchange (ETDEWEB)

    Shvets, Alexey A.; Kolomeisky, Anatoly B., E-mail: tolya@rice.edu [Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005 (United States)

    2015-12-28

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  3. The Application of Next Generation Sequencing in DNA Methylation Analysis

    Directory of Open Access Journals (Sweden)

    Yingying Zhang

    2010-06-01

    Full Text Available DNA methylation is a major form of epigenetic modification and plays essential roles in physiology and disease processes. In the human genome, about 80% of cytosines in the 56 million CpG sites are methylated to 5-methylcytosines. The methylation pattern of DNA is highly variable among cells types and developmental stages and influenced by disease processes and genetic factors, which brings considerable theoretical and technological challenges for its comprehensive mapping. Recently various high-throughput approaches based on bisulfite conversion combined with next generation sequencing have been developed and applied for the genome wide analysis of DNA methylation. These methods provide single base pair resolution, quantitative DNA methylation data with genome wide coverage. We review these methods here and discuss some technical points of special interest like the sequence depth necessary to reach conclusions, the identification of clonal DNA amplification after bisulfite conversion and the detection of non-CpG methylation. Future application of these methods will greatly facilitate the profiling of the DNA methylation in the genomes of different species, individuals and cell types under healthy and disease states.

  4. DNA watermarks in non-coding regulatory sequences

    Directory of Open Access Journals (Sweden)

    Pyka Martin

    2009-07-01

    Full Text Available Abstract Background DNA watermarks can be applied to identify the unauthorized use of genetically modified organisms. It has been shown that coding regions can be used to encrypt information into living organisms by using the DNA-Crypt algorithm. Yet, if the sequence of interest presents a non-coding DNA sequence, either the function of a resulting functional RNA molecule or a regulatory sequence, such as a promoter, could be affected. For our studies we used the small cytoplasmic RNA 1 in yeast and the lac promoter region of Escherichia coli. Findings The lac promoter was deactivated by the integrated watermark. In addition, the RNA molecules displayed altered configurations after introducing a watermark, but surprisingly were functionally intact, which has been verified by analyzing the growth characteristics of both wild type and watermarked scR1 transformed yeast cells. In a third approach we introduced a second overlapping watermark into the lac promoter, which did not affect the promoter activity. Conclusion Even though the watermarked RNA and one of the watermarked promoters did not show any significant differences compared to the wild type RNA and wild type promoter region, respectively, it cannot be generalized that other RNA molecules or regulatory sequences behave accordingly. Therefore, we do not recommend integrating watermark sequences into regulatory regions.

  5. DNA sequence analysis of newly formed telomeres in yeast.

    Science.gov (United States)

    Wang, S S; Pluta, A F; Zakian, V A

    1989-01-01

    A plasmid can be maintained in linear form in baker's yeast if it bears telomeric sequences at each end. Linear plasmids bearing cloned telomeric C4A4 repeats at one end (test end) and a natural DNA terminus with approximately 300 bps of C4A2 repeats at the other or control end were introduced by transformation into yeast. Test-end termini of 28 to 112 bps supported telomere formation. During telomere formation, C4A2 repeats were often transferred to test-end termini. To determine in greater detail the fate of test-end sequences on these plasmids after propagation in yeast, test-end telomeres were subcloned into E. coli and sequenced. DNA sequencing established a number of points about the molecular events involved in telomere formation in yeast. The results suggest that there are at least two mechanisms for telomere formation in yeast. One is mediated by a recombination event that requires neither a long stretch of homology nor the RAD52 gene product. The other mechanism is by addition of C1-3A repeats to the termini of linear DNA molecules. The telomeric sequence required to support C1-3A addition need not be at the very end of a molecule for telomere formation.

  6. Terminal region sequence variations in variola virus DNA.

    Science.gov (United States)

    Massung, R F; Loparev, V N; Knight, J C; Totmenin, A V; Chizhikov, V E; Parsons, J M; Safronov, P F; Gutorov, V V; Shchelkunov, S N; Esposito, J J

    1996-07-15

    Genome DNA terminal region sequences were determined for a Brazilian alastrim variola minor virus strain Garcia-1966 that was associated with an 0.8% case-fatality rate and African smallpox strains Congo-1970 and Somalia-1977 associated with variola major (9.6%) and minor (0.4%) mortality rates, respectively. A base sequence identity of > or = 98.8% was determined after aligning 30 kb of the left- or right-end region sequences with cognate sequences previously determined for Asian variola major strains India-1967 (31% death rate) and Bangladesh-1975 (18.5% death rate). The deduced amino acid sequences of putative proteins of > or = 65 amino acids also showed relatively high identity, although the Asian and African viruses were clearly more related to each other than to alastrim virus. Alastrim virus contained only 10 of 70 proteins that were 100% identical to homologs in Asian strains, and 7 alastrim-specific proteins were noted. PMID:8661439

  7. The DNA sequence and comparative analysis of human chromosome 10.

    Science.gov (United States)

    Deloukas, P; Earthrowl, M E; Grafham, D V; Rubenfield, M; French, L; Steward, C A; Sims, S K; Jones, M C; Searle, S; Scott, C; Howe, K; Hunt, S E; Andrews, T D; Gilbert, J G R; Swarbreck, D; Ashurst, J L; Taylor, A; Battles, J; Bird, C P; Ainscough, R; Almeida, J P; Ashwell, R I S; Ambrose, K D; Babbage, A K; Bagguley, C L; Bailey, J; Banerjee, R; Bates, K; Beasley, H; Bray-Allen, S; Brown, A J; Brown, J Y; Burford, D C; Burrill, W; Burton, J; Cahill, P; Camire, D; Carter, N P; Chapman, J C; Clark, S Y; Clarke, G; Clee, C M; Clegg, S; Corby, N; Coulson, A; Dhami, P; Dutta, I; Dunn, M; Faulkner, L; Frankish, A; Frankland, J A; Garner, P; Garnett, J; Gribble, S; Griffiths, C; Grocock, R; Gustafson, E; Hammond, S; Harley, J L; Hart, E; Heath, P D; Ho, T P; Hopkins, B; Horne, J; Howden, P J; Huckle, E; Hynds, C; Johnson, C; Johnson, D; Kana, A; Kay, M; Kimberley, A M; Kershaw, J K; Kokkinaki, M; Laird, G K; Lawlor, S; Lee, H M; Leongamornlert, D A; Laird, G; Lloyd, C; Lloyd, D M; Loveland, J; Lovell, J; McLaren, S; McLay, K E; McMurray, A; Mashreghi-Mohammadi, M; Matthews, L; Milne, S; Nickerson, T; Nguyen, M; Overton-Larty, E; Palmer, S A; Pearce, A V; Peck, A I; Pelan, S; Phillimore, B; Porter, K; Rice, C M; Rogosin, A; Ross, M T; Sarafidou, T; Sehra, H K; Shownkeen, R; Skuce, C D; Smith, M; Standring, L; Sycamore, N; Tester, J; Thorpe, A; Torcasso, W; Tracey, A; Tromans, A; Tsolas, J; Wall, M; Walsh, J; Wang, H; Weinstock, K; West, A P; Willey, D L; Whitehead, S L; Wilming, L; Wray, P W; Young, L; Chen, Y; Lovering, R C; Moschonas, N K; Siebert, R; Fechtel, K; Bentley, D; Durbin, R; Hubbard, T; Doucette-Stamm, L; Beck, S; Smith, D R; Rogers, J

    2004-05-27

    The finished sequence of human chromosome 10 comprises a total of 131,666,441 base pairs. It represents 99.4% of the euchromatic DNA and includes one megabase of heterochromatic sequence within the pericentromeric region of the short and long arm of the chromosome. Sequence annotation revealed 1,357 genes, of which 816 are protein coding, and 430 are pseudogenes. We observed widespread occurrence of overlapping coding genes (either strand) and identified 67 antisense transcripts. Our analysis suggests that both inter- and intrachromosomal segmental duplications have impacted on the gene count on chromosome 10. Multispecies comparative analysis indicated that we can readily annotate the protein-coding genes with current resources. We estimate that over 95% of all coding exons were identified in this study. Assessment of single base changes between the human chromosome 10 and chimpanzee sequence revealed nonsense mutations in only 21 coding genes with respect to the human sequence. PMID:15164054

  8. Anonymity in Large Societies

    OpenAIRE

    Andrei Gomberg; Cesar Martinelli; Ricard Torres

    2002-01-01

    In a social choice model with an infinite number of agents, there may occur "equal size" coalitions that a preference aggregation rule should treat in the same manner. We introduce an axiom of equal treatment with respect to a measure of coalition size and explore its interaction with common axioms of social choice. We show that, provided the measure space is sufficiently rich in coalitions of the same measure, the new axiom is the natural extension of the concept of anonymity, and in particu...

  9. Nanopore-based Fourth-generation DNA Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Yanxiao Feng

    2015-02-01

    Full Text Available Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than $100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications.

  10. DNA sequence analysis using hierarchical ART-based classification networks

    Energy Technology Data Exchange (ETDEWEB)

    LeBlanc, C.; Hruska, S.I. [Florida State Univ., Tallahassee, FL (United States); Katholi, C.R.; Unnasch, T.R. [Univ. of Alabama, Birmingham, AL (United States)

    1994-12-31

    Adaptive resonance theory (ART) describes a class of artificial neural network architectures that act as classification tools which self-organize, work in real-time, and require no retraining to classify novel sequences. We have adapted ART networks to provide support to scientists attempting to categorize tandem repeat DNA fragments from Onchocerca volvulus. In this approach, sequences of DNA fragments are presented to multiple ART-based networks which are linked together into two (or more) tiers; the first provides coarse sequence classification while the sub- sequent tiers refine the classifications as needed. The overall rating of the resulting classification of fragments is measured using statistical techniques based on those introduced to validate results from traditional phylogenetic analysis. Tests of the Hierarchical ART-based Classification Network, or HABclass network, indicate its value as a fast, easy-to-use classification tool which adapts to new data without retraining on previously classified data.

  11. Perspectives of DNA microarray and next-generation DNA sequencing technologies

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    DNA microarray and next-generation DNA sequencing technologies are important tools for high-throughput genome research,in revealing both the structural and functional characteristics of genomes.In the past decade the DNA microarray technologies have been widely applied in the studies of functional genomics,systems biology and pharmacogenomics.The next-generation DNA sequencing method was first introduced by the 454 Company in 2003,immediately followed by the establishment of the Solexa and Solid techniques by other biotech companies.Though it has not been long since the first emergence of this technology,with the fast and impressive improvement,the application of this technology has extended to almost all fields of genomics research,as a rival challenging the existing DNA microarray technology.This paper briefly reviews the working principles of these two technologies as well as their application and perspectives in genome research.

  12. VoSeq: a voucher and DNA sequence web application.

    Directory of Open Access Journals (Sweden)

    Carlos Peña

    Full Text Available There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit.

  13. Recent developments in sequence selective minor groove DNA effectors.

    Science.gov (United States)

    Reddy, B S; Sharma, S K; Lown, J W

    2001-04-01

    DNA is a well characterized intracellular target but its large size and sequential nature make it an elusive target for selective drug action. Binding of low molecular weight ligands to DNA causes a wide variety of potential biological responses. In this respect the main consideration is given to recent developments in DNA sequence selective binding agents bearing conjugated effectors because of their potential application in diagnosis and treatment of cancers as well as in molecular biology. Recent progress in the development of cross linked lexitropsin oligopeptides and hairpins, which bind selectively to the minor groove of duplex DNA, is discussed. Bis-distamycins and related lexitropsins show inhibitory activity against HIV-1 and HIV-2 integrases at low nanomolar concentrations. Benzoyl nitrogen mustard analogs of lexitropsins are active against a variety of tumor models. Certain of the bis-benzimidazoles show altered DNA sequence preference and bind to DNA at 5'CG and TG sequences rather than at the preferred AT sites of the parent drug. A comparison of bifunctional bizelesin with monoalkylating adozelesin shows that it appears to have an increased sequence selectivity such that monoalkylating compounds react at more than one site but bizelesin reacts only at sites where there are two suitably positioned alkylation sites. Adozelesin, bizelesin and carzelesin are far more potent as cytotoxic agents than cisplatin or doxorubicin. A new class of 1,2,9,9a-tetrahydrocyclo-propa[c]benz[e]indole-4-one (CBI) analogs i.e., CBI-lexitropsin conjugates arising from the latter leads are also discussed.A number of cyclopropylpyrroloindole (CPI) and CBI-lexitropsin conjugates related to CC-1065 alkylate at the N3 position of adenine in the minor groove of DNA in a sequence specific manner, and also show cytotoxicities in the femtomolar range. The cross linking efficiency of PBD dimers is much greater than that of other cross linkers including cisplatin, and melphalan. A new

  14. Patterns of nucleotide misincorporations during enzymatic amplification and direct large-scale sequencing of ancient DNA

    OpenAIRE

    Stiller, M.; Green, R. E.; Ronan, M.; Simons, J F; Du, L; He, W.; Egholm, M; Rothberg, J. M.; Keates, S.G.; Ovodov, N. D.; Antipina, E. E.; Baryshnikov, G. F.; Kuzmin, Y.V.; Vasilevski, A. A.; Wuenschell, G. E.

    2006-01-01

    Whereas evolutionary inferences derived from present-day DNA sequences are by necessity indirect, ancient DNA sequences provide a direct view of past genetic variants. However, base lesions that accumulate in DNA over time may cause nucleotide misincorporations when ancient DNA sequences are replicated. By repeated amplifications of mitochondrial DNA sequences from a large number of ancient wolf remains, we show that C/G-to-T/A transitions are the predominant type of such misincorporations. U...

  15. The influence of DNA sequence on epigenome-induced pathologies

    Directory of Open Access Journals (Sweden)

    Meagher Richard B

    2012-07-01

    Full Text Available Abstract Clear cause-and-effect relationships are commonly established between genotype and the inherited risk of acquiring human and plant diseases and aberrant phenotypes. By contrast, few such cause-and-effect relationships are established linking a chromatin structure (that is, the epitype with the transgenerational risk of acquiring a disease or abnormal phenotype. It is not entirely clear how epitypes are inherited from parent to offspring as populations evolve, even though epigenetics is proposed to be fundamental to evolution and the likelihood of acquiring many diseases. This article explores the hypothesis that, for transgenerationally inherited chromatin structures, “genotype predisposes epitype”, and that epitype functions as a modifier of gene expression within the classical central dogma of molecular biology. Evidence for the causal contribution of genotype to inherited epitypes and epigenetic risk comes primarily from two different kinds of studies discussed herein. The first and direct method of research proceeds by the examination of the transgenerational inheritance of epitype and the penetrance of phenotype among genetically related individuals. The second approach identifies epitypes that are duplicated (as DNA sequences are duplicated and evolutionarily conserved among repeated patterns in the DNA sequence. The body of this article summarizes particularly robust examples of these studies from humans, mice, Arabidopsis, and other organisms. The bulk of the data from both areas of research support the hypothesis that genotypes predispose the likelihood of displaying various epitypes, but for only a few classes of epitype. This analysis suggests that renewed efforts are needed in identifying polymorphic DNA sequences that determine variable nucleosome positioning and DNA methylation as the primary cause of inherited epigenome-induced pathologies. By contrast, there is very little evidence that DNA sequence directly

  16. Early Lyme disease with spirochetemia - diagnosed by DNA sequencing

    Directory of Open Access Journals (Sweden)

    Jones William

    2010-11-01

    Full Text Available Abstract Background A sensitive and analytically specific nucleic acid amplification test (NAAT is valuable in confirming the diagnosis of early Lyme disease at the stage of spirochetemia. Findings Venous blood drawn from patients with clinical presentations of Lyme disease was tested for the standard 2-tier screen and Western Blot serology assay for Lyme disease, and also by a nested polymerase chain reaction (PCR for B. burgdorferi sensu lato 16S ribosomal DNA. The PCR amplicon was sequenced for B. burgdorferi genomic DNA validation. A total of 130 patients visiting emergency room (ER or Walk-in clinic (WALKIN, and 333 patients referred through the private physicians' offices were studied. While 5.4% of the ER/WALKIN patients showed DNA evidence of spirochetemia, none (0% of the patients referred from private physicians' offices were DNA-positive. In contrast, while 8.4% of the patients referred from private physicians' offices were positive for the 2-tier Lyme serology assay, only 1.5% of the ER/WALKIN patients were positive for this antibody test. The 2-tier serology assay missed 85.7% of the cases of early Lyme disease with spirochetemia. The latter diagnosis was confirmed by DNA sequencing. Conclusion Nested PCR followed by automated DNA sequencing is a valuable supplement to the standard 2-tier antibody assay in the diagnosis of early Lyme disease with spirochetemia. The best time to test for Lyme spirochetemia is when the patients living in the Lyme disease endemic areas develop unexplained symptoms or clinical manifestations that are consistent with Lyme disease early in the course of their illness.

  17. Prediction of fine-tuned promoter activity from DNA sequence.

    Science.gov (United States)

    Siwo, Geoffrey; Rider, Andrew; Tan, Asako; Pinapati, Richard; Emrich, Scott; Chawla, Nitesh; Ferdig, Michael

    2016-01-01

    The quantitative prediction of transcriptional activity of genes using promoter sequence is fundamental to the engineering of biological systems for industrial purposes and understanding the natural variation in gene expression. To catalyze the development of new algorithms for this purpose, the Dialogue on Reverse Engineering Assessment and Methods (DREAM) organized a community challenge seeking predictive models of promoter activity given normalized promoter activity data for 90 ribosomal protein promoters driving expression of a fluorescent reporter gene. By developing an unbiased modeling approach that performs an iterative search for predictive DNA sequence features using the frequencies of various k-mers, inferred DNA mechanical properties and spatial positions of promoter sequences, we achieved the best performer status in this challenge. The specific predictive features used in the model included the frequency of the nucleotide G, the length of polymeric tracts of T and TA, the frequencies of 6 distinct trinucleotides and 12 tetranucleotides, and the predicted protein deformability of the DNA sequence. Our method accurately predicted the activity of 20 natural variants of ribosomal protein promoters (Spearman correlation r = 0.73) as compared to 33 laboratory-mutated variants of the promoters (r = 0.57) in a test set that was hidden from participants. Notably, our model differed substantially from the rest in 2 main ways: i) it did not explicitly utilize transcription factor binding information implying that subtle DNA sequence features are highly associated with gene expression, and ii) it was entirely based on features extracted exclusively from the 100 bp region upstream from the translational start site demonstrating that this region encodes much of the overall promoter activity. The findings from this study have important implications for the engineering of predictable gene expression systems and the evolution of gene expression in naturally occurring

  18. Short sequence effect of ancient DNA on mammoth phylogenetic analyses

    Institute of Scientific and Technical Information of China (English)

    Guilian SHENG; Lianjuan WU; Xindong HOU; Junxia YUAN; Shenghong CHENG; Bojian ZHONG; Xulong LAI

    2009-01-01

    The evolution of Elephantidae has been intensively studied in the past few years, especially after 2006. The molecular approaches have made great contribution to the assumption that the extinct woolly mammoth has a close relationship with the Asian elephant instead of the African elephant. In this study, partial ancient DNA sequences of cytochrome b (cyt b) gene in mitochondrial genome were successfully retrieved from Late Pleistocene Mammuthus primigenius bones collected from Heilongjiang Province in Northeast China. Both the partial and complete homologous cyt b gene sequences and the whole mitochondrial genome sequences extracted from GenBank were aligned and used as datasets for phylogenetic analyses. All of the phylogenetic trees, based on either the partial or the complete cyt b gene, reject the relationship constructed by the whole mitochondrial genome, showing the occurrence of an effect of sequence length of cyt b gene on mammoth phylogenetic analyses.

  19. DNA sequence representation by trianders and determinative degree of nucleotides

    Institute of Scientific and Technical Information of China (English)

    DUPLIJ Diana; DUPLIJ Steven

    2005-01-01

    A new version of DNA walks, where nucleotides are regarded unequal in their contribution to a walk is introduced,which allows us to study thoroughly the "fine structure" of nucleotide sequences. The approach is based on the assumption that nucleotides have an inner abstract characteristic, the determinative degree, which reflects genetic code phenomenological properties and is adjusted to nucleotides physical properties. We consider each codon position independently, which gives three separate walks characterized by different angles and lengths, and that such an object is called triander which reflects the "strength"of branch. A general method for identifying DNA sequence "by triander" which can be treated as a unique "genogram" (or "gene passport") is proposed. The two- and three-dimensional trianders are considered. The difference of sequences fine structure in genes and the intergenic space is shown. A clear triplet signal in coding sequences was found which is absent in the intergenic space and is independent from the sequence length. This paper presents the topological classification oftrianders which can allow us to provide a detailed working out signatures of functionally different genomic regions.

  20. Biased distribution of DNA uptake sequences towards genome maintenance genes

    DEFF Research Database (Denmark)

    Davidsen, T.; Rodland, E.A.; Lagesen, K.;

    2004-01-01

    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...... in these organisms. Pasteurella multocida also displayed high frequencies of a putative DUS identical to that previously identified in H. influenzae and with a skewed distribution towards genome maintenance genes, indicating that this bacterium might be transformation competent under certain conditions....

  1. Sequences sufficient for programming imprinted germline DNA methylation defined.

    Directory of Open Access Journals (Sweden)

    Yoon Jung Park

    Full Text Available Epigenetic marks are fundamental to normal development, but little is known about signals that dictate their placement. Insights have been provided by studies of imprinted loci in mammals, where monoallelic expression is epigenetically controlled. Imprinted expression is regulated by DNA methylation programmed during gametogenesis in a sex-specific manner and maintained after fertilization. At Rasgrf1 in mouse, paternal-specific DNA methylation on a differential methylation domain (DMD requires downstream tandem repeats. The DMD and repeats constitute a binary switch regulating paternal-specific expression. Here, we define sequences sufficient for imprinted methylation using two transgenic mouse lines: One carries the entire Rasgrf1 cluster (RC; the second carries only the DMD and repeats (DR from Rasgrf1. The RC transgene recapitulated all aspects of imprinting seen at the endogenous locus. DR underwent proper DNA methylation establishment in sperm and erasure in oocytes, indicating the DMD and repeats are sufficient to program imprinted DNA methylation in germlines. Both transgenes produce a DMD-spanning pit-RNA, previously shown to be necessary for imprinted DNA methylation at the endogenous locus. We show that when pit-RNA expression is controlled by the repeats, it regulates DNA methylation in cis only and not in trans. Interestingly, pedigree history dictated whether established DR methylation patterns were maintained after fertilization. When DR was paternally transmitted followed by maternal transmission, the unmethylated state that was properly established in the female germlines could not be maintained. This provides a model for transgenerational epigenetic inheritance in mice.

  2. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison

    Directory of Open Access Journals (Sweden)

    Kato Mikio

    2003-01-01

    Full Text Available Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA.

  3. Cladistic analysis of iridoviruses based on protein and DNA sequences.

    Science.gov (United States)

    Wang, J W; Deng, R Q; Wang, X Z; Huang, Y S; Xing, K; Feng, J H; He, J G; Long, Q X

    2003-11-01

    Cladograms of iridoviruses were inferred from bootstrap analysis of molecular data sets comprising all published protein and DNA sequences of the major capsid protein, ATPase and DNA polymerase genes of members of the Iridoviridae family Iridovirus. All data sets yielded cladograms supporting the separation of the Iridovirus, Ranavirus and Lymphocystivirus genera, and the cladogram based on data derived from major capsid proteins further divided both the Iridovirus and Ranavirus genera into two groups. Tests of alternative hypotheses of topological constraints were also performed to further investigate relationships between infectious spleen and kidney necrosis virus (ISKNV), an unclassified fish iridovirus for which the complete genome sequence data is available, and other iridoviruses. Cladograms inferred and results of Shimodaira-Hasegawa tests indicated that ISKNV is more closely related to the Ranavirus genus than it is to the other genera of the family.

  4. Effect of dephasing on DNA sequencing via transverse electronic transport

    Energy Technology Data Exchange (ETDEWEB)

    Zwolak, Michael [Los Alamos National Laboratory; Krems, Matt [NON LANL; Pershin, Yuriy V [NON LANL; Di Ventra, Massimiliano [NON LANL

    2009-01-01

    We study theoretically the effects of dephasing on DNA sequencing in a nanopore via transverse electronic transport. To do this, we couple classical molecular dynamics simulations with transport calculations using scattering theory. Previous studies, which did not include dephasing, have shown that by measuring the transverse current of a particular base multiple times, one can get distributions of currents for each base that are distinguishable. We introduce a dephasing parameter into transport calculations to simulate the effects of the ions and other fluctuations. These effects lower the overall magnitude of the current, but have little effect on the current distributions themselves. The results of this work further implicate that distinguishing DNA bases via transverse electronic transport has potential as a sequencing tool.

  5. Silicene as a new potential DNA sequencing device

    Science.gov (United States)

    Amorim, Rodrigo G.; Scheicher, Ralph H.

    2015-04-01

    Silicene, a hexagonal buckled 2D allotrope of silicon, shows potential as a platform for numerous new applications, and may allow for easier integration with existing silicon-based microelectronics than graphene. Here, we show that silicene could function as an electrical DNA sequencing device. We investigated the stability of this novel nano-bio system, its electronic properties and the pronounced effects on the transverse electronic transport, i.e., changes in the transmission and the conductance caused by adsorption of each nucleobase, explored by us through the non-equilibrium Green’s function method. Intriguingly, despite the relatively weak interaction between nucleobases and silicene, significant changes in the transmittance at zero bias are predicted by us, in particular for the two nucleobases cytosine and guanine. Our findings suggest that silicene could be utilized as an integrated-circuit biosensor as part of a lab-on-a-chip device for DNA sequencing.

  6. An automated annotation tool for genomic DNA sequences using GeneScan and BLAST

    Indian Academy of Sciences (India)

    Andrew M. Lynn; Chakresh Kumar Jain; K. Kosalai; Pranjan Barman; Nupur Thakur; Harish Batra; Alok Bhattacharya

    2001-04-01

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated annotation of genome DNA sequences.

  7. Multiplexed DNA Sequence Capture of Mitochondrial Genomes Using PCR Products

    OpenAIRE

    Tomislav Maricic; Mark Whitten; Svante Pääbo

    2010-01-01

    BACKGROUND: To utilize the power of high-throughput sequencers, target enrichment methods have been developed. The majority of these require reagents and equipment that are only available from commercial vendors and are not suitable for the targets that are a few kilobases in length. METHODOLOGY/PRINCIPAL FINDINGS: We describe a novel and economical method in which custom made long-range PCR products are used to capture complete human mitochondrial genomes from complex DNA mixtures. We use th...

  8. Revised phylogeny of whales suggested by mitochondrial ribosomal DNA sequences

    OpenAIRE

    Milinkovitch, M.C.; Orti, G.; Meyer, A.

    1993-01-01

    Living cetaceans are subdivided into two highly distinct suborders, Odontoceti (the echolocating toothed whales) and Mysticeti (the filter-feeding baleen whales), which are believed to have had a long independent history. Here we report the determination of DNA sequences from two mitochondrial ribosomal gene segments (930 base pairs per species) for 16 species of cetaceans, a perissodactyl and a sloth, and construct the first phylogeny for whales and dolphins based on explicit cladistic metho...

  9. Roche genome sequencer FLX based high-throughput sequencing of ancient DNA

    DEFF Research Database (Denmark)

    Alquezar-Planas, David E; Fordyce, Sarah Louise

    2012-01-01

    Since the development of so-called "next generation" high-throughput sequencing in 2005, this technology has been applied to a variety of fields. Such applications include disease studies, evolutionary investigations, and ancient DNA. Each application requires a specialized protocol to ensure tha...

  10. Computational optimisation of targeted DNA sequencing for cancer detection

    Science.gov (United States)

    Martinez, Pierre; McGranahan, Nicholas; Birkbak, Nicolai Juul; Gerlinger, Marco; Swanton, Charles

    2013-12-01

    Despite recent progress thanks to next-generation sequencing technologies, personalised cancer medicine is still hampered by intra-tumour heterogeneity and drug resistance. As most patients with advanced metastatic disease face poor survival, there is need to improve early diagnosis. Analysing circulating tumour DNA (ctDNA) might represent a non-invasive method to detect mutations in patients, facilitating early detection. In this article, we define reduced gene panels from publicly available datasets as a first step to assess and optimise the potential of targeted ctDNA scans for early tumour detection. Dividing 4,467 samples into one discovery and two independent validation cohorts, we show that up to 76% of 10 cancer types harbour at least one mutation in a panel of only 25 genes, with high sensitivity across most tumour types. Our analyses demonstrate that targeting ``hotspot'' regions would introduce biases towards in-frame mutations and would compromise the reproducibility of tumour detection.

  11. Application of synthetic DNA probes to the analysis of DNA sequence variants in man

    International Nuclear Information System (INIS)

    Oligonucleotide probes provide a tool to discriminate between any two alleles on the basis of hybridization. Random sampling of the genome with different oligonucleotide probes should reveal polymorphism in a certain percentage of the cases. In the hope of identifying polymorphic regions more efficiently, we chose to take advantage of the proposed hypermutability of repeated DNA sequences and the specificity of oligonucleotide hybridization. Since, under appropriate conditions, oligonucleotide probes require complete base pairing for hybridization to occur, they will only hybridize to a subset of the members of a repeat family when all members of the family are not identical. The results presented here suggest that oligonucleotide hybridization can be used to extend the genomic sequences that can be tested for the presence of RFLPs. This expands the tools available to human genetics. In addition, the results suggest that repeated DNA sequences are indeed more polymorphic than single-copy sequences. 28 references, 2 figures

  12. Human mitochondrial DNA complete amplification and sequencing: a new validated primer set that prevents nuclear DNA sequences of mitochondrial origin co-amplification.

    Science.gov (United States)

    Ramos, Amanda; Santos, Cristina; Alvarez, Luis; Nogués, Ramon; Aluja, Maria Pilar

    2009-05-01

    To date, there are no published primers to amplify the entire mitochondrial DNA (mtDNA) that completely prevent the amplification of nuclear DNA (nDNA) sequences of mitochondrial origin. The main goal of this work was to design, validate and describe a set of primers, to specifically amplify and sequence the complete human mtDNA, allowing the correct interpretation of mtDNA heteroplasmy in healthy and pathological samples. Validation was performed using two different approaches: (i) Basic Local Alignment Search Tool and (ii) amplification using isolated nDNA obtained from sperm cells by differential lyses. During the validation process, two mtDNA regions, with high similarity with nDNA, represent the major problematic areas for primer design. One of these could represent a non-published nuclear DNA sequence of mitochondrial origin. For two of the initially designed fragments, the amplification results reveal PCR artifacts that can be attributed to the poor quality of the DNA. After the validation, nine overlapping primer pairs to perform mtDNA amplification and 22 additional internal primers for mtDNA sequencing were obtained. These primers could be a useful tool in future projects that deal with mtDNA complete sequencing and heteroplasmy detection, since they represent a set of primers that have been tested for the non-amplification of nDNA.

  13. Rapid sequencing of DNA based on single-molecule detection

    Science.gov (United States)

    Soper, Steven A.; Davis, Lloyd M.; Fairfield, Frederick R.; Hammond, Mark L.; Harger, Carol A.; Jett, James H.; Keller, Richard A.; Marrone, Babetta L.; Martin, John C.; Nutter, Harvey L.; Shera, E. Brooks; Simpson, Daniel J.

    1991-07-01

    Sequencing the human genome is a major undertaking considering the large number of nucleotides present in the genome and the slow methods currently available to perform the task. The authors have recently reported on a scheme to sequence DNA rapidly using a non-gel based technique. The concept is based upon the incorporation of fluorescently labeled nucleotides into a strand of DNA, isolation and manipulation of a labeled DNA fragment and the detection of single nucleotides using ultra-sensitive laser-induced fluorescence detection following their cleavage from the fragment. Detection of individual fluorophores in the liquid phase was accomplished with time-gated detection following pulsed-laser excitation. The photon bursts from individual rhodamine 6G (R6G) molecules travelling through a laser beam have been observed, as have bursts from single fluorescently modified nucleotides. Using two different biotinylated nucleotides as a model system for fluorescently labeled nucleotides, the authors have observed synthesis of the complementary copy of M13 bacteriophage. Work with fluorescently labeled nucleotides is underway. Individual molecules of DNA attached to a microbead have been observed and manipulated with an epifluorescence microscope.

  14. A DNA sequence alignment algorithm using quality information and a fuzzy inference method

    Institute of Scientific and Technical Information of China (English)

    Kwangbaek Kim; Minhwan Kim; Youngwoon Woo

    2008-01-01

    DNA sequence alignment algorithms in computational molecular biology have been improved by diverse methods.In this paper.We propose a DNA sequence alignment that Uses quality information and a fuzzy inference method developed based on the characteristics of DNA fragments and a fuzzy logic system in order to improve conventional DNA sequence alignment methods that uses DNA sequence quality information.In conventional algorithms.DNA sequence alignment scores are calculated by the global sequence alignment algorithm proposed by Needleman-Wunsch,which is established by using quality information of each DNA fragment.However,there may be errors in the process of calculating DNA sequence alignment scores when the quality of DNA fragment tips is low.because only the overall DNA sequence quality information are used.In our proposed method.an exact DNA sequence alignment can be achieved in spite of the low quality of DNA fragment tips by improvement of conventional algorithms using quality information.Mapping score parameters used to calculate DNA sequence alignment scores are dynamically adjusted by the fuzzy logic system utilizing lengths of DNA fragments and frequencies of low quality DNA bases in the fragments.From the experiments by applying real genome data of National Center for Bioteclmology Information,we could see that the proposed method is more efficient than conventional algorithms.

  15. Artificial intelligence approach in analysis of DNA sequences.

    Science.gov (United States)

    Brézillon, P J; Zaraté, P; Saci, F

    1993-01-01

    We present an approach for designing a knowledge-based system, called Sequence Acquisition In Context (SAIC), that will be able to cooperate with a biologist in the analysis of DNA sequences. The main task of the system is the acquisition of the expert knowledge that the biologist uses for solving ambiguities from gel autoradiograms, with the aim of re-using it later for solving similar ambiguities. The various types of expert knowledge constitute what we call the contextual knowledge of the sequence analysis. Contextual knowledge deals with the unavoidable problems that are common in the study of the living material (eg noise on data, difficulties of observations). Indeed, the analysis of DNA sequences from autoradiograms belongs to an emerging and promising area of investigation, namely reasoning with images. The SAIC project is developed in a theoretical framework that is shared with other applications. Not all tasks have the same importance in each application. We use this observation for designing an intelligent assistant system with three applications. In the SAIC project, we focus on knowledge acquisition, human-computer interaction and explanation. The project will benefit research in the two other applications. We also discuss our SAIC project in the context of large international projects that aim to re-use and share knowledge in a repository.

  16. DNA Sequencing via Quantum Mechanics and Machine Learning

    CERN Document Server

    Yuen, Henry; Zhang, Kevin J; Nomura, Ken-ichi; Kalia, Rajiv K; Nakano, Aiichiro; Vashishta, Priya

    2010-01-01

    Rapid sequencing of individual human genome is prerequisite to genomic medicine, where diseases will be prevented by preemptive cures. Quantum-mechanical tunneling through single-stranded DNA in a solid-state nanopore has been proposed for rapid DNA sequencing, but unfortunately the tunneling current alone cannot distinguish the four nucleotides due to large fluctuations in molecular conformation and solvent. Here, we propose a machine-learning approach applied to the tunneling current-voltage (I-V) characteristic for efficient discrimination between the four nucleotides. We first combine principal component analysis (PCA) and fuzzy c-means (FCM) clustering to learn the "fingerprints" of the electronic density-of-states (DOS) of the four nucleotides, which can be derived from the I-V data. We then apply the hidden Markov model and the Viterbi algorithm to sequence a time series of DOS data (i.e., to solve the sequencing problem). Numerical experiments show that the PCA-FCM approach can classify unlabeled DOS ...

  17. DNA sequence chromatogram browsing using JAVA and CORBA.

    Science.gov (United States)

    Parsons, J D; Buehler, E; Hillier, L

    1999-03-01

    DNA sequence chromatograms (traces) are the primary data source for all large-scale genomic and expressed sequence tags (ESTs) sequencing projects. Access to the sequencing trace assists many later analyses, for example contig assembly and polymorphism detection, but obtaining and using traces is problematic. Traces are not collected and published centrally, they are much larger than the base calls derived from them, and viewing them requires the interactivity of a local graphical client with local data. To provide efficient global access to DNA traces, we developed a client/server system based on flexible Java components integrated into other applications including an applet for use in a WWW browser and a stand-alone trace viewer. Client/server interaction is facilitated by CORBA middleware which provides a well-defined interface, a naming service, and location independence. [The software is packaged as a Jar file available from the following URL: http://www.ebi.ac.uk/jparsons. Links to working examples of the trace viewers can be found at http://corba.ebi.ac.uk/EST. All the Washington University mouse EST traces are available for browsing at the same URL.

  18. The most frequent short sequences in non-coding DNA.

    Science.gov (United States)

    Subirana, Juan A; Messeguer, Xavier

    2010-03-01

    The purpose of this work is to determine the most frequent short sequences in non-coding DNA. They may play a role in maintaining the structure and function of eukaryotic chromosomes. We present a simple method for the detection and analysis of such sequences in several genomes, including Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster and Homo sapiens. We also study two chromosomes of man and mouse with a length similar to the whole genomes of the other species. We provide a list of the most common sequences of 9-14 bases in each genome. As expected, they are present in human Alu sequences. Our programs may also give a graph and a list of their position in the genome. Detection of clusters is also possible. In most cases, these sequences contain few alternating regions. Their intrinsic structure and their influence on nucleosome formation are not known. In particular, we have found new features of short sequences in C. elegans, which are distributed in heterogeneous clusters. They appear as punctuation marks in the chromosomes. Such clusters are not found in either A. thaliana or D. melanogaster. We discuss the possibility that they play a role in centromere function and homolog recognition in meiosis. PMID:19966278

  19. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697

  20. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Directory of Open Access Journals (Sweden)

    Chun-Tien Chang

    2012-01-01

    Full Text Available The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs, insertion-deletions (indels, short tandem repeats (STRs, and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR, which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS; (iii determine human papilloma virus (HPV genotypes by searching current viral databases in cases of double infections; (iv estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4 and its paralog HSPDP3.

  1. Mixed sequence reader: a program for analyzing DNA sequences with heterozygous base calling.

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3.

  2. Electromechanical Signatures for DNA Sequencing through a Mechanosensitive Nanopore.

    Science.gov (United States)

    Farimani, A Barati; Heiranian, M; Aluru, N R

    2015-02-19

    Biological nanopores have been extensively used for DNA base detection since these pores are widely available and tunable through mutations. Distinguishing bases of nucleic acids by passing them through nanopores has so far primarily relied on electrical signals-specifically, ionic currents through the nanopores. However, the low signal-to-noise ratio makes detection of ionic currents difficult. In this study, we show that the initially closed mechanosensitive channel of large conductance (MscL) protein pore opens for single-stranded DNA (ssDNA) translocation under an applied electric field. As each nucleotide translocates through the pore, a unique mechanical signal is observed-specifically, the tension in the membrane containing the MscL pore is different for each nucleotide. In addition to the membrane tension, we found that the ionic current is also different for the four nucleotide types. The initially closed MscL adapts its opening for nucleotide translocation due to the flexibility of the pore. This unique operation of MscL provides single nucleotide resolution in both electrical and mechanical signals. Finally, we also show that the speed of DNA translocation is roughly 1 order of magnitude slower in MscL compared to Mycobacterium smegmatis porin A (MspA), suggesting MscL to be an attractive protein pore for DNA sequencing. PMID:26262481

  3. Chimeric TALE recombinases with programmable DNA sequence specificity.

    Science.gov (United States)

    Mercer, Andrew C; Gaj, Thomas; Fuller, Roberta P; Barbas, Carlos F

    2012-11-01

    Site-specific recombinases are powerful tools for genome engineering. Hyperactivated variants of the resolvase/invertase family of serine recombinases function without accessory factors, and thus can be re-targeted to sequences of interest by replacing native DNA-binding domains (DBDs) with engineered zinc-finger proteins (ZFPs). However, imperfect modularity with particular domains, lack of high-affinity binding to all DNA triplets, and difficulty in construction has hindered the widespread adoption of ZFPs in unspecialized laboratories. The discovery of a novel type of DBD in transcription activator-like effector (TALE) proteins from Xanthomonas provides an alternative to ZFPs. Here we describe chimeric TALE recombinases (TALERs): engineered fusions between a hyperactivated catalytic domain from the DNA invertase Gin and an optimized TALE architecture. We use a library of incrementally truncated TALE variants to identify TALER fusions that modify DNA with efficiency and specificity comparable to zinc-finger recombinases in bacterial cells. We also show that TALERs recombine DNA in mammalian cells. The TALER architecture described herein provides a platform for insertion of customized TALE domains, thus significantly expanding the targeting capacity of engineered recombinases and their potential applications in biotechnology and medicine.

  4. A 28,000 Years Old Cro-Magnon mtDNA Sequence Differs from All Potentially Contaminating Modern Sequences

    OpenAIRE

    David Caramelli; Lucio Milani; Stefania Vai; Alessandra Modi; Elena Pecchioli; Matteo Girardi; Elena Pilli; Martina Lari; Barbara Lippi; Annamaria Ronchitelli; Francesco Mallegni; Antonella Casoli; Giorgio Bertorelle; Guido Barbujani

    2008-01-01

    Background: DNA sequences from ancient speciments may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal) and early modern (Cro-Magnoid) Europeans. Methodology/Principal Findings: We typed the mitochondrial DNA (mtDNA) hypervariable region I in a 28...

  5. Bacterial DNA Sequence Compression Models Using Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Armando J. Pinho

    2013-08-01

    Full Text Available It is widely accepted that the advances in DNA sequencing techniques have contributed to an unprecedented growth of genomic data. This fact has increased the interest in DNA compression, not only from the information theory and biology points of view, but also from a practical perspective, since such sequences require storage resources. Several compression methods exist, and particularly, those using finite-context models (FCMs have received increasing attention, as they have been proven to effectively compress DNA sequences with low bits-per-base, as well as low encoding/decoding time-per-base. However, the amount of run-time memory required to store high-order finite-context models may become impractical, since a context-order as low as 16 requires a maximum of 17.2 x 109 memory entries. This paper presents a method to reduce such a memory requirement by using a novel application of artificial neural networks (ANN to build such probabilistic models in a compact way and shows how to use them to estimate the probabilities. Such a system was implemented, and its performance compared against state-of-the art compressors, such as XM-DNA (expert model and FCM-Mx (mixture of finite-context models , as well as with general-purpose compressors. Using a combination of order-10 FCM and ANN, similar encoding results to those of FCM, up to order-16, are obtained using only 17 megabytes of memory, whereas the latter, even employing hash-tables, uses several hundreds of megabytes.

  6. Complete genome sequence of mitochondrial DNA (mtDNA) of Chlorella sorokiniana.

    Science.gov (United States)

    Orsini, Massimiliano; Costelli, Cristina; Malavasi, Veronica; Cusano, Roberto; Concas, Alessandro; Angius, Andrea; Cao, Giacomo

    2016-01-01

    The complete sequence of mitochondrial genome of the Chlorella sorokiniana strain (SAG 111-8 k) is presented in this work. Within the Chlorella genus, it represents the second species with a complete sequenced and annotated mitochondrial genome (GenBank accession no. KM241869). The genome consists of circular chromosomes of 52,528 bp and encodes a total of 31 protein coding genes, 3 rRNAs and 26 tRNAs. The overall AT contents of the C. sorokiniana mtDNA is 70.89%, while the coding sequence is of 97.4%.

  7. Sequencing of mitochondrial HV1 and HV2 DNA with length heteroplasmy

    DEFF Research Database (Denmark)

    Rasmussen, E. Michael; Eriksen, Birthe; Larsen, Hans Jakob;

    2003-01-01

    This study presents a fast method for sequencing the poly C/G regions in HV1 and HV2 in the mitochondrial DNA (mtDNA)......This study presents a fast method for sequencing the poly C/G regions in HV1 and HV2 in the mitochondrial DNA (mtDNA)...

  8. Choosing the best heuristic for seeded alignment of DNA sequences

    Directory of Open Access Journals (Sweden)

    Buhler Jeremy

    2006-03-01

    Full Text Available Abstract Background Seeded alignment is an important component of algorithms for fast, large-scale DNA similarity search. A good seed matching heuristic can reduce the execution time of genomic-scale sequence comparison without degrading sensitivity. Recently, many types of seed have been proposed to improve on the performance of traditional contiguous seeds as used in, e.g., NCBI BLASTN. Choosing among these seed types, particularly those that use information besides the presence or absence of matching residue pairs, requires practical guidance based on a rigorous comparison, including assessment of sensitivity, specificity, and computational efficiency. This work performs such a comparison, focusing on alignments in DNA outside widely studied coding regions. Results We compare seeds of several types, including those allowing transition mutations rather than matches at fixed positions, those allowing transitions at arbitrary positions ("BLASTZ" seeds, and those using a more general scoring matrix. For each seed type, we use an extended version of our Mandala seed design software to choose seeds with optimized sensitivity for various levels of specificity. Our results show that, on a test set biased toward alignments of noncoding DNA, transition information significantly improves seed performance, while finer distinctions between different types of mismatches do not. BLASTZ seeds perform especially well. These results depend on properties of our test set that are not shared by EST-based test sets with a strong bias toward coding DNA. Conclusion Practical seed design requires careful attention to the properties of the alignments being sought. For noncoding DNA sequences, seeds that use transition information, especially BLASTZ-style seeds, are particularly useful. The Mandala seed design software can be found at http://www.cse.wustl.edu/~yanni/mandala/.

  9. Stability of capillary gels for automated sequencing of DNA.

    Science.gov (United States)

    Swerdlow, H; Dew-Jager, K E; Brady, K; Grey, R; Dovichi, N J; Gesteland, R

    1992-08-01

    Recent interest in capillary gel electrophoresis has been fueled by the Human Genome Project and other large-scale sequencing projects. Advances in gel polymerization techniques and detector design have enabled sequencing of DNA directly in capillaries. Efforts to exploit this technology have been hampered by problems with the reproducibility and stability of gels. Gel instability manifests itself during electrophoresis as a decrease in the current passing through the capillary under a constant voltage. Upon subsequent microscopic examination, bubbles are often visible at or near the injection (cathodic) end of the capillary gel. Gels have been prepared with the polyacrylamide matrix covalently attached to the silica walls of the capillary. These gels, although more stable, still suffer from problems with bubbles. The use of actual DNA sequencing samples also adversely affects gel stability. We examined the mechanisms underlying these disruptive processes by employing polyacrylamide gel-filled capillaries in which the gel was not attached to the capillary wall. Three sources of gel instability were identified. Bubbles occurring in the absence of sample introduction were attributed to electroosmotic force; replacing the denaturant urea with formamide was shown to reduce the frequency of these bubbles. The slow, steady decline in current through capillary sequencing gels interferes with the ability to detect other gel problems. This phenomenon was shown to be a result of ionic depletion at the gel-liquid interface. The decline was ameliorated by adding denaturant and acrylamide monomers to the buffer reservoirs. Sample-induced problems were shown to be due to the presence of template DNA; elimination of the template allowed sample loading to occur without complications.(ABSTRACT TRUNCATED AT 250 WORDS)

  10. Characterization of Expressed Sequence Tags From a Gallus gallus Pineal Gland cDNA Library

    OpenAIRE

    Stefanie Hartman; Greg Touchton; Jessica Wynn; Tuoyu Geng; Chong, Nelson W.; Ed Smith

    2005-01-01

    The pineal gland is the circadian oscillator in the chicken, regulating diverse functions ranging from egg laying to feeding. Here, we describe the isolation and characterization of expressed sequence tags (ESTs) isolated from a chicken pineal gland cDNA library. A total of 192 unique sequences were analysed and submitted to GenBank; 6% of the ESTs matched neither GenBank cDNA sequences nor the newly assembled chicken genomic DNA sequence, three ESTs aligned with sequences d...

  11. Using Synthetic Nanopores for Single-Molecule Analyses: Detecting SNPs, Trapping DNA Molecules, and the Prospects for Sequencing DNA

    Science.gov (United States)

    Dimitrov, Valentin V.

    2009-01-01

    This work focuses on studying properties of DNA molecules and DNA-protein interactions using synthetic nanopores, and it examines the prospects of sequencing DNA using synthetic nanopores. We have developed a method for discriminating between alleles that uses a synthetic nanopore to measure the binding of a restriction enzyme to DNA. There exists…

  12. Isolation and analysis of high quality nuclear DNA with reduced organellar DNA for plant genome sequencing and resequencing

    Directory of Open Access Journals (Sweden)

    Zdepski Anna

    2011-05-01

    Full Text Available Abstract Background High throughput sequencing (HTS technologies have revolutionized the field of genomics by drastically reducing the cost of sequencing, making it feasible for individual labs to sequence or resequence plant genomes. Obtaining high quality, high molecular weight DNA from plants poses significant challenges due to the high copy number of chloroplast and mitochondrial DNA, as well as high levels of phenolic compounds and polysaccharides. Multiple methods have been used to isolate DNA from plants; the CTAB method is commonly used to isolate total cellular DNA from plants that contain nuclear DNA, as well as chloroplast and mitochondrial DNA. Alternatively, DNA can be isolated from nuclei to minimize chloroplast and mitochondrial DNA contamination. Results We describe optimized protocols for isolation of nuclear DNA from eight different plant species encompassing both monocot and eudicot species. These protocols use nuclei isolation to minimize chloroplast and mitochondrial DNA contamination. We also developed a protocol to determine the number of chloroplast and mitochondrial DNA copies relative to the nuclear DNA using quantitative real time PCR (qPCR. We compared DNA isolated from nuclei to total cellular DNA isolated with the CTAB method. As expected, DNA isolated from nuclei consistently yielded nuclear DNA with fewer chloroplast and mitochondrial DNA copies, as compared to the total cellular DNA prepared with the CTAB method. This protocol will allow for analysis of the quality and quantity of nuclear DNA before starting a plant whole genome sequencing or resequencing experiment. Conclusions Extracting high quality, high molecular weight nuclear DNA in plants has the potential to be a bottleneck in the era of whole genome sequencing and resequencing. The methods that are described here provide a framework for researchers to extract and quantify nuclear DNA in multiple types of plants.

  13. Anonymous Broadcast Messages

    Directory of Open Access Journals (Sweden)

    Dragan Lazic

    2013-01-01

    Full Text Available The Dining Cryptographer network (or DC-net is a privacy preserving communication protocol devised by David Chaum for anonymous message publication. A very attractive feature of DC-nets is the strength of its security, which is inherent in the protocol and is not dependent on other schemes, like encryption. Unfortunately the DC-net protocol has a level of complexity that causes it to suffer from exceptional communication overhead and implementation difficulty that precludes its use in many real-world use-cases. We have designed and created a DC-net implementation that uses a pure client-server model, which successfully avoids much of the complexity inherent in the DC-net protocol. We describe the theory of DC-nets and our pure client-server implementation, as well as the compromises that were made to reduce the protocol’s level of complexity. Discussion centers around the details of our implementation of DC-net.

  14. Isolation of Human Genomic DNA Sequences with Expanded Nucleobase Selectivity.

    Science.gov (United States)

    Rathi, Preeti; Maurer, Sara; Kubik, Grzegorz; Summerer, Daniel

    2016-08-10

    We report the direct isolation of user-defined DNA sequences from the human genome with programmable selectivity for both canonical and epigenetic nucleobases. This is enabled by the use of engineered transcription-activator-like effectors (TALEs) as DNA major groove-binding probes in affinity enrichment. The approach provides the direct quantification of 5-methylcytosine (5mC) levels at single genomic nucleotide positions in a strand-specific manner. We demonstrate the simple, multiplexed typing of a variety of epigenetic cancer biomarker 5mC with custom TALE mixes. Compared to antibodies as the most widely used affinity probes for 5mC analysis, i.e., employed in the methylated DNA immunoprecipitation (MeDIP) protocol, TALEs provide superior sensitivity, resolution and technical ease. We engineer a range of size-reduced TALE repeats and establish full selectivity profiles for their binding to all five human cytosine nucleobases. These provide insights into their nucleobase recognition mechanisms and reveal the ability of TALEs to isolate genomic target sequences with selectivity for single 5-hydroxymethylcytosine and, in combination with sodium borohydride reduction, single 5-formylcytosine nucleobases. PMID:27429302

  15. Discovering motifs in ranked lists of DNA sequences.

    Directory of Open Access Journals (Sweden)

    Eran Eden

    2007-03-01

    Full Text Available Computational methods for discovery of sequence elements that are enriched in a target set compared with a background set are fundamental in molecular biology research. One example is the discovery of transcription factor binding motifs that are inferred from ChIP-chip (chromatin immuno-precipitation on a microarray measurements. Several major challenges in sequence motif discovery still require consideration: (i the need for a principled approach to partitioning the data into target and background sets; (ii the lack of rigorous models and of an exact p-value for measuring motif enrichment; (iii the need for an appropriate framework for accounting for motif multiplicity; (iv the tendency, in many of the existing methods, to report presumably significant motifs even when applied to randomly generated data. In this paper we present a statistical framework for discovering enriched sequence elements in ranked lists that resolves these four issues. We demonstrate the implementation of this framework in a software application, termed DRIM (discovery of rank imbalanced motifs, which identifies sequence motifs in lists of ranked DNA sequences. We applied DRIM to ChIP-chip and CpG methylation data and obtained the following results. (i Identification of 50 novel putative transcription factor (TF binding sites in yeast ChIP-chip data. The biological function of some of them was further investigated to gain new insights on transcription regulation networks in yeast. For example, our discoveries enable the elucidation of the network of the TF ARO80. Another finding concerns a systematic TF binding enhancement to sequences containing CA repeats. (ii Discovery of novel motifs in human cancer CpG methylation data. Remarkably, most of these motifs are similar to DNA sequence elements bound by the Polycomb complex that promotes histone methylation. Our findings thus support a model in which histone methylation and CpG methylation are mechanistically linked

  16. Efficient Anonymizations with Enhanced Utility

    Directory of Open Access Journals (Sweden)

    Jacob Goldberger

    2010-08-01

    Full Text Available One of the most well studied models of privacy preservation is k-anonymity. Previous studies of k-anonymization used various utility measures that aim at enhancing the correlation between the original public data and the generalized public data. We, bearing in mind that a primary goal in releasing the anonymized database for datamining is to deducemethods of predicting the private data from the public data, propose a new information-theoretic measure that aims at enhancing the correlation between the generalized public data and the private data. Such a measure significantly enhances the utility of the released anonymized database for data mining. We then proceed to describe a new algorithm that is designed to achieve k-anonymity with high utility, independently of the underlying utility measure. That algorithm is based on a modified version of sequential clustering which is the method of choice in clustering. Experimental comparison with four well known algorithms of k-anonymity show that the sequential clustering algorithm is an efficient algorithm that achieves the best utility results. We also describe a modification of the algorithm that outputs k-anonymizations which respect the additional security measure of l-diversity.

  17. PCR master mixes harbour murine DNA sequences. Caveat emptor!

    Directory of Open Access Journals (Sweden)

    Philip W Tuke

    Full Text Available BACKGROUND: XMRV is the most recently described retrovirus to be found in Man, firstly in patients with prostate cancer (PC and secondly in 67% of patients with chronic fatigue syndrome (CFS and 3.7% of controls. Both disease associations remain contentious. Indeed, a recent publication has concluded that "XMRV is unlikely to be a human pathogen". Subsequently related but different polytropic MLV (pMLV sequences were also reported from the blood of 86.5% of patients with CFS. and 6.8% of controls. Consequently we decided to investigate blood donors for evidence of XMRV/pMLV. METHODOLOGY/PRINCIPAL FINDINGS: Testing of cDNA prepared from the whole blood of 80 random blood donors, generated gag PCR signals from two samples (7C and 9C. These had previously tested negative for XMRV by two other PCR based techniques. To test whether the PCR mix was the source of these sequences 88 replicates of water were amplified using Invitrogen Platinum Taq (IPT and Applied Biosystems Taq Gold LD (ABTG. Four gag sequences (2D, 3F, 7H, 12C were generated with the IPT, a further sequence (12D by ABTG re-amplification of an IPT first round product. Sequence comparisons revealed remarkable similarities between these sequences, endogeous MLVs and the pMLV sequences reported in patients with CFS. CONCLUSIONS/SIGNIFICANCE: Methodologies for the detection of viruses highly homologous to endogenous murine viruses require special caution as the very reagents used in the detection process can be a source of contamination and at a level where it is not immediately apparent. It is suggested that such contamination is likely to explain the apparent presence of pMLV in CFS.

  18. Sequence analysis of four caprine mitochondria DNA lineages

    Directory of Open Access Journals (Sweden)

    Yue-Hui Ma

    2012-10-01

    Full Text Available The complete mitochondrial DNA (mtDNA (16640bp in length was sequenced from four Chinese goat lineages representing the four major mtDNA haplogroups in goats. A total of 124 single nucleotide polymorphisms (SNPs were found in encoding regions, and the overall ratio of transitions:transversions was 40:1 revealing a heavy transition/transversion rate in domestic goats. Eighteen non-synonymous sites were found for the total number of SNPs; the sites did not affect the predicted functions of protein for these four goat mtDNA lineages. In the region for coding tRNA and rRNA, SNPs occurred in loops, unstructured single strand and stems that were conformed with the principle of G-U pairing. We came to the conclusion that these substitutions could not change secondary structure of RNAs, and there was no positive selection on goat mitochondrial coding region according to the result of dN/dS (0.0399-0.1529 by comparing the goat with other reported mitochondrial genomes.

  19. Peptide Synthesis on a Next-Generation DNA Sequencing Platform.

    Science.gov (United States)

    Svensen, Nina; Peersen, Olve B; Jaffrey, Samie R

    2016-09-01

    Methods for displaying large numbers of peptides on solid surfaces are essential for high-throughput characterization of peptide function and binding properties. Here we describe a method for converting the >10(7) flow cell-bound clusters of identical DNA strands generated by the Illumina DNA sequencing technology into clusters of complementary RNA, and subsequently peptide clusters. We modified the flow-cell-bound primers with ribonucleotides thus enabling them to be used by poliovirus polymerase 3D(pol) . The primers hybridize to the clustered DNA thus leading to RNA clusters. The RNAs fold into functional protein- or small molecule-binding aptamers. We used the mRNA-display approach to synthesize flow-cell-tethered peptides from these RNA clusters. The peptides showed selective binding to cognate antibodies. The methods described here provide an approach for using DNA clusters to template peptide synthesis on an Illumina flow cell, thus providing new opportunities for massively parallel peptide-based assays.

  20. Programmable in vivo selection of arbitrary DNA sequences.

    Directory of Open Access Journals (Sweden)

    Tuval Ben Yehezkel

    Full Text Available The extraordinary fidelity, sensory and regulatory capacity of natural intracellular machinery is generally confined to their endogenous environment. Nevertheless, synthetic bio-molecular components have been engineered to interface with the cellular transcription, splicing and translation machinery in vivo by embedding functional features such as promoters, introns and ribosome binding sites, respectively, into their design. Tapping and directing the power of intracellular molecular processing towards synthetic bio-molecular inputs is potentially a powerful approach, albeit limited by our ability to streamline the interface of synthetic components with the intracellular machinery in vivo. Here we show how a library of synthetic DNA devices, each bearing an input DNA sequence and a logical selection module, can be designed to direct its own probing and processing by interfacing with the bacterial DNA mismatch repair (MMR system in vivo and selecting for the most abundant variant, regardless of its function. The device provides proof of concept for programmable, function-independent DNA selection in vivo and provides a unique example of a logical-functional interface of an engineered synthetic component with a complex endogenous cellular system. Further research into the design, construction and operation of synthetic devices in vivo may lead to other functional devices that interface with other complex cellular processes for both research and applied purposes.

  1. Chromatin reconstitution on small DNA rings. IV. DNA supercoiling and nucleosome sequence preference.

    Science.gov (United States)

    Duband-Goulet, I; Carot, V; Ulyanov, A V; Douc-Rasy, S; Prunell, A

    1992-04-20

    Nucleosome formation on inverted repeats or on some alternations of purines and pyrimidines can be inhibited in vitro by DNA supercoiling through their supercoiling-induced structural transitions to cruciforms or Z-form DNA, respectively. We report here, as a result of study of single nucleosome reconstitutions on a DNA minicircle, that a physiological level of DNA supercoiling can also enhance nucleosome sequence preference. The 357 base-pair minicircle was composed of a promoter of phage SP6 RNA polymerase joined to a 256 base-pair fragment containing a sea urchin 5 S RNA gene. Nucleosome formation on the promoter was found to be enhanced on a topoisomer with in vivo superhelix density when compared to topoisomers of lower or higher superhelical densities, to the nicked circle, or to the linear DNA. In contrast, nucleosomes at other positions appeared to be insensitive to supercoiling. This observation relied on a novel procedure for the investigation of nucleosome positioning. The reconstituted circular chromatin was first linearized using a restriction endonuclease, and the linear chromatin so obtained was electrophoresed as nucleoprotein in a polyacrylamide gel. The gel showed well-fractionated bands whose mobilities were a V-like function of nucleosome positions, with the nucleosome near the middle migrating less. This behavior is similar to that previously observed for complexes of sequence-specific DNA-bending proteins with circularly permuted DNA fragments, and presumably reflects the change in the direction of the DNA axis between the entrance and the exit of the particle. Possible mechanisms for such supercoiling-induced modulation of nucleosome formation are discussed in the light of the supercoiling-dependent susceptibility to cleavage of the naked minicircle with S1 and Bal31 nucleases; and a comparison between DNase I cleavage patterns of the modulated nucleosome and of another, non-modulated, overlapping nucleosome. PMID:1314907

  2. New scoring schema for finding motifs in DNA Sequences

    Directory of Open Access Journals (Sweden)

    Nowzari-Dalini Abbas

    2009-03-01

    Full Text Available Abstract Background Pattern discovery in DNA sequences is one of the most fundamental problems in molecular biology with important applications in finding regulatory signals and transcription factor binding sites. An important task in this problem is to search (or predict known binding sites in a new DNA sequence. For this reason, all subsequences of the given DNA sequence are scored based on an scoring function and the prediction is done by selecting the best score. By assuming no dependency between binding site base positions, most of the available tools for known binding site prediction are designed. Recently Tomovic and Oakeley investigated the statistical basis for either a claim of dependence or independence, to determine whether such a claim is generally true, and they presented a scoring function for binding site prediction based on the dependency between binding site base positions. Our primary objective is to investigate the scoring functions which can be used in known binding site prediction based on the assumption of dependency or independency in binding site base positions. Results We propose a new scoring function based on the dependency between all positions in biding site base positions. This scoring function uses joint information content and mutual information as a measure of dependency between positions in transcription factor binding site. Our method for modeling dependencies is simply an extension of position independency methods. We evaluate our new scoring function on the real data sets extracted from JASPAR and TRANSFAC data bases, and compare the obtained results with two other well known scoring functions. Conclusion The results demonstrate that the new approach improves known binding site discovery and show that the joint information content and mutual information provide a better and more general criterion to investigate the relationships between positions in the TFBS. Our scoring function is formulated by simple

  3. Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens.

    Science.gov (United States)

    Shokralla, Shadi; Gibson, Joel F; Nikbakht, Hamid; Janzen, Daniel H; Hallwachs, Winnie; Hajibabaei, Mehrdad

    2014-09-01

    DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large-scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next-generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high-target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next-generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10-mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full-length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full-length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next-generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content.

  4. Narcotics Anonymous: Anonymity, admiration, and prestige in an egalitarian community

    OpenAIRE

    Snyder, Jeffrey K.; Fessler, Daniel M.T.

    2014-01-01

    Narcotics Anonymous (NA) supports long-term recovery for those addicted to drugs. Paralleling social dynamics in many small-scale societies, NA exhibits tension between egalitarianism and prestige-based hierarchy, a problem exacerbated by the addict’s personality as characterized by NA’s ethnopsychology.  We explore how NA’s central principle of anonymity normatively translates into egalitarianism among group members.  Turning to the lived reality of membership, building on Carr’s (2011) conc...

  5. Phylogenetic analysis of the genus Hordeum using repetitive DNA sequences

    DEFF Research Database (Denmark)

    Svitashev, S.; Bryngelsson, T.; Vershinin, A.;

    1994-01-01

    over all chromosomes of H. vulgare and the wild barley species H. bulbosum, H. marinum and H. murinum. Southern blot hybridization revealed different levels of polymorphism among barley species and the RFLP data were used to generate a phylogenetic tree for the genus Hordeum. Our data are in a good......A set of six cloned barley (Hordeum vulgare) repetitive DNA sequences was used for the analysis of phylogenetic relationships among 31 species (46 taxa) of the genus Hordeum, using molecular hybridization techniques. In situ hybridization experiments showed dispersed organization of the sequences...... agreement with the classification system which suggests the division of the genus into four major groups, containing the genomes I, X, Y, and H. However, our investigation also supports previous molecular studies of barley species where the unique position of H. bulbosum has been pointed out. In our...

  6. Ribbon channel plate rotating drum DNA sequencing device.

    Science.gov (United States)

    Douthart, R J; Welt, M; Walling, L

    1996-01-01

    A new design DNA sequencing electrophoresis device is described. The device, called the ribbon channeled plate rotating drum (rprd), consists of two major components, the plate assembly and the drum assembly. The plate assembly contains a machined or etched plate of individual micro-channels called the ribbon channeled plate. The ribbon channeled plate and other components of the plate assembly combine the advantages of thin gels and capillary arrays in a single unit with few of the disadvantages. The other major component of rprd is the drum assembly, which facilitates direct blotting onto deposition membranes affixed to a large plastic drum. The drum with attached membrane and deposited electrophoretically resolved ladders is easily moved to special units facilitating downstream processing and detection. The drum unit, although versatile, is specifically designed to be used with multiplex sequencing. PMID:8907517

  7. PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context.

    Science.gov (United States)

    Zhou, Jiyun; Xu, Ruifeng; He, Yulan; Lu, Qin; Wang, Hongpeng; Kong, Bing

    2016-01-01

    Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community. PMID:27282833

  8. Mitochondrial DNA sequence variation in the Anatolian Peninsula (Turkey)

    Indian Academy of Sciences (India)

    Hatice Mergen; Reyhan Öner; Cihan Öner

    2004-04-01

    Throughout human history, the region known today as the Anatolian peninsula (Turkey) has served as a junction connecting the Middle East, Europe and Central Asia, and, thus, has been subject to major population movements. The present study is undertaken to obtain information about the distribution of the existing mitochondrial D-loop sequence variations in the Turkish population of Anatolia. A few studies have previously reported mtDNA sequences in Turks. We attempted to extend these results by analysing a cohort that is not only larger, but also more representative of the Turkish population living in Anatolia. In order to obtain a descriptive picture for the phylogenetic distribution of the mitochondrial genome within Turkey, we analysed mitochondrial D-loop region sequence variations in 75 individuals from different parts of Anatolia by direct sequencing. Analysis of the two hypervariable segments within the noncoding region of the mitochondrial genome revealed the existence of 81 nucleotide mutations at 79 sites. The neighbour-joining tree of Kimura’s distance matrix has revealed the presence of six main clusters, of which H and U are the most common. The data obtained are also compared with several European and Turkic Central Asian populations.

  9. Analyzing large-scale DNA Sequences on Multi-core Architectures

    OpenAIRE

    Memeti, Suejb; Pllana, Sabri

    2015-01-01

    Rapid analysis of DNA sequences is important in preventing the evolution of different viruses and bacteria during an early phase, early diagnosis of genetic predispositions to certain diseases (cancer, cardiovascular diseases), and in DNA forensics. However, real-world DNA sequences may comprise several Gigabytes and the process of DNA analysis demands adequate computational resources to be completed within a reasonable time. In this paper we present a scalable approach for parallel DNA analy...

  10. Mitochondrial DNA sequences in single hairs from a southern African population.

    OpenAIRE

    Vigilant, L.; Pennington, R; Harpending, H; Kocher, T.D.; Wilson, A C

    1989-01-01

    Hypervariable parts of mitochondrial DNA (mtDNA) were amplified enzymatically and sequenced directly by using genomic DNA from single plucked human hairs. This method has been applied to study mtDNA sequence variation among 15 members of the !Kung population. A genealogical tree relating these aboriginal, Khoisan-speaking southern Africans to 68 other humans and to one chimpanzee has the deepest branches occurring amongst the !Kung, a result consistent with an African origin of human mtDNA. F...

  11. Facile, High Quality Sequencing of Bacterial Genomes from Small Amounts of DNA

    OpenAIRE

    Momchilo Vuyisich; Ayesha Arefin; Karen Davenport; Shihai Feng; Cheryl Gleasner; Kim McMurry; Beverly Parson-Quintana; Jennifer Price; Matthew Scholz; Patrick Chain

    2014-01-01

    Sequencing bacterial genomes has traditionally required large amounts of genomic DNA (~1 μg). There have been few studies to determine the effects of the input DNA amount or library preparation method on the quality of sequencing data. Several new commercially available library preparation methods enable shotgun sequencing from as little as 1 ng of input DNA. In this study, we evaluated the NEBNext Ultra library preparation reagents for sequencing bacterial genomes. We have evaluated the util...

  12. Complete genome sequence of chloroplast DNA (cpDNA) of Chlorella sorokiniana.

    Science.gov (United States)

    Orsini, Massimiliano; Cusano, Roberto; Costelli, Cristina; Malavasi, Veronica; Concas, Alessandro; Angius, Andrea; Cao, Giacomo

    2016-01-01

    The complete chloroplast genome sequence of Chlorella sorokiniana strain (SAG 111-8 k) is presented in this study. The genome consists of circular chromosomes of 109,811 bp, which encode a total of 109 genes, including 74 proteins, 3 rRNAs and 31 tRNAs. Moreover, introns are not detected and all genes are present in single copy. The overall AT contents of the C. sorokiniana cpDNA is 65.9%, the coding sequence is 59.1% and a large inverted repeat (IR) is not observed.

  13. Statistical methods for detecting periodic fragments in DNA sequence data

    Directory of Open Access Journals (Sweden)

    Ying Hua

    2011-04-01

    Full Text Available Abstract Background Period 10 dinucleotides are structurally and functionally validated factors that influence the ability of DNA to form nucleosomes, histone core octamers. Robust identification of periodic signals in DNA sequences is therefore required to understand nucleosome organisation in genomes. While various techniques for identifying periodic components in genomic sequences have been proposed or adopted, the requirements for such techniques have not been considered in detail and confirmatory testing for a priori specified periods has not been developed. Results We compared the estimation accuracy and suitability for confirmatory testing of autocorrelation, discrete Fourier transform (DFT, integer period discrete Fourier transform (IPDFT and a previously proposed Hybrid measure. A number of different statistical significance procedures were evaluated but a blockwise bootstrap proved superior. When applied to synthetic data whose period-10 signal had been eroded, or for which the signal was approximately period-10, the Hybrid technique exhibited superior properties during exploratory period estimation. In contrast, confirmatory testing using the blockwise bootstrap procedure identified IPDFT as having the greatest statistical power. These properties were validated on yeast sequences defined from a ChIP-chip study where the Hybrid metric confirmed the expected dominance of period-10 in nucleosome associated DNA but IPDFT identified more significant occurrences of period-10. Application to the whole genomes of yeast and mouse identified ~ 21% and ~ 19% respectively of these genomes as spanned by period-10 nucleosome positioning sequences (NPS. Conclusions For estimating the dominant period, we find the Hybrid period estimation method empirically to be the most effective for both eroded and approximate periodicity. The blockwise bootstrap was found to be effective as a significance measure, performing particularly well in the problem of

  14. Determination of cDNA and genomic DNA sequences of hevamine, a chitinase from the rubber tree Hevea brasiliensis

    NARCIS (Netherlands)

    Bokma, E; Spiering, M; Chow, KS; Mulder, PPMFA; Subroto, T; Beintema, JJ

    2001-01-01

    Hevamine is a chitinase from the rubber tree Hevea brasiliensis and belongs to the family 18 glycosyl hydrolases. This paper describes the cloning of hevamine DNA and cDNA sequences. Hevamine contains a signal peptide at the N-terminus and a putative vacuolar targeting sequence at the C-terminus whi

  15. Assessing the fidelity of ancient DNA sequences amplified from nuclear genes

    DEFF Research Database (Denmark)

    Binladen, Jonas; Wiuf, Carsten Henrik; Gilbert, M. Thomas P.;

    2006-01-01

    in phenotypic traits of extinct taxa. It is well documented that postmortem damage in ancient mtDNA can lead to the generation of artifactual sequences. However, as yet no one has thoroughly investigated the damage spectrum in ancient nuDNA. By comparing clone sequences from 23 fossil specimens, recovered from......DNA and nuDNA despite great differences in cellular copy numbers. For both mtDNA and nuDNA, we find significant positive correlations between total sequence heterogeneity and the rates of type 1 transitions (adenine guanine and thymine --> cytosine) and type 2 transitions (cytosine --> thymine and guanine...

  16. Construction of a Sequencing Library from Circulating Cell-Free DNA.

    Science.gov (United States)

    Fang, Nan; Löffert, Dirk; Akinci-Tolun, Rumeysa; Heitz, Katja; Wolf, Alexander

    2016-01-01

    Circulating DNA is cell-free DNA (cfDNA) in serum or plasma that can be used for non-invasive prenatal testing, as well as cancer diagnosis, prognosis, and stratification. High-throughput sequence analysis of the cfDNA with next-generation sequencing technologies has proven to be a highly sensitive and specific method in detecting and characterizing mutations in cancer and other diseases, as well as aneuploidy during pregnancy. This unit describes detailed procedures to extract circulating cfDNA from human serum and plasma and generate sequencing libraries from a wide concentration range of circulating DNA. © 2016 by John Wiley & Sons, Inc. PMID:27038390

  17. Long-range correlations and charge transport properties of DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Liu Xiaoliang, E-mail: xlliucsu@yahoo.com.c [College of Physical Science and Technology and College of Metallurgical Science and Engineering, Central South University, Changsha 410083 (China); Ren, Yi [College of Physical Science and Technology and College of Metallurgical Science and Engineering, Central South University, Changsha 410083 (China); Xie, Qiong-tao [Key Laboratory of Low Dimensional Quantum Structures and Quantum Control of Ministry of Education (Hunan Normal University), Changsha 410081 (China); Deng, Chao-sheng; Xu, Hui [College of Physical Science and Technology and College of Metallurgical Science and Engineering, Central South University, Changsha 410083 (China)

    2010-04-26

    By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that lambda-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5sequence displays a transition from correlation behavior to anticorrelation behavior. The resonant peaks of the transmission coefficient in genomic sequences can survive in longer sequence length than in random sequences but in shorter sequence length than in quasiperiodic sequences. It is shown that the genomic sequences have long-range correlation properties to some extent but the correlations are not strong enough to maintain the scale invariance properties.

  18. Anti-DNA antibodies: Sequencing, cloning, and expression

    Energy Technology Data Exchange (ETDEWEB)

    Barry, M.M.

    1992-01-01

    To gain some insight into the mechanism of systemic lupus erythematosus, and the interactions involved in proteins binding to DNA four anti-DNA antibodies have been investigated. Two of the antibodies, Hed 10 and Jel 242, have previously been prepared from female NZB/NZW mice which develop an autoimmune disease resembling human SLE. The remaining two antibodies, Jel 72 and Jel 318, have previously been produced via immunization of C57BL/6 mice. The isotypes of the four antibodies investigated in this thesis were determined by an enzyme-linked-immunosorbent assay. All four antibodies contained [kappa] light chains and [gamma]2a heavy chains except Jel 318 which contains a [gamma]2b heavy chain. The complete variable regions of the heavy and light chains of these four antibodies were sequenced from their respective mRNAs. The gene segments and variable gene families expressed in each antibody were identified. Analysis of the genes used in the autoimmune anti-DNA antibodies and those produced by immunization indicated no obvious differences to account for their different origins. Examination of the amino acid residues present in the complementary-determining regions of these four antibodies indicates a preference for aromatic amino acids. Jel 72 and Jel 242 contain three arginine residues in the third complementary-determining region. A single-chain Fv and the variable region of the heavy chain of Hed 10 were expressed in Escherichia coli. Expression resulted in the production of a 26,000 M[sub r] protein and a 15,000 M[sub r] protein. An immunoblot indicated that the 26,000 M[sub r] protein was the Fv for Hed 10, while the 15,000 M[sub r] protein was shown to bind poly (dT). The contribution of the heavy chain to DNA binding was assessed.

  19. Phylogeny of Pelargonium (Geraniaceae) based on DNA sequences from three genomes

    NARCIS (Netherlands)

    Bakker, F.T.; Culham, A.; Hettiarachi, P.; Touloumendidou, T.; Gibby, M.

    2004-01-01

    Phylogenetic hypotheses for the largely South African genus Pelargonium L'Hér. (Geraniaceae) were derived based on DNA sequence data from nuclear, chloroplast and mitochondrial encoded regions. The datasets were unequally represented and comprised cpDNA trnL-F sequences for 152 taxa, nrDNA ITS seque

  20. The DNA sequence of the human X chromosome.

    Science.gov (United States)

    Ross, Mark T; Grafham, Darren V; Coffey, Alison J; Scherer, Steven; McLay, Kirsten; Muzny, Donna; Platzer, Matthias; Howell, Gareth R; Burrows, Christine; Bird, Christine P; Frankish, Adam; Lovell, Frances L; Howe, Kevin L; Ashurst, Jennifer L; Fulton, Robert S; Sudbrak, Ralf; Wen, Gaiping; Jones, Matthew C; Hurles, Matthew E; Andrews, T Daniel; Scott, Carol E; Searle, Stephen; Ramser, Juliane; Whittaker, Adam; Deadman, Rebecca; Carter, Nigel P; Hunt, Sarah E; Chen, Rui; Cree, Andrew; Gunaratne, Preethi; Havlak, Paul; Hodgson, Anne; Metzker, Michael L; Richards, Stephen; Scott, Graham; Steffen, David; Sodergren, Erica; Wheeler, David A; Worley, Kim C; Ainscough, Rachael; Ambrose, Kerrie D; Ansari-Lari, M Ali; Aradhya, Swaroop; Ashwell, Robert I S; Babbage, Anne K; Bagguley, Claire L; Ballabio, Andrea; Banerjee, Ruby; Barker, Gary E; Barlow, Karen F; Barrett, Ian P; Bates, Karen N; Beare, David M; Beasley, Helen; Beasley, Oliver; Beck, Alfred; Bethel, Graeme; Blechschmidt, Karin; Brady, Nicola; Bray-Allen, Sarah; Bridgeman, Anne M; Brown, Andrew J; Brown, Mary J; Bonnin, David; Bruford, Elspeth A; Buhay, Christian; Burch, Paula; Burford, Deborah; Burgess, Joanne; Burrill, Wayne; Burton, John; Bye, Jackie M; Carder, Carol; Carrel, Laura; Chako, Joseph; Chapman, Joanne C; Chavez, Dean; Chen, Ellson; Chen, Guan; Chen, Yuan; Chen, Zhijian; Chinault, Craig; Ciccodicola, Alfredo; Clark, Sue Y; Clarke, Graham; Clee, Chris M; Clegg, Sheila; Clerc-Blankenburg, Kerstin; Clifford, Karen; Cobley, Vicky; Cole, Charlotte G; Conquer, Jen S; Corby, Nicole; Connor, Richard E; David, Robert; Davies, Joy; Davis, Clay; Davis, John; Delgado, Oliver; Deshazo, Denise; Dhami, Pawandeep; Ding, Yan; Dinh, Huyen; Dodsworth, Steve; Draper, Heather; Dugan-Rocha, Shannon; Dunham, Andrew; Dunn, Matthew; Durbin, K James; Dutta, Ireena; Eades, Tamsin; Ellwood, Matthew; Emery-Cohen, Alexandra; Errington, Helen; Evans, Kathryn L; Faulkner, Louisa; Francis, Fiona; Frankland, John; Fraser, Audrey E; Galgoczy, Petra; Gilbert, James; Gill, Rachel; Glöckner, Gernot; Gregory, Simon G; Gribble, Susan; Griffiths, Coline; Grocock, Russell; Gu, Yanghong; Gwilliam, Rhian; Hamilton, Cerissa; Hart, Elizabeth A; Hawes, Alicia; Heath, Paul D; Heitmann, Katja; Hennig, Steffen; Hernandez, Judith; Hinzmann, Bernd; Ho, Sarah; Hoffs, Michael; Howden, Phillip J; Huckle, Elizabeth J; Hume, Jennifer; Hunt, Paul J; Hunt, Adrienne R; Isherwood, Judith; Jacob, Leni; Johnson, David; Jones, Sally; de Jong, Pieter J; Joseph, Shirin S; Keenan, Stephen; Kelly, Susan; Kershaw, Joanne K; Khan, Ziad; Kioschis, Petra; Klages, Sven; Knights, Andrew J; Kosiura, Anna; Kovar-Smith, Christie; Laird, Gavin K; Langford, Cordelia; Lawlor, Stephanie; Leversha, Margaret; Lewis, Lora; Liu, Wen; Lloyd, Christine; Lloyd, David M; Loulseged, Hermela; Loveland, Jane E; Lovell, Jamieson D; Lozado, Ryan; Lu, Jing; Lyne, Rachael; Ma, Jie; Maheshwari, Manjula; Matthews, Lucy H; McDowall, Jennifer; McLaren, Stuart; McMurray, Amanda; Meidl, Patrick; Meitinger, Thomas; Milne, Sarah; Miner, George; Mistry, Shailesh L; Morgan, Margaret; Morris, Sidney; Müller, Ines; Mullikin, James C; Nguyen, Ngoc; Nordsiek, Gabriele; Nyakatura, Gerald; O'Dell, Christopher N; Okwuonu, Geoffery; Palmer, Sophie; Pandian, Richard; Parker, David; Parrish, Julia; Pasternak, Shiran; Patel, Dina; Pearce, Alex V; Pearson, Danita M; Pelan, Sarah E; Perez, Lesette; Porter, Keith M; Ramsey, Yvonne; Reichwald, Kathrin; Rhodes, Susan; Ridler, Kerry A; Schlessinger, David; Schueler, Mary G; Sehra, Harminder K; Shaw-Smith, Charles; Shen, Hua; Sheridan, Elizabeth M; Shownkeen, Ratna; Skuce, Carl D; Smith, Michelle L; Sotheran, Elizabeth C; Steingruber, Helen E; Steward, Charles A; Storey, Roy; Swann, R Mark; Swarbreck, David; Tabor, Paul E; Taudien, Stefan; Taylor, Tineace; Teague, Brian; Thomas, Karen; Thorpe, Andrea; Timms, Kirsten; Tracey, Alan; Trevanion, Steve; Tromans, Anthony C; d'Urso, Michele; Verduzco, Daniel; Villasana, Donna; Waldron, Lenee; Wall, Melanie; Wang, Qiaoyan; Warren, James; Warry, Georgina L; Wei, Xuehong; West, Anthony; Whitehead, Siobhan L; Whiteley, Mathew N; Wilkinson, Jane E; Willey, David L; Williams, Gabrielle; Williams, Leanne; Williamson, Angela; Williamson, Helen; Wilming, Laurens; Woodmansey, Rebecca L; Wray, Paul W; Yen, Jennifer; Zhang, Jingkun; Zhou, Jianling; Zoghbi, Huda; Zorilla, Sara; Buck, David; Reinhardt, Richard; Poustka, Annemarie; Rosenthal, André; Lehrach, Hans; Meindl, Alfons; Minx, Patrick J; Hillier, Ladeana W; Willard, Huntington F; Wilson, Richard K; Waterston, Robert H; Rice, Catherine M; Vaudin, Mark; Coulson, Alan; Nelson, David L; Weinstock, George; Sulston, John E; Durbin, Richard; Hubbard, Tim; Gibbs, Richard A; Beck, Stephan; Rogers, Jane; Bentley, David R

    2005-03-17

    The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence.

  1. The evolution processes of DNA sequences, languages and carols

    Science.gov (United States)

    Hauck, Jürgen; Henkel, Dorothea; Mika, Klaus

    2001-04-01

    The sequences of bases A, T, C and G of about 100 enolase, secA and cytochrome DNA were analyzed for attractive or repulsive interactions by the numbers T 1,T 2,T 3; r of nearest, next-nearest and third neighbor bases of the same kind and the concentration r=other bases/analyzed base. The area of possible T1, T2 values is limited by the linear borders T 2=2T 1-2, T 2=0 or T1=0 for clustering, attractive or repulsive interactions and the border T2=-2 T1+2(2- r) for a variation from repulsive to attractive interactions at r⩽2. Clustering is preferred by most bases in sequences of enolases and secA’ s. Major deviations with repulsive interactions of some bases are observed for archaea bacteria in secA and for highly developed animals and the human species in enolase sequences. The borders of the structure map for enthalpy stabilized structures with maximum interactions are approached in few cases. Most letters of the natural languages and some music notes are at the borders of the structure map.

  2. Complete Genome Sequence of Pelosinus sp. Strain UFO1 Assembled Using Single-Molecule Real-Time DNA Sequencing Technology

    OpenAIRE

    Brown, Steven D.; Utturkar, Sagar M.; Magnuson, Timothy S.; Ray, Allison E.; Poole, Farris L.; Lancaster, W Andrew; Thorgersen, Michael P.; Adams, Michael W. W.; Elias, Dwayne A.

    2014-01-01

    Pelosinus species can reduce metals such as Fe(III), U(VI), and Cr(VI) and have been isolated from diverse geographical regions. Five draft genome sequences have been published. We report the complete genome sequence for Pelosinus sp. strain UFO1 using only PacBio DNA sequence data and without manual finishing.

  3. Structural analysis of DNA sequence: evidence for lateral gene transfer in Thermotoga maritima

    DEFF Research Database (Denmark)

    Worning, Peder; Jensen, Lars Juhl; Nelson, K. E.;

    2000-01-01

    The recently published complete DNA sequence of the bacterium Thermotoga maritima provides evidence, based on protein sequence conservation, for lateral gene transfer between Archaea and Bacteria. We introduce a new method of periodicity analysis of DNA sequences, based on structural parameters, ...

  4. Challenges in DNA motion control and sequence readout using nanopore devices

    International Nuclear Information System (INIS)

    Nanopores are being hailed as a potential next-generation DNA sequencer that could provide cheap, high-throughput DNA analysis. In this review we present a detailed summary of the various sensing techniques being investigated for use in DNA sequencing and mapping applications. A crucial impasse to the success of nanopores as a reliable DNA analysis tool is the fast and stochastic nature of DNA translocation. We discuss the incorporation of biological motors to step DNA through a pore base-by-base, as well as the many experimental modifications attempted for the purpose of slowing and controlling DNA transport. (paper)

  5. Anonymity and Historical-Anonymity in Location-Based Services

    Science.gov (United States)

    Bettini, Claudio; Mascetti, Sergio; Wang, X. Sean; Freni, Dario; Jajodia, Sushil

    The problem of protecting user’s privacy in Location-Based Services (LBS) has been extensively studied recently and several defense techniques have been proposed. In this contribution, we first present a categorization of privacy attacks and related defenses. Then, we consider the class of defense techniques that aim at providing privacy through anonymity and in particular algorithms achieving “historical k- anonymity” in the case of the adversary obtaining a trace of requests recognized as being issued by the same (anonymous) user. Finally, we investigate the issues involved in the experimental evaluation of anonymity based defense techniques; we show that user movement simulations based on mostly random movements can lead to overestimate the privacy protection in some cases and to overprotective techniques in other cases. The above results are obtained by comparison to a more realistic simulation with an agent-based simulator, considering a specific deployment scenario.

  6. mapDamage: testing for damage patterns in ancient DNA sequences

    DEFF Research Database (Denmark)

    Ginolhac, Aurelien; Rasmussen, Morten; Gilbert, M Thomas P;

    2011-01-01

    Ancient DNA extracts consist of a mixture of contaminant DNA molecules, most often originating from environmental microbes, and endogenous fragments exhibiting substantial levels of DNA damage. The latter introduce specific nucleotide misincorporations and DNA fragmentation signatures in sequenci...... of the SAMtools suite and R environment and has been validated on both GNU/Linux and MacOSX operating systems....

  7. Analysis of mitochondrial DNA sequences in patients with isolated or combined oxidative phosphorylation system deficiency.

    NARCIS (Netherlands)

    Hinttala, R.; Smeets, R.; Moilanen, J.S.; Ugalde, C.; Uusimaa, J.; Smeitink, J.A.M.; Majamaa, K.

    2006-01-01

    BACKGROUND: Enzyme deficiencies of the oxidative phosphorylation (OXPHOS) system may be caused by mutations in the mitochondrial DNA (mtDNA) or in the nuclear DNA. OBJECTIVE: To analyse the sequences of the mtDNA coding region in 25 patients with OXPHOS system deficiency to identify the underlying g

  8. Anonymous online purchases with exhaustive operational security

    OpenAIRE

    Van Mieghem, Vincent; Pouwelse, Johan

    2015-01-01

    This paper describes the process of remaining anonymous online and its concurrent operational security that has to be performed. It focusses particularly on remaining anonymous while purchasing online goods, resulting in anonymously bought items. Different aspects of the operational security process as well as anonymously funding with cryptocurrencies are described. Eventually it is shown how to anonymously purchase items and services from the hidden web, as well as the delivery. It is shown ...

  9. Unusual conformational effect exerted by Z-DNA upon its neighboring sequences.

    OpenAIRE

    Kohwi-Shigematsu, T; Manes, T; Kohwi, Y

    1987-01-01

    Supercoiled plasmid DNA harboring an insert of (dG-dC)16, a sequence known to form Z-DNA upon negative supercoiling, was reacted with chloroacetaldehyde. Chloroacetaldehyde, like bromoacetaldehyde, was found to be a specific probe for detecting unpaired DNA bases in supercoiled plasmid DNA. Under torsional stress (at bacterial superhelical density), chloroacetaldehyde reacted at multiple discrete regions within the neighboring sequences of the (dG-dC)16 insert. When the plasmid population was...

  10. True single-molecule DNA sequencing of a pleistocene horse bone

    DEFF Research Database (Denmark)

    Orlando, Ludovic Antoine Alexandre; Ginolhac, Aurélien; Raghavan, Maanasa;

    2011-01-01

    -preserved Pleistocene horse bone using the Helicos HeliScope and Illumina GAIIx platforms, respectively. We find that the percentage of endogenous DNA sequences derived from the horse is higher among the Helicos data than Illumina data. This result indicates that the molecular biology tools used to generate sequencing...... to the standard Helicos DNA template preparation protocol further increase the proportion of horse DNA for this sample by 3-fold. Comparison of Helicos-specific biases and sequence errors in modern DNA with those in ancient DNA also reveals extensive cytosine deamination damage at the 3' ends of ancient templates...

  11. Sequence selective naked-eye detection of DNA harnessing extension of oligonucleotide-modified nucleotides.

    Science.gov (United States)

    Verga, Daniela; Welter, Moritz; Marx, Andreas

    2016-02-01

    DNA polymerases can efficiently and sequence selectively incorporate oligonucleotide (ODN)-modified nucleotides and the incorporated oligonucleotide strand can be employed as primer in rolling circle amplification (RCA). The effective amplification of the DNA primer by Φ29 DNA polymerase allows the sequence-selective hybridisation of the amplified strand with a G-quadruplex DNA sequence that has horse radish peroxidase-like activity. Based on these findings we develop a system that allows DNA detection with single-base resolution by naked eye.

  12. How to Bootstrap Anonymous Communication

    DEFF Research Database (Denmark)

    Jakobsen, Sune K.; Orlandi, Claudio

    2015-01-01

    formal study in this direction. To solve this problem, we introduce the concept of anonymous steganography: think of a leaker Lea who wants to leak a large document to Joe the journalist. Using anonymous steganography Lea can embed this document in innocent looking communication on some popular website...... (such as cat videos on YouTube or funny memes on 9GAG). Then Lea provides Joe with a short key k which, when applied to the entire website, recovers the document while hiding the identity of Lea among the large number of users of the website. Our contributions include: { Introducing and formally dening...

  13. How to Bootstrap Anonymous Communication

    DEFF Research Database (Denmark)

    Jakobsen, Sune K.; Orlandi, Claudio

    2015-01-01

    formal study in this direction. To solve this problem, we introduce the concept of anonymous steganography: think of a leaker Lea who wants to leak a large document to Joe the journalist. Using anonymous steganography Lea can embed this document in innocent looking communication on some popular website...... (such as cat videos on YouTube or funny memes on 9GAG). Then Lea provides Joe with a short key $k$ which, when applied to the entire website, recovers the document while hiding the identity of Lea among the large number of users of the website. Our contributions include: - Introducing and formally...

  14. Data Retention and Anonymity Services

    Science.gov (United States)

    Berthold, Stefan; Böhme, Rainer; Köpsell, Stefan

    The recently introduced legislation on data retention to aid prosecuting cyber-related crime in Europe also affects the achievable security of systems for anonymous communication on the Internet. We argue that data retention requires a review of existing security evaluations against a new class of realistic adversary models. In particular, we present theoretical results and first empirical evidence for intersection attacks by law enforcement authorities. The reference architecture for our study is the anonymity service AN.ON, from which we also collect empirical data. Our adversary model reflects an interpretation of the current implementation of the EC Directive on Data Retention in Germany.

  15. Comparisons of ape and human sequences that regulate mitochondrial DNA transcription and D-loop DNA synthesis.

    OpenAIRE

    Foran, D R; Hixson, J E; Brown, W. M.

    1988-01-01

    The mitochondrial DNA (mtDNA) control regions for common chimpanzee, pygmy chimpanzee and gorilla were sequenced and the lengths and termini of their D-loop DNA's characterized. In these and all other species for which there are data, 5' termini map to sequences that contain the trinucleotide YAY. 3' termini are 25-51 nucleotides downstream from a sequence that is moderately conserved among vertebrates. Substitutions were greater than 1.5 times more frequent in the control region than in regi...

  16. RevTrans: multiple alignment of coding DNA from aligned amino acid sequences

    DEFF Research Database (Denmark)

    Wernersson, Rasmus; Pedersen, Anders Gorm

    2003-01-01

    proteins. It is therefore preferable to align coding DNA at the amino acid level and it is for this purpose we have constructed the program RevTrans. RevTrans constructs a multiple DNA alignment by: (i) translating the DNA; (ii) aligning the resulting peptide sequences; and (iii) building a multiple DNA...... alignment by 'reverse translation' of the aligned protein sequences. In the resulting DNA alignment, gaps occur in groups of three corresponding to entire codons, and analogous codon positions are therefore always lined up. These features are useful when constructing multiple DNA alignments for phylogenetic...

  17. Organization and Evolution of Primate Centromeric DNA from Whole-Genome Shotgun Sequence Data

    OpenAIRE

    Alkan, Can; Eichler, Evan E.; Ventura, Mario; Archidiacono, Nicoletta; Rocchi, Mariano; Sahinalp, S Cenk

    2007-01-01

    Author Summary Centromeric DNA has been described as the last frontier of genomic sequencing; such regions are typically poorly assembled during the whole-genome shotgun sequence assembly process due to their repetitive complexity. This paper develops a computational algorithm to systematically extract data regarding primate centromeric DNA structure and organization from that ∼5% of sequence that is not included as part of standard genome sequence assemblies. Using this computational approac...

  18. Sequencing of megabase plus DNA by hybridization: Method development ENT. Final technical progress report

    Energy Technology Data Exchange (ETDEWEB)

    Crkvenjakov, R.; Drmanac, R.

    1991-01-31

    Sequencing by hybridization (SBH) is the only sequencing method based on the experimental determination of the content of oligonucleotide sequences. The data acquisition relies on the natural process of base pairing. It is possible to determine the content of complementary oligosequences in the target DNA by the process of hybridization with oligonucleotide probes of known sequences.

  19. [Sequencing of low-molecular-weight DNA in blood plasma of irradiated rats].

    Science.gov (United States)

    Vasilieva, I N; Bespalov, V G; Zinkin, V N; Podgornaya, O I

    2015-01-01

    Extracellular low-molecular-weight DNA in blood of irradiated rats was sequenced for the first time. The screening of sequences in the DDBJ database displayed homology of various parts of the rodent genome. Sequences of low-molecular-weight DNA in rat's plasma are enriched with G/C pairs and long interspersed elements relative to rat genome. DNA sequences in blood of rats irradiated at the doses of 8 and 100 Gy have marked distinctions. Data of sequencing of extracellular DNA from normal humans and with pathology were analyzed. DNA sequences of irradiated rats differ from the human ones by a wealth of long interspersed elements. This new knowledge lays the foundation for development of minimally invasive technologies of diagnosing the probability of pathology and controlling the adaptive resources of people in extreme environments. PMID:25958466

  20. One-way sequencing of multiple amplicons from tandem repetitive mitochondrial DNA control region.

    Science.gov (United States)

    Xu, Jiawu; Fonseca, Dina M

    2011-10-01

    Repetitive DNA sequences not only exist abundantly in eukaryotic nuclear genomes, but also occur as tandem repeats in many animal mitochondrial DNA (mtDNA) control regions. Due to concerted evolution, these repetitive sequences are highly similar or even identical within a genome. When long repetitive regions are the targets of amplification for the purpose of sequencing, multiple amplicons may result if one primer has to be located inside the repeats. Here, we show that, without separating these amplicons by gel purification or cloning, directly sequencing the mitochondrial repeats with the primer outside repetitive region is feasible and efficient. We exemplify it by sequencing the mtDNA control region of the mosquito Aedes albopictus, which harbors typical large tandem DNA repeats. This one-way sequencing strategy is optimal for population surveys.

  1. DUC-Curve, a highly compact 2D graphical representation of DNA sequences and its application in sequence alignment

    Science.gov (United States)

    Li, Yushuang; Liu, Qian; Zheng, Xiaoqi

    2016-08-01

    A highly compact and simple 2D graphical representation of DNA sequences, named DUC-Curve, is constructed through mapping four nucleotides to a unit circle with a cyclic order. DUC-Curve could directly detect nucleotide, di-nucleotide compositions and microsatellite structure from DNA sequences. Moreover, it also could be used for DNA sequence alignment. Taking geometric center vectors of DUC-Curves as sequence descriptor, we perform similarity analysis on the first exons of β-globin genes of 11 species, oncogene TP53 of 27 species and twenty-four Influenza A viruses, respectively. The obtained reasonable results illustrate that the proposed method is very effective in sequence comparison problems, and will at least play a complementary role in classification and clustering problems.

  2. A 28,000 years old Cro-Magnon mtDNA sequence differs from all potentially contaminating modern sequences.

    Directory of Open Access Journals (Sweden)

    David Caramelli

    Full Text Available BACKGROUND: DNA sequences from ancient specimens may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal and early modern (Cro-Magnoid Europeans. METHODOLOGY/PRINCIPAL FINDINGS: We typed the mitochondrial DNA (mtDNA hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23 and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. CONCLUSIONS/SIGNIFICANCE: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans.

  3. Sequences Characterization of Microsatellite DNA Sequences in Pacific Abalone (Haliotis discus hannat)

    Institute of Scientific and Technical Information of China (English)

    LI Qi; Kijima Akihiro

    2007-01-01

    The microsatellite-enriched library was constructed using magnetic bead hybridization selection method, and the microsatellite DNA sequences were analyzed in Pacific abalone Haliotis discus hannai. Three hundred and fifty white colonies were screened using PCR-based technique, and 84 clones were identified to potentially contain microsatellite repeat motif. The 84 clones were sequenced, and 42 microsatellites and 4 minisatellites with a minimum of five repeats were found (13.1% of white colonies screened). Besides the motif of CA contained in the oligoprobe, we also found other 16 types of microsatellite repeats including a dinucleotide repeat, two tetranucleotide repeats, twelve pentanucleotide repeats and a hexanucleotide repeat. According to Weber(1990), the microsatellite sequences obtained could be categorized structurally into perfect repeats (73.3%), imperfect repeats(13.3%), and compound repeats (13.4%). Among the microsatellite repeats, relatively short arrays (< 20 repeats) were most abundant,accounting for 75.0%. The largest length of microsatellites was 48 repeats, and the average number of repeats was 13.4. The data on the composition and length distribution of microsatellites obtained in the present study can be useful for choosing the repeat motifs for microsatetlite isolation in other abalone species.

  4. Water Mediates Recognition of DNA Sequence via Ionic Current Blockade in a Biological Nanopore.

    Science.gov (United States)

    Bhattacharya, Swati; Yoo, Jejoong; Aksimentiev, Aleksei

    2016-04-26

    Electric field-driven translocation of DNA strands through biological nanopores has been shown to produce blockades of the nanopore ionic current that depend on the nucleotide composition of the strands. Coupling a biological nanopore MspA to a DNA processing enzyme has made DNA sequencing via measurement of ionic current blockades possible. Nevertheless, the physical mechanism enabling the DNA sequence readout has remained undetermined. Here, we report the results of all-atom molecular dynamics simulations that elucidated the physical mechanism of ionic current blockades in the biological nanopore MspA. We find that the amount of water displaced from the nanopore by the DNA strand determines the nanopore ionic current, whereas the steric and base-stacking properties of the DNA nucleotides determine the amount of water displaced. Unexpectedly, we find the effective force on DNA in MspA to undergo large fluctuations, which may produce insertion errors in the DNA sequence readout. PMID:27054820

  5. Collection and Extraction of Saliva DNA for Next Generation Sequencing

    OpenAIRE

    Goode, Michael R.; Cheong, Soo Yeon; Li, Ning; Ray, William C.; Bartlett, Christopher W

    2014-01-01

    DNA extraction from saliva can provide a readily available source of high molecular weight DNA, with little to no degradation/fragmentation. This protocol provides optimized parameters for saliva collection/storage and DNA extraction to be of sufficient quality and quantity for downstream DNA assays with high quality requirements.

  6. Movement Data Anonymity through Generalization

    Directory of Open Access Journals (Sweden)

    Anna Monreale

    2010-08-01

    Full Text Available Wireless networks and mobile devices, such as mobile phones and GPS receivers, sense and track the movements of people and vehicles, producing society-wide mobility databases. This is a challenging scenario for data analysis and mining. On the one hand, exciting opportunities arise out of discovering new knowledge about human mobile behavior, and thus fuel intelligent info-mobility applications. On other hand, new privacy concerns arise when mobility data are published. The risk is particularly high for GPS trajectories, which represent movement of a very high precision and spatio-temporal resolution: the de-identification of such trajectories (i.e., forgetting the ID of their associated owners is only a weak protection, as generally it is possible to re-identify a person by observing her routine movements. In this paper we propose a method for achieving true anonymity in a dataset of published trajectories, by defining a transformation of the original GPS trajectories based on spatial generalization and k-anonymity. The proposed method offers a formal data protection safeguard, quantified as a theoretical upper bound to the probability of re-identification. We conduct a thorough study on a real-life GPS trajectory dataset, and provide strong empirical evidence that the proposed anonymity techniques achieve the conflicting goals of data utility and data privacy. In practice, the achieved anonymity protection is much stronger than the theoretical worst case, while the quality of the cluster analysis on the trajectory data is preserved.

  7. DNA Qualification Workflow for Next Generation Sequencing of Histopathological Samples

    OpenAIRE

    Michele Simbolo; Marisa Gottardi; Vincenzo Corbo; Matteo Fassan; Andrea Mafficini; Giorgio Malpeli; Lawlor, Rita T; Aldo Scarpa

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughpu...

  8. Analysis and location of a rice BAC clone containing telomeric DNA sequences

    Institute of Scientific and Technical Information of China (English)

    翟文学; 陈浩; 颜辉煌; 严长杰; 王国梁; 朱立煌

    1999-01-01

    BAC2, a rice BAC clone containing (TTTAGGG)n homologous sequences, was analyzed by Southern hybridization and DNA sequencing of its subclones. It was disclosed that there were many tandem repeated satellite DNA sequences, called TA352, as well as simple tandem repeats consisting of TTTAGGG or its variant within the BAC2 insert. A 0. 8 kb (TTTAGGG) n-containing fragment in BAC2 was mapped in the telomere regions of at least 5 pairs of rice chromosomes by using fluorescence in situ hybridization (FISH). By RFLP analysis of low copy sequences the BAC2 clone was localized in one terminal region of chromosome 6. All the results strongly suggest that the telomeric DNA sequences of rice are TTTAGGG or its variant, and the linked satellite DNA TA352 sequences belong to telomere-associated sequences.

  9. Synergy of Two Assembly Languages in DNA Nanostructures: Self-Assembly of Sequence-Defined Polymers on DNA Cages.

    Science.gov (United States)

    Chidchob, Pongphak; Edwardson, Thomas G W; Serpell, Christopher J; Sleiman, Hanadi F

    2016-04-01

    DNA base-pairing is the central interaction in DNA assembly. However, this simple four-letter (A-T and G-C) language makes it difficult to create complex structures without using a large number of DNA strands of different sequences. Inspired by protein folding, we introduce hydrophobic interactions to expand the assembly language of DNA nanotechnology. To achieve this, DNA cages of different geometries are combined with sequence-defined polymers containing long alkyl and oligoethylene glycol repeat units. Anisotropic decoration of hydrophobic polymers on one face of the cage leads to hydrophobically driven formation of quantized aggregates of DNA cages, where polymer length determines the cage aggregation number. Hydrophobic chains decorated on both faces of the cage can undergo an intrascaffold "handshake" to generate DNA-micelle cages, which have increased structural stability and assembly cooperativity, and can encapsulate small molecules. The polymer sequence order can control the interaction between hydrophobic blocks, leading to unprecedented "doughnut-shaped" DNA cage-ring structures. We thus demonstrate that new structural and functional modes in DNA nanostructures can emerge from the synergy of two interactions, providing an attractive approach to develop protein-inspired assembly modules in DNA nanotechnology. PMID:26998893

  10. DNA supercoiling enables the Type IIS restriction enzyme BspMI to recognise the relative orientation of two DNA sequences

    OpenAIRE

    Kingston, Isabel J.; Gormley, Niall A.; Halford, Stephen E.

    2003-01-01

    Many proteins can sense the relative orientations of two sequences at distant locations in DNA: some require sites in inverted (head-to-head) orientation, others in repeat (head-to-tail) orientation. Like many restriction enzymes, the BspMI endonuclease binds two copies of its target site before cleaving DNA. Its target is an asymmetric sequence so two sites in repeat orientation differ from sites in inverted orientation. When tested against supercoiled plasmids with two sites 700 bp apart in...

  11. DNA sequence-dependent mechanics and protein-assisted bending in repressor-mediated loop formation

    International Nuclear Information System (INIS)

    As the chief informational molecule of life, DNA is subject to extensive physical manipulations. The energy required to deform double-helical DNA depends on sequence, and this mechanical code of DNA influences gene regulation, such as through nucleosome positioning. Here we examine the sequence-dependent flexibility of DNA in bacterial transcription factor-mediated looping, a context for which the role of sequence remains poorly understood. Using a suite of synthetic constructs repressed by the Lac repressor and two well-known sequences that show large flexibility differences in vitro, we make precise statistical mechanical predictions as to how DNA sequence influences loop formation and test these predictions using in vivo transcription and in vitro single-molecule assays. Surprisingly, sequence-dependent flexibility does not affect in vivo gene regulation. By theoretically and experimentally quantifying the relative contributions of sequence and the DNA-bending protein HU to DNA mechanical properties, we reveal that bending by HU dominates DNA mechanics and masks intrinsic sequence-dependent flexibility. Such a quantitative understanding of how mechanical regulatory information is encoded in the genome will be a key step towards a predictive understanding of gene regulation at single-base pair resolution. (paper)

  12. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution

    NARCIS (Netherlands)

    Falconer, Ester; Hills, Mark; Naumann, Ulrike; Poon, Steven S. S.; Chavez, Elizabeth A.; Sanders, Ashley D.; Zhao, Yongjun; Hirst, Martin; Lansdorp, Peter M.

    2012-01-01

    DNA rearrangements such as sister chromatid exchanges (SCEs) are sensitive indicators of genomic stress and instability, but they are typically masked by single-cell sequencing techniques. We developed Strand-seq to independently sequence parental DNA template strands from single cells, making it po

  13. Therapeutic modulation of endogenous gene function by agents with designed DNA-sequence specificities

    NARCIS (Netherlands)

    Uil, T.G.; Haisma, H.J.; Rots, Marianne

    2003-01-01

    Designer molecules that can specifically target pre-determined DNA sequences provide a means to modulate endogenous gene function. Different classes of sequence-specific DNA-binding agents have been developed, including triplex-forming molecules, synthetic polyamides and designer zinc finger protein

  14. Methods for sequencing GC-rich and CCT repeat DNA templates

    Science.gov (United States)

    Robinson, Donna L.

    2007-02-20

    The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.

  15. The DNA sequence and biology of human chromosome 19

    Energy Technology Data Exchange (ETDEWEB)

    Grimwood, J; Gordon, L A; Olsen, A; Terry, A; Schmutz, J; Lamerdin, J; Hellsten, U; Goodstein, D; Couronne, O; Tran-Gyamfi, M

    2004-04-06

    Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high GC content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in Mendelian disorders, including familial hypercholesterolemia and insulin-resistant diabetes. Nearly one quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.

  16. [Patentability of DNA sequences: the debate remains open].

    Science.gov (United States)

    Martín Uranga, Amelia

    2013-01-01

    The patentability of human genes was from the beginning of the discussion concerning the Directive on the legal protection of biotechnological inventions, an issue that provoked debates among politicians, scientists, lawyers and civil society itself. Although Directive 98/44 tried to settle the matter by stating that to support the patentability of human genes, it should know what role they fulfill, which protein they encode, all of this as an essential requirement to test its industrial application. However, following the judgment of 13 June 2013 (Supreme Court of the United States of America in the case of Association for Molecular Pathology et al. versus Myriad Genetics Inc.) the debate on this issue has been reopened. There are several issues to be considered, taking into account that the patents on DNA & Gene Sequences have played an important incentive to increase the interest in biotechnology applied to human health. On the other hand, this is a paradigm shift in the R & D of biopharmaceutical companies, and it has moved from an in house research model to a model of open innovation, a model of collaboration between large corporations with biotech SMEs and public and private research centers. This model of innovation, impacts on the issue of the industrial property, and therefore it will be necessary to clearly define what each party brings to the relationship and how they are expected to share the results. But all of this, with the ultimate goal that the patients have access to treatments and medications most innovative, safe and effective.

  17. Cloning and sequencing of Octopus dofleini hemocyanin cDNA: derived sequences of functional units Ode and Odf.

    OpenAIRE

    Lang, W H; van Holde, K E

    1991-01-01

    A number of additional cDNA clones coding for portions of the very large polypeptide chain of Octopus dofleini hemocyanin were isolated and sequenced. These data reveal two very similar coding sequences, which we have denoted "A-type" and "G-type." We have obtained complete A-type sequences coding for functional units Ode and Odf; consequently a total of three such unit sequences are now known from a single subunit of one molluscan hemocyanin. This presents the opportunity to make sequence co...

  18. Sequence analysis of a cDNA coding for a pancreatic precursor to somatostatin.

    OpenAIRE

    Taylor, W.L.; Collier, K J; Deschenes, R J; Weith, H L; Dixon, J. E.

    1981-01-01

    A synthetic oligonucleotide having the sequence d(T-T-C-C-A-G-A-A-G-A-A) deduced from the amino acid sequence Phe-Phe-Trp-Lys of somatostatin-14 was used to prime the synthesis of a cDNA from channel catfish (Ictalurus punctatus) pancreatic poly(A)-RNA. The major product of this reaction was a cDNA fragment of 565 nucleotides. Chemical sequence analysis of the cDNA fragment revealed that it was complementary to a mRNA coding for somatostatin. The 565-nucleotide cDNA hybridizes strongly with a...

  19. Phylogenetic relationships within Pelargonium section Peristera (Geraniaceae) inferred from nrDNA and cpDNA sequence comparisons.

    NARCIS (Netherlands)

    Bakker, F.T.; Helbrugge, D.; Culham, A.; Gibby, M.

    1998-01-01

    Phylogenetic analysis of nrDNA ITS and tmL (UAA) 5' exon-tmF (GAA) chloroplast DNA sequences from 17 species of Pelargonium sect. Peristera, together with nine putative outgroups, suggests paraphyly for the section and a close relationship between the highly disjurmt South African and Australian spe

  20. A Microbiome DNA Enrichment Method for Next-Generation Sequencing Sample Preparation.

    Science.gov (United States)

    Yigit, Erbay; Feehery, George R; Langhorst, Bradley W; Stewart, Fiona J; Dimalanta, Eileen T; Pradhan, Sriharsa; Slatko, Barton; Gardner, Andrew F; McFarland, James; Sumner, Christine; Davis, Theodore B

    2016-01-01

    "Microbiome" is used to describe the communities of microorganisms and their genes in a particular environment, including communities in association with a eukaryotic host or part of a host. One challenge in microbiome analysis concerns the presence of host DNA in samples. Removal of host DNA before sequencing results in greater sequence depth of the intended microbiome target population. This unit describes a novel method of microbial DNA enrichment in which methylated host DNA such as human genomic DNA is selectively bound and separated from microbial DNA before next-generation sequencing (NGS) library construction. This microbiome enrichment technique yields a higher fraction of microbial sequencing reads and improved read quality resulting in a reduced cost of downstream data generation and analysis. © 2016 by John Wiley & Sons, Inc. PMID:27366894

  1. Optimized Protocol for Simple Extraction of High-Quality Genomic DNA from Clostridium difficile for Whole-Genome Sequencing

    OpenAIRE

    Sim, James Heng Chiak; Anikst, Victoria; Lohith, Akshar; Pourmand, Nader; Banaei, Niaz

    2015-01-01

    Successful sequencing of the Clostridium difficile genome requires high-quality genomic DNA (gDNA) as the starting material. gDNA extraction using conventional methods is laborious. We describe here an optimized method for the simple extraction of C. difficile gDNA using the QIAamp DNA minikit, which yielded high-quality sequence reads on the Illumina MiSeq platform.

  2. Sequencing the hypervariable regions of human mitochondrial DNA using massively parallel sequencing: Enhanced data acquisition for DNA samples encountered in forensic testing.

    Science.gov (United States)

    Davis, Carey; Peters, Dixie; Warshauer, David; King, Jonathan; Budowle, Bruce

    2015-03-01

    Mitochondrial DNA testing is a useful tool in the analysis of forensic biological evidence. In cases where nuclear DNA is damaged or limited in quantity, the higher copy number of mitochondrial genomes available in a sample can provide information about the source of a sample. Currently, Sanger-type sequencing (STS) is the primary method to develop mitochondrial DNA profiles. This method is laborious and time consuming. Massively parallel sequencing (MPS) can increase the amount of information obtained from mitochondrial DNA samples while improving turnaround time by decreasing the numbers of manipulations and more so by exploiting high throughput analyses to obtain interpretable results. In this study 18 buccal swabs, three different tissue samples from five individuals, and four bones samples from casework were sequenced at hypervariable regions I and II using STS and MPS. Sample enrichment for STS and MPS was PCR-based. Library preparation for MPS was performed using Nextera® XT DNA Sample Preparation Kit and sequencing was performed on the MiSeq™ (Illumina, Inc.). MPS yielded full concordance of base calls with STS results, and the newer methodology was able to resolve length heteroplasmy in homopolymeric regions. This study demonstrates short amplicon MPS of mitochondrial DNA is feasible, can provide information not possible with STS, and lays the groundwork for development of a whole genome sequencing strategy for degraded samples.

  3. Sequence analysis of three mitochondrial DNA molecules reveals interesting differences among Saccharomyces yeasts

    DEFF Research Database (Denmark)

    Langkjær, Rikke Breinhold; Casaregola, S.; Ussery, David;

    2003-01-01

    The complete sequences of mitochondrial DNA ( mtDNA) from the two budding yeasts Saccharomyces castellii and Saccharomyces servazzii, consisting of 25 753 and 30 782 bp, respectively, were analysed and compared to Saccharomyces cerevisiae mtDNA. While some of the traits are very similar among...

  4. Carrier molecules and extraction of circulating tumor DNA for next generation sequencing in colorectal cancer.

    Science.gov (United States)

    Beránek, Martin; Sirák, Igor; Vošmik, Milan; Petera, Jiří; Drastíková, Monika; Palička, Vladimír

    2016-01-01

    The aims of the study were: i) to compare circulating tumor DNA (ctDNA) yields obtained by different manual extraction procedures, ii) to evaluate the addition of various carrier molecules into the plasma to improve ctDNA extraction recovery, and iii) to use next generation sequencing (NGS) technology to analyze KRAS, BRAF, and NRAS somatic mutations in ctDNA from patients with metastatic colorectal cancer. Venous blood was obtained from patients who suffered from metastatic colorectal carcinoma. For plasma ctDNA extraction, the following carriers were tested: carrier RNA, polyadenylic acid, glycogen, linear acrylamide, yeast tRNA, salmon sperm DNA, and herring sperm DNA. Each extract was characterized by quantitative real-time PCR and next generation sequencing. The addition of polyadenylic acid had a significant positive effect on the amount of ctDNA eluted. The sequencing data revealed five cases of ctDNA mutated in KRAS and one patient with a BRAF mutation. An agreement of 86% was found between tumor tissues and ctDNA. Testing somatic mutations in ctDNA seems to be a promising tool to monitor dynamically changing genotypes of tumor cells circulating in the body. The optimized process of ctDNA extraction should help to obtain more reliable sequencing data in patients with metastatic colorectal cancer. PMID:27526306

  5. Systematic sequencing of cDNA clones using the transposon Tn5

    OpenAIRE

    Shevchenko, Yuriy; Bouffard, Gerard G.; Butterfield, Yaron S.N.; Blakesley, Robert W.; Hartley, James L.; Young, Alice C.; Marco A. Marra; Jones, Steven J M; Touchman, Jeffrey W.; Green, Eric D.

    2002-01-01

    In parallel with the production of genomic sequence data, attention is being focused on the generation of comprehensive cDNA-sequence resources. Such efforts are increasingly emphasizing the production of high-accuracy sequence corresponding to the entire insert of cDNA clones, especially those presumed to reflect the full-length mRNA. The complete sequencing of cDNA clones on a large scale presents unique challenges because of the generally small, yet heterogeneous, sizes of the cloned inser...

  6. Identification of DNA Sequences Specific for Vibrio vulnificus Biotype 2 Strains by Suppression Subtractive Hybridization

    OpenAIRE

    Lee, Chung-Te; Amaro, Carmen; Sanjuán, Eva; Hor, Lien-I

    2005-01-01

    Vibrio vulnificus can be divided into three biotypes, and only biotype 2, which is further divided into serovars, contains eel-virulent strains. We compared the genomic DNA of a biotype 2 serovar E isolate (tester) with the genomic DNAs of three biotype 1 strains by suppression subtractive hybridization and then tested the distribution of the tester-specific DNA sequences in a wide collection of bacterial strains. In this way we identified three plasmid-borne DNA sequences that were specific ...

  7. How effective is graphene nanopore geometry on DNA sequencing?

    OpenAIRE

    Satarifard, Vahid; Foroutan, Masumeh; Ejtehadi, Mohammad Reza

    2015-01-01

    In this paper we investigate the effects of graphene nanopore geometry on homopolymer ssDNA pulling process through nanopore using steered molecular dynamic (SMD) simulations. Different graphene nanopores are examined including axially symmetric and asymmetric monolayer graphene nanopores as well as five layer graphene polyhedral crystals (GPC). The pulling force profile, moving fashion of ssDNA, work done in irreversible DNA pulling and orientations of DNA bases near the nanopore are assesse...

  8. Molecular characterization and phylogeny of whipworm nematodes inferred from DNA sequences of cox1 mtDNA and 18S rDNA.

    Science.gov (United States)

    Callejón, Rocío; Nadler, Steven; De Rojas, Manuel; Zurita, Antonio; Petrášová, Jana; Cutillas, Cristina

    2013-11-01

    A molecular phylogenetic hypothesis is presented for the genus Trichuris based on sequence data from the mitochondrial cytochrome c oxidase 1 (cox1) and ribosomal 18S genes. The taxa consisted of different described species and several host-associated isolates (undescribed taxa) of Trichuris collected from hosts from Spain. Sequence data from mitochondrial cox1 (partial gene) and nuclear 18S near-complete gene were analyzed by maximum likelihood and Bayesian inference methods, as separate and combined datasets, to evaluate phylogenetic relationships among taxa. Phylogenetic results based on 18S ribosomal DNA (rDNA) were robust for relationships among species; cox1 sequences delimited species and revealed phylogeographic variation, but most relationships among Trichuris species were poorly resolved by mitochondrial sequences. The phylogenetic hypotheses for both genes strongly supported monophyly of Trichuris, and distinct genetic lineages corresponding to described species or nematodes associated with certain hosts were recognized based on cox1 sequences. Phylogenetic reconstructions based on concatenated sequences of the two loci, cox1 (mitochondrial DNA (mtDNA)) and 18S rDNA, were congruent with the overall topology inferred from 18S and previously published results based on internal transcribed spacer sequences. Our results demonstrate that the 18S rDNA and cox1 mtDNA genes provide resolution at different levels, but together resolve relationships among geographic populations and species in the genus Trichuris.

  9. Dramatic reduction of sequence artefacts from DNA isolated from formalin-fixed cancer biopsies by treatment with uracil- DNA glycosylase.

    Science.gov (United States)

    Do, Hongdo; Dobrovic, Alexander

    2012-05-01

    Non-reproducible sequence artefacts are frequently detected in DNA from formalinfixed and paraffin-embedded (FFPE) tissues. However, no rational strategy has been developed for reduction of sequence artefacts from FFPE DNA as the underlying causes of the artefacts are poorly understood. As cytosine deamination to uracil is a common form of DNA damage in ancient DNA, we set out to examine whether treatment of FFPE DNA with uracil-DNA glycosylase (UDG) would lead to the reduction of C>T (and G>A) sequence artefacts. Heteroduplex formation in high resolution melting (HRM)-based assays was used for the detection of sequence variants in FFPE DNA samples. A set of samples that gave false positive HRM results for screening for the E17K mutation in exon 4 of the AKT1 gene were chosen for analysis. Sequencing of these samples showed multiple non-reproducible C:G>T:A artefacts. Treatment of the FFPE DNA with UDG prior to PCR amplification led to a very marked reduction of the sequence artefacts as indicated by both HRM and sequencing analysis, indicating that uracil lesions are the major cause of sequence artefacts. Similar results were shown for the BRAF V600 region in the same sample set and EGFR exon 19 in another sample set. UDG treatment specifically suppressed the formation of artefacts in FFPE DNA as it did not affect the detection of true KRAS codon 12 and true EGFR exon 19 and 20 mutations. We conclude that uracil in FFPE DNA leads to a significant proportion of sequence artefacts. These can be minimised by a simple UDG pretreatment which can be readily carried out, in the same tube, as the PCR immediately prior to commencing thermal cycling. HRM is a convenient way of monitoring both the degree of damage and the effectiveness of the UDG treatment. These findings have immediate and important implications for cancer diagnostics where FFPE DNA is used as the primary genetic material for mutational studies guiding personalised medicine strategies and where simple

  10. The organisation and evolution of a repeated DNA sequence family in related Allium species

    OpenAIRE

    Evans, Ian Jeffrey

    1983-01-01

    A large proportion of the genomes of species belonging to the genus Allium comprises repetitive sequence DNA, a component implicated as a cause of the large variation in C-values between even closely related species. The work presented here represents part of the first phase in the characterisation of some of these repetitive sequences in a number of Allium species. One repetitive DNA sequence family, BIOOO, isolated from the genome of A. sativum, has been characterised with respect to the...

  11. Compilation of human mtDNA control region sequences.

    OpenAIRE

    Handt, O.; Meyer, S.; von Haeseler, A

    1998-01-01

    This paper describes the organisation of a database for human mitochondrial control-region sequences. The data are divided into three ASCII files that contain aligned sequences from the hypervariable region I (HVRI), from the hypervariable region II (HVRII), and the available information about the individuals, from whom the sequences stem. The current collection comprises 4079 HVRI and 969 HVRII sequences. From 728 individuals sequences of both HVRI and HVRII are available. For easy access, t...

  12. Anonymous Boh avatud kunsti maastikul / Raivo Kelomees

    Index Scriptorium Estoniae

    Kelomees, Raivo, 1960-

    2010-01-01

    Anonymous Bohi näitus Tartu Kunstimajas, avatud 30. juulini 2010. Anonymous Boh on koos Non Grataga läbi viinud performance´id Euroopas, Ameerikas ja Aasias. Anonymous Bohi vastused oma näituse ja loominguga seotud küsimustele

  13. Quantum communications with an anonymous receiver

    Institute of Scientific and Technical Information of China (English)

    2010-01-01

    A new protocol for the anonymous communication of quantum information is proposed. The anonymity of the receiver and the privacy of the quantum information are perfectly protected except with exponentially small probability in this protocol. Furthermore, this protocol uses single photons to construct anonymous entanglement instead of multipartite entangled states, and therefore it reduces quantum resources compared with the pioneering work.

  14. Analysis of T-DNA/Host-Plant DNA Junction Sequences in Single-Copy Transgenic Barley Lines

    Directory of Open Access Journals (Sweden)

    Joanne G. Bartlett

    2014-01-01

    Full Text Available Sequencing across the junction between an integrated transfer DNA (T-DNA and a host plant genome provides two important pieces of information. The junctions themselves provide information regarding the proportion of T-DNA which has integrated into the host plant genome, whilst the transgene flanking sequences can be used to study the local genetic environment of the integrated transgene. In addition, this information is important in the safety assessment of GM crops and essential for GM traceability. In this study, a detailed analysis was carried out on the right-border T-DNA junction sequences of single-copy independent transgenic barley lines. T-DNA truncations at the right-border were found to be relatively common and affected 33.3% of the lines. In addition, 14.3% of lines had rearranged construct sequence after the right border break-point. An in depth analysis of the host-plant flanking sequences revealed that a significant proportion of the T-DNAs integrated into or close to known repetitive elements. However, this integration into repetitive DNA did not have a negative effect on transgene expression.

  15. An improved chloroplast DNA extraction procedure for whole plastid genome sequencing.

    Directory of Open Access Journals (Sweden)

    Chao Shi

    Full Text Available BACKGROUND: Chloroplast genomes supply valuable genetic information for evolutionary and functional studies in plants. The past five years have witnessed a dramatic increase in the number of completely sequenced chloroplast genomes with the application of second-generation sequencing technology in plastid genome sequencing projects. However, cost-effective high-throughput chloroplast DNA (cpDNA extraction becomes a major bottleneck restricting the application, as conventional methods are difficult to make a balance between the quality and yield of cpDNAs. METHODOLOGY/PRINCIPAL FINDINGS: We first tested two traditional methods to isolate cpDNA from the three species, Oryza brachyantha, Leersia japonica and Prinsepia utihis. Both of them failed to obtain properly defined cpDNA bands. However, we developed a simple but efficient method based on sucrose gradients and found that the modified protocol worked efficiently to isolate the cpDNA from the same three plant species. We sequenced the isolated DNA samples with Illumina (Solexa sequencing technology to test cpDNA purity according to aligning sequence reads to the reference chloroplast genomes, showing that the reference genome was properly covered. We show that 40-50% cpDNA purity is achieved with our method. CONCLUSION: Here we provide an improved method used to isolate cpDNA from angiosperms. The Illumina sequencing results suggest that the isolated cpDNA has reached enough yield and sufficient purity to perform subsequent genome assembly. The cpDNA isolation protocol thus will be widely applicable to the plant chloroplast genome sequencing projects.

  16. Replication of cloned DNA containing the Alu family sequence during cell extract-promoting simian virus 40 DNA synthesis.

    OpenAIRE

    Ariga, H

    1984-01-01

    The replicating activity of several cloned DNAs containing putative origin sequences was examined in a cell-free extract that absolutely depends on simian virus 40 (SV40) T antigen promoting initiation of SV40 DNA replication in vitro. Of the three DNAs containing the human Alu family sequence (BLUR8), the origin of (Saccharomyces cerevisiae plasmid 2 micron DNA (pJD29), and the yeast autonomous replicating sequence (YRp7), only BLUR8 was active as a template. Replication in a reaction mixtur...

  17. SERS-melting: a new method for discriminating mutations in dna sequences

    OpenAIRE

    Mahajan, Sumeet; Richardson, James; Brown, Tom; Bartlett, Philip N

    2008-01-01

    The reliable discrimination of mutations, single nucleotide polymorphisms (SNPs), and other differences in genomic sequence is an essential part of DNA diagnostics and forensics. It is commonly achieved using fluorescently labeled DNA probes and thermal gradients to distinguish between the matched and mismatched DNA. Here, we describe a novel method that uses surface enhanced (resonance) Raman spectroscopy (SER(R)S) to follow denaturation of dsDNA attached to a structured gold surface. T...

  18. Thermoelectric effect and its dependence on molecular length and sequence in single DNA molecules

    OpenAIRE

    Li, Yueqi; Xiang, Limin; Palma, Julio L.; ASAI, Yoshihiro; Tao, Nongjian

    2016-01-01

    Studying the thermoelectric effect in DNA is important for unravelling charge transport mechanisms and for developing relevant applications of DNA molecules. Here we report a study of the thermoelectric effect in single DNA molecules. By varying the molecular length and sequence, we tune the charge transport in DNA to either a hopping- or tunnelling-dominated regimes. The thermoelectric effect is small and insensitive to the molecular length in the hopping regime. In contrast, the thermoelect...

  19. Global matrilineal population structure in sperm whales as indicated by mitochondrial DNA sequences.

    OpenAIRE

    Lyrholm, T; Gyllensten, U

    1998-01-01

    The genetic variability and population structure of worldwide populations of the sperm whale was investigated by sequence analysis of the first 5'L 330 base pairs in the mitochondrial DNA (mtDNA) control region. The study included a total of 231 individuals from three major oceanic regions, the North Atlantic, the North Pacific and the Southern Hemisphere. Fifteen segregating nucleotide sites defined 16 mtDNA haplotypes (lineages). The most common mtDNA types were present in more than one oce...

  20. The complete nucleotide sequence of the mitochondrial DNA of the dogfish, Scyliorhinus canicula.

    OpenAIRE

    Delarbre, C; Spruyt, N; Delmarre, C; Gallut, C; Barriel, V.; Janvier, P.; Laudet, V; Gachelin, G

    1998-01-01

    We have determined the complete nucleotide sequence of the mitochondrial DNA (mtDNA) of the dogfish, Scyliorhinus canicula. The 16,697-bp-long mtDNA possesses a gene organization identical to that of the Osteichthyes, but different from that of the sea lamprey Petromyzon marinus. The main features of the mtDNA of osteichthyans were thus established in the common ancestor to chondrichthyans and osteichthyans. The phylogenetic analysis confirms that the Chondrichthyes are the sister group of th...

  1. High-throughput sequencing of nematode communities from total soil DNA extractions

    DEFF Research Database (Denmark)

    Sapkota, Rumakanta; Nicolaisen, Mogens

    2015-01-01

    nematodes without the need for enrichment was developed. Using this strategy on DNA templates from a set of 22 agricultural soils, we obtained 64.4% sequences of nematode origin in total, whereas the remaining sequences were almost entirely from other metazoans. The nematode sequences were derived from...

  2. Cell-free DNA next-generation sequencing in pancreatobiliary carcinomas

    Science.gov (United States)

    Zill, Oliver A.; Greene, Claire; Sebisanovic, Dragan; Siew, LaiMun; Leng, Jim; Vu, Mary; Hendifar, Andrew E.; Wang, Zhen; Atreya, Chloe E.; Kelley, Robin K.; Van Loon, Katherine; Ko, Andrew H.; Tempero, Margaret A.; Bivona, Trever G.; Munster, Pamela N.; Talasaz, AmirAli; Collisson, Eric A.

    2015-01-01

    Patients with pancreatic and biliary carcinomas lack personalized treatment options, in part because biopsies are often inadequate for molecular characterization. Cell-free DNA (cfDNA) sequencing may enable a precision oncology approach in this setting. We attempted to prospectively analyze 54 genes in tumor and cfDNA for 26 patients. Tumor sequencing failed in nine patients (35%). In the remaining 17, 90.3% (95% CI: 73.1–97.5%) of mutations detected in tumor biopsies were also detected in cfDNA. The diagnostic accuracy of cfDNA sequencing was 97.7%, with 92.3% average sensitivity and 100% specificity across five informative genes. Changes in cfDNA correlated well with tumor marker dynamics in serial sampling (r=0.93). We demonstrate that cfDNA sequencing is feasible, accurate, and sensitive in identifying tumor-derived mutations without prior knowledge of tumor genotype or the abundance of circulating tumor DNA. cfDNA sequencing should be considered in pancreatobiliary cancer trials where tissue sampling is unsafe, infeasible, or otherwise unsuccessful. PMID:26109333

  3. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system.

    Science.gov (United States)

    Schloss, Patrick D; Jenior, Matthew L; Koumpouras, Charles C; Westcott, Sarah L; Highlander, Sarah K

    2016-01-01

    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V5, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

  4. The DNA sequence, annotation and analysis of human chromosome 3

    DEFF Research Database (Denmark)

    Muzny, Donna M; Scherer, Steven E; Kaul, Rajinder;

    2006-01-01

    After the completion of a draft human genome sequence, the International Human Genome Sequencing Consortium has proceeded to finish and annotate each of the 24 chromosomes comprising the human genome. Here we describe the sequencing and analysis of human chromosome 3, one of the largest human chr...

  5. Nanopore DNA sequencing and epigenetic detection with a MspA nanopore

    Science.gov (United States)

    Laszlo, Andrew H.

    DNA forms the molecular basis for all known life. Widespread DNA sequencing has the potential to revolutionize healthcare and our understanding of the life sciences. Sequencing has already had a profound effect on our understanding of the molecular basis of life and underpinnings of disease. Current DNA sequencing technologies require costly reagents, can sequence only short DNA strands, and take too long to complete entire genomes. Furthermore, the required DNA sample size limits the types of experiments that can be run. For instance sequencing single cells is extremely difficult. New technologies are key to making DNA sequencing as cheap and accessible as possible and for making new experiments possible. One such new technology is nanopore sequencing. In nanopore sequencing, a thin membrane is used to divide a salt solution into two wells: cis and trans. This membrane contains a single nanometer sized hole that forms the only electrical connection between the two wells. When a voltage is applied across the membrane, ion current flows through the nanopore. This ion current is the primary signal for nanopore sequencing. DNA is negatively charged and can be pulled into the pore. When DNA is pulled into the pore, it occludes the pore and reduces the ion current that can pass through the pore. Individual DNA nucleotides along the DNA strand block the pore to varying degrees. One can measure the degree to which the pore is blocked as DNA passes through the pore and use the ion current signal to read off the DNA sequence. This thesis chronicles recent advances in the Gundlach laboratory in which I have played a leading role. It describes our work testing the biological nanopore Mycobacterium smegmatis porin A (MspA) for nanopore sequencing. The thesis consists of five chapters and three appendices which contain supplemental information for Chapters 2, 3, and 4. Chapter 1 begins with some motivation and defines the current challenges in DNA sequencing. I also introduce

  6. Ecological niche modelling and nDNA sequencing support a new, morphologically cryptic beetle species unveiled by DNA barcoding.

    Directory of Open Access Journals (Sweden)

    Oliver Hawlitschek

    Full Text Available BACKGROUND: DNA sequencing techniques used to estimate biodiversity, such as DNA barcoding, may reveal cryptic species. However, disagreements between barcoding and morphological data have already led to controversy. Species delimitation should therefore not be based on mtDNA alone. Here, we explore the use of nDNA and bioclimatic modelling in a new species of aquatic beetle revealed by mtDNA sequence data. METHODOLOGY/PRINCIPAL FINDINGS: The aquatic beetle fauna of Australia is characterised by high degrees of endemism, including local radiations such as the genus Antiporus. Antiporus femoralis was previously considered to exist in two disjunct, but morphologically indistinguishable populations in south-western and south-eastern Australia. We constructed a phylogeny of Antiporus and detected a deep split between these populations. Diagnostic characters from the highly variable nuclear protein encoding arginine kinase gene confirmed the presence of two isolated populations. We then used ecological niche modelling to examine the climatic niche characteristics of the two populations. All results support the status of the two populations as distinct species. We describe the south-western species as Antiporus occidentalis sp.n. CONCLUSION/SIGNIFICANCE: In addition to nDNA sequence data and extended use of mitochondrial sequences, ecological niche modelling has great potential for delineating morphologically cryptic species.

  7. Sequencing strategy of mitochondrial HV1 and HV2 DNA with length heteroplasmy

    DEFF Research Database (Denmark)

    Rasmussen, Erik Michael; Sørensen, E; Eriksen, Birthe;

    2002-01-01

    We describe a method to obtain reliable mitochondrial DNA (mtDNA) sequences downstream of the homopolymeric stretches with length heteroplasmy in the sequencing direction. The method is based on the use of junction primers that bind to a part of the homopolymeric stretch and the first 2-4 bases...... downstream of the homopolymeric region. This junction primer method gave clear and unambiguous results using samples from 21 individuals with length heteroplasmy in the hypervariable regions HV1, HV2 or both. The method is of special value for forensic casework, because sequencing of both strands of an mtDNA...

  8. Divergence between samples of chimpanzee and human DNA sequences is 5%, counting indels

    OpenAIRE

    Britten, Roy J.

    2002-01-01

    Five chimpanzee bacterial artificial chromosome (BAC) sequences (described in GenBank) have been compared with the best matching regions of the human genome sequence to assay the amount and kind of DNA divergence. The conclusion is the old saw that we share 98.5% of our DNA sequence with chimpanzee is probably in error. For this sample, a better estimate would be that 95% of the base pairs are exactly shared between chimpanzee and human DNA. In this sample of 779 kb, the divergence due to bas...

  9. Repetitive Sequences in Plant Nuclear DNA:Types, Distribution, Evolution and Function

    Institute of Scientific and Technical Information of China (English)

    Shweta Mehrotra; Vinod Goyal

    2014-01-01

    Repetitive DNA sequences are a major component of eukaryotic genomes and may account for up to 90% of the genome size. They can be divided into minisatellite, microsatellite and satellite sequences. Satellite DNA sequences are considered to be a fast-evolving component of eukaryotic genomes, comprising tandemly-arrayed, highly-repetitive and highly-conserved monomer sequences. The monomer unit of satellite DNA is 150-400 base pairs (bp) in length. Repetitive sequences may be species- or genus-specific, and may be centromeric or subtelomeric in nature. They exhibit cohesive and concerted evolution caused by molecular drive, leading to high sequence homogeneity. Repetitive sequences accumulate variations in sequence and copy number during evolution, hence they are important tools for taxonomic and phylogenetic studies, and are known as‘‘tuning knobs’’ in the evolution. Therefore, knowledge of repetitive sequences assists our understanding of the organization, evolution and behavior of eukaryotic genomes. Repetitive sequences have cytoplasmic, cellular and developmental effects and play a role in chromosomal recombination. In the post-genomics era, with the introduction of next-generation sequencing tech-nology, it is possible to evaluate complex genomes for analyzing repetitive sequences and decipher-ing the yet unknown functional potential of repetitive sequences.

  10. cDNA sequence of a new chicken embryonic rho-globin.

    OpenAIRE

    Roninson, I B; Ingram, V M

    1981-01-01

    In order to use specific DNA probes for the study of developmentally regulated gene expression, we have prepared cDNA clones corresponding to chicken embryonic globins by inserting cDNA.mRNA hybrids into the Pst I site of the plasmid pBR322 by using poly(dG) and poly(dC) linkers. The nucleotide sequence of the insert of one clone, representing a nearly full-length copy of an embryonic beta-like globin cDNA, has been determined. The amino acid sequence of the globin encoded by this insert is i...

  11. Sequence-specific Hydrolysis of Single-stranded DNA by PNA-Cerium (Ⅳ) Adduct

    Institute of Scientific and Technical Information of China (English)

    He Bai SHEN; Feng WANG; Yong Tao YANG

    2005-01-01

    A novel artificial site specific cleavage reagent, with peptide nucleic acid (PNA) as sequence-recognizing moiety and cerium (Ⅳ) ions as "scissors" for cleaving target DNA, was synthesized. Subsequently, it was employed in the cleavage of target 26-mer single-stranded DNA (ssDNA), which has 10-mer sequence complementary with PNA recognizer in the hybrids,under physiological conditions. Reversed-phase high-performance liquid chromatogram (RPHPLC) experiments indicated that the artificial site specific cleavage reagent could cleave the target DNA specifically.

  12. Targeting DNA with triplex-forming oligonucleotides to modify gene sequence.

    Science.gov (United States)

    Simon, Philippe; Cannata, Fabio; Concordet, Jean-Paul; Giovannangeli, Carine

    2008-08-01

    Molecules that interact with DNA in a sequence-specific manner are attractive tools for manipulating gene sequence and expression. For example, triplex-forming oligonucleotides (TFOs), which bind to oligopyrimidine.oligopurine sequences via Hoogsteen hydrogen bonds, have been used to inhibit gene expression at the DNA level as well as to induce targeted mutagenesis in model systems. Recent advances in using oligonucleotides and analogs to target DNA in a sequence-specific manner will be discussed. In particular, chemical modification of TFOs has been used to improve binding to chromosomal target sequences in living cells. Various oligonucleotide analogs have also been found to expand the range of sequences amenable to manipulation, including so-called "Zorro" locked nucleic acids (LNAs) and pseudo-complementary peptide nucleic acids (pcPNAs). Finally, we will examine the potential of TFOs for directing targeted gene sequence modification and propose that synthetic nucleases, based on conjugation of sequence-specific DNA ligands to DNA damaging molecules, are a promising alternative to protein-based endonucleases for targeted gene sequence modification. PMID:18460344

  13. Bisulfite sequencing reveals that Aspergillus flavus holds a hollow in DNA methylation.

    Directory of Open Access Journals (Sweden)

    Si-Yang Liu

    Full Text Available Aspergillus flavus first gained scientific attention for its production of aflatoxin. The underlying regulation of aflatoxin biosynthesis has been serving as a theoretical model for biosynthesis of other microbial secondary metabolites. Nevertheless, for several decades, the DNA methylation status, one of the important epigenomic modifications involved in gene regulation, in A. flavus remains to be controversial. Here, we applied bisulfite sequencing in conjunction with a biological replicate strategy to investigate the DNA methylation profiling of A. flavus genome. Both the bisulfite sequencing data and the methylome comparisons with other fungi confirm that the DNA methylation level of this fungus is negligible. Further investigation into the DNA methyltransferase of Aspergillus uncovers its close relationship with RID-like enzymes as well as its divergence with the methyltransferase of species with validated DNA methylation. The lack of repeat contents of the A. flavus' genome and the high RIP-index of the small amount of remanent repeat potentially support our speculation that DNA methylation may be absent in A. flavus or that it may possess de novo DNA methylation which occurs very transiently during the obscure sexual stage of this fungal species. This work contributes to our understanding on the DNA methylation status of A. flavus, as well as reinforces our views on the DNA methylation in fungal species. In addition, our strategy of applying bisulfite sequencing to DNA methylation detection in species with low DNA methylation may serve as a reference for later scientific investigations in other hypomethylated species.

  14. Noninvasive diagnosis of fetal aneuploidy by shotgun sequencing DNA from maternal blood.

    Science.gov (United States)

    Fan, H Christina; Blumenfeld, Yair J; Chitkara, Usha; Hudgins, Louanne; Quake, Stephen R

    2008-10-21

    We directly sequenced cell-free DNA with high-throughput shotgun sequencing technology from plasma of pregnant women, obtaining, on average, 5 million sequence tags per patient sample. This enabled us to measure the over- and underrepresentation of chromosomes from an aneuploid fetus. The sequencing approach is polymorphism-independent and therefore universally applicable for the noninvasive detection of fetal aneuploidy. Using this method, we successfully identified all nine cases of trisomy 21 (Down syndrome), two cases of trisomy 18 (Edward syndrome), and one case of trisomy 13 (Patau syndrome) in a cohort of 18 normal and aneuploid pregnancies; trisomy was detected at gestational ages as early as the 14th week. Direct sequencing also allowed us to study the characteristics of cell-free plasma DNA, and we found evidence that this DNA is enriched for sequences from nucleosomes. PMID:18838674

  15. Characterization of an Unusually Conserved Alui Highly Reiterated DNA Sequence Family from the Honeybee, Apis Mellifera

    OpenAIRE

    Tares, S.; Cornuet, J. M.; Abad, P.

    1993-01-01

    An AluI family of highly reiterated nontranscribed sequences has been found in the genome of the honeybee Apis mellifera. This repeated sequence is shown to be present at approximately 23,000 copies per haploid genome constituting about 2% of the total genomic DNA. The nucleotide sequence of 10 monomers was determined. The consensus sequence is 176 nucleotides long and has an A + T content of 58%. There are clusters of both direct and inverted repeats. Internal subrepeating units ranging from...

  16. Molecular cloning of a family of retroviral sequences found in chimpanzee but not human DNA.

    OpenAIRE

    Bonner, T I; Birkenmeier, E. H.; Gonda, M A; Mark, G E; Searfoss, G H; Todaro, G J

    1982-01-01

    A number of retrovirus-like sequences have been cloned from chimpanzee DNA which constitute the chimpanzee homologs of the endogenous colobus type C virus CPC-1. One of the clones contains a nearly complete viral genome, but others have sustained deletions of 1 to 2 kilobases in the polymerase gene. The pattern of related sequences detected in other primate species is consistent with the genetic transmission of these sequences for millions of years. However, the appropriately related sequence...

  17. Reproducibility of Illumina platform deep sequencing errors allows accurate determination of DNA barcodes in cells.

    OpenAIRE

    Beltman, J.B.; J. Urbanus; Velds, A.; de, Rooij, R.; Rohr, J.C.; S.H. Naik; T.N. Schumacher.

    2016-01-01

    BACKGROUND Next generation sequencing (NGS) of amplified DNA is a powerful tool to describe genetic heterogeneity within cell populations that can both be used to investigate the clonal structure of cell populations and to perform genetic lineage tracing. For applications in which both abundant and rare sequences are biologically relevant, the relatively high error rate of NGS techniques complicates data analysis, as it is difficult to distinguish rare true sequences from spurious sequences t...

  18. Noninvasive diagnosis of fetal aneuploidy by shotgun sequencing DNA from maternal blood

    OpenAIRE

    Fan, H. Christina; Blumenfeld, Yair J.; Chitkara, Usha; Hudgins, Louanne; Quake, Stephen R.

    2008-01-01

    We directly sequenced cell-free DNA with high-throughput shotgun sequencing technology from plasma of pregnant women, obtaining, on average, 5 million sequence tags per patient sample. This enabled us to measure the over- and underrepresentation of chromosomes from an aneuploid fetus. The sequencing approach is polymorphism-independent and therefore universally applicable for the noninvasive detection of fetal aneuploidy. Using this method, we successfully identified all nine cases of trisomy...

  19. DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability

    OpenAIRE

    Little, Damon P.

    2011-01-01

    For DNA barcoding to succeed as a scientific endeavor an accurate and expeditious query sequence identification method is needed. Although a global multiple-sequence alignment can be generated for some barcoding markers (e.g. COI, rbcL), not all barcoding markers are as structurally conserved (e.g. matK). Thus, algorithms that depend on global multiple-sequence alignments are not universally applicable. Some sequence identification methods that use local pairwise alignments (e.g. BLAST) are u...

  20. Characterising the atypical 5'-CG DNA sequence specificity of 9-aminoacridine carboxamide Pt complexes.

    Science.gov (United States)

    Kava, Hieronimus W; Galea, Anne M; Md Jamil, Farhana; Feng, Yue; Murray, Vincent

    2014-08-01

    In this study, the DNA sequence specificity of four DNA-targeted 9-aminoacridine carboxamide Pt complexes was compared with cisplatin, using two specially constructed plasmid templates. One plasmid contained 5'-CG and 5'-GA insert sequences while the other plasmid contained a G-rich transferrin receptor gene promoter insert sequence. The damage profiles of each compound on the different DNA templates were quantified via a polymerase stop assay with fluorescently labelled primers and capillary electrophoresis. With the plasmid that contained 5'-CG and 5'-GA dinucleotides, the four 9-aminoacridine carboxamide Pt complexes produced distinctly different damage profiles as compared with cisplatin. These 9-aminoacridine complexes had greatly increased levels of DNA damage at CG and GA dinucleotides as compared with cisplatin. It was shown that the presence of a CG or GA dinucleotide was sufficient to reveal the altered DNA sequence selectivity of the 9-aminoacridine carboxamide Pt analogues. The DNA sequence specificity of the Pt complexes was also found to be similarly altered utilising the transferrin receptor DNA sequence. PMID:24827388

  1. A human cellular sequence implicated in trk oncogene activation is DNA damage inducible

    Energy Technology Data Exchange (ETDEWEB)

    Ben-Ishai, R.; Scharf, R.; Sharon, R.; Kapten, I. (Technion-Israel Institute of Technology, Haifa (Israel))

    1990-08-01

    Xeroderma pigmentosum cells, which are deficient in the repair of UV light-induced DNA damage, have been used to clone DNA-damage-inducible transcripts in human cells. The cDNA clone designated pC-5 hybridizes on RNA gel blots to a 1-kilobase transcript, which is moderately abundant in nontreated cells and whose synthesis is enhanced in human cells following UV irradiation or treatment with several other DNA-damaging agents. UV-enhanced transcription of C-5 RNA is transient and occurs at lower fluences and to a greater extent in DNA-repair-deficient than in DNA-repair-proficient cells. Southern blot analysis indicates that the C-5 gene belongs to a multigene family. A cDNA clone containing the complete coding sequence of C-5 was isolated. Sequence analysis revealed that it is homologous to a human cellular sequence encoding the amino-terminal activating sequence of the trk-2h chimeric oncogene. The presence of DNA-damage-responsive sequences at the 5' end of a chimeric oncogene could result in enhanced expression of the oncogene in response to carcinogens.

  2. Fast mitochondrial DNA isolation from mammalian cells for next-generation sequencing.

    Science.gov (United States)

    Quispe-Tintaya, Wilber; White, Ryan R; Popov, Vasily N; Vijg, Jan; Maslov, Alexander Y

    2013-09-01

    Standard methods for mitochondrial DNA (mtDNA) extraction do not provide the level of enrichment for mtDNA sufficient for direct sequencing and must be followed by long-range-PCR amplification, which can bias the sequencing results. Here, we describe a fast, cost-effective, and reliable method for preparation of mtDNA enriched samples from eukaryotic cells ready for direct sequencing. Our protocol utilizes a conventional miniprep kit, paramagnetic bead-based purification, and an optional, limited PCR amplification of mtDNA. The first two steps alone provide more than 2000-fold enrichment for mtDNA when compared with total cellular DNA (~200-fold in comparison with current commercially available kits) as demonstrated by real-time PCR. The percentage of sequencing reads aligned to mtDNA was about 22% for non-amplified samples and greater than 99% for samples subjected to 10 cycles of long-range-PCR with mtDNA specific primers.

  3. mtDNA-Server: next-generation sequencing data analysis of human mitochondrial DNA in the cloud.

    Science.gov (United States)

    Weissensteiner, Hansi; Forer, Lukas; Fuchsberger, Christian; Schöpf, Bernd; Kloss-Brandstätter, Anita; Specht, Günther; Kronenberg, Florian; Schönherr, Sebastian

    2016-07-01

    Next generation sequencing (NGS) allows investigating mitochondrial DNA (mtDNA) characteristics such as heteroplasmy (i.e. intra-individual sequence variation) to a higher level of detail. While several pipelines for analyzing heteroplasmies exist, issues in usability, accuracy of results and interpreting final data limit their usage. Here we present mtDNA-Server, a scalable web server for the analysis of mtDNA studies of any size with a special focus on usability as well as reliable identification and quantification of heteroplasmic variants. The mtDNA-Server workflow includes parallel read alignment, heteroplasmy detection, artefact or contamination identification, variant annotation as well as several quality control metrics, often neglected in current mtDNA NGS studies. All computational steps are parallelized with Hadoop MapReduce and executed graphically with Cloudgene. We validated the underlying heteroplasmy and contamination detection model by generating four artificial sample mix-ups on two different NGS devices. Our evaluation data shows that mtDNA-Server detects heteroplasmies and artificial recombinations down to the 1% level with perfect specificity and outperforms existing approaches regarding sensitivity. mtDNA-Server is currently able to analyze the 1000G Phase 3 data (n = 2,504) in less than 5 h and is freely accessible at https://mtdna-server.uibk.ac.at. PMID:27084948

  4. Computational optimisation of targeted DNA sequencing for cancer detection

    DEFF Research Database (Denmark)

    Martinez, Pierre; McGranahan, Nicholas; Birkbak, Nicolai Juul;

    2013-01-01

    circulating tumour DNA (ctDNA) might represent a non-invasive method to detect mutations in patients, facilitating early detection. In this article, we define reduced gene panels from publicly available datasets as a first step to assess and optimise the potential of targeted ctDNA scans for early tumour...... detection. Dividing 4,467 samples into one discovery and two independent validation cohorts, we show that up to 76% of 10 cancer types harbour at least one mutation in a panel of only 25 genes, with high sensitivity across most tumour types. Our analyses demonstrate that targeting "hotspot" regions would...

  5. Modeling the early stage of DNA sequence recognition within RecA nucleoprotein filaments.

    Science.gov (United States)

    Saladin, Adrien; Amourda, Christopher; Poulain, Pierre; Férey, Nicolas; Baaden, Marc; Zacharias, Martin; Delalande, Olivier; Prévost, Chantal

    2010-10-01

    Homologous recombination is a fundamental process enabling the repair of double-strand breaks with a high degree of fidelity. In prokaryotes, it is carried out by RecA nucleofilaments formed on single-stranded DNA (ssDNA). These filaments incorporate genomic sequences that are homologous to the ssDNA and exchange the homologous strands. Due to the highly dynamic character of this process and its rapid propagation along the filament, the sequence recognition and strand exchange mechanism remains unknown at the structural level. The recently published structure of the RecA/DNA filament active for recombination (Chen et al., Mechanism of homologous recombination from the RecA-ssDNA/dsDNA structure, Nature 2008, 453, 489) provides a starting point for new exploration of the system. Here, we investigate the possible geometries of association of the early encounter complex between RecA/ssDNA filament and double-stranded DNA (dsDNA). Due to the huge size of the system and its dense packing, we use a reduced representation for protein and DNA together with state-of-the-art molecular modeling methods, including systematic docking and virtual reality simulations. The results indicate that it is possible for the double-stranded DNA to access the RecA-bound ssDNA while initially retaining its Watson-Crick pairing. They emphasize the importance of RecA L2 loop mobility for both recognition and strand exchange.

  6. High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

    Directory of Open Access Journals (Sweden)

    Soichi Inagaki

    Full Text Available Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.

  7. Comparing the performance of three ancient DNA extraction methods for high-throughput sequencing

    DEFF Research Database (Denmark)

    Gamba, Cristina; Hanghøj, Kristian Ebbesen; Gaunitz, Charleen;

    2016-01-01

    The DNA molecules that can be extracted from archaeological and palaeontological remains are often degraded and massively contaminated with environmental microbial material. This reduces the efficacy of shotgun approaches for sequencing ancient genomes, despite the decreasing sequencing costs...... of high-throughput sequencing (HTS). Improving the recovery of endogenous molecules from the DNA extraction and purification steps could, thus, help advance the characterization of ancient genomes. Here, we apply the three most commonly used DNA extraction methods to five ancient bone samples spanning...... a ~30 thousand year temporal range and originating from a diversity of environments, from South America to Alaska. We show that methods based on the purification of DNA fragments using silica columns are more advantageous than in solution methods and increase not only the total amount of DNA molecules...

  8. Interaction of berenil with the tyrT DNA sequence studied by footprinting and molecular modelling. Implications for the design of sequence-specific DNA recognition agents.

    Science.gov (United States)

    Laughton, C A; Jenkins, T C; Fox, K R; Neidle, S

    1990-08-11

    We have developed a technique of partially-restrained molecular mechanics enthalpy minimisation which enables the sequence-dependence of the DNA binding of a non-intercalating ligand to be studied for arbitrary sequences of considerable length (greater than = 60 base-pairs). The technique has been applied to analyse the binding of berenil to the minor groove of a 60 base-pair sequence derived from the tyrT promoter; the results are compared with those obtained by DNAse I and hydroxyl radical footprinting on the same sequence. The calculated and experimentally observed patterns of binding are in good agreement. Analysis of the modelling data highlights the importance of DNA flexibility in ligand binding. Further, the electrostatic component of the interaction tends to favour binding to AT-rich regions, whilst the van der Waals interaction energy term favours GC-rich ones. The results also suggest that an important contribution to the observed preference for binding in AT-rich regions arises from lower DNA perturbation energies and is not accompanied by reduced DNA structural perturbations in such sequences. It is therefore concluded that those modes of DNA distortion favourable to binding are probably more flexible in AT-rich regions. The structure of the modelled DNA sequence has also been analysed in terms of helical parameters. For the DNA energy-minimised in the absence of berenil, certain helical parameters show marked sequence-dependence. For example, purine-pyrimidine (R-Y) base pairs show a consistent positive buckle whereas this feature is consistently negative for Y-R pairs. Further, CG steps show lower than average values of slide while GC steps show lower than average values of rise. Similar analysis of the modelling data from the calculations including berenil highlights the importance of DNA flexibility in ligand binding. We observe that the binding of berenil induces characteristic responses in different helical parameters for the base-pairs around

  9. Designing universal primers for the isolation of DNA sequences encoding Proanthocyanidins biosynthetic enzymes in Crataegus aronia

    Directory of Open Access Journals (Sweden)

    Zuiter Afnan

    2012-08-01

    Full Text Available Abstract Background Hawthorn is the common name of all plant species in the genus Crataegus, which belongs to the Rosaceae family. Crataegus are considered useful medicinal plants because of their high content of proanthocyanidins (PAs and other related compounds. To improve PAs production in Crataegus tissues, the sequences of genes encoding PAs biosynthetic enzymes are required. Findings Different bioinformatics tools, including BLAST, multiple sequence alignment and alignment PCR analysis were used to design primers suitable for the amplification of DNA fragments from 10 candidate genes encoding enzymes involved in PAs biosynthesis in C. aronia. DNA sequencing results proved the utility of the designed primers. The primers were used successfully to amplify DNA fragments of different PAs biosynthesis genes in different Rosaceae plants. Conclusion To the best of our knowledge, this is the first use of the alignment PCR approach to isolate DNA sequences encoding PAs biosynthetic enzymes in Rosaceae plants.

  10. Sequence-specific RNA Photocleavage by Single-stranded DNA in Presence of Riboflavin

    Science.gov (United States)

    Zhao, Yongyun; Chen, Gangyi; Yuan, Yi; Li, Na; Dong, Juan; Huang, Xin; Cui, Xin; Tang, Zhuo

    2015-10-01

    Constant efforts have been made to develop new method to realize sequence-specific RNA degradation, which could cause inhibition of the expression of targeted gene. Herein, by using an unmodified short DNA oligonucleotide for sequence recognition and endogenic small molecue, vitamin B2 (riboflavin) as photosensitizer, we report a simple strategy to realize the sequence-specific photocleavage of targeted RNA. The DNA strand is complimentary to the target sequence to form DNA/RNA duplex containing a G•U wobble in the middle. The cleavage reaction goes through oxidative elimination mechanism at the nucleoside downstream of U of the G•U wobble in duplex to obtain unnatural RNA terminal, and the whole process is under tight control by using light as switch, which means the cleavage could be carried out according to specific spatial and temporal requirements. The biocompatibility of this method makes the DNA strand in combination with riboflavin a promising molecular tool for RNA manipulation.

  11. Genomic Signal Processing Methods for Computation of Alignment-Free Distances from DNA Sequences

    Science.gov (United States)

    Borrayo, Ernesto; Mendizabal-Ruiz, E. Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P.; Morales, J. Alejandro

    2014-01-01

    Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments. PMID:25393409

  12. Genomic signal processing methods for computation of alignment-free distances from DNA sequences.

    Science.gov (United States)

    Borrayo, Ernesto; Mendizabal-Ruiz, E Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P; Morales, J Alejandro

    2014-01-01

    Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments.

  13. Aspergillus and Penicillium identification using DNA sequences: Barcode or MLST?

    Science.gov (United States)

    Current methods in DNA technology can detect single nucleotide polymorphisms with measurable accuracy using several different approaches appropriate for different uses. If there are even single nucleotide differences that are invariant markers of the species, we can accomplish identification through...

  14. Sequence-specific DNA purification by triplex affinity capture.

    OpenAIRE

    Ito, T.; Smith, C L; Cantor, C R

    1992-01-01

    A DNA isolation procedure was developed by using triple-helix formation and magnetic separation. In this procedure, target DNA is captured by a biotinylated oligonucleotide via intermolecular triplex formation, bound to streptavidin-coated magnetic beads, and recovered in double-stranded form by elution with a mild alkaline buffer that destabilizes the triple helix. The effectiveness of the procedure was demonstrated by a model experiment with an artificially reconstructed library and, also, ...

  15. Efficiency of ITS Sequences for DNA Barcoding in Passiflora (Passifloraceae

    Directory of Open Access Journals (Sweden)

    Giovanna Câmara Giudicelli

    2015-04-01

    Full Text Available DNA barcoding is a technique for discriminating and identifying species using short, variable, and standardized DNA regions. Here, we tested for the first time the performance of plastid and nuclear regions as DNA barcodes in Passiflora. This genus is a largely variable, with more than 900 species of high ecological, commercial, and ornamental importance. We analyzed 1034 accessions of 222 species representing the four subgenera of Passiflora and evaluated the effectiveness of five plastid regions and three nuclear datasets currently employed as DNA barcodes in plants using barcoding gap, applied similarity-, and tree-based methods. The plastid regions were able to identify less than 45% of species, whereas the nuclear datasets were efficient for more than 50% using “best match” and “best close match” methods of TaxonDNA software. All subgenera presented higher interspecific pairwise distances and did not fully overlap with the intraspecific distance, and similarity-based methods showed better results than tree-based methods. The nuclear ribosomal internal transcribed spacer 1 (ITS1 region presented a higher discrimination power than the other datasets and also showed other desirable characteristics as a DNA barcode for this genus. Therefore, we suggest that this region should be used as a starting point to identify Passiflora species.

  16. General Strategy for the Design of DNA Coding Sequences Applied to Nanoparticle Assembly.

    Science.gov (United States)

    Calais, Théo; Baijot, Vincent; Djafari Rouhani, Mehdi; Gauchard, David; Chabal, Yves J; Rossi, Carole; Estève, Alain

    2016-09-20

    The DNA-directed assembly of nano-objects has been the subject of many recent studies as a means to construct advanced nanomaterial architectures. Although much experimental in silico work has been presented and discussed, there has been no in-depth consideration of the proper design of single-strand sticky termination of DNA sequences, noted as ssST, which is important in avoiding self-folding within one DNA strand, unwanted strand-to-strand interaction, and mismatching. In this work, a new comprehensive and computationally efficient optimization algorithm is presented for the construction of all possible DNA sequences that specifically prevents these issues. This optimization procedure is also effective when a spacer section is used, typically repeated sequences of thymine or adenine placed between the ssST and the nano-object, to address the most conventional experimental protocols. We systematically discuss the fundamental statistics of DNA sequences considering complementarities limited to two (or three) adjacent pairs to avoid self-folding and hybridization of identical strands due to unwanted complements and mismatching. The optimized DNA sequences can reach maximum lengths of 9 to 34 bases depending on the level of applied constraints. The thermodynamic properties of the allowed sequences are used to develop a ranking for each design. For instance, we show that the maximum melting temperature saturates with 14 bases under typical solvation and concentration conditions. Thus, DNA ssST with optimized sequences are developed for segments ranging from 4 to 40 bases, providing a very useful guide for all technological protocols. An experimental test is presented and discussed using the aggregation of Al and CuO nanoparticles and is shown to validate and illustrate the importance of the proposed DNA coding sequence optimization. PMID:27578445

  17. A next generation semiconductor based sequencing approach for the identification of meat species in DNA mixtures.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43% in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97 and lower for avian species (0.70. PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.

  18. A next generation semiconductor based sequencing approach for the identification of meat species in DNA mixtures.

    Science.gov (United States)

    Bertolini, Francesca; Ghionda, Marco Ciro; D'Alessandro, Enrico; Geraci, Claudia; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine) for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon) as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43%) in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97) and lower for avian species (0.70). PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.

  19. Stochastic model of homogeneous coding and latent periodicity in DNA sequences.

    Science.gov (United States)

    Chaley, Maria; Kutyrkin, Vladimir

    2016-02-01

    The concept of latent triplet periodicity in coding DNA sequences which has been earlier extensively discussed is confirmed in the result of analysis of a number of eukaryotic genomes, where latent periodicity of a new type, called profile periodicity, is recognized in the CDSs. Original model of Stochastic Homogeneous Organization of Coding (SHOC-model) in textual string is proposed. This model explains the existence of latent profile periodicity and regularity in DNA sequences. PMID:26656186

  20. The use of permanganate as a sequencing reagent for identification of 5-methylcytosine residues in DNA.

    OpenAIRE

    Fritzsche, E; Hayatsu, H; Igloi, G L; Iida, S.; Kössel, H

    1987-01-01

    The use of permanganate as a reagent for DNA sequencing by chemical degradation has been studied with respect to its specificity for 5-methylcytosine residues. At weakly acidic pH and room temperature, 0.2 mM potassium permanganate reacts preferentially with thymine, 5-methylcytosine, and to a lesser extent with purine residues, while cytosine remains essentially intact. Permanganate oxidation is, therefore, a suitable DNA sequencing reaction for positive discrimination between 5-methylcytosi...

  1. Construction of Agropyrum intermedium 2Ai-2 Chromosome DNA Library and Cloning of Species-Specific DNA Sequences

    Institute of Scientific and Technical Information of China (English)

    HE Cong-fen; MA You-zhi; XIN Zhi-yong; XU Qiong-fang; LI Lian-cheng

    2004-01-01

    The univalent from the meiosis-metaphase spreads of F1 (Z2× wheat variety Wan7107) was identified to be Agropyrum intermedium 2Ai-2 chromosome by GISH. The 2Ai-2 chromosomes were microisolated and collected. After two rounds of PCR amplification, the PCR products were ranged from 150 - 3 000 bp,with predominant fragments at about 200 - 2 000 bp. Using Ag.intermediumgenomic DNA as a probe, Southern blotting analysis confirmed the products originated from Ag. intermediumgenome. The products were purified, ligated to pUC18 and then transformed into competence E.coli DH5α to produce a 2Ai-2 chromosome DNA library. The microcloning experiments produced approximately 5×105 clones, the size range of the cloned inserts was 200- 1 500 bp, with an average of 580bp. Using Ag. intermediumgenomic DNA as a probe, dot blotting results showed that 56% clones are unique/low copy sequences, 44% are repetitive sequences in the library. Four Ag. intermedium clones were screened from the library by RFLP, and three clones(Mag065, Mag088, Mag139)belong to low/single sequences, one clone(Mag104)was repetitive sequence, and GISH results indicated that Mag104 was Ag.intermedium species-specific repetitive DNA sequence.

  2. Draft versus finished sequence data for DNA and protein diagnostic signature development

    Energy Technology Data Exchange (ETDEWEB)

    Gardner, S N; Lam, M W; Smith, J R; Torres, C L; Slezak, T R

    2004-10-29

    Sequencing pathogen genomes is costly, demanding careful allocation of limited sequencing resources. We built a computational Sequencing Analysis Pipeline (SAP) to guide decisions regarding the amount of genomic sequencing necessary to develop high-quality diagnostic DNA and protein signatures. SAP uses simulations to estimate the number of target genomes and close phylogenetic relatives (near neighbors, or NNs) to sequence. We use SAP to assess whether draft data is sufficient or finished sequencing is required using Marburg and variola virus sequences. Simulations indicate that intermediate to high quality draft with error rates of 10{sup -3}-10{sup -5} ({approx} 8x coverage) of target organisms is suitable for DNA signature prediction. Low quality draft with error rates of {approx} 1% (3x to 6x coverage) of target isolates is inadequate for DNA signature prediction, although low quality draft of NNs is sufficient, as long as the target genomes are of high quality. For protein signature prediction, sequencing errors in target genomes substantially reduce the detection of amino acid sequence conservation, even if the draft is of high quality. In summary, high quality draft of target and low quality draft of NNs appears to be a cost-effective investment for DNA signature prediction, but may lead to underestimation of predicted protein signatures.

  3. CLONING AND ANALYSIS OF THE GENOMIC DNA SEQUENCE OF AUGMENTER OF LIVER REGENERATION FROM RAT

    Institute of Scientific and Technical Information of China (English)

    董菁; 成军; 王勤环; 施双双; 王刚; 斯崇文

    2002-01-01

    Objective.To search for genomic DNA sequence of the augmenter of liver regeneration (ALR) of rat.Methods.Polymerase chain reaction (PCR) with specific primers was used to amplify the sequence from the rat genome.Results.A piece of genomic DNA sequence and a piece of pseudogene of rat ALR were identified.The lengths of the gene and pseudogene are 1508 bp and 442 bp,respectively.The ALR gene of rat includes 3 exons and 2 introns.The 442 bp DNA sequence may represent a pseudogene or a ALR related peptide.Predicted amino acid sequence analysis showed that there were 14 different amino acid residues between the gene and pseudogene.ALR related peptide is 84 amino acid residues in length and relates closely to ALR protein.Conclusion.There might be a multigene family of ALR in rat.

  4. Noninvasive Prenatal Paternity Testing (NIPAT) through Maternal Plasma DNA Sequencing: A Pilot Study.

    Science.gov (United States)

    Jiang, Haojun; Xie, Yifan; Li, Xuchao; Ge, Huijuan; Deng, Yongqiang; Mu, Haofang; Feng, Xiaoli; Yin, Lu; Du, Zhou; Chen, Fang; He, Nongyue

    2016-01-01

    Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels in order to verify the performance in clinical cases. Combining targeted deep sequencing of selective SNP and informative bioinformatics pipeline, we calculated the combined paternity index (CPI) of 17 cases to determine paternity. Sequencing-based NIPAT results fully agreed with invasive prenatal paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future.

  5. Real sequence effects on the search dynamics of transcription factors on DNA

    DEFF Research Database (Denmark)

    Bauer, Maximilian; Rasmussen, Emil S.; Lomholt, Michael A.;

    2015-01-01

    Recent experiments show that transcription factors (TFs) indeed use the facilitated diffusion mechanism to locate their target sequences on DNA in living bacteria cells: TFs alternate between sliding motion along DNA and relocation events through the cytoplasm. From simulations and theoretical...

  6. Nucleotide sequence determination of the region in adenovirus 5 DNA involved in cell transformation

    International Nuclear Information System (INIS)

    A description is given of investigations into the primary structure of the transforming region of adenovirus type 5 DNA. The phenomenon of cell transformation is discussed in general terms and the principles of a number of fairly recent techniques, which have been in use for DNA sequence determination since 1975 are dealt with. A few of the author's own techniques are described which deal both with nucleotide sequence analysis and with the determination of DNA cleavage sites of restriction endonucleases. The results are given of the mapping of cleavage sites in the HpaI-E fragment of adenovirus DNA of HpaII, HaeIII, AluI, HinfI and TaqI and of the determination of the nucleotide sequence in the transforming region of adenovirus type 5 DNA. The results of the sequence determination of the Ad5 HindIII-G fragment are discussed in relation with the investigation on the transforming proteins isolated from in vitro and in vivo synthesizing systems. Labelling procedures of DNA are described including the exonuclease III/DNA polymerase 1 method and TA polynucleotide kinase labelling of DNA fragments. (Auth.)

  7. Cloning, sequencing and expression of cDNA encoding growth hormone from Indian catfish (Heteropneustes fossilis)

    Indian Academy of Sciences (India)

    Vikas Anathy; Thayanithy Venugopal; Ramanathan Koteeswaran; Thavamani J Pandian; Sinnakaruppan Mathavan

    2001-09-01

    A tissue-specific cDNA library was constructed using polyA+ RNA from pituitary glands of the Indian catfish Heteropneustes fossilis (Bloch) and a cDNA clone encoding growth hormone (GH) was isolated. Using polymerase chain reaction (PCR) primers representing the conserved regions of fish GH sequences the 3′ region of catfish GH cDNA (540 bp) was cloned by random amplification of cDNA ends and the clone was used as a probe to isolate recombinant phages carrying the full-length cDNA sequence. The full-length cDNA clone is 1132 bp in length, coding for an open reading frame (ORF) of 603 bp; the reading frame encodes a putative polypeptide of 200 amino acids including the signal sequence of 22 amino acids. The 5′ and 3′ untranslated regions of the cDNA are 58 bp and 456 bp long, respectively. The predicted amino acid sequence of H. fossils GH shared 98% homology with other catfishes. Mature GH protein was efficiently expressed in bacterial and zebrafish systems using appropriate expression vectors. The successful expression of the cloned GH cDNA of catfish confirms the functional viability of the clone.

  8. On the sequence selective bis-intercalation of a homodimeric thiazole orange dye in DNA

    DEFF Research Database (Denmark)

    Bunkenborg, Jakob; Stidsen, M M; Jacobsen, J P

    1998-01-01

    The thiazole orange dye 1,1'-(4,4,8,8-tetramethyl-4, 8-diazaundecamethylene)-bis-4-[(3-methyl-2,3-dihydro(benzo-1, 3-thiazolyl)-2-methylidene]quinolinium tetraiodide (TOTO) binds sequence selectively to double-stranded DNA (dsDNA) by bis-intercalation. Each chromophore is sandwiched between two...

  9. DNA sequence and structure recognition by Fe(II)[center dot]bleomycin

    Energy Technology Data Exchange (ETDEWEB)

    Kane, S.A.

    1993-01-01

    The bleomycins (BLMs) are a family of clinically-important antitumor antibiotics whose chemotherapeutic effects are believed to be expressed at the level of DNA degradation. Bleomycin-mediated DNA strand scission is sequence-selective, resulting in cleavage predominantly at [sup 5[prime

  10. Mitochondrial DNA sequence variation in Finnish patients with matrilineal diabetes mellitus

    Directory of Open Access Journals (Sweden)

    Soini Heidi K

    2012-07-01

    Full Text Available Abstract Background The genetic background of type 2 diabetes is complex involving contribution by both nuclear and mitochondrial genes. There is an excess of maternal inheritance in patients with type 2 diabetes and, furthermore, diabetes is a common symptom in patients with mutations in mitochondrial DNA (mtDNA. Polymorphisms in mtDNA have been reported to act as risk factors in several complex diseases. Findings We examined the nucleotide variation in complete mtDNA sequences of 64 Finnish patients with matrilineal diabetes. We used conformation sensitive gel electrophoresis and sequencing to detect sequence variation. We analysed the pathogenic potential of nonsynonymous variants detected in the sequences and examined the role of the m.16189 T>C variant. Controls consisted of non-diabetic subjects ascertained in the same population. The frequency of mtDNA haplogroup V was 3-fold higher in patients with diabetes. Patients harboured many nonsynonymous mtDNA substitutions that were predicted to be possibly or probably damaging. Furthermore, a novel m.13762 T>G in MTND5 leading to p.Ser476Ala and several rare mtDNA variants were found. Haplogroup H1b harbouring m.16189 T > C and m.3010 G > A was found to be more frequent in patients with diabetes than in controls. Conclusions Mildly deleterious nonsynonymous mtDNA variants and rare population-specific haplotypes constitute genetic risk factors for maternally inherited diabetes.

  11. Sequence-selective DNA binding with cell-permeable oligoguanidinium-peptide conjugates.

    Science.gov (United States)

    Mosquera, Jesús; Sánchez, Mateo I; Valero, Julián; de Mendoza, Javier; Vázquez, M Eugenio; Mascareñas, José L

    2015-03-21

    Conjugation of a short peptide fragment from a bZIP protein to an oligoguanidinium tail results in a DNA-binding miniprotein that selectively interacts with composite sequences containing the peptide-binding site next to an A/T-rich tract. In addition to stabilizing the complex with the target DNA, the oligoguanidinium unit also endows the conjugate with cell internalization properties.

  12. Cloning, sequencing and expression of cDNA encoding growth hormone from Indian catfish (Heteropneustes fossilis)

    Indian Academy of Sciences (India)

    Vikas Anathy; Thayanithy Venugopal; Ramanathan Koteeswaran; Thavamani J Pandian; Sinnakaruppan Mathavan

    2013-03-01

    A tissue-specific cDNA library was constructed using polyA+ RNA from pituitary glands of the Indian catfish Heteropneustes fossilis (Bloch) and a cDNA clone encoding growth hormone (GH) was isolated. Using polymerase chain reaction (PCR) primers representing the conserved regions of fish GH sequences the 3′ region of catfish GH cDNA (540 bp) was cloned by random amplification of cDNA ends and the clone was used as a probe to isolate recombinant phages carrying the full-length cDNA sequence. The full-length cDNA clone is 1132 bp in length, coding for an open reading frame (ORF) of 603 bp; the reading frame encodes a putative polypeptide of 200 amino acids including the signal sequence of 22 amino acids. The 5′ and 3′ untranslated regions of the cDNA are 58 bp and 456 bp long, respectively. The predicted amino acid sequence of H. fossils GH shared 98% homology with other catfishes. Mature GH protein was efficiently expressed in bacterial and zebrafish systems using appropriate expression vectors. The successful expression of the cloned GH cDNA of catfish confirms the functional viability of the clone.

  13. DNA stretching and optimization of nucleobase recognition in enzymatic nanopore sequencing

    NARCIS (Netherlands)

    Stoddart, David; Franceschini, Lorenzo; Heron, Andrew; Bayley, Hagan; Maglia, Giovanni

    2015-01-01

    In nanopore sequencing, where single DNA strands are electrophoretically translocated through a nanopore and the resulting ionic signal is used to identify the four DNA bases, an enzyme has been used to ratchet the nucleic acid stepwise through the pore at a controlled speed. In this work, we invest

  14. High Interlaboratory Reprocucibility of DNA Sequence-based Typing of Bacteria in a Multicenter Study

    DEFF Research Database (Denmark)

    Sousa, MA de; Boye, Kit; Lencastre, H de;

    2006-01-01

    Current DNA amplification-based typing methods for bacterial pathogens often lack interlaboratory reproducibility. In this international study, DNA sequence-based typing of the Staphylococcus aureus protein A gene (spa, 110 to 422 bp) showed 100% intra- and interlaboratory reproducibility without...

  15. Protocols for 16S rDNA Array Analyses of Microbial Communities by Sequence-Specific Labeling of DNA Probes

    Directory of Open Access Journals (Sweden)

    Knut Rudi

    2003-01-01

    Full Text Available Analyses of complex microbial communities are becoming increasingly important. Bottlenecks in these analyses, however, are the tools to actually describe the biodiversity. Novel protocols for DNA array-based analyses of microbial communities are presented. In these protocols, the specificity obtained by sequence-specific labeling of DNA probes is combined with the possibility of detecting several different probes simultaneously by DNA array hybridization. The gene encoding 16S ribosomal RNA was chosen as the target in these analyses. This gene contains both universally conserved regions and regions with relatively high variability. The universally conserved regions are used for PCR amplification primers, while the variable regions are used for the specific probes. Protocols are presented for DNA purification, probe construction, probe labeling, and DNA array hybridizations.

  16. Analysis of domestic dog mitochondrial DNA sequence variation for forensic investigations

    OpenAIRE

    Angleby, Helen

    2005-01-01

    The first method for DNA analysis in forensics was presented in 1985. Since then, the introduction of the polymerase chain reaction (PCR) has rendered possible the analysis of small amounts of DNA and automated sequencing and fragment analysis techniques have facilitated the analyses. In most cases short tandemly repeated regions (STRs) of nuclear DNA are analysed in forensic investigations, but all samples cannot be successfully analysed using this method. For samples containing minute amoun...

  17. Digital Droplet Multiple Displacement Amplification (ddMDA) for Whole Genome Sequencing of Limited DNA Samples

    OpenAIRE

    Minsoung Rhee; Yooli K Light; Meagher, Robert J.; Anup K. Singh

    2016-01-01

    Multiple displacement amplification (MDA) is a widely used technique for amplification of DNA from samples containing limited amounts of DNA (e.g., uncultivable microbes or clinical samples) before whole genome sequencing. Despite its advantages of high yield and fidelity, it suffers from high amplification bias and non-specific amplification when amplifying sub-nanogram of template DNA. Here, we present a microfluidic digital droplet MDA (ddMDA) technique where partitioning of the template D...

  18. Sequence analysis of the ribosomal DNA ITS2 region in two Trichogramma species (Hymenoptera: Trichogrammatidae

    Directory of Open Access Journals (Sweden)

    Ercan Sumer Fahriye

    2011-01-01

    Full Text Available Two egg parasitoid wasps, Trichogramma euproctidis (Girault and Trichogramma brassicae (Bezdenko (Hymenoptera: Trichogrammatidae were identified in the study. The taxonomy of these wasps is problematic because of their small size and lack of distinguishable morphological characters. The DNA sequence variation from the internal transcribed spacer 2 (ITS2 region of nuclear ribosomal DNA (rDNA was analyzed from these two Trichogramma species. This technique provides quick, simple and reliable molecular identification of Trichogramma species.

  19. Sequence-specific nucleic acid mobility using a reversible block copolymer gel matrix and DNA amphiphiles (lipid-DNA) in capillary and microfluidic electrophoretic separations

    NARCIS (Netherlands)

    Wagler, Patrick; Minero, Gabriel Antonio S.; Tangen, Uwe; de Vries, Jan Willem; Prusty, Deepak; Kwak, Minseok; Herrmann, Andreas; McCaskill, John S.

    2015-01-01

    Reversible noncovalent but sequence-dependent attachment of DNA to gels is shown to allow programmable mobility processing of DNA populations. The covalent attachment of DNA oligomers to polyacrylamide gels using acrydite-modified oligonucleotides has enabled sequence-specific mobility assays for DN

  20. Ray Wu as Fifth Business: Deconstructing collective memory in the history of DNA sequencing.

    Science.gov (United States)

    Onaga, Lisa A

    2014-06-01

    The concept of 'Fifth Business' is used to analyze a minority standpoint and bring serious attention to the role of scientists who play a galvanizing role in a science but for multiple reasons appear less prominently in more common recounts of any particular development. Biochemist Ray Wu (1928-2008) published a DNA sequencing experiment in March 1970 using DNA polymerase catalysis and specific nucleotide labeling, both of which are foundational to general sequencing methods today. The scant mention of Wu's work from textbooks, research articles, and other accounts of DNA sequencing calls into question how scientific collective memory forms. This alternative history seeks to understand why a key figure in nucleic acid sequence analysis has remained less visibly connected or peripheral to solidifying narratives about the history of DNA sequencing. The study resists predictable dismissals of Wu's work in order to seriously examine the formation of his nucleic acid sequence analysis research program and how he shared his knowledge of sequencing during a period of rapid advancement in the field. An analysis of Wu's work on sequencing the cohesive ends of lambda bacteriophage in the 1960s and 1970s exemplifies how a variety of individuals and groups attempted to develop protocol for sequencing the order of nucleotide base pairs comprising DNA. This historical examination of the sociality of scientific research suggests a way to understand how Wu and others contributed to the very collective memory of DNA sequencing that Wu eventually tried to repair. The study of Wu, who was a Chinese immigrant to the United States, provides a foundation for further critical scholarship on the heterogeneous histories of Asian American bioscientists, the sociality of their scientific works, and how the resulting knowledge produced is preserved, if not evenly, in a scientific field's collective memory.

  1. Sequence-specific nucleic acid mobility using a reversible block copolymer gel matrix and DNA amphiphiles (lipid-DNA) in capillary and microfluidic electrophoretic separations.

    Science.gov (United States)

    Wagler, Patrick; Minero, Gabriel Antonio S; Tangen, Uwe; de Vries, Jan Willem; Prusty, Deepak; Kwak, Minseok; Herrmann, Andreas; McCaskill, John S

    2015-10-01

    Reversible noncovalent but sequence-dependent attachment of DNA to gels is shown to allow programmable mobility processing of DNA populations. The covalent attachment of DNA oligomers to polyacrylamide gels using acrydite-modified oligonucleotides has enabled sequence-specific mobility assays for DNA in gel electrophoresis: sequences binding to the immobilized DNA are delayed in their migration. Such a system has been used for example to construct complex DNA filters facilitating DNA computations. However, these gels are formed irreversibly and the choice of immobilized sequences is made once off during fabrication. In this work, we demonstrate the reversible self-assembly of gels combined with amphiphilic DNA molecules, which exhibit hydrophobic hydrocarbon chains attached to the nucleobase. This amphiphilic DNA, which we term lipid-DNA, is synthesized in advance and is blended into a block copolymer gel to induce sequence-dependent DNA retention during electrophoresis. Furthermore, we demonstrate and characterize the programmable mobility shift of matching DNA in such reversible gels both in thin films and microchannels using microelectrode arrays. Such sequence selective separation may be employed to select nucleic acid sequences of similar length from a mixture via local electronics, a basic functionality that can be employed in novel electronic chemical cell designs and other DNA information-processing systems. PMID:26095642

  2. Budding yeast cDNA sequencing project: S03052-76_F01 [Budding yeast cDNA sequencing project

    Lifescience Database Archive (English)

    Full Text Available EST - Link to UCSC Genome Browser - Sequence >S03052-76_F01.phd NNNNNNNNNNNNNNNNNNNNNNNNNTNTAAAANNNNGANNNGANNNGTGGNTNTNTNTNT TNT...ANTTTNAANAAANAACNNNCCCTNNNNCNCNNNNNNNGAGNAAAAANNGGGTNTNNT NTTTTNNTNNTNTNTNNNNCNNN Qualit

  3. Octopus: A Secure and Anonymous DHT Lookup

    CERN Document Server

    Wang, Qiyan

    2012-01-01

    Distributed Hash Table (DHT) lookup is a core technique in structured peer-to-peer (P2P) networks. Its decentralized nature introduces security and privacy vulnerabilities for applications built on top of them; we thus set out to design a lookup mechanism achieving both security and anonymity, heretofore an open problem. We present Octopus, a novel DHT lookup which provides strong guarantees for both security and anonymity. Octopus uses attacker identification mechanisms to discover and remove malicious nodes, severely limiting an adversary's ability to carry out active attacks, and splits lookup queries over separate anonymous paths and introduces dummy queries to achieve high levels of anonymity. We analyze the security of Octopus by developing an event-based simulator to show that the attacker discovery mechanisms can rapidly identify malicious nodes with low error rate. We calculate the anonymity of Octopus using probabilistic modeling and show that Octopus can achieve near-optimal anonymity. We evaluate ...

  4. Sequence-specific DNA breaks produced by triplex-directed decay of iodine-125

    International Nuclear Information System (INIS)

    Triplex forming oligonucleotides (TFO) labeled with Auger emitters could be ideal vehicles to deliver radioactive-decay energy to specific DNA sequences, causing DNA breaks and, subsequently, inactivation of these sequences. To demonstrate this approach we labeled with 125I (two 125I per molecule on average) a purine-rich 38-mer which forms a stable triplex with a polypurine x polypyrimidine stretch in the human HPRT gene. Decay of 125I in the bound TFO was shown to cause sequence-specific double strand breaks (DSB) in the target HPRT sequence cloned into plasmid DNA. No sequence-specific breaks were observed if 125I-labeled TFO were not bound to the plasmid DNA. After 60 days of decay accumulation (one 125I half-life) approximately a quarter of all plasmid molecules contained sequence-specific DSB, corresponding to 0.3 site-specific DSB per decay. Sequencing gel analysis shows that the DNA breaks are distributed within a few bases of the maxima at those bases opposite to the positions of 125I in the TFO. (orig.)

  5. Sequence-specific DNA breaks produced by triplex-directed decay of iodine-125

    Energy Technology Data Exchange (ETDEWEB)

    Panyutin, I.G. [National Institutes of Health, Bethesda, MD (United States). Dept. of Nuclear Medicine; Neumann, R.D. [National Institutes of Health, Bethesda, MD (United States). Dept. of Nuclear Medicine

    1996-12-31

    Triplex forming oligonucleotides (TFO) labeled with Auger emitters could be ideal vehicles to deliver radioactive-decay energy to specific DNA sequences, causing DNA breaks and, subsequently, inactivation of these sequences. To demonstrate this approach we labeled with {sup 125}I (two {sup 125}I per molecule on average) a purine-rich 38-mer which forms a stable triplex with a polypurine x polypyrimidine stretch in the human HPRT gene. Decay of {sup 125}I in the bound TFO was shown to cause sequence-specific double strand breaks (DSB) in the target HPRT sequence cloned into plasmid DNA. No sequence-specific breaks were observed if {sup 125}I-labeled TFO were not bound to the plasmid DNA. After 60 days of decay accumulation (one {sup 125}I half-life) approximately a quarter of all plasmid molecules contained sequence-specific DSB, corresponding to 0.3 site-specific DSB per decay. Sequencing gel analysis shows that the DNA breaks are distributed within a few bases of the maxima at those bases opposite to the positions of {sup 125}I in the TFO. (orig.).

  6. The genome-wide DNA sequence specificity of the anti-tumour drug bleomycin in human cells.

    Science.gov (United States)

    Murray, Vincent; Chen, Jon K; Tanaka, Mark M

    2016-07-01

    The cancer chemotherapeutic agent, bleomycin, cleaves DNA at specific sites. For the first time, the genome-wide DNA sequence specificity of bleomycin breakage was determined in human cells. Utilising Illumina next-generation DNA sequencing techniques, over 200 million bleomycin cleavage sites were examined to elucidate the bleomycin genome-wide DNA selectivity. The genome-wide bleomycin cleavage data were analysed by four different methods to determine the cellular DNA sequence specificity of bleomycin strand breakage. For the most highly cleaved DNA sequences, the preferred site of bleomycin breakage was at 5'-GT* dinucleotide sequences (where the asterisk indicates the bleomycin cleavage site), with lesser cleavage at 5'-GC* dinucleotides. This investigation also determined longer bleomycin cleavage sequences, with preferred cleavage at 5'-GT*A and 5'- TGT* trinucleotide sequences, and 5'-TGT*A tetranucleotides. For cellular DNA, the hexanucleotide DNA sequence 5'-RTGT*AY (where R is a purine and Y is a pyrimidine) was the most highly cleaved DNA sequence. It was striking that alternating purine-pyrimidine sequences were highly cleaved by bleomycin. The highest intensity cleavage sites in cellular and purified DNA were very similar although there were some minor differences. Statistical nucleotide frequency analysis indicated a G nucleotide was present at the -3 position (relative to the cleavage site) in cellular DNA but was absent in purified DNA.

  7. Nanopores in suspended WS2 membranes for DNA sequencing

    Science.gov (United States)

    Danda, Gopinath; Masih Das, Paul; Chou, Yung-Chien; Mlack, Jerome; Naylor, Carl; Perea-Lopez, Nestor; Lin, Zhong; Fulton, Laura Beth; Terrones, Mauricio; Johnson, A. T. Charlie; Drndic, Marija

    Recent advances in solid-state nanopore sensor systems for DNA detection and analysis have been supported by using increasingly thinner materials to the point of utilizing atomically thin two-dimensional materials such as graphene and MoS2. However, these materials still have issues with pore wettability and signal-to-noise ratios displayed in DNA translocation measurements. Recently, the fabrication and operation of nanopores in MoS2 have been demonstrated, but the wetting properties and signal-to-noise ratios of transition metal dichalcogenides are yet to be understood and further improved. Here we fabricate suspended WS2 nanopore devices with sub-10 nm pore diameters using a novel nanomaterial transfer method and TEM nanosculpting to study and better understand nanopore wetting properties and performance in DNA translocation measurements.

  8. Sequence-specific DNA purification by triplex affinity capture

    Energy Technology Data Exchange (ETDEWEB)

    Ito, Takashi; Smith, C.L.; Cantor, C.R. (Lawrence Berkeley Lab., CA (United States))

    1992-01-15

    A DNA isolation procedure was developed by using triple-helix formation and magnetic separation. In this procedure, target DNA is captured by a biotinylated oligonucleotide via intermolecular triplex formation, bound to streptavidin-coated magnetic beads, and recovered in double-stranded form by elution with a mild alkaline buffer that destabilizes the triple helix. The effectiveness of the procedure was demonstrated by a model experiment with an artificially reconstructed library and, also, by the isolation of (dT-dC){sub n}{center dot}(dG-dA){sub n} dinucleotide repeats from a human genomic library. This procedure provides a prototype for other triplex mediated DNA isolation technologies.

  9. Privacy-Preserving Updates to Anonymous Databases

    OpenAIRE

    Sivasubramanian, R.; K.P. KALIYAMURTHIE

    2013-01-01

    Suppose a medical facility connected with a research institution and the researchers can use themedical details of a patient without knowing the personal details. Thus the research data base used by theresearchers must be anonymized (Sanitized). We can consider another problem in the area of census.Individuals give the private information to a trusted party (Census Bureau) and the census bureau mustpublish anonymized or sanitized version of data. So anonymization is done for privacy. Our work...

  10. Distinguishing authentic mitochondrial and plastid DNAs from similar DNA sequences in the nucleus using the polymerase chain reaction.

    Science.gov (United States)

    Kumar, Rachana A; Bendich, Arnold J

    2011-08-01

    DNA sequences similar to those in the organellar genomes are also found in the nucleus. These non-coding sequences may be co-amplified by PCR with the authentic organellar DNA sequences, leading to erroneous conclusions. To avoid this problem, we describe an experimental procedure to prevent amplification of this "promiscuous" DNA when total tissue DNA is used with PCR. First, primers are designed for organelle-specific sequences using a bioinformatics method. These primers are then tested using methylation-sensitive PCR. The method is demonstrated for both end-point and real-time PCR with Zea mays, where most of the DNA sequences in the organellar genomes are also present in the nucleus. We use this procedure to quantify those nuclear DNA sequences that are near-perfect replicas of organellar DNA. This method should be useful for applications including phylogenetic analysis, organellar DNA quantification and clinical testing.

  11. DNA sequence and analysis of human chromosome 18.

    Science.gov (United States)

    Nusbaum, Chad; Zody, Michael C; Borowsky, Mark L; Kamal, Michael; Kodira, Chinnappa D; Taylor, Todd D; Whittaker, Charles A; Chang, Jean L; Cuomo, Christina A; Dewar, Ken; FitzGerald, Michael G; Yang, Xiaoping; Abouelleil, Amr; Allen, Nicole R; Anderson, Scott; Bloom, Toby; Bugalter, Boris; Butler, Jonathan; Cook, April; DeCaprio, David; Engels, Reinhard; Garber, Manuel; Gnirke, Andreas; Hafez, Nabil; Hall, Jennifer L; Norman, Catherine Hosage; Itoh, Takehiko; Jaffe, David B; Kuroki, Yoko; Lehoczky, Jessica; Lui, Annie; Macdonald, Pendexter; Mauceli, Evan; Mikkelsen, Tarjei S; Naylor, Jerome W; Nicol, Robert; Nguyen, Cindy; Noguchi, Hideki; O'Leary, Sinéad B; O'Neill, Keith; Piqani, Bruno; Smith, Cherylyn L; Talamas, Jessica A; Topham, Kerri; Totoki, Yasushi; Toyoda, Atsushi; Wain, Hester M; Young, Sarah K; Zeng, Qiandong; Zimmer, Andrew R; Fujiyama, Asao; Hattori, Masahira; Birren, Bruce W; Sakaki, Yoshiyuki; Lander, Eric S

    2005-09-22

    Chromosome 18 appears to have the lowest gene density of any human chromosome and is one of only three chromosomes for which trisomic individuals survive to term. There are also a number of genetic disorders stemming from chromosome 18 trisomy and aneuploidy. Here we report the finished sequence and gene annotation of human chromosome 18, which will allow a better understanding of the normal and disease biology of this chromosome. Despite the low density of protein-coding genes on chromosome 18, we find that the proportion of non-protein-coding sequences evolutionarily conserved among mammals is close to the genome-wide average. Extending this analysis to the entire human genome, we find that the density of conserved non-protein-coding sequences is largely uncorrelated with gene density. This has important implications for the nature and roles of non-protein-coding sequence elements. PMID:16177791

  12. DNA sequencing leads to genomics progress in China

    Institute of Scientific and Technical Information of China (English)

    WU JiaYan; XIAO JingFa; ZHANG RuoSi; YU Jun

    2011-01-01

    1 Science in the large-scale sequencing era Ten years ago,the first draft sequence assembly of the human genome was completed [1],bringing biomedical research one-step closer toward the goal of revolutionizing diagnosis,prevention,and treatment of human diseases.Recently,journalists from the journal Nature surveyed more than 1000 life scientists regarding this laudable aim [2],obtaining substantially negative responses [3].However,almost all of those surveyed had been influenced,in one way or another,by the availability of the human genome sequence,and they also agreed with the notion that the "sequence is the start." The complexity of genome biology and almost every aspect of human biology is far greater than previously thought [4].

  13. Complete mitochondrial DNA sequence of the Qianshao spotted pig.

    Science.gov (United States)

    Xu, Dong; Chai, Yu-Lan; Jiang, Juan; He, Chang-Qing; Ma, Hai-Ming

    2015-01-01

    The complete mitochondrial genome sequence of Qianshao spotted pig was first determined in this study. The mitogenome (16,700 bp) consists of 22 tRNA genes, 2 ribosomal RNA genes, 13 protein-coding genes and 1 control region (D-loop region). The complete mitochondrial genome sequence of the Qianshao spotted pig enriches data resource for further study in genetic mechanism.

  14. Algorithm of detecting structural variations in DNA sequences

    Science.gov (United States)

    Nałecz-Charkiewicz, Katarzyna; Nowak, Robert

    2014-11-01

    Whole genome sequencing enables to use the longest common subsequence algorithm to detect genetic structure variations. We propose to search position of short unique fragments, genetic markers, to achieve acceptable time and space complexity. The markers are generated by algorithms searching the genetic sequence or its Fourier transformation. The presented methods are checked on structural variations generated in silico on bacterial genomes giving the comparable or better results than other solutions.

  15. Oxidation by DNA Charge Transport Damages Conserved Sequence Block II, a Regulatory Element in Mitochondrial DNA

    OpenAIRE

    Merino, Edward J.; Barton, Jacqueline K.

    2007-01-01

    Sites of oxidative damage in mitochondrial DNA have been identified on the basis of DNA-mediated charge transport. Our goal is to understand which sites in mitochondrial DNA are prone to oxidation at long range and whether such oxidative damage correlates with cancerous transformation. Here we show that a primer extension reaction can be used to monitor directly oxidative damage to authentic mitochondrial DNA through photoreactions with a rhodium intercalator. The complex [Rh(phi)_2bpy]Cl_3 (...

  16. Genetic alterations of hepatocellular carcinoma by random amplified polymorphic DNA analysis and cloning sequencing of tumor differential DNA fragment

    Institute of Scientific and Technical Information of China (English)

    Zhi-Hong Xian; Wen-Ming Cong; Shu-Hui Zhang; Meng-Chao Wu

    2005-01-01

    AIM: To study the genetic alterations and their association with clinicopathological characteristics of hepatocellular carcinoma (HCC), and to find the tumor related DNA fragments.METHODS: DNA isolated from tumors and corresponding noncancerous liver tissues of 56 HCC patients was amplified by random amplified polymorphic DNA (RAPD)with 10 random 10-mer arbitrary primers. The RAPD bands showing obvious differences in tumor tissue DNA corresponding to that of normal tissue were separated,purified, cloned and sequenced. DNA sequences were analyzed and compared with GenBank data.RESULTS: A total of 56 cases of HCC were demonstrated to have genetic alterations, which were detected by at least one primer. The detestability of genetic alterations ranged from 20% to 70% in each case, and 17.9% to 50% in each primer. Serum HBV infection, tumor size,histological grade, tumor capsule, as well as tumor intrahepatic metastasis, might be correlated with genetic alterations on certain primers. A band with a higher intensity of 480 bp or so amplified fragments in tumor DNA relative to normal DNA could be seen in 27 of 56 tumor samples using primer 4. Sequence analysis of these fragments showed 91% homology with Homo sapiens double homeobox protein DUX10 gene.CONCLUSION: Genetic alterations are a frequent event in HCC, and tumor related DNA fragments have been found in this study, which may be associated with hepatocarcinogenesis. RAPD is an effective method for the identification and analysis of genetic alterations in HCC, and may provide new information for further evaluating the molecular mechanism of hepatocarcinogenesis.

  17. Noninvasive prenatal diagnosis of fetal trisomy 18 and trisomy 13 by maternal plasma DNA sequencing.

    NARCIS (Netherlands)

    Chen, E.Z.; Chiu, R.W.; Sun, H.; Akolekar, R.; Chan, K.C.; Leung, T.Y.; Jiang, P.; Zheng, Y.W.; Lun, F.M.; Chan, L.Y.; Jin, Y.; Go, A.T.; Lau, E.T; To, W.W.; Leung, W.C.; Tang, R.Y.; Au-Yeung, S.K.; Lam, H.; Kung, Y.Y.; Zhang, X.; Vugt, J.M.G. van; Minekawa, R.; Tang, M.H.; Wang, J.; Oudejans, C.B.; Lau, T.K.; Nicolaides, K.H.; Lo, Y.M.

    2011-01-01

    Massively parallel sequencing of DNA molecules in the plasma of pregnant women has been shown to allow accurate and noninvasive prenatal detection of fetal trisomy 21. However, whether the sequencing approach is as accurate for the noninvasive prenatal diagnosis of trisomy 13 and 18 is unclear due t

  18. Discovery and genotyping of existing and induced DNA sequence variation in potato

    NARCIS (Netherlands)

    Uitdewilligen, J.G.A.M.L.

    2012-01-01

    In this thesis natural and induced DNA sequence diversity in potato (Solanum tuberosum) for use in marker-trait analysis and potato breeding is assessed. The study addresses the challenges of reliable, high-throughput identification and genotyping of sequence variants in existing tetraploid potato c

  19. An intragenic distribution bias of DNA uptake sequences in Pasteurellaceae and Neisseriae

    NARCIS (Netherlands)

    Passel, van M.W.J.

    2008-01-01

    Most sequenced strains from Pasteurellaceae and Neisseriae contain hundreds to thousands of uptake sequence (US) motifs in their genome, which are associated with natural competence for DNA uptake. The mechanism of their recognition is still unclear, and I searched for intragenic location patterns o

  20. A likelihood ratio test for species membership based on DNA sequence data

    DEFF Research Database (Denmark)

    Matz, Mikhail V.; Nielsen, Rasmus

    2005-01-01

    sequence is a member of an a priori specified species. We investigate the performance of the test using coalescence simulations, as well as using the real data from butterflies and frogs representing two kinds of challenge for DNA barcoding: extremely low and extremely high levels of sequence variability....

  1. Supported PCR : an efficient procedure to amplify sequences flanking a known DNA segment

    NARCIS (Netherlands)

    Rudenko, George N.; Rommens, Caius M.T.; Nijkamp, H. John J.; Hille, Jacques

    1993-01-01

    We describe a novel modification of the polymerase chain reaction for efficient in vitro amplification of genomic DNA sequences flanking short stretches of known sequence. The technique utilizes a target enrichment step, based on the selective isolation of biotinylated fragments from the bulk of gen

  2. Practical anonymity hiding in plain sight online

    CERN Document Server

    Loshin, Peter

    2013-01-01

    For those with legitimate reason to use the Internet anonymously--diplomats, military and other government agencies, journalists, political activists, IT professionals, law enforcement personnel, political refugees and others--anonymous networking provides an invaluable tool, and many good reasons that anonymity can serve a very important purpose. Anonymous use of the Internet is made difficult by the many websites that know everything about us, by the cookies and ad networks, IP-logging ISPs, even nosy officials may get involved. It is no longer possible to turn off browser cookies to be l

  3. Cell-free DNA next-generation sequencing in pancreatobiliary carcinomas

    OpenAIRE

    Zill, Oliver A.; Greene, Claire; Sebisanovic, Dragan; Siew, LaiMun; Leng, Jim; Vu, Mary; HENDIFAR, ANDREW E.; Zhen WANG; Atreya, Chloe E.; Kelley, Robin K.; Van Loon, Katherine; Ko, Andrew H.; Tempero, Margaret A.; Bivona, Trever G; Munster, Pamela N.

    2015-01-01

    Patients with pancreatic and biliary carcinomas lack personalized treatment options, in part because biopsies are often inadequate for molecular characterization. Cell-free DNA (cfDNA) sequencing may enable a precision oncology approach in this setting. We attempted to prospectively analyze 54 genes in tumor and cfDNA for 26 patients. Tumor sequencing failed in nine patients (35%). In the remaining 17, 90.3% (95% CI: 73.1–97.5%) of mutations detected in tumor biopsies were also detected in cf...

  4. Strong physical constraints on sequence-specific target location by proteins on DNA molecules

    DEFF Research Database (Denmark)

    Flyvbjerg, H.; Keatch, S.A.; Dryden, D.T.F

    2006-01-01

    Sequence-specific binding to DNA in the presence of competing non-sequence-specific ligands is a problem faced by proteins in all organisms. It is akin to the problem of parking a truck at a loading bay by the side of a road in the presence of cars parked at random along the road. Cars even...... required for function rather than the more commonly measured physical footprint. Assaying the complex type I restriction enzyme, EcoKI, gives an activity footprint of similar to 66 bp for ATP hydrolysis and 300 bp for the DNA cleavage function which is intimately linked with translocation of DNA by Eco...

  5. Complete DNA sequence of the linear mitochondrial genome of the pathogenic yeast Candida parapsilosis

    DEFF Research Database (Denmark)

    Nosek, J.; Novotna, M.; Hlavatovicova, Z.;

    2004-01-01

    The complete sequence of the mitochondrial DNA of the opportunistic yeast pathogen Candida parapsilosis was determined. The mitochondrial genome is represented by linear DNA molecules terminating with tandem repeats of a 738-bp unit. The number of repeats varies, thus generating a population...... of linear DNA molecules that are heterogeneous in size. The length of the shortest molecules is 30,922 bp, whereas the longer molecules have expanded terminal tandem arrays (n x 738 bp). The mitochondrial genome is highly compact., with less than 8% of the sequence corresponding to non-coding intergenic...

  6. Recharacterization of ancient DNA miscoding lesions: insights in the era of sequencing-by-synthesis

    DEFF Research Database (Denmark)

    Gilbert, M Thomas P; Binladen, Jonas; Miller, Webb;

    2007-01-01

    Although ancient DNA (aDNA) miscoding lesions have been studied since the earliest days of the field, their nature remains a source of debate. A variety of conflicting hypotheses exist about which miscoding lesions constitute true aDNA damage as opposed to PCR polymerase amplification error...... strand of origin of observed damage events. With the advent of emulsion-based clonal amplification (emPCR) and the sequencing-by-synthesis technology this has changed. In this paper we demonstrate how data produced on the Roche GS20 genome sequencer can determine miscoding lesion strands of origin...

  7. Introduction of restriction enzyme sites in protein-coding DNA sequences by site-specific mutagenesis not affecting the amino acid sequence: a computer program.

    OpenAIRE

    Arentzen, R; Ripka, W. C.

    1984-01-01

    Structure/function relationship studies of proteins are greatly facilitated by recombinant DNA technology which allows specific amino acid mutations to be made at the DNA sequence level by site-specific mutagenesis employing synthetic oligonucleotides. This technique has been successfully used to alter one or two amino acids in a protein. Replacement of existing DNA sequence coding for several amino acids with new synthetic DNA fragments would be facilitated by the presence of unique restrict...

  8. A New Revised DNA Cramp Tool Based Approach of Chopping DNA Repetitive and Non-Repetitive Genome Sequences

    Directory of Open Access Journals (Sweden)

    V.Hari Prasad

    2012-11-01

    Full Text Available In vogue tremendous amount of data generated day by day by the living organism of genetic sequences and its accumulation in database, their size is growing in an exponential manner. Due to excessive storage of DNA sequences in public databases like NCBI, EMBL and DDBJ archival maintenance is tedious task. Transmission of information from one place to another place in network management systems is also a critical task. So To improve the efficiency and to reduce the overhead of the database need of compression arises in database optimization. In this connection different techniques were bloomed, but achieved results are not bountiful. Many classical algorithms are fails to compress genetic sequences due to the specificity of text encoded in dna and few of the existing techniques achieved positive results. DNA is repetitive and non repetitive in nature. Our proposed technique DNACRAMP is applicable on repetitive and non repetitive sequences of dna and it yields better compression ratio in terms of bits per bases. This is compared with existing techniques and observed that our one is the optimum technique and compression results are on par with existing techniques.

  9. Complete genome sequence of the mitochondrial DNA of the river lamprey, Lethenteron japonicum.

    Science.gov (United States)

    Kawai, Yuri L; Yura, Kei; Shindo, Miyuki; Kusakabe, Rie; Hayashi, Keiko; Hata, Kenichiro; Nakabayashi, Kazuhiko; Okamura, Kohji

    2015-01-01

    Lampreys are eel-like jawless fishes evolutionarily positioned between invertebrates and vertebrates, and have been used as model organisms to explore vertebrate evolution. In this study we determined the complete genome sequence of the mitochondrial DNA of the Japanese river lamprey, Lethenteron japonicum, using next-generation sequencers. The sequence was 16,272 bp in length. The gene content and order were identical to those of the sea lamprey, Petromyzon marinus, which has been the reference among lamprey species. However, the sequence similarity was less than 90%, suggesting the need for the whole-genome sequencing of L. japonicum.

  10. Characterization of human chromosomal DNA sequences which replicate autonomously in Saccharomyces cerevisiae.

    Science.gov (United States)

    Montiel, J F; Norbury, C J; Tuite, M F; Dobson, M J; Mills, J S; Kingsman, A J; Kingsman, S M

    1984-01-01

    We have characterised two restriction fragments, isolated from a "shotgun" collection of human DNA, which function as autonomously replicating sequences (ARSs) in Saccharomyces cerevisiae. Functional domains of these fragments have been defined by subcloning and exonuclease (BAL 31) deletion analysis. Both fragments contain two spatially distinct domains. One is essential for high frequency transformation and is termed the Replication Sequence (RS) domain, the other, termed the Replication Enhancer (RE) domain, has no inherent replication competence but is essential for ensuring maximum function of the RS domain. The nucleotide sequence of these domains reveals several conserved sequences one of which is strikingly similar to the yeast ARS consensus sequence. PMID:6320114

  11. Mixed-Sequence Recognition of Double-Stranded DNA Using Enzymatically Stable Phosphorothioate Invader Probes

    Directory of Open Access Journals (Sweden)

    Brooke A. Anderson

    2015-07-01

    Full Text Available Development of probes that allow for sequence-unrestricted recognition of double-stranded DNA (dsDNA continues to attract much attention due to the prospect for molecular tools that enable detection, regulation, and manipulation of genes. We have recently introduced so-called Invader probes as alternatives to more established approaches such as triplex-forming oligonucleotides, peptide nucleic acids and polyamides. These short DNA duplexes are activated for dsDNA recognition by installment of +1 interstrand zippers of intercalator-functionalized nucleotides such as 2′-N-(pyren-1-ylmethyl-2′-N-methyl-2′-aminouridine and 2′-O-(pyren-1-ylmethyluridine, which results in violation of the nearest neighbor exclusion principle and duplex destabilization. The individual probes strands have high affinity toward complementary DNA strands, which generates the driving force for recognition of mixed-sequence dsDNA regions. In the present article, we characterize Invader probes that are based on phosphorothioate backbones (PS-DNA Invaders. The change from the regular phosphodiester backbone furnishes Invader probes that are much more stable to nucleolytic degradation, while displaying acceptable dsDNA-recognition efficiency. PS-DNA Invader probes therefore present themselves as interesting probes for dsDNA-targeting applications in cellular environments and living organisms.

  12. Mixed-Sequence Recognition of Double-Stranded DNA Using Enzymatically Stable Phosphorothioate Invader Probes.

    Science.gov (United States)

    Anderson, Brooke A; Karmakar, Saswata; Hrdlicka, Patrick J

    2015-01-01

    Development of probes that allow for sequence-unrestricted recognition of double-stranded DNA (dsDNA) continues to attract much attention due to the prospect for molecular tools that enable detection, regulation, and manipulation of genes. We have recently introduced so-called Invader probes as alternatives to more established approaches such as triplex-forming oligonucleotides, peptide nucleic acids and polyamides. These short DNA duplexes are activated for dsDNA recognition by installment of +1 interstrand zippers of intercalator-functionalized nucleotides such as 2'-N-(pyren-1-yl)methyl-2'-N-methyl-2'-aminouridine and 2'-O-(pyren-1-yl)methyluridine, which results in violation of the nearest neighbor exclusion principle and duplex destabilization. The individual probes strands have high affinity toward complementary DNA strands, which generates the driving force for recognition of mixed-sequence dsDNA regions. In the present article, we characterize Invader probes that are based on phosphorothioate backbones (PS-DNA Invaders). The change from the regular phosphodiester backbone furnishes Invader probes that are much more stable to nucleolytic degradation, while displaying acceptable dsDNA-recognition efficiency. PS-DNA Invader probes therefore present themselves as interesting probes for dsDNA-targeting applications in cellular environments and living organisms. PMID:26230684

  13. DNA sequence conservation between the Bacillus anthracis pXO2 plasmid and genomic sequence from closely related bacteria

    Directory of Open Access Journals (Sweden)

    Sabin Robert

    2002-12-01

    Full Text Available Abstract Background Complete sequencing and annotation of the 96.2 kb Bacillus anthracis plasmid, pXO2, predicted 85 open reading frames (ORFs. Bacillus cereus and Bacillus thuringiensis isolates that ranged in genomic similarity to B. anthracis, as determined by amplified fragment length polymorphism (AFLP analysis, were examined by PCR for the presence of sequences similar to 47 pXO2 ORFs. Results The two most distantly related isolates examined, B. thuringiensis 33679 and B. thuringiensis AWO6, produced the greatest number of ORF sequences similar to pXO2; 10 detected in 33679 and 16 in AWO6. No more than two of the pXO2 ORFs were detected in any one of the remaining isolates. Dot-blot DNA hybridizations between pXO2 ORF fragments and total genomic DNA from AWO6 were consistent with the PCR assay results for this isolate and also revealed nine additional ORFs shared between these two bacteria. Sequences similar to the B. anthracis cap genes or their regulator, acpA, were not detected among any of the examined isolates. Conclusions The presence of pXO2 sequences in the other Bacillus isolates did not correlate with genomic relatedness established by AFLP analysis. The presence of pXO2 ORF sequences in other Bacillus species suggests the possibility that certain pXO2 plasmid gene functions may also be present in other closely related bacteria.

  14. Anonymous Web Browsing and Hosting

    Directory of Open Access Journals (Sweden)

    MANOJ KUMAR

    2013-02-01

    Full Text Available In today’s high tech environment every organization, individual computer users use internet for accessing web data. To maintain high confidentiality and security of the data secure web solutions are required. In this paper we described dedicated anonymous web browsing solutions which makes our browsing faster and secure. Web application which play important role for transferring our secret information including like email need more and more security concerns. This paper also describes that how we can choose safe web hosting solutions and what the main functions are which provides more security over server data. With the browser security network security is also important which can be implemented using cryptography solutions, VPN and by implementing firewalls on the network. Hackers always try to steal our identity and data, they track our activities using the network application software’s and do harmful activities. So in this paper we described that how we can monitor them from security purposes.

  15. Distributed anonymous discrete function computation

    CERN Document Server

    Hendrickx, Julien M; Tsitsiklis, John N

    2010-01-01

    We propose a model for deterministic distributed function computation by a network of identical and anonymous nodes. In this model, each node has bounded computation and storage capabilities that do not grow with the network size. Furthermore, each node only knows its neighbors, not the entire graph. Our goal is to characterize the class of functions that can be computed within this model. In our main result, we provide a necessary condition for computability which we show to be nearly sufficient, in the sense that every function that violates this condition can at least be approximated. The problem of computing suitably rounded averages in a distributed manner plays a central role in our development; we provide an algorithm that solves it in time that grows quadratically with the size of the network.

  16. Evaluation of Anonymized ONS Queries

    CERN Document Server

    Garcia-Alfaro, Joaquin; Kranakis, Evangelos

    2009-01-01

    Electronic Product Code (EPC) is the basis of a pervasive infrastructure for the automatic identification of objects on supply chain applications (e.g., pharmaceutical or military applications). This infrastructure relies on the use of the (1) Radio Frequency Identification (RFID) technology to tag objects in motion and (2) distributed services providing information about objects via the Internet. A lookup service, called the Object Name Service (ONS) and based on the use of the Domain Name System (DNS), can be publicly accessed by EPC applications looking for information associated with tagged objects. Privacy issues may affect corporate infrastructures based on EPC technologies if their lookup service is not properly protected. A possible solution to mitigate these issues is the use of online anonymity. We present an evaluation experiment that compares the of use of Tor (The second generation Onion Router) on a global ONS/DNS setup, with respect to benefits, limitations, and latency.

  17. Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications

    Energy Technology Data Exchange (ETDEWEB)

    Torella, JP; Lienert, F; Boehm, CR; Chen, JH; Way, JC; Silver, PA

    2014-08-07

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.

  18. Cloning and sequencing of a DNA fragment encoding N37 apoptotic peptide derived from p53

    Institute of Scientific and Technical Information of China (English)

    Yan-xia Bai; Qing-yong Ma; Guang-xiao Yang

    2009-01-01

    Objective It was reported that p53 apoptotic peptide (N37) could inhibit p73 gene through being bound with iASPP, which could induce tumor cell apoptosis. To further explore the function of N37, we constructed the cloning plasmid of DNA fragment encoding p53 (N37) apoptotic peptide by using DNA synthesis and molecular biology methods. Methods According to human p53 sequence from the GenBank database, the primer of p53(N37) gene was designed using Primer V7.0 software. The DNA fragment encoding p53 (N37) apoptotic peptide was amplified by using self-complementation polymerase chain reaction (PCR) method and cloned into the pGEM-T Easy vector. The constructed plasmid was confirmed by endonuclease analysis and sequencing. Results The insertion of objective DNA fragment was confirmed by plasmid DNA enzyme spectrum analysis, p53 (N37) gene was successfully synthesized chemically in vitro. The sequencing result of positive clone was completely identical to the human p53(N37) sequence in GenBank using BLAST software (http://www. ncbi. him. nih. gov/cgi-bin /BLASTn). Conclusion The cloning of DNA fragment encoding p53(N37) apoptotic peptide was constructed by using DNA synthesis and pGEM-T Easy cloning methods. With the constructed plasmid, we could further investigate the function of N37 peptide.

  19. Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments.

    Science.gov (United States)

    Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias

    2013-09-24

    Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retrieval. In combination with single-stranded DNA library preparation, this method enabled us to reconstruct the mitochondrial genome sequence from a Middle Pleistocene cave bear (Ursus deningeri) bone excavated at Sima de los Huesos in the Sierra de Atapuerca, Spain. Phylogenetic reconstructions indicate that the U. deningeri sequence forms an early diverging sister lineage to all Western European Late Pleistocene cave bears. Our results prove that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost. Moreover, the techniques presented enable the retrieval of phylogenetically informative sequences from samples in which virtually all DNA is diminished to fragments shorter than 50 bp. PMID:24019490

  20. Phylogeny and genetic diversity of Bridgeoporus nobilissimus inferred using mitochondrial and nuclear rDNA sequences

    Science.gov (United States)

    Redberg, G.L.; Hibbett, D.S.; Ammirati, J.F.; Rodriguez, R.J.

    2003-01-01

    The genetic diversity and phylogeny of Bridgeoporus nobilissimus have been analyzed. DNA was extracted from spores collected from individual fruiting bodies representing six geographically distinct populations in Oregon and Washington. Spore samples collected contained low levels of bacteria, yeast and a filamentous fungal species. Using taxon-specific PCR primers, it was possible to discriminate among rDNA from bacteria, yeast, a filamentous associate and B. nobilissimus. Nuclear rDNA internal transcribed spacer (ITS) region sequences of B. nobilissimus were compared among individuals representing six populations and were found to have less than 2% variation. These sequences also were used to design dual and nested PCR primers for B. nobilissimus-specific amplification. Mitochondrial small-subunit rDNA sequences were used in a phylogenetic analysis that placed B. nobilissimus in the hymenochaetoid clade, where it was associated with Oxyporus and Schizopora.

  1. Statistical Algorithms for Long DNA Sequences: Oligonucleotide Distributions and Homogeneity Maps

    Directory of Open Access Journals (Sweden)

    P. Katsaloulis

    2005-01-01

    Full Text Available The statistical properties of oligonucleotide appearances within long DNA sequences often reveal useful characteristics of the corresponding DNA areas. Two algorithms to statistically analyze oligonucleotide appearances within long DNA sequences in genome banks are presented. The first algorithm determines statistical indices for arbitrary length oligonucleotides within arbitrary length DNA sequences. The critical exponent μ of the distance distribution between consecutive occurrences of the same oligonucleotide is calculated and its value is shown to characterize the functionality of the oligonucleotide. The second algorithm searches for areas with variable homogeneity, based on the density of oligonucleotides. The two algorithms have been applied to representative eucaryotes (the animal Mus musculusand the plant Arabidopsis thaliana and interesting results were obtained, confirmed by biological observations. All programs are open source and publicly available on our web site.

  2. A Model of Sequence Dependent Rna-Polymerase Diffusion Along Dna

    CERN Document Server

    Barbi, M; Popkov, V; Salerno, M; Barbi, Maria; Place, Christophe; Popkov, Vladislav; Salerno, Mario

    2001-01-01

    We introduce a probabilistic model for the RNA-polymerase sliding motion along DNA during the promoter search. The model accounts for possible effects due to sequence-dependent interactions between the nonspecific DNA and the enzyme. We focus on T7 RNA-polymerase and exploit the available information about its interaction at the promoter site in order to investigate the influence of bacteriophage T7 DNA sequence on the dynamics of the sliding process. Hydrogen bonds in the major groove are used as the main sequence-dependent interaction between the RNA-polymerase and the DNA. The resulting dynamical properties and the possibility of an experimental validation are discussed in details. We show that, while at large times the process reaches a pure diffusive regime, it initially displays a sub-diffusive behavior. The crossover from anomalous to normal diffusion may occur at times large enough to be of biological interest.

  3. Phylogenetic study on Shiraia bambusicola by rDNA sequence analyses.

    Science.gov (United States)

    Cheng, Tian-Fan; Jia, Xiao-Ming; Ma, Xiao-Hang; Lin, Hai-Ping; Zhao, Yu-Hua

    2004-01-01

    In this study, 18S rDNA and ITS-5.8S rDNA regions of four Shiraia bambusicola isolates collected from different species of bamboos were amplified by PCR with universal primer pairs NS1/NS8 and ITS5/ITS4, respectively, and sequenced. Phylogenetic analyses were conducted on three selected datasets of rDNA sequences. Maximum parsimony, distance and maximum likelihood criteria were used to infer trees. Morphological characteristics were also observed. The positioning of Shiraia in the order Pleosporales was well supported by bootstrap, which agreed with the placement by Amano (1980) according to their morphology. We did not find significant inter-hostal differences among these four isolates from different species of bamboos. From the results of analyses and comparison of their rDNA sequences, we conclude that Shiraia should be classified into Pleosporales as Amano (1980) proposed and suggest that it might be positioned in the family Phaeosphaeriaceae.

  4. Sequence-selective targeting of duplex DNA by peptide nucleic acids

    DEFF Research Database (Denmark)

    Nielsen, Peter E

    2010-01-01

    Sequence-selective gene targeting constitutes an attractive drug-discovery approach for genetic therapy, with the aim of reducing or enhancing the activity of specific genes at the transcriptional level, or as part of a methodology for targeted gene repair. The pseudopeptide DNA mimic peptide...... nucleic acid (PNA) can recognize duplex DNA with high sequence specificity and affinity in triplex, duplex and double-duplex invasive modes or non-invasive triplex modes. Novel PNA modification has improved the affinity for DNA recognition via duplex invasion, double-duplex invasion and triplex...... recognition considerably. Such modifications have also resulted in new approaches to targeted gene repair and sequence-selective double-strand cleavage of genomic DNA....

  5. Correcting sequencing errors in DNA coding regions using a dynamic programming approach.

    Science.gov (United States)

    Xu, Y; Mural, R J; Uberbacher, E C

    1995-04-01

    This paper presents an algorithm for detecting and 'correcting' sequencing errors that occur in DNA coding regions. The types of sequencing errors addressed are insertions and deletions (indels) of DNA bases. The goal is to provide a capability which makes single-pass or low-redundancy sequence data more informative, reducing the need for high-redundancy sequencing for gene identification and characterization purposes. This would permit improved sequencing efficiency and reduce genome sequencing costs. The algorithm detects sequencing errors by discovering changes in the statistically preferred reading frame within a putative coding region and then inserts a number of 'neutral' bases at a perceived reading frame transition point to make the putative exon candidate frame consistent. We have implemented the algorithm as a front-end subsystem of the GRAIL DNA sequence analysis system to construct a version which is very error tolerant and also intend to use this as a testbed for further development of sequencing error-correction technology. Preliminary test results have shown the usefulness of this algorithm and also exhibited some of its weakness, providing possible directions for further improvement. On a test set consisting of 68 human DNA sequences with 1% randomly generated indels in coding regions, the algorithm detected and corrected 76% of the indels. The average distance between the position of an indel and the predicted one was 9.4 bases. With this subsystem in place, GRAIL correctly predicted 89% of the coding messages with 10% false message on the 'corrected' sequences, compared to 69% correctly predicted coding messages and 11% falsely predicted messages on the 'corrupted' sequences using standard GRAIL II method (version 1.2).(ABSTRACT TRUNCATED AT 250 WORDS)

  6. Plant or fungal sequences? An alternative optimized PCR protocol to avoid ITS (nrDNA) misamplification

    OpenAIRE

    Vitor Fernandes Oliveira de Miranda; Vanderlei Geraldo Martins; Antonio Furlan; Maurício Bacci Jr.

    2010-01-01

    The nuclear ribosomal DNA internal transcribed spacers (ITS1 and ITS2) from leaves of Drosera (Droseraceae) were amplified using "universal" primers. The analysis of the products demonstrated most samples were a molecular mixture as a result of unsuccessful and non-specific amplifications. Among the obtained sequences, two were from Basidiomycota fungi. Homologous sequences of Basidiomycota were obtained from GenBank database and added to a data set with sequences from Drosera leaves. Parsimo...

  7. Neural network predicts sequence of TP53 gene based on DNA chip

    DEFF Research Database (Denmark)

    Spicker, J.S.; Wikman, F.; Lu, M.L.;

    2002-01-01

    We have trained an artificial neural network to predict the sequence of the human TP53 tumor suppressor gene based on a p53 GeneChip. The trained neural network uses as input the fluorescence intensities of DNA hybridized to oligonucleotides on the surface of the chip and makes between zero...... and four errors in the predicted 1300 bp sequence when tested on wild-type TP53 sequence....

  8. Chromatin Isolation and DNA Sequence Analysis in Large Undergraduate Laboratory Sections

    Science.gov (United States)

    Hagerman, Ann E.

    1999-10-01

    A pair of exercises that introduce undergraduate students to basic techniques and concepts of molecular biology and that are appropriate for classes with large enrollments are described. One exercise is a simple laboratory experiment in which chromatin is isolated from chicken liver and is resolved into histone proteins and DNA by ion-exchange chromatography. The other is a series of computer simulations that introduce DNA sequencing, mapping, and sequence analysis to the students. The final step of the simulation is submission of a sequence to a database on the World Wide Web for identification of the protein product of the gene.

  9. Human insulin genome sequence map, biochemical structure of insulin for recombinant DNA insulin.

    Science.gov (United States)

    Chakraborty, Chiranjib; Mungantiwar, Ashish A

    2003-08-01

    Insulin is a essential molecule for type I diabetes that is marketed by very few companies. It is the first molecule, which was made by recombinant technology; but the commercialization process is very difficult. Knowledge about biochemical structure of insulin and human insulin genome sequence map is pivotal to large scale manufacturing of recombinant DNA Insulin. This paper reviews human insulin genome sequence map, the amino acid sequence of porcine insulin, crystal structure of porcine insulin, insulin monomer, aggregation surfaces of insulin, conformational variation in the insulin monomer, insulin X-ray structures for recombinant DNA technology in the synthesis of human insulin in Escherichia coli. PMID:12769691

  10. Genomic DNA enrichment using sequence capture microarrays: a novel approach to discover sequence nucleotide polymorphisms (SNP in Brassica napus L.

    Directory of Open Access Journals (Sweden)

    Wayne E Clarke

    Full Text Available Targeted genomic selection methodologies, or sequence capture, allow for DNA enrichment and large-scale resequencing and characterization of natural genetic variation in species with complex genomes, such as rapeseed canola (Brassica napus L., AACC, 2n=38. The main goal of this project was to combine sequence capture with next generation sequencing (NGS to discover single nucleotide polymorphisms (SNPs in specific areas of the B. napus genome historically associated (via quantitative trait loci -QTL- analysis to traits of agronomical and nutritional importance. A 2.1 million feature sequence capture platform was designed to interrogate DNA sequence variation across 47 specific genomic regions, representing 51.2 Mb of the Brassica A and C genomes, in ten diverse rapeseed genotypes. All ten genotypes were sequenced using the 454 Life Sciences chemistry and to assess the effect of increased sequence depth, two genotypes were also sequenced using Illumina HiSeq chemistry. As a result, 589,367 potentially useful SNPs were identified. Analysis of sequence coverage indicated a four-fold increased representation of target regions, with 57% of the filtered SNPs falling within these regions. Sixty percent of discovered SNPs corresponded to transitions while 40% were transversions. Interestingly, fifty eight percent of the SNPs were found in genic regions while 42% were found in intergenic regions. Further, a high percentage of genic SNPs was found in exons (65% and 64% for the A and C genomes, respectively. Two different genotyping assays were used to validate the discovered SNPs. Validation rates ranged from 61.5% to 84% of tested SNPs, underpinning the effectiveness of this SNP discovery approach. Most importantly, the discovered SNPs were associated with agronomically important regions of the B. napus genome generating a novel data resource for research and breeding this crop species.

  11. Repetitive sequences in Eurasian lynx (Lynx lynx L.) mitochondrial DNA control region.

    Science.gov (United States)

    Sindičić, Magda; Gomerčić, Tomislav; Galov, Ana; Polanc, Primož; Huber, Duro; Slavica, Alen

    2012-06-01

    Mitochondrial DNA (mtDNA) control region (CR) of numerous species is known to include up to five different repetitive sequences (RS1-RS5) that are found at various locations, involving motifs of different length and extensive length heteroplasmy. Two repetitive sequences (RS2 and RS3) on opposite sides of mtDNA central conserved region have been described in domestic cat (Felis catus) and some other felid species. However, the presence of repetitive sequence RS3 has not been detected in Eurasian lynx (Lynx lynx) yet. We analyzed mtDNA CR of 35 Eurasian lynx (L. lynx L.) samples to characterize repetitive sequences and to compare them with those found in other felid species. We confirmed the presence of 80 base pairs (bp) repetitive sequence (RS2) at the 5' end of the Eurasian lynx mtDNA CR L strand and for the first time we described RS3 repetitive sequence at its 3' end, consisting of an array of tandem repeats five to ten bp long. We found that felid species share similar RS3 repetitive pattern and fundamental repeat motif TACAC.

  12. 5'-end sequences of budding yeast full-length cDNA clones and quality scores - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available Budding yeast cDNA sequencing project 5'-end sequences of budding yeast full-length cDNA clones and quality ...scores Data detail Data name 5'-end sequences of budding yeast full-length cDNA clones and quality scores De...from the budding yeast full-length cDNA library by the vector-capping method, the sequence quality score gen...s accession only. Sequence 5'-end sequence data of budding yeast full-length cDNA clones. FASTA format. Quality Phred's quality... Update History of This Database Site Policy | Contact Us 5'-end sequences of budding yeast full-length cDNA clones and quality

  13. [Statistical analysis of DNA sequences nearby splicing sites].

    Science.gov (United States)

    Korzinov, O M; Astakhova, T V; Vlasov, P K; Roĭtberg, M A

    2008-01-01

    Recognition of coding regions within eukaryotic genomes is one of oldest but yet not solved problems of bioinformatics. New high-accuracy methods of splicing sites recognition are needed to solve this problem. A question of current interest is to identify specific features of nucleotide sequences nearby splicing sites and recognize sites in sequence context. We performed a statistical analysis of human genes fragment database and revealed some characteristics of nucleotide sequences in splicing sites neighborhood. Frequencies of all nucleotides and dinucleotides in splicing sites environment were computed and nucleotides and dinucleotides with extremely high\\low occurrences were identified. Statistical information obtained in this work can be used in further development of the methods of splicing sites annotation and exon-intron structure recognition.

  14. Merging Two Strategies for Mixed-Sequence Recognition of Double-Stranded DNA: Pseudocomplementary Invader Probes.

    Science.gov (United States)

    Anderson, Brooke A; Hrdlicka, Patrick J

    2016-04-15

    The development of molecular strategies that enable recognition of specific double-stranded DNA (dsDNA) regions has been a longstanding goal as evidenced by the emergence of triplex-forming oligonucleotides, peptide nucleic acids (PNAs), minor groove binding polyamides, and-more recently-engineered proteins such as CRISPR/Cas9. Despite this progress, an unmet need remains for simple hybridization-based probes that recognize specific mixed-sequence dsDNA regions under physiological conditions. Herein, we introduce pseudocomplementary Invader probes as a step in this direction. These double-stranded probes are chimeras between pseudocomplementary DNA (pcDNA) and Invader probes, which are activated for mixed-sequence dsDNA-recognition through the introduction of pseudocomplementary base pairs comprised of 2-thiothymine and 2,6-diaminopurine, and +1 interstrand zipper arrangements of intercalator-functionalized nucleotides, respectively. We demonstrate that certain pseudocomplementary Invader probe designs result in very efficient and specific recognition of model dsDNA targets in buffers of high ionic strength. These chimeric probes, therefore, present themselves as a promising strategy for mixed-sequence recognition of dsDNA targets for applications in molecular biology and nucleic acid diagnostics. PMID:26998918

  15. Assembly of long DNA sequences using a new synthetic Escherichia coli-yeast shuttle vector.

    Science.gov (United States)

    Hou, Zheng; Zhou, Zheng; Wang, Zonglin; Xiao, Gengfu

    2016-04-01

    Synthetic biology is a newly developed field of research focused on designing and rebuilding novel biomolecular components, circuits, and networks. Synthetic biology can also help understand biological principles and engineer complex artificial metabolic systems. DNA manipulation on a large genome-wide scale is an inevitable challenge, but a necessary tool for synthetic biology. To improve the methods used for the synthesis of long DNA fragments, here we constructed a novel shuttle vector named pGF (plasmid Genome Fast) for DNA assembly in vivo. The BAC plasmid pCC1BAC, which can accommodate large DNA molecules, was chosen as the backbone. The sequence of the yeast artificial chromosome (YAC) regulatory element CEN6-ARS4 was synthesized and inserted into the plasmid to enable it to replicate in yeast. The selection sequence HIS3, obtained by polymerase chain reaction (PCR) from the plasmid pBS313, was inserted for screening. This new synthetic shuttle vector can mediate the transformation-associated recombination (TAR) assembly of large DNA fragments in yeast, and the assembled products can be transformed into Escherichia coli for further amplification. We also conducted in vivo DNA assembly using pGF and yeast homologous recombination and constructed a 31-kb long DNA sequence from the cyanophage PP genome. Our findings show that this novel shuttle vector would be a useful tool for efficient genome-scale DNA reconstruction. PMID:27113243

  16. rDNA-ITS sequence analysis of pathogens of cucumber downy mildew and cucumber powdery mildew

    Institute of Scientific and Technical Information of China (English)

    Na WANG; Yajun MA; Cuiyun YANG; Guanghui DAI; Zhezhi WANG

    2008-01-01

    To determine the pathogens of cucumber downy mildew and cucumber powdery mildew by molecular marker,we amplified and sequenced the rDNA-ITS region of the pathogens of cucumber downy mildew and cucumber powdery mildew collected from the Shanghai region.The intra-/interspecific sequence difference was analyzed by rDNA-ITS sequence.The results show that the length of rDNA-ITS1 and rDNA-ITS2 of cucumber downy mildew's pathogen was 141 bp and 406 bp,respectively,with GC contents of 41.13% in ITS1 and 46.8% (Minhang and Jinshan District,sml and sm2) or 46.55% (Pudong District,sm3) in ITS2.The rDNA-ITS sequence was intraspecific conservation.The interspecific difference was related with their kin relationship.The pathogen of cucumber downy mildew was identified as Pseudoperonospora cubensis by molecular marker.The length of rDNA-ITS1 and rDNA-ITS2 of cucumber powdery mildew's pathogen was 136 bp and 89 bp,respectively,with GC contents being 59.56% and 66.29%,and rDNA-ITS sequence being highly conservative in this study that was the same as Sphaerotheca cucurbitae.But the sequence difference between the strains in the Shanghai region in this study with S.fuliginea was 4.5%,which was identified by morphology.It is suggested that the pathogen of cucumber powdery mildew should be further clarified and determined.

  17. Rapid detection and purification of sequence specific DNA binding proteins using magnetic separation

    Directory of Open Access Journals (Sweden)

    TIJANA SAVIC

    2006-02-01

    Full Text Available In this paper, a method for the rapid identification and purification of sequence specific DNA binding proteins based on magnetic separation is presented. This method was applied to confirm the binding of the human recombinant USF1 protein to its putative binding site (E-box within the human SOX3 protomer. It has been shown that biotinylated DNA attached to streptavidin magnetic particles specifically binds the USF1 protein in the presence of competitor DNA. It has also been demonstrated that the protein could be successfully eluted from the beads, in high yield and with restored DNA binding activity. The advantage of these procedures is that they could be applied for the identification and purification of any high-affinity sequence-specific DNA binding protein with only minor modifications.

  18. Sequence-specific protection of duplex DNA against restriction and methylation enzymes by pseudocomplementary PNAs

    DEFF Research Database (Denmark)

    Izvolsky, K I; Demidov, V V; Nielsen, P E;

    2000-01-01

    and sequence-specific complexes with duplex DNA in a very salt-dependent manner. In accord with a strand-invasion mode of complex formation, the pcPNA binding proceeds much faster with supercoiled than with linear plasmids. The double-duplex invasion complexes selectively shield specific DNA sites from Bcl......A new generation of PNAs, so-called pseudocomplementary PNAs (pcPNAs), which are able to target the designated sites on duplex DNA with mixed sequence of purines and pyrimidines via double-duplex invasion mode, has recently been introduced. It has been demonstrated that appropriate pairs...... of decameric pcPNAs block an access of RNA polymerase to the corresponding promoter. Here, we show that this type of PNAs protects selected DNA sites containing all four nucleobases from the action of restriction enzymes and DNA methyltransferases. We have found that pcPNAs as short as octamers form stable...

  19. Thermoelectric effect and its dependence on molecular length and sequence in single DNA molecules.

    Science.gov (United States)

    Li, Yueqi; Xiang, Limin; Palma, Julio L; Asai, Yoshihiro; Tao, Nongjian

    2016-01-01

    Studying the thermoelectric effect in DNA is important for unravelling charge transport mechanisms and for developing relevant applications of DNA molecules. Here we report a study of the thermoelectric effect in single DNA molecules. By varying the molecular length and sequence, we tune the charge transport in DNA to either a hopping- or tunnelling-dominated regimes. The thermoelectric effect is small and insensitive to the molecular length in the hopping regime. In contrast, the thermoelectric effect is large and sensitive to the length in the tunnelling regime. These findings indicate that one may control the thermoelectric effect in DNA by varying its sequence and length. We describe the experimental results in terms of hopping and tunnelling charge transport models.

  20. DNA sequencing with capillary electrophoresis and single cell analysis with mass spectrometry

    Energy Technology Data Exchange (ETDEWEB)

    Fung, N.

    1998-03-27

    Since the first demonstration of the laser in the 1960`s, lasers have found numerous applications in analytical chemistry. In this work, two different applications are described, namely, DNA sequencing with capillary gel electrophoresis and single cell analysis with mass spectrometry. Two projects are described in which high-speed DNA separations with capillary gel electrophoresis were demonstrated. In the third project, flow cytometry and mass spectrometry were coupled via a laser vaporization/ionization interface and individual mammalian cells were analyzed. First, DNA Sanger fragments were separated by capillary gel electrophoresis. A separation speed of 20 basepairs per minute was demonstrated with a mixed poly(ethylene oxide) (PEO) sieving solution. In addition, a new capillary wall treatment protocol was developed in which bare (or uncoated) capillaries can be used in DNA sequencing. Second, a temperature programming scheme was used to separate DNA Sanger fragments. Third, flow cytometry and mass spectrometry were coupled with a laser vaporization/ionization interface.

  1. Thermoelectric effect and its dependence on molecular length and sequence in single DNA molecules

    Science.gov (United States)

    Li, Yueqi; Xiang, Limin; Palma, Julio L.; Asai, Yoshihiro; Tao, Nongjian

    2016-04-01

    Studying the thermoelectric effect in DNA is important for unravelling charge transport mechanisms and for developing relevant applications of DNA molecules. Here we report a study of the thermoelectric effect in single DNA molecules. By varying the molecular length and sequence, we tune the charge transport in DNA to either a hopping- or tunnelling-dominated regimes. The thermoelectric effect is small and insensitive to the molecular length in the hopping regime. In contrast, the thermoelectric effect is large and sensitive to the length in the tunnelling regime. These findings indicate that one may control the thermoelectric effect in DNA by varying its sequence and length. We describe the experimental results in terms of hopping and tunnelling charge transport models.

  2. Base J glucosyltransferase does not regulate the sequence specificity of J synthesis in trypanosomatid telomeric DNA.

    Science.gov (United States)

    Bullard, Whitney; Cliffe, Laura; Wang, Pengcheng; Wang, Yinsheng; Sabatini, Robert

    2015-12-01

    Telomeric DNA of trypanosomatids possesses a modified thymine base, called base J, that is synthesized in a two-step process; the base is hydroxylated by a thymidine hydroxylase forming hydroxymethyluracil (hmU) and a glucose moiety is then attached by the J-associated glucosyltransferase (JGT). To examine the importance of JGT in modifiying specific thymine in DNA, we used a Leishmania episome system to demonstrate that the telomeric repeat (GGGTTA) stimulates J synthesis in vivo while mutant telomeric sequences (GGGTTT, GGGATT, and GGGAAA) do not. Utilizing an in vitro GT assay we find that JGT can glycosylate hmU within any sequence with no significant change in Km or kcat, even mutant telomeric sequences that are unable to be J-modified in vivo. The data suggests that JGT possesses no DNA sequence specificity in vitro, lending support to the hypothesis that the specificity of base J synthesis is not at the level of the JGT reaction.

  3. Base J glucosyltransferase does not regulate the sequence specificity of J synthesis in trypanosomatid telomeric DNA.

    Science.gov (United States)

    Bullard, Whitney; Cliffe, Laura; Wang, Pengcheng; Wang, Yinsheng; Sabatini, Robert

    2015-12-01

    Telomeric DNA of trypanosomatids possesses a modified thymine base, called base J, that is synthesized in a two-step process; the base is hydroxylated by a thymidine hydroxylase forming hydroxymethyluracil (hmU) and a glucose moiety is then attached by the J-associated glucosyltransferase (JGT). To examine the importance of JGT in modifiying specific thymine in DNA, we used a Leishmania episome system to demonstrate that the telomeric repeat (GGGTTA) stimulates J synthesis in vivo while mutant telomeric sequences (GGGTTT, GGGATT, and GGGAAA) do not. Utilizing an in vitro GT assay we find that JGT can glycosylate hmU within any sequence with no significant change in Km or kcat, even mutant telomeric sequences that are unable to be J-modified in vivo. The data suggests that JGT possesses no DNA sequence specificity in vitro, lending support to the hypothesis that the specificity of base J synthesis is not at the level of the JGT reaction. PMID:26815240

  4. [Application of rDNA-ITS sequence in entomology].

    Science.gov (United States)

    Liu, Yan-bin; Ji, Lan-zhu

    2007-05-01

    As an important complement of the information obtained from mtDNA, the internal transcribed spacer (ITS) of nuclear ribosomal DNA is being increasingly applied in entomological study. This paper introduced the structure and characters of ITS, and summarized its applications in identifying insect species and in studying their relative relationships and phylogenesis, evolution and spread, and relations with environment. ITS was mainly applied in identifying the species whose morphological differences were subtle. The research of relative relationships and phylogenesis was aimed to understand the species origin and evolution, while the study on the relations with environment was mainly focused on sociological and parasitic insects. The problems and their possible causes in ITS application were discussed.

  5. Highly parallel translation of DNA sequences into small molecules.

    Directory of Open Access Journals (Sweden)

    Rebecca M Weisinger

    Full Text Available A large body of in vitro evolution work establishes the utility of biopolymer libraries comprising 10(10 to 10(15 distinct molecules for the discovery of nanomolar-affinity ligands to proteins. Small-molecule libraries of comparable complexity will likely provide nanomolar-affinity small-molecule ligands. Unlike biopolymers, small molecules can offer the advantages of cell permeability, low immunogenicity, metabolic stability, rapid diffusion and inexpensive mass production. It is thought that such desirable in vivo behavior is correlated with the physical properties of small molecules, specifically a limited number of hydrogen bond donors and acceptors, a defined range of hydrophobicity, and most importantly, molecular weights less than 500 Daltons. Creating a collection of 10(10 to 10(15 small molecules that meet these criteria requires the use of hundreds to thousands of diversity elements per step in a combinatorial synthesis of three to five steps. With this goal in mind, we have reported a set of mesofluidic devices that enable DNA-programmed combinatorial chemistry in a highly parallel 384-well plate format. Here, we demonstrate that these devices can translate DNA genes encoding 384 diversity elements per coding position into corresponding small-molecule gene products. This robust and efficient procedure yields small molecule-DNA conjugates suitable for in vitro evolution experiments.

  6. Sequence Searcher: A Java tool to perform regular expression and fuzzy searches of multiple DNA and protein sequences

    Directory of Open Access Journals (Sweden)

    Upton Chris

    2009-01-01

    Full Text Available Abstract Background Many sequence-searching tools have limiting factors for their use. For example, they may be platform specific, enforce restrictive size limits and sequences to be searched, or only allow searches of one of DNA or protein. Findings We present an easy-to-use, fast, platform-independent tool to search for amino acid or nucleotide patterns within one or many protein or nucleic acid sequences. The user can choose to search for regular expressions or perform a fuzzy search in which a particular number of errors is accepted during matching of a sequence. Positions of mismatches in fuzzy searches are displayed graphically the user. Conclusion SeqS provides an improved feature set and functions as a stand-alone tool or could be integrated into other bioinformatics platforms.

  7. DNA sequence and analysis of human chromosome 8.

    Science.gov (United States)

    Nusbaum, Chad; Mikkelsen, Tarjei S; Zody, Michael C; Asakawa, Shuichi; Taudien, Stefan; Garber, Manuel; Kodira, Chinnappa D; Schueler, Mary G; Shimizu, Atsushi; Whittaker, Charles A; Chang, Jean L; Cuomo, Christina A; Dewar, Ken; FitzGerald, Michael G; Yang, Xiaoping; Allen, Nicole R; Anderson, Scott; Asakawa, Teruyo; Blechschmidt, Karin; Bloom, Toby; Borowsky, Mark L; Butler, Jonathan; Cook, April; Corum, Benjamin; DeArellano, Kurt; DeCaprio, David; Dooley, Kathleen T; Dorris, Lester; Engels, Reinhard; Glöckner, Gernot; Hafez, Nabil; Hagopian, Daniel S; Hall, Jennifer L; Ishikawa, Sabine K; Jaffe, David B; Kamat, Asha; Kudoh, Jun; Lehmann, Rüdiger; Lokitsang, Tashi; Macdonald, Pendexter; Major, John E; Matthews, Charles D; Mauceli, Evan; Menzel, Uwe; Mihalev, Atanas H; Minoshima, Shinsei; Murayama, Yuji; Naylor, Jerome W; Nicol, Robert; Nguyen, Cindy; O'Leary, Sinéad B; O'Neill, Keith; Parker, Stephen C J; Polley, Andreas; Raymond, Christina K; Reichwald, Kathrin; Rodriguez, Joseph; Sasaki, Takashi; Schilhabel, Markus; Siddiqui, Roman; Smith, Cherylyn L; Sneddon, Tam P; Talamas, Jessica A; Tenzin, Pema; Topham, Kerri; Venkataraman, Vijay; Wen, Gaiping; Yamazaki, Satoru; Young, Sarah K; Zeng, Qiandong; Zimmer, Andrew R; Rosenthal, Andre; Birren, Bruce W; Platzer, Matthias; Shimizu, Nobuyoshi; Lander, Eric S

    2006-01-19

    The International Human Genome Sequencing Consortium (IHGSC) recently completed a sequence of the human genome. As part of this project, we have focused on chromosome 8. Although some chromosomes exhibit extreme characteristics in terms of length, gene content, repeat content and fraction segmentally duplicated, chromosome 8 is distinctly typical in character, being very close to the genome median in each of these aspects. This work describes a finished sequence and gene catalogue for the chromosome, which represents just over 5% of the euchromatic human genome. A unique feature of the chromosome is a vast region of approximately 15 megabases on distal 8p that appears to have a strikingly high mutation rate, which has accelerated in the hominids relative to other sequenced mammals. This fast-evolving region contains a number of genes related to innate immunity and the nervous system, including loci that appear to be under positive selection--these include the major defensin (DEF) gene cluster and MCPH1, a gene that may have contributed to the evolution of expanded brain size in the great apes. The data from chromosome 8 should allow a better understanding of both normal and disease biology and genome evolution. PMID:16421571

  8. An editing environment for DNA sequence analysis and annotation

    Energy Technology Data Exchange (ETDEWEB)

    Uberbacher, E.C.; Xu, Y.; Shah, M.B.; Olman, V.; Parang, M.; Mural, R.

    1998-12-31

    This paper presents a computer system for analyzing and annotating large-scale genomic sequences. The core of the system is a multiple-gene structure identification program, which predicts the most probable gene structures based on the given evidence, including pattern recognition, EST and protein homology information. A graphics-based user interface provides an environment which allows the user to interactively control the evidence to be used in the gene identification process. To overcome the computational bottleneck in the database similarity search used in the gene identification process, the authors have developed an effective way to partition a database into a set of sub-databases of related sequences, and reduced the search problem on a large database to a signature identification problem and a search problem on a much smaller sub-database. This reduces the number of sequences to be searched from N to O({radical}N) on average, and hence greatly reduces the search time, where N is the number of sequences in the original database. The system provides the user with the ability to facilitate and modify the analysis and modeling in real time.

  9. Targeted enrichment of genomic DNA regions for next generation sequencing

    NARCIS (Netherlands)

    Mertens, F.; El-Sharawy, A.; Sauer, S.; Van Helvoort, J.; Van der Zaag, P.J.; Franke, A.; Nilsson, M.; Lehrach. H.; Brookes, A.

    2011-01-01

    In this review we discuss the latest targeted enrichment methods, and aspects of their utilization along with second generation sequencing for complex genome analysis. In doing so we provide an overview of issues involved in detecting genetic variation, for which targeted enrichment has become a pow

  10. Application of Subspace Clustering in DNA Sequence Analysis.

    Science.gov (United States)

    Wallace, Tim; Sekmen, Ali; Wang, Xiaofei

    2015-10-01

    Identification and clustering of orthologous genes plays an important role in developing evolutionary models such as validating convergent and divergent phylogeny and predicting functional proteins in newly sequenced species of unverified nucleotide protein mappings. Here, we introduce an application of subspace clustering as applied to orthologous gene sequences and discuss the initial results. The working hypothesis is based upon the concept that genetic changes between nucleotide sequences coding for proteins among selected species and groups may lie within a union of subspaces for clusters of the orthologous groups. Estimates for the subspace dimensions were computed for a small population sample. A series of experiments was performed to cluster randomly selected sequences. The experimental design allows for both false positives and false negatives, and estimates for the statistical significance are provided. The clustering results are consistent with the main hypothesis. A simple random mutation binary tree model is used to simulate speciation events that show the interdependence of the subspace rank versus time and mutation rates. The simple mutation model is found to be largely consistent with the observed subspace clustering singular value results. Our study indicates that the subspace clustering method may be applied in orthology analysis. PMID:26162018

  11. Ligation bias in illumina next-generation DNA libraries: implications for sequencing ancient genomes.

    Directory of Open Access Journals (Sweden)

    Andaine Seguin-Orlando

    Full Text Available Ancient DNA extracts consist of a mixture of endogenous molecules and contaminant DNA templates, often originating from environmental microbes. These two populations of templates exhibit different chemical characteristics, with the former showing depurination and cytosine deamination by-products, resulting from post-mortem DNA damage. Such chemical modifications can interfere with the molecular tools used for building second-generation DNA libraries, and limit our ability to fully characterize the true complexity of ancient DNA extracts. In this study, we first use fresh DNA extracts to demonstrate that library preparation based on adapter ligation at AT-overhangs are biased against DNA templates starting with thymine residues, contrarily to blunt-end adapter ligation. We observe the same bias on fresh DNA extracts sheared on Bioruptor, Covaris and nebulizers. This contradicts previous reports suggesting that this bias could originate from the methods used for shearing DNA. This also suggests that AT-overhang adapter ligation efficiency is affected in a sequence-dependent manner and results in an uneven representation of different genomic contexts. We then show how this bias could affect the base composition of ancient DNA libraries prepared following AT-overhang ligation, mainly by limiting the ability to ligate DNA templates starting with thymines and therefore deaminated cytosines. This results in particular nucleotide misincorporation damage patterns, deviating from the signature generally expected for authenticating ancient sequence data. Consequently, we show that models adequate for estimating post-mortem DNA damage levels must be robust to the molecular tools used for building ancient DNA libraries.

  12. Ligation bias in illumina next-generation DNA libraries: implications for sequencing ancient genomes.

    Science.gov (United States)

    Seguin-Orlando, Andaine; Schubert, Mikkel; Clary, Joel; Stagegaard, Julia; Alberdi, Maria T; Prado, José Luis; Prieto, Alfredo; Willerslev, Eske; Orlando, Ludovic

    2013-01-01

    Ancient DNA extracts consist of a mixture of endogenous molecules and contaminant DNA templates, often originating from environmental microbes. These two populations of templates exhibit different chemical characteristics, with the former showing depurination and cytosine deamination by-products, resulting from post-mortem DNA damage. Such chemical modifications can interfere with the molecular tools used for building second-generation DNA libraries, and limit our ability to fully characterize the true complexity of ancient DNA extracts. In this study, we first use fresh DNA extracts to demonstrate that library preparation based on adapter ligation at AT-overhangs are biased against DNA templates starting with thymine residues, contrarily to blunt-end adapter ligation. We observe the same bias on fresh DNA extracts sheared on Bioruptor, Covaris and nebulizers. This contradicts previous reports suggesting that this bias could originate from the methods used for shearing DNA. This also suggests that AT-overhang adapter ligation efficiency is affected in a sequence-dependent manner and results in an uneven representation of different genomic contexts. We then show how this bias could affect the base composition of ancient DNA libraries prepared following AT-overhang ligation, mainly by limiting the ability to ligate DNA templates starting with thymines and therefore deaminated cytosines. This results in particular nucleotide misincorporation damage patterns, deviating from the signature generally expected for authenticating ancient sequence data. Consequently, we show that models adequate for estimating post-mortem DNA damage levels must be robust to the molecular tools used for building ancient DNA libraries.

  13. Self-tallying quantum anonymous voting

    Science.gov (United States)

    Wang, Qingle; Yu, Chaohua; Gao, Fei; Qi, Haoyu; Wen, Qiaoyan

    2016-08-01

    Anonymous voting is a voting method of hiding the link between a vote and a voter, the context of which ranges from governmental elections to decision making in small groups like councils and companies. In this paper, we propose a quantum anonymous voting protocol assisted by two kinds of entangled quantum states. Particularly, we provide a mechanism of opening and permuting the ordered votes of all the voters in an anonymous manner; any party who is interested in the voting results can acquire a permutation copy and then obtains the voting result through a simple calculation. Unlike all previous quantum works on anonymous voting, our quantum anonymous protocol possesses the properties of privacy, self-tallying, nonreusability, verifiability, and fairness at the same time. In addition, we demonstrate that the entanglement of the quantum states used in our protocol makes an attack from an outside eavesdropper and inside dishonest voters impossible. We also generalize our protocol to execute the task of anonymous multiparty computation, such as anonymous broadcast and anonymous ranking.

  14. Anonymity in Classroom Voting and Debating

    Science.gov (United States)

    Ainsworth, Shaaron; Gelmini-Hornsby, Giulia; Threapleton, Kate; Crook, Charles; O'Malley, Claire; Buda, Marie

    2011-01-01

    The advent of networked environments into the classroom is changing classroom debates in many ways. This article addresses one key attribute of these environments, namely anonymity, to explore its consequences for co-present adolescents anonymous, by virtue of the computer system, to peers not to teachers. Three studies with 16-17 year-olds used a…

  15. Is it OK to be an Anonymous?

    NARCIS (Netherlands)

    Serracino Inglott, P.

    2013-01-01

    Do the deviant acts carried out by the collective known as Anonymous qualify as vigilante activity, and if so, can they be justified? Addressing this question helps expose the difficulties of morally evaluating technologically enabled deviance. Anonymous is a complex, fluid actor but not as mysterio

  16. [Sequence of the ITS region of nuclear ribosomal DNA(nrDNA) in Xinjiang wild Dianthus and its phylogenetic relationship].

    Science.gov (United States)

    Zhang, Lu; Cai, You-Ming; Zhuge, Qiang; Zou, Hui-Yu; Huang, Min-Ren

    2002-06-01

    Xinjiang is a center of distribution and differentiation of genus Dianthus in China, and has a great deal of species resources. The sequences of ITS region (including ITS-1, 5.8S rDNA and ITS-2) of nuclear ribosomal DNA from 8 species of genus Dianthus wildly distributed in Xinjiang were determined by direct sequencing of PCR products. The result showed that the size of the ITS of Dianthus is from 617 to 621 bp, and the length variation is only 4 bp. There are very high homogeneous (97.6%-99.8%) sequences between species, and about 80% homogeneous sequences between genus Dianthus and outgroup. The sequences of ITS in genus Dianthus are relatively conservative. In general, there are more conversion than transition in the variation sites among genus Dianthus. The conversion rates are relatively high, and the ratios of conversion/transition are 1.0-3.0. On the basis of phylogenetic analysis of nucleotide sequences the species of Dianthus in China would be divided into three sections. There is a distant relationship between sect. Barbulatum Williams and sect. Dianthus and between sect. Barbulatum Williams and sect. Fimbriatum Williams, and there is a close relationship between sect. Dianthus and sect. Fimbriatum Williams. From the phylogenetic tree of ITS it was found that the origin of sect. Dianthusis is earlier than that of sect. Fimbriatum Williams and sect. Barbulatum Williams.

  17. Sequencing cDNAs: An Introduction to DNA Sequence Analysis in the Undergraduate Molecular Genetics Course.

    Science.gov (United States)

    Galewsky, Samuel

    2000-01-01

    Introduces a series of molecular genetics laboratories where students pick a single colony from a Drosophila melanogester embryo cDNA library and purify the plasmid, then analyze the insert through restriction digests and gel electrophoresis. (Author/YDS)

  18. Comparison of microbial DNA enrichment tools for metagenomic whole genome sequencing.

    Science.gov (United States)

    Thoendel, Matthew; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Yao, Janet Z; Chia, Nicholas; Hanssen, Arlen D; Abdel, Matthew P; Patel, Robin

    2016-08-01

    Metagenomic whole genome sequencing for detection of pathogens in clinical samples is an exciting new area for discovery and clinical testing. A major barrier to this approach is the overwhelming ratio of human to pathogen DNA in samples with low pathogen abundance, which is typical of most clinical specimens. Microbial DNA enrichment methods offer the potential to relieve this limitation by improving this ratio. Two commercially available enrichment kits, the NEBNext Microbiome DNA Enrichment Kit and the Molzym MolYsis Basic kit, were tested for their ability to enrich for microbial DNA from resected arthroplasty component sonicate fluids from prosthetic joint infections or uninfected sonicate fluids spiked with Staphylococcus aureus. Using spiked uninfected sonicate fluid there was a 6-fold enrichment of bacterial DNA with the NEBNext kit and 76-fold enrichment with the MolYsis kit. Metagenomic whole genome sequencing of sonicate fluid revealed 13- to 85-fold enrichment of bacterial DNA using the NEBNext enrichment kit. The MolYsis approach achieved 481- to 9580-fold enrichment, resulting in 7 to 59% of sequencing reads being from the pathogens known to be present in the samples. These results demonstrate the usefulness of these tools when testing clinical samples with low microbial burden using next generation sequencing. PMID:27237775

  19. MAGIC-SPP: a database-driven DNA sequence processing package with associated management tools

    Directory of Open Access Journals (Sweden)

    Qu Junfeng

    2006-03-01

    Full Text Available Abstract Background Processing raw DNA sequence data is an especially challenging task for relatively small laboratories and core facilities that produce as many as 5000 or more DNA sequences per week from multiple projects in widely differing species. To meet this challenge, we have developed the flexible, scalable, and automated sequence processing package described here. Results MAGIC-SPP is a DNA sequence processing package consisting of an Oracle 9i relational database, a Perl pipeline, and user interfaces implemented either as JavaServer Pages (JSP or as a Java graphical user interface (GUI. The database not only serves as a data repository, but also controls processing of trace files. MAGIC-SPP includes an administrative interface, a laboratory information management system, and interfaces for exploring sequences, monitoring quality control, and troubleshooting problems related to sequencing activities. In the sequence trimming algorithm it employs new features designed to improve performance with respect to concerns such as concatenated linkers, identification of the expected start position of a vector insert, and extending the useful length of trimmed sequences by bridging short regions of low quality when the following high quality segment is sufficiently long to justify doing so. Conclusion MAGIC-SPP has been designed to minimize human error, while simultaneously being robust, versatile, flexible and automated. It offers a unique combination of features that permit administration by a biologist with little or no informatics background. It is well suited to both individual research programs and core facilities.

  20. Inhibition of hepatitis B virus replication with linear DNA sequences expressing antiviral micro-RNA shuttles

    Energy Technology Data Exchange (ETDEWEB)

    Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie; Weinberg, Marc S. [Antiviral Gene Therapy Research Unit, University of the Witwatersrand (South Africa); Arbuthnot, Patrick, E-mail: Patrick.Arbuthnot@wits.ac.za [Antiviral Gene Therapy Research Unit, University of the Witwatersrand (South Africa)

    2009-11-20

    RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR) shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.

  1. CLONING AND EXPRESSION OF A cDNA SEQUENCE FOR HUMAN THIOREDOXIN

    Institute of Scientific and Technical Information of China (English)

    Liu Qingyong(刘庆勇); Ruan Xiyun(阮喜云); Liu Xiaogong(刘效恭); Ji Zongzheng(纪宗正); Dang Jiangong; Nan Xunyi(南勋义); Wang Quanying(王全颖); Yang Guangxiao(杨广笑)

    2003-01-01

    Objective To clone and determine the sequence and expression of a cDNA segment for human thioredoxin. Methods The cDNA segment of thioredoxin was obtained through amplification by RT-PCR cloning from 143 (TK-) human osteosarcoma cell. The amplified products were cloned into pGEM-T Easy vector and sequenced. Then the expressed vector pBV220-hTRX was constructed and transformed into E.coli strain DH5α for hTRX expression. The hTRX was purified by DEAE-Sephadex A-50 column and the activity of recombinant hTRX was determined by the insulin disulfide reduction assay. Results Comparison of cDNA sequence of the cloned fragments with that of the reported hTRX (GenBank J04026) demonstrated that there were two differences compared to the reported cDNA sequence for hTRX at bp180 and bp284, and the amino acids enceoded altered respectively, but motif of the sequence was identical to that of the reported hTRX. The recombinant hTRX can catalyze insulin reduction by DTT. Conclusion The successful cloning and expression of hTRX cDNA formed a basis for further study on biological functions and utilization of hTRX.

  2. Sequence homology at the breakpoint and clinical phenotype of mitochondrial DNA deletion syndromes.

    Directory of Open Access Journals (Sweden)

    Bekim Sadikovic

    Full Text Available Mitochondrial DNA (mtDNA deletions are a common cause of mitochondrial disorders. Large mtDNA deletions can lead to a broad spectrum of clinical features with different age of onset, ranging from mild mitochondrial myopathies (MM, progressive external ophthalmoplegia (PEO, and Kearns-Sayre syndrome (KSS, to severe Pearson syndrome. The aim of this study is to investigate the molecular signatures surrounding the deletion breakpoints and their association with the clinical phenotype and age at onset. MtDNA deletions in 67 patients were characterized using array comparative genomic hybridization (aCGH followed by PCR-sequencing of the deletion junctions. Sequence homology including both perfect and imperfect short repeats flanking the deletion regions were analyzed and correlated with clinical features and patients' age group. In all age groups, there was a significant increase in sequence homology flanking the deletion compared to mtDNA background. The youngest patient group (<6 years old showed a diffused pattern of deletion distribution in size and locations, with a significantly lower sequence homology flanking the deletion, and the highest percentage of deletion mutant heteroplasmy. The older age groups showed rather discrete pattern of deletions with 44% of all patients over 6 years old carrying the most common 5 kb mtDNA deletion, which was found mostly in muscle specimens (22/41. Only 15% (3/20 of the young patients (<6 years old carry the 5 kb common deletion, which is usually present in blood rather than muscle. This group of patients predominantly (16 out of 17 exhibit multisystem disorder and/or Pearson syndrome, while older patients had predominantly neuromuscular manifestations including KSS, PEO, and MM. In conclusion, sequence homology at the deletion flanking regions is a consistent feature of mtDNA deletions. Decreased levels of sequence homology and increased levels of deletion mutant heteroplasmy appear to correlate with earlier

  3. Noninvasive detection of fetal subchromosomal abnormalities by semiconductor sequencing of maternal plasma DNA.

    Science.gov (United States)

    Yin, Ai-hua; Peng, Chun-fang; Zhao, Xin; Caughey, Bennett A; Yang, Jie-xia; Liu, Jian; Huang, Wei-wei; Liu, Chang; Luo, Dong-hong; Liu, Hai-liang; Chen, Yang-yi; Wu, Jing; Hou, Rui; Zhang, Mindy; Ai, Michael; Zheng, Lianghong; Xue, Rachel Q; Mai, Ming-qin; Guo, Fang-fang; Qi, Yi-ming; Wang, Dong-mei; Krawczyk, Michal; Zhang, Daniel; Wang, Yu-nan; Huang, Quan-fei; Karin, Michael; Zhang, Kang

    2015-11-24

    Noninvasive prenatal testing (NIPT) using sequencing of fetal cell-free DNA from maternal plasma has enabled accurate prenatal diagnosis of aneuploidy and become increasingly accepted in clinical practice. We investigated whether NIPT using semiconductor sequencing platform (SSP) could reliably detect subchromosomal deletions/duplications in women carrying high-risk fetuses. We first showed that increasing concentration of abnormal DNA and sequencing depth improved detection. Subsequently, we analyzed plasma from 1,456 pregnant women to develop a method for estimating fetal DNA concentration based on the size distribution of DNA fragments. Finally, we collected plasma from 1,476 pregnant women with fetal structural abnormalities detected on ultrasound who also underwent an invasive diagnostic procedure. We used SSP of maternal plasma DNA to detect subchromosomal abnormalities and validated our results with array comparative genomic hybridization (aCGH). With 3.5 million reads, SSP detected 56 of 78 (71.8%) subchromosomal abnormalities detected by aCGH. With increased sequencing depth up to 10 million reads and restriction of the size of abnormalities to more than 1 Mb, sensitivity improved to 69 of 73 (94.5%). Of 55 false-positive samples, 35 were caused by deletions/duplications present in maternal DNA, indicating the necessity of a validation test to exclude maternal karyotype abnormalities. This study shows that detection of fetal subchromosomal abnormalities is a viable extension of NIPT based on SSP. Although we focused on the application of cell-free DNA sequencing for NIPT, we believe that this method has broader applications for genetic diagnosis, such as analysis of circulating tumor DNA for detection of cancer.

  4. The finished DNA sequence of human chromosome 12.

    Science.gov (United States)

    Scherer, Steven E; Muzny, Donna M; Buhay, Christian J; Chen, Rui; Cree, Andrew; Ding, Yan; Dugan-Rocha, Shannon; Gill, Rachel; Gunaratne, Preethi; Harris, R Alan; Hawes, Alicia C; Hernandez, Judith; Hodgson, Anne V; Hume, Jennifer; Jackson, Andrew; Khan, Ziad Mohid; Kovar-Smith, Christie; Lewis, Lora R; Lozado, Ryan J; Metzker, Michael L; Milosavljevic, Aleksandar; Miner, George R; Montgomery, Kate T; Morgan, Margaret B; Nazareth, Lynne V; Scott, Graham; Sodergren, Erica; Song, Xing-Zhi; Steffen, David; Lovering, Ruth C; Wheeler, David A; Worley, Kim C; Yuan, Yi; Zhang, Zhengdong; Adams, Charles Q; Ansari-Lari, M Ali; Ayele, Mulu; Brown, Mary J; Chen, Guan; Chen, Zhijian; Clerc-Blankenburg, Kerstin P; Davis, Clay; Delgado, Oliver; Dinh, Huyen H; Draper, Heather; Gonzalez-Garay, Manuel L; Havlak, Paul; Jackson, Laronda R; Jacob, Leni S; Kelly, Susan H; Li, Li; Li, Zhangwan; Liu, Jing; Liu, Wen; Lu, Jing; Maheshwari, Manjula; Nguyen, Bao-Viet; Okwuonu, Geoffrey O; Pasternak, Shiran; Perez, Lesette M; Plopper, Farah J H; Santibanez, Jireh; Shen, Hua; Tabor, Paul E; Verduzco, Daniel; Waldron, Lenee; Wang, Qiaoyan; Williams, Gabrielle A; Zhang, Jingkun; Zhou, Jianling; Allen, Carlana C; Amin, Anita G; Anyalebechi, Vivian; Bailey, Michael; Barbaria, Joseph A; Bimage, Kesha E; Bryant, Nathaniel P; Burch, Paula E; Burkett, Carrie E; Burrell, Kevin L; Calderon, Eliana; Cardenas, Veronica; Carter, Kelvin; Casias, Kristal; Cavazos, Iracema; Cavazos, Sandra R; Ceasar, Heather; Chacko, Joseph; Chan, Sheryl N; Chavez, Dean; Christopoulos, Constantine; Chu, Joseph; Cockrell, Raynard; Cox, Caroline D; Dang, Michelle; Dathorne, Stephanie R; David, Robert; Davis, Candi Mon'Et; Davy-Carroll, Latarsha; Deshazo, Denise R; Donlin, Jeremy E; D'Souza, Lisa; Eaves, Kristy A; Egan, Amy; Emery-Cohen, Alexandra J; Escotto, Michael; Flagg, Nicole; Forbes, Lisa D; Gabisi, Abdul M; Garza, Melissa; Hamilton, Cerissa; Henderson, Nicholas; Hernandez, Omar; Hines, Sandra; Hogues, Marilyn E; Huang, Mei; Idlebird, DeVincent G; Johnson, Rudy; Jolivet, Angela; Jones, Sally; Kagan, Ryan; King, Laquisha M; Leal, Belita; Lebow, Heather; Lee, Sandra; LeVan, Jaclyn M; Lewis, Lakeshia C; London, Pamela; Lorensuhewa, Lorna M; Loulseged, Hermela; Lovett, Demetria A; Lucier, Alice; Lucier, Raymond L; Ma, Jie; Madu, Renita C; Mapua, Patricia; Martindale, Ashley D; Martinez, Evangelina; Massey, Elizabeth; Mawhiney, Samantha; Meador, Michael G; Mendez, Sylvia; Mercado, Christian; Mercado, Iracema C; Merritt, Christina E; Miner, Zachary L; Minja, Emmanuel; Mitchell, Teresa; Mohabbat, Farida; Mohabbat, Khatera; Montgomery, Baize; Moore, Niki; Morris, Sidney; Munidasa, Mala; Ngo, Robin N; Nguyen, Ngoc B; Nickerson, Elizabeth; Nwaokelemeh, Ogechi O; Nwokenkwo, Stanley; Obregon, Melissa; Oguh, Maryann; Oragunye, Njideka; Oviedo, Rodolfo J; Parish, Bridgette J; Parker, David N; Parrish, Julia; Parks, Kenya L; Paul, Heidie A; Payton, Brett A; Perez, Agapito; Perrin, William; Pickens, Adam; Primus, Eltrick L; Pu, Ling-Ling; Puazo, Maria; Quiles, Miyo M; Quiroz, Juana B; Rabata, Dina; Reeves, Kacy; Ruiz, San Juana; Shao, Hongmei; Sisson, Ida; Sonaike, Titilola; Sorelle, Richard P; Sutton, Angelica E; Svatek, Amanda F; Svetz, Leah Anne; Tamerisa, Kavitha S; Taylor, Tineace R; Teague, Brian; Thomas, Nicole; Thorn, Rachel D; Trejos, Zulma Y; Trevino, Brenda K; Ukegbu, Ogechi N; Urban, Jeremy B; Vasquez, Lydia I; Vera, Virginia A; Villasana, Donna M; Wang, Ling; Ward-Moore, Stephanie; Warren, James T; Wei, Xuehong; White, Flower; Williamson, Angela L; Wleczyk, Regina; Wooden, Hailey S; Wooden, Steven H; Yen, Jennifer; Yoon, Lillienne; Yoon, Vivienne; Zorrilla, Sara E; Nelson, David; Kucherlapati, Raju; Weinstock, George; Gibbs, Richard A

    2006-03-16

    Human chromosome 12 contains more than 1,400 coding genes and 487 loci that have been directly implicated in human disease. The q arm of chromosome 12 contains one of the largest blocks of linkage disequilibrium found in the human genome. Here we present the finished sequence of human chromosome 12, which has been finished to high quality and spans approximately 132 megabases, representing approximately 4.5% of the human genome. Alignment of the human chromosome 12 sequence across vertebrates reveals the origin of individual segments in chicken, and a unique history of rearrangement through rodent and primate lineages. The rate of base substitutions in recent evolutionary history shows an overall slowing in hominids compared with primates and rodents.

  5. Towards a Theory of Anonymous Networking

    CERN Document Server

    Ghaderi, J

    2009-01-01

    The problem of anonymous networking when an eavesdropper observes packet timings in a communication network is considered. The goal is to hide the identities of source-destination nodes, and paths of information flow in the network. One way to achieve such an anonymity is to use mixers. Mixers are nodes that receive packets from multiple sources and change the timing of packets, by mixing packets at the output links, to prevent the eavesdropper from finding sources of outgoing packets. In this paper, we consider two simple but fundamental scenarios: double input-single output mixer and double input-double output mixer. For the first case, we use the information-theoretic definition of the anonymity, based on average entropy per packet, and find an optimal mixing strategy under a strict latency constraint. For the second case, perfect anonymity is considered, and a maximal throughput strategy with perfect anonymity is found that minimizes the average delay.

  6. High penetrance of sequencing errors and interpretative shortcomings in mtDNA sequence analysis of LHON patients.

    Science.gov (United States)

    Bandelt, Hans-Jürgen; Yao, Yong-Gang; Salas, Antonio; Kivisild, Toomas; Bravi, Claudio M

    2007-01-12

    For identifying mutation(s) that are potentially pathogenic it is essential to determine the entire mitochondrial DNA (mtDNA) sequences from patients suffering from a particular mitochondrial disease, such as Leber hereditary optic neuropathy (LHON). However, such sequencing efforts can, in the worst case, be riddled with errors by imposing phantom mutations or misreporting variant nucleotides, and moreover, by inadvertently regarding some mutations as novel and pathogenic, which are actually known to define minor haplogroups. Under such circumstances it remains unclear whether the disease-associated mutations would have been determined adequately. Here, we re-analyse four problematic LHON studies and propose guidelines by which some of the pitfalls could be avoided.

  7. Structural basis for sequence-specific recognition of DNA by TAL effectors

    KAUST Repository

    Deng, Dong

    2012-01-05

    TAL (transcription activator-like) effectors, secreted by phytopathogenic bacteria, recognize host DNA sequences through a central domain of tandem repeats. Each repeat comprises 33 to 35 conserved amino acids and targets a specific base pair by using two hypervariable residues [known as repeat variable diresidues (RVDs)] at positions 12 and 13. Here, we report the crystal structures of an 11.5-repeat TAL effector in both DNA-free and DNA-bound states. Each TAL repeat comprises two helices connected by a short RVD-containing loop. The 11.5 repeats form a right-handed, superhelical structure that tracks along the sense strand of DNA duplex, with RVDs contacting the major groove. The 12th residue stabilizes the RVD loop, whereas the 13th residue makes a base-specific contact. Understanding DNA recognition by TAL effectors may facilitate rational design of DNA-binding proteins with biotechnological applications.

  8. Electrochemical direct immobilization of DNA sequences for label-free herpes virus detection

    International Nuclear Information System (INIS)

    DNA sequences/bio-macromolecules of herpes virus (5'-AT CAC CGA CCC GGA GAG GGA C-3') were directly immobilized into polypyrrole matrix by using the cyclic voltammetry method, and grafted onto arrays of interdigitated platinum microelectrodes. The morphology surface of the obtained PPy/DNA of herpes virus composite films was investigated by a FESEM Hitachi-S 4800. Fourier transform infrared spectroscopy (FTIR) was used to characterize the PPy/DNA film and to study the specific interactions that may exist between DNA biomacromolecules and PPy chains. Attempts are made to use these PPy/DNA composite films for label-free herpes virus detection revealed a response time of 60 s in solutions containing as low as 2 nM DNA concentration, and self life of six months when emerged in double distilled water and kept refrigerated.

  9. Mechanism of sequence-specific template binding by the DNA primase of bacteriophage T7

    KAUST Repository

    Lee, Seung-Joo

    2010-03-28

    DNA primases catalyze the synthesis of the oligoribonucleotides required for the initiation of lagging strand DNA synthesis. Biochemical studies have elucidated the mechanism for the sequence-specific synthesis of primers. However, the physical interactions of the primase with the DNA template to explain the basis of specificity have not been demonstrated. Using a combination of surface plasmon resonance and biochemical assays, we show that T7 DNA primase has only a slightly higher affinity for DNA containing the primase recognition sequence (5\\'-TGGTC-3\\') than for DNA lacking the recognition site. However, this binding is drastically enhanced by the presence of the cognate Nucleoside triphosphates (NTPs), Adenosine triphosphate (ATP) and Cytosine triphosphate (CTP) that are incorporated into the primer, pppACCA. Formation of the dimer, pppAC, the initial step of sequence-specific primer synthesis, is not sufficient for the stable binding. Preformed primers exhibit significantly less selective binding than that observed with ATP and CTP. Alterations in subdomains of the primase result in loss of selective DNA binding. We present a model in which conformational changes induced during primer synthesis facilitate contact between the zinc-binding domain and the polymerase domain. The Author(s) 2010. Published by Oxford University Press.

  10. Sequence-specific interactions of drugs interfering with the topoisomerase-DNA cleavage complex.

    Science.gov (United States)

    Palumbo, Manlio; Gatto, Barbara; Moro, Stefano; Sissi, Claudia; Zagotto, Giuseppe

    2002-07-18

    DNA-processing enzymes, such as the topoisomerases (tops), represent major targets for potent anticancer (and antibacterial) agents. The drugs kill cells by poisoning the enzymes' catalytic cycle. Understanding the molecular details of top poisoning is a fundamental requisite for the rational development of novel, more effective antineoplastic drugs. In this connection, sequence-specific recognition of the top-DNA complex is a key step to preferentially direct the action of the drugs onto selected genomic sequences. In fact, the (reversible) interference of drugs with the top-DNA complex exhibits well-defined preferences for DNA bases in the proximity of the cleavage site, each drug showing peculiarities connected to its structural features. A second level of selectivity can be observed when chemically reactive groups are present in the structure of the top-directed drug. In this case, the enzyme recognizes or generates a unique site for covalent drug-DNA binding. This will further subtly modulate the drug's efficiency in stimulating DNA damage at selected sites. Finally, drugs can discriminate not only among different types of tops, but also among different isoenzymes, providing an additional level of specific selection. Once the molecular basis for DNA sequence-dependent recognition has been established, the above-mentioned modes to generate selectivity in drug poisoning can be rationally exploited, alone or in combination, to develop tailor-made drugs targeted at defined loci in cancer cells. PMID:12084456

  11. Cloning and sequencing of Octopus dofleini hemocyanin cDNA: derived sequences of functional units Ode and Odf.

    Science.gov (United States)

    Lang, W H; van Holde, K E

    1991-01-01

    A number of additional cDNA clones coding for portions of the very large polypeptide chain of Octopus dofleini hemocyanin were isolated and sequenced. These data reveal two very similar coding sequences, which we have denoted "A-type" and "G-type." We have obtained complete A-type sequences coding for functional units Ode and Odf; consequently a total of three such unit sequences are now known from a single subunit of one molluscan hemocyanin. This presents the opportunity to make sequence comparisons within one hemocyanin subunit. Domains within one subunit show on the average 42% identity in amino acid residues; corresponding functional units from hemocyanins of different species show degrees of identity of 53-75%. Therefore, molluscan hemocyanins already existed before the individual molluscan classes diverged in the early Cambrian. Sequence comparisons of molluscan hemocyanins with arthropodan hemocyanins and tyrosinases allow us to identify the ligands of the "Copper B" site with high probability. Possible ligands for the "Copper A" site are proposed, based on sequence comparisons between molluscan hemocyanins and tyrosinases. Besides two histidine side chains, a methionine side chain might be involved in binding of Copper A, a result not in conflict with spectroscopic studies. PMID:1898774

  12. Tomato protoplast DNA transformation : physical linkage and recombination of exogenous DNA sequences

    NARCIS (Netherlands)

    Jongsma, Maarten; Koornneef, Maarten; Zabel, Pim; Hille, Jacques

    1987-01-01

    Tomato protoplasts have been transformed with plasmid DNA's, containing a chimeric kanamycin resistance gene and putative tomato origins of replication. A calcium phosphate-DNA mediated transformation procedure was employed in combination with either polyethylene glycol or polyvinyl alcohol. There w

  13. DNA Sequencing by Capillary Electrophoresis Using Quasi-inter penetrating Network Formed by Polyacrylamide and Poly(N-hydroxymethylacrylamide)

    Institute of Scientific and Technical Information of China (English)

    Wen Long ZHANG; Yan Mei WANG

    2006-01-01

    Quasi-interpenetrating network formed by polyacrylamide and poly (N-hydroxymethylacrylamide) was designed, synthesized, and tested for DNA sequencing by capillary electrophoresis. The performance of quasi-IPN on DNA sequencing was determined by the acrylamide to N-hydroxymethylacrylamide molar ratio and sequencing temperature.

  14. Whole genome bisulfite sequencing of cell-free DNA and its cellular contributors uncovers placenta hypomethylated domains

    OpenAIRE

    Jensen, Taylor J.; Kim, Sung K; Zhu, Zhanyang; Chin, Christine; Gebhard, Claudia; Lu, Tim; Deciu, Cosmin; Van den Boom, Dirk; Ehrich, Mathias

    2015-01-01

    Background Circulating cell-free fetal DNA has enabled non-invasive prenatal fetal aneuploidy testing without direct discrimination of the maternal and fetal DNA. Testing may be improved by specifically enriching the sample material for fetal DNA. DNA methylation may allow for such a separation of DNA; however, this depends on knowledge of the methylomes of circulating cell-free DNA and its cellular contributors. Results We perform whole genome bisulfite sequencing on a set of unmatched sampl...

  15. The Grouping of DNA Sequences Model%DNA 分类模型

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    In this paper, a method to classify the DNA sequences is proposed. Mathematical methods such as statistics and optimization are used to build the model. The data is analysed sufficiently and the “critical words” is got, which can represent the characteristics of each group. According to this, a quantitative standard for grouping is brought forward. This model can properly classify the given data through testing. First, the strings which appear repeatedly (called words) in the given data are scanned out. The standard frequency and dispersion for each word are calculated. Second, using the Least Squares method, the priority function is fixed. Through stepwise optimization, the coefficients are made stable. Third, the key words are selected out and calculate the weight according to the priority function. At last, using the “analyse hierarchy process”,the undetermined data is classified. This method can classify the undetermined data (No.21—No.40) fairly well, it can also give good result for the last 182 sequences.%本模型充分利用了所给数据的特点,运用统计、最优化等数学方法,从已知样本序列中提炼出能较好代表两类特征的关键字符串,据此提出量化的分类标准,能较好的对任给DNA序列进行分类. 首先,从已知样本序列中用广度优先法选出所有重复出现的字符串,并计算其标准化频率及分散度. 然后,利用样本数据结合最小二乘法确定两类字符串各自的优先级函数,并且逐步优化其参数使之达到稳定,提高了可信度. 最后,根据优先级函数找出关键词,然后确定权数,用层次分析法对未知样本进行分类,并定出显著水平,从而得到了一个比较通用的分类方法. 经过检验,此方法对21—40号待测样本进行了很好的分类,对后面的182个DNA序列进行同样的操作,也有较好的效果.

  16. Temporal stability of epigenetic markers: sequence characteristics and predictors of short-term DNA methylation variations.

    Directory of Open Access Journals (Sweden)

    Hyang-Min Byun

    Full Text Available BACKGROUND: DNA methylation is an epigenetic mechanism that has been increasingly investigated in observational human studies, particularly on blood leukocyte DNA. Characterizing the degree and determinants of DNA methylation stability can provide critical information for the design and conduction of human epigenetic studies. METHODS: We measured DNA methylation in 12 gene-promoter regions (APC, p16, p53, RASSF1A, CDH13, eNOS, ET-1, IFNγ, IL-6, TNFα, iNOS, and hTERT and 2 of non-long terminal repeat elements, i.e., L1 and Alu in blood samples obtained from 63 healthy individuals at baseline (Day 1 and after three days (Day 4. DNA methylation was measured by bisulfite-PCR-Pyrosequencing. We calculated intraclass correlation coefficients (ICCs to measure the within-individual stability of DNA methylation between Day 1 and 4, subtracted of pyrosequencing error and adjusted for multiple covariates. RESULTS: Methylation markers showed different temporal behaviors ranging from high (IL-6, ICC = 0.89 to low stability (APC, ICC = 0.08 between Day 1 and 4. Multiple sequence and marker characteristics were associated with the degree of variation. Density of CpG dinucleotides nearby the sequence analyzed (measured as CpG(o/e or G+C content within ±200 bp was positively associated with DNA methylation stability. The 3' proximity to repeat elements and range of DNA methylation on Day 1 were also positively associated with methylation stability. An inverted U-shaped correlation was observed between mean DNA methylation on Day 1 and stability. CONCLUSIONS: The degree of short-term DNA methylation stability is marker-dependent and associated with sequence characteristics and methylation levels.

  17. cDNA sequence, mRNA expression and genomic DNA of trypsinogen from the indianmeal moth, Plodia interpunctella.

    Science.gov (United States)

    Zhu, Y C; Oppert, B; Kramer, K J; McGaughey, W H; Dowdy, A K

    2000-02-01

    Trypsin-like enzymes are major insect gut enzymes that digest dietary proteins and proteolytically activate insecticidal proteins produced by the bacterium Bacillus thuringiensis (Bt). Resistance to Bt in a strain of the Indianmeal moth, Plodia interpunctella, was linked to the absence of a major trypsin-like proteinase (Oppert et al., 1997). In this study, trypsin-like proteinases, cDNA sequences, mRNA expression levels and genomic DNAs from Bt-susceptible and -resistant strains of the Indianmeal moth were compared. Proteinase activity blots of gut extracts indicated that the susceptible strain had two major trypsin-like proteinases, whereas the resistant strain had only one. Several trypsinogen-like cDNA clones were isolated and sequenced from cDNA libraries of both strains using a probe deduced from a conserved sequence for a serine proteinase active site. cDNAs of 852 nucleotides from the susceptible strain and 848 nucleotides from the resistant strain contained an open reading frame of 783 nucleotides which encoded a 261-amino acid trypsinogen-like protein. There was a single silent nucleotide difference between the two cDNAs in the open reading frame and the predicted amino acid sequence from the cDNA clones was most similar to sequences of trypsin-like proteinases from the spruce budworm, Choristoneura fumiferana, and the tobacco hornworm, Manduca sexta. The encoded protein included amino acid sequence motifs of serine proteinase active sites, conserved cysteine residues, and both zymogen activation and signal peptides. Northern blotting analysis showed no major difference between the two strains in mRNA expression in fourth-instar larvae, indicating that transcription was similar in the strains. Southern blotting analysis revealed that the restriction sites for the trypsinogen genes from the susceptible and resistant strains were different. Based on an enzyme size comparison, the cDNA isolated in this study corresponded to the gene for the smaller of two

  18. Identification and isolation of full-length cDNA sequences by sequencing and analysis of expressed sequence tags from guarana (Paullinia cupana).

    Science.gov (United States)

    Figueirêdo, L C; Faria-Campos, A C; Astolfi-Filho, S; Azevedo, J L

    2011-01-01

    The current intense production of biological data, generated by sequencing techniques, has created an ever-growing volume of unanalyzed data. We reevaluated data produced by the guarana (Paullinia cupana) transcriptome sequencing project to identify cDNA clones with complete coding sequences (full-length clones) and complete sequences of genes of biotechnological interest, contributing to the knowledge of biological characteristics of this organism. We analyzed 15,490 ESTs of guarana in search of clones with complete coding regions. A total of 12,402 sequences were analyzed using BLAST, and 4697 full-length clones were identified, responsible for the production of 2297 different proteins. Eighty-four clones were identified as full-length for N-methyltransferase and 18 were sequenced in both directions to obtain the complete genome sequence, and confirm the search made in silico for full-length clones. Phylogenetic analyses were made with the complete genome sequences of three clones, which showed only 0.017% dissimilarity; these are phylogenetically close to the caffeine synthase of Theobroma cacao. The search for full-length clones allowed the identification of numerous clones that had the complete coding region, demonstrating this to be an efficient and useful tool in the process of biological data mining. The sequencing of the complete coding region of identified full-length clones corroborated the data from the in silico search, strengthening its efficiency and utility. PMID:21732283

  19. Electrochemical molecular beacon biosensor for sequence-specific recognition of double-stranded DNA.

    Science.gov (United States)

    Miao, Xiangmin; Guo, Xiaoting; Xiao, Zhiyou; Ling, Liansheng

    2014-09-15

    Direct recognition of double-stranded DNA (dsDNA) was crucial to disease diagnosis and gene therapy, because DNA in its natural state is double stranded. Here, a novel sensor for the sequence-specific recognition of dsDNA was developed based on the structure change of ferrocene (Fc) redox probe modified molecular beacon (MB). For constructing such a sensor, gold nanoparticles (AuNPs) were initially electrochemical-deposited onto glass carbon electrode (GCE) surface to immobilize thiolated MB in their folded states with Au-S bond. Hybridization of MB with target dsDNA induced the formation of parallel triplex DNA and opened the stem-loop structure of it, which resulted in the redox probe (Fc) away from the electrode and triggered the decrease of current signals. Under optimal conditions, dsDNA detection could be realized in the range from 350 pM to 25 nM, with a detection limit of 275 pM. Moreover, the proposed method has good sequence-specificity for target dsDNA compared with single base pair mismatch and two base pairs mismatches.

  20. Determining physical constraints in transcriptional initiationcomplexes using DNA sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Shultzaberger, Ryan K.; Chiang, Derek Y.; Moses, Alan M.; Eisen,Michael B.

    2007-07-01

    Eukaryotic gene expression is often under the control ofcooperatively acting transcription factors whose binding is limited bystructural constraints. By determining these structural constraints, wecan understand the "rules" that define functional cooperativity.Conversely, by understanding the rules of binding, we can inferstructural characteristics. We have developed an information theory basedmethod for approximating the physical limitations of cooperativeinteractions by comparing sequence analysis to microarray expressiondata. When applied to the coordinated binding of the sulfur amino acidregulatory protein Met4 by Cbf1 and Met31, we were able to create acombinatorial model that can correctly identify Met4 regulatedgenes.

  1. Amplification of a transcriptionally active DNA sequence in the human brain

    International Nuclear Information System (INIS)

    The authors present their findings of tissue-specific amplification of a DNA fragment actively transcribed in the human brain. This genome fragment was found in the library complement of cDNA of the human brain and evidently belongs to a new class of moderate repetitions of DNA with an unstable copying capacity in the human genome. The authors isolated total cell RNA from various human tissues (brain, placenta), and rat tissues (brain, liver), by the method of hot phenol extraction with guanidine thiocynate. The poly(A+) RNA fraction was isolated by chromatography. Synthesis of cDNA was done on a matrix of poly(A+) RNA of human brain. The cDNA obtained was cloned in plasmid pBR322 for the PstI site using (dC/dG) sequences synthesized on the 3' ends of the vector molecule and cDNA respectively. In cloning 75 ng cDNA, the authors obtained approximately 105 recombinant. This library was analyzed by the hybridization method on columns with two radioactive (32P) probes: the total cDNA preparation and the total nuclear DNA from the human brain. The number of copies of the cloned DNA fragment in the genome was determined by dot hybridization. Restricting fragments of human and rat DNA genomes homologous to the cloned cDNA were identified on radio-autographs. In each case, 10 micrograms of EcoRI DNA hydrolyzate was fractionated in 1% agarose gel. The probe was also readied with RNA samples fractionated in agarose gel with formaldehyde and transferred to a nitrocellulose filter under weak vacuum. The filter was hybridized with 0.1 micrograms DNA pAG 02, labeled with (32P) to a specific activity of 0.5-1 x 109 counts/min x microgram. The autograph was exposed with amplifying screens at -700C for 2 days

  2. Sequence Effect on the Topology of 3 + 1 Interlocked Bimolecular DNA G-Quadruplexes.

    Science.gov (United States)

    Gao, Shang; Cao, Yanwei; Yan, Yuting; Guo, Xinhua

    2016-05-17

    Electrospray ionization mass spectrometry (ESI-MS) combined with fluorescence, circular dichroism, UV spectrophotometer, and native polyacrylamide gel electrophoresis techniques are used to study structural features of interlocked dimers formed by DNA sequence 93del (GGGGTGGGAGGAGGGT) and its derivatives. Herein, we demonstrate that the interlocked dimers can be distinguished from stacked dimers formed by sequences T30923 (GGGTGGGTGGGTGGGT) and T30177 (GTGGTGGGTGGGTGGGT). In addition, loop length, the base at 5'-end, and the isolation of T and TT to the first 4G tract do significantly influence the formation and topologies of interlocked dimers. Furthermore, our results suggest that the 4G tract and the 2G tract in various locations in the 93del derivative sequence can form interlocked structure. This work not only provides new insight into the assembly of 3 + 1 interlocked DNA conformations but also demonstrates that ESI-MS combined with other analytical methods is rapid and useful for DNA structural studies. PMID:27027538

  3. More of an Art than a Science: Using Microbial DNA Sequences to Compose Music

    Directory of Open Access Journals (Sweden)

    Peter E. Larsen

    2015-12-01

    Full Text Available Bacteria are everywhere. Microbial ecology is emerging as a critical field for understanding the relationships between these ubiquitous bacterial communities, the environment, and human health. Next generation DNA sequencing technology provides us a powerful tool to indirectly observe the communities by sequencing and analyzing all of the bacterial DNA present in an environment. The results of the DNA sequencing experiments can generate gigabytes to terabytes of information, however, making it difficult for the citizen scientist to grasp and the educator to convey this data. Here, we present a method for interpreting massive amounts of microbial ecology data as musical performances, easily generated on any computer and using only commonly available or freely available software and the ‘Microbial Bebop’ algorithm. Using this approach, citizen scientists and biology educators can sonify complex data in a fun and interactive format, making it easier to communicate both the importance and the excitement of exploring the planet earth’s largest ecosystem.

  4. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    International Nuclear Information System (INIS)

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  5. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    Energy Technology Data Exchange (ETDEWEB)

    Arneodo, Alain, E-mail: alain.arneodo@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Vaillant, Cedric, E-mail: cedric.vaillant@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Audit, Benjamin, E-mail: benjamin.audit@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Argoul, Francoise, E-mail: francoise.argoul@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); D' Aubenton-Carafa, Yves, E-mail: daubenton@cgm.cnrs-gif.f [Centre de Genetique Moleculaire, CNRS, Allee de la Terrasse, 91198 Gif-sur-Yvette (France); Thermes, Claude, E-mail: claude.thermes@cgm.cnrs-gif.f [Centre de Genetique Moleculaire, CNRS, Allee de la Terrasse, 91198 Gif-sur-Yvette (France)

    2011-02-15

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  6. Population genetic structure of Indian shad, Tenualosa ilisha inferred from variation in mitochondrial DNA sequences.

    Science.gov (United States)

    Behera, B K; Singh, N S; Paria, P; Sahoo, A K; Panda, D; Meena, D K; Das, P; Pakrashi, S; Biswas, D K; Sharma, A P

    2015-09-01

    Indian shad, Tenualosa ilisha, is a commercially important anadromous fish representing major catch in Indo-pacific region. The present study evaluated partial Cytochrome b (Cyt b) gene sequence of mtDNA in T. ilisha for determining genetic variation from Bay of Bengal and Arabian Sea origins. The genomic DNA extracted from T. ilisha samples representing two distant rivers in the Indian subcontinent, the Bhagirathi (lower stretch of Ganges) and the Tapi was analyzed. Sequencing of 307 bp mtDNA Cytochrome b gene fragment revealed the presence of 5 haplotypes, with high haplotype diversity (Hd) of 0.9048 with variance 0.103 and low nucleotide diversity (π) of 0.14301. Three population specific haplotypes were observed in river Ganga and two haplotypes in river Tapi. Neighbour-joining tree based on Cytochrome b gene sequences of T. ilisha showed that population from Bay of Bengal and Arabian Sea origins belonged to two distinct clusters. PMID:26521565

  7. More of an Art than a Science: Using Microbial DNA Sequences to Compose Music†

    Science.gov (United States)

    Larsen, Peter E.

    2016-01-01

    Bacteria are everywhere. Microbial ecology is emerging as a critical field for understanding the relationships between these ubiquitous bacterial communities, the environment, and human health. Next generation DNA sequencing technology provides us a powerful tool to indirectly observe the communities by sequencing and analyzing all of the bacterial DNA present in an environment. The results of the DNA sequencing experiments can generate gigabytes to terabytes of information, however, making it difficult for the citizen scientist to grasp and the educator to convey this data. Here, we present a method for interpreting massive amounts of microbial ecology data as musical performances, easily generated on any computer and using only commonly available or freely available software and the ‘Microbial Bebop’ algorithm. Using this approach, citizen scientists and biology educators can sonify complex data in a fun and interactive format, making it easier to communicate both the importance and the excitement of exploring the planet earth’s largest ecosystem. PMID:27047609

  8. Gold electrode modified by self-assembled monolayers of thiols to determine DNA sequences hybridization

    Indian Academy of Sciences (India)

    Mízia M S Silva; Igor T Cavalcanti; M Fátima Barroso; M Goreti F Sales; Rosa Fireman Dutra

    2010-11-01

    The process of immobilization of biological molecules is one of the most important steps in the construction of a biosensor. In the case of DNA, the way it exposes its bases can result in electrochemical signals to acceptable levels. The use of self-assembled monolayer that allows a connection to the gold thiol group and DNA binding to an aldehydic ligand resulted in the possibility of determining DNA hybridization. Immobilized single strand of DNA (ssDNA) from calf thymus pre-formed from alkanethiol film was formed by incubating a solution of 2-aminoethanothiol (Cys) followed by glutaraldehyde (Glu). Cyclic voltammetry (CV) was used to characterize the self-assembled monolayer on the gold electrode and, also, to study the immobilization of ssDNA probe and hybridization with the complementary sequence (target ssDNA). The ssDNA probe presents a well-defined oxidation peak at +0.158 V. When the hybridization occurs, this peak disappears which confirms the efficacy of the annealing and the DNA double helix performing without the presence of electroactive indicators. The use of SAM resulted in a stable immobilization of the ssDNA probe, enabling the hybridization detection without labels. This study represents a promising approach for molecular biosensor with sensible and reproducible results.

  9. Recognition of DNA sequencing through binding of nucleobases to graphene

    Science.gov (United States)

    Zaffino, Valentina

    Graphene is one of the most promising materials in nanotechnology. Its large surface to volume ratio, high conductivity and electron mobility at room temperature are outstanding properties for use in DNA sensors. For this study, we used Density Functional Theory (DFT), ?with and without the inclusion of van der Waals (vdW) interactions, ?to investigate the adsorption of nucleobases (cytosine, guanine, adenine, thymine, and uracil) on pristine graphene and graphene with defects (Divacancy and Stone-Wales). We investigated the performance of two types of vdW-DF functional (optB86b-vdW and rPW86-vdW), as well as the PBE functional, and their description of the adsorption geometry and electronic structure of the nucleobase-graphene systems.The inclusion of defects results in an increase in binding energy, closer adsorption of the molecule to graphene and greater buckling in both the graphene structure and nucleobase.

  10. Obesity risk gene TMEM18 encodes a sequence-specific DNA-binding protein.

    Directory of Open Access Journals (Sweden)

    Jaana M Jurvansuu

    Full Text Available Transmembrane protein 18 (TMEM18 has previously been connected to cell migration and obesity. However, the molecular function of the protein has not yet been described. Here we show that TMEM18 localises to the nuclear membrane and binds to DNA in a sequence-specific manner. The protein binds DNA with its positively charged C-terminus that contains also a nuclear localisation signal. Increase in the amount of TMEM18 in cells suppresses expression from a reporter vector with the TMEM18 target sequence. TMEM18 is a small protein of 140 residues and is predicted to be mostly alpha-helical with three transmembrane parts. As a consequence the DNA binding by TMEM18 would bring the chromatin very near to nuclear membrane. We speculate that this closed perinuclear localisation of TMEM18-bound DNA might repress transcription from it.

  11. Bisulfite sequencing reveals that Aspergillus flavus holds a hollow in DNA methylation

    DEFF Research Database (Denmark)

    Liu, Si-Yang; Lin, Jian-Qing; Wu, Hong-Long;

    2012-01-01

    data and the methylome comparisons with other fungi confirm that the DNA methylation level of this fungus is negligible. Further investigation into the DNA methyltransferase of Aspergillus uncovers its close relationship with RID-like enzymes as well as its divergence with the methyltransferase......Aspergillus flavus first gained scientific attention for its production of aflatoxin. The underlying regulation of aflatoxin biosynthesis has been serving as a theoretical model for biosynthesis of other microbial secondary metabolites. Nevertheless, for several decades, the DNA methylation status......, one of the important epigenomic modifications involved in gene regulation, in A. flavus remains to be controversial. Here, we applied bisulfite sequencing in conjunction with a biological replicate strategy to investigate the DNA methylation profiling of A. flavus genome. Both the bisulfite sequencing...

  12. cDNA-derived amino acid sequences of myoglobins from nine species of whales and dolphins.

    Science.gov (United States)

    Iwanami, Kentaro; Mita, Hajime; Yamamoto, Yasuhiko; Fujise, Yoshihiro; Yamada, Tadasu; Suzuki, Tomohiko

    2006-10-01

    We determined the myoglobin (Mb) cDNA sequences of nine cetaceans, of which six are the first reports of Mb sequences: sei whale (Balaenoptera borealis), Bryde's whale (Balaenoptera edeni), pygmy sperm whale (Kogia breviceps), Stejneger's beaked whale (Mesoplodon stejnegeri), Longman's beaked whale (Indopacetus pacificus), and melon-headed whale (Peponocephala electra), and three confirm the previously determined chemical amino acid sequences: sperm whale (Physeter macrocephalus), common minke whale (Balaenoptera acutorostrata) and pantropical spotted dolphin (Stenella attenuata). We found two types of Mb in the skeletal muscle of pantropical spotted dolphin: Mb I with the same amino acid sequence as that deposited in the protein database, and Mb II, which differs at two amino acid residues compared with Mb I. Using an alignment of the amino acid or cDNA sequences of cetacean Mb, we constructed a phylogenetic tree by the NJ method. Clustering of cetacean Mb amino acid and cDNA sequences essentially follows the classical taxonomy of cetaceans, suggesting that Mb sequence data is valid for classification of cetaceans at least to the family level. PMID:16962803

  13. An Effective Identification of Species from DNA Sequence: A Classification Technique by Integrating DM and ANN

    Directory of Open Access Journals (Sweden)

    Sathish Kumar S

    2012-08-01

    Full Text Available Species classification from DNA sequences remains as an open challenge in the area of bioinformatics, which deals with the collection, processing and analysis of DNA and proteomic sequence. Though incorporation of data mining can guide the process to perform well, poor definition, and heterogeneous nature of gene sequence remains as a barrier. In this paper, an effective classification technique to identify the organism from its gene sequence is proposed. The proposed integrated technique is mainly based on pattern mining and neural network-based classification. In pattern mining, the technique mines nucleotide patterns and their support from selected DNA sequence. The high dimension of the mined dataset is reduced using Multilinear Principal Component Analysis (MPCA. In classification, a well-trained neural network classifies the selected gene sequence and so the organism is identified even from a part of the sequence. The proposed technique is evaluated by performing 10-fold cross validation, a statistical validation measure, and the obtained results prove the efficacy of the technique.

  14. DNA sequence-based analysis of the Pseudomonas species.

    Science.gov (United States)

    Mulet, Magdalena; Lalucat, Jorge; García-Valdés, Elena

    2010-06-01

    Partial sequences of four core 'housekeeping' genes (16S rRNA, gyrB, rpoB and rpoD) of the type strains of 107 Pseudomonas species were analysed in order to obtain a comprehensive view regarding the phylogenetic relationships within the Pseudomonas genus. Gene trees allowed the discrimination of two lineages or intrageneric groups (IG), called IG P. aeruginosa and IG P. fluorescens. The first IG P. aeruginosa, was divided into three main groups, represented by the species P. aeruginosa, P. stutzeri and P. oleovorans. The second IG was divided into six groups, represented by the species P. fluorescens, P. syringae, P. lutea, P. putida, P. anguilliseptica and P. straminea. The P. fluorescens group was the most complex and included nine subgroups, represented by the species P. fluorescens, P. gessardi, P. fragi, P. mandelii, P. jesseni, P. koreensis, P. corrugata, P. chlororaphis and P. asplenii. Pseudomonas rhizospherae was affiliated with the P. fluorescens IG in the phylogenetic analysis but was independent of any group. Some species were located on phylogenetic branches that were distant from defined clusters, such as those represented by the P. oryzihabitans group and the type strains P. pachastrellae, P. pertucinogena and P. luteola. Additionally, 17 strains of P. aeruginosa, 'P. entomophila', P. fluorescens, P. putida, P. syringae and P. stutzeri, for which genome sequences have been determined, have been included to compare the results obtained in the analysis of four housekeeping genes with those obtained from whole genome analyses.

  15. Transcriptional Regulation in Mammalian Cells by Sequence-Specific DNA Binding Proteins

    Science.gov (United States)

    Mitchell, Pamela J.; Tjian, Robert

    1989-07-01

    The cloning of genes encoding mammalian DNA binding transcription factors for RNA polymerase II has provided the opportunity to analyze the structure and function of these proteins. This review summarizes recent studies that define structural domains for DNA binding and transcriptional activation functions in sequence-specific transcription factors. The mechanisms by which these factors may activate transcriptional initiation and by which they may be regulated to achieve differential gene expression are also discussed.

  16. Multiple gene sequence analysis using genes of the bacterial DNA repair pathway

    OpenAIRE

    Miguel Rotelok Neto; Carolina Weigert Galvão; Leonardo Magalhães Cruz; Dieval Guizelini; Leilane Caline Silva; Jarem Raul Garcia; Rafael Mazer Etto

    2015-01-01

    The ability to recognize and repair abnormal DNA structures is common to all forms of life. Physiological studies and genomic sequencing of a variety of bacterial species have identified an incredible diversity of DNA repair pathways. Despite the amount of available genes in public database, the usual method to place genomes in a taxonomic context is based mainly on the 16S rRNA or housekeeping genes. Thus, the relationships among genomes remain poorly understood. In this work, an approach of...

  17. Detection of specific DNA sequences by fluorescence amplification: a color complementation assay.

    OpenAIRE

    Chehab, F. F.; Kan, Y W

    1989-01-01

    We have developed a color complementation assay that allows rapid screening of specific genomic DNA sequences. It is based on the simultaneous amplification of two or more DNA segments with fluorescent oligonucleotide primers such that the generation of a color, or combination of colors, can be visualized and used for diagnosis. Color complementation assay obviates the need for gel electrophoresis and has been applied to the detection of a large and small gene deletion, a chromosomal transloc...

  18. Whole genome nucleosome sequencing identifies novel types of forensic markers in degraded DNA samples

    OpenAIRE

    Chun-nan Dong; Ya-dong Yang; Shu-jin Li; Ya-ran Yang; Xiao-jing Zhang; Xiang-dong Fang; Jiang-wei Yan; Bin Cong

    2016-01-01

    In the case of mass disasters, missing persons and forensic caseworks, highly degraded biological samples are often encountered. It can be a challenge to analyze and interpret the DNA profiles from these samples. Here we provide a new strategy to solve the problem by taking advantage of the intrinsic structural properties of DNA. We have assessed the in vivo positions of more than 35 million putative nucleosome cores in human leukocytes using high-throughput whole genome sequencing, and ident...

  19. Length Variation, Heteroplasmy and Sequence Divergence in the Mitochondrial DNA of Four Species of Sturgeon (Acipenser)

    OpenAIRE

    Brown, J R; Beckenbach, K.; Beckenbach, A. T.; Smith, M.J

    1996-01-01

    The extent of mtDNA length variation and heteroplasmy as well as DNA sequences of the control region and two tRNA genes were determined for four North American sturgeon species: Acipenser transmontanus, A. medirostris, A. fulvescens and A. oxyrhnychus. Across the Continental Divide, a division in the occurrence of length variation and heteroplasmy was observed that was concordant with species biogeography as well as with phylogenies inferred from restriction fragment length polymorphisms (RFL...

  20. Biases during DNA extraction of activated sludge samples revealed by high throughput sequencing

    OpenAIRE

    Guo, Feng; Zhang, Tong

    2012-01-01

    Standardization of DNA extraction is a fundamental issue of fidelity and comparability in investigations of environmental microbial communities. Commercial kits for soil or feces are often adopted for studies of activated sludge because of a lack of specific kits, but they have never been evaluated regarding their effectiveness and potential biases based on high throughput sequencing. In this study, seven common DNA extraction kits were evaluated, based on not only yield/purity but also seque...