WorldWideScience

Sample records for dna sequence analyses

  1. Phylogenetic study on Shiraia bambusicola by rDNA sequence analyses.

    Science.gov (United States)

    Cheng, Tian-Fan; Jia, Xiao-Ming; Ma, Xiao-Hang; Lin, Hai-Ping; Zhao, Yu-Hua

    2004-01-01

    In this study, 18S rDNA and ITS-5.8S rDNA regions of four Shiraia bambusicola isolates collected from different species of bamboos were amplified by PCR with universal primer pairs NS1/NS8 and ITS5/ITS4, respectively, and sequenced. Phylogenetic analyses were conducted on three selected datasets of rDNA sequences. Maximum parsimony, distance and maximum likelihood criteria were used to infer trees. Morphological characteristics were also observed. The positioning of Shiraia in the order Pleosporales was well supported by bootstrap, which agreed with the placement by Amano (1980) according to their morphology. We did not find significant inter-hostal differences among these four isolates from different species of bamboos. From the results of analyses and comparison of their rDNA sequences, we conclude that Shiraia should be classified into Pleosporales as Amano (1980) proposed and suggest that it might be positioned in the family Phaeosphaeriaceae. Copyright 2004 WILEY-VCH Verlag GmbH & Co.

  2. Protocols for 16S rDNA Array Analyses of Microbial Communities by Sequence-Specific Labeling of DNA Probes

    Directory of Open Access Journals (Sweden)

    Knut Rudi

    2003-01-01

    Full Text Available Analyses of complex microbial communities are becoming increasingly important. Bottlenecks in these analyses, however, are the tools to actually describe the biodiversity. Novel protocols for DNA array-based analyses of microbial communities are presented. In these protocols, the specificity obtained by sequence-specific labeling of DNA probes is combined with the possibility of detecting several different probes simultaneously by DNA array hybridization. The gene encoding 16S ribosomal RNA was chosen as the target in these analyses. This gene contains both universally conserved regions and regions with relatively high variability. The universally conserved regions are used for PCR amplification primers, while the variable regions are used for the specific probes. Protocols are presented for DNA purification, probe construction, probe labeling, and DNA array hybridizations.

  3. SWORDS: A statistical tool for analysing large DNA sequences

    Indian Academy of Sciences (India)

    Unknown

    These techniques are based on frequency distributions of DNA words in a large sequence, and have been packaged into a software called SWORDS. Using sequences available in ... tions with the cellular processes like recombination, replication .... in DNA sequences using certain specific probability laws. (Pevzner et al ...

  4. Mitogenomic analyses from ancient DNA

    DEFF Research Database (Denmark)

    Paijmans, Johanna L. A.; Gilbert, Tom; Hofreiter, Michael

    2013-01-01

    The analysis of ancient DNA is playing an increasingly important role in conservation genetic, phylogenetic and population genetic analyses, as it allows incorporating extinct species into DNA sequence trees and adds time depth to population genetics studies. For many years, these types of DNA...... analyses (whether using modern or ancient DNA) were largely restricted to the analysis of short fragments of the mitochondrial genome. However, due to many technological advances during the past decade, a growing number of studies have explored the power of complete mitochondrial genome sequences...... yielded major progress with regard to both the phylogenetic positions of extinct species, as well as resolving population genetics questions in both extinct and extant species....

  5. Detecting differential DNA methylation from sequencing of bisulfite converted DNA of diverse species.

    Science.gov (United States)

    Huh, Iksoo; Wu, Xin; Park, Taesung; Yi, Soojin V

    2017-07-21

    DNA methylation is one of the most extensively studied epigenetic modifications of genomic DNA. In recent years, sequencing of bisulfite-converted DNA, particularly via next-generation sequencing technologies, has become a widely popular method to study DNA methylation. This method can be readily applied to a variety of species, dramatically expanding the scope of DNA methylation studies beyond the traditionally studied human and mouse systems. In parallel to the increasing wealth of genomic methylation profiles, many statistical tools have been developed to detect differentially methylated loci (DMLs) or differentially methylated regions (DMRs) between biological conditions. We discuss and summarize several key properties of currently available tools to detect DMLs and DMRs from sequencing of bisulfite-converted DNA. However, the majority of the statistical tools developed for DML/DMR analyses have been validated using only mammalian data sets, and less priority has been placed on the analyses of invertebrate or plant DNA methylation data. We demonstrate that genomic methylation profiles of non-mammalian species are often highly distinct from those of mammalian species using examples of honey bees and humans. We then discuss how such differences in data properties may affect statistical analyses. Based on these differences, we provide three specific recommendations to improve the power and accuracy of DML and DMR analyses of invertebrate data when using currently available statistical tools. These considerations should facilitate systematic and robust analyses of DNA methylation from diverse species, thus advancing our understanding of DNA methylation. © The Author 2017. Published by Oxford University Press.

  6. A DNA Structure-Based Bionic Wavelet Transform and Its Application to DNA Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Fei Chen

    2003-01-01

    Full Text Available DNA sequence analysis is of great significance for increasing our understanding of genomic functions. An important task facing us is the exploration of hidden structural information stored in the DNA sequence. This paper introduces a DNA structure-based adaptive wavelet transform (WT – the bionic wavelet transform (BWT – for DNA sequence analysis. The symbolic DNA sequence can be separated into four channels of indicator sequences. An adaptive symbol-to-number mapping, determined from the structural feature of the DNA sequence, was introduced into WT. It can adjust the weight value of each channel to maximise the useful energy distribution of the whole BWT output. The performance of the proposed BWT was examined by analysing synthetic and real DNA sequences. Results show that BWT performs better than traditional WT in presenting greater energy distribution. This new BWT method should be useful for the detection of the latent structural features in future DNA sequence analysis.

  7. Ancient DNA analyses of museum specimens from selected Presbytis (primate: Colobinae) based on partial Cyt b sequences

    Science.gov (United States)

    Aifat, N. R.; Yaakop, S.; Md-Zain, B. M.

    2016-11-01

    The IUCN Red List of Threatened Species has categorized Malaysian primates from being data deficient to critically endanger. Thus, ancient DNA analyses hold great potential to understand phylogeny, phylogeography and population history of extinct and extant species. Museum samples are one of the alternatives to provide important sources of biological materials for a large proportion of ancient DNA studies. In this study, a total of six museum skin samples from species Presbytis hosei (4 samples) and Presbytis frontata (2 samples), aged between 43 and 124 years old were extracted to obtain the DNA. Extraction was done by using QIAGEN QIAamp DNA Investigator Kit and the ability of this kit to extract museum skin samples was tested by amplification of partial Cyt b sequence using species-specific designed primer. Two primer pairs were designed specifically for P. hosei and P. frontata, respectively. These primer pairs proved to be efficient in amplifying 200bp of the targeted species in the optimized PCR conditions. The performance of the sequences were tested to determine genetic distance of genus Presbytis in Malaysia. From the analyses, P. hosei is closely related to P. chrysomelas and P. frontata with the value of 0.095 and 0.106, respectively. Cyt b gave a clear data in determining relationships among Bornean species. Thus, with the optimized condition, museum specimens can be used for molecular systematic studies of the Malaysian primates.

  8. New environmental metabarcodes for analysing soil DNA

    DEFF Research Database (Denmark)

    Epp, Laura S.; Boessenkool, Sanne; Bellemain, Eva P.

    2012-01-01

    was systematically evaluated by (i) in silico PCRs using all standard sequences in the EMBL public database as templates, (ii) in vitro PCRs of DNA extracts from surface soil samples from a site in Varanger, northern Norway and (iii) in vitro PCRs of DNA extracts from permanently frozen sediment samples of late......Metabarcoding approaches use total and typically degraded DNA from environmental samples to analyse biotic assemblages and can potentially be carried out for any kinds of organisms in an ecosystem. These analyses rely on specific markers, here called metabarcodes, which should be optimized...... for taxonomic resolution, minimal bias in amplification of the target organism group and short sequence length. Using bioinformatic tools, we developed metabarcodes for several groups of organisms: fungi, bryophytes, enchytraeids, beetles and birds. The ability of these metabarcodes to amplify the target groups...

  9. DNA sequence analyses of blended herbal products including synthetic cannabinoids as designer drugs.

    Science.gov (United States)

    Ogata, Jun; Uchiyama, Nahoko; Kikura-Hanajiri, Ruri; Goda, Yukihiro

    2013-04-10

    In recent years, various herbal products adulterated with synthetic cannabinoids have been distributed worldwide via the Internet. These herbal products are mostly sold as incense, and advertised as not for human consumption. Although their labels indicate that they contain mixtures of several potentially psychoactive plants, and numerous studies have reported that they contain a variety of synthetic cannabinoids, their exact botanical contents are not always clear. In this study, we investigated the origins of botanical materials in 62 Spice-like herbal products distributed on the illegal drug market in Japan, by DNA sequence analyses and BLAST searches. The nucleotide sequences of four regions were analyzed to identify the origins of each plant species in the herbal mixtures. The sequences of "Damiana" (Turnera diffusa) and Lamiaceae herbs (Mellissa, Mentha and Thymus) were frequently detected in a number of products. However, the sequences of other plant species indicated on the packaging labels were not detected. In a few products, DNA fragments of potent psychotropic plants were found, including marijuana (Cannabis sativa), "Diviner's Sage" (Salvia divinorum) and "Kratom" (Mitragyna speciosa). Their active constituents were also confirmed using gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS), although these plant names were never indicated on the labels. Most plant species identified in the products were different from the plants indicated on the labels. The plant materials would be used mainly as diluents for the psychoactive synthetic compounds, because no reliable psychoactive effects have been reported for most of the identified plants, with the exception of the psychotropic plants named above. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  10. ADN-Viewer: a 3D approach for bioinformatic analyses of large DNA sequences.

    Science.gov (United States)

    Hérisson, Joan; Ferey, Nicolas; Gros, Pierre-Emmanuel; Gherbi, Rachid

    2007-01-20

    Most of biologists work on textual DNA sequences that are limited to the linear representation of DNA. In this paper, we address the potential offered by Virtual Reality for 3D modeling and immersive visualization of large genomic sequences. The representation of the 3D structure of naked DNA allows biologists to observe and analyze genomes in an interactive way at different levels. We developed a powerful software platform that provides a new point of view for sequences analysis: ADNViewer. Nevertheless, a classical eukaryotic chromosome of 40 million base pairs requires about 6 Gbytes of 3D data. In order to manage these huge amounts of data in real-time, we designed various scene management algorithms and immersive human-computer interaction for user-friendly data exploration. In addition, one bioinformatics study scenario is proposed.

  11. Using Synthetic Nanopores for Single-Molecule Analyses: Detecting SNPs, Trapping DNA Molecules, and the Prospects for Sequencing DNA

    Science.gov (United States)

    Dimitrov, Valentin V.

    2009-01-01

    This work focuses on studying properties of DNA molecules and DNA-protein interactions using synthetic nanopores, and it examines the prospects of sequencing DNA using synthetic nanopores. We have developed a method for discriminating between alleles that uses a synthetic nanopore to measure the binding of a restriction enzyme to DNA. There exists…

  12. DMINDA: an integrated web server for DNA motif identification and analyses.

    Science.gov (United States)

    Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

    2014-07-01

    DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. The occurrence of Toxocara malaysiensis in cats in China, confirmed by sequence-based analyses of ribosomal DNA.

    Science.gov (United States)

    Li, Ming-Wei; Zhu, Xing-Quan; Gasser, Robin B; Lin, Rui-Qing; Sani, Rehana A; Lun, Zhao-Rong; Jacobs, Dennis E

    2006-10-01

    Non-isotopic polymerase chain reaction (PCR)-based single-strand conformation polymorphism and sequence analyses of the second internal transcribed spacer (ITS-2) of nuclear ribosomal DNA (rDNA) were utilized to genetically characterise ascaridoids from dogs and cats from China by comparison with those from other countries. The study showed that Toxocara canis, Toxocara cati, and Toxascaris leonina from China were genetically the same as those from other geographical origins. Specimens from cats from Guangzhou, China, which were morphologically consistent with Toxocara malaysiensis, were the same genetically as those from Malaysia, with the exception of a polymorphism in the ITS-2 but no unequivocal sequence difference. This is the first report of T. malaysiensis in cats outside of Malaysia (from where it was originally described), supporting the proposal that this species has a broader geographical distribution. The molecular approach employed provides a powerful tool for elucidating the biology, epidemiology, and zoonotic significance of T. malaysiensis.

  14. Examination of species boundaries in the Acropora cervicornis group (Scleractinia, cnidaria) using nuclear DNA sequence analyses.

    Science.gov (United States)

    Oppen, M J; Willis, B L; Vugt, H W; Miller, D J

    2000-09-01

    Although Acropora is the most species-rich genus of the scleractinian (stony) corals, only three species occur in the Caribbean: A. cervicornis, A. palmata and A. prolifera. Based on overall coral morphology, abundance and distribution patterns, it has been suggested that A. prolifera may be a hybrid between A. cervicornis and A. palmata. The species boundaries among these three morphospecies were examined using DNA sequence analyses of the nuclear Pax-C 46/47 intron and the ribosomal DNA Internal Transcribed Spacer (ITS1 and ITS2) and 5.8S regions. Moderate levels of sequence variability were observed in the ITS and 5.8S sequences (up to 5.2% overall sequence difference), but variability within species was as large as between species and all three species carried similar sequences. Since this is unlikely to represent a shared ancestral polymorphism, the data suggest that introgressive hybridization occurs among the three species. For the Pax-C intron, A. cervicornis and A. palmata had very distinct allele frequencies and A. cervicornis carried a unique allele at a frequency of 0.769 (although sequence differences between alleles were small). All A. prolifera colonies examined were heterozygous for the Pax-C intron, whereas heterozygosity was only 0.286 and 0.333 for A. cervicornis and A. palmata, respectively. These data support the hypothesis that A. prolifera is the product of hybridization between two species that have a different allelic composition for the Pax-C intron, i.e. A. cervicornis and A. palmata. We therefore suggest that A. prolifera is a hybrid between A. cervicornis and A. palmata, which backcrosses with the parental species at low frequency.

  15. Repeated DNA sequences in fungi

    Energy Technology Data Exchange (ETDEWEB)

    Dutta, S K

    1974-11-01

    Several fungal species, representatives of all broad groups like basidiomycetes, ascomycetes and phycomycetes, were examined for the nature of repeated DNA sequences by DNA:DNA reassociation studies using hydroxyapatite chromatography. All of the fungal species tested contained 10 to 20 percent repeated DNA sequences. There are approximately 100 to 110 copies of repeated DNA sequences of approximately 4 x 10/sup 7/ daltons piece size of each. Repeated DNA sequence homoduplexes showed on average 5/sup 0/C difference of T/sub e/50 (temperature at which 50 percent duplexes dissociate) values from the corresponding homoduplexes of unfractionated whole DNA. It is suggested that a part of repetitive sequences in fungi constitutes mitochondrial DNA and a part of it constitutes nuclear DNA. (auth)

  16. msgbsR: An R package for analysing methylation-sensitive restriction enzyme sequencing data.

    Science.gov (United States)

    Mayne, Benjamin T; Leemaqz, Shalem Y; Buckberry, Sam; Rodriguez Lopez, Carlos M; Roberts, Claire T; Bianco-Miotto, Tina; Breen, James

    2018-02-01

    Genotyping-by-sequencing (GBS) or restriction-site associated DNA marker sequencing (RAD-seq) is a practical and cost-effective method for analysing large genomes from high diversity species. This method of sequencing, coupled with methylation-sensitive enzymes (often referred to as methylation-sensitive restriction enzyme sequencing or MRE-seq), is an effective tool to study DNA methylation in parts of the genome that are inaccessible in other sequencing techniques or are not annotated in microarray technologies. Current software tools do not fulfil all methylation-sensitive restriction sequencing assays for determining differences in DNA methylation between samples. To fill this computational need, we present msgbsR, an R package that contains tools for the analysis of methylation-sensitive restriction enzyme sequencing experiments. msgbsR can be used to identify and quantify read counts at methylated sites directly from alignment files (BAM files) and enables verification of restriction enzyme cut sites with the correct recognition sequence of the individual enzyme. In addition, msgbsR assesses DNA methylation based on read coverage, similar to RNA sequencing experiments, rather than methylation proportion and is a useful tool in analysing differential methylation on large populations. The package is fully documented and available freely online as a Bioconductor package ( https://bioconductor.org/packages/release/bioc/html/msgbsR.html ).

  17. RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.

    Science.gov (United States)

    Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

    2012-01-01

    RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.

  18. Dialects of the DNA Uptake Sequence in Neisseriaceae

    Science.gov (United States)

    Frye, Stephan A.; Nilsen, Mariann; Tønjum, Tone; Ambur, Ole Herman

    2013-01-01

    In all sexual organisms, adaptations exist that secure the safe reassortment of homologous alleles and prevent the intrusion of potentially hazardous alien DNA. Some bacteria engage in a simple form of sex known as transformation. In the human pathogen Neisseria meningitidis and in related bacterial species, transformation by exogenous DNA is regulated by the presence of a specific DNA Uptake Sequence (DUS), which is present in thousands of copies in the respective genomes. DUS affects transformation by limiting DNA uptake and recombination in favour of homologous DNA. The specific mechanisms of DUS–dependent genetic transformation have remained elusive. Bioinformatic analyses of family Neisseriaceae genomes reveal eight distinct variants of DUS. These variants are here termed DUS dialects, and their effect on interspecies commutation is demonstrated. Each of the DUS dialects is remarkably conserved within each species and is distributed consistent with a robust Neisseriaceae phylogeny based on core genome sequences. The impact of individual single nucleotide transversions in DUS on meningococcal transformation and on DNA binding and uptake is analysed. The results show that a DUS core 5′-CTG-3′ is required for transformation and that transversions in this core reduce DNA uptake more than two orders of magnitude although the level of DNA binding remains less affected. Distinct DUS dialects are efficient barriers to interspecies recombination in N. meningitidis, N. elongata, Kingella denitrificans, and Eikenella corrodens, despite the presence of the core sequence. The degree of similarity between the DUS dialect of the recipient species and the donor DNA directly correlates with the level of transformation and DNA binding and uptake. Finally, DUS–dependent transformation is documented in the genera Eikenella and Kingella for the first time. The results presented here advance our understanding of the function and evolution of DUS and genetic transformation

  19. Dialects of the DNA uptake sequence in Neisseriaceae.

    Directory of Open Access Journals (Sweden)

    Stephan A Frye

    2013-04-01

    Full Text Available In all sexual organisms, adaptations exist that secure the safe reassortment of homologous alleles and prevent the intrusion of potentially hazardous alien DNA. Some bacteria engage in a simple form of sex known as transformation. In the human pathogen Neisseria meningitidis and in related bacterial species, transformation by exogenous DNA is regulated by the presence of a specific DNA Uptake Sequence (DUS, which is present in thousands of copies in the respective genomes. DUS affects transformation by limiting DNA uptake and recombination in favour of homologous DNA. The specific mechanisms of DUS-dependent genetic transformation have remained elusive. Bioinformatic analyses of family Neisseriaceae genomes reveal eight distinct variants of DUS. These variants are here termed DUS dialects, and their effect on interspecies commutation is demonstrated. Each of the DUS dialects is remarkably conserved within each species and is distributed consistent with a robust Neisseriaceae phylogeny based on core genome sequences. The impact of individual single nucleotide transversions in DUS on meningococcal transformation and on DNA binding and uptake is analysed. The results show that a DUS core 5'-CTG-3' is required for transformation and that transversions in this core reduce DNA uptake more than two orders of magnitude although the level of DNA binding remains less affected. Distinct DUS dialects are efficient barriers to interspecies recombination in N. meningitidis, N. elongata, Kingella denitrificans, and Eikenella corrodens, despite the presence of the core sequence. The degree of similarity between the DUS dialect of the recipient species and the donor DNA directly correlates with the level of transformation and DNA binding and uptake. Finally, DUS-dependent transformation is documented in the genera Eikenella and Kingella for the first time. The results presented here advance our understanding of the function and evolution of DUS and genetic

  20. The sequence specificity of UV-induced DNA damage in a systematically altered DNA sequence.

    Science.gov (United States)

    Khoe, Clairine V; Chung, Long H; Murray, Vincent

    2018-06-01

    The sequence specificity of UV-induced DNA damage was investigated in a specifically designed DNA plasmid using two procedures: end-labelling and linear amplification. Absorption of UV photons by DNA leads to dimerisation of pyrimidine bases and produces two major photoproducts, cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). A previous study had determined that two hexanucleotide sequences, 5'-GCTC*AC and 5'-TATT*AA, were high intensity UV-induced DNA damage sites. The UV clone plasmid was constructed by systematically altering each nucleotide of these two hexanucleotide sequences. One of the main goals of this study was to determine the influence of single nucleotide alterations on the intensity of UV-induced DNA damage. The sequence 5'-GCTC*AC was designed to examine the sequence specificity of 6-4PPs and the highest intensity 6-4PP damage sites were found at 5'-GTTC*CC nucleotides. The sequence 5'-TATT*AA was devised to investigate the sequence specificity of CPDs and the highest intensity CPD damage sites were found at 5'-TTTT*CG nucleotides. It was proposed that the tetranucleotide DNA sequence, 5'-YTC*Y (where Y is T or C), was the consensus sequence for the highest intensity UV-induced 6-4PP adduct sites; while it was 5'-YTT*C for the highest intensity UV-induced CPD damage sites. These consensus tetranucleotides are composed entirely of consecutive pyrimidines and must have a DNA conformation that is highly productive for the absorption of UV photons. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.

  1. A review of bioinformatic methods for forensic DNA analyses.

    Science.gov (United States)

    Liu, Yao-Yuan; Harbison, SallyAnn

    2018-03-01

    Short tandem repeats, single nucleotide polymorphisms, and whole mitochondrial analyses are three classes of markers which will play an important role in the future of forensic DNA typing. The arrival of massively parallel sequencing platforms in forensic science reveals new information such as insights into the complexity and variability of the markers that were previously unseen, along with amounts of data too immense for analyses by manual means. Along with the sequencing chemistries employed, bioinformatic methods are required to process and interpret this new and extensive data. As more is learnt about the use of these new technologies for forensic applications, development and standardization of efficient, favourable tools for each stage of data processing is being carried out, and faster, more accurate methods that improve on the original approaches have been developed. As forensic laboratories search for the optimal pipeline of tools, sequencer manufacturers have incorporated pipelines into sequencer software to make analyses convenient. This review explores the current state of bioinformatic methods and tools used for the analyses of forensic markers sequenced on the massively parallel sequencing (MPS) platforms currently most widely used. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Biosensors for DNA sequence detection

    Science.gov (United States)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  3. "First generation" automated DNA sequencing technology.

    Science.gov (United States)

    Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

    2011-10-01

    Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.

  4. Computational analyses of ancient pathogen DNA from herbarium samples: challenges and prospects.

    Science.gov (United States)

    Yoshida, Kentaro; Sasaki, Eriko; Kamoun, Sophien

    2015-01-01

    The application of DNA sequencing technology to the study of ancient DNA has enabled the reconstruction of past epidemics from genomes of historically important plant-associated microbes. Recently, the genome sequences of the potato late blight pathogen Phytophthora infestans were analyzed from 19th century herbarium specimens. These herbarium samples originated from infected potatoes collected during and after the Irish potato famine. Herbaria have therefore great potential to help elucidate past epidemics of crops, date the emergence of pathogens, and inform about past pathogen population dynamics. DNA preservation in herbarium samples was unexpectedly good, raising the possibility of a whole new research area in plant and microbial genomics. However, the recovered DNA can be extremely fragmented resulting in specific challenges in reconstructing genome sequences. Here we review some of the challenges in computational analyses of ancient DNA from herbarium samples. We also applied the recently developed linkage method to haplotype reconstruction of diploid or polyploid genomes from fragmented ancient DNA.

  5. Genome-wide DNA polymorphism analyses using VariScan

    Directory of Open Access Journals (Sweden)

    Vilella Albert J

    2006-09-01

    Full Text Available Abstract Background DNA sequence polymorphisms analysis can provide valuable information on the evolutionary forces shaping nucleotide variation, and provides an insight into the functional significance of genomic regions. The recent ongoing genome projects will radically improve our capabilities to detect specific genomic regions shaped by natural selection. Current available methods and software, however, are unsatisfactory for such genome-wide analysis. Results We have developed methods for the analysis of DNA sequence polymorphisms at the genome-wide scale. These methods, which have been tested on a coalescent-simulated and actual data files from mouse and human, have been implemented in the VariScan software package version 2.0. Additionally, we have also incorporated a graphical-user interface. The main features of this software are: i exhaustive population-genetic analyses including those based on the coalescent theory; ii analysis adapted to the shallow data generated by the high-throughput genome projects; iii use of genome annotations to conduct a comprehensive analyses separately for different functional regions; iv identification of relevant genomic regions by the sliding-window and wavelet-multiresolution approaches; v visualization of the results integrated with current genome annotations in commonly available genome browsers. Conclusion VariScan is a powerful and flexible suite of software for the analysis of DNA polymorphisms. The current version implements new algorithms, methods, and capabilities, providing an important tool for an exhaustive exploratory analysis of genome-wide DNA polymorphism data.

  6. cDNA sequence quality data - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Budding yeast cDNA sequencing project cDNA sequence quality data Data detail Data name cDNA sequence quality... data DOI 10.18908/lsdba.nbdc00838-003 Description of data contents Phred's quality score. P...tion Download License Update History of This Database Site Policy | Contact Us cDNA sequence quality

  7. DNA Polymerases Drive DNA Sequencing-by-Synthesis Technologies: Both Past and Present

    Directory of Open Access Journals (Sweden)

    Cheng-Yao eChen

    2014-06-01

    Full Text Available Next-generation sequencing (NGS technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. E. coli DNA polymerase I proteolytic (Klenow fragment was originally utilized in Sanger's dideoxy chain terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today's standard capillary electrophoresis (CE and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ⱷ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ⱷ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies.

  8. Phylogenetic characterization of a biogas plant microbial community integrating clone library 16S-rDNA sequences and metagenome sequence data obtained by 454-pyrosequencing.

    Science.gov (United States)

    Kröber, Magdalena; Bekel, Thomas; Diaz, Naryttza N; Goesmann, Alexander; Jaenicke, Sebastian; Krause, Lutz; Miller, Dimitri; Runte, Kai J; Viehöver, Prisca; Pühler, Alfred; Schlüter, Andreas

    2009-06-01

    The phylogenetic structure of the microbial community residing in a fermentation sample from a production-scale biogas plant fed with maize silage, green rye and liquid manure was analysed by an integrated approach using clone library sequences and metagenome sequence data obtained by 454-pyrosequencing. Sequencing of 109 clones from a bacterial and an archaeal 16S-rDNA amplicon library revealed that the obtained nucleotide sequences are similar but not identical to 16S-rDNA database sequences derived from different anaerobic environments including digestors and bioreactors. Most of the bacterial 16S-rDNA sequences could be assigned to the phylum Firmicutes with the most abundant class Clostridia and to the class Bacteroidetes, whereas most archaeal 16S-rDNA sequences cluster close to the methanogen Methanoculleus bourgensis. Further sequences of the archaeal library most probably represent so far non-characterised species within the genus Methanoculleus. A similar result derived from phylogenetic analysis of mcrA clone sequences. The mcrA gene product encodes the alpha-subunit of methyl-coenzyme-M reductase involved in the final step of methanogenesis. BLASTn analysis applying stringent settings resulted in assignment of 16S-rDNA metagenome sequence reads to 62 16S-rDNA amplicon sequences thus enabling frequency of abundance estimations for 16S-rDNA clone library sequences. Ribosomal Database Project (RDP) Classifier processing of metagenome 16S-rDNA reads revealed abundance of the phyla Firmicutes, Bacteroidetes and Euryarchaeota and the orders Clostridiales, Bacteroidales and Methanomicrobiales. Moreover, a large fraction of 16S-rDNA metagenome reads could not be assigned to lower taxonomic ranks, demonstrating that numerous microorganisms in the analysed fermentation sample of the biogas plant are still unclassified or unknown.

  9. Rapid Multiplex Small DNA Sequencing on the MinION Nanopore Sequencing Platform

    Directory of Open Access Journals (Sweden)

    Shan Wei

    2018-05-01

    Full Text Available Real-time sequencing of short DNA reads has a wide variety of clinical and research applications including screening for mutations, target sequences and aneuploidy. We recently demonstrated that MinION, a nanopore-based DNA sequencing device the size of a USB drive, could be used for short-read DNA sequencing. In this study, an ultra-rapid multiplex library preparation and sequencing method for the MinION is presented and applied to accurately test normal diploid and aneuploidy samples’ genomic DNA in under three hours, including library preparation and sequencing. This novel method shows great promise as a clinical diagnostic test for applications requiring rapid short-read DNA sequencing.

  10. Complete nuclear ribosomal DNA sequence amplification and molecular analyses of Bangia (Bangiales, Rhodophyta) from China

    Science.gov (United States)

    Xu, Jiajie; Jiang, Bo; Chai, Sanming; He, Yuan; Zhu, Jianyi; Shen, Zonggen; Shen, Songdong

    2016-09-01

    Filamentous Bangia, which are distributed extensively throughout the world, have simple and similar morphological characteristics. Scientists can classify these organisms using molecular markers in combination with morphology. We successfully sequenced the complete nuclear ribosomal DNA, approximately 13 kb in length, from a marine Bangia population. We further analyzed the small subunit ribosomal DNA gene (nrSSU) and the internal transcribed spacer (ITS) sequence regions along with nine other marine, and two freshwater Bangia samples from China. Pairwise distances of the nrSSU and 5.8S ribosomal DNA gene sequences show the marine samples grouping together with low divergences (00.003; 0-0.006, respectively) from each other, but high divergences (0.123-0.126; 0.198, respectively) from freshwater samples. An exception is the marine sample collected from Weihai, which shows high divergence from both other marine samples (0.063-0.065; 0.129, respectively) and the freshwater samples (0.097; 0.120, respectively). A maximum likelihood phylogenetic tree based on a combined SSU-ITS dataset with maximum likelihood method shows the samples divided into three clades, with the two marine sample clades containing Bangia spp. from North America, Europe, Asia, and Australia; and one freshwater clade, containing Bangia atropurpurea from North America and China.

  11. DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

    Science.gov (United States)

    Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

    2012-01-01

    DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.

  12. Low-Energy Electron-Induced Strand Breaks in Telomere-Derived DNA Sequences-Influence of DNA Sequence and Topology.

    Science.gov (United States)

    Rackwitz, Jenny; Bald, Ilko

    2018-03-26

    During cancer radiation therapy high-energy radiation is used to reduce tumour tissue. The irradiation produces a shower of secondary low-energy (DNA very efficiently by dissociative electron attachment. Recently, it was suggested that low-energy electron-induced DNA strand breaks strongly depend on the specific DNA sequence with a high sensitivity of G-rich sequences. Here, we use DNA origami platforms to expose G-rich telomere sequences to low-energy (8.8 eV) electrons to determine absolute cross sections for strand breakage and to study the influence of sequence modifications and topology of telomeric DNA on the strand breakage. We find that the telomeric DNA 5'-(TTA GGG) 2 is more sensitive to low-energy electrons than an intermixed sequence 5'-(TGT GTG A) 2 confirming the unique electronic properties resulting from G-stacking. With increasing length of the oligonucleotide (i.e., going from 5'-(GGG ATT) 2 to 5'-(GGG ATT) 4 ), both the variety of topology and the electron-induced strand break cross sections increase. Addition of K + ions decreases the strand break cross section for all sequences that are able to fold G-quadruplexes or G-intermediates, whereas the strand break cross section for the intermixed sequence remains unchanged. These results indicate that telomeric DNA is rather sensitive towards low-energy electron-induced strand breakage suggesting significant telomere shortening that can also occur during cancer radiation therapy. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. A novel constraint for thermodynamically designing DNA sequences.

    Directory of Open Access Journals (Sweden)

    Qiang Zhang

    Full Text Available Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap.

  14. Studies of base pair sequence effects on DNA solvation based on all

    Indian Academy of Sciences (India)

    Detailed analyses of the sequence-dependent solvation and ion atmosphere of DNA are presented based on molecular dynamics (MD) simulations on all the 136 unique tetranucleotide steps obtained by the ABC consortium using the AMBER suite of programs. Significant sequence effects on solvation and ion localization ...

  15. Mitochondrial DNA sequence variation in Finnish patients with matrilineal diabetes mellitus

    Directory of Open Access Journals (Sweden)

    Soini Heidi K

    2012-07-01

    Full Text Available Abstract Background The genetic background of type 2 diabetes is complex involving contribution by both nuclear and mitochondrial genes. There is an excess of maternal inheritance in patients with type 2 diabetes and, furthermore, diabetes is a common symptom in patients with mutations in mitochondrial DNA (mtDNA. Polymorphisms in mtDNA have been reported to act as risk factors in several complex diseases. Findings We examined the nucleotide variation in complete mtDNA sequences of 64 Finnish patients with matrilineal diabetes. We used conformation sensitive gel electrophoresis and sequencing to detect sequence variation. We analysed the pathogenic potential of nonsynonymous variants detected in the sequences and examined the role of the m.16189 T>C variant. Controls consisted of non-diabetic subjects ascertained in the same population. The frequency of mtDNA haplogroup V was 3-fold higher in patients with diabetes. Patients harboured many nonsynonymous mtDNA substitutions that were predicted to be possibly or probably damaging. Furthermore, a novel m.13762 T>G in MTND5 leading to p.Ser476Ala and several rare mtDNA variants were found. Haplogroup H1b harbouring m.16189 T > C and m.3010 G > A was found to be more frequent in patients with diabetes than in controls. Conclusions Mildly deleterious nonsynonymous mtDNA variants and rare population-specific haplotypes constitute genetic risk factors for maternally inherited diabetes.

  16. Fractals in DNA sequence analysis

    Institute of Scientific and Technical Information of China (English)

    Yu Zu-Guo(喻祖国); Vo Anh; Gong Zhi-Min(龚志民); Long Shun-Chao(龙顺潮)

    2002-01-01

    Fractal methods have been successfully used to study many problems in physics, mathematics, engineering, finance,and even in biology. There has been an increasing interest in unravelling the mysteries of DNA; for example, how can we distinguish coding and noncoding sequences, and the problems of classification and evolution relationship of organisms are key problems in bioinformatics. Although much research has been carried out by taking into consideration the long-range correlations in DNA sequences, and the global fractal dimension has been used in these works by other people, the models and methods are somewhat rough and the results are not satisfactory. In recent years, our group has introduced a time series model (statistical point of view) and a visual representation (geometrical point of view)to DNA sequence analysis. We have also used fractal dimension, correlation dimension, the Hurst exponent and the dimension spectrum (multifractal analysis) to discuss problems in this field. In this paper, we introduce these fractal models and methods and the results of DNA sequence analysis.

  17. Dog Y chromosomal DNA sequence: identification, sequencing and SNP discovery

    Directory of Open Access Journals (Sweden)

    Kirkness Ewen

    2006-10-01

    Full Text Available Abstract Background Population genetic studies of dogs have so far mainly been based on analysis of mitochondrial DNA, describing only the history of female dogs. To get a picture of the male history, as well as a second independent marker, there is a need for studies of biallelic Y-chromosome polymorphisms. However, there are no biallelic polymorphisms reported, and only 3200 bp of non-repetitive dog Y-chromosome sequence deposited in GenBank, necessitating the identification of dog Y chromosome sequence and the search for polymorphisms therein. The genome has been only partially sequenced for one male dog, disallowing mapping of the sequence into specific chromosomes. However, by comparing the male genome sequence to the complete female dog genome sequence, candidate Y-chromosome sequence may be identified by exclusion. Results The male dog genome sequence was analysed by Blast search against the human genome to identify sequences with a best match to the human Y chromosome and to the female dog genome to identify those absent in the female genome. Candidate sequences were then tested for male specificity by PCR of five male and five female dogs. 32 sequences from the male genome, with a total length of 24 kbp, were identified as male specific, based on a match to the human Y chromosome, absence in the female dog genome and male specific PCR results. 14437 bp were then sequenced for 10 male dogs originating from Europe, Southwest Asia, Siberia, East Asia, Africa and America. Nine haplotypes were found, which were defined by 14 substitutions. The genetic distance between the haplotypes indicates that they originate from at least five wolf haplotypes. There was no obvious trend in the geographic distribution of the haplotypes. Conclusion We have identified 24159 bp of dog Y-chromosome sequence to be used for population genetic studies. We sequenced 14437 bp in a worldwide collection of dogs, identifying 14 SNPs for future SNP analyses, and

  18. Fast and secure retrieval of DNA sequences

    NARCIS (Netherlands)

    2014-01-01

    Sequence models are retrieved from a sequences index. The sequence models model DNA or RNA sequences stored in a database, and each comprises a finite memory tree source model and parameters for the finite memory tree source model. One or more DNA or RNA sequences stored in the database are

  19. DNA sequencing conference, 2

    Energy Technology Data Exchange (ETDEWEB)

    Cook-Deegan, R.M. [Georgetown Univ., Kennedy Inst. of Ethics, Washington, DC (United States); Venter, J.C. [National Inst. of Neurological Disorders and Strokes, Bethesda, MD (United States); Gilbert, W. [Harvard Univ., Cambridge, MA (United States); Mulligan, J. [Stanford Univ., CA (United States); Mansfield, B.K. [Oak Ridge National Lab., TN (United States)

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  20. Nucleotide sequence preservation of human mitochondrial DNA

    International Nuclear Information System (INIS)

    Monnat, R.J. Jr.; Loeb, L.A.

    1985-01-01

    Recombinant DNA techniques have been used to quantitate the amount of nucleotide sequence divergence in the mitochondrial DNA population of individual normal humans. Mitochondrial DNA was isolated from the peripheral blood lymphocytes of five normal humans and cloned in M13 mp11; 49 kilobases of nucleotide sequence information was obtained from 248 independently isolated clones from the five normal donors. Both between- and within-individual differences were identified. Between-individual differences were identified in approximately = to 1/200 nucleotides. In contrast, only one within-individual difference was identified in 49 kilobases of nucleotide sequence information. This high degree of mitochondrial nucleotide sequence homogeneity in human somatic cells is in marked contrast to the rapid evolutionary divergence of human mitochondrial DNA and suggests the existence of mechanisms for the concerted preservation of mammalian mitochondrial DNA sequences in single organisms

  1. Complete sequence analysis of 18S rDNA based on genomic DNA extraction from individual Demodex mites (Acari: Demodicidae).

    Science.gov (United States)

    Zhao, Ya-E; Xu, Ji-Ru; Hu, Li; Wu, Li-Ping; Wang, Zheng-Hang

    2012-05-01

    The study for the first time attempted to accomplish 18S ribosomal DNA (rDNA) complete sequence amplification and analysis for three Demodex species (Demodex folliculorum, Demodex brevis and Demodex canis) based on gDNA extraction from individual mites. The mites were treated by DNA Release Additive and Hot Start II DNA Polymerase so as to promote mite disruption and increase PCR specificity. Determination of D. folliculorum gDNA showed that the gDNA yield reached the highest at 1 mite, tending to descend with the increase of mite number. The individual mite gDNA was successfully used for 18S rDNA fragment (about 900 bp) amplification examination. The alignments of 18S rDNA complete sequences of individual mite samples and those of pooled mite samples ( ≥ 1000mites/sample) showed over 97% identities for each species, indicating that the gDNA extracted from a single individual mite was as satisfactory as that from pooled mites for PCR amplification. Further pairwise sequence analyses showed that average divergence, genetic distance, transition/transversion or phylogenetic tree could not effectively identify the three Demodex species, largely due to the differentiation in the D. canis isolates. It can be concluded that the individual Demodex mite gDNA can satisfy the molecular study of Demodex. 18S rDNA complete sequence is suitable for interfamily identification in Cheyletoidea, but whether it is suitable for intrafamily identification cannot be confirmed until the ascertainment of the types of Demodex mites parasitizing in dogs. Copyright © 2012 Elsevier Inc. All rights reserved.

  2. Fidelity and mutational spectrum of Pfu DNA polymerase on a human mitochondrial DNA sequence.

    Science.gov (United States)

    André, P; Kim, A; Khrapko, K; Thilly, W G

    1997-08-01

    The study of rare genetic changes in human tissues requires specialized techniques. Point mutations at fractions at or below 10(-6) must be observed to discover even the most prominent features of the point mutational spectrum. PCR permits the increase in number of mutant copies but does so at the expense of creating many additional mutations or "PCR noise". Thus, each DNA sequence studied must be characterized with regard to the DNA polymerase and conditions used to avoid interpreting a PCR-generated mutation as one arising in human tissue. The thermostable DNA polymerase derived from Pyrococcus furiosus designated Pfu has the highest fidelity of any DNA thermostable polymerase studied to date, and this property recommends it for analyses of tissue mutational spectra. Here, we apply constant denaturant capillary electrophoresis (CDCE) to separate and isolate the products of DNA amplification. This new strategy permitted direct enumeration and identification of point mutations created by Pfu DNA polymerase in a 96-bp low melting domain of a human mitochondrial sequence despite the very low mutant fractions generated in the PCR process. This sequence, containing part of the tRNA glycine and NADH dehydrogenase subunit 3 genes, is the target of our studies of mitochondrial mutagenesis in human cells and tissues. Incorrectly synthesized sequences were separated from the wild type as mutant/wild-type heteroduplexes by sequential enrichment on CDCE. An artificially constructed mutant was used as an internal standard to permit calculation of the mutant fraction. Our study found that the average error rate (mutations per base pair duplication) of Pfu was 6.5 x 10(-7), and five of its more frequent mutations (hot spots) consisted of three transversions (GC-->TA, AT-->TA, and AT-->CG), one transition (AT-->GC), and one 1-bp deletion (in an AAAAAA sequence). To achieve an even higher sensitivity, the amount of Pfu-induced mutants must be reduced.

  3. Alu repeats as markers for forensic DNA analyses

    Energy Technology Data Exchange (ETDEWEB)

    Batzer, M.A.; Alegria-Hartman, M. [Lawrence Livermore National Lab., CA (United States); Kass, D.H. [Louisiana State Univ., New Orleans, LA (United States)] [and others

    1994-01-01

    The Human-Specific (HS) subfamily of Alu sequences is comprised of a group of 500 nearly identical members which are almost exclusively restricted to the human genome. Individual subfamily members share an average of 98.9% nucleotide identity with the HS subfamily consensus sequence, and have an average age of 2.8 million years. We have developed a Polymerase Chain Reaction (PCR) based assay using primers complementary to the 5 inch and 3 inch unique flanking DNA sequences from each HS Alu that allow the locus to be assayed for the presence or absence of the Alu repeat. The dimorphic HS Alu sequences probably inserted in the human genome after the radiation of modem humans (within the last 200,000-one million years) and represent a unique source of information for human population genetics and forensic DNA analyses. These sites can be developed into Dimorphic Alu Sequence Tagged Sites (DASTS) for the Human Genome Project. HS Alu family member insertions differ from other types of polymorphism (e.g. Variable Number of Tandem Repeat [VNTR] or Restriction Fragment Length Polymorphism [RFLP]) in that polymorphisms due to Alu insertions arise as a result of a unique event which has occurred only one time in the human population and spread through the population from that point. Therefore, individuals that share HS Alu repeats inherited these elements from a common ancestor. Most VNTR and RFLP polymorphisms may arise multiple times in parallel within a population.

  4. EGNAS: an exhaustive DNA sequence design algorithm

    Directory of Open Access Journals (Sweden)

    Kick Alfred

    2012-06-01

    Full Text Available Abstract Background The molecular recognition based on the complementary base pairing of deoxyribonucleic acid (DNA is the fundamental principle in the fields of genetics, DNA nanotechnology and DNA computing. We present an exhaustive DNA sequence design algorithm that allows to generate sets containing a maximum number of sequences with defined properties. EGNAS (Exhaustive Generation of Nucleic Acid Sequences offers the possibility of controlling both interstrand and intrastrand properties. The guanine-cytosine content can be adjusted. Sequences can be forced to start and end with guanine or cytosine. This option reduces the risk of “fraying” of DNA strands. It is possible to limit cross hybridizations of a defined length, and to adjust the uniqueness of sequences. Self-complementarity and hairpin structures of certain length can be avoided. Sequences and subsequences can optionally be forbidden. Furthermore, sequences can be designed to have minimum interactions with predefined strands and neighboring sequences. Results The algorithm is realized in a C++ program. TAG sequences can be generated and combined with primers for single-base extension reactions, which were described for multiplexed genotyping of single nucleotide polymorphisms. Thereby, possible foldback through intrastrand interaction of TAG-primer pairs can be limited. The design of sequences for specific attachment of molecular constructs to DNA origami is presented. Conclusions We developed a new software tool called EGNAS for the design of unique nucleic acid sequences. The presented exhaustive algorithm allows to generate greater sets of sequences than with previous software and equal constraints. EGNAS is freely available for noncommercial use at http://www.chm.tu-dresden.de/pc6/EGNAS.

  5. Sequence periodicity in nucleosomal DNA and intrinsic curvature.

    Science.gov (United States)

    Nair, T Murlidharan

    2010-05-17

    Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA.

  6. DNA sequence modeling based on context trees

    NARCIS (Netherlands)

    Kusters, C.J.; Ignatenko, T.; Roland, J.; Horlin, F.

    2015-01-01

    Genomic sequences contain instructions for protein and cell production. Therefore understanding and identification of biologically and functionally meaningful patterns in DNA sequences is of paramount importance. Modeling of DNA sequences in its turn can help to better understand and identify such

  7. Insights into N-calls of mitochondrial DNA sequencing using MitoChip v2.0

    Directory of Open Access Journals (Sweden)

    Blakely Emma L

    2011-10-01

    Full Text Available Abstract Background Developments in DNA resequencing microarrays include mitochondrial DNA (mtDNA sequencing and mutation detection. Failure by the microarray to identify a base, compared to the reference sequence, is designated an 'N-call.' This study re-examined the N-call distribution of mtDNA samples sequenced by the Affymetrix MitoChip v.2.0, based on the hypothesis that N-calls may represent insertions or deletions (indels in mtDNA. Findings We analysed 16 patient mtDNA samples using MitoChip. N-calls by the proprietary GSEQ software were significantly reduced when either of the freeware on-line algorithms ResqMi or sPROFILER was utilized. With sPROFILER, this decrease in N-calls had no effect on the homoplasmic or heteroplasmic mutation levels compared to GSEQ software, but ResqMi produced a significant change in mutation load, as well as producing longer N-cell stretches. For these reasons, further analysis using ResqMi was not attempted. Conventional DNA sequencing of the longer N-calls stretches from sPROFILER revealed 7 insertions and 12 point mutations. Moreover, analysis of single-base N-calls of one mtDNA sample found 3 other point mutations. Conclusions Our study is the first to analyse N-calls produced from GSEQ software for the MitoChipv2.0. By narrowing the focus to longer stretches of N-calls revealed by sPROFILER, conventional sequencing was able to identify unique insertions and point mutations. Shorter N-calls also harboured point mutations, but the absence of deletions among N-calls suggests that probe confirmation affects binding and thus N-calling. This study supports the contention that the GSEQ is more capable of assigning bases when used in conjunction with sPROFILER.

  8. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  9. Sequence of human protamine 2 cDNA

    Energy Technology Data Exchange (ETDEWEB)

    Domenjoud, L; Fronia, C; Uhde, F; Engel, W [Universitaet Goettingen (West Germany)

    1988-08-11

    The authors report the cloning and sequencing of a cDNA clone for human protamine 2 (hp2), isolated from a human testis cDNA library cloned in the vector {lambda}-gt11. A 66mer oligonucleotide, that corresponds to an amino acid sequence which is highly conserved between hp2 and mouse protamine 2 (mp2) served as hybridization probe. The homology between the amino acid sequence deduced from our cDNA and the published amino acid sequence for hp2 is 100%.

  10. Bacterial identification and subtyping using DNA microarray and DNA sequencing.

    Science.gov (United States)

    Al-Khaldi, Sufian F; Mossoba, Magdi M; Allard, Marc M; Lienau, E Kurt; Brown, Eric D

    2012-01-01

    The era of fast and accurate discovery of biological sequence motifs in prokaryotic and eukaryotic cells is here. The co-evolution of direct genome sequencing and DNA microarray strategies not only will identify, isotype, and serotype pathogenic bacteria, but also it will aid in the discovery of new gene functions by detecting gene expressions in different diseases and environmental conditions. Microarray bacterial identification has made great advances in working with pure and mixed bacterial samples. The technological advances have moved beyond bacterial gene expression to include bacterial identification and isotyping. Application of new tools such as mid-infrared chemical imaging improves detection of hybridization in DNA microarrays. The research in this field is promising and future work will reveal the potential of infrared technology in bacterial identification. On the other hand, DNA sequencing by using 454 pyrosequencing is so cost effective that the promise of $1,000 per bacterial genome sequence is becoming a reality. Pyrosequencing technology is a simple to use technique that can produce accurate and quantitative analysis of DNA sequences with a great speed. The deposition of massive amounts of bacterial genomic information in databanks is creating fingerprint phylogenetic analysis that will ultimately replace several technologies such as Pulsed Field Gel Electrophoresis. In this chapter, we will review (1) the use of DNA microarray using fluorescence and infrared imaging detection for identification of pathogenic bacteria, and (2) use of pyrosequencing in DNA cluster analysis to fingerprint bacterial phylogenetic trees.

  11. High-Throughput Block Optical DNA Sequence Identification.

    Science.gov (United States)

    Sagar, Dodderi Manjunatha; Korshoj, Lee Erik; Hanson, Katrina Bethany; Chowdhury, Partha Pratim; Otoupal, Peter Britton; Chatterjee, Anushree; Nagpal, Prashant

    2018-01-01

    Optical techniques for molecular diagnostics or DNA sequencing generally rely on small molecule fluorescent labels, which utilize light with a wavelength of several hundred nanometers for detection. Developing a label-free optical DNA sequencing technique will require nanoscale focusing of light, a high-throughput and multiplexed identification method, and a data compression technique to rapidly identify sequences and analyze genomic heterogeneity for big datasets. Such a method should identify characteristic molecular vibrations using optical spectroscopy, especially in the "fingerprinting region" from ≈400-1400 cm -1 . Here, surface-enhanced Raman spectroscopy is used to demonstrate label-free identification of DNA nucleobases with multiplexed 3D plasmonic nanofocusing. While nanometer-scale mode volumes prevent identification of single nucleobases within a DNA sequence, the block optical technique can identify A, T, G, and C content in DNA k-mers. The content of each nucleotide in a DNA block can be a unique and high-throughput method for identifying sequences, genes, and other biomarkers as an alternative to single-letter sequencing. Additionally, coupling two complementary vibrational spectroscopy techniques (infrared and Raman) can improve block characterization. These results pave the way for developing a novel, high-throughput block optical sequencing method with lossy genomic data compression using k-mer identification from multiplexed optical data acquisition. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  12. Plant DNA sequences from feces: potential means for assessing diets of wild primates.

    Science.gov (United States)

    Bradley, Brenda J; Stiller, Mathias; Doran-Sheehy, Diane M; Harris, Tara; Chapman, Colin A; Vigilant, Linda; Poinar, Hendrik

    2007-06-01

    Analyses of plant DNA in feces provides a promising, yet largely unexplored, means of documenting the diets of elusive primates. Here we demonstrate the promise and pitfalls of this approach using DNA extracted from fecal samples of wild western gorillas (Gorilla gorilla) and black and white colobus monkeys (Colobus guereza). From these DNA extracts we amplified, cloned, and sequenced small segments of chloroplast DNA (part of the rbcL gene) and plant nuclear DNA (ITS-2). The obtained sequences were compared to sequences generated from known plant samples and to those in GenBank to identify plant taxa in the feces. With further optimization, this method could provide a basic evaluation of minimum primate dietary diversity even when knowledge of local flora is limited. This approach may find application in studies characterizing the diets of poorly-known, unhabituated primate species or assaying consumer-resource relationships in an ecosystem. (c) 2007 Wiley-Liss, Inc.

  13. Genome-wide profiling of DNA-binding proteins using barcode-based multiplex Solexa sequencing.

    Science.gov (United States)

    Raghav, Sunil Kumar; Deplancke, Bart

    2012-01-01

    Chromatin immunoprecipitation (ChIP) is a commonly used technique to detect the in vivo binding of proteins to DNA. ChIP is now routinely paired to microarray analysis (ChIP-chip) or next-generation sequencing (ChIP-Seq) to profile the DNA occupancy of proteins of interest on a genome-wide level. Because ChIP-chip introduces several biases, most notably due to the use of a fixed number of probes, ChIP-Seq has quickly become the method of choice as, depending on the sequencing depth, it is more sensitive, quantitative, and provides a greater binding site location resolution. With the ever increasing number of reads that can be generated per sequencing run, it has now become possible to analyze several samples simultaneously while maintaining sufficient sequence coverage, thus significantly reducing the cost per ChIP-Seq experiment. In this chapter, we provide a step-by-step guide on how to perform multiplexed ChIP-Seq analyses. As a proof-of-concept, we focus on the genome-wide profiling of RNA Polymerase II as measuring its DNA occupancy at different stages of any biological process can provide insights into the gene regulatory mechanisms involved. However, the protocol can also be used to perform multiplexed ChIP-Seq analyses of other DNA-binding proteins such as chromatin modifiers and transcription factors.

  14. Satellite DNA Sequences in Canidae and Their Chromosome Distribution in Dog and Red Fox.

    Science.gov (United States)

    Vozdova, Miluse; Kubickova, Svatava; Cernohorska, Halina; Fröhlich, Jan; Rubes, Jiri

    2016-01-01

    Satellite DNA is a characteristic component of mammalian centromeric heterochromatin, and a comparative analysis of its evolutionary dynamics can be used for phylogenetic studies. We analysed satellite and satellite-like DNA sequences available in NCBI for 4 species of the family Canidae (red fox, Vulpes vulpes, VVU; domestic dog, Canis familiaris, CFA; arctic fox, Vulpes lagopus, VLA; raccoon dog, Nyctereutes procyonoides procyonoides, NPR) by comparative sequence analysis, which revealed 86-90% intraspecies and 76-79% interspecies similarity. Comparative fluorescence in situ hybridisation in the red fox and dog showed signals of the red fox satellite probe in canine and vulpine autosomal centromeres, on VVUY, B chromosomes, and in the distal parts of VVU9q and VVU10p which were shown to contain nucleolus organiser regions. The CFA satellite probe stained autosomal centromeres only in the dog. The CFA satellite-like DNA did not show any significant sequence similarity with the satellite DNA of any species analysed and was localised to the centromeres of 9 canine chromosome pairs. No significant heterochromatin block was detected on the B chromosomes of the red fox. Our results show extensive heterogeneity of satellite sequences among Canidae and prove close evolutionary relationships between the red and arctic fox. © 2017 S. Karger AG, Basel.

  15. Compressing DNA sequence databases with coil

    Directory of Open Access Journals (Sweden)

    Hendy Michael D

    2008-05-01

    Full Text Available Abstract Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.

  16. Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing.

    Directory of Open Access Journals (Sweden)

    Sarah M Hykin

    Full Text Available For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles, attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens-particularly for use in phylogenetic analyses-has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp. We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens

  17. Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing.

    Science.gov (United States)

    Hykin, Sarah M; Bi, Ke; McGuire, Jimmy A

    2015-01-01

    For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens-particularly for use in phylogenetic analyses-has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for

  18. DNA Sequencing by Capillary Electrophoresis

    Science.gov (United States)

    Karger, Barry L.; Guttman, Andras

    2009-01-01

    Sequencing of human and other genomes has been at the center of interest in the biomedical field over the past several decades and is now leading toward an era of personalized medicine. During this time, DNA sequencing methods have evolved from the labor intensive slab gel electrophoresis, through automated multicapillary electrophoresis systems using fluorophore labeling with multispectral imaging, to the “next generation” technologies of cyclic array, hybridization based, nanopore and single molecule sequencing. Deciphering the genetic blueprint and follow-up confirmatory sequencing of Homo sapiens and other genomes was only possible by the advent of modern sequencing technologies that was a result of step by step advances with a contribution of academics, medical personnel and instrument companies. While next generation sequencing is moving ahead at break-neck speed, the multicapillary electrophoretic systems played an essential role in the sequencing of the Human Genome, the foundation of the field of genomics. In this prospective, we wish to overview the role of capillary electrophoresis in DNA sequencing based in part of several of our articles in this journal. PMID:19517496

  19. Characterizing genetic diversity of contemporary pacific chickens using mitochondrial DNA analyses.

    Directory of Open Access Journals (Sweden)

    Kelsey Needham Dancause

    Full Text Available BACKGROUND: Mitochondrial DNA (mtDNA hypervariable region (HVR sequences of prehistoric Polynesian chicken samples reflect dispersal of two haplogroups--D and E--by the settlers of the Pacific. The distribution of these chicken haplogroups has been used as an indicator of human movement. Recent analyses suggested similarities between prehistoric Pacific and South American chicken samples, perhaps reflecting prehistoric Polynesian introduction of the chicken into South America. These analyses have been heavily debated. The current distribution of the D and E lineages among contemporary chicken populations in the Western Pacific is unclear, but might ultimately help to inform debates about the movements of humans that carried them. OBJECTIVES: We sought to characterize contemporary mtDNA diversity among chickens in two of the earliest settled archipelagos of Remote Oceania, the Marianas and Vanuatu. METHODS: We generated HVR sequences for 43 chickens from four islands in Vanuatu, and for 5 chickens from Guam in the Marianas. RESULTS: Forty samples from Vanuatu and three from Guam were assigned to haplogroup D, supporting this as a Pacific chicken haplogroup that persists in the Western Pacific. Two haplogroup E lineages were observed in Guam and two in Vanuatu. Of the E lineages in Vanuatu, one was identical to prehistoric Vanuatu and Polynesian samples and the other differed by one polymorphism. Contrary to our expectations, we observed few globally distributed domesticate lineages not associated with Pacific chicken dispersal. This might suggest less European introgression of chickens into Vanuatu than expected. If so, the E lineages might represent lineages maintained from ancient Pacific chicken introductions. The Vanuatu sample might thus provide an opportunity to distinguish between maintained ancestral Pacific chicken lineages and replacement by global domesticates through genomic analyses, which could resolve questions of contemporary

  20. On site DNA barcoding by nanopore sequencing.

    Directory of Open Access Journals (Sweden)

    Michele Menegon

    Full Text Available Biodiversity research is becoming increasingly dependent on genomics, which allows the unprecedented digitization and understanding of the planet's biological heritage. The use of genetic markers i.e. DNA barcoding, has proved to be a powerful tool in species identification. However, full exploitation of this approach is hampered by the high sequencing costs and the absence of equipped facilities in biodiversity-rich countries. In the present work, we developed a portable sequencing laboratory based on the portable DNA sequencer from Oxford Nanopore Technologies, the MinION. Complementary laboratory equipment and reagents were selected to be used in remote and tough environmental conditions. The performance of the MinION sequencer and the portable laboratory was tested for DNA barcoding in a mimicking tropical environment, as well as in a remote rainforest of Tanzania lacking electricity. Despite the relatively high sequencing error-rate of the MinION, the development of a suitable pipeline for data analysis allowed the accurate identification of different species of vertebrates including amphibians, reptiles and mammals. In situ sequencing of a wild frog allowed us to rapidly identify the species captured, thus confirming that effective DNA barcoding in the field is possible. These results open new perspectives for real-time-on-site DNA sequencing thus potentially increasing opportunities for the understanding of biodiversity in areas lacking conventional laboratory facilities.

  1. GENETIC POLYMORPHISM IN GYMNODINIUM GALATHEANUM CHLOROPLAST DNA SEQUENCES AND DEVELOPMENT OF A MOLECULAR DETECTION ASSAY. (R827084)

    Science.gov (United States)

    Nuclear and chloroplast-encoded small subunit ribosomal DNA sequences were obtainedfrom several strains of the toxic dinoflagellate Gymnodinium galatheanum. Phylogenetic analyses andcomparison of sequences indicate that the chloroplast sequences show a higher degree of se...

  2. Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

    Directory of Open Access Journals (Sweden)

    Jason D Thompson

    Full Text Available Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.

  3. Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

    Science.gov (United States)

    Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre

    2012-01-01

    Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.

  4. Highly multiplexed targeted DNA sequencing from single nuclei.

    Science.gov (United States)

    Leung, Marco L; Wang, Yong; Kim, Charissa; Gao, Ruli; Jiang, Jerry; Sei, Emi; Navin, Nicholas E

    2016-02-01

    Single-cell DNA sequencing methods are challenged by poor physical coverage, high technical error rates and low throughput. To address these issues, we developed a single-cell DNA sequencing protocol that combines flow-sorting of single nuclei, time-limited multiple-displacement amplification (MDA), low-input library preparation, DNA barcoding, targeted capture and next-generation sequencing (NGS). This approach represents a major improvement over our previous single nucleus sequencing (SNS) Nature Protocols paper in terms of generating higher-coverage data (>90%), thereby enabling the detection of genome-wide variants in single mammalian cells at base-pair resolution. Furthermore, by pooling 48-96 single-cell libraries together for targeted capture, this approach can be used to sequence many single-cell libraries in parallel in a single reaction. This protocol greatly reduces the cost of single-cell DNA sequencing, and it can be completed in 5-6 d by advanced users. This single-cell DNA sequencing protocol has broad applications for studying rare cells and complex populations in diverse fields of biological research and medicine.

  5. Three genetic stocks of frigate tuna Auxis thazard thazard (Lacepede, 1800) along the Indian coast revealed from sequence analyses of mitochondrial DNA D-loop region

    Digital Repository Service at National Institute of Oceanography (India)

    GirishKumar; Kunal, S.P.; Menezes, M.R.; Meena, R.M.

    revealed from sequence analyses of mitochondrial DNA D-loop region Name of authors: 1. Girish Kumar* Biological Oceanography Division (BOD) National Institute of Oceanography (NIO) Dona Paula, Goa 403004, India. Email: girishkumar....nio@gmail.com Tel: +919766548060 2. Swaraj Priyaranjan Kunal Biological Oceanography Division (BOD) National Institute of Oceanography (NIO) Dona Paula, Goa 403004, India. Email: swar.mbt@gmail.com 3. Maria Rosalia Menezes Biological Oceanography...

  6. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    Directory of Open Access Journals (Sweden)

    Baldwin Stephen A

    2011-03-01

    Full Text Available Abstract Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.

  7. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities.

    Science.gov (United States)

    Troshin, Peter V; Postis, Vincent Lg; Ashworth, Denise; Baldwin, Stephen A; McPherson, Michael J; Barton, Geoffrey J

    2011-03-07

    Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.

  8. Entropic fluctuations in DNA sequences

    Science.gov (United States)

    Thanos, Dimitrios; Li, Wentian; Provata, Astero

    2018-03-01

    The Local Shannon Entropy (LSE) in blocks is used as a complexity measure to study the information fluctuations along DNA sequences. The LSE of a DNA block maps the local base arrangement information to a single numerical value. It is shown that despite this reduction of information, LSE allows to extract meaningful information related to the detection of repetitive sequences in whole chromosomes and is useful in finding evolutionary differences between organisms. More specifically, large regions of tandem repeats, such as centromeres, can be detected based on their low LSE fluctuations along the chromosome. Furthermore, an empirical investigation of the appropriate block sizes is provided and the relationship of LSE properties with the structure of the underlying repetitive units is revealed by using both computational and mathematical methods. Sequence similarity between the genomic DNA of closely related species also leads to similar LSE values at the orthologous regions. As an application, the LSE covariance function is used to measure the evolutionary distance between several primate genomes.

  9. Cytophotometric and biochemical analyses of DNA in pentaploid and diploid Agave species.

    Science.gov (United States)

    Cavallini, A; Natali, L; Cionini, G; Castorena-Sanchez, I

    1996-04-01

    Nuclear DNA content, chromatin structure, and DNA composition were investigated in four Agave species: two diploid, Agave tequilana Weber and Agave angustifolia Haworth var. marginata Hort., and two pentaploid, Agave fourcroydes Lemaire and Agave sisalana Perrine. It was determined that the genome size of pentaploid species is nearly 2.5 times that of diploid ones. Cytophotometric analyses of chromatin structure were performed following Feulgen or DAPI staining to determine optical density profiles of interphase nuclei. Pentaploid species showed higher frequencies of condensed chromatin (heterochromatin) than diploid species. On the other hand, a lower frequency of A-T rich (DAPI stained) heterochromatin was found in pentaploid species than in diploid ones, indicating that heterochromatin in pentaploid species is made up of sequences with base compositions different from those of diploid species. Since thermal denaturation profiles of extracted DNA showed minor variations in the base composition of the genomes of the four species, it is supposed that, in pentaploid species, the large heterochromatin content is not due to an overrepresentation of G-C repetitive sequences but rather to the condensation of nonrepetitive sequences, such as, for example, redundant gene copies switched off in the polyploid complement. It is suggested that speciation in the genus Agave occurs through point mutations and minor DNA rearrangements, as is also indicated by the relative stability of the karyotype of this genus. Key words : Agave, DNA cytophotometry, DNA melting profiles, chromatin structure, genome size.

  10. Molecular systematics of Indian Alysicarpus (Fabaceae) based on analyses of nuclear ribosomal DNA sequences.

    Science.gov (United States)

    Gholami, Akram; Subramaniam, Shweta; Geeta, R; Pandey, Arun K

    2017-06-01

    Alysicarpus Necker ex Desvaux (Fabaceae, Desmodieae) consists of ~30 species that are distributed in tropical and subtropical regions of theworld. In India, the genus is represented by ca. 18 species, ofwhich seven are endemic. Sequences of the nuclear Internal transcribed spacer from38 accessions representing 16 Indian specieswere subjected to phylogenetic analyses. The ITS sequence data strongly support the monophyly of the genus Alysicarpus. Analyses revealed four major well-supported clades within Alysicarpus. Ancestral state reconstructions were done for two morphological characters, namely calyx length in relation to pod (macrocalyx and microcalyx) and pod surface ornamentation (transversely rugose and nonrugose). The present study is the first report on molecular systematics of Indian Alysicarpus.

  11. DNA Replication Profiling Using Deep Sequencing.

    Science.gov (United States)

    Saayman, Xanita; Ramos-Pérez, Cristina; Brown, Grant W

    2018-01-01

    Profiling of DNA replication during progression through S phase allows a quantitative snap-shot of replication origin usage and DNA replication fork progression. We present a method for using deep sequencing data to profile DNA replication in S. cerevisiae.

  12. Effects of sequence on DNA wrapping around histones

    Science.gov (United States)

    Ortiz, Vanessa

    2011-03-01

    A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).

  13. High-Throughput DNA sequencing of ancient wood.

    Science.gov (United States)

    Wagner, Stefanie; Lagane, Frédéric; Seguin-Orlando, Andaine; Schubert, Mikkel; Leroy, Thibault; Guichoux, Erwan; Chancerel, Emilie; Bech-Hebelstrup, Inger; Bernard, Vincent; Billard, Cyrille; Billaud, Yves; Bolliger, Matthias; Croutsch, Christophe; Čufar, Katarina; Eynaud, Frédérique; Heussner, Karl Uwe; Köninger, Joachim; Langenegger, Fabien; Leroy, Frédéric; Lima, Christine; Martinelli, Nicoletta; Momber, Garry; Billamboz, André; Nelle, Oliver; Palomo, Antoni; Piqué, Raquel; Ramstein, Marianne; Schweichel, Roswitha; Stäuble, Harald; Tegel, Willy; Terradas, Xavier; Verdin, Florence; Plomion, Christophe; Kremer, Antoine; Orlando, Ludovic

    2018-03-01

    Reconstructing the colonization and demographic dynamics that gave rise to extant forests is essential to forecasts of forest responses to environmental changes. Classical approaches to map how population of trees changed through space and time largely rely on pollen distribution patterns, with only a limited number of studies exploiting DNA molecules preserved in wooden tree archaeological and subfossil remains. Here, we advance such analyses by applying high-throughput (HTS) DNA sequencing to wood archaeological and subfossil material for the first time, using a comprehensive sample of 167 European white oak waterlogged remains spanning a large temporal (from 550 to 9,800 years) and geographical range across Europe. The successful characterization of the endogenous DNA and exogenous microbial DNA of 140 (~83%) samples helped the identification of environmental conditions favouring long-term DNA preservation in wood remains, and started to unveil the first trends in the DNA decay process in wood material. Additionally, the maternally inherited chloroplast haplotypes of 21 samples from three periods of forest human-induced use (Neolithic, Bronze Age and Middle Ages) were found to be consistent with those of modern populations growing in the same geographic areas. Our work paves the way for further studies aiming at using ancient DNA preserved in wood to reconstruct the micro-evolutionary response of trees to climate change and human forest management. © 2018 John Wiley & Sons Ltd.

  14. Molecular design of sequence specific DNA alkylating agents.

    Science.gov (United States)

    Minoshima, Masafumi; Bando, Toshikazu; Shinohara, Ken-ichi; Sugiyama, Hiroshi

    2009-01-01

    Sequence-specific DNA alkylating agents have great interest for novel approach to cancer chemotherapy. We designed the conjugates between pyrrole (Py)-imidazole (Im) polyamides and DNA alkylating chlorambucil moiety possessing at different positions. The sequence-specific DNA alkylation by conjugates was investigated by using high-resolution denaturing polyacrylamide gel electrophoresis (PAGE). The results showed that polyamide chlorambucil conjugates alkylate DNA at flanking adenines in recognition sequences of Py-Im polyamides, however, the reactivities and alkylation sites were influenced by the positions of conjugation. In addition, we synthesized conjugate between Py-Im polyamide and another alkylating agent, 1-(chloromethyl)-5-hydroxy-1,2-dihydro-3H-benz[e]indole (seco-CBI). DNA alkylation reactivies by both alkylating polyamides were almost comparable. In contrast, cytotoxicities against cell lines differed greatly. These comparative studies would promote development of appropriate sequence-specific DNA alkylating polyamides against specific cancer cells.

  15. Sequence analysis of Leukemia DNA

    Science.gov (United States)

    Nacong, Nasria; Lusiyanti, Desy; Irawan, Muhammad. Isa

    2018-03-01

    Cancer is a very deadly disease, one of which is leukemia disease or better known as blood cancer. The cancer cell can be detected by taking DNA in laboratory test. This study focused on local alignment of leukemia and non leukemia data resulting from NCBI in the form of DNA sequences by using Smith-Waterman algorithm. SmithWaterman algorithm was invented by TF Smith and MS Waterman in 1981. These algorithms try to find as much as possible similarity of a pair of sequences, by giving a negative value to the unequal base pair (mismatch), and positive values on the same base pair (match). So that will obtain the maximum positive value as the end of the alignment, and the minimum value as the initial alignment. This study will use sequences of leukemia and 3 sequences of non leukemia.

  16. Multiple tag labeling method for DNA sequencing

    Science.gov (United States)

    Mathies, R.A.; Huang, X.C.; Quesada, M.A.

    1995-07-25

    A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.

  17. Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data.

    Directory of Open Access Journals (Sweden)

    Can Alkan

    2007-09-01

    Full Text Available The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%-5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution.

  18. Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data.

    Science.gov (United States)

    Alkan, Can; Ventura, Mario; Archidiacono, Nicoletta; Rocchi, Mariano; Sahinalp, S Cenk; Eichler, Evan E

    2007-09-01

    The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%-5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution.

  19. Human Chromosome 7: DNA Sequence and Biology

    OpenAIRE

    Scherer, Stephen W.; Cheung, Joseph; MacDonald, Jeffrey R.; Osborne, Lucy R.; Nakabayashi, Kazuhiko; Herbrick, Jo-Anne; Carson, Andrew R.; Parker-Katiraee, Layla; Skaug, Jennifer; Khaja, Razi; Zhang, Junjun; Hudek, Alexander K.; Li, Martin; Haddad, May; Duggan, Gavin E.

    2003-01-01

    DNA sequence and annotation of the entire human chromosome 7, encompassing nearly 158 million nucleotides of DNA and 1917 gene structures, are presented. To generate a higher order description, additional structural features such as imprinted genes, fragile sites, and segmental duplications were integrated at the level of the DNA sequence with medical genetic data, including 440 chromosome rearrangement breakpoints associated with disease. This approach enabled the discovery of candidate gene...

  20. PREDICTION OF CHROMATIN STATES USING DNA SEQUENCE PROPERTIES

    KAUST Repository

    Bahabri, Rihab R.

    2013-06-01

    Activities of DNA are to a great extent controlled epigenetically through the internal struc- ture of chromatin. This structure is dynamic and is influenced by different modifications of histone proteins. Various combinations of epigenetic modification of histones pinpoint to different functional regions of the DNA determining the so-called chromatin states. How- ever, the characterization of chromatin states by the DNA sequence properties remains largely unknown. In this study we aim to explore whether DNA sequence patterns in the human genome can characterize different chromatin states. Using DNA sequence motifs we built binary classifiers for each chromatic state to eval- uate whether a given genomic sequence is a good candidate for belonging to a particular chromatin state. Of four classification algorithms (C4.5, Naive Bayes, Random Forest, and SVM) used for this purpose, the decision tree based classifiers (C4.5 and Random Forest) yielded best results among those we evaluated. Our results suggest that in general these models lack sufficient predictive power, although for four chromatin states (insulators, het- erochromatin, and two types of copy number variation) we found that presence of certain motifs in DNA sequences does imply an increased probability that such a sequence is one of these chromatin states.

  1. Sequencing, Characterization, and Comparative Analyses of the Plastome of Caragana rosea var. rosea

    Directory of Open Access Journals (Sweden)

    Mei Jiang

    2018-05-01

    Full Text Available To exploit the drought-resistant Caragana species, we performed a comparative study of the plastomes from four species: Caragana rosea, C. microphylla, C. kozlowii, and C. Korshinskii. The complete plastome sequence of the C. rosea was obtained using the next generation DNA sequencing technology. The genome is a circular structure of 133,122 bases and it lacks inverted repeat. It contains 111 unique genes, including 76 protein-coding, 30 tRNA, and four rRNA genes. Repeat analyses obtained 239, 244, 258, and 246 simple sequence repeats in C. rosea, C. microphylla, C. kozlowii, and C. korshinskii, respectively. Analyses of sequence divergence found two intergenic regions: trnI-CAU-ycf2 and trnN-GUU-ycf1, exhibiting a high degree of variations. Phylogenetic analyses showed that the four Caragana species belong to a monophyletic clade. Analyses of Ka/Ks ratios revealed that five genes: rpl16, rpl20, rps11, rps7, and ycf1 and several sites having undergone strong positive selection in the Caragana branch. The results lay the foundation for the development of molecular markers and the understanding of the evolutionary process for drought-resistant characteristics.

  2. Pegasys: software for executing and integrating analyses of biological sequences

    Directory of Open Access Journals (Sweden)

    Lett Drew

    2004-04-01

    Full Text Available Abstract Background We present Pegasys – a flexible, modular and customizable software system that facilitates the execution and data integration from heterogeneous biological sequence analysis tools. Results The Pegasys system includes numerous tools for pair-wise and multiple sequence alignment, ab initio gene prediction, RNA gene detection, masking repetitive sequences in genomic DNA as well as filters for database formatting and processing raw output from various analysis tools. We introduce a novel data structure for creating workflows of sequence analyses and a unified data model to store its results. The software allows users to dynamically create analysis workflows at run-time by manipulating a graphical user interface. All non-serial dependent analyses are executed in parallel on a compute cluster for efficiency of data generation. The uniform data model and backend relational database management system of Pegasys allow for results of heterogeneous programs included in the workflow to be integrated and exported into General Feature Format for further analyses in GFF-dependent tools, or GAME XML for import into the Apollo genome editor. The modularity of the design allows for new tools to be added to the system with little programmer overhead. The database application programming interface allows programmatic access to the data stored in the backend through SQL queries. Conclusions The Pegasys system enables biologists and bioinformaticians to create and manage sequence analysis workflows. The software is released under the Open Source GNU General Public License. All source code and documentation is available for download at http://bioinformatics.ubc.ca/pegasys/.

  3. Googling DNA sequences on the World Wide Web.

    Science.gov (United States)

    Hajibabaei, Mehrdad; Singer, Gregory A C

    2009-11-10

    New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bioinformatics applications. We have developed a novel algorithm and implemented it for searching species-specific genomic sequences, DNA barcodes, by using popular web-based methods such as Google. We developed an alignment independent character based algorithm based on dividing a sequence library (DNA barcodes) and query sequence to words. The actual search is conducted by conventional search tools such as freely available Google Desktop Search. We implemented our algorithm in two exemplar packages. We developed pre and post-processing software to provide customized input and output services, respectively. Our analysis of all publicly available DNA barcode sequences shows a high accuracy as well as rapid results. Our method makes use of conventional web-based technologies for specialized genetic data. It provides a robust and efficient solution for sequence search on the web. The integration of our search method for large-scale sequence libraries such as DNA barcodes provides an excellent web-based tool for accessing this information and linking it to other available categories of information on the web.

  4. Graphene nanodevices for DNA sequencing

    NARCIS (Netherlands)

    Heerema, S.J.; Dekker, C.

    2016-01-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with

  5. Gomphid DNA sequence data

    Data.gov (United States)

    U.S. Environmental Protection Agency — DNA sequence data for several genetic loci. This dataset is not publicly accessible because: It's already publicly available on GenBank. It can be accessed through...

  6. Characteristics of alternating current hopping conductivity in DNA sequences

    Institute of Scientific and Technical Information of China (English)

    Ma Song-Shan; Xu Hui; Wang Huan-You; Guo Rui

    2009-01-01

    This paper presents a model to describe alternating current (AC) conductivity of DNA sequences,in which DNA is considered as a one-dimensional (1D) disordered system,and electrons transport via hopping between localized states.It finds that AC conductivity in DNA sequences increases as the frequency of the external electric field rises,and it takes the form of σac(ω)~ω2 ln2(1/ω).Also AC conductivity of DNA sequences increases with the increase of temperature,this phenomenon presents characteristics of weak temperature-dependence.Meanwhile,the AC conductivity in an off diagonally correlated case is much larger than that in the uncorrelated case of the Anderson limit in low temperatures,which indicates that the off-diagonal correlations in DNA sequences have a great effect on the AC conductivity,while at high temperature the off-diagonal correlations no longer play a vital role in electric transport. In addition,the proportion of nucleotide pairs p also plays an important role in AC electron transport of DNA sequences.For p<0.5,the conductivity of DNA sequence decreases with the increase of p,while for p > 0.5,the conductivity increases with the increase of p.

  7. Identification of DNA-binding protein target sequences by physical effective energy functions: free energy analysis of lambda repressor-DNA complexes.

    Directory of Open Access Journals (Sweden)

    Caselle Michele

    2007-09-01

    Full Text Available Abstract Background Specific binding of proteins to DNA is one of the most common ways gene expression is controlled. Although general rules for the DNA-protein recognition can be derived, the ambiguous and complex nature of this mechanism precludes a simple recognition code, therefore the prediction of DNA target sequences is not straightforward. DNA-protein interactions can be studied using computational methods which can complement the current experimental methods and offer some advantages. In the present work we use physical effective potentials to evaluate the DNA-protein binding affinities for the λ repressor-DNA complex for which structural and thermodynamic experimental data are available. Results The binding free energy of two molecules can be expressed as the sum of an intermolecular energy (evaluated using a molecular mechanics forcefield, a solvation free energy term and an entropic term. Different solvation models are used including distance dependent dielectric constants, solvent accessible surface tension models and the Generalized Born model. The effect of conformational sampling by Molecular Dynamics simulations on the computed binding energy is assessed; results show that this effect is in general negative and the reproducibility of the experimental values decreases with the increase of simulation time considered. The free energy of binding for non-specific complexes, estimated using the best energetic model, agrees with earlier theoretical suggestions. As a results of these analyses, we propose a protocol for the prediction of DNA-binding target sequences. The possibility of searching regulatory elements within the bacteriophage λ genome using this protocol is explored. Our analysis shows good prediction capabilities, even in absence of any thermodynamic data and information on the naturally recognized sequence. Conclusion This study supports the conclusion that physics-based methods can offer a completely complementary

  8. DNA Nucleotide Sequence Restricted by the RI Endonuclease

    Science.gov (United States)

    Hedgpeth, Joe; Goodman, Howard M.; Boyer, Herbert W.

    1972-01-01

    The sequence of DNA base pairs adjacent to the phosphodiester bonds cleaved by the RI restriction endonuclease in unmodified DNA from coliphage λ has been determined. The 5′-terminal nucleotide labeled with 32P and oligonucleotides up to the heptamer were analyzed from a pancreatic DNase digest. The following sequence of nucleotides adjacent to the RI break made in λ DNA was deduced from these data and from the 3′-dinucleotide sequence and nearest-neighbor analysis obtained from repair synthesis with the DNA polymerase of Rous sarcoma virus [Formula: see text] The RI endonuclease cleavage of the phosphodiester bonds (indicated by arrows) generates 5′-phosphoryls and short cohesive termini of four nucleotides, pApApTpT. The most striking feature of the sequence is its symmetry. PMID:4343974

  9. An extended sequence specificity for UV-induced DNA damage.

    Science.gov (United States)

    Chung, Long H; Murray, Vincent

    2018-01-01

    The sequence specificity of UV-induced DNA damage was determined with a higher precision and accuracy than previously reported. UV light induces two major damage adducts: cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). Employing capillary electrophoresis with laser-induced fluorescence and taking advantages of the distinct properties of the CPDs and 6-4PPs, we studied the sequence specificity of UV-induced DNA damage in a purified DNA sequence using two approaches: end-labelling and a polymerase stop/linear amplification assay. A mitochondrial DNA sequence that contained a random nucleotide composition was employed as the target DNA sequence. With previous methodology, the UV sequence specificity was determined at a dinucleotide or trinucleotide level; however, in this paper, we have extended the UV sequence specificity to a hexanucleotide level. With the end-labelling technique (for 6-4PPs), the consensus sequence was found to be 5'-GCTC*AC (where C* is the breakage site); while with the linear amplification procedure, it was 5'-TCTT*AC. With end-labelling, the dinucleotide frequency of occurrence was highest for 5'-TC*, 5'-TT* and 5'-CC*; whereas it was 5'-TT* for linear amplification. The influence of neighbouring nucleotides on the degree of UV-induced DNA damage was also examined. The core sequences consisted of pyrimidine nucleotides 5'-CTC* and 5'-CTT* while an A at position "1" and C at position "2" enhanced UV-induced DNA damage. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.

  10. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    DEFF Research Database (Denmark)

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie

    2014-01-01

    The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sec...

  11. Morphology of the larvae, male genitalia and DNA sequences of Anopheles (Kerteszia pholidotus (Diptera: Culicidae from Colombia

    Directory of Open Access Journals (Sweden)

    Jesús Eduardo Escovar

    2014-07-01

    Full Text Available Since 1984, Anopheles (Kerteszia lepidotus has been considered a mosquito species that is involved in the transmission of malaria in Colombia, after having been incriminated as such with epidemiological evidence from a malaria outbreak in Cunday-Villarrica, Tolima. Subsequent morphological analyses of females captured in the same place and at the time of the outbreak showed that the species responsible for the transmission was not An. lepidotus, but rather Anopheles pholidotus. However, the associated morphological stages and DNA sequences of An. pholidotus from the foci of Cunday-Villarrica had not been analysed. Using samples that were caught recently from the outbreak region, the purpose of this study was to provide updated and additional information by analysing the morphology of female mosquitoes, the genitalia of male mosquitoes and fourth instar larvae of An. pholidotus, which was confirmed with DNA sequences of cytochrome oxidase I and rDNA internal transcribed spacer. A total of 1,596 adult females were collected in addition to 37 larval collections in bromeliads. Furthermore, 141 adult females, which were captured from the same area in the years 1981-1982, were analysed morphologically. Ninety-five DNA sequences were analysed for this study. Morphological and molecular analyses showed that the species present in this region corresponds to An. pholidotus. Given the absence of An. lepidotus, even in recent years, we consider that the species of mosquitoes that was previously incriminated as the malaria vector during the outbreak was indeed An. pholidotus, thus ending the controversy.

  12. Characteristics of alternating current hopping conductivity in DNA sequences

    International Nuclear Information System (INIS)

    Song-Shan, Ma; Hui, Xu; Huan-You, Wang; Rui, Guo

    2009-01-01

    This paper presents a model to describe alternating current (AC) conductivity of DNA sequences, in which DNA is considered as a one-dimensional (1D) disordered system, and electrons transport via hopping between localized states. It finds that AC conductivity in DNA sequences increases as the frequency of the external electric field rises, and it takes the form of ø ac (ω) ∼ ω 2 ln 2 (1/ω). Also AC conductivity of DNA sequences increases with the increase of temperature, this phenomenon presents characteristics of weak temperature-dependence. Meanwhile, the AC conductivity in an off-diagonally correlated case is much larger than that in the uncorrelated case of the Anderson limit in low temperatures, which indicates that the off-diagonal correlations in DNA sequences have a great effect on the AC conductivity, while at high temperature the off-diagonal correlations no longer play a vital role in electric transport. In addition, the proportion of nucleotide pairs p also plays an important role in AC electron transport of DNA sequences. For p < 0.5, the conductivity of DNA sequence decreases with the increase of p, while for p ≥ 0.5, the conductivity increases with the increase of p. (cross-disciplinary physics and related areas of science and technology)

  13. Sequence-dependent DNA deformability studied using molecular dynamics simulations.

    Science.gov (United States)

    Fujii, Satoshi; Kono, Hidetoshi; Takenaka, Shigeori; Go, Nobuhiro; Sarai, Akinori

    2007-01-01

    Proteins recognize specific DNA sequences not only through direct contact between amino acids and bases, but also indirectly based on the sequence-dependent conformation and deformability of the DNA (indirect readout). We used molecular dynamics simulations to analyze the sequence-dependent DNA conformations of all 136 possible tetrameric sequences sandwiched between CGCG sequences. The deformability of dimeric steps obtained by the simulations is consistent with that by the crystal structures. The simulation results further showed that the conformation and deformability of the tetramers can highly depend on the flanking base pairs. The conformations of xATx tetramers show the most rigidity and are not affected by the flanking base pairs and the xYRx show by contrast the greatest flexibility and change their conformations depending on the base pairs at both ends, suggesting tetramers with the same central dimer can show different deformabilities. These results suggest that analysis of dimeric steps alone may overlook some conformational features of DNA and provide insight into the mechanism of indirect readout during protein-DNA recognition. Moreover, the sequence dependence of DNA conformation and deformability may be used to estimate the contribution of indirect readout to the specificity of protein-DNA recognition as well as nucleosome positioning and large-scale behavior of nucleic acids.

  14. Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing.

    Science.gov (United States)

    Hargreaves, Adam D; Mulley, John F

    2015-01-01

    Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0-2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5' and 3' UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete) and Sanger-based ESTs (15/29). We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species.

  15. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    Energy Technology Data Exchange (ETDEWEB)

    Winston Chen, C.H.; Taranenko, N.I.; Zhu, Y.F.; Chung, C.N.; Allman, S.L.

    1997-03-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, the authors recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Snager`s enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. The preliminary results indicate laser mass spectrometry can possibly be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, the authors applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  16. ITS all right mama: investigating the formation of chimeric sequences in the ITS2 region by DNA metabarcoding analyses of fungal mock communities of different complexities.

    Science.gov (United States)

    Bjørnsgaard Aas, Anders; Davey, Marie Louise; Kauserud, Håvard

    2017-07-01

    The formation of chimeric sequences can create significant methodological bias in PCR-based DNA metabarcoding analyses. During mixed-template amplification of barcoding regions, chimera formation is frequent and well documented. However, profiling of fungal communities typically uses the more variable rDNA region ITS. Due to a larger research community, tools for chimera detection have been developed mainly for the 16S/18S markers. However, these tools are widely applied to the ITS region without verification of their performance. We examined the rate of chimera formation during amplification and 454 sequencing of the ITS2 region from fungal mock communities of different complexities. We evaluated the chimera detecting ability of two common chimera-checking algorithms: perseus and uchime. Large proportions of the chimeras reported were false positives. No false negatives were found in the data set. Verified chimeras accounted for only 0.2% of the total ITS2 reads, which is considerably less than what is typically reported in 16S and 18S metabarcoding analyses. Verified chimeric 'parent sequences' had significantly higher per cent identity to one another than to random members of the mock communities. Community complexity increased the rate of chimera formation. GC content was higher around the verified chimeric break points, potentially facilitating chimera formation through base pair mismatching in the neighbouring regions of high similarity in the chimeric region. We conclude that the hypervariable nature of the ITS region seems to buffer the rate of chimera formation in comparison with other, less variable barcoding regions, due to shorter regions of high sequence similarity. © 2016 John Wiley & Sons Ltd.

  17. Molecular identification and phylogenetic analysis of important medicinal plant species in genus Paeonia based on rDNA-ITS, matK, and rbcL DNA barcode sequences.

    Science.gov (United States)

    Kim, W J; Ji, Y; Choi, G; Kang, Y M; Yang, S; Moon, B C

    2016-08-05

    This study was performed to identify and analyze the phylogenetic relationship among four herbaceous species of the genus Paeonia, P. lactiflora, P. japonica, P. veitchii, and P. suffruticosa, using DNA barcodes. These four species, which are commonly used in traditional medicine as Paeoniae Radix and Moutan Radicis Cortex, are pharmaceutically defined in different ways in the national pharmacopoeias in Korea, Japan, and China. To authenticate the different species used in these medicines, we evaluated rDNA-internal transcribed spacers (ITS), matK and rbcL regions, which provide information capable of effectively distinguishing each species from one another. Seventeen samples were collected from different geographic regions in Korea and China, and DNA barcode regions were amplified using universal primers. Comparative analyses of these DNA barcode sequences revealed species-specific nucleotide sequences capable of discriminating the four Paeonia species. Among the entire sequences of three barcodes, marker nucleotides were identified at three positions in P. lactiflora, eleven in P. japonica, five in P. veitchii, and 25 in P. suffruticosa. Phylogenetic analyses also revealed four distinct clusters showing homogeneous clades with high resolution at the species level. The results demonstrate that the analysis of these three DNA barcode sequences is a reliable method for identifying the four Paeonia species and can be used to authenticate Paeoniae Radix and Moutan Radicis Cortex at the species level. Furthermore, based on the assessment of amplicon sizes, inter/intra-specific distances, marker nucleotides, and phylogenetic analysis, rDNA-ITS was the most suitable DNA barcode for identification of these species.

  18. Sequence-specific DNA alkylation by tandem Py-Im polyamide conjugates.

    Science.gov (United States)

    Taylor, Rhys Dylan; Kawamoto, Yusuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

    2014-09-01

    Tandem N-methylpyrrole-N-methylimidazole (Py-Im) polyamides with good sequence-specific DNA-alkylating activities have been designed and synthesized. Three alkylating tandem Py-Im polyamides with different linkers, which each contained the same moiety for the recognition of a 10 bp DNA sequence, were evaluated for their reactivity and selectivity by DNA alkylation, using high-resolution denaturing gel electrophoresis. All three conjugates displayed high reactivities for the target sequence. In particular, polyamide 1, which contained a β-alanine linker, displayed the most-selective sequence-specific alkylation towards the target 10 bp DNA sequence. The tandem Py-Im polyamide conjugates displayed greater sequence-specific DNA alkylation than conventional hairpin Py-Im polyamide conjugates (4 and 5). For further research, the design of tandem Py-Im polyamide conjugates could play an important role in targeting specific gene sequences. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. Biomolecule Sequencer: Next-Generation DNA Sequencing Technology for In-Flight Environmental Monitoring, Research, and Beyond

    Science.gov (United States)

    Smith, David J.; Burton, Aaron; Castro-Wallace, Sarah; John, Kristen; Stahl, Sarah E.; Dworkin, Jason Peter; Lupisella, Mark L.

    2016-01-01

    On the International Space Station (ISS), technologies capable of rapid microbial identification and disease diagnostics are not currently available. NASA still relies upon sample return for comprehensive, molecular-based sample characterization. Next-generation DNA sequencing is a powerful approach for identifying microorganisms in air, water, and surfaces onboard spacecraft. The Biomolecule Sequencer payload, manifested to SpaceX-9 and scheduled on the Increment 4748 research plan (June 2016), will assess the functionality of a commercially-available next-generation DNA sequencer in the microgravity environment of ISS. The MinION device from Oxford Nanopore Technologies (Oxford, UK) measures picoamp changes in electrical current dependent on nucleotide sequences of the DNA strand migrating through nanopores in the system. The hardware is exceptionally small (9.5 x 3.2 x 1.6 cm), lightweight (120 grams), and powered only by a USB connection. For the ISS technology demonstration, the Biomolecule Sequencer will be powered by a Microsoft Surface Pro3. Ground-prepared samples containing lambda bacteriophage, Escherichia coli, and mouse genomic DNA, will be launched and stored frozen on the ISS until experiment initiation. Immediately prior to sequencing, a crew member will collect and thaw frozen DNA samples, connect the sequencer to the Surface Pro3, inject thawed samples into a MinION flow cell, and initiate sequencing. At the completion of the sequencing run, data will be downlinked for ground analysis. Identical, synchronous ground controls will be used for data comparisons to determine sequencer functionality, run-time sequence, current dynamics, and overall accuracy. We will present our latest results from the ISS flight experiment the first time DNA has ever been sequenced in space and discuss the many potential applications of the Biomolecule Sequencer for environmental monitoring, medical diagnostics, higher fidelity and more adaptable Space Biology Human

  20. The Large Subunit rDNA Sequence of Plasmodiophora brassicae Does not Contain Intra-species Polymorphism.

    Science.gov (United States)

    Schwelm, Arne; Berney, Cédric; Dixelius, Christina; Bass, David; Neuhauser, Sigrid

    2016-12-01

    Clubroot disease caused by Plasmodiophora brassicae is one of the most important diseases of cultivated brassicas. P. brassicae occurs in pathotypes which differ in the aggressiveness towards their Brassica host plants. To date no DNA based method to distinguish these pathotypes has been described. In 2011 polymorphism within the 28S rDNA of P. brassicae was reported which potentially could allow to distinguish pathotypes without the need of time-consuming bioassays. However, isolates of P. brassicae from around the world analysed in this study do not show polymorphism in their LSU rDNA sequences. The previously described polymorphism most likely derived from soil inhabiting Cercozoa more specifically Neoheteromita-like glissomonads. Here we correct the LSU rDNA sequence of P. brassicae. By using FISH we demonstrate that our newly generated sequence belongs to the causal agent of clubroot disease. Copyright © 2016 The Authors. Published by Elsevier GmbH.. All rights reserved.

  1. Identification of Meconopsis species by a DNA barcode sequence ...

    African Journals Online (AJOL)

    Deoxyribonucleic acid (DNA) barcoding is a novel technology that uses a standard DNA sequence to facilitate species identification. Species identification is necessary for the authentication of traditional plant based medicines. Although a consensus has not been agreed regarding which DNA sequences can be used as ...

  2. Levenshtein error-correcting barcodes for multiplexed DNA sequencing

    NARCIS (Netherlands)

    Buschmann, Tilo; Bystrykh, Leonid V.

    2013-01-01

    Background: High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called

  3. Mitochondrial DNA sequence data reveals association of haplogroup U with psychosis in bipolar disorder.

    Science.gov (United States)

    Frye, Mark A; Ryu, Euijung; Nassan, Malik; Jenkins, Gregory D; Andreazza, Ana C; Evans, Jared M; McElroy, Susan L; Oglesbee, Devin; Highsmith, W Edward; Biernacka, Joanna M

    2017-01-01

    Converging genetic, postmortem gene-expression, cellular, and neuroimaging data implicate mitochondrial dysfunction in bipolar disorder. This study was conducted to investigate whether mitochondrial DNA (mtDNA) haplogroups and single nucleotide variants (SNVs) are associated with sub-phenotypes of bipolar disorder. MtDNA from 224 patients with Bipolar I disorder (BPI) was sequenced, and association of sequence variations with 3 sub-phenotypes (psychosis, rapid cycling, and adolescent illness onset) was evaluated. Gene-level tests were performed to evaluate overall burden of minor alleles for each phenotype. The haplogroup U was associated with a higher risk of psychosis. Secondary analyses of SNVs provided nominal evidence for association of psychosis with variants in the tRNA, ND4 and ND5 genes. The association of psychosis with ND4 (gene that encodes NADH dehydrogenase 4) was further supported by gene-level analysis. Preliminary analysis of mtDNA sequence data suggests a higher risk of psychosis with the U haplogroup and variation in the ND4 gene implicated in electron transport chain energy regulation. Further investigation of the functional consequences of this mtDNA variation is encouraged. Copyright © 2016. Published by Elsevier Ltd.

  4. Analysis of T-DNA/Host-Plant DNA Junction Sequences in Single-Copy Transgenic Barley Lines

    Directory of Open Access Journals (Sweden)

    Joanne G. Bartlett

    2014-01-01

    Full Text Available Sequencing across the junction between an integrated transfer DNA (T-DNA and a host plant genome provides two important pieces of information. The junctions themselves provide information regarding the proportion of T-DNA which has integrated into the host plant genome, whilst the transgene flanking sequences can be used to study the local genetic environment of the integrated transgene. In addition, this information is important in the safety assessment of GM crops and essential for GM traceability. In this study, a detailed analysis was carried out on the right-border T-DNA junction sequences of single-copy independent transgenic barley lines. T-DNA truncations at the right-border were found to be relatively common and affected 33.3% of the lines. In addition, 14.3% of lines had rearranged construct sequence after the right border break-point. An in depth analysis of the host-plant flanking sequences revealed that a significant proportion of the T-DNAs integrated into or close to known repetitive elements. However, this integration into repetitive DNA did not have a negative effect on transgene expression.

  5. Chimeric proteins for detection and quantitation of DNA mutations, DNA sequence variations, DNA damage and DNA mismatches

    Science.gov (United States)

    McCutchen-Maloney, Sandra L.

    2002-01-01

    Chimeric proteins having both DNA mutation binding activity and nuclease activity are synthesized by recombinant technology. The proteins are of the general formula A-L-B and B-L-A where A is a peptide having DNA mutation binding activity, L is a linker and B is a peptide having nuclease activity. The chimeric proteins are useful for detection and identification of DNA sequence variations including DNA mutations (including DNA damage and mismatches) by binding to the DNA mutation and cutting the DNA once the DNA mutation is detected.

  6. Quantum-Sequencing: Fast electronic single DNA molecule sequencing

    Science.gov (United States)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free, high-throughput and cost-effective, single-molecule sequencing method. Here, we present the first demonstration of unique ``electronic fingerprint'' of all nucleotides (A, G, T, C), with single-molecule DNA sequencing, using Quantum-tunneling Sequencing (Q-Seq) at room temperature. We show that the electronic state of the nucleobases shift depending on the pH, with most distinct states identified at acidic pH. We also demonstrate identification of single nucleotide modifications (methylation here). Using these unique electronic fingerprints (or tunneling data), we report a partial sequence of beta lactamase (bla) gene, which encodes resistance to beta-lactam antibiotics, with over 95% success rate. These results highlight the potential of Q-Seq as a robust technique for next-generation sequencing.

  7. PISMA: A Visual Representation of Motif Distribution in DNA Sequences

    Directory of Open Access Journals (Sweden)

    Rogelio Alcántara-Silva

    2017-03-01

    Full Text Available Background: Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code–like, as a gene-map–like, and as a transcript scheme. Results: We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. Availability and Implementation: PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf .

  8. Torque measurements reveal sequence-specific cooperative transitions in supercoiled DNA

    Science.gov (United States)

    Oberstrass, Florian C.; Fernandes, Louis E.; Bryant, Zev

    2012-01-01

    B-DNA becomes unstable under superhelical stress and is able to adopt a wide range of alternative conformations including strand-separated DNA and Z-DNA. Localized sequence-dependent structural transitions are important for the regulation of biological processes such as DNA replication and transcription. To directly probe the effect of sequence on structural transitions driven by torque, we have measured the torsional response of a panel of DNA sequences using single molecule assays that employ nanosphere rotational probes to achieve high torque resolution. The responses of Z-forming d(pGpC)n sequences match our predictions based on a theoretical treatment of cooperative transitions in helical polymers. “Bubble” templates containing 50–100 bp mismatch regions show cooperative structural transitions similar to B-DNA, although less torque is required to disrupt strand–strand interactions. Our mechanical measurements, including direct characterization of the torsional rigidity of strand-separated DNA, establish a framework for quantitative predictions of the complex torsional response of arbitrary sequences in their biological context. PMID:22474350

  9. Directed PCR-free engineering of highly repetitive DNA sequences

    Directory of Open Access Journals (Sweden)

    Preissler Steffen

    2011-09-01

    Full Text Available Abstract Background Highly repetitive nucleotide sequences are commonly found in nature e.g. in telomeres, microsatellite DNA, polyadenine (poly(A tails of eukaryotic messenger RNA as well as in several inherited human disorders linked to trinucleotide repeat expansions in the genome. Therefore, studying repetitive sequences is of biological, biotechnological and medical relevance. However, cloning of such repetitive DNA sequences is challenging because specific PCR-based amplification is hampered by the lack of unique primer binding sites resulting in unspecific products. Results For the PCR-free generation of repetitive DNA sequences we used antiparallel oligonucleotides flanked by restriction sites of Type IIS endonucleases. The arrangement of recognition sites allowed for stepwise and seamless elongation of repetitive sequences. This facilitated the assembly of repetitive DNA segments and open reading frames encoding polypeptides with periodic amino acid sequences of any desired length. By this strategy we cloned a series of polyglutamine encoding sequences as well as highly repetitive polyadenine tracts. Such repetitive sequences can be used for diverse biotechnological applications. As an example, the polyglutamine sequences were expressed as His6-SUMO fusion proteins in Escherichia coli cells to study their aggregation behavior in vitro. The His6-SUMO moiety enabled affinity purification of the polyglutamine proteins, increased their solubility, and allowed controlled induction of the aggregation process. We successfully purified the fusions proteins and provide an example for their applicability in filter retardation assays. Conclusion Our seamless cloning strategy is PCR-free and allows the directed and efficient generation of highly repetitive DNA sequences of defined lengths by simple standard cloning procedures.

  10. DNA analyses of the remains of the Prince Branciforte Barresi family.

    Science.gov (United States)

    Rickards, O; Martínez-Labarga, C; Favaro, M; Frezza, D; Mallegni, F

    2001-01-01

    The five skeletons found buried in the church of Militello di Catania, Sicily, were tentatively identified by morphological analysis and historical reports as the remains of Prince Branciforte Barresi, two of his children, his brother and another juvenile member of the family (sixteenth and seventeenth centuries). In order to attempt to clarify the degree of relationships of the five skeletons, sex testing and mitochondrial DNA (mtDNA) sequence analysis of the hypervariable segments I and II (HV1 and HV2) of control region were performed. Moreover, the 9 bp-deletion marker of region V (COII/tRNAlys) was examined. Molecular genetic analyses were consistent with historical expectations, although they did not directly demonstrate that these are in fact the remains of the Prince and his relatives, due to the impossibility of obtaining DNA from living maternal relatives of the Prince.

  11. Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing

    Directory of Open Access Journals (Sweden)

    Adam D. Hargreaves

    2015-11-01

    Full Text Available Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0–2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5′ and 3′ UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete and Sanger-based ESTs (15/29. We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species.

  12. Automated methods for single-stranded DNA isolation and dideoxynucleotide DNA sequencing reactions on a robotic workstation

    International Nuclear Information System (INIS)

    Mardis, E.R.; Roe, B.A.

    1989-01-01

    Automated procedures have been developed for both the simultaneous isolation of 96 single-stranded M13 chimeric template DNAs in less than two hours, and for simultaneously pipetting 24 dideoxynucleotide sequencing reactions on a commercially available laboratory workstation. The DNA sequencing results obtained by either radiolabeled or fluorescent methods are consistent with the premise that automation of these portions of DNA sequencing projects will improve the reproducibility of the DNA isolation and the procedures for these normally labor-intensive steps provides an approach for rapid acquisition of large amounts of high quality, reproducible DNA sequence data

  13. Cost-effective sequencing of full-length cDNA clones powered by a de novo-reference hybrid assembly.

    Science.gov (United States)

    Kuroshu, Reginaldo M; Watanabe, Junichi; Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka; Kasahara, Masahiro

    2010-05-07

    Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence approximately 800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only approximately US$3 per clone, demonstrating a significant advantage over previous approaches.

  14. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  15. Sequencing Intractable DNA to Close Microbial Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Hurt, Jr., Richard Ashley [ORNL; Brown, Steven D [ORNL; Podar, Mircea [ORNL; Palumbo, Anthony Vito [ORNL; Elias, Dwayne A [ORNL

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  16. Sequence Dependent Interactions Between DNA and Single-Walled Carbon Nanotubes

    Science.gov (United States)

    Roxbury, Daniel

    It is known that single-stranded DNA adopts a helical wrap around a single-walled carbon nanotube (SWCNT), forming a water-dispersible hybrid molecule. The ability to sort mixtures of SWCNTs based on chirality (electronic species) has recently been demonstrated using special short DNA sequences that recognize certain matching SWCNTs of specific chirality. This thesis investigates the intricacies of DNA-SWCNT sequence-specific interactions through both experimental and molecular simulation studies. The DNA-SWCNT binding strengths were experimentally quantified by studying the kinetics of DNA replacement by a surfactant on the surface of particular SWCNTs. Recognition ability was found to correlate strongly with measured binding strength, e.g. DNA sequence (TAT)4 was found to bind 20 times stronger to the (6,5)-SWCNT than sequence (TAT)4T. Next, using replica exchange molecular dynamics (REMD) simulations, equilibrium structures formed by (a) single-strands and (b) multiple-strands of 12-mer oligonucleotides adsorbed on various SWCNTs were explored. A number of structural motifs were discovered in which the DNA strand wraps around the SWCNT and 'stitches' to itself via hydrogen bonding. Great variability among equilibrium structures was observed and shown to be directly influenced by DNA sequence and SWCNT type. For example, the (6,5)-SWCNT DNA recognition sequence, (TAT)4, was found to wrap in a tight single-stranded right-handed helical conformation. In contrast, DNA sequence T12 forms a beta-barrel left-handed structure on the same SWCNT. These are the first theoretical indications that DNA-based SWCNT selectivity can arise on a molecular level. In a biomedical collaboration with the Mayo Clinic, pathways for DNA-SWCNT internalization into healthy human endothelial cells were explored. Through absorbance spectroscopy, TEM imaging, and confocal fluorescence microscopy, we showed that intracellular concentrations of SWCNTs far exceeded those of the incubation

  17. Evaluation of a transposase protocol for rapid generation of shotgun high-throughput sequencing libraries from nanogram quantities of DNA.

    Science.gov (United States)

    Marine, Rachel; Polson, Shawn W; Ravel, Jacques; Hatfull, Graham; Russell, Daniel; Sullivan, Matthew; Syed, Fraz; Dumas, Michael; Wommack, K Eric

    2011-11-01

    Construction of DNA fragment libraries for next-generation sequencing can prove challenging, especially for samples with low DNA yield. Protocols devised to circumvent the problems associated with low starting quantities of DNA can result in amplification biases that skew the distribution of genomes in metagenomic data. Moreover, sample throughput can be slow, as current library construction techniques are time-consuming. This study evaluated Nextera, a new transposon-based method that is designed for quick production of DNA fragment libraries from a small quantity of DNA. The sequence read distribution across nine phage genomes in a mock viral assemblage met predictions for six of the least-abundant phages; however, the rank order of the most abundant phages differed slightly from predictions. De novo genome assemblies from Nextera libraries provided long contigs spanning over half of the phage genome; in four cases where full-length genome sequences were available for comparison, consensus sequences were found to match over 99% of the genome with near-perfect identity. Analysis of areas of low and high sequence coverage within phage genomes indicated that GC content may influence coverage of sequences from Nextera libraries. Comparisons of phage genomes prepared using both Nextera and a standard 454 FLX Titanium library preparation protocol suggested that the coverage biases according to GC content observed within the Nextera libraries were largely attributable to bias in the Nextera protocol rather than to the 454 sequencing technology. Nevertheless, given suitable sequence coverage, the Nextera protocol produced high-quality data for genomic studies. For metagenomics analyses, effects of GC amplification bias would need to be considered; however, the library preparation standardization that Nextera provides should benefit comparative metagenomic analyses.

  18. Minding the gap: Frequency of indels in mtDNA control region sequence data and influence on population genetic analyses

    Science.gov (United States)

    Pearce, J.M.

    2006-01-01

    Insertions and deletions (indels) result in sequences of various lengths when homologous gene regions are compared among individuals or species. Although indels are typically phylogenetically informative, occurrence and incorporation of these characters as gaps in intraspecific population genetic data sets are rarely discussed. Moreover, the impact of gaps on estimates of fixation indices, such as FST, has not been reviewed. Here, I summarize the occurrence and population genetic signal of indels among 60 published studies that involved alignments of multiple sequences from the mitochondrial DNA (mtDNA) control region of vertebrate taxa. Among 30 studies observing indels, an average of 12% of both variable and parsimony-informative sites were composed of these sites. There was no consistent trend between levels of population differentiation and the number of gap characters in a data block. Across all studies, the average influence on estimates of ??ST was small, explaining only an additional 1.8% of among population variance (range 0.0-8.0%). Studies most likely to observe an increase in ??ST with the inclusion of gap characters were those with control region DNA appears small, dependent upon total number of variable sites in the data block, and related to species-specific characteristics and the spatial distribution of mtDNA lineages that contain indels. ?? 2006 Blackwell Publishing Ltd.

  19. Recurrence plot analysis of DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Wu Zuobing [State Key Laboratory of Nonlinear Mechanics, Institute of Mechanics, Chinese Academy of Sciences, Beijing 100080 (China)]. E-mail: wuzb@lnm.imech.ac.cn

    2004-11-15

    Recurrence plot technique of DNA sequences is established on metric representation and employed to analyze correlation structure of nucleotide strings. It is found that, in the transference of nucleotide strings, a human DNA fragment has a major correlation distance, but a yeast chromosome's correlation distance has a constant increasing.

  20. Chromatid interchanges at intrachromosomal telomeric DNA sequences

    International Nuclear Information System (INIS)

    Fernandez, J.L.; Vazquez-Gundin, F.; Bilbao, A.; Gosalvez, J.; Goyanes, V.

    1997-01-01

    Chinese hamster Don cells were exposed to X-rays, mitomycin C and teniposide (VM-26) to induce chromatid exchanges (quadriradials and triradials). After fluorescence in situ hybridization (FISH) of telomere sequences it was found that interstitial telomere-like DNA sequence arrays presented around five times more breakage-rearrangements than the genome overall. This high recombinogenic capacity was independent of the clastogen, suggesting that this susceptibility is not related to the initial mechanisms of DNA damage. (author)

  1. Development of a defined-sequence DNA system for use in DNA misrepair studies

    International Nuclear Information System (INIS)

    Sutton, S.; Tobias, C.A.

    1984-01-01

    The authors have developed a system that allows them to study cellular DNA repair processes at the molecular level. In particular, the authors are using this system to examine the consequences of a misrepair of radiation-induced DNA damage, as a function of dose. The cells being used are specially engineered haploid yeast cells. Maintained in the cells, at one copy per cell, is a cen plasmid, a plasmid that behaves like a functional chromosome. This plasmid carries a small defined sequence of DNA from the E. coli lac z gene. It is this lac z region (called the alpha region) that serves as the target for radiation damage. Two copies of the complimentary portion of the lac z gene are integrated into the yeast genome. Irradiated cells are screened for possible mutation in the alpha region by testing the cells' ability to hydrolyze xgal, a lactose substrate. The DNA of interest is then extracted from the cells, sequenced, and the sequence is compared to that of the control. Unlike the usual defined-sequence DNA systems, theirs is an in vivo system. A disadvantage is the relatively high background mutation rate. Results achieved with this system, as well as future applications, are discussed

  2. Adenoviral DNA replication: DNA sequences and enzymes required for initiation in vitro

    International Nuclear Information System (INIS)

    Stillman, B.W.; Tamanoi, F.

    1983-01-01

    In this paper evidence is provided that the 140,000-dalton DNA polymerase is encoded by the adenoviral genome and is required for the initiation of DNA replication in vitro. The DNA sequences in the template DNA that are required for the initiation of replication have also been identified, using both plasmid DNAs and synthetic oligodeoxyribonucleotides. 48 references, 7 figures, 1 table

  3. RANDNA: a random DNA sequence generator.

    Science.gov (United States)

    Piva, Francesco; Principato, Giovanni

    2006-01-01

    Monte Carlo simulations are useful to verify the significance of data. Genomic regularities, such as the nucleotide correlations or the not uniform distribution of the motifs throughout genomic or mature mRNA sequences, exist and their significance can be checked by means of the Monte Carlo test. The test needs good quality random sequences in order to work, moreover they should have the same nucleotide distribution as the sequences in which the regularities have been found. Random DNA sequences are also useful to estimate the background score of an alignment, that is a threshold below which the resulting score is merely due to chance. We have developed RANDNA, a free software which allows to produce random DNA or RNA sequences setting both their length and the percentage of nucleotide composition. Sequences having the same nucleotide distribution of exonic, intronic or intergenic sequences can be generated. Its graphic interface makes it possible to easily set the parameters that characterize the sequences being produced and saved in a text format file. The pseudo-random number generator function of Borland Delphi 6 is used, since it guarantees a good randomness, a long cycle length and a high speed. We have checked the quality of sequences generated by the software, by means of well-known tests, both by themselves and versus genuine random sequences. We show the good quality of the generated sequences. The software, complete with examples and documentation, is freely available to users from: http://www.introni.it/en/software.

  4. Phylogeny of the Serrasalmidae (Characiformes based on mitochondrial DNA sequences

    Directory of Open Access Journals (Sweden)

    Guillermo Ortí

    2008-01-01

    Full Text Available Previous studies based on DNA sequences of mitochondrial (mt rRNA genes showed three main groups within the subfamily Serrasalminae: (1 a "pacu" clade of herbivores (Colossoma, Mylossoma, Piaractus; (2 the "Myleus" clade (Myleus, Mylesinus, Tometes, Ossubtus; and (3 the "piranha" clade (Serrasalmus, Pygocentrus, Pygopristis, Pristobrycon, Catoprion, Metynnis. The genus Acnodon was placed as the sister taxon of clade (2+3. However, poor resolution within each clade was obtained due to low levels of variation among rRNA gene sequences. Complete sequences of the hypervariable mtDNA control region for a total of 45 taxa, and additional sequences of 12S and 16S rRNA from a total of 74 taxa representing all genera in the family are now presented to address intragroup relationships. Control region sequences of several serrasalmid species exhibit tandem repeats of short motifs (12 to 33 bp in the 3' end of this region, accounting for substantial length variation. Bayesian inference and maximum parsimony analyses of these sequences identify the same groupings as before and provide further evidence to support the following observations: (a Serrasalmus gouldingi and species of Pristobrycon (non-striolatus form a monophyletic group that is the sister group to other species of Serrasalmus and Pygocentrus; (b Catoprion, Pygopristis, and Pristobrycon striolatus form a well supported clade, sister to the group described above; (c some taxa assigned to the genus Myloplus (M. asterias, M tiete, M ternetzi, and M rubripinnis form a well supported group whereas other Myloplus species remain with uncertain affinities (d Mylesinus, Tometes and Myleus setiger form a monophyletic group.

  5. Alignment of Escherichia coli K12 DNA sequences to a genomic restriction map.

    Science.gov (United States)

    Rudd, K E; Miller, W; Ostell, J; Benson, D A

    1990-01-25

    We use the extensive published information describing the genome of Escherichia coli and new restriction map alignment software to align DNA sequence, genetic, and physical maps. Restriction map alignment software is used which considers restriction maps as strings analogous to DNA or protein sequences except that two values, enzyme name and DNA base address, are associated with each position on the string. The resulting alignments reveal a nearly linear relationship between the physical and genetic maps of the E. coli chromosome. Physical map comparisons with the 1976, 1980, and 1983 genetic maps demonstrate a better fit with the more recent maps. The results of these alignments are genomic kilobase coordinates, orientation and rank of the alignment that best fits the genetic data. A statistical measure based on extreme value distribution is applied to the alignments. Additional computer analyses allow us to estimate the accuracy of the published E. coli genomic restriction map, simulate rearrangements of the bacterial chromosome, and search for repetitive DNA. The procedures we used are general enough to be applicable to other genome mapping projects.

  6. Google matrix analysis of DNA sequences.

    Science.gov (United States)

    Kandiah, Vivek; Shepelyansky, Dima L

    2013-01-01

    For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW). At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  7. Google matrix analysis of DNA sequences.

    Directory of Open Access Journals (Sweden)

    Vivek Kandiah

    Full Text Available For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW. At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  8. Chaos game representation (CGR)-walk model for DNA sequences

    International Nuclear Information System (INIS)

    Jie, Gao; Zhen-Yuan, Xu

    2009-01-01

    Chaos game representation (CGR) is an iterative mapping technique that processes sequences of units, such as nucleotides in a DNA sequence or amino acids in a protein, in order to determine the coordinates of their positions in a continuous space. This distribution of positions has two features: one is unique, and the other is source sequence that can be recovered from the coordinates so that the distance between positions may serve as a measure of similarity between the corresponding sequences. A CGR-walk model is proposed based on CGR coordinates for the DNA sequences. The CGR coordinates are converted into a time series, and a long-memory ARFIMA (p, d, q) model, where ARFIMA stands for autoregressive fractionally integrated moving average, is introduced into the DNA sequence analysis. This model is applied to simulating real CGR-walk sequence data of ten genomic sequences. Remarkably long-range correlations are uncovered in the data, and the results from these models are reasonably fitted with those from the ARFIMA (p, d, q) model. (cross-disciplinary physics and related areas of science and technology)

  9. Sequence-Dependent Diastereospecific and Diastereodivergent Crosslinking of DNA by Decarbamoylmitomycin C.

    Science.gov (United States)

    Aguilar, William; Paz, Manuel M; Vargas, Anayatzinc; Clement, Cristina C; Cheng, Shu-Yuan; Champeil, Elise

    2018-04-20

    Mitomycin C (MC), a potent antitumor drug, and decarbamoylmitomycin C (DMC), a derivative lacking the carbamoyl group, form highly cytotoxic DNA interstrand crosslinks. The major interstrand crosslink formed by DMC is the C1'' epimer of the major crosslink formed by MC. The molecular basis for the stereochemical configuration exhibited by DMC was investigated using biomimetic synthesis. The formation of DNA-DNA crosslinks by DMC is diastereospecific and diastereodivergent: Only the 1''S-diastereomer of the initially formed monoadduct can form crosslinks at GpC sequences, and only the 1''R-diastereomer of the monoadduct can form crosslinks at CpG sequences. We also show that CpG and GpC sequences react with divergent diastereoselectivity in the first alkylation step: 1"S stereochemistry is favored at GpC sequences and 1''R stereochemistry is favored at CpG sequences. Therefore, the first alkylation step results, at each sequence, in the selective formation of the diastereomer able to generate an interstrand DNA-DNA crosslink after the "second arm" alkylation. Examination of the known DNA adduct pattern obtained after treatment of cancer cell cultures with DMC indicates that the GpC sequence is the major target for the formation of DNA-DNA crosslinks in vivo by this drug. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. Assessing the fidelity of ancient DNA sequences amplified from nuclear genes

    DEFF Research Database (Denmark)

    Binladen, Jonas; Wiuf, Carsten Henrik; Gilbert, M. Thomas P.

    2006-01-01

    To date, the field of ancient DNA has relied almost exclusively on mitochondrial DNA (mtDNA) sequences. However, a number of recent studies have reported the successful recovery of ancient nuclear DNA (nuDNA) sequences, thereby allowing the characterization of genetic loci directly involved...... in phenotypic traits of extinct taxa. It is well documented that postmortem damage in ancient mtDNA can lead to the generation of artifactual sequences. However, as yet no one has thoroughly investigated the damage spectrum in ancient nuDNA. By comparing clone sequences from 23 fossil specimens, recovered from...... adenine), respectively. Type 2 transitions are by far the most dominant and increase relative to those of type 1 with damage load. The results suggest that the deamination of cytosine (and 5-methyl cytosine) to uracil (and thymine) is the main cause of miscoding lesions in both ancient mtDNA and nu...

  11. Thermodynamics of sequence-specific binding of PNA to DNA

    DEFF Research Database (Denmark)

    Ratilainen, T; Holmén, A; Tuite, E

    2000-01-01

    For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes) and seq......For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes...

  12. Next-generation sequencing offers new insights into DNA degradation

    DEFF Research Database (Denmark)

    Overballe-Petersen, Søren; Orlando, Ludovic Antoine Alexandre; Willerslev, Eske

    2012-01-01

    The processes underlying DNA degradation are central to various disciplines, including cancer research, forensics and archaeology. The sequencing of ancient DNA molecules on next-generation sequencing platforms provides direct measurements of cytosine deamination, depurination and fragmentation...... rates that previously were obtained only from extrapolations of results from in vitro kinetic experiments performed over short timescales. For example, recent next-generation sequencing of ancient DNA reveals purine bases as one of the main targets of postmortem hydrolytic damage, through base...... elimination and strand breakage. It also shows substantially increased rates of DNA base-loss at guanosine. In this review, we argue that the latter results from an electron resonance structure unique to guanosine rather than adenosine having an extra resonance structure over guanosine as previously suggested....

  13. Analysis of genetic diversity of Sclerotinia sclerotiorum from eggplant by mycelial compatibility, random amplification of polymorphic DNA (RAPD and simple sequence repeat (SSR analyses

    Directory of Open Access Journals (Sweden)

    Fatih Mehmet Tok

    2016-09-01

    Full Text Available The genetic diversity and pathogenicity/virulence among 60 eggplant Sclerotinia sclerotiorum isolates collected from six different geographic regions of Turkey were analysed using mycelial compatibility groupings (MCGs, random amplified polymorphic DNA (RAPD and simple sequence repeat (SSR polymorphism. By MCG tests, the isolates were classified into 22 groups. Out of 22 MCGs, 36% were represented each by a single isolate. The isolates showed great variability for virulence regardless of MCG and geographic origin. Based on the results of RAPD and SSR analyses, 60 S. sclerotiorum isolates representing 22 MCGs were grouped in 2 and 3 distinct clusters, respectively. Analyses using RAPD and SSR markers illustrated that cluster groupings or genetic distance of S. sclerotiorum populations from eggplant were not distinctly relative to the MCG, geographical origin and virulence diversity. The patterns obtained revealed a high heterogeneity of genetic composition and suggested the occurrence of clonal and sexual reproduction of S. sclerotiorum on eggplant in the areas surveyed.

  14. Enhanced throughput for infrared automated DNA sequencing

    Science.gov (United States)

    Middendorf, Lyle R.; Gartside, Bill O.; Humphrey, Pat G.; Roemer, Stephen C.; Sorensen, David R.; Steffens, David L.; Sutter, Scott L.

    1995-04-01

    Several enhancements have been developed and applied to infrared automated DNA sequencing resulting in significantly higher throughput. A 41 cm sequencing gel (31 cm well- to-read distance) combines high resolution of DNA sequencing fragments with optimized run times yielding two runs per day of 500 bases per sample. A 66 cm sequencing gel (56 cm well-to-read distance) produces sequence read lengths of up to 1000 bases for ds and ss templates using either T7 polymerase or cycle-sequencing protocols. Using a multichannel syringe to load 64 lanes allows 16 samples (compatible with 96-well format) to be visualized for each run. The 41 cm gel configuration allows 16,000 bases per day (16 samples X 500 bases/sample X 2 ten hour runs/day) to be sequenced with the advantages of infrared technology. Enhancements to internal labeling techniques using an infrared-labeled dATP molecule (Boehringer Mannheim GmbH, Penzberg, Germany; Sequenase (U.S. Biochemical) have also been made. The inclusion of glycerol in the sequencing reactions yields greatly improved results for some primer and template combinations. The inclusion of (alpha) -Thio-dNTP's in the labeling reaction increases signal intensity two- to three-fold.

  15. Methylation patterns of repetitive DNA sequences in germ cells of Mus musculus.

    Science.gov (United States)

    Sanford, J; Forrester, L; Chapman, V; Chandley, A; Hastie, N

    1984-03-26

    The major and the minor satellite sequences of Mus musculus were undermethylated in both sperm and oocyte DNAs relative to the amount of undermethylation observed in adult somatic tissue DNA. This hypomethylation was specific for satellite sequences in sperm DNA. Dispersed repetitive and low copy sequences show a high degree of methylation in sperm DNA; however, a dispersed repetitive sequence was undermethylated in oocyte DNA. This finding suggests a difference in the amount of total genomic DNA methylation between sperm and oocyte DNA. The methylation levels of the minor satellite sequences did not change during spermiogenesis, and were not associated with the onset of meiosis or a specific stage in sperm development.

  16. iDNA at Sea: Recovery of Whale Shark (Rhincodon typus Mitochondrial DNA Sequences from the Whale Shark Copepod (Pandarus rhincodonicus Confirms Global Population Structure

    Directory of Open Access Journals (Sweden)

    Mark Meekan

    2017-12-01

    Full Text Available The whale shark (Rhincodon typus is an iconic and endangered species with a broad distribution spanning warm-temperate and tropical oceans. Effective conservation management of the species requires an understanding of the degree of genetic connectivity among populations, which is hampered by the need for sampling that involves invasive techniques. Here, the feasibility of minimally-invasive sampling was explored by isolating and sequencing whale shark DNA from a commensal or possibly parasitic copepod, Pandarus rhincodonicus that occurs on the skin of the host. We successfully recovered mitochondrial control region DNA sequences (~1,000 bp of the host via DNA extraction and polymerase chain reaction from whole copepod specimens. DNA sequences obtained from multiple copepods collected from the same shark exhibited 100% sequence similarity, suggesting a persistent association of copepods with individual hosts. Newly-generated mitochondrial haplotypes of whale shark hosts derived from the copepods were included in an analysis of the genetic structure of the global population of whale sharks (644 sequences; 136 haplotypes. Our results supported those of previous studies and suggested limited genetic structuring across most of the species range, but the presence of a genetically unique and potentially isolated population in the Atlantic Ocean. Furthermore, we recovered the mitogenome and nuclear ribosomal genes of a whale shark using a shotgun sequencing approach on copepod tissue. The recovered mitogenome is the third mitogenome reported for the species and the first from the Mozambique population. Our invertebrate DNA (iDNA approach could be used to better understand the population structure of whale sharks, particularly in the Atlantic Ocean, and also for genetic analyses of other elasmobranchs parasitized by pandarid copepods.

  17. Molecular phylogeography of the brown bear (Ursus arctos) in Northeastern Asia based on analyses of complete mitochondrial DNA sequences.

    Science.gov (United States)

    Hirata, Daisuke; Mano, Tsutomu; Abramov, Alexei V; Baryshnikov, Gennady F; Kosintsev, Pavel A; Vorobiev, Alexandr A; Raichev, Evgeny G; Tsunoda, Hiroshi; Kaneko, Yayoi; Murata, Koichi; Fukui, Daisuke; Masuda, Ryuichi

    2013-07-01

    To further elucidate the migration history of the brown bears (Ursus arctos) on Hokkaido Island, Japan, we analyzed the complete mitochondrial DNA (mtDNA) sequences of 35 brown bears from Hokkaido, the southern Kuril Islands (Etorofu and Kunashiri), Sakhalin Island, and the Eurasian Continent (continental Russia, Bulgaria, and Tibet), and those of four polar bears. Based on these sequences, we reconstructed the maternal phylogeny of the brown bear and estimated divergence times to investigate the timing of brown bear migrations, especially in northeastern Eurasia. Our gene tree showed the mtDNA haplotypes of all 73 brown and polar bears to be divided into eight divergent lineages. The brown bear on Hokkaido was divided into three lineages (central, eastern, and southern). The Sakhalin brown bear grouped with eastern European and western Alaskan brown bears. Etorofu and Kunashiri brown bears were closely related to eastern Hokkaido brown bears and could have diverged from the eastern Hokkaido lineage after formation of the channel between Hokkaido and the southern Kuril Islands. Tibetan brown bears diverged early in the eastern lineage. Southern Hokkaido brown bears were closely related to North American brown bears.

  18. DNA-PK dependent targeting of DNA-ends to a protein complex assembled on matrix attachment region DNA sequences

    International Nuclear Information System (INIS)

    Mauldin, S.K.; Getts, R.C.; Perez, M.L.; DiRienzo, S.; Stamato, T.D.

    2003-01-01

    Full text: We find that nuclear protein extracts from mammalian cells contain an activity that allows DNA ends to associate with circular pUC18 plasmid DNA. This activity requires the catalytic subunit of DNA-PK (DNA-PKcs) and Ku since it was not observed in mutants lacking Ku or DNA-PKcs but was observed when purified Ku/DNA-PKcs was added to these mutant extracts. Competition experiments between pUC18 and pUC18 plasmids containing various nuclear matrix attachment region (MAR) sequences suggest that DNA ends preferentially associate with plasmids containing MAR DNA sequences. At a 1:5 mass ratio of MAR to pUC18, approximately equal amounts of DNA end binding to the two plasmids were observed, while at a 1:1 ratio no pUC18 end-binding was observed. Calculation of relative binding activities indicates that DNA-end binding activities to MAR sequences was 7 to 21 fold higher than pUC18. Western analysis of proteins bound to pUC18 and MAR plasmids indicates that XRCC4, DNA ligase IV, scaffold attachment factor A, topoisomerase II, and poly(ADP-ribose) polymerase preferentially associate with the MAR plasmid in the absence or presence of DNA ends. In contrast, Ku and DNA-PKcs were found on the MAR plasmid only in the presence of DNA ends. After electroporation of a 32P-labeled DNA probe into human cells and cell fractionation, 87% of the total intercellular radioactivity remained in nuclei after a 0.5M NaCl extraction suggesting the probe was strongly bound in the nucleus. The above observations raise the possibility that DNA-PK targets DNA-ends to a repair and/or DNA damage signaling complex which is assembled on MAR sites in the nucleus

  19. Next Generation DNA Sequencing and the Future of Genomic Medicine

    OpenAIRE

    Anderson, Matthew W.; Schrijver, Iris

    2010-01-01

    In the years since the first complete human genome sequence was reported, there has been a rapid development of technologies to facilitate high-throughput sequence analysis of DNA (termed “next-generation” sequencing). These novel approaches to DNA sequencing offer the promise of complete genomic analysis at a cost feasible for routine clinical diagnostics. However, the ability to more thoroughly interrogate genomic sequence raises a number of important issues with regard to result interpreta...

  20. Advantages and Limitations of Ribosomal RNA PCR and DNA Sequencing for Identification of Bacteria in Cardiac Valves of Danish Patients

    DEFF Research Database (Denmark)

    Kemp, Michael; Bangsborg, Jette; Kjerulf, Anne

    2013-01-01

    of direct molecular identification should also address weaknesses, their relevance in the given setting, and possible improvements. In this study cardiac valves from 56 Danish patients referred for surgery for infective endocarditis were analysed by microscopy and culture as well as by PCR targeting part...... of the bacterial 16S rRNA gene followed by DNA sequencing of the PCR product. PCR and DNA sequencing identified significant bacteria in 49 samples from 43 patients, including five out of 13 culture-negative cases. No rare, exotic, or intracellular bacteria were identified. There was a general agreement between...... bacterial identity obtained by ribosomal PCR and DNA sequencing from the valves and bacterial isolates from blood culture. However, DNA sequencing of the 16S rRNA gene did not discriminate well among non-haemolytic streptococci, especially within the Streptococcus mitis group. Ribosomal PCR with subsequent...

  1. repDNA: a Python package to generate various modes of feature vectors for DNA sequences by incorporating user-defined physicochemical properties and sequence-order effects.

    Science.gov (United States)

    Liu, Bin; Liu, Fule; Fang, Longyun; Wang, Xiaolong; Chou, Kuo-Chen

    2015-04-15

    In order to develop powerful computational predictors for identifying the biological features or attributes of DNAs, one of the most challenging problems is to find a suitable approach to effectively represent the DNA sequences. To facilitate the studies of DNAs and nucleotides, we developed a Python package called representations of DNAs (repDNA) for generating the widely used features reflecting the physicochemical properties and sequence-order effects of DNAs and nucleotides. There are three feature groups composed of 15 features. The first group calculates three nucleic acid composition features describing the local sequence information by means of kmers; the second group calculates six autocorrelation features describing the level of correlation between two oligonucleotides along a DNA sequence in terms of their specific physicochemical properties; the third group calculates six pseudo nucleotide composition features, which can be used to represent a DNA sequence with a discrete model or vector yet still keep considerable sequence-order information via the physicochemical properties of its constituent oligonucleotides. In addition, these features can be easily calculated based on both the built-in and user-defined properties via using repDNA. The repDNA Python package is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repDNA/. bliu@insun.hit.edu.cn or kcchou@gordonlifescience.org Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  2. Next Generation Sequencing of Ancient DNA: Requirements, Strategies and Perspectives

    Directory of Open Access Journals (Sweden)

    Michael Knapp

    2010-07-01

    Full Text Available The invention of next-generation-sequencing has revolutionized almost all fields of genetics, but few have profited from it as much as the field of ancient DNA research. From its beginnings as an interesting but rather marginal discipline, ancient DNA research is now on its way into the centre of evolutionary biology. In less than a year from its invention next-generation-sequencing had increased the amount of DNA sequence data available from extinct organisms by several orders of magnitude. Ancient DNA  research is now not only adding a temporal aspect to evolutionary studies and allowing for the observation of evolution in real time, it also provides important data to help understand the origins of our own species. Here we review progress that has been made in next-generation-sequencing of ancient DNA over the past five years and evaluate sequencing strategies and future directions.

  3. Application of Quaternion in improving the quality of global sequence alignment scores for an ambiguous sequence target in Streptococcus pneumoniae DNA

    Science.gov (United States)

    Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.

    2017-07-01

    DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.

  4. DNA watermarks in non-coding regulatory sequences

    Directory of Open Access Journals (Sweden)

    Pyka Martin

    2009-07-01

    Full Text Available Abstract Background DNA watermarks can be applied to identify the unauthorized use of genetically modified organisms. It has been shown that coding regions can be used to encrypt information into living organisms by using the DNA-Crypt algorithm. Yet, if the sequence of interest presents a non-coding DNA sequence, either the function of a resulting functional RNA molecule or a regulatory sequence, such as a promoter, could be affected. For our studies we used the small cytoplasmic RNA 1 in yeast and the lac promoter region of Escherichia coli. Findings The lac promoter was deactivated by the integrated watermark. In addition, the RNA molecules displayed altered configurations after introducing a watermark, but surprisingly were functionally intact, which has been verified by analyzing the growth characteristics of both wild type and watermarked scR1 transformed yeast cells. In a third approach we introduced a second overlapping watermark into the lac promoter, which did not affect the promoter activity. Conclusion Even though the watermarked RNA and one of the watermarked promoters did not show any significant differences compared to the wild type RNA and wild type promoter region, respectively, it cannot be generalized that other RNA molecules or regulatory sequences behave accordingly. Therefore, we do not recommend integrating watermark sequences into regulatory regions.

  5. Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics

    Science.gov (United States)

    Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.

  6. Mitochondrial DNA sequence evolution in shorebird populations

    NARCIS (Netherlands)

    Wenink, P.W.

    1994-01-01

    This thesis describes the global molecular population structure of two shorebird species, in particular of the dunlin, Calidris alpina, by means of comparative sequence analysis of the most variable part of the mitochondrial DNA (mtDNA) genome. There are several reasons

  7. Profiling nematode communities in unmanaged flowerbed and agricultural field soils in Japan by DNA barcode sequencing.

    Directory of Open Access Journals (Sweden)

    Hisashi Morise

    Full Text Available Soil nematodes play crucial roles in the soil food web and are a suitable indicator for assessing soil environments and ecosystems. Previous nematode community analyses based on nematode morphology classification have been shown to be useful for assessing various soil environments. Here we have conducted DNA barcode analysis for soil nematode community analyses in Japanese soils. We isolated nematodes from two different environmental soils of an unmanaged flowerbed and an agricultural field using the improved flotation-sieving method. Small subunit (SSU rDNA fragments were directly amplified from each of 68 (flowerbed samples and 48 (field samples isolated nematodes to determine the nucleotide sequence. Sixteen and thirteen operational taxonomic units (OTUs were obtained by multiple sequence alignment from the flowerbed and agricultural field nematodes, respectively. All 29 SSU rDNA-derived OTUs (rOTUs were further mapped onto a phylogenetic tree with 107 known nematode species. Interestingly, the two nematode communities examined were clearly distinct from each other in terms of trophic groups: Animal predators and plant feeders were markedly abundant in the flowerbed soils, in contrast, bacterial feeders were dominantly observed in the agricultural field soils. The data from the flowerbed nematodes suggests a possible food web among two different trophic nematode groups and plants (weeds in the closed soil environment. Finally, DNA sequences derived from the mitochondrial cytochrome oxidase c subunit 1 (COI gene were determined as a DNA barcode from 43 agricultural field soil nematodes. These nematodes were assigned to 13 rDNA-derived OTUs, but in the COI gene analysis were assigned to 23 COI gene-derived OTUs (cOTUs, indicating that COI gene-based barcoding may provide higher taxonomic resolution than conventional SSU rDNA-barcoding in soil nematode community analysis.

  8. Profiling Nematode Communities in Unmanaged Flowerbed and Agricultural Field Soils in Japan by DNA Barcode Sequencing

    Science.gov (United States)

    Morise, Hisashi; Miyazaki, Erika; Yoshimitsu, Shoko; Eki, Toshihiko

    2012-01-01

    Soil nematodes play crucial roles in the soil food web and are a suitable indicator for assessing soil environments and ecosystems. Previous nematode community analyses based on nematode morphology classification have been shown to be useful for assessing various soil environments. Here we have conducted DNA barcode analysis for soil nematode community analyses in Japanese soils. We isolated nematodes from two different environmental soils of an unmanaged flowerbed and an agricultural field using the improved flotation-sieving method. Small subunit (SSU) rDNA fragments were directly amplified from each of 68 (flowerbed samples) and 48 (field samples) isolated nematodes to determine the nucleotide sequence. Sixteen and thirteen operational taxonomic units (OTUs) were obtained by multiple sequence alignment from the flowerbed and agricultural field nematodes, respectively. All 29 SSU rDNA-derived OTUs (rOTUs) were further mapped onto a phylogenetic tree with 107 known nematode species. Interestingly, the two nematode communities examined were clearly distinct from each other in terms of trophic groups: Animal predators and plant feeders were markedly abundant in the flowerbed soils, in contrast, bacterial feeders were dominantly observed in the agricultural field soils. The data from the flowerbed nematodes suggests a possible food web among two different trophic nematode groups and plants (weeds) in the closed soil environment. Finally, DNA sequences derived from the mitochondrial cytochrome oxidase c subunit 1 (COI) gene were determined as a DNA barcode from 43 agricultural field soil nematodes. These nematodes were assigned to 13 rDNA-derived OTUs, but in the COI gene analysis were assigned to 23 COI gene-derived OTUs (cOTUs), indicating that COI gene-based barcoding may provide higher taxonomic resolution than conventional SSU rDNA-barcoding in soil nematode community analysis. PMID:23284767

  9. Anaplasma phagocytophilum in Danish sheep: confirmation by DNA sequencing

    Directory of Open Access Journals (Sweden)

    Thamsborg Stig M

    2009-12-01

    Full Text Available Abstract Background The presence of Anaplasma phagocytophilum, an Ixodes ricinus transmitted bacterium, was investigated in two flocks of Danish grazing lambs. Direct PCR detection was performed on DNA extracted from blood and serum with subsequent confirmation by DNA sequencing. Methods 31 samples obtained from clinically normal lambs in 2000 from Fussingø, Jutland and 12 samples from ten lambs and two ewes from a clinical outbreak at Feddet, Zealand in 2006 were included in the study. Some of the animals from Feddet had shown clinical signs of polyarthritis and general unthriftiness prior to sampling. DNA extraction was optimized from blood and serum and detection achieved by a 16S rRNA targeted PCR with verification of the product by DNA sequencing. Results Five DNA extracts were found positive by PCR, including two samples from 2000 and three from 2006. For both series of samples the product was verified as A. phagocytophilum by DNA sequencing. Conclusions A. phagocytophilum was detected by molecular methods for the first time in Danish grazing lambs during the two seasons investigated (2000 and 2006.

  10. Isolation of a sex-linked DNA sequence in cranes.

    Science.gov (United States)

    Duan, W; Fuerst, P A

    2001-01-01

    A female-specific DNA fragment (CSL-W; crane sex-linked DNA on W chromosome) was cloned from female whooping cranes (Grus americana). From the nucleotide sequence of CSL-W, a set of polymerase chain reaction (PCR) primers was identified which amplify a 227-230 bp female-specific fragment from all existing crane species and some other noncrane species. A duplicated versions of the DNA segment, which is found to have a larger size (231-235 bp) than CSL-W in both sexes, was also identified, and was designated CSL-NW (crane sex-linked DNA on non-W chromosome). The nucleotide similarity between the sequences of CSL-W and CSL-NW from whooping cranes was 86.3%. The CSL primers do not amplify any sequence from mammalian DNA, limiting the potential for contamination from human sources. Using the CSL primers in combination with a quick DNA extraction method allows the noninvasive identification of crane gender in less than 10 h. A test of the methodology was carried out on fully developed body feathers from 18 captive cranes and resulted in 100% successful identification.

  11. Spreadsheet-based program for alignment of overlapping DNA sequences.

    Science.gov (United States)

    Anbazhagan, R; Gabrielson, E

    1999-06-01

    Molecular biology laboratories frequently face the challenge of aligning small overlapping DNA sequences derived from a long DNA segment. Here, we present a short program that can be used to adapt Excel spreadsheets as a tool for aligning DNA sequences, regardless of their orientation. The program runs on any Windows or Macintosh operating system computer with Excel 97 or Excel 98. The program is available for use as an Excel file, which can be downloaded from the BioTechniques Web site. Upon execution, the program opens a specially designed customized workbook and is capable of identifying overlapping regions between two sequence fragments and displaying the sequence alignment. It also performs a number of specialized functions such as recognition of restriction enzyme cutting sites and CpG island mapping without costly specialized software.

  12. Methylation patterns of repetitive DNA sequences in germ cells of Mus musculus.

    OpenAIRE

    Sanford, J; Forrester, L; Chapman, V; Chandley, A; Hastie, N

    1984-01-01

    The major and the minor satellite sequences of Mus musculus were undermethylated in both sperm and oocyte DNAs relative to the amount of undermethylation observed in adult somatic tissue DNA. This hypomethylation was specific for satellite sequences in sperm DNA. Dispersed repetitive and low copy sequences show a high degree of methylation in sperm DNA; however, a dispersed repetitive sequence was undermethylated in oocyte DNA. This finding suggests a difference in the amount of total genomic...

  13. A 28,000 Years Old Cro-Magnon mtDNA Sequence Differs from All Potentially Contaminating Modern Sequences

    Science.gov (United States)

    Caramelli, David; Milani, Lucio; Vai, Stefania; Modi, Alessandra; Pecchioli, Elena; Girardi, Matteo; Pilli, Elena; Lari, Martina; Lippi, Barbara; Ronchitelli, Annamaria; Mallegni, Francesco; Casoli, Antonella; Bertorelle, Giorgio; Barbujani, Guido

    2008-01-01

    Background DNA sequences from ancient speciments may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal) and early modern (Cro-Magnoid) Europeans. Methodology/Principal Findings We typed the mitochondrial DNA (mtDNA) hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23) and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. Conclusions/Significance: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans. PMID:18628960

  14. A 28,000 years old Cro-Magnon mtDNA sequence differs from all potentially contaminating modern sequences.

    Directory of Open Access Journals (Sweden)

    David Caramelli

    Full Text Available BACKGROUND: DNA sequences from ancient specimens may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal and early modern (Cro-Magnoid Europeans. METHODOLOGY/PRINCIPAL FINDINGS: We typed the mitochondrial DNA (mtDNA hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23 and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. CONCLUSIONS/SIGNIFICANCE: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans.

  15. Ribosomal DNA sequence heterogeneity reflects intraspecies phylogenies and predicts genome structure in two contrasting yeast species.

    Science.gov (United States)

    West, Claire; James, Stephen A; Davey, Robert P; Dicks, Jo; Roberts, Ian N

    2014-07-01

    The ribosomal RNA encapsulates a wealth of evolutionary information, including genetic variation that can be used to discriminate between organisms at a wide range of taxonomic levels. For example, the prokaryotic 16S rDNA sequence is very widely used both in phylogenetic studies and as a marker in metagenomic surveys and the internal transcribed spacer region, frequently used in plant phylogenetics, is now recognized as a fungal DNA barcode. However, this widespread use does not escape criticism, principally due to issues such as difficulties in classification of paralogous versus orthologous rDNA units and intragenomic variation, both of which may be significant barriers to accurate phylogenetic inference. We recently analyzed data sets from the Saccharomyces Genome Resequencing Project, characterizing rDNA sequence variation within multiple strains of the baker's yeast Saccharomyces cerevisiae and its nearest wild relative Saccharomyces paradoxus in unprecedented detail. Notably, both species possess single locus rDNA systems. Here, we use these new variation datasets to assess whether a more detailed characterization of the rDNA locus can alleviate the second of these phylogenetic issues, sequence heterogeneity, while controlling for the first. We demonstrate that a strong phylogenetic signal exists within both datasets and illustrate how they can be used, with existing methodology, to estimate intraspecies phylogenies of yeast strains consistent with those derived from whole-genome approaches. We also describe the use of partial Single Nucleotide Polymorphisms, a type of sequence variation found only in repetitive genomic regions, in identifying key evolutionary features such as genome hybridization events and show their consistency with whole-genome Structure analyses. We conclude that our approach can transform rDNA sequence heterogeneity from a problem to a useful source of evolutionary information, enabling the estimation of highly accurate phylogenies of

  16. Toward a Better Compression for DNA Sequences Using Huffman Encoding.

    Science.gov (United States)

    Al-Okaily, Anas; Almarri, Badar; Al Yami, Sultan; Huang, Chun-Hsi

    2017-04-01

    Due to the significant amount of DNA data that are being generated by next-generation sequencing machines for genomes of lengths ranging from megabases to gigabases, there is an increasing need to compress such data to a less space and a faster transmission. Different implementations of Huffman encoding incorporating the characteristics of DNA sequences prove to better compress DNA data. These implementations center on the concepts of selecting frequent repeats so as to force a skewed Huffman tree, as well as the construction of multiple Huffman trees when encoding. The implementations demonstrate improvements on the compression ratios for five genomes with lengths ranging from 5 to 50 Mbp, compared with the standard Huffman tree algorithm. The research hence suggests an improvement on all such DNA sequence compression algorithms that use the conventional Huffman encoding. The research suggests an improvement on all DNA sequence compression algorithms that use the conventional Huffman encoding. Accompanying software is publicly available (AL-Okaily, 2016 ).

  17. High-Quality Exome Sequencing of Whole-Genome Amplified Neonatal Dried Blood Spot DNA

    DEFF Research Database (Denmark)

    Poulsen, Jesper Buchhave; Lescai, Francesco; Grove, Jakob

    2016-01-01

    Stored neonatal dried blood spot (DBS) samples from neonatal screening programmes are a valuable diagnostic and research resource. Combined with information from national health registries they can be used in population-based studies of genetic diseases. DNA extracted from neonatal DBSs can...... be amplified to obtain micrograms of an otherwise limited resource, referred to as whole-genome amplified DNA (wgaDNA). Here we investigate the robustness of exome sequencing of wgaDNA of neonatal DBS samples. We conducted three pilot studies of seven, eight and seven subjects, respectively. For each subject...... we analysed a neonatal DBS sample and corresponding adult whole-blood (WB) reference sample. Different DNA sample types were prepared for each of the subjects. Pilot 1: wgaDNA of 2x3.2mm neonatal DBSs (DBS_2x3.2) and raw DNA extract of the WB reference sample (WB_ref). Pilot 2: DBS_2x3.2, WB...

  18. Mouse tetranectin: cDNA sequence, tissue-specific expression, and chromosomal mapping

    DEFF Research Database (Denmark)

    Ibaraki, K; Kozak, C A; Wewer, U M

    1995-01-01

    regulation, mouse tetranectin cDNA was cloned from a 16-day-old mouse embryo library. Sequence analysis revealed a 992-bp cDNA with an open reading frame of 606 bp, which is identical in length to the human tetranectin cDNA. The deduced amino acid sequence showed high homology to the human cDNA with 76......(s) of tetranectin. The sequence analysis revealed a difference in both sequence and size of the noncoding regions between mouse and human cDNAs. Northern analysis of the various tissues from mouse, rat, and cow showed the major transcript(s) to be approximately 1 kb, which is similar in size to that observed...

  19. Isolation and analysis of high quality nuclear DNA with reduced organellar DNA for plant genome sequencing and resequencing

    Directory of Open Access Journals (Sweden)

    Zdepski Anna

    2011-05-01

    Full Text Available Abstract Background High throughput sequencing (HTS technologies have revolutionized the field of genomics by drastically reducing the cost of sequencing, making it feasible for individual labs to sequence or resequence plant genomes. Obtaining high quality, high molecular weight DNA from plants poses significant challenges due to the high copy number of chloroplast and mitochondrial DNA, as well as high levels of phenolic compounds and polysaccharides. Multiple methods have been used to isolate DNA from plants; the CTAB method is commonly used to isolate total cellular DNA from plants that contain nuclear DNA, as well as chloroplast and mitochondrial DNA. Alternatively, DNA can be isolated from nuclei to minimize chloroplast and mitochondrial DNA contamination. Results We describe optimized protocols for isolation of nuclear DNA from eight different plant species encompassing both monocot and eudicot species. These protocols use nuclei isolation to minimize chloroplast and mitochondrial DNA contamination. We also developed a protocol to determine the number of chloroplast and mitochondrial DNA copies relative to the nuclear DNA using quantitative real time PCR (qPCR. We compared DNA isolated from nuclei to total cellular DNA isolated with the CTAB method. As expected, DNA isolated from nuclei consistently yielded nuclear DNA with fewer chloroplast and mitochondrial DNA copies, as compared to the total cellular DNA prepared with the CTAB method. This protocol will allow for analysis of the quality and quantity of nuclear DNA before starting a plant whole genome sequencing or resequencing experiment. Conclusions Extracting high quality, high molecular weight nuclear DNA in plants has the potential to be a bottleneck in the era of whole genome sequencing and resequencing. The methods that are described here provide a framework for researchers to extract and quantify nuclear DNA in multiple types of plants.

  20. Statistical assignment of DNA sequences using Bayesian phylogenetics

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Huelsenbeck, John P.

    2008-01-01

    We provide a new automated statistical method for DNA barcoding based on a Bayesian phylogenetic analysis. The method is based on automated database sequence retrieval, alignment, and phylogenetic analysis using a custom-built program for Bayesian phylogenetic analysis. We show on real data...... that the method outperforms Blast searches as a measure of confidence and can help eliminate 80% of all false assignment based on best Blast hit. However, the most important advance of the method is that it provides statistically meaningful measures of confidence. We apply the method to a re......-analysis of previously published ancient DNA data and show that, with high statistical confidence, most of the published sequences are in fact of Neanderthal origin. However, there are several cases of chimeric sequences that are comprised of a combination of both Neanderthal and modern human DNA....

  1. Sequence of a cDNA encoding turtle high mobility group 1 protein.

    Science.gov (United States)

    Zheng, Jifang; Hu, Bi; Wu, Duansheng

    2005-07-01

    In order to understand sequence information about turtle HMG1 gene, a cDNA encoding HMG1 protein of the Chinese soft-shell turtle (Pelodiscus sinensis) was amplified by RT-PCR from kidney total RNA, and was cloned, sequenced and analyzed. The results revealed that the open reading frame (ORF) of turtle HMG1 cDNA is 606 bp long. The ORF codifies 202 amino acid residues, from which two DNA-binding domains and one polyacidic region are derived. The DNA-binding domains share higher amino acid identity with homologues sequences of chicken (96.5%) and mammalian (74%) than homologues sequence of rainbow trout (67%). The polyacidic region shows 84.6% amino acid homology with the equivalent region of chicken HMG1 cDNA. Turtle HMG1 protein contains 3 Cys residues located at completely conserved positions. Conservation in sequence and structure suggests that the functions of turtle HMG1 cDNA may be highly conserved during evolution. To our knowledge, this is the first report of HMG1 cDNA sequence in any reptilian.

  2. Enrichment of megabase-sized DNA molecules for single-molecule optical mapping and next-generation sequencing

    DEFF Research Database (Denmark)

    Łopacińska-Jørgensen, Joanna M; Pedersen, Jonas Nyvold; Bak, Mads

    2017-01-01

    Next-generation sequencing (NGS) has caused a revolution, yet left a gap: long-range genetic information from native, non-amplified DNA fragments is unavailable. It might be obtained by optical mapping of megabase-sized DNA molecules. Frequently only a specific genomic region is of interest, so......-megabase- to megabase-sized DNA molecules were recovered from the gel and analysed by denaturation-renaturation optical mapping. Size-selected molecules from the same gel were sequenced by NGS. The optically mapped molecules and the NGS reads showed enrichment from regions defined by NotI restriction sites. We...... demonstrate that the unannotated genome can be characterized in a locus-specific manner via molecules partially overlapping with the annotated genome. The method is a promising tool for investigation of structural variants in enriched human genomic regions for both research and diagnostic purposes. Our...

  3. Cloning, sequencing, and expression of cDNA for human β-glucuronidase

    International Nuclear Information System (INIS)

    Oshima, A.; Kyle, J.W.; Miller, R.D.

    1987-01-01

    The authors report here the cDNA sequence for human placental β-glucuronidase (β-D-glucuronoside glucuronosohydrolase, EC 3.2.1.31) and demonstrate expression of the human enzyme in transfected COS cells. They also sequenced a partial cDNA clone from human fibroblasts that contained a 153-base-pair deletion within the coding sequence and found a second type of cDNA clone from placenta that contained the same deletion. Nuclease S1 mapping studies demonstrated two types of mRNAs in human placenta that corresponded to the two types of cDNA clones isolated. The NH 2 -terminal amino acid sequence determined for human spleen β-glucuronidase agreed with that inferred from the DNA sequence of the two placental clones, beginning at amino acid 23, suggesting a cleaved signal sequence of 22 amino acids. When transfected into COS cells, plasmids containing either placental clone expressed an immunoprecipitable protein that contained N-linked oligosaccharides as evidenced by sensitivity to endoglycosidase F. However, only transfection with the clone containing the 153-base-pair segment led to expression of human β-glucuronidase activity. These studies provide the sequence for the full-length cDNA for human β-glucuronidase, demonstrate the existence of two populations of mRNA for β-glucuronidase in human placenta, only one of which specifies a catalytically active enzyme, and illustrate the importance of expression studies in verifying that a cDNA is functionally full-length

  4. Recurrence time statistics: versatile tools for genomic DNA sequence analysis.

    Science.gov (United States)

    Cao, Yinhe; Tung, Wen-Wen; Gao, J B

    2004-01-01

    With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.

  5. Order and correlations in genomic DNA sequences. The spectral approach

    International Nuclear Information System (INIS)

    Lobzin, Vasilii V; Chechetkin, Vladimir R

    2000-01-01

    The structural analysis of genomic DNA sequences is discussed in the framework of the spectral approach, which is sufficiently universal due to the reciprocal correspondence and mutual complementarity of Fourier transform length scales. The spectral characteristics of random sequences of the same nucleotide composition possess the property of self-averaging for relatively short sequences of length M≥100-300. Comparison with the characteristics of random sequences determines the statistical significance of the structural features observed. Apart from traditional applications to the search for hidden periodicities, spectral methods are also efficient in studying mutual correlations in DNA sequences. By combining spectra for structure factors and correlation functions, not only integral correlations can be estimated but also their origin identified. Using the structural spectral entropy approach, the regularity of a sequence can be quantitatively assessed. A brief introduction to the problem is also presented and other major methods of DNA sequence analysis described. (reviews of topical problems)

  6. Network clustering coefficient approach to DNA sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gerhardt, Guenther J.L. [Universidade Federal do Rio Grande do Sul-Hospital de Clinicas de Porto Alegre, Rua Ramiro Barcelos 2350/sala 2040/90035-003 Porto Alegre (Brazil); Departamento de Fisica e Quimica da Universidade de Caxias do Sul, Rua Francisco Getulio Vargas 1130, 95001-970 Caxias do Sul (Brazil); Lemke, Ney [Programa Interdisciplinar em Computacao Aplicada, Unisinos, Av. Unisinos, 950, 93022-000 Sao Leopoldo, RS (Brazil); Corso, Gilberto [Departamento de Biofisica e Farmacologia, Centro de Biociencias, Universidade Federal do Rio Grande do Norte, Campus Universitario, 59072 970 Natal, RN (Brazil)]. E-mail: corso@dfte.ufrn.br

    2006-05-15

    In this work we propose an alternative DNA sequence analysis tool based on graph theoretical concepts. The methodology investigates the path topology of an organism genome through a triplet network. In this network, triplets in DNA sequence are vertices and two vertices are connected if they occur juxtaposed on the genome. We characterize this network topology by measuring the clustering coefficient. We test our methodology against two main bias: the guanine-cytosine (GC) content and 3-bp (base pairs) periodicity of DNA sequence. We perform the test constructing random networks with variable GC content and imposed 3-bp periodicity. A test group of some organisms is constructed and we investigate the methodology in the light of the constructed random networks. We conclude that the clustering coefficient is a valuable tool since it gives information that is not trivially contained in 3-bp periodicity neither in the variable GC content.

  7. Mapping Base Modifications in DNA by Transverse-Current Sequencing

    Science.gov (United States)

    Alvarez, Jose R.; Skachkov, Dmitry; Massey, Steven E.; Kalitsov, Alan; Velev, Julian P.

    2018-02-01

    Sequencing DNA modifications and lesions, such as methylation of cytosine and oxidation of guanine, is even more important and challenging than sequencing the genome itself. The traditional methods for detecting DNA modifications are either insensitive to these modifications or require additional processing steps to identify a particular type of modification. Transverse-current sequencing in nanopores can potentially identify the canonical bases and base modifications in the same run. In this work, we demonstrate that the most common DNA epigenetic modifications and lesions can be detected with any predefined accuracy based on their tunneling current signature. Our results are based on simulations of the nanopore tunneling current through DNA molecules, calculated using nonequilibrium electron-transport methodology within an effective multiorbital model derived from first-principles calculations, followed by a base-calling algorithm accounting for neighbor current-current correlations. This methodology can be integrated with existing experimental techniques to improve base-calling fidelity.

  8. DNA interactions with a Methylene Blue redox indicator depend on the DNA length and are sequence specific.

    Science.gov (United States)

    Farjami, Elaheh; Clima, Lilia; Gothelf, Kurt V; Ferapontova, Elena E

    2010-06-01

    A DNA molecular beacon approach was used for the analysis of interactions between DNA and Methylene Blue (MB) as a redox indicator of a hybridization event. DNA hairpin structures of different length and guanine (G) content were immobilized onto gold electrodes in their folded states through the alkanethiol linker at the 5'-end. Binding of MB to the folded hairpin DNA was electrochemically studied and compared with binding to the duplex structure formed by hybridization of the hairpin DNA to a complementary DNA strand. Variation of the electrochemical signal from the DNA-MB complex was shown to depend primarily on the DNA length and sequence used: the G-C base pairs were the preferential sites of MB binding in the duplex. For short 20 nts long DNA sequences, the increased electrochemical response from MB bound to the duplex structure was consistent with the increased amount of bound and electrochemically readable MB molecules (i.e. MB molecules that are available for the electron transfer (ET) reaction with the electrode). With longer DNA sequences, the balance between the amounts of the electrochemically readable MB molecules bound to the hairpin DNA and to the hybrid was opposite: a part of the MB molecules bound to the long-sequence DNA duplex seem to be electrochemically mute due to long ET distance. The increasing electrochemical response from MB bound to the short-length DNA hybrid contrasts with the decreasing signal from MB bound to the long-length DNA hybrid and allows an "off"-"on" genosensor development.

  9. SSR_pipeline--computer software for the identification of microsatellite sequences from paired-end Illumina high-throughput DNA sequence data

    Science.gov (United States)

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (SSRs; for example, microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains three analysis modules along with a fourth control module that can be used to automate analyses of large volumes of data. The modules are used to (1) identify the subset of paired-end sequences that pass quality standards, (2) align paired-end reads into a single composite DNA sequence, and (3) identify sequences that possess microsatellites conforming to user specified parameters. Each of the three separate analysis modules also can be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc). All modules are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, Windows). The program suite relies on a compiled Python extension module to perform paired-end alignments. Instructions for compiling the extension from source code are provided in the documentation. Users who do not have Python installed on their computers or who do not have the ability to compile software also may choose to download packaged executable files. These files include all Python scripts, a copy of the compiled extension module, and a minimal installation of Python in a single binary executable. See program documentation for more information.

  10. Ancestral sequence reconstruction in primate mitochondrial DNA: compositional bias and effect on functional inference.

    Science.gov (United States)

    Krishnan, Neeraja M; Seligmann, Hervé; Stewart, Caro-Beth; De Koning, A P Jason; Pollock, David D

    2004-10-01

    Reconstruction of ancestral DNA and amino acid sequences is an important means of inferring information about past evolutionary events. Such reconstructions suggest changes in molecular function and evolutionary processes over the course of evolution and are used to infer adaptation and convergence. Maximum likelihood (ML) is generally thought to provide relatively accurate reconstructed sequences compared to parsimony, but both methods lead to the inference of multiple directional changes in nucleotide frequencies in primate mitochondrial DNA (mtDNA). To better understand this surprising result, as well as to better understand how parsimony and ML differ, we constructed a series of computationally simple "conditional pathway" methods that differed in the number of substitutions allowed per site along each branch, and we also evaluated the entire Bayesian posterior frequency distribution of reconstructed ancestral states. We analyzed primate mitochondrial cytochrome b (Cyt-b) and cytochrome oxidase subunit I (COI) genes and found that ML reconstructs ancestral frequencies that are often more different from tip sequences than are parsimony reconstructions. In contrast, frequency reconstructions based on the posterior ensemble more closely resemble extant nucleotide frequencies. Simulations indicate that these differences in ancestral sequence inference are probably due to deterministic bias caused by high uncertainty in the optimization-based ancestral reconstruction methods (parsimony, ML, Bayesian maximum a posteriori). In contrast, ancestral nucleotide frequencies based on an average of the Bayesian set of credible ancestral sequences are much less biased. The methods involving simpler conditional pathway calculations have slightly reduced likelihood values compared to full likelihood calculations, but they can provide fairly unbiased nucleotide reconstructions and may be useful in more complex phylogenetic analyses than considered here due to their speed and

  11. Fidelity and Mutational Spectrum of Pfu DNA Polymerase on a Human Mitochondrial DNA Sequence

    Science.gov (United States)

    André, Paulo; Kim, Andrea; Khrapko, Konstantin; Thilly, William G.

    1997-01-01

    The study of rare genetic changes in human tissues requires specialized techniques. Point mutations at fractions at or below 10−6 must be observed to discover even the most prominent features of the point mutational spectrum. PCR permits the increase in number of mutant copies but does so at the expense of creating many additional mutations or “PCR noise”. Thus, each DNA sequence studied must be characterized with regard to the DNA polymerase and conditions used to avoid interpreting a PCR-generated mutation as one arising in human tissue. The thermostable DNA polymerase derived from Pyrococcus furiosus designated Pfu has the highest fidelity of any DNA thermostable polymerase studied to date, and this property recommends it for analyses of tissue mutational spectra. Here, we apply constant denaturant capillary electrophoresis (CDCE) to separate and isolate the products of DNA amplification. This new strategy permitted direct enumeration and identification of point mutations created by Pfu DNA polymerase in a 96-bp low melting domain of a human mitochondrial sequence despite the very low mutant fractions generated in the PCR process. This sequence, containing part of the tRNA glycine and NADH dehydrogenase subunit 3 genes, is the target of our studies of mitochondrial mutagenesis in human cells and tissues. Incorrectly synthesized sequences were separated from the wild type as mutant/wild-type heteroduplexes by sequential enrichment on CDCE. An artificially constructed mutant was used as an internal standard to permit calculation of the mutant fraction. Our study found that the average error rate (mutations per base pair duplication) of Pfu was 6.5 × 10−7, and five of its more frequent mutations (hot spots) consisted of three transversions (GC → TA, AT → TA, and AT → CG), one transition (AT → GC), and one 1-bp deletion (in an AAAAAA sequence). To achieve an even higher sensitivity, the amount of Pfu-induced mutants must be

  12. DNA sequences from the quagga, an extinct member of the horse family.

    Science.gov (United States)

    Higuchi, R; Bowman, B; Freiberger, M; Ryder, O A; Wilson, A C

    To determine whether DNA survives and can be recovered from the remains of extinct creatures, we have examined dried muscle from a museum specimen of the quagga, a zebra-like species (Equus quagga) that became extinct in 1883 (ref. 1). We report that DNA was extracted from this tissue in amounts approaching 1% of that expected from fresh muscle, and that the DNA was of relatively low molecular weight. Among the many clones obtained from the quagga DNA, two containing pieces of mitochondrial DNA (mtDNA) were sequenced. These sequences, comprising 229 nucleotide pairs, differ by 12 base substitutions from the corresponding sequences of mtDNA from a mountain zebra, an extant member of the genus Equus. The number, nature and locations of the substitutions imply that there has been little or no postmortem modification of the quagga DNA sequences, and that the two species had a common ancestor 3-4 Myr ago, consistent with fossil evidence concerning the age of the genus Equus.

  13. Substrate sequence selectivity of APOBEC3A implicates intra-DNA interactions.

    Science.gov (United States)

    Silvas, Tania V; Hou, Shurong; Myint, Wazo; Nalivaika, Ellen; Somasundaran, Mohan; Kelch, Brian A; Matsuo, Hiroshi; Kurt Yilmaz, Nese; Schiffer, Celia A

    2018-05-14

    The APOBEC3 (A3) family of human cytidine deaminases is renowned for providing a first line of defense against many exogenous and endogenous retroviruses. However, the ability of these proteins to deaminate deoxycytidines in ssDNA makes A3s a double-edged sword. When overexpressed, A3s can mutate endogenous genomic DNA resulting in a variety of cancers. Although the sequence context for mutating DNA varies among A3s, the mechanism for substrate sequence specificity is not well understood. To characterize substrate specificity of A3A, a systematic approach was used to quantify the affinity for substrate as a function of sequence context, length, secondary structure, and solution pH. We identified the A3A ssDNA binding motif as (T/C)TC(A/G), which correlated with enzymatic activity. We also validated that A3A binds RNA in a sequence specific manner. A3A bound tighter to substrate binding motif within a hairpin loop compared to linear oligonucleotide, suggesting A3A affinity is modulated by substrate structure. Based on these findings and previously published A3A-ssDNA co-crystal structures, we propose a new model with intra-DNA interactions for the molecular mechanism underlying A3A sequence preference. Overall, the sequence and structural preferences identified for A3A leads to a new paradigm for identifying A3A's involvement in mutation of endogenous or exogenous DNA.

  14. DNA cross-linking by dehydromonocrotaline lacks apparent base sequence preference.

    Science.gov (United States)

    Rieben, W Kurt; Coulombe, Roger A

    2004-12-01

    Pyrrolizidine alkaloids (PAs) are ubiquitous plant toxins, many of which, upon oxidation by hepatic mixed-function oxidases, become reactive bifunctional pyrrolic electrophiles that form DNA-DNA and DNA-protein cross-links. The anti-mitotic, toxic, and carcinogenic action of PAs is thought to be caused, at least in part, by these cross-links. We wished to determine whether the activated PA pyrrole dehydromonocrotaline (DHMO) exhibits base sequence preferences when cross-linked to a set of model duplex poly A-T 14-mer oligonucleotides with varying internal and/or end 5'-d(CG), 5'-d(GC), 5'-d(TA), 5'-d(CGCG), or 5'-d(GCGC) sequences. DHMO-DNA cross-links were assessed by electrophoretic mobility shift assay (EMSA) of 32P endlabeled oligonucleotides and by HPLC analysis of cross-linked DNAs enzymatically digested to their constituent deoxynucleosides. The degree of DNA cross-links depended upon the concentration of the pyrrole, but not on the base sequence of the oligonucleotide target. Likewise, HPLC chromatograms of cross-linked and digested DNAs showed no discernible sequence preference for any nucleotide. Added glutathione, tyrosine, cysteine, and aspartic acid, but not phenylalanine, threonine, serine, lysine, or methionine competed with DNA as alternate nucleophiles for cross-linking by DHMO. From these data it appears that DHMO exhibits no strong base preference when forming cross-links with DNA, and that some cellular nucleophiles can inhibit DNA cross-link formation.

  15. Rapid identification and classification of bacteria by 16S rDNA restriction fragment melting curve analyses (RFMCA).

    Science.gov (United States)

    Rudi, Knut; Kleiberg, Gro H; Heiberg, Ragnhild; Rosnes, Jan T

    2007-08-01

    The aim of this work was to evaluate restriction fragment melting curve analyses (RFMCA) as a novel approach for rapid classification of bacteria during food production. RFMCA was evaluated for bacteria isolated from sous vide food products, and raw materials used for sous vide production. We identified four major bacterial groups in the material analysed (cluster I-Streptococcus, cluster II-Carnobacterium/Bacillus, cluster III-Staphylococcus and cluster IV-Actinomycetales). The accuracy of RFMCA was evaluated by comparison with 16S rDNA sequencing. The strains satisfying the RFMCA quality filtering criteria (73%, n=57), with both 16S rDNA sequence information and RFMCA data (n=45) gave identical group assignments with the two methods. RFMCA enabled rapid and accurate classification of bacteria that is database compatible. Potential application of RFMCA in the food or pharmaceutical industry will include development of classification models for the bacteria expected in a given product, and then to build an RFMCA database as a part of the product quality control.

  16. [Whole Genome Sequencing of Human mtDNA Based on Ion Torrent PGM™ Platform].

    Science.gov (United States)

    Cao, Y; Zou, K N; Huang, J P; Ma, K; Ping, Y

    2017-08-01

    To analyze and detect the whole genome sequence of human mitochondrial DNA (mtDNA) by Ion Torrent PGM™ platform and to study the differences of mtDNA sequence in different tissues. Samples were collected from 6 unrelated individuals by forensic postmortem examination, including chest blood, hair, costicartilage, nail, skeletal muscle and oral epithelium. Amplification of whole genome sequence of mtDNA was performed by 4 pairs of primer. Libraries were constructed with Ion Shear™ Plus Reagents kit and Ion Plus Fragment Library kit. Whole genome sequencing of mtDNA was performed using Ion Torrent PGM™ platform. Sanger sequencing was used to determine the heteroplasmy positions and the mutation positions on HVⅠ region. The whole genome sequence of mtDNA from all samples were amplified successfully. Six unrelated individuals belonged to 6 different haplotypes. Different tissues in one individual had heteroplasmy difference. The heteroplasmy positions and the mutation positions on HVⅠ region were verified by Sanger sequencing. After a consistency check by the Kappa method, it was found that the results of mtDNA sequence had a high consistency in different tissues. The testing method used in present study for sequencing the whole genome sequence of human mtDNA can detect the heteroplasmy difference in different tissues, which have good consistency. The results provide guidance for the further applications of mtDNA in forensic science. Copyright© by the Editorial Department of Journal of Forensic Medicine

  17. Studies of base pair sequence effects on DNA solvation based on all-atom molecular dynamics simulations.

    Science.gov (United States)

    Dixit, Surjit B; Mezei, Mihaly; Beveridge, David L

    2012-07-01

    Detailed analyses of the sequence-dependent solvation and ion atmosphere of DNA are presented based on molecular dynamics (MD) simulations on all the 136 unique tetranucleotide steps obtained by the ABC consortium using the AMBER suite of programs. Significant sequence effects on solvation and ion localization were observed in these simulations. The results were compared to essentially all known experimental data on the subject. Proximity analysis was employed to highlight the sequence dependent differences in solvation and ion localization properties in the grooves of DNA. Comparison of the MD-calculated DNA structure with canonical A- and B-forms supports the idea that the G/C-rich sequences are closer to canonical A- than B-form structures, while the reverse is true for the poly A sequences, with the exception of the alternating ATAT sequence. Analysis of hydration density maps reveals that the flexibility of solute molecule has a significant effect on the nature of observed hydration. Energetic analysis of solute-solvent interactions based on proximity analysis of solvent reveals that the GC or CG base pairs interact more strongly with water molecules in the minor groove of DNA that the AT or TA base pairs, while the interactions of the AT or TA pairs in the major groove are stronger than those of the GC or CG pairs. Computation of solvent-accessible surface area of the nucleotide units in the simulated trajectories reveals that the similarity with results derived from analysis of a database of crystallographic structures is excellent. The MD trajectories tend to follow Manning's counterion condensation theory, presenting a region of condensed counterions within a radius of about 17 A from the DNA surface independent of sequence. The GC and CG pairs tend to associate with cations in the major groove of the DNA structure to a greater extent than the AT and TA pairs. Cation association is more frequent in the minor groove of AT than the GC pairs. In general, the

  18. Sequence context effects on 8-methoxypsoralen photobinding to defined DNA fragments

    International Nuclear Information System (INIS)

    Sage, E.; Moustacchi, E.

    1987-01-01

    The photoreaction of 8-methoxypsoralen (8-MOP) with DNA fragments of defined sequence was studied. The authors took advantage of the blockage by bulky adducts of the 3'-5'-exonuclease activity associated with the T4 DNA polymerase. The action of the exonuclease is stopped by biadducts as well as by monoadducts. The termination products were analyzed on sequencing gels. A strong sequence specificity was observed in the DNA photobinding of 8-MOP. The exonuclease terminates its digestion near thymine residues, mainly at potentially cross-linkable sites. There is an increasing reactivity of thymine residues in the order T < TT << TTT in a GC environment. For thymine residues in cross-linkable sites, the reactivity follows the order AT << TA ∼ TAT << ATA < ATAT < ATATAA. Repeated A-T sequences are hot spots for the photochemical reaction of 8-MOP with DNA. Both monoadducts and interstrand cross-links are formed preferentially in 5'-TpA sites. The results highlight the role of the sequence and consequently of the conformation around a potential site in the photobinding of 8-MOP to DNA

  19. Management of High-Throughput DNA Sequencing Projects: Alpheus.

    Science.gov (United States)

    Miller, Neil A; Kingsmore, Stephen F; Farmer, Andrew; Langley, Raymond J; Mudge, Joann; Crow, John A; Gonzalez, Alvaro J; Schilkey, Faye D; Kim, Ryan J; van Velkinburgh, Jennifer; May, Gregory D; Black, C Forrest; Myers, M Kathy; Utsey, John P; Frost, Nicholas S; Sugarbaker, David J; Bueno, Raphael; Gullans, Stephen R; Baxter, Susan M; Day, Steve W; Retzel, Ernest F

    2008-12-26

    High-throughput DNA sequencing has enabled systems biology to begin to address areas in health, agricultural and basic biological research. Concomitant with the opportunities is an absolute necessity to manage significant volumes of high-dimensional and inter-related data and analysis. Alpheus is an analysis pipeline, database and visualization software for use with massively parallel DNA sequencing technologies that feature multi-gigabase throughput characterized by relatively short reads, such as Illumina-Solexa (sequencing-by-synthesis), Roche-454 (pyrosequencing) and Applied Biosystem's SOLiD (sequencing-by-ligation). Alpheus enables alignment to reference sequence(s), detection of variants and enumeration of sequence abundance, including expression levels in transcriptome sequence. Alpheus is able to detect several types of variants, including non-synonymous and synonymous single nucleotide polymorphisms (SNPs), insertions/deletions (indels), premature stop codons, and splice isoforms. Variant detection is aided by the ability to filter variant calls based on consistency, expected allele frequency, sequence quality, coverage, and variant type in order to minimize false positives while maximizing the identification of true positives. Alpheus also enables comparisons of genes with variants between cases and controls or bulk segregant pools. Sequence-based differential expression comparisons can be developed, with data export to SAS JMP Genomics for statistical analysis.

  20. Genetic diversity of mtDNA D-loop sequences in four native Chinese chicken breeds.

    Science.gov (United States)

    Guo, H W; Li, C; Wang, X N; Li, Z J; Sun, G R; Li, G X; Liu, X J; Kang, X T; Han, R L

    2017-10-01

    1. To explore the genetic diversity of Chinese indigenous chicken breeds, a 585 bp fragment of the mitochondrial DNA (mtDNA) region was sequenced in 102 birds from the Xichuan black-bone chicken, Yunyang black-bone chicken and Lushi chicken. In addition, 30 mtDNA D-loop sequences of Silkie fowls were downloaded from NCBI. The mtDNA D-loop sequence polymorphism and maternal origin of 4 chicken breeds were analysed in this study. 2. The results showed that a total of 33 mutation sites and 28 haplotypes were detected in the 4 chicken breeds. The haplotype diversity and nucleotide diversity of these 4 native breeds were 0.916 ± 0.014 and 0.012 ± 0.002, respectively. Three clusters were formed in 4 Chinese native chickens and 12 reference breeds. Both the Xichuan black-bone chicken and Yunyang black-bone chicken were grouped into one cluster. Four haplogroups (A, B, C and E) emerged in the median-joining network in these breeds. 3. It was concluded that these 4 Chinese chicken breeds had high genetic diversity. The phylogenetic tree and median network profiles showed that Chinese native chickens and its neighbouring countries had at least two maternal origins, one from Yunnan, China and another from Southeast Asia or its surrounding area.

  1. DNA sequence responsible for the amplification of adjacent genes.

    Science.gov (United States)

    Pasion, S G; Hartigan, J A; Kumar, V; Biswas, D K

    1987-10-01

    A 10.3-kb DNA fragment in the 5'-flanking region of the rat prolactin (rPRL) gene was isolated from F1BGH(1)2C1, a strain of rat pituitary tumor cells (GH cells) that produces prolactin in response to 5-bromodeoxyuridine (BrdU). Following transfection and integration into genomic DNA of recipient mouse L cells, this DNA induced amplification of the adjacent thymidine kinase gene from Herpes simplex virus type 1 (HSV1TK). We confirmed the ability of this "Amplicon" sequence to induce amplification of other linked or unlinked genes in DNA-mediated gene transfer studies. When transferred into the mouse L cells with the 10.3-5'rPRL gene sequence of BrdU-responsive cells, both the human growth hormone and the HSV1TK genes are amplified in response to 5-bromodeoxyuridine. This observation is substantiated by BrdU-induced amplification of the cotransferred bacterial Neo gene. Cotransfection studies reveal that the BrdU-induced amplification capability is associated with a 4-kb DNA sequence in the 5'-flanking region of the rPRL gene of BrdU-responsive cells. These results demonstrate that genes of heterologous origin, linked or unlinked, and selected or unselected, can be coamplified when located within the amplification boundary of the Amplicon sequence.

  2. PCR-Free Enrichment of Mitochondrial DNA from Human Blood and Cell Lines for High Quality Next-Generation DNA Sequencing.

    Directory of Open Access Journals (Sweden)

    Meetha P Gould

    Full Text Available Recent advances in sequencing technology allow for accurate detection of mitochondrial sequence variants, even those in low abundance at heteroplasmic sites. Considerable sequencing cost savings can be achieved by enriching samples for mitochondrial (relative to nuclear DNA. Reduction in nuclear DNA (nDNA content can also help to avoid false positive variants resulting from nuclear mitochondrial sequences (numts. We isolate intact mitochondrial organelles from both human cell lines and blood components using two separate methods: a magnetic bead binding protocol and differential centrifugation. DNA is extracted and further enriched for mitochondrial DNA (mtDNA by an enzyme digest. Only 1 ng of the purified DNA is necessary for library preparation and next generation sequence (NGS analysis. Enrichment methods are assessed and compared using mtDNA (versus nDNA content as a metric, measured by using real-time quantitative PCR and NGS read analysis. Among the various strategies examined, the optimal is differential centrifugation isolation followed by exonuclease digest. This strategy yields >35% mtDNA reads in blood and cell lines, which corresponds to hundreds-fold enrichment over baseline. The strategy also avoids false variant calls that, as we show, can be induced by the long-range PCR approaches that are the current standard in enrichment procedures. This optimization procedure allows mtDNA enrichment for efficient and accurate massively parallel sequencing, enabling NGS from samples with small amounts of starting material. This will decrease costs by increasing the number of samples that may be multiplexed, ultimately facilitating efforts to better understand mitochondria-related diseases.

  3. Polymorphism of Paramecium pentaurelia (Ciliophora, Oligohymenophorea) strains revealed by rDNA and mtDNA sequences.

    Science.gov (United States)

    Przyboś, Ewa; Tarcz, Sebastian; Greczek-Stachura, Magdalena; Surmacz, Marta

    2011-05-01

    Paramecium pentaurelia is one of 15 known sibling species of the Paramecium aurelia complex. It is recognized as a species showing no intra-specific differentiation on the basis of molecular fingerprint analyses, whereas the majority of other species are polymorphic. This study aimed at assessing genetic polymorphism within P. pentaurelia including new strains recently found in Poland (originating from two water bodies, different years, seasons, and clones of one strain) as well as strains collected from distant habitats (USA, Europe, Asia), and strains representing other species of the complex. We compared two DNA fragments: partial sequences (349 bp) of the LSU rDNA and partial sequences (618 bp) of cytochrome B gene. A correlation between the geographical origin of the strains and the genetic characteristics of their genotypes was not observed. Different genotypes were found in Kraków in two types of water bodies (Opatkowice-natural pond; Jordan's Park-artificial pond). Haplotype diversity within a single water body was not recorded. Likewise, seasonal haplotype differences between the strains within the artificial water body, as well as differences between clones originating from one strain, were not detected. The clustering of some strains belonging to different species was observed in the phylogenies. Copyright © 2010 Elsevier GmbH. All rights reserved.

  4. Genetic polymorphism in Gymnodinium galatheanum chloroplast DNA sequences and development of a molecular detection assay.

    Science.gov (United States)

    Tengs, T; Bowers, H A; Ziman, A P; Stoecker, D K; Oldach, D W

    2001-02-01

    Nuclear and chloroplast-encoded small subunit ribosomal DNA sequences were obtained from several strains of the toxic dinoflagellate Gymnodinium galatheanum. Phylogenetic analyses and comparison of sequences indicate that the chloroplast sequences show a higher degree of sequence divergence than the nuclear homologue. The chloroplast sequences were chosen as targets for the development of a 5'--3' exonuclease assay for detection of the organism. The assay has a very high degree of specificity and has been used to screen environmental water samples from a fish farm where the presence of this dinoflagellate species has previously been associated with fish kills. Various hypotheses for the derived nature of the chloroplast sequences are discussed, as well as what is known about the toxicity of the species.

  5. Characterization of primary biogenic aerosol particles in urban, rural, and high-alpine air by DNA sequence and restriction fragment analysis of ribosomal RNA genes

    Directory of Open Access Journals (Sweden)

    V. R. Després

    2007-12-01

    Full Text Available This study explores the applicability of DNA analyses for the characterization of primary biogenic aerosol (PBA particles in the atmosphere. Samples of fine particulate matter (PM2.5 and total suspended particulates (TSP have been collected on different types of filter materials at urban, rural, and high-alpine locations along an altitude transect in the south of Germany (Munich, Hohenpeissenberg, Mt. Zugspitze.

    From filter segments loaded with about one milligram of air particulate matter, DNA could be extracted and DNA sequences could be determined for bacteria, fungi, plants and animals. Sequence analyses were used to determine the identity of biological organisms, and terminal restriction fragment length polymorphism analyses (T-RFLP were applied to estimate diversities and relative abundances of bacteria. Investigations of blank and background samples showed that filter materials have to be decontaminated prior to use, and that the sampling and handling procedures have to be carefully controlled to avoid artifacts in the analyses.

    Mass fractions of DNA in PM2.5 were found to be around 0.05% in urban, rural, and high-alpine aerosols. The average concentration of DNA determined for urban air was on the order of ~7 ng m−3, indicating that human adults may inhale about one microgram of DNA per day (corresponding to ~108 haploid bacterial genomes or ~105 haploid human genomes, respectively.

    Most of the bacterial sequences found in PM2.5 were from Proteobacteria (42 and some from Actinobacteria (10 and Firmicutes (1. The fungal sequences were characteristic for Ascomycota (3 and Basidiomycota (1, which are known to actively discharge spores into the atmosphere. The plant sequences could be attributed to green plants (2 and moss spores (2, while animal DNA was found only for one unicellular eukaryote (protist.

  6. An automated annotation tool for genomic DNA sequences using

    Indian Academy of Sciences (India)

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated ...

  7. Novel DNA sequence detection method based on fluorescence energy transfer

    International Nuclear Information System (INIS)

    Kobayashi, S.; Tamiya, E.; Karube, I.

    1987-01-01

    Recently the detection of specific DNA sequence, DNA analysis, has been becoming more important for diagnosis of viral genomes causing infections disease and human sequences related to inherited disorders. These methods typically involve electrophoresis, the immobilization of DNA on a solid support, hybridization to a complementary probe, the detection using labeled with /sup 32/P or nonisotopically with a biotin-avidin-enzyme system, and so on. These techniques are highly effective, but they are very time-consuming and expensive. A principle of fluorescene energy transfer is that the light energy from an excited donor (fluorophore) is transferred to an acceptor (fluorophore), if the acceptor exists in the vicinity of the donor and the excitation spectrum of donor overlaps the emission spectrum of acceptor. In this study, the fluorescence energy transfer was applied to the detection of specific DNA sequence using the hybridization method. The analyte, single-stranded DNA labeled with the donor fluorophore is hybridized to a probe DNA labeled with the acceptor. Because of the complementary DNA duplex formation, two fluorophores became to be closed to each other, and the fluorescence energy transfer was occurred

  8. mtDNA sequence diversity of Hazara ethnic group from Pakistan.

    Science.gov (United States)

    Rakha, Allah; Fatima; Peng, Min-Sheng; Adan, Atif; Bi, Rui; Yasmin, Memona; Yao, Yong-Gang

    2017-09-01

    The present study was undertaken to investigate mitochondrial DNA (mtDNA) control region sequences of Hazaras from Pakistan, so as to generate mtDNA reference database for forensic casework in Pakistan and to analyze phylogenetic relationship of this particular ethnic group with geographically proximal populations. Complete mtDNA control region (nt 16024-576) sequences were generated through Sanger Sequencing for 319 Hazara individuals from Quetta, Baluchistan. The population sample set showed a total of 189 distinct haplotypes, belonging mainly to West Eurasian (51.72%), East & Southeast Asian (29.78%) and South Asian (18.50%) haplogroups. Compared with other populations from Pakistan, the Hazara population had a relatively high haplotype diversity (0.9945) and a lower random match probability (0.0085). The dataset has been incorporated into EMPOP database under accession number EMP00680. The data herein comprises the largest, and likely most thoroughly examined, control region mtDNA dataset from Hazaras of Pakistan. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. VoSeq: a voucher and DNA sequence web application.

    Directory of Open Access Journals (Sweden)

    Carlos Peña

    Full Text Available There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit.

  10. Spectral entropy criteria for structural segmentation in genomic DNA sequences

    International Nuclear Information System (INIS)

    Chechetkin, V.R.; Lobzin, V.V.

    2004-01-01

    The spectral entropy is calculated with Fourier structure factors and characterizes the level of structural ordering in a sequence of symbols. It may efficiently be applied to the assessment and reconstruction of the modular structure in genomic DNA sequences. We present the relevant spectral entropy criteria for the local and non-local structural segmentation in DNA sequences. The results are illustrated with the model examples and analysis of intervening exon-intron segments in the protein-coding regions

  11. Functional role of a highly repetitive DNA sequence in anchorage of the mouse genome.

    Science.gov (United States)

    Neuer-Nitsche, B; Lu, X N; Werner, D

    1988-09-12

    The major portion of the eukaryotic genome consists of various categories of repetitive DNA sequences which have been studied with respect to their base compositions, organizations, copy numbers, transcription and species specificities; their biological roles, however, are still unclear. A novel quality of a highly repetitive mouse DNA sequence is described which points to a functional role: All copies (approximately 50,000 per haploid genome) of this DNA sequence reside on genomic Alu I DNA fragments each associated with nuclear polypeptides that are not released from DNA by proteinase K, SDS and phenol extraction. By this quality the repetitive DNA sequence is classified as a member of the sub-set of DNA sequences involved in tight DNA-polypeptide complexes which have been previously shown to be components of the subnuclear structure termed 'nuclear matrix'. From these results it has to be concluded that the repetitive DNA sequence characterized in this report represents or comprises a signal for a large number of site specific attachment points of the mouse genome in the nuclear matrix.

  12. Capillary gel electrophoresis for rapid, high resolution DNA sequencing.

    OpenAIRE

    Swerdlow, H; Gesteland, R

    1990-01-01

    Capillary gel electrophoresis has been demonstrated for the separation and detection of DNA sequencing samples. Enzymatic dideoxy nucleotide chain termination was employed, using fluorescently tagged oligonucleotide primers and laser based on-column detection (limit of detection is 6,000 molecules per peak). Capillary gel separations were shown to be three times faster, with better resolution (2.4 x), and higher separation efficiency (5.4 x) than a conventional automated slab gel DNA sequenci...

  13. Noninvasive prenatal paternity testing (NIPAT) through maternal plasma DNA sequencing

    DEFF Research Database (Denmark)

    Jiang, Haojun; Xie, Yifan; Li, Xuchao

    2016-01-01

    developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels......Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we...... paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future....

  14. Method for priming and DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Mugasimangalam, R.C.; Ulanovsky, L.E.

    1997-12-01

    A method is presented for improving the priming specificity of an oligonucleotide primer that is non-unique in a nucleic acid template which includes selecting a continuous stretch of several nucleotides in the template DNA where one of the four bases does not occur in the stretch. This also includes bringing the template DNA in contract with a non-unique primer partially or fully complimentary to the sequence immediately upstream of the selected sequence stretch. This results in polymerase-mediated differential extension of the primer in the presence of a subset of deoxyribonucleotide triphosphates that does not contain the base complementary to the base absent in the selected sequence stretch. These reactions occur at a temperature sufficiently low for allowing the extension of the non-unique primer. The method causes polymerase-mediated extension reactions in the presence of all four natural deoxyribonucleotide triphosphates or modifications. At this high temperature discrimination occurs against priming sites of the non-unique primer where the differential extension has not made the primer sufficiently stable to prime. However, the primer extended at the selected stretch is sufficiently stable to prime.

  15. Genotyping of Giardia lamblia isolates from humans in China and Korea using ribosomal DNA Sequences.

    Science.gov (United States)

    Yong, T S; Park, S J; Hwang, U W; Yang, H W; Lee, K W; Min, D Y; Rim, H J; Wang, Y; Zheng, F

    2000-08-01

    Genetic characterization of a total of 15 Giardia lamblia isolates, 8 from Anhui Province, China (all from purified cysts) and 7 from Seoul, Korea (2 from axenic cultures and 5 from purified cysts), was performed by polymerase chain reaction amplification and sequencing of a 295-bp region near the 5' end of the small subunit ribosomal DNA (eukaryotic 16S rDNA). Phylogenetic analyses were subsequently conducted using sequence data obtained in this study, as well as sequences published from other Giardia isolates. The maximum parsimony method revealed that G. lamblia isolates from humans in China and Korea are divided into 2 major lineages, assemblages A and B. All 7 Korean isolates were grouped into assemblage A, whereas 4 Chinese isolates were grouped into assemblage A and 4 into assemblage B. Two Giardia microti isolates and 2 dog-derived Giardia isolates also grouped into assemblage B, whereas Giardia ardeae and Giardia muris were unique.

  16. OPTSDNA: Performance evaluation of an efficient distributed bioinformatics system for DNA sequence analysis.

    Science.gov (United States)

    Khan, Mohammad Ibrahim; Sheel, Chotan

    2013-01-01

    Storage of sequence data is a big concern as the amount of data generated is exponential in nature at several locations. Therefore, there is a need to develop techniques to store data using compression algorithm. Here we describe optimal storage algorithm (OPTSDNA) for storing large amount of DNA sequences of varying length. This paper provides performance analysis of optimal storage algorithm (OPTSDNA) of a distributed bioinformatics computing system for analysis of DNA sequences. OPTSDNA algorithm is used for storing various sizes of DNA sequences into database. DNA sequences of different lengths were stored by using this algorithm. These input DNA sequences are varied in size from very small to very large. Storage size is calculated by this algorithm. Response time is also calculated in this work. The efficiency and performance of the algorithm is high (in size calculation with percentage) when compared with other known with sequential approach.

  17. The cDNA sequence of a neutral horseradish peroxidase.

    Science.gov (United States)

    Bartonek-Roxå, E; Eriksson, H; Mattiasson, B

    1991-02-16

    A cDNA clone encoding a horseradish (Armoracia rusticana) peroxidase has been isolated and characterized. The cDNA contains 1378 nucleotides excluding the poly(A) tail and the deduced protein contains 327 amino acids which includes a 28 amino acid leader sequence. The predicted amino acid sequence is nine amino acids shorter than the major isoenzyme belonging to the horseradish peroxidase C group (HRP-C) and the sequence shows 53.7% identity with this isoenzyme. The described clone encodes nine cysteines of which eight correspond well with the cysteines found in HRP-C. Five potential N-glycosylation sites with the general sequence Asn-X-Thr/Ser are present in the deduced sequence. Compared to the earlier described HRP-C this is three glycosylation sites less. The shorter sequence and fewer N-glycosylation sites give the native isoenzyme a molecular weight of several thousands less than the horseradish peroxidase C isoenzymes. Comparison with the net charge value of HRP-C indicates that the described cDNA clone encodes a peroxidase which has either the same or a slightly less basic pI value, depending on whether the encoded protein is N-terminally blocked or not. This excludes the possibility that HRP-n could belong to either the HRP-A, -D or -E groups. The low sequence identity (53.7%) with HRP-C indicates that the described clone does not belong to the HRP-C isoenzyme group and comparison of the total amino acid composition with the HRP-B group does not place the described clone within this isoenzyme group. Our conclusion is that the described cDNA clone encodes a neutral horseradish peroxidase which belongs to a new, not earlier described, horseradish peroxidase group.

  18. Real sequence effects on the search dynamics of transcription factors on DNA

    DEFF Research Database (Denmark)

    Bauer, Maximilian; Rasmussen, Emil S.; Lomholt, Michael A.

    2015-01-01

    Recent experiments show that transcription factors (TFs) indeed use the facilitated diffusion mechanism to locate their target sequences on DNA in living bacteria cells: TFs alternate between sliding motion along DNA and relocation events through the cytoplasm. From simulations and theoretical...... analysis we study the TF-sliding motion for a large section of the DNA-sequence of a common E. coli strain, based on the two-state TF-model with a fast-sliding search state and a recognition state enabling target detection. For the probability to detect the target before dissociating from DNA the TF...... on the underlying nucleotide sequence is varied. A moderate dependence maximises the capability to distinguish between the main operator and similar sequences. Moreover, these auxiliary operators serve as starting points for DNA looping with the main operator, yielding a spectrum of target detection times spanning...

  19. High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

    Science.gov (United States)

    Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca

    2015-01-01

    Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.

  20. Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

    Science.gov (United States)

    Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

    2014-11-01

    As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of

  1. Phylogenomics of Phrynosomatid Lizards: Conflicting Signals from Sequence Capture versus Restriction Site Associated DNA Sequencing

    Science.gov (United States)

    Leaché, Adam D.; Chavez, Andreas S.; Jones, Leonard N.; Grummer, Jared A.; Gottscho, Andrew D.; Linkem, Charles W.

    2015-01-01

    Sequence capture and restriction site associated DNA sequencing (RADseq) are popular methods for obtaining large numbers of loci for phylogenetic analysis. These methods are typically used to collect data at different evolutionary timescales; sequence capture is primarily used for obtaining conserved loci, whereas RADseq is designed for discovering single nucleotide polymorphisms (SNPs) suitable for population genetic or phylogeographic analyses. Phylogenetic questions that span both “recent” and “deep” timescales could benefit from either type of data, but studies that directly compare the two approaches are lacking. We compared phylogenies estimated from sequence capture and double digest RADseq (ddRADseq) data for North American phrynosomatid lizards, a species-rich and diverse group containing nine genera that began diversifying approximately 55 Ma. Sequence capture resulted in 584 loci that provided a consistent and strong phylogeny using concatenation and species tree inference. However, the phylogeny estimated from the ddRADseq data was sensitive to the bioinformatics steps used for determining homology, detecting paralogs, and filtering missing data. The topological conflicts among the SNP trees were not restricted to any particular timescale, but instead were associated with short internal branches. Species tree analysis of the largest SNP assembly, which also included the most missing data, supported a topology that matched the sequence capture tree. This preferred phylogeny provides strong support for the paraphyly of the earless lizard genera Holbrookia and Cophosaurus, suggesting that the earless morphology either evolved twice or evolved once and was subsequently lost in Callisaurus. PMID:25663487

  2. cDNA sequences of two inducible T-cell genes

    Energy Technology Data Exchange (ETDEWEB)

    Kwon, B.S. (Indiana Univ. School of Medicine, Indianapolis (USA) Guthrie Research Institute, Sayre, PA (USA)); Weissman, S.M. (Yale Univ., New Haven, CT (USA))

    1989-03-01

    The authors have previously described a set of human T-lymphocyte-specific cDNA clones isolated by a modified differential screening procedure. Apparent full-length cDNAs containing the sequences of 14 of the 16 initial isolates were sequenced and were found to represent five different species of mRNA; three of the five species were identical to previously reported cDNA sequences of preproenkephalin, T-cell-replacing factor, and a serine esterase, respectively. The other two species, 4-1BB and L2G25B, were inducible sequences found in mRNA from both a cytolytic T-lymphocyte and a helper T-lymphocyte clone and were not previously described in T-cell mRNA; these mRNA sequences encode peptides of 256 and 92 amino acids, respectively. Both peptides contain putative leader sequences. The protein encoded by 4-1BB also has a potential membrane anchor segment and other features also seen in known receptor proteins.

  3. Sequence of a cloned cDNA encoding human ribosomal protein S11

    Energy Technology Data Exchange (ETDEWEB)

    Lott, J B; Mackie, G A

    1988-02-11

    The authors have isolated a cloned cDNA that encodes human ribosomal protein (rp) S11 by screening a human fibroblast cDNA library with a labelled 204 bp DNA fragment encompassing residues 212-416 of pRS11, a rat rp Sll cDNA clone. The human rp S11 cloned cDNA consists of 15 residues of the 5' leader, the entire coding sequence and all 51 residues of the 3' untranslated region. The predicted amino acid sequence of 158 residues is identical to rat rpS11. The nucleotide sequence in the coding region differs, however, from that in rat in the first position in two codons and in the third position in 44 codons.

  4. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

    Science.gov (United States)

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  5. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data.

    Science.gov (United States)

    Miller, Mark P; Knaus, Brian J; Mullins, Thomas D; Haig, Susan M

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25 bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  6. Nucleotide sequence analysis of regions of adenovirus 5 DNA containing the origins of DNA replication

    International Nuclear Information System (INIS)

    Steenbergh, P.H.

    1979-01-01

    The purpose of the investigations described is the determination of nucleotide sequences at the molecular ends of the linear adenovirus type 5 DNA. Knowledge of the primary structure at the termini of this DNA molecule is of particular interest in the study of the mechanism of replication of adenovirus DNA. The initiation- and termination sites of adenovirus DNA replication are located at the ends of the DNA molecule. (Auth.)

  7. A Flexible, Efficient Binomial Mixed Model for Identifying Differential DNA Methylation in Bisulfite Sequencing Data

    Science.gov (United States)

    Lea, Amanda J.

    2015-01-01

    Identifying sources of variation in DNA methylation levels is important for understanding gene regulation. Recently, bisulfite sequencing has become a popular tool for investigating DNA methylation levels. However, modeling bisulfite sequencing data is complicated by dramatic variation in coverage across sites and individual samples, and because of the computational challenges of controlling for genetic covariance in count data. To address these challenges, we present a binomial mixed model and an efficient, sampling-based algorithm (MACAU: Mixed model association for count data via data augmentation) for approximate parameter estimation and p-value computation. This framework allows us to simultaneously account for both the over-dispersed, count-based nature of bisulfite sequencing data, as well as genetic relatedness among individuals. Using simulations and two real data sets (whole genome bisulfite sequencing (WGBS) data from Arabidopsis thaliana and reduced representation bisulfite sequencing (RRBS) data from baboons), we show that our method provides well-calibrated test statistics in the presence of population structure. Further, it improves power to detect differentially methylated sites: in the RRBS data set, MACAU detected 1.6-fold more age-associated CpG sites than a beta-binomial model (the next best approach). Changes in these sites are consistent with known age-related shifts in DNA methylation levels, and are enriched near genes that are differentially expressed with age in the same population. Taken together, our results indicate that MACAU is an efficient, effective tool for analyzing bisulfite sequencing data, with particular salience to analyses of structured populations. MACAU is freely available at www.xzlab.org/software.html. PMID:26599596

  8. Targeting and tracing of specific DNA sequences with dTALEs in living cells

    Science.gov (United States)

    Thanisch, Katharina; Schneider, Katrin; Morbitzer, Robert; Solovei, Irina; Lahaye, Thomas; Bultmann, Sebastian; Leonhardt, Heinrich

    2014-01-01

    Epigenetic regulation of gene expression involves, besides DNA and histone modifications, the relative positioning of DNA sequences within the nucleus. To trace specific DNA sequences in living cells, we used programmable sequence-specific DNA binding of designer transcription activator-like effectors (dTALEs). We designed a recombinant dTALE (msTALE) with variable repeat domains to specifically bind a 19-bp target sequence of major satellite DNA. The msTALE was fused with green fluorescent protein (GFP) and stably expressed in mouse embryonic stem cells. Hybridization with a major satellite probe (3D-fluorescent in situ hybridization) and co-staining for known cellular structures confirmed in vivo binding of the GFP-msTALE to major satellite DNA present at nuclear chromocenters. Dual tracing of major satellite DNA and the replication machinery throughout S-phase showed co-localization during mid to late S-phase, directly demonstrating the late replication timing of major satellite DNA. Fluorescence bleaching experiments indicated a relatively stable but still dynamic binding, with mean residence times in the range of minutes. Fluorescently labeled dTALEs open new perspectives to target and trace DNA sequences and to monitor dynamic changes in subnuclear positioning as well as interactions with functional nuclear structures during cell cycle progression and cellular differentiation. PMID:24371265

  9. Targeting and tracing of specific DNA sequences with dTALEs in living cells.

    Science.gov (United States)

    Thanisch, Katharina; Schneider, Katrin; Morbitzer, Robert; Solovei, Irina; Lahaye, Thomas; Bultmann, Sebastian; Leonhardt, Heinrich

    2014-04-01

    Epigenetic regulation of gene expression involves, besides DNA and histone modifications, the relative positioning of DNA sequences within the nucleus. To trace specific DNA sequences in living cells, we used programmable sequence-specific DNA binding of designer transcription activator-like effectors (dTALEs). We designed a recombinant dTALE (msTALE) with variable repeat domains to specifically bind a 19-bp target sequence of major satellite DNA. The msTALE was fused with green fluorescent protein (GFP) and stably expressed in mouse embryonic stem cells. Hybridization with a major satellite probe (3D-fluorescent in situ hybridization) and co-staining for known cellular structures confirmed in vivo binding of the GFP-msTALE to major satellite DNA present at nuclear chromocenters. Dual tracing of major satellite DNA and the replication machinery throughout S-phase showed co-localization during mid to late S-phase, directly demonstrating the late replication timing of major satellite DNA. Fluorescence bleaching experiments indicated a relatively stable but still dynamic binding, with mean residence times in the range of minutes. Fluorescently labeled dTALEs open new perspectives to target and trace DNA sequences and to monitor dynamic changes in subnuclear positioning as well as interactions with functional nuclear structures during cell cycle progression and cellular differentiation.

  10. Phylogenetic relationships in Peniocereus (Cactaceae) inferred from plastid DNA sequence data.

    Science.gov (United States)

    Arias, Salvador; Terrazas, Teresa; Arreola-Nava, Hilda J; Vázquez-Sánchez, Monserrat; Cameron, Kenneth M

    2005-10-01

    The phylogenetic relationships of Peniocereus (Cactaceae) species were studied using parsimony analyses of DNA sequence data. The plastid rpl16 and trnL-F regions were sequenced for 98 taxa including 17 species of Peniocereus, representatives from all genera of tribe Pachycereeae, four genera of tribe Hylocereeae, as well as from three additional outgroup genera of tribes Calymmantheae, Notocacteae, and Trichocereeae. Phylogenetic analyses support neither the monophyly of Peniocereus as currently circumscribed, nor the monophyly of tribe Pachycereeae since species of Peniocereus subgenus Pseudoacanthocereus are embedded within tribe Hylocereeae. Furthermore, these results show that the eight species of Peniocereus subgenus Peniocereus (Peniocereus sensu stricto) form a well-supported clade within subtribe Pachycereinae; P. serpentinus is also a member of this subtribe, but is sister to Bergerocactus. Moreover, Nyctocereus should be resurrected as a monotypic genus. Species of Peniocereus subgenus Pseudoacanthocereus are positioned among species of Acanthocereus within tribe Hylocereeae, indicating that they may be better classified within that genus. A number of morphological and anatomical characters, especially related to the presence or absence of dimorphic branches, are discussed to support these relationships.

  11. Homogeneity of the 16S rDNA sequence among geographically disparate isolates of Taylorella equigenitalis

    Directory of Open Access Journals (Sweden)

    Moore JE

    2006-01-01

    Full Text Available Abstract Background At present, six accessible sequences of 16S rDNA from Taylorella equigenitalis (T. equigenitalis are available, whose sequence differences occur at a few nucleotide positions. Thus it is important to determine these sequences from additional strains in other countries, if possible, in order to clarify any anomalies regarding 16S rDNA sequence heterogeneity. Here, we clone and sequence the approximate full-length 16S rDNA from additional strains of T. equigenitalis isolated in Japan, Australia and France and compare these sequences to the existing published sequences. Results Clarification of any anomalies regarding 16S rDNA sequence heterogeneity of T. equigenitalis was carried out. When cloning, sequencing and comparison of the approximate full-length 16S rDNA from 17 strains of T. equigenitalis isolated in Japan, Australia and France, nucleotide sequence differences were demonstrated at the six loci in the 1,469 nucleotide sequence. Moreover, 12 polymorphic sites occurred among 23 sequences of the 16S rDNA, including the six reference sequences. Conclusion High sequence similarity (99.5% or more was observed throughout, except from nucleotide positions 138 to 501 where substitutions and deletions were noted.

  12. Homogeneity of the 16S rDNA sequence among geographically disparate isolates of Taylorella equigenitalis

    Science.gov (United States)

    Matsuda, M; Tazumi, A; Kagawa, S; Sekizuka, T; Murayama, O; Moore, JE; Millar, BC

    2006-01-01

    Background At present, six accessible sequences of 16S rDNA from Taylorella equigenitalis (T. equigenitalis) are available, whose sequence differences occur at a few nucleotide positions. Thus it is important to determine these sequences from additional strains in other countries, if possible, in order to clarify any anomalies regarding 16S rDNA sequence heterogeneity. Here, we clone and sequence the approximate full-length 16S rDNA from additional strains of T. equigenitalis isolated in Japan, Australia and France and compare these sequences to the existing published sequences. Results Clarification of any anomalies regarding 16S rDNA sequence heterogeneity of T. equigenitalis was carried out. When cloning, sequencing and comparison of the approximate full-length 16S rDNA from 17 strains of T. equigenitalis isolated in Japan, Australia and France, nucleotide sequence differences were demonstrated at the six loci in the 1,469 nucleotide sequence. Moreover, 12 polymorphic sites occurred among 23 sequences of the 16S rDNA, including the six reference sequences. Conclusion High sequence similarity (99.5% or more) was observed throughout, except from nucleotide positions 138 to 501 where substitutions and deletions were noted. PMID:16398935

  13. Rapid discrimination and classification of the Lactobacillus plantarum group based on a partial dnaK sequence and DNA fingerprinting techniques.

    Science.gov (United States)

    Huang, Chien-Hsun; Lee, Fwu-Ling; Liou, Jong-Shian

    2010-03-01

    The Lactobacillus plantarum group comprises five very closely related species. Some species of this group are considered to be probiotic and widely applied in the food industry. In this study, we compared the use of two different molecular markers, the 16S rRNA and dnaK gene, for discriminating phylogenetic relationships amongst L. plantarum strains using sequencing and DNA fingerprinting. The average sequence similarity for the dnaK gene (89.2%) among five type strains was significantly less than that for the 16S rRNA (99.4%). This result demonstrates that the dnaK gene sequence provided higher resolution than the 16S rRNA and suggests that the dnaK could be used as an additional phylogenetic marker for L. plantarum. Species-specific profiles of the Lactobacillus strains were obtained with RAPD and RFLP methods. Our data indicate that phylogenetic relationships between these strains are easily resolved using sequencing of the dnaK gene or DNA fingerprinting assays.

  14. The nucleotide sequence of human transition protein 1 cDNA

    Energy Technology Data Exchange (ETDEWEB)

    Luerssen, H; Hoyer-Fender, S; Engel, W [Universitaet Goettingen (West Germany)

    1988-08-11

    The authors have screened a human testis cDNA library with an oligonucleotide of 81 mer prepared according to a part of the published nucleotide sequence of the rat transition protein TP 1. They have isolated a cDNA clone with the length of 441 bp containing the coding region of 162 bp for human transition protein 1. There is about 84% homology in the coding region of the sequence compared to rat. The human cDNA-clone encodes a polypeptide of 54 amino acids of which 7 are different to that of rat.

  15. DNA sequence analyses reveal abundant diversity, endemism and evidence for Asian origin of the porcini mushrooms.

    Directory of Open Access Journals (Sweden)

    Bang Feng

    Full Text Available The wild gourmet mushroom Boletus edulis and its close allies are of significant ecological and economic importance. They are found throughout the Northern Hemisphere, but despite their ubiquity there are still many unresolved issues with regard to the taxonomy, systematics and biogeography of this group of mushrooms. Most phylogenetic studies of Boletus so far have characterized samples from North America and Europe and little information is available on samples from other areas, including the ecologically and geographically diverse regions of China. Here we analyzed DNA sequence variation in three gene markers from samples of these mushrooms from across China and compared our findings with those from other representative regions. Our results revealed fifteen novel phylogenetic species (about one-third of the known species and a newly identified lineage represented by Boletus sp. HKAS71346 from tropical Asia. The phylogenetic analyses support eastern Asia as the center of diversity for the porcini sensu stricto clade. Within this clade, B. edulis is the only known holarctic species. The majority of the other phylogenetic species are geographically restricted in their distributions. Furthermore, molecular dating and geological evidence suggest that this group of mushrooms originated during the Eocene in eastern Asia, followed by dispersal to and subsequent speciation in other parts of Asia, Europe, and the Americas from the middle Miocene through the early Pliocene. In contrast to the ancient dispersal of porcini in the strict sense in the Northern Hemisphere, the occurrence of B. reticulatus and B. edulis sensu lato in the Southern Hemisphere was probably due to recent human-mediated introductions.

  16. DNA Sequence Analyses Reveal Abundant Diversity, Endemism and Evidence for Asian Origin of the Porcini Mushrooms

    Science.gov (United States)

    Feng, Bang; Xu, Jianping; Wu, Gang; Zeng, Nian-Kai; Li, Yan-Chun; Tolgor, Bau; Kost, Gerhard W.; Yang, Zhu L.

    2012-01-01

    The wild gourmet mushroom Boletus edulis and its close allies are of significant ecological and economic importance. They are found throughout the Northern Hemisphere, but despite their ubiquity there are still many unresolved issues with regard to the taxonomy, systematics and biogeography of this group of mushrooms. Most phylogenetic studies of Boletus so far have characterized samples from North America and Europe and little information is available on samples from other areas, including the ecologically and geographically diverse regions of China. Here we analyzed DNA sequence variation in three gene markers from samples of these mushrooms from across China and compared our findings with those from other representative regions. Our results revealed fifteen novel phylogenetic species (about one-third of the known species) and a newly identified lineage represented by Boletus sp. HKAS71346 from tropical Asia. The phylogenetic analyses support eastern Asia as the center of diversity for the porcini sensu stricto clade. Within this clade, B. edulis is the only known holarctic species. The majority of the other phylogenetic species are geographically restricted in their distributions. Furthermore, molecular dating and geological evidence suggest that this group of mushrooms originated during the Eocene in eastern Asia, followed by dispersal to and subsequent speciation in other parts of Asia, Europe, and the Americas from the middle Miocene through the early Pliocene. In contrast to the ancient dispersal of porcini in the strict sense in the Northern Hemisphere, the occurrence of B. reticulatus and B. edulis sensu lato in the Southern Hemisphere was probably due to recent human-mediated introductions. PMID:22629418

  17. Analysis of host preference and geographical distribution of Anastrepha suspensa (Diptera: Tephritidae) using phylogenetic analyses of mitochondrial cytochrome oxidase I DNA sequence data.

    Science.gov (United States)

    Boykin, L M; Shatters, R G; Hall, D G; Burns, R E; Franqui, R A

    2006-10-01

    Anastrepha suspensa (Loew) is an economically important pest, restricted to the Greater Antilles and southern Florida. It infests a wide variety of hosts and is of quarantine importance in citrus, a multi-million dollar industry in Florida. The observed recent increase in citrus infested with A. suspensa in Florida has raised questions regarding host-specificity of certain populations and genetic diversity of the pest throughout its geographical distribution. Cytochrome oxidase I (COI) DNA sequence data was used to characterize the genetic diversity of A. suspensa from Florida and Caribbean populations reared from different host plants. Maximum likelihood and Bayesian phylogenetic methods were used to analyse COI data. Sequence variation among mitochondrial COI genes from 107 A. suspensa samples collected throughout Florida and the Caribbean ranged between 0 and 10% and placed all A. suspensa as a monophyletic group that united all A. suspensa in a clade sister to a Central American group of the A. fraterculus paraphyletic species complex. The most likely tree of the COI locus indicated that COI sequence variation was too low to provide resolution at the subspecies level, therefore monophyletic groups based on host-plant use, geography (Florida, Jamaica, Cayman Islands, Puerto Rico or Dominican Republic) or population sampled are not supported. This result indicates that either no population segregation has occurred based on these biological or geographical distinctions and that this is a generalist, polyphagous invasive genotype. Alternatively, if populations are distinct, the segregation event was more recent than can be distinguished based on COI sequence variation.

  18. Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors.

    Science.gov (United States)

    Adalsteinsson, Viktor A; Ha, Gavin; Freeman, Samuel S; Choudhury, Atish D; Stover, Daniel G; Parsons, Heather A; Gydush, Gregory; Reed, Sarah C; Rotem, Denisse; Rhoades, Justin; Loginov, Denis; Livitz, Dimitri; Rosebrock, Daniel; Leshchiner, Ignaty; Kim, Jaegil; Stewart, Chip; Rosenberg, Mara; Francis, Joshua M; Zhang, Cheng-Zhong; Cohen, Ofir; Oh, Coyin; Ding, Huiming; Polak, Paz; Lloyd, Max; Mahmud, Sairah; Helvie, Karla; Merrill, Margaret S; Santiago, Rebecca A; O'Connor, Edward P; Jeong, Seong H; Leeson, Rachel; Barry, Rachel M; Kramkowski, Joseph F; Zhang, Zhenwei; Polacek, Laura; Lohr, Jens G; Schleicher, Molly; Lipscomb, Emily; Saltzman, Andrea; Oliver, Nelly M; Marini, Lori; Waks, Adrienne G; Harshman, Lauren C; Tolaney, Sara M; Van Allen, Eliezer M; Winer, Eric P; Lin, Nancy U; Nakabayashi, Mari; Taplin, Mary-Ellen; Johannessen, Cory M; Garraway, Levi A; Golub, Todd R; Boehm, Jesse S; Wagle, Nikhil; Getz, Gad; Love, J Christopher; Meyerson, Matthew

    2017-11-06

    Whole-exome sequencing of cell-free DNA (cfDNA) could enable comprehensive profiling of tumors from blood but the genome-wide concordance between cfDNA and tumor biopsies is uncertain. Here we report ichorCNA, software that quantifies tumor content in cfDNA from 0.1× coverage whole-genome sequencing data without prior knowledge of tumor mutations. We apply ichorCNA to 1439 blood samples from 520 patients with metastatic prostate or breast cancers. In the earliest tested sample for each patient, 34% of patients have ≥10% tumor-derived cfDNA, sufficient for standard coverage whole-exome sequencing. Using whole-exome sequencing, we validate the concordance of clonal somatic mutations (88%), copy number alterations (80%), mutational signatures, and neoantigens between cfDNA and matched tumor biopsies from 41 patients with ≥10% cfDNA tumor content. In summary, we provide methods to identify patients eligible for comprehensive cfDNA profiling, revealing its applicability to many patients, and demonstrate high concordance of cfDNA and metastatic tumor whole-exome sequencing.

  19. High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

    Directory of Open Access Journals (Sweden)

    Soichi Inagaki

    Full Text Available Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.

  20. Food Fish Identification from DNA Extraction through Sequence Analysis

    Science.gov (United States)

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  1. Sequencing historical specimens: successful preparation of small specimens with low amounts of degraded DNA.

    Science.gov (United States)

    Sproul, John S; Maddison, David R

    2017-11-01

    Despite advances that allow DNA sequencing of old museum specimens, sequencing small-bodied, historical specimens can be challenging and unreliable as many contain only small amounts of fragmented DNA. Dependable methods to sequence such specimens are especially critical if the specimens are unique. We attempt to sequence small-bodied (3-6 mm) historical specimens (including nomenclatural types) of beetles that have been housed, dried, in museums for 58-159 years, and for which few or no suitable replacement specimens exist. To better understand ideal approaches of sample preparation and produce preparation guidelines, we compared different library preparation protocols using low amounts of input DNA (1-10 ng). We also explored low-cost optimizations designed to improve library preparation efficiency and sequencing success of historical specimens with minimal DNA, such as enzymatic repair of DNA. We report successful sample preparation and sequencing for all historical specimens despite our low-input DNA approach. We provide a list of guidelines related to DNA repair, bead handling, reducing adapter dimers and library amplification. We present these guidelines to facilitate more economical use of valuable DNA and enable more consistent results in projects that aim to sequence challenging, irreplaceable historical specimens. © 2017 John Wiley & Sons Ltd.

  2. Spliced DNA Sequences in the Paramecium Germline: Their Properties and Evolutionary Potential

    Science.gov (United States)

    Catania, Francesco; McGrath, Casey L.; Doak, Thomas G.; Lynch, Michael

    2013-01-01

    Despite playing a crucial role in germline-soma differentiation, the evolutionary significance of developmentally regulated genome rearrangements (DRGRs) has received scant attention. An example of DRGR is DNA splicing, a process that removes segments of DNA interrupting genic and/or intergenic sequences. Perhaps, best known for shaping immune-system genes in vertebrates, DNA splicing plays a central role in the life of ciliated protozoa, where thousands of germline DNA segments are eliminated after sexual reproduction to regenerate a functional somatic genome. Here, we identify and chronicle the properties of 5,286 sequences that putatively undergo DNA splicing (i.e., internal eliminated sequences [IESs]) across the genomes of three closely related species of the ciliate Paramecium (P. tetraurelia, P. biaurelia, and P. sexaurelia). The study reveals that these putative IESs share several physical characteristics. Although our results are consistent with excision events being largely conserved between species, episodes of differential IES retention/excision occur, may have a recent origin, and frequently involve coding regions. Our findings indicate interconversion between somatic—often coding—DNA sequences and noncoding IESs, and provide insights into the role of DNA splicing in creating potentially functional genetic innovation. PMID:23737328

  3. Ribosomal DNA intergenic spacer sequence in foxtail millet, Setaria italica (L.) P. Beauv. and its characterization and application to typing of foxtail millet landraces.

    Science.gov (United States)

    Fukunaga, Kenji; Ichitani, Katsuyuki; Taura, Satoru; Sato, Muneharu; Kawase, Makoto

    2005-02-01

    We determined the sequence of ribosomal DNA (rDNA) intergenic spacer (IGS) of foxtail millet isolated in our previous study, and identified subrepeats in the polymorphic region. We also developed a PCR-based method for identifying rDNA types based on sequence information and assessed 153 accessions of foxtail millet. Results were congruent with our previous works. This study provides new findings regarding the geographical distribution of rDNA variants. This new method facilitates analyses of numerous foxtail millet accessions. It is helpful for typing of foxtail millet germplasms and elucidating the evolution of this millet.

  4. Systematics of Cladophora spp. (Chlorophyta) from North Carolina, USA, based upon morphology and DNA sequence data with a description of Cladophora subtilissima sp. nov.

    Science.gov (United States)

    Taylor, Robin L; Bailey, Jeffrey Craig; Freshwater, David Wilson

    2017-06-01

    Identification of Cladophora species is challenging due to conservation of gross morphology, few discrete autapomorphies, and environmental influences on morphology. Twelve species of marine Cladophora were reported from North Carolina waters. Cladophora specimens were collected from inshore and offshore marine waters for DNA sequence and morphological analyses. The nuclear-encoded rRNA internal transcribed spacer regions (ITS) were sequenced for 105 specimens and used in molecular assisted identification. The ITS1 and ITS2 region was highly variable, and sequences were sorted into ITS Sets of Alignable Sequences (SASs). Sequencing of short hyper-variable ITS1 sections from Cladophora type specimens was used to positively identify species represented by SASs when the types were made available. Secondary structures for the ITS1 locus were also predicted for each specimen and compared to predicted structures from Cladophora sequences available in GenBank. Nine ITS SASs were identified and representative specimens chosen for phylogenetic analyses of 18S and 28S rRNA gene sequences to reveal relationships with other Cladophora species. Phylogenetic analyses indicated that marine Cladophorales were polyphyletic and separated into two clades, the Cladophora clade and the "Siphonocladales" clade. Morphological analyses were performed to assess the consistency of character states within species, and complement the DNA sequence analyses. These analyses revealed intra- and interspecific character state variation, and that combined molecular and morphological analyses were required for the identification of species. One new report, Cladophora dotyana, and one new species Cladophora subtilissima sp. nov., were revealed, and increased the biodiversity of North Carolina marine Cladophora to 14 species. © 2017 Phycological Society of America.

  5. Utility of 16S rDNA Sequencing for Identification of Rare Pathogenic Bacteria.

    Science.gov (United States)

    Loong, Shih Keng; Khor, Chee Sieng; Jafar, Faizatul Lela; AbuBakar, Sazaly

    2016-11-01

    Phenotypic identification systems are established methods for laboratory identification of bacteria causing human infections. Here, the utility of phenotypic identification systems was compared against 16S rDNA identification method on clinical isolates obtained during a 5-year study period, with special emphasis on isolates that gave unsatisfactory identification. One hundred and eighty-seven clinical bacteria isolates were tested with commercial phenotypic identification systems and 16S rDNA sequencing. Isolate identities determined using phenotypic identification systems and 16S rDNA sequencing were compared for similarity at genus and species level, with 16S rDNA sequencing as the reference method. Phenotypic identification systems identified ~46% (86/187) of the isolates with identity similar to that identified using 16S rDNA sequencing. Approximately 39% (73/187) and ~15% (28/187) of the isolates showed different genus identity and could not be identified using the phenotypic identification systems, respectively. Both methods succeeded in determining the species identities of 55 isolates; however, only ~69% (38/55) of the isolates matched at species level. 16S rDNA sequencing could not determine the species of ~20% (37/187) of the isolates. The 16S rDNA sequencing is a useful method over the phenotypic identification systems for the identification of rare and difficult to identify bacteria species. The 16S rDNA sequencing method, however, does have limitation for species-level identification of some bacteria highlighting the need for better bacterial pathogen identification tools. © 2016 Wiley Periodicals, Inc.

  6. ABI Base Recall: Automatic Correction and Ends Trimming of DNA Sequences.

    Science.gov (United States)

    Elyazghi, Zakaria; Yazouli, Loubna El; Sadki, Khalid; Radouani, Fouzia

    2017-12-01

    Automated DNA sequencers produce chromatogram files in ABI format. When viewing chromatograms, some ambiguities are shown at various sites along the DNA sequences, because the program implemented in the sequencing machine and used to call bases cannot always precisely determine the right nucleotide, especially when it is represented by either a broad peak or a set of overlaying peaks. In such cases, a letter other than A, C, G, or T is recorded, most commonly N. Thus, DNA sequencing chromatograms need manual examination: checking for mis-calls and truncating the sequence when errors become too frequent. The purpose of this paper is to develop a program allowing the automatic correction of these ambiguities. This application is a Web-based program powered by Shiny and runs under R platform for an easy exploitation. As a part of the interface, we added the automatic ends clipping option, alignment against reference sequences, and BLAST. To develop and test our tool, we collected several bacterial DNA sequences from different laboratories within Institut Pasteur du Maroc and performed both manual and automatic correction. The comparison between the two methods was carried out. As a result, we note that our program, ABI base recall, accomplishes good correction with a high accuracy. Indeed, it increases the rate of identity and coverage and minimizes the number of mismatches and gaps, hence it provides solution to sequencing ambiguities and saves biologists' time and labor.

  7. Probing DNA in nanopores via tunneling: from sequencing to ``quantum'' analogies

    Science.gov (United States)

    di Ventra, Massimiliano

    2012-02-01

    Fast and low-cost DNA sequencing methods would revolutionize medicine: a person could have his/her full genome sequenced so that drugs could be tailored to his/her specific illnesses; doctors could know in advance patients' likelihood to develop a given ailment; cures to major diseases could be found faster [1]. However, this goal of ``personalized medicine'' is hampered today by the high cost and slow speed of DNA sequencing methods. In this talk, I will discuss the sequencing protocol we suggest which requires the measurement of the distributions of transverse currents during the translocation of single-stranded DNA into nanopores [2-5]. I will support our conclusions with a combination of molecular dynamics simulations coupled to quantum mechanical calculations of electrical current in experimentally realizable systems [2-5]. I will also discuss recent experiments that support these theoretical predictions. In addition, I will show how this relatively unexplored area of research at the interface between solids, liquids, and biomolecules at the nanometer length scale is a fertile ground to study quantum phenomena that have a classical counterpart, such as ionic quasi-particles, ionic ``quantized'' conductance [6,7] and Coulomb blockade [8]. Work supported in part by NIH. [4pt] [1] M. Zwolak, M. Di Ventra, Physical Approaches to DNA Sequencing and Detection, Rev. Mod. Phys. 80, 141 (2008).[0pt] [2] M. Zwolak and M. Di Ventra, Electronic signature of DNA nucleotides via transverse transport, Nano Lett. 5, 421 (2005).[0pt] [3] J. Lagerqvist, M. Zwolak, and M. Di Ventra, Fast DNA sequencing via transverse electronic transport, Nano Lett. 6, 779 (2006).[0pt] [4] J. Lagerqvist, M. Zwolak, and M. Di Ventra, Influence of the environment and probes on rapid DNA sequencing via transverse electronic transport, Biophys. J. 93, 2384 (2007).[0pt] [5] M. Krems, M. Zwolak, Y.V. Pershin, and M. Di Ventra, Effect of noise on DNA sequencing via transverse electronic transport

  8. Molecular cloning of chicken metallothionein. Deduction of the complete amino acid sequence and analysis of expression using cloned cDNA

    Energy Technology Data Exchange (ETDEWEB)

    Wei, D; Andrews, G K

    1988-01-25

    A cDNA library was constructed using RNA isolated from the livers of chickens which had been treated with zinc. This library was screened with a RNA probe complementary to mouse metallothionein-I (MT), and eight chicken MT cDNA clones were obtained. All of the cDNA clones contained nucleotide sequences homologous to regions of the longest (375 bp) cDNA clone. The latter contained an open reading frame of 189 bp, and the deduced amino acid sequence indicates a protein of 63 amino acids of which 20 are cysteine residues. Amino acid composition and partial amino acid sequence analyses of purified chicken MT protein agreed with the amino acid composition and sequence deduced from the cloned cDNA. Amino acid sequence comparison establish that chicken MT shares extensive homology with mammalian MTs. Southern blot analysis of chicken DNA indicates that the chicken MT gene is not a part of a large family of related sequences, but rather is likely to be a unique gene sequence. In the chicken liver, levels of chicken MT mRNA were rapidly induced by metals (Cd/sup 2 +/, Zn/sup 2 +/, Cu/sup 2 +/), glucocorticoids and lipopolysaccharide. MT mRNA was present in low levels in embryonic liver and increased to high levels during the first week after hatching before decreasing again to the basal levels found in adult liver. The results of this study establish that MT is highly conserved between birds and mammals and is regulated in the chicken by agents which also regulate expression of mammalian MT genes. However, in contrast to the mammals, the results suggest the existence of a single isoform of MT in the chicken.

  9. A sequence-dependent rigid-base model of DNA

    Science.gov (United States)

    Gonzalez, O.; Petkevičiutė, D.; Maddocks, J. H.

    2013-02-01

    A novel hierarchy of coarse-grain, sequence-dependent, rigid-base models of B-form DNA in solution is introduced. The hierarchy depends on both the assumed range of energetic couplings, and the extent of sequence dependence of the model parameters. A significant feature of the models is that they exhibit the phenomenon of frustration: each base cannot simultaneously minimize the energy of all of its interactions. As a consequence, an arbitrary DNA oligomer has an intrinsic or pre-existing stress, with the level of this frustration dependent on the particular sequence of the oligomer. Attention is focussed on the particular model in the hierarchy that has nearest-neighbor interactions and dimer sequence dependence of the model parameters. For a Gaussian version of this model, a complete coarse-grain parameter set is estimated. The parameterized model allows, for an oligomer of arbitrary length and sequence, a simple and explicit construction of an approximation to the configuration-space equilibrium probability density function for the oligomer in solution. The training set leading to the coarse-grain parameter set is itself extracted from a recent and extensive database of a large number of independent, atomic-resolution molecular dynamics (MD) simulations of short DNA oligomers immersed in explicit solvent. The Kullback-Leibler divergence between probability density functions is used to make several quantitative assessments of our nearest-neighbor, dimer-dependent model, which is compared against others in the hierarchy to assess various assumptions pertaining both to the locality of the energetic couplings and to the level of sequence dependence of its parameters. It is also compared directly against all-atom MD simulation to assess its predictive capabilities. The results show that the nearest-neighbor, dimer-dependent model can successfully resolve sequence effects both within and between oligomers. For example, due to the presence of frustration, the model can

  10. A sequence-dependent rigid-base model of DNA.

    Science.gov (United States)

    Gonzalez, O; Petkevičiūtė, D; Maddocks, J H

    2013-02-07

    A novel hierarchy of coarse-grain, sequence-dependent, rigid-base models of B-form DNA in solution is introduced. The hierarchy depends on both the assumed range of energetic couplings, and the extent of sequence dependence of the model parameters. A significant feature of the models is that they exhibit the phenomenon of frustration: each base cannot simultaneously minimize the energy of all of its interactions. As a consequence, an arbitrary DNA oligomer has an intrinsic or pre-existing stress, with the level of this frustration dependent on the particular sequence of the oligomer. Attention is focussed on the particular model in the hierarchy that has nearest-neighbor interactions and dimer sequence dependence of the model parameters. For a Gaussian version of this model, a complete coarse-grain parameter set is estimated. The parameterized model allows, for an oligomer of arbitrary length and sequence, a simple and explicit construction of an approximation to the configuration-space equilibrium probability density function for the oligomer in solution. The training set leading to the coarse-grain parameter set is itself extracted from a recent and extensive database of a large number of independent, atomic-resolution molecular dynamics (MD) simulations of short DNA oligomers immersed in explicit solvent. The Kullback-Leibler divergence between probability density functions is used to make several quantitative assessments of our nearest-neighbor, dimer-dependent model, which is compared against others in the hierarchy to assess various assumptions pertaining both to the locality of the energetic couplings and to the level of sequence dependence of its parameters. It is also compared directly against all-atom MD simulation to assess its predictive capabilities. The results show that the nearest-neighbor, dimer-dependent model can successfully resolve sequence effects both within and between oligomers. For example, due to the presence of frustration, the model can

  11. RevTrans: multiple alignment of coding DNA from aligned amino acid sequences

    DEFF Research Database (Denmark)

    Wernersson, Rasmus; Pedersen, Anders Gorm

    2003-01-01

    The simple fact that proteins are built from 20 amino acids while DNA only contains four different bases, means that the 'signal-to-noise ratio' in protein sequence alignments is much better than in alignments of DNA. Besides this information-theoretical advantage, protein alignments also benefit...... proteins. It is therefore preferable to align coding DNA at the amino acid level and it is for this purpose we have constructed the program RevTrans. RevTrans constructs a multiple DNA alignment by: (i) translating the DNA; (ii) aligning the resulting peptide sequences; and (iii) building a multiple DNA...

  12. Pseudogenes and DNA-based diet analyses: A cautionary tale from a relatively well sampled predator-prey system

    DEFF Research Database (Denmark)

    Dunshea, G.; Barros, N. B.; Wells, R. S.

    2008-01-01

    Mitochondrial ribosomal DNA is commonly used in DNA-based dietary analyses. In such studies, these sequences are generally assumed to be the only version present in DNA of the organism of interest. However, nuclear pseudogenes that display variable similarity to the mitochondrial versions...... are common in many taxa. The presence of nuclear pseudogenes that co-amplify with their mitochondrial paralogues can lead to several possible confounding interpretations when applied to estimating animal diet. Here, we investigate the occurrence of nuclear pseudogenes in fecal samples taken from bottlenose...... dolphins (Tursiops truncatus) that were assayed for prey DNA with a universal primer technique. We found pseudogenes in 13 of 15 samples and 1-5 pseudogene haplotypes per sample representing 5-100% of all amplicons produced. The proportion of amplicons that were pseudogenes and the diversity of prey DNA...

  13. Sequence-selective single-molecule alkylation with a pyrrole-imidazole polyamide visualized in a DNA nanoscaffold.

    Science.gov (United States)

    Yoshidome, Tomofumi; Endo, Masayuki; Kashiwazaki, Gengo; Hidaka, Kumi; Bando, Toshikazu; Sugiyama, Hiroshi

    2012-03-14

    We demonstrate a novel strategy for visualizing sequence-selective alkylation of target double-stranded DNA (dsDNA) using a synthetic pyrrole-imidazole (PI) polyamide in a designed DNA origami scaffold. Doubly functionalized PI polyamide was designed by introduction of an alkylating agent 1-(chloromethyl)-5-hydroxy-1,2-dihydro-3H-benz[e]indole (seco-CBI) and biotin for sequence-selective alkylation at the target sequence and subsequent streptavidin labeling, respectively. Selective alkylation of the target site in the substrate DNA was observed by analysis using sequencing gel electrophoresis. For the single-molecule observation of the alkylation by functionalized PI polyamide using atomic force microscopy (AFM), the target position in the dsDNA (∼200 base pairs) was alkylated and then visualized by labeling with streptavidin. Newly designed DNA origami scaffold named "five-well DNA frame" carrying five different dsDNA sequences in its cavities was used for the detailed analysis of the sequence-selectivity and alkylation. The 64-mer dsDNAs were introduced to five individual wells, in which target sequence AGTXCCA/TGGYACT (XY = AT, TA, GC, CG) was employed as fully matched (X = G) and one-base mismatched (X = A, T, C) sequences. The fully matched sequence was alkylated with 88% selectivity over other mismatched sequences. In addition, the PI polyamide failed to attach to the target sequence lacking the alkylation site after washing and streptavidin treatment. Therefore, the PI polyamide discriminated the one mismatched nucleotide at the single-molecule level, and alkylation anchored the PI polyamide to the target dsDNA.

  14. Polyfluorophore Labels on DNA: Dramatic Sequence Dependence of Quenching

    Science.gov (United States)

    Teo, Yin Nah; Wilson, James N.

    2010-01-01

    We describe studies carried out in the DNA context to test how a common fluorescence quencher, dabcyl, interacts with oligodeoxynu-cleoside fluorophores (ODFs)—a system of stacked, electronically interacting fluorophores built on a DNA scaffold. We tested twenty different tetrameric ODF sequences containing varied combinations and orderings of pyrene (Y), benzopyrene (B), perylene (E), dimethylaminostilbene (D), and spacer (S) monomers conjugated to the 3′ end of a DNA oligomer. Hybridization of this probe sequence to a dabcyl-labeled complementary strand resulted in strong quenching of fluorescence in 85% of the twenty ODF sequences. The high efficiency of quenching was also established by their large Stern–Volmer constants (KSV) of between 2.1 × 104 and 4.3 × 105M−1, measured with a free dabcyl quencher. Interestingly, quenching of ODFs displayed strong sequence dependence. This was particularly evident in anagrams of ODF sequences; for example, the sequence BYDS had a KSV that was approximately two orders of magnitude greater than that of BSDY, which has the same dye composition. Other anagrams, for example EDSY and ESYD, also displayed different responses upon quenching by dabcyl. Analysis of spectra showed that apparent excimer and exciplex emission bands were quenched with much greater efficiency compared to monomer emission bands by at least an order of magnitude. This suggests an important role played by delocalized excited states of the π stack of fluorophores in the amplified quenching of fluorescence. PMID:19780115

  15. Templated Chemistry for Sequence-Specific Fluorogenic Detection of Duplex DNA

    Science.gov (United States)

    Li, Hao; Franzini, Raphael M.; Bruner, Christopher; Kool, Eric T.

    2015-01-01

    We describe the development of templated fluorogenic chemistry for detection of specific sequences of duplex DNA in solution. In this approach, two modified homopyrimidine oligodeoxynucleotide probes are designed to bind by triple helix formation at adjacent positions on a specific purine-rich target sequence of duplex DNA. One fluorescein-labeled probe contains an α-azidoether linker to a fluorescence quencher; the second (trigger) probe carries a triarylphosphine, designed to reduce the azide and cleave the linker. The data showed that at pH 5.6 these probes yielded a strong fluorescence signal within minutes on addition to a complementary homopurine duplex DNA target. The signal increased by a factor of ca. 60, and was completely dependent on the presence of the target DNA. Replacement of cytosine in the probes with pseudoisocytosine allowed the templated chemistry to proceed readily at pH 7. Single nucleotide mismatches in the target oligonucleotide slowed the templated reaction considerably, demonstrating high sequence selectivity. The use of templated fluorogenic chemistry for detection of duplex DNAs has not been previously reported and may allow detection of double stranded DNA, at least for homopurine-homopyrimidine target sites, under native, non-disturbing conditions. PMID:20859985

  16. DNA Extraction Protocols for Whole-Genome Sequencing in Marine Organisms.

    Science.gov (United States)

    Panova, Marina; Aronsson, Henrik; Cameron, R Andrew; Dahl, Peter; Godhe, Anna; Lind, Ulrika; Ortega-Martinez, Olga; Pereyra, Ricardo; Tesson, Sylvie V M; Wrange, Anna-Lisa; Blomberg, Anders; Johannesson, Kerstin

    2016-01-01

    The marine environment harbors a large proportion of the total biodiversity on this planet, including the majority of the earths' different phyla and classes. Studying the genomes of marine organisms can bring interesting insights into genome evolution. Today, almost all marine organismal groups are understudied with respect to their genomes. One potential reason is that extraction of high-quality DNA in sufficient amounts is challenging for many marine species. This is due to high polysaccharide content, polyphenols and other secondary metabolites that will inhibit downstream DNA library preparations. Consequently, protocols developed for vertebrates and plants do not always perform well for invertebrates and algae. In addition, many marine species have large population sizes and, as a consequence, highly variable genomes. Thus, to facilitate the sequence read assembly process during genome sequencing, it is desirable to obtain enough DNA from a single individual, which is a challenge in many species of invertebrates and algae. Here, we present DNA extraction protocols for seven marine species (four invertebrates, two algae, and a marine yeast), optimized to provide sufficient DNA quality and yield for de novo genome sequencing projects.

  17. An Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases

    Directory of Open Access Journals (Sweden)

    Md. Rezaul Karim

    2012-03-01

    Full Text Available Mining interesting patterns from DNA sequences is one of the most challenging tasks in bioinformatics and computational biology. Maximal contiguous frequent patterns are preferable for expressing the function and structure of DNA sequences and hence can capture the common data characteristics among related sequences. Biologists are interested in finding frequent orderly arrangements of motifs that are responsible for similar expression of a group of genes. In order to reduce mining time and complexity, however, most existing sequence mining algorithms either focus on finding short DNA sequences or require explicit specification of sequence lengths in advance. The challenge is to find longer sequences without specifying sequence lengths in advance. In this paper, we propose an efficient approach to mining maximal contiguous frequent patterns from large DNA sequence datasets. The experimental results show that our proposed approach is memory-efficient and mines maximal contiguous frequent patterns within a reasonable time.

  18. DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding.

    Science.gov (United States)

    Ma, Wenxiu; Yang, Lin; Rohs, Remo; Noble, William Stafford

    2017-10-01

    Transcription factors (TFs) bind to specific DNA sequence motifs. Several lines of evidence suggest that TF-DNA binding is mediated in part by properties of the local DNA shape: the width of the minor groove, the relative orientations of adjacent base pairs, etc. Several methods have been developed to jointly account for DNA sequence and shape properties in predicting TF binding affinity. However, a limitation of these methods is that they typically require a training set of aligned TF binding sites. We describe a sequence + shape kernel that leverages DNA sequence and shape information to better understand protein-DNA binding preference and affinity. This kernel extends an existing class of k-mer based sequence kernels, based on the recently described di-mismatch kernel. Using three in vitro benchmark datasets, derived from universal protein binding microarrays (uPBMs), genomic context PBMs (gcPBMs) and SELEX-seq data, we demonstrate that incorporating DNA shape information improves our ability to predict protein-DNA binding affinity. In particular, we observe that (i) the k-spectrum + shape model performs better than the classical k-spectrum kernel, particularly for small k values; (ii) the di-mismatch kernel performs better than the k-mer kernel, for larger k; and (iii) the di-mismatch + shape kernel performs better than the di-mismatch kernel for intermediate k values. The software is available at https://bitbucket.org/wenxiu/sequence-shape.git. rohs@usc.edu or william-noble@uw.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  19. [Phylogenetic relationships among the genera of Taxodiaceae and Cupressaceae from 28S rDNA sequences].

    Science.gov (United States)

    Li, Chun-Xiang; Yang, Qun

    2003-03-01

    DNA sequences from 28S rDNA were used to assess relationships between and within traditional Taxodiaceae and Cupressaceae s.s. The MP tree and NJ tree generally are similar to one another. The results show that Taxodiaceae and Cupressaceae s.s. form a monophyletic conifer lineage excluding Sciadopitys. In the Taxodiaceae-Cupressaceae s.s. monophyletic group, the Taxodiaceae is paraphyletic. Taxodium, Glyptostrobus and Cryptomeria forming a clade(Taxodioideae), in which Glyptostrobus and Taxodium are closely related and sister to Cryptomeria; Sequoia, Sequoiadendron and Metasequoia are closely related to each other, forming another clade (Sequoioideae), in which Sequoia and Sequoiadendron are closely related and sister to Metasequoia; the seven genera of Cupressaceae s.s. are found to be closely related to form a monophyletic lineage (Cupressoideae). These results are basically similar to analyses from chloroplast gene data. But the relationships among Taiwania, Sequoioideae, Taxodioideae, and Cupressoideae remain unclear because of the slow evolution rate of 28S rDNA, which might best be answered by sequencing more rapidly evolving nuclear genes.

  20. Isolation and characterization of 5S rDNA sequences in catfishes genome (Heptapteridae and Pseudopimelodidae): perspectives for rDNA studies in fish by C0t method.

    Science.gov (United States)

    Gouveia, Juceli Gonzalez; Wolf, Ivan Rodrigo; de Moraes-Manécolo, Vivian Patrícia Oliveira; Bardella, Vanessa Belline; Ferracin, Lara Munique; Giuliano-Caetano, Lucia; da Rosa, Renata; Dias, Ana Lúcia

    2016-12-01

    Sequences of 5S ribosomal RNA (rRNA) are extensively used in fish cytogenomic studies, once they have a flexible organization at the chromosomal level, showing inter- and intra-specific variation in number and position in karyotypes. Sequences from the genome of Imparfinis schubarti (Heptapteridae) were isolated, aiming to understand the organization of 5S rDNA families in the fish genome. The isolation of 5S rDNA from the genome of I. schubarti was carried out by reassociation kinetics (C 0 t) and PCR amplification. The obtained sequences were cloned for the construction of a micro-library. The obtained clones were sequenced and hybridized in I. schubarti and Microglanis cottoides (Pseudopimelodidae) for chromosome mapping. An analysis of the sequence alignments with other fish groups was accomplished. Both methods were effective when using 5S rDNA for hybridization in I. schubarti genome. However, the C 0 t method enabled the use of a complete 5S rRNA gene, which was also successful in the hybridization of M. cottoides. Nevertheless, this gene was obtained only partially by PCR. The hybridization results and sequence analyses showed that intact 5S regions are more appropriate for the probe operation, due to conserved structure and motifs. This study contributes to a better understanding of the organization of multigene families in catfish's genomes.

  1. cgDNA: a software package for the prediction of sequence-dependent coarse-grain free energies of B-form DNA.

    Science.gov (United States)

    Petkevičiūtė, D; Pasi, M; Gonzalez, O; Maddocks, J H

    2014-11-10

    cgDNA is a package for the prediction of sequence-dependent configuration-space free energies for B-form DNA at the coarse-grain level of rigid bases. For a fragment of any given length and sequence, cgDNA calculates the configuration of the associated free energy minimizer, i.e. the relative positions and orientations of each base, along with a stiffness matrix, which together govern differences in free energies. The model predicts non-local (i.e. beyond base-pair step) sequence dependence of the free energy minimizer. Configurations can be input or output in either the Curves+ definition of the usual helical DNA structural variables, or as a PDB file of coordinates of base atoms. We illustrate the cgDNA package by comparing predictions of free energy minimizers from (a) the cgDNA model, (b) time-averaged atomistic molecular dynamics (or MD) simulations, and (c) NMR or X-ray experimental observation, for (i) the Dickerson-Drew dodecamer and (ii) three oligomers containing A-tracts. The cgDNA predictions are rather close to those of the MD simulations, but many orders of magnitude faster to compute. Both the cgDNA and MD predictions are in reasonable agreement with the available experimental data. Our conclusion is that cgDNA can serve as a highly efficient tool for studying structural variations in B-form DNA over a wide range of sequences. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. Comparison of microbial DNA enrichment tools for metagenomic whole genome sequencing.

    Science.gov (United States)

    Thoendel, Matthew; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Yao, Janet Z; Chia, Nicholas; Hanssen, Arlen D; Abdel, Matthew P; Patel, Robin

    2016-08-01

    Metagenomic whole genome sequencing for detection of pathogens in clinical samples is an exciting new area for discovery and clinical testing. A major barrier to this approach is the overwhelming ratio of human to pathogen DNA in samples with low pathogen abundance, which is typical of most clinical specimens. Microbial DNA enrichment methods offer the potential to relieve this limitation by improving this ratio. Two commercially available enrichment kits, the NEBNext Microbiome DNA Enrichment Kit and the Molzym MolYsis Basic kit, were tested for their ability to enrich for microbial DNA from resected arthroplasty component sonicate fluids from prosthetic joint infections or uninfected sonicate fluids spiked with Staphylococcus aureus. Using spiked uninfected sonicate fluid there was a 6-fold enrichment of bacterial DNA with the NEBNext kit and 76-fold enrichment with the MolYsis kit. Metagenomic whole genome sequencing of sonicate fluid revealed 13- to 85-fold enrichment of bacterial DNA using the NEBNext enrichment kit. The MolYsis approach achieved 481- to 9580-fold enrichment, resulting in 7 to 59% of sequencing reads being from the pathogens known to be present in the samples. These results demonstrate the usefulness of these tools when testing clinical samples with low microbial burden using next generation sequencing. Copyright © 2016 Elsevier B.V. All rights reserved.

  3. Ecological niche modelling and nDNA sequencing support a new, morphologically cryptic beetle species unveiled by DNA barcoding.

    Science.gov (United States)

    Hawlitschek, Oliver; Porch, Nick; Hendrich, Lars; Balke, Michael

    2011-02-09

    DNA sequencing techniques used to estimate biodiversity, such as DNA barcoding, may reveal cryptic species. However, disagreements between barcoding and morphological data have already led to controversy. Species delimitation should therefore not be based on mtDNA alone. Here, we explore the use of nDNA and bioclimatic modelling in a new species of aquatic beetle revealed by mtDNA sequence data. The aquatic beetle fauna of Australia is characterised by high degrees of endemism, including local radiations such as the genus Antiporus. Antiporus femoralis was previously considered to exist in two disjunct, but morphologically indistinguishable populations in south-western and south-eastern Australia. We constructed a phylogeny of Antiporus and detected a deep split between these populations. Diagnostic characters from the highly variable nuclear protein encoding arginine kinase gene confirmed the presence of two isolated populations. We then used ecological niche modelling to examine the climatic niche characteristics of the two populations. All results support the status of the two populations as distinct species. We describe the south-western species as Antiporus occidentalis sp.n. In addition to nDNA sequence data and extended use of mitochondrial sequences, ecological niche modelling has great potential for delineating morphologically cryptic species.

  4. Statistical properties and fractals of nucleotide clusters in DNA sequences

    International Nuclear Information System (INIS)

    Sun Tingting; Zhang Linxi; Chen Jin; Jiang Zhouting

    2004-01-01

    Statistical properties of nucleotide clusters in DNA sequences and their fractals are investigated in this paper. The average size of nucleotide clusters in non-coding sequence is larger than that in coding sequence. We investigate the cluster-size distribution P(S) for human chromosomes 21 and 22, and the results are different from previous works. The cluster-size distribution P(S 1 +S 2 ) with the total size of sequential Pu-cluster and Py-cluster S 1 +S 2 is studied. We observe that P(S 1 +S 2 ) follows an exponential decay both in coding and non-coding sequences. However, we get different results for human chromosomes 21 and 22. The probability distribution P(S 1 ,S 2 ) of nucleotide clusters with the size of sequential Pu-cluster and Py-cluster S 1 and S 2 respectively, is also examined. In the meantime, some of the linear correlations are obtained in the double logarithmic plots of the fluctuation F(l) versus nucleotide cluster distance l along the DNA chain. The power spectrums of nucleotide clusters are also discussed, and it is concluded that the curves are flat and hardly changed and the 1/3 frequency is neither observed in coding sequence nor in non-coding sequence. These investigations can provide some insights into the nucleotide clusters of DNA sequences

  5. Fast phylogenetic DNA barcoding

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Willerslev, Eske

    2008-01-01

    We present a heuristic approach to the DNA assignment problem based on phylogenetic inferences using constrained neighbour joining and non-parametric bootstrapping. We show that this method performs as well as the more computationally intensive full Bayesian approach in an analysis of 500 insect...... DNA sequences obtained from GenBank. We also analyse a previously published dataset of environmental DNA sequences from soil from New Zealand and Siberia, and use these data to illustrate the fact that statistical approaches to the DNA assignment problem allow for more appropriate criteria...... for determining the taxonomic level at which a particular DNA sequence can be assigned....

  6. Close sequence identity between ribosomal DNA episomes of the ...

    Indian Academy of Sciences (India)

    Unknown

    The restriction map of the E. dispar rDNA circle showed close simi- larity to EhR1 .... for 30 cycles in a DNA Thermal cycler (MJ Research,. USA). 3. .... by asterisk. The gaps show the variation between E. dispar and E. histolytica sequences.

  7. DNA interaction with platinum-based cytostatics revealed by DNA sequencing.

    Science.gov (United States)

    Smerkova, Kristyna; Vaculovic, Tomas; Vaculovicova, Marketa; Kynicky, Jindrich; Brtnicky, Martin; Eckschlager, Tomas; Stiborova, Marie; Hubalek, Jaromir; Adam, Vojtech

    2017-12-15

    The main mechanism of action of platinum-based cytostatic drugs - cisplatin, oxaliplatin and carboplatin - is the formation of DNA cross-links, which restricts the transcription due to the disability of DNA to enter the active site of the polymerase. The polymerase chain reaction (PCR) was employed as a simplified model of the amplification process in the cell nucleus. PCR with fluorescently labelled dideoxynucleotides commonly employed for DNA sequencing was used to monitor the effect of platinum-based cytostatics on DNA in terms of decrease in labeling efficiency dependent on a presence of the DNA-drug cross-link. It was found that significantly different amounts of the drugs - cisplatin (0.21 μg/mL), oxaliplatin (5.23 μg/mL), and carboplatin (71.11 μg/mL) - were required to cause the same quenching effect (50%) on the fluorescent labelling of 50 μg/mL of DNA. Moreover, it was found that even though the amounts of the drugs was applied to the reaction mixture differing by several orders of magnitude, the amount of incorporated platinum, quantified by inductively coupled plasma mass spectrometry, was in all cases at the level of tenths of μg per 5 μg of DNA. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Determination of cDNA and genomic DNA sequences of hevamine, a chitinase from the rubber tree Hevea brasiliensis

    NARCIS (Netherlands)

    Bokma, E; Spiering, M; Chow, KS; Mulder, PPMFA; Subroto, T; Beintema, JJ

    Hevamine is a chitinase from the rubber tree Hevea brasiliensis and belongs to the family 18 glycosyl hydrolases. This paper describes the cloning of hevamine DNA and cDNA sequences. Hevamine contains a signal peptide at the N-terminus and a putative vacuolar targeting sequence at the C-terminus

  9. Roche genome sequencer FLX based high-throughput sequencing of ancient DNA

    DEFF Research Database (Denmark)

    Alquezar-Planas, David E; Fordyce, Sarah Louise

    2012-01-01

    Since the development of so-called "next generation" high-throughput sequencing in 2005, this technology has been applied to a variety of fields. Such applications include disease studies, evolutionary investigations, and ancient DNA. Each application requires a specialized protocol to ensure...... that the data produced is optimal. Although much of the procedure can be followed directly from the manufacturer's protocols, the key differences lie in the library preparation steps. This chapter presents an optimized protocol for the sequencing of fossil remains and museum specimens, commonly referred...

  10. Ultrasensitive DNA sequence detection using nanoscale ZnO sensor arrays

    Energy Technology Data Exchange (ETDEWEB)

    Kumar, Nitin; Dorfman, Adam; Hahm, Jong-in [Department of Chemical Engineering, Pennsylvania State University, 160 Fenske Laboratory, University Park, PA 16802 (United States)

    2006-06-28

    We report that engineered nanoscale zinc oxide structures can be effectively used for the identification of the biothreat agent, Bacillus anthracis by successfully discriminating its DNA sequence from other genetically related species. We explore both covalent and non-covalent linking schemes in order to couple probe DNA strands to the zinc oxide nanostructures. Hybridization reactions are performed with various concentrations of target DNA strands whose sequence is unique to Bacillus anthracis. The use of zinc oxide nanomaterials greatly enhances the fluorescence signal collected after carrying out duplex formation reaction. Specifically, the covalent strategy allows detection of the target species at sample concentrations at a level as low as a few femtomolar as compared to the detection sensitivity in the tens of nanomolar range when using the non-covalent scheme. The presence of the underlying zinc oxide nanomaterials is critical in achieving increased fluorescence detection of hybridized DNA and, therefore, accomplishing rapid and extremely sensitive identification of the biothreat agent. We also demonstrate the easy integration potential of nanoscale zinc oxide into high density arrays by using various types of zinc oxide sensor prototypes in the DNA sequence detection. When combined with conventional automatic sample handling apparatus and computerized fluorescence detection equipment, our approach can greatly promote the use of zinc oxide nanomaterials as signal enhancing platforms for rapid, multiplexed, high-throughput, highly sensitive, DNA sensor arrays.

  11. Ultrasensitive DNA sequence detection using nanoscale ZnO sensor arrays

    International Nuclear Information System (INIS)

    Kumar, Nitin; Dorfman, Adam; Hahm, Jong-in

    2006-01-01

    We report that engineered nanoscale zinc oxide structures can be effectively used for the identification of the biothreat agent, Bacillus anthracis by successfully discriminating its DNA sequence from other genetically related species. We explore both covalent and non-covalent linking schemes in order to couple probe DNA strands to the zinc oxide nanostructures. Hybridization reactions are performed with various concentrations of target DNA strands whose sequence is unique to Bacillus anthracis. The use of zinc oxide nanomaterials greatly enhances the fluorescence signal collected after carrying out duplex formation reaction. Specifically, the covalent strategy allows detection of the target species at sample concentrations at a level as low as a few femtomolar as compared to the detection sensitivity in the tens of nanomolar range when using the non-covalent scheme. The presence of the underlying zinc oxide nanomaterials is critical in achieving increased fluorescence detection of hybridized DNA and, therefore, accomplishing rapid and extremely sensitive identification of the biothreat agent. We also demonstrate the easy integration potential of nanoscale zinc oxide into high density arrays by using various types of zinc oxide sensor prototypes in the DNA sequence detection. When combined with conventional automatic sample handling apparatus and computerized fluorescence detection equipment, our approach can greatly promote the use of zinc oxide nanomaterials as signal enhancing platforms for rapid, multiplexed, high-throughput, highly sensitive, DNA sensor arrays

  12. cDNA sequencing improves the detection of P53 missense mutations in colorectal cancer

    International Nuclear Information System (INIS)

    Szybka, Malgorzata; Kordek, Radzislaw; Zakrzewska, Magdalena; Rieske, Piotr; Pasz-Walczak, Grazyna; Kulczycka-Wojdala, Dominika; Zawlik, Izabela; Stawski, Robert; Jesionek-Kupnicka, Dorota; Liberski, Pawel P

    2009-01-01

    Recently published data showed discrepancies beteween P53 cDNA and DNA sequencing in glioblastomas. We hypothesised that similar discrepancies may be observed in other human cancers. To this end, we analyzed 23 colorectal cancers for P53 mutations and gene expression using both DNA and cDNA sequencing, real-time PCR and immunohistochemistry. We found P53 gene mutations in 16 cases (15 missense and 1 nonsense). Two of the 15 cases with missense mutations showed alterations based only on cDNA, and not DNA sequencing. Moreover, in 6 of the 15 cases with a cDNA mutation those mutations were difficult to detect in the DNA sequencing, so the results of DNA analysis alone could be misinterpreted if the cDNA sequencing results had not also been available. In all those 15 cases, we observed a higher ratio of the mutated to the wild type template by cDNA analysis, but not by the DNA analysis. Interestingly, a similar overexpression of P53 mRNA was present in samples with and without P53 mutations. In terms of colorectal cancer, those discrepancies might be explained under three conditions: 1, overexpression of mutated P53 mRNA in cancer cells as compared with normal cells; 2, a higher content of cells without P53 mutation (normal cells and cells showing K-RAS and/or APC but not P53 mutation) in samples presenting P53 mutation; 3, heterozygous or hemizygous mutations of P53 gene. Additionally, for heterozygous mutations unknown mechanism(s) causing selective overproduction of mutated allele should also be considered. Our data offer new clues for studying discrepancy in P53 cDNA and DNA sequencing analysis

  13. Application of synthetic DNA probes to the analysis of DNA sequence variants in man

    International Nuclear Information System (INIS)

    Wallace, R.B.; Petz, L.D.; Yam, P.Y.

    1986-01-01

    Oligonucleotide probes provide a tool to discriminate between any two alleles on the basis of hybridization. Random sampling of the genome with different oligonucleotide probes should reveal polymorphism in a certain percentage of the cases. In the hope of identifying polymorphic regions more efficiently, we chose to take advantage of the proposed hypermutability of repeated DNA sequences and the specificity of oligonucleotide hybridization. Since, under appropriate conditions, oligonucleotide probes require complete base pairing for hybridization to occur, they will only hybridize to a subset of the members of a repeat family when all members of the family are not identical. The results presented here suggest that oligonucleotide hybridization can be used to extend the genomic sequences that can be tested for the presence of RFLPs. This expands the tools available to human genetics. In addition, the results suggest that repeated DNA sequences are indeed more polymorphic than single-copy sequences. 28 references, 2 figures

  14. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    DEFF Research Database (Denmark)

    Nielsen, Peter E.

    2008-01-01

    sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technol. of protein dsDNA structures. (c) 2008 American Institute of Physics. [on SciFinder (R)] Udgivelsesdato...

  15. Sequence and transcription analysis of the human cytomegalovirus DNA polymerase gene

    International Nuclear Information System (INIS)

    Kouzarides, T.; Bankier, A.T.; Satchwell, S.C.; Weston, K.; Tomlinson, P.; Barrell, B.G.

    1987-01-01

    DNA sequence analysis has revealed that the gene coding for the human cytomegalovirus (HCMV) DNA polymerase is present within the long unique region of the virus genome. Identification is based on extensive amino acid homology between the predicted HCMV open reading frame HFLF2 and the DNA polymerase of herpes simplex virus type 1. The authors present here a 5280 base-pair DNA sequence containing the HCMV pol gene, along with the analysis of transcripts encoded within this region. Since HCMV pol also shows homology to the predicted Epstein-Barr virus pol, they were able to analyze the extent of homology between the DNA polymerases of three distantly related herpes viruses, HCMV, Epstein-Barr virus, and herpes simplex virus. The comparison shows that these DNA polymerases exhibit considerable amino acid homology and highlights a number of highly conserved regions; two such regions show homology to sequences within the adenovirus type 2 DNA polymerase. The HCMV pol gene is flanked by open reading frames with homology to those of other herpes viruses; upstream, there is a reading frame homologous to the glycoprotein B gene of herpes simplex virus type I and Epstein-Barr virus, and downstream there is a reading frame homologous to BFLF2 of Epstein-Barr virus

  16. Special Issue: Next Generation DNA Sequencing

    Directory of Open Access Journals (Sweden)

    Paul Richardson

    2010-10-01

    Full Text Available Next Generation Sequencing (NGS refers to technologies that do not rely on traditional dideoxy-nucleotide (Sanger sequencing where labeled DNA fragments are physically resolved by electrophoresis. These new technologies rely on different strategies, but essentially all of them make use of real-time data collection of a base level incorporation event across a massive number of reactions (on the order of millions versus 96 for capillary electrophoresis for instance. The major commercial NGS platforms available to researchers are the 454 Genome Sequencer (Roche, Illumina (formerly Solexa Genome analyzer, the SOLiD system (Applied Biosystems/Life Technologies and the Heliscope (Helicos Corporation. The techniques and different strategies utilized by these platforms are reviewed in a number of the papers in this special issue. These technologies are enabling new applications that take advantage of the massive data produced by this next generation of sequencing instruments. [...

  17. Spectral sum rules and search for periodicities in DNA sequences

    International Nuclear Information System (INIS)

    Chechetkin, V.R.

    2011-01-01

    Periodic patterns play the important regulatory and structural roles in genomic DNA sequences. Commonly, the underlying periodicities should be understood in a broad statistical sense, since the corresponding periodic patterns have been strongly distorted by the random point mutations and insertions/deletions during molecular evolution. The latent periodicities in DNA sequences can be efficiently displayed by Fourier transform. The criteria of significance for observed periodicities are obtained via the comparison versus the counterpart characteristics of the reference random sequences. We show that the restrictions imposed on the significance criteria by the rigorous spectral sum rules can be rationally described with De Finetti distribution. This distribution provides the convenient intermediate asymptotic form between Rayleigh distribution and exact combinatoric theory. - Highlights: → We study the significance criteria for latent periodicities in DNA sequences. → The constraints imposed by sum rules can be described with De Finetti distribution. → It is intermediate between Rayleigh distribution and exact combinatoric theory. → Theory is applicable to the study of correlations between different periodicities. → The approach can be generalized to the arbitrary discrete Fourier transform.

  18. Genomic signal processing methods for computation of alignment-free distances from DNA sequences.

    Science.gov (United States)

    Borrayo, Ernesto; Mendizabal-Ruiz, E Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P; Morales, J Alejandro

    2014-01-01

    Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments.

  19. Complete sequences of the mitochondrial DNA of the wild Gracilariopsis lemaneiformis and two mutagenic cultivated breeds (Gracilariaceae, Rhodophyta.

    Directory of Open Access Journals (Sweden)

    Lei Zhang

    Full Text Available The complete mitochondrial DNA (mtDNA of Gracilariopsis lemaneiformis was sequenced (25883 bp and mapped to a circular model. The A+T composition was 72.5%. Forty six genes and two potentially functional open reading frames were identified. They include 24 protein-coding genes, 2 rRNA genes, 20 tRNA genes and 2 ORFs (orf60, orf142. There is considerable sequence synteny across the five red algal mtDNAs falling into Florideophyceae including Gr. lemaneiformis in this study and previously sequenced species. A long stem-loop and a hairpin structure were identified in intergenic regions of mt genome of Gr. lemaneiformis, which are believed to be involved with transcription and replication. In addition, the mtDNAs of two mutagenic cultivated breeds ("981" and "07-2" were also sequenced. Compared with the mtDNA of wild Gr. lemaneiformis, the genome size and gene length and order of three strains were completely identical except nine base mutations including eight in the protein-coding genes and one in the tRNA gene. None of the base mutations caused frameshift or a premature stop codon in the mtDNA genes. Phylogenetic analyses based on mitochondrial protein-coding genes and rRNA genes demonstrated Gracilariopsis andersonii had closer phylogenetic relationship with its parasite Gracilariophila oryzoides than Gracilariopsis lemaneiformis which was from the same genus of Gracilariopsis.

  20. Complete sequences of the mitochondrial DNA of the wild Gracilariopsis lemaneiformis and two mutagenic cultivated breeds (Gracilariaceae, Rhodophyta).

    Science.gov (United States)

    Zhang, Lei; Wang, Xumin; Qian, Hao; Chi, Shan; Liu, Cui; Liu, Tao

    2012-01-01

    The complete mitochondrial DNA (mtDNA) of Gracilariopsis lemaneiformis was sequenced (25883 bp) and mapped to a circular model. The A+T composition was 72.5%. Forty six genes and two potentially functional open reading frames were identified. They include 24 protein-coding genes, 2 rRNA genes, 20 tRNA genes and 2 ORFs (orf60, orf142). There is considerable sequence synteny across the five red algal mtDNAs falling into Florideophyceae including Gr. lemaneiformis in this study and previously sequenced species. A long stem-loop and a hairpin structure were identified in intergenic regions of mt genome of Gr. lemaneiformis, which are believed to be involved with transcription and replication. In addition, the mtDNAs of two mutagenic cultivated breeds ("981" and "07-2") were also sequenced. Compared with the mtDNA of wild Gr. lemaneiformis, the genome size and gene length and order of three strains were completely identical except nine base mutations including eight in the protein-coding genes and one in the tRNA gene. None of the base mutations caused frameshift or a premature stop codon in the mtDNA genes. Phylogenetic analyses based on mitochondrial protein-coding genes and rRNA genes demonstrated Gracilariopsis andersonii had closer phylogenetic relationship with its parasite Gracilariophila oryzoides than Gracilariopsis lemaneiformis which was from the same genus of Gracilariopsis.

  1. Parallel Sequencing of Expressed Sequence Tags from Two Complementary DNA Libraries for High and Low Phosphorus Adaptation in Common Beans

    Directory of Open Access Journals (Sweden)

    Matthew W. Blair

    2011-11-01

    Full Text Available Expressed sequence tags (ESTs have proven useful for gene discovery in many crops. In this work, our objective was to construct complementary DNA (cDNA libraries from root tissues of common beans ( L. grown under low and high P hydroponic conditions and to conduct EST sequencing and comparative analyses of the libraries. Expressed sequence tag analysis of 3648 clones identified 2372 unigenes, of which 1591 were annotated as known genes while a total of 465 unigenes were not associated with any known gene. Unigenes with hits were categorized according to biological processes, molecular function, and cellular compartmentalization. Given the young tissue used to make the root libraries, genes for catalytic activity and binding were highly expressed. Comparisons with previous root EST sequencing and between the two libraries made here resulted in a set of genes to study further for differential gene expression and adaptation to low P, such as a 14 kDa praline-rich protein, a metallopeptidase, tonoplast intrinsic protein, adenosine triphosphate (ATP citrate synthase, and cell proliferation genes expressed in the low P treated plants. Given that common beans are often grown on acid soils of the tropics and subtropics that are usually low in P these genes and the two parallel libraries will be useful for selection for better uptake of this essential macronutrient. The importance of EST generation for common bean root tissues under low P and other abiotic soil stresses is also discussed.

  2. High-resolution characterization of sequence signatures due to non-random cleavage of cell-free DNA.

    Science.gov (United States)

    Chandrananda, Dineika; Thorne, Natalie P; Bahlo, Melanie

    2015-06-17

    High-throughput sequencing of cell-free DNA fragments found in human plasma has been used to non-invasively detect fetal aneuploidy, monitor organ transplants and investigate tumor DNA. However, many biological properties of this extracellular genetic material remain unknown. Research that further characterizes circulating DNA could substantially increase its diagnostic value by allowing the application of more sophisticated bioinformatics tools that lead to an improved signal to noise ratio in the sequencing data. In this study, we investigate various features of cell-free DNA in plasma using deep-sequencing data from two pregnant women (>70X, >50X) and compare them with matched cellular DNA. We utilize a descriptive approach to examine how the biological cleavage of cell-free DNA affects different sequence signatures such as fragment lengths, sequence motifs at fragment ends and the distribution of cleavage sites along the genome. We show that the size distributions of these cell-free DNA molecules are dependent on their autosomal and mitochondrial origin as well as the genomic location within chromosomes. DNA mapping to particular microsatellites and alpha repeat elements display unique size signatures. We show how cell-free fragments occur in clusters along the genome, localizing to nucleosomal arrays and are preferentially cleaved at linker regions by correlating the mapping locations of these fragments with ENCODE annotation of chromatin organization. Our work further demonstrates that cell-free autosomal DNA cleavage is sequence dependent. The region spanning up to 10 positions on either side of the DNA cleavage site show a consistent pattern of preference for specific nucleotides. This sequence motif is present in cleavage sites localized to nucleosomal cores and linker regions but is absent in nucleosome-free mitochondrial DNA. These background signals in cell-free DNA sequencing data stem from the non-random biological cleavage of these fragments. This

  3. AU2EU : Privacy-preserving matching of DNA sequences

    NARCIS (Netherlands)

    Ignatenko, T.; Petkovic, M.; Naccache, D.; Sauveron, D.

    2014-01-01

    Advances in DNA sequencing create new opportunities for the use of DNA data in healthcare for diagnostic and treatment purposes, but also in many other health and well-being services. This brings new challenges with regard to the protection and use of this sensitive data. Thus, special technical

  4. cDNA sequences of two apolipoproteins from lamprey

    International Nuclear Information System (INIS)

    Pontes, M.; Xu, X.; Graham, D.; Riley, M.; Doolittle, R.F.

    1987-01-01

    The messages for two small but abundant apolipoproteins found in lamprey blood plasma were cloned with the aid of oligonucleotide probes based on amino-terminal sequences. In both cases, numerous clones were identified in a lamprey liver cDNA library, consistent with the great abundance of these proteins in lamprey blood. One of the cDNAs (LAL1) has a coding region of 105 amino acids that corresponds to a 21-residue signal peptide, a putative 8-residue propeptide, and the 76-residue mature protein found in blood. The other cDNA (LAL2) codes for a total of 191 residues, the first 23 of which constitute a signal peptide. The two proteins, which occur in the high-density lipoprotein fraction of ultracentrifuged plasma, have amino acid compositions similar to those of apolipoproteins found in mammalian blood; computer analysis indicates that the sequences are largely helix-permissive. When the sequences were searched against an amino acid sequence data base, rat apolipoprotein IV was the best matching candidate in both cases. Although a reasonable alignment can be made with that sequence and LAL1, definitive assignment of the two lamprey proteins to typical mammalian classes cannot be made at this point

  5. Early Lyme disease with spirochetemia - diagnosed by DNA sequencing

    Directory of Open Access Journals (Sweden)

    Jones William

    2010-11-01

    Full Text Available Abstract Background A sensitive and analytically specific nucleic acid amplification test (NAAT is valuable in confirming the diagnosis of early Lyme disease at the stage of spirochetemia. Findings Venous blood drawn from patients with clinical presentations of Lyme disease was tested for the standard 2-tier screen and Western Blot serology assay for Lyme disease, and also by a nested polymerase chain reaction (PCR for B. burgdorferi sensu lato 16S ribosomal DNA. The PCR amplicon was sequenced for B. burgdorferi genomic DNA validation. A total of 130 patients visiting emergency room (ER or Walk-in clinic (WALKIN, and 333 patients referred through the private physicians' offices were studied. While 5.4% of the ER/WALKIN patients showed DNA evidence of spirochetemia, none (0% of the patients referred from private physicians' offices were DNA-positive. In contrast, while 8.4% of the patients referred from private physicians' offices were positive for the 2-tier Lyme serology assay, only 1.5% of the ER/WALKIN patients were positive for this antibody test. The 2-tier serology assay missed 85.7% of the cases of early Lyme disease with spirochetemia. The latter diagnosis was confirmed by DNA sequencing. Conclusion Nested PCR followed by automated DNA sequencing is a valuable supplement to the standard 2-tier antibody assay in the diagnosis of early Lyme disease with spirochetemia. The best time to test for Lyme spirochetemia is when the patients living in the Lyme disease endemic areas develop unexplained symptoms or clinical manifestations that are consistent with Lyme disease early in the course of their illness.

  6. Characterization of four species of Trichuris (Nematoda: Enoplida) by their second internal transcribed spacer ribosomal DNA sequence.

    Science.gov (United States)

    Oliveros, R; Cutillas, C; De Rojas, M; Arias, P

    2000-12-01

    Adult worms of Trichuris ovis and T. globulosa were collected from Ovis aries (sheep) and Capra hircus (goats). T. suis was isolated from Sus scrofa domestica (swine) and T. leporis was isolated from Lepus europaeus (rabbits) in Spain. Genomic DNA was isolated and a ribosomal internal transcribed spacer (ITS2) was amplified and sequenced using polymerase-chain-reaction (PCR) techniques. The ITS2 of T. ovis and T. globulosa was 407 nucleotides in length and had a GC content of about 62%. Furthermore, the ITS2 of T. suis and T. leporis was 534 and 418 nucleotides in length and had a GC content of about 64.8% and 62.4%, respectively. There was evidence of slight variation in the sequence within individuals of all species analyzed, indicating intraindividual variation in the sequence of different copies of the ribosomal DNA. Furthermore, low-level intraspecific variation was detected. Sequence analyses of ITS2 products of T. ovis and T. globulosa demonstrated no sequence difference between them. Nevertheless, differences were detected between the ITS2 sequences of T. suis, T. leporis, and T. ovis, indicating that Trichuris species can reliably be differentiated by their ITS2 sequences and PCR-linked restriction-fragment-length polymorphism (RFLP).

  7. Identification of tissue-embedded ascarid larvae by ribosomal DNA sequencing.

    Science.gov (United States)

    Ishiwata, Kenji; Shinohara, Akio; Yagi, Kinpei; Horii, Yoichiro; Tsuchiya, Kimiyuki; Nawa, Yukifumi

    2004-01-01

    Polymerase chain reaction (PCR) was applied to identify tissue-embedded ascarid nematode larvae. Two sequences of the internal transcribed spacer (ITS) regions of ribosomal DNA (rDNA), ITS1 and ITS2, of the ascarid parasites were amplified and compared with those of ascarid-nematodes registered in a DNA database (GenBank). The ITS sequences of the PCR products obtained from the ascarid parasite specimen in our laboratory were compatible with those of registered adult Ascaris and Toxocara parasites. PCR amplification of the ITS regions was sensitive enough to detect a single larva of Ascaris suum mixed with porcine liver tissue. Using this method, ascarid larvae embedded in the liver of a naturally infected turkey were identified as Toxocara canis. These results suggest that even a single larva embedded in tissues from patients with larva migrans could be identified by sequencing the ITS regions.

  8. Micropatterning stretched and aligned DNA for sequence-specific nanolithography

    Science.gov (United States)

    Petit, Cecilia Anna Paulette

    Techniques for fabricating nanostructured materials can be categorized as either "top-down" or "bottom-up". Top-down techniques use lithography and contact printing to create patterned surfaces and microfluidic channels that can corral and organize nanoscale structures, such as molecules and nanorods in contrast; bottom-up techniques use self-assembly or molecular recognition to direct the organization of materials. A central goal in nanotechnology is the integration of bottom-up and top-down assembly strategies for materials development, device design; and process integration. With this goal in mind, we have developed strategies that will allow this integration by using DNA as a template for nanofabrication; two top-down approaches allow the placement of these templates, while the bottom-up technique uses the specific sequence of bases to pattern materials along each strand of DNA. Our first top-down approach, termed combing of molecules in microchannels (COMMIC), produces microscopic patterns of stretched and aligned molecules of DNA on surfaces. This process consists of passing an air-water interface over end adsorbed molecules inside microfabricated channels. The geometry of the microchannel directs the placement of the DNA molecules, while the geometry of the airwater interface directs the local orientation and curvature of the molecules. We developed another top-down strategy for creating micropatterns of stretched and aligned DNA using surface chemistry. Because DNA stretching occurs on hydrophobic surfaces, this technique uses photolithography to pattern vinyl-terminated silanes on glass When these surface-, are immersed in DNA solution, molecules adhere preferentially to the silanized areas. This approach has also proven useful in patterning protein for cell adhesion studies. Finally, we describe the use of these stretched and aligned molecules of DNA as templates for the subsequent bottom-up construction of hetero-structures through hybridization

  9. Extra-binomial variation approach for analysis of pooled DNA sequencing data

    Science.gov (United States)

    Wallace, Chris

    2012-01-01

    Motivation: The invention of next-generation sequencing technology has made it possible to study the rare variants that are more likely to pinpoint causal disease genes. To make such experiments financially viable, DNA samples from several subjects are often pooled before sequencing. This induces large between-pool variation which, together with other sources of experimental error, creates over-dispersed data. Statistical analysis of pooled sequencing data needs to appropriately model this additional variance to avoid inflating the false-positive rate. Results: We propose a new statistical method based on an extra-binomial model to address the over-dispersion and apply it to pooled case-control data. We demonstrate that our model provides a better fit to the data than either a standard binomial model or a traditional extra-binomial model proposed by Williams and can analyse both rare and common variants with lower or more variable pool depths compared to the other methods. Availability: Package ‘extraBinomial’ is on http://cran.r-project.org/ Contact: chris.wallace@cimr.cam.ac.uk Supplementary information: Supplementary data are available at Bioinformatics Online. PMID:22976083

  10. Simultaneous identification of DNA and RNA viruses present in pig faeces using process-controlled deep sequencing.

    Directory of Open Access Journals (Sweden)

    Jana Sachsenröder

    Full Text Available BACKGROUND: Animal faeces comprise a community of many different microorganisms including bacteria and viruses. Only scarce information is available about the diversity of viruses present in the faeces of pigs. Here we describe a protocol, which was optimized for the purification of the total fraction of viral particles from pig faeces. The genomes of the purified DNA and RNA viruses were simultaneously amplified by PCR and subjected to deep sequencing followed by bioinformatic analyses. The efficiency of the method was monitored using a process control consisting of three bacteriophages (T4, M13 and MS2 with different morphology and genome types. Defined amounts of the bacteriophages were added to the sample and their abundance was assessed by quantitative PCR during the preparation procedure. RESULTS: The procedure was applied to a pooled faecal sample of five pigs. From this sample, 69,613 sequence reads were generated. All of the added bacteriophages were identified by sequence analysis of the reads. In total, 7.7% of the reads showed significant sequence identities with published viral sequences. They mainly originated from bacteriophages (73.9% and mammalian viruses (23.9%; 0.8% of the sequences showed identities to plant viruses. The most abundant detected porcine viruses were kobuvirus, rotavirus C, astrovirus, enterovirus B, sapovirus and picobirnavirus. In addition, sequences with identities to the chimpanzee stool-associated circular ssDNA virus were identified. Whole genome analysis indicates that this virus, tentatively designated as pig stool-associated circular ssDNA virus (PigSCV, represents a novel pig virus. CONCLUSION: The established protocol enables the simultaneous detection of DNA and RNA viruses in pig faeces including the identification of so far unknown viruses. It may be applied in studies investigating aetiology, epidemiology and ecology of diseases. The implemented process control serves as quality control, ensures

  11. Sequence-specific RNA Photocleavage by Single-stranded DNA in Presence of Riboflavin

    Science.gov (United States)

    Zhao, Yongyun; Chen, Gangyi; Yuan, Yi; Li, Na; Dong, Juan; Huang, Xin; Cui, Xin; Tang, Zhuo

    2015-10-01

    Constant efforts have been made to develop new method to realize sequence-specific RNA degradation, which could cause inhibition of the expression of targeted gene. Herein, by using an unmodified short DNA oligonucleotide for sequence recognition and endogenic small molecue, vitamin B2 (riboflavin) as photosensitizer, we report a simple strategy to realize the sequence-specific photocleavage of targeted RNA. The DNA strand is complimentary to the target sequence to form DNA/RNA duplex containing a G•U wobble in the middle. The cleavage reaction goes through oxidative elimination mechanism at the nucleoside downstream of U of the G•U wobble in duplex to obtain unnatural RNA terminal, and the whole process is under tight control by using light as switch, which means the cleavage could be carried out according to specific spatial and temporal requirements. The biocompatibility of this method makes the DNA strand in combination with riboflavin a promising molecular tool for RNA manipulation.

  12. DNA Sequences of RAPD Fragments in the Egyptian cotton ...

    African Journals Online (AJOL)

    Random Amplified Polymorphic DNAs (RAPDs) is a DNA polymorphism assay based on the amplification of random DNA segments with single primers of arbitrary nucleotide sequence. Despite the fact that the RAPD technique has become a very powerful tool and has found use in numerous applications, yet, the nature of ...

  13. Next generation sequencing of DNA-launched Chikungunya vaccine virus

    Energy Technology Data Exchange (ETDEWEB)

    Hidajat, Rachmat; Nickols, Brian [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States); Forrester, Naomi [Institute for Human Infections and Immunity, Sealy Center for Vaccine Development and Department of Pathology, University of Texas Medical Branch, GNL, 301 University Blvd., Galveston, TX 77555 (United States); Tretyakova, Irina [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States); Weaver, Scott [Institute for Human Infections and Immunity, Sealy Center for Vaccine Development and Department of Pathology, University of Texas Medical Branch, GNL, 301 University Blvd., Galveston, TX 77555 (United States); Pushko, Peter, E-mail: ppushko@medigen-usa.com [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States)

    2016-03-15

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3′ untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. - Highlights: • Chikungunya virus (CHIKV) is an emerging pandemic threat. • In vivo DNA-launched attenuated CHIKV is a novel vaccine technology. • DNA-launched virus was sequenced using HiSeq2000 and compared to the 181/25 virus. • DNA-launched virus has lower frequency of SNPs at E2-12 and E2-82 attenuation loci.

  14. Next generation sequencing of DNA-launched Chikungunya vaccine virus

    International Nuclear Information System (INIS)

    Hidajat, Rachmat; Nickols, Brian; Forrester, Naomi; Tretyakova, Irina; Weaver, Scott; Pushko, Peter

    2016-01-01

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3′ untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. - Highlights: • Chikungunya virus (CHIKV) is an emerging pandemic threat. • In vivo DNA-launched attenuated CHIKV is a novel vaccine technology. • DNA-launched virus was sequenced using HiSeq2000 and compared to the 181/25 virus. • DNA-launched virus has lower frequency of SNPs at E2-12 and E2-82 attenuation loci.

  15. A microfluidic DNA library preparation platform for next-generation sequencing.

    Science.gov (United States)

    Kim, Hanyoup; Jebrail, Mais J; Sinha, Anupama; Bent, Zachary W; Solberg, Owen D; Williams, Kelly P; Langevin, Stanley A; Renzi, Ronald F; Van De Vreugde, James L; Meagher, Robert J; Schoeniger, Joseph S; Lane, Todd W; Branda, Steven S; Bartsch, Michael S; Patel, Kamlesh D

    2013-01-01

    Next-generation sequencing (NGS) is emerging as a powerful tool for elucidating genetic information for a wide range of applications. Unfortunately, the surging popularity of NGS has not yet been accompanied by an improvement in automated techniques for preparing formatted sequencing libraries. To address this challenge, we have developed a prototype microfluidic system for preparing sequencer-ready DNA libraries for analysis by Illumina sequencing. Our system combines droplet-based digital microfluidic (DMF) sample handling with peripheral modules to create a fully-integrated, sample-in library-out platform. In this report, we use our automated system to prepare NGS libraries from samples of human and bacterial genomic DNA. E. coli libraries prepared on-device from 5 ng of total DNA yielded excellent sequence coverage over the entire bacterial genome, with >99% alignment to the reference genome, even genome coverage, and good quality scores. Furthermore, we produced a de novo assembly on a previously unsequenced multi-drug resistant Klebsiella pneumoniae strain BAA-2146 (KpnNDM). The new method described here is fast, robust, scalable, and automated. Our device for library preparation will assist in the integration of NGS technology into a wide variety of laboratories, including small research laboratories and clinical laboratories.

  16. A microfluidic DNA library preparation platform for next-generation sequencing.

    Directory of Open Access Journals (Sweden)

    Hanyoup Kim

    Full Text Available Next-generation sequencing (NGS is emerging as a powerful tool for elucidating genetic information for a wide range of applications. Unfortunately, the surging popularity of NGS has not yet been accompanied by an improvement in automated techniques for preparing formatted sequencing libraries. To address this challenge, we have developed a prototype microfluidic system for preparing sequencer-ready DNA libraries for analysis by Illumina sequencing. Our system combines droplet-based digital microfluidic (DMF sample handling with peripheral modules to create a fully-integrated, sample-in library-out platform. In this report, we use our automated system to prepare NGS libraries from samples of human and bacterial genomic DNA. E. coli libraries prepared on-device from 5 ng of total DNA yielded excellent sequence coverage over the entire bacterial genome, with >99% alignment to the reference genome, even genome coverage, and good quality scores. Furthermore, we produced a de novo assembly on a previously unsequenced multi-drug resistant Klebsiella pneumoniae strain BAA-2146 (KpnNDM. The new method described here is fast, robust, scalable, and automated. Our device for library preparation will assist in the integration of NGS technology into a wide variety of laboratories, including small research laboratories and clinical laboratories.

  17. Phylogenetic relationships of the Gomphales based on nuc-25S-rDNA, mit-12S-rDNA, and mit-atp6-DNA combined sequences

    Science.gov (United States)

    Admir J. Giachini; Kentaro Hosaka; Eduardo Nouhra; Joseph Spatafora; James M. Trappe

    2010-01-01

    Phylogenetic relationships among Geastrales, Gomphales, Hysterangiales, and Phallales were estimated via combined sequences: nuclear large subunit ribosomal DNA (nuc-25S-rDNA), mitochondrial small subunit ribosomal DNA (mit-12S-rDNA), and mitochondrial atp6 DNA (mit-atp6-DNA). Eighty-one taxa comprising 19 genera and 58 species...

  18. From Sequence to Morphology - Long-Range Correlations in Complete Sequenced Genomes

    NARCIS (Netherlands)

    T.A. Knoch (Tobias)

    2004-01-01

    textabstractThe largely unresolved sequential organization, i.e. the relations within DNA sequences, and its connection to the three-dimensional organization of genomes was investigated by correlation analyses of completely sequenced chromosomes from Viroids, Archaea, Bacteria, Arabidopsis

  19. Sequence Dependencies of DNA Deformability and Hydration in the Minor Groove

    Science.gov (United States)

    Yonetani, Yoshiteru; Kono, Hidetoshi

    2009-01-01

    Abstract DNA deformability and hydration are both sequence-dependent and are essential in specific DNA sequence recognition by proteins. However, the relationship between the two is not well understood. Here, systematic molecular dynamics simulations of 136 DNA sequences that differ from each other in their central tetramer revealed that sequence dependence of hydration is clearly correlated with that of deformability. We show that this correlation can be illustrated by four typical cases. Most rigid basepair steps are highly likely to form an ordered hydration pattern composed of one water molecule forming a bridge between the bases of distinct strands, but a few exceptions favor another ordered hydration composed of two water molecules forming such a bridge. Steps with medium deformability can display both of these hydration patterns with frequent transition. Highly flexible steps do not have any stable hydration pattern. A detailed picture of this correlation demonstrates that motions of hydration water molecules and DNA bases are tightly coupled with each other at the atomic level. These results contribute to our understanding of the entropic contribution from water molecules in protein or drug binding and could be applied for the purpose of predicting binding sites. PMID:19686662

  20. Pericentric satellite DNA sequences in Pipistrellus pipistrellus (Vespertilionidae; Chiroptera).

    Science.gov (United States)

    Barragán, M J L; Martínez, S; Marchal, J A; Fernández, R; Bullejos, M; Díaz de la Guardia, R; Sánchez, A

    2003-09-01

    This paper reports the molecular and cytogenetic characterization of a HindIII family of satellite DNA in the bat species Pipistrellus pipistrellus. This satellite is organized in tandem repeats of 418 bp monomer units, and represents approximately 3% of the whole genome. The consensus sequence from five cloned monomer units has an A-T content of 62.20%. We have found differences in the ladder pattern of bands between two populations of the same species. These differences are probably because of the absence of the target sites for the HindIII enzyme in most monomer units of one population, but not in the other. Fluorescent in situ hybridization (FISH) localized the satellite DNA in the pericentromeric regions of all autosomes and the X chromosome, but it was absent from the Y chromosome. Digestion of genomic DNAs with HpaII and its isoschizomer MspI demonstrated that these repetitive DNA sequences are not methylated. Other bat species were tested for the presence of this repetitive DNA. It was absent in five Vespertilionidae and one Rhinolophidae species, indicating that it could be a species/genus specific, repetitive DNA family.

  1. Comparative d2/d3 LSU–rDNA sequence study of some Iranian ...

    African Journals Online (AJOL)

    SERVER

    2007-11-05

    Nov 5, 2007 ... segments yielded one fragment at over all sequenced isolates as 787 bp in size. The DNA sequences were aligned .... expansion segments of the 28S rDNA subunit (D2/D3. LSU-rDNA) are the ... isolated from different geographical location from tea shrubs infested roots of Guilan province, Iran (Table 1).

  2. Integration of hepatitis B virus DNA in chromosome-specific satellite sequences

    International Nuclear Information System (INIS)

    Shaul, Y.; Garcia, P.D.; Schonberg, S.; Rutter, W.J.

    1986-01-01

    The authors previously reported the cloning and detailed analysis of the integrated hepatitis B virus sequences in a human hepatoma cell line. They report here the integration of at least one of hepatitis B virus at human satellite DNA sequences. The majority of the cellular sequences identified by this satellite were organized as a multimeric composition of a 0.6-kilobase EcoRI fragment. This clone hybridized in situ almost exclusively to the centromeric heterochromatin of chromosomes 1 and 16 and to a lower extent to chromosome 2 and to the heterochromatic region of the Y chromosome. The immediate flanking host sequence appeared as a hierarchy of repeating units which were almost identical to a previously reported human satellite III DNA sequence

  3. Autonomous replication of plasmids bearing monkey DNA origin-enriched sequences

    International Nuclear Information System (INIS)

    Frappier, L.; Zannis-Hadjopoulos, M.

    1987-01-01

    Twelve clones of origin-enriched sequences (ORS) isolated from early replicating monkey (CV-1) DNA were examined for transient episomal replication in transfected CV-1, COS-7, and HeLa cells. Plasmid DNA was isolated at time intervals after transfection and screened by the Dpn I resistance assay or by the bromodeoxyuridine substitution assay to differentiate between input and replicated DNA. The authors have identified four monkey ORS (ORS3, -8, -9, and -12) that can support plasmid replication in mammalian cells. This replication is carried out in a controlled and semiconservative manner characteristic of mammalian replicons. ORS replication was most efficient in HeLa cells. Electron microscopy showed ORS8 and ORS12 plasmids of the correct size with replication bubbles. Using a unique restriction site in ORS12, we have mapped the replication bubble within the monkey DNA sequence

  4. Presence of a consensus DNA motif at nearby DNA sequence of the mutation susceptible CG nucleotides.

    Science.gov (United States)

    Chowdhury, Kaushik; Kumar, Suresh; Sharma, Tanu; Sharma, Ankit; Bhagat, Meenakshi; Kamai, Asangla; Ford, Bridget M; Asthana, Shailendra; Mandal, Chandi C

    2018-01-10

    Complexity in tissues affected by cancer arises from somatic mutations and epigenetic modifications in the genome. The mutation susceptible hotspots present within the genome indicate a non-random nature and/or a position specific selection of mutation. An association exists between the occurrence of mutations and epigenetic DNA methylation. This study is primarily aimed at determining mutation status, and identifying a signature for predicting mutation prone zones of tumor suppressor (TS) genes. Nearby sequences from the top five positions having a higher mutation frequency in each gene of 42 TS genes were selected from a cosmic database and were considered as mutation prone zones. The conserved motifs present in the mutation prone DNA fragments were identified. Molecular docking studies were done to determine putative interactions between the identified conserved motifs and enzyme methyltransferase DNMT1. Collective analysis of 42 TS genes found GC as the most commonly replaced and AT as the most commonly formed residues after mutation. Analysis of the top 5 mutated positions of each gene (210 DNA segments for 42 TS genes) identified that CG nucleotides of the amino acid codons (e.g., Arginine) are most susceptible to mutation, and found a consensus DNA "T/AGC/GAGGA/TG" sequence present in these mutation prone DNA segments. Similar to TS genes, analysis of 54 oncogenes not only found CG nucleotides of the amino acid Arg as the most susceptible to mutation, but also identified the presence of similar consensus DNA motifs in the mutation prone DNA fragments (270 DNA segments for 54 oncogenes) of oncogenes. Docking studies depicted that, upon binding of DNMT1 methylates to this consensus DNA motif (C residues of CpG islands), mutation was likely to occur. Thus, this study proposes that DNMT1 mediated methylation in chromosomal DNA may decrease if a foreign DNA segment containing this consensus sequence along with CG nucleotides is exogenously introduced to dividing

  5. Profiling soil microbial communities with next-generation sequencing: the influence of DNA kit selection and technician technical expertise.

    Science.gov (United States)

    Soliman, Taha; Yang, Sung-Yin; Yamazaki, Tomoko; Jenke-Kodama, Holger

    2017-01-01

    Structure and diversity of microbial communities are an important research topic in biology, since microbes play essential roles in the ecology of various environments. Different DNA isolation protocols can lead to data bias and can affect results of next-generation sequencing. To evaluate the impact of protocols for DNA isolation from soil samples and also the influence of individual handling of samples, we compared results obtained by two researchers (R and T) using two different DNA extraction kits: (1) MO BIO PowerSoil ® DNA Isolation kit (MO_R and MO_T) and (2) NucleoSpin ® Soil kit (MN_R and MN_T). Samples were collected from six different sites on Okinawa Island, Japan. For all sites, differences in the results of microbial composition analyses (bacteria, archaea, fungi, and other eukaryotes), obtained by the two researchers using the two kits, were analyzed. For both researchers, the MN kit gave significantly higher yields of genomic DNA at all sites compared to the MO kit (ANOVA; P  technicians for thorough microbial analyses and to obtain accurate estimates of microbial diversity.

  6. Research on Image Encryption Based on DNA Sequence and Chaos Theory

    Science.gov (United States)

    Tian Zhang, Tian; Yan, Shan Jun; Gu, Cheng Yan; Ren, Ran; Liao, Kai Xin

    2018-04-01

    Nowadays encryption is a common technique to protect image data from unauthorized access. In recent years, many scientists have proposed various encryption algorithms based on DNA sequence to provide a new idea for the design of image encryption algorithm. Therefore, a new method of image encryption based on DNA computing technology is proposed in this paper, whose original image is encrypted by DNA coding and 1-D logistic chaotic mapping. First, the algorithm uses two modules as the encryption key. The first module uses the real DNA sequence, and the second module is made by one-dimensional logistic chaos mapping. Secondly, the algorithm uses DNA complementary rules to encode original image, and uses the key and DNA computing technology to compute each pixel value of the original image, so as to realize the encryption of the whole image. Simulation results show that the algorithm has good encryption effect and security.

  7. Sequence heterogeneity accelerates protein search for targets on DNA

    International Nuclear Information System (INIS)

    Shvets, Alexey A.; Kolomeisky, Anatoly B.

    2015-01-01

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome

  8. Sequence heterogeneity accelerates protein search for targets on DNA

    Energy Technology Data Exchange (ETDEWEB)

    Shvets, Alexey A.; Kolomeisky, Anatoly B., E-mail: tolya@rice.edu [Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005 (United States)

    2015-12-28

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  9. Using TESS to predict transcription factor binding sites in DNA sequence.

    Science.gov (United States)

    Schug, Jonathan

    2008-03-01

    This unit describes how to use the Transcription Element Search System (TESS). This Web site predicts transcription factor binding sites (TFBS) in DNA sequence using two different kinds of models of sites, strings and positional weight matrices. The binding of transcription factors to DNA is a major part of the control of gene expression. Transcription factors exhibit sequence-specific binding; they form stronger bonds to some DNA sequences than to others. Identification of a good binding site in the promoter for a gene suggests the possibility that the corresponding factor may play a role in the regulation of that gene. However, the sequences transcription factors recognize are typically short and allow for some amount of mismatch. Because of this, binding sites for a factor can typically be found at random every few hundred to a thousand base pairs. TESS has features to help sort through and evaluate the significance of predicted sites.

  10. Rhipicephalus microplus dataset of nonredundant raw sequence reads from 454 GS FLX sequencing of Cot-selected (Cot = 660) genomic DNA

    Science.gov (United States)

    A reassociation kinetics-based approach was used to reduce the complexity of genomic DNA from the Deutsch laboratory strain of the cattle tick, Rhipicephalus microplus, to facilitate genome sequencing. Selected genomic DNA (Cot value = 660) was sequenced using 454 GS FLX technology, resulting in 356...

  11. Improved detection of CXCR4-using HIV by V3 genotyping: application of population-based and "deep" sequencing to plasma RNA and proviral DNA.

    Science.gov (United States)

    Swenson, Luke C; Moores, Andrew; Low, Andrew J; Thielen, Alexander; Dong, Winnie; Woods, Conan; Jensen, Mark A; Wynhoven, Brian; Chan, Dennison; Glascock, Christopher; Harrigan, P Richard

    2010-08-01

    Tropism testing should rule out CXCR4-using HIV before treatment with CCR5 antagonists. Currently, the recombinant phenotypic Trofile assay (Monogram) is most widely utilized; however, genotypic tests may represent alternative methods. Independent triplicate amplifications of the HIV gp120 V3 region were made from either plasma HIV RNA or proviral DNA. These underwent standard, population-based sequencing with an ABI3730 (RNA n = 63; DNA n = 40), or "deep" sequencing with a Roche/454 Genome Sequencer-FLX (RNA n = 12; DNA n = 12). Position-specific scoring matrices (PSSMX4/R5) (-6.96 cutoff) and geno2pheno[coreceptor] (5% false-positive rate) inferred tropism from V3 sequence. These methods were then independently validated with a separate, blinded dataset (n = 278) of screening samples from the maraviroc MOTIVATE trials. Standard sequencing of HIV RNA with PSSM yielded 69% sensitivity and 91% specificity, relative to Trofile. The validation dataset gave 75% sensitivity and 83% specificity. Proviral DNA plus PSSM gave 77% sensitivity and 71% specificity. "Deep" sequencing of HIV RNA detected >2% inferred-CXCR4-using virus in 8/8 samples called non-R5 by Trofile, and <2% in 4/4 samples called R5. Triplicate analyses of V3 standard sequence data detect greater proportions of CXCR4-using samples than previously achieved. Sequencing proviral DNA and "deep" V3 sequencing may also be useful tools for assessing tropism.

  12. Isolation and sequence of complementary DNA encoding human extracellular superoxide dismutase

    International Nuclear Information System (INIS)

    Hjalmarsson, K.; Marklund, S.L.; Engstroem, A.; Edlund, T.

    1987-01-01

    A complementary DNA (cDNA) clone from a human placenta cDNA library encoding extracellular superoxide dismutase has been isolated and the nucleotide sequence determined. The cDNA has a very high G + C content. EC-SOD is synthesized with a putative 18-amino acid signal peptide, preceding the 222 amino acids in the mature enzyme, indicating that the enzyme is a secretory protein. The first 95 amino acids of the mature enzyme show no sequence homology with other sequenced proteins and there is one possible N-glycosylation site (Asn-89). The amino acid sequence from residues 96-193 shows strong homology (∼ 50%) with the final two-thirds of the sequences of all know eukaryotic CuZn SODs, whereas the homology with the P. leiognathi CuZn SOD is clearly lower. The ligands to Cu and Zn, the cysteines forming the intrasubunit disulfide bridge in the CuZn SODs, and the arginine found in all CuZn SODs in the entrance to the active site can all be identified in EC-SOD. A comparison with bovine CuZn SOD, the three-dimensional structure of which is known, reveals that the homologies occur in the active site and the divergencies are in the part constituting the subunit contact area in CuZn SOD. Amino acid sequence 194-222 in the carboxyl-terminal end of EC-SOD is strongly hydrophilic and contains nine amino acids with a positive charge. This sequence probably confers the affinity of EC-SOD for heparin and heparan sulfate. An analysis of the amino acid sequence homologies with CuZn SODs from various species indicates that the EC-SODs may have evolved form the CuZn SODs before the evolution of fungi and plants

  13. Comparison of DNA Quantification Methods for Next Generation Sequencing.

    Science.gov (United States)

    Robin, Jérôme D; Ludlow, Andrew T; LaRanger, Ryan; Wright, Woodring E; Shay, Jerry W

    2016-04-06

    Next Generation Sequencing (NGS) is a powerful tool that depends on loading a precise amount of DNA onto a flowcell. NGS strategies have expanded our ability to investigate genomic phenomena by referencing mutations in cancer and diseases through large-scale genotyping, developing methods to map rare chromatin interactions (4C; 5C and Hi-C) and identifying chromatin features associated with regulatory elements (ChIP-seq, Bis-Seq, ChiA-PET). While many methods are available for DNA library quantification, there is no unambiguous gold standard. Most techniques use PCR to amplify DNA libraries to obtain sufficient quantities for optical density measurement. However, increased PCR cycles can distort the library's heterogeneity and prevent the detection of rare variants. In this analysis, we compared new digital PCR technologies (droplet digital PCR; ddPCR, ddPCR-Tail) with standard methods for the titration of NGS libraries. DdPCR-Tail is comparable to qPCR and fluorometry (QuBit) and allows sensitive quantification by analysis of barcode repartition after sequencing of multiplexed samples. This study provides a direct comparison between quantification methods throughout a complete sequencing experiment and provides the impetus to use ddPCR-based quantification for improvement of NGS quality.

  14. Genomic signal processing for DNA sequence clustering.

    Science.gov (United States)

    Mendizabal-Ruiz, Gerardo; Román-Godínez, Israel; Torres-Ramos, Sulema; Salido-Ruiz, Ricardo A; Vélez-Pérez, Hugo; Morales, J Alejandro

    2018-01-01

    Genomic signal processing (GSP) methods which convert DNA data to numerical values have recently been proposed, which would offer the opportunity of employing existing digital signal processing methods for genomic data. One of the most used methods for exploring data is cluster analysis which refers to the unsupervised classification of patterns in data. In this paper, we propose a novel approach for performing cluster analysis of DNA sequences that is based on the use of GSP methods and the K-means algorithm. We also propose a visualization method that facilitates the easy inspection and analysis of the results and possible hidden behaviors. Our results support the feasibility of employing the proposed method to find and easily visualize interesting features of sets of DNA data.

  15. Transcription blockage by homopurine DNA sequences: role of sequence composition and single-strand breaks

    Science.gov (United States)

    Belotserkovskii, Boris P.; Neil, Alexander J.; Saleh, Syed Shayon; Shin, Jane Hae Soo; Mirkin, Sergei M.; Hanawalt, Philip C.

    2013-01-01

    The ability of DNA to adopt non-canonical structures can affect transcription and has broad implications for genome functioning. We have recently reported that guanine-rich (G-rich) homopurine-homopyrimidine sequences cause significant blockage of transcription in vitro in a strictly orientation-dependent manner: when the G-rich strand serves as the non-template strand [Belotserkovskii et al. (2010) Mechanisms and implications of transcription blockage by guanine-rich DNA sequences., Proc. Natl Acad. Sci. USA, 107, 12816–12821]. We have now systematically studied the effect of the sequence composition and single-stranded breaks on this blockage. Although substitution of guanine by any other base reduced the blockage, cytosine and thymine reduced the blockage more significantly than adenine substitutions, affirming the importance of both G-richness and the homopurine-homopyrimidine character of the sequence for this effect. A single-strand break in the non-template strand adjacent to the G-rich stretch dramatically increased the blockage. Breaks in the non-template strand result in much weaker blockage signals extending downstream from the break even in the absence of the G-rich stretch. Our combined data support the notion that transcription blockage at homopurine-homopyrimidine sequences is caused by R-loop formation. PMID:23275544

  16. Inspecting Targeted Deep Sequencing of Whole Genome Amplified DNA Versus Fresh DNA for Somatic Mutation Detection: A Genetic Study in Myelodysplastic Syndrome Patients.

    Science.gov (United States)

    Palomo, Laura; Fuster-Tormo, Francisco; Alvira, Daniel; Ademà, Vera; Armengol, María Pilar; Gómez-Marzo, Paula; de Haro, Nuri; Mallo, Mar; Xicoy, Blanca; Zamora, Lurdes; Solé, Francesc

    2017-08-01

    Whole genome amplification (WGA) has become an invaluable method for preserving limited samples of precious stock material and has been used during the past years as an alternative tool to increase the amount of DNA before library preparation for next-generation sequencing. Myelodysplastic syndromes (MDS) are a group of clonal hematopoietic stem cell disorders characterized by presenting somatic mutations in several myeloid-related genes. In this work, targeted deep sequencing has been performed on four paired fresh DNA and WGA DNA samples from bone marrow of MDS patients, to assess the feasibility of using WGA DNA for detecting somatic mutations. The results of this study highlighted that, in general, the sequencing and alignment statistics of fresh DNA and WGA DNA samples were similar. However, after variant calling and when considering variants detected at all frequencies, there was a high level of discordance between fresh DNA and WGA DNA (overall, a higher number of variants was detected in WGA DNA). After proper filtering, a total of three somatic mutations were detected in the cohort. All somatic mutations detected in fresh DNA were also identified in WGA DNA and validated by whole exome sequencing.

  17. Nucleotide sequence determination of the region in adenovirus 5 DNA involved in cell transformation

    International Nuclear Information System (INIS)

    Maat, J.

    1978-01-01

    A description is given of investigations into the primary structure of the transforming region of adenovirus type 5 DNA. The phenomenon of cell transformation is discussed in general terms and the principles of a number of fairly recent techniques, which have been in use for DNA sequence determination since 1975 are dealt with. A few of the author's own techniques are described which deal both with nucleotide sequence analysis and with the determination of DNA cleavage sites of restriction endonucleases. The results are given of the mapping of cleavage sites in the HpaI-E fragment of adenovirus DNA of HpaII, HaeIII, AluI, HinfI and TaqI and of the determination of the nucleotide sequence in the transforming region of adenovirus type 5 DNA. The results of the sequence determination of the Ad5 HindIII-G fragment are discussed in relation with the investigation on the transforming proteins isolated from in vitro and in vivo synthesizing systems. Labelling procedures of DNA are described including the exonuclease III/DNA polymerase 1 method and TA polynucleotide kinase labelling of DNA fragments. (Auth.)

  18. Insights into the phylogeny of Northern Hemisphere Armillaria: Neighbor-net and Bayesian analyses of translation elongation factor 1-α gene sequences

    Science.gov (United States)

    Ned B. Klopfenstein; Jane E. Stewart; Yuko Ota; John W. Hanna; Bryce A. Richardson; Amy L. Ross-Davis; Ruben D. Elias-Roman; Kari Korhonen; Nenad Keca; Eugenia Iturritxa; Dionicio Alvarado-Rosales; Halvor Solheim; Nicholas J. Brazee; Piotr Lakomy; Michelle R. Cleary; Eri Hasegawa; Taisei Kikuchi; Fortunato Garza-Ocanas; Panaghiotis Tsopelas; Daniel Rigling; Simone Prospero; Tetyana Tsykun; Jean A. Berube; Franck O. P. Stefani; Saeideh Jafarpour; Vladimir Antonin; Michal Tomsovsky; Geral I. McDonald; Stephen Woodward; Mee-Sook Kim

    2017-01-01

    Armillaria possesses several intriguing characteristics that have inspired wide interest in understanding phylogenetic relationships within and among species of this genus. Nuclear ribosomal DNA sequence–based analyses of Armillaria provide only limited information for phylogenetic studies among widely divergent taxa. More recent studies have shown that translation...

  19. [Identification of a repetitive sequence element for DNA fingerprinting in Phytophthora sojae].

    Science.gov (United States)

    Yin, Lihua; Wang, Qinhu; Ning, Feng; Zhu, Xiaoying; Zuo, Yuhu; Shan, Weixing

    2010-04-01

    Establishment of DNA fingerprinting in Phytophthora sojae and an analysis of genetic relationship of Heilongjiang and Xinjiang populations. Bioinformatics tools were used to search repetitive sequences in P. sojae and Southern blot analysis was employed for DNA fingerprinting analysis of P. sojae populations from Heilongjiang and Xinjiang using the identified repetitive sequence. A moderately repetitive sequence was identified and designated as PS1227. Southern blot analysis indicated 34 distinct bands ranging in size from 1.5 kb-23 kb, of which 21 were polymorphic among 49 isolates examined. Analysis of single-zoospore progenies showed that the PS1227 fingerprint pattern was mitotically stable. DNA fingerprinting showed that the P. sojae isolates HP4002, SY6 and GJ0105 of Heilongjiang are genetically identical to DW303, 71228 and 71222 of Xinjiang, respectively. A moderately repetitive sequence designated PS1227 which will be useful for epidemiology and population biology studies of P. sojae was obtained, and a PS1227-based DNA fingerprinting analysis provided molecular evidence that P. sojae in Xinjiang was likely introduced from Heilongjiang.

  20. Frequency of Epstein-Barr virus DNA sequences in human gliomas

    Directory of Open Access Journals (Sweden)

    Renata Fragelli Fonseca

    Full Text Available CONTEXT AND OBJECTIVE: The Epstein-Barr virus (EBV is the most common cause of infectious mononucleosis and is also associated with several human tumors, including Burkitt's lymphoma, Hodgkin's lymphoma, some cases of gastric carcinoma and nasopharyngeal carcinoma, among other neoplasms. The aim of this study was to screen 75 primary gliomas for the presence of specific EBV DNA sequences by means of the polymerase chain reaction (PCR, with confirmation by direct sequencing. DESIGN AND SETTING: Prevalence study on EBV molecular genetics at a molecular pathology laboratory in a university hospital and at an applied genetics laboratory in a national institution. METHODS: A total of 75 primary glioma biopsies and 6 others from other tumors from the central nervous system were obtained. The tissues were immediately frozen for subsequent DNA extraction by means of traditional methods using proteinase K digestion and extraction with a phenol-chloroform-isoamyl alcohol mixture. DNA was precipitated with ethanol, resuspended in buffer and stored. The PCRs were carried out using primers for amplification of the EBV BamM region. Positive and negative controls were added to each reaction. The PCR products were used for direct sequencing for confirmation. RESULTS: The viral sequences were positive in 11/75 (14.7% of our samples. CONCLUSION: The prevalence of EBV DNA was 11/75 (14.7% in our glioma collection. Further molecular and epidemiological studies are needed to establish the possible role played by EBV in the tumorigenesis of gliomas.

  1. A survey of the sequence-specific interaction of damaging agents with DNA: emphasis on antitumor agents.

    Science.gov (United States)

    Murray, V

    1999-01-01

    This article reviews the literature concerning the sequence specificity of DNA-damaging agents. DNA-damaging agents are widely used in cancer chemotherapy. It is important to understand fully the determinants of DNA sequence specificity so that more effective DNA-damaging agents can be developed as antitumor drugs. There are five main methods of DNA sequence specificity analysis: cleavage of end-labeled fragments, linear amplification with Taq DNA polymerase, ligation-mediated polymerase chain reaction (PCR), single-strand ligation PCR, and footprinting. The DNA sequence specificity in purified DNA and in intact mammalian cells is reviewed for several classes of DNA-damaging agent. These include agents that form covalent adducts with DNA, free radical generators, topoisomerase inhibitors, intercalators and minor groove binders, enzymes, and electromagnetic radiation. The main sites of adduct formation are at the N-7 of guanine in the major groove of DNA and the N-3 of adenine in the minor groove, whereas free radical generators abstract hydrogen from the deoxyribose sugar and topoisomerase inhibitors cause enzyme-DNA cross-links to form. Several issues involved in the determination of the DNA sequence specificity are discussed. The future directions of the field, with respect to cancer chemotherapy, are also examined.

  2. Real-time DNA barcoding in a rainforest using nanopore sequencing: opportunities for rapid biodiversity assessments and local capacity building.

    Science.gov (United States)

    Pomerantz, Aaron; Peñafiel, Nicolás; Arteaga, Alejandro; Bustamante, Lucas; Pichardo, Frank; Coloma, Luis A; Barrio-Amorós, César L; Salazar-Valenzuela, David; Prost, Stefan

    2018-04-01

    Advancements in portable scientific instruments provide promising avenues to expedite field work in order to understand the diverse array of organisms that inhabit our planet. Here, we tested the feasibility for in situ molecular analyses of endemic fauna using a portable laboratory fitting within a single backpack in one of the world's most imperiled biodiversity hotspots, the Ecuadorian Chocó rainforest. We used portable equipment, including the MinION nanopore sequencer (Oxford Nanopore Technologies) and the miniPCR (miniPCR), to perform DNA extraction, polymerase chain reaction amplification, and real-time DNA barcoding of reptile specimens in the field. We demonstrate that nanopore sequencing can be implemented in a remote tropical forest to quickly and accurately identify species using DNA barcoding, as we generated consensus sequences for species resolution with an accuracy of >99% in less than 24 hours after collecting specimens. The flexibility of our mobile laboratory further allowed us to generate sequence information at the Universidad Tecnológica Indoamérica in Quito for rare, endangered, and undescribed species. This includes the recently rediscovered Jambato toad, which was thought to be extinct for 28 years. Sequences generated on the MinION required as few as 30 reads to achieve high accuracy relative to Sanger sequencing, and with further multiplexing of samples, nanopore sequencing can become a cost-effective approach for rapid and portable DNA barcoding. Overall, we establish how mobile laboratories and nanopore sequencing can help to accelerate species identification in remote areas to aid in conservation efforts and be applied to research facilities in developing countries. This opens up possibilities for biodiversity studies by promoting local research capacity building, teaching nonspecialists and students about the environment, tackling wildlife crime, and promoting conservation via research-focused ecotourism.

  3. The influence of DNA sequence on epigenome-induced pathologies

    Directory of Open Access Journals (Sweden)

    Meagher Richard B

    2012-07-01

    Full Text Available Abstract Clear cause-and-effect relationships are commonly established between genotype and the inherited risk of acquiring human and plant diseases and aberrant phenotypes. By contrast, few such cause-and-effect relationships are established linking a chromatin structure (that is, the epitype with the transgenerational risk of acquiring a disease or abnormal phenotype. It is not entirely clear how epitypes are inherited from parent to offspring as populations evolve, even though epigenetics is proposed to be fundamental to evolution and the likelihood of acquiring many diseases. This article explores the hypothesis that, for transgenerationally inherited chromatin structures, “genotype predisposes epitype”, and that epitype functions as a modifier of gene expression within the classical central dogma of molecular biology. Evidence for the causal contribution of genotype to inherited epitypes and epigenetic risk comes primarily from two different kinds of studies discussed herein. The first and direct method of research proceeds by the examination of the transgenerational inheritance of epitype and the penetrance of phenotype among genetically related individuals. The second approach identifies epitypes that are duplicated (as DNA sequences are duplicated and evolutionarily conserved among repeated patterns in the DNA sequence. The body of this article summarizes particularly robust examples of these studies from humans, mice, Arabidopsis, and other organisms. The bulk of the data from both areas of research support the hypothesis that genotypes predispose the likelihood of displaying various epitypes, but for only a few classes of epitype. This analysis suggests that renewed efforts are needed in identifying polymorphic DNA sequences that determine variable nucleosome positioning and DNA methylation as the primary cause of inherited epigenome-induced pathologies. By contrast, there is very little evidence that DNA sequence directly

  4. High Throughput Sample Preparation and Analysis for DNA Sequencing, PCR and Combinatorial Screening of Catalysis Based on Capillary Array Technique

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Yonghua [Iowa State Univ., Ames, IA (United States)

    2000-01-01

    Sample preparation has been one of the major bottlenecks for many high throughput analyses. The purpose of this research was to develop new sample preparation and integration approach for DNA sequencing, PCR based DNA analysis and combinatorial screening of homogeneous catalysis based on multiplexed capillary electrophoresis with laser induced fluorescence or imaging UV absorption detection. The author first introduced a method to integrate the front-end tasks to DNA capillary-array sequencers. protocols for directly sequencing the plasmids from a single bacterial colony in fused-silica capillaries were developed. After the colony was picked, lysis was accomplished in situ in the plastic sample tube using either a thermocycler or heating block. Upon heating, the plasmids were released while chromsomal DNA and membrane proteins were denatured and precipitated to the bottom of the tube. After adding enzyme and Sanger reagents, the resulting solution was aspirated into the reaction capillaries by a syringe pump, and cycle sequencing was initiated. No deleterious effect upon the reaction efficiency, the on-line purification system, or the capillary electrophoresis separation was observed, even though the crude lysate was used as the template. Multiplexed on-line DNA sequencing data from 8 parallel channels allowed base calling up to 620 bp with an accuracy of 98%. The entire system can be automatically regenerated for repeated operation. For PCR based DNA analysis, they demonstrated that capillary electrophoresis with UV detection can be used for DNA analysis starting from clinical sample without purification. After PCR reaction using cheek cell, blood or HIV-1 gag DNA, the reaction mixtures was injected into the capillary either on-line or off-line by base stacking. The protocol was also applied to capillary array electrophoresis. The use of cheaper detection, and the elimination of purification of DNA sample before or after PCR reaction, will make this approach an

  5. Modeling genetic imprinting effects of DNA sequences with multilocus polymorphism data

    Directory of Open Access Journals (Sweden)

    Staud Roland

    2009-08-01

    Full Text Available Abstract Single nucleotide polymorphisms (SNPs represent the most widespread type of DNA sequence variation in the human genome and they have recently emerged as valuable genetic markers for revealing the genetic architecture of complex traits in terms of nucleotide combination and sequence. Here, we extend an algorithmic model for the haplotype analysis of SNPs to estimate the effects of genetic imprinting expressed at the DNA sequence level. The model provides a general procedure for identifying the number and types of optimal DNA sequence variants that are expressed differently due to their parental origin. The model is used to analyze a genetic data set collected from a pain genetics project. We find that DNA haplotype GAC from three SNPs, OPRKG36T (with two alleles G and T, OPRKA843G (with alleles A and G, and OPRKC846T (with alleles C and T, at the kappa-opioid receptor, triggers a significant effect on pain sensitivity, but with expression significantly depending on the parent from which it is inherited (p = 0.008. With a tremendous advance in SNP identification and automated screening, the model founded on haplotype discovery and statistical inference may provide a useful tool for genetic analysis of any quantitative trait with complex inheritance.

  6. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution

    NARCIS (Netherlands)

    Falconer, Ester; Hills, Mark; Naumann, Ulrike; Poon, Steven S. S.; Chavez, Elizabeth A.; Sanders, Ashley D.; Zhao, Yongjun; Hirst, Martin; Lansdorp, Peter M.

    DNA rearrangements such as sister chromatid exchanges (SCEs) are sensitive indicators of genomic stress and instability, but they are typically masked by single-cell sequencing techniques. We developed Strand-seq to independently sequence parental DNA template strands from single cells, making it

  7. Underwound DNA under Tension: Structure, Elasticity, and Sequence-Dependent Behaviors

    Science.gov (United States)

    Sheinin, Maxim Y.; Forth, Scott; Marko, John F.; Wang, Michelle D.

    2011-09-01

    DNA melting under torsion plays an important role in a wide variety of cellular processes. In the present Letter, we have investigated DNA melting at the single-molecule level using an angular optical trap. By directly measuring force, extension, torque, and angle of DNA, we determined the structural and elastic parameters of torsionally melted DNA. Our data reveal that under moderate forces, the melted DNA assumes a left-handed structure as opposed to an open bubble conformation and is highly torsionally compliant. We have also discovered that at low forces melted DNA properties are highly dependent on DNA sequence. These results provide a more comprehensive picture of the global DNA force-torque phase diagram.

  8. Molecular dynamics simulations shed light on the enthalpic and entropic driving forces that govern the sequence specific recognition between netropsin and DNA.

    Science.gov (United States)

    Dolenc, Jozica; Gerster, Sarah; van Gunsteren, Wilfred F

    2010-09-02

    With the aim to gain a better understanding of the various driving forces that govern sequence specific DNA minor groove binding, we performed a thermodynamic analysis of netropsin binding to an AT-containing and to a set of six mixed AT/GC-containing binding sequences in the DNA minor groove. The relative binding free energies obtained using molecular dynamics simulations and free energy calculations show significant variations with the binding sequence. While the introduction of a GC base pair in the middle or close to the middle of the binding site is unfavorable for netropsin binding, a GC base pair at the end of the binding site appears to have no negative influence on the binding. The results of the structural and energetic analyses of the netropsin-DNA complexes reveal that the differences in the calculated binding affinities cannot be explained solely in terms of netropsin-DNA hydrogen-bonding or interaction energies. In addition, solvation effects and entropic contributions to the relative binding free energy provide a more complete picture of the various factors determining binding. Analysis of the relative binding entropy indicates that its magnitude is highly sequence-dependent, with the ratio |TDeltaDeltaS|/|DeltaDeltaH| ranging from 0.07 for the AAAGA to 1.7 for the AAGAG binding sequence, respectively.

  9. Molecular phylogeny and radiation time of erysiphales inferred from the nuclear ribosomal DNA sequences

    International Nuclear Information System (INIS)

    Mori, Y.; Sato, Y.; Takamatsu, S.

    2000-01-01

    Phylogenetic relationships of Erysiphales within Ascomycota were inferred from the newly determined sequences of the 18S rDNA and partial sequences of the 28S rDNA including the D1 and D2 regions of 10 Erysiphales taxa. Phylogenetic analyses revealed that the Erysiphales form a distinct clade among ascomycetous fungi suggesting that the Erysiphales diverged from a single ancestral taxon. The Myxotrichaceae of the Onygenales was distantly related to the other onygenalean families and was the sister group to the Erysiphales calde, with which it combined to form a clade. The Erysiphales/Myxotrichaceae clade was also closely related to some discomycetous fungi (Leotiales, Cyttariales and Thelebolaceae) including taxa that form cleistothecial ascomata. The present molecular analyses as well as previously reported morphological observations suggest the possible existence of a novel evolutionary pathway from cleistothecial discomycetous fungi to Erysiphales and Myxotrichaceae. However, since most of these fungi, except for the Erysiphales, are saprophytic on dung and/or plant materials, the questions of how and why an obligate biotroph like the Erysiphales radiated from the saprophytic fungi remain to be addressed. We also estimated the radiation time of the Erysiphales using the 18S rDNA sequences and the two molecular clockes that have been previously reported. The calculation showed that the Erysiphales split from the Myxotrichaceae 190–127 myr ago. Since the radiation time of the Erysiphales does not exceed 230 myr ago, even when allowance is made for the uncertainty of the molecular clocks, it is possible to consider that the Erysiphales evolved after the radiation of angiosperms. The results of our calculation also showed that the first radiation within the Erysiphales (138–92 myr ago) coincided with the date of a major diversification of angiosperms (130–90 myr ago). These results may support our early assumption that the radiation of the Erysiphales

  10. Bacterial DNA Sequence Compression Models Using Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Armando J. Pinho

    2013-08-01

    Full Text Available It is widely accepted that the advances in DNA sequencing techniques have contributed to an unprecedented growth of genomic data. This fact has increased the interest in DNA compression, not only from the information theory and biology points of view, but also from a practical perspective, since such sequences require storage resources. Several compression methods exist, and particularly, those using finite-context models (FCMs have received increasing attention, as they have been proven to effectively compress DNA sequences with low bits-per-base, as well as low encoding/decoding time-per-base. However, the amount of run-time memory required to store high-order finite-context models may become impractical, since a context-order as low as 16 requires a maximum of 17.2 x 109 memory entries. This paper presents a method to reduce such a memory requirement by using a novel application of artificial neural networks (ANN to build such probabilistic models in a compact way and shows how to use them to estimate the probabilities. Such a system was implemented, and its performance compared against state-of-the art compressors, such as XM-DNA (expert model and FCM-Mx (mixture of finite-context models , as well as with general-purpose compressors. Using a combination of order-10 FCM and ANN, similar encoding results to those of FCM, up to order-16, are obtained using only 17 megabytes of memory, whereas the latter, even employing hash-tables, uses several hundreds of megabytes.

  11. Mesoscopic modeling of DNA denaturation rates: Sequence dependence and experimental comparison

    Energy Technology Data Exchange (ETDEWEB)

    Dahlen, Oda, E-mail: oda.dahlen@ntnu.no; Erp, Titus S. van, E-mail: titus.van.erp@ntnu.no [Department of Chemistry, Norwegian University of Science and Technology (NTNU), Høgskoleringen 5, Realfagbygget D3-117 7491 Trondheim (Norway)

    2015-06-21

    Using rare event simulation techniques, we calculated DNA denaturation rate constants for a range of sequences and temperatures for the Peyrard-Bishop-Dauxois (PBD) model with two different parameter sets. We studied a larger variety of sequences compared to previous studies that only consider DNA homopolymers and DNA sequences containing an equal amount of weak AT- and strong GC-base pairs. Our results show that, contrary to previous findings, an even distribution of the strong GC-base pairs does not always result in the fastest possible denaturation. In addition, we applied an adaptation of the PBD model to study hairpin denaturation for which experimental data are available. This is the first quantitative study in which dynamical results from the mesoscopic PBD model have been compared with experiments. Our results show that present parameterized models, although giving good results regarding thermodynamic properties, overestimate denaturation rates by orders of magnitude. We believe that our dynamical approach is, therefore, an important tool for verifying DNA models and for developing next generation models that have higher predictive power than present ones.

  12. Molecular genotyping of Colletotrichum species based on arbitrarily primed PCR, A + T-Rich DNA, and nuclear DNA analyses

    Science.gov (United States)

    Freeman, S.; Pham, M.; Rodriguez, R.J.

    1993-01-01

    Molecular genotyping of Colletotrichum species based on arbitrarily primed PCR, A + T-rich DNA, and nuclear DNA analyses. Experimental Mycology 17, 309-322. Isolates of Colletotrichum were grouped into 10 separate species based on arbitrarily primed PCR (ap-PCR), A + T-rich DNA (AT-DNA) and nuclear DNA banding patterns. In general, the grouping of Colletotrichum isolates by these molecular approaches corresponded to that done by classical taxonomic identification, however, some exceptions were observed. PCR amplification of genomic DNA using four different primers allowed for reliable differentiation between isolates of the 10 species. HaeIII digestion patterns of AT-DNA also distinguished between species of Colletotrichum by generating species-specific band patterns. In addition, hybridization of the repetitive DNA element (GcpR1) to genomic DNA identified a unique set of Pst 1-digested nuclear DNA fragments in each of the 10 species of Colletotrichum tested. Multiple isolates of C. acutatum, C. coccodes, C. fragariae, C. lindemuthianum, C. magna, C. orbiculare, C. graminicola from maize, and C. graminicola from sorghum showed 86-100% intraspecies similarity based on ap-PCR and AT-DNA analyses. Interspecies similarity determined by ap-PCR and AT-DNA analyses varied between 0 and 33%. Three distinct banding patterns were detected in isolates of C. gloeosporioides from strawberry. Similarly, three different banding patterns were observed among isolates of C. musae from diseased banana.

  13. Fascioliasis transmission by Lymnaea neotropica confirmed by nuclear rDNA and mtDNA sequencing in Argentina.

    Science.gov (United States)

    Mera y Sierra, Roberto; Artigas, Patricio; Cuervo, Pablo; Deis, Erika; Sidoti, Laura; Mas-Coma, Santiago; Bargues, Maria Dolores

    2009-12-03

    Fascioliasis is widespread in livestock in Argentina. Among activities included in a long-term initiative to ascertain which are the fascioliasis areas of most concern, studies were performed in a recreational farm, including liver fluke infection in different domestic animal species, classification of the lymnaeid vector and verification of natural transmission of fascioliasis by identification of the intramolluscan trematode larval stages found in naturally infected snails. The high prevalences in the domestic animals appeared related to only one lymnaeid species present. Lymnaeid and trematode classification was verified by means of nuclear ribosomal DNA and mitochondrial DNA marker sequencing. Complete sequences of 18S rRNA gene and rDNA ITS-2 and ITS-1, and a fragment of the mtDNA cox1 gene demonstrate that the Argentinian lymnaeid belongs to the species Lymnaea neotropica. Redial larval stages found in a L. neotropica specimen were ascribed to Fasciola hepatica after analysis of the complete ITS-1 sequence. The finding of L. neotropica is the first of this lymnaeid species not only in Argentina but also in Southern Cone countries. The total absence of nucleotide differences between the sequences of specimens from Argentina and the specimens from the Peruvian type locality at the levels of rDNA 18S, ITS-2 and ITS-1, and the only one mutation at the mtDNA cox1 gene suggest a very recent spread. The ecological characteristics of this lymnaeid, living in small, superficial water collections frequented by livestock, suggest that it may be carried from one place to another by remaining in dried mud stuck to the feet of transported animals. The presence of L. neotropica adds pronounced complexity to the transmission and epidemiology of fascioliasis in Argentina, due to the great difficulties in distinguishing, by traditional malacological methods, between the three similar lymnaeid species of the controversial Galba/Fossaria group present in this country: L. viatrix

  14. Bisulfite sequencing reveals that Aspergillus flavus holds a hollow in DNA methylation.

    Directory of Open Access Journals (Sweden)

    Si-Yang Liu

    Full Text Available Aspergillus flavus first gained scientific attention for its production of aflatoxin. The underlying regulation of aflatoxin biosynthesis has been serving as a theoretical model for biosynthesis of other microbial secondary metabolites. Nevertheless, for several decades, the DNA methylation status, one of the important epigenomic modifications involved in gene regulation, in A. flavus remains to be controversial. Here, we applied bisulfite sequencing in conjunction with a biological replicate strategy to investigate the DNA methylation profiling of A. flavus genome. Both the bisulfite sequencing data and the methylome comparisons with other fungi confirm that the DNA methylation level of this fungus is negligible. Further investigation into the DNA methyltransferase of Aspergillus uncovers its close relationship with RID-like enzymes as well as its divergence with the methyltransferase of species with validated DNA methylation. The lack of repeat contents of the A. flavus' genome and the high RIP-index of the small amount of remanent repeat potentially support our speculation that DNA methylation may be absent in A. flavus or that it may possess de novo DNA methylation which occurs very transiently during the obscure sexual stage of this fungal species. This work contributes to our understanding on the DNA methylation status of A. flavus, as well as reinforces our views on the DNA methylation in fungal species. In addition, our strategy of applying bisulfite sequencing to DNA methylation detection in species with low DNA methylation may serve as a reference for later scientific investigations in other hypomethylated species.

  15. Ultra-deep sequencing of mouse mitochondrial DNA: mutational patterns and their origins.

    Directory of Open Access Journals (Sweden)

    Adam Ameur

    2011-03-01

    Full Text Available Somatic mutations of mtDNA are implicated in the aging process, but there is no universally accepted method for their accurate quantification. We have used ultra-deep sequencing to study genome-wide mtDNA mutation load in the liver of normally- and prematurely-aging mice. Mice that are homozygous for an allele expressing a proof-reading-deficient mtDNA polymerase (mtDNA mutator mice have 10-times-higher point mutation loads than their wildtype siblings. In addition, the mtDNA mutator mice have increased levels of a truncated linear mtDNA molecule, resulting in decreased sequence coverage in the deleted region. In contrast, circular mtDNA molecules with large deletions occur at extremely low frequencies in mtDNA mutator mice and can therefore not drive the premature aging phenotype. Sequence analysis shows that the main proportion of the mutation load in heterozygous mtDNA mutator mice and their wildtype siblings is inherited from their heterozygous mothers consistent with germline transmission. We found no increase in levels of point mutations or deletions in wildtype C57Bl/6N mice with increasing age, thus questioning the causative role of these changes in aging. In addition, there was no increased frequency of transversion mutations with time in any of the studied genotypes, arguing against oxidative damage as a major cause of mtDNA mutations. Our results from studies of mice thus indicate that most somatic mtDNA mutations occur as replication errors during development and do not result from damage accumulation in adult life.

  16. Ray Wu as Fifth Business: Deconstructing collective memory in the history of DNA sequencing.

    Science.gov (United States)

    Onaga, Lisa A

    2014-06-01

    The concept of 'Fifth Business' is used to analyze a minority standpoint and bring serious attention to the role of scientists who play a galvanizing role in a science but for multiple reasons appear less prominently in more common recounts of any particular development. Biochemist Ray Wu (1928-2008) published a DNA sequencing experiment in March 1970 using DNA polymerase catalysis and specific nucleotide labeling, both of which are foundational to general sequencing methods today. The scant mention of Wu's work from textbooks, research articles, and other accounts of DNA sequencing calls into question how scientific collective memory forms. This alternative history seeks to understand why a key figure in nucleic acid sequence analysis has remained less visibly connected or peripheral to solidifying narratives about the history of DNA sequencing. The study resists predictable dismissals of Wu's work in order to seriously examine the formation of his nucleic acid sequence analysis research program and how he shared his knowledge of sequencing during a period of rapid advancement in the field. An analysis of Wu's work on sequencing the cohesive ends of lambda bacteriophage in the 1960s and 1970s exemplifies how a variety of individuals and groups attempted to develop protocol for sequencing the order of nucleotide base pairs comprising DNA. This historical examination of the sociality of scientific research suggests a way to understand how Wu and others contributed to the very collective memory of DNA sequencing that Wu eventually tried to repair. The study of Wu, who was a Chinese immigrant to the United States, provides a foundation for further critical scholarship on the heterogeneous histories of Asian American bioscientists, the sociality of their scientific works, and how the resulting knowledge produced is preserved, if not evenly, in a scientific field's collective memory. Copyright © 2014 Elsevier Ltd. All rights reserved.

  17. An analysis of expressed sequence tags of developing castor endosperm using a full-length cDNA library

    Directory of Open Access Journals (Sweden)

    Wallis James G

    2007-07-01

    Full Text Available Abstract Background Castor seeds are a major source for ricinoleate, an important industrial raw material. Genomics studies of castor plant will provide critical information for understanding seed metabolism, for effectively engineering ricinoleate production in transgenic oilseeds, or for genetically improving castor plants by eliminating toxic and allergic proteins in seeds. Results Full-length cDNAs are useful resources in annotating genes and in providing functional analysis of genes and their products. We constructed a full-length cDNA library from developing castor endosperm, and obtained 4,720 ESTs from 5'-ends of the cDNA clones representing 1,908 unique sequences. The most abundant transcripts are genes encoding storage proteins, ricin, agglutinin and oleosins. Several other sequences are also very numerous, including two acidic triacylglycerol lipases, and the oleate hydroxylase (FAH12 gene that is responsible for ricinoleate biosynthesis. The role(s of the lipases in developing castor seeds are not clear, and co-expressing of a lipase and the FAH12 did not result in significant changes in hydroxy fatty acid accumulation in transgenic Arabidopsis seeds. Only one oleate desaturase (FAD2 gene was identified in our cDNA sequences. Sequence and functional analyses of the castor FAD2 were carried out since it had not been characterized previously. Overexpression of castor FAD2 in a FAH12-expressing Arabidopsis line resulted in decreased accumulation of hydroxy fatty acids in transgenic seeds. Conclusion Our results suggest that transcriptional regulation of FAD2 and FAH12 genes maybe one of the mechanisms that contribute to a high level of ricinoleate accumulation in castor endosperm. The full-length cDNA library will be used to search for additional genes that affect ricinoleate accumulation in seed oils. Our EST sequences will also be useful to annotate the castor genome, which whole sequence is being generated by shotgun sequencing at

  18. High-throughput sequencing of three Lemnoideae (duckweeds chloroplast genomes from total DNA.

    Directory of Open Access Journals (Sweden)

    Wenqin Wang

    Full Text Available BACKGROUND: Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. METHODS: We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. CONCLUSIONS: This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power.

  19. Roles of genes and Alu repeats in nonlinear correlations of HUMHBB DNA sequence

    International Nuclear Information System (INIS)

    Xiao Yi; Huang Yanzhao

    2004-01-01

    DNA sequences of different species and different portion of the DNA of the same species may have completely different correlation properties, but the origin of these correlations is still not very clear and is currently being investigated, especially in different particular cases. We report here a study of the DNA sequence of human beta globin region (HUMHBB) which has strong linear and nonlinear correlations. We studied the roles of two of the typical elements of DNA sequence, genes and Alu repeats, in the nonlinear correlations of HUMHBB. We find that there exist strong nonlinear correlations between the exons or introns in different genes and between the Alu repeats. They may be one of the major sources of the nonlinear correlations in HUMBHB

  20. Sequence specificity and biological consequences of drugs that bind covalently in the minor groove of DNA

    International Nuclear Information System (INIS)

    Hurley, L.H.; Needham-VanDevanter, D.R.

    1986-01-01

    DNA ligands which bind within the minor groove of DNA exhibit varying degrees of sequence selectivity. Factors which contribute to nucleotide sequence recognition by minor groove ligands have been extensively investigated. Electrostatic interactions, ligand and DNA dehydration energies, hydrophobic interactions and steric factors all play significant roles in sequence selectivity in the minor groove. Interestingly, ligand recognition of nucleotide sequence in the minor groove does not involve significant hydrogen bonding. This is in sharp contrast to cellular enzyme and protein recognition of nucleotide sequence, which is achieved in the major groove via specific hydrogen bond formation between individual bases and the ligand. The ability to read nucleotide sequence via hydrogen bonding allows precise binding of proteins to specific DNA sequences. Minor groove ligands examined to date exhibit a much lower sequence specificity, generally binding to a subset of possible sequences, rather than a single sequence. 19 refs., 7 figs

  1. Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments.

    Science.gov (United States)

    Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias

    2013-09-24

    Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retrieval. In combination with single-stranded DNA library preparation, this method enabled us to reconstruct the mitochondrial genome sequence from a Middle Pleistocene cave bear (Ursus deningeri) bone excavated at Sima de los Huesos in the Sierra de Atapuerca, Spain. Phylogenetic reconstructions indicate that the U. deningeri sequence forms an early diverging sister lineage to all Western European Late Pleistocene cave bears. Our results prove that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost. Moreover, the techniques presented enable the retrieval of phylogenetically informative sequences from samples in which virtually all DNA is diminished to fragments shorter than 50 bp.

  2. Phylogenetic analysis of the genus Hordeum using repetitive DNA sequences

    DEFF Research Database (Denmark)

    Svitashev, S.; Bryngelsson, T.; Vershinin, A.

    1994-01-01

    A set of six cloned barley (Hordeum vulgare) repetitive DNA sequences was used for the analysis of phylogenetic relationships among 31 species (46 taxa) of the genus Hordeum, using molecular hybridization techniques. In situ hybridization experiments showed dispersed organization of the sequences...

  3. Engineering of a DNA Polymerase for Direct m6 A Sequencing.

    Science.gov (United States)

    Aschenbrenner, Joos; Werner, Stephan; Marchand, Virginie; Adam, Martina; Motorin, Yuri; Helm, Mark; Marx, Andreas

    2018-01-08

    Methods for the detection of RNA modifications are of fundamental importance for advancing epitranscriptomics. N 6 -methyladenosine (m 6 A) is the most abundant RNA modification in mammalian mRNA and is involved in the regulation of gene expression. Current detection techniques are laborious and rely on antibody-based enrichment of m 6 A-containing RNA prior to sequencing, since m 6 A modifications are generally "erased" during reverse transcription (RT). To overcome the drawbacks associated with indirect detection, we aimed to generate novel DNA polymerase variants for direct m 6 A sequencing. Therefore, we developed a screen to evolve an RT-active KlenTaq DNA polymerase variant that sets a mark for N 6 -methylation. We identified a mutant that exhibits increased misincorporation opposite m 6 A compared to unmodified A. Application of the generated DNA polymerase in next-generation sequencing allowed the identification of m 6 A sites directly from the sequencing data of untreated RNA samples. © 2017 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.

  4. Complete cDNA sequence coding for human docking protein

    Energy Technology Data Exchange (ETDEWEB)

    Hortsch, M; Labeit, S; Meyer, D I

    1988-01-11

    Docking protein (DP, or SRP receptor) is a rough endoplasmic reticulum (ER)-associated protein essential for the targeting and translocation of nascent polypeptides across this membrane. It specifically interacts with a cytoplasmic ribonucleoprotein complex, the signal recognition particle (SRP). The nucleotide sequence of cDNA encoding the entire human DP and its deduced amino acid sequence are given.

  5. Sequence homology at the breakpoint and clinical phenotype of mitochondrial DNA deletion syndromes.

    Science.gov (United States)

    Sadikovic, Bekim; Wang, Jing; El-Hattab, Ayman W; Landsverk, Megan; Douglas, Ganka; Brundage, Ellen K; Craigen, William J; Schmitt, Eric S; Wong, Lee-Jun C

    2010-12-20

    Mitochondrial DNA (mtDNA) deletions are a common cause of mitochondrial disorders. Large mtDNA deletions can lead to a broad spectrum of clinical features with different age of onset, ranging from mild mitochondrial myopathies (MM), progressive external ophthalmoplegia (PEO), and Kearns-Sayre syndrome (KSS), to severe Pearson syndrome. The aim of this study is to investigate the molecular signatures surrounding the deletion breakpoints and their association with the clinical phenotype and age at onset. MtDNA deletions in 67 patients were characterized using array comparative genomic hybridization (aCGH) followed by PCR-sequencing of the deletion junctions. Sequence homology including both perfect and imperfect short repeats flanking the deletion regions were analyzed and correlated with clinical features and patients' age group. In all age groups, there was a significant increase in sequence homology flanking the deletion compared to mtDNA background. The youngest patient group (deletion distribution in size and locations, with a significantly lower sequence homology flanking the deletion, and the highest percentage of deletion mutant heteroplasmy. The older age groups showed rather discrete pattern of deletions with 44% of all patients over 6 years old carrying the most common 5 kb mtDNA deletion, which was found mostly in muscle specimens (22/41). Only 15% (3/20) of the young patients (deletion, which is usually present in blood rather than muscle. This group of patients predominantly (16 out of 17) exhibit multisystem disorder and/or Pearson syndrome, while older patients had predominantly neuromuscular manifestations including KSS, PEO, and MM. In conclusion, sequence homology at the deletion flanking regions is a consistent feature of mtDNA deletions. Decreased levels of sequence homology and increased levels of deletion mutant heteroplasmy appear to correlate with earlier onset and more severe disease with multisystem involvement.

  6. Sequence-specific DNA binding by MYC/MAX to low-affinity non-E-box motifs.

    Directory of Open Access Journals (Sweden)

    Michael Allevato

    Full Text Available The MYC oncoprotein regulates transcription of a large fraction of the genome as an obligatory heterodimer with the transcription factor MAX. The MYC:MAX heterodimer and MAX:MAX homodimer (hereafter MYC/MAX bind Enhancer box (E-box DNA elements (CANNTG and have the greatest affinity for the canonical MYC E-box (CME CACGTG. However, MYC:MAX also recognizes E-box variants and was reported to bind DNA in a "non-specific" fashion in vitro and in vivo. Here, in order to identify potential additional non-canonical binding sites for MYC/MAX, we employed high throughput in vitro protein-binding microarrays, along with electrophoretic mobility-shift assays and bioinformatic analyses of MYC-bound genomic loci in vivo. We identified all hexameric motifs preferentially bound by MYC/MAX in vitro, which include the low-affinity non-E-box sequence AACGTT, and found that the vast majority (87% of MYC-bound genomic sites in a human B cell line contain at least one of the top 21 motifs bound by MYC:MAX in vitro. We further show that high MYC/MAX concentrations are needed for specific binding to the low-affinity sequence AACGTT in vitro and that elevated MYC levels in vivo more markedly increase the occupancy of AACGTT sites relative to CME sites, especially at distal intergenic and intragenic loci. Hence, MYC binds diverse DNA motifs with a broad range of affinities in a sequence-specific and dose-dependent manner, suggesting that MYC overexpression has more selective effects on the tumor transcriptome than previously thought.

  7. Characteristics of palindromic sequences in DNA of the sea urchin Stronglyocentrotus intermedius

    International Nuclear Information System (INIS)

    Brykov, V.A.; Kukhlevskii, A.D.

    1986-01-01

    The fraction of palindromic sequences in the nuclear DNA of the sea urchin S. intermedius was characterized. Using chromatography on hydroxyapatite and treatment with S1 nuclease, it was shown that the fraction of palindromic sequences more than doubles when the sodium concentration in solution is increased or the temperature of reassociation is lowered. The increase is due to the involvement of inverted repeats in reassociation, which are characterized by a substantial nonhomologous character and/or the presence of an extended intervening DNA sequence. It was found by the method of reassociation of a nicked palindrome fraction with an excess of total homologous DNA that most of the inverted repeats in the sea urchin genome are unique sequences. The complexity of the palindrome fraction was estimated at 8.2 x 10 7 nucleotide pairs, and the number of palindromes per haploid genome ∼ 500,000

  8. The function analysis of full-length cDNA sequence from IRM-2 mouse cDNA library

    International Nuclear Information System (INIS)

    Wang Qin; Liu Xiaoqiu; Xu Chang; Du Liqing; Sun Zhijuan; Wang Yan; Liu Qiang; Song Li; Li Jin; Fan Feiyue

    2013-01-01

    Objective: To identify the function of full-length cDNA sequence from IRM-2 mouse cDNA library. Methods: Full-length cDNA products were amplified by PCR from IRM-2 mouse cDNA library according to twenty-one pieces of expressed sequence tag. The expression of full-length cDNAs were detected after mouse embryonic fibroblasts were exposed to 6.5 Gy γ-ray radiation. And the effect on the growth of radiosensitivity cells AT5B1VA transfected with full-length cDNAs was investigated. Results: The expression of No.4, 5 and 2 full-length cDNAs from IRM-2 mouse were higher than that of parental ICR and 615 mouse after mouse embryonic fibroblasts irradiated with γ-ray radiation. And the survival rate of AT5B1VA cells transfected with No.4, 5 and 2 full-length cDNAs was high. Conclusion: No.4, 5 and 2 full-length cDNAs of IRM-2 mouse are of high radioresistance. (authors)

  9. Sequence specificity of DNA cleavage by Micrococcus luteus γ endonuclease

    International Nuclear Information System (INIS)

    Hentosh, P.; Henner, W.D.; Reynolds, R.J.

    1985-01-01

    DNA fragments of defined sequence have been used to determine the sites of cleavage by γ-endonuclease activity in extracts prepared from Micrococcus luteus. End-labeled DNA restriction fragments of pBR322 DNA that had been irradiated under nitrogen in the presence of potassium iodide or t-butanol were treated with M. luteus γ endonuclease and analyzed on irradiated DNA preferentially at the positions of cytosines and thymines. DNA cleavage occurred immediately to the 3' side of pyrimidines in irradiated DNA and resulted in fragments that terminate in a 5'-phosphoryl group. These studies indicate that both altered cytosines and thymines may be important DNA lesions requiring repair after exposure to γ radiation

  10. Comparative sequence analyses on the 16S rRNA (rDNA) of Bacillus acidocaldarius, Bacillus acidoterrestris, and Bacillus cycloheptanicus and proposal for creation of a new genus, Alicyclobacillus gen. nov

    Science.gov (United States)

    Wisotzkey, J. D.; Jurtshuk, P. Jr; Fox, G. E.; Deinhard, G.; Poralla, K.

    1992-01-01

    Comparative 16S rRNA (rDNA) sequence analyses performed on the thermophilic Bacillus species Bacillus acidocaldarius, Bacillus acidoterrestris, and Bacillus cycloheptanicus revealed that these organisms are sufficiently different from the traditional Bacillus species to warrant reclassification in a new genus, Alicyclobacillus gen. nov. An analysis of 16S rRNA sequences established that these three thermoacidophiles cluster in a group that differs markedly from both the obligately thermophilic organisms Bacillus stearothermophilus and the facultatively thermophilic organism Bacillus coagulans, as well as many other common mesophilic and thermophilic Bacillus species. The thermoacidophilic Bacillus species B. acidocaldarius, B. acidoterrestris, and B. cycloheptanicus also are unique in that they possess omega-alicylic fatty acid as the major natural membranous lipid component, which is a rare phenotype that has not been found in any other Bacillus species characterized to date. This phenotype, along with the 16S rRNA sequence data, suggests that these thermoacidophiles are biochemically and genetically unique and supports the proposal that they should be reclassified in the new genus Alicyclobacillus.

  11. High Performance Systolic Array Core Architecture Design for DNA Sequencer

    Directory of Open Access Journals (Sweden)

    Saiful Nurdin Dayana

    2018-01-01

    Full Text Available This paper presents a high performance systolic array (SA core architecture design for Deoxyribonucleic Acid (DNA sequencer. The core implements the affine gap penalty score Smith-Waterman (SW algorithm. This time-consuming local alignment algorithm guarantees optimal alignment between DNA sequences, but it requires quadratic computation time when performed on standard desktop computers. The use of linear SA decreases the time complexity from quadratic to linear. In addition, with the exponential growth of DNA databases, the SA architecture is used to overcome the timing issue. In this work, the SW algorithm has been captured using Verilog Hardware Description Language (HDL and simulated using Xilinx ISIM simulator. The proposed design has been implemented in Xilinx Virtex -6 Field Programmable Gate Array (FPGA and improved in the core area by 90% reduction.

  12. Effect of DNA sequence, ionic strength, and cationic DNA affinity binders on the methylation of DNA by N-methyl-N-nitrosourea

    International Nuclear Information System (INIS)

    Wurdeman, R.L.; Gold, B.

    1988-01-01

    DNA alkylation by N-alkyl-N-nitrosoureas is generally accepted to be responsible for their mutagenic, carcinogenic, and antineoplastic activities. The exact nature of the ultimate alkylating intermediate is still controversial, with a variety of species having been nominated. The sequence specificity for DNA alkylation by simple N-alkyl-N-nitrosoureas has not been reported, although such information is basic in understanding the specific point mutations induced by these compounds in oncogene targets. These two points are addressed by using N-methyl-N-nitrosourea methylation of a 576 base-pair 32 P-end-labeled DNA restriction fragment and high-resolution polyacrylamide sequencing gels. This method provides information on the formation of N 7 -methylguanine, by the generation of single-strand breaks upon exposure to piperidine

  13. Sequence specific electronic conduction through polyion-stabilized double-stranded DNA in nanoscale break junctions

    International Nuclear Information System (INIS)

    Mahapatro, Ajit K; Jeong, Kyung J; Lee, Gil U; Janes, David B

    2007-01-01

    This paper presents a study of sequence specific electronic conduction through short (15-base-pair) double-stranded (ds) DNA molecules, measured by immobilizing 3 ' -thiol-derivatized DNAs in nanometre scale gaps between gold electrodes. The polycation spermidine was used to stabilize the ds-DNA structure, allowing electrical measurements to be performed in a dry state. For specific sequences, the conductivity was observed to scale with the surface density of immobilized DNA, which can be controlled by the buffer concentration. A series of 15-base DNA oligonucleotide pairs, in which the centre sequence of five base pairs was changed from G:C to A:T pairs, has been studied. The conductivity per molecule is observed to decrease exponentially with the number of adjacent A:T pairs replacing G:C pairs, consistent with a barrier at the A:T sites. Conductance-based devices for short DNA sequences could provide sensing approaches with direct electrical readout, as well as label-free detection

  14. DNA migration mechanism analyses for applications in capillary and microchip electrophoresis

    Science.gov (United States)

    Forster, Ryan E.; Hert, Daniel G.; Chiesl, Thomas N.; Fredlake, Christopher P.; Barron, Annelise E.

    2009-01-01

    In 2009, electrophoretically driven DNA separations in slab gels and capillaries have the sepia tones of an old-fashioned technology in the eyes of many, even while they remain ubiquitously used, fill a unique niche, and arguably have yet to reach their full potential. For comic relief, what is old becomes new again: agarose slab gel separations are used to prepare DNA samples for “next-gen” sequencing platforms (e.g., the Illumina and 454 machines)—dsDNA molecules within a certain size range are “cut out” of a gel and recovered for subsequent “massively parallel” pyrosequencing. In this review, we give a Barron lab perspective on how our comprehension of DNA migration mechanisms in electrophoresis has evolved, since the first reports of DNA separations by CE (∼1989) until now, 20 years later. Fused silica capillaries, and borosilicate glass and plastic microchips, quietly offer increasing capacities for fast (and even “ultra-fast”), efficient DNA separations. While the channel-by-channel scaling of both old and new electrophoresis platforms provides key flexibility, it requires each unique DNA sample to be prepared in its own micro- or nanovolume. This Achille's heel of electrophoresis technologies left an opening through which pooled-sample, next-gen DNA sequencing technologies rushed. We shall see, over time, whether sharpening understanding of transitions in DNA migration modes in crosslinked gels, nanogel solutions, and uncrosslinked polymer solutions will allow electrophoretic DNA analysis technologies to flower again. Microchannel electrophoresis, after a quiet period of metamorphosis, may emerge sleeker and more powerful, to claim its own important niche applications. PMID:19582705

  15. Targeted DNA Methylation Analysis by High Throughput Sequencing in Porcine Peri-attachment Embryos

    OpenAIRE

    MORRILL, Benson H.; COX, Lindsay; WARD, Anika; HEYWOOD, Sierra; PRATHER, Randall S.; ISOM, S. Clay

    2013-01-01

    Abstract The purpose of this experiment was to implement and evaluate the effectiveness of a next-generation sequencing-based method for DNA methylation analysis in porcine embryonic samples. Fourteen discrete genomic regions were amplified by PCR using bisulfite-converted genomic DNA derived from day 14 in vivo-derived (IVV) and parthenogenetic (PA) porcine embryos as template DNA. Resulting PCR products were subjected to high-throughput sequencing using the Illumina Genome Analyzer IIx plat...

  16. DNA repair-related genes in sugarcane expressed sequence tags (ESTs

    Directory of Open Access Journals (Sweden)

    R.M.A. Costa

    2001-12-01

    Full Text Available There is much interest in the identification and characterization of genes involved in DNA repair because of their importance in the maintenance of the genome integrity. The high level of conservation of DNA repair genes means that these genetic elements may be used in phylogenetic studies as a source of information on the genetic origin and evolution of species. The mechanisms by which damaged DNA is repaired are well understood in bacteria, yeast and mammals, but much remains to be learned as regards plants. We identified genes involved in DNA repair mechanisms in sugarcane using a similarity search of the Brazilian Sugarcane Expressed Sequence Tag (SUCEST database against known sequences deposited in other public databases (National Center of Biotechnology Information (NCBI database and the Munich Information Center for Protein Sequences (MIPS Arabidopsis thaliana database. This search revealed that most of the various proteins involved in DNA repair in sugarcane are similar to those found in other eukaryotes. However, we also identified certain intriguing features found only in plants, probably due to the independent evolution of this kingdom. The DNA repair mechanisms investigated include photoreactivation, base excision repair, nucleotide excision repair, mismatch repair, non-homologous end joining, homologous recombination repair and DNA lesion tolerance. We report the main differences found in the DNA repair machinery in plant cells as compared to other organisms. These differences point to potentially different strategies plants employ to deal with DNA damage, that deserve further investigation.A identificação e caracterização de genes envolvidos com reparo de DNA são de grande interesse, dada a sua importância na manutenção da integridade genômica. Além disso, a alta conservação dos genes de reparo de DNA faz com que possam ser utilizados como fonte de informação no que diz respeito à origem e evolução das esp

  17. Applications of statistical physics and information theory to the analysis of DNA sequences

    Science.gov (United States)

    Grosse, Ivo

    2000-10-01

    DNA carries the genetic information of most living organisms, and the of genome projects is to uncover that genetic information. One basic task in the analysis of DNA sequences is the recognition of protein coding genes. Powerful computer programs for gene recognition have been developed, but most of them are based on statistical patterns that vary from species to species. In this thesis I address the question if there exist universal statistical patterns that are different in coding and noncoding DNA of all living species, regardless of their phylogenetic origin. In search for such species-independent patterns I study the mutual information function of genomic DNA sequences, and find that it shows persistent period-three oscillations. To understand the biological origin of the observed period-three oscillations, I compare the mutual information function of genomic DNA sequences to the mutual information function of stochastic model sequences. I find that the pseudo-exon model is able to reproduce the mutual information function of genomic DNA sequences. Moreover, I find that a generalization of the pseudo-exon model can connect the existence and the functional form of long-range correlations to the presence and the length distributions of coding and noncoding regions. Based on these theoretical studies I am able to find an information-theoretical quantity, the average mutual information (AMI), whose probability distributions are significantly different in coding and noncoding DNA, while they are almost identical in all studied species. These findings show that there exist universal statistical patterns that are different in coding and noncoding DNA of all studied species, and they suggest that the AMI may be used to identify genes in different living species, irrespective of their taxonomic origin.

  18. DNA copy number, including telomeres and mitochondria, assayed using next-generation sequencing

    Directory of Open Access Journals (Sweden)

    Jackson Stuart

    2010-04-01

    Full Text Available Abstract Background DNA copy number variations occur within populations and aberrations can cause disease. We sought to develop an improved lab-automatable, cost-efficient, accurate platform to profile DNA copy number. Results We developed a sequencing-based assay of nuclear, mitochondrial, and telomeric DNA copy number that draws on the unbiased nature of next-generation sequencing and incorporates techniques developed for RNA expression profiling. To demonstrate this platform, we assayed UMC-11 cells using 5 million 33 nt reads and found tremendous copy number variation, including regions of single and homogeneous deletions and amplifications to 29 copies; 5 times more mitochondria and 4 times less telomeric sequence than a pool of non-diseased, blood-derived DNA; and that UMC-11 was derived from a male individual. Conclusion The described assay outputs absolute copy number, outputs an error estimate (p-value, and is more accurate than array-based platforms at high copy number. The platform enables profiling of mitochondrial levels and telomeric length. The assay is lab-automatable and has a genomic resolution and cost that are tunable based on the number of sequence reads.

  19. Sequence Dependent Electrophoretic Separations of DNA in Pluronic F127 Gels

    Science.gov (United States)

    You, Seungyong; van Winkle, David H.

    2010-03-01

    Two-dimensional (2-D) electrophoresis has successfully been used to visualize the separation of DNA fragments of the same length. We electrophorese a double-stranded DNA ladder in an Agarose gel for the first dimension and in gels of Pluronic F127 for the second dimension at room temperature. The 1000 bp band that travels together as a single band in an Agarose gel is split into two bands in Pluronic gels. The slower band follows the exponential decay trend that the other ladder constituents do. After sequencing the DNA fragments, the faster band has an apparently random sequence, while the slower band and the others have two A-tracts in each 250 bp segment. The A-tracts consist of a series of at least five adenine bases pairing with thymine bases. This result leads to the conclusion that the migration of the DNA molecules bent with A-tracts is more retarded in Pluronic gels than the wild-type of DNA molecules.

  20. Parasitic infections and resource economy of Danish Iron Age settlement through ancient DNA sequencing.

    Science.gov (United States)

    Tams, Katrine Wegener; Jensen Søe, Martin; Merkyte, Inga; Valeur Seersholm, Frederik; Henriksen, Peter Steen; Klingenberg, Susanne; Willerslev, Eske; Kjær, Kurt H; Hansen, Anders Johannes; Kapel, Christian Moliin Outzen

    2018-01-01

    In this study, we screen archaeological soil samples by microscopy and analyse the samples by next generation sequencing to obtain results with parasites at species level and untargeted findings of plant and animal DNA. Three separate sediment layers of an ancient man-made pond in Hoby, Denmark, ranging from 100 BC to 200 AD, were analysed by microscopy for presence of intestinal worm eggs and DNA analysis were performed to identify intestinal worms and dietary components. Ancient DNA of parasites, domestic animals and edible plants revealed a change in use of the pond over time reflecting the household practice in the adjacent Iron Age settlement. The most abundant parasite found belonged to the Ascaris genus, which was not possible to type at species level. For all sediment layers the presence of eggs of the human whipworm Trichuris trichiura and the beef tapeworm Taenia saginata suggests continuous disposal of human faeces in the pond. Moreover, the continuous findings of T. saginata further imply beef consumption and may suggest that cattle were living in the immediate surrounding of the site throughout the period. Findings of additional host-specific parasites suggest fluctuating presence of other domestic animals over time: Trichuris suis (pig), Parascaris univalens (horse), Taenia hydatigena (dog and sheep). Likewise, alternating occurrence of aDNA of edible plants may suggest changes in agricultural practices. Moreover, the composition of aDNA of parasites, plants and vertebrates suggests a significant change in the use of the ancient pond over a period of three centuries.

  1. Molecular cloning and nucleotide sequence of cDNA for human liver arginase

    International Nuclear Information System (INIS)

    Haraguchi, Y.; Takiguchi, M.; Amaya, Y.; Kawamoto, S.; Matsuda, I.; Mori, M.

    1987-01-01

    Arginase (EC3.5.3.1) catalyzes the last step of the urea cycle in the liver of ureotelic animals. Inherited deficiency of the enzyme results in argininemia, an autosomal recessive disorder characterized by hyperammonemia. To facilitate investigation of the enzyme and gene structures and to elucidate the nature of the mutation in argininemia, the authors isolated cDNA clones for human liver arginase. Oligo(dT)-primed and random primer human liver cDNA libraries in λ gt11 were screened using isolated rat arginase cDNA as a probe. Two of the positive clones, designated λ hARG6 and λ hARG109, contained an overlapping cDNA sequence with an open reading frame encoding a polypeptide of 322 amino acid residues (predicted M/sub r/, 34,732), a 5'-untranslated sequence of 56 base pairs, a 3'-untranslated sequence of 423 base pairs, and a poly(A) segment. Arginase activity was detected in Escherichia coli cells transformed with the plasmid carrying λ hARG6 cDNA insert. RNA gel blot analysis of human liver RNA showed a single mRNA of 1.6 kilobases. The predicted amino acid sequence of human liver arginase is 87% and 41% identical with those of the rat liver and yeast enzymes, respectively. There are several highly conserved segments among the human, rat, and yeast enzymes

  2. Plastome Sequencing of Ten Nonmodel Crop Species Uncovers a Large Insertion of Mitochondrial DNA in Cashew.

    Science.gov (United States)

    Rabah, Samar O; Lee, Chaehee; Hajrah, Nahid H; Makki, Rania M; Alharby, Hesham F; Alhebshi, Alawiah M; Sabir, Jamal S M; Jansen, Robert K; Ruhlman, Tracey A

    2017-11-01

    In plant evolution, intracellular gene transfer (IGT) is a prevalent, ongoing process. While nuclear and mitochondrial genomes are known to integrate foreign DNA via IGT and horizontal gene transfer (HGT), plastid genomes (plastomes) have resisted foreign DNA incorporation and only recently has IGT been uncovered in the plastomes of a few land plants. In this study, we completed plastome sequences for l0 crop species and describe a number of structural features including variation in gene and intron content, inversions, and expansion and contraction of the inverted repeat (IR). We identified a putative in cinnamon ( J. Presl) and other sequenced Lauraceae and an apparent functional transfer of to the nucleus of quinoa ( Willd.). In the orchard tree cashew ( L.), we report the insertion of an ∼6.7-kb fragment of mitochondrial DNA into the plastome IR. BLASTn analyses returned high identity hits to mitogenome sequences including an intact open reading frame. Using three plastome markers for five species of , we generated a phylogeny to investigate the distribution and timing of the insertion. Four species share the insertion, suggesting that this event occurred <20 million yr ago in a single clade in the genus. Our study extends the observation of mitochondrial to plastome IGT to include long-lived tree species. While previous studies have suggested possible mechanisms facilitating IGT to the plastome, more examples of this phenomenon, along with more complete mitogenome sequences, will be required before a common, or variable, mechanism can be elucidated. Copyright © 2017 Crop Science Society of America.

  3. Simulating efficiently the evolution of DNA sequences.

    Science.gov (United States)

    Schöniger, M; von Haeseler, A

    1995-02-01

    Two menu-driven FORTRAN programs are described that simulate the evolution of DNA sequences in accordance with a user-specified model. This general stochastic model allows for an arbitrary stationary nucleotide composition and any transition-transversion bias during the process of base substitution. In addition, the user may define any hypothetical model tree according to which a family of sequences evolves. The programs suggest the computationally most inexpensive approach to generate nucleotide substitutions. Either reproducible or non-repeatable simulations, depending on the method of initializing the pseudo-random number generator, can be performed. The corresponding options are offered by the interface menu.

  4. Profiling the nucleobase and structure selectivity of anticancer drugs and other DNA alkylating agents by RNA sequencing.

    Science.gov (United States)

    Gillingham, Dennis; Sauter, Basilius

    2018-05-06

    Drugs that covalently modify DNA are components of most chemotherapy regimens, often serving as first-line treatments. Classically the chemical reactivity of DNA alkylators has been determined in vitro with short oligonucleotides. Here we use next generation RNA sequencing to report on the chemoselectivity of alkylating agents. We develop the method with the well-known clinically used DNA modifiying drugs streptozotocin and temozolomide, and then apply the technique to profile RNA modification with uncharacterized alkylation reactions such as with powerful electrophiles like trimethylsilyldiazomethane. The multiplexed and massively parallel format of NGS offers analyses of chemical reactivity in nucleic acids to be accomplished in less time with greater statistical power. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. A phylogenetic hypothesis for passerine birds: taxonomic and biogeographic implications of an analysis of nuclear DNA sequence data.

    Science.gov (United States)

    Barker, F Keith; Barrowclough, George F; Groth, Jeff G

    2002-02-07

    Passerine birds comprise over half of avian diversity, but have proved difficult to classify. Despite a long history of work on this group, no comprehensive hypothesis of passerine family-level relationships was available until recent analyses of DNA-DNA hybridization data. Unfortunately, given the value of such a hypothesis in comparative studies of passerine ecology and behaviour, the DNA-hybridization results have not been well tested using independent data and analytical approaches. Therefore, we analysed nucleotide sequence variation at the nuclear RAG-1 and c-mos genes from 69 passerine taxa, including representatives of most currently recognized families. In contradiction to previous DNA-hybridization studies, our analyses suggest paraphyly of suboscine passerines because the suboscine New Zealand wren Acanthisitta was found to be sister to all other passerines. Additionally, we reconstructed the parvorder Corvida as a basal paraphyletic grade within the oscine passerines. Finally, we found strong evidence that several family-level taxa are misplaced in the hybridization results, including the Alaudidae, Irenidae, and Melanocharitidae. The hypothesis of relationships we present here suggests that the oscine passerines arose on the Australian continental plate while it was isolated by oceanic barriers and that a major northern radiation of oscines (i.e. the parvorder Passerida) originated subsequent to dispersal from the south.

  6. DNA Qualification Workflow for Next Generation Sequencing of Histopathological Samples

    Science.gov (United States)

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T.; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  7. DNA qualification workflow for next generation sequencing of histopathological samples.

    Directory of Open Access Journals (Sweden)

    Michele Simbolo

    Full Text Available Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF tissues, 6 formalin-fixed paraffin-embedded (FFPE tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard

  8. DNA fingerprinting of Mycobacterium tuberculosis: from phage typing to whole-genome sequencing.

    Science.gov (United States)

    Schürch, Anita C; van Soolingen, Dick

    2012-06-01

    Current typing methods for Mycobacterium tuberculosis complex evolved from simple phenotypic approaches like phage typing and drug susceptibility profiling to DNA-based strain typing methods, such as IS6110-restriction fragment length polymorphisms (RFLP) and variable number of tandem repeats (VNTR) typing. Examples of the usefulness of molecular typing are source case finding and epidemiological linkage of tuberculosis (TB) cases, international transmission of MDR/XDR-TB, the discrimination between endogenous reactivation and exogenous re-infection as a cause of relapses after curative treatment of tuberculosis, the evidence of multiple M. tuberculosis infections, and the disclosure of laboratory cross-contaminations. Simultaneously, phylogenetic analyses were developed based on single nucleotide polymorphisms (SNPs), genomic deletions usually referred to as regions of difference (RDs) and spoligotyping which served both strain typing and phylogenetic analysis. National and international initiatives that rely on the application of these typing methods have brought significant insight into the molecular epidemiology of tuberculosis. However, current DNA fingerprinting methods have important limitations. They can often not distinguish between genetically closely related strains and the turn-over of these markers is variable. Moreover, the suitability of most DNA typing methods for phylogenetic reconstruction is limited as they show a high propensity of convergent evolution or misinfer genetic distances. In order to fully explore the possibilities of genotyping in the molecular epidemiology of tuberculosis and to study the phylogeny of the causative bacteria reliably, the application of whole-genome sequencing (WGS) analysis for all M. tuberculosis isolates is the optimal, although currently still a costly solution. In the last years WGS for typing of pathogens has been explored and yielded important additional information on strain diversity in comparison to the

  9. True single-molecule DNA sequencing of a pleistocene horse bone

    DEFF Research Database (Denmark)

    Orlando, Ludovic Antoine Alexandre; Ginolhac, Aurélien; Raghavan, Maanasa

    2011-01-01

    -preserved Pleistocene horse bone using the Helicos HeliScope and Illumina GAIIx platforms, respectively. We find that the percentage of endogenous DNA sequences derived from the horse is higher among the Helicos data than Illumina data. This result indicates that the molecular biology tools used to generate sequencing...

  10. Genome-wide DNA methylation maps in follicular lymphoma cells determined by methylation-enriched bisulfite sequencing.

    Directory of Open Access Journals (Sweden)

    Jeong-Hyeon Choi

    Full Text Available BACKGROUND: Follicular lymphoma (FL is a form of non-Hodgkin's lymphoma (NHL that arises from germinal center (GC B-cells. Despite the significant advances in immunotherapy, FL is still not curable. Beyond transcriptional profiling and genomics datasets, there currently is no epigenome-scale dataset or integrative biology approach that can adequately model this disease and therefore identify novel mechanisms and targets for successful prevention and treatment of FL. METHODOLOGY/PRINCIPAL FINDINGS: We performed methylation-enriched genome-wide bisulfite sequencing of FL cells and normal CD19(+ B-cells using 454 sequencing technology. The methylated DNA fragments were enriched with methyl-binding proteins, treated with bisulfite, and sequenced using the Roche-454 GS FLX sequencer. The total number of bases covered in the human genome was 18.2 and 49.3 million including 726,003 and 1.3 million CpGs in FL and CD19(+ B-cells, respectively. 11,971 and 7,882 methylated regions of interest (MRIs were identified respectively. The genome-wide distribution of these MRIs displayed significant differences between FL and normal B-cells. A reverse trend in the distribution of MRIs between the promoter and the gene body was observed in FL and CD19(+ B-cells. The MRIs identified in FL cells also correlated well with transcriptomic data and ChIP-on-Chip analyses of genome-wide histone modifications such as tri-methyl-H3K27, and tri-methyl-H3K4, indicating a concerted epigenetic alteration in FL cells. CONCLUSIONS/SIGNIFICANCE: This study is the first to provide a large scale and comprehensive analysis of the DNA methylation sequence composition and distribution in the FL epigenome. These integrated approaches have led to the discovery of novel and frequent targets of aberrant epigenetic alterations. The genome-wide bisulfite sequencing approach developed here can be a useful tool for profiling DNA methylation in clinical samples.

  11. A viral metagenomic approach on a non-metagenomic experiment: Mining next generation sequencing datasets from pig DNA identified several porcine parvoviruses for a retrospective evaluation of viral infections.

    Directory of Open Access Journals (Sweden)

    Samuele Bovo

    Full Text Available Shot-gun next generation sequencing (NGS on whole DNA extracted from specimens collected from mammals often produces reads that are not mapped (i.e. unmapped reads on the host reference genome and that are usually discarded as by-products of the experiments. In this study, we mined Ion Torrent reads obtained by sequencing DNA isolated from archived blood samples collected from 100 performance tested Italian Large White pigs. Two reduced representation libraries were prepared from two DNA pools constructed each from 50 equimolar DNA samples. Bioinformatic analyses were carried out to mine unmapped reads on the reference pig genome that were obtained from the two NGS datasets. In silico analyses included read mapping and sequence assembly approaches for a viral metagenomic analysis using the NCBI Viral Genome Resource. Our approach identified sequences matching several viruses of the Parvoviridae family: porcine parvovirus 2 (PPV2, PPV4, PPV5 and PPV6 and porcine bocavirus 1-H18 isolate (PBoV1-H18. The presence of these viruses was confirmed by PCR and Sanger sequencing of individual DNA samples. PPV2, PPV4, PPV5, PPV6 and PBoV1-H18 were all identified in samples collected in 1998-2007, 1998-2000, 1997-2000, 1998-2004 and 2003, respectively. For most of these viruses (PPV4, PPV5, PPV6 and PBoV1-H18 previous studies reported their first occurrence much later (from 5 to more than 10 years than our identification period and in different geographic areas. Our study provided a retrospective evaluation of apparently asymptomatic parvovirus infected pigs providing information that could be important to define occurrence and prevalence of different parvoviruses in South Europe. This study demonstrated the potential of mining NGS datasets non-originally derived by metagenomics experiments for viral metagenomics analyses in a livestock species.

  12. Cloning, sequencing, and expression of dnaK-operon proteins from the thermophilic bacterium Thermus thermophilus.

    Science.gov (United States)

    Osipiuk, J; Joachimiak, A

    1997-09-12

    We propose that the dnaK operon of Thermus thermophilus HB8 is composed of three functionally linked genes: dnaK, grpE, and dnaJ. The dnaK and dnaJ gene products are most closely related to their cyanobacterial homologs. The DnaK protein sequence places T. thermophilus in the plastid Hsp70 subfamily. In contrast, the grpE translated sequence is most similar to GrpE from Clostridium acetobutylicum, a Gram-positive anaerobic bacterium. A single promoter region, with homology to the Escherichia coli consensus promoter sequences recognized by the sigma70 and sigma32 transcription factors, precedes the postulated operon. This promoter is heat-shock inducible. The dnaK mRNA level increased more than 30 times upon 10 min of heat shock (from 70 degrees C to 85 degrees C). A strong transcription terminating sequence was found between the dnaK and grpE genes. The individual genes were cloned into pET expression vectors and the thermophilic proteins were overproduced at high levels in E. coli and purified to homogeneity. The recombinant T. thermophilus DnaK protein was shown to have a weak ATP-hydrolytic activity, with an optimum at 90 degrees C. The ATPase was stimulated by the presence of GrpE and DnaJ. Another open reading frame, coding for ClpB heat-shock protein, was found downstream of the dnaK operon.

  13. A DNA sequence obtained by replacement of the dopamine RNA aptamer bases is not an aptamer.

    Science.gov (United States)

    Álvarez-Martos, Isabel; Ferapontova, Elena E

    2017-08-05

    A unique specificity of the aptamer-ligand biorecognition and binding facilitates bioanalysis and biosensor development, contributing to discrimination of structurally related molecules, such as dopamine and other catecholamine neurotransmitters. The aptamer sequence capable of specific binding of dopamine is a 57 nucleotides long RNA sequence reported in 1997 (Biochemistry, 1997, 36, 9726). Later, it was suggested that the DNA homologue of the RNA aptamer retains the specificity of dopamine binding (Biochem. Biophys. Res. Commun., 2009, 388, 732). Here, we show that the DNA sequence obtained by the replacement of the RNA aptamer bases for their DNA analogues is not able of specific biorecognition of dopamine, in contrast to the original RNA aptamer sequence. This DNA sequence binds dopamine and structurally related catecholamine neurotransmitters non-specifically, as any DNA sequence, and, thus, is not an aptamer and cannot be used neither for in vivo nor in situ analysis of dopamine in the presence of structurally related neurotransmitters. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. An integrated multiple capillary array electrophoresis system for high-throughput DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Lu, X.

    1998-03-27

    A capillary array electrophoresis system was chosen to perform DNA sequencing because of several advantages such as rapid heat dissipation, multiplexing capabilities, gel matrix filling simplicity, and the mature nature of the associated manufacturing technologies. There are two major concerns for the multiple capillary systems. One concern is inter-capillary cross-talk, and the other concern is excitation and detection efficiency. Cross-talk is eliminated through proper optical coupling, good focusing and immersing capillary array into index matching fluid. A side-entry excitation scheme with orthogonal detection was established for large capillary array. Two 100 capillary array formats were used for DNA sequencing. One format is cylindrical capillary with 150 {micro}m o.d., 75 {micro}m i.d and the other format is square capillary with 300 {micro}m out edge and 75 {micro}m inner edge. This project is focused on the development of excitation and detection of DNA as well as performing DNA sequencing. The DNA injection schemes are discussed for the cases of single and bundled capillaries. An individual sampling device was designed. The base-calling was performed for a capillary from the capillary array with the accuracy of 98%.

  15. High Interlaboratory Reprocucibility of DNA Sequence-based Typing of Bacteria in a Multicenter Study

    DEFF Research Database (Denmark)

    Sousa, MA de; Boye, Kit; Lencastre, H de

    2006-01-01

    Current DNA amplification-based typing methods for bacterial pathogens often lack interlaboratory reproducibility. In this international study, DNA sequence-based typing of the Staphylococcus aureus protein A gene (spa, 110 to 422 bp) showed 100% intra- and interlaboratory reproducibility without...... extensive harmonization of protocols for 30 blind-coded S. aureus DNA samples sent to 10 laboratories. Specialized software for automated sequence analysis ensured a common typing nomenclature....

  16. A MapReduce Framework for DNA Sequencing Data Processing

    Directory of Open Access Journals (Sweden)

    Samy Ghoneimy

    2016-12-01

    Full Text Available Genomics and Next Generation Sequencers (NGS like Illumina Hiseq produce data in the order of ‎‎200 billion base pairs in a single one-week run for a 60x human genome coverage, which ‎requires modern high-throughput experimental technologies that can ‎only be tackled with high performance computing (HPC and specialized software algorithms called ‎‎“short read aligners”. This paper focuses on the implementation of the DNA sequencing as a set of MapReduce programs that will accept a DNA data set as a FASTQ file and finally generate a VCF (variant call format file, which has variants for a given DNA data set. In this paper MapReduce/Hadoop along with Burrows-Wheeler Aligner (BWA, Sequence Alignment/Map (SAM ‎tools, are fully utilized to provide various utilities for manipulating alignments, including sorting, merging, indexing, ‎and generating alignments. The Map-Sort-Reduce process is designed to be suited for a Hadoop framework in ‎which each cluster is a traditional N-node Hadoop cluster to utilize all of the Hadoop features like HDFS, program ‎management and fault tolerance. The Map step performs multiple instances of the short read alignment algorithm ‎‎(BoWTie that run in parallel in Hadoop. The ordered list of the sequence reads are used as input tuples and the ‎output tuples are the alignments of the short reads. In the Reduce step many parallel instances of the Short ‎Oligonucleotide Analysis Package for SNP (SOAPsnp algorithm run in the cluster. Input tuples are sorted ‎alignments for a partition and the output tuples are SNP calls. Results are stored via HDFS, and then archived in ‎SOAPsnp format. ‎ The proposed framework enables extremely fast discovering somatic mutations, inferring population genetical ‎parameters, and performing association tests directly based on sequencing data without explicit genotyping or ‎linkage-based imputation. It also demonstrate that this method achieves comparable

  17. Effect of ionic strength and cationic DNA affinity binders on the DNA sequence selective alkylation of guanine N7-positions by nitrogen mustards

    International Nuclear Information System (INIS)

    Hartley, J.A.; Forrow, S.M.; Souhami, R.L.

    1990-01-01

    Large variations in alkylation intensities exist among guanines in a DNA sequence following treatment with chemotherapeutic alkylating agents such as nitrogen mustards, and the substituent attached to the reactive group can impose a distinct sequence preference for reaction. In order to understand further the structural and electrostatic factors which determine the sequence selectivity of alkylation reactions, the effect of increase ionic strength, the intercalator ethidium bromide, AT-specific minor groove binders distamycin A and netropsin, and the polyamine spermine on guanine N7-alkylation by L-phenylalanine mustard (L-Pam), uracil mustard (UM), and quinacrine mustard (QM) was investigated with a modification of the guanine-specific chemical cleavage technique for DNA sequencing. The result differed with both the nitrogen mustard and the cationic agent used. The effect, which resulted in both enhancement and suppression of alkylation sites, was most striking in the case of netropsin and distamycin A, which differed from each other. DNA footprinting indicated that selective binding to AT sequences in the minor groove of DNA can have long-range effects on the alkylation pattern of DNA in the major groove

  18. Cloning and sequence analysis of cDNA coding for rat nucleolar protein C23

    International Nuclear Information System (INIS)

    Ghaffari, S.H.; Olson, M.O.J.

    1986-01-01

    Using synthetic oligonucleotides as primers and probes, the authors have isolated and sequenced cDNA clones encoding protein C23, a putative nucleolus organizer protein. Poly(A + ) RNA was isolated from rat Novikoff hepatoma cells and enriched in C23 mRNA by sucrose density gradient ultracentrifugation. Two deoxyoligonuleotides, a 48- and a 27-mer, were synthesized on the basis of amino acid sequence from the C-terminal half of protein C23 and cDNA sequence data from CHO cell protein. The 48-mer was used a primer for synthesis of cDNA which was then inserted into plasmid pUC9. Transformed bacterial colonies were screened by hybridization with 32 P labeled 27-mer. Two clones among 5000 gave a strong positive signal. Plasmid DNAs from these clones were purified and characterized by blotting and nucleotide sequence analysis. The length of C23 mRNA was estimated to be 3200 bases in a northern blot analysis. The sequence of a 267 b.p. insert shows high homology with the CHO cDNA with only 9 nucleotide differences and an identical amino acid sequence. These studies indicate that this region of the protein is highly conserved

  19. Reconstruction of DNA sequences using genetic algorithms and cellular automata: towards mutation prediction?

    Science.gov (United States)

    Mizas, Ch; Sirakoulis, G Ch; Mardiris, V; Karafyllidis, I; Glykos, N; Sandaltzopoulos, R

    2008-04-01

    Change of DNA sequence that fuels evolution is, to a certain extent, a deterministic process because mutagenesis does not occur in an absolutely random manner. So far, it has not been possible to decipher the rules that govern DNA sequence evolution due to the extreme complexity of the entire process. In our attempt to approach this issue we focus solely on the mechanisms of mutagenesis and deliberately disregard the role of natural selection. Hence, in this analysis, evolution refers to the accumulation of genetic alterations that originate from mutations and are transmitted through generations without being subjected to natural selection. We have developed a software tool that allows modelling of a DNA sequence as a one-dimensional cellular automaton (CA) with four states per cell which correspond to the four DNA bases, i.e. A, C, T and G. The four states are represented by numbers of the quaternary number system. Moreover, we have developed genetic algorithms (GAs) in order to determine the rules of CA evolution that simulate the DNA evolution process. Linear evolution rules were considered and square matrices were used to represent them. If DNA sequences of different evolution steps are available, our approach allows the determination of the underlying evolution rule(s). Conversely, once the evolution rules are deciphered, our tool may reconstruct the DNA sequence in any previous evolution step for which the exact sequence information was unknown. The developed tool may be used to test various parameters that could influence evolution. We describe a paradigm relying on the assumption that mutagenesis is governed by a near-neighbour-dependent mechanism. Based on the satisfactory performance of our system in the deliberately simplified example, we propose that our approach could offer a starting point for future attempts to understand the mechanisms that govern evolution. The developed software is open-source and has a user-friendly graphical input interface.

  20. Transcriptional blockages in a cell-free system by sequence-selective DNA alkylating agents.

    Science.gov (United States)

    Ferguson, L R; Liu, A P; Denny, W A; Cullinane, C; Talarico, T; Phillips, D R

    2000-04-14

    There is considerable interest in DNA sequence-selective DNA-binding drugs as potential inhibitors of gene expression. Five compounds with distinctly different base pair specificities were compared in their effects on the formation and elongation of the transcription complex from the lac UV5 promoter in a cell-free system. All were tested at drug levels which killed 90% of cells in a clonogenic survival assay. Cisplatin, a selective alkylator at purine residues, inhibited transcription, decreasing the full-length transcript, and causing blockage at a number of GG or AG sequences, making it probable that intrastrand crosslinks are the blocking lesions. A cyclopropylindoline known to be an A-specific alkylator also inhibited transcription, with blocks at adenines. The aniline mustard chlorambucil, that targets primarily G but also A sequences, was also effective in blocking the formation of full-length transcripts. It produced transcription blocks either at, or one base prior to, AA or GG sequences, suggesting that intrastrand crosslinks could again be involved. The non-alkylating DNA minor groove binder Hoechst 33342 (a bisbenzimidazole) blocked formation of the full-length transcript, but without creating specific blockage sites. A bisbenzimidazole-linked aniline mustard analogue was a more effective transcription inhibitor than either chlorambucil or Hoechst 33342, with different blockage sites occurring immediately as compared with 2 h after incubation. The blockages were either immediately prior to AA or GG residues, or four to five base pairs prior to such sites, a pattern not predicted from in vitro DNA-binding studies. Minor groove DNA-binding ligands are of particular interest as inhibitors of gene expression, since they have the potential ability to bind selectively to long sequences of DNA. The results suggest that the bisbenzimidazole-linked mustard does cause alkylation and transcription blockage at novel DNA sites. in addition to sites characteristic of

  1. Reduced Representation Libraries from DNA Pools Analysed with Next Generation Semiconductor Based-Sequencing to Identify SNPs in Extreme and Divergent Pigs for Back Fat Thickness

    Directory of Open Access Journals (Sweden)

    Samuele Bovo

    2015-01-01

    Full Text Available The aim of this study was to identify single nucleotide polymorphisms (SNPs that could be associated with back fat thickness (BFT in pigs. To achieve this goal, we evaluated the potential and limits of an experimental design that combined several methodologies. DNA samples from two groups of Italian Large White pigs with divergent estimating breeding value (EBV for BFT were separately pooled and sequenced, after preparation of reduced representation libraries (RRLs, on the Ion Torrent technology. Taking advantage from SNAPE for SNPs calling in sequenced DNA pools, 39,165 SNPs were identified; 1/4 of them were novel variants not reported in dbSNP. Combining sequencing data with Illumina PorcineSNP60 BeadChip genotyping results on the same animals, 661 genomic positions overlapped with a good approximation of minor allele frequency estimation. A total of 54 SNPs showing enriched alleles in one or in the other RRLs might be potential markers associated with BFT. Some of these SNPs were close to genes involved in obesity related phenotypes.

  2. Interspecies hybridization on DNA resequencing microarrays: efficiency of sequence recovery and accuracy of SNP detection in human, ape, and codfish mitochondrial DNA genomes sequenced on a human-specific MitoChip

    Directory of Open Access Journals (Sweden)

    Carr Steven M

    2007-09-01

    Full Text Available Abstract Background Iterative DNA "resequencing" on oligonucleotide microarrays offers a high-throughput method to measure intraspecific biodiversity, one that is especially suited to SNP-dense gene regions such as vertebrate mitochondrial (mtDNA genomes. However, costs of single-species design and microarray fabrication are prohibitive. A cost-effective, multi-species strategy is to hybridize experimental DNAs from diverse species to a common microarray that is tiled with oligonucleotide sets from multiple, homologous reference genomes. Such a strategy requires that cross-hybridization between the experimental DNAs and reference oligos from the different species not interfere with the accurate recovery of species-specific data. To determine the pattern and limits of such interspecific hybridization, we compared the efficiency of sequence recovery and accuracy of SNP identification by a 15,452-base human-specific microarray challenged with human, chimpanzee, gorilla, and codfish mtDNA genomes. Results In the human genome, 99.67% of the sequence was recovered with 100.0% accuracy. Accuracy of SNP identification declines log-linearly with sequence divergence from the reference, from 0.067 to 0.247 errors per SNP in the chimpanzee and gorilla genomes, respectively. Efficiency of sequence recovery declines with the increase of the number of interspecific SNPs in the 25b interval tiled by the reference oligonucleotides. In the gorilla genome, which differs from the human reference by 10%, and in which 46% of these 25b regions contain 3 or more SNP differences from the reference, only 88% of the sequence is recoverable. In the codfish genome, which differs from the reference by > 30%, less than 4% of the sequence is recoverable, in short islands ≥ 12b that are conserved between primates and fish. Conclusion Experimental DNAs bind inefficiently to homologous reference oligonucleotide sets on a re-sequencing microarray when their sequences differ by

  3. High-fidelity target sequencing of individual molecules identified using barcode sequences: de novo detection and absolute quantitation of mutations in plasma cell-free DNA from cancer patients.

    Science.gov (United States)

    Kukita, Yoji; Matoba, Ryo; Uchida, Junji; Hamakawa, Takuya; Doki, Yuichiro; Imamura, Fumio; Kato, Kikuya

    2015-08-01

    Circulating tumour DNA (ctDNA) is an emerging field of cancer research. However, current ctDNA analysis is usually restricted to one or a few mutation sites due to technical limitations. In the case of massively parallel DNA sequencers, the number of false positives caused by a high read error rate is a major problem. In addition, the final sequence reads do not represent the original DNA population due to the global amplification step during the template preparation. We established a high-fidelity target sequencing system of individual molecules identified in plasma cell-free DNA using barcode sequences; this system consists of the following two steps. (i) A novel target sequencing method that adds barcode sequences by adaptor ligation. This method uses linear amplification to eliminate the errors introduced during the early cycles of polymerase chain reaction. (ii) The monitoring and removal of erroneous barcode tags. This process involves the identification of individual molecules that have been sequenced and for which the number of mutations have been absolute quantitated. Using plasma cell-free DNA from patients with gastric or lung cancer, we demonstrated that the system achieved near complete elimination of false positives and enabled de novo detection and absolute quantitation of mutations in plasma cell-free DNA. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  4. G-quadruplex and G-rich sequence stimulate Pif1p-catalyzed downstream duplex DNA unwinding through reducing waiting time at ss/dsDNA junction

    Science.gov (United States)

    Zhang, Bo; Wu, Wen-Qiang; Liu, Na-Nv; Duan, Xiao-Lei; Li, Ming; Dou, Shuo-Xing; Hou, Xi-Miao; Xi, Xu-Guang

    2016-01-01

    Alternative DNA structures that deviate from B-form double-stranded DNA such as G-quadruplex (G4) DNA can be formed by G-rich sequences that are widely distributed throughout the human genome. We have previously shown that Pif1p not only unfolds G4, but also unwinds the downstream duplex DNA in a G4-stimulated manner. In the present study, we further characterized the G4-stimulated duplex DNA unwinding phenomenon by means of single-molecule fluorescence resonance energy transfer. It was found that Pif1p did not unwind the partial duplex DNA immediately after unfolding the upstream G4 structure, but rather, it would dwell at the ss/dsDNA junction with a ‘waiting time’. Further studies revealed that the waiting time was in fact related to a protein dimerization process that was sensitive to ssDNA sequence and would become rapid if the sequence is G-rich. Furthermore, we identified that the G-rich sequence, as the G4 structure, equally stimulates duplex DNA unwinding. The present work sheds new light on the molecular mechanism by which G4-unwinding helicase Pif1p resolves physiological G4/duplex DNA structures in cells. PMID:27471032

  5. Development of taxon-specific sequence characterized amplified region (SCAR) markers based on actin sequences and DNA amplification fingerprinting (DAF): a case study in the Phoma exigua species complex.

    Science.gov (United States)

    Aveskamp, Maikel M; Woudenberg, Joyce H C; de Gruyter, Johannes; Turco, Elena; Groenewald, Johannes Z; Crous, Pedro W

    2009-05-01

    Phoma exigua is considered to be an assemblage of at least nine varieties that are mainly distinguished on the basis of host specificity and pathogenicity. However, these varieties are also reported to be weak pathogens and secondary invaders on non-host tissue. In practice, it is difficult to distinguish P. exigua from its close relatives and to correctly identify isolates up to the variety level, because of their low genetic variation and high morphological similarity. Because of quarantine issues and phytosanitary measures, a robust DNA-based tool is required for accurate and rapid identification of the separate taxa in this species complex. The present study therefore aims to develop such a tool based on unique nucleotide sequence identifiers. More than 60 strains of P. exigua and related species were compared in terms of partial actin gene sequences, or analysed using DNA amplification fingerprinting (DAF) with short, arbitrary, mini-hairpin primers. Fragments in the fingerprint unique to a single taxon were identified, purified and sequenced. Alignment of the sequence data and subsequent primer trials led to the identification of taxon-specific sequence characterized amplified regions (SCARs), and to a set of specific oligonucleotide combinations that can be used to identify these organisms in plant quarantine inspections.

  6. Evaluation of DNA bending models in their capacity to predict electrophoretic migration anomalies of satellite DNA sequences.

    Science.gov (United States)

    Matyášek, Roman; Fulneček, Jaroslav; Kovařík, Aleš

    2013-09-01

    DNA containing a sequence that generates a local curvature exhibits a pronounced retardation in electrophoretic mobility. Various theoretical models have been proposed to explain relationship between DNA structural features and migration anomaly. Here, we studied the capacity of 15 static wedge-bending models to predict electrophoretic behavior of 69 satellite monomers derived from four divergent families. All monomers exhibited retarded mobility in PAGE corresponding to retardation factors ranging 1.02-1.54. The curvature varied both within and across the groups and correlated with the number, position, and lengths of A-tracts. Two dinucleotide models provided strong correlation between gel mobility and curvature prediction; two trinucleotide models were satisfactory while remaining dinucleotide models provided intermediate results with reliable prediction for subsets of sequences only. In some cases, similarly shaped molecules exhibited relatively large differences in mobility and vice versa. Generally less accurate predictions were obtained in groups containing less homogeneous sequences possessing distinct structural features. In conclusion, relatively universal theoretical models were identified suitable for the analysis of natural sequences known to harbor relatively moderate curvature. These models could be potentially applied to genome wide studies. However, in silico predictions should be viewed in context of experimental measurement of intrinsic DNA curvature. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Poincaré recurrences of DNA sequences

    Science.gov (United States)

    Frahm, K. M.; Shepelyansky, D. L.

    2012-01-01

    We analyze the statistical properties of Poincaré recurrences of Homo sapiens, mammalian, and other DNA sequences taken from the Ensembl Genome data base with up to 15 billion base pairs. We show that the probability of Poincaré recurrences decays in an algebraic way with the Poincaré exponent β≈4 even if the oscillatory dependence is well pronounced. The correlations between recurrences decay with an exponent ν≈0.6 that leads to an anomalous superdiffusive walk. However, for Homo sapiens sequences, with the largest available statistics, the diffusion coefficient converges to a finite value on distances larger than one million base pairs. We argue that the approach based on Poncaré recurrences determines new proximity features between different species and sheds a new light on their evolution history.

  8. Brownian dynamics simulations of sequence-dependent duplex denaturation in dynamically superhelical DNA

    Science.gov (United States)

    Mielke, Steven P.; Grønbech-Jensen, Niels; Krishnan, V. V.; Fink, William H.; Benham, Craig J.

    2005-09-01

    The topological state of DNA in vivo is dynamically regulated by a number of processes that involve interactions with bound proteins. In one such process, the tracking of RNA polymerase along the double helix during transcription, restriction of rotational motion of the polymerase and associated structures, generates waves of overtwist downstream and undertwist upstream from the site of transcription. The resulting superhelical stress is often sufficient to drive double-stranded DNA into a denatured state at locations such as promoters and origins of replication, where sequence-specific duplex opening is a prerequisite for biological function. In this way, transcription and other events that actively supercoil the DNA provide a mechanism for dynamically coupling genetic activity with regulatory and other cellular processes. Although computer modeling has provided insight into the equilibrium dynamics of DNA supercoiling, to date no model has appeared for simulating sequence-dependent DNA strand separation under the nonequilibrium conditions imposed by the dynamic introduction of torsional stress. Here, we introduce such a model and present results from an initial set of computer simulations in which the sequences of dynamically superhelical, 147 base pair DNA circles were systematically altered in order to probe the accuracy with which the model can predict location, extent, and time of stress-induced duplex denaturation. The results agree both with well-tested statistical mechanical calculations and with available experimental information. Additionally, we find that sites susceptible to denaturation show a propensity for localizing to supercoil apices, suggesting that base sequence determines locations of strand separation not only through the energetics of interstrand interactions, but also by influencing the geometry of supercoiling.

  9. Genetic alterations of hepatocellular carcinoma by random amplified polymorphic DNA analysis and cloning sequencing of tumor differential DNA fragment

    Science.gov (United States)

    Xian, Zhi-Hong; Cong, Wen-Ming; Zhang, Shu-Hui; Wu, Meng-Chao

    2005-01-01

    AIM: To study the genetic alterations and their association with clinicopathological characteristics of hepatocellular carcinoma (HCC), and to find the tumor related DNA fragments. METHODS: DNA isolated from tumors and corresponding noncancerous liver tissues of 56 HCC patients was amplified by random amplified polymorphic DNA (RAPD) with 10 random 10-mer arbitrary primers. The RAPD bands showing obvious differences in tumor tissue DNA corresponding to that of normal tissue were separated, purified, cloned and sequenced. DNA sequences were analyzed and compared with GenBank data. RESULTS: A total of 56 cases of HCC were demonstrated to have genetic alterations, which were detected by at least one primer. The detestability of genetic alterations ranged from 20% to 70% in each case, and 17.9% to 50% in each primer. Serum HBV infection, tumor size, histological grade, tumor capsule, as well as tumor intrahepatic metastasis, might be correlated with genetic alterations on certain primers. A band with a higher intensity of 480 bp or so amplified fragments in tumor DNA relative to normal DNA could be seen in 27 of 56 tumor samples using primer 4. Sequence analysis of these fragments showed 91% homology with Homo sapiens double homeobox protein DUX10 gene. CONCLUSION: Genetic alterations are a frequent event in HCC, and tumor related DNA fragments have been found in this study, which may be associated with hepatocarcin-ogenesis. RAPD is an effective method for the identification and analysis of genetic alterations in HCC, and may provide new information for further evaluating the molecular mechanism of hepatocarcinogenesis. PMID:15996039

  10. A statistical model for investigating binding probabilities of DNA nucleotide sequences using microarrays.

    Science.gov (United States)

    Lee, Mei-Ling Ting; Bulyk, Martha L; Whitmore, G A; Church, George M

    2002-12-01

    There is considerable scientific interest in knowing the probability that a site-specific transcription factor will bind to a given DNA sequence. Microarray methods provide an effective means for assessing the binding affinities of a large number of DNA sequences as demonstrated by Bulyk et al. (2001, Proceedings of the National Academy of Sciences, USA 98, 7158-7163) in their study of the DNA-binding specificities of Zif268 zinc fingers using microarray technology. In a follow-up investigation, Bulyk, Johnson, and Church (2002, Nucleic Acid Research 30, 1255-1261) studied the interdependence of nucleotides on the binding affinities of transcription proteins. Our article is motivated by this pair of studies. We present a general statistical methodology for analyzing microarray intensity measurements reflecting DNA-protein interactions. The log probability of a protein binding to a DNA sequence on an array is modeled using a linear ANOVA model. This model is convenient because it employs familiar statistical concepts and procedures and also because it is effective for investigating the probability structure of the binding mechanism.

  11. Microsatellite DNA in genomic survey sequences and UniGenes of loblolly pine

    Science.gov (United States)

    Craig S Echt; Surya Saha; Dennis L Deemer; C Dana Nelson

    2011-01-01

    Genomic DNA sequence databases are a potential and growing resource for simple sequence repeat (SSR) marker development in loblolly pine (Pinus taeda L.). Loblolly pine also has many expressed sequence tags (ESTs) available for microsatellite (SSR) marker development. We compared loblolly pine SSR densities in genome survey sequences (GSSs) to those in non-redundant...

  12. Phylogenetic relationships and timing of diversification in gonorynchiform fishes inferred using nuclear gene DNA sequences (Teleostei: Ostariophysi).

    Science.gov (United States)

    Near, Thomas J; Dornburg, Alex; Friedman, Matt

    2014-11-01

    The Gonorynchiformes are the sister lineage of the species-rich Otophysi and provide important insights into the diversification of ostariophysan fishes. Phylogenies of gonorynchiforms inferred using morphological characters and mtDNA gene sequences provide differing resolutions with regard to the sister lineage of all other gonorynchiforms (Chanos vs. Gonorynchus) and support for monophyly of the two miniaturized lineages Cromeria and Grasseichthys. In this study the phylogeny and divergence times of gonorynchiforms are investigated with DNA sequences sampled from nine nuclear genes and a published morphological character matrix. Bayesian phylogenetic analyses reveal substantial congruence among individual gene trees with inferences from eight genes placing Gonorynchus as the sister lineage to all other gonorynchiforms. Seven gene trees resolve Cromeria and Grasseichthys as a clade, supporting previous inferences using morphological characters. Phylogenies resulting from either concatenating the nuclear genes, performing a multispecies coalescent species tree analysis, or combining the morphological and nuclear gene DNA sequences resolve Gonorynchus as the living sister lineage of all other gonorynchiforms, strongly support the monophyly of Cromeria and Grasseichthys, and resolve a clade containing Parakneria, Cromeria, and Grasseichthys. The morphological dataset, which includes 13 gonorynchiform fossil taxa that range in age from Early Cretaceous to Eocene, was analyzed in combination with DNA sequences from the nine nuclear genes and a relaxed molecular clock to estimate times of evolutionary divergence. This "tip dating" strategy accommodates uncertainty in the phylogenetic resolution of fossil taxa that provide calibration information in the relaxed molecular clock analysis. The estimated age of the most recent common ancestor (MRCA) of living gonorynchiforms is slightly older than estimates from previous node dating efforts, but the molecular tip dating

  13. Genome dynamics of short oligonucleotides: the example of bacterial DNA uptake enhancing sequences.

    Directory of Open Access Journals (Sweden)

    Mohammed Bakkali

    Full Text Available Among the many bacteria naturally competent for transformation by DNA uptake-a phenomenon with significant clinical and financial implications- Pasteurellaceae and Neisseriaceae species preferentially take up DNA containing specific short sequences. The genomic overrepresentation of these DNA uptake enhancing sequences (DUES causes preferential uptake of conspecific DNA, but the function(s behind this overrepresentation and its evolution are still a matter for discovery. Here I analyze DUES genome dynamics and evolution and test the validity of the results to other selectively constrained oligonucleotides. I use statistical methods and computer simulations to examine DUESs accumulation in Haemophilus influenzae and Neisseria gonorrhoeae genomes. I analyze DUESs sequence and nucleotide frequencies, as well as those of all their mismatched forms, and prove the dependence of DUESs genomic overrepresentation on their preferential uptake by quantifying and correlating both characteristics. I then argue that mutation, uptake bias, and weak selection against DUESs in less constrained parts of the genome combined are sufficient enough to cause DUESs accumulation in susceptible parts of the genome with no need for other DUES function. The distribution of overrepresentation values across sequences with different mismatch loads compared to the DUES suggests a gradual yet not linear molecular drive of DNA sequences depending on their similarity to the DUES. Other genomically overrepresented sequences, both pro- and eukaryotic, show similar distribution of frequencies suggesting that the molecular drive reported above applies to other frequent oligonucleotides. Rare oligonucleotides, however, seem to be gradually drawn to genomic underrepresentation, thus, suggesting a molecular drag. To my knowledge this work provides the first clear evidence of the gradual evolution of selectively constrained oligonucleotides, including repeated, palindromic and protein

  14. A DNA sequence element that advances replication origin activation time in Saccharomyces cerevisiae.

    Science.gov (United States)

    Pohl, Thomas J; Kolor, Katherine; Fangman, Walton L; Brewer, Bonita J; Raghuraman, M K

    2013-11-06

    Eukaryotic origins of DNA replication undergo activation at various times in S-phase, allowing the genome to be duplicated in a temporally staggered fashion. In the budding yeast Saccharomyces cerevisiae, the activation times of individual origins are not intrinsic to those origins but are instead governed by surrounding sequences. Currently, there are two examples of DNA sequences that are known to advance origin activation time, centromeres and forkhead transcription factor binding sites. By combining deletion and linker scanning mutational analysis with two-dimensional gel electrophoresis to measure fork direction in the context of a two-origin plasmid, we have identified and characterized a 19- to 23-bp and a larger 584-bp DNA sequence that are capable of advancing origin activation time.

  15. RNA-DNA sequence differences spell genetic code ambiguities

    DEFF Research Database (Denmark)

    Bentin, Thomas; Nielsen, Michael L

    2013-01-01

    A recent paper in Science by Li et al. 2011(1) reports widespread sequence differences in the human transcriptome between RNAs and their encoding genes termed RNA-DNA differences (RDDs). The findings could add a new layer of complexity to gene expression but the study has been criticized. ...

  16. (Brassicaceae) based on nuclear ribosomal ITS DNA sequences

    Indian Academy of Sciences (India)

    Home; Journals; Journal of Genetics; Volume 93; Issue 2. Phylogeny and biogeography of Alyssum (Brassicaceae) based on nuclear ribosomal ITS DNA sequences. Yan Li Yan Kong Zhe Zhang Yanqiang Yin Bin Liu Guanghui Lv Xiyong Wang. Research Article Volume 93 Issue 2 August 2014 pp 313-323 ...

  17. Cloning, sequencing and expression of cDNA encoding growth ...

    Indian Academy of Sciences (India)

    Unknown

    of medicine, animal husbandry, fish farming and animal ..... northern pike (Esox lucius) growth hormone; Mol. Mar. Biol. ... prolactin 1-luciferase fusion gene in African catfish and ... 1988 Cloning and sequencing of cDNA that encodes goat.

  18. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697

  19. Human tissue factor: cDNA sequence and chromosome localization of the gene

    International Nuclear Information System (INIS)

    Scarpati, E.M.; Wen, D.; Broze, G.J. Jr.; Miletich, J.P.; Flandermeyer, R.R.; Siegel, N.R.; Sadler, J.E.

    1987-01-01

    A human placenta cDNA library in λgt11 was screened for the expression of tissue factor antigens with rabbit polyclonal anti-human tissue factor immunoglobulin G. Among 4 million recombinant clones screened, one positive, λHTF8, expressed a protein that shared epitopes with authentic human brain tissue factor. The 1.1-kilobase cDNA insert of λHTF8 encoded a peptide that contained the amino-terminal protein sequence of human brain tissue factor. Northern blotting identified a major mRNA species of 2.2 kilobases and a minor species of ∼ 3.2 kilobases in poly(A) + RNA of placenta. Only 2.2-kilobase mRNA was detected in human brain and in the human monocytic U937 cell line. In U937 cells, the quantity of tissue factor mRNA was increased several fold by exposure of the cells to phorbol 12-myristate 13-acetate. Additional cDNA clones were selected by hybridization with the cDNA insert of λHTF8. These overlapping isolates span 2177 base pairs of the tissue factor cDNA sequence that includes a 5'-noncoding region of 75 base pairs, an open reading frame of 885 base pairs, a stop codon, a 3'-noncoding region of 1141 base pairs, and a poly(a) tail. The open reading frame encodes a 33-kilodalton protein of 295 amino acids. The predicted sequence includes a signal peptide of 32 or 34 amino acids, a probable extracellular factor VII binding domain of 217 or 219 amino acids, a transmembrane segment of 23 acids, and a cytoplasmic tail of 21 amino acids. There are three potential glycosylation sites with the sequence Asn-X-Thr/Ser. The 3'-noncoding region contains an inverted Alu family repetitive sequence. The tissue factor gene was localized to chromosome 1 by hybridization of the cDNA insert of λHTF8 to flow-sorted human chromosomes

  20. Next Generation Sequencing-Based Analysis of Repetitive DNA in the Model Dioceous Plant Silene latifolia

    Czech Academy of Sciences Publication Activity Database

    Macas, Jiří; Kejnovský, Eduard; Neumann, Pavel; Novák, Petr; Koblížková, Andrea; Vyskot, Boris

    2011-01-01

    Roč. 6, č. 11 (2011), e27335 E-ISSN 1932-6203 R&D Projects: GA MŠk(CZ) OC10037; GA MŠk(CZ) LC06004; GA MŠk(CZ) LH11058; GA ČR(CZ) GAP501/10/0102; GA ČR(CZ) GAP305/10/0930 Institutional research plan: CEZ:AV0Z50510513; CEZ:AV0Z50040702 Keywords : Plant genome * Sequencing-Based Analyses * Repetitive DNA * Silene latifolia Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 4.092, year: 2011

  1. HLA class I sequence-based typing using DNA recovered from frozen plasma.

    Science.gov (United States)

    Cotton, Laura A; Abdur Rahman, Manal; Ng, Carmond; Le, Anh Q; Milloy, M-J; Mo, Theresa; Brumme, Zabrina L

    2012-08-31

    We describe a rapid, reliable and cost-effective method for intermediate-to-high-resolution sequence-based HLA class I typing using frozen plasma as a source of genomic DNA. The plasma samples investigated had a median age of 8.5 years. Total nucleic acids were isolated from matched frozen PBMC (~2.5 million) and plasma (500 μl) samples from a panel of 25 individuals using commercial silica-based kits. Extractions yielded median [IQR] nucleic acid concentrations of 85.7 [47.0-130.0]ng/μl and 2.2 [1.7-2.6]ng/μl from PBMC and plasma, respectively. Following extraction, ~1000 base pair regions spanning exons 2 and 3 of HLA-A, -B and -C were amplified independently via nested PCR using universal, locus-specific primers and sequenced directly. Chromatogram analysis was performed using commercial DNA sequence analysis software and allele interpretation was performed using a free web-based tool. HLA-A, -B and -C amplification rates were 100% and chromatograms were of uniformly high quality with clearly distinguishable mixed bases regardless of DNA source. Concordance between PBMC and plasma-derived HLA types was 100% at the allele and protein levels. At the nucleotide level, a single partially discordant base (resulting from a failure to call both peaks in a mixed base) was observed out of >46,975 bases sequenced (>99.9% concordance). This protocol has previously been used to perform HLA class I typing from a variety of genomic DNA sources including PBMC, whole blood, granulocyte pellets and serum, from specimens up to 30 years old. This method provides comparable specificity to conventional sequence-based approaches and could be applied in situations where cell samples are unavailable or DNA quantities are limiting. Copyright © 2012 Elsevier B.V. All rights reserved.

  2. Compilation and analysis of Escherichia coli promoter DNA sequences.

    OpenAIRE

    Hawley, D K; McClure, W R

    1983-01-01

    The DNA sequence of 168 promoter regions (-50 to +10) for Escherichia coli RNA polymerase were compiled. The complete listing was divided into two groups depending upon whether or not the promoter had been defined by genetic (promoter mutations) or biochemical (5' end determination) criteria. A consensus promoter sequence based on homologies among 112 well-defined promoters was determined that was in substantial agreement with previous compilations. In addition, we have tabulated 98 promoter ...

  3. Rational design of DNA sequences for nanotechnology, microarrays and molecular computers using Eulerian graphs.

    Science.gov (United States)

    Pancoska, Petr; Moravek, Zdenek; Moll, Ute M

    2004-01-01

    Nucleic acids are molecules of choice for both established and emerging nanoscale technologies. These technologies benefit from large functional densities of 'DNA processing elements' that can be readily manufactured. To achieve the desired functionality, polynucleotide sequences are currently designed by a process that involves tedious and laborious filtering of potential candidates against a series of requirements and parameters. Here, we present a complete novel methodology for the rapid rational design of large sets of DNA sequences. This method allows for the direct implementation of very complex and detailed requirements for the generated sequences, thus avoiding 'brute force' filtering. At the same time, these sequences have narrow distributions of melting temperatures. The molecular part of the design process can be done without computer assistance, using an efficient 'human engineering' approach by drawing a single blueprint graph that represents all generated sequences. Moreover, the method eliminates the necessity for extensive thermodynamic calculations. Melting temperature can be calculated only once (or not at all). In addition, the isostability of the sequences is independent of the selection of a particular set of thermodynamic parameters. Applications are presented for DNA sequence designs for microarrays, universal microarray zip sequences and electron transfer experiments.

  4. Rapid detection and purification of sequence specific DNA binding proteins using magnetic separation

    Directory of Open Access Journals (Sweden)

    TIJANA SAVIC

    2006-02-01

    Full Text Available In this paper, a method for the rapid identification and purification of sequence specific DNA binding proteins based on magnetic separation is presented. This method was applied to confirm the binding of the human recombinant USF1 protein to its putative binding site (E-box within the human SOX3 protomer. It has been shown that biotinylated DNA attached to streptavidin magnetic particles specifically binds the USF1 protein in the presence of competitor DNA. It has also been demonstrated that the protein could be successfully eluted from the beads, in high yield and with restored DNA binding activity. The advantage of these procedures is that they could be applied for the identification and purification of any high-affinity sequence-specific DNA binding protein with only minor modifications.

  5. Quantum Point Contact Single-Nucleotide Conductance for DNA and RNA Sequence Identification.

    Science.gov (United States)

    Afsari, Sepideh; Korshoj, Lee E; Abel, Gary R; Khan, Sajida; Chatterjee, Anushree; Nagpal, Prashant

    2017-11-28

    Several nanoscale electronic methods have been proposed for high-throughput single-molecule nucleic acid sequence identification. While many studies display a large ensemble of measurements as "electronic fingerprints" with some promise for distinguishing the DNA and RNA nucleobases (adenine, guanine, cytosine, thymine, and uracil), important metrics such as accuracy and confidence of base calling fall well below the current genomic methods. Issues such as unreliable metal-molecule junction formation, variation of nucleotide conformations, insufficient differences between the molecular orbitals responsible for single-nucleotide conduction, and lack of rigorous base calling algorithms lead to overlapping nanoelectronic measurements and poor nucleotide discrimination, especially at low coverage on single molecules. Here, we demonstrate a technique for reproducible conductance measurements on conformation-constrained single nucleotides and an advanced algorithmic approach for distinguishing the nucleobases. Our quantum point contact single-nucleotide conductance sequencing (QPICS) method uses combed and electrostatically bound single DNA and RNA nucleotides on a self-assembled monolayer of cysteamine molecules. We demonstrate that by varying the applied bias and pH conditions, molecular conductance can be switched ON and OFF, leading to reversible nucleotide perturbation for electronic recognition (NPER). We utilize NPER as a method to achieve >99.7% accuracy for DNA and RNA base calling at low molecular coverage (∼12×) using unbiased single measurements on DNA/RNA nucleotides, which represents a significant advance compared to existing sequencing methods. These results demonstrate the potential for utilizing simple surface modifications and existing biochemical moieties in individual nucleobases for a reliable, direct, single-molecule, nanoelectronic DNA and RNA nucleotide identification method for sequencing.

  6. The DNA sequence and biology of human chromosome 19

    Energy Technology Data Exchange (ETDEWEB)

    Grimwood, J; Gordon, L A; Olsen, A; Terry, A; Schmutz, J; Lamerdin, J; Hellsten, U; Goodstein, D; Couronne, O; Tran-Gyamfi, M

    2004-04-06

    Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high GC content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in Mendelian disorders, including familial hypercholesterolemia and insulin-resistant diabetes. Nearly one quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.

  7. cDNA cloning, sequence analysis, and chromosomal localization of the gene for human carnitine palmitoyltransferase

    International Nuclear Information System (INIS)

    Finocchiaro, G.; Taroni, F.; Martin, A.L.; Colombo, I.; Tarelli, G.T.; DiDonato, S.; Rocchi, M.

    1991-01-01

    The authors have cloned and sequenced a cDNA encoding human liver carnitine palmitoyltransferase an inner mitochondrial membrane enzyme that plays a major role in the fatty acid oxidation pathway. Mixed oligonucleotide primers whose sequences were deduced from one tryptic peptide obtained from purified CPTase were used in a polymerase chain reaction, allowing the amplification of a 0.12-kilobase fragment of human genomic DNA encoding such a peptide. A 60-base-pair (bp) oligonucleotide synthesized on the basis of the sequence from this fragment was used for the screening of a cDNA library from human liver and hybridized to a cDNA insert of 2255 bp. This cDNA contains an open reading frame of 1974 bp that encodes a protein of 658 amino acid residues including 25 residues of an NH 2 -terminal leader peptide. The assignment of this open reading frame to human liver CPTase is confirmed by matches to seven different amino acid sequences of tryptic peptides derived from pure human CPTase and by the 82.2% homology with the amino acid sequence of rat CPTase. The NH 2 -terminal region of CPTase contains a leucine-proline motif that is shared by carnitine acetyl- and octanoyltransferases and by choline acetyltransferase. The gene encoding CPTase was assigned to human chromosome 1, region 1q12-1pter, by hybridization of CPTase cDNA with a DNA panel of 19 human-hanster somatic cell hybrids

  8. Comparative molecular analysis of Herbaspirillum strains by RAPD, RFLP, and 16S rDNA sequencing

    Directory of Open Access Journals (Sweden)

    Soares-Ramos Juliana R.L.

    2003-01-01

    Full Text Available Herbaspirillum spp. are endophytic diazotrophic bacteria associated with important agricultural crops. In this work, we analyzed six strains of H. seropedicae (Z78, M2, ZA69, ZA95, Z152, and Z67 and one strain of H. rubrisubalbicans (M4 by restriction fragment length polymorphism (RFLP using HindIII or DraI restriction endonucleases, random amplified polymorphic DNA (RAPD, and partial sequencing of 16S rDNA. The results of these analyses ascribed the strains studied to three distinct groups: group I, consisting of M2 and M4; group II, of ZA69; and group III, of ZA95, Z78, Z67, and Z152. RAPD fingerprinting showed a higher variability than the other methods, and each strain had a unique electrophoretic pattern with five of the six primers used. Interestingly, H. seropedicae M2 was found by all analyses to be genetically very close to H. rubrisubalbicans M4. Our results show that RAPD can distinguish between all Herbaspirillum strains tested.

  9. Suitability of DNA extracted from archival specimens of fruit-eating bats of the genus Artibeus (Chiroptera, Phyllostomidae for polymerase chain reaction and sequencing analysis

    Directory of Open Access Journals (Sweden)

    Mário Pinzan Scatena

    2008-01-01

    Full Text Available To establish a technique which minimized the effects of fixation on the extraction of DNA from formalin-fixed tissues preserved in scientific collections we extracted DNA samples from fixed tissues using different methods and evaluated the effect of the different procedures on PCR and sequencing analysis. We investigated muscle and liver tissues from museum specimens of five species of fruit-eating (frugivorous bats of the Neotropical genus Artibeus (Chiroptera, Phyllostomidae: A. fimbriatus, A. lituratus, A. jamaicensis, A. obscurus, and A. planirostris. The results indicated that treatment of tissues in buffered solutions at neutral pH and about 37 °C for at least four days improves the quality and quantity of extracted DNA and the quality of the amplification and sequencing products. However, the comparison between the performance of DNA obtained from fixed and fresh tissues showed that, in spite of the fact that both types of tissue generate reliable sequences for use in phylogenetic analyses, DNA samples from fixed tissues presented a larger rate of errors in the different stages of the study. These results suggest that DNA extracted from formalin-fixed tissue can be used in molecular studies of Neotropical Artibeus bats and that our methodology may be applicable to other animal groups.

  10. Open source tools to exploit DNA sequence data from livestock species

    Science.gov (United States)

    Next-Generation Sequencing (NGS) is a recent technological development that allows researchers to rapidly determine the DNA sequence of an individual. The decrease in cost of NGS has brought the technology into the realm of practical applications in livestock genomics, where it can be used to genera...

  11. Molecular characterization of the rDNA-ITS sequence and a PCR diagnostic technique for Pileolaria terebinthi, the cause of pistachio rust

    Directory of Open Access Journals (Sweden)

    Hossein ALAEI

    2013-01-01

    Full Text Available Eleven samples of the most important pistachio rust (caused by Pileolaria terebinthi (DC. Cast.,, which causes disease on Beneh (Pistacia atlantica Desf. subsp. mutica (Fisch. & Mey. Rech. F and Kasoor (Pistacia khinjuk Stocks., were collected from herbarium specimens and pistachio fields at the Pistachio Research Institute in Rafsanjan, Iran. The complete sequences of ribosomal DNA internal transcribed spacers ITS1 and ITS2 (rDNA ITS from the samples were determined and analysed. In general, very little rDNA ITS sequence variation was observed between rDNA ITS sequences of P. terebinthi samples. The length of the PCR fragments was 621 bp (for ITS1F-ITS4 and 1177 bp (for ITS1F-rust1, and consisted of 67 bp at the 3 ́ end of 18S rDNA, 93 bp of ITS1 region, 154 bp of 5.8S rDNA, 246 bp of the ITS2 region, 57 bp (for ITS1F-ITS4 and 613 bp (for ITS1F-rust1 at the 5 ́ end of the 28S rDNA. Restriction fragment length polymorphisms (RFLPs of the rDNA-ITS region were used to identify Pileolaria terebinthi. Three strong bands of 105, 134 and 381 bp and five bands of 105, 134, 200, 301 and 437 bp are observed for the fragment of “ITS1F-ITS4” and “ITS1F-rust1”, respectively. A PCR-RFLP diagnostic technique provided effective identification of the species by a unique pattern with the specific restriction enzyme XapI (ApoI.

  12. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    2000-07-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  13. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 12 figs.

  14. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, Natasha V. (Okemos, MI); Broekaert, Willem F. (Dilbeek, BE); Chua, Nam-Hai (Scarsdale, NY); Kush, Anil (New York, NY)

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  15. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1995-03-21

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 11 figures.

  16. Default cycle phases determined after modifying discrete DNA sequences in plant cells

    International Nuclear Information System (INIS)

    Sans, J.; Leyton, C.

    1997-01-01

    After bromosubstituting DNA sequences replicated in the first, second, or third part of the S phase, in Allium cepa L. meristematic cells, radiation at 313 nm wavelength under anoxia allowed ascription of different sequences to both the positive and negative regulation of some cycle phase transitions. The present report shows that the radiation forced cells in late G 1 phase to advance into S, while those in G 2 remained in G 2 and cells in prophase returned to G 2 when both sets of sequences involved in the positive and negative controls were bromosubstituted and later irradiated. In this way, not only G 2 but also the S phase behaved as cycle phases where cells accumulated by default when signals of different sign functionally cancelled out. The treatment did not halt the rates of replication or transcription of plant bromosubstituted DNA. The irradiation under hypoxia apparently prevents the binding of regulatory proteins to Br-DNA. (author)

  17. A Novel Computational Method for Detecting DNA Methylation Sites with DNA Sequence Information and Physicochemical Properties.

    Science.gov (United States)

    Pan, Gaofeng; Jiang, Limin; Tang, Jijun; Guo, Fei

    2018-02-08

    DNA methylation is an important biochemical process, and it has a close connection with many types of cancer. Research about DNA methylation can help us to understand the regulation mechanism and epigenetic reprogramming. Therefore, it becomes very important to recognize the methylation sites in the DNA sequence. In the past several decades, many computational methods-especially machine learning methods-have been developed since the high-throughout sequencing technology became widely used in research and industry. In order to accurately identify whether or not a nucleotide residue is methylated under the specific DNA sequence context, we propose a novel method that overcomes the shortcomings of previous methods for predicting methylation sites. We use k -gram, multivariate mutual information, discrete wavelet transform, and pseudo amino acid composition to extract features, and train a sparse Bayesian learning model to do DNA methylation prediction. Five criteria-area under the receiver operating characteristic curve (AUC), Matthew's correlation coefficient (MCC), accuracy (ACC), sensitivity (SN), and specificity-are used to evaluate the prediction results of our method. On the benchmark dataset, we could reach 0.8632 on AUC, 0.8017 on ACC, 0.5558 on MCC, and 0.7268 on SN. Additionally, the best results on two scBS-seq profiled mouse embryonic stem cells datasets were 0.8896 and 0.9511 by AUC, respectively. When compared with other outstanding methods, our method surpassed them on the accuracy of prediction. The improvement of AUC by our method compared to other methods was at least 0.0399 . For the convenience of other researchers, our code has been uploaded to a file hosting service, and can be downloaded from: https://figshare.com/s/0697b692d802861282d3.

  18. A Novel Computational Method for Detecting DNA Methylation Sites with DNA Sequence Information and Physicochemical Properties

    Directory of Open Access Journals (Sweden)

    Gaofeng Pan

    2018-02-01

    Full Text Available DNA methylation is an important biochemical process, and it has a close connection with many types of cancer. Research about DNA methylation can help us to understand the regulation mechanism and epigenetic reprogramming. Therefore, it becomes very important to recognize the methylation sites in the DNA sequence. In the past several decades, many computational methods—especially machine learning methods—have been developed since the high-throughout sequencing technology became widely used in research and industry. In order to accurately identify whether or not a nucleotide residue is methylated under the specific DNA sequence context, we propose a novel method that overcomes the shortcomings of previous methods for predicting methylation sites. We use k-gram, multivariate mutual information, discrete wavelet transform, and pseudo amino acid composition to extract features, and train a sparse Bayesian learning model to do DNA methylation prediction. Five criteria—area under the receiver operating characteristic curve (AUC, Matthew’s correlation coefficient (MCC, accuracy (ACC, sensitivity (SN, and specificity—are used to evaluate the prediction results of our method. On the benchmark dataset, we could reach 0.8632 on AUC, 0.8017 on ACC, 0.5558 on MCC, and 0.7268 on SN. Additionally, the best results on two scBS-seq profiled mouse embryonic stem cells datasets were 0.8896 and 0.9511 by AUC, respectively. When compared with other outstanding methods, our method surpassed them on the accuracy of prediction. The improvement of AUC by our method compared to other methods was at least 0.0399 . For the convenience of other researchers, our code has been uploaded to a file hosting service, and can be downloaded from: https://figshare.com/s/0697b692d802861282d3.

  19. Human β satellite DNA: Genomic organization and sequence definition of a class of highly repetitive tandem DNA

    International Nuclear Information System (INIS)

    Waye, J.S.; Willard, H.F.

    1989-01-01

    The authors describe a class of human repetitive DNA, called β satellite, that, at a most fundamental level, exists as tandem arrays of diverged ∼68-base-pair monomer repeat units. The monomer units are organized as distinct subsets, each characterized by a multimeric higher-order repeat unit that is tandemly reiterated and represents a recent unit of amplification. They have cloned, characterized, and determined the sequence of two β satellite higher-order repeat units: one located on chromosome 9, the other on the acrocentric chromosomes (13, 14, 15, 21, and 22) and perhaps other sites in the genome. Analysis by pulsed-field gel electrophoresis reveals that these tandem arrays are localized in large domains that are marked by restriction fragment length polymorphisms. In total, β-satellite sequences comprise several million base pairs of DNA in the human genome. Analysis of this DNA family should permit insights into the nature of chromosome-specific and nonspecific modes of satellite DNA evolution and provide useful tools for probing the molecular organization and concerted evolution of the acrocentric chromosomes

  20. Efficient DNA fingerprinting based on the targeted sequencing of active retrotransposon insertion sites using a bench-top high-throughput sequencing platform.

    Science.gov (United States)

    Monden, Yuki; Yamamoto, Ayaka; Shindo, Akiko; Tahara, Makoto

    2014-10-01

    In many crop species, DNA fingerprinting is required for the precise identification of cultivars to protect the rights of breeders. Many families of retrotransposons have multiple copies throughout the eukaryotic genome and their integrated copies are inherited genetically. Thus, their insertion polymorphisms among cultivars are useful for DNA fingerprinting. In this study, we conducted a DNA fingerprinting based on the insertion polymorphisms of active retrotransposon families (Rtsp-1 and LIb) in sweet potato. Using 38 cultivars, we identified 2,024 insertion sites in the two families with an Illumina MiSeq sequencing platform. Of these insertion sites, 91.4% appeared to be polymorphic among the cultivars and 376 cultivar-specific insertion sites were identified, which were converted directly into cultivar-specific sequence-characterized amplified region (SCAR) markers. A phylogenetic tree was constructed using these insertion sites, which corresponded well with known pedigree information, thereby indicating their suitability for genetic diversity studies. Thus, the genome-wide comparative analysis of active retrotransposon insertion sites using the bench-top MiSeq sequencing platform is highly effective for DNA fingerprinting without any requirement for whole genome sequence information. This approach may facilitate the development of practical polymerase chain reaction-based cultivar diagnostic system and could also be applied to the determination of genetic relationships. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  1. Distribution and sequence homogeneity of an abundant satellite DNA in the beetle, Tenebrio molitor.

    Science.gov (United States)

    Davis, C A; Wyatt, G R

    1989-01-01

    The mealworm beetle, Tenebrio molitor, contains an unusually abundant and homogeneous satellite DNA which constitutes up to 60% of its genome. The satellite DNA is shown to be present in all of the chromosomes by in situ hybridization. 18 dimers of the repeat unit were cloned and sequenced. The consensus sequence is 142 nt long and lacks any internal repeat structure. Monomers of the sequence are very similar, showing on average a 2% divergence from the calculated consensus. Variant nucleotides are scattered randomly throughout the sequence although some variants are more common than others. Neighboring repeat units are no more alike than randomly chosen ones. The results suggest that some mechanism, perhaps gene conversion, is acting to maintain the homogeneity of the satellite DNA despite its abundance and distribution on all of the chromosomes. Images PMID:2762148

  2. The Dunaliella salina organelle genomes: large sequences, inflated with intronic and intergenic DNA

    Energy Technology Data Exchange (ETDEWEB)

    Smith, David R.; Lee, Robert W.; Cushman, John C.; Magnuson, Jon K.; Tran, Duc; Polle, Juergen E.

    2010-05-07

    Abstract Background: Dunaliella salina Teodoresco, a unicellular, halophilic green alga belonging to the Chlorophyceae, is among the most industrially important microalgae. This is because D. salina can produce massive amounts of β-carotene, which can be collected for commercial purposes, and because of its potential as a feedstock for biofuels production. Although the biochemistry and physiology of D. salina have been studied in great detail, virtually nothing is known about the genomes it carries, especially those within its mitochondrion and plastid. This study presents the complete mitochondrial and plastid genome sequences of D. salina and compares them with those of the model green algae Chlamydomonas reinhardtii and Volvox carteri. Results: The D. salina organelle genomes are large, circular-mapping molecules with ~60% noncoding DNA, placing them among the most inflated organelle DNAs sampled from the Chlorophyta. In fact, the D. salina plastid genome, at 269 kb, is the largest complete plastid DNA (ptDNA) sequence currently deposited in GenBank, and both the mitochondrial and plastid genomes have unprecedentedly high intron densities for organelle DNA: ~1.5 and ~0.4 introns per gene, respectively. Moreover, what appear to be the relics of genes, introns, and intronic open reading frames are found scattered throughout the intergenic ptDNA regions -- a trait without parallel in other characterized organelle genomes and one that gives insight into the mechanisms and modes of expansion of the D. salina ptDNA. Conclusions: These findings confirm the notion that chlamydomonadalean algae have some of the most extreme organelle genomes of all eukaryotes. They also suggest that the events giving rise to the expanded ptDNA architecture of D. salina and other Chlamydomonadales may have occurred early in the evolution of this lineage. Although interesting from a genome evolution standpoint, the D. salina organelle DNA sequences will aid in the development of a viable

  3. The Dunaliella salina organelle genomes: large sequences, inflated with intronic and intergenic DNA

    Directory of Open Access Journals (Sweden)

    Tran Duc

    2010-05-01

    Full Text Available Abstract Background Dunaliella salina Teodoresco, a unicellular, halophilic green alga belonging to the Chlorophyceae, is among the most industrially important microalgae. This is because D. salina can produce massive amounts of β-carotene, which can be collected for commercial purposes, and because of its potential as a feedstock for biofuels production. Although the biochemistry and physiology of D. salina have been studied in great detail, virtually nothing is known about the genomes it carries, especially those within its mitochondrion and plastid. This study presents the complete mitochondrial and plastid genome sequences of D. salina and compares them with those of the model green algae Chlamydomonas reinhardtii and Volvox carteri. Results The D. salina organelle genomes are large, circular-mapping molecules with ~60% noncoding DNA, placing them among the most inflated organelle DNAs sampled from the Chlorophyta. In fact, the D. salina plastid genome, at 269 kb, is the largest complete plastid DNA (ptDNA sequence currently deposited in GenBank, and both the mitochondrial and plastid genomes have unprecedentedly high intron densities for organelle DNA: ~1.5 and ~0.4 introns per gene, respectively. Moreover, what appear to be the relics of genes, introns, and intronic open reading frames are found scattered throughout the intergenic ptDNA regions -- a trait without parallel in other characterized organelle genomes and one that gives insight into the mechanisms and modes of expansion of the D. salina ptDNA. Conclusions These findings confirm the notion that chlamydomonadalean algae have some of the most extreme organelle genomes of all eukaryotes. They also suggest that the events giving rise to the expanded ptDNA architecture of D. salina and other Chlamydomonadales may have occurred early in the evolution of this lineage. Although interesting from a genome evolution standpoint, the D. salina organelle DNA sequences will aid in the

  4. Sequence analysis of the canine mitochondrial DNA control region from shed hair samples in criminal investigations.

    Science.gov (United States)

    Berger, C; Berger, B; Parson, W

    2012-01-01

    In recent years, evidence from domestic dogs has increasingly been analyzed by forensic DNA testing. Especially, canine hairs have proved most suitable and practical due to the high rate of hair transfer occurring between dogs and humans. Starting with the description of a contamination-free sample handling procedure, we give a detailed workflow for sequencing hypervariable segments (HVS) of the mtDNA control region from canine evidence. After the hair material is lysed and the DNA extracted by Phenol/Chloroform, the amplification and sequencing strategy comprises the HVS I and II of the canine control region and is optimized for DNA of medium-to-low quality and quantity. The sequencing procedure is based on the Sanger Big-dye deoxy-terminator method and the separation of the sequencing reaction products is performed on a conventional multicolor fluorescence detection capillary electrophoresis platform. Finally, software-aided base calling and sequence interpretation are addressed exemplarily.

  5. Comparative analyses of two Geraniaceae transcriptomes using next-generation sequencing.

    Science.gov (United States)

    Zhang, Jin; Ruhlman, Tracey A; Mower, Jeffrey P; Jansen, Robert K

    2013-12-29

    Organelle genomes of Geraniaceae exhibit several unusual evolutionary phenomena compared to other angiosperm families including accelerated nucleotide substitution rates, widespread gene loss, reduced RNA editing, and extensive genomic rearrangements. Since most organelle-encoded proteins function in multi-subunit complexes that also contain nuclear-encoded proteins, it is likely that the atypical organellar phenomena affect the evolution of nuclear genes encoding organellar proteins. To begin to unravel the complex co-evolutionary interplay between organellar and nuclear genomes in this family, we sequenced nuclear transcriptomes of two species, Geranium maderense and Pelargonium x hortorum. Normalized cDNA libraries of G. maderense and P. x hortorum were used for transcriptome sequencing. Five assemblers (MIRA, Newbler, SOAPdenovo, SOAPdenovo-trans [SOAPtrans], Trinity) and two next-generation technologies (454 and Illumina) were compared to determine the optimal transcriptome sequencing approach. Trinity provided the highest quality assembly of Illumina data with the deepest transcriptome coverage. An analysis to determine the amount of sequencing needed for de novo assembly revealed diminishing returns of coverage and quality with data sets larger than sixty million Illumina paired end reads for both species. The G. maderense and P. x hortorum transcriptomes contained fewer transcripts encoding the PLS subclass of PPR proteins relative to other angiosperms, consistent with reduced mitochondrial RNA editing activity in Geraniaceae. In addition, transcripts for all six plastid targeted sigma factors were identified in both transcriptomes, suggesting that one of the highly divergent rpoA-like ORFs in the P. x hortorum plastid genome is functional. The findings support the use of the Illumina platform and assemblers optimized for transcriptome assembly, such as Trinity or SOAPtrans, to generate high-quality de novo transcriptomes with broad coverage. In addition

  6. Cloning and sequencing of cDNA encoding human DNA topoisomerase II and localization of the gene to chromosome region 17q21-22

    International Nuclear Information System (INIS)

    Tsai-Pflugfelder, M.; Liu, L.F.; Liu, A.A.; Tewey, K.M.; Whang-Peng, J.; Knutsen, T.; Huebner, K.; Croce, C.M.; Wang, J.C.

    1988-01-01

    Two overlapping cDNA clones encoding human DNA topoisomerase II were identified by two independent methods. In one, a human cDNA library in phage λ was screened by hybridization with a mixed oligonucleotide probe encoding a stretch of seven amino acids found in yeast and Drosophila DNA topoisomerase II; in the other, a different human cDNA library in a λgt11 expression vector was screened for the expression of antigenic determinants that are recognized by rabbit antibodies specific to human DNA topoisomerase II. The entire coding sequences of the human DNA topoisomerase II gene were determined from these and several additional clones, identified through the use of the cloned human TOP2 gene sequences as probes. Hybridization between the cloned sequences and mRNA and genomic DNA indicates that the human enzyme is encoded by a single-copy gene. The location of the gene was mapped to chromosome 17q21-22 by in situ hybridization of a cloned fragment to metaphase chromosomes and by hybridization analysis with a panel of mouse-human hybrid cell lines, each retaining a subset of human chromosomes

  7. Profiling soil microbial communities with next-generation sequencing: the influence of DNA kit selection and technician technical expertise

    Directory of Open Access Journals (Sweden)

    Taha Soliman

    2017-12-01

    Full Text Available Structure and diversity of microbial communities are an important research topic in biology, since microbes play essential roles in the ecology of various environments. Different DNA isolation protocols can lead to data bias and can affect results of next-generation sequencing. To evaluate the impact of protocols for DNA isolation from soil samples and also the influence of individual handling of samples, we compared results obtained by two researchers (R and T using two different DNA extraction kits: (1 MO BIO PowerSoil® DNA Isolation kit (MO_R and MO_T and (2 NucleoSpin® Soil kit (MN_R and MN_T. Samples were collected from six different sites on Okinawa Island, Japan. For all sites, differences in the results of microbial composition analyses (bacteria, archaea, fungi, and other eukaryotes, obtained by the two researchers using the two kits, were analyzed. For both researchers, the MN kit gave significantly higher yields of genomic DNA at all sites compared to the MO kit (ANOVA; P < 0.006. In addition, operational taxonomic units for some phyla and classes were missed in some cases: Micrarchaea were detected only in the MN_T and MO_R analyses; the bacterial phylum Armatimonadetes was detected only in MO_R and MO_T; and WIM5 of the phylum Amoebozoa of eukaryotes was found only in the MO_T analysis. Our results suggest the possibility of handling bias; therefore, it is crucial that replicated DNA extraction be performed by at least two technicians for thorough microbial analyses and to obtain accurate estimates of microbial diversity.

  8. DNA sequence and prokaryotic expression analysis of vitellogenin ...

    African Journals Online (AJOL)

    In this study, the DNA sequence of vitellogenin from Antheraea pernyi (Ap-Vg) was identified and its functional domain (30-740 aa, Ap-Vg-1) was expressed in Escherichia coli BL21 (DE3) cells. The recombinant Ap-Vg-1 proteins were purified and used for antibody preparation. The results showed that the intact DNA ...

  9. 'Mitominis': multiplex PCR analysis of reduced size amplicons for compound sequence analysis of the entire mtDNA control region in highly degraded samples.

    Science.gov (United States)

    Eichmann, Cordula; Parson, Walther

    2008-09-01

    The traditional protocol for forensic mitochondrial DNA (mtDNA) analyses involves the amplification and sequencing of the two hypervariable segments HVS-I and HVS-II of the mtDNA control region. The primers usually span fragment sizes of 300-400 bp each region, which may result in weak or failed amplification in highly degraded samples. Here we introduce an improved and more stable approach using shortened amplicons in the fragment range between 144 and 237 bp. Ten such amplicons were required to produce overlapping fragments that cover the entire human mtDNA control region. These were co-amplified in two multiplex polymerase chain reactions and sequenced with the individual amplification primers. The primers were carefully selected to minimize binding on homoplasic and haplogroup-specific sites that would otherwise result in loss of amplification due to mis-priming. The multiplexes have successfully been applied to ancient and forensic samples such as bones and teeth that showed a high degree of degradation.

  10. Analysis of Litopenaeus vannamei transcriptome using the next-generation DNA sequencing technique.

    Directory of Open Access Journals (Sweden)

    Chaozheng Li

    Full Text Available BACKGROUND: Pacific white shrimp (Litopenaeus vannamei, the major species of farmed shrimps in the world, has been attracting extensive studies, which require more and more genome background knowledge. The now available transcriptome data of L. vannamei are insufficient for research requirements, and have not been adequately assembled and annotated. METHODOLOGY/PRINCIPAL FINDINGS: This is the first study that used a next-generation high-throughput DNA sequencing technique, the Solexa/Illumina GA II method, to analyze the transcriptome from whole bodies of L. vannamei larvae. More than 2.4 Gb of raw data were generated, and 109,169 unigenes with a mean length of 396 bp were assembled using the SOAP denovo software. 73,505 unigenes (>200 bp with good quality sequences were selected and subjected to annotation analysis, among which 37.80% can be matched in NCBI Nr database, 37.3% matched in Swissprot, and 44.1% matched in TrEMBL. Using BLAST and BLAST2Go softwares, 11,153 unigenes were classified into 25 Clusters of Orthologous Groups of proteins (COG categories, 8171 unigenes were assigned into 51 Gene ontology (GO functional groups, and 18,154 unigenes were divided into 220 Kyoto Encyclopedia of Genes and Genomes (KEGG pathways. To primarily verify part of the results of assembly and annotations, 12 assembled unigenes that are homologous to many embryo development-related genes were chosen and subjected to RT-PCR for electrophoresis and Sanger sequencing analyses, and to real-time PCR for expression profile analyses during embryo development. CONCLUSIONS/SIGNIFICANCE: The L. vannamei transcriptome analyzed using the next-generation sequencing technique enriches the information of L. vannamei genes, which will facilitate our understanding of the genome background of crustaceans, and promote the studies on L. vannamei.

  11. Sequence-specific activation of the DNA sensor cGAS by Y-form DNA structures as found in primary HIV-1 cDNA.

    Science.gov (United States)

    Herzner, Anna-Maria; Hagmann, Cristina Amparo; Goldeck, Marion; Wolter, Steven; Kübler, Kirsten; Wittmann, Sabine; Gramberg, Thomas; Andreeva, Liudmila; Hopfner, Karl-Peter; Mertens, Christina; Zillinger, Thomas; Jin, Tengchuan; Xiao, Tsan Sam; Bartok, Eva; Coch, Christoph; Ackermann, Damian; Hornung, Veit; Ludwig, Janos; Barchet, Winfried; Hartmann, Gunther; Schlee, Martin

    2015-10-01

    Cytosolic DNA that emerges during infection with a retrovirus or DNA virus triggers antiviral type I interferon responses. So far, only double-stranded DNA (dsDNA) over 40 base pairs (bp) in length has been considered immunostimulatory. Here we found that unpaired DNA nucleotides flanking short base-paired DNA stretches, as in stem-loop structures of single-stranded DNA (ssDNA) derived from human immunodeficiency virus type 1 (HIV-1), activated the type I interferon-inducing DNA sensor cGAS in a sequence-dependent manner. DNA structures containing unpaired guanosines flanking short (12- to 20-bp) dsDNA (Y-form DNA) were highly stimulatory and specifically enhanced the enzymatic activity of cGAS. Furthermore, we found that primary HIV-1 reverse transcripts represented the predominant viral cytosolic DNA species during early infection of macrophages and that these ssDNAs were highly immunostimulatory. Collectively, our study identifies unpaired guanosines in Y-form DNA as a highly active, minimal cGAS recognition motif that enables detection of HIV-1 ssDNA.

  12. DNA extraction from sea anemone (Cnidaria: Actiniaria tissues for molecular analyses

    Directory of Open Access Journals (Sweden)

    Pinto S.M.

    2000-01-01

    Full Text Available A specific DNA extraction method for sea anemones is described in which extraction of total DNA from eight species of sea anemones and one species of corallimorpharian was achieved by changing the standard extraction protocols. DNA extraction from sea anemone tissue is made more difficult both by the tissue consistency and the presence of symbiotic zooxanthellae. The technique described here is an efficient way to avoid problems of DNA contamination and obtain large amounts of purified and integral DNA which can be used in different kinds of molecular analyses.

  13. A next generation semiconductor based sequencing approach for the identification of meat species in DNA mixtures.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43% in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97 and lower for avian species (0.70. PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.

  14. Chimeric TALE recombinases with programmable DNA sequence specificity.

    Science.gov (United States)

    Mercer, Andrew C; Gaj, Thomas; Fuller, Roberta P; Barbas, Carlos F

    2012-11-01

    Site-specific recombinases are powerful tools for genome engineering. Hyperactivated variants of the resolvase/invertase family of serine recombinases function without accessory factors, and thus can be re-targeted to sequences of interest by replacing native DNA-binding domains (DBDs) with engineered zinc-finger proteins (ZFPs). However, imperfect modularity with particular domains, lack of high-affinity binding to all DNA triplets, and difficulty in construction has hindered the widespread adoption of ZFPs in unspecialized laboratories. The discovery of a novel type of DBD in transcription activator-like effector (TALE) proteins from Xanthomonas provides an alternative to ZFPs. Here we describe chimeric TALE recombinases (TALERs): engineered fusions between a hyperactivated catalytic domain from the DNA invertase Gin and an optimized TALE architecture. We use a library of incrementally truncated TALE variants to identify TALER fusions that modify DNA with efficiency and specificity comparable to zinc-finger recombinases in bacterial cells. We also show that TALERs recombine DNA in mammalian cells. The TALER architecture described herein provides a platform for insertion of customized TALE domains, thus significantly expanding the targeting capacity of engineered recombinases and their potential applications in biotechnology and medicine.

  15. Inhibition of hepatitis B virus replication with linear DNA sequences expressing antiviral micro-RNA shuttles

    Energy Technology Data Exchange (ETDEWEB)

    Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie; Weinberg, Marc S. [Antiviral Gene Therapy Research Unit, University of the Witwatersrand (South Africa); Arbuthnot, Patrick, E-mail: Patrick.Arbuthnot@wits.ac.za [Antiviral Gene Therapy Research Unit, University of the Witwatersrand (South Africa)

    2009-11-20

    RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR) shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.

  16. Inhibition of hepatitis B virus replication with linear DNA sequences expressing antiviral micro-RNA shuttles

    International Nuclear Information System (INIS)

    Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie; Weinberg, Marc S.; Arbuthnot, Patrick

    2009-01-01

    RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR) shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.

  17. Nucleotide sequence of a cDNA coding for the amino-terminal region of human prepro. alpha. 1(III) collagen

    Energy Technology Data Exchange (ETDEWEB)

    Toman, P D; Ricca, G A [Rorer Biotechnology, Inc., Springfield, VA (USA); de Crombrugghe, B [National Institutes of Health, Bethesda, MD (USA)

    1988-07-25

    Type III Collagen is synthesized in a variety of tissues as a precursor macromolecule containing a leader sequence, a N-propeptide, a N-telopeptide, the triple helical region, a C-telopeptide, and C-propeptide. To further characterize the human type III collagen precursor, a human placental cDNA library was constructed in gt11 using an oligonucleotide derived from a partial cDNA sequence corresponding to the carboxy-terminal part of the 1(III) collagen. A cDNA was identified which contains the leader sequence, the N-propeptide and N-telopeptide regions. The DNA sequence of these regions are presented here. The triple helical, C-telopeptide and C-propeptide amino acid sequence for human type III collagen has been determined previously. A comparison of the human amino acid sequence with mouse, chicken, and calf sequence shows 81%, 81%, and 92% similarity, respectively. At the DNA level, the sequence similarity between human and mouse or chicken type III collagen sequences in this area is 82% and 77%, respectively.

  18. A Universal Method for Species Identification of Mammals Utilizing Next Generation Sequencing for the Analysis of DNA Mixtures

    Science.gov (United States)

    Tillmar, Andreas O.; Dell'Amico, Barbara; Welander, Jenny; Holmlund, Gunilla

    2013-01-01

    Species identification can be interesting in a wide range of areas, for example, in forensic applications, food monitoring and in archeology. The vast majority of existing DNA typing methods developed for species determination, mainly focuses on a single species source. There are, however, many instances where all species from mixed sources need to be determined, even when the species in minority constitutes less than 1 % of the sample. The introduction of next generation sequencing opens new possibilities for such challenging samples. In this study we present a universal deep sequencing method using 454 GS Junior sequencing of a target on the mitochondrial gene 16S rRNA. The method was designed through phylogenetic analyses of DNA reference sequences from more than 300 mammal species. Experiments were performed on artificial species-species mixture samples in order to verify the method’s robustness and its ability to detect all species within a mixture. The method was also tested on samples from authentic forensic casework. The results showed to be promising, discriminating over 99.9 % of mammal species and the ability to detect multiple donors within a mixture and also to detect minor components as low as 1 % of a mixed sample. PMID:24358309

  19. Mechanism of sequence-specific template binding by the DNA primase of bacteriophage T7

    KAUST Repository

    Lee, Seung-Joo

    2010-03-28

    DNA primases catalyze the synthesis of the oligoribonucleotides required for the initiation of lagging strand DNA synthesis. Biochemical studies have elucidated the mechanism for the sequence-specific synthesis of primers. However, the physical interactions of the primase with the DNA template to explain the basis of specificity have not been demonstrated. Using a combination of surface plasmon resonance and biochemical assays, we show that T7 DNA primase has only a slightly higher affinity for DNA containing the primase recognition sequence (5\\'-TGGTC-3\\') than for DNA lacking the recognition site. However, this binding is drastically enhanced by the presence of the cognate Nucleoside triphosphates (NTPs), Adenosine triphosphate (ATP) and Cytosine triphosphate (CTP) that are incorporated into the primer, pppACCA. Formation of the dimer, pppAC, the initial step of sequence-specific primer synthesis, is not sufficient for the stable binding. Preformed primers exhibit significantly less selective binding than that observed with ATP and CTP. Alterations in subdomains of the primase result in loss of selective DNA binding. We present a model in which conformational changes induced during primer synthesis facilitate contact between the zinc-binding domain and the polymerase domain. The Author(s) 2010. Published by Oxford University Press.

  20. Purification of High Molecular Weight Genomic DNA from Powdery Mildew for Long-Read Sequencing.

    Science.gov (United States)

    Feehan, Joanna M; Scheibel, Katherine E; Bourras, Salim; Underwood, William; Keller, Beat; Somerville, Shauna C

    2017-03-31

    The powdery mildew fungi are a group of economically important fungal plant pathogens. Relatively little is known about the molecular biology and genetics of these pathogens, in part due to a lack of well-developed genetic and genomic resources. These organisms have large, repetitive genomes, which have made genome sequencing and assembly prohibitively difficult. Here, we describe methods for the collection, extraction, purification and quality control assessment of high molecular weight genomic DNA from one powdery mildew species, Golovinomyces cichoracearum. The protocol described includes mechanical disruption of spores followed by an optimized phenol/chloroform genomic DNA extraction. A typical yield was 7 µg DNA per 150 mg conidia. The genomic DNA that is isolated using this procedure is suitable for long-read sequencing (i.e., > 48.5 kbp). Quality control measures to ensure the size, yield, and purity of the genomic DNA are also described in this method. Sequencing of the genomic DNA of the quality described here will allow for the assembly and comparison of multiple powdery mildew genomes, which in turn will lead to a better understanding and improved control of this agricultural pathogen.

  1. GenEST, a powerful bidirectional link between cDNA sequence data and gene expression profiles generated by cDNA-AFLP

    NARCIS (Netherlands)

    Qin Ling,; Prins, P.; Jones, J.T.; Popeijus, H.; Smant, G.; Bakker, J.; Helder, J.

    2001-01-01

    The release of vast quantities of DNA sequence data by large-scale genome and expressed sequence tag (EST) projects underlines the necessity for the development of efficient and inexpensive ways to link sequence databases with temporal and spatial expression profiles. Here we demonstrate the power

  2. Screening the sequence selectivity of DNA-binding molecules using a gold nanoparticle-based colorimetric approach.

    Science.gov (United States)

    Hurst, Sarah J; Han, Min Su; Lytton-Jean, Abigail K R; Mirkin, Chad A

    2007-09-15

    We have developed a novel competition assay that uses a gold nanoparticle (Au NP)-based, high-throughput colorimetric approach to screen the sequence selectivity of DNA-binding molecules. This assay hinges on the observation that the melting behavior of DNA-functionalized Au NP aggregates is sensitive to the concentration of the DNA-binding molecule in solution. When short, oligomeric hairpin DNA sequences were added to a reaction solution consisting of DNA-functionalized Au NP aggregates and DNA-binding molecules, these molecules may either bind to the Au NP aggregate interconnects or the hairpin stems based on their relative affinity for each. This relative affinity can be measured as a change in the melting temperature (Tm) of the DNA-modified Au NP aggregates in solution. As a proof of concept, we evaluated the selectivity of 4',6-diamidino-2-phenylindone (an AT-specific binder), ethidium bromide (a nonspecific binder), and chromomycin A (a GC-specific binder) for six sequences of hairpin DNA having different numbers of AT pairs in a five-base pair variable stem region. Our assay accurately and easily confirmed the known trends in selectivity for the DNA binders in question without the use of complicated instrumentation. This novel assay will be useful in assessing large libraries of potential drug candidates that work by binding DNA to form a drug/DNA complex.

  3. Extraction of High Molecular Weight DNA from Fungal Rust Spores for Long Read Sequencing.

    Science.gov (United States)

    Schwessinger, Benjamin; Rathjen, John P

    2017-01-01

    Wheat rust fungi are complex organisms with a complete life cycle that involves two different host plants and five different spore types. During the asexual infection cycle on wheat, rusts produce massive amounts of dikaryotic urediniospores. These spores are dikaryotic (two nuclei) with each nucleus containing one haploid genome. This dikaryotic state is likely to contribute to their evolutionary success, making them some of the major wheat pathogens globally. Despite this, most published wheat rust genomes are highly fragmented and contain very little haplotype-specific sequence information. Current long-read sequencing technologies hold great promise to provide more contiguous and haplotype-phased genome assemblies. Long reads are able to span repetitive regions and phase structural differences between the haplomes. This increased genome resolution enables the identification of complex loci and the study of genome evolution beyond simple nucleotide polymorphisms. Long-read technologies require pure high molecular weight DNA as an input for sequencing. Here, we describe a DNA extraction protocol for rust spores that yields pure double-stranded DNA molecules with molecular weight of >50 kilo-base pairs (kbp). The isolated DNA is of sufficient purity for PacBio long-read sequencing, but may require additional purification for other sequencing technologies such as Nanopore and 10× Genomics.

  4. Applicability of Ion Torrent Colon and Lung sequencing panel on circulating cell-free DNA

    DEFF Research Database (Denmark)

    Demuth, Christina; Tranberg Madsen, Anne; Larsen, Anne Winther

    of targeted sequencing have been optimised for clinical use on FFPE, e.g. the Ion Torrent Colon and Lung panel. The size of DNA extracted from FFPE tissue is comparable with that from cfDNA. We therefore investigated the performance of the clinically relevant Ion Torrent Colon and Lung panel on cfDNA. Methods...... a baseline for the panel. Lastly, the panel was tested on 52 patient samples. Patient plasma samples are from a previously collected cohort of EGFR wild-type non-small cell lung cancer patients (ClinicalTrial.gov: NCT02043002) All samples were sequenced using the Ion Torrent Oncomine Solid Tumor DNA kit...... (Colon and Lung panel) from Thermo Fisher. Sample preparation was performed using the Ion Torrent Chef and sequencing was performed on the Personal Genome Machine (PGM) system. Data was analyzed using the Torrent Suite software, and variants called by Ion Reporter. Results: No somatic mutations were...

  5. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution

    OpenAIRE

    Falconer, Ester; Hills, Mark; Naumann, Ulrike; Poon, Steven S. S.; Chavez, Elizabeth A.; Sanders, Ashley D.; Zhao, Yongjun; Hirst, Martin; Lansdorp, Peter M.

    2012-01-01

    DNA rearrangements such as sister chromatid exchanges (SCEs) are sensitive indicators of genomic stress and instability, but they are typically masked by single-cell sequencing techniques. We developed Strand-seq to independently sequence parental DNA template strands from single cells, making it possible to map SCEs at orders-of-magnitude greater resolution than was previously possible. On average, murine embryonic stem (mES) cells exhibit eight SCEs, which are detected at a resolution of up...

  6. Comparative performance of the BGISEQ-500 versus Illumina HiSeq2500 sequencing platforms for palaeogenomic sequencing

    DEFF Research Database (Denmark)

    Mak, Sarah Siu Tze Mak; Gopalakrishnan, Shyam Sunder; Carøe, Christian

    2017-01-01

    on degraded DNA, then directly compared the sequencing performance and data quality of the BGISEQ-500 to the Illumina HiSeq2500 platform, on DNA extracted from eight historic and ancient dog and wolf samples. Results: The data generated was largely comparable between sequencing platforms...... difference was also observed in the mitochondrial DNA percentages recovered (p = 0.018), although we believe this is likely a stochastic effect relating to the extremely low levels of mitochondria that were sequenced from three of the samples with overall very low levels of endogenous DNA. Conclusions......: Although we acknowledge our analyses were limited to animal material, our observations suggest that the BGISEQ-500 holds the potential to represent valid and potentially valuable alternative platform for palaeogenomic data generation, that is worthy of future exploration by those interested...

  7. metaBIT, an integrative and automated metagenomic pipeline for analysing microbial profiles from high-throughput sequencing shotgun data

    DEFF Research Database (Denmark)

    Louvel, Guillaume; Der Sarkissian, Clio; Hanghøj, Kristian Ebbesen

    2016-01-01

    -throughput DNA sequencing (HTS). Here, we develop metaBIT, an open-source computational pipeline automatizing routine microbial profiling of shotgun HTS data. Customizable by the user at different stringency levels, it performs robust taxonomy-based assignment and relative abundance calculation of microbial taxa......, as well as cross-sample statistical analyses of microbial diversity distributions. We demonstrate the versatility of metaBIT within a range of published HTS data sets sampled from the environment (soil and seawater) and the human body (skin and gut), but also from archaeological specimens. We present......-friendly profiling of the microbial DNA present in HTS shotgun data sets. The applications of metaBIT are vast, from monitoring of laboratory errors and contaminations, to the reconstruction of past and present microbiota, and the detection of candidate species, including pathogens....

  8. Cloning, sequencing and expression of a novel xylanase cDNA from ...

    African Journals Online (AJOL)

    A strain SH 2016, capable of producing xylanase, was isolated and identified as Aspergillus awamori, based on its physiological and biochemical characteristics as well as its ITS rDNA gene sequence analysis. A xylanase gene of 591 bp was cloned from this newly isolated A. awamori and the ORF sequence predicted a ...

  9. A likelihood ratio test for species membership based on DNA sequence data

    DEFF Research Database (Denmark)

    Matz, Mikhail V.; Nielsen, Rasmus

    2005-01-01

    DNA barcoding as an approach for species identification is rapidly increasing in popularity. However, it remains unclear which statistical procedures should accompany the technique to provide a measure of uncertainty. Here we describe a likelihood ratio test which can be used to test if a sampled...... sequence is a member of an a priori specified species. We investigate the performance of the test using coalescence simulations, as well as using the real data from butterflies and frogs representing two kinds of challenge for DNA barcoding: extremely low and extremely high levels of sequence variability....

  10. [Replication of Streptomyces plasmids: the DNA nucleotide sequence of plasmid pSB 24.2].

    Science.gov (United States)

    Bolotin, A P; Sorokin, A V; Aleksandrov, N N; Danilenko, V N; Kozlov, Iu I

    1985-11-01

    The nucleotide sequence of DNA in plasmid pSB 24.2, a natural deletion derivative of plasmid pSB 24.1 isolated from S. cyanogenus was studied. The plasmid amounted by its size to 3706 nucleotide pairs. The G-C composition was equal to 73 per cent. The analysis of the DNA structure in plasmid pSB 24.2 revealed the protein-encoding sequence of DNA, the continuity of which was significant for replication of the plasmid containing more than 1300 nucleotide pairs. The analysis also revealed two A-T-rich areas of DNA, the G-C composition of which was less than 55 per cent and a DNA area with a branched pin structure. The results may be of value in investigation of plasmid replication in actinomycetes and experimental cloning of DNA with this plasmid as a vector.

  11. DNA sequence analysis of X-ray induced Adh null mutations in Drosophila melanogaster

    International Nuclear Information System (INIS)

    Mahmoud, J.; Fossett, N.G.; Arbour-Reily, P.; McDaniel, M.; Tucker, A.; Chang, S.H.; Lee, W.R.

    1991-01-01

    The mutational spectrum for 28 X-ray induced mutations and 2 spontaneous mutations, previously determined by genetic and cytogenetic methods, consisted of 20 multilocus deficiencies (19 induced and 1 spontaneous) and 10 intragenic mutations (9 induced and 1 spontaneous). One of the X-ray induced intragenic mutations was lost, and another was determined to be a recombinant with the allele used in the recovery scheme. The DNA sequence of two X-ray induced intragenic mutations has been published. This paper reports the results of DNA sequence analysis of the remaining intragenic mutations and a summary of the X-ray induced mutational spectrum. The combination of DNA sequence analysis with genetic complementation analysis shows a continuous distribution in size of deletions rather than two different types of mutations consisting of deletions and 'point mutations'. Sequencing is shown to be essential for detecting intragenic deletions. Of particular importance for future studies is the observation that all of the intragenic deletions consist of a direct repeat adjacent to the breakpoint with one of the repeats deleted

  12. Analysing Microbial Community Composition through Amplicon Sequencing: From Sampling to Hypothesis Testing

    Directory of Open Access Journals (Sweden)

    Luisa W. Hugerth

    2017-09-01

    Full Text Available Microbial ecology as a scientific field is fundamentally driven by technological advance. The past decade's revolution in DNA sequencing cost and throughput has made it possible for most research groups to map microbial community composition in environments of interest. However, the computational and statistical methodology required to analyse this kind of data is often not part of the biologist training. In this review, we give a historical perspective on the use of sequencing data in microbial ecology and restate the current need for this method; but also highlight the major caveats with standard practices for handling these data, from sample collection and library preparation to statistical analysis. Further, we outline the main new analytical tools that have been developed in the past few years to bypass these caveats, as well as highlight the major requirements of common statistical practices and the extent to which they are applicable to microbial data. Besides delving into the meaning of select alpha- and beta-diversity measures, we give special consideration to techniques for finding the main drivers of community dissimilarity and for interaction network construction. While every project design has specific needs, this review should serve as a starting point for considering what options are available.

  13. DNA sequence analysis of the photosynthesis region of Rhodobacter sphaeroides 2.4.1T

    OpenAIRE

    Choudhary, M.; Kaplan, Samuel

    2000-01-01

    This paper describes the DNA sequence of the photosynthesis region of Rhodobacter sphaeroides 2.4.1T. The photosynthesis gene cluster is located within a ~73 kb AseI genomic DNA fragment containing the puf, puhA, cycA and puc operons. A total of 65 open reading frames (ORFs) have been identified, of which 61 showed significant similarity to genes/proteins of other organisms while only four did not reveal any significant sequence similarity to any gene/protein sequences in the database. The da...

  14. DNA Sequence-Mediated, Evolutionarily Rapid Redistribution of Meiotic Recombination Hotspots

    Science.gov (United States)

    Wahls, Wayne P.; Davidson, Mari K.

    2011-01-01

    Hotspots regulate the position and frequency of Spo11 (Rec12)-initiated meiotic recombination, but paradoxically they are suicidal and are somehow resurrected elsewhere in the genome. After the DNA sequence-dependent activation of hotspots was discovered in fission yeast, nearly two decades elapsed before the key realizations that (A) DNA site-dependent regulation is broadly conserved and (B) individual eukaryotes have multiple different DNA sequence motifs that activate hotspots. From our perspective, such findings provide a conceptually straightforward solution to the hotspot paradox and can explain other, seemingly complex features of meiotic recombination. We describe how a small number of single-base-pair substitutions can generate hotspots de novo and dramatically alter their distribution in the genome. This model also shows how equilibrium rate kinetics could maintain the presence of hotspots over evolutionary timescales, without strong selective pressures invoked previously, and explains why hotspots localize preferentially to intergenic regions and introns. The model is robust enough to account for all hotspots of humans and chimpanzees repositioned since their divergence from the latest common ancestor. PMID:22084420

  15. Puzzling sequences: studying microbial genomes from 'Ötzi'

    International Nuclear Information System (INIS)

    Rattei, T.

    2012-01-01

    Ancient remains, and mummies in particular, are of central value for archaeological research. The Tyrolean iceman “Ötzi” was conserved in a glacier of the Ötztal Alps about 5000 years ago. Aside from morphological and phenotypical classification, the determination of DNA sequences and the subsequent genome analyses have been first applied to mitochondrial DNA and then been extended to genomic DNA. Typically also ancient microbial DNA is sequenced. These sequences allow the identification of pathogens as well as studying the evolution of microorganisms. The talk will explain the metagenomic aspects of the “Ötzi” genome project and discuss the first results. (author)

  16. Genomic DNA sequence and cytosine methylation changes of adult rice leaves after seeds space flight

    Science.gov (United States)

    Shi, Jinming

    In this study, cytosine methylation on CCGG site and genomic DNA sequence changes of adult leaves of rice after seeds space flight were detected by methylation-sensitive amplification polymorphism (MSAP) and Amplified fragment length polymorphism (AFLP) technique respectively. Rice seeds were planted in the trial field after 4 days space flight on the shenzhou-6 Spaceship of China. Adult leaves of space-treated rice including 8 plants chosen randomly and 2 plants with phenotypic mutation were used for AFLP and MSAP analysis. Polymorphism of both DNA sequence and cytosine methylation were detected. For MSAP analysis, the average polymorphic frequency of the on-ground controls, space-treated plants and mutants are 1.3%, 3.1% and 11% respectively. For AFLP analysis, the average polymorphic frequencies are 1.4%, 2.9%and 8%respectively. Total 27 and 22 polymorphic fragments were cloned sequenced from MSAP and AFLP analysis respectively. Nine of the 27 fragments from MSAP analysis show homology to coding sequence. For the 22 polymorphic fragments from AFLP analysis, no one shows homology to mRNA sequence and eight fragments show homology to repeat region or retrotransposon sequence. These results suggest that although both genomic DNA sequence and cytosine methylation status can be effected by space flight, the genomic region homology to the fragments from genome DNA and cytosine methylation analysis were different.

  17. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, Natasha V. (Okemos, MI); Broekaert, Willem F. (Dilbeek, BE); Chua, Nam-Hai (Scarsdale, NY); Kush, Anil (New York, NY)

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a pu GOVERNMENT RIGHTS This application was funded under Department of Energy Contract DE-AC02-76ER01338. The U.S. Government has certain rights under this application and any patent issuing thereon.

  18. An integrated PCR colony hybridization approach to screen cDNA libraries for full-length coding sequences.

    Science.gov (United States)

    Pollier, Jacob; González-Guzmán, Miguel; Ardiles-Diaz, Wilson; Geelen, Danny; Goossens, Alain

    2011-01-01

    cDNA-Amplified Fragment Length Polymorphism (cDNA-AFLP) is a commonly used technique for genome-wide expression analysis that does not require prior sequence knowledge. Typically, quantitative expression data and sequence information are obtained for a large number of differentially expressed gene tags. However, most of the gene tags do not correspond to full-length (FL) coding sequences, which is a prerequisite for subsequent functional analysis. A medium-throughput screening strategy, based on integration of polymerase chain reaction (PCR) and colony hybridization, was developed that allows in parallel screening of a cDNA library for FL clones corresponding to incomplete cDNAs. The method was applied to screen for the FL open reading frames of a selection of 163 cDNA-AFLP tags from three different medicinal plants, leading to the identification of 109 (67%) FL clones. Furthermore, the protocol allows for the use of multiple probes in a single hybridization event, thus significantly increasing the throughput when screening for rare transcripts. The presented strategy offers an efficient method for the conversion of incomplete expressed sequence tags (ESTs), such as cDNA-AFLP tags, to FL-coding sequences.

  19. Yeast identification by sequencing, biochemical kits, MALDI-TOF MS and rep-PCR DNA fingerprinting.

    Science.gov (United States)

    Zhao, Ying; Tsang, Chi-Ching; Xiao, Meng; Chan, Jasper F W; Lau, Susanna K P; Kong, Fanrong; Xu, Yingchun; Woo, Patrick C Y

    2017-12-08

    No study has comprehensively evaluated the performance of 28S nrDNA and ITS sequencing, commercial biochemical test kits, MALDI-TOF MS platforms, and the emerging rep-PCR DNA fingerprinting technology using a cohort of yeast strains collected from a clinical microbiology laboratory. In this study, using 71 clinically important yeast isolates (excluding Candida albicans) collected from a single centre, we determined the concordance of 28S nrDNA and ITS sequencing and evaluated the performance of two commercial test kits, two MALDI-TOF MS platforms, and rep-PCR DNA fingerprinting. 28S nrDNA and ITS sequencing showed complete agreement on the identities of the 71 isolates. Using sequencing results as the standard, 78.9% and 71.8% isolates were correctly identified using the API 20C AUX and Vitek 2 YST ID Card systems, respectively; and 90.1% and 80.3% isolates were correctly identified using the Bruker and Vitek MALDI-TOF MS platforms, respectively. Of the 18 strains belonging to the Candida parapsilosis species complex tested by DiversiLab automated rep-PCR DNA fingerprinting, all were identified only as Candida parapsilosis with similarities ≥93.2%, indicating the misidentification of Candida metapsilosis and Candida orthopsilosis. However, hierarchical cluster analysis of the rep-PCR DNA fingerprints of these three species within this species complex formed three different discrete clusters, indicating that this technology can potentially differentiate the three species. To achieve higher accuracies of identification, the databases of commercial biochemical test kits, MALDI-TOF MS platforms, and DiversiLab automated rep-PCR DNA fingerprinting needs further enrichment, particularly for uncommonly encountered yeast species. © The Author 2017. Published by Oxford University Press on behalf of The International Society for Human and Animal Mycology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  20. Two new computational methods for universal DNA barcoding: a benchmark using barcode sequences of bacteria, archaea, animals, fungi, and land plants.

    Science.gov (United States)

    Tanabe, Akifumi S; Toju, Hirokazu

    2013-01-01

    Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used "1-nearest-neighbor" (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate

  1. Assessing Mitochondrial DNA Variation and Copy Number in Lymphocytes of ~2,000 Sardinians Using Tailored Sequencing Analysis Tools.

    Directory of Open Access Journals (Sweden)

    Jun Ding

    2015-07-01

    Full Text Available DNA sequencing identifies common and rare genetic variants for association studies, but studies typically focus on variants in nuclear DNA and ignore the mitochondrial genome. In fact, analyzing variants in mitochondrial DNA (mtDNA sequences presents special problems, which we resolve here with a general solution for the analysis of mtDNA in next-generation sequencing studies. The new program package comprises 1 an algorithm designed to identify mtDNA variants (i.e., homoplasmies and heteroplasmies, incorporating sequencing error rates at each base in a likelihood calculation and allowing allele fractions at a variant site to differ across individuals; and 2 an estimation of mtDNA copy number in a cell directly from whole-genome sequencing data. We also apply the methods to DNA sequence from lymphocytes of ~2,000 SardiNIA Project participants. As expected, mothers and offspring share all homoplasmies but a lesser proportion of heteroplasmies. Both homoplasmies and heteroplasmies show 5-fold higher transition/transversion ratios than variants in nuclear DNA. Also, heteroplasmy increases with age, though on average only ~1 heteroplasmy reaches the 4% level between ages 20 and 90. In addition, we find that mtDNA copy number averages ~110 copies/lymphocyte and is ~54% heritable, implying substantial genetic regulation of the level of mtDNA. Copy numbers also decrease modestly but significantly with age, and females on average have significantly more copies than males. The mtDNA copy numbers are significantly associated with waist circumference (p-value = 0.0031 and waist-hip ratio (p-value = 2.4×10-5, but not with body mass index, indicating an association with central fat distribution. To our knowledge, this is the largest population analysis to date of mtDNA dynamics, revealing the age-imposed increase in heteroplasmy, the relatively high heritability of copy number, and the association of copy number with metabolic traits.

  2. Highly accurate fluorogenic DNA sequencing with information theory-based error correction.

    Science.gov (United States)

    Chen, Zitian; Zhou, Wenxiong; Qiao, Shuo; Kang, Li; Duan, Haifeng; Xie, X Sunney; Huang, Yanyi

    2017-12-01

    Eliminating errors in next-generation DNA sequencing has proved challenging. Here we present error-correction code (ECC) sequencing, a method to greatly improve sequencing accuracy by combining fluorogenic sequencing-by-synthesis (SBS) with an information theory-based error-correction algorithm. ECC embeds redundancy in sequencing reads by creating three orthogonal degenerate sequences, generated by alternate dual-base reactions. This is similar to encoding and decoding strategies that have proved effective in detecting and correcting errors in information communication and storage. We show that, when combined with a fluorogenic SBS chemistry with raw accuracy of 98.1%, ECC sequencing provides single-end, error-free sequences up to 200 bp. ECC approaches should enable accurate identification of extremely rare genomic variations in various applications in biology and medicine.

  3. A DNA 'barcode blitz': rapid digitization and sequencing of a natural history collection.

    Science.gov (United States)

    Hebert, Paul D N; Dewaard, Jeremy R; Zakharov, Evgeny V; Prosser, Sean W J; Sones, Jayme E; McKeown, Jaclyn T A; Mantle, Beth; La Salle, John

    2013-01-01

    DNA barcoding protocols require the linkage of each sequence record to a voucher specimen that has, whenever possible, been authoritatively identified. Natural history collections would seem an ideal resource for barcode library construction, but they have never seen large-scale analysis because of concerns linked to DNA degradation. The present study examines the strength of this barrier, carrying out a comprehensive analysis of moth and butterfly (Lepidoptera) species in the Australian National Insect Collection. Protocols were developed that enabled tissue samples, specimen data, and images to be assembled rapidly. Using these methods, a five-person team processed 41,650 specimens representing 12,699 species in 14 weeks. Subsequent molecular analysis took about six months, reflecting the need for multiple rounds of PCR as sequence recovery was impacted by age, body size, and collection protocols. Despite these variables and the fact that specimens averaged 30.4 years old, barcode records were obtained from 86% of the species. In fact, one or more barcode compliant sequences (>487 bp) were recovered from virtually all species represented by five or more individuals, even when the youngest was 50 years old. By assembling specimen images, distributional data, and DNA barcode sequences on a web-accessible informatics platform, this study has greatly advanced accessibility to information on thousands of species. Moreover, much of the specimen data became publically accessible within days of its acquisition, while most sequence results saw release within three months. As such, this study reveals the speed with which DNA barcode workflows can mobilize biodiversity data, often providing the first web-accessible information for a species. These results further suggest that existing collections can enable the rapid development of a comprehensive DNA barcode library for the most diverse compartment of terrestrial biodiversity - insects.

  4. RAPD and Internal Transcribed Spacer Sequence Analyses Reveal Zea nicaraguensis as a Section Luxuriantes Species Close to Zea luxurians

    Science.gov (United States)

    Wang, Pei; Lu, Yanli; Zheng, Mingmin; Rong, Tingzhao; Tang, Qilin

    2011-01-01

    Genetic relationship of a newly discovered teosinte from Nicaragua, Zea nicaraguensis with waterlogging tolerance, was determined based on randomly amplified polymorphic DNA (RAPD) markers and the internal transcribed spacer (ITS) sequences of nuclear ribosomal DNA using 14 accessions from Zea species. RAPD analysis showed that a total of 5,303 fragments were produced by 136 random decamer primers, of which 84.86% bands were polymorphic. RAPD-based UPGMA analysis demonstrated that the genus Zea can be divided into section Luxuriantes including Zea diploperennis, Zea luxurians, Zea perennis and Zea nicaraguensis, and section Zea including Zea mays ssp. mexicana, Zea mays ssp. parviglumis, Zea mays ssp. huehuetenangensis and Zea mays ssp. mays. ITS sequence analysis showed the lengths of the entire ITS region of the 14 taxa in Zea varied from 597 to 605 bp. The average GC content was 67.8%. In addition to the insertion/deletions, 78 variable sites were recorded in the total ITS region with 47 in ITS1, 5 in 5.8S, and 26 in ITS2. Sequences of these taxa were analyzed with neighbor-joining (NJ) and maximum parsimony (MP) methods to construct the phylogenetic trees, selecting Tripsacum dactyloides L. as the outgroup. The phylogenetic relationships of Zea species inferred from the ITS sequences are highly concordant with the RAPD evidence that resolved two major subgenus clades. Both RAPD and ITS sequence analyses indicate that Zea nicaraguensis is more closely related to Zea luxurians than the other teosintes and cultivated maize, which should be regarded as a section Luxuriantes species. PMID:21525982

  5. Phylogenetic Analysis of Shewanella Strains by DNA Relatedness Derived from Whole Genome Microarray DNA-DNA Hybridization and Comparisons with Other Methods

    International Nuclear Information System (INIS)

    Wu, Liyou; Yi, T.Y.; Van Nostrand, Joy; Zhou, Jizhong

    2010-01-01

    Phylogenetic analyses were done for the Shewanella strains isolated from Baltic Sea (38 strains), US DOE Hanford Uranium bioremediation site (Hanford Reach of the Columbia River (HRCR), 11 strains), Pacific Ocean and Hawaiian sediments (8 strains), and strains from other resources (16 strains) with three out group strains, Rhodopseudomonas palustris, Clostridium cellulolyticum, and Thermoanaerobacter ethanolicus X514, using DNA relatedness derived from WCGA-based DNA-DNA hybridizations, sequence similarities of 16S rRNA gene and gyrB gene, and sequence similarities of 6 loci of Shewanella genome selected from a shared gene list of the Shewanella strains with whole genome sequenced based on the average nucleotide identity of them (ANI). The phylogenetic trees based on 16S rRNA and gyrB gene sequences, and DNA relatedness derived from WCGA hybridizations of the tested Shewanella strains share exactly the same sub-clusters with very few exceptions, in which the strains were basically grouped by species. However, the phylogenetic analysis based on DNA relatedness derived from WCGA hybridizations dramatically increased the differentiation resolution at species and strains level within Shewanella genus. When the tree based on DNA relatedness derived from WCGA hybridizations was compared to the tree based on the combined sequences of the selected functional genes (6 loci), we found that the resolutions of both methods are similar, but the clustering of the tree based on DNA relatedness derived from WMGA hybridizations was clearer. These results indicate that WCGA-based DNA-DNA hybridization is an idea alternative of conventional DNA-DNA hybridization methods and it is superior to the phylogenetics methods based on sequence similarities of single genes. Detailed analysis is being performed for the re-classification of the strains examined.

  6. Phylogenetic Analysis of Shewanella Strains by DNA Relatedness Derived from Whole Genome Microarray DNA-DNA Hybridization and Comparison with Other Methods

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Liyou; Yi, T. Y.; Van Nostrand, Joy; Zhou, Jizhong

    2010-05-17

    Phylogenetic analyses were done for the Shewanella strains isolated from Baltic Sea (38 strains), US DOE Hanford Uranium bioremediation site [Hanford Reach of the Columbia River (HRCR), 11 strains], Pacific Ocean and Hawaiian sediments (8 strains), and strains from other resources (16 strains) with three out group strains, Rhodopseudomonas palustris, Clostridium cellulolyticum, and Thermoanaerobacter ethanolicus X514, using DNA relatedness derived from WCGA-based DNA-DNA hybridizations, sequence similarities of 16S rRNA gene and gyrB gene, and sequence similarities of 6 loci of Shewanella genome selected from a shared gene list of the Shewanella strains with whole genome sequenced based on the average nucleotide identity of them (ANI). The phylogenetic trees based on 16S rRNA and gyrB gene sequences, and DNA relatedness derived from WCGA hybridizations of the tested Shewanella strains share exactly the same sub-clusters with very few exceptions, in which the strains were basically grouped by species. However, the phylogenetic analysis based on DNA relatedness derived from WCGA hybridizations dramatically increased the differentiation resolution at species and strains level within Shewanella genus. When the tree based on DNA relatedness derived from WCGA hybridizations was compared to the tree based on the combined sequences of the selected functional genes (6 loci), we found that the resolutions of both methods are similar, but the clustering of the tree based on DNA relatedness derived from WMGA hybridizations was clearer. These results indicate that WCGA-based DNA-DNA hybridization is an idea alternative of conventional DNA-DNA hybridization methods and it is superior to the phylogenetics methods based on sequence similarities of single genes. Detailed analysis is being performed for the re-classification of the strains examined.

  7. Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations.

    Science.gov (United States)

    Oikonomopoulos, Spyros; Wang, Yu Chang; Djambazian, Haig; Badescu, Dunarel; Ragoussis, Jiannis

    2016-08-24

    To assess the performance of the Oxford Nanopore Technologies MinION sequencing platform, cDNAs from the External RNA Controls Consortium (ERCC) RNA Spike-In mix were sequenced. This mix mimics mammalian mRNA species and consists of 92 polyadenylated transcripts with known concentration. cDNA libraries were generated using a template switching protocol to facilitate the direct comparison between different sequencing platforms. The MinION performance was assessed for its ability to sequence the cDNAs directly with good accuracy in terms of abundance and full length. The abundance of the ERCC cDNA molecules sequenced by MinION agreed with their expected concentration. No length or GC content bias was observed. The majority of cDNAs were sequenced as full length. Additionally, a complex cDNA population derived from a human HEK-293 cell line was sequenced on an Illumina HiSeq 2500, PacBio RS II and ONT MinION platforms. We observed that there was a good agreement in the measured cDNA abundance between PacBio RS II and ONT MinION (rpearson = 0.82, isoforms with length more than 700bp) and between Illumina HiSeq 2500 and ONT MinION (rpearson = 0.75). This indicates that the ONT MinION can sequence quantitatively both long and short full length cDNA molecules.

  8. Phylogenetic and ecological analyses of soil and sporocarp DNA sequences reveal high diversity and strong habitat partitioning in the boreal ectomycorrhizal genus Russula (Russulales; Basidiomycota)

    Science.gov (United States)

    József Geml; Gary A. Laursen; Ian C. Herriott; Jack M. McFarland; Michael G. Booth; Niall Lennon; H. Chad Nusbaum; D. Lee Taylor

    2010-01-01

    Although critical for the functioning of ecosystems, fungi are poorly known in high-latitude regions. Here, we provide the first genetic diversity assessment of one of the most diverse and abundant ectomycorrhizal genera in Alaska: Russula. We analyzed internal transcribed spacer rDNA sequences from sporocarps and soil samples using phylogenetic...

  9. Reticulate evolution: frequent introgressive hybridization among chinese hares (genus lepus revealed by analyses of multiple mitochondrial and nuclear DNA loci

    Directory of Open Access Journals (Sweden)

    Wu Shi-Fang

    2011-07-01

    Full Text Available Abstract Background Interspecific hybridization may lead to the introgression of genes and genomes across species barriers and contribute to a reticulate evolutionary pattern and thus taxonomic uncertainties. Since several previous studies have demonstrated that introgressive hybridization has occurred among some species within Lepus, therefore it is possible that introgressive hybridization events also occur among Chinese Lepus species and contribute to the current taxonomic confusion. Results Data from four mtDNA genes, from 116 individuals, and one nuclear gene, from 119 individuals, provides the first evidence of frequent introgression events via historical and recent interspecific hybridizations among six Chinese Lepus species. Remarkably, the mtDNA of L. mandshuricus was completely replaced by mtDNA from L. timidus and L. sinensis. Analysis of the nuclear DNA sequence revealed a high proportion of heterozygous genotypes containing alleles from two divergent clades and that several haplotypes were shared among species, suggesting repeated and recent introgression. Furthermore, results from the present analyses suggest that Chinese hares belong to eight species. Conclusion This study provides a framework for understanding the patterns of speciation and the taxonomy of this clade. The existence of morphological intermediates and atypical mitochondrial gene genealogies resulting from frequent hybridization events likely contribute to the current taxonomic confusion of Chinese hares. The present study also demonstrated that nuclear gene sequence could offer a powerful complementary data set with mtDNA in tracing a complete evolutionary history of recently diverged species.

  10. Full-length cDNA sequences from Rhesus monkey placenta tissue: analysis and utility for comparative mapping

    Directory of Open Access Journals (Sweden)

    Lee Sang-Rae

    2010-07-01

    Full Text Available Abstract Background Rhesus monkeys (Macaca mulatta are widely-used as experimental animals in biomedical research and are closely related to other laboratory macaques, such as cynomolgus monkeys (Macaca fascicularis, and to humans, sharing a last common ancestor from about 25 million years ago. Although rhesus monkeys have been studied extensively under field and laboratory conditions, research has been limited by the lack of genetic resources. The present study generated placenta full-length cDNA libraries, characterized the resulting expressed sequence tags, and described their utility for comparative mapping with human RefSeq mRNA transcripts. Results From rhesus monkey placenta full-length cDNA libraries, 2000 full-length cDNA sequences were determined and 1835 rhesus placenta cDNA sequences longer than 100 bp were collected. These sequences were annotated based on homology to human genes. Homology search against human RefSeq mRNAs revealed that our collection included the sequences of 1462 putative rhesus monkey genes. Moreover, we identified 207 genes containing exon alterations in the coding region and the untranslated region of rhesus monkey transcripts, despite the highly conserved structure of the coding regions. Approximately 10% (187 of all full-length cDNA sequences did not represent any public human RefSeq mRNAs. Intriguingly, two rhesus monkey specific exons derived from the transposable elements of AluYRa2 (SINE family and MER11B (LTR family were also identified. Conclusion The 1835 rhesus monkey placenta full-length cDNA sequences described here could expand genomic resources and information of rhesus monkeys. This increased genomic information will greatly contribute to the development of evolutionary biology and biomedical research.

  11. Is photocleavage of DNA by YOYO-1 using a synchrotron radiation light source sequence dependent?

    DEFF Research Database (Denmark)

    Gilroy, Emma L.; Hoffmann, Søren Vrønning; Jones, Nykola C.

    2011-01-01

    ) throughout the irradiation period. The dependence of LD signals on DNA sequences and on time in the intense light beam was explored and quantified for single-stranded poly(dA), poly[(dA-dT)2], calf thymus DNA (ctDNA) and Micrococcus luteus DNA (mlDNA). The DNA and ligand regions of the spectrum showed...

  12. Structural properties of replication origins in yeast DNA sequences

    International Nuclear Information System (INIS)

    Cao Xiaoqin; Zeng Jia; Yan Hong

    2008-01-01

    Sequence-dependent DNA flexibility is an important structural property originating from the DNA 3D structure. In this paper, we investigate the DNA flexibility of the budding yeast (S. Cerevisiae) replication origins on a genome-wide scale using flexibility parameters from two different models, the trinucleotide and the tetranucleotide models. Based on analyzing average flexibility profiles of 270 replication origins, we find that yeast replication origins are significantly rigid compared with their surrounding genomic regions. To further understand the highly distinctive property of replication origins, we compare the flexibility patterns between yeast replication origins and promoters, and find that they both contain significantly rigid DNAs. Our results suggest that DNA flexibility is an important factor that helps proteins recognize and bind the target sites in order to initiate DNA replication. Inspired by the role of the rigid region in promoters, we speculate that the rigid replication origins may facilitate binding of proteins, including the origin recognition complex (ORC), Cdc6, Cdt1 and the MCM2-7 complex

  13. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    International Nuclear Information System (INIS)

    Arneodo, Alain; Vaillant, Cedric; Audit, Benjamin; Argoul, Francoise; D'Aubenton-Carafa, Yves; Thermes, Claude

    2011-01-01

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  14. Phylogeny and genetic diversity of Bridgeoporus nobilissimus inferred using mitochondrial and nuclear rDNA sequences

    Science.gov (United States)

    Redberg, G.L.; Hibbett, D.S.; Ammirati, J.F.; Rodriguez, R.J.

    2003-01-01

    The genetic diversity and phylogeny of Bridgeoporus nobilissimus have been analyzed. DNA was extracted from spores collected from individual fruiting bodies representing six geographically distinct populations in Oregon and Washington. Spore samples collected contained low levels of bacteria, yeast and a filamentous fungal species. Using taxon-specific PCR primers, it was possible to discriminate among rDNA from bacteria, yeast, a filamentous associate and B. nobilissimus. Nuclear rDNA internal transcribed spacer (ITS) region sequences of B. nobilissimus were compared among individuals representing six populations and were found to have less than 2% variation. These sequences also were used to design dual and nested PCR primers for B. nobilissimus-specific amplification. Mitochondrial small-subunit rDNA sequences were used in a phylogenetic analysis that placed B. nobilissimus in the hymenochaetoid clade, where it was associated with Oxyporus and Schizopora.

  15. Cloning and cDNA sequence of the dihydrolipoamide dehydrogenase component of human α-ketoacid dehydrogenase complexes

    International Nuclear Information System (INIS)

    Pons, G.; Raefsky-Estrin, C.; Carothers, D.J.; Pepin, R.A.; Javed, A.A.; Jesse, B.W.; Ganapathi, M.K.; Samols, D.; Patel, M.S.

    1988-01-01

    cDNA clones comprising the entire coding region for human dihydrolipoamide dehydrogenase have been isolated from a human liver cDNA library. The cDNA sequence of the largest clone consisted of 2082 base pairs and contained a 1527-base open reading frame that encodes a precursor dihydrolipoamide dehydrogenase of 509 amino acid residues. The first 35-amino acid residues of the open reading frame probably correspond to a typical mitochondrial import leader sequence. The predicted amino acid sequence of the mature protein, starting at the residue number 36 of the open reading frame, is almost identical (>98% homology) with the known partial amino acid sequence of the pig heart dihydrolipoamide dehydrogenase. The cDNA clone also contains a 3' untranslated region of 505 bases with an unusual polyadenylylation signal (TATAAA) and a short poly(A) track. By blot-hybridization analysis with the cDNA as probe, two mRNAs, 2.2 and 2.4 kilobases in size, have been detected in human tissues and fibroblasts, whereas only one mRNA (2.4 kilobases) was detected in rat tissues

  16. A Nuclear Ribosomal DNA Phylogeny of Acer Inferred with Maximum Likelihood, Splits Graphs, and Motif Analysis of 606 Sequences

    Directory of Open Access Journals (Sweden)

    Guido W. Grimm

    2006-01-01

    Full Text Available The multi-copy internal transcribed spacer (ITS region of nuclear ribosomal DNA is widely used to infer phylogenetic relationships among closely related taxa. Here we use maximum likelihood (ML and splits graph analyses to extract phylogenetic information from ~ 600 mostly cloned ITS sequences, representing 81 species and subspecies of Acer, and both species of its sister Dipteronia. Additional analyses compared sequence motifs in Acer and several hundred Anacardiaceae, Burseraceae, Meliaceae, Rutaceae, and Sapindaceae ITS sequences in GenBank. We also assessed the effects of using smaller data sets of consensus sequences with ambiguity coding (accounting for within-species variation instead of the full (partly redundant original sequences. Neighbor-nets and bipartition networks were used to visualize conflict among character state patterns. Species clusters observed in the trees and networks largely agree with morphology-based classifications; of de Jong’s (1994 16 sections, nine are supported in neighbor-net and bipartition networks, and ten by sequence motifs and the ML tree; of his 19 series, 14 are supported in networks, motifs, and the ML tree. Most nodes had higher bootstrap support with matrices of 105 or 40 consensus sequences than with the original matrix. Within-taxon ITS divergence did not differ between diploid and polyploid Acer, and there was little evidence of differentiated parental ITS haplotypes, suggesting that concerted evolution in Acer acts rapidly.

  17. A Nuclear Ribosomal DNA Phylogeny of Acer Inferred with Maximum Likelihood, Splits Graphs, and Motif Analysis of 606 Sequences

    Science.gov (United States)

    Grimm, Guido W.; Renner, Susanne S.; Stamatakis, Alexandros; Hemleben, Vera

    2007-01-01

    The multi-copy internal transcribed spacer (ITS) region of nuclear ribosomal DNA is widely used to infer phylogenetic relationships among closely related taxa. Here we use maximum likelihood (ML) and splits graph analyses to extract phylogenetic information from ~ 600 mostly cloned ITS sequences, representing 81 species and subspecies of Acer, and both species of its sister Dipteronia. Additional analyses compared sequence motifs in Acer and several hundred Anacardiaceae, Burseraceae, Meliaceae, Rutaceae, and Sapindaceae ITS sequences in GenBank. We also assessed the effects of using smaller data sets of consensus sequences with ambiguity coding (accounting for within-species variation) instead of the full (partly redundant) original sequences. Neighbor-nets and bipartition networks were used to visualize conflict among character state patterns. Species clusters observed in the trees and networks largely agree with morphology-based classifications; of de Jong’s (1994) 16 sections, nine are supported in neighbor-net and bipartition networks, and ten by sequence motifs and the ML tree; of his 19 series, 14 are supported in networks, motifs, and the ML tree. Most nodes had higher bootstrap support with matrices of 1