WorldWideScience

Sample records for identifying dna sequence

  1. Potential of DNA sequences to identify zoanthids (Cnidaria: Zoantharia).

    Science.gov (United States)

    Sinniger, Frederic; Reimer, James D; Pawlowski, Jan

    2008-12-01

    The order Zoantharia is known for its chaotic taxonomy and difficult morphological identification. One method that potentially could help for examining such troublesome taxa is DNA barcoding, which identifies species using standard molecular markers. The mitochondrial cytochrome oxidase subunit I (COI) has been utilized to great success in groups such as birds and insects; however, its applicability in many other groups is controversial. Recently, some studies have suggested that barcoding is not applicable to anthozoans. Here, we examine the use of COI and mitochondrial 16S ribosomal DNA for zoanthid identification. Despite the absence of a clear barcoding gap, our results show that for most of 54 zoanthid samples, both markers could separate samples to the species, or species group, level, particularly when easily accessible ecological or distributional data were included. Additionally, we have used the short V5 region of mt 16S rDNA to identify eight old (13 to 50 years old) museum samples. We discuss advantages and disadvantages of COI and mt 16S rDNA as barcodes for Zoantharia, and recommend that either one or both of these markers be considered for zoanthid identification in the future.

  2. An Internet-Accessible DNA Sequence Database for Identifying Fusaria from Human and Animal Infections

    Science.gov (United States)

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated wi...

  3. A Flexible, Efficient Binomial Mixed Model for Identifying Differential DNA Methylation in Bisulfite Sequencing Data

    Science.gov (United States)

    Lea, Amanda J.

    2015-01-01

    Identifying sources of variation in DNA methylation levels is important for understanding gene regulation. Recently, bisulfite sequencing has become a popular tool for investigating DNA methylation levels. However, modeling bisulfite sequencing data is complicated by dramatic variation in coverage across sites and individual samples, and because of the computational challenges of controlling for genetic covariance in count data. To address these challenges, we present a binomial mixed model and an efficient, sampling-based algorithm (MACAU: Mixed model association for count data via data augmentation) for approximate parameter estimation and p-value computation. This framework allows us to simultaneously account for both the over-dispersed, count-based nature of bisulfite sequencing data, as well as genetic relatedness among individuals. Using simulations and two real data sets (whole genome bisulfite sequencing (WGBS) data from Arabidopsis thaliana and reduced representation bisulfite sequencing (RRBS) data from baboons), we show that our method provides well-calibrated test statistics in the presence of population structure. Further, it improves power to detect differentially methylated sites: in the RRBS data set, MACAU detected 1.6-fold more age-associated CpG sites than a beta-binomial model (the next best approach). Changes in these sites are consistent with known age-related shifts in DNA methylation levels, and are enriched near genes that are differentially expressed with age in the same population. Taken together, our results indicate that MACAU is an efficient, effective tool for analyzing bisulfite sequencing data, with particular salience to analyses of structured populations. MACAU is freely available at www.xzlab.org/software.html. PMID:26599596

  4. SeqAnt: A web service to rapidly identify and annotate DNA sequence variations

    Directory of Open Access Journals (Sweden)

    Patel Viren

    2010-09-01

    Full Text Available Abstract Background The enormous throughput and low cost of second-generation sequencing platforms now allow research and clinical geneticists to routinely perform single experiments that identify tens of thousands to millions of variant sites. Existing methods to annotate variant sites using information from publicly available databases via web browsers are too slow to be useful for the large sequencing datasets being routinely generated by geneticists. Because sequence annotation of variant sites is required before functional characterization can proceed, the lack of a high-throughput pipeline to efficiently annotate variant sites can act as a significant bottleneck in genetics research. Results SeqAnt (Sequence Annotator is an open source web service and software package that rapidly annotates DNA sequence variants and identifies recessive or compound heterozygous loci in human, mouse, fly, and worm genome sequencing experiments. Variants are characterized with respect to their functional type, frequency, and evolutionary conservation. Annotated variants can be viewed on a web browser, downloaded in a tab-delimited text file, or directly uploaded in a BED format to the UCSC genome browser. To demonstrate the speed of SeqAnt, we annotated a series of publicly available datasets that ranged in size from 37 to 3,439,107 variant sites. The total time to completely annotate these data completely ranged from 0.17 seconds to 28 minutes 49.8 seconds. Conclusion SeqAnt is an open source web service and software package that overcomes a critical bottleneck facing research and clinical geneticists using second-generation sequencing platforms. SeqAnt will prove especially useful for those investigators who lack dedicated bioinformatics personnel or infrastructure in their laboratories.

  5. Transcriptome sequencing in pediatric acute lymphoblastic leukemia identifies fusion genes associated with distinct DNA methylation profiles

    Directory of Open Access Journals (Sweden)

    Yanara Marincevic-Zuniga

    2017-08-01

    Full Text Available Abstract Background Structural chromosomal rearrangements that lead to expressed fusion genes are a hallmark of acute lymphoblastic leukemia (ALL. In this study, we performed transcriptome sequencing of 134 primary ALL patient samples to comprehensively detect fusion transcripts. Methods We combined fusion gene detection with genome-wide DNA methylation analysis, gene expression profiling, and targeted sequencing to determine molecular signatures of emerging ALL subtypes. Results We identified 64 unique fusion events distributed among 80 individual patients, of which over 50% have not previously been reported in ALL. Although the majority of the fusion genes were found only in a single patient, we identified several recurrent fusion gene families defined by promiscuous fusion gene partners, such as ETV6, RUNX1, PAX5, and ZNF384, or recurrent fusion genes, such as DUX4-IGH. Our data show that patients harboring these fusion genes displayed characteristic genome-wide DNA methylation and gene expression signatures in addition to distinct patterns in single nucleotide variants and recurrent copy number alterations. Conclusion Our study delineates the fusion gene landscape in pediatric ALL, including both known and novel fusion genes, and highlights fusion gene families with shared molecular etiologies, which may provide additional information for prognosis and therapeutic options in the future.

  6. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

    Science.gov (United States)

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  7. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data.

    Science.gov (United States)

    Miller, Mark P; Knaus, Brian J; Mullins, Thomas D; Haig, Susan M

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25 bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  8. The Comparison of Biochemical and Sequencing 16S rDNA Gene Methods to Identify Nontuberculous Mycobacteria

    Directory of Open Access Journals (Sweden)

    Shafipour1, M.

    2014-11-01

    Full Text Available The identification of Mycobacteria in the species level has great medical importance. Biochemical tests are laborious and time-consuming, so new techniques could be used to identify the species. This research aimed to the comparison of biochemical and sequencing 16S rDNA gene methods to identify nontuberculous Mycobacteria in patients suspected to tuberculosis in Golestan province which is the most prevalent region of tuberculosis in Iran. Among 3336 patients suspected to tuberculosis referred to hospitals and health care centres in Golestan province during 2010-2011, 319 (9.56% culture positive cases were collected. Identification of species by using biochemical tests was done. On the samples recognized as nontuberculous Mycobacteria, after DNA extraction by boiling, 16S rDNA PCR was done and their sequencing were identified by NCBI BLAST. Of the 319 positive samples in Golestan Province, 300 cases were M.tuberculosis and 19 cases (5.01% were identified as nontuberculous Mycobacteria by biochemical tests. 15 out of 19 nontuberculous Mycobacteria were identified by PCR and sequencing method as similar by biochemical methods (similarity rate: 78.9%. But after PCR, 1 case known as M.simiae by biochemical test was identified as M. lentiflavum and 3 other cases were identified as Nocardia. Biochemical methods corresponded to the 16S rDNA PCR and sequencing in 78.9% of cases. However, in identification of M. lentiflavum and Nocaria sp. the molecular method is better than biochemical methods.

  9. Multiplexed resequencing analysis to identify rare variants in pooled DNA with barcode indexing using next-generation sequencer.

    Science.gov (United States)

    Mitsui, Jun; Fukuda, Yoko; Azuma, Kyo; Tozaki, Hirokazu; Ishiura, Hiroyuki; Takahashi, Yuji; Goto, Jun; Tsuji, Shoji

    2010-07-01

    We have recently found that multiple rare variants of the glucocerebrosidase gene (GBA) confer a robust risk for Parkinson disease, supporting the 'common disease-multiple rare variants' hypothesis. To develop an efficient method of identifying rare variants in a large number of samples, we applied multiplexed resequencing using a next-generation sequencer to identification of rare variants of GBA. Sixteen sets of pooled DNAs from six pooled DNA samples were prepared. Each set of pooled DNAs was subjected to polymerase chain reaction to amplify the target gene (GBA) covering 6.5 kb, pooled into one tube with barcode indexing, and then subjected to extensive sequence analysis using the SOLiD System. Individual samples were also subjected to direct nucleotide sequence analysis. With the optimization of data processing, we were able to extract all the variants from 96 samples with acceptable rates of false-positive single-nucleotide variants.

  10. Exome sequencing identifies rare deleterious mutations in DNA repair genes FANCC and BLM as potential breast cancer susceptibility alleles.

    Directory of Open Access Journals (Sweden)

    Ella R Thompson

    2012-09-01

    Full Text Available Despite intensive efforts using linkage and candidate gene approaches, the genetic etiology for the majority of families with a multi-generational breast cancer predisposition is unknown. In this study, we used whole-exome sequencing of thirty-three individuals from 15 breast cancer families to identify potential predisposing genes. Our analysis identified families with heterozygous, deleterious mutations in the DNA repair genes FANCC and BLM, which are responsible for the autosomal recessive disorders Fanconi Anemia and Bloom syndrome. In total, screening of all exons in these genes in 438 breast cancer families identified three with truncating mutations in FANCC and two with truncating mutations in BLM. Additional screening of FANCC mutation hotspot exons identified one pathogenic mutation among an additional 957 breast cancer families. Importantly, none of the deleterious mutations were identified among 464 healthy controls and are not reported in the 1,000 Genomes data. Given the rarity of Fanconi Anemia and Bloom syndrome disorders among Caucasian populations, the finding of multiple deleterious mutations in these critical DNA repair genes among high-risk breast cancer families is intriguing and suggestive of a predisposing role. Our data demonstrate the utility of intra-family exome-sequencing approaches to uncover cancer predisposition genes, but highlight the major challenge of definitively validating candidates where the incidence of sporadic disease is high, germline mutations are not fully penetrant, and individual predisposition genes may only account for a tiny proportion of breast cancer families.

  11. Hybridization-based antibody cDNA recovery for the production of recombinant antibodies identified by repertoire sequencing.

    Science.gov (United States)

    Valdés-Alemán, Javier; Téllez-Sosa, Juan; Ovilla-Muñoz, Marbella; Godoy-Lozano, Elizabeth; Velázquez-Ramírez, Daniel; Valdovinos-Torres, Humberto; Gómez-Barreto, Rosa E; Martinez-Barnetche, Jesús

    2014-01-01

    High-throughput sequencing of the antibody repertoire is enabling a thorough analysis of B cell diversity and clonal selection, which may improve the novel antibody discovery process. Theoretically, an adequate bioinformatic analysis could allow identification of candidate antigen-specific antibodies, requiring their recombinant production for experimental validation of their specificity. Gene synthesis is commonly used for the generation of recombinant antibodies identified in silico. Novel strategies that bypass gene synthesis could offer more accessible antibody identification and validation alternatives. We developed a hybridization-based recovery strategy that targets the complementarity-determining region 3 (CDRH3) for the enrichment of cDNA of candidate antigen-specific antibody sequences. Ten clonal groups of interest were identified through bioinformatic analysis of the heavy chain antibody repertoire of mice immunized with hen egg white lysozyme (HEL). cDNA from eight of the targeted clonal groups was recovered efficiently, leading to the generation of recombinant antibodies. One representative heavy chain sequence from each clonal group recovered was paired with previously reported anti-HEL light chains to generate full antibodies, later tested for HEL-binding capacity. The recovery process proposed represents a simple and scalable molecular strategy that could enhance antibody identification and specificity assessment, enabling a more cost-efficient generation of recombinant antibodies.

  12. Pooled-DNA sequencing identifies genomic regions of selection in Nigerian isolates of Plasmodium falciparum.

    Science.gov (United States)

    Oyebola, Kolapo M; Idowu, Emmanuel T; Olukosi, Yetunde A; Awolola, Taiwo S; Amambua-Ngwa, Alfred

    2017-06-29

    The burden of falciparum malaria is especially high in sub-Saharan Africa. Differences in pressure from host immunity and antimalarial drugs lead to adaptive changes responsible for high level of genetic variations within and between the parasite populations. Population-specific genetic studies to survey for genes under positive or balancing selection resulting from drug pressure or host immunity will allow for refinement of interventions. We performed a pooled sequencing (pool-seq) of the genomes of 100 Plasmodium falciparum isolates from Nigeria. We explored allele-frequency based neutrality test (Tajima's D) and integrated haplotype score (iHS) to identify genes under selection. Fourteen shared iHS regions that had at least 2 SNPs with a score > 2.5 were identified. These regions code for genes that were likely to have been under strong directional selection. Two of these genes were the chloroquine resistance transporter (CRT) on chromosome 7 and the multidrug resistance 1 (MDR1) on chromosome 5. There was a weak signature of selection in the dihydrofolate reductase (DHFR) gene on chromosome 4 and MDR5 genes on chromosome 13, with only 2 and 3 SNPs respectively identified within the iHS window. We observed strong selection pressure attributable to continued chloroquine and sulfadoxine-pyrimethamine use despite their official proscription for the treatment of uncomplicated malaria. There was also a major selective sweep on chromosome 6 which had 32 SNPs within the shared iHS region. Tajima's D of circumsporozoite protein (CSP), erythrocyte-binding antigen (EBA-175), merozoite surface proteins - MSP3 and MSP7, merozoite surface protein duffy binding-like (MSPDBL2) and serine repeat antigen (SERA-5) were 1.38, 1.29, 0.73, 0.84 and 0.21, respectively. We have demonstrated the use of pool-seq to understand genomic patterns of selection and variability in P. falciparum from Nigeria, which bears the highest burden of infections. This investigation identified known

  13. diffloop: a computational framework for identifying and analyzing differential DNA loops from sequencing data.

    Science.gov (United States)

    Lareau, Caleb A; Aryee, Martin J; Berger, Bonnie

    2018-02-15

    The 3D architecture of DNA within the nucleus is a key determinant of interactions between genes, regulatory elements, and transcriptional machinery. As a result, differences in DNA looping structure are associated with variation in gene expression and cell state. To systematically assess changes in DNA looping architecture between samples, we introduce diffloop, an R/Bioconductor package that provides a suite of functions for the quality control, statistical testing, annotation, and visualization of DNA loops. We demonstrate this functionality by detecting differences between ENCODE ChIA-PET samples and relate looping to variability in epigenetic state. Diffloop is implemented as an R/Bioconductor package available at https://bioconductor.org/packages/release/bioc/html/diffloop.html. aryee.martin@mgh.harvard.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  14. DNA sequencing conference, 2

    Energy Technology Data Exchange (ETDEWEB)

    Cook-Deegan, R.M. [Georgetown Univ., Kennedy Inst. of Ethics, Washington, DC (United States); Venter, J.C. [National Inst. of Neurological Disorders and Strokes, Bethesda, MD (United States); Gilbert, W. [Harvard Univ., Cambridge, MA (United States); Mulligan, J. [Stanford Univ., CA (United States); Mansfield, B.K. [Oak Ridge National Lab., TN (United States)

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  15. Use of DNA sequences to identify forensically important fly species and their distribution in the coastal region of Central California.

    Science.gov (United States)

    Nakano, Angie; Honda, Jeff

    2015-08-01

    Forensic entomology has gained prominence in recent years, as improvements in DNA technology and molecular methods have allowed insect and other arthropod evidence to become increasingly useful in criminal and civil investigations. However, comprehensive faunal inventories are still needed, including cataloging local DNA sequences for forensically significant Diptera. This multi-year fly-trapping study was built upon and expanded a previous survey of these flies in Santa Clara County, including the addition of genetic barcoding data from collected species of flies. Flies from the families Calliphoridae, Sarcophagidae, and Muscidae were trapped in meat-baited traps set in a variety of locations throughout the county. Flies were identified using morphological features and confirmed by molecular analysis. A total of 16 calliphorid species, 11 sarcophagid species, and four muscid species were collected and differentiated. This study found more species of flies than previous area surveys and established new county records for two calliphorid species: Cynomya cadaverina and Chrysomya rufifacies. Differences were found in fly fauna in different areas of the county, indicating the importance of microclimates in the distribution of these flies. Molecular analysis supported the use of DNA barcoding as an effective method of identifying cryptic fly species. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  16. Gomphid DNA sequence data

    Data.gov (United States)

    U.S. Environmental Protection Agency — DNA sequence data for several genetic loci. This dataset is not publicly accessible because: It's already publicly available on GenBank. It can be accessed through...

  17. High-fidelity target sequencing of individual molecules identified using barcode sequences: de novo detection and absolute quantitation of mutations in plasma cell-free DNA from cancer patients.

    Science.gov (United States)

    Kukita, Yoji; Matoba, Ryo; Uchida, Junji; Hamakawa, Takuya; Doki, Yuichiro; Imamura, Fumio; Kato, Kikuya

    2015-08-01

    Circulating tumour DNA (ctDNA) is an emerging field of cancer research. However, current ctDNA analysis is usually restricted to one or a few mutation sites due to technical limitations. In the case of massively parallel DNA sequencers, the number of false positives caused by a high read error rate is a major problem. In addition, the final sequence reads do not represent the original DNA population due to the global amplification step during the template preparation. We established a high-fidelity target sequencing system of individual molecules identified in plasma cell-free DNA using barcode sequences; this system consists of the following two steps. (i) A novel target sequencing method that adds barcode sequences by adaptor ligation. This method uses linear amplification to eliminate the errors introduced during the early cycles of polymerase chain reaction. (ii) The monitoring and removal of erroneous barcode tags. This process involves the identification of individual molecules that have been sequenced and for which the number of mutations have been absolute quantitated. Using plasma cell-free DNA from patients with gastric or lung cancer, we demonstrated that the system achieved near complete elimination of false positives and enabled de novo detection and absolute quantitation of mutations in plasma cell-free DNA. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  18. DNA sequence modeling based on context trees

    NARCIS (Netherlands)

    Kusters, C.J.; Ignatenko, T.; Roland, J.; Horlin, F.

    2015-01-01

    Genomic sequences contain instructions for protein and cell production. Therefore understanding and identification of biologically and functionally meaningful patterns in DNA sequences is of paramount importance. Modeling of DNA sequences in its turn can help to better understand and identify such

  19. Repeated DNA sequences in fungi

    Energy Technology Data Exchange (ETDEWEB)

    Dutta, S K

    1974-11-01

    Several fungal species, representatives of all broad groups like basidiomycetes, ascomycetes and phycomycetes, were examined for the nature of repeated DNA sequences by DNA:DNA reassociation studies using hydroxyapatite chromatography. All of the fungal species tested contained 10 to 20 percent repeated DNA sequences. There are approximately 100 to 110 copies of repeated DNA sequences of approximately 4 x 10/sup 7/ daltons piece size of each. Repeated DNA sequence homoduplexes showed on average 5/sup 0/C difference of T/sub e/50 (temperature at which 50 percent duplexes dissociate) values from the corresponding homoduplexes of unfractionated whole DNA. It is suggested that a part of repetitive sequences in fungi constitutes mitochondrial DNA and a part of it constitutes nuclear DNA. (auth)

  20. Nucleotide sequence preservation of human mitochondrial DNA

    International Nuclear Information System (INIS)

    Monnat, R.J. Jr.; Loeb, L.A.

    1985-01-01

    Recombinant DNA techniques have been used to quantitate the amount of nucleotide sequence divergence in the mitochondrial DNA population of individual normal humans. Mitochondrial DNA was isolated from the peripheral blood lymphocytes of five normal humans and cloned in M13 mp11; 49 kilobases of nucleotide sequence information was obtained from 248 independently isolated clones from the five normal donors. Both between- and within-individual differences were identified. Between-individual differences were identified in approximately = to 1/200 nucleotides. In contrast, only one within-individual difference was identified in 49 kilobases of nucleotide sequence information. This high degree of mitochondrial nucleotide sequence homogeneity in human somatic cells is in marked contrast to the rapid evolutionary divergence of human mitochondrial DNA and suggests the existence of mechanisms for the concerted preservation of mammalian mitochondrial DNA sequences in single organisms

  1. Biosensors for DNA sequence detection

    Science.gov (United States)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  2. Reduced Representation Libraries from DNA Pools Analysed with Next Generation Semiconductor Based-Sequencing to Identify SNPs in Extreme and Divergent Pigs for Back Fat Thickness

    Directory of Open Access Journals (Sweden)

    Samuele Bovo

    2015-01-01

    Full Text Available The aim of this study was to identify single nucleotide polymorphisms (SNPs that could be associated with back fat thickness (BFT in pigs. To achieve this goal, we evaluated the potential and limits of an experimental design that combined several methodologies. DNA samples from two groups of Italian Large White pigs with divergent estimating breeding value (EBV for BFT were separately pooled and sequenced, after preparation of reduced representation libraries (RRLs, on the Ion Torrent technology. Taking advantage from SNAPE for SNPs calling in sequenced DNA pools, 39,165 SNPs were identified; 1/4 of them were novel variants not reported in dbSNP. Combining sequencing data with Illumina PorcineSNP60 BeadChip genotyping results on the same animals, 661 genomic positions overlapped with a good approximation of minor allele frequency estimation. A total of 54 SNPs showing enriched alleles in one or in the other RRLs might be potential markers associated with BFT. Some of these SNPs were close to genes involved in obesity related phenotypes.

  3. Quantitation of heteroplasmy of mtDNA sequence variants identified in a population of AD patients and controls by array-based resequencing.

    Science.gov (United States)

    Coon, Keith D; Valla, Jon; Szelinger, Szabolics; Schneider, Lonnie E; Niedzielko, Tracy L; Brown, Kevin M; Pearson, John V; Halperin, Rebecca; Dunckley, Travis; Papassotiropoulos, Andreas; Caselli, Richard J; Reiman, Eric M; Stephan, Dietrich A

    2006-08-01

    The role of mitochondrial dysfunction in the pathogenesis of Alzheimer's disease (AD) has been well documented. Though evidence for the role of mitochondria in AD seems incontrovertible, the impact of mitochondrial DNA (mtDNA) mutations in AD etiology remains controversial. Though mutations in mitochondrially encoded genes have repeatedly been implicated in the pathogenesis of AD, many of these studies have been plagued by lack of replication as well as potential contamination of nuclear-encoded mitochondrial pseudogenes. To assess the role of mtDNA mutations in the pathogenesis of AD, while avoiding the pitfalls of nuclear-encoded mitochondrial pseudogenes encountered in previous investigations and showcasing the benefits of a novel resequencing technology, we sequenced the entire coding region (15,452 bp) of mtDNA from 19 extremely well-characterized AD patients and 18 age-matched, unaffected controls utilizing a new, reliable, high-throughput array-based resequencing technique, the Human MitoChip. High-throughput, array-based DNA resequencing of the entire mtDNA coding region from platelets of 37 subjects revealed the presence of 208 loci displaying a total of 917 sequence variants. There were no statistically significant differences in overall mutational burden between cases and controls, however, 265 independent sites of statistically significant change between cases and controls were identified. Changed sites were found in genes associated with complexes I (30.2%), III (3.0%), IV (33.2%), and V (9.1%) as well as tRNA (10.6%) and rRNA (14.0%). Despite their statistical significance, the subtle nature of the observed changes makes it difficult to determine whether they represent true functional variants involved in AD etiology or merely naturally occurring dissimilarity. Regardless, this study demonstrates the tremendous value of this novel mtDNA resequencing platform, which avoids the pitfalls of erroneously amplifying nuclear-encoded mtDNA pseudogenes, and

  4. Whole-exome sequencing of muscle-invasive bladder cancer identifies recurrent mutations of UNC5C and prognostic importance of DNA repair gene mutations on survival.

    Science.gov (United States)

    Yap, Kai Lee; Kiyotani, Kazuma; Tamura, Kenji; Antic, Tatjana; Jang, Miran; Montoya, Magdeline; Campanile, Alexa; Yew, Poh Yin; Ganshert, Cory; Fujioka, Tomoaki; Steinberg, Gary D; O'Donnell, Peter H; Nakamura, Yusuke

    2014-12-15

    Because of suboptimal outcomes in muscle-invasive bladder cancer even with multimodality therapy, determination of potential genetic drivers offers the possibility of improving therapeutic approaches and discovering novel prognostic indicators. Using pTN staging, we case-matched 81 patients with resected ≥pT2 bladder cancers for whom perioperative chemotherapy use and disease recurrence status were known. Whole-exome sequencing was conducted in 43 cases to identify recurrent somatic mutations and targeted sequencing of 10 genes selected from the initial screening in an additional 38 cases was completed. Mutational profiles along with clinicopathologic information were correlated with recurrence-free survival (RFS) in the patients. We identified recurrent novel somatic mutations in the gene UNC5C (9.9%), in addition to TP53 (40.7%), KDM6A (21.0%), and TSC1 (12.3%). Patients who were carriers of somatic mutations in DNA repair genes (one or more of ATM, ERCC2, FANCD2, PALB2, BRCA1, or BRCA2) had a higher overall number of somatic mutations (P = 0.011). Importantly, after a median follow-up of 40.4 months, carriers of somatic mutations (n = 25) in any of these six DNA repair genes had significantly enhanced RFS compared with noncarriers [median, 32.4 vs. 14.8 months; hazard ratio of 0.46, 95% confidence interval (CI), 0.22-0.98; P = 0.0435], after adjustment for pathologic pTN staging and independent of adjuvant chemotherapy usage. Better prognostic outcomes of individuals carrying somatic mutations in DNA repair genes suggest these mutations as favorable prognostic events in muscle-invasive bladder cancer. Additional mechanistic investigation into the previously undiscovered role of UNC5C in bladder cancer is warranted. ©2014 American Association for Cancer Research.

  5. Duplication in DNA Sequences

    Science.gov (United States)

    Ito, Masami; Kari, Lila; Kincaid, Zachary; Seki, Shinnosuke

    The duplication and repeat-deletion operations are the basis of a formal language theoretic model of errors that can occur during DNA replication. During DNA replication, subsequences of a strand of DNA may be copied several times (resulting in duplications) or skipped (resulting in repeat-deletions). As formal language operations, iterated duplication and repeat-deletion of words and languages have been well studied in the literature. However, little is known about single-step duplications and repeat-deletions. In this paper, we investigate several properties of these operations, including closure properties of language families in the Chomsky hierarchy and equations involving these operations. We also make progress toward a characterization of regular languages that are generated by duplicating a regular language.

  6. Graphene nanodevices for DNA sequencing

    NARCIS (Netherlands)

    Heerema, S.J.; Dekker, C.

    2016-01-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with

  7. On site DNA barcoding by nanopore sequencing.

    Directory of Open Access Journals (Sweden)

    Michele Menegon

    Full Text Available Biodiversity research is becoming increasingly dependent on genomics, which allows the unprecedented digitization and understanding of the planet's biological heritage. The use of genetic markers i.e. DNA barcoding, has proved to be a powerful tool in species identification. However, full exploitation of this approach is hampered by the high sequencing costs and the absence of equipped facilities in biodiversity-rich countries. In the present work, we developed a portable sequencing laboratory based on the portable DNA sequencer from Oxford Nanopore Technologies, the MinION. Complementary laboratory equipment and reagents were selected to be used in remote and tough environmental conditions. The performance of the MinION sequencer and the portable laboratory was tested for DNA barcoding in a mimicking tropical environment, as well as in a remote rainforest of Tanzania lacking electricity. Despite the relatively high sequencing error-rate of the MinION, the development of a suitable pipeline for data analysis allowed the accurate identification of different species of vertebrates including amphibians, reptiles and mammals. In situ sequencing of a wild frog allowed us to rapidly identify the species captured, thus confirming that effective DNA barcoding in the field is possible. These results open new perspectives for real-time-on-site DNA sequencing thus potentially increasing opportunities for the understanding of biodiversity in areas lacking conventional laboratory facilities.

  8. Sequence analysis of Leukemia DNA

    Science.gov (United States)

    Nacong, Nasria; Lusiyanti, Desy; Irawan, Muhammad. Isa

    2018-03-01

    Cancer is a very deadly disease, one of which is leukemia disease or better known as blood cancer. The cancer cell can be detected by taking DNA in laboratory test. This study focused on local alignment of leukemia and non leukemia data resulting from NCBI in the form of DNA sequences by using Smith-Waterman algorithm. SmithWaterman algorithm was invented by TF Smith and MS Waterman in 1981. These algorithms try to find as much as possible similarity of a pair of sequences, by giving a negative value to the unequal base pair (mismatch), and positive values on the same base pair (match). So that will obtain the maximum positive value as the end of the alignment, and the minimum value as the initial alignment. This study will use sequences of leukemia and 3 sequences of non leukemia.

  9. DNA Sequencing by Capillary Electrophoresis

    Science.gov (United States)

    Karger, Barry L.; Guttman, Andras

    2009-01-01

    Sequencing of human and other genomes has been at the center of interest in the biomedical field over the past several decades and is now leading toward an era of personalized medicine. During this time, DNA sequencing methods have evolved from the labor intensive slab gel electrophoresis, through automated multicapillary electrophoresis systems using fluorophore labeling with multispectral imaging, to the “next generation” technologies of cyclic array, hybridization based, nanopore and single molecule sequencing. Deciphering the genetic blueprint and follow-up confirmatory sequencing of Homo sapiens and other genomes was only possible by the advent of modern sequencing technologies that was a result of step by step advances with a contribution of academics, medical personnel and instrument companies. While next generation sequencing is moving ahead at break-neck speed, the multicapillary electrophoretic systems played an essential role in the sequencing of the Human Genome, the foundation of the field of genomics. In this prospective, we wish to overview the role of capillary electrophoresis in DNA sequencing based in part of several of our articles in this journal. PMID:19517496

  10. Bacterial identification and subtyping using DNA microarray and DNA sequencing.

    Science.gov (United States)

    Al-Khaldi, Sufian F; Mossoba, Magdi M; Allard, Marc M; Lienau, E Kurt; Brown, Eric D

    2012-01-01

    The era of fast and accurate discovery of biological sequence motifs in prokaryotic and eukaryotic cells is here. The co-evolution of direct genome sequencing and DNA microarray strategies not only will identify, isotype, and serotype pathogenic bacteria, but also it will aid in the discovery of new gene functions by detecting gene expressions in different diseases and environmental conditions. Microarray bacterial identification has made great advances in working with pure and mixed bacterial samples. The technological advances have moved beyond bacterial gene expression to include bacterial identification and isotyping. Application of new tools such as mid-infrared chemical imaging improves detection of hybridization in DNA microarrays. The research in this field is promising and future work will reveal the potential of infrared technology in bacterial identification. On the other hand, DNA sequencing by using 454 pyrosequencing is so cost effective that the promise of $1,000 per bacterial genome sequence is becoming a reality. Pyrosequencing technology is a simple to use technique that can produce accurate and quantitative analysis of DNA sequences with a great speed. The deposition of massive amounts of bacterial genomic information in databanks is creating fingerprint phylogenetic analysis that will ultimately replace several technologies such as Pulsed Field Gel Electrophoresis. In this chapter, we will review (1) the use of DNA microarray using fluorescence and infrared imaging detection for identification of pathogenic bacteria, and (2) use of pyrosequencing in DNA cluster analysis to fingerprint bacterial phylogenetic trees.

  11. Fractals in DNA sequence analysis

    Institute of Scientific and Technical Information of China (English)

    Yu Zu-Guo(喻祖国); Vo Anh; Gong Zhi-Min(龚志民); Long Shun-Chao(龙顺潮)

    2002-01-01

    Fractal methods have been successfully used to study many problems in physics, mathematics, engineering, finance,and even in biology. There has been an increasing interest in unravelling the mysteries of DNA; for example, how can we distinguish coding and noncoding sequences, and the problems of classification and evolution relationship of organisms are key problems in bioinformatics. Although much research has been carried out by taking into consideration the long-range correlations in DNA sequences, and the global fractal dimension has been used in these works by other people, the models and methods are somewhat rough and the results are not satisfactory. In recent years, our group has introduced a time series model (statistical point of view) and a visual representation (geometrical point of view)to DNA sequence analysis. We have also used fractal dimension, correlation dimension, the Hurst exponent and the dimension spectrum (multifractal analysis) to discuss problems in this field. In this paper, we introduce these fractal models and methods and the results of DNA sequence analysis.

  12. Comparison of the accuracy of two conventional phenotypic methods and two MALDI-TOF MS systems with that of DNA sequencing analysis for correctly identifying clinically encountered yeasts.

    Science.gov (United States)

    Chao, Qiao-Ting; Lee, Tai-Fen; Teng, Shih-Hua; Peng, Li-Yun; Chen, Ping-Hung; Teng, Lee-Jene; Hsueh, Po-Ren

    2014-01-01

    We assessed the accuracy of species-level identification of two commercially available matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) systems (Bruker Biotyper and Vitek MS) and two conventional phenotypic methods (Phoenix 100 YBC and Vitek 2 Yeast ID) with that of rDNA gene sequencing analysis among 200 clinical isolates of commonly encountered yeasts. The correct identification rates of the 200 yeast isolates to species or complex (Candida parapsilosis complex, C. guilliermondii complex and C. rugosa complex) levels by the Bruker Biotyper, Vitek MS (using in vitro devices [IVD] database), Phoenix 100 YBC and Vitek 2 Yeast ID (Sabouraud's dextrose agar) systems were 92.5%, 79.5%, 89%, and 74%, respectively. An additional 72 isolates of C. parapsilosis complex and 18 from the above 200 isolates (30 in each of C. parapsilosis, C. metapsilosis, and C. orthopsilosis) were also evaluated separately. Bruker Biotyper system could accurately identify all C. parapsilosis complex to species level. Using Vitek 2 MS (IVD) system, all C. parapsilosis but none of C. metapsilosis, or C. orthopsilosis could be accurately identified. Among the 89 yeasts misidentified by the Vitek 2 MS (IVD) system, 39 (43.8%), including 27 C. orthopsilosis isolates, could be correctly identified Using the Vitek MS Plus SARAMIS database for research use only. This resulted in an increase in the rate of correct identification of all yeast isolates (87.5%) by Vitek 2 MS. The two species in C. guilliermondii complex (C. guilliermondii and C. fermentati) isolates were correctly identified by cluster analysis of spectra generated by the Bruker Biotyper system. Based on the results obtained in the current study, MALDI-TOF MS systems present a promising alternative for the routine identification of yeast species, including clinically commonly and rarely encountered yeast species and several species belonging to C. parapsilosis complex, C. guilliermondii complex

  13. Comparison of the accuracy of two conventional phenotypic methods and two MALDI-TOF MS systems with that of DNA sequencing analysis for correctly identifying clinically encountered yeasts.

    Directory of Open Access Journals (Sweden)

    Qiao-Ting Chao

    Full Text Available We assessed the accuracy of species-level identification of two commercially available matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS systems (Bruker Biotyper and Vitek MS and two conventional phenotypic methods (Phoenix 100 YBC and Vitek 2 Yeast ID with that of rDNA gene sequencing analysis among 200 clinical isolates of commonly encountered yeasts. The correct identification rates of the 200 yeast isolates to species or complex (Candida parapsilosis complex, C. guilliermondii complex and C. rugosa complex levels by the Bruker Biotyper, Vitek MS (using in vitro devices [IVD] database, Phoenix 100 YBC and Vitek 2 Yeast ID (Sabouraud's dextrose agar systems were 92.5%, 79.5%, 89%, and 74%, respectively. An additional 72 isolates of C. parapsilosis complex and 18 from the above 200 isolates (30 in each of C. parapsilosis, C. metapsilosis, and C. orthopsilosis were also evaluated separately. Bruker Biotyper system could accurately identify all C. parapsilosis complex to species level. Using Vitek 2 MS (IVD system, all C. parapsilosis but none of C. metapsilosis, or C. orthopsilosis could be accurately identified. Among the 89 yeasts misidentified by the Vitek 2 MS (IVD system, 39 (43.8%, including 27 C. orthopsilosis isolates, could be correctly identified Using the Vitek MS Plus SARAMIS database for research use only. This resulted in an increase in the rate of correct identification of all yeast isolates (87.5% by Vitek 2 MS. The two species in C. guilliermondii complex (C. guilliermondii and C. fermentati isolates were correctly identified by cluster analysis of spectra generated by the Bruker Biotyper system. Based on the results obtained in the current study, MALDI-TOF MS systems present a promising alternative for the routine identification of yeast species, including clinically commonly and rarely encountered yeast species and several species belonging to C. parapsilosis complex, C. guilliermondii

  14. Quantum-Sequencing: Fast electronic single DNA molecule sequencing

    Science.gov (United States)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free, high-throughput and cost-effective, single-molecule sequencing method. Here, we present the first demonstration of unique ``electronic fingerprint'' of all nucleotides (A, G, T, C), with single-molecule DNA sequencing, using Quantum-tunneling Sequencing (Q-Seq) at room temperature. We show that the electronic state of the nucleobases shift depending on the pH, with most distinct states identified at acidic pH. We also demonstrate identification of single nucleotide modifications (methylation here). Using these unique electronic fingerprints (or tunneling data), we report a partial sequence of beta lactamase (bla) gene, which encodes resistance to beta-lactam antibiotics, with over 95% success rate. These results highlight the potential of Q-Seq as a robust technique for next-generation sequencing.

  15. Fast and secure retrieval of DNA sequences

    NARCIS (Netherlands)

    2014-01-01

    Sequence models are retrieved from a sequences index. The sequence models model DNA or RNA sequences stored in a database, and each comprises a finite memory tree source model and parameters for the finite memory tree source model. One or more DNA or RNA sequences stored in the database are

  16. Entropic fluctuations in DNA sequences

    Science.gov (United States)

    Thanos, Dimitrios; Li, Wentian; Provata, Astero

    2018-03-01

    The Local Shannon Entropy (LSE) in blocks is used as a complexity measure to study the information fluctuations along DNA sequences. The LSE of a DNA block maps the local base arrangement information to a single numerical value. It is shown that despite this reduction of information, LSE allows to extract meaningful information related to the detection of repetitive sequences in whole chromosomes and is useful in finding evolutionary differences between organisms. More specifically, large regions of tandem repeats, such as centromeres, can be detected based on their low LSE fluctuations along the chromosome. Furthermore, an empirical investigation of the appropriate block sizes is provided and the relationship of LSE properties with the structure of the underlying repetitive units is revealed by using both computational and mathematical methods. Sequence similarity between the genomic DNA of closely related species also leads to similar LSE values at the orthologous regions. As an application, the LSE covariance function is used to measure the evolutionary distance between several primate genomes.

  17. POU4F3 mutation screening in Japanese hearing loss patients: Massively parallel DNA sequencing-based analysis identified novel variants associated with autosomal dominant hearing loss.

    Directory of Open Access Journals (Sweden)

    Tomohiro Kitano

    Full Text Available A variant in a transcription factor gene, POU4F3, is responsible for autosomal dominant nonsyndromic hereditary hearing loss, DFNA15. To date, 14 variants, including a whole deletion of POU4F3, have been reported to cause HL in various ethnic groups. In the present study, genetic screening for POU4F3 variants was carried out for a large series of Japanese hearing loss (HL patients to clarify the prevalence and clinical characteristics of DFNA15 in the Japanese population. Massively parallel DNA sequencing of 68 target candidate genes was utilized in 2,549 unrelated Japanese HL patients (probands to identify genomic variations responsible for HL. The detailed clinical features in patients with POU4F3 variants were collected from medical charts and analyzed. Novel 12 POU4F3 likely pathogenic variants (six missense variants, three frameshift variants, and three nonsense variants were successfully identified in 15 probands (2.5% among 602 families exhibiting autosomal dominant HL, whereas no variants were detected in the other 1,947 probands with autosomal recessive or inheritance pattern unknown HL. To obtain the audiovestibular configuration of the patients harboring POU4F3 variants, we collected audiograms and vestibular symptoms of the probands and their affected family members. Audiovestibular phenotypes in a total of 24 individuals from the 15 families possessing variants were characterized by progressive HL, with a large variation in the onset age and severity with or without vestibular symptoms observed. Pure-tone audiograms indicated the most prevalent configuration as mid-frequency HL type followed by high-frequency HL type, with asymmetry observed in approximately 20% of affected individuals. Analysis of the relationship between age and pure-tone average suggested that individuals with truncating variants showed earlier onset and slower progression of HL than did those with non-truncating variants. The present study showed that variants

  18. Identifying driver mutations in sequenced cancer genomes

    DEFF Research Database (Denmark)

    Raphael, Benjamin J; Dobson, Jason R; Oesper, Layla

    2014-01-01

    High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, nois...... patterns of mutual exclusivity. These techniques, coupled with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis and treatment of cancer....

  19. DNA Replication Profiling Using Deep Sequencing.

    Science.gov (United States)

    Saayman, Xanita; Ramos-Pérez, Cristina; Brown, Grant W

    2018-01-01

    Profiling of DNA replication during progression through S phase allows a quantitative snap-shot of replication origin usage and DNA replication fork progression. We present a method for using deep sequencing data to profile DNA replication in S. cerevisiae.

  20. Dog Y chromosomal DNA sequence: identification, sequencing and SNP discovery

    Directory of Open Access Journals (Sweden)

    Kirkness Ewen

    2006-10-01

    Full Text Available Abstract Background Population genetic studies of dogs have so far mainly been based on analysis of mitochondrial DNA, describing only the history of female dogs. To get a picture of the male history, as well as a second independent marker, there is a need for studies of biallelic Y-chromosome polymorphisms. However, there are no biallelic polymorphisms reported, and only 3200 bp of non-repetitive dog Y-chromosome sequence deposited in GenBank, necessitating the identification of dog Y chromosome sequence and the search for polymorphisms therein. The genome has been only partially sequenced for one male dog, disallowing mapping of the sequence into specific chromosomes. However, by comparing the male genome sequence to the complete female dog genome sequence, candidate Y-chromosome sequence may be identified by exclusion. Results The male dog genome sequence was analysed by Blast search against the human genome to identify sequences with a best match to the human Y chromosome and to the female dog genome to identify those absent in the female genome. Candidate sequences were then tested for male specificity by PCR of five male and five female dogs. 32 sequences from the male genome, with a total length of 24 kbp, were identified as male specific, based on a match to the human Y chromosome, absence in the female dog genome and male specific PCR results. 14437 bp were then sequenced for 10 male dogs originating from Europe, Southwest Asia, Siberia, East Asia, Africa and America. Nine haplotypes were found, which were defined by 14 substitutions. The genetic distance between the haplotypes indicates that they originate from at least five wolf haplotypes. There was no obvious trend in the geographic distribution of the haplotypes. Conclusion We have identified 24159 bp of dog Y-chromosome sequence to be used for population genetic studies. We sequenced 14437 bp in a worldwide collection of dogs, identifying 14 SNPs for future SNP analyses, and

  1. An automated annotation tool for genomic DNA sequences using

    Indian Academy of Sciences (India)

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated ...

  2. Adenoviral DNA replication: DNA sequences and enzymes required for initiation in vitro

    International Nuclear Information System (INIS)

    Stillman, B.W.; Tamanoi, F.

    1983-01-01

    In this paper evidence is provided that the 140,000-dalton DNA polymerase is encoded by the adenoviral genome and is required for the initiation of DNA replication in vitro. The DNA sequences in the template DNA that are required for the initiation of replication have also been identified, using both plasmid DNAs and synthetic oligodeoxyribonucleotides. 48 references, 7 figures, 1 table

  3. Use of DNA barcodes to identify flowering plants

    OpenAIRE

    Kress, W. John; Wurdack, Kenneth J.; Zimmer, Elizabeth A.; Weigt, Lee A.; Janzen, Daniel H.

    2005-01-01

    Methods for identifying species by using short orthologous DNA sequences, known as “DNA barcodes,” have been proposed and initiated to facilitate biodiversity studies, identify juveniles, associate sexes, and enhance forensic analyses. The cytochrome c oxidase 1 sequence, which has been found to be widely applicable in animal barcoding, is not appropriate for most species of plants because of a much slower rate of cytochrome c oxidase 1 gene evolution in higher plants than in animals. We ther...

  4. A viral metagenomic approach on a non-metagenomic experiment: Mining next generation sequencing datasets from pig DNA identified several porcine parvoviruses for a retrospective evaluation of viral infections.

    Directory of Open Access Journals (Sweden)

    Samuele Bovo

    Full Text Available Shot-gun next generation sequencing (NGS on whole DNA extracted from specimens collected from mammals often produces reads that are not mapped (i.e. unmapped reads on the host reference genome and that are usually discarded as by-products of the experiments. In this study, we mined Ion Torrent reads obtained by sequencing DNA isolated from archived blood samples collected from 100 performance tested Italian Large White pigs. Two reduced representation libraries were prepared from two DNA pools constructed each from 50 equimolar DNA samples. Bioinformatic analyses were carried out to mine unmapped reads on the reference pig genome that were obtained from the two NGS datasets. In silico analyses included read mapping and sequence assembly approaches for a viral metagenomic analysis using the NCBI Viral Genome Resource. Our approach identified sequences matching several viruses of the Parvoviridae family: porcine parvovirus 2 (PPV2, PPV4, PPV5 and PPV6 and porcine bocavirus 1-H18 isolate (PBoV1-H18. The presence of these viruses was confirmed by PCR and Sanger sequencing of individual DNA samples. PPV2, PPV4, PPV5, PPV6 and PBoV1-H18 were all identified in samples collected in 1998-2007, 1998-2000, 1997-2000, 1998-2004 and 2003, respectively. For most of these viruses (PPV4, PPV5, PPV6 and PBoV1-H18 previous studies reported their first occurrence much later (from 5 to more than 10 years than our identification period and in different geographic areas. Our study provided a retrospective evaluation of apparently asymptomatic parvovirus infected pigs providing information that could be important to define occurrence and prevalence of different parvoviruses in South Europe. This study demonstrated the potential of mining NGS datasets non-originally derived by metagenomics experiments for viral metagenomics analyses in a livestock species.

  5. Identifying Fishes through DNA Barcodes and Microarrays.

    Directory of Open Access Journals (Sweden)

    Marc Kochzius

    2010-09-01

    Full Text Available International fish trade reached an import value of 62.8 billion Euro in 2006, of which 44.6% are covered by the European Union. Species identification is a key problem throughout the life cycle of fishes: from eggs and larvae to adults in fisheries research and control, as well as processed fish products in consumer protection.This study aims to evaluate the applicability of the three mitochondrial genes 16S rRNA (16S, cytochrome b (cyt b, and cytochrome oxidase subunit I (COI for the identification of 50 European marine fish species by combining techniques of "DNA barcoding" and microarrays. In a DNA barcoding approach, neighbour Joining (NJ phylogenetic trees of 369 16S, 212 cyt b, and 447 COI sequences indicated that cyt b and COI are suitable for unambiguous identification, whereas 16S failed to discriminate closely related flatfish and gurnard species. In course of probe design for DNA microarray development, each of the markers yielded a high number of potentially species-specific probes in silico, although many of them were rejected based on microarray hybridisation experiments. None of the markers provided probes to discriminate the sibling flatfish and gurnard species. However, since 16S-probes were less negatively influenced by the "position of label" effect and showed the lowest rejection rate and the highest mean signal intensity, 16S is more suitable for DNA microarray probe design than cty b and COI. The large portion of rejected COI-probes after hybridisation experiments (>90% renders the DNA barcoding marker as rather unsuitable for this high-throughput technology.Based on these data, a DNA microarray containing 64 functional oligonucleotide probes for the identification of 30 out of the 50 fish species investigated was developed. It represents the next step towards an automated and easy-to-handle method to identify fish, ichthyoplankton, and fish products.

  6. "First generation" automated DNA sequencing technology.

    Science.gov (United States)

    Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

    2011-10-01

    Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.

  7. Understanding human DNA sequence variation.

    Science.gov (United States)

    Kidd, K K; Pakstis, A J; Speed, W C; Kidd, J R

    2004-01-01

    Over the past century researchers have identified normal genetic variation and studied that variation in diverse human populations to determine the amounts and distributions of that variation. That information is being used to develop an understanding of the demographic histories of the different populations and the species as a whole, among other studies. With the advent of DNA-based markers in the last quarter century, these studies have accelerated. One of the challenges for the next century is to understand that variation. One component of that understanding will be population genetics. We present here examples of many of the ways these new data can be analyzed from a population perspective using results from our laboratory on multiple individual DNA-based polymorphisms, many clustered in haplotypes, studied in multiple populations representing all major geographic regions of the world. These data support an "out of Africa" hypothesis for human dispersal around the world and begin to refine the understanding of population structures and genetic relationships. We are also developing baseline information against which we can compare findings at different loci to aid in the identification of loci subject, now and in the past, to selection (directional or balancing). We do not yet have a comprehensive understanding of the extensive variation in the human genome, but some of that understanding is coming from population genetics.

  8. Human Chromosome 7: DNA Sequence and Biology

    OpenAIRE

    Scherer, Stephen W.; Cheung, Joseph; MacDonald, Jeffrey R.; Osborne, Lucy R.; Nakabayashi, Kazuhiko; Herbrick, Jo-Anne; Carson, Andrew R.; Parker-Katiraee, Layla; Skaug, Jennifer; Khaja, Razi; Zhang, Junjun; Hudek, Alexander K.; Li, Martin; Haddad, May; Duggan, Gavin E.

    2003-01-01

    DNA sequence and annotation of the entire human chromosome 7, encompassing nearly 158 million nucleotides of DNA and 1917 gene structures, are presented. To generate a higher order description, additional structural features such as imprinted genes, fragile sites, and segmental duplications were integrated at the level of the DNA sequence with medical genetic data, including 440 chromosome rearrangement breakpoints associated with disease. This approach enabled the discovery of candidate gene...

  9. Multiple tag labeling method for DNA sequencing

    Science.gov (United States)

    Mathies, R.A.; Huang, X.C.; Quesada, M.A.

    1995-07-25

    A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.

  10. Aspects of coverage in medical DNA sequencing

    Directory of Open Access Journals (Sweden)

    Wilson Richard K

    2008-05-01

    Full Text Available Abstract Background DNA sequencing is now emerging as an important component in biomedical studies of diseases like cancer. Short-read, highly parallel sequencing instruments are expected to be used heavily for such projects, but many design specifications have yet to be conclusively established. Perhaps the most fundamental of these is the redundancy required to detect sequence variations, which bears directly upon genomic coverage and the consequent resolving power for discerning somatic mutations. Results We address the medical sequencing coverage problem via an extension of the standard mathematical theory of haploid coverage. The expected diploid multi-fold coverage, as well as its generalization for aneuploidy are derived and these expressions can be readily evaluated for any project. The resulting theory is used as a scaling law to calibrate performance to that of standard BAC sequencing at 8× to 10× redundancy, i.e. for expected coverages that exceed 99% of the unique sequence. A differential strategy is formalized for tumor/normal studies wherein tumor samples are sequenced more deeply than normal ones. In particular, both tumor alleles should be detected at least twice, while both normal alleles are detected at least once. Our theory predicts these requirements can be met for tumor and normal redundancies of approximately 26× and 21×, respectively. We explain why these values do not differ by a factor of 2, as might intuitively be expected. Future technology developments should prompt even deeper sequencing of tumors, but the 21× value for normal samples is essentially a constant. Conclusion Given the assumptions of standard coverage theory, our model gives pragmatic estimates for required redundancy. The differential strategy should be an efficient means of identifying potential somatic mutations for further study.

  11. High-Throughput Block Optical DNA Sequence Identification.

    Science.gov (United States)

    Sagar, Dodderi Manjunatha; Korshoj, Lee Erik; Hanson, Katrina Bethany; Chowdhury, Partha Pratim; Otoupal, Peter Britton; Chatterjee, Anushree; Nagpal, Prashant

    2018-01-01

    Optical techniques for molecular diagnostics or DNA sequencing generally rely on small molecule fluorescent labels, which utilize light with a wavelength of several hundred nanometers for detection. Developing a label-free optical DNA sequencing technique will require nanoscale focusing of light, a high-throughput and multiplexed identification method, and a data compression technique to rapidly identify sequences and analyze genomic heterogeneity for big datasets. Such a method should identify characteristic molecular vibrations using optical spectroscopy, especially in the "fingerprinting region" from ≈400-1400 cm -1 . Here, surface-enhanced Raman spectroscopy is used to demonstrate label-free identification of DNA nucleobases with multiplexed 3D plasmonic nanofocusing. While nanometer-scale mode volumes prevent identification of single nucleobases within a DNA sequence, the block optical technique can identify A, T, G, and C content in DNA k-mers. The content of each nucleotide in a DNA block can be a unique and high-throughput method for identifying sequences, genes, and other biomarkers as an alternative to single-letter sequencing. Additionally, coupling two complementary vibrational spectroscopy techniques (infrared and Raman) can improve block characterization. These results pave the way for developing a novel, high-throughput block optical sequencing method with lossy genomic data compression using k-mer identification from multiplexed optical data acquisition. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  12. Order and correlations in genomic DNA sequences. The spectral approach

    International Nuclear Information System (INIS)

    Lobzin, Vasilii V; Chechetkin, Vladimir R

    2000-01-01

    The structural analysis of genomic DNA sequences is discussed in the framework of the spectral approach, which is sufficiently universal due to the reciprocal correspondence and mutual complementarity of Fourier transform length scales. The spectral characteristics of random sequences of the same nucleotide composition possess the property of self-averaging for relatively short sequences of length M≥100-300. Comparison with the characteristics of random sequences determines the statistical significance of the structural features observed. Apart from traditional applications to the search for hidden periodicities, spectral methods are also efficient in studying mutual correlations in DNA sequences. By combining spectra for structure factors and correlation functions, not only integral correlations can be estimated but also their origin identified. Using the structural spectral entropy approach, the regularity of a sequence can be quantitatively assessed. A brief introduction to the problem is also presented and other major methods of DNA sequence analysis described. (reviews of topical problems)

  13. Novel genomes and genome constitutions identified by GISH and 5S rDNA and knotted1 genomic sequences in the genus Setaria.

    Science.gov (United States)

    Zhao, Meicheng; Zhi, Hui; Doust, Andrew N; Li, Wei; Wang, Yongfang; Li, Haiquan; Jia, Guanqing; Wang, Yongqiang; Zhang, Ning; Diao, Xianmin

    2013-04-11

    The Setaria genus is increasingly of interest to researchers, as its two species, S. viridis and S. italica, are being developed as models for understanding C4 photosynthesis and plant functional genomics. The genome constitution of Setaria species has been studied in the diploid species S. viridis, S. adhaerans and S. grisebachii, where three genomes A, B and C were identified respectively. Two allotetraploid species, S. verticillata and S. faberi, were found to have AABB genomes, and one autotetraploid species, S. queenslandica, with an AAAA genome, has also been identified. The genomes and genome constitutions of most other species remain unknown, even though it was thought there are approximately 125 species in the genus distributed world-wide. GISH was performed to detect the genome constitutions of Eurasia species of S. glauca, S. plicata, and S. arenaria, with the known A, B and C genomes as probes. No or very poor hybridization signal was detected indicating that their genomes are different from those already described. GISH was also performed reciprocally between S. glauca, S. plicata, and S. arenaria genomes, but no hybridization signals between each other were found. The two sets of chromosomes of S. lachnea both hybridized strong signals with only the known C genome of S. grisebachii. Chromosomes of Qing 9, an accession formerly considered as S. viridis, hybridized strong signal only to B genome of S. adherans. Phylogenetic trees constructed with 5S rDNA and knotted1 markers, clearly classify the samples in this study into six clusters, matching the GISH results, and suggesting that the F genome of S. arenaria is basal in the genus. Three novel genomes in the Setaria genus were identified and designated as genome D (S. glauca), E (S. plicata) and F (S. arenaria) respectively. The genome constitution of tetraploid S. lachnea is putatively CCC'C'. Qing 9 is a B genome species indigenous to China and is hypothesized to be a newly identified species. The

  14. EGNAS: an exhaustive DNA sequence design algorithm

    Directory of Open Access Journals (Sweden)

    Kick Alfred

    2012-06-01

    Full Text Available Abstract Background The molecular recognition based on the complementary base pairing of deoxyribonucleic acid (DNA is the fundamental principle in the fields of genetics, DNA nanotechnology and DNA computing. We present an exhaustive DNA sequence design algorithm that allows to generate sets containing a maximum number of sequences with defined properties. EGNAS (Exhaustive Generation of Nucleic Acid Sequences offers the possibility of controlling both interstrand and intrastrand properties. The guanine-cytosine content can be adjusted. Sequences can be forced to start and end with guanine or cytosine. This option reduces the risk of “fraying” of DNA strands. It is possible to limit cross hybridizations of a defined length, and to adjust the uniqueness of sequences. Self-complementarity and hairpin structures of certain length can be avoided. Sequences and subsequences can optionally be forbidden. Furthermore, sequences can be designed to have minimum interactions with predefined strands and neighboring sequences. Results The algorithm is realized in a C++ program. TAG sequences can be generated and combined with primers for single-base extension reactions, which were described for multiplexed genotyping of single nucleotide polymorphisms. Thereby, possible foldback through intrastrand interaction of TAG-primer pairs can be limited. The design of sequences for specific attachment of molecular constructs to DNA origami is presented. Conclusions We developed a new software tool called EGNAS for the design of unique nucleic acid sequences. The presented exhaustive algorithm allows to generate greater sets of sequences than with previous software and equal constraints. EGNAS is freely available for noncommercial use at http://www.chm.tu-dresden.de/pc6/EGNAS.

  15. Chromatid interchanges at intrachromosomal telomeric DNA sequences

    International Nuclear Information System (INIS)

    Fernandez, J.L.; Vazquez-Gundin, F.; Bilbao, A.; Gosalvez, J.; Goyanes, V.

    1997-01-01

    Chinese hamster Don cells were exposed to X-rays, mitomycin C and teniposide (VM-26) to induce chromatid exchanges (quadriradials and triradials). After fluorescence in situ hybridization (FISH) of telomere sequences it was found that interstitial telomere-like DNA sequence arrays presented around five times more breakage-rearrangements than the genome overall. This high recombinogenic capacity was independent of the clastogen, suggesting that this susceptibility is not related to the initial mechanisms of DNA damage. (author)

  16. Mitochondrial DNA sequence evolution in shorebird populations

    NARCIS (Netherlands)

    Wenink, P.W.

    1994-01-01

    This thesis describes the global molecular population structure of two shorebird species, in particular of the dunlin, Calidris alpina, by means of comparative sequence analysis of the most variable part of the mitochondrial DNA (mtDNA) genome. There are several reasons

  17. Recurrence plot analysis of DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Wu Zuobing [State Key Laboratory of Nonlinear Mechanics, Institute of Mechanics, Chinese Academy of Sciences, Beijing 100080 (China)]. E-mail: wuzb@lnm.imech.ac.cn

    2004-11-15

    Recurrence plot technique of DNA sequences is established on metric representation and employed to analyze correlation structure of nucleotide strings. It is found that, in the transference of nucleotide strings, a human DNA fragment has a major correlation distance, but a yeast chromosome's correlation distance has a constant increasing.

  18. Recurrence time statistics: versatile tools for genomic DNA sequence analysis.

    Science.gov (United States)

    Cao, Yinhe; Tung, Wen-Wen; Gao, J B

    2004-01-01

    With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.

  19. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  20. Sequencing Intractable DNA to Close Microbial Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Hurt, Jr., Richard Ashley [ORNL; Brown, Steven D [ORNL; Podar, Mircea [ORNL; Palumbo, Anthony Vito [ORNL; Elias, Dwayne A [ORNL

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  1. Mapping Base Modifications in DNA by Transverse-Current Sequencing

    Science.gov (United States)

    Alvarez, Jose R.; Skachkov, Dmitry; Massey, Steven E.; Kalitsov, Alan; Velev, Julian P.

    2018-02-01

    Sequencing DNA modifications and lesions, such as methylation of cytosine and oxidation of guanine, is even more important and challenging than sequencing the genome itself. The traditional methods for detecting DNA modifications are either insensitive to these modifications or require additional processing steps to identify a particular type of modification. Transverse-current sequencing in nanopores can potentially identify the canonical bases and base modifications in the same run. In this work, we demonstrate that the most common DNA epigenetic modifications and lesions can be detected with any predefined accuracy based on their tunneling current signature. Our results are based on simulations of the nanopore tunneling current through DNA molecules, calculated using nonequilibrium electron-transport methodology within an effective multiorbital model derived from first-principles calculations, followed by a base-calling algorithm accounting for neighbor current-current correlations. This methodology can be integrated with existing experimental techniques to improve base-calling fidelity.

  2. RANDNA: a random DNA sequence generator.

    Science.gov (United States)

    Piva, Francesco; Principato, Giovanni

    2006-01-01

    Monte Carlo simulations are useful to verify the significance of data. Genomic regularities, such as the nucleotide correlations or the not uniform distribution of the motifs throughout genomic or mature mRNA sequences, exist and their significance can be checked by means of the Monte Carlo test. The test needs good quality random sequences in order to work, moreover they should have the same nucleotide distribution as the sequences in which the regularities have been found. Random DNA sequences are also useful to estimate the background score of an alignment, that is a threshold below which the resulting score is merely due to chance. We have developed RANDNA, a free software which allows to produce random DNA or RNA sequences setting both their length and the percentage of nucleotide composition. Sequences having the same nucleotide distribution of exonic, intronic or intergenic sequences can be generated. Its graphic interface makes it possible to easily set the parameters that characterize the sequences being produced and saved in a text format file. The pseudo-random number generator function of Borland Delphi 6 is used, since it guarantees a good randomness, a long cycle length and a high speed. We have checked the quality of sequences generated by the software, by means of well-known tests, both by themselves and versus genuine random sequences. We show the good quality of the generated sequences. The software, complete with examples and documentation, is freely available to users from: http://www.introni.it/en/software.

  3. Compressing DNA sequence databases with coil

    Directory of Open Access Journals (Sweden)

    Hendy Michael D

    2008-05-01

    Full Text Available Abstract Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.

  4. DNA Sequencing in Cultural Heritage.

    Science.gov (United States)

    Vai, Stefania; Lari, Martina; Caramelli, David

    2016-02-01

    During the last three decades, DNA analysis on degraded samples revealed itself as an important research tool in anthropology, archaeozoology, molecular evolution, and population genetics. Application on topics such as determination of species origin of prehistoric and historic objects, individual identification of famous personalities, characterization of particular samples important for historical, archeological, or evolutionary reconstructions, confers to the paleogenetics an important role also for the enhancement of cultural heritage. A really fast improvement in methodologies in recent years led to a revolution that permitted recovering even complete genomes from highly degraded samples with the possibility to go back in time 400,000 years for samples from temperate regions and 700,000 years for permafrozen remains and to analyze even more recent material that has been subjected to hard biochemical treatments. Here we propose a review on the different methodological approaches used so far for the molecular analysis of degraded samples and their application on some case studies.

  5. Spreadsheet-based program for alignment of overlapping DNA sequences.

    Science.gov (United States)

    Anbazhagan, R; Gabrielson, E

    1999-06-01

    Molecular biology laboratories frequently face the challenge of aligning small overlapping DNA sequences derived from a long DNA segment. Here, we present a short program that can be used to adapt Excel spreadsheets as a tool for aligning DNA sequences, regardless of their orientation. The program runs on any Windows or Macintosh operating system computer with Excel 97 or Excel 98. The program is available for use as an Excel file, which can be downloaded from the BioTechniques Web site. Upon execution, the program opens a specially designed customized workbook and is capable of identifying overlapping regions between two sequence fragments and displaying the sequence alignment. It also performs a number of specialized functions such as recognition of restriction enzyme cutting sites and CpG island mapping without costly specialized software.

  6. Enhanced throughput for infrared automated DNA sequencing

    Science.gov (United States)

    Middendorf, Lyle R.; Gartside, Bill O.; Humphrey, Pat G.; Roemer, Stephen C.; Sorensen, David R.; Steffens, David L.; Sutter, Scott L.

    1995-04-01

    Several enhancements have been developed and applied to infrared automated DNA sequencing resulting in significantly higher throughput. A 41 cm sequencing gel (31 cm well- to-read distance) combines high resolution of DNA sequencing fragments with optimized run times yielding two runs per day of 500 bases per sample. A 66 cm sequencing gel (56 cm well-to-read distance) produces sequence read lengths of up to 1000 bases for ds and ss templates using either T7 polymerase or cycle-sequencing protocols. Using a multichannel syringe to load 64 lanes allows 16 samples (compatible with 96-well format) to be visualized for each run. The 41 cm gel configuration allows 16,000 bases per day (16 samples X 500 bases/sample X 2 ten hour runs/day) to be sequenced with the advantages of infrared technology. Enhancements to internal labeling techniques using an infrared-labeled dATP molecule (Boehringer Mannheim GmbH, Penzberg, Germany; Sequenase (U.S. Biochemical) have also been made. The inclusion of glycerol in the sequencing reactions yields greatly improved results for some primer and template combinations. The inclusion of (alpha) -Thio-dNTP's in the labeling reaction increases signal intensity two- to three-fold.

  7. Fungal Peritonitis Due to Fusarium solani Species Complex Sequential Isolates Identified with DNA Sequencing in a Kidney Transplant Recipient in Brazil.

    Science.gov (United States)

    da Silva-Rocha, Walicyranison Plinio; Zuza-Alves, Diana Luzia; Melo, Analy Salles de Azevedo; Chaves, Guilherme Maranhão

    2015-12-01

    Fungal peritonitis is a rare serious complication most commonly observed in immunocompromised patients under peritoneal dialysis. Nevertheless, this clinical condition is more difficult to treat than bacterial peritonitis. Bacterial peritonitis followed by the use of antibiotics is the main risk factor for developing fungal peritonitis. Candida spp. are more frequently isolated, and the isolation of filamentous fungi is only occasional. Here we describe a case of Fusarium solani species complex peritonitis associated with bacterial peritonitis in a female kidney transplant recipient with previous history of nephrotic syndrome. The patient has had Enterobacter sp. endocarditis and was hypertensive and diabetic. Two sequential isolates of F. solani were recovered from cultures and identified with different molecular techniques. She was successfully treated with 50 mg daily amphotericin B for 4 weeks.

  8. Presence of a consensus DNA motif at nearby DNA sequence of the mutation susceptible CG nucleotides.

    Science.gov (United States)

    Chowdhury, Kaushik; Kumar, Suresh; Sharma, Tanu; Sharma, Ankit; Bhagat, Meenakshi; Kamai, Asangla; Ford, Bridget M; Asthana, Shailendra; Mandal, Chandi C

    2018-01-10

    Complexity in tissues affected by cancer arises from somatic mutations and epigenetic modifications in the genome. The mutation susceptible hotspots present within the genome indicate a non-random nature and/or a position specific selection of mutation. An association exists between the occurrence of mutations and epigenetic DNA methylation. This study is primarily aimed at determining mutation status, and identifying a signature for predicting mutation prone zones of tumor suppressor (TS) genes. Nearby sequences from the top five positions having a higher mutation frequency in each gene of 42 TS genes were selected from a cosmic database and were considered as mutation prone zones. The conserved motifs present in the mutation prone DNA fragments were identified. Molecular docking studies were done to determine putative interactions between the identified conserved motifs and enzyme methyltransferase DNMT1. Collective analysis of 42 TS genes found GC as the most commonly replaced and AT as the most commonly formed residues after mutation. Analysis of the top 5 mutated positions of each gene (210 DNA segments for 42 TS genes) identified that CG nucleotides of the amino acid codons (e.g., Arginine) are most susceptible to mutation, and found a consensus DNA "T/AGC/GAGGA/TG" sequence present in these mutation prone DNA segments. Similar to TS genes, analysis of 54 oncogenes not only found CG nucleotides of the amino acid Arg as the most susceptible to mutation, but also identified the presence of similar consensus DNA motifs in the mutation prone DNA fragments (270 DNA segments for 54 oncogenes) of oncogenes. Docking studies depicted that, upon binding of DNMT1 methylates to this consensus DNA motif (C residues of CpG islands), mutation was likely to occur. Thus, this study proposes that DNMT1 mediated methylation in chromosomal DNA may decrease if a foreign DNA segment containing this consensus sequence along with CG nucleotides is exogenously introduced to dividing

  9. DNA sequence and prokaryotic expression analysis of vitellogenin ...

    African Journals Online (AJOL)

    In this study, the DNA sequence of vitellogenin from Antheraea pernyi (Ap-Vg) was identified and its functional domain (30-740 aa, Ap-Vg-1) was expressed in Escherichia coli BL21 (DE3) cells. The recombinant Ap-Vg-1 proteins were purified and used for antibody preparation. The results showed that the intact DNA ...

  10. Isolation of a sex-linked DNA sequence in cranes.

    Science.gov (United States)

    Duan, W; Fuerst, P A

    2001-01-01

    A female-specific DNA fragment (CSL-W; crane sex-linked DNA on W chromosome) was cloned from female whooping cranes (Grus americana). From the nucleotide sequence of CSL-W, a set of polymerase chain reaction (PCR) primers was identified which amplify a 227-230 bp female-specific fragment from all existing crane species and some other noncrane species. A duplicated versions of the DNA segment, which is found to have a larger size (231-235 bp) than CSL-W in both sexes, was also identified, and was designated CSL-NW (crane sex-linked DNA on non-W chromosome). The nucleotide similarity between the sequences of CSL-W and CSL-NW from whooping cranes was 86.3%. The CSL primers do not amplify any sequence from mammalian DNA, limiting the potential for contamination from human sources. Using the CSL primers in combination with a quick DNA extraction method allows the noninvasive identification of crane gender in less than 10 h. A test of the methodology was carried out on fully developed body feathers from 18 captive cranes and resulted in 100% successful identification.

  11. The sequence specificity of UV-induced DNA damage in a systematically altered DNA sequence.

    Science.gov (United States)

    Khoe, Clairine V; Chung, Long H; Murray, Vincent

    2018-06-01

    The sequence specificity of UV-induced DNA damage was investigated in a specifically designed DNA plasmid using two procedures: end-labelling and linear amplification. Absorption of UV photons by DNA leads to dimerisation of pyrimidine bases and produces two major photoproducts, cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). A previous study had determined that two hexanucleotide sequences, 5'-GCTC*AC and 5'-TATT*AA, were high intensity UV-induced DNA damage sites. The UV clone plasmid was constructed by systematically altering each nucleotide of these two hexanucleotide sequences. One of the main goals of this study was to determine the influence of single nucleotide alterations on the intensity of UV-induced DNA damage. The sequence 5'-GCTC*AC was designed to examine the sequence specificity of 6-4PPs and the highest intensity 6-4PP damage sites were found at 5'-GTTC*CC nucleotides. The sequence 5'-TATT*AA was devised to investigate the sequence specificity of CPDs and the highest intensity CPD damage sites were found at 5'-TTTT*CG nucleotides. It was proposed that the tetranucleotide DNA sequence, 5'-YTC*Y (where Y is T or C), was the consensus sequence for the highest intensity UV-induced 6-4PP adduct sites; while it was 5'-YTT*C for the highest intensity UV-induced CPD damage sites. These consensus tetranucleotides are composed entirely of consecutive pyrimidines and must have a DNA conformation that is highly productive for the absorption of UV photons. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.

  12. Special Issue: Next Generation DNA Sequencing

    Directory of Open Access Journals (Sweden)

    Paul Richardson

    2010-10-01

    Full Text Available Next Generation Sequencing (NGS refers to technologies that do not rely on traditional dideoxy-nucleotide (Sanger sequencing where labeled DNA fragments are physically resolved by electrophoresis. These new technologies rely on different strategies, but essentially all of them make use of real-time data collection of a base level incorporation event across a massive number of reactions (on the order of millions versus 96 for capillary electrophoresis for instance. The major commercial NGS platforms available to researchers are the 454 Genome Sequencer (Roche, Illumina (formerly Solexa Genome analyzer, the SOLiD system (Applied Biosystems/Life Technologies and the Heliscope (Helicos Corporation. The techniques and different strategies utilized by these platforms are reviewed in a number of the papers in this special issue. These technologies are enabling new applications that take advantage of the massive data produced by this next generation of sequencing instruments. [...

  13. Simulating efficiently the evolution of DNA sequences.

    Science.gov (United States)

    Schöniger, M; von Haeseler, A

    1995-02-01

    Two menu-driven FORTRAN programs are described that simulate the evolution of DNA sequences in accordance with a user-specified model. This general stochastic model allows for an arbitrary stationary nucleotide composition and any transition-transversion bias during the process of base substitution. In addition, the user may define any hypothetical model tree according to which a family of sequences evolves. The programs suggest the computationally most inexpensive approach to generate nucleotide substitutions. Either reproducible or non-repeatable simulations, depending on the method of initializing the pseudo-random number generator, can be performed. The corresponding options are offered by the interface menu.

  14. Identifying structural variants using linked-read sequencing data.

    Science.gov (United States)

    Elyanow, Rebecca; Wu, Hsin-Ta; Raphael, Benjamin J

    2017-11-03

    Structural variation, including large deletions, duplications, inversions, translocations, and other rearrangements, is common in human and cancer genomes. A number of methods have been developed to identify structural variants from Illumina short-read sequencing data. However, reliable identification of structural variants remains challenging because many variants have breakpoints in repetitive regions of the genome and thus are difficult to identify with short reads. The recently developed linked-read sequencing technology from 10X Genomics combines a novel barcoding strategy with Illumina sequencing. This technology labels all reads that originate from a small number (~5-10) DNA molecules ~50Kbp in length with the same molecular barcode. These barcoded reads contain long-range sequence information that is advantageous for identification of structural variants. We present Novel Adjacency Identification with Barcoded Reads (NAIBR), an algorithm to identify structural variants in linked-read sequencing data. NAIBR predicts novel adjacencies in a individual genome resulting from structural variants using a probabilistic model that combines multiple signals in barcoded reads. We show that NAIBR outperforms several existing methods for structural variant identification - including two recent methods that also analyze linked-reads - on simulated sequencing data and 10X whole-genome sequencing data from the NA12878 human genome and the HCC1954 breast cancer cell line. Several of the novel somatic structural variants identified in HCC1954 overlap known cancer genes. Software is available at compbio.cs.brown.edu/software. braphael@princeton.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  15. Genomic signal processing for DNA sequence clustering.

    Science.gov (United States)

    Mendizabal-Ruiz, Gerardo; Román-Godínez, Israel; Torres-Ramos, Sulema; Salido-Ruiz, Ricardo A; Vélez-Pérez, Hugo; Morales, J Alejandro

    2018-01-01

    Genomic signal processing (GSP) methods which convert DNA data to numerical values have recently been proposed, which would offer the opportunity of employing existing digital signal processing methods for genomic data. One of the most used methods for exploring data is cluster analysis which refers to the unsupervised classification of patterns in data. In this paper, we propose a novel approach for performing cluster analysis of DNA sequences that is based on the use of GSP methods and the K-means algorithm. We also propose a visualization method that facilitates the easy inspection and analysis of the results and possible hidden behaviors. Our results support the feasibility of employing the proposed method to find and easily visualize interesting features of sets of DNA data.

  16. Google matrix analysis of DNA sequences.

    Science.gov (United States)

    Kandiah, Vivek; Shepelyansky, Dima L

    2013-01-01

    For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW). At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  17. Google matrix analysis of DNA sequences.

    Directory of Open Access Journals (Sweden)

    Vivek Kandiah

    Full Text Available For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW. At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  18. DNA barcoding as a means for identifying medicinal plants of Pakistan

    International Nuclear Information System (INIS)

    Schori, M.; Showalter, A.M.

    2011-01-01

    DNA barcoding involves the generation of DNA sequencing data from particular genetic regions in an organism and the use of these sequence data to identify or 'barcode' that organism and distinguish it from other species. Here, DNA barcoding is being used to identify several medicinal plants found in Pakistan and distinguished them from other similar species. Several challenges to the successful implementation of plant DNA barcoding are presented and discussed. Despite these challenges, DNA barcoding has the potential to uniquely identify medicinal plants and provide quality control and standardization of the plant material supplied to the pharmaceutical industry. (author)

  19. Measuring the Electronic Properties of DNA-Specific Schottky Diodes Towards Detecting and Identifying Basidiomycetes DNA

    Science.gov (United States)

    Periasamy, Vengadesh; Rizan, Nastaran; Al-Ta’ii, Hassan Maktuff Jaber; Tan, Yee Shin; Tajuddin, Hairul Annuar; Iwamoto, Mitsumasa

    2016-01-01

    The discovery of semiconducting behavior of deoxyribonucleic acid (DNA) has resulted in a large number of literatures in the study of DNA electronics. Sequence-specific electronic response provides a platform towards understanding charge transfer mechanism and therefore the electronic properties of DNA. It is possible to utilize these characteristic properties to identify/detect DNA. In this current work, we demonstrate a novel method of DNA-based identification of basidiomycetes using current-voltage (I-V) profiles obtained from DNA-specific Schottky barrier diodes. Electronic properties such as ideality factor, barrier height, shunt resistance, series resistance, turn-on voltage, knee-voltage, breakdown voltage and breakdown current were calculated and used to quantify the identification process as compared to morphological and molecular characterization techniques. The use of these techniques is necessary in order to study biodiversity, but sometimes it can be misleading and unreliable and is not sufficiently useful for the identification of fungi genera. Many of these methods have failed when it comes to identification of closely related species of certain genus like Pleurotus. Our electronics profiles, both in the negative and positive bias regions were however found to be highly characteristic according to the base-pair sequences. We believe that this simple, low-cost and practical method could be useful towards identifying and detecting DNA in biotechnology and pathology. PMID:27435636

  20. cDNA sequence quality data - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Budding yeast cDNA sequencing project cDNA sequence quality data Data detail Data name cDNA sequence quality... data DOI 10.18908/lsdba.nbdc00838-003 Description of data contents Phred's quality score. P...tion Download License Update History of This Database Site Policy | Contact Us cDNA sequence quality

  1. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    Energy Technology Data Exchange (ETDEWEB)

    Winston Chen, C.H.; Taranenko, N.I.; Zhu, Y.F.; Chung, C.N.; Allman, S.L.

    1997-03-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, the authors recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Snager`s enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. The preliminary results indicate laser mass spectrometry can possibly be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, the authors applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  2. Method for priming and DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Mugasimangalam, R.C.; Ulanovsky, L.E.

    1997-12-01

    A method is presented for improving the priming specificity of an oligonucleotide primer that is non-unique in a nucleic acid template which includes selecting a continuous stretch of several nucleotides in the template DNA where one of the four bases does not occur in the stretch. This also includes bringing the template DNA in contract with a non-unique primer partially or fully complimentary to the sequence immediately upstream of the selected sequence stretch. This results in polymerase-mediated differential extension of the primer in the presence of a subset of deoxyribonucleotide triphosphates that does not contain the base complementary to the base absent in the selected sequence stretch. These reactions occur at a temperature sufficiently low for allowing the extension of the non-unique primer. The method causes polymerase-mediated extension reactions in the presence of all four natural deoxyribonucleotide triphosphates or modifications. At this high temperature discrimination occurs against priming sites of the non-unique primer where the differential extension has not made the primer sufficiently stable to prime. However, the primer extended at the selected stretch is sufficiently stable to prime.

  3. Chimeric proteins for detection and quantitation of DNA mutations, DNA sequence variations, DNA damage and DNA mismatches

    Science.gov (United States)

    McCutchen-Maloney, Sandra L.

    2002-01-01

    Chimeric proteins having both DNA mutation binding activity and nuclease activity are synthesized by recombinant technology. The proteins are of the general formula A-L-B and B-L-A where A is a peptide having DNA mutation binding activity, L is a linker and B is a peptide having nuclease activity. The chimeric proteins are useful for detection and identification of DNA sequence variations including DNA mutations (including DNA damage and mismatches) by binding to the DNA mutation and cutting the DNA once the DNA mutation is detected.

  4. Poincaré recurrences of DNA sequences

    Science.gov (United States)

    Frahm, K. M.; Shepelyansky, D. L.

    2012-01-01

    We analyze the statistical properties of Poincaré recurrences of Homo sapiens, mammalian, and other DNA sequences taken from the Ensembl Genome data base with up to 15 billion base pairs. We show that the probability of Poincaré recurrences decays in an algebraic way with the Poincaré exponent β≈4 even if the oscillatory dependence is well pronounced. The correlations between recurrences decay with an exponent ν≈0.6 that leads to an anomalous superdiffusive walk. However, for Homo sapiens sequences, with the largest available statistics, the diffusion coefficient converges to a finite value on distances larger than one million base pairs. We argue that the approach based on Poncaré recurrences determines new proximity features between different species and sheds a new light on their evolution history.

  5. DNA watermarks in non-coding regulatory sequences

    Directory of Open Access Journals (Sweden)

    Pyka Martin

    2009-07-01

    Full Text Available Abstract Background DNA watermarks can be applied to identify the unauthorized use of genetically modified organisms. It has been shown that coding regions can be used to encrypt information into living organisms by using the DNA-Crypt algorithm. Yet, if the sequence of interest presents a non-coding DNA sequence, either the function of a resulting functional RNA molecule or a regulatory sequence, such as a promoter, could be affected. For our studies we used the small cytoplasmic RNA 1 in yeast and the lac promoter region of Escherichia coli. Findings The lac promoter was deactivated by the integrated watermark. In addition, the RNA molecules displayed altered configurations after introducing a watermark, but surprisingly were functionally intact, which has been verified by analyzing the growth characteristics of both wild type and watermarked scR1 transformed yeast cells. In a third approach we introduced a second overlapping watermark into the lac promoter, which did not affect the promoter activity. Conclusion Even though the watermarked RNA and one of the watermarked promoters did not show any significant differences compared to the wild type RNA and wild type promoter region, respectively, it cannot be generalized that other RNA molecules or regulatory sequences behave accordingly. Therefore, we do not recommend integrating watermark sequences into regulatory regions.

  6. Image correlation method for DNA sequence alignment.

    Science.gov (United States)

    Curilem Saldías, Millaray; Villarroel Sassarini, Felipe; Muñoz Poblete, Carlos; Vargas Vásquez, Asticio; Maureira Butler, Iván

    2012-01-01

    The complexity of searches and the volume of genomic data make sequence alignment one of bioinformatics most active research areas. New alignment approaches have incorporated digital signal processing techniques. Among these, correlation methods are highly sensitive. This paper proposes a novel sequence alignment method based on 2-dimensional images, where each nucleic acid base is represented as a fixed gray intensity pixel. Query and known database sequences are coded to their pixel representation and sequence alignment is handled as object recognition in a scene problem. Query and database become object and scene, respectively. An image correlation process is carried out in order to search for the best match between them. Given that this procedure can be implemented in an optical correlator, the correlation could eventually be accomplished at light speed. This paper shows an initial research stage where results were "digitally" obtained by simulating an optical correlation of DNA sequences represented as images. A total of 303 queries (variable lengths from 50 to 4500 base pairs) and 100 scenes represented by 100 x 100 images each (in total, one million base pair database) were considered for the image correlation analysis. The results showed that correlations reached very high sensitivity (99.01%), specificity (98.99%) and outperformed BLAST when mutation numbers increased. However, digital correlation processes were hundred times slower than BLAST. We are currently starting an initiative to evaluate the correlation speed process of a real experimental optical correlator. By doing this, we expect to fully exploit optical correlation light properties. As the optical correlator works jointly with the computer, digital algorithms should also be optimized. The results presented in this paper are encouraging and support the study of image correlation methods on sequence alignment.

  7. The influence of DNA sequence on epigenome-induced pathologies

    Directory of Open Access Journals (Sweden)

    Meagher Richard B

    2012-07-01

    Full Text Available Abstract Clear cause-and-effect relationships are commonly established between genotype and the inherited risk of acquiring human and plant diseases and aberrant phenotypes. By contrast, few such cause-and-effect relationships are established linking a chromatin structure (that is, the epitype with the transgenerational risk of acquiring a disease or abnormal phenotype. It is not entirely clear how epitypes are inherited from parent to offspring as populations evolve, even though epigenetics is proposed to be fundamental to evolution and the likelihood of acquiring many diseases. This article explores the hypothesis that, for transgenerationally inherited chromatin structures, “genotype predisposes epitype”, and that epitype functions as a modifier of gene expression within the classical central dogma of molecular biology. Evidence for the causal contribution of genotype to inherited epitypes and epigenetic risk comes primarily from two different kinds of studies discussed herein. The first and direct method of research proceeds by the examination of the transgenerational inheritance of epitype and the penetrance of phenotype among genetically related individuals. The second approach identifies epitypes that are duplicated (as DNA sequences are duplicated and evolutionarily conserved among repeated patterns in the DNA sequence. The body of this article summarizes particularly robust examples of these studies from humans, mice, Arabidopsis, and other organisms. The bulk of the data from both areas of research support the hypothesis that genotypes predispose the likelihood of displaying various epitypes, but for only a few classes of epitype. This analysis suggests that renewed efforts are needed in identifying polymorphic DNA sequences that determine variable nucleosome positioning and DNA methylation as the primary cause of inherited epigenome-induced pathologies. By contrast, there is very little evidence that DNA sequence directly

  8. Application of synthetic DNA probes to the analysis of DNA sequence variants in man

    International Nuclear Information System (INIS)

    Wallace, R.B.; Petz, L.D.; Yam, P.Y.

    1986-01-01

    Oligonucleotide probes provide a tool to discriminate between any two alleles on the basis of hybridization. Random sampling of the genome with different oligonucleotide probes should reveal polymorphism in a certain percentage of the cases. In the hope of identifying polymorphic regions more efficiently, we chose to take advantage of the proposed hypermutability of repeated DNA sequences and the specificity of oligonucleotide hybridization. Since, under appropriate conditions, oligonucleotide probes require complete base pairing for hybridization to occur, they will only hybridize to a subset of the members of a repeat family when all members of the family are not identical. The results presented here suggest that oligonucleotide hybridization can be used to extend the genomic sequences that can be tested for the presence of RFLPs. This expands the tools available to human genetics. In addition, the results suggest that repeated DNA sequences are indeed more polymorphic than single-copy sequences. 28 references, 2 figures

  9. Biased distribution of DNA uptake sequences towards genome maintenance genes

    DEFF Research Database (Denmark)

    Davidsen, T.; Rodland, E.A.; Lagesen, K.

    2004-01-01

    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...... in these organisms. Pasteurella multocida also displayed high frequencies of a putative DUS identical to that previously identified in H. influenzae and with a skewed distribution towards genome maintenance genes, indicating that this bacterium might be transformation competent under certain conditions....

  10. Identifying the bacterial community on the surface of Intralox belting in a meat boning room by culture-dependent and culture-independent 16S rDNA sequence analysis.

    Science.gov (United States)

    Brightwell, Gale; Boerema, Jackie; Mills, John; Mowat, Eilidh; Pulford, David

    2006-05-25

    We examined the bacterial community present on an Intralox conveyor belt system in an operating lamb boning room by sequencing the 16S ribosomal DNA (rDNA) of bacteria extracted in the presence or absence of cultivation. RFLP patterns for 16S rDNA clone library and cultures were generated using HaeIII and MspI restriction endonucleases. 16S rDNA amplicons produced 8 distinct RFLP pattern groups. RFLP groups I-IV were represented in the clone library and RFLP groups I and V-VIII were represented amongst the cultured isolates. Partial DNA sequences from each RFLP group revealed that all group I, II and VIII representatives were Pseudomonas spp., group III were Sphingomonas spp., group IV clones were most similar to an uncultured alpha proteobacterium, group V was similar to a Serratia spp., group VI with an Alcaligenes spp., and group VII with Microbacterium spp. Sphingomonads were numerically dominant in the culture-independent clone library and along with the group IV alpha proteobacterium were not represented amongst the cultured isolates. Serratia, Alcaligenes and Microbacterium spp. were only represented with cultured isolates. Pseudomonads were detected by both culture-dependent (84% of isolates) and culture-independent (12.5% of clones) methods and their presence at high frequency does pose the risk of product spoilage if transferred onto meat stored under aerobic conditions. The detection of sphingomonads in large numbers by the culture-independent method demands further analysis because sphingomonads may represent a new source of meat spoilage that has not been previously recognised in the meat processing environment. The 16S rDNA collections generated by both methods were important at representing the diversity of the bacterial population associated with an Intralox conveyor belt system.

  11. A microfabricated hybrid device for DNA sequencing.

    Science.gov (United States)

    Liu, Shaorong

    2003-11-01

    We have created a hybrid device of a microfabricated round-channel twin-T injector incorporated with a separation capillary in order to extend the straight separation distance for high speed and long readlength DNA sequencing. Semicircular grooves on glass wafers are obtained using a photomask with a narrow line-width and a standard isotropic photolithographic etching process. Round channels are made when two etched wafers are face-to-face aligned and bonded. A two-mask fabrication process has been developed to make channels of two different diameters. The twin-T injector is formed by the smaller channels whose diameter matches the bore of the separation capillary, and the "usual" separation channel, now called the connection channel, is formed by the larger ones whose diameter matches the outer diameter of the separation capillary. The separation capillary is inserted through the connection channel all the way to the twin-T injector to allow the capillary bore flush with the twin-T injector channels. The total dead-volume of the connection is estimated to be approximately 5 pL. To demonstrate the efficiency of this hybrid device, we have performed four-color DNA sequencing on it. Using a 200 microm twin-T injector coupled with a separation capillary of 20 cm effective separation distance, we have obtained readlengths of 800 plus bases at an accuracy of 98.5% in 56 min, compared to about 650 bases in 100 min on a conventional 40 cm long capillary sequencing machine under similar conditions. At an increased separation field strength and using a diluted sieving matrix, the separation time has been reduced to 20 min with a readlength of 700 bases at 98.5% base-calling accuracy.

  12. Next generation sequencing of DNA-launched Chikungunya vaccine virus

    Energy Technology Data Exchange (ETDEWEB)

    Hidajat, Rachmat; Nickols, Brian [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States); Forrester, Naomi [Institute for Human Infections and Immunity, Sealy Center for Vaccine Development and Department of Pathology, University of Texas Medical Branch, GNL, 301 University Blvd., Galveston, TX 77555 (United States); Tretyakova, Irina [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States); Weaver, Scott [Institute for Human Infections and Immunity, Sealy Center for Vaccine Development and Department of Pathology, University of Texas Medical Branch, GNL, 301 University Blvd., Galveston, TX 77555 (United States); Pushko, Peter, E-mail: ppushko@medigen-usa.com [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States)

    2016-03-15

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3′ untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. - Highlights: • Chikungunya virus (CHIKV) is an emerging pandemic threat. • In vivo DNA-launched attenuated CHIKV is a novel vaccine technology. • DNA-launched virus was sequenced using HiSeq2000 and compared to the 181/25 virus. • DNA-launched virus has lower frequency of SNPs at E2-12 and E2-82 attenuation loci.

  13. Next generation sequencing of DNA-launched Chikungunya vaccine virus

    International Nuclear Information System (INIS)

    Hidajat, Rachmat; Nickols, Brian; Forrester, Naomi; Tretyakova, Irina; Weaver, Scott; Pushko, Peter

    2016-01-01

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3′ untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. - Highlights: • Chikungunya virus (CHIKV) is an emerging pandemic threat. • In vivo DNA-launched attenuated CHIKV is a novel vaccine technology. • DNA-launched virus was sequenced using HiSeq2000 and compared to the 181/25 virus. • DNA-launched virus has lower frequency of SNPs at E2-12 and E2-82 attenuation loci.

  14. Identification of Meconopsis species by a DNA barcode sequence ...

    African Journals Online (AJOL)

    Deoxyribonucleic acid (DNA) barcoding is a novel technology that uses a standard DNA sequence to facilitate species identification. Species identification is necessary for the authentication of traditional plant based medicines. Although a consensus has not been agreed regarding which DNA sequences can be used as ...

  15. A DNA Structure-Based Bionic Wavelet Transform and Its Application to DNA Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Fei Chen

    2003-01-01

    Full Text Available DNA sequence analysis is of great significance for increasing our understanding of genomic functions. An important task facing us is the exploration of hidden structural information stored in the DNA sequence. This paper introduces a DNA structure-based adaptive wavelet transform (WT – the bionic wavelet transform (BWT – for DNA sequence analysis. The symbolic DNA sequence can be separated into four channels of indicator sequences. An adaptive symbol-to-number mapping, determined from the structural feature of the DNA sequence, was introduced into WT. It can adjust the weight value of each channel to maximise the useful energy distribution of the whole BWT output. The performance of the proposed BWT was examined by analysing synthetic and real DNA sequences. Results show that BWT performs better than traditional WT in presenting greater energy distribution. This new BWT method should be useful for the detection of the latent structural features in future DNA sequence analysis.

  16. Next Generation DNA Sequencing and the Future of Genomic Medicine

    OpenAIRE

    Anderson, Matthew W.; Schrijver, Iris

    2010-01-01

    In the years since the first complete human genome sequence was reported, there has been a rapid development of technologies to facilitate high-throughput sequence analysis of DNA (termed “next-generation” sequencing). These novel approaches to DNA sequencing offer the promise of complete genomic analysis at a cost feasible for routine clinical diagnostics. However, the ability to more thoroughly interrogate genomic sequence raises a number of important issues with regard to result interpreta...

  17. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697

  18. SWORDS: A statistical tool for analysing large DNA sequences

    Indian Academy of Sciences (India)

    Unknown

    These techniques are based on frequency distributions of DNA words in a large sequence, and have been packaged into a software called SWORDS. Using sequences available in ... tions with the cellular processes like recombination, replication .... in DNA sequences using certain specific probability laws. (Pevzner et al ...

  19. cDNA sequences of two apolipoproteins from lamprey

    International Nuclear Information System (INIS)

    Pontes, M.; Xu, X.; Graham, D.; Riley, M.; Doolittle, R.F.

    1987-01-01

    The messages for two small but abundant apolipoproteins found in lamprey blood plasma were cloned with the aid of oligonucleotide probes based on amino-terminal sequences. In both cases, numerous clones were identified in a lamprey liver cDNA library, consistent with the great abundance of these proteins in lamprey blood. One of the cDNAs (LAL1) has a coding region of 105 amino acids that corresponds to a 21-residue signal peptide, a putative 8-residue propeptide, and the 76-residue mature protein found in blood. The other cDNA (LAL2) codes for a total of 191 residues, the first 23 of which constitute a signal peptide. The two proteins, which occur in the high-density lipoprotein fraction of ultracentrifuged plasma, have amino acid compositions similar to those of apolipoproteins found in mammalian blood; computer analysis indicates that the sequences are largely helix-permissive. When the sequences were searched against an amino acid sequence data base, rat apolipoprotein IV was the best matching candidate in both cases. Although a reasonable alignment can be made with that sequence and LAL1, definitive assignment of the two lamprey proteins to typical mammalian classes cannot be made at this point

  20. Comparison of DNA Quantification Methods for Next Generation Sequencing.

    Science.gov (United States)

    Robin, Jérôme D; Ludlow, Andrew T; LaRanger, Ryan; Wright, Woodring E; Shay, Jerry W

    2016-04-06

    Next Generation Sequencing (NGS) is a powerful tool that depends on loading a precise amount of DNA onto a flowcell. NGS strategies have expanded our ability to investigate genomic phenomena by referencing mutations in cancer and diseases through large-scale genotyping, developing methods to map rare chromatin interactions (4C; 5C and Hi-C) and identifying chromatin features associated with regulatory elements (ChIP-seq, Bis-Seq, ChiA-PET). While many methods are available for DNA library quantification, there is no unambiguous gold standard. Most techniques use PCR to amplify DNA libraries to obtain sufficient quantities for optical density measurement. However, increased PCR cycles can distort the library's heterogeneity and prevent the detection of rare variants. In this analysis, we compared new digital PCR technologies (droplet digital PCR; ddPCR, ddPCR-Tail) with standard methods for the titration of NGS libraries. DdPCR-Tail is comparable to qPCR and fluorometry (QuBit) and allows sensitive quantification by analysis of barcode repartition after sequencing of multiplexed samples. This study provides a direct comparison between quantification methods throughout a complete sequencing experiment and provides the impetus to use ddPCR-based quantification for improvement of NGS quality.

  1. A novel constraint for thermodynamically designing DNA sequences.

    Directory of Open Access Journals (Sweden)

    Qiang Zhang

    Full Text Available Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap.

  2. Chimeric TALE recombinases with programmable DNA sequence specificity.

    Science.gov (United States)

    Mercer, Andrew C; Gaj, Thomas; Fuller, Roberta P; Barbas, Carlos F

    2012-11-01

    Site-specific recombinases are powerful tools for genome engineering. Hyperactivated variants of the resolvase/invertase family of serine recombinases function without accessory factors, and thus can be re-targeted to sequences of interest by replacing native DNA-binding domains (DBDs) with engineered zinc-finger proteins (ZFPs). However, imperfect modularity with particular domains, lack of high-affinity binding to all DNA triplets, and difficulty in construction has hindered the widespread adoption of ZFPs in unspecialized laboratories. The discovery of a novel type of DBD in transcription activator-like effector (TALE) proteins from Xanthomonas provides an alternative to ZFPs. Here we describe chimeric TALE recombinases (TALERs): engineered fusions between a hyperactivated catalytic domain from the DNA invertase Gin and an optimized TALE architecture. We use a library of incrementally truncated TALE variants to identify TALER fusions that modify DNA with efficiency and specificity comparable to zinc-finger recombinases in bacterial cells. We also show that TALERs recombine DNA in mammalian cells. The TALER architecture described herein provides a platform for insertion of customized TALE domains, thus significantly expanding the targeting capacity of engineered recombinases and their potential applications in biotechnology and medicine.

  3. Biomolecule Sequencer: Next-Generation DNA Sequencing Technology for In-Flight Environmental Monitoring, Research, and Beyond

    Science.gov (United States)

    Smith, David J.; Burton, Aaron; Castro-Wallace, Sarah; John, Kristen; Stahl, Sarah E.; Dworkin, Jason Peter; Lupisella, Mark L.

    2016-01-01

    On the International Space Station (ISS), technologies capable of rapid microbial identification and disease diagnostics are not currently available. NASA still relies upon sample return for comprehensive, molecular-based sample characterization. Next-generation DNA sequencing is a powerful approach for identifying microorganisms in air, water, and surfaces onboard spacecraft. The Biomolecule Sequencer payload, manifested to SpaceX-9 and scheduled on the Increment 4748 research plan (June 2016), will assess the functionality of a commercially-available next-generation DNA sequencer in the microgravity environment of ISS. The MinION device from Oxford Nanopore Technologies (Oxford, UK) measures picoamp changes in electrical current dependent on nucleotide sequences of the DNA strand migrating through nanopores in the system. The hardware is exceptionally small (9.5 x 3.2 x 1.6 cm), lightweight (120 grams), and powered only by a USB connection. For the ISS technology demonstration, the Biomolecule Sequencer will be powered by a Microsoft Surface Pro3. Ground-prepared samples containing lambda bacteriophage, Escherichia coli, and mouse genomic DNA, will be launched and stored frozen on the ISS until experiment initiation. Immediately prior to sequencing, a crew member will collect and thaw frozen DNA samples, connect the sequencer to the Surface Pro3, inject thawed samples into a MinION flow cell, and initiate sequencing. At the completion of the sequencing run, data will be downlinked for ground analysis. Identical, synchronous ground controls will be used for data comparisons to determine sequencer functionality, run-time sequence, current dynamics, and overall accuracy. We will present our latest results from the ISS flight experiment the first time DNA has ever been sequenced in space and discuss the many potential applications of the Biomolecule Sequencer for environmental monitoring, medical diagnostics, higher fidelity and more adaptable Space Biology Human

  4. Cytogenetic Analysis of Populus trichocarpa - Ribosomal DNA, Telomere Repeat Sequence, and Marker-selected BACs

    Science.gov (United States)

    M.N. lslam-Faridi; C.D. Nelson; S.P. DiFazio; L.E. Gunter; G.A. Tuskan

    2009-01-01

    The 185-285 rDNA and 55 rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 185-285 rDNA sites and one 55 rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis-type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones...

  5. Levenshtein error-correcting barcodes for multiplexed DNA sequencing

    NARCIS (Netherlands)

    Buschmann, Tilo; Bystrykh, Leonid V.

    2013-01-01

    Background: High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called

  6. repDNA: a Python package to generate various modes of feature vectors for DNA sequences by incorporating user-defined physicochemical properties and sequence-order effects.

    Science.gov (United States)

    Liu, Bin; Liu, Fule; Fang, Longyun; Wang, Xiaolong; Chou, Kuo-Chen

    2015-04-15

    In order to develop powerful computational predictors for identifying the biological features or attributes of DNAs, one of the most challenging problems is to find a suitable approach to effectively represent the DNA sequences. To facilitate the studies of DNAs and nucleotides, we developed a Python package called representations of DNAs (repDNA) for generating the widely used features reflecting the physicochemical properties and sequence-order effects of DNAs and nucleotides. There are three feature groups composed of 15 features. The first group calculates three nucleic acid composition features describing the local sequence information by means of kmers; the second group calculates six autocorrelation features describing the level of correlation between two oligonucleotides along a DNA sequence in terms of their specific physicochemical properties; the third group calculates six pseudo nucleotide composition features, which can be used to represent a DNA sequence with a discrete model or vector yet still keep considerable sequence-order information via the physicochemical properties of its constituent oligonucleotides. In addition, these features can be easily calculated based on both the built-in and user-defined properties via using repDNA. The repDNA Python package is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repDNA/. bliu@insun.hit.edu.cn or kcchou@gordonlifescience.org Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  7. DNA barcoding identifies Argentine fishes from marine and brackish waters.

    Directory of Open Access Journals (Sweden)

    Ezequiel Mabragaña

    Full Text Available BACKGROUND: DNA barcoding has been advanced as a promising tool to aid species identification and discovery through the use of short, standardized gene targets. Despite extensive taxonomic studies, for a variety of reasons the identification of fishes can be problematic, even for experts. DNA barcoding is proving to be a useful tool in this context. However, its broad application is impeded by the need to construct a comprehensive reference sequence library for all fish species. Here, we make a regional contribution to this grand challenge by calibrating the species discrimination efficiency of barcoding among 125 Argentine fish species, representing nearly one third of the known fauna, and examine the utility of these data to address several key taxonomic uncertainties pertaining to species in this region. METHODOLOGY/PRINCIPAL FINDINGS: Specimens were collected and morphologically identified during crusies conducted between 2005 and 2008. The standard BARCODE fragment of COI was amplified and bi-directionally sequenced from 577 specimens (mean of 5 specimens/species, and all specimens and sequence data were archived and interrogated using analytical tools available on the Barcode of Life Data System (BOLD; www.barcodinglife.org. Nearly all species exhibited discrete clusters of closely related haplogroups which permitted the discrimination of 95% of the species (i.e. 119/125 examined while cases of shared haplotypes were detected among just three species-pairs. Notably, barcoding aided the identification of a new species of skate, Dipturus argentinensis, permitted the recognition of Genypterus brasiliensis as a valid species and questions the generic assignment of Paralichthys isosceles. CONCLUSIONS/SIGNIFICANCE: This study constitutes a significant contribution to the global barcode reference sequence library for fishes and demonstrates the utility of barcoding for regional species identification. As an independent assessment of alpha

  8. Molecular design of sequence specific DNA alkylating agents.

    Science.gov (United States)

    Minoshima, Masafumi; Bando, Toshikazu; Shinohara, Ken-ichi; Sugiyama, Hiroshi

    2009-01-01

    Sequence-specific DNA alkylating agents have great interest for novel approach to cancer chemotherapy. We designed the conjugates between pyrrole (Py)-imidazole (Im) polyamides and DNA alkylating chlorambucil moiety possessing at different positions. The sequence-specific DNA alkylation by conjugates was investigated by using high-resolution denaturing polyacrylamide gel electrophoresis (PAGE). The results showed that polyamide chlorambucil conjugates alkylate DNA at flanking adenines in recognition sequences of Py-Im polyamides, however, the reactivities and alkylation sites were influenced by the positions of conjugation. In addition, we synthesized conjugate between Py-Im polyamide and another alkylating agent, 1-(chloromethyl)-5-hydroxy-1,2-dihydro-3H-benz[e]indole (seco-CBI). DNA alkylation reactivies by both alkylating polyamides were almost comparable. In contrast, cytotoxicities against cell lines differed greatly. These comparative studies would promote development of appropriate sequence-specific DNA alkylating polyamides against specific cancer cells.

  9. Phylogeny of the Serrasalmidae (Characiformes based on mitochondrial DNA sequences

    Directory of Open Access Journals (Sweden)

    Guillermo Ortí

    2008-01-01

    Full Text Available Previous studies based on DNA sequences of mitochondrial (mt rRNA genes showed three main groups within the subfamily Serrasalminae: (1 a "pacu" clade of herbivores (Colossoma, Mylossoma, Piaractus; (2 the "Myleus" clade (Myleus, Mylesinus, Tometes, Ossubtus; and (3 the "piranha" clade (Serrasalmus, Pygocentrus, Pygopristis, Pristobrycon, Catoprion, Metynnis. The genus Acnodon was placed as the sister taxon of clade (2+3. However, poor resolution within each clade was obtained due to low levels of variation among rRNA gene sequences. Complete sequences of the hypervariable mtDNA control region for a total of 45 taxa, and additional sequences of 12S and 16S rRNA from a total of 74 taxa representing all genera in the family are now presented to address intragroup relationships. Control region sequences of several serrasalmid species exhibit tandem repeats of short motifs (12 to 33 bp in the 3' end of this region, accounting for substantial length variation. Bayesian inference and maximum parsimony analyses of these sequences identify the same groupings as before and provide further evidence to support the following observations: (a Serrasalmus gouldingi and species of Pristobrycon (non-striolatus form a monophyletic group that is the sister group to other species of Serrasalmus and Pygocentrus; (b Catoprion, Pygopristis, and Pristobrycon striolatus form a well supported clade, sister to the group described above; (c some taxa assigned to the genus Myloplus (M. asterias, M tiete, M ternetzi, and M rubripinnis form a well supported group whereas other Myloplus species remain with uncertain affinities (d Mylesinus, Tometes and Myleus setiger form a monophyletic group.

  10. Food Fish Identification from DNA Extraction through Sequence Analysis

    Science.gov (United States)

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  11. DNA Polymerases Drive DNA Sequencing-by-Synthesis Technologies: Both Past and Present

    Directory of Open Access Journals (Sweden)

    Cheng-Yao eChen

    2014-06-01

    Full Text Available Next-generation sequencing (NGS technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. E. coli DNA polymerase I proteolytic (Klenow fragment was originally utilized in Sanger's dideoxy chain terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today's standard capillary electrophoresis (CE and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ⱷ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ⱷ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies.

  12. Sequence of human protamine 2 cDNA

    Energy Technology Data Exchange (ETDEWEB)

    Domenjoud, L; Fronia, C; Uhde, F; Engel, W [Universitaet Goettingen (West Germany)

    1988-08-11

    The authors report the cloning and sequencing of a cDNA clone for human protamine 2 (hp2), isolated from a human testis cDNA library cloned in the vector {lambda}-gt11. A 66mer oligonucleotide, that corresponds to an amino acid sequence which is highly conserved between hp2 and mouse protamine 2 (mp2) served as hybridization probe. The homology between the amino acid sequence deduced from our cDNA and the published amino acid sequence for hp2 is 100%.

  13. Sequence periodicity in nucleosomal DNA and intrinsic curvature.

    Science.gov (United States)

    Nair, T Murlidharan

    2010-05-17

    Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA.

  14. Rapid Multiplex Small DNA Sequencing on the MinION Nanopore Sequencing Platform

    Directory of Open Access Journals (Sweden)

    Shan Wei

    2018-05-01

    Full Text Available Real-time sequencing of short DNA reads has a wide variety of clinical and research applications including screening for mutations, target sequences and aneuploidy. We recently demonstrated that MinION, a nanopore-based DNA sequencing device the size of a USB drive, could be used for short-read DNA sequencing. In this study, an ultra-rapid multiplex library preparation and sequencing method for the MinION is presented and applied to accurately test normal diploid and aneuploidy samples’ genomic DNA in under three hours, including library preparation and sequencing. This novel method shows great promise as a clinical diagnostic test for applications requiring rapid short-read DNA sequencing.

  15. PISMA: A Visual Representation of Motif Distribution in DNA Sequences

    Directory of Open Access Journals (Sweden)

    Rogelio Alcántara-Silva

    2017-03-01

    Full Text Available Background: Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code–like, as a gene-map–like, and as a transcript scheme. Results: We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. Availability and Implementation: PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf .

  16. Spectral entropy criteria for structural segmentation in genomic DNA sequences

    International Nuclear Information System (INIS)

    Chechetkin, V.R.; Lobzin, V.V.

    2004-01-01

    The spectral entropy is calculated with Fourier structure factors and characterizes the level of structural ordering in a sequence of symbols. It may efficiently be applied to the assessment and reconstruction of the modular structure in genomic DNA sequences. We present the relevant spectral entropy criteria for the local and non-local structural segmentation in DNA sequences. The results are illustrated with the model examples and analysis of intervening exon-intron segments in the protein-coding regions

  17. Toward a Better Compression for DNA Sequences Using Huffman Encoding.

    Science.gov (United States)

    Al-Okaily, Anas; Almarri, Badar; Al Yami, Sultan; Huang, Chun-Hsi

    2017-04-01

    Due to the significant amount of DNA data that are being generated by next-generation sequencing machines for genomes of lengths ranging from megabases to gigabases, there is an increasing need to compress such data to a less space and a faster transmission. Different implementations of Huffman encoding incorporating the characteristics of DNA sequences prove to better compress DNA data. These implementations center on the concepts of selecting frequent repeats so as to force a skewed Huffman tree, as well as the construction of multiple Huffman trees when encoding. The implementations demonstrate improvements on the compression ratios for five genomes with lengths ranging from 5 to 50 Mbp, compared with the standard Huffman tree algorithm. The research hence suggests an improvement on all such DNA sequence compression algorithms that use the conventional Huffman encoding. The research suggests an improvement on all DNA sequence compression algorithms that use the conventional Huffman encoding. Accompanying software is publicly available (AL-Okaily, 2016 ).

  18. Gene Unprediction with Spurio: A tool to identify spurious protein sequences.

    Science.gov (United States)

    Höps, Wolfram; Jeffryes, Matt; Bateman, Alex

    2018-01-01

    We now have access to the sequences of tens of millions of proteins. These protein sequences are essential for modern molecular biology and computational biology. The vast majority of protein sequences are derived from gene prediction tools and have no experimental supporting evidence for their translation.  Despite the increasing accuracy of gene prediction tools there likely exists a large number of spurious protein predictions in the sequence databases.  We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes.  Spurio searches the query protein sequence against a prokaryotic nucleotide database using tblastn and identifies homologous sequences. The tblastn matches are used to score the query sequence's likelihood of being a spurious protein prediction using a Gaussian process model. The most informative feature is the appearance of stop codons within the presumed translation of homologous DNA sequences. Benchmarking shows that the Spurio tool is able to distinguish spurious from true proteins. However, transposon proteins are prone to be predicted as spurious because of the frequency of degraded homologs found in the DNA sequence databases. Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than the AntiFam resource. The Spurio software and source code is available under an MIT license at the following URL: https://bitbucket.org/bateman-group/spurio.

  19. Next Generation Sequencing of Ancient DNA: Requirements, Strategies and Perspectives

    Directory of Open Access Journals (Sweden)

    Michael Knapp

    2010-07-01

    Full Text Available The invention of next-generation-sequencing has revolutionized almost all fields of genetics, but few have profited from it as much as the field of ancient DNA research. From its beginnings as an interesting but rather marginal discipline, ancient DNA research is now on its way into the centre of evolutionary biology. In less than a year from its invention next-generation-sequencing had increased the amount of DNA sequence data available from extinct organisms by several orders of magnitude. Ancient DNA  research is now not only adding a temporal aspect to evolutionary studies and allowing for the observation of evolution in real time, it also provides important data to help understand the origins of our own species. Here we review progress that has been made in next-generation-sequencing of ancient DNA over the past five years and evaluate sequencing strategies and future directions.

  20. Substrate sequence selectivity of APOBEC3A implicates intra-DNA interactions.

    Science.gov (United States)

    Silvas, Tania V; Hou, Shurong; Myint, Wazo; Nalivaika, Ellen; Somasundaran, Mohan; Kelch, Brian A; Matsuo, Hiroshi; Kurt Yilmaz, Nese; Schiffer, Celia A

    2018-05-14

    The APOBEC3 (A3) family of human cytidine deaminases is renowned for providing a first line of defense against many exogenous and endogenous retroviruses. However, the ability of these proteins to deaminate deoxycytidines in ssDNA makes A3s a double-edged sword. When overexpressed, A3s can mutate endogenous genomic DNA resulting in a variety of cancers. Although the sequence context for mutating DNA varies among A3s, the mechanism for substrate sequence specificity is not well understood. To characterize substrate specificity of A3A, a systematic approach was used to quantify the affinity for substrate as a function of sequence context, length, secondary structure, and solution pH. We identified the A3A ssDNA binding motif as (T/C)TC(A/G), which correlated with enzymatic activity. We also validated that A3A binds RNA in a sequence specific manner. A3A bound tighter to substrate binding motif within a hairpin loop compared to linear oligonucleotide, suggesting A3A affinity is modulated by substrate structure. Based on these findings and previously published A3A-ssDNA co-crystal structures, we propose a new model with intra-DNA interactions for the molecular mechanism underlying A3A sequence preference. Overall, the sequence and structural preferences identified for A3A leads to a new paradigm for identifying A3A's involvement in mutation of endogenous or exogenous DNA.

  1. Complete sequence analysis of 18S rDNA based on genomic DNA extraction from individual Demodex mites (Acari: Demodicidae).

    Science.gov (United States)

    Zhao, Ya-E; Xu, Ji-Ru; Hu, Li; Wu, Li-Ping; Wang, Zheng-Hang

    2012-05-01

    The study for the first time attempted to accomplish 18S ribosomal DNA (rDNA) complete sequence amplification and analysis for three Demodex species (Demodex folliculorum, Demodex brevis and Demodex canis) based on gDNA extraction from individual mites. The mites were treated by DNA Release Additive and Hot Start II DNA Polymerase so as to promote mite disruption and increase PCR specificity. Determination of D. folliculorum gDNA showed that the gDNA yield reached the highest at 1 mite, tending to descend with the increase of mite number. The individual mite gDNA was successfully used for 18S rDNA fragment (about 900 bp) amplification examination. The alignments of 18S rDNA complete sequences of individual mite samples and those of pooled mite samples ( ≥ 1000mites/sample) showed over 97% identities for each species, indicating that the gDNA extracted from a single individual mite was as satisfactory as that from pooled mites for PCR amplification. Further pairwise sequence analyses showed that average divergence, genetic distance, transition/transversion or phylogenetic tree could not effectively identify the three Demodex species, largely due to the differentiation in the D. canis isolates. It can be concluded that the individual Demodex mite gDNA can satisfy the molecular study of Demodex. 18S rDNA complete sequence is suitable for interfamily identification in Cheyletoidea, but whether it is suitable for intrafamily identification cannot be confirmed until the ascertainment of the types of Demodex mites parasitizing in dogs. Copyright © 2012 Elsevier Inc. All rights reserved.

  2. Thermodynamics of sequence-specific binding of PNA to DNA

    DEFF Research Database (Denmark)

    Ratilainen, T; Holmén, A; Tuite, E

    2000-01-01

    For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes) and seq......For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes...

  3. A newly identified DNA ligase of Saccharomyces cerevisiae involved in RAD52-independent repair of DNA double-strand breaks

    Science.gov (United States)

    Schär, Primo; Herrmann, Gernot; Daly, Graham; Lindahl, Tomas

    1997-01-01

    Eukaryotic DNA ligases are ATP-dependent DNA strand-joining enzymes that participate in DNA replication, repair, and recombination. Whereas mammalian cells contain several different DNA ligases, encoded by at least three distinct genes, only one DNA ligase has been detected previously in either budding yeast or fission yeast. Here, we describe a newly identified nonessential Saccharomyces cerevisiae gene that encodes a DNA ligase distinct from the CDC9 gene product. This DNA ligase shares significant amino acid sequence homology with human DNA ligase IV; accordingly, we designate the yeast gene LIG4. Recombinant LIG4 protein forms a covalent enzyme-AMP complex and can join a DNA single-strand break in a DNA/RNA hybrid duplex, the preferred substrate in vitro. Disruption of the LIG4 gene causes only marginally increased cellular sensitivity to several DNA damaging agents, and does not further sensitize cdc9 or rad52 mutant cells. In contrast, lig4 mutant cells have a 1000-fold reduced capacity for correct recircularization of linearized plasmids by illegitimate end-joining after transformation. Moreover, homozygous lig4 mutant diploids sporulate less efficiently than isogenic wild-type cells, and show retarded progression through meiotic prophase I. Spore viability is normal, but lig4 mutants appear to produce a higher proportion of tetrads with only three viable spores. The mutant phenotypes are consistent with functions of LIG4 in an illegitimate DNA end-joining pathway and ensuring efficient meiosis. PMID:9271115

  4. A Novel Computational Method for Detecting DNA Methylation Sites with DNA Sequence Information and Physicochemical Properties.

    Science.gov (United States)

    Pan, Gaofeng; Jiang, Limin; Tang, Jijun; Guo, Fei

    2018-02-08

    DNA methylation is an important biochemical process, and it has a close connection with many types of cancer. Research about DNA methylation can help us to understand the regulation mechanism and epigenetic reprogramming. Therefore, it becomes very important to recognize the methylation sites in the DNA sequence. In the past several decades, many computational methods-especially machine learning methods-have been developed since the high-throughout sequencing technology became widely used in research and industry. In order to accurately identify whether or not a nucleotide residue is methylated under the specific DNA sequence context, we propose a novel method that overcomes the shortcomings of previous methods for predicting methylation sites. We use k -gram, multivariate mutual information, discrete wavelet transform, and pseudo amino acid composition to extract features, and train a sparse Bayesian learning model to do DNA methylation prediction. Five criteria-area under the receiver operating characteristic curve (AUC), Matthew's correlation coefficient (MCC), accuracy (ACC), sensitivity (SN), and specificity-are used to evaluate the prediction results of our method. On the benchmark dataset, we could reach 0.8632 on AUC, 0.8017 on ACC, 0.5558 on MCC, and 0.7268 on SN. Additionally, the best results on two scBS-seq profiled mouse embryonic stem cells datasets were 0.8896 and 0.9511 by AUC, respectively. When compared with other outstanding methods, our method surpassed them on the accuracy of prediction. The improvement of AUC by our method compared to other methods was at least 0.0399 . For the convenience of other researchers, our code has been uploaded to a file hosting service, and can be downloaded from: https://figshare.com/s/0697b692d802861282d3.

  5. A Novel Computational Method for Detecting DNA Methylation Sites with DNA Sequence Information and Physicochemical Properties

    Directory of Open Access Journals (Sweden)

    Gaofeng Pan

    2018-02-01

    Full Text Available DNA methylation is an important biochemical process, and it has a close connection with many types of cancer. Research about DNA methylation can help us to understand the regulation mechanism and epigenetic reprogramming. Therefore, it becomes very important to recognize the methylation sites in the DNA sequence. In the past several decades, many computational methods—especially machine learning methods—have been developed since the high-throughout sequencing technology became widely used in research and industry. In order to accurately identify whether or not a nucleotide residue is methylated under the specific DNA sequence context, we propose a novel method that overcomes the shortcomings of previous methods for predicting methylation sites. We use k-gram, multivariate mutual information, discrete wavelet transform, and pseudo amino acid composition to extract features, and train a sparse Bayesian learning model to do DNA methylation prediction. Five criteria—area under the receiver operating characteristic curve (AUC, Matthew’s correlation coefficient (MCC, accuracy (ACC, sensitivity (SN, and specificity—are used to evaluate the prediction results of our method. On the benchmark dataset, we could reach 0.8632 on AUC, 0.8017 on ACC, 0.5558 on MCC, and 0.7268 on SN. Additionally, the best results on two scBS-seq profiled mouse embryonic stem cells datasets were 0.8896 and 0.9511 by AUC, respectively. When compared with other outstanding methods, our method surpassed them on the accuracy of prediction. The improvement of AUC by our method compared to other methods was at least 0.0399 . For the convenience of other researchers, our code has been uploaded to a file hosting service, and can be downloaded from: https://figshare.com/s/0697b692d802861282d3.

  6. Characteristics of alternating current hopping conductivity in DNA sequences

    International Nuclear Information System (INIS)

    Song-Shan, Ma; Hui, Xu; Huan-You, Wang; Rui, Guo

    2009-01-01

    This paper presents a model to describe alternating current (AC) conductivity of DNA sequences, in which DNA is considered as a one-dimensional (1D) disordered system, and electrons transport via hopping between localized states. It finds that AC conductivity in DNA sequences increases as the frequency of the external electric field rises, and it takes the form of ø ac (ω) ∼ ω 2 ln 2 (1/ω). Also AC conductivity of DNA sequences increases with the increase of temperature, this phenomenon presents characteristics of weak temperature-dependence. Meanwhile, the AC conductivity in an off-diagonally correlated case is much larger than that in the uncorrelated case of the Anderson limit in low temperatures, which indicates that the off-diagonal correlations in DNA sequences have a great effect on the AC conductivity, while at high temperature the off-diagonal correlations no longer play a vital role in electric transport. In addition, the proportion of nucleotide pairs p also plays an important role in AC electron transport of DNA sequences. For p < 0.5, the conductivity of DNA sequence decreases with the increase of p, while for p ≥ 0.5, the conductivity increases with the increase of p. (cross-disciplinary physics and related areas of science and technology)

  7. Sequence-dependent DNA deformability studied using molecular dynamics simulations.

    Science.gov (United States)

    Fujii, Satoshi; Kono, Hidetoshi; Takenaka, Shigeori; Go, Nobuhiro; Sarai, Akinori

    2007-01-01

    Proteins recognize specific DNA sequences not only through direct contact between amino acids and bases, but also indirectly based on the sequence-dependent conformation and deformability of the DNA (indirect readout). We used molecular dynamics simulations to analyze the sequence-dependent DNA conformations of all 136 possible tetrameric sequences sandwiched between CGCG sequences. The deformability of dimeric steps obtained by the simulations is consistent with that by the crystal structures. The simulation results further showed that the conformation and deformability of the tetramers can highly depend on the flanking base pairs. The conformations of xATx tetramers show the most rigidity and are not affected by the flanking base pairs and the xYRx show by contrast the greatest flexibility and change their conformations depending on the base pairs at both ends, suggesting tetramers with the same central dimer can show different deformabilities. These results suggest that analysis of dimeric steps alone may overlook some conformational features of DNA and provide insight into the mechanism of indirect readout during protein-DNA recognition. Moreover, the sequence dependence of DNA conformation and deformability may be used to estimate the contribution of indirect readout to the specificity of protein-DNA recognition as well as nucleosome positioning and large-scale behavior of nucleic acids.

  8. Characteristics of alternating current hopping conductivity in DNA sequences

    Institute of Scientific and Technical Information of China (English)

    Ma Song-Shan; Xu Hui; Wang Huan-You; Guo Rui

    2009-01-01

    This paper presents a model to describe alternating current (AC) conductivity of DNA sequences,in which DNA is considered as a one-dimensional (1D) disordered system,and electrons transport via hopping between localized states.It finds that AC conductivity in DNA sequences increases as the frequency of the external electric field rises,and it takes the form of σac(ω)~ω2 ln2(1/ω).Also AC conductivity of DNA sequences increases with the increase of temperature,this phenomenon presents characteristics of weak temperature-dependence.Meanwhile,the AC conductivity in an off diagonally correlated case is much larger than that in the uncorrelated case of the Anderson limit in low temperatures,which indicates that the off-diagonal correlations in DNA sequences have a great effect on the AC conductivity,while at high temperature the off-diagonal correlations no longer play a vital role in electric transport. In addition,the proportion of nucleotide pairs p also plays an important role in AC electron transport of DNA sequences.For p<0.5,the conductivity of DNA sequence decreases with the increase of p,while for p > 0.5,the conductivity increases with the increase of p.

  9. Nucleotide sequence analysis of regions of adenovirus 5 DNA containing the origins of DNA replication

    International Nuclear Information System (INIS)

    Steenbergh, P.H.

    1979-01-01

    The purpose of the investigations described is the determination of nucleotide sequences at the molecular ends of the linear adenovirus type 5 DNA. Knowledge of the primary structure at the termini of this DNA molecule is of particular interest in the study of the mechanism of replication of adenovirus DNA. The initiation- and termination sites of adenovirus DNA replication are located at the ends of the DNA molecule. (Auth.)

  10. Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors.

    Science.gov (United States)

    Adalsteinsson, Viktor A; Ha, Gavin; Freeman, Samuel S; Choudhury, Atish D; Stover, Daniel G; Parsons, Heather A; Gydush, Gregory; Reed, Sarah C; Rotem, Denisse; Rhoades, Justin; Loginov, Denis; Livitz, Dimitri; Rosebrock, Daniel; Leshchiner, Ignaty; Kim, Jaegil; Stewart, Chip; Rosenberg, Mara; Francis, Joshua M; Zhang, Cheng-Zhong; Cohen, Ofir; Oh, Coyin; Ding, Huiming; Polak, Paz; Lloyd, Max; Mahmud, Sairah; Helvie, Karla; Merrill, Margaret S; Santiago, Rebecca A; O'Connor, Edward P; Jeong, Seong H; Leeson, Rachel; Barry, Rachel M; Kramkowski, Joseph F; Zhang, Zhenwei; Polacek, Laura; Lohr, Jens G; Schleicher, Molly; Lipscomb, Emily; Saltzman, Andrea; Oliver, Nelly M; Marini, Lori; Waks, Adrienne G; Harshman, Lauren C; Tolaney, Sara M; Van Allen, Eliezer M; Winer, Eric P; Lin, Nancy U; Nakabayashi, Mari; Taplin, Mary-Ellen; Johannessen, Cory M; Garraway, Levi A; Golub, Todd R; Boehm, Jesse S; Wagle, Nikhil; Getz, Gad; Love, J Christopher; Meyerson, Matthew

    2017-11-06

    Whole-exome sequencing of cell-free DNA (cfDNA) could enable comprehensive profiling of tumors from blood but the genome-wide concordance between cfDNA and tumor biopsies is uncertain. Here we report ichorCNA, software that quantifies tumor content in cfDNA from 0.1× coverage whole-genome sequencing data without prior knowledge of tumor mutations. We apply ichorCNA to 1439 blood samples from 520 patients with metastatic prostate or breast cancers. In the earliest tested sample for each patient, 34% of patients have ≥10% tumor-derived cfDNA, sufficient for standard coverage whole-exome sequencing. Using whole-exome sequencing, we validate the concordance of clonal somatic mutations (88%), copy number alterations (80%), mutational signatures, and neoantigens between cfDNA and matched tumor biopsies from 41 patients with ≥10% cfDNA tumor content. In summary, we provide methods to identify patients eligible for comprehensive cfDNA profiling, revealing its applicability to many patients, and demonstrate high concordance of cfDNA and metastatic tumor whole-exome sequencing.

  11. Cloning, sequencing and expression of cDNA encoding growth ...

    Indian Academy of Sciences (India)

    Unknown

    of medicine, animal husbandry, fish farming and animal ..... northern pike (Esox lucius) growth hormone; Mol. Mar. Biol. ... prolactin 1-luciferase fusion gene in African catfish and ... 1988 Cloning and sequencing of cDNA that encodes goat.

  12. DNA Nucleotide Sequence Restricted by the RI Endonuclease

    Science.gov (United States)

    Hedgpeth, Joe; Goodman, Howard M.; Boyer, Herbert W.

    1972-01-01

    The sequence of DNA base pairs adjacent to the phosphodiester bonds cleaved by the RI restriction endonuclease in unmodified DNA from coliphage λ has been determined. The 5′-terminal nucleotide labeled with 32P and oligonucleotides up to the heptamer were analyzed from a pancreatic DNase digest. The following sequence of nucleotides adjacent to the RI break made in λ DNA was deduced from these data and from the 3′-dinucleotide sequence and nearest-neighbor analysis obtained from repair synthesis with the DNA polymerase of Rous sarcoma virus [Formula: see text] The RI endonuclease cleavage of the phosphodiester bonds (indicated by arrows) generates 5′-phosphoryls and short cohesive termini of four nucleotides, pApApTpT. The most striking feature of the sequence is its symmetry. PMID:4343974

  13. Capillary gel electrophoresis for rapid, high resolution DNA sequencing.

    OpenAIRE

    Swerdlow, H; Gesteland, R

    1990-01-01

    Capillary gel electrophoresis has been demonstrated for the separation and detection of DNA sequencing samples. Enzymatic dideoxy nucleotide chain termination was employed, using fluorescently tagged oligonucleotide primers and laser based on-column detection (limit of detection is 6,000 molecules per peak). Capillary gel separations were shown to be three times faster, with better resolution (2.4 x), and higher separation efficiency (5.4 x) than a conventional automated slab gel DNA sequenci...

  14. Sequence-specific activation of the DNA sensor cGAS by Y-form DNA structures as found in primary HIV-1 cDNA.

    Science.gov (United States)

    Herzner, Anna-Maria; Hagmann, Cristina Amparo; Goldeck, Marion; Wolter, Steven; Kübler, Kirsten; Wittmann, Sabine; Gramberg, Thomas; Andreeva, Liudmila; Hopfner, Karl-Peter; Mertens, Christina; Zillinger, Thomas; Jin, Tengchuan; Xiao, Tsan Sam; Bartok, Eva; Coch, Christoph; Ackermann, Damian; Hornung, Veit; Ludwig, Janos; Barchet, Winfried; Hartmann, Gunther; Schlee, Martin

    2015-10-01

    Cytosolic DNA that emerges during infection with a retrovirus or DNA virus triggers antiviral type I interferon responses. So far, only double-stranded DNA (dsDNA) over 40 base pairs (bp) in length has been considered immunostimulatory. Here we found that unpaired DNA nucleotides flanking short base-paired DNA stretches, as in stem-loop structures of single-stranded DNA (ssDNA) derived from human immunodeficiency virus type 1 (HIV-1), activated the type I interferon-inducing DNA sensor cGAS in a sequence-dependent manner. DNA structures containing unpaired guanosines flanking short (12- to 20-bp) dsDNA (Y-form DNA) were highly stimulatory and specifically enhanced the enzymatic activity of cGAS. Furthermore, we found that primary HIV-1 reverse transcripts represented the predominant viral cytosolic DNA species during early infection of macrophages and that these ssDNAs were highly immunostimulatory. Collectively, our study identifies unpaired guanosines in Y-form DNA as a highly active, minimal cGAS recognition motif that enables detection of HIV-1 ssDNA.

  15. Cloning, sequencing and expression of a novel xylanase cDNA from ...

    African Journals Online (AJOL)

    A strain SH 2016, capable of producing xylanase, was isolated and identified as Aspergillus awamori, based on its physiological and biochemical characteristics as well as its ITS rDNA gene sequence analysis. A xylanase gene of 591 bp was cloned from this newly isolated A. awamori and the ORF sequence predicted a ...

  16. An extended sequence specificity for UV-induced DNA damage.

    Science.gov (United States)

    Chung, Long H; Murray, Vincent

    2018-01-01

    The sequence specificity of UV-induced DNA damage was determined with a higher precision and accuracy than previously reported. UV light induces two major damage adducts: cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). Employing capillary electrophoresis with laser-induced fluorescence and taking advantages of the distinct properties of the CPDs and 6-4PPs, we studied the sequence specificity of UV-induced DNA damage in a purified DNA sequence using two approaches: end-labelling and a polymerase stop/linear amplification assay. A mitochondrial DNA sequence that contained a random nucleotide composition was employed as the target DNA sequence. With previous methodology, the UV sequence specificity was determined at a dinucleotide or trinucleotide level; however, in this paper, we have extended the UV sequence specificity to a hexanucleotide level. With the end-labelling technique (for 6-4PPs), the consensus sequence was found to be 5'-GCTC*AC (where C* is the breakage site); while with the linear amplification procedure, it was 5'-TCTT*AC. With end-labelling, the dinucleotide frequency of occurrence was highest for 5'-TC*, 5'-TT* and 5'-CC*; whereas it was 5'-TT* for linear amplification. The influence of neighbouring nucleotides on the degree of UV-induced DNA damage was also examined. The core sequences consisted of pyrimidine nucleotides 5'-CTC* and 5'-CTT* while an A at position "1" and C at position "2" enhanced UV-induced DNA damage. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.

  17. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    DEFF Research Database (Denmark)

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie

    2014-01-01

    The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sec...

  18. PREDICTION OF CHROMATIN STATES USING DNA SEQUENCE PROPERTIES

    KAUST Repository

    Bahabri, Rihab R.

    2013-06-01

    Activities of DNA are to a great extent controlled epigenetically through the internal struc- ture of chromatin. This structure is dynamic and is influenced by different modifications of histone proteins. Various combinations of epigenetic modification of histones pinpoint to different functional regions of the DNA determining the so-called chromatin states. How- ever, the characterization of chromatin states by the DNA sequence properties remains largely unknown. In this study we aim to explore whether DNA sequence patterns in the human genome can characterize different chromatin states. Using DNA sequence motifs we built binary classifiers for each chromatic state to eval- uate whether a given genomic sequence is a good candidate for belonging to a particular chromatin state. Of four classification algorithms (C4.5, Naive Bayes, Random Forest, and SVM) used for this purpose, the decision tree based classifiers (C4.5 and Random Forest) yielded best results among those we evaluated. Our results suggest that in general these models lack sufficient predictive power, although for four chromatin states (insulators, het- erochromatin, and two types of copy number variation) we found that presence of certain motifs in DNA sequences does imply an increased probability that such a sequence is one of these chromatin states.

  19. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    DEFF Research Database (Denmark)

    Nielsen, Peter E.

    2008-01-01

    sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technol. of protein dsDNA structures. (c) 2008 American Institute of Physics. [on SciFinder (R)] Udgivelsesdato...

  20. AU2EU : Privacy-preserving matching of DNA sequences

    NARCIS (Netherlands)

    Ignatenko, T.; Petkovic, M.; Naccache, D.; Sauveron, D.

    2014-01-01

    Advances in DNA sequencing create new opportunities for the use of DNA data in healthcare for diagnostic and treatment purposes, but also in many other health and well-being services. This brings new challenges with regard to the protection and use of this sensitive data. Thus, special technical

  1. Close sequence identity between ribosomal DNA episomes of the ...

    Indian Academy of Sciences (India)

    Unknown

    The restriction map of the E. dispar rDNA circle showed close simi- larity to EhR1 .... for 30 cycles in a DNA Thermal cycler (MJ Research,. USA). 3. .... by asterisk. The gaps show the variation between E. dispar and E. histolytica sequences.

  2. DNA Sequences of RAPD Fragments in the Egyptian cotton ...

    African Journals Online (AJOL)

    Random Amplified Polymorphic DNAs (RAPDs) is a DNA polymorphism assay based on the amplification of random DNA segments with single primers of arbitrary nucleotide sequence. Despite the fact that the RAPD technique has become a very powerful tool and has found use in numerous applications, yet, the nature of ...

  3. Effects of sequence on DNA wrapping around histones

    Science.gov (United States)

    Ortiz, Vanessa

    2011-03-01

    A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).

  4. cDNA sequence of human transforming gene hst and identification of the coding sequence required for transforming activity

    International Nuclear Information System (INIS)

    Taira, M.; Yoshida, T.; Miyagawa, K.; Sakamoto, H.; Terada, M.; Sugimura, T.

    1987-01-01

    The hst gene was originally identified as a transforming gene in DNAs from human stomach cancers and from a noncancerous portion of stomach mucosa by DNA-mediated transfection assay using NIH3T3 cells. cDNA clones of hst were isolated from the cDNA library constructed from poly(A) + RNA of a secondary transformant induced by the DNA from a stomach cancer. The sequence analysis of the hst cDNA revealed the presence of two open reading frames. When this cDNA was inserted into an expression vector containing the simian virus 40 promoter, it efficiently induced the transformation of NIH3T3 cells upon transfection. It was found that one of the reading frames, which coded for 206 amino acids, was responsible for the transforming activity

  5. DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

    Science.gov (United States)

    Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

    2012-01-01

    DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.

  6. Highly multiplexed targeted DNA sequencing from single nuclei.

    Science.gov (United States)

    Leung, Marco L; Wang, Yong; Kim, Charissa; Gao, Ruli; Jiang, Jerry; Sei, Emi; Navin, Nicholas E

    2016-02-01

    Single-cell DNA sequencing methods are challenged by poor physical coverage, high technical error rates and low throughput. To address these issues, we developed a single-cell DNA sequencing protocol that combines flow-sorting of single nuclei, time-limited multiple-displacement amplification (MDA), low-input library preparation, DNA barcoding, targeted capture and next-generation sequencing (NGS). This approach represents a major improvement over our previous single nucleus sequencing (SNS) Nature Protocols paper in terms of generating higher-coverage data (>90%), thereby enabling the detection of genome-wide variants in single mammalian cells at base-pair resolution. Furthermore, by pooling 48-96 single-cell libraries together for targeted capture, this approach can be used to sequence many single-cell libraries in parallel in a single reaction. This protocol greatly reduces the cost of single-cell DNA sequencing, and it can be completed in 5-6 d by advanced users. This single-cell DNA sequencing protocol has broad applications for studying rare cells and complex populations in diverse fields of biological research and medicine.

  7. Googling DNA sequences on the World Wide Web.

    Science.gov (United States)

    Hajibabaei, Mehrdad; Singer, Gregory A C

    2009-11-10

    New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bioinformatics applications. We have developed a novel algorithm and implemented it for searching species-specific genomic sequences, DNA barcodes, by using popular web-based methods such as Google. We developed an alignment independent character based algorithm based on dividing a sequence library (DNA barcodes) and query sequence to words. The actual search is conducted by conventional search tools such as freely available Google Desktop Search. We implemented our algorithm in two exemplar packages. We developed pre and post-processing software to provide customized input and output services, respectively. Our analysis of all publicly available DNA barcode sequences shows a high accuracy as well as rapid results. Our method makes use of conventional web-based technologies for specialized genetic data. It provides a robust and efficient solution for sequence search on the web. The integration of our search method for large-scale sequence libraries such as DNA barcodes provides an excellent web-based tool for accessing this information and linking it to other available categories of information on the web.

  8. Utility of 16S rDNA Sequencing for Identification of Rare Pathogenic Bacteria.

    Science.gov (United States)

    Loong, Shih Keng; Khor, Chee Sieng; Jafar, Faizatul Lela; AbuBakar, Sazaly

    2016-11-01

    Phenotypic identification systems are established methods for laboratory identification of bacteria causing human infections. Here, the utility of phenotypic identification systems was compared against 16S rDNA identification method on clinical isolates obtained during a 5-year study period, with special emphasis on isolates that gave unsatisfactory identification. One hundred and eighty-seven clinical bacteria isolates were tested with commercial phenotypic identification systems and 16S rDNA sequencing. Isolate identities determined using phenotypic identification systems and 16S rDNA sequencing were compared for similarity at genus and species level, with 16S rDNA sequencing as the reference method. Phenotypic identification systems identified ~46% (86/187) of the isolates with identity similar to that identified using 16S rDNA sequencing. Approximately 39% (73/187) and ~15% (28/187) of the isolates showed different genus identity and could not be identified using the phenotypic identification systems, respectively. Both methods succeeded in determining the species identities of 55 isolates; however, only ~69% (38/55) of the isolates matched at species level. 16S rDNA sequencing could not determine the species of ~20% (37/187) of the isolates. The 16S rDNA sequencing is a useful method over the phenotypic identification systems for the identification of rare and difficult to identify bacteria species. The 16S rDNA sequencing method, however, does have limitation for species-level identification of some bacteria highlighting the need for better bacterial pathogen identification tools. © 2016 Wiley Periodicals, Inc.

  9. Simple sequence repeat (SSR) markers are effective for identifying ...

    African Journals Online (AJOL)

    DNA was extracted from newly formed leaves and amplified using 21 simple sequence repeat (SSR) markers (NH001c, NH002b, NH005b, NH007b, NH008b, NH009b, NH011b, NH013b, NH012a, NH014a, NH015a, NH017a, KA4b, KA5, KA14, KA16, KB16, KU10, BGA35, BGT23b and HGA8b). The data was analyzed by ...

  10. Plant DNA sequences from feces: potential means for assessing diets of wild primates.

    Science.gov (United States)

    Bradley, Brenda J; Stiller, Mathias; Doran-Sheehy, Diane M; Harris, Tara; Chapman, Colin A; Vigilant, Linda; Poinar, Hendrik

    2007-06-01

    Analyses of plant DNA in feces provides a promising, yet largely unexplored, means of documenting the diets of elusive primates. Here we demonstrate the promise and pitfalls of this approach using DNA extracted from fecal samples of wild western gorillas (Gorilla gorilla) and black and white colobus monkeys (Colobus guereza). From these DNA extracts we amplified, cloned, and sequenced small segments of chloroplast DNA (part of the rbcL gene) and plant nuclear DNA (ITS-2). The obtained sequences were compared to sequences generated from known plant samples and to those in GenBank to identify plant taxa in the feces. With further optimization, this method could provide a basic evaluation of minimum primate dietary diversity even when knowledge of local flora is limited. This approach may find application in studies characterizing the diets of poorly-known, unhabituated primate species or assaying consumer-resource relationships in an ecosystem. (c) 2007 Wiley-Liss, Inc.

  11. Complete cDNA sequence coding for human docking protein

    Energy Technology Data Exchange (ETDEWEB)

    Hortsch, M; Labeit, S; Meyer, D I

    1988-01-11

    Docking protein (DP, or SRP receptor) is a rough endoplasmic reticulum (ER)-associated protein essential for the targeting and translocation of nascent polypeptides across this membrane. It specifically interacts with a cytoplasmic ribonucleoprotein complex, the signal recognition particle (SRP). The nucleotide sequence of cDNA encoding the entire human DP and its deduced amino acid sequence are given.

  12. Phylogenetic analysis of the genus Hordeum using repetitive DNA sequences

    DEFF Research Database (Denmark)

    Svitashev, S.; Bryngelsson, T.; Vershinin, A.

    1994-01-01

    A set of six cloned barley (Hordeum vulgare) repetitive DNA sequences was used for the analysis of phylogenetic relationships among 31 species (46 taxa) of the genus Hordeum, using molecular hybridization techniques. In situ hybridization experiments showed dispersed organization of the sequences...

  13. Sequence specificity of DNA cleavage by Micrococcus luteus γ endonuclease

    International Nuclear Information System (INIS)

    Hentosh, P.; Henner, W.D.; Reynolds, R.J.

    1985-01-01

    DNA fragments of defined sequence have been used to determine the sites of cleavage by γ-endonuclease activity in extracts prepared from Micrococcus luteus. End-labeled DNA restriction fragments of pBR322 DNA that had been irradiated under nitrogen in the presence of potassium iodide or t-butanol were treated with M. luteus γ endonuclease and analyzed on irradiated DNA preferentially at the positions of cytosines and thymines. DNA cleavage occurred immediately to the 3' side of pyrimidines in irradiated DNA and resulted in fragments that terminate in a 5'-phosphoryl group. These studies indicate that both altered cytosines and thymines may be important DNA lesions requiring repair after exposure to γ radiation

  14. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  15. Next-generation sequencing offers new insights into DNA degradation

    DEFF Research Database (Denmark)

    Overballe-Petersen, Søren; Orlando, Ludovic Antoine Alexandre; Willerslev, Eske

    2012-01-01

    The processes underlying DNA degradation are central to various disciplines, including cancer research, forensics and archaeology. The sequencing of ancient DNA molecules on next-generation sequencing platforms provides direct measurements of cytosine deamination, depurination and fragmentation...... rates that previously were obtained only from extrapolations of results from in vitro kinetic experiments performed over short timescales. For example, recent next-generation sequencing of ancient DNA reveals purine bases as one of the main targets of postmortem hydrolytic damage, through base...... elimination and strand breakage. It also shows substantially increased rates of DNA base-loss at guanosine. In this review, we argue that the latter results from an electron resonance structure unique to guanosine rather than adenosine having an extra resonance structure over guanosine as previously suggested....

  16. Anti-DNA antibodies: Sequencing, cloning, and expression

    Energy Technology Data Exchange (ETDEWEB)

    Barry, M.M.

    1992-01-01

    To gain some insight into the mechanism of systemic lupus erythematosus, and the interactions involved in proteins binding to DNA four anti-DNA antibodies have been investigated. Two of the antibodies, Hed 10 and Jel 242, have previously been prepared from female NZB/NZW mice which develop an autoimmune disease resembling human SLE. The remaining two antibodies, Jel 72 and Jel 318, have previously been produced via immunization of C57BL/6 mice. The isotypes of the four antibodies investigated in this thesis were determined by an enzyme-linked-immunosorbent assay. All four antibodies contained [kappa] light chains and [gamma]2a heavy chains except Jel 318 which contains a [gamma]2b heavy chain. The complete variable regions of the heavy and light chains of these four antibodies were sequenced from their respective mRNAs. The gene segments and variable gene families expressed in each antibody were identified. Analysis of the genes used in the autoimmune anti-DNA antibodies and those produced by immunization indicated no obvious differences to account for their different origins. Examination of the amino acid residues present in the complementary-determining regions of these four antibodies indicates a preference for aromatic amino acids. Jel 72 and Jel 242 contain three arginine residues in the third complementary-determining region. A single-chain Fv and the variable region of the heavy chain of Hed 10 were expressed in Escherichia coli. Expression resulted in the production of a 26,000 M[sub r] protein and a 15,000 M[sub r] protein. An immunoblot indicated that the 26,000 M[sub r] protein was the Fv for Hed 10, while the 15,000 M[sub r] protein was shown to bind poly (dT). The contribution of the heavy chain to DNA binding was assessed.

  17. Directed PCR-free engineering of highly repetitive DNA sequences

    Directory of Open Access Journals (Sweden)

    Preissler Steffen

    2011-09-01

    Full Text Available Abstract Background Highly repetitive nucleotide sequences are commonly found in nature e.g. in telomeres, microsatellite DNA, polyadenine (poly(A tails of eukaryotic messenger RNA as well as in several inherited human disorders linked to trinucleotide repeat expansions in the genome. Therefore, studying repetitive sequences is of biological, biotechnological and medical relevance. However, cloning of such repetitive DNA sequences is challenging because specific PCR-based amplification is hampered by the lack of unique primer binding sites resulting in unspecific products. Results For the PCR-free generation of repetitive DNA sequences we used antiparallel oligonucleotides flanked by restriction sites of Type IIS endonucleases. The arrangement of recognition sites allowed for stepwise and seamless elongation of repetitive sequences. This facilitated the assembly of repetitive DNA segments and open reading frames encoding polypeptides with periodic amino acid sequences of any desired length. By this strategy we cloned a series of polyglutamine encoding sequences as well as highly repetitive polyadenine tracts. Such repetitive sequences can be used for diverse biotechnological applications. As an example, the polyglutamine sequences were expressed as His6-SUMO fusion proteins in Escherichia coli cells to study their aggregation behavior in vitro. The His6-SUMO moiety enabled affinity purification of the polyglutamine proteins, increased their solubility, and allowed controlled induction of the aggregation process. We successfully purified the fusions proteins and provide an example for their applicability in filter retardation assays. Conclusion Our seamless cloning strategy is PCR-free and allows the directed and efficient generation of highly repetitive DNA sequences of defined lengths by simple standard cloning procedures.

  18. kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

    Science.gov (United States)

    Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.

    2013-01-01

    Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147

  19. High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

    Directory of Open Access Journals (Sweden)

    Soichi Inagaki

    Full Text Available Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.

  20. High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

    Science.gov (United States)

    Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca

    2015-01-01

    Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.

  1. Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

    Directory of Open Access Journals (Sweden)

    Jason D Thompson

    Full Text Available Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.

  2. Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

    Science.gov (United States)

    Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre

    2012-01-01

    Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.

  3. Chaos game representation (CGR)-walk model for DNA sequences

    International Nuclear Information System (INIS)

    Jie, Gao; Zhen-Yuan, Xu

    2009-01-01

    Chaos game representation (CGR) is an iterative mapping technique that processes sequences of units, such as nucleotides in a DNA sequence or amino acids in a protein, in order to determine the coordinates of their positions in a continuous space. This distribution of positions has two features: one is unique, and the other is source sequence that can be recovered from the coordinates so that the distance between positions may serve as a measure of similarity between the corresponding sequences. A CGR-walk model is proposed based on CGR coordinates for the DNA sequences. The CGR coordinates are converted into a time series, and a long-memory ARFIMA (p, d, q) model, where ARFIMA stands for autoregressive fractionally integrated moving average, is introduced into the DNA sequence analysis. This model is applied to simulating real CGR-walk sequence data of ten genomic sequences. Remarkably long-range correlations are uncovered in the data, and the results from these models are reasonably fitted with those from the ARFIMA (p, d, q) model. (cross-disciplinary physics and related areas of science and technology)

  4. Mitochondrial DNA sequence-based phylogenetic relationship ...

    Indian Academy of Sciences (India)

    cophaga ranges from 0.037–0.106 and 0.049–0.207 for COI and ND5 genes, respectively (tables 2 and 3). Analysis of genetic distance on the basis of sequence difference for both the mitochondrial genes shows very little genetic difference. The discrepancy in the phylogenetic trees based on individ- ual genes may be due ...

  5. DNA sequence analysis of the photosynthesis region of Rhodobacter sphaeroides 2.4.1T

    OpenAIRE

    Choudhary, M.; Kaplan, Samuel

    2000-01-01

    This paper describes the DNA sequence of the photosynthesis region of Rhodobacter sphaeroides 2.4.1T. The photosynthesis gene cluster is located within a ~73 kb AseI genomic DNA fragment containing the puf, puhA, cycA and puc operons. A total of 65 open reading frames (ORFs) have been identified, of which 61 showed significant similarity to genes/proteins of other organisms while only four did not reveal any significant sequence similarity to any gene/protein sequences in the database. The da...

  6. Novel DNA sequence detection method based on fluorescence energy transfer

    International Nuclear Information System (INIS)

    Kobayashi, S.; Tamiya, E.; Karube, I.

    1987-01-01

    Recently the detection of specific DNA sequence, DNA analysis, has been becoming more important for diagnosis of viral genomes causing infections disease and human sequences related to inherited disorders. These methods typically involve electrophoresis, the immobilization of DNA on a solid support, hybridization to a complementary probe, the detection using labeled with /sup 32/P or nonisotopically with a biotin-avidin-enzyme system, and so on. These techniques are highly effective, but they are very time-consuming and expensive. A principle of fluorescene energy transfer is that the light energy from an excited donor (fluorophore) is transferred to an acceptor (fluorophore), if the acceptor exists in the vicinity of the donor and the excitation spectrum of donor overlaps the emission spectrum of acceptor. In this study, the fluorescence energy transfer was applied to the detection of specific DNA sequence using the hybridization method. The analyte, single-stranded DNA labeled with the donor fluorophore is hybridized to a probe DNA labeled with the acceptor. Because of the complementary DNA duplex formation, two fluorophores became to be closed to each other, and the fluorescence energy transfer was occurred

  7. Management of High-Throughput DNA Sequencing Projects: Alpheus.

    Science.gov (United States)

    Miller, Neil A; Kingsmore, Stephen F; Farmer, Andrew; Langley, Raymond J; Mudge, Joann; Crow, John A; Gonzalez, Alvaro J; Schilkey, Faye D; Kim, Ryan J; van Velkinburgh, Jennifer; May, Gregory D; Black, C Forrest; Myers, M Kathy; Utsey, John P; Frost, Nicholas S; Sugarbaker, David J; Bueno, Raphael; Gullans, Stephen R; Baxter, Susan M; Day, Steve W; Retzel, Ernest F

    2008-12-26

    High-throughput DNA sequencing has enabled systems biology to begin to address areas in health, agricultural and basic biological research. Concomitant with the opportunities is an absolute necessity to manage significant volumes of high-dimensional and inter-related data and analysis. Alpheus is an analysis pipeline, database and visualization software for use with massively parallel DNA sequencing technologies that feature multi-gigabase throughput characterized by relatively short reads, such as Illumina-Solexa (sequencing-by-synthesis), Roche-454 (pyrosequencing) and Applied Biosystem's SOLiD (sequencing-by-ligation). Alpheus enables alignment to reference sequence(s), detection of variants and enumeration of sequence abundance, including expression levels in transcriptome sequence. Alpheus is able to detect several types of variants, including non-synonymous and synonymous single nucleotide polymorphisms (SNPs), insertions/deletions (indels), premature stop codons, and splice isoforms. Variant detection is aided by the ability to filter variant calls based on consistency, expected allele frequency, sequence quality, coverage, and variant type in order to minimize false positives while maximizing the identification of true positives. Alpheus also enables comparisons of genes with variants between cases and controls or bulk segregant pools. Sequence-based differential expression comparisons can be developed, with data export to SAS JMP Genomics for statistical analysis.

  8. DNA-PK dependent targeting of DNA-ends to a protein complex assembled on matrix attachment region DNA sequences

    International Nuclear Information System (INIS)

    Mauldin, S.K.; Getts, R.C.; Perez, M.L.; DiRienzo, S.; Stamato, T.D.

    2003-01-01

    Full text: We find that nuclear protein extracts from mammalian cells contain an activity that allows DNA ends to associate with circular pUC18 plasmid DNA. This activity requires the catalytic subunit of DNA-PK (DNA-PKcs) and Ku since it was not observed in mutants lacking Ku or DNA-PKcs but was observed when purified Ku/DNA-PKcs was added to these mutant extracts. Competition experiments between pUC18 and pUC18 plasmids containing various nuclear matrix attachment region (MAR) sequences suggest that DNA ends preferentially associate with plasmids containing MAR DNA sequences. At a 1:5 mass ratio of MAR to pUC18, approximately equal amounts of DNA end binding to the two plasmids were observed, while at a 1:1 ratio no pUC18 end-binding was observed. Calculation of relative binding activities indicates that DNA-end binding activities to MAR sequences was 7 to 21 fold higher than pUC18. Western analysis of proteins bound to pUC18 and MAR plasmids indicates that XRCC4, DNA ligase IV, scaffold attachment factor A, topoisomerase II, and poly(ADP-ribose) polymerase preferentially associate with the MAR plasmid in the absence or presence of DNA ends. In contrast, Ku and DNA-PKcs were found on the MAR plasmid only in the presence of DNA ends. After electroporation of a 32P-labeled DNA probe into human cells and cell fractionation, 87% of the total intercellular radioactivity remained in nuclei after a 0.5M NaCl extraction suggesting the probe was strongly bound in the nucleus. The above observations raise the possibility that DNA-PK targets DNA-ends to a repair and/or DNA damage signaling complex which is assembled on MAR sites in the nucleus

  9. Identification of tissue-embedded ascarid larvae by ribosomal DNA sequencing.

    Science.gov (United States)

    Ishiwata, Kenji; Shinohara, Akio; Yagi, Kinpei; Horii, Yoichiro; Tsuchiya, Kimiyuki; Nawa, Yukifumi

    2004-01-01

    Polymerase chain reaction (PCR) was applied to identify tissue-embedded ascarid nematode larvae. Two sequences of the internal transcribed spacer (ITS) regions of ribosomal DNA (rDNA), ITS1 and ITS2, of the ascarid parasites were amplified and compared with those of ascarid-nematodes registered in a DNA database (GenBank). The ITS sequences of the PCR products obtained from the ascarid parasite specimen in our laboratory were compatible with those of registered adult Ascaris and Toxocara parasites. PCR amplification of the ITS regions was sensitive enough to detect a single larva of Ascaris suum mixed with porcine liver tissue. Using this method, ascarid larvae embedded in the liver of a naturally infected turkey were identified as Toxocara canis. These results suggest that even a single larva embedded in tissues from patients with larva migrans could be identified by sequencing the ITS regions.

  10. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    Directory of Open Access Journals (Sweden)

    Baldwin Stephen A

    2011-03-01

    Full Text Available Abstract Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.

  11. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities.

    Science.gov (United States)

    Troshin, Peter V; Postis, Vincent Lg; Ashworth, Denise; Baldwin, Stephen A; McPherson, Michael J; Barton, Geoffrey J

    2011-03-07

    Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.

  12. Dialects of the DNA uptake sequence in Neisseriaceae.

    Directory of Open Access Journals (Sweden)

    Stephan A Frye

    2013-04-01

    Full Text Available In all sexual organisms, adaptations exist that secure the safe reassortment of homologous alleles and prevent the intrusion of potentially hazardous alien DNA. Some bacteria engage in a simple form of sex known as transformation. In the human pathogen Neisseria meningitidis and in related bacterial species, transformation by exogenous DNA is regulated by the presence of a specific DNA Uptake Sequence (DUS, which is present in thousands of copies in the respective genomes. DUS affects transformation by limiting DNA uptake and recombination in favour of homologous DNA. The specific mechanisms of DUS-dependent genetic transformation have remained elusive. Bioinformatic analyses of family Neisseriaceae genomes reveal eight distinct variants of DUS. These variants are here termed DUS dialects, and their effect on interspecies commutation is demonstrated. Each of the DUS dialects is remarkably conserved within each species and is distributed consistent with a robust Neisseriaceae phylogeny based on core genome sequences. The impact of individual single nucleotide transversions in DUS on meningococcal transformation and on DNA binding and uptake is analysed. The results show that a DUS core 5'-CTG-3' is required for transformation and that transversions in this core reduce DNA uptake more than two orders of magnitude although the level of DNA binding remains less affected. Distinct DUS dialects are efficient barriers to interspecies recombination in N. meningitidis, N. elongata, Kingella denitrificans, and Eikenella corrodens, despite the presence of the core sequence. The degree of similarity between the DUS dialect of the recipient species and the donor DNA directly correlates with the level of transformation and DNA binding and uptake. Finally, DUS-dependent transformation is documented in the genera Eikenella and Kingella for the first time. The results presented here advance our understanding of the function and evolution of DUS and genetic

  13. Dialects of the DNA Uptake Sequence in Neisseriaceae

    Science.gov (United States)

    Frye, Stephan A.; Nilsen, Mariann; Tønjum, Tone; Ambur, Ole Herman

    2013-01-01

    In all sexual organisms, adaptations exist that secure the safe reassortment of homologous alleles and prevent the intrusion of potentially hazardous alien DNA. Some bacteria engage in a simple form of sex known as transformation. In the human pathogen Neisseria meningitidis and in related bacterial species, transformation by exogenous DNA is regulated by the presence of a specific DNA Uptake Sequence (DUS), which is present in thousands of copies in the respective genomes. DUS affects transformation by limiting DNA uptake and recombination in favour of homologous DNA. The specific mechanisms of DUS–dependent genetic transformation have remained elusive. Bioinformatic analyses of family Neisseriaceae genomes reveal eight distinct variants of DUS. These variants are here termed DUS dialects, and their effect on interspecies commutation is demonstrated. Each of the DUS dialects is remarkably conserved within each species and is distributed consistent with a robust Neisseriaceae phylogeny based on core genome sequences. The impact of individual single nucleotide transversions in DUS on meningococcal transformation and on DNA binding and uptake is analysed. The results show that a DUS core 5′-CTG-3′ is required for transformation and that transversions in this core reduce DNA uptake more than two orders of magnitude although the level of DNA binding remains less affected. Distinct DUS dialects are efficient barriers to interspecies recombination in N. meningitidis, N. elongata, Kingella denitrificans, and Eikenella corrodens, despite the presence of the core sequence. The degree of similarity between the DUS dialect of the recipient species and the donor DNA directly correlates with the level of transformation and DNA binding and uptake. Finally, DUS–dependent transformation is documented in the genera Eikenella and Kingella for the first time. The results presented here advance our understanding of the function and evolution of DUS and genetic transformation

  14. Mitochondrial DNA sequence evolution in the Arctoidea.

    OpenAIRE

    Zhang, Y P; Ryder, O A

    1993-01-01

    Some taxa in the superfamily Arctoidea, such as the giant panda and the lesser panda, have presented puzzles to taxonomists. In the present study, approximately 397 bases of the cytochrome b gene, 364 bases of the 12S rRNA gene, and 74 bases of the tRNA(Thr) and tRNA(Pro) genes from the giant panda, lesser panda, kinkajou, raccoon, coatimundi, and all species of the Ursidae were sequenced. The high transition/transversion ratios in cytochrome b and RNA genes prior to saturation suggest that t...

  15. Noninvasive prenatal paternity testing (NIPAT) through maternal plasma DNA sequencing

    DEFF Research Database (Denmark)

    Jiang, Haojun; Xie, Yifan; Li, Xuchao

    2016-01-01

    developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels......Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we...... paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future....

  16. Statistical assignment of DNA sequences using Bayesian phylogenetics

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Huelsenbeck, John P.

    2008-01-01

    We provide a new automated statistical method for DNA barcoding based on a Bayesian phylogenetic analysis. The method is based on automated database sequence retrieval, alignment, and phylogenetic analysis using a custom-built program for Bayesian phylogenetic analysis. We show on real data...... that the method outperforms Blast searches as a measure of confidence and can help eliminate 80% of all false assignment based on best Blast hit. However, the most important advance of the method is that it provides statistically meaningful measures of confidence. We apply the method to a re......-analysis of previously published ancient DNA data and show that, with high statistical confidence, most of the published sequences are in fact of Neanderthal origin. However, there are several cases of chimeric sequences that are comprised of a combination of both Neanderthal and modern human DNA....

  17. Network clustering coefficient approach to DNA sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gerhardt, Guenther J.L. [Universidade Federal do Rio Grande do Sul-Hospital de Clinicas de Porto Alegre, Rua Ramiro Barcelos 2350/sala 2040/90035-003 Porto Alegre (Brazil); Departamento de Fisica e Quimica da Universidade de Caxias do Sul, Rua Francisco Getulio Vargas 1130, 95001-970 Caxias do Sul (Brazil); Lemke, Ney [Programa Interdisciplinar em Computacao Aplicada, Unisinos, Av. Unisinos, 950, 93022-000 Sao Leopoldo, RS (Brazil); Corso, Gilberto [Departamento de Biofisica e Farmacologia, Centro de Biociencias, Universidade Federal do Rio Grande do Norte, Campus Universitario, 59072 970 Natal, RN (Brazil)]. E-mail: corso@dfte.ufrn.br

    2006-05-15

    In this work we propose an alternative DNA sequence analysis tool based on graph theoretical concepts. The methodology investigates the path topology of an organism genome through a triplet network. In this network, triplets in DNA sequence are vertices and two vertices are connected if they occur juxtaposed on the genome. We characterize this network topology by measuring the clustering coefficient. We test our methodology against two main bias: the guanine-cytosine (GC) content and 3-bp (base pairs) periodicity of DNA sequence. We perform the test constructing random networks with variable GC content and imposed 3-bp periodicity. A test group of some organisms is constructed and we investigate the methodology in the light of the constructed random networks. We conclude that the clustering coefficient is a valuable tool since it gives information that is not trivially contained in 3-bp periodicity neither in the variable GC content.

  18. Statistical properties and fractals of nucleotide clusters in DNA sequences

    International Nuclear Information System (INIS)

    Sun Tingting; Zhang Linxi; Chen Jin; Jiang Zhouting

    2004-01-01

    Statistical properties of nucleotide clusters in DNA sequences and their fractals are investigated in this paper. The average size of nucleotide clusters in non-coding sequence is larger than that in coding sequence. We investigate the cluster-size distribution P(S) for human chromosomes 21 and 22, and the results are different from previous works. The cluster-size distribution P(S 1 +S 2 ) with the total size of sequential Pu-cluster and Py-cluster S 1 +S 2 is studied. We observe that P(S 1 +S 2 ) follows an exponential decay both in coding and non-coding sequences. However, we get different results for human chromosomes 21 and 22. The probability distribution P(S 1 ,S 2 ) of nucleotide clusters with the size of sequential Pu-cluster and Py-cluster S 1 and S 2 respectively, is also examined. In the meantime, some of the linear correlations are obtained in the double logarithmic plots of the fluctuation F(l) versus nucleotide cluster distance l along the DNA chain. The power spectrums of nucleotide clusters are also discussed, and it is concluded that the curves are flat and hardly changed and the 1/3 frequency is neither observed in coding sequence nor in non-coding sequence. These investigations can provide some insights into the nucleotide clusters of DNA sequences

  19. DNA sequence responsible for the amplification of adjacent genes.

    Science.gov (United States)

    Pasion, S G; Hartigan, J A; Kumar, V; Biswas, D K

    1987-10-01

    A 10.3-kb DNA fragment in the 5'-flanking region of the rat prolactin (rPRL) gene was isolated from F1BGH(1)2C1, a strain of rat pituitary tumor cells (GH cells) that produces prolactin in response to 5-bromodeoxyuridine (BrdU). Following transfection and integration into genomic DNA of recipient mouse L cells, this DNA induced amplification of the adjacent thymidine kinase gene from Herpes simplex virus type 1 (HSV1TK). We confirmed the ability of this "Amplicon" sequence to induce amplification of other linked or unlinked genes in DNA-mediated gene transfer studies. When transferred into the mouse L cells with the 10.3-5'rPRL gene sequence of BrdU-responsive cells, both the human growth hormone and the HSV1TK genes are amplified in response to 5-bromodeoxyuridine. This observation is substantiated by BrdU-induced amplification of the cotransferred bacterial Neo gene. Cotransfection studies reveal that the BrdU-induced amplification capability is associated with a 4-kb DNA sequence in the 5'-flanking region of the rPRL gene of BrdU-responsive cells. These results demonstrate that genes of heterologous origin, linked or unlinked, and selected or unselected, can be coamplified when located within the amplification boundary of the Amplicon sequence.

  20. Anaplasma phagocytophilum in Danish sheep: confirmation by DNA sequencing

    Directory of Open Access Journals (Sweden)

    Thamsborg Stig M

    2009-12-01

    Full Text Available Abstract Background The presence of Anaplasma phagocytophilum, an Ixodes ricinus transmitted bacterium, was investigated in two flocks of Danish grazing lambs. Direct PCR detection was performed on DNA extracted from blood and serum with subsequent confirmation by DNA sequencing. Methods 31 samples obtained from clinically normal lambs in 2000 from Fussingø, Jutland and 12 samples from ten lambs and two ewes from a clinical outbreak at Feddet, Zealand in 2006 were included in the study. Some of the animals from Feddet had shown clinical signs of polyarthritis and general unthriftiness prior to sampling. DNA extraction was optimized from blood and serum and detection achieved by a 16S rRNA targeted PCR with verification of the product by DNA sequencing. Results Five DNA extracts were found positive by PCR, including two samples from 2000 and three from 2006. For both series of samples the product was verified as A. phagocytophilum by DNA sequencing. Conclusions A. phagocytophilum was detected by molecular methods for the first time in Danish grazing lambs during the two seasons investigated (2000 and 2006.

  1. DNA Qualification Workflow for Next Generation Sequencing of Histopathological Samples

    Science.gov (United States)

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T.; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  2. DNA qualification workflow for next generation sequencing of histopathological samples.

    Directory of Open Access Journals (Sweden)

    Michele Simbolo

    Full Text Available Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF tissues, 6 formalin-fixed paraffin-embedded (FFPE tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard

  3. Compilation and analysis of Escherichia coli promoter DNA sequences.

    OpenAIRE

    Hawley, D K; McClure, W R

    1983-01-01

    The DNA sequence of 168 promoter regions (-50 to +10) for Escherichia coli RNA polymerase were compiled. The complete listing was divided into two groups depending upon whether or not the promoter had been defined by genetic (promoter mutations) or biochemical (5' end determination) criteria. A consensus promoter sequence based on homologies among 112 well-defined promoters was determined that was in substantial agreement with previous compilations. In addition, we have tabulated 98 promoter ...

  4. Natural history bycatch: a pipeline for identifying metagenomic sequences in RADseq data

    Directory of Open Access Journals (Sweden)

    Iris Holmes

    2018-04-01

    Full Text Available Background Reduced representation genomic datasets are increasingly becoming available from a variety of organisms. These datasets do not target specific genes, and so may contain sequences from parasites and other organisms present in the target tissue sample. In this paper, we demonstrate that (1 RADseq datasets can be used for exploratory analysis of tissue-specific metagenomes, and (2 tissue collections house complete metagenomic communities, which can be investigated and quantified by a variety of techniques. Methods We present an exploratory method for mining metagenomic “bycatch” sequences from a range of host tissue types. We use a combination of the pyRAD assembly pipeline, NCBI’s blastn software, and custom R scripts to isolate metagenomic sequences from RADseq type datasets. Results When we focus on sequences that align with existing references in NCBI’s GenBank, we find that between three and five percent of identifiable double-digest restriction site associated DNA (ddRAD sequences from host tissue samples are from phyla to contain known blood parasites. In addition to tissue samples, we examine ddRAD sequences from metagenomic DNA extracted snake and lizard hind-gut samples. We find that the sequences recovered from these samples match with expected bacterial and eukaryotic gut microbiome phyla. Discussion Our results suggest that (1 museum tissue banks originally collected for host DNA archiving are also preserving valuable parasite and microbiome communities, (2 that publicly available RADseq datasets may include metagenomic sequences that could be explored, and (3 that restriction site approaches are a useful exploratory technique to identify microbiome lineages that could be missed by primer-based approaches.

  5. Polyfluorophore Labels on DNA: Dramatic Sequence Dependence of Quenching

    Science.gov (United States)

    Teo, Yin Nah; Wilson, James N.

    2010-01-01

    We describe studies carried out in the DNA context to test how a common fluorescence quencher, dabcyl, interacts with oligodeoxynu-cleoside fluorophores (ODFs)—a system of stacked, electronically interacting fluorophores built on a DNA scaffold. We tested twenty different tetrameric ODF sequences containing varied combinations and orderings of pyrene (Y), benzopyrene (B), perylene (E), dimethylaminostilbene (D), and spacer (S) monomers conjugated to the 3′ end of a DNA oligomer. Hybridization of this probe sequence to a dabcyl-labeled complementary strand resulted in strong quenching of fluorescence in 85% of the twenty ODF sequences. The high efficiency of quenching was also established by their large Stern–Volmer constants (KSV) of between 2.1 × 104 and 4.3 × 105M−1, measured with a free dabcyl quencher. Interestingly, quenching of ODFs displayed strong sequence dependence. This was particularly evident in anagrams of ODF sequences; for example, the sequence BYDS had a KSV that was approximately two orders of magnitude greater than that of BSDY, which has the same dye composition. Other anagrams, for example EDSY and ESYD, also displayed different responses upon quenching by dabcyl. Analysis of spectra showed that apparent excimer and exciplex emission bands were quenched with much greater efficiency compared to monomer emission bands by at least an order of magnitude. This suggests an important role played by delocalized excited states of the π stack of fluorophores in the amplified quenching of fluorescence. PMID:19780115

  6. Spliced DNA Sequences in the Paramecium Germline: Their Properties and Evolutionary Potential

    Science.gov (United States)

    Catania, Francesco; McGrath, Casey L.; Doak, Thomas G.; Lynch, Michael

    2013-01-01

    Despite playing a crucial role in germline-soma differentiation, the evolutionary significance of developmentally regulated genome rearrangements (DRGRs) has received scant attention. An example of DRGR is DNA splicing, a process that removes segments of DNA interrupting genic and/or intergenic sequences. Perhaps, best known for shaping immune-system genes in vertebrates, DNA splicing plays a central role in the life of ciliated protozoa, where thousands of germline DNA segments are eliminated after sexual reproduction to regenerate a functional somatic genome. Here, we identify and chronicle the properties of 5,286 sequences that putatively undergo DNA splicing (i.e., internal eliminated sequences [IESs]) across the genomes of three closely related species of the ciliate Paramecium (P. tetraurelia, P. biaurelia, and P. sexaurelia). The study reveals that these putative IESs share several physical characteristics. Although our results are consistent with excision events being largely conserved between species, episodes of differential IES retention/excision occur, may have a recent origin, and frequently involve coding regions. Our findings indicate interconversion between somatic—often coding—DNA sequences and noncoding IESs, and provide insights into the role of DNA splicing in creating potentially functional genetic innovation. PMID:23737328

  7. Detecting differential DNA methylation from sequencing of bisulfite converted DNA of diverse species.

    Science.gov (United States)

    Huh, Iksoo; Wu, Xin; Park, Taesung; Yi, Soojin V

    2017-07-21

    DNA methylation is one of the most extensively studied epigenetic modifications of genomic DNA. In recent years, sequencing of bisulfite-converted DNA, particularly via next-generation sequencing technologies, has become a widely popular method to study DNA methylation. This method can be readily applied to a variety of species, dramatically expanding the scope of DNA methylation studies beyond the traditionally studied human and mouse systems. In parallel to the increasing wealth of genomic methylation profiles, many statistical tools have been developed to detect differentially methylated loci (DMLs) or differentially methylated regions (DMRs) between biological conditions. We discuss and summarize several key properties of currently available tools to detect DMLs and DMRs from sequencing of bisulfite-converted DNA. However, the majority of the statistical tools developed for DML/DMR analyses have been validated using only mammalian data sets, and less priority has been placed on the analyses of invertebrate or plant DNA methylation data. We demonstrate that genomic methylation profiles of non-mammalian species are often highly distinct from those of mammalian species using examples of honey bees and humans. We then discuss how such differences in data properties may affect statistical analyses. Based on these differences, we provide three specific recommendations to improve the power and accuracy of DML and DMR analyses of invertebrate data when using currently available statistical tools. These considerations should facilitate systematic and robust analyses of DNA methylation from diverse species, thus advancing our understanding of DNA methylation. © The Author 2017. Published by Oxford University Press.

  8. The cDNA sequence of a neutral horseradish peroxidase.

    Science.gov (United States)

    Bartonek-Roxå, E; Eriksson, H; Mattiasson, B

    1991-02-16

    A cDNA clone encoding a horseradish (Armoracia rusticana) peroxidase has been isolated and characterized. The cDNA contains 1378 nucleotides excluding the poly(A) tail and the deduced protein contains 327 amino acids which includes a 28 amino acid leader sequence. The predicted amino acid sequence is nine amino acids shorter than the major isoenzyme belonging to the horseradish peroxidase C group (HRP-C) and the sequence shows 53.7% identity with this isoenzyme. The described clone encodes nine cysteines of which eight correspond well with the cysteines found in HRP-C. Five potential N-glycosylation sites with the general sequence Asn-X-Thr/Ser are present in the deduced sequence. Compared to the earlier described HRP-C this is three glycosylation sites less. The shorter sequence and fewer N-glycosylation sites give the native isoenzyme a molecular weight of several thousands less than the horseradish peroxidase C isoenzymes. Comparison with the net charge value of HRP-C indicates that the described cDNA clone encodes a peroxidase which has either the same or a slightly less basic pI value, depending on whether the encoded protein is N-terminally blocked or not. This excludes the possibility that HRP-n could belong to either the HRP-A, -D or -E groups. The low sequence identity (53.7%) with HRP-C indicates that the described clone does not belong to the HRP-C isoenzyme group and comparison of the total amino acid composition with the HRP-B group does not place the described clone within this isoenzyme group. Our conclusion is that the described cDNA clone encodes a neutral horseradish peroxidase which belongs to a new, not earlier described, horseradish peroxidase group.

  9. RNA-DNA sequence differences spell genetic code ambiguities

    DEFF Research Database (Denmark)

    Bentin, Thomas; Nielsen, Michael L

    2013-01-01

    A recent paper in Science by Li et al. 2011(1) reports widespread sequence differences in the human transcriptome between RNAs and their encoding genes termed RNA-DNA differences (RDDs). The findings could add a new layer of complexity to gene expression but the study has been criticized. ...

  10. cDNA, genomic sequence cloning and overexpression of ribosomal ...

    African Journals Online (AJOL)

    RPS16 of eukaryote is a component of the 40S small ribosomal subunit encoded by RPS16 gene and is also a homolog of prokaryotic RPS9. The cDNA and genomic sequence of RPS16 was cloned successfully for the first time from the Giant Panda (Ailuropoda melanoleuca) using reverse transcription-polymerase chain ...

  11. (Brassicaceae) based on nuclear ribosomal ITS DNA sequences

    Indian Academy of Sciences (India)

    Home; Journals; Journal of Genetics; Volume 93; Issue 2. Phylogeny and biogeography of Alyssum (Brassicaceae) based on nuclear ribosomal ITS DNA sequences. Yan Li Yan Kong Zhe Zhang Yanqiang Yin Bin Liu Guanghui Lv Xiyong Wang. Research Article Volume 93 Issue 2 August 2014 pp 313-323 ...

  12. Integration of hepatitis B virus DNA in chromosome-specific satellite sequences

    International Nuclear Information System (INIS)

    Shaul, Y.; Garcia, P.D.; Schonberg, S.; Rutter, W.J.

    1986-01-01

    The authors previously reported the cloning and detailed analysis of the integrated hepatitis B virus sequences in a human hepatoma cell line. They report here the integration of at least one of hepatitis B virus at human satellite DNA sequences. The majority of the cellular sequences identified by this satellite were organized as a multimeric composition of a 0.6-kilobase EcoRI fragment. This clone hybridized in situ almost exclusively to the centromeric heterochromatin of chromosomes 1 and 16 and to a lower extent to chromosome 2 and to the heterochromatic region of the Y chromosome. The immediate flanking host sequence appeared as a hierarchy of repeating units which were almost identical to a previously reported human satellite III DNA sequence

  13. High Performance Systolic Array Core Architecture Design for DNA Sequencer

    Directory of Open Access Journals (Sweden)

    Saiful Nurdin Dayana

    2018-01-01

    Full Text Available This paper presents a high performance systolic array (SA core architecture design for Deoxyribonucleic Acid (DNA sequencer. The core implements the affine gap penalty score Smith-Waterman (SW algorithm. This time-consuming local alignment algorithm guarantees optimal alignment between DNA sequences, but it requires quadratic computation time when performed on standard desktop computers. The use of linear SA decreases the time complexity from quadratic to linear. In addition, with the exponential growth of DNA databases, the SA architecture is used to overcome the timing issue. In this work, the SW algorithm has been captured using Verilog Hardware Description Language (HDL and simulated using Xilinx ISIM simulator. The proposed design has been implemented in Xilinx Virtex -6 Field Programmable Gate Array (FPGA and improved in the core area by 90% reduction.

  14. Sequence heterogeneity accelerates protein search for targets on DNA

    International Nuclear Information System (INIS)

    Shvets, Alexey A.; Kolomeisky, Anatoly B.

    2015-01-01

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome

  15. Sequence heterogeneity accelerates protein search for targets on DNA

    Energy Technology Data Exchange (ETDEWEB)

    Shvets, Alexey A.; Kolomeisky, Anatoly B., E-mail: tolya@rice.edu [Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005 (United States)

    2015-12-28

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  16. Exome sequencing identifies ZNF644 mutations in high myopia.

    Directory of Open Access Journals (Sweden)

    Yi Shi

    2011-06-01

    Full Text Available Myopia is the most common ocular disorder worldwide, and high myopia in particular is one of the leading causes of blindness. Genetic factors play a critical role in the development of myopia, especially high myopia. Recently, the exome sequencing approach has been successfully used for the disease gene identification of Mendelian disorders. Here we show a successful application of exome sequencing to identify a gene for an autosomal dominant disorder, and we have identified a gene potentially responsible for high myopia in a monogenic form. We captured exomes of two affected individuals from a Han Chinese family with high myopia and performed sequencing analysis by a second-generation sequencer with a mean coverage of 30× and sufficient depth to call variants at ∼97% of each targeted exome. The shared genetic variants of these two affected individuals in the family being studied were filtered against the 1000 Genomes Project and the dbSNP131 database. A mutation A672G in zinc finger protein 644 isoform 1 (ZNF644 was identified as being related to the phenotype of this family. After we performed sequencing analysis of the exons in the ZNF644 gene in 300 sporadic cases of high myopia, we identified an additional five mutations (I587V, R680G, C699Y, 3'UTR+12 C>G, and 3'UTR+592 G>A in 11 different patients. All these mutations were absent in 600 normal controls. The ZNF644 gene was expressed in human retinal and retinal pigment epithelium (RPE. Given that ZNF644 is predicted to be a transcription factor that may regulate genes involved in eye development, mutation may cause the axial elongation of eyeball found in high myopia patients. Our results suggest that ZNF644 might be a causal gene for high myopia in a monogenic form.

  17. DNA enrichment approaches to identify unauthorized genetically modified organisms (GMOs).

    Science.gov (United States)

    Arulandhu, Alfred J; van Dijk, Jeroen P; Dobnik, David; Holst-Jensen, Arne; Shi, Jianxin; Zel, Jana; Kok, Esther J

    2016-07-01

    With the increased global production of different genetically modified (GM) plant varieties, chances increase that unauthorized GM organisms (UGMOs) may enter the food chain. At the same time, the detection of UGMOs is a challenging task because of the limited sequence information that will generally be available. PCR-based methods are available to detect and quantify known UGMOs in specific cases. If this approach is not feasible, DNA enrichment of the unknown adjacent sequences of known GMO elements is one way to detect the presence of UGMOs in a food or feed product. These enrichment approaches are also known as chromosome walking or gene walking (GW). In recent years, enrichment approaches have been coupled with next generation sequencing (NGS) analysis and implemented in, amongst others, the medical and microbiological fields. The present review will provide an overview of these approaches and an evaluation of their applicability in the identification of UGMOs in complex food or feed samples.

  18. Yeast identification by sequencing, biochemical kits, MALDI-TOF MS and rep-PCR DNA fingerprinting.

    Science.gov (United States)

    Zhao, Ying; Tsang, Chi-Ching; Xiao, Meng; Chan, Jasper F W; Lau, Susanna K P; Kong, Fanrong; Xu, Yingchun; Woo, Patrick C Y

    2017-12-08

    No study has comprehensively evaluated the performance of 28S nrDNA and ITS sequencing, commercial biochemical test kits, MALDI-TOF MS platforms, and the emerging rep-PCR DNA fingerprinting technology using a cohort of yeast strains collected from a clinical microbiology laboratory. In this study, using 71 clinically important yeast isolates (excluding Candida albicans) collected from a single centre, we determined the concordance of 28S nrDNA and ITS sequencing and evaluated the performance of two commercial test kits, two MALDI-TOF MS platforms, and rep-PCR DNA fingerprinting. 28S nrDNA and ITS sequencing showed complete agreement on the identities of the 71 isolates. Using sequencing results as the standard, 78.9% and 71.8% isolates were correctly identified using the API 20C AUX and Vitek 2 YST ID Card systems, respectively; and 90.1% and 80.3% isolates were correctly identified using the Bruker and Vitek MALDI-TOF MS platforms, respectively. Of the 18 strains belonging to the Candida parapsilosis species complex tested by DiversiLab automated rep-PCR DNA fingerprinting, all were identified only as Candida parapsilosis with similarities ≥93.2%, indicating the misidentification of Candida metapsilosis and Candida orthopsilosis. However, hierarchical cluster analysis of the rep-PCR DNA fingerprints of these three species within this species complex formed three different discrete clusters, indicating that this technology can potentially differentiate the three species. To achieve higher accuracies of identification, the databases of commercial biochemical test kits, MALDI-TOF MS platforms, and DiversiLab automated rep-PCR DNA fingerprinting needs further enrichment, particularly for uncommonly encountered yeast species. © The Author 2017. Published by Oxford University Press on behalf of The International Society for Human and Animal Mycology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  19. VoSeq: a voucher and DNA sequence web application.

    Directory of Open Access Journals (Sweden)

    Carlos Peña

    Full Text Available There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit.

  20. Ribosomal PCR and DNA sequencing for detection and identification of bacteria

    DEFF Research Database (Denmark)

    Jensen, Kristine Helander; Dargis, Rimtas; Christensen, Jens Jørgen

    2014-01-01

    -haemolytic streptococci, especially within the mitis group. The data show that ribosomal PCR with subsequent DNA sequencing of the PCR product is a most valuable supplement to culture for identifying bacterial agents of both acute and prolonged infections. However, some bacteria, including non-haemolytic streptococci...

  1. High-Throughput DNA sequencing of ancient wood.

    Science.gov (United States)

    Wagner, Stefanie; Lagane, Frédéric; Seguin-Orlando, Andaine; Schubert, Mikkel; Leroy, Thibault; Guichoux, Erwan; Chancerel, Emilie; Bech-Hebelstrup, Inger; Bernard, Vincent; Billard, Cyrille; Billaud, Yves; Bolliger, Matthias; Croutsch, Christophe; Čufar, Katarina; Eynaud, Frédérique; Heussner, Karl Uwe; Köninger, Joachim; Langenegger, Fabien; Leroy, Frédéric; Lima, Christine; Martinelli, Nicoletta; Momber, Garry; Billamboz, André; Nelle, Oliver; Palomo, Antoni; Piqué, Raquel; Ramstein, Marianne; Schweichel, Roswitha; Stäuble, Harald; Tegel, Willy; Terradas, Xavier; Verdin, Florence; Plomion, Christophe; Kremer, Antoine; Orlando, Ludovic

    2018-03-01

    Reconstructing the colonization and demographic dynamics that gave rise to extant forests is essential to forecasts of forest responses to environmental changes. Classical approaches to map how population of trees changed through space and time largely rely on pollen distribution patterns, with only a limited number of studies exploiting DNA molecules preserved in wooden tree archaeological and subfossil remains. Here, we advance such analyses by applying high-throughput (HTS) DNA sequencing to wood archaeological and subfossil material for the first time, using a comprehensive sample of 167 European white oak waterlogged remains spanning a large temporal (from 550 to 9,800 years) and geographical range across Europe. The successful characterization of the endogenous DNA and exogenous microbial DNA of 140 (~83%) samples helped the identification of environmental conditions favouring long-term DNA preservation in wood remains, and started to unveil the first trends in the DNA decay process in wood material. Additionally, the maternally inherited chloroplast haplotypes of 21 samples from three periods of forest human-induced use (Neolithic, Bronze Age and Middle Ages) were found to be consistent with those of modern populations growing in the same geographic areas. Our work paves the way for further studies aiming at using ancient DNA preserved in wood to reconstruct the micro-evolutionary response of trees to climate change and human forest management. © 2018 John Wiley & Sons Ltd.

  2. Micropatterning stretched and aligned DNA for sequence-specific nanolithography

    Science.gov (United States)

    Petit, Cecilia Anna Paulette

    Techniques for fabricating nanostructured materials can be categorized as either "top-down" or "bottom-up". Top-down techniques use lithography and contact printing to create patterned surfaces and microfluidic channels that can corral and organize nanoscale structures, such as molecules and nanorods in contrast; bottom-up techniques use self-assembly or molecular recognition to direct the organization of materials. A central goal in nanotechnology is the integration of bottom-up and top-down assembly strategies for materials development, device design; and process integration. With this goal in mind, we have developed strategies that will allow this integration by using DNA as a template for nanofabrication; two top-down approaches allow the placement of these templates, while the bottom-up technique uses the specific sequence of bases to pattern materials along each strand of DNA. Our first top-down approach, termed combing of molecules in microchannels (COMMIC), produces microscopic patterns of stretched and aligned molecules of DNA on surfaces. This process consists of passing an air-water interface over end adsorbed molecules inside microfabricated channels. The geometry of the microchannel directs the placement of the DNA molecules, while the geometry of the airwater interface directs the local orientation and curvature of the molecules. We developed another top-down strategy for creating micropatterns of stretched and aligned DNA using surface chemistry. Because DNA stretching occurs on hydrophobic surfaces, this technique uses photolithography to pattern vinyl-terminated silanes on glass When these surface-, are immersed in DNA solution, molecules adhere preferentially to the silanized areas. This approach has also proven useful in patterning protein for cell adhesion studies. Finally, we describe the use of these stretched and aligned molecules of DNA as templates for the subsequent bottom-up construction of hetero-structures through hybridization

  3. A DNA sequence element that advances replication origin activation time in Saccharomyces cerevisiae.

    Science.gov (United States)

    Pohl, Thomas J; Kolor, Katherine; Fangman, Walton L; Brewer, Bonita J; Raghuraman, M K

    2013-11-06

    Eukaryotic origins of DNA replication undergo activation at various times in S-phase, allowing the genome to be duplicated in a temporally staggered fashion. In the budding yeast Saccharomyces cerevisiae, the activation times of individual origins are not intrinsic to those origins but are instead governed by surrounding sequences. Currently, there are two examples of DNA sequences that are known to advance origin activation time, centromeres and forkhead transcription factor binding sites. By combining deletion and linker scanning mutational analysis with two-dimensional gel electrophoresis to measure fork direction in the context of a two-origin plasmid, we have identified and characterized a 19- to 23-bp and a larger 584-bp DNA sequence that are capable of advancing origin activation time.

  4. Autonomous replication of plasmids bearing monkey DNA origin-enriched sequences

    International Nuclear Information System (INIS)

    Frappier, L.; Zannis-Hadjopoulos, M.

    1987-01-01

    Twelve clones of origin-enriched sequences (ORS) isolated from early replicating monkey (CV-1) DNA were examined for transient episomal replication in transfected CV-1, COS-7, and HeLa cells. Plasmid DNA was isolated at time intervals after transfection and screened by the Dpn I resistance assay or by the bromodeoxyuridine substitution assay to differentiate between input and replicated DNA. The authors have identified four monkey ORS (ORS3, -8, -9, and -12) that can support plasmid replication in mammalian cells. This replication is carried out in a controlled and semiconservative manner characteristic of mammalian replicons. ORS replication was most efficient in HeLa cells. Electron microscopy showed ORS8 and ORS12 plasmids of the correct size with replication bubbles. Using a unique restriction site in ORS12, we have mapped the replication bubble within the monkey DNA sequence

  5. Inspecting Targeted Deep Sequencing of Whole Genome Amplified DNA Versus Fresh DNA for Somatic Mutation Detection: A Genetic Study in Myelodysplastic Syndrome Patients.

    Science.gov (United States)

    Palomo, Laura; Fuster-Tormo, Francisco; Alvira, Daniel; Ademà, Vera; Armengol, María Pilar; Gómez-Marzo, Paula; de Haro, Nuri; Mallo, Mar; Xicoy, Blanca; Zamora, Lurdes; Solé, Francesc

    2017-08-01

    Whole genome amplification (WGA) has become an invaluable method for preserving limited samples of precious stock material and has been used during the past years as an alternative tool to increase the amount of DNA before library preparation for next-generation sequencing. Myelodysplastic syndromes (MDS) are a group of clonal hematopoietic stem cell disorders characterized by presenting somatic mutations in several myeloid-related genes. In this work, targeted deep sequencing has been performed on four paired fresh DNA and WGA DNA samples from bone marrow of MDS patients, to assess the feasibility of using WGA DNA for detecting somatic mutations. The results of this study highlighted that, in general, the sequencing and alignment statistics of fresh DNA and WGA DNA samples were similar. However, after variant calling and when considering variants detected at all frequencies, there was a high level of discordance between fresh DNA and WGA DNA (overall, a higher number of variants was detected in WGA DNA). After proper filtering, a total of three somatic mutations were detected in the cohort. All somatic mutations detected in fresh DNA were also identified in WGA DNA and validated by whole exome sequencing.

  6. Pericentric satellite DNA sequences in Pipistrellus pipistrellus (Vespertilionidae; Chiroptera).

    Science.gov (United States)

    Barragán, M J L; Martínez, S; Marchal, J A; Fernández, R; Bullejos, M; Díaz de la Guardia, R; Sánchez, A

    2003-09-01

    This paper reports the molecular and cytogenetic characterization of a HindIII family of satellite DNA in the bat species Pipistrellus pipistrellus. This satellite is organized in tandem repeats of 418 bp monomer units, and represents approximately 3% of the whole genome. The consensus sequence from five cloned monomer units has an A-T content of 62.20%. We have found differences in the ladder pattern of bands between two populations of the same species. These differences are probably because of the absence of the target sites for the HindIII enzyme in most monomer units of one population, but not in the other. Fluorescent in situ hybridization (FISH) localized the satellite DNA in the pericentromeric regions of all autosomes and the X chromosome, but it was absent from the Y chromosome. Digestion of genomic DNAs with HpaII and its isoschizomer MspI demonstrated that these repetitive DNA sequences are not methylated. Other bat species were tested for the presence of this repetitive DNA. It was absent in five Vespertilionidae and one Rhinolophidae species, indicating that it could be a species/genus specific, repetitive DNA family.

  7. Early Lyme disease with spirochetemia - diagnosed by DNA sequencing

    Directory of Open Access Journals (Sweden)

    Jones William

    2010-11-01

    Full Text Available Abstract Background A sensitive and analytically specific nucleic acid amplification test (NAAT is valuable in confirming the diagnosis of early Lyme disease at the stage of spirochetemia. Findings Venous blood drawn from patients with clinical presentations of Lyme disease was tested for the standard 2-tier screen and Western Blot serology assay for Lyme disease, and also by a nested polymerase chain reaction (PCR for B. burgdorferi sensu lato 16S ribosomal DNA. The PCR amplicon was sequenced for B. burgdorferi genomic DNA validation. A total of 130 patients visiting emergency room (ER or Walk-in clinic (WALKIN, and 333 patients referred through the private physicians' offices were studied. While 5.4% of the ER/WALKIN patients showed DNA evidence of spirochetemia, none (0% of the patients referred from private physicians' offices were DNA-positive. In contrast, while 8.4% of the patients referred from private physicians' offices were positive for the 2-tier Lyme serology assay, only 1.5% of the ER/WALKIN patients were positive for this antibody test. The 2-tier serology assay missed 85.7% of the cases of early Lyme disease with spirochetemia. The latter diagnosis was confirmed by DNA sequencing. Conclusion Nested PCR followed by automated DNA sequencing is a valuable supplement to the standard 2-tier antibody assay in the diagnosis of early Lyme disease with spirochetemia. The best time to test for Lyme spirochetemia is when the patients living in the Lyme disease endemic areas develop unexplained symptoms or clinical manifestations that are consistent with Lyme disease early in the course of their illness.

  8. Spectral sum rules and search for periodicities in DNA sequences

    International Nuclear Information System (INIS)

    Chechetkin, V.R.

    2011-01-01

    Periodic patterns play the important regulatory and structural roles in genomic DNA sequences. Commonly, the underlying periodicities should be understood in a broad statistical sense, since the corresponding periodic patterns have been strongly distorted by the random point mutations and insertions/deletions during molecular evolution. The latent periodicities in DNA sequences can be efficiently displayed by Fourier transform. The criteria of significance for observed periodicities are obtained via the comparison versus the counterpart characteristics of the reference random sequences. We show that the restrictions imposed on the significance criteria by the rigorous spectral sum rules can be rationally described with De Finetti distribution. This distribution provides the convenient intermediate asymptotic form between Rayleigh distribution and exact combinatoric theory. - Highlights: → We study the significance criteria for latent periodicities in DNA sequences. → The constraints imposed by sum rules can be described with De Finetti distribution. → It is intermediate between Rayleigh distribution and exact combinatoric theory. → Theory is applicable to the study of correlations between different periodicities. → The approach can be generalized to the arbitrary discrete Fourier transform.

  9. Applications of statistical physics and information theory to the analysis of DNA sequences

    Science.gov (United States)

    Grosse, Ivo

    2000-10-01

    DNA carries the genetic information of most living organisms, and the of genome projects is to uncover that genetic information. One basic task in the analysis of DNA sequences is the recognition of protein coding genes. Powerful computer programs for gene recognition have been developed, but most of them are based on statistical patterns that vary from species to species. In this thesis I address the question if there exist universal statistical patterns that are different in coding and noncoding DNA of all living species, regardless of their phylogenetic origin. In search for such species-independent patterns I study the mutual information function of genomic DNA sequences, and find that it shows persistent period-three oscillations. To understand the biological origin of the observed period-three oscillations, I compare the mutual information function of genomic DNA sequences to the mutual information function of stochastic model sequences. I find that the pseudo-exon model is able to reproduce the mutual information function of genomic DNA sequences. Moreover, I find that a generalization of the pseudo-exon model can connect the existence and the functional form of long-range correlations to the presence and the length distributions of coding and noncoding regions. Based on these theoretical studies I am able to find an information-theoretical quantity, the average mutual information (AMI), whose probability distributions are significantly different in coding and noncoding DNA, while they are almost identical in all studied species. These findings show that there exist universal statistical patterns that are different in coding and noncoding DNA of all studied species, and they suggest that the AMI may be used to identify genes in different living species, irrespective of their taxonomic origin.

  10. Phylogenetic relationships of the Gomphales based on nuc-25S-rDNA, mit-12S-rDNA, and mit-atp6-DNA combined sequences

    Science.gov (United States)

    Admir J. Giachini; Kentaro Hosaka; Eduardo Nouhra; Joseph Spatafora; James M. Trappe

    2010-01-01

    Phylogenetic relationships among Geastrales, Gomphales, Hysterangiales, and Phallales were estimated via combined sequences: nuclear large subunit ribosomal DNA (nuc-25S-rDNA), mitochondrial small subunit ribosomal DNA (mit-12S-rDNA), and mitochondrial atp6 DNA (mit-atp6-DNA). Eighty-one taxa comprising 19 genera and 58 species...

  11. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, Natasha V. (Okemos, MI); Broekaert, Willem F. (Dilbeek, BE); Chua, Nam-Hai (Scarsdale, NY); Kush, Anil (New York, NY)

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a pu GOVERNMENT RIGHTS This application was funded under Department of Energy Contract DE-AC02-76ER01338. The U.S. Government has certain rights under this application and any patent issuing thereon.

  12. Genomic DNA Enrichment Using Sequence Capture Microarrays: a Novel Approach to Discover Sequence Nucleotide Polymorphisms (SNP) in Brassica napus L

    Science.gov (United States)

    Clarke, Wayne E.; Parkin, Isobel A.; Gajardo, Humberto A.; Gerhardt, Daniel J.; Higgins, Erin; Sidebottom, Christine; Sharpe, Andrew G.; Snowdon, Rod J.; Federico, Maria L.; Iniguez-Luy, Federico L.

    2013-01-01

    Targeted genomic selection methodologies, or sequence capture, allow for DNA enrichment and large-scale resequencing and characterization of natural genetic variation in species with complex genomes, such as rapeseed canola (Brassica napus L., AACC, 2n=38). The main goal of this project was to combine sequence capture with next generation sequencing (NGS) to discover single nucleotide polymorphisms (SNPs) in specific areas of the B. napus genome historically associated (via quantitative trait loci –QTL– analysis) to traits of agronomical and nutritional importance. A 2.1 million feature sequence capture platform was designed to interrogate DNA sequence variation across 47 specific genomic regions, representing 51.2 Mb of the Brassica A and C genomes, in ten diverse rapeseed genotypes. All ten genotypes were sequenced using the 454 Life Sciences chemistry and to assess the effect of increased sequence depth, two genotypes were also sequenced using Illumina HiSeq chemistry. As a result, 589,367 potentially useful SNPs were identified. Analysis of sequence coverage indicated a four-fold increased representation of target regions, with 57% of the filtered SNPs falling within these regions. Sixty percent of discovered SNPs corresponded to transitions while 40% were transversions. Interestingly, fifty eight percent of the SNPs were found in genic regions while 42% were found in intergenic regions. Further, a high percentage of genic SNPs was found in exons (65% and 64% for the A and C genomes, respectively). Two different genotyping assays were used to validate the discovered SNPs. Validation rates ranged from 61.5% to 84% of tested SNPs, underpinning the effectiveness of this SNP discovery approach. Most importantly, the discovered SNPs were associated with agronomically important regions of the B. napus genome generating a novel data resource for research and breeding this crop species. PMID:24312619

  13. Close Sequence Comparisons are Sufficient to Identify Humancis-Regulatory Elements

    Energy Technology Data Exchange (ETDEWEB)

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Couronne, Olivier; Pennacchio, Len A.

    2005-12-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons, due to the lack of a universal metric for sequence conservation, and also the paucity of empirically defined benchmark sets of cis-regulatory elements. To address this problem, we developed a general-purpose algorithm (Gumby) that detects slowly-evolving regions in primate, mammalian and more distant comparisons without requiring adjustment of parameters, and ranks conserved elements by P-value using Karlin-Altschul statistics. We benchmarked Gumby predictions against previously identified cis-regulatory elements at diverse genomic loci, and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using reporter-gene assays in transgenic mice. Human regulatory elements were identified with acceptable sensitivity and specificity by comparison with 1-5 other eutherian mammals or 6 other simian primates. More distant comparisons (marsupial, avian, amphibian and fish) failed to identify many of the empirically defined functional noncoding elements. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole genome comparative analysis, which explains some of these findings. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for testing at embryonic time points.

  14. Development of a defined-sequence DNA system for use in DNA misrepair studies

    International Nuclear Information System (INIS)

    Sutton, S.; Tobias, C.A.

    1984-01-01

    The authors have developed a system that allows them to study cellular DNA repair processes at the molecular level. In particular, the authors are using this system to examine the consequences of a misrepair of radiation-induced DNA damage, as a function of dose. The cells being used are specially engineered haploid yeast cells. Maintained in the cells, at one copy per cell, is a cen plasmid, a plasmid that behaves like a functional chromosome. This plasmid carries a small defined sequence of DNA from the E. coli lac z gene. It is this lac z region (called the alpha region) that serves as the target for radiation damage. Two copies of the complimentary portion of the lac z gene are integrated into the yeast genome. Irradiated cells are screened for possible mutation in the alpha region by testing the cells' ability to hydrolyze xgal, a lactose substrate. The DNA of interest is then extracted from the cells, sequenced, and the sequence is compared to that of the control. Unlike the usual defined-sequence DNA systems, theirs is an in vivo system. A disadvantage is the relatively high background mutation rate. Results achieved with this system, as well as future applications, are discussed

  15. Amplification of a transcriptionally active DNA sequence in the human brain

    International Nuclear Information System (INIS)

    Yakovlev, A.G.; Sazonov, A.E.; Spunde, A.Ya.; Gindilis, V.M.

    1986-01-01

    The authors present their findings of tissue-specific amplification of a DNA fragment actively transcribed in the human brain. This genome fragment was found in the library complement of cDNA of the human brain and evidently belongs to a new class of moderate repetitions of DNA with an unstable copying capacity in the human genome. The authors isolated total cell RNA from various human tissues (brain, placenta), and rat tissues (brain, liver), by the method of hot phenol extraction with guanidine thiocynate. The poly(A + ) RNA fraction was isolated by chromatography. Synthesis of cDNA was done on a matrix of poly(A + ) RNA of human brain. The cDNA obtained was cloned in plasmid pBR322 for the PstI site using (dC/dG) sequences synthesized on the 3' ends of the vector molecule and cDNA respectively. In cloning 75 ng cDNA, the authors obtained approximately 10 5 recombinant. This library was analyzed by the hybridization method on columns with two radioactive ( 32 P) probes: the total cDNA preparation and the total nuclear DNA from the human brain. The number of copies of the cloned DNA fragment in the genome was determined by dot hybridization. Restricting fragments of human and rat DNA genomes homologous to the cloned cDNA were identified on radio-autographs. In each case, 10 micrograms of EcoRI DNA hydrolyzate was fractionated in 1% agarose gel. The probe was also readied with RNA samples fractionated in agarose gel with formaldehyde and transferred to a nitrocellulose filter under weak vacuum. The filter was hybridized with 0.1 micrograms DNA pAG 02, labeled with ( 32 P) to a specific activity of 0.5-1 x 10 9 counts/min x microgram. The autograph was exposed with amplifying screens at -70 0 C for 2 days

  16. Rapid DNA sequencing by horizontal ultrathin gel electrophoresis.

    OpenAIRE

    Brumley, R L; Smith, L M

    1991-01-01

    A horizontal polyacrylamide gel electrophoresis apparatus has been developed that decreases the time required to separate the DNA fragments produced in enzymatic sequencing reactions. The configuration of this apparatus and the use of circulating coolant directly under the glass plates result in heat exchange that is approximately nine times more efficient than passive thermal transfer methods commonly used. Bubble-free gels as thin as 25 microns can be routinely cast on this device. The appl...

  17. Assessing Mitochondrial DNA Variation and Copy Number in Lymphocytes of ~2,000 Sardinians Using Tailored Sequencing Analysis Tools.

    Directory of Open Access Journals (Sweden)

    Jun Ding

    2015-07-01

    Full Text Available DNA sequencing identifies common and rare genetic variants for association studies, but studies typically focus on variants in nuclear DNA and ignore the mitochondrial genome. In fact, analyzing variants in mitochondrial DNA (mtDNA sequences presents special problems, which we resolve here with a general solution for the analysis of mtDNA in next-generation sequencing studies. The new program package comprises 1 an algorithm designed to identify mtDNA variants (i.e., homoplasmies and heteroplasmies, incorporating sequencing error rates at each base in a likelihood calculation and allowing allele fractions at a variant site to differ across individuals; and 2 an estimation of mtDNA copy number in a cell directly from whole-genome sequencing data. We also apply the methods to DNA sequence from lymphocytes of ~2,000 SardiNIA Project participants. As expected, mothers and offspring share all homoplasmies but a lesser proportion of heteroplasmies. Both homoplasmies and heteroplasmies show 5-fold higher transition/transversion ratios than variants in nuclear DNA. Also, heteroplasmy increases with age, though on average only ~1 heteroplasmy reaches the 4% level between ages 20 and 90. In addition, we find that mtDNA copy number averages ~110 copies/lymphocyte and is ~54% heritable, implying substantial genetic regulation of the level of mtDNA. Copy numbers also decrease modestly but significantly with age, and females on average have significantly more copies than males. The mtDNA copy numbers are significantly associated with waist circumference (p-value = 0.0031 and waist-hip ratio (p-value = 2.4×10-5, but not with body mass index, indicating an association with central fat distribution. To our knowledge, this is the largest population analysis to date of mtDNA dynamics, revealing the age-imposed increase in heteroplasmy, the relatively high heritability of copy number, and the association of copy number with metabolic traits.

  18. G-quadruplex and G-rich sequence stimulate Pif1p-catalyzed downstream duplex DNA unwinding through reducing waiting time at ss/dsDNA junction

    Science.gov (United States)

    Zhang, Bo; Wu, Wen-Qiang; Liu, Na-Nv; Duan, Xiao-Lei; Li, Ming; Dou, Shuo-Xing; Hou, Xi-Miao; Xi, Xu-Guang

    2016-01-01

    Alternative DNA structures that deviate from B-form double-stranded DNA such as G-quadruplex (G4) DNA can be formed by G-rich sequences that are widely distributed throughout the human genome. We have previously shown that Pif1p not only unfolds G4, but also unwinds the downstream duplex DNA in a G4-stimulated manner. In the present study, we further characterized the G4-stimulated duplex DNA unwinding phenomenon by means of single-molecule fluorescence resonance energy transfer. It was found that Pif1p did not unwind the partial duplex DNA immediately after unfolding the upstream G4 structure, but rather, it would dwell at the ss/dsDNA junction with a ‘waiting time’. Further studies revealed that the waiting time was in fact related to a protein dimerization process that was sensitive to ssDNA sequence and would become rapid if the sequence is G-rich. Furthermore, we identified that the G-rich sequence, as the G4 structure, equally stimulates duplex DNA unwinding. The present work sheds new light on the molecular mechanism by which G4-unwinding helicase Pif1p resolves physiological G4/duplex DNA structures in cells. PMID:27471032

  19. Efficiency of ITS Sequences for DNA Barcoding in Passiflora (Passifloraceae

    Directory of Open Access Journals (Sweden)

    Giovanna Câmara Giudicelli

    2015-04-01

    Full Text Available DNA barcoding is a technique for discriminating and identifying species using short, variable, and standardized DNA regions. Here, we tested for the first time the performance of plastid and nuclear regions as DNA barcodes in Passiflora. This genus is a largely variable, with more than 900 species of high ecological, commercial, and ornamental importance. We analyzed 1034 accessions of 222 species representing the four subgenera of Passiflora and evaluated the effectiveness of five plastid regions and three nuclear datasets currently employed as DNA barcodes in plants using barcoding gap, applied similarity-, and tree-based methods. The plastid regions were able to identify less than 45% of species, whereas the nuclear datasets were efficient for more than 50% using “best match” and “best close match” methods of TaxonDNA software. All subgenera presented higher interspecific pairwise distances and did not fully overlap with the intraspecific distance, and similarity-based methods showed better results than tree-based methods. The nuclear ribosomal internal transcribed spacer 1 (ITS1 region presented a higher discrimination power than the other datasets and also showed other desirable characteristics as a DNA barcode for this genus. Therefore, we suggest that this region should be used as a starting point to identify Passiflora species.

  20. Is photocleavage of DNA by YOYO-1 using a synchrotron radiation light source sequence dependent?

    DEFF Research Database (Denmark)

    Gilroy, Emma L.; Hoffmann, Søren Vrønning; Jones, Nykola C.

    2011-01-01

    ) throughout the irradiation period. The dependence of LD signals on DNA sequences and on time in the intense light beam was explored and quantified for single-stranded poly(dA), poly[(dA-dT)2], calf thymus DNA (ctDNA) and Micrococcus luteus DNA (mlDNA). The DNA and ligand regions of the spectrum showed...

  1. Using sequence-specific chemical and structural properties of DNA to predict transcription factor binding sites.

    Directory of Open Access Journals (Sweden)

    Amy L Bauer

    2010-11-01

    Full Text Available An important step in understanding gene regulation is to identify the DNA binding sites recognized by each transcription factor (TF. Conventional approaches to prediction of TF binding sites involve the definition of consensus sequences or position-specific weight matrices and rely on statistical analysis of DNA sequences of known binding sites. Here, we present a method called SiteSleuth in which DNA structure prediction, computational chemistry, and machine learning are applied to develop models for TF binding sites. In this approach, binary classifiers are trained to discriminate between true and false binding sites based on the sequence-specific chemical and structural features of DNA. These features are determined via molecular dynamics calculations in which we consider each base in different local neighborhoods. For each of 54 TFs in Escherichia coli, for which at least five DNA binding sites are documented in RegulonDB, the TF binding sites and portions of the non-coding genome sequence are mapped to feature vectors and used in training. According to cross-validation analysis and a comparison of computational predictions against ChIP-chip data available for the TF Fis, SiteSleuth outperforms three conventional approaches: Match, MATRIX SEARCH, and the method of Berg and von Hippel. SiteSleuth also outperforms QPMEME, a method similar to SiteSleuth in that it involves a learning algorithm. The main advantage of SiteSleuth is a lower false positive rate.

  2. Complete cDNA sequence and amino acid analysis of a bovine ribonuclease K6 gene.

    Science.gov (United States)

    Pietrowski, D; Förster, M

    2000-01-01

    The complete cDNA sequence of a ribonuclease k6 gene of Bos Taurus has been determined. It codes for a protein with 154 amino acids and contains the invariant cysteine, histidine and lysine residues as well as the characteristic motifs specific to ribonuclease active sites. The deduced protein sequence is 27 residues longer than other known ribonucleases k6 and shows amino acids exchanges which could reflect a strain specificity or polymorphism within the bovine genome. Based on sequence similarity we have termed the identified gene bovine ribonuclease k6 b (brk6b).

  3. DNA Barcoding: Amplification and sequence analysis of rbcl and matK genome regions in three divergent plant species

    Directory of Open Access Journals (Sweden)

    Javed Iqbal Wattoo

    2016-11-01

    Full Text Available Background: DNA barcoding is a novel method of species identification based on nucleotide diversity of conserved sequences. The establishment and refining of plant DNA barcoding systems is more challenging due to high genetic diversity among different species. Therefore, targeting the conserved nuclear transcribed regions would be more reliable for plant scientists to reveal genetic diversity, species discrimination and phylogeny. Methods: In this study, we amplified and sequenced the chloroplast DNA regions (matk+rbcl of Solanum nigrum, Euphorbia helioscopia and Dalbergia sissoo to study the functional annotation, homology modeling and sequence analysis to allow a more efficient utilization of these sequences among different plant species. These three species represent three families; Solanaceae, Euphorbiaceae and Fabaceae respectively. Biological sequence homology and divergence of amplified sequences was studied using Basic Local Alignment Tool (BLAST. Results: Both primers (matk+rbcl showed good amplification in three species. The sequenced regions reveled conserved genome information for future identification of different medicinal plants belonging to these species. The amplified conserved barcodes revealed different levels of biological homology after sequence analysis. The results clearly showed that the use of these conserved DNA sequences as barcode primers would be an accurate way for species identification and discrimination. Conclusion: The amplification and sequencing of conserved genome regions identified a novel sequence of matK in native species of Solanum nigrum. The findings of the study would be applicable in medicinal industry to establish DNA based identification of different medicinal plant species to monitor adulteration.

  4. Methylation patterns of repetitive DNA sequences in germ cells of Mus musculus.

    Science.gov (United States)

    Sanford, J; Forrester, L; Chapman, V; Chandley, A; Hastie, N

    1984-03-26

    The major and the minor satellite sequences of Mus musculus were undermethylated in both sperm and oocyte DNAs relative to the amount of undermethylation observed in adult somatic tissue DNA. This hypomethylation was specific for satellite sequences in sperm DNA. Dispersed repetitive and low copy sequences show a high degree of methylation in sperm DNA; however, a dispersed repetitive sequence was undermethylated in oocyte DNA. This finding suggests a difference in the amount of total genomic DNA methylation between sperm and oocyte DNA. The methylation levels of the minor satellite sequences did not change during spermiogenesis, and were not associated with the onset of meiosis or a specific stage in sperm development.

  5. [Identification of a repetitive sequence element for DNA fingerprinting in Phytophthora sojae].

    Science.gov (United States)

    Yin, Lihua; Wang, Qinhu; Ning, Feng; Zhu, Xiaoying; Zuo, Yuhu; Shan, Weixing

    2010-04-01

    Establishment of DNA fingerprinting in Phytophthora sojae and an analysis of genetic relationship of Heilongjiang and Xinjiang populations. Bioinformatics tools were used to search repetitive sequences in P. sojae and Southern blot analysis was employed for DNA fingerprinting analysis of P. sojae populations from Heilongjiang and Xinjiang using the identified repetitive sequence. A moderately repetitive sequence was identified and designated as PS1227. Southern blot analysis indicated 34 distinct bands ranging in size from 1.5 kb-23 kb, of which 21 were polymorphic among 49 isolates examined. Analysis of single-zoospore progenies showed that the PS1227 fingerprint pattern was mitotically stable. DNA fingerprinting showed that the P. sojae isolates HP4002, SY6 and GJ0105 of Heilongjiang are genetically identical to DW303, 71228 and 71222 of Xinjiang, respectively. A moderately repetitive sequence designated PS1227 which will be useful for epidemiology and population biology studies of P. sojae was obtained, and a PS1227-based DNA fingerprinting analysis provided molecular evidence that P. sojae in Xinjiang was likely introduced from Heilongjiang.

  6. Fidelity and mutational spectrum of Pfu DNA polymerase on a human mitochondrial DNA sequence.

    Science.gov (United States)

    André, P; Kim, A; Khrapko, K; Thilly, W G

    1997-08-01

    The study of rare genetic changes in human tissues requires specialized techniques. Point mutations at fractions at or below 10(-6) must be observed to discover even the most prominent features of the point mutational spectrum. PCR permits the increase in number of mutant copies but does so at the expense of creating many additional mutations or "PCR noise". Thus, each DNA sequence studied must be characterized with regard to the DNA polymerase and conditions used to avoid interpreting a PCR-generated mutation as one arising in human tissue. The thermostable DNA polymerase derived from Pyrococcus furiosus designated Pfu has the highest fidelity of any DNA thermostable polymerase studied to date, and this property recommends it for analyses of tissue mutational spectra. Here, we apply constant denaturant capillary electrophoresis (CDCE) to separate and isolate the products of DNA amplification. This new strategy permitted direct enumeration and identification of point mutations created by Pfu DNA polymerase in a 96-bp low melting domain of a human mitochondrial sequence despite the very low mutant fractions generated in the PCR process. This sequence, containing part of the tRNA glycine and NADH dehydrogenase subunit 3 genes, is the target of our studies of mitochondrial mutagenesis in human cells and tissues. Incorrectly synthesized sequences were separated from the wild type as mutant/wild-type heteroduplexes by sequential enrichment on CDCE. An artificially constructed mutant was used as an internal standard to permit calculation of the mutant fraction. Our study found that the average error rate (mutations per base pair duplication) of Pfu was 6.5 x 10(-7), and five of its more frequent mutations (hot spots) consisted of three transversions (GC-->TA, AT-->TA, and AT-->CG), one transition (AT-->GC), and one 1-bp deletion (in an AAAAAA sequence). To achieve an even higher sensitivity, the amount of Pfu-induced mutants must be reduced.

  7. Accurate quantification of mouse mitochondrial DNA without co-amplification of nuclear mitochondrial insertion sequences.

    Science.gov (United States)

    Malik, Afshan N; Czajka, Anna; Cunningham, Phil

    2016-07-01

    Mitochondria contain an extra-nuclear genome in the form of mitochondrial DNA (MtDNA), damage to which can lead to inflammation and bioenergetic deficit. Changes in MtDNA levels are increasingly used as a biomarker of mitochondrial dysfunction. We previously reported that in humans, fragments in the nuclear genome known as nuclear mitochondrial insertion sequences (NumtS) affect accurate quantification of MtDNA. In the current paper our aim was to determine whether mouse NumtS affect the quantification of MtDNA and to establish a method designed to avoid this. The existence of NumtS in the mouse genome was confirmed using blast N, unique MtDNA regions were identified using FASTA, and MtDNA primers which do not co-amplify NumtS were designed and tested. MtDNA copy numbers were determined in a range of mouse tissues as the ratio of the mitochondrial and nuclear genome using real time qPCR and absolute quantification. Approximately 95% of mouse MtDNA was duplicated in the nuclear genome as NumtS which were located in 15 out of 21 chromosomes. A unique region was identified and primers flanking this region were used. MtDNA levels differed significantly in mouse tissues being the highest in the heart, with levels in descending order (highest to lowest) in kidney, liver, blood, brain, islets and lung. The presence of NumtS in the nuclear genome of mouse could lead to erroneous data when studying MtDNA content or mutation. The unique primers described here will allow accurate quantification of MtDNA content in mouse models without co-amplification of NumtS. Copyright © 2016 Elsevier B.V. and Mitochondria Research Society. All rights reserved.

  8. Evaluation of DNA bending models in their capacity to predict electrophoretic migration anomalies of satellite DNA sequences.

    Science.gov (United States)

    Matyášek, Roman; Fulneček, Jaroslav; Kovařík, Aleš

    2013-09-01

    DNA containing a sequence that generates a local curvature exhibits a pronounced retardation in electrophoretic mobility. Various theoretical models have been proposed to explain relationship between DNA structural features and migration anomaly. Here, we studied the capacity of 15 static wedge-bending models to predict electrophoretic behavior of 69 satellite monomers derived from four divergent families. All monomers exhibited retarded mobility in PAGE corresponding to retardation factors ranging 1.02-1.54. The curvature varied both within and across the groups and correlated with the number, position, and lengths of A-tracts. Two dinucleotide models provided strong correlation between gel mobility and curvature prediction; two trinucleotide models were satisfactory while remaining dinucleotide models provided intermediate results with reliable prediction for subsets of sequences only. In some cases, similarly shaped molecules exhibited relatively large differences in mobility and vice versa. Generally less accurate predictions were obtained in groups containing less homogeneous sequences possessing distinct structural features. In conclusion, relatively universal theoretical models were identified suitable for the analysis of natural sequences known to harbor relatively moderate curvature. These models could be potentially applied to genome wide studies. However, in silico predictions should be viewed in context of experimental measurement of intrinsic DNA curvature. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. A sequence-dependent rigid-base model of DNA

    Science.gov (United States)

    Gonzalez, O.; Petkevičiutė, D.; Maddocks, J. H.

    2013-02-01

    A novel hierarchy of coarse-grain, sequence-dependent, rigid-base models of B-form DNA in solution is introduced. The hierarchy depends on both the assumed range of energetic couplings, and the extent of sequence dependence of the model parameters. A significant feature of the models is that they exhibit the phenomenon of frustration: each base cannot simultaneously minimize the energy of all of its interactions. As a consequence, an arbitrary DNA oligomer has an intrinsic or pre-existing stress, with the level of this frustration dependent on the particular sequence of the oligomer. Attention is focussed on the particular model in the hierarchy that has nearest-neighbor interactions and dimer sequence dependence of the model parameters. For a Gaussian version of this model, a complete coarse-grain parameter set is estimated. The parameterized model allows, for an oligomer of arbitrary length and sequence, a simple and explicit construction of an approximation to the configuration-space equilibrium probability density function for the oligomer in solution. The training set leading to the coarse-grain parameter set is itself extracted from a recent and extensive database of a large number of independent, atomic-resolution molecular dynamics (MD) simulations of short DNA oligomers immersed in explicit solvent. The Kullback-Leibler divergence between probability density functions is used to make several quantitative assessments of our nearest-neighbor, dimer-dependent model, which is compared against others in the hierarchy to assess various assumptions pertaining both to the locality of the energetic couplings and to the level of sequence dependence of its parameters. It is also compared directly against all-atom MD simulation to assess its predictive capabilities. The results show that the nearest-neighbor, dimer-dependent model can successfully resolve sequence effects both within and between oligomers. For example, due to the presence of frustration, the model can

  10. A sequence-dependent rigid-base model of DNA.

    Science.gov (United States)

    Gonzalez, O; Petkevičiūtė, D; Maddocks, J H

    2013-02-07

    A novel hierarchy of coarse-grain, sequence-dependent, rigid-base models of B-form DNA in solution is introduced. The hierarchy depends on both the assumed range of energetic couplings, and the extent of sequence dependence of the model parameters. A significant feature of the models is that they exhibit the phenomenon of frustration: each base cannot simultaneously minimize the energy of all of its interactions. As a consequence, an arbitrary DNA oligomer has an intrinsic or pre-existing stress, with the level of this frustration dependent on the particular sequence of the oligomer. Attention is focussed on the particular model in the hierarchy that has nearest-neighbor interactions and dimer sequence dependence of the model parameters. For a Gaussian version of this model, a complete coarse-grain parameter set is estimated. The parameterized model allows, for an oligomer of arbitrary length and sequence, a simple and explicit construction of an approximation to the configuration-space equilibrium probability density function for the oligomer in solution. The training set leading to the coarse-grain parameter set is itself extracted from a recent and extensive database of a large number of independent, atomic-resolution molecular dynamics (MD) simulations of short DNA oligomers immersed in explicit solvent. The Kullback-Leibler divergence between probability density functions is used to make several quantitative assessments of our nearest-neighbor, dimer-dependent model, which is compared against others in the hierarchy to assess various assumptions pertaining both to the locality of the energetic couplings and to the level of sequence dependence of its parameters. It is also compared directly against all-atom MD simulation to assess its predictive capabilities. The results show that the nearest-neighbor, dimer-dependent model can successfully resolve sequence effects both within and between oligomers. For example, due to the presence of frustration, the model can

  11. Methylation patterns of repetitive DNA sequences in germ cells of Mus musculus.

    OpenAIRE

    Sanford, J; Forrester, L; Chapman, V; Chandley, A; Hastie, N

    1984-01-01

    The major and the minor satellite sequences of Mus musculus were undermethylated in both sperm and oocyte DNAs relative to the amount of undermethylation observed in adult somatic tissue DNA. This hypomethylation was specific for satellite sequences in sperm DNA. Dispersed repetitive and low copy sequences show a high degree of methylation in sperm DNA; however, a dispersed repetitive sequence was undermethylated in oocyte DNA. This finding suggests a difference in the amount of total genomic...

  12. Structural properties of replication origins in yeast DNA sequences

    International Nuclear Information System (INIS)

    Cao Xiaoqin; Zeng Jia; Yan Hong

    2008-01-01

    Sequence-dependent DNA flexibility is an important structural property originating from the DNA 3D structure. In this paper, we investigate the DNA flexibility of the budding yeast (S. Cerevisiae) replication origins on a genome-wide scale using flexibility parameters from two different models, the trinucleotide and the tetranucleotide models. Based on analyzing average flexibility profiles of 270 replication origins, we find that yeast replication origins are significantly rigid compared with their surrounding genomic regions. To further understand the highly distinctive property of replication origins, we compare the flexibility patterns between yeast replication origins and promoters, and find that they both contain significantly rigid DNAs. Our results suggest that DNA flexibility is an important factor that helps proteins recognize and bind the target sites in order to initiate DNA replication. Inspired by the role of the rigid region in promoters, we speculate that the rigid replication origins may facilitate binding of proteins, including the origin recognition complex (ORC), Cdc6, Cdt1 and the MCM2-7 complex

  13. Advantages and Limitations of Ribosomal RNA PCR and DNA Sequencing for Identification of Bacteria in Cardiac Valves of Danish Patients

    DEFF Research Database (Denmark)

    Kemp, Michael; Bangsborg, Jette; Kjerulf, Anne

    2013-01-01

    of direct molecular identification should also address weaknesses, their relevance in the given setting, and possible improvements. In this study cardiac valves from 56 Danish patients referred for surgery for infective endocarditis were analysed by microscopy and culture as well as by PCR targeting part...... of the bacterial 16S rRNA gene followed by DNA sequencing of the PCR product. PCR and DNA sequencing identified significant bacteria in 49 samples from 43 patients, including five out of 13 culture-negative cases. No rare, exotic, or intracellular bacteria were identified. There was a general agreement between...... bacterial identity obtained by ribosomal PCR and DNA sequencing from the valves and bacterial isolates from blood culture. However, DNA sequencing of the 16S rRNA gene did not discriminate well among non-haemolytic streptococci, especially within the Streptococcus mitis group. Ribosomal PCR with subsequent...

  14. Characterization of full-length sequenced cDNA inserts (FLIcs from Atlantic salmon (Salmo salar

    Directory of Open Access Journals (Sweden)

    Lunner Sigbjørn

    2009-10-01

    Full Text Available Abstract Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP, the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91% of the transcripts were annotated using Gene Ontology (GO terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS. The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS. This

  15. Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)

    Science.gov (United States)

    Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn

    2009-01-01

    Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining cDNA

  16. Identifying Statistical Dependence in Genomic Sequences via Mutual Information Estimates

    Directory of Open Access Journals (Sweden)

    Wojciech Szpankowski

    2007-12-01

    Full Text Available Questions of understanding and quantifying the representation and amount of information in organisms have become a central part of biological research, as they potentially hold the key to fundamental advances. In this paper, we demonstrate the use of information-theoretic tools for the task of identifying segments of biomolecules (DNA or RNA that are statistically correlated. We develop a precise and reliable methodology, based on the notion of mutual information, for finding and extracting statistical as well as structural dependencies. A simple threshold function is defined, and its use in quantifying the level of significance of dependencies between biological segments is explored. These tools are used in two specific applications. First, they are used for the identification of correlations between different parts of the maize zmSRp32 gene. There, we find significant dependencies between the 5′ untranslated region in zmSRp32 and its alternatively spliced exons. This observation may indicate the presence of as-yet unknown alternative splicing mechanisms or structural scaffolds. Second, using data from the FBI's combined DNA index system (CODIS, we demonstrate that our approach is particularly well suited for the problem of discovering short tandem repeats—an application of importance in genetic profiling.

  17. Bacterial DNA Sequence Compression Models Using Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Armando J. Pinho

    2013-08-01

    Full Text Available It is widely accepted that the advances in DNA sequencing techniques have contributed to an unprecedented growth of genomic data. This fact has increased the interest in DNA compression, not only from the information theory and biology points of view, but also from a practical perspective, since such sequences require storage resources. Several compression methods exist, and particularly, those using finite-context models (FCMs have received increasing attention, as they have been proven to effectively compress DNA sequences with low bits-per-base, as well as low encoding/decoding time-per-base. However, the amount of run-time memory required to store high-order finite-context models may become impractical, since a context-order as low as 16 requires a maximum of 17.2 x 109 memory entries. This paper presents a method to reduce such a memory requirement by using a novel application of artificial neural networks (ANN to build such probabilistic models in a compact way and shows how to use them to estimate the probabilities. Such a system was implemented, and its performance compared against state-of-the art compressors, such as XM-DNA (expert model and FCM-Mx (mixture of finite-context models , as well as with general-purpose compressors. Using a combination of order-10 FCM and ANN, similar encoding results to those of FCM, up to order-16, are obtained using only 17 megabytes of memory, whereas the latter, even employing hash-tables, uses several hundreds of megabytes.

  18. Genomic localization, sequence analysis, and transcription of the putative human cytomegalovirus DNA polymerase gene

    International Nuclear Information System (INIS)

    Heilbronn, T.; Jahn, G.; Buerkle, A.; Freese, U.K.; Fleckenstein, B.; Zur Hausen, H.

    1987-01-01

    The human cytomegalovirus (HCMV)-induced DNA polymerase has been well characterized biochemically and functionally, but its genomic location has not yet been assigned. To identify the coding sequence, cross-hybridization with the herpes simplex virus type 1 (HSV-1) polymerase gene was used, as suggested by the close similarity of the herpes group virus-induced DNA polymerases to the HCMV DNA polymerase. A cosmid and plasmid library of the entire HCMV genome was screened with the BamHI Q fragment of HSF-1 at different stringency conditions. One PstI-HincII restriction fragment of 850 base pairs mapping within the EcoRI M fragment of HCMV cross-hybridized at T/sub m/ - 25/degrees/C. Sequence analysis revealed one open reading frame spanning the entire sequence. The amino acid sequence showed a highly conserved domain of 133 amino acids shared with the HSV and putative Esptein-Barr virus polymerase sequences. This domain maps within the C-terminal part of the HSV polymerase gene, which has been suggested to contain part of the catalytic center of the enzyme. Transcription analysis revealed one 5.4-kilobase early transcript in the sense orientation with respect to the open reading frame identified. This transcript appears to code for the 140-kilodalton HCMV polymerase protein

  19. Exome sequencing identifies SUCO mutations in mesial temporal lobe epilepsy.

    Science.gov (United States)

    Sha, Zhiqiang; Sha, Longze; Li, Wenting; Dou, Wanchen; Shen, Yan; Wu, Liwen; Xu, Qi

    2015-03-30

    Mesial temporal lobe epilepsy (mTLE) is the main type and most common medically intractable form of epilepsy. Severity of disease-based stratified samples may help identify new disease-associated mutant genes. We analyzed mRNA expression profiles from patient hippocampal tissue. Three of the seven patients had severe mTLE with generalized-onset convulsions and consciousness loss that occurred over many years. We found that compared with other groups, patients with severe mTLE were classified into a distinct group. Whole-exome sequencing and Sanger sequencing validation in all seven patients identified three novel SUN domain-containing ossification factor (SUCO) mutations in severely affected patients. Furthermore, SUCO knock down significantly reduced dendritic length in vitro. Our results indicate that mTLE defects may affect neuronal development, and suggest that neurons have abnormal development due to lack of SUCO, which may be a generalized-onset epilepsy-related gene. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  20. Using Synthetic Nanopores for Single-Molecule Analyses: Detecting SNPs, Trapping DNA Molecules, and the Prospects for Sequencing DNA

    Science.gov (United States)

    Dimitrov, Valentin V.

    2009-01-01

    This work focuses on studying properties of DNA molecules and DNA-protein interactions using synthetic nanopores, and it examines the prospects of sequencing DNA using synthetic nanopores. We have developed a method for discriminating between alleles that uses a synthetic nanopore to measure the binding of a restriction enzyme to DNA. There exists…

  1. Isolation and sequence of complementary DNA encoding human extracellular superoxide dismutase

    International Nuclear Information System (INIS)

    Hjalmarsson, K.; Marklund, S.L.; Engstroem, A.; Edlund, T.

    1987-01-01

    A complementary DNA (cDNA) clone from a human placenta cDNA library encoding extracellular superoxide dismutase has been isolated and the nucleotide sequence determined. The cDNA has a very high G + C content. EC-SOD is synthesized with a putative 18-amino acid signal peptide, preceding the 222 amino acids in the mature enzyme, indicating that the enzyme is a secretory protein. The first 95 amino acids of the mature enzyme show no sequence homology with other sequenced proteins and there is one possible N-glycosylation site (Asn-89). The amino acid sequence from residues 96-193 shows strong homology (∼ 50%) with the final two-thirds of the sequences of all know eukaryotic CuZn SODs, whereas the homology with the P. leiognathi CuZn SOD is clearly lower. The ligands to Cu and Zn, the cysteines forming the intrasubunit disulfide bridge in the CuZn SODs, and the arginine found in all CuZn SODs in the entrance to the active site can all be identified in EC-SOD. A comparison with bovine CuZn SOD, the three-dimensional structure of which is known, reveals that the homologies occur in the active site and the divergencies are in the part constituting the subunit contact area in CuZn SOD. Amino acid sequence 194-222 in the carboxyl-terminal end of EC-SOD is strongly hydrophilic and contains nine amino acids with a positive charge. This sequence probably confers the affinity of EC-SOD for heparin and heparan sulfate. An analysis of the amino acid sequence homologies with CuZn SODs from various species indicates that the EC-SODs may have evolved form the CuZn SODs before the evolution of fungi and plants

  2. Isolation and analysis of high quality nuclear DNA with reduced organellar DNA for plant genome sequencing and resequencing

    Directory of Open Access Journals (Sweden)

    Zdepski Anna

    2011-05-01

    Full Text Available Abstract Background High throughput sequencing (HTS technologies have revolutionized the field of genomics by drastically reducing the cost of sequencing, making it feasible for individual labs to sequence or resequence plant genomes. Obtaining high quality, high molecular weight DNA from plants poses significant challenges due to the high copy number of chloroplast and mitochondrial DNA, as well as high levels of phenolic compounds and polysaccharides. Multiple methods have been used to isolate DNA from plants; the CTAB method is commonly used to isolate total cellular DNA from plants that contain nuclear DNA, as well as chloroplast and mitochondrial DNA. Alternatively, DNA can be isolated from nuclei to minimize chloroplast and mitochondrial DNA contamination. Results We describe optimized protocols for isolation of nuclear DNA from eight different plant species encompassing both monocot and eudicot species. These protocols use nuclei isolation to minimize chloroplast and mitochondrial DNA contamination. We also developed a protocol to determine the number of chloroplast and mitochondrial DNA copies relative to the nuclear DNA using quantitative real time PCR (qPCR. We compared DNA isolated from nuclei to total cellular DNA isolated with the CTAB method. As expected, DNA isolated from nuclei consistently yielded nuclear DNA with fewer chloroplast and mitochondrial DNA copies, as compared to the total cellular DNA prepared with the CTAB method. This protocol will allow for analysis of the quality and quantity of nuclear DNA before starting a plant whole genome sequencing or resequencing experiment. Conclusions Extracting high quality, high molecular weight nuclear DNA in plants has the potential to be a bottleneck in the era of whole genome sequencing and resequencing. The methods that are described here provide a framework for researchers to extract and quantify nuclear DNA in multiple types of plants.

  3. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    2000-07-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  4. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 12 figs.

  5. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, Natasha V. (Okemos, MI); Broekaert, Willem F. (Dilbeek, BE); Chua, Nam-Hai (Scarsdale, NY); Kush, Anil (New York, NY)

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  6. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1995-03-21

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 11 figures.

  7. DNA fingerprinting based on simple sequence repeat (SSR ...

    African Journals Online (AJOL)

    New varieties of sugarcane are protected using morphological descriptors, which have limitations in identifying morphologically similar cultivars. Development of a reliable DNA fingerprint system for identification of new varieties would contribute greatly to the breeding of these species. Microsatellite markers are tools with ...

  8. DNA Sequencing as a Tool to Monitor Marine Ecological Status

    Directory of Open Access Journals (Sweden)

    Kelly D. Goodwin

    2017-05-01

    Full Text Available Many ocean policies mandate integrated, ecosystem-based approaches to marine monitoring, driving a global need for efficient, low-cost bioindicators of marine ecological quality. Most traditional methods to assess biological quality rely on specialized expertise to provide visual identification of a limited set of specific taxonomic groups, a time-consuming process that can provide a narrow view of ecological status. In addition, microbial assemblages drive food webs but are not amenable to visual inspection and thus are largely excluded from detailed inventory. Molecular-based assessments of biodiversity and ecosystem function offer advantages over traditional methods and are increasingly being generated for a suite of taxa using a “microbes to mammals” or “barcodes to biomes” approach. Progress in these efforts coupled with continued improvements in high-throughput sequencing and bioinformatics pave the way for sequence data to be employed in formal integrated ecosystem evaluation, including food web assessments, as called for in the European Union Marine Strategy Framework Directive. DNA sequencing of bioindicators, both traditional (e.g., benthic macroinvertebrates, ichthyoplankton and emerging (e.g., microbial assemblages, fish via eDNA, promises to improve assessment of marine biological quality by increasing the breadth, depth, and throughput of information and by reducing costs and reliance on specialized taxonomic expertise.

  9. SoftSearch: integration of multiple sequence features to identify breakpoints of structural variations.

    Directory of Open Access Journals (Sweden)

    Steven N Hart

    Full Text Available BACKGROUND: Structural variation (SV represents a significant, yet poorly understood contribution to an individual's genetic makeup. Advanced next-generation sequencing technologies are widely used to discover such variations, but there is no single detection tool that is considered a community standard. In an attempt to fulfil this need, we developed an algorithm, SoftSearch, for discovering structural variant breakpoints in Illumina paired-end next-generation sequencing data. SoftSearch combines multiple strategies for detecting SV including split-read, discordant read-pair, and unmated pairs. Co-localized split-reads and discordant read pairs are used to refine the breakpoints. RESULTS: We developed and validated SoftSearch using real and synthetic datasets. SoftSearch's key features are 1 not requiring secondary (or exhaustive primary alignment, 2 portability into established sequencing workflows, and 3 is applicable to any DNA-sequencing experiment (e.g. whole genome, exome, custom capture, etc.. SoftSearch identifies breakpoints from a small number of soft-clipped bases from split reads and a few discordant read-pairs which on their own would not be sufficient to make an SV call. CONCLUSIONS: We show that SoftSearch can identify more true SVs by combining multiple sequence features. SoftSearch was able to call clinically relevant SVs in the BRCA2 gene not reported by other tools while offering significantly improved overall performance.

  10. Incorporation of unique molecular identifiers in TruSeq adapters improves the accuracy of quantitative sequencing.

    Science.gov (United States)

    Hong, Jungeui; Gresham, David

    2017-11-01

    Quantitative analysis of next-generation sequencing (NGS) data requires discriminating duplicate reads generated by PCR from identical molecules that are of unique origin. Typically, PCR duplicates are identified as sequence reads that align to the same genomic coordinates using reference-based alignment. However, identical molecules can be independently generated during library preparation. Misidentification of these molecules as PCR duplicates can introduce unforeseen biases during analyses. Here, we developed a cost-effective sequencing adapter design by modifying Illumina TruSeq adapters to incorporate a unique molecular identifier (UMI) while maintaining the capacity to undertake multiplexed, single-index sequencing. Incorporation of UMIs into TruSeq adapters (TrUMIseq adapters) enables identification of bona fide PCR duplicates as identically mapped reads with identical UMIs. Using TrUMIseq adapters, we show that accurate removal of PCR duplicates results in improved accuracy of both allele frequency (AF) estimation in heterogeneous populations using DNA sequencing and gene expression quantification using RNA-Seq.

  11. A MapReduce Framework for DNA Sequencing Data Processing

    Directory of Open Access Journals (Sweden)

    Samy Ghoneimy

    2016-12-01

    Full Text Available Genomics and Next Generation Sequencers (NGS like Illumina Hiseq produce data in the order of ‎‎200 billion base pairs in a single one-week run for a 60x human genome coverage, which ‎requires modern high-throughput experimental technologies that can ‎only be tackled with high performance computing (HPC and specialized software algorithms called ‎‎“short read aligners”. This paper focuses on the implementation of the DNA sequencing as a set of MapReduce programs that will accept a DNA data set as a FASTQ file and finally generate a VCF (variant call format file, which has variants for a given DNA data set. In this paper MapReduce/Hadoop along with Burrows-Wheeler Aligner (BWA, Sequence Alignment/Map (SAM ‎tools, are fully utilized to provide various utilities for manipulating alignments, including sorting, merging, indexing, ‎and generating alignments. The Map-Sort-Reduce process is designed to be suited for a Hadoop framework in ‎which each cluster is a traditional N-node Hadoop cluster to utilize all of the Hadoop features like HDFS, program ‎management and fault tolerance. The Map step performs multiple instances of the short read alignment algorithm ‎‎(BoWTie that run in parallel in Hadoop. The ordered list of the sequence reads are used as input tuples and the ‎output tuples are the alignments of the short reads. In the Reduce step many parallel instances of the Short ‎Oligonucleotide Analysis Package for SNP (SOAPsnp algorithm run in the cluster. Input tuples are sorted ‎alignments for a partition and the output tuples are SNP calls. Results are stored via HDFS, and then archived in ‎SOAPsnp format. ‎ The proposed framework enables extremely fast discovering somatic mutations, inferring population genetical ‎parameters, and performing association tests directly based on sequencing data without explicit genotyping or ‎linkage-based imputation. It also demonstrate that this method achieves comparable

  12. Retroviral DNA Sequences as a Means for Determining Ancient Diets.

    Directory of Open Access Journals (Sweden)

    Jessica I Rivera-Perez

    Full Text Available For ages, specialists from varying fields have studied the diets of the primeval inhabitants of our planet, detecting diet remains in archaeological specimens using a range of morphological and biochemical methods. As of recent, metagenomic ancient DNA studies have allowed for the comparison of the fecal and gut microbiomes associated to archaeological specimens from various regions of the world; however the complex dynamics represented in those microbial communities still remain unclear. Theoretically, similar to eukaryote DNA the presence of genes from key microbes or enzymes, as well as the presence of DNA from viruses specific to key organisms, may suggest the ingestion of specific diet components. In this study we demonstrate that ancient virus DNA obtained from coprolites also provides information reconstructing the host's diet, as inferred from sequences obtained from pre-Columbian coprolites. This depicts a novel and reliable approach to determine new components as well as validate the previously suggested diets of extinct cultures and animals. Furthermore, to our knowledge this represents the first description of the eukaryotic viral diversity found in paleofaeces belonging to pre-Columbian cultures.

  13. Mitochondrial DNA sequencing of cat hair: an informative forensic tool.

    Science.gov (United States)

    Tarditi, Christy R; Grahn, Robert A; Evans, Jeffrey J; Kurushima, Jennifer D; Lyons, Leslie A

    2011-01-01

    Approximately 81.7 million cats are in 37.5 million U.S. households. Shed fur can be criminal evidence because of transfer to victims, suspects, and/or their belongings. To improve cat hairs as forensic evidence, the mtDNA control region from single hairs, with and without root tags, was sequenced. A dataset of a 402-bp control region segment from 174 random-bred cats representing four U.S. geographic areas was generated to determine the informativeness of the mtDNA region. Thirty-two mtDNA mitotypes were observed ranging in frequencies from 0.6-27%. Four common types occurred in all populations. Low heteroplasmy, 1.7%, was determined. Unique mitotypes were found in 18 individuals, 10.3% of the population studied. The calculated discrimination power implied that 8.3 of 10 randomly selected individuals can be excluded by this region. The genetic characteristics of the region and the generated dataset support the use of this cat mtDNA region in forensic applications. 2010 American Academy of Forensic Sciences. Published 2010. This article is a U.S. Government work and is in the public domain in the U.S.A.

  14. Ribosomal DNA sequence heterogeneity reflects intraspecies phylogenies and predicts genome structure in two contrasting yeast species.

    Science.gov (United States)

    West, Claire; James, Stephen A; Davey, Robert P; Dicks, Jo; Roberts, Ian N

    2014-07-01

    The ribosomal RNA encapsulates a wealth of evolutionary information, including genetic variation that can be used to discriminate between organisms at a wide range of taxonomic levels. For example, the prokaryotic 16S rDNA sequence is very widely used both in phylogenetic studies and as a marker in metagenomic surveys and the internal transcribed spacer region, frequently used in plant phylogenetics, is now recognized as a fungal DNA barcode. However, this widespread use does not escape criticism, principally due to issues such as difficulties in classification of paralogous versus orthologous rDNA units and intragenomic variation, both of which may be significant barriers to accurate phylogenetic inference. We recently analyzed data sets from the Saccharomyces Genome Resequencing Project, characterizing rDNA sequence variation within multiple strains of the baker's yeast Saccharomyces cerevisiae and its nearest wild relative Saccharomyces paradoxus in unprecedented detail. Notably, both species possess single locus rDNA systems. Here, we use these new variation datasets to assess whether a more detailed characterization of the rDNA locus can alleviate the second of these phylogenetic issues, sequence heterogeneity, while controlling for the first. We demonstrate that a strong phylogenetic signal exists within both datasets and illustrate how they can be used, with existing methodology, to estimate intraspecies phylogenies of yeast strains consistent with those derived from whole-genome approaches. We also describe the use of partial Single Nucleotide Polymorphisms, a type of sequence variation found only in repetitive genomic regions, in identifying key evolutionary features such as genome hybridization events and show their consistency with whole-genome Structure analyses. We conclude that our approach can transform rDNA sequence heterogeneity from a problem to a useful source of evolutionary information, enabling the estimation of highly accurate phylogenies of

  15. Transcription blockage by homopurine DNA sequences: role of sequence composition and single-strand breaks

    Science.gov (United States)

    Belotserkovskii, Boris P.; Neil, Alexander J.; Saleh, Syed Shayon; Shin, Jane Hae Soo; Mirkin, Sergei M.; Hanawalt, Philip C.

    2013-01-01

    The ability of DNA to adopt non-canonical structures can affect transcription and has broad implications for genome functioning. We have recently reported that guanine-rich (G-rich) homopurine-homopyrimidine sequences cause significant blockage of transcription in vitro in a strictly orientation-dependent manner: when the G-rich strand serves as the non-template strand [Belotserkovskii et al. (2010) Mechanisms and implications of transcription blockage by guanine-rich DNA sequences., Proc. Natl Acad. Sci. USA, 107, 12816–12821]. We have now systematically studied the effect of the sequence composition and single-stranded breaks on this blockage. Although substitution of guanine by any other base reduced the blockage, cytosine and thymine reduced the blockage more significantly than adenine substitutions, affirming the importance of both G-richness and the homopurine-homopyrimidine character of the sequence for this effect. A single-strand break in the non-template strand adjacent to the G-rich stretch dramatically increased the blockage. Breaks in the non-template strand result in much weaker blockage signals extending downstream from the break even in the absence of the G-rich stretch. Our combined data support the notion that transcription blockage at homopurine-homopyrimidine sequences is caused by R-loop formation. PMID:23275544

  16. Modeling genetic imprinting effects of DNA sequences with multilocus polymorphism data

    Directory of Open Access Journals (Sweden)

    Staud Roland

    2009-08-01

    Full Text Available Abstract Single nucleotide polymorphisms (SNPs represent the most widespread type of DNA sequence variation in the human genome and they have recently emerged as valuable genetic markers for revealing the genetic architecture of complex traits in terms of nucleotide combination and sequence. Here, we extend an algorithmic model for the haplotype analysis of SNPs to estimate the effects of genetic imprinting expressed at the DNA sequence level. The model provides a general procedure for identifying the number and types of optimal DNA sequence variants that are expressed differently due to their parental origin. The model is used to analyze a genetic data set collected from a pain genetics project. We find that DNA haplotype GAC from three SNPs, OPRKG36T (with two alleles G and T, OPRKA843G (with alleles A and G, and OPRKC846T (with alleles C and T, at the kappa-opioid receptor, triggers a significant effect on pain sensitivity, but with expression significantly depending on the parent from which it is inherited (p = 0.008. With a tremendous advance in SNP identification and automated screening, the model founded on haplotype discovery and statistical inference may provide a useful tool for genetic analysis of any quantitative trait with complex inheritance.

  17. Automated methods for single-stranded DNA isolation and dideoxynucleotide DNA sequencing reactions on a robotic workstation

    International Nuclear Information System (INIS)

    Mardis, E.R.; Roe, B.A.

    1989-01-01

    Automated procedures have been developed for both the simultaneous isolation of 96 single-stranded M13 chimeric template DNAs in less than two hours, and for simultaneously pipetting 24 dideoxynucleotide sequencing reactions on a commercially available laboratory workstation. The DNA sequencing results obtained by either radiolabeled or fluorescent methods are consistent with the premise that automation of these portions of DNA sequencing projects will improve the reproducibility of the DNA isolation and the procedures for these normally labor-intensive steps provides an approach for rapid acquisition of large amounts of high quality, reproducible DNA sequence data

  18. Roche genome sequencer FLX based high-throughput sequencing of ancient DNA

    DEFF Research Database (Denmark)

    Alquezar-Planas, David E; Fordyce, Sarah Louise

    2012-01-01

    Since the development of so-called "next generation" high-throughput sequencing in 2005, this technology has been applied to a variety of fields. Such applications include disease studies, evolutionary investigations, and ancient DNA. Each application requires a specialized protocol to ensure...... that the data produced is optimal. Although much of the procedure can be followed directly from the manufacturer's protocols, the key differences lie in the library preparation steps. This chapter presents an optimized protocol for the sequencing of fossil remains and museum specimens, commonly referred...

  19. Parallel Sequencing of Expressed Sequence Tags from Two Complementary DNA Libraries for High and Low Phosphorus Adaptation in Common Beans

    Directory of Open Access Journals (Sweden)

    Matthew W. Blair

    2011-11-01

    Full Text Available Expressed sequence tags (ESTs have proven useful for gene discovery in many crops. In this work, our objective was to construct complementary DNA (cDNA libraries from root tissues of common beans ( L. grown under low and high P hydroponic conditions and to conduct EST sequencing and comparative analyses of the libraries. Expressed sequence tag analysis of 3648 clones identified 2372 unigenes, of which 1591 were annotated as known genes while a total of 465 unigenes were not associated with any known gene. Unigenes with hits were categorized according to biological processes, molecular function, and cellular compartmentalization. Given the young tissue used to make the root libraries, genes for catalytic activity and binding were highly expressed. Comparisons with previous root EST sequencing and between the two libraries made here resulted in a set of genes to study further for differential gene expression and adaptation to low P, such as a 14 kDa praline-rich protein, a metallopeptidase, tonoplast intrinsic protein, adenosine triphosphate (ATP citrate synthase, and cell proliferation genes expressed in the low P treated plants. Given that common beans are often grown on acid soils of the tropics and subtropics that are usually low in P these genes and the two parallel libraries will be useful for selection for better uptake of this essential macronutrient. The importance of EST generation for common bean root tissues under low P and other abiotic soil stresses is also discussed.

  20. DNA interaction with platinum-based cytostatics revealed by DNA sequencing.

    Science.gov (United States)

    Smerkova, Kristyna; Vaculovic, Tomas; Vaculovicova, Marketa; Kynicky, Jindrich; Brtnicky, Martin; Eckschlager, Tomas; Stiborova, Marie; Hubalek, Jaromir; Adam, Vojtech

    2017-12-15

    The main mechanism of action of platinum-based cytostatic drugs - cisplatin, oxaliplatin and carboplatin - is the formation of DNA cross-links, which restricts the transcription due to the disability of DNA to enter the active site of the polymerase. The polymerase chain reaction (PCR) was employed as a simplified model of the amplification process in the cell nucleus. PCR with fluorescently labelled dideoxynucleotides commonly employed for DNA sequencing was used to monitor the effect of platinum-based cytostatics on DNA in terms of decrease in labeling efficiency dependent on a presence of the DNA-drug cross-link. It was found that significantly different amounts of the drugs - cisplatin (0.21 μg/mL), oxaliplatin (5.23 μg/mL), and carboplatin (71.11 μg/mL) - were required to cause the same quenching effect (50%) on the fluorescent labelling of 50 μg/mL of DNA. Moreover, it was found that even though the amounts of the drugs was applied to the reaction mixture differing by several orders of magnitude, the amount of incorporated platinum, quantified by inductively coupled plasma mass spectrometry, was in all cases at the level of tenths of μg per 5 μg of DNA. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. Massively parallel DNA sequencing facilitates diagnosis of patients with Usher syndrome type 1.

    Directory of Open Access Journals (Sweden)

    Hidekane Yoshimura

    Full Text Available Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1% who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%, which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance.

  2. Massively parallel DNA sequencing facilitates diagnosis of patients with Usher syndrome type 1.

    Science.gov (United States)

    Yoshimura, Hidekane; Iwasaki, Satoshi; Nishio, Shin-Ya; Kumakawa, Kozo; Tono, Tetsuya; Kobayashi, Yumiko; Sato, Hiroaki; Nagai, Kyoko; Ishikawa, Kotaro; Ikezono, Tetsuo; Naito, Yasushi; Fukushima, Kunihiro; Oshikawa, Chie; Kimitsuki, Takashi; Nakanishi, Hiroshi; Usami, Shin-Ichi

    2014-01-01

    Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1%) who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%), which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance.

  3. New scoring schema for finding motifs in DNA Sequences

    Directory of Open Access Journals (Sweden)

    Nowzari-Dalini Abbas

    2009-03-01

    Full Text Available Abstract Background Pattern discovery in DNA sequences is one of the most fundamental problems in molecular biology with important applications in finding regulatory signals and transcription factor binding sites. An important task in this problem is to search (or predict known binding sites in a new DNA sequence. For this reason, all subsequences of the given DNA sequence are scored based on an scoring function and the prediction is done by selecting the best score. By assuming no dependency between binding site base positions, most of the available tools for known binding site prediction are designed. Recently Tomovic and Oakeley investigated the statistical basis for either a claim of dependence or independence, to determine whether such a claim is generally true, and they presented a scoring function for binding site prediction based on the dependency between binding site base positions. Our primary objective is to investigate the scoring functions which can be used in known binding site prediction based on the assumption of dependency or independency in binding site base positions. Results We propose a new scoring function based on the dependency between all positions in biding site base positions. This scoring function uses joint information content and mutual information as a measure of dependency between positions in transcription factor binding site. Our method for modeling dependencies is simply an extension of position independency methods. We evaluate our new scoring function on the real data sets extracted from JASPAR and TRANSFAC data bases, and compare the obtained results with two other well known scoring functions. Conclusion The results demonstrate that the new approach improves known binding site discovery and show that the joint information content and mutual information provide a better and more general criterion to investigate the relationships between positions in the TFBS. Our scoring function is formulated by simple

  4. MetaPhinder-Identifying Bacteriophage Sequences in Metagenomic Data Sets

    DEFF Research Database (Denmark)

    Jurtz, Vanessa Isabell; Villarroel, Julia; Lund, Ole

    2016-01-01

    genome structure of many bacteriophages. The method is demonstrated to outperform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source...... and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e. contigs) of phage origin in metage-nomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic...... code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder....

  5. DNA typing of ancient parasite eggs from environmental samples identifies human and animal worm infections in viking-age settlement

    DEFF Research Database (Denmark)

    Søe, Martin Jensen; Nejsum, Peter; Fredensborg, Brian Lund

    2015-01-01

    Ancient parasite eggs were recovered from environmental samples collected at a Viking-age settlement in Viborg, Denmark, dated 1018-1030 A.D. Morphological examination identified Ascaris sp., Trichuris sp., and Fasciola sp. eggs, but size and shape did not allow species identification. By carefully...... selecting genetic markers, PCR amplification and sequencing of ancient DNA (aDNA) isolates resulted in identification of: the human whipworm, Trichuris trichiura, using SSUrRNA sequence homology; Ascaris sp. with 100% homology to cox1 haplotype 07; and Fasciola hepatica using ITS1 sequence homology...

  6. PCR amplification and DNA sequencing of Demodex injai from otic secretions of a dog.

    Science.gov (United States)

    Milosevic, Milivoj A; Frank, Linda A; Brahmbhatt, Rupal A; Kania, Stephen A

    2013-04-01

    The identification of Demodex mites from dogs is usually based on morphology and location. Mites with uncharacteristic features or from unusual locations, hosts or disease manifestations could represent new species not previously described; however, this is difficult to determine based on morphology alone. The goal of this study was to identify and confirm Demodex injai in association with otitis externa in a dog using PCR amplification and DNA sequencing. Otic samples were obtained from a beagle in which a long-bodied Demodex mite was identified. For comparison, Demodex mite samples were collected from a swab and scraping of the dorsal skin of a wire-haired fox terrier and an otic sample from a dog with generalized and otic demodicosis. To identify the Demodex mite, DNA was extracted, and 16S rRNA was amplified by PCR, sequenced and compared with Demodex sequences available in public databases and from separate samples morphologically diagnosed as D. injai and Demodex canis. PCR amplification of the long-bodied mite rRNA DNA obtained from otic samples was approximately 330 bp and was identical to that from the mite morphologically identified as D. injai obtained from the dorsal skin of a dog. Furthermore, the examined mite did not have any significant homology to any of the reported genes from Demodex spp. These results confirmed that the demodex mites in this case were D. injai. © 2013 The Authors. Veterinary Dermatology © 2013 ESVD and ACVD.

  7. The function analysis of full-length cDNA sequence from IRM-2 mouse cDNA library

    International Nuclear Information System (INIS)

    Wang Qin; Liu Xiaoqiu; Xu Chang; Du Liqing; Sun Zhijuan; Wang Yan; Liu Qiang; Song Li; Li Jin; Fan Feiyue

    2013-01-01

    Objective: To identify the function of full-length cDNA sequence from IRM-2 mouse cDNA library. Methods: Full-length cDNA products were amplified by PCR from IRM-2 mouse cDNA library according to twenty-one pieces of expressed sequence tag. The expression of full-length cDNAs were detected after mouse embryonic fibroblasts were exposed to 6.5 Gy γ-ray radiation. And the effect on the growth of radiosensitivity cells AT5B1VA transfected with full-length cDNAs was investigated. Results: The expression of No.4, 5 and 2 full-length cDNAs from IRM-2 mouse were higher than that of parental ICR and 615 mouse after mouse embryonic fibroblasts irradiated with γ-ray radiation. And the survival rate of AT5B1VA cells transfected with No.4, 5 and 2 full-length cDNAs was high. Conclusion: No.4, 5 and 2 full-length cDNAs of IRM-2 mouse are of high radioresistance. (authors)

  8. DNA-based approaches to identify forest fungi in Pacific Islands: A pilot study

    Science.gov (United States)

    Anna E. Case; Sara M. Ashiglar; Phil G. Cannon; Ernesto P. Militante; Edwin R. Tadiosa; Mutya Quintos-Manalo; Nelson M. Pampolina; John W. Hanna; Fred E. Brooks; Amy L. Ross-Davis; Mee-Sook Kim; Ned B. Klopfenstein

    2013-01-01

    DNA-based diagnostics have been successfully used to characterize diverse forest fungi (e.g., Hoff et al. 2004, Kim et al. 2006, Glaeser & Lindner 2011). DNA sequencing of the internal transcribed spacer (ITS) and large subunit (LSU) regions of nuclear ribosomal DNA (rDNA) has proved especially useful (Sonnenberg et al. 2007, Seifert 2009, Schoch et al. 2012) for...

  9. Identifications of Captive and Wild Tilapia Species Existing in Hawaii by Mitochondrial DNA Control Region Sequence

    Science.gov (United States)

    Wu, Liang; Yang, Jinzeng

    2012-01-01

    Background The tilapia family of the Cichlidae includes many fish species, which live in freshwater and saltwater environments. Several species, such as O. niloticus, O. aureus, and O. mossambicus, are excellent for aquaculture because these fish are easily reproduced and readily adapt to diverse environments. Historically, tilapia species, including O. mossambicus, S. melanotheron, and O. aureus, were introduced to Hawaii many decades ago, and the state of Hawaii uses the import permit policy to prevent O. niloticus from coming into the islands. However, hybrids produced from O. niloticus may already be present in the freshwater and marine environments of the islands. The purpose of this study was to identify tilapia species that exist in Hawaii using mitochondrial DNA analysis. Methodology/Principal Findings In this study, we analyzed 382 samples collected from 13 farm (captive) and wild tilapia populations in Oahu and the Hawaii Islands. Comparison of intraspecies variation between the mitochondrial DNA control region (mtDNA CR) and cytochrome c oxidase I (COI) gene from five populations indicated that mtDNA CR had higher nucleotide diversity than COI. A phylogenetic tree of all sampled tilapia was generated using mtDNA CR sequences. The neighbor-joining tree analysis identified seven distinctive tilapia species: O. aureus, O. mossambicus, O. niloticus, S. melanotheron, O. urolepies, T. redalli, and a hybrid of O. massambicus and O. niloticus. Of all the populations examined, 10 populations consisting of O. aureus, O. mossambicus, O. urolepis, and O. niloticus from the farmed sites were relatively pure, whereas three wild populations showed some degree of introgression and hybridization. Conclusions/Significance This DNA-based tilapia species identification is the first report that confirmed tilapia species identities in the wild and captive populations in Hawaii. The DNA sequence comparisons of mtDNA CR appear to be a valid method for tilapia species

  10. Norgal: extraction and de novo assembly of mitochondrial DNA from whole-genome sequencing data.

    Science.gov (United States)

    Al-Nakeeb, Kosai; Petersen, Thomas Nordahl; Sicheritz-Pontén, Thomas

    2017-11-21

    Whole-genome sequencing (WGS) projects provide short read nucleotide sequences from nuclear and possibly organelle DNA depending on the source of origin. Mitochondrial DNA is present in animals and fungi, while plants contain DNA from both mitochondria and chloroplasts. Current techniques for separating organelle reads from nuclear reads in WGS data require full reference or partial seed sequences for assembling. Norgal (de Novo ORGAneLle extractor) avoids this requirement by identifying a high frequency subset of k-mers that are predominantly of mitochondrial origin and performing a de novo assembly on a subset of reads that contains these k-mers. The method was applied to WGS data from a panda, brown algae seaweed, butterfly and filamentous fungus. We were able to extract full circular mitochondrial genomes and obtained sequence identities to the reference sequences in the range from 98.5 to 99.5%. We also assembled the chloroplasts of grape vines and cucumbers using Norgal together with seed-based de novo assemblers. Norgal is a pipeline that can extract and assemble full or partial mitochondrial and chloroplast genomes from WGS short reads without prior knowledge. The program is available at: https://bitbucket.org/kosaidtu/norgal .

  11. Identifying the ichthyoplankton of a coral reef using DNA barcodes.

    Science.gov (United States)

    Hubert, Nicolas; Espiau, Benoit; Meyer, Christopher; Planes, Serge

    2015-01-01

    Marine fishes exhibit spectacular phenotypic changes during their ontogeny, and the identification of their early stages is challenging due to the paucity of diagnostic morphological characters at the species level. Meanwhile, the importance of early life stages in dispersal and connectivity has recently experienced an increasing interest in conservation programmes for coral reef fishes. This study aims at assessing the effectiveness of DNA barcoding for the automated identification of coral reef fish larvae through large-scale ecosystemic sampling. Fish larvae were mainly collected using bongo nets and light traps around Moorea between September 2008 and August 2010 in 10 sites distributed in open waters. Fish larvae ranged from 2 to 100 mm of total length, with the most abundant individuals being <5 mm. Among the 505 individuals DNA barcoded, 373 larvae (i.e. 75%) were identified to the species level. A total of 106 species were detected, among which 11 corresponded to pelagic and bathypelagic species, while 95 corresponded to species observed at the adult stage on neighbouring reefs. This study highlights the benefits and pitfalls of using standardized molecular systems for species identification and illustrates the new possibilities enabled by DNA barcoding for future work on coral reef fish larval ecology. © 2014 John Wiley & Sons Ltd.

  12. Cloning and sequencing of cDNA encoding human DNA topoisomerase II and localization of the gene to chromosome region 17q21-22

    International Nuclear Information System (INIS)

    Tsai-Pflugfelder, M.; Liu, L.F.; Liu, A.A.; Tewey, K.M.; Whang-Peng, J.; Knutsen, T.; Huebner, K.; Croce, C.M.; Wang, J.C.

    1988-01-01

    Two overlapping cDNA clones encoding human DNA topoisomerase II were identified by two independent methods. In one, a human cDNA library in phage λ was screened by hybridization with a mixed oligonucleotide probe encoding a stretch of seven amino acids found in yeast and Drosophila DNA topoisomerase II; in the other, a different human cDNA library in a λgt11 expression vector was screened for the expression of antigenic determinants that are recognized by rabbit antibodies specific to human DNA topoisomerase II. The entire coding sequences of the human DNA topoisomerase II gene were determined from these and several additional clones, identified through the use of the cloned human TOP2 gene sequences as probes. Hybridization between the cloned sequences and mRNA and genomic DNA indicates that the human enzyme is encoded by a single-copy gene. The location of the gene was mapped to chromosome 17q21-22 by in situ hybridization of a cloned fragment to metaphase chromosomes and by hybridization analysis with a panel of mouse-human hybrid cell lines, each retaining a subset of human chromosomes

  13. Exome sequencing identifies recurrent somatic RAC1 mutations in melanoma

    Energy Technology Data Exchange (ETDEWEB)

    Krauthammer, Michael; Kong, Yong; Ha, Byung Hak; Evans, Perry; Bacchiocchi, Antonella; McCusker, James P.; Cheng, Elaine; Davis, Matthew J.; Goh, Gerald; Choi, Murim; Ariyan, Stephan; Narayan, Deepak; Dutton-Regester, Ken; Capatana, Ana; Holman, Edna C.; Bosenberg, Marcus; Sznol, Mario; Kluger, Harriet M.; Brash, Douglas E.; Stern, David F.; Materin, Miguel A.; Lo, Roger S.; Mane, Shrikant; Ma, Shuangge; Kidd, Kenneth K.; Hayward, Nicholas K.; Lifton, Richard P.; Schlessinger, Joseph; Boggon, Titus J.; Halaban, Ruth (Yale-MED); (UCLA); (Queens)

    2012-10-11

    We characterized the mutational landscape of melanoma, the form of skin cancer with the highest mortality rate, by sequencing the exomes of 147 melanomas. Sun-exposed melanomas had markedly more ultraviolet (UV)-like C>T somatic mutations compared to sun-shielded acral, mucosal and uveal melanomas. Among the newly identified cancer genes was PPP6C, encoding a serine/threonine phosphatase, which harbored mutations that clustered in the active site in 12% of sun-exposed melanomas, exclusively in tumors with mutations in BRAF or NRAS. Notably, we identified a recurrent UV-signature, an activating mutation in RAC1 in 9.2% of sun-exposed melanomas. This activating mutation, the third most frequent in our cohort of sun-exposed melanoma after those of BRAF and NRAS, changes Pro29 to serine (RAC1{sup P29S}) in the highly conserved switch I domain. Crystal structures, and biochemical and functional studies of RAC1{sup P29S} showed that the alteration releases the conformational restraint conferred by the conserved proline, causes an increased binding of the protein to downstream effectors, and promotes melanocyte proliferation and migration. These findings raise the possibility that pharmacological inhibition of downstream effectors of RAC1 signaling could be of therapeutic benefit.

  14. Somatic mutations in histiocytic sarcoma identified by next generation sequencing.

    Science.gov (United States)

    Liu, Qingqing; Tomaszewicz, Keith; Hutchinson, Lloyd; Hornick, Jason L; Woda, Bruce; Yu, Hongbo

    2016-08-01

    Histiocytic sarcoma is a rare malignant neoplasm of presumed hematopoietic origin showing morphologic and immunophenotypic evidence of histiocytic differentiation. Somatic mutation importance in the pathogenesis or disease progression of histiocytic sarcoma was largely unknown. To identify somatic mutations in histiocytic sarcoma, we studied 5 histiocytic sarcomas [3 female and 2 male patients; mean age 54.8 (20-72), anatomic sites include lymph node, uterus, and pleura] and matched normal tissues from each patient as germ line controls. Somatic mutations in 50 "Hotspot" oncogenes and tumor suppressor genes were examined using next generation sequencing. Three (out of five) histiocytic sarcoma cases carried somatic mutations in BRAF. Among them, G464V [variant frequency (VF) of 43.6 %] and G466R (VF of 29.6 %) located at the P loop potentially interfere with the hydrophobic interaction between P and activating loops and ultimately activation of BRAF. Also detected was BRAF somatic mutation N581S (VF of 7.4 %), which was located at the catalytic loop of BRAF kinase domain: its role in modifying kinase activity was unclear. A similar mutational analysis was also performed on nine acute monocytic/monoblastic leukemia cases, which did not identify any BRAF somatic mutations. Our study detected several BRAF mutations in histiocytic sarcomas, which may be important in understanding the tumorigenesis of this rare neoplasm and providing mechanisms for potential therapeutical opportunities.

  15. A DNA 'barcode blitz': rapid digitization and sequencing of a natural history collection.

    Science.gov (United States)

    Hebert, Paul D N; Dewaard, Jeremy R; Zakharov, Evgeny V; Prosser, Sean W J; Sones, Jayme E; McKeown, Jaclyn T A; Mantle, Beth; La Salle, John

    2013-01-01

    DNA barcoding protocols require the linkage of each sequence record to a voucher specimen that has, whenever possible, been authoritatively identified. Natural history collections would seem an ideal resource for barcode library construction, but they have never seen large-scale analysis because of concerns linked to DNA degradation. The present study examines the strength of this barrier, carrying out a comprehensive analysis of moth and butterfly (Lepidoptera) species in the Australian National Insect Collection. Protocols were developed that enabled tissue samples, specimen data, and images to be assembled rapidly. Using these methods, a five-person team processed 41,650 specimens representing 12,699 species in 14 weeks. Subsequent molecular analysis took about six months, reflecting the need for multiple rounds of PCR as sequence recovery was impacted by age, body size, and collection protocols. Despite these variables and the fact that specimens averaged 30.4 years old, barcode records were obtained from 86% of the species. In fact, one or more barcode compliant sequences (>487 bp) were recovered from virtually all species represented by five or more individuals, even when the youngest was 50 years old. By assembling specimen images, distributional data, and DNA barcode sequences on a web-accessible informatics platform, this study has greatly advanced accessibility to information on thousands of species. Moreover, much of the specimen data became publically accessible within days of its acquisition, while most sequence results saw release within three months. As such, this study reveals the speed with which DNA barcode workflows can mobilize biodiversity data, often providing the first web-accessible information for a species. These results further suggest that existing collections can enable the rapid development of a comprehensive DNA barcode library for the most diverse compartment of terrestrial biodiversity - insects.

  16. Hypervariable minisatellite DNA sequences in the Indian peafowl Pavo cristatus.

    Science.gov (United States)

    Hanotte, O; Burke, T; Armour, J A; Jeffreys, A J

    1991-04-01

    We report here for the first time the large-scale isolation of hypervariable minisatellite DNA sequences from a non-human species, the Indian peafowl (Pavo cristatus). A size-selected genomic DNA fraction, rich in hypervariable minisatellites, was cloned into Charomid 9-36. This library was screened using two multilocus hypervariable probes, 33.6 and 33.15 and also, in a "probe-walking" approach, with five of the peafowl minisatellites initially isolated. Forty-eight positively hybridizing clones were characterized and found to originate from 30 different loci, 18 of which were polymorphic. Five of these variable minisatellite loci were studied further. They all showed Mendelian inheritance. The heterozygosities of these loci were relatively low (range 22-78%) in comparison with those of previously cloned human loci, as expected in view of inbreeding in our semicaptive study population. No new length allele mutations were observed in families and the mean mutation rate per locus is low (less than 0.004, 95% confidence maximum). These loci were also investigated by cross-species hybridization in related taxa. The ability of the probes to detect hypervariable sequences in other species within the same avian family was found to vary, from those probes that are species-specific to those that are apparently general to the family. We also illustrate the potential usefulness of these probes for paternity analysis in a study of sexual selection, and discuss the general application of specific hypervariable probes in behavioral and evolutionary studies.

  17. A pneumatic device for rapid loading of DNA sequencing gels.

    Science.gov (United States)

    Panussis, D A; Cook, M W; Rifkin, L L; Snider, J E; Strong, J T; McGrane, R M; Wilson, R K; Mardis, E R

    1998-05-01

    This work describes the design and construction of a device that facilitates the loading of DNA samples onto polyacrylamide gels for detection in the Perkin Elmer/Applied Biosystems (PE/ABI) 373 and 377 DNA sequencing instruments. The device is mounted onto the existing gel cassettes and makes the process of loading high-density gels less cumbersome while the associated time and errors are reduced. The principle of operation includes the simultaneous transfer of the entire batch of samples, in which a spring-loaded air cylinder generates positive pressure and flexible silica capillaries transfer the samples. A retractable capillary array carrier allows the delivery ends of the capillaries to be held up clear of the gel during loader attachment on the gel plates, while enabling their insertion in the gel wells once the device is securely mounted. Gel-loading devices capable of simultaneously transferring 72 samples onto the PE/ABI 373 and 377 are currently being used in our production sequencing groups while a 96-sample transfer prototype undergoes testing.

  18. Phylogenomics of Phrynosomatid Lizards: Conflicting Signals from Sequence Capture versus Restriction Site Associated DNA Sequencing

    Science.gov (United States)

    Leaché, Adam D.; Chavez, Andreas S.; Jones, Leonard N.; Grummer, Jared A.; Gottscho, Andrew D.; Linkem, Charles W.

    2015-01-01

    Sequence capture and restriction site associated DNA sequencing (RADseq) are popular methods for obtaining large numbers of loci for phylogenetic analysis. These methods are typically used to collect data at different evolutionary timescales; sequence capture is primarily used for obtaining conserved loci, whereas RADseq is designed for discovering single nucleotide polymorphisms (SNPs) suitable for population genetic or phylogeographic analyses. Phylogenetic questions that span both “recent” and “deep” timescales could benefit from either type of data, but studies that directly compare the two approaches are lacking. We compared phylogenies estimated from sequence capture and double digest RADseq (ddRADseq) data for North American phrynosomatid lizards, a species-rich and diverse group containing nine genera that began diversifying approximately 55 Ma. Sequence capture resulted in 584 loci that provided a consistent and strong phylogeny using concatenation and species tree inference. However, the phylogeny estimated from the ddRADseq data was sensitive to the bioinformatics steps used for determining homology, detecting paralogs, and filtering missing data. The topological conflicts among the SNP trees were not restricted to any particular timescale, but instead were associated with short internal branches. Species tree analysis of the largest SNP assembly, which also included the most missing data, supported a topology that matched the sequence capture tree. This preferred phylogeny provides strong support for the paraphyly of the earless lizard genera Holbrookia and Cophosaurus, suggesting that the earless morphology either evolved twice or evolved once and was subsequently lost in Callisaurus. PMID:25663487

  19. AQME: A forensic mitochondrial DNA analysis tool for next-generation sequencing data.

    Science.gov (United States)

    Sturk-Andreaggi, Kimberly; Peck, Michelle A; Boysen, Cecilie; Dekker, Patrick; McMahon, Timothy P; Marshall, Charla K

    2017-11-01

    The feasibility of generating mitochondrial DNA (mtDNA) data has expanded considerably with the advent of next-generation sequencing (NGS), specifically in the generation of entire mtDNA genome (mitogenome) sequences. However, the analysis of these data has emerged as the greatest challenge to implementation in forensics. To address this need, a custom toolkit for use in the CLC Genomics Workbench (QIAGEN, Hilden, Germany) was developed through a collaborative effort between the Armed Forces Medical Examiner System - Armed Forces DNA Identification Laboratory (AFMES-AFDIL) and QIAGEN Bioinformatics. The AFDIL-QIAGEN mtDNA Expert, or AQME, generates an editable mtDNA profile that employs forensic conventions and includes the interpretation range required for mtDNA data reporting. AQME also integrates an mtDNA haplogroup estimate into the analysis workflow, which provides the analyst with phylogenetic nomenclature guidance and a profile quality check without the use of an external tool. Supplemental AQME outputs such as nucleotide-per-position metrics, configurable export files, and an audit trail are produced to assist the analyst during review. AQME is applied to standard CLC outputs and thus can be incorporated into any mtDNA bioinformatics pipeline within CLC regardless of sample type, library preparation or NGS platform. An evaluation of AQME was performed to demonstrate its functionality and reliability for the analysis of mitogenome NGS data. The study analyzed Illumina mitogenome data from 21 samples (including associated controls) of varying quality and sample preparations with the AQME toolkit. A total of 211 tool edits were automatically applied to 130 of the 698 total variants reported in an effort to adhere to forensic nomenclature. Although additional manual edits were required for three samples, supplemental tools such as mtDNA haplogroup estimation assisted in identifying and guiding these necessary modifications to the AQME-generated profile. Along

  20. New insights into Trypanosoma cruzi evolution, genotyping and molecular diagnostics from satellite DNA sequence analysis.

    Directory of Open Access Journals (Sweden)

    Juan C Ramírez

    2017-12-01

    Full Text Available Trypanosoma cruzi has been subdivided into seven Discrete Typing Units (DTUs, TcI-TcVI and Tcbat. Two major evolutionary models have been proposed to explain the origin of hybrid lineages, but while it is widely accepted that TcV and TcVI are the result of genetic exchange between TcII and TcIII strains, the origin of TcIII and TcIV is still a matter of debate. T. cruzi satellite DNA (SatDNA, comprised of 195 bp units organized in tandem repeats, from both TcV and TcVI stocks were found to have SatDNA copies type TcI and TcII; whereas contradictory results were observed for TcIII stocks and no TcIV sequence has been analyzed yet. Herein, we have gone deeper into this matter analyzing 335 distinct SatDNA sequences from 19 T. cruzi stocks representative of DTUs TcI-TcVI for phylogenetic inference. Bayesian phylogenetic tree showed that all sequences were grouped in three major clusters, which corresponded to sequences from DTUs TcI/III, TcII and TcIV; whereas TcV and TcVI stocks had two sets of sequences distributed into TcI/III and TcII clusters. As expected, the lowest genetic distances were found between TcI and TcIII, and between TcV and TcVI sequences; whereas the highest ones were observed between TcII and TcI/III, and among TcIV sequences and those from the remaining DTUs. In addition, signature patterns associated to specific T. cruzi lineages were identified and new primers that improved SatDNA-based qPCR sensitivity were designed. Our findings support the theory that TcIII is not the result of a hybridization event between TcI and TcII, and that TcIV had an independent origin from the other DTUs, contributing to clarifying the evolutionary history of T. cruzi lineages. Moreover, this work opens the possibility of typing samples from Chagas disease patients with low parasitic loads and improving molecular diagnostic methods of T. cruzi infection based on SatDNA sequence amplification.

  1. New insights into Trypanosoma cruzi evolution, genotyping and molecular diagnostics from satellite DNA sequence analysis.

    Science.gov (United States)

    Ramírez, Juan C; Torres, Carolina; Curto, María de Los A; Schijman, Alejandro G

    2017-12-01

    Trypanosoma cruzi has been subdivided into seven Discrete Typing Units (DTUs), TcI-TcVI and Tcbat. Two major evolutionary models have been proposed to explain the origin of hybrid lineages, but while it is widely accepted that TcV and TcVI are the result of genetic exchange between TcII and TcIII strains, the origin of TcIII and TcIV is still a matter of debate. T. cruzi satellite DNA (SatDNA), comprised of 195 bp units organized in tandem repeats, from both TcV and TcVI stocks were found to have SatDNA copies type TcI and TcII; whereas contradictory results were observed for TcIII stocks and no TcIV sequence has been analyzed yet. Herein, we have gone deeper into this matter analyzing 335 distinct SatDNA sequences from 19 T. cruzi stocks representative of DTUs TcI-TcVI for phylogenetic inference. Bayesian phylogenetic tree showed that all sequences were grouped in three major clusters, which corresponded to sequences from DTUs TcI/III, TcII and TcIV; whereas TcV and TcVI stocks had two sets of sequences distributed into TcI/III and TcII clusters. As expected, the lowest genetic distances were found between TcI and TcIII, and between TcV and TcVI sequences; whereas the highest ones were observed between TcII and TcI/III, and among TcIV sequences and those from the remaining DTUs. In addition, signature patterns associated to specific T. cruzi lineages were identified and new primers that improved SatDNA-based qPCR sensitivity were designed. Our findings support the theory that TcIII is not the result of a hybridization event between TcI and TcII, and that TcIV had an independent origin from the other DTUs, contributing to clarifying the evolutionary history of T. cruzi lineages. Moreover, this work opens the possibility of typing samples from Chagas disease patients with low parasitic loads and improving molecular diagnostic methods of T. cruzi infection based on SatDNA sequence amplification.

  2. DNA hybridization kinetics: zippering, internal displacement and sequence dependence.

    Science.gov (United States)

    Ouldridge, Thomas E; Sulc, Petr; Romano, Flavio; Doye, Jonathan P K; Louis, Ard A

    2013-10-01

    Although the thermodynamics of DNA hybridization is generally well established, the kinetics of this classic transition is less well understood. Providing such understanding has new urgency because DNA nanotechnology often depends critically on binding rates. Here, we explore DNA oligomer hybridization kinetics using a coarse-grained model. Strand association proceeds through a complex set of intermediate states, with successful binding events initiated by a few metastable base-pairing interactions, followed by zippering of the remaining bonds. But despite reasonably strong interstrand interactions, initial contacts frequently dissociate because typical configurations in which they form differ from typical states of similar enthalpy in the double-stranded equilibrium ensemble. Initial contacts must be stabilized by two or three base pairs before full zippering is likely, resulting in negative effective activation enthalpies. Non-Arrhenius behavior arises because the number of base pairs required for nucleation increases with temperature. In addition, we observe two alternative pathways-pseudoknot and inchworm internal displacement-through which misaligned duplexes can rearrange to form duplexes. These pathways accelerate hybridization. Our results explain why experimentally observed association rates of GC-rich oligomers are higher than rates of AT- rich equivalents, and more generally demonstrate how association rates can be modulated by sequence choice.

  3. Aptaligner: automated software for aligning pseudorandom DNA X-aptamers from next-generation sequencing data.

    Science.gov (United States)

    Lu, Emily; Elizondo-Riojas, Miguel-Angel; Chang, Jeffrey T; Volk, David E

    2014-06-10

    Next-generation sequencing results from bead-based aptamer libraries have demonstrated that traditional DNA/RNA alignment software is insufficient. This is particularly true for X-aptamers containing specialty bases (W, X, Y, Z, ...) that are identified by special encoding. Thus, we sought an automated program that uses the inherent design scheme of bead-based X-aptamers to create a hypothetical reference library and Markov modeling techniques to provide improved alignments. Aptaligner provides this feature as well as length error and noise level cutoff features, is parallelized to run on multiple central processing units (cores), and sorts sequences from a single chip into projects and subprojects.

  4. Quality standards for DNA sequence variation databases to improve clinical management under development in Australia

    Directory of Open Access Journals (Sweden)

    B. Bennetts

    2014-09-01

    Full Text Available Despite the routine nature of comparing sequence variations identified during clinical testing to database records, few databases meet quality requirements for clinical diagnostics. To address this issue, The Royal College of Pathologists of Australasia (RCPA in collaboration with the Human Genetics Society of Australasia (HGSA, and the Human Variome Project (HVP is developing standards for DNA sequence variation databases intended for use in the Australian clinical environment. The outputs of this project will be promoted to other health systems and accreditation bodies by the Human Variome Project to support the development of similar frameworks in other jurisdictions.

  5. Molecular characterization of Fasciola gigantica from Mauritania based on mitochondrial and nuclear ribosomal DNA sequences.

    Science.gov (United States)

    Amor, Nabil; Farjallah, Sarra; Salem, Mohamed; Lamine, Dia Mamadou; Merella, Paolo; Said, Khaled; Ben Slimane, Badreddine

    2011-10-01

    Fasciolosis caused by Fasciola hepatica and Fasciola gigantica (Platyhelminthes: Trematoda: Digenea) is considered the most important helminth infection of ruminants in tropical countries, causing considerable socioeconomic problems. From Africa, F. gigantica has been previously characterized from Burkina Faso, Senegal, Kenya, Zambia and Mali, while F. hepatica has been reported from Morocco and Tunisia, and both species have been observed from Ethiopia and Egypt on the basis of morphometric differences, while the use of molecular markers is necessary to distinguish exactly between species. Samples identified morphologically as F. gigantica (n=60) from sheep and cattle from different geographical localities of Mauritania were genetically characterized by sequences of the first (ITS-1), the 5.8S, and second (ITS-2) Internal Transcribed Spacers (ITS) of nuclear ribosomal DNA (rDNA) genes and the mitochondrial Cytochrome c Oxidase I (COI) gene. Comparison of the sequences of the Mauritanian samples with sequences of Fasciola spp. from GenBank confirmed that all samples belong to the species F. gigantica. The nucleotide sequencing of ITS rDNA of F. gigantica showed no nucleotide variation in the ITS-1, 5.8S, and ITS-2 rDNA sequences among all samples examined and those from Burkina Faso, Kenya, Egypt and Iran. The phylogenetic trees based on the ITS-1 and ITS-2 sequences showed a close relationship of the Mauritanian samples with isolates of F. gigantica from different localities of Africa and Asia. The COI genotypes of the Mauritanian specimens of F. gigantica had a high level of diversity, and they belonged to the F. gigantica phylogenically distinguishable clade. The present study is the first molecular characterization of F. gigantica in sheep and cattle from Mauritania, allowing a reliable approach for the genetic differentiation of Fasciola spp. and providing basis for further studies on liver flukes in the African countries. Copyright © 2011 Elsevier Inc. All

  6. ‘‘Blind'' mapping of genic DNA sequence polymorphisms in Lolium perenne L. by high resolution melting curve analysis

    DEFF Research Database (Denmark)

    Studer, Bruno; Jensen, Louise Bach; Fiil, Alice

    2009-01-01

    High resolution melting curve analysis (HRM) measures dissociation of double stranded DNA of a PCR product amplified in the presence of a saturating fluorescence dye. Recently, HRM proved successful to genotype DNA sequence polymorphisms such as SSRs and SNPs based on the shape of the melting...... curves. In this study, HRM was used for simultaneous screening and genotyping of genic DNA sequence polymorphisms identified in the Lolium perenne F2 mapping population VrnA. Melting profiles of PCR products amplified from previously published gene loci and from a novel gene putatively involved...

  7. PMS2 gene mutational analysis: direct cDNA sequencing to circumvent pseudogene interference.

    Science.gov (United States)

    Wimmer, Katharina; Wernstedt, Annekatrin

    2014-01-01

    The presence of highly homologous pseudocopies can compromise the mutation analysis of a gene of interest. In particular, when using PCR-based strategies, pseudogene co-amplification has to be effectively prevented. This is often achieved by using primers designed to be parental gene specific according to the reference sequence and by applying stringent PCR conditions. However, there are cases in which this approach is of limited utility. For example, it has been shown that the PMS2 gene exchanges sequences with one of its pseudogenes, named PMS2CL. This results in functional PMS2 alleles containing pseudogene-derived sequences at their 3'-end and in nonfunctional PMS2CL pseudogene alleles that contain gene-derived sequences. Hence, the paralogues cannot be distinguished according to the reference sequence. This shortcoming can be effectively circumvented by using direct cDNA sequencing. This approach is based on the selective amplification of PMS2 transcripts in two overlapping 1.6-kb RT-PCR products. In addition to avoiding pseudogene co-amplification and allele dropout, this method has also the advantage that it allows to effectively identify deletions, splice mutations, and de novo retrotransposon insertions that escape the detection of most DNA-based mutation analysis protocols.

  8. Detection of genomic variation by selection of a 9 mb DNA region and high throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Sergey I Nikolaev

    Full Text Available Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb and 7 (1.1 Mb from an individual from the International HapMap Project (NA12872. We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage > or = 4-fold, and 97.9% concordant in regions with coverage > or = 15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

  9. Generation of sequence signatures from DNA amplification fingerprints with mini-hairpin and microsatellite primers.

    Science.gov (United States)

    Caetano-Anollés, G; Gresshoff, P M

    1996-06-01

    DNA amplification fingerprinting (DAF) with mini-hairpins harboring arbitrary "core" sequences at their 3' termini were used to fingerprint a variety of templates, including PCR products and whole genomes, to establish genetic relationships between plant tax at the interspecific and intraspecific level, and to identify closely related fungal isolates and plant accessions. No correlation was observed between the sequence of the arbitrary core, the stability of the mini-hairpin structure and DAF efficiency. Mini-hairpin primers with short arbitrary cores and primers complementary to simple sequence repeats present in microsatellites were also used to generate arbitrary signatures from amplification profiles (ASAP). The ASAP strategy is a dual-step amplification procedure that uses at least one primer in each fingerprinting stage. ASAP was able to reproducibly amplify DAF products (representing about 10-15 kb of sequence) following careful optimization of amplification parameters such as primer and template concentration. Avoidance of primer sequences partially complementary to DAF product termini was necessary in order to produce distinct fingerprints. This allowed the combinatorial use of oligomers in nucleic acid screening, with numerous ASAP fingerprinting reactions based on a limited number of primer sequences. Mini-hairpin primers and ASAP analysis significantly increased detection of polymorphic DNA, separating closely related bermudagrass (Cynodon) cultivars and detecting putatively linked markers in bulked segregant analysis of the soybean (Glycine max) supernodulation (nitrate-tolerant symbiosis) locus.

  10. Use of a mitochondrial COI sequence to identify species of the subtribe Aphidina (Hemiptera, Aphididae

    Directory of Open Access Journals (Sweden)

    Jianfeng WANG

    2011-08-01

    Full Text Available Aphids of the subtribe Aphidina are found mainly in the North Temperate Zone. The relative lack of diagnostic morphological characteristics has obscured the identification of species in this group. However, DNA-based taxonomic methods can clarify species relationships within this group. Sequence variation in a partial segment of the mitochondrial COI gene was highly effective for resolving species relationships within Aphidina. Forty-five species were correctly identified in a neighbor-joining tree. Mean intraspecific sequence divergence was 0.17%, with a range of 0.00% to 1.54%. Mean interspecific divergence within previously recognized genera or morphologically similar species groups was 4.54%, with variation mainly in the range of 3.50% to 8.00%. Possible reasons for anomalous levels of mean nucleotide divergence within or between some taxa are discussed.

  11. Determination of cDNA and genomic DNA sequences of hevamine, a chitinase from the rubber tree Hevea brasiliensis

    NARCIS (Netherlands)

    Bokma, E; Spiering, M; Chow, KS; Mulder, PPMFA; Subroto, T; Beintema, JJ

    Hevamine is a chitinase from the rubber tree Hevea brasiliensis and belongs to the family 18 glycosyl hydrolases. This paper describes the cloning of hevamine DNA and cDNA sequences. Hevamine contains a signal peptide at the N-terminus and a putative vacuolar targeting sequence at the C-terminus

  12. Mapping vaccinia virus DNA replication origins at nucleotide level by deep sequencing.

    Science.gov (United States)

    Senkevich, Tatiana G; Bruno, Daniel; Martens, Craig; Porcella, Stephen F; Wolf, Yuri I; Moss, Bernard

    2015-09-01

    Poxviruses reproduce in the host cytoplasm and encode most or all of the enzymes and factors needed for expression and synthesis of their double-stranded DNA genomes. Nevertheless, the mode of poxvirus DNA replication and the nature and location of the replication origins remain unknown. A current but unsubstantiated model posits only leading strand synthesis starting at a nick near one covalently closed end of the genome and continuing around the other end to generate a concatemer that is subsequently resolved into unit genomes. The existence of specific origins has been questioned because any plasmid can replicate in cells infected by vaccinia virus (VACV), the prototype poxvirus. We applied directional deep sequencing of short single-stranded DNA fragments enriched for RNA-primed nascent strands isolated from the cytoplasm of VACV-infected cells to pinpoint replication origins. The origins were identified as the switching points of the fragment directions, which correspond to the transition from continuous to discontinuous DNA synthesis. Origins containing a prominent initiation point mapped to a sequence within the hairpin loop at one end of the VACV genome and to the same sequence within the concatemeric junction of replication intermediates. These findings support a model for poxvirus genome replication that involves leading and lagging strand synthesis and is consistent with the requirements for primase and ligase activities as well as earlier electron microscopic and biochemical studies implicating a replication origin at the end of the VACV genome.

  13. Comparative d2/d3 LSU–rDNA sequence study of some Iranian ...

    African Journals Online (AJOL)

    SERVER

    2007-11-05

    Nov 5, 2007 ... segments yielded one fragment at over all sequenced isolates as 787 bp in size. The DNA sequences were aligned .... expansion segments of the 28S rDNA subunit (D2/D3. LSU-rDNA) are the ... isolated from different geographical location from tea shrubs infested roots of Guilan province, Iran (Table 1).

  14. Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes.

    Science.gov (United States)

    Nowrousian, Minou; Teichert, Ines; Masloff, Sandra; Kück, Ulrich

    2012-02-01

    The study of mutants to elucidate gene functions has a long and successful history; however, to discover causative mutations in mutants that were generated by random mutagenesis often takes years of laboratory work and requires previously generated genetic and/or physical markers, or resources like DNA libraries for complementation. Here, we present an alternative method to identify defective genes in developmental mutants of the filamentous fungus Sordaria macrospora through Illumina/Solexa whole-genome sequencing. We sequenced pooled DNA from progeny of crosses of three mutants and the wild type and were able to pinpoint the causative mutations in the mutant strains through bioinformatics analysis. One mutant is a spore color mutant, and the mutated gene encodes a melanin biosynthesis enzyme. The causative mutation is a G to A change in the first base of an intron, leading to a splice defect. The second mutant carries an allelic mutation in the pro41 gene encoding a protein essential for sexual development. In the mutant, we detected a complex pattern of deletion/rearrangements at the pro41 locus. In the third mutant, a point mutation in the stop codon of a transcription factor-encoding gene leads to the production of immature fruiting bodies. For all mutants, transformation with a wild type-copy of the affected gene restored the wild-type phenotype. Our data demonstrate that whole-genome sequencing of mutant strains is a rapid method to identify developmental genes in an organism that can be genetically crossed and where a reference genome sequence is available, even without prior mapping information.

  15. Sequence specificity and biological consequences of drugs that bind covalently in the minor groove of DNA

    International Nuclear Information System (INIS)

    Hurley, L.H.; Needham-VanDevanter, D.R.

    1986-01-01

    DNA ligands which bind within the minor groove of DNA exhibit varying degrees of sequence selectivity. Factors which contribute to nucleotide sequence recognition by minor groove ligands have been extensively investigated. Electrostatic interactions, ligand and DNA dehydration energies, hydrophobic interactions and steric factors all play significant roles in sequence selectivity in the minor groove. Interestingly, ligand recognition of nucleotide sequence in the minor groove does not involve significant hydrogen bonding. This is in sharp contrast to cellular enzyme and protein recognition of nucleotide sequence, which is achieved in the major groove via specific hydrogen bond formation between individual bases and the ligand. The ability to read nucleotide sequence via hydrogen bonding allows precise binding of proteins to specific DNA sequences. Minor groove ligands examined to date exhibit a much lower sequence specificity, generally binding to a subset of possible sequences, rather than a single sequence. 19 refs., 7 figs

  16. The DNA sequence of the human X chromosome

    Science.gov (United States)

    Ross, Mark T.; Grafham, Darren V.; Coffey, Alison J.; Scherer, Steven; McLay, Kirsten; Muzny, Donna; Platzer, Matthias; Howell, Gareth R.; Burrows, Christine; Bird, Christine P.; Frankish, Adam; Lovell, Frances L.; Howe, Kevin L.; Ashurst, Jennifer L.; Fulton, Robert S.; Sudbrak, Ralf; Wen, Gaiping; Jones, Matthew C.; Hurles, Matthew E.; Andrews, T. Daniel; Scott, Carol E.; Searle, Stephen; Ramser, Juliane; Whittaker, Adam; Deadman, Rebecca; Carter, Nigel P.; Hunt, Sarah E.; Chen, Rui; Cree, Andrew; Gunaratne, Preethi; Havlak, Paul; Hodgson, Anne; Metzker, Michael L.; Richards, Stephen; Scott, Graham; Steffen, David; Sodergren, Erica; Wheeler, David A.; Worley, Kim C.; Ainscough, Rachael; Ambrose, Kerrie D.; Ansari-Lari, M. Ali; Aradhya, Swaroop; Ashwell, Robert I. S.; Babbage, Anne K.; Bagguley, Claire L.; Ballabio, Andrea; Banerjee, Ruby; Barker, Gary E.; Barlow, Karen F.; Barrett, Ian P.; Bates, Karen N.; Beare, David M.; Beasley, Helen; Beasley, Oliver; Beck, Alfred; Bethel, Graeme; Blechschmidt, Karin; Brady, Nicola; Bray-Allen, Sarah; Bridgeman, Anne M.; Brown, Andrew J.; Brown, Mary J.; Bonnin, David; Bruford, Elspeth A.; Buhay, Christian; Burch, Paula; Burford, Deborah; Burgess, Joanne; Burrill, Wayne; Burton, John; Bye, Jackie M.; Carder, Carol; Carrel, Laura; Chako, Joseph; Chapman, Joanne C.; Chavez, Dean; Chen, Ellson; Chen, Guan; Chen, Yuan; Chen, Zhijian; Chinault, Craig; Ciccodicola, Alfredo; Clark, Sue Y.; Clarke, Graham; Clee, Chris M.; Clegg, Sheila; Clerc-Blankenburg, Kerstin; Clifford, Karen; Cobley, Vicky; Cole, Charlotte G.; Conquer, Jen S.; Corby, Nicole; Connor, Richard E.; David, Robert; Davies, Joy; Davis, Clay; Davis, John; Delgado, Oliver; DeShazo, Denise; Dhami, Pawandeep; Ding, Yan; Dinh, Huyen; Dodsworth, Steve; Draper, Heather; Dugan-Rocha, Shannon; Dunham, Andrew; Dunn, Matthew; Durbin, K. James; Dutta, Ireena; Eades, Tamsin; Ellwood, Matthew; Emery-Cohen, Alexandra; Errington, Helen; Evans, Kathryn L.; Faulkner, Louisa; Francis, Fiona; Frankland, John; Fraser, Audrey E.; Galgoczy, Petra; Gilbert, James; Gill, Rachel; Glöckner, Gernot; Gregory, Simon G.; Gribble, Susan; Griffiths, Coline; Grocock, Russell; Gu, Yanghong; Gwilliam, Rhian; Hamilton, Cerissa; Hart, Elizabeth A.; Hawes, Alicia; Heath, Paul D.; Heitmann, Katja; Hennig, Steffen; Hernandez, Judith; Hinzmann, Bernd; Ho, Sarah; Hoffs, Michael; Howden, Phillip J.; Huckle, Elizabeth J.; Hume, Jennifer; Hunt, Paul J.; Hunt, Adrienne R.; Isherwood, Judith; Jacob, Leni; Johnson, David; Jones, Sally; de Jong, Pieter J.; Joseph, Shirin S.; Keenan, Stephen; Kelly, Susan; Kershaw, Joanne K.; Khan, Ziad; Kioschis, Petra; Klages, Sven; Knights, Andrew J.; Kosiura, Anna; Kovar-Smith, Christie; Laird, Gavin K.; Langford, Cordelia; Lawlor, Stephanie; Leversha, Margaret; Lewis, Lora; Liu, Wen; Lloyd, Christine; Lloyd, David M.; Loulseged, Hermela; Loveland, Jane E.; Lovell, Jamieson D.; Lozado, Ryan; Lu, Jing; Lyne, Rachael; Ma, Jie; Maheshwari, Manjula; Matthews, Lucy H.; McDowall, Jennifer; McLaren, Stuart; McMurray, Amanda; Meidl, Patrick; Meitinger, Thomas; Milne, Sarah; Miner, George; Mistry, Shailesh L.; Morgan, Margaret; Morris, Sidney; Müller, Ines; Mullikin, James C.; Nguyen, Ngoc; Nordsiek, Gabriele; Nyakatura, Gerald; O’Dell, Christopher N.; Okwuonu, Geoffery; Palmer, Sophie; Pandian, Richard; Parker, David; Parrish, Julia; Pasternak, Shiran; Patel, Dina; Pearce, Alex V.; Pearson, Danita M.; Pelan, Sarah E.; Perez, Lesette; Porter, Keith M.; Ramsey, Yvonne; Reichwald, Kathrin; Rhodes, Susan; Ridler, Kerry A.; Schlessinger, David; Schueler, Mary G.; Sehra, Harminder K.; Shaw-Smith, Charles; Shen, Hua; Sheridan, Elizabeth M.; Shownkeen, Ratna; Skuce, Carl D.; Smith, Michelle L.; Sotheran, Elizabeth C.; Steingruber, Helen E.; Steward, Charles A.; Storey, Roy; Swann, R. Mark; Swarbreck, David; Tabor, Paul E.; Taudien, Stefan; Taylor, Tineace; Teague, Brian; Thomas, Karen; Thorpe, Andrea; Timms, Kirsten; Tracey, Alan; Trevanion, Steve; Tromans, Anthony C.; d’Urso, Michele; Verduzco, Daniel; Villasana, Donna; Waldron, Lenee; Wall, Melanie; Wang, Qiaoyan; Warren, James; Warry, Georgina L.; Wei, Xuehong; West, Anthony; Whitehead, Siobhan L.; Whiteley, Mathew N.; Wilkinson, Jane E.; Willey, David L.; Williams, Gabrielle; Williams, Leanne; Williamson, Angela; Williamson, Helen; Wilming, Laurens; Woodmansey, Rebecca L.; Wray, Paul W.; Yen, Jennifer; Zhang, Jingkun; Zhou, Jianling; Zoghbi, Huda; Zorilla, Sara; Buck, David; Reinhardt, Richard; Poustka, Annemarie; Rosenthal, André; Lehrach, Hans; Meindl, Alfons; Minx, Patrick J.; Hillier, LaDeana W.; Willard, Huntington F.; Wilson, Richard K.; Waterston, Robert H.; Rice, Catherine M.; Vaudin, Mark; Coulson, Alan; Nelson, David L.; Weinstock, George; Sulston, John E.; Durbin, Richard; Hubbard, Tim; Gibbs, Richard A.; Beck, Stephan; Rogers, Jane; Bentley, David R.

    2009-01-01

    The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence. PMID:15772651

  17. Engineering of a DNA Polymerase for Direct m6 A Sequencing.

    Science.gov (United States)

    Aschenbrenner, Joos; Werner, Stephan; Marchand, Virginie; Adam, Martina; Motorin, Yuri; Helm, Mark; Marx, Andreas

    2018-01-08

    Methods for the detection of RNA modifications are of fundamental importance for advancing epitranscriptomics. N 6 -methyladenosine (m 6 A) is the most abundant RNA modification in mammalian mRNA and is involved in the regulation of gene expression. Current detection techniques are laborious and rely on antibody-based enrichment of m 6 A-containing RNA prior to sequencing, since m 6 A modifications are generally "erased" during reverse transcription (RT). To overcome the drawbacks associated with indirect detection, we aimed to generate novel DNA polymerase variants for direct m 6 A sequencing. Therefore, we developed a screen to evolve an RT-active KlenTaq DNA polymerase variant that sets a mark for N 6 -methylation. We identified a mutant that exhibits increased misincorporation opposite m 6 A compared to unmodified A. Application of the generated DNA polymerase in next-generation sequencing allowed the identification of m 6 A sites directly from the sequencing data of untreated RNA samples. © 2017 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.

  18. Plastome Sequencing of Ten Nonmodel Crop Species Uncovers a Large Insertion of Mitochondrial DNA in Cashew.

    Science.gov (United States)

    Rabah, Samar O; Lee, Chaehee; Hajrah, Nahid H; Makki, Rania M; Alharby, Hesham F; Alhebshi, Alawiah M; Sabir, Jamal S M; Jansen, Robert K; Ruhlman, Tracey A

    2017-11-01

    In plant evolution, intracellular gene transfer (IGT) is a prevalent, ongoing process. While nuclear and mitochondrial genomes are known to integrate foreign DNA via IGT and horizontal gene transfer (HGT), plastid genomes (plastomes) have resisted foreign DNA incorporation and only recently has IGT been uncovered in the plastomes of a few land plants. In this study, we completed plastome sequences for l0 crop species and describe a number of structural features including variation in gene and intron content, inversions, and expansion and contraction of the inverted repeat (IR). We identified a putative in cinnamon ( J. Presl) and other sequenced Lauraceae and an apparent functional transfer of to the nucleus of quinoa ( Willd.). In the orchard tree cashew ( L.), we report the insertion of an ∼6.7-kb fragment of mitochondrial DNA into the plastome IR. BLASTn analyses returned high identity hits to mitogenome sequences including an intact open reading frame. Using three plastome markers for five species of , we generated a phylogeny to investigate the distribution and timing of the insertion. Four species share the insertion, suggesting that this event occurred <20 million yr ago in a single clade in the genus. Our study extends the observation of mitochondrial to plastome IGT to include long-lived tree species. While previous studies have suggested possible mechanisms facilitating IGT to the plastome, more examples of this phenomenon, along with more complete mitogenome sequences, will be required before a common, or variable, mechanism can be elucidated. Copyright © 2017 Crop Science Society of America.

  19. RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.

    Science.gov (United States)

    Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

    2012-01-01

    RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.

  20. Genome-Wide Prediction of DNA Methylation Using DNA Composition and Sequence Complexity in Human.

    Science.gov (United States)

    Wu, Chengchao; Yao, Shixin; Li, Xinghao; Chen, Chujia; Hu, Xuehai

    2017-02-16

    DNA methylation plays a significant role in transcriptional regulation by repressing activity. Change of the DNA methylation level is an important factor affecting the expression of target genes and downstream phenotypes. Because current experimental technologies can only assay a small proportion of CpG sites in the human genome, it is urgent to develop reliable computational models for predicting genome-wide DNA methylation. Here, we proposed a novel algorithm that accurately extracted sequence complexity features (seven features) and developed a support-vector-machine-based prediction model with integration of the reported DNA composition features (trinucleotide frequency and GC content, 65 features) by utilizing the methylation profiles of embryonic stem cells in human. The prediction results from 22 human chromosomes with size-varied windows showed that the 600-bp window achieved the best average accuracy of 94.7%. Moreover, comparisons with two existing methods further showed the superiority of our model, and cross-species predictions on mouse data also demonstrated that our model has certain generalization ability. Finally, a statistical test of the experimental data and the predicted data on functional regions annotated by ChromHMM found that six out of 10 regions were consistent, which implies reliable prediction of unassayed CpG sites. Accordingly, we believe that our novel model will be useful and reliable in predicting DNA methylation.

  1. DNA barcodes identify Central Asian Colias butterflies (Lepidoptera, Pieridae).

    Science.gov (United States)

    Laiho, Juha; Ståhls, Gunilla

    2013-12-30

    A majority of the known Colias species (Lepidoptera: Pieridae, Coliadinae) occur in the mountainous regions of Central-Asia, vast areas that are hard to access, rendering the knowledge of many species limited due to the lack of extensive sampling. Two gene regions, the mitochondrial COI 'barcode' region and the nuclear ribosomal protein RpS2 gene region were used for exploring the utility of these DNA markers for species identification. A comprehensive sampling of COI barcodes for Central Asian Colias butterflies showed that the barcodes facilitated identification of most of the included species. Phylogenetic reconstruction based on parsimony and Neighbour-Joining recovered most species as monophyletic entities. For the RpS2 gene region species-specific sequences were registered for some of the included Colias spp. Nevertheless, this gene region was not deemed useful as additional molecular 'barcode'. A parsimony analysis of the combined COI and RpS2 data did not support the current subgeneric classification based on morphological characteristics.

  2. Identifying transposon insertions and their effects from RNA-sequencing data.

    Science.gov (United States)

    de Ruiter, Julian R; Kas, Sjors M; Schut, Eva; Adams, David J; Koudijs, Marco J; Wessels, Lodewyk F A; Jonkers, Jos

    2017-07-07

    Insertional mutagenesis using engineered transposons is a potent forward genetic screening technique used to identify cancer genes in mouse model systems. In the analysis of these screens, transposon insertion sites are typically identified by targeted DNA-sequencing and subsequently assigned to predicted target genes using heuristics. As such, these approaches provide no direct evidence that insertions actually affect their predicted targets or how transcripts of these genes are affected. To address this, we developed IM-Fusion, an approach that identifies insertion sites from gene-transposon fusions in standard single- and paired-end RNA-sequencing data. We demonstrate IM-Fusion on two separate transposon screens of 123 mammary tumors and 20 B-cell acute lymphoblastic leukemias, respectively. We show that IM-Fusion accurately identifies transposon insertions and their true target genes. Furthermore, by combining the identified insertion sites with expression quantification, we show that we can determine the effect of a transposon insertion on its target gene(s) and prioritize insertions that have a significant effect on expression. We expect that IM-Fusion will significantly enhance the accuracy of cancer gene discovery in forward genetic screens and provide initial insight into the biological effects of insertions on candidate cancer genes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. cDNA cloning and nucleotide sequence comparison of Chinese hamster metallothionein I and II mRNAs

    Energy Technology Data Exchange (ETDEWEB)

    Griffith, B B; Walters, R A; Enger, M D; Hildebrand, C E; Griffith, J K

    1983-01-01

    Polyadenylated RNA was extracted from a cadmium resistant Chinese hamster (CHO) cell line, enriched for metal-induced, abundant RNA sequences and cloned as double-stranded cDNA in the plasmid pBR322. Two cDNA clones, pCHMT1 and pCHMT2, encoding two Chinese hamster isometallothioneins were identified, and the nucleotide sequence of each insert was determined. The two Chinese hamster metallothioneins show nucleotide sequence homologies of 80% in the protein coding region and approximately 35% in both the 5' and 3' untranslated regions. Interestingly, an 8 nucleotide sequence (TGTAAATA) has been conserved in sequence and position in the 3' untranslated regions of each metallothionein mRNA sequenced thus far. Estimated nucleotide substitution rates derived from interspecies comparisons were used to calculate a metallothionein gene duplication time of 45 to 120 million years ago. 39 references, 1 figure, 1 table.

  4. Using DNA Barcodes to Identify Road-Killed Animals in Two Atlantic Forest Nature Reserves, Brazil.

    Directory of Open Access Journals (Sweden)

    Angélica H Klippel

    Full Text Available Road mortality is the leading source of biodiversity loss in the world, especially due to fragmentation of natural habitats and loss of wildlife. The survey of the main species victims of roadkill is of fundamental importance for the better understanding of the problem, being necessary, for this, the correct species identification. The aim of this study was to verify if DNA barcodes can be applied to identify road-killed samples that often cannot be determined morphologically. For this purpose, 222 vertebrate samples were collected in a stretch of the BR-101 highway that crosses two Discovery Coast Atlantic Forest Natural Reserves, the Sooretama Biological Reserve and the Vale Natural Reserve, in Espírito Santo, Brazil. The mitochondrial COI gene was amplified, sequenced and confronted with the BOLD database. It was possible to identify 62.16% of samples, totaling 62 different species, including Pyrrhura cruentata, Chaetomys subspinosus, Puma yagouaroundi and Leopardus wiedii considered Vulnerable in the National Official List of Species of Endangered Wildlife. The most commonly identified animals were a bat (Molossus molossus, an opossum (Didelphis aurita and a frog (Trachycephalus mesophaeus species. Only one reptile was identified using the technique, probably due to lack of reference sequences in BOLD. These data may contribute to a better understanding of the impact of roads on species biodiversity loss and to introduce the DNA barcode technique to road ecology scenarios.

  5. An Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases

    Directory of Open Access Journals (Sweden)

    Md. Rezaul Karim

    2012-03-01

    Full Text Available Mining interesting patterns from DNA sequences is one of the most challenging tasks in bioinformatics and computational biology. Maximal contiguous frequent patterns are preferable for expressing the function and structure of DNA sequences and hence can capture the common data characteristics among related sequences. Biologists are interested in finding frequent orderly arrangements of motifs that are responsible for similar expression of a group of genes. In order to reduce mining time and complexity, however, most existing sequence mining algorithms either focus on finding short DNA sequences or require explicit specification of sequence lengths in advance. The challenge is to find longer sequences without specifying sequence lengths in advance. In this paper, we propose an efficient approach to mining maximal contiguous frequent patterns from large DNA sequence datasets. The experimental results show that our proposed approach is memory-efficient and mines maximal contiguous frequent patterns within a reasonable time.

  6. Beyond DNA Sequencing in Space: Current and Future Omics Capabilities of the Biomolecule Sequencer Payload

    Science.gov (United States)

    Wallace, Sarah

    2017-01-01

    Why do we need a DNA sequencer to support the human exploration of space? (A) Operational environmental monitoring; (1) Identification of contaminating microbes, (2) Infectious disease diagnosis, (3) Reduce down mass (sample return for environmental monitoring, crew health, etc.). (B) Research; (1) Human, (2) Animal, (3) Microbes/Cell lines, (4) Plant. (C) Med Ops; (1) Response to countermeasures, (2) Radiation, (3) Real-time analysis can influence medical intervention. (C) Support astrobiology science investigations; (1) Technology superiorly suited to in situ nucleic acid-based life detection, (2) Functional testing for integration into robotics for extraplanetary exploration mission.

  7. Mechanism of sequence-specific template binding by the DNA primase of bacteriophage T7

    KAUST Repository

    Lee, Seung-Joo; Zhu, Bin; Hamdan, Samir; Richardson, Charles C.

    2010-01-01

    DNA primases catalyze the synthesis of the oligoribonucleotides required for the initiation of lagging strand DNA synthesis. Biochemical studies have elucidated the mechanism for the sequence-specific synthesis of primers. However, the physical

  8. Assessing the fidelity of ancient DNA sequences amplified from nuclear genes

    DEFF Research Database (Denmark)

    Binladen, Jonas; Wiuf, Carsten Henrik; Gilbert, M. Thomas P.

    2006-01-01

    To date, the field of ancient DNA has relied almost exclusively on mitochondrial DNA (mtDNA) sequences. However, a number of recent studies have reported the successful recovery of ancient nuclear DNA (nuDNA) sequences, thereby allowing the characterization of genetic loci directly involved...... in phenotypic traits of extinct taxa. It is well documented that postmortem damage in ancient mtDNA can lead to the generation of artifactual sequences. However, as yet no one has thoroughly investigated the damage spectrum in ancient nuDNA. By comparing clone sequences from 23 fossil specimens, recovered from...... adenine), respectively. Type 2 transitions are by far the most dominant and increase relative to those of type 1 with damage load. The results suggest that the deamination of cytosine (and 5-methyl cytosine) to uracil (and thymine) is the main cause of miscoding lesions in both ancient mtDNA and nu...

  9. Efficient DNA fingerprinting based on the targeted sequencing of active retrotransposon insertion sites using a bench-top high-throughput sequencing platform.

    Science.gov (United States)

    Monden, Yuki; Yamamoto, Ayaka; Shindo, Akiko; Tahara, Makoto

    2014-10-01

    In many crop species, DNA fingerprinting is required for the precise identification of cultivars to protect the rights of breeders. Many families of retrotransposons have multiple copies throughout the eukaryotic genome and their integrated copies are inherited genetically. Thus, their insertion polymorphisms among cultivars are useful for DNA fingerprinting. In this study, we conducted a DNA fingerprinting based on the insertion polymorphisms of active retrotransposon families (Rtsp-1 and LIb) in sweet potato. Using 38 cultivars, we identified 2,024 insertion sites in the two families with an Illumina MiSeq sequencing platform. Of these insertion sites, 91.4% appeared to be polymorphic among the cultivars and 376 cultivar-specific insertion sites were identified, which were converted directly into cultivar-specific sequence-characterized amplified region (SCAR) markers. A phylogenetic tree was constructed using these insertion sites, which corresponded well with known pedigree information, thereby indicating their suitability for genetic diversity studies. Thus, the genome-wide comparative analysis of active retrotransposon insertion sites using the bench-top MiSeq sequencing platform is highly effective for DNA fingerprinting without any requirement for whole genome sequence information. This approach may facilitate the development of practical polymerase chain reaction-based cultivar diagnostic system and could also be applied to the determination of genetic relationships. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  10. Long span DNA paired-end-tag (DNA-PET sequencing strategy for the interrogation of genomic structural mutations and fusion-point-guided reconstruction of amplicons.

    Directory of Open Access Journals (Sweden)

    Fei Yao

    Full Text Available Structural variations (SVs contribute significantly to the variability of the human genome and extensive genomic rearrangements are a hallmark of cancer. While genomic DNA paired-end-tag (DNA-PET sequencing is an attractive approach to identify genomic SVs, the current application of PET sequencing with short insert size DNA can be insufficient for the comprehensive mapping of SVs in low complexity and repeat-rich genomic regions. We employed a recently developed procedure to generate PET sequencing data using large DNA inserts of 10-20 kb and compared their characteristics with short insert (1 kb libraries for their ability to identify SVs. Our results suggest that although short insert libraries bear an advantage in identifying small deletions, they do not provide significantly better breakpoint resolution. In contrast, large inserts are superior to short inserts in providing higher physical genome coverage for the same sequencing cost and achieve greater sensitivity, in practice, for the identification of several classes of SVs, such as copy number neutral and complex events. Furthermore, our results confirm that large insert libraries allow for the identification of SVs within repetitive sequences, which cannot be spanned by short inserts. This provides a key advantage in studying rearrangements in cancer, and we show how it can be used in a fusion-point-guided-concatenation algorithm to study focally amplified regions in cancer.

  11. Homogeneity of the 16S rDNA sequence among geographically disparate isolates of Taylorella equigenitalis

    Directory of Open Access Journals (Sweden)

    Moore JE

    2006-01-01

    Full Text Available Abstract Background At present, six accessible sequences of 16S rDNA from Taylorella equigenitalis (T. equigenitalis are available, whose sequence differences occur at a few nucleotide positions. Thus it is important to determine these sequences from additional strains in other countries, if possible, in order to clarify any anomalies regarding 16S rDNA sequence heterogeneity. Here, we clone and sequence the approximate full-length 16S rDNA from additional strains of T. equigenitalis isolated in Japan, Australia and France and compare these sequences to the existing published sequences. Results Clarification of any anomalies regarding 16S rDNA sequence heterogeneity of T. equigenitalis was carried out. When cloning, sequencing and comparison of the approximate full-length 16S rDNA from 17 strains of T. equigenitalis isolated in Japan, Australia and France, nucleotide sequence differences were demonstrated at the six loci in the 1,469 nucleotide sequence. Moreover, 12 polymorphic sites occurred among 23 sequences of the 16S rDNA, including the six reference sequences. Conclusion High sequence similarity (99.5% or more was observed throughout, except from nucleotide positions 138 to 501 where substitutions and deletions were noted.

  12. Homogeneity of the 16S rDNA sequence among geographically disparate isolates of Taylorella equigenitalis

    Science.gov (United States)

    Matsuda, M; Tazumi, A; Kagawa, S; Sekizuka, T; Murayama, O; Moore, JE; Millar, BC

    2006-01-01

    Background At present, six accessible sequences of 16S rDNA from Taylorella equigenitalis (T. equigenitalis) are available, whose sequence differences occur at a few nucleotide positions. Thus it is important to determine these sequences from additional strains in other countries, if possible, in order to clarify any anomalies regarding 16S rDNA sequence heterogeneity. Here, we clone and sequence the approximate full-length 16S rDNA from additional strains of T. equigenitalis isolated in Japan, Australia and France and compare these sequences to the existing published sequences. Results Clarification of any anomalies regarding 16S rDNA sequence heterogeneity of T. equigenitalis was carried out. When cloning, sequencing and comparison of the approximate full-length 16S rDNA from 17 strains of T. equigenitalis isolated in Japan, Australia and France, nucleotide sequence differences were demonstrated at the six loci in the 1,469 nucleotide sequence. Moreover, 12 polymorphic sites occurred among 23 sequences of the 16S rDNA, including the six reference sequences. Conclusion High sequence similarity (99.5% or more) was observed throughout, except from nucleotide positions 138 to 501 where substitutions and deletions were noted. PMID:16398935

  13. Sequence-specific capture of protein-DNA complexes for mass spectrometric protein identification.

    Directory of Open Access Journals (Sweden)

    Cheng-Hsien Wu

    Full Text Available The regulation of gene transcription is fundamental to the existence of complex multicellular organisms such as humans. Although it is widely recognized that much of gene regulation is controlled by gene-specific protein-DNA interactions, there presently exists little in the way of tools to identify proteins that interact with the genome at locations of interest. We have developed a novel strategy to address this problem, which we refer to as GENECAPP, for Global ExoNuclease-based Enrichment of Chromatin-Associated Proteins for Proteomics. In this approach, formaldehyde cross-linking is employed to covalently link DNA to its associated proteins; subsequent fragmentation of the DNA, followed by exonuclease digestion, produces a single-stranded region of the DNA that enables sequence-specific hybridization capture of the protein-DNA complex on a solid support. Mass spectrometric (MS analysis of the captured proteins is then used for their identification and/or quantification. We show here the development and optimization of GENECAPP for an in vitro model system, comprised of the murine insulin-like growth factor-binding protein 1 (IGFBP1 promoter region and FoxO1, a member of the forkhead rhabdomyosarcoma (FoxO subfamily of transcription factors, which binds specifically to the IGFBP1 promoter. This novel strategy provides a powerful tool for studies of protein-DNA and protein-protein interactions.

  14. Expressed sequences tags of the anther smut fungus, Microbotryum violaceum, identify mating and pathogenicity genes

    Directory of Open Access Journals (Sweden)

    Devier Benjamin

    2007-08-01

    Full Text Available Abstract Background The basidiomycete fungus Microbotryum violaceum is responsible for the anther-smut disease in many plants of the Caryophyllaceae family and is a model in genetics and evolutionary biology. Infection is initiated by dikaryotic hyphae produced after the conjugation of two haploid sporidia of opposite mating type. This study describes M. violaceum ESTs corresponding to nuclear genes expressed during conjugation and early hyphal production. Results A normalized cDNA library generated 24,128 sequences, which were assembled into 7,765 unique genes; 25.2% of them displayed significant similarity to annotated proteins from other organisms, 74.3% a weak similarity to the same set of known proteins, and 0.5% were orphans. We identified putative pheromone receptors and genes that in other fungi are involved in the mating process. We also identified many sequences similar to genes known to be involved in pathogenicity in other fungi. The M. violaceum EST database, MICROBASE, is available on the Web and provides access to the sequences, assembled contigs, annotations and programs to compare similarities against MICROBASE. Conclusion This study provides a basis for cloning the mating type locus, for further investigation of pathogenicity genes in the anther smut fungi, and for comparative genomics.

  15. Sequence Capture versus Restriction Site Associated DNA Sequencing for Shallow Systematics.

    Science.gov (United States)

    Harvey, Michael G; Smith, Brian Tilston; Glenn, Travis C; Faircloth, Brant C; Brumfield, Robb T

    2016-09-01

    Sequence capture and restriction site associated DNA sequencing (RAD-Seq) are two genomic enrichment strategies for applying next-generation sequencing technologies to systematics studies. At shallow timescales, such as within species, RAD-Seq has been widely adopted among researchers, although there has been little discussion of the potential limitations and benefits of RAD-Seq and sequence capture. We discuss a series of issues that may impact the utility of sequence capture and RAD-Seq data for shallow systematics in non-model species. We review prior studies that used both methods, and investigate differences between the methods by re-analyzing existing RAD-Seq and sequence capture data sets from a Neotropical bird (Xenops minutus). We suggest that the strengths of RAD-Seq data sets for shallow systematics are the wide dispersion of markers across the genome, the relative ease and cost of laboratory work, the deep coverage and read overlap at recovered loci, and the high overall information that results. Sequence capture's benefits include flexibility and repeatability in the genomic regions targeted, success using low-quality samples, more straightforward read orthology assessment, and higher per-locus information content. The utility of a method in systematics, however, rests not only on its performance within a study, but on the comparability of data sets and inferences with those of prior work. In RAD-Seq data sets, comparability is compromised by low overlap of orthologous markers across species and the sensitivity of genetic diversity in a data set to an interaction between the level of natural heterozygosity in the samples examined and the parameters used for orthology assessment. In contrast, sequence capture of conserved genomic regions permits interrogation of the same loci across divergent species, which is preferable for maintaining comparability among data sets and studies for the purpose of drawing general conclusions about the impact of

  16. Contig Maps and Genomic Sequencing Identify Candidate Genes in the Usher 1C Locus

    Science.gov (United States)

    Higgins, Michael J.; Day, Colleen D.; Smilinich, Nancy J.; Ni, L.; Cooper, Paul R.; Nowak, Norma J.; Davies, Chris; de Jong, Pieter J.; Hejtmancik, Fielding; Evans, Glen A.; Smith, Richard J.H.; Shows, Thomas B.

    1998-01-01

    Usher syndrome 1C (USH1C) is a congenital condition manifesting profound hearing loss, the absence of vestibular function, and eventual retinal degeneration. The USH1C locus has been mapped genetically to a 2- to 3-cM interval in 11p14–15.1 between D11S899 and D11S861. In an effort to identify the USH1C disease gene we have isolated the region between these markers in yeast artificial chromosomes (YACs) using a combination of STS content mapping and Alu–PCR hybridization. The YAC contig is ∼3.5 Mb and has located several other loci within this interval, resulting in the order CEN-LDHA-SAA1-TPH-D11S1310-(D11S1888/KCNC1)-MYOD1-D11S902D11S921-D11S1890-TEL. Subsequent haplotyping and homozygosity analysis refined the location of the disease gene to a 400-kb interval between D11S902 and D11S1890 with all affected individuals being homozygous for the internal marker D11S921. To facilitate gene identification, the critical region has been converted into P1 artificial chromosome (PAC) clones using sequence-tagged sites (STSs) mapped to the YAC contig, Alu–PCR products generated from the YACs, and PAC end probes. A contig of >50 PAC clones has been assembled between D11S1310 and D11S1890, confirming the order of markers used in haplotyping. Three PAC clones representing nearly two-thirds of the USH1C critical region have been sequenced. PowerBLAST analysis identified six clusters of expressed sequence tags (ESTs), two known genes (BIR,SUR1) mapped previously to this region, and a previously characterized but unmapped gene NEFA (DNA binding/EF hand/acidic amino-acid-rich). GRAIL analysis identified 11 CpG islands and 73 exons of excellent quality. These data allowed the construction of a transcription map for the USH1C critical region, consisting of three known genes and six or more novel transcripts. Based on their map location, these loci represent candidate disease loci for USH1C. The NEFA gene was assessed as the USH1C locus by the sequencing of an amplified NEFA

  17. Genome-wide identification and characterisation of human DNA replication origins by initiation site sequencing (ini-seq).

    Science.gov (United States)

    Langley, Alexander R; Gräf, Stefan; Smith, James C; Krude, Torsten

    2016-12-01

    Next-generation sequencing has enabled the genome-wide identification of human DNA replication origins. However, different approaches to mapping replication origins, namely (i) sequencing isolated small nascent DNA strands (SNS-seq); (ii) sequencing replication bubbles (bubble-seq) and (iii) sequencing Okazaki fragments (OK-seq), show only limited concordance. To address this controversy, we describe here an independent high-resolution origin mapping technique that we call initiation site sequencing (ini-seq). In this approach, newly replicated DNA is directly labelled with digoxigenin-dUTP near the sites of its initiation in a cell-free system. The labelled DNA is then immunoprecipitated and genomic locations are determined by DNA sequencing. Using this technique we identify >25,000 discrete origin sites at sub-kilobase resolution on the human genome, with high concordance between biological replicates. Most activated origins identified by ini-seq are found at transcriptional start sites and contain G-quadruplex (G4) motifs. They tend to cluster in early-replicating domains, providing a correlation between early replication timing and local density of activated origins. Origins identified by ini-seq show highest concordance with sites identified by SNS-seq, followed by OK-seq and bubble-seq. Furthermore, germline origins identified by positive nucleotide distribution skew jumps overlap with origins identified by ini-seq and OK-seq more frequently and more specifically than do sites identified by either SNS-seq or bubble-seq. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. [Whole Genome Sequencing of Human mtDNA Based on Ion Torrent PGM™ Platform].

    Science.gov (United States)

    Cao, Y; Zou, K N; Huang, J P; Ma, K; Ping, Y

    2017-08-01

    To analyze and detect the whole genome sequence of human mitochondrial DNA (mtDNA) by Ion Torrent PGM™ platform and to study the differences of mtDNA sequence in different tissues. Samples were collected from 6 unrelated individuals by forensic postmortem examination, including chest blood, hair, costicartilage, nail, skeletal muscle and oral epithelium. Amplification of whole genome sequence of mtDNA was performed by 4 pairs of primer. Libraries were constructed with Ion Shear™ Plus Reagents kit and Ion Plus Fragment Library kit. Whole genome sequencing of mtDNA was performed using Ion Torrent PGM™ platform. Sanger sequencing was used to determine the heteroplasmy positions and the mutation positions on HVⅠ region. The whole genome sequence of mtDNA from all samples were amplified successfully. Six unrelated individuals belonged to 6 different haplotypes. Different tissues in one individual had heteroplasmy difference. The heteroplasmy positions and the mutation positions on HVⅠ region were verified by Sanger sequencing. After a consistency check by the Kappa method, it was found that the results of mtDNA sequence had a high consistency in different tissues. The testing method used in present study for sequencing the whole genome sequence of human mtDNA can detect the heteroplasmy difference in different tissues, which have good consistency. The results provide guidance for the further applications of mtDNA in forensic science. Copyright© by the Editorial Department of Journal of Forensic Medicine

  19. Differential representation of sunflower ESTs in enriched organ-specific cDNA libraries in a small scale sequencing project

    Directory of Open Access Journals (Sweden)

    Heinz Ruth A

    2003-09-01

    Full Text Available Abstract Background Subtractive hybridization methods are valuable tools for identifying differentially regulated genes in a given tissue avoiding redundant sequencing of clones representing the same expressed genes, maximizing detection of low abundant transcripts and thus, affecting the efficiency and cost effectiveness of small scale cDNA sequencing projects aimed to the specific identification of useful genes for breeding purposes. The objective of this work is to evaluate alternative strategies to high-throughput sequencing projects for the identification of novel genes differentially expressed in sunflower as a source of organ-specific genetic markers that can be functionally associated to important traits. Results Differential organ-specific ESTs were generated from leaf, stem, root and flower bud at two developmental stages (R1 and R4. The use of different sources of RNA as tester and driver cDNA for the construction of differential libraries was evaluated as a tool for detection of rare or low abundant transcripts. Organ-specificity ranged from 75 to 100% of non-redundant sequences in the different cDNA libraries. Sequence redundancy varied according to the target and driver cDNA used in each case. The R4 flower cDNA library was the less redundant library with 62% of unique sequences. Out of a total of 919 sequences that were edited and annotated, 318 were non-redundant sequences. Comparison against sequences in public databases showed that 60% of non-redundant sequences showed significant similarity to known sequences. The number of predicted novel genes varied among the different cDNA libraries, ranging from 56% in the R4 flower to 16 % in the R1 flower bud library. Comparison with sunflower ESTs on public databases showed that 197 of non-redundant sequences (60% did not exhibit significant similarity to previously reported sunflower ESTs. This approach helped to successfully isolate a significant number of new reported sequences

  20. Analysis of sequence diversity through internal transcribed spacers and simple sequence repeats to identify Dendrobium species.

    Science.gov (United States)

    Liu, Y T; Chen, R K; Lin, S J; Chen, Y C; Chin, S W; Chen, F C; Lee, C Y

    2014-04-08

    The Orchidaceae is one of the largest and most diverse families of flowering plants. The Dendrobium genus has high economic potential as ornamental plants and for medicinal purposes. In addition, the species of this genus are able to produce large crops. However, many Dendrobium varieties are very similar in outward appearance, making it difficult to distinguish one species from another. This study demonstrated that the 12 Dendrobium species used in this study may be divided into 2 groups by internal transcribed spacer (ITS) sequence analysis. Red and yellow flowers may also be used to separate these species into 2 main groups. In particular, the deciduous characteristic is associated with the ITS genetic diversity of the A group. Of 53 designed simple sequence repeat (SSR) primer pairs, 7 pairs were polymorphic for polymerase chain reaction products that were amplified from a specific band. The results of this study demonstrate that these 7 SSR primer pairs may potentially be used to identify Dendrobium species and their progeny in future studies.

  1. RevTrans: multiple alignment of coding DNA from aligned amino acid sequences

    DEFF Research Database (Denmark)

    Wernersson, Rasmus; Pedersen, Anders Gorm

    2003-01-01

    The simple fact that proteins are built from 20 amino acids while DNA only contains four different bases, means that the 'signal-to-noise ratio' in protein sequence alignments is much better than in alignments of DNA. Besides this information-theoretical advantage, protein alignments also benefit...... proteins. It is therefore preferable to align coding DNA at the amino acid level and it is for this purpose we have constructed the program RevTrans. RevTrans constructs a multiple DNA alignment by: (i) translating the DNA; (ii) aligning the resulting peptide sequences; and (iii) building a multiple DNA...

  2. DNA interactions with a Methylene Blue redox indicator depend on the DNA length and are sequence specific.

    Science.gov (United States)

    Farjami, Elaheh; Clima, Lilia; Gothelf, Kurt V; Ferapontova, Elena E

    2010-06-01

    A DNA molecular beacon approach was used for the analysis of interactions between DNA and Methylene Blue (MB) as a redox indicator of a hybridization event. DNA hairpin structures of different length and guanine (G) content were immobilized onto gold electrodes in their folded states through the alkanethiol linker at the 5'-end. Binding of MB to the folded hairpin DNA was electrochemically studied and compared with binding to the duplex structure formed by hybridization of the hairpin DNA to a complementary DNA strand. Variation of the electrochemical signal from the DNA-MB complex was shown to depend primarily on the DNA length and sequence used: the G-C base pairs were the preferential sites of MB binding in the duplex. For short 20 nts long DNA sequences, the increased electrochemical response from MB bound to the duplex structure was consistent with the increased amount of bound and electrochemically readable MB molecules (i.e. MB molecules that are available for the electron transfer (ET) reaction with the electrode). With longer DNA sequences, the balance between the amounts of the electrochemically readable MB molecules bound to the hairpin DNA and to the hybrid was opposite: a part of the MB molecules bound to the long-sequence DNA duplex seem to be electrochemically mute due to long ET distance. The increasing electrochemical response from MB bound to the short-length DNA hybrid contrasts with the decreasing signal from MB bound to the long-length DNA hybrid and allows an "off"-"on" genosensor development.

  3. OPTSDNA: Performance evaluation of an efficient distributed bioinformatics system for DNA sequence analysis.

    Science.gov (United States)

    Khan, Mohammad Ibrahim; Sheel, Chotan

    2013-01-01

    Storage of sequence data is a big concern as the amount of data generated is exponential in nature at several locations. Therefore, there is a need to develop techniques to store data using compression algorithm. Here we describe optimal storage algorithm (OPTSDNA) for storing large amount of DNA sequences of varying length. This paper provides performance analysis of optimal storage algorithm (OPTSDNA) of a distributed bioinformatics computing system for analysis of DNA sequences. OPTSDNA algorithm is used for storing various sizes of DNA sequences into database. DNA sequences of different lengths were stored by using this algorithm. These input DNA sequences are varied in size from very small to very large. Storage size is calculated by this algorithm. Response time is also calculated in this work. The efficiency and performance of the algorithm is high (in size calculation with percentage) when compared with other known with sequential approach.

  4. Sequencing of megabase plus DNA by hybridization: Method development ENT. Final technical progress report

    Energy Technology Data Exchange (ETDEWEB)

    Crkvenjakov, R.; Drmanac, R.

    1991-01-31

    Sequencing by hybridization (SBH) is the only sequencing method based on the experimental determination of the content of oligonucleotide sequences. The data acquisition relies on the natural process of base pairing. It is possible to determine the content of complementary oligosequences in the target DNA by the process of hybridization with oligonucleotide probes of known sequences.

  5. Microsatellite DNA in genomic survey sequences and UniGenes of loblolly pine

    Science.gov (United States)

    Craig S Echt; Surya Saha; Dennis L Deemer; C Dana Nelson

    2011-01-01

    Genomic DNA sequence databases are a potential and growing resource for simple sequence repeat (SSR) marker development in loblolly pine (Pinus taeda L.). Loblolly pine also has many expressed sequence tags (ESTs) available for microsatellite (SSR) marker development. We compared loblolly pine SSR densities in genome survey sequences (GSSs) to those in non-redundant...

  6. Human tissue factor: cDNA sequence and chromosome localization of the gene

    International Nuclear Information System (INIS)

    Scarpati, E.M.; Wen, D.; Broze, G.J. Jr.; Miletich, J.P.; Flandermeyer, R.R.; Siegel, N.R.; Sadler, J.E.

    1987-01-01

    A human placenta cDNA library in λgt11 was screened for the expression of tissue factor antigens with rabbit polyclonal anti-human tissue factor immunoglobulin G. Among 4 million recombinant clones screened, one positive, λHTF8, expressed a protein that shared epitopes with authentic human brain tissue factor. The 1.1-kilobase cDNA insert of λHTF8 encoded a peptide that contained the amino-terminal protein sequence of human brain tissue factor. Northern blotting identified a major mRNA species of 2.2 kilobases and a minor species of ∼ 3.2 kilobases in poly(A) + RNA of placenta. Only 2.2-kilobase mRNA was detected in human brain and in the human monocytic U937 cell line. In U937 cells, the quantity of tissue factor mRNA was increased several fold by exposure of the cells to phorbol 12-myristate 13-acetate. Additional cDNA clones were selected by hybridization with the cDNA insert of λHTF8. These overlapping isolates span 2177 base pairs of the tissue factor cDNA sequence that includes a 5'-noncoding region of 75 base pairs, an open reading frame of 885 base pairs, a stop codon, a 3'-noncoding region of 1141 base pairs, and a poly(a) tail. The open reading frame encodes a 33-kilodalton protein of 295 amino acids. The predicted sequence includes a signal peptide of 32 or 34 amino acids, a probable extracellular factor VII binding domain of 217 or 219 amino acids, a transmembrane segment of 23 acids, and a cytoplasmic tail of 21 amino acids. There are three potential glycosylation sites with the sequence Asn-X-Thr/Ser. The 3'-noncoding region contains an inverted Alu family repetitive sequence. The tissue factor gene was localized to chromosome 1 by hybridization of the cDNA insert of λHTF8 to flow-sorted human chromosomes

  7. Complete chloroplast DNA sequence from a Korean endemic genus, Megaleranthis saniculifolia, and its evolutionary implications.

    Science.gov (United States)

    Kim, Young-Kyu; Park, Chong-wook; Kim, Ki-Joong

    2009-03-31

    The chloroplast DNA sequences of Megaleranthis saniculifolia, an endemic and monotypic endangered plant species, were completed in this study (GenBank FJ597983). The genome is 159,924 bp in length. It harbors a pair of IR regions consisting of 26,608 bp each. The lengths of the LSC and SSC regions are 88,326 bp and 18,382 bp, respectively. The structural organizations, gene and intron contents, gene orders, AT contents, codon usages, and transcription units of the Megaleranthis chloroplast genome are similar to those of typical land plant cp DNAs. However, the detailed features of Megaleranthis chloroplast genomes are substantially different from that of Ranunculus, which belongs to the same family, the Ranunculaceae. First, the Megaleranthis cp DNA was 4,797 bp longer than that of Ranunculus due to an expanded IR region into the SSC region and duplicated sequence elements in several spacer regions of the Megaleranthis cp genome. Second, the chloroplast genomes of Megaleranthis and Ranunculus evidence 5.6% sequence divergence in the coding regions, 8.9% sequence divergence in the intron regions, and 18.7% sequence divergence in the intergenic spacer regions, respectively. In both the coding and noncoding regions, average nucleotide substitution rates differed markedly, depending on the genome position. Our data strongly implicate the positional effects of the evolutionary modes of chloroplast genes. The genes evidencing higher levels of base substitutions also have higher incidences of indel mutations and low Ka/Ks ratios. A total of 54 simple sequence repeat loci were identified from the Megaleranthis cp genome. The existence of rich cp SSR loci in the Megaleranthis cp genome provides a rare opportunity to study the population genetic structures of this endangered species. Our phylogenetic trees based on the two independent markers, the nuclear ITS and chloroplast matK sequences, strongly support the inclusion of the Megaleranthis to the Trollius. Therefore, our

  8. A 28,000 Years Old Cro-Magnon mtDNA Sequence Differs from All Potentially Contaminating Modern Sequences

    Science.gov (United States)

    Caramelli, David; Milani, Lucio; Vai, Stefania; Modi, Alessandra; Pecchioli, Elena; Girardi, Matteo; Pilli, Elena; Lari, Martina; Lippi, Barbara; Ronchitelli, Annamaria; Mallegni, Francesco; Casoli, Antonella; Bertorelle, Giorgio; Barbujani, Guido

    2008-01-01

    Background DNA sequences from ancient speciments may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal) and early modern (Cro-Magnoid) Europeans. Methodology/Principal Findings We typed the mitochondrial DNA (mtDNA) hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23) and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. Conclusions/Significance: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans. PMID:18628960

  9. A 28,000 years old Cro-Magnon mtDNA sequence differs from all potentially contaminating modern sequences.

    Directory of Open Access Journals (Sweden)

    David Caramelli

    Full Text Available BACKGROUND: DNA sequences from ancient specimens may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal and early modern (Cro-Magnoid Europeans. METHODOLOGY/PRINCIPAL FINDINGS: We typed the mitochondrial DNA (mtDNA hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23 and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. CONCLUSIONS/SIGNIFICANCE: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans.

  10. Sequence analysis of mitochondrial DNA hypervariable region III of ...

    African Journals Online (AJOL)

    Aghomotsegin

    2015-07-01

    Jul 1, 2015 ... population genetics research, studies based on mitochondrial DNA (mtDNA) and Y-chromosome DNA are an excellent way of illustrating population structure .... avoid landing investigators into serious situations of medical genetic privacy and ethnics, especially for. mtDNA coding area whose mutation often ...

  11. Protein and DNA sequence determinants of thermophilic adaptation.

    Directory of Open Access Journals (Sweden)

    Konstantin B Zeldovich

    2007-01-01

    Full Text Available There have been considerable attempts in the past to relate phenotypic trait--habitat temperature of organisms--to their genotypes, most importantly compositions of their genomes and proteomes. However, despite accumulation of anecdotal evidence, an exact and conclusive relationship between the former and the latter has been elusive. We present an exhaustive study of the relationship between amino acid composition of proteomes, nucleotide composition of DNA, and optimal growth temperature (OGT of prokaryotes. Based on 204 complete proteomes of archaea and bacteria spanning the temperature range from -10 degrees C to 110 degrees C, we performed an exhaustive enumeration of all possible sets of amino acids and found a set of amino acids whose total fraction in a proteome is correlated, to a remarkable extent, with the OGT. The universal set is Ile, Val, Tyr, Trp, Arg, Glu, Leu (IVYWREL, and the correlation coefficient is as high as 0.93. We also found that the G + C content in 204 complete genomes does not exhibit a significant correlation with OGT (R = -0.10. On the other hand, the fraction of A + G in coding DNA is correlated with temperature, to a considerable extent, due to codon patterns of IVYWREL amino acids. Further, we found strong and independent correlation between OGT and the frequency with which pairs of A and G nucleotides appear as nearest neighbors in genome sequences. This adaptation is achieved via codon bias. These findings present a direct link between principles of proteins structure and stability and evolutionary mechanisms of thermophylic adaptation. On the nucleotide level, the analysis provides an example of how nature utilizes codon bias for evolutionary adaptation to extreme conditions. Together these results provide a complete picture of how compositions of proteomes and genomes in prokaryotes adjust to the extreme conditions of the environment.

  12. Molecular Profiling of Microbial Communities from Contaminated Sources: Use of Subtractive Cloning Methods and rDNA Spacer Sequences; FINAL

    International Nuclear Information System (INIS)

    Robb, Frank T.

    2001-01-01

    The major objective of this research was to provide appropriate sequences and assemble a DNA array of oligonucleotides to be used for rapid profiling of microbial populations from polluted areas and other areas of interest. The sequences to be assigned to the DNA array were chosen from cloned genomic DNA taken from groundwater sites having well characterized pollutant histories at Hanford Nuclear Plant and Lawrence Livermore Site 300. Glass-slide arrays were made and tested; and a new multiplexed, bead-based method was developed that uses nucleic acid hybridization on the surface of microscopic polystyrene spheres to identify specific sequences in heterogeneous mixtures of DNA sequences. The test data revealed considerable strain variation between sample sites showing a striking distribution of sequences. It also suggests that diversity varies greatly with bioremediation, and that there are many bacterial intergenic spacer region sequences that can indicate its effects. The bead method exhibited superior sequence discrimination and has features for easier and more accurate measurement

  13. Insights into N-calls of mitochondrial DNA sequencing using MitoChip v2.0

    Directory of Open Access Journals (Sweden)

    Blakely Emma L

    2011-10-01

    Full Text Available Abstract Background Developments in DNA resequencing microarrays include mitochondrial DNA (mtDNA sequencing and mutation detection. Failure by the microarray to identify a base, compared to the reference sequence, is designated an 'N-call.' This study re-examined the N-call distribution of mtDNA samples sequenced by the Affymetrix MitoChip v.2.0, based on the hypothesis that N-calls may represent insertions or deletions (indels in mtDNA. Findings We analysed 16 patient mtDNA samples using MitoChip. N-calls by the proprietary GSEQ software were significantly reduced when either of the freeware on-line algorithms ResqMi or sPROFILER was utilized. With sPROFILER, this decrease in N-calls had no effect on the homoplasmic or heteroplasmic mutation levels compared to GSEQ software, but ResqMi produced a significant change in mutation load, as well as producing longer N-cell stretches. For these reasons, further analysis using ResqMi was not attempted. Conventional DNA sequencing of the longer N-calls stretches from sPROFILER revealed 7 insertions and 12 point mutations. Moreover, analysis of single-base N-calls of one mtDNA sample found 3 other point mutations. Conclusions Our study is the first to analyse N-calls produced from GSEQ software for the MitoChipv2.0. By narrowing the focus to longer stretches of N-calls revealed by sPROFILER, conventional sequencing was able to identify unique insertions and point mutations. Shorter N-calls also harboured point mutations, but the absence of deletions among N-calls suggests that probe confirmation affects binding and thus N-calling. This study supports the contention that the GSEQ is more capable of assigning bases when used in conjunction with sPROFILER.

  14. [Sequencing of mitochondrial DNA cytochrome oxidase subunit I gene in sarcosaphagous flies from 14 provinces in China].

    Science.gov (United States)

    Yang, Li; Cai, Jifeng; Wen, Jifang; Guo, Yadong

    2010-08-01

    To detect the 278 bp region of gene of the cytochrome oxidase subunit I (COI) in mitochondral DNA (mtDNA) of sarcosaphagous flies, identify the species of sarcosaphagous flies, and provide reference for forensic application. Samples were collected in Baotou and Chifeng of Inner Mongolia, Tianjin, Nanning, Fuzhou, Linyi of Shandong, Shijiazhuang, Yinchuan, Lanzhou, Huairou of Beijing, Xinxiang and Nanyang of Henan, Datong of Shanxi, Wuhu of Anhui, Quzhou of Zhejiang, Changsha, Zhuzhou and Yongzhou of Hunan. A total of 38 flies were randomly collected from rabbits, dogs and pigs which were set outdoors, then the flies' mitochondrial DNA (mtDNA) were extracted by the improved small insects DNA homogenate method. Amplification was conducted by Perkin-Elmer 9600 thermal cycler, then vertical non-denaturing 7% polyacrylamide gelectrophoresis. PCR products were purified using the nucleic acid purification kit. Sequences of both strands were obtained by direct sequence of the double-stranded PCR product using one of the PCR primers and the ABI PRISM big dye terminator cycle sequencing dit. Sequence reactions were electrophorsed on ABI Model 3730 DNA Sequencers. A UPGMA tree was contrasted using the maximum composite likelihood method in MEGA4. The 38 sarcosaphagous flies belonged to 3 families(Muscidae, Calliphoridae, and Sarcophagidae), 10 genuses (Musca Linnaeus, Hydrotaea Robineau-Desvoidy, Aldrichina Townsend, Hemipyrellia Townsend, Achoetandrus Bezzi, Protophormia Townsend, Chrysomya Robineau-Desvoidy, Lucilia Robineau-Desvoidy, Helicophagella Enderlein, and Boettcherisca Rohdendorf), and 12 species [Musca domestica (Linnaeus), Hydrotaea (Ophyra) capensis (Wiedemann), Lucilia caesar (Linnaeus), Lucilia illustris (Meigen), Aldrichina graham (Aldrich), Hemipyrellia ligurriens, Achoetandrus (Chrysomya) rufifacies (Macquary), Protophormia terraenovae (Robineau-Desvoidy), Chrysomya megacephala (Fabricius), Lucilia sericata (Meigen), Helicophagella melanura (Meigen), and

  15. Determining physical constraints in transcriptional initiationcomplexes using DNA sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Shultzaberger, Ryan K.; Chiang, Derek Y.; Moses, Alan M.; Eisen,Michael B.

    2007-07-01

    Eukaryotic gene expression is often under the control ofcooperatively acting transcription factors whose binding is limited bystructural constraints. By determining these structural constraints, wecan understand the "rules" that define functional cooperativity.Conversely, by understanding the rules of binding, we can inferstructural characteristics. We have developed an information theory basedmethod for approximating the physical limitations of cooperativeinteractions by comparing sequence analysis to microarray expressiondata. When applied to the coordinated binding of the sulfur amino acidregulatory protein Met4 by Cbf1 and Met31, we were able to create acombinatorial model that can correctly identify Met4 regulatedgenes.

  16. Sequence-Based Introgression Mapping Identifies Candidate White Mold Tolerance Genes in Common Bean

    Directory of Open Access Journals (Sweden)

    Sujan Mamidi

    2016-07-01

    Full Text Available White mold, caused by the necrotrophic fungus (Lib. de Bary, is a major disease of common bean ( L.. WM7.1 and WM8.3 are two quantitative trait loci (QTL with major effects on tolerance to the pathogen. Advanced backcross populations segregating individually for either of the two QTL, and a recombinant inbred (RI population segregating for both QTL were used to fine map and confirm the genetic location of the QTL. The QTL intervals were physically mapped using the reference common bean genome sequence, and the physical intervals for each QTL were further confirmed by sequence-based introgression mapping. Using whole-genome sequence data from susceptible and tolerant DNA pools, introgressed regions were identified as those with significantly higher numbers of single-nucleotide polymorphisms (SNPs relative to the whole genome. By combining the QTL and SNP data, WM7.1 was located to a 660-kb region that contained 41 gene models on the proximal end of chromosome Pv07, while the WM8.3 introgression was narrowed to a 1.36-Mb region containing 70 gene models. The most polymorphic candidate gene in the WM7.1 region encodes a BEACH-domain protein associated with apoptosis. Within the WM8.3 interval, a receptor-like protein with the potential to recognize pathogen effectors was the most polymorphic gene. The use of gene and sequence-based mapping identified two candidate genes whose putative functions are consistent with the current model of pathogenicity.

  17. Full-length cDNA sequences from Rhesus monkey placenta tissue: analysis and utility for comparative mapping

    Directory of Open Access Journals (Sweden)

    Lee Sang-Rae

    2010-07-01

    Full Text Available Abstract Background Rhesus monkeys (Macaca mulatta are widely-used as experimental animals in biomedical research and are closely related to other laboratory macaques, such as cynomolgus monkeys (Macaca fascicularis, and to humans, sharing a last common ancestor from about 25 million years ago. Although rhesus monkeys have been studied extensively under field and laboratory conditions, research has been limited by the lack of genetic resources. The present study generated placenta full-length cDNA libraries, characterized the resulting expressed sequence tags, and described their utility for comparative mapping with human RefSeq mRNA transcripts. Results From rhesus monkey placenta full-length cDNA libraries, 2000 full-length cDNA sequences were determined and 1835 rhesus placenta cDNA sequences longer than 100 bp were collected. These sequences were annotated based on homology to human genes. Homology search against human RefSeq mRNAs revealed that our collection included the sequences of 1462 putative rhesus monkey genes. Moreover, we identified 207 genes containing exon alterations in the coding region and the untranslated region of rhesus monkey transcripts, despite the highly conserved structure of the coding regions. Approximately 10% (187 of all full-length cDNA sequences did not represent any public human RefSeq mRNAs. Intriguingly, two rhesus monkey specific exons derived from the transposable elements of AluYRa2 (SINE family and MER11B (LTR family were also identified. Conclusion The 1835 rhesus monkey placenta full-length cDNA sequences described here could expand genomic resources and information of rhesus monkeys. This increased genomic information will greatly contribute to the development of evolutionary biology and biomedical research.

  18. Parasitic infections and resource economy of Danish Iron Age settlement through ancient DNA sequencing.

    Science.gov (United States)

    Tams, Katrine Wegener; Jensen Søe, Martin; Merkyte, Inga; Valeur Seersholm, Frederik; Henriksen, Peter Steen; Klingenberg, Susanne; Willerslev, Eske; Kjær, Kurt H; Hansen, Anders Johannes; Kapel, Christian Moliin Outzen

    2018-01-01

    In this study, we screen archaeological soil samples by microscopy and analyse the samples by next generation sequencing to obtain results with parasites at species level and untargeted findings of plant and animal DNA. Three separate sediment layers of an ancient man-made pond in Hoby, Denmark, ranging from 100 BC to 200 AD, were analysed by microscopy for presence of intestinal worm eggs and DNA analysis were performed to identify intestinal worms and dietary components. Ancient DNA of parasites, domestic animals and edible plants revealed a change in use of the pond over time reflecting the household practice in the adjacent Iron Age settlement. The most abundant parasite found belonged to the Ascaris genus, which was not possible to type at species level. For all sediment layers the presence of eggs of the human whipworm Trichuris trichiura and the beef tapeworm Taenia saginata suggests continuous disposal of human faeces in the pond. Moreover, the continuous findings of T. saginata further imply beef consumption and may suggest that cattle were living in the immediate surrounding of the site throughout the period. Findings of additional host-specific parasites suggest fluctuating presence of other domestic animals over time: Trichuris suis (pig), Parascaris univalens (horse), Taenia hydatigena (dog and sheep). Likewise, alternating occurrence of aDNA of edible plants may suggest changes in agricultural practices. Moreover, the composition of aDNA of parasites, plants and vertebrates suggests a significant change in the use of the ancient pond over a period of three centuries.

  19. Machine Learned Replacement of N-Labels for Basecalled Sequences in DNA Barcoding.

    Science.gov (United States)

    Ma, Eddie Y T; Ratnasingham, Sujeevan; Kremer, Stefan C

    2018-01-01

    This study presents a machine learning method that increases the number of identified bases in Sanger Sequencing. The system post-processes a KB basecalled chromatogram. It selects a recoverable subset of N-labels in the KB-called chromatogram to replace with basecalls (A,C,G,T). An N-label correction is defined given an additional read of the same sequence, and a human finished sequence. Corrections are added to the dataset when an alignment determines the additional read and human agree on the identity of the N-label. KB must also rate the replacement with quality value of in the additional read. Corrections are only available during system training. Developing the system, nearly 850,000 N-labels are obtained from Barcode of Life Datasystems, the premier database of genetic markers called DNA Barcodes. Increasing the number of correct bases improves reference sequence reliability, increases sequence identification accuracy, and assures analysis correctness. Keeping with barcoding standards, our system maintains an error rate of percent. Our system only applies corrections when it estimates low rate of error. Tested on this data, our automation selects and recovers: 79 percent of N-labels from COI (animal barcode); 80 percent from matK and rbcL (plant barcodes); and 58 percent from non-protein-coding sequences (across eukaryotes).

  20. Sequence homology and expression profile of genes associated with dna repair pathways in Mycobacterium leprae

    Directory of Open Access Journals (Sweden)

    Mukul Sharma

    2017-01-01

    Full Text Available Background: Survival of Mycobacterium leprae, the causative bacteria for leprosy, in the human host is dependent to an extent on the ways in which its genome integrity is retained. DNA repair mechanisms protect bacterial DNA from damage induced by various stress factors. The current study is aimed at understanding the sequence and functional annotation of DNA repair genes in M. leprae. Methods: T he genome of M. leprae was annotated using sequence alignment tools to identify DNA repair genes that have homologs in Mycobacterium tuberculosis and Escherichia coli. A set of 96 genes known to be involved in DNA repair mechanisms in E. coli and Mycobacteriaceae were chosen as a reference. Among these, 61 were identified in M. leprae based on sequence similarity and domain architecture. The 61 were classified into 36 characterized gene products (59%, 11 hypothetical proteins (18%, and 14 pseudogenes (23%. All these genes have homologs in M. tuberculosis and 49 (80.32% in E. coli. A set of 12 genes which are absent in E. coli were present in M. leprae and in Mycobacteriaceae. These 61 genes were further investigated for their expression profiles in the whole transcriptome microarray data of M. leprae which was obtained from the signal intensities of 60bp probes, tiling the entire genome with 10bp overlaps. Results: It was noted that transcripts corresponding to all the 61 genes were identified in the transcriptome data with varying expression levels ranging from 0.18 to 2.47 fold (normalized with 16SrRNA. The mRNA expression levels of a representative set of seven genes ( four annotated and three hypothetical protein coding genes were analyzed using quantitative Polymerase Chain Reaction (qPCR assays with RNA extracted from skin biopsies of 10 newly diagnosed, untreated leprosy cases. It was noted that RNA expression levels were higher for genes involved in homologous recombination whereas the genes with a low level of expression are involved in the

  1. Sequence homology and expression profile of genes associated with DNA repair pathways in Mycobacterium leprae.

    Science.gov (United States)

    Sharma, Mukul; Vedithi, Sundeep Chaitanya; Das, Madhusmita; Roy, Anindya; Ebenezer, Mannam

    2017-01-01

    Survival of Mycobacterium leprae, the causative bacteria for leprosy, in the human host is dependent to an extent on the ways in which its genome integrity is retained. DNA repair mechanisms protect bacterial DNA from damage induced by various stress factors. The current study is aimed at understanding the sequence and functional annotation of DNA repair genes in M. leprae. T he genome of M. leprae was annotated using sequence alignment tools to identify DNA repair genes that have homologs in Mycobacterium tuberculosis and Escherichia coli. A set of 96 genes known to be involved in DNA repair mechanisms in E. coli and Mycobacteriaceae were chosen as a reference. Among these, 61 were identified in M. leprae based on sequence similarity and domain architecture. The 61 were classified into 36 characterized gene products (59%), 11 hypothetical proteins (18%), and 14 pseudogenes (23%). All these genes have homologs in M. tuberculosis and 49 (80.32%) in E. coli. A set of 12 genes which are absent in E. coli were present in M. leprae and in Mycobacteriaceae. These 61 genes were further investigated for their expression profiles in the whole transcriptome microarray data of M. leprae which was obtained from the signal intensities of 60bp probes, tiling the entire genome with 10bp overlaps. It was noted that transcripts corresponding to all the 61 genes were identified in the transcriptome data with varying expression levels ranging from 0.18 to 2.47 fold (normalized with 16SrRNA). The mRNA expression levels of a representative set of seven genes ( four annotated and three hypothetical protein coding genes) were analyzed using quantitative Polymerase Chain Reaction (qPCR) assays with RNA extracted from skin biopsies of 10 newly diagnosed, untreated leprosy cases. It was noted that RNA expression levels were higher for genes involved in homologous recombination whereas the genes with a low level of expression are involved in the direct repair pathway. This study provided

  2. Ribosomal DNA intergenic spacer sequence in foxtail millet, Setaria italica (L.) P. Beauv. and its characterization and application to typing of foxtail millet landraces.

    Science.gov (United States)

    Fukunaga, Kenji; Ichitani, Katsuyuki; Taura, Satoru; Sato, Muneharu; Kawase, Makoto

    2005-02-01

    We determined the sequence of ribosomal DNA (rDNA) intergenic spacer (IGS) of foxtail millet isolated in our previous study, and identified subrepeats in the polymorphic region. We also developed a PCR-based method for identifying rDNA types based on sequence information and assessed 153 accessions of foxtail millet. Results were congruent with our previous works. This study provides new findings regarding the geographical distribution of rDNA variants. This new method facilitates analyses of numerous foxtail millet accessions. It is helpful for typing of foxtail millet germplasms and elucidating the evolution of this millet.

  3. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution

    NARCIS (Netherlands)

    Falconer, Ester; Hills, Mark; Naumann, Ulrike; Poon, Steven S. S.; Chavez, Elizabeth A.; Sanders, Ashley D.; Zhao, Yongjun; Hirst, Martin; Lansdorp, Peter M.

    DNA rearrangements such as sister chromatid exchanges (SCEs) are sensitive indicators of genomic stress and instability, but they are typically masked by single-cell sequencing techniques. We developed Strand-seq to independently sequence parental DNA template strands from single cells, making it

  4. Germline Variants in Targeted Tumor Sequencing Using Matched Normal DNA.

    Science.gov (United States)

    Schrader, Kasmintan A; Cheng, Donavan T; Joseph, Vijai; Prasad, Meera; Walsh, Michael; Zehir, Ahmet; Ni, Ai; Thomas, Tinu; Benayed, Ryma; Ashraf, Asad; Lincoln, Annie; Arcila, Maria; Stadler, Zsofia; Solit, David; Hyman, David M; Hyman, David; Zhang, Liying; Klimstra, David; Ladanyi, Marc; Offit, Kenneth; Berger, Michael; Robson, Mark

    2016-01-01

    Tumor genetic sequencing identifies potentially targetable genetic alterations with therapeutic implications. Analysis has concentrated on detecting tumor-specific variants, but recognition of germline variants may prove valuable as well. To estimate the burden of germline variants identified through routine clinical tumor sequencing. Patients with advanced cancer diagnoses eligible for studies of targeted agents at Memorial Sloan Kettering Cancer Center are offered tumor-normal sequencing with MSK-IMPACT, a 341-gene panel. We surveyed the germline variants seen in 187 overlapping genes with Mendelian disease associations in 1566 patients who had undergone tumor profiling between March and October 2014. The number of presumed pathogenic germline variants (PPGVs) and variants of uncertain significance per person in 187 genes associated with single-gene disorders and the proportions of individuals with PPGVs in clinically relevant gene subsets, in genes consistent with known tumor phenotypes, and in genes with evidence of second somatic hits in their tumors. The mean age of the 1566 patients was 58 years, and 54% were women. Presumed pathogenic germline variants in known Mendelian disease-associated genes were identified in 246 of 1566 patients (15.7%; 95% CI, 14.0%-17.6%), including 198 individuals with mutations in genes associated with cancer susceptibility. Germline findings in cancer susceptibility genes were concordant with the individual's cancer type in only 81 of 198 cases (40.9%; 95% CI, 34.3%-47.9%). In individuals with PPGVs retained in the tumor, somatic alteration of the other allele was seen in 39 of 182 cases (21.4%; 95% CI, 16.1%-28.0%), of which 13 cases did not show a known correlation of the germline mutation and a known syndrome. Mutations in non-cancer-related Mendelian disease genes were seen in 55 of 1566 cases (3.5%; 95% CI, 27.1%-45.4%). Almost every individual had more than 1 variant of uncertain significance (1565 of 1566 patients; 99

  5. The DNA sequence and biology of human chromosome 19

    Energy Technology Data Exchange (ETDEWEB)

    Grimwood, J; Gordon, L A; Olsen, A; Terry, A; Schmutz, J; Lamerdin, J; Hellsten, U; Goodstein, D; Couronne, O; Tran-Gyamfi, M

    2004-04-06

    Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high GC content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in Mendelian disorders, including familial hypercholesterolemia and insulin-resistant diabetes. Nearly one quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.

  6. Frequent mutations in EGFR, KRAS and TP53 genes in human lung cancer tumors detected by ion torrent DNA sequencing.

    Directory of Open Access Journals (Sweden)

    Xin Cai

    Full Text Available Lung cancer is the most common malignancy and the leading cause of cancer deaths worldwide. While smoking is by far the leading cause of lung cancer, other environmental and genetic factors influence the development and progression of the cancer. Since unique mutations patterns have been observed in individual cancer samples, identification and characterization of the distinctive lung cancer molecular profile is essential for developing more effective, tailored therapies. Until recently, personalized DNA sequencing to identify genetic mutations in cancer was impractical and expensive. The recent technological advancements in next-generation DNA sequencing, such as the semiconductor-based Ion Torrent sequencing platform, has made DNA sequencing cost and time effective with more reliable results. Using the Ion Torrent Ampliseq Cancer Panel, we sequenced 737 loci from 45 cancer-related genes to identify genetic mutations in 76 human lung cancer samples. The sequencing analysis revealed missense mutations in KRAS, EGFR, and TP53 genes in the breast cancer samples of various histologic types. Thus, this study demonstrates the necessity of sequencing individual human cancers in order to develop personalized drugs or combination therapies to effectively target individual, breast cancer-specific mutations.

  7. Identification and positional distribution analysis of transcription factor binding sites for genes from the wheat fl-cDNA sequences.

    Science.gov (United States)

    Chen, Zhen-Yong; Guo, Xiao-Jiang; Chen, Zhong-Xu; Chen, Wei-Ying; Wang, Ji-Rui

    2017-06-01

    The binding sites of transcription factors (TFs) in upstream DNA regions are called transcription factor binding sites (TFBSs). TFBSs are important elements for regulating gene expression. To date, there have been few studies on the profiles of TFBSs in plants. In total, 4,873 sequences with 5' upstream regions from 8530 wheat fl-cDNA sequences were used to predict TFBSs. We found 4572 TFBSs for the MADS TF family, which was twice as many as for bHLH (1951), B3 (1951), HB superfamily (1914), ERF (1820), and AP2/ERF (1725) TFs, and was approximately four times higher than the remaining TFBS types. The percentage of TFBSs and TF members showed a distinct distribution in different tissues. Overall, the distribution of TFBSs in the upstream regions of wheat fl-cDNA sequences had significant difference. Meanwhile, high frequencies of some types of TFBSs were found in specific regions in the upstream sequences. Both TFs and fl-cDNA with TFBSs predicted in the same tissues exhibited specific distribution preferences for regulating gene expression. The tissue-specific analysis of TFs and fl-cDNA with TFBSs provides useful information for functional research, and can be used to identify relationships between tissue-specific TFs and fl-cDNA with TFBSs. Moreover, the positional distribution of TFBSs indicates that some types of wheat TFBS have different positional distribution preferences in the upstream regions of genes.

  8. SSR_pipeline--computer software for the identification of microsatellite sequences from paired-end Illumina high-throughput DNA sequence data

    Science.gov (United States)

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (SSRs; for example, microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains three analysis modules along with a fourth control module that can be used to automate analyses of large volumes of data. The modules are used to (1) identify the subset of paired-end sequences that pass quality standards, (2) align paired-end reads into a single composite DNA sequence, and (3) identify sequences that possess microsatellites conforming to user specified parameters. Each of the three separate analysis modules also can be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc). All modules are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, Windows). The program suite relies on a compiled Python extension module to perform paired-end alignments. Instructions for compiling the extension from source code are provided in the documentation. Users who do not have Python installed on their computers or who do not have the ability to compile software also may choose to download packaged executable files. These files include all Python scripts, a copy of the compiled extension module, and a minimal installation of Python in a single binary executable. See program documentation for more information.

  9. Comparison of Flow Injection MS, NMR, and DNA Sequencing: Methods for Identification and Authentication of Black Cohosh (Actaea racemosa)

    Science.gov (United States)

    Flow injection mass spectrometry (FIMS) and proton nuclear magnetic resonance spectrometry (1H-NMR), two metabolic fingerprinting methods, and DNA sequencing were used to identify and authenticate Actaea species. Initially, samples of Actaea racemosa L. from a single source were distinguished from ...

  10. Whole-genome sequencing identifies recurrent somatic NOTCH2 mutations in splenic marginal zone lymphoma.

    Science.gov (United States)

    Kiel, Mark J; Velusamy, Thirunavukkarasu; Betz, Bryan L; Zhao, Lili; Weigelin, Helmut G; Chiang, Mark Y; Huebner-Chan, David R; Bailey, Nathanael G; Yang, David T; Bhagat, Govind; Miranda, Roberto N; Bahler, David W; Medeiros, L Jeffrey; Lim, Megan S; Elenitoba-Johnson, Kojo S J

    2012-08-27

    Splenic marginal zone lymphoma (SMZL), the most common primary lymphoma of spleen, is poorly understood at the genetic level. In this study, using whole-genome DNA sequencing (WGS) and confirmation by Sanger sequencing, we observed mutations identified in several genes not previously known to be recurrently altered in SMZL. In particular, we identified recurrent somatic gain-of-function mutations in NOTCH2, a gene encoding a protein required for marginal zone B cell development, in 25 of 99 (∼25%) cases of SMZL and in 1 of 19 (∼5%) cases of nonsplenic MZLs. These mutations clustered near the C-terminal proline/glutamate/serine/threonine (PEST)-rich domain, resulting in protein truncation or, rarely, were nonsynonymous substitutions affecting the extracellular heterodimerization domain (HD). NOTCH2 mutations were not present in other B cell lymphomas and leukemias, such as chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL/SLL; n = 15), mantle cell lymphoma (MCL; n = 15), low-grade follicular lymphoma (FL; n = 44), hairy cell leukemia (HCL; n = 15), and reactive lymphoid hyperplasia (n = 14). NOTCH2 mutations were associated with adverse clinical outcomes (relapse, histological transformation, and/or death) among SMZL patients (P = 0.002). These results suggest that NOTCH2 mutations play a role in the pathogenesis and progression of SMZL and are associated with a poor prognosis.

  11. Detection of Ribosomal DNA Sequence Polymorphisms in the Protist Plasmodiophora brassicae for the Identification of Geographical Isolates

    Directory of Open Access Journals (Sweden)

    Rawnak Laila

    2017-01-01

    Full Text Available Clubroot is a soil-borne disease caused by the protist Plasmodiophora brassicae (P. brassicae. It is one of the most economically important diseases of Brassica rapa and other cruciferous crops as it can cause remarkable yield reductions. Understanding P. brassicae genetics, and developing efficient molecular markers, is essential for effective detection of harmful races of this pathogen. Samples from 11 Korean field populations of P. brassicae (geographic isolates, collected from nine different locations in South Korea, were used in this study. Genomic DNA was extracted from the clubroot-infected samples to sequence the ribosomal DNA. Primers and probes for P. brassicae were designed using a ribosomal DNA gene sequence from a Japanese strain available in GenBank (accession number AB526843; isolate NGY. The nuclear ribosomal DNA (rDNA sequence of P. brassicae, comprising 6932 base pairs (bp, was cloned and sequenced and found to include the small subunits (SSUs and a large subunit (LSU, internal transcribed spacers (ITS1 and ITS2, and a 5.8s. Sequence variation was observed in both the SSU and LSU. Four markers showed useful differences in high-resolution melting analysis to identify nucleotide polymorphisms including single- nucleotide polymorphisms (SNPs, oligonucleotide polymorphisms, and insertions/deletions (InDels. A combination of three markers was able to distinguish the geographical isolates into two groups.

  12. A novel mini-DNA barcoding assay to identify processed fins from internationally protected shark species.

    Science.gov (United States)

    Fields, Andrew T; Abercrombie, Debra L; Eng, Rowena; Feldheim, Kevin; Chapman, Demian D

    2015-01-01

    There is a growing need to identify shark products in trade, in part due to the recent listing of five commercially important species on the Appendices of the Convention on International Trade in Endangered Species (CITES; porbeagle, Lamna nasus, oceanic whitetip, Carcharhinus longimanus scalloped hammerhead, Sphyrna lewini, smooth hammerhead, S. zygaena and great hammerhead S. mokarran) in addition to three species listed in the early part of this century (whale, Rhincodon typus, basking, Cetorhinus maximus, and white, Carcharodon carcharias). Shark fins are traded internationally to supply the Asian dried seafood market, in which they are used to make the luxury dish shark fin soup. Shark fins usually enter international trade with their skin still intact and can be identified using morphological characters or standard DNA-barcoding approaches. Once they reach Asia and are traded in this region the skin is removed and they are treated with chemicals that eliminate many key diagnostic characters and degrade their DNA ("processed fins"). Here, we present a validated mini-barcode assay based on partial sequences of the cytochrome oxidase I gene that can reliably identify the processed fins of seven of the eight CITES listed shark species. We also demonstrate that the assay can even frequently identify the species or genus of origin of shark fin soup (31 out of 50 samples).

  13. A novel mini-DNA barcoding assay to identify processed fins from internationally protected shark species.

    Directory of Open Access Journals (Sweden)

    Andrew T Fields

    Full Text Available There is a growing need to identify shark products in trade, in part due to the recent listing of five commercially important species on the Appendices of the Convention on International Trade in Endangered Species (CITES; porbeagle, Lamna nasus, oceanic whitetip, Carcharhinus longimanus scalloped hammerhead, Sphyrna lewini, smooth hammerhead, S. zygaena and great hammerhead S. mokarran in addition to three species listed in the early part of this century (whale, Rhincodon typus, basking, Cetorhinus maximus, and white, Carcharodon carcharias. Shark fins are traded internationally to supply the Asian dried seafood market, in which they are used to make the luxury dish shark fin soup. Shark fins usually enter international trade with their skin still intact and can be identified using morphological characters or standard DNA-barcoding approaches. Once they reach Asia and are traded in this region the skin is removed and they are treated with chemicals that eliminate many key diagnostic characters and degrade their DNA ("processed fins". Here, we present a validated mini-barcode assay based on partial sequences of the cytochrome oxidase I gene that can reliably identify the processed fins of seven of the eight CITES listed shark species. We also demonstrate that the assay can even frequently identify the species or genus of origin of shark fin soup (31 out of 50 samples.

  14. High Interlaboratory Reprocucibility of DNA Sequence-based Typing of Bacteria in a Multicenter Study

    DEFF Research Database (Denmark)

    Sousa, MA de; Boye, Kit; Lencastre, H de

    2006-01-01

    Current DNA amplification-based typing methods for bacterial pathogens often lack interlaboratory reproducibility. In this international study, DNA sequence-based typing of the Staphylococcus aureus protein A gene (spa, 110 to 422 bp) showed 100% intra- and interlaboratory reproducibility without...... extensive harmonization of protocols for 30 blind-coded S. aureus DNA samples sent to 10 laboratories. Specialized software for automated sequence analysis ensured a common typing nomenclature....

  15. Targeted DNA Methylation Analysis by High Throughput Sequencing in Porcine Peri-attachment Embryos

    OpenAIRE

    MORRILL, Benson H.; COX, Lindsay; WARD, Anika; HEYWOOD, Sierra; PRATHER, Randall S.; ISOM, S. Clay

    2013-01-01

    Abstract The purpose of this experiment was to implement and evaluate the effectiveness of a next-generation sequencing-based method for DNA methylation analysis in porcine embryonic samples. Fourteen discrete genomic regions were amplified by PCR using bisulfite-converted genomic DNA derived from day 14 in vivo-derived (IVV) and parthenogenetic (PA) porcine embryos as template DNA. Resulting PCR products were subjected to high-throughput sequencing using the Illumina Genome Analyzer IIx plat...

  16. Sequence analysis of mitochondrial DNA hypervariable region III of ...

    African Journals Online (AJOL)

    The aims of this research were to study mitochondrial DNA hypervariable region III and establish the degree of variation characteristic of a fragment. The mitochondrial DNA (mtDNA) is a small circular genome located within the mitochondria in the cytoplasm of the cell and a smaller 1.2 kb pair fragment, called the control ...

  17. Proceedings of the relevance of mass spectrometry to DNA sequence determination: Research needs for the Human Genome Program

    Energy Technology Data Exchange (ETDEWEB)

    Edmonds, C.G.; Smith, R.D. (Pacific Northwest Lab., Richland, WA (USA)); Smith, L.M. (Wisconsin Univ., Madison, WI (USA))

    1990-11-01

    A workshop was sponsored for the US Department of Energy (DOE), Office of Health and Environmental Research by Pacific Northwest Laboratory, April 4--5, 1990, in Seattle, Washington, to examine the potential role of mass spectrometry in the joint DOE/National Institutes of Health (NIH) Human Genome Program. The workshop was occasioned by recent developments in mass spectrometry that are providing new levels for selectivity, sensitivity, and, in particular, new methods of ionization appropriate for large biopolymers such as DNA. During discussions, three general mass spectrometric approaches to the determination of DNA sequence were considered: (1) the mass spectrometric detection of isotopic labels from DNA sequencing mixtures separated using gel electrophoresis, (2) the direct mass spectrometric analysis from direct ionization of unfractionated sequencing mixtures where the measured mass of the constituents functions to identify and order the base sequence (replacing separation by gel electrophoresis), and (3) an approach in which a single highly charged molecular ion of a large DNA segment produced is rapidly sequenced in an ion cyclotron resonance ion trap. The consensus of the workshop was that, on the basis of the new developments, mass spectrometry has the potential to provide the substantial increases in sequencing speed required for the Human Genome Program. 66 refs., 3 tabs.

  18. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

    Science.gov (United States)

    de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

    2015-11-16

    Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Low-Energy Electron-Induced Strand Breaks in Telomere-Derived DNA Sequences-Influence of DNA Sequence and Topology.

    Science.gov (United States)

    Rackwitz, Jenny; Bald, Ilko

    2018-03-26

    During cancer radiation therapy high-energy radiation is used to reduce tumour tissue. The irradiation produces a shower of secondary low-energy (DNA very efficiently by dissociative electron attachment. Recently, it was suggested that low-energy electron-induced DNA strand breaks strongly depend on the specific DNA sequence with a high sensitivity of G-rich sequences. Here, we use DNA origami platforms to expose G-rich telomere sequences to low-energy (8.8 eV) electrons to determine absolute cross sections for strand breakage and to study the influence of sequence modifications and topology of telomeric DNA on the strand breakage. We find that the telomeric DNA 5'-(TTA GGG) 2 is more sensitive to low-energy electrons than an intermixed sequence 5'-(TGT GTG A) 2 confirming the unique electronic properties resulting from G-stacking. With increasing length of the oligonucleotide (i.e., going from 5'-(GGG ATT) 2 to 5'-(GGG ATT) 4 ), both the variety of topology and the electron-induced strand break cross sections increase. Addition of K + ions decreases the strand break cross section for all sequences that are able to fold G-quadruplexes or G-intermediates, whereas the strand break cross section for the intermixed sequence remains unchanged. These results indicate that telomeric DNA is rather sensitive towards low-energy electron-induced strand breakage suggesting significant telomere shortening that can also occur during cancer radiation therapy. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. DNA sequence analyses of blended herbal products including synthetic cannabinoids as designer drugs.

    Science.gov (United States)

    Ogata, Jun; Uchiyama, Nahoko; Kikura-Hanajiri, Ruri; Goda, Yukihiro

    2013-04-10

    In recent years, various herbal products adulterated with synthetic cannabinoids have been distributed worldwide via the Internet. These herbal products are mostly sold as incense, and advertised as not for human consumption. Although their labels indicate that they contain mixtures of several potentially psychoactive plants, and numerous studies have reported that they contain a variety of synthetic cannabinoids, their exact botanical contents are not always clear. In this study, we investigated the origins of botanical materials in 62 Spice-like herbal products distributed on the illegal drug market in Japan, by DNA sequence analyses and BLAST searches. The nucleotide sequences of four regions were analyzed to identify the origins of each plant species in the herbal mixtures. The sequences of "Damiana" (Turnera diffusa) and Lamiaceae herbs (Mellissa, Mentha and Thymus) were frequently detected in a number of products. However, the sequences of other plant species indicated on the packaging labels were not detected. In a few products, DNA fragments of potent psychotropic plants were found, including marijuana (Cannabis sativa), "Diviner's Sage" (Salvia divinorum) and "Kratom" (Mitragyna speciosa). Their active constituents were also confirmed using gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS), although these plant names were never indicated on the labels. Most plant species identified in the products were different from the plants indicated on the labels. The plant materials would be used mainly as diluents for the psychoactive synthetic compounds, because no reliable psychoactive effects have been reported for most of the identified plants, with the exception of the psychotropic plants named above. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  1. First-principles study of high-conductance DNA sequencing with carbon nanotube electrodes

    KAUST Repository

    Chen, X.

    2012-03-26

    Rapid and cost-effective DNA sequencing at the single nucleotide level might be achieved by measuring a transverse electronic current as single-stranded DNA is pulled through a nanometer-sized pore. In order to enhance the electronic coupling between the nucleotides and the electrodes and hence the current signals, we employ a pair of single-walled close-ended (6,6) carbon nanotubes (CNTs) as electrodes. We then investigate the electron transport properties of nucleotides sandwiched between such electrodes by using first-principles quantum transport theory. In particular, we consider the extreme case where the separation between the electrodes is the smallest possible that still allows the DNA translocation. The benzene-like ring at the end cap of the CNT can strongly couple with the nucleobases and therefore it can both reduce conformational fluctuations and significantly improve the conductance. As such, when the electrodes are closely spaced, the nucleobases can pass through only with their base plane parallel to the plane of CNT end caps. The optimal molecular configurations, at which the nucleotides strongly couple to the CNTs, and which yield the largest transmission, are first identified. These correspond approximately to the lowest energy configurations. Then the electronic structures and the electron transport of these optimal configurations are analyzed. The typical tunneling currents are of the order of 50 nA for voltages up to 1 V. At higher bias, where resonant transport through the molecular states is possible, the current is of the order of several μA. Below 1 V, the currents associated to the different nucleotides are consistently distinguishable, with adenine having the largest current, guanine the second largest, cytosine the third and, finally, thymine the smallest. We further calculate the transmission coefficient profiles as the nucleotides are dragged along the DNA translocation path and investigate the effects of configurational variations

  2. cDNA sequencing improves the detection of P53 missense mutations in colorectal cancer

    International Nuclear Information System (INIS)

    Szybka, Malgorzata; Kordek, Radzislaw; Zakrzewska, Magdalena; Rieske, Piotr; Pasz-Walczak, Grazyna; Kulczycka-Wojdala, Dominika; Zawlik, Izabela; Stawski, Robert; Jesionek-Kupnicka, Dorota; Liberski, Pawel P

    2009-01-01

    Recently published data showed discrepancies beteween P53 cDNA and DNA sequencing in glioblastomas. We hypothesised that similar discrepancies may be observed in other human cancers. To this end, we analyzed 23 colorectal cancers for P53 mutations and gene expression using both DNA and cDNA sequencing, real-time PCR and immunohistochemistry. We found P53 gene mutations in 16 cases (15 missense and 1 nonsense). Two of the 15 cases with missense mutations showed alterations based only on cDNA, and not DNA sequencing. Moreover, in 6 of the 15 cases with a cDNA mutation those mutations were difficult to detect in the DNA sequencing, so the results of DNA analysis alone could be misinterpreted if the cDNA sequencing results had not also been available. In all those 15 cases, we observed a higher ratio of the mutated to the wild type template by cDNA analysis, but not by the DNA analysis. Interestingly, a similar overexpression of P53 mRNA was present in samples with and without P53 mutations. In terms of colorectal cancer, those discrepancies might be explained under three conditions: 1, overexpression of mutated P53 mRNA in cancer cells as compared with normal cells; 2, a higher content of cells without P53 mutation (normal cells and cells showing K-RAS and/or APC but not P53 mutation) in samples presenting P53 mutation; 3, heterozygous or hemizygous mutations of P53 gene. Additionally, for heterozygous mutations unknown mechanism(s) causing selective overproduction of mutated allele should also be considered. Our data offer new clues for studying discrepancy in P53 cDNA and DNA sequencing analysis

  3. Transcriptome sequencing in prostate cancer identifies inter-tumor heterogeneity

    Directory of Open Access Journals (Sweden)

    Janet Mendonca

    2015-06-01

    Full Text Available Given the dearth of gene mutations in prostate cancer, [1] ,[2] it is likely that genomic rearrangements play a significant role in the evolution of prostate cancer. However, in the search for recurrent genomic alterations, "private alterations" have received less attention. Such alterations may provide insights into the evolution, behavior, and clinical outcome of an individual tumor. In a recent report in "Genome Biology" Wyatt et al. [3] defines unique alterations in a cohort of high-risk prostate cancer patient with a lethal phenotype. Utilizing a transcriptome sequencing approach they observe high inter-tumor heterogeneity; however, the genes altered distill into three distinct cancer-relevant pathways. Their analysis reveals the presence of several non-ETS fusions, which may contribute to the phenotype of individual tumors, and have significance for disease progression.

  4. Functional role of a highly repetitive DNA sequence in anchorage of the mouse genome.

    Science.gov (United States)

    Neuer-Nitsche, B; Lu, X N; Werner, D

    1988-09-12

    The major portion of the eukaryotic genome consists of various categories of repetitive DNA sequences which have been studied with respect to their base compositions, organizations, copy numbers, transcription and species specificities; their biological roles, however, are still unclear. A novel quality of a highly repetitive mouse DNA sequence is described which points to a functional role: All copies (approximately 50,000 per haploid genome) of this DNA sequence reside on genomic Alu I DNA fragments each associated with nuclear polypeptides that are not released from DNA by proteinase K, SDS and phenol extraction. By this quality the repetitive DNA sequence is classified as a member of the sub-set of DNA sequences involved in tight DNA-polypeptide complexes which have been previously shown to be components of the subnuclear structure termed 'nuclear matrix'. From these results it has to be concluded that the repetitive DNA sequence characterized in this report represents or comprises a signal for a large number of site specific attachment points of the mouse genome in the nuclear matrix.

  5. Sequencing historical specimens: successful preparation of small specimens with low amounts of degraded DNA.

    Science.gov (United States)

    Sproul, John S; Maddison, David R

    2017-11-01

    Despite advances that allow DNA sequencing of old museum specimens, sequencing small-bodied, historical specimens can be challenging and unreliable as many contain only small amounts of fragmented DNA. Dependable methods to sequence such specimens are especially critical if the specimens are unique. We attempt to sequence small-bodied (3-6 mm) historical specimens (including nomenclatural types) of beetles that have been housed, dried, in museums for 58-159 years, and for which few or no suitable replacement specimens exist. To better understand ideal approaches of sample preparation and produce preparation guidelines, we compared different library preparation protocols using low amounts of input DNA (1-10 ng). We also explored low-cost optimizations designed to improve library preparation efficiency and sequencing success of historical specimens with minimal DNA, such as enzymatic repair of DNA. We report successful sample preparation and sequencing for all historical specimens despite our low-input DNA approach. We provide a list of guidelines related to DNA repair, bead handling, reducing adapter dimers and library amplification. We present these guidelines to facilitate more economical use of valuable DNA and enable more consistent results in projects that aim to sequence challenging, irreplaceable historical specimens. © 2017 John Wiley & Sons Ltd.

  6. Sequence-Dependent Diastereospecific and Diastereodivergent Crosslinking of DNA by Decarbamoylmitomycin C.

    Science.gov (United States)

    Aguilar, William; Paz, Manuel M; Vargas, Anayatzinc; Clement, Cristina C; Cheng, Shu-Yuan; Champeil, Elise

    2018-04-20

    Mitomycin C (MC), a potent antitumor drug, and decarbamoylmitomycin C (DMC), a derivative lacking the carbamoyl group, form highly cytotoxic DNA interstrand crosslinks. The major interstrand crosslink formed by DMC is the C1'' epimer of the major crosslink formed by MC. The molecular basis for the stereochemical configuration exhibited by DMC was investigated using biomimetic synthesis. The formation of DNA-DNA crosslinks by DMC is diastereospecific and diastereodivergent: Only the 1''S-diastereomer of the initially formed monoadduct can form crosslinks at GpC sequences, and only the 1''R-diastereomer of the monoadduct can form crosslinks at CpG sequences. We also show that CpG and GpC sequences react with divergent diastereoselectivity in the first alkylation step: 1"S stereochemistry is favored at GpC sequences and 1''R stereochemistry is favored at CpG sequences. Therefore, the first alkylation step results, at each sequence, in the selective formation of the diastereomer able to generate an interstrand DNA-DNA crosslink after the "second arm" alkylation. Examination of the known DNA adduct pattern obtained after treatment of cancer cell cultures with DMC indicates that the GpC sequence is the major target for the formation of DNA-DNA crosslinks in vivo by this drug. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Nanobiosensor for Detection and Quantification of DNA Sequences in Degraded Mixed Meats

    Directory of Open Access Journals (Sweden)

    M. E. Ali

    2011-01-01

    Full Text Available A novel class of nanobiosensor was developed by integrating a 27-nucleotide AluI fragment of swine cytochrome b (cytb gene to a 3-nm diameter citrate-tannate coated gold nanoparticle (GNP. The biosensor detected 0.5% and 1% pork in raw and 2.5-h autoclaved pork-beef binary admixtures in a single step without any separation or washing. The hybridization kinetics of the hybrid sensor was studied with synthetic and AluI digested real pork targets from moderate to extreme target concentrations and a sigmoidal relationship was found. Using the kinetic curve, a convenient method for quantifying and counting target DNA copy number was developed. The accuracy of the method was over 90% and 80% for raw and autoclaved pork-beef binary admixtures in the range of 5–100% pork adulteration. The biosensor probe identified a target DNA sequence that was several-folds shorter than a typical PCR-template. This offered the detection and quantitation of potential targets in highly processed or degraded samples where PCR amplification was not possible due to template crisis. The assay was a viable alternative approach of qPCR for detecting, quantifying and counting copy number of shorter size DNA sequences to address a wide ranging biological problem in food industry, diagnostic laboratories and forensic medicine.

  8. Deep sequencing reveals distinct patterns of DNA methylation in prostate cancer.

    Science.gov (United States)

    Kim, Jung H; Dhanasekaran, Saravana M; Prensner, John R; Cao, Xuhong; Robinson, Daniel; Kalyana-Sundaram, Shanker; Huang, Christina; Shankar, Sunita; Jing, Xiaojun; Iyer, Matthew; Hu, Ming; Sam, Lee; Grasso, Catherine; Maher, Christopher A; Palanisamy, Nallasivam; Mehra, Rohit; Kominsky, Hal D; Siddiqui, Javed; Yu, Jindan; Qin, Zhaohui S; Chinnaiyan, Arul M

    2011-07-01

    Beginning with precursor lesions, aberrant DNA methylation marks the entire spectrum of prostate cancer progression. We mapped the global DNA methylation patterns in select prostate tissues and cell lines using MethylPlex-next-generation sequencing (M-NGS). Hidden Markov model-based next-generation sequence analysis identified ∼68,000 methylated regions per sample. While global CpG island (CGI) methylation was not differential between benign adjacent and cancer samples, overall promoter CGI methylation significantly increased from ~12.6% in benign samples to 19.3% and 21.8% in localized and metastatic cancer tissues, respectively (P-value prostate tissues, 2481 differentially methylated regions (DMRs) are cancer-specific, including numerous novel DMRs. A novel cancer-specific DMR in the WFDC2 promoter showed frequent methylation in cancer (17/22 tissues, 6/6 cell lines), but not in the benign tissues (0/10) and normal PrEC cells. Integration of LNCaP DNA methylation and H3K4me3 data suggested an epigenetic mechanism for alternate transcription start site utilization, and these modifications segregated into distinct regions when present on the same promoter. Finally, we observed differences in repeat element methylation, particularly LINE-1, between ERG gene fusion-positive and -negative cancers, and we confirmed this observation using pyrosequencing on a tissue panel. This comprehensive methylome map will further our understanding of epigenetic regulation in prostate cancer progression.

  9. Dragon polya spotter: Predictor of poly(A) motifs within human genomic DNA sequences

    KAUST Repository

    Kalkatawi, Manal M.

    2011-11-15

    Motivation: Recognition of poly(A) signals in mRNA is relatively straightforward due to the presence of easily recognizable polyadenylic acid tail. However, the task of identifying poly(A) motifs in the primary genomic DNA sequence that correspond to poly(A) signals in mRNA is a far more challenging problem. Recognition of poly(A) signals is important for better gene annotation and understanding of the gene regulation mechanisms. In this work, we present one such poly(A) motif prediction method based on properties of human genomic DNA sequence surrounding a poly(A) motif. These properties include thermodynamic, physico-chemical and statistical characteristics. For predictions, we developed Artificial Neural Network and Random Forest models. These models are trained to recognize 12 most common poly(A) motifs in human DNA. Our predictors are available as a free web-based tool accessible at http://cbrc.kaust.edu.sa/dps. Compared with other reported predictors, our models achieve higher sensitivity and specificity and furthermore provide a consistent level of accuracy for 12 poly(A) motif variants. The Author(s) 2011. Published by Oxford University Press. All rights reserved.

  10. More on contamination: the use of asymmetric molecular behavior to identify authentic ancient human DNA

    DEFF Research Database (Denmark)

    Malmström, Helena; Svensson, Emma M; Gilbert, M Thomas P

    2007-01-01

    concerning the authenticity of such data. Although several methods have been developed to the purpose of authenticating ancient DNA (aDNA) results, while they are useful in faunal research, most of the methods have proven complicated to apply to ancient human DNA. Here, we investigate in detail...... the reliability of one of the proposed criteria, that of appropriate molecular behavior. Using real-time polymerase chain reaction (PCR) and pyrosequencing, we have quantified the relative levels of authentic aDNA and contaminant human DNA sequences recovered from archaeological dog and cattle remains. In doing...

  11. COMPETITIVE METAGENOMIC DNA HYBRIDIZATION IDENTIFIES HOST-SPECIFIC GENETIC MARKERS IN HUMAN FECAL MICROBIAL COMMUNITIES

    Science.gov (United States)

    Although recent technological advances in DNA sequencing and computational biology now allow scientists to compare entire microbial genomes, the use of these approaches to discern key genomic differences between natural microbial communities remains prohibitively expensive for mo...

  12. Polymorphism discovery and allele frequency estimation using high-throughput DNA sequencing of target-enriched pooled DNA samples

    Directory of Open Access Journals (Sweden)

    Mullen Michael P

    2012-01-01

    Full Text Available Abstract Background The central role of the somatotrophic axis in animal post-natal growth, development and fertility is well established. Therefore, the identification of genetic variants affecting quantitative traits within this axis is an attractive goal. However, large sample numbers are a pre-requisite for the identification of genetic variants underlying complex traits and although technologies are improving rapidly, high-throughput sequencing of large numbers of complete individual genomes remains prohibitively expensive. Therefore using a pooled DNA approach coupled with target enrichment and high-throughput sequencing, the aim of this study was to identify polymorphisms and estimate allele frequency differences across 83 candidate genes of the somatotrophic axis, in 150 Holstein-Friesian dairy bulls divided into two groups divergent for genetic merit for fertility. Results In total, 4,135 SNPs and 893 indels were identified during the resequencing of the 83 candidate genes. Nineteen percent (n = 952 of variants were located within 5' and 3' UTRs. Seventy-two percent (n = 3,612 were intronic and 9% (n = 464 were exonic, including 65 indels and 236 SNPs resulting in non-synonymous substitutions (NSS. Significant (P ® MassARRAY. No significant differences (P > 0.1 were observed between the two methods for any of the 43 SNPs across both pools (i.e., 86 tests in total. Conclusions The results of the current study support previous findings of the use of DNA sample pooling and high-throughput sequencing as a viable strategy for polymorphism discovery and allele frequency estimation. Using this approach we have characterised the genetic variation within genes of the somatotrophic axis and related pathways, central to mammalian post-natal growth and development and subsequent lactogenesis and fertility. We have identified a large number of variants segregating at significantly different frequencies between cattle groups divergent for calving

  13. Clinical identification of bacteria in human chronic wound infections: culturing vs. 16S ribosomal DNA sequencing

    Directory of Open Access Journals (Sweden)

    Rhoads Daniel D

    2012-11-01

    Full Text Available Abstract Background Chronic wounds affect millions of people and cost billions of dollars in the United States each year. These wounds harbor polymicrobial biofilm communities, which can be difficult to elucidate using culturing methods. Clinical molecular microbiological methods are increasingly being employed to investigate the microbiota of chronic infections, including wounds, as part of standard patient care. However, molecular testing is more sensitive than culturing, which results in markedly different results being reported to clinicians. This study compares the results of aerobic culturing and molecular testing (culture-free 16S ribosomal DNA sequencing, and it examines the relative abundance score that is generated by the molecular test and the usefulness of the relative abundance score in predicting the likelihood that the same organism would be detected by culture. Methods Parallel samples from 51 chronic wounds were studied using aerobic culturing and 16S DNA sequencing for the identification of bacteria. Results One hundred forty-five (145 unique genera were identified using molecular methods, and 68 of these genera were aerotolerant. Fourteen (14 unique genera were identified using aerobic culture methods. One-third (31/92 of the cultures were determined to be Staphylococcus aureus, Pseudomonas aeruginosa, and Enterococcus faecalis with higher relative abundance scores were more likely to be detected by culture as demonstrated with regression modeling. Conclusion Discordance between molecular and culture testing is often observed. However, culture-free 16S ribosomal DNA sequencing and its relative abundance score can provide clinicians with insight into which bacteria are most abundant in a sample and which are most likely to be detected by culture.

  14. Cloning, sequencing, and expression of dnaK-operon proteins from the thermophilic bacterium Thermus thermophilus.

    Science.gov (United States)

    Osipiuk, J; Joachimiak, A

    1997-09-12

    We propose that the dnaK operon of Thermus thermophilus HB8 is composed of three functionally linked genes: dnaK, grpE, and dnaJ. The dnaK and dnaJ gene products are most closely related to their cyanobacterial homologs. The DnaK protein sequence places T. thermophilus in the plastid Hsp70 subfamily. In contrast, the grpE translated sequence is most similar to GrpE from Clostridium acetobutylicum, a Gram-positive anaerobic bacterium. A single promoter region, with homology to the Escherichia coli consensus promoter sequences recognized by the sigma70 and sigma32 transcription factors, precedes the postulated operon. This promoter is heat-shock inducible. The dnaK mRNA level increased more than 30 times upon 10 min of heat shock (from 70 degrees C to 85 degrees C). A strong transcription terminating sequence was found between the dnaK and grpE genes. The individual genes were cloned into pET expression vectors and the thermophilic proteins were overproduced at high levels in E. coli and purified to homogeneity. The recombinant T. thermophilus DnaK protein was shown to have a weak ATP-hydrolytic activity, with an optimum at 90 degrees C. The ATPase was stimulated by the presence of GrpE and DnaJ. Another open reading frame, coding for ClpB heat-shock protein, was found downstream of the dnaK operon.

  15. Association of Amine-Receptor DNA Sequence Variants with Associative Learning in the Honeybee.

    Science.gov (United States)

    Lagisz, Malgorzata; Mercer, Alison R; de Mouzon, Charlotte; Santos, Luana L S; Nakagawa, Shinichi

    2016-03-01

    Octopamine- and dopamine-based neuromodulatory systems play a critical role in learning and learning-related behaviour in insects. To further our understanding of these systems and resulting phenotypes, we quantified DNA sequence variations at six loci coding octopamine-and dopamine-receptors and their association with aversive and appetitive learning traits in a population of honeybees. We identified 79 polymorphic sequence markers (mostly SNPs and a few insertions/deletions) located within or close to six candidate genes. Intriguingly, we found that levels of sequence variation in the protein-coding regions studied were low, indicating that sequence variation in the coding regions of receptor genes critical to learning and memory is strongly selected against. Non-coding and upstream regions of the same genes, however, were less conserved and sequence variations in these regions were weakly associated with between-individual differences in learning-related traits. While these associations do not directly imply a specific molecular mechanism, they suggest that the cross-talk between dopamine and octopamine signalling pathways may influence olfactory learning and memory in the honeybee.

  16. Characterization of Satellite DNA Sequences from the Commercially Important Marine Rotifers Brachionus rotundiformis and Brachionus plicatilis.

    Science.gov (United States)

    Boehm; Gibson; Lubzens

    2000-01-01

    This study was initiated to search for species-specific and strain-specific satellite DNA sequences for which oligonucleotide primers could be designed to differentiate between various commercially important strains of the marine monogonont rotifers Brachionus rotundiformis and Brachionus plicatilis. Two unrelated, highly reiterated satellite sequences were cloned and characterized. The eight sequenced monomers from B. rotundiformis and six from B. plicatilis had low intrarepeat variability and were similar in their overall lengths, A + T compositions, and high degrees of repeated motif substructure. However, hybridizations to 19 representative strains, sequence characterizations, and GenBank searches indicated that these two satellites are morphotype-specific and population-specific, respectively, and share little homology to each other or to other characterized sequences in the database. Primer pairs designed for the B. rotundiformis satellite confirmed hybridization specificities on polymerase chain reaction and could serve as a useful molecular diagnostic tool to identify strains belonging to the SS morphotype, which are gaining widespread usage as first feeds for marine fish in commercial production.

  17. Analysis of T-DNA/Host-Plant DNA Junction Sequences in Single-Copy Transgenic Barley Lines

    Directory of Open Access Journals (Sweden)

    Joanne G. Bartlett

    2014-01-01

    Full Text Available Sequencing across the junction between an integrated transfer DNA (T-DNA and a host plant genome provides two important pieces of information. The junctions themselves provide information regarding the proportion of T-DNA which has integrated into the host plant genome, whilst the transgene flanking sequences can be used to study the local genetic environment of the integrated transgene. In addition, this information is important in the safety assessment of GM crops and essential for GM traceability. In this study, a detailed analysis was carried out on the right-border T-DNA junction sequences of single-copy independent transgenic barley lines. T-DNA truncations at the right-border were found to be relatively common and affected 33.3% of the lines. In addition, 14.3% of lines had rearranged construct sequence after the right border break-point. An in depth analysis of the host-plant flanking sequences revealed that a significant proportion of the T-DNAs integrated into or close to known repetitive elements. However, this integration into repetitive DNA did not have a negative effect on transgene expression.

  18. Two dimensional molecular electronics spectroscopy for molecular fingerprinting, DNA sequencing, and cancerous DNA recognition.

    Science.gov (United States)

    Rajan, Arunkumar Chitteth; Rezapour, Mohammad Reza; Yun, Jeonghun; Cho, Yeonchoo; Cho, Woo Jong; Min, Seung Kyu; Lee, Geunsik; Kim, Kwang S

    2014-02-25

    Laser-driven molecular spectroscopy of low spatial resolution is widely used, while electronic current-driven molecular spectroscopy of atomic scale resolution has been limited because currents provide only minimal information. However, electron transmission of a graphene nanoribbon on which a molecule is adsorbed shows molecular fingerprints of Fano resonances, i.e., characteristic features of frontier orbitals and conformations of physisorbed molecules. Utilizing these resonance profiles, here we demonstrate two-dimensional molecular electronics spectroscopy (2D MES). The differential conductance with respect to bias and gate voltages not only distinguishes different types of nucleobases for DNA sequencing but also recognizes methylated nucleobases which could be related to cancerous cell growth. This 2D MES could open an exciting field to recognize single molecule signatures at atomic resolution. The advantages of the 2D MES over the one-dimensional (1D) current analysis can be comparable to those of 2D NMR over 1D NMR analysis.

  19. Frequency of Usher syndrome type 1 in deaf children by massively parallel DNA sequencing.

    Science.gov (United States)

    Yoshimura, Hidekane; Miyagawa, Maiko; Kumakawa, Kozo; Nishio, Shin-Ya; Usami, Shin-Ichi

    2016-05-01

    Usher syndrome type 1 (USH1) is the most severe of the three USH subtypes due to its profound hearing loss, absent vestibular response and retinitis pigmentosa appearing at a prepubescent age. Six causative genes have been identified for USH1, making early diagnosis and therapy possible through DNA testing. Targeted exon sequencing of selected genes using massively parallel DNA sequencing (MPS) technology enables clinicians to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using MPS along with direct sequence analysis, we screened 227 unrelated non-syndromic deaf children and detected recessive mutations in USH1 causative genes in five patients (2.2%): three patients harbored MYO7A mutations and one each carried CDH23 or PCDH15 mutations. As indicated by an earlier genotype-phenotype correlation study of the CDH23 and PCDH15 genes, we considered the latter two patients to have USH1. Based on clinical findings, it was also highly likely that one patient with MYO7A mutations possessed USH1 due to a late onset age of walking. This first report describing the frequency (1.3-2.2%) of USH1 among non-syndromic deaf children highlights the importance of comprehensive genetic testing for early disease diagnosis.

  20. Alu polymerase chain reaction: A method for rapid isolation of human-specific sequences from complex DNA sources

    International Nuclear Information System (INIS)

    Nelson, D.L.; Ledbetter, S.A.; Corbo, L.; Victoria, M.F.; Ramirez-Solis, R.; Webster, T.D.; Ledbetter, D.H.; Caskey, C.T.

    1989-01-01

    Current efforts to map the human genome are focused on individual chromosomes or smaller regions and frequently rely on the use of somatic cell hybrids. The authors report the application of the polymerase chain reaction to direct amplification of human DNA from hybrid cells containing regions of the human genome in rodent cell backgrounds using primers directed to the human Alu repeat element. They demonstrate Alu-directed amplification of a fragment of the human HPRT gene from both hybrid cell and cloned DNA and identify through sequence analysis the Alu repeats involved in this amplification. They also demonstrate the application of this technique to identify the chromosomal locations of large fragments of the human X chromosome cloned in a yeast artificial chromosome and the general applicability of the method to the preparation of DNA probes from cloned human sequences. The technique allows rapid gene mapping and provides a simple method for the isolation and analysis of specific chromosomal regions

  1. High throughput sequencing and proteomics to identify immunogenic proteins of a new pathogen: the dirty genome approach.

    Science.gov (United States)

    Greub, Gilbert; Kebbi-Beghdadi, Carole; Bertelli, Claire; Collyn, François; Riederer, Beat M; Yersin, Camille; Croxatto, Antony; Raoult, Didier

    2009-12-23

    With the availability of new generation sequencing technologies, bacterial genome projects have undergone a major boost. Still, chromosome completion needs a costly and time-consuming gap closure, especially when containing highly repetitive elements. However, incomplete genome data may be sufficiently informative to derive the pursued information. For emerging pathogens, i.e. newly identified pathogens, lack of release of genome data during gap closure stage is clearly medically counterproductive. We thus investigated the feasibility of a dirty genome approach, i.e. the release of unfinished genome sequences to develop serological diagnostic tools. We showed that almost the whole genome sequence of the emerging pathogen Parachlamydia acanthamoebae was retrieved even with relatively short reads from Genome Sequencer 20 and Solexa. The bacterial proteome was analyzed to select immunogenic proteins, which were then expressed and used to elaborate the first steps of an ELISA. This work constitutes the proof of principle for a dirty genome approach, i.e. the use of unfinished genome sequences of pathogenic bacteria, coupled with proteomics to rapidly identify new immunogenic proteins useful to develop in the future specific diagnostic tests such as ELISA, immunohistochemistry and direct antigen detection. Although applied here to an emerging pathogen, this combined dirty genome sequencing/proteomic approach may be used for any pathogen for which better diagnostics are needed. These genome sequences may also be very useful to develop DNA based diagnostic tests. All these diagnostic tools will allow further evaluations of the pathogenic potential of this obligate intracellular bacterium.

  2. Opsin cDNA sequences of a UV and green rhodopsin of the satyrine butterfly Bicyclus anynana.

    Science.gov (United States)

    Vanhoutte, K J A; Eggen, B J L; Janssen, J J M; Stavenga, D G

    2002-11-01

    The cDNAs of an ultraviolet (UV) and long-wavelength (LW) (green) absorbing rhodopsin of the bush brown Bicyclus anynana were partially identified. The UV sequence, encoding 377 amino acids, is 76-79% identical to the UV sequences of the papilionids Papilio glaucus and Papilio xuthus and the moth Manduca sexta. A dendrogram derived from aligning the amino acid sequences reveals an equidistant position of Bicyclus between Papilio and Manduca. The sequence of the green opsin cDNA fragment, which encodes 242 amino acids, represents six of the seven transmembrane regions. At the amino acid level, this fragment is more than 80% identical to the corresponding LW opsin sequences of Dryas, Heliconius, Papilio (rhodopsin 2) and Manduca. Whereas three LW absorbing rhodopsins were identified in the papilionid butterflies, only one green opsin was found in B. anynana.

  3. Real sequence effects on the search dynamics of transcription factors on DNA

    DEFF Research Database (Denmark)

    Bauer, Maximilian; Rasmussen, Emil S.; Lomholt, Michael A.

    2015-01-01

    Recent experiments show that transcription factors (TFs) indeed use the facilitated diffusion mechanism to locate their target sequences on DNA in living bacteria cells: TFs alternate between sliding motion along DNA and relocation events through the cytoplasm. From simulations and theoretical...... analysis we study the TF-sliding motion for a large section of the DNA-sequence of a common E. coli strain, based on the two-state TF-model with a fast-sliding search state and a recognition state enabling target detection. For the probability to detect the target before dissociating from DNA the TF...... on the underlying nucleotide sequence is varied. A moderate dependence maximises the capability to distinguish between the main operator and similar sequences. Moreover, these auxiliary operators serve as starting points for DNA looping with the main operator, yielding a spectrum of target detection times spanning...

  4. Targeting and tracing of specific DNA sequences with dTALEs in living cells

    Science.gov (United States)

    Thanisch, Katharina; Schneider, Katrin; Morbitzer, Robert; Solovei, Irina; Lahaye, Thomas; Bultmann, Sebastian; Leonhardt, Heinrich

    2014-01-01

    Epigenetic regulation of gene expression involves, besides DNA and histone modifications, the relative positioning of DNA sequences within the nucleus. To trace specific DNA sequences in living cells, we used programmable sequence-specific DNA binding of designer transcription activator-like effectors (dTALEs). We designed a recombinant dTALE (msTALE) with variable repeat domains to specifically bind a 19-bp target sequence of major satellite DNA. The msTALE was fused with green fluorescent protein (GFP) and stably expressed in mouse embryonic stem cells. Hybridization with a major satellite probe (3D-fluorescent in situ hybridization) and co-staining for known cellular structures confirmed in vivo binding of the GFP-msTALE to major satellite DNA present at nuclear chromocenters. Dual tracing of major satellite DNA and the replication machinery throughout S-phase showed co-localization during mid to late S-phase, directly demonstrating the late replication timing of major satellite DNA. Fluorescence bleaching experiments indicated a relatively stable but still dynamic binding, with mean residence times in the range of minutes. Fluorescently labeled dTALEs open new perspectives to target and trace DNA sequences and to monitor dynamic changes in subnuclear positioning as well as interactions with functional nuclear structures during cell cycle progression and cellular differentiation. PMID:24371265

  5. Targeting and tracing of specific DNA sequences with dTALEs in living cells.

    Science.gov (United States)

    Thanisch, Katharina; Schneider, Katrin; Morbitzer, Robert; Solovei, Irina; Lahaye, Thomas; Bultmann, Sebastian; Leonhardt, Heinrich

    2014-04-01

    Epigenetic regulation of gene expression involves, besides DNA and histone modifications, the relative positioning of DNA sequences within the nucleus. To trace specific DNA sequences in living cells, we used programmable sequence-specific DNA binding of designer transcription activator-like effectors (dTALEs). We designed a recombinant dTALE (msTALE) with variable repeat domains to specifically bind a 19-bp target sequence of major satellite DNA. The msTALE was fused with green fluorescent protein (GFP) and stably expressed in mouse embryonic stem cells. Hybridization with a major satellite probe (3D-fluorescent in situ hybridization) and co-staining for known cellular structures confirmed in vivo binding of the GFP-msTALE to major satellite DNA present at nuclear chromocenters. Dual tracing of major satellite DNA and the replication machinery throughout S-phase showed co-localization during mid to late S-phase, directly demonstrating the late replication timing of major satellite DNA. Fluorescence bleaching experiments indicated a relatively stable but still dynamic binding, with mean residence times in the range of minutes. Fluorescently labeled dTALEs open new perspectives to target and trace DNA sequences and to monitor dynamic changes in subnuclear positioning as well as interactions with functional nuclear structures during cell cycle progression and cellular differentiation.

  6. DNA repair-related genes in sugarcane expressed sequence tags (ESTs

    Directory of Open Access Journals (Sweden)

    R.M.A. Costa

    2001-12-01

    Full Text Available There is much interest in the identification and characterization of genes involved in DNA repair because of their importance in the maintenance of the genome integrity. The high level of conservation of DNA repair genes means that these genetic elements may be used in phylogenetic studies as a source of information on the genetic origin and evolution of species. The mechanisms by which damaged DNA is repaired are well understood in bacteria, yeast and mammals, but much remains to be learned as regards plants. We identified genes involved in DNA repair mechanisms in sugarcane using a similarity search of the Brazilian Sugarcane Expressed Sequence Tag (SUCEST database against known sequences deposited in other public databases (National Center of Biotechnology Information (NCBI database and the Munich Information Center for Protein Sequences (MIPS Arabidopsis thaliana database. This search revealed that most of the various proteins involved in DNA repair in sugarcane are similar to those found in other eukaryotes. However, we also identified certain intriguing features found only in plants, probably due to the independent evolution of this kingdom. The DNA repair mechanisms investigated include photoreactivation, base excision repair, nucleotide excision repair, mismatch repair, non-homologous end joining, homologous recombination repair and DNA lesion tolerance. We report the main differences found in the DNA repair machinery in plant cells as compared to other organisms. These differences point to potentially different strategies plants employ to deal with DNA damage, that deserve further investigation.A identificação e caracterização de genes envolvidos com reparo de DNA são de grande interesse, dada a sua importância na manutenção da integridade genômica. Além disso, a alta conservação dos genes de reparo de DNA faz com que possam ser utilizados como fonte de informação no que diz respeito à origem e evolução das esp

  7. MotifMark: Finding Regulatory Motifs in DNA Sequences

    OpenAIRE

    Hassanzadeh, Hamid Reza; Kolhe, Pushkar; Isbell, Charles L.; Wang, May D.

    2017-01-01

    The interaction between proteins and DNA is a key driving force in a significant number of biological processes such as transcriptional regulation, repair, recombination, splicing, and DNA modification. The identification of DNA-binding sites and the specificity of target proteins in binding to these regions are two important steps in understanding the mechanisms of these biological activities. A number of high-throughput technologies have recently emerged that try to quantify the affinity be...

  8. Sequence analysis of the canine mitochondrial DNA control region from shed hair samples in criminal investigations.

    Science.gov (United States)

    Berger, C; Berger, B; Parson, W

    2012-01-01

    In recent years, evidence from domestic dogs has increasingly been analyzed by forensic DNA testing. Especially, canine hairs have proved most suitable and practical due to the high rate of hair transfer occurring between dogs and humans. Starting with the description of a contamination-free sample handling procedure, we give a detailed workflow for sequencing hypervariable segments (HVS) of the mtDNA control region from canine evidence. After the hair material is lysed and the DNA extracted by Phenol/Chloroform, the amplification and sequencing strategy comprises the HVS I and II of the canine control region and is optimized for DNA of medium-to-low quality and quantity. The sequencing procedure is based on the Sanger Big-dye deoxy-terminator method and the separation of the sequencing reaction products is performed on a conventional multicolor fluorescence detection capillary electrophoresis platform. Finally, software-aided base calling and sequence interpretation are addressed exemplarily.

  9. Molecular cloning of a human glycophorin B cDNA: nucleotide sequence and genomic relationship to glycophorin A

    International Nuclear Information System (INIS)

    Siebert, P.D.; Fukuda, M.

    1987-01-01

    The authors describe the isolation and nucleotide sequence of a human glycophorin B cDNA. The cDNA was identified by differential hybridization of synthetic oligonucleotide probes to a human erythroleukemic cell line (K562) cDNA library constructed in phage vector λgt10. The nucleotide sequence of the glycophorin B cDNA was compared with that of a previously cloned glycophorin A cDNA. The nucleotide sequences encoding the NH 2 -terminal leader peptide and first 26 amino acids of the two proteins are nearly identical. This homologous region is followed by areas specific to either glycophorin A or B and a number of small regions of homology, which in turn are followed by a very homologous region encoding the presumed membrane-spanning portion of the proteins. They used RNA blot hybridization with both cDNA and synthetic oligonucleotide probes to prove our previous hypothesis that glycophorin B is encoded by a single 0.5- to 0.6-kb mRNA and to show that glycophorins A and B are negatively and coordinately regulated by a tumor-promoting phorbol ester, phorbol 12-myristate 13-acetate. They established the intron/exon structure of the glycophorin A and B genes by oligonucleotide mapping; the results suggest a complex evolution of the glycophorin genes

  10. A duplex DNA-gold nanoparticle probe composed as a colorimetric biosensor for sequence-specific DNA-binding proteins.

    Science.gov (United States)

    Ahn, Junho; Choi, Yeonweon; Lee, Ae-Ree; Lee, Joon-Hwa; Jung, Jong Hwa

    2016-03-21

    Using duplex DNA-AuNP aggregates, a sequence-specific DNA-binding protein, SQUAMOSA Promoter-binding-Like protein 12 (SPL-12), was directly determined by SPL-12-duplex DNA interaction-based colorimetric actions of DNA-Au assemblies. In order to prepare duplex DNA-Au aggregates, thiol-modified DNA 1 and DNA 2 were attached onto the surface of AuNPs, respectively, by the salt-aging method and then the DNA-attached AuNPs were mixed. Duplex-DNA-Au aggregates having the average size of 160 nm diameter and the maximum absorption at 529 nm were able to recognize SPL-12 and reached the equivalent state by the addition of ∼30 equivalents of SPL-12 accompanying a color change from red to blue with a red shift of the maximum absorption at 570 nm. As a result, the aggregation size grew to about 247 nm. Also, at higher temperatures of the mixture of duplex-DNA-Au aggregate solution and SPL-12, the equivalent state was reached rapidly. On the contrary, in the control experiment using Bovine Serum Albumin (BSA), no absorption band shift of duplex-DNA-Au aggregates was observed.

  11. Complete chloroplast genome and 45S nrDNA sequences of the medicinal plant species Glycyrrhiza glabra and Glycyrrhiza uralensis.

    Science.gov (United States)

    Kang, Sang-Ho; Lee, Jeong-Hoon; Lee, Hyun Oh; Ahn, Byoung Ohg; Won, So Youn; Sohn, Seong-Han; Kim, Jung Sun

    2017-10-06

    Glycyrrhiza uralensis and G. glabra, members of the Fabaceae, are medicinally important species that are native to Asia and Europe. Extracts from these plants are widely used as natural sweeteners because of their much greater sweetness than sucrose. In this study, the three complete chloroplast genomes and five 45S nuclear ribosomal (nr)DNA sequences of these two licorice species and an interspecific hybrid are presented. The chloroplast genomes of G. glabra, G. uralensis and G. glabra × G. uralensis were 127,895 bp, 127,716 bp and 127,939 bp, respectively. The three chloroplast genomes harbored 110 annotated genes, including 76 protein-coding genes, 30 tRNA genes and 4 rRNA genes. The 45S nrDNA sequences were either 5,947 or 5,948 bp in length. Glycyrrhiza glabra and G. glabra × G. uralensis showed two types of nrDNA, while G. uralensis contained a single type. The complete 45S nrDNA sequence unit contains 18S rRNA, ITS1, 5.8S rRNA, ITS2 and 26S rRNA. We identified simple sequence repeat and tandem repeat sequences. We also developed four reliable markers for analysis of Glycyrrhiza diversity authentication.

  12. Ecological niche modelling and nDNA sequencing support a new, morphologically cryptic beetle species unveiled by DNA barcoding.

    Science.gov (United States)

    Hawlitschek, Oliver; Porch, Nick; Hendrich, Lars; Balke, Michael

    2011-02-09

    DNA sequencing techniques used to estimate biodiversity, such as DNA barcoding, may reveal cryptic species. However, disagreements between barcoding and morphological data have already led to controversy. Species delimitation should therefore not be based on mtDNA alone. Here, we explore the use of nDNA and bioclimatic modelling in a new species of aquatic beetle revealed by mtDNA sequence data. The aquatic beetle fauna of Australia is characterised by high degrees of endemism, including local radiations such as the genus Antiporus. Antiporus femoralis was previously considered to exist in two disjunct, but morphologically indistinguishable populations in south-western and south-eastern Australia. We constructed a phylogeny of Antiporus and detected a deep split between these populations. Diagnostic characters from the highly variable nuclear protein encoding arginine kinase gene confirmed the presence of two isolated populations. We then used ecological niche modelling to examine the climatic niche characteristics of the two populations. All results support the status of the two populations as distinct species. We describe the south-western species as Antiporus occidentalis sp.n. In addition to nDNA sequence data and extended use of mitochondrial sequences, ecological niche modelling has great potential for delineating morphologically cryptic species.

  13. Complete sequences of the mitochondrial DNA of the wild Gracilariopsis lemaneiformis and two mutagenic cultivated breeds (Gracilariaceae, Rhodophyta.

    Directory of Open Access Journals (Sweden)

    Lei Zhang

    Full Text Available The complete mitochondrial DNA (mtDNA of Gracilariopsis lemaneiformis was sequenced (25883 bp and mapped to a circular model. The A+T composition was 72.5%. Forty six genes and two potentially functional open reading frames were identified. They include 24 protein-coding genes, 2 rRNA genes, 20 tRNA genes and 2 ORFs (orf60, orf142. There is considerable sequence synteny across the five red algal mtDNAs falling into Florideophyceae including Gr. lemaneiformis in this study and previously sequenced species. A long stem-loop and a hairpin structure were identified in intergenic regions of mt genome of Gr. lemaneiformis, which are believed to be involved with transcription and replication. In addition, the mtDNAs of two mutagenic cultivated breeds ("981" and "07-2" were also sequenced. Compared with the mtDNA of wild Gr. lemaneiformis, the genome size and gene length and order of three strains were completely identical except nine base mutations including eight in the protein-coding genes and one in the tRNA gene. None of the base mutations caused frameshift or a premature stop codon in the mtDNA genes. Phylogenetic analyses based on mitochondrial protein-coding genes and rRNA genes demonstrated Gracilariopsis andersonii had closer phylogenetic relationship with its parasite Gracilariophila oryzoides than Gracilariopsis lemaneiformis which was from the same genus of Gracilariopsis.

  14. Complete sequences of the mitochondrial DNA of the wild Gracilariopsis lemaneiformis and two mutagenic cultivated breeds (Gracilariaceae, Rhodophyta).

    Science.gov (United States)

    Zhang, Lei; Wang, Xumin; Qian, Hao; Chi, Shan; Liu, Cui; Liu, Tao

    2012-01-01

    The complete mitochondrial DNA (mtDNA) of Gracilariopsis lemaneiformis was sequenced (25883 bp) and mapped to a circular model. The A+T composition was 72.5%. Forty six genes and two potentially functional open reading frames were identified. They include 24 protein-coding genes, 2 rRNA genes, 20 tRNA genes and 2 ORFs (orf60, orf142). There is considerable sequence synteny across the five red algal mtDNAs falling into Florideophyceae including Gr. lemaneiformis in this study and previously sequenced species. A long stem-loop and a hairpin structure were identified in intergenic regions of mt genome of Gr. lemaneiformis, which are believed to be involved with transcription and replication. In addition, the mtDNAs of two mutagenic cultivated breeds ("981" and "07-2") were also sequenced. Compared with the mtDNA of wild Gr. lemaneiformis, the genome size and gene length and order of three strains were completely identical except nine base mutations including eight in the protein-coding genes and one in the tRNA gene. None of the base mutations caused frameshift or a premature stop codon in the mtDNA genes. Phylogenetic analyses based on mitochondrial protein-coding genes and rRNA genes demonstrated Gracilariopsis andersonii had closer phylogenetic relationship with its parasite Gracilariophila oryzoides than Gracilariopsis lemaneiformis which was from the same genus of Gracilariopsis.

  15. cDNA encoding a polypeptide including a hevein sequence

    Energy Technology Data Exchange (ETDEWEB)

    Raikhel, N.V.; Broekaert, W.F.; Namhai Chua; Kush, A.

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids.

  16. Bisulfite sequencing reveals that Aspergillus flavus holds a hollow in DNA methylation.

    Directory of Open Access Journals (Sweden)

    Si-Yang Liu

    Full Text Available Aspergillus flavus first gained scientific attention for its production of aflatoxin. The underlying regulation of aflatoxin biosynthesis has been serving as a theoretical model for biosynthesis of other microbial secondary metabolites. Nevertheless, for several decades, the DNA methylation status, one of the important epigenomic modifications involved in gene regulation, in A. flavus remains to be controversial. Here, we applied bisulfite sequencing in conjunction with a biological replicate strategy to investigate the DNA methylation profiling of A. flavus genome. Both the bisulfite sequencing data and the methylome comparisons with other fungi confirm that the DNA methylation level of this fungus is negligible. Further investigation into the DNA methyltransferase of Aspergillus uncovers its close relationship with RID-like enzymes as well as its divergence with the methyltransferase of species with validated DNA methylation. The lack of repeat contents of the A. flavus' genome and the high RIP-index of the small amount of remanent repeat potentially support our speculation that DNA methylation may be absent in A. flavus or that it may possess de novo DNA methylation which occurs very transiently during the obscure sexual stage of this fungal species. This work contributes to our understanding on the DNA methylation status of A. flavus, as well as reinforces our views on the DNA methylation in fungal species. In addition, our strategy of applying bisulfite sequencing to DNA methylation detection in species with low DNA methylation may serve as a reference for later scientific investigations in other hypomethylated species.

  17. Implementing reverse genetics in Rosaceae: analysis of T-DNA flanking sequences of insertional mutant lines in the diploid strawberry, Fragaria vesca.

    Science.gov (United States)

    Oosumi, Teruko; Ruiz-Rojas, Juan Jairo; Veilleux, Richard E; Dickerman, Allan; Shulaev, Vladimir

    2010-09-01

    Reverse genetics is used for functional genomics research in model plants. To establish a model system for the systematic reverse genetics research in the Rosaceae family, we analyzed genomic DNA flanking the T-DNA insertions in 191 transgenic plants of the diploid strawberry, Fragaria vesca. One hundred and seventy-six T-DNA flanking sequences were amplified from the right border (RB) and 37 from the left border (LB) by thermal asymmetric interlaced PCR. Analysis of the T-DNA nick positions revealed that T-DNA was most frequently nicked at the cleavage sites. Analysis of 11 T-DNA integration sites indicated that T-DNA was integrated into the F. vesca genome by illegitimate recombination, as reported in other model plants: Arabidopsis, rice and tobacco. First, deletion of DNA was found at T-DNA integration target sites in all transgenic plants tested. Second, microsimilarities of a few base pairs between the left and/or right ends of the T-DNA and genomic sites were found in all transgenic plants tested. Finally, filler DNA was identified in four break-points. Out of 191 transgenic plants, T-DNA flanking sequences of 79 plants (41%) showed significant similarity to genes, elements or proteins of other plant species and 67 (35%) of the sequences are still unknown strawberry gene fragments. T-DNA flanking sequences of 126 plants (66%) showed homology to plant ESTs. This is the first report of T-DNA integration in a sizeable population of a rosaceous species. We have shown in this paper that T-DNA integration in strawberry is not random but directed by sequence microsimilarities in the host genome.

  18. Microaspiration of esophageal gland cells and cDNA library construction for identifying parasitism genes of plant-parasitic nematodes.

    Science.gov (United States)

    Hussey, Richard S; Huang, Guozhong; Allen, Rex

    2011-01-01

    Identifying parasitism genes encoding proteins secreted from a plant-parasitic nematode's esophageal gland cells and injected through its stylet into plant tissue is the key to understanding the molecular basis of nematode parasitism of plants. Parasitism genes have been cloned by directly microaspirating the cytoplasm from the esophageal gland cells of different parasitic stages of cyst or root-knot nematodes to provide mRNA to create a gland cell-specific cDNA library by long-distance reverse-transcriptase polymerase chain reaction. cDNA clones are sequenced and deduced protein sequences with a signal peptide for secretion are identified for high-throughput in situ hybridization to confirm gland-specific expression.

  19. Sequence of a cDNA encoding turtle high mobility group 1 protein.

    Science.gov (United States)

    Zheng, Jifang; Hu, Bi; Wu, Duansheng

    2005-07-01

    In order to understand sequence information about turtle HMG1 gene, a cDNA encoding HMG1 protein of the Chinese soft-shell turtle (Pelodiscus sinensis) was amplified by RT-PCR from kidney total RNA, and was cloned, sequenced and analyzed. The results revealed that the open reading frame (ORF) of turtle HMG1 cDNA is 606 bp long. The ORF codifies 202 amino acid residues, from which two DNA-binding domains and one polyacidic region are derived. The DNA-binding domains share higher amino acid identity with homologues sequences of chicken (96.5%) and mammalian (74%) than homologues sequence of rainbow trout (67%). The polyacidic region shows 84.6% amino acid homology with the equivalent region of chicken HMG1 cDNA. Turtle HMG1 protein contains 3 Cys residues located at completely conserved positions. Conservation in sequence and structure suggests that the functions of turtle HMG1 cDNA may be highly conserved during evolution. To our knowledge, this is the first report of HMG1 cDNA sequence in any reptilian.

  20. HindIII identifies a two allele DNA polymorphism of the human cannabinoid receptor gene (CNR)

    Energy Technology Data Exchange (ETDEWEB)

    Caenazzo, L.; Hoehe, M.R.; Hsieh, W.T.; Berrettini, W.H.; Bonner, T.I.; Gershon, E.S. (National Inst. of Health, Bethesda, MD (United States))

    1991-09-11

    HCNR p5, a 0.9 kb BamHI/EcoRI fragment from the human cannabinoid receptor gene inserted into pUC19, was used as probe. The fragment is located in an intron approximately 14 kb 5{prime} of the initiation codon. This fragment is a clean single copy sequence by genomic blotting. Hybridization of human genomic DNA digested with HindIII identified a two allele RFLP with bands at 5.5 (A1) and 3.3 kb (A2). The human cannabinoid receptor gene has been genetically mapped in CEPH reference pedigrees to the centromeric/q region of chromosome 6. In situ hybridization localizes it to 6q14-q15. Codominant segregation has been observed in 26 informative two- and three-generation CEPH pedigrees and in 14 medium-sized disease families.

  1. Human pro. cap alpha. 1(III) collagen: cDNA sequence for the 3' end

    Energy Technology Data Exchange (ETDEWEB)

    Mankoo, B S; Dalgleish, R

    1988-03-25

    The authors have previously isolated two overlapping cDNA clones, pIII-21 and pIII-33, which encode the C-terminal end of human type III procollagen. They now present the sequence of 2520 bases encoded in these cDNAs which overlaps other previously published sequences for the same gene. The sequence presented differs from previously published sequences at five positions.

  2. Simultaneous identification of DNA and RNA viruses present in pig faeces using process-controlled deep sequencing.

    Directory of Open Access Journals (Sweden)

    Jana Sachsenröder

    Full Text Available BACKGROUND: Animal faeces comprise a community of many different microorganisms including bacteria and viruses. Only scarce information is available about the diversity of viruses present in the faeces of pigs. Here we describe a protocol, which was optimized for the purification of the total fraction of viral particles from pig faeces. The genomes of the purified DNA and RNA viruses were simultaneously amplified by PCR and subjected to deep sequencing followed by bioinformatic analyses. The efficiency of the method was monitored using a process control consisting of three bacteriophages (T4, M13 and MS2 with different morphology and genome types. Defined amounts of the bacteriophages were added to the sample and their abundance was assessed by quantitative PCR during the preparation procedure. RESULTS: The procedure was applied to a pooled faecal sample of five pigs. From this sample, 69,613 sequence reads were generated. All of the added bacteriophages were identified by sequence analysis of the reads. In total, 7.7% of the reads showed significant sequence identities with published viral sequences. They mainly originated from bacteriophages (73.9% and mammalian viruses (23.9%; 0.8% of the sequences showed identities to plant viruses. The most abundant detected porcine viruses were kobuvirus, rotavirus C, astrovirus, enterovirus B, sapovirus and picobirnavirus. In addition, sequences with identities to the chimpanzee stool-associated circular ssDNA virus were identified. Whole genome analysis indicates that this virus, tentatively designated as pig stool-associated circular ssDNA virus (PigSCV, represents a novel pig virus. CONCLUSION: The established protocol enables the simultaneous detection of DNA and RNA viruses in pig faeces including the identification of so far unknown viruses. It may be applied in studies investigating aetiology, epidemiology and ecology of diseases. The implemented process control serves as quality control, ensures

  3. Large-scale Identification of Expressed Sequence Tags (ESTs from Nicotianatabacum by Normalized cDNA Library Sequencing

    Directory of Open Access Journals (Sweden)

    Alvarez S Perez

    2014-12-01

    Full Text Available An expressed sequence tags (EST resource for tobacco plants (Nicotianatabacum was established using high-throughput sequencing of randomly selected clones from one cDNA library representing a range of plant organs (leaf, stem, root and root base. Over 5000 ESTs were generated from the 3’ ends of 8000 clones, analyzed by BLAST searches and categorized functionally. All annotated ESTs were classified into 18 functional categories, unique transcripts involved in energy were the largest group accounting for 831 (32.32% of the annotated ESTs. After excluding 2450 non-significant tentative unique transcripts (TUTs, 100 unique sequences (1.67% of total TUTs were identified from the N. tabacum database. In the array result two genes strongly related to the tobacco mosaic virus (TMV were obtained, one basic form of pathogenesis-related protein 1 precursor (TBT012G08 and ubiquitin (TBT087G01. Both of them were found in the variety Hongda, some other important genes were classified into two groups, one of these implicated in plant development like those genes related to a photosynthetic process (chlorophyll a-b binding protein, photosystem I, ferredoxin I and III, ATP synthase and a further group including genes related to plant stress response (ubiquitin, ubiquitin-like protein SMT3, glycine-rich RNA binding protein, histones and methallothionein. The interesting finding in this study is that two of these genes have never been reported before in N. tabacum (ubiquitin-like protein SMT3 and methallothionein. The array results were confirmed using quantitative PCR.

  4. Application of Quaternion in improving the quality of global sequence alignment scores for an ambiguous sequence target in Streptococcus pneumoniae DNA

    Science.gov (United States)

    Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.

    2017-07-01

    DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.

  5. Real-time DNA barcoding in a rainforest using nanopore sequencing: opportunities for rapid biodiversity assessments and local capacity building.

    Science.gov (United States)

    Pomerantz, Aaron; Peñafiel, Nicolás; Arteaga, Alejandro; Bustamante, Lucas; Pichardo, Frank; Coloma, Luis A; Barrio-Amorós, César L; Salazar-Valenzuela, David; Prost, Stefan

    2018-04-01

    Advancements in portable scientific instruments provide promising avenues to expedite field work in order to understand the diverse array of organisms that inhabit our planet. Here, we tested the feasibility for in situ molecular analyses of endemic fauna using a portable laboratory fitting within a single backpack in one of the world's most imperiled biodiversity hotspots, the Ecuadorian Chocó rainforest. We used portable equipment, including the MinION nanopore sequencer (Oxford Nanopore Technologies) and the miniPCR (miniPCR), to perform DNA extraction, polymerase chain reaction amplification, and real-time DNA barcoding of reptile specimens in the field. We demonstrate that nanopore sequencing can be implemented in a remote tropical forest to quickly and accurately identify species using DNA barcoding, as we generated consensus sequences for species resolution with an accuracy of >99% in less than 24 hours after collecting specimens. The flexibility of our mobile laboratory further allowed us to generate sequence information at the Universidad Tecnológica Indoamérica in Quito for rare, endangered, and undescribed species. This includes the recently rediscovered Jambato toad, which was thought to be extinct for 28 years. Sequences generated on the MinION required as few as 30 reads to achieve high accuracy relative to Sanger sequencing, and with further multiplexing of samples, nanopore sequencing can become a cost-effective approach for rapid and portable DNA barcoding. Overall, we establish how mobile laboratories and nanopore sequencing can help to accelerate species identification in remote areas to aid in conservation efforts and be applied to research facilities in developing countries. This opens up possibilities for biodiversity studies by promoting local research capacity building, teaching nonspecialists and students about the environment, tackling wildlife crime, and promoting conservation via research-focused ecotourism.

  6. Whole-genome sequencing and comprehensive molecular profiling identify new driver mutations in gastric cancer

    NARCIS (Netherlands)

    Wang, Kai; Yuen, Siu Tsan; Xu, Jiangchun; Lee, Siu Po; Yan, Helen H N; Shi, Stephanie T; Siu, Hoi Cheong; Deng, Shibing; Chu, Kent Man; Law, Simon; Chan, Kok Hoe; Chan, Annie S Y; Tsui, Wai Yin; Ho, Siu Lun; Chan, Anthony K W; Man, Jonathan L K; Foglizzo, Valentina; Ng, Man Kin; Chan, April S; Ching, Yick Pang; Cheng, Grace H W; Xie, Tao; Fernandez, Julio; Li, Vivian S W; Clevers, Hans; Rejto, Paul A; Mao, Mao; Leung, Suet Yi

    Gastric cancer is a heterogeneous disease with diverse molecular and histological subtypes. We performed whole-genome sequencing in 100 tumor-normal pairs, along with DNA copy number, gene expression and methylation profiling, for integrative genomic analysis. We found subtype-specific genetic and

  7. Sequence of a cloned cDNA encoding human ribosomal protein S11

    Energy Technology Data Exchange (ETDEWEB)

    Lott, J B; Mackie, G A

    1988-02-11

    The authors have isolated a cloned cDNA that encodes human ribosomal protein (rp) S11 by screening a human fibroblast cDNA library with a labelled 204 bp DNA fragment encompassing residues 212-416 of pRS11, a rat rp Sll cDNA clone. The human rp S11 cloned cDNA consists of 15 residues of the 5' leader, the entire coding sequence and all 51 residues of the 3' untranslated region. The predicted amino acid sequence of 158 residues is identical to rat rpS11. The nucleotide sequence in the coding region differs, however, from that in rat in the first position in two codons and in the third position in 44 codons.

  8. Decoding the regulatory landscape of medulloblastoma using DNA methylation sequencing

    NARCIS (Netherlands)

    Hovestadt, Volker; Jones, David T. W.; Picelli, Simone; Wang, Wei; Kool, Marcel; Northcott, Paul A.; Sultan, Marc; Stachurski, Katharina; Ryzhova, Marina; Warnatz, Hans-Jörg; Ralser, Meryem; Brun, Sonja; Bunt, Jens; Jäger, Natalie; Kleinheinz, Kortine; Erkek, Serap; Weber, Ursula D.; Bartholomae, Cynthia C.; von Kalle, Christof; Lawerenz, Chris; Eils, Jürgen; Koster, Jan; Versteeg, Rogier; Milde, Till; Witt, Olaf; Schmidt, Sabine; Wolf, Stephan; Pietsch, Torsten; Rutkowski, Stefan; Scheurlen, Wolfram; Taylor, Michael D.; Brors, Benedikt; Felsberg, Jörg; Reifenberger, Guido; Borkhardt, Arndt; Lehrach, Hans; Wechsler-Reya, Robert J.; Eils, Roland; Yaspo, Marie-Laure; Landgraf, Pablo; Korshunov, Andrey; Zapatka, Marc; Radlwimmer, Bernhard; Pfister, Stefan M.; Lichter, Peter

    2014-01-01

    Epigenetic alterations, that is, disruption of DNA methylation and chromatin architecture, are now acknowledged as a universal feature of tumorigenesis. Medulloblastoma, a clinically challenging, malignant childhood brain tumour, is no exception. Despite much progress from recent genomics studies,

  9. Mouse tetranectin: cDNA sequence, tissue-specific expression, and chromosomal mapping

    DEFF Research Database (Denmark)

    Ibaraki, K; Kozak, C A; Wewer, U M

    1995-01-01

    regulation, mouse tetranectin cDNA was cloned from a 16-day-old mouse embryo library. Sequence analysis revealed a 992-bp cDNA with an open reading frame of 606 bp, which is identical in length to the human tetranectin cDNA. The deduced amino acid sequence showed high homology to the human cDNA with 76......(s) of tetranectin. The sequence analysis revealed a difference in both sequence and size of the noncoding regions between mouse and human cDNAs. Northern analysis of the various tissues from mouse, rat, and cow showed the major transcript(s) to be approximately 1 kb, which is similar in size to that observed...

  10. Designing universal primers for the isolation of DNA sequences encoding Proanthocyanidins biosynthetic enzymes in Crataegus aronia

    Directory of Open Access Journals (Sweden)

    Zuiter Afnan

    2012-08-01

    Full Text Available Abstract Background Hawthorn is the common name of all plant species in the genus Crataegus, which belongs to the Rosaceae family. Crataegus are considered useful medicinal plants because of their high content of proanthocyanidins (PAs and other related compounds. To improve PAs production in Crataegus tissues, the sequences of genes encoding PAs biosynthetic enzymes are required. Findings Different bioinformatics tools, including BLAST, multiple sequence alignment and alignment PCR analysis were used to design primers suitable for the amplification of DNA fragments from 10 candidate genes encoding enzymes involved in PAs biosynthesis in C. aronia. DNA sequencing results proved the utility of the designed primers. The primers were used successfully to amplify DNA fragments of different PAs biosynthesis genes in different Rosaceae plants. Conclusion To the best of our knowledge, this is the first use of the alignment PCR approach to isolate DNA sequences encoding PAs biosynthetic enzymes in Rosaceae plants.

  11. Genomic signal processing methods for computation of alignment-free distances from DNA sequences.

    Science.gov (United States)

    Borrayo, Ernesto; Mendizabal-Ruiz, E Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P; Morales, J Alejandro

    2014-01-01

    Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments.

  12. Sequence-specific RNA Photocleavage by Single-stranded DNA in Presence of Riboflavin

    Science.gov (United States)

    Zhao, Yongyun; Chen, Gangyi; Yuan, Yi; Li, Na; Dong, Juan; Huang, Xin; Cui, Xin; Tang, Zhuo

    2015-10-01

    Constant efforts have been made to develop new method to realize sequence-specific RNA degradation, which could cause inhibition of the expression of targeted gene. Herein, by using an unmodified short DNA oligonucleotide for sequence recognition and endogenic small molecue, vitamin B2 (riboflavin) as photosensitizer, we report a simple strategy to realize the sequence-specific photocleavage of targeted RNA. The DNA strand is complimentary to the target sequence to form DNA/RNA duplex containing a G•U wobble in the middle. The cleavage reaction goes through oxidative elimination mechanism at the nucleoside downstream of U of the G•U wobble in duplex to obtain unnatural RNA terminal, and the whole process is under tight control by using light as switch, which means the cleavage could be carried out according to specific spatial and temporal requirements. The biocompatibility of this method makes the DNA strand in combination with riboflavin a promising molecular tool for RNA manipulation.

  13. Molecular-Sized DNA or RNA Sequencing Machine | NCI Technology Transfer Center | TTC

    Science.gov (United States)

    The National Cancer Institute's Gene Regulation and Chromosome Biology Laboratory is seeking statements of capability or interest from parties interested in collaborative research to co-develop a molecular-sized DNA or RNA sequencing machine.

  14. Structural basis for sequence-specific recognition of DNA by TAL effectors

    KAUST Repository

    Deng, Dong; Yan, Chuangye; Pan, Xiaojing; Mahfouz, Magdy M.; Wang, Jiawei; Zhu, Jiankang; Shi, Yi Gong; Yan, Nieng

    2012-01-01

    TAL (transcription activator-like) effectors, secreted by phytopathogenic bacteria, recognize host DNA sequences through a central domain of tandem repeats. Each repeat comprises 33 to 35 conserved amino acids and targets a specific base pair

  15. Cloning, DNA sequence, and expression of the Rhodobacter sphaeroides cytochrome c/sub 2/ gene

    Energy Technology Data Exchange (ETDEWEB)

    Donohue, T.J.; McEwan, A.G.; Kaplan, S.

    1986-11-01

    The Rhodobacter sphaeroides cytochrome c/sub 2/ functions as a mobile electron carrier in both aerobic and photosynthetic electron transport chains. Synthetic deoxyoligonucleotide probes, based on the known amino acid sequence of this protein (M/sub r/ 14,000), were used to identify and clone the cytochrome c/sub 2/ structural gene (cycA). DNA sequence analysis of the cycA gene indicated the presence of a typical procaryotic 21-residue signal sequence, suggesting that this periplasmic protein is synthesized in vivo as a precursor. Synthesis of an immunoreactive cytochrome c/sub 2/ precursor protein (M/sub r/ 15,500) was observed in vitro when plasmids containing the cycA gene were used as templates in an R. sphaeroides coupled transcription-translation system. Approximately 500 base pairs of DNA upstream of the cycA gene was sufficient to allow expression of this gene product in vitro. Northern blot analysis with an internal cycA-specific probe identified at least two possibly monocistronic transcripts present in both different cellular levels and relative stoichiometries in steady-state cells grown under different physiological conditions. The ratio of the small (740-mucleotide) and large (920-nucleotide) cycA-specific mRNA species was dependent on cultural conditions but was not affected by light intensity under photosynthetic conditions. These results suggest that the increase in the cellular level of the cytochrome c/sub 2/ protein found in photosynthetic cells was due, in part, to increased transcription of the single-copy cyc operon.

  16. Identifying aggressive prostate cancer foci using a DNA methylation classifier.

    Science.gov (United States)

    Mundbjerg, Kamilla; Chopra, Sameer; Alemozaffar, Mehrdad; Duymich, Christopher; Lakshminarasimhan, Ranjani; Nichols, Peter W; Aron, Manju; Siegmund, Kimberly D; Ukimura, Osamu; Aron, Monish; Stern, Mariana; Gill, Parkash; Carpten, John D; Ørntoft, Torben F; Sørensen, Karina D; Weisenberger, Daniel J; Jones, Peter A; Duddalwar, Vinay; Gill, Inderbir; Liang, Gangning

    2017-01-12

    Slow-growing prostate cancer (PC) can be aggressive in a subset of cases. Therefore, prognostic tools to guide clinical decision-making and avoid overtreatment of indolent PC and undertreatment of aggressive disease are urgently needed. PC has a propensity to be multifocal with several different cancerous foci per gland. Here, we have taken advantage of the multifocal propensity of PC and categorized aggressiveness of individual PC foci based on DNA methylation patterns in primary PC foci and matched lymph node metastases. In a set of 14 patients, we demonstrate that over half of the cases have multiple epigenetically distinct subclones and determine the primary subclone from which the metastatic lesion(s) originated. Furthermore, we develop an aggressiveness classifier consisting of 25 DNA methylation probes to determine aggressive and non-aggressive subclones. Upon validation of the classifier in an independent cohort, the predicted aggressive tumors are significantly associated with the presence of lymph node metastases and invasive tumor stages. Overall, this study provides molecular-based support for determining PC aggressiveness with the potential to impact clinical decision-making, such as targeted biopsy approaches for early diagnosis and active surveillance, in addition to focal therapy.

  17. Flavonoid Biosynthesis Genes Putatively Identified in the Aromatic Plant Polygonum minus via Expressed Sequences Tag (EST Analysis

    Directory of Open Access Journals (Sweden)

    Zamri Zainal

    2012-02-01

    Full Text Available P. minus is an aromatic plant, the leaf of which is widely used as a food additive and in the perfume industry. The leaf also accumulates secondary metabolites that act as active ingredients such as flavonoid. Due to limited genomic and transcriptomic data, the biosynthetic pathway of flavonoids is currently unclear. Identification of candidate genes involved in the flavonoid biosynthetic pathway will significantly contribute to understanding the biosynthesis of active compounds. We have constructed a standard cDNA library from P. minus leaves, and two normalized full-length enriched cDNA libraries were constructed from stem and root organs in order to create a gene resource for the biosynthesis of secondary metabolites, especially flavonoid biosynthesis. Thus, large‑scale sequencing of P. minus cDNA libraries identified 4196 expressed sequences tags (ESTs which were deposited in dbEST in the National Center of Biotechnology Information (NCBI. From the three constructed cDNA libraries, 11 ESTs encoding seven genes were mapped to the flavonoid biosynthetic pathway. Finally, three flavonoid biosynthetic pathway-related ESTs chalcone synthase, CHS (JG745304, flavonol synthase, FLS (JG705819 and leucoanthocyanidin dioxygenase, LDOX (JG745247 were selected for further examination by quantitative RT-PCR (qRT-PCR in different P. minus organs. Expression was detected in leaf, stem and root. Gene expression studies have been initiated in order to better understand the underlying physiological processes.

  18. Underwound DNA under Tension: Structure, Elasticity, and Sequence-Dependent Behaviors

    Science.gov (United States)

    Sheinin, Maxim Y.; Forth, Scott; Marko, John F.; Wang, Michelle D.

    2011-09-01

    DNA melting under torsion plays an important role in a wide variety of cellular processes. In the present Letter, we have investigated DNA melting at the single-molecule level using an angular optical trap. By directly measuring force, extension, torque, and angle of DNA, we determined the structural and elastic parameters of torsionally melted DNA. Our data reveal that under moderate forces, the melted DNA assumes a left-handed structure as opposed to an open bubble conformation and is highly torsionally compliant. We have also discovered that at low forces melted DNA properties are highly dependent on DNA sequence. These results provide a more comprehensive picture of the global DNA force-torque phase diagram.

  19. A next generation semiconductor based sequencing approach for the identification of meat species in DNA mixtures.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43% in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97 and lower for avian species (0.70. PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.

  20. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution

    OpenAIRE

    Falconer, Ester; Hills, Mark; Naumann, Ulrike; Poon, Steven S. S.; Chavez, Elizabeth A.; Sanders, Ashley D.; Zhao, Yongjun; Hirst, Martin; Lansdorp, Peter M.

    2012-01-01

    DNA rearrangements such as sister chromatid exchanges (SCEs) are sensitive indicators of genomic stress and instability, but they are typically masked by single-cell sequencing techniques. We developed Strand-seq to independently sequence parental DNA template strands from single cells, making it possible to map SCEs at orders-of-magnitude greater resolution than was previously possible. On average, murine embryonic stem (mES) cells exhibit eight SCEs, which are detected at a resolution of up...

  1. The use of permanganate as a sequencing reagent for identification of 5-methylcytosine residues in DNA.

    OpenAIRE

    Fritzsche, E; Hayatsu, H; Igloi, G L; Iida, S; Kössel, H

    1987-01-01

    The use of permanganate as a reagent for DNA sequencing by chemical degradation has been studied with respect to its specificity for 5-methylcytosine residues. At weakly acidic pH and room temperature, 0.2 mM potassium permanganate reacts preferentially with thymine, 5-methylcytosine, and to a lesser extent with purine residues, while cytosine remains essentially intact. Permanganate oxidation is, therefore, a suitable DNA sequencing reaction for positive discrimination between 5-methylcytosi...

  2. Solving the Curriculum Sequencing Problem with DNA Computing Approach

    Science.gov (United States)

    Debbah, Amina; Ben Ali, Yamina Mohamed

    2014-01-01

    In the e-learning systems, a learning path is known as a sequence of learning materials linked to each others to help learners achieving their learning goals. As it is impossible to have the same learning path that suits different learners, the Curriculum Sequencing problem (CS) consists of the generation of a personalized learning path for each…

  3. Research on Image Encryption Based on DNA Sequence and Chaos Theory

    Science.gov (United States)

    Tian Zhang, Tian; Yan, Shan Jun; Gu, Cheng Yan; Ren, Ran; Liao, Kai Xin

    2018-04-01

    Nowadays encryption is a common technique to protect image data from unauthorized access. In recent years, many scientists have proposed various encryption algorithms based on DNA sequence to provide a new idea for the design of image encryption algorithm. Therefore, a new method of image encryption based on DNA computing technology is proposed in this paper, whose original image is encrypted by DNA coding and 1-D logistic chaotic mapping. First, the algorithm uses two modules as the encryption key. The first module uses the real DNA sequence, and the second module is made by one-dimensional logistic chaos mapping. Secondly, the algorithm uses DNA complementary rules to encode original image, and uses the key and DNA computing technology to compute each pixel value of the original image, so as to realize the encryption of the whole image. Simulation results show that the algorithm has good encryption effect and security.

  4. Detection of herpes simplex virus-specific DNA sequences in latently infected mice and in humans.

    Science.gov (United States)

    Efstathiou, S; Minson, A C; Field, H J; Anderson, J R; Wildy, P

    1986-02-01

    Herpes simplex virus-specific DNA sequences have been detected by Southern hybridization analysis in both central and peripheral nervous system tissues of latently infected mice. We have detected virus-specific sequences corresponding to the junction fragment but not the genomic termini, an observation first made by Rock and Fraser (Nature [London] 302:523-525, 1983). This "endless" herpes simplex virus DNA is both qualitatively and quantitatively stable in mouse neural tissue analyzed over a 4-month period. In addition, examination of DNA extracted from human trigeminal ganglia has shown herpes simplex virus DNA to be present in an "endless" form similar to that found in the mouse model system. Further restriction enzyme analysis of latently infected mouse brainstem and human trigeminal DNA has shown that this "endless" herpes simplex virus DNA is present in all four isomeric configurations.

  5. [Replication of Streptomyces plasmids: the DNA nucleotide sequence of plasmid pSB 24.2].

    Science.gov (United States)

    Bolotin, A P; Sorokin, A V; Aleksandrov, N N; Danilenko, V N; Kozlov, Iu I

    1985-11-01

    The nucleotide sequence of DNA in plasmid pSB 24.2, a natural deletion derivative of plasmid pSB 24.1 isolated from S. cyanogenus was studied. The plasmid amounted by its size to 3706 nucleotide pairs. The G-C composition was equal to 73 per cent. The analysis of the DNA structure in plasmid pSB 24.2 revealed the protein-encoding sequence of DNA, the continuity of which was significant for replication of the plasmid containing more than 1300 nucleotide pairs. The analysis also revealed two A-T-rich areas of DNA, the G-C composition of which was less than 55 per cent and a DNA area with a branched pin structure. The results may be of value in investigation of plasmid replication in actinomycetes and experimental cloning of DNA with this plasmid as a vector.

  6. Comparison of base composition analysis and Sanger sequencing of mitochondrial DNA for four U.S. population groups.

    Science.gov (United States)

    Kiesler, Kevin M; Coble, Michael D; Hall, Thomas A; Vallone, Peter M

    2014-01-01

    A set of 711 samples from four U.S. population groups was analyzed using a novel mass spectrometry based method for mitochondrial DNA (mtDNA) base composition profiling. Comparison of the mass spectrometry results with Sanger sequencing derived data yielded a concordance rate of 99.97%. Length heteroplasmy was identified in 46% of samples and point heteroplasmy was observed in 6.6% of samples in the combined mass spectral and Sanger data set. Using discrimination capacity as a metric, Sanger sequencing of the full control region had the highest discriminatory power, followed by the mass spectrometry base composition method, which was more discriminating than Sanger sequencing of just the hypervariable regions. This trend is in agreement with the number of nucleotides covered by each of the three assays. Published by Elsevier Ireland Ltd.

  7. Human liver phosphatase 2A: cDNA and amino acid sequence of two catalytic subunit isotypes

    International Nuclear Information System (INIS)

    Arino, J.; Woon, Chee Wai; Brautigan, D.L.; Miller, T.B. Jr.; Johnson, G.L.

    1988-01-01

    Two cDNA clones were isolated from a human liver library that encode two phosphatase 2A catalytic subunits. The two cDNAs differed in eight amino acids (97% identity) with three nonconservative substitutions. All of the amino acid substitutions were clustered in the amino-terminal domain of the protein. Amino acid sequence of one human liver clone (HL-14) was identical to the rabbit skeletal muscle phosphatase 2A cDNA (with 97% nucleotide identity). The second human liver clone (HL-1) is encoded by a separate gene, and RNA gel blot analysis indicates that both mRNAs are expressed similarly in several human clonal cell lines. Sequence comparison with phosphatase 1 and 2A indicates highly divergent amino acid sequences at the amino and carboxyl termini of the proteins and identifies six highly conserved regions between the two proteins that are predicted to be important for phosphatase enzymatic activity

  8. cDNA cloning, sequence analysis, and chromosomal localization of the gene for human carnitine palmitoyltransferase

    International Nuclear Information System (INIS)

    Finocchiaro, G.; Taroni, F.; Martin, A.L.; Colombo, I.; Tarelli, G.T.; DiDonato, S.; Rocchi, M.

    1991-01-01

    The authors have cloned and sequenced a cDNA encoding human liver carnitine palmitoyltransferase an inner mitochondrial membrane enzyme that plays a major role in the fatty acid oxidation pathway. Mixed oligonucleotide primers whose sequences were deduced from one tryptic peptide obtained from purified CPTase were used in a polymerase chain reaction, allowing the amplification of a 0.12-kilobase fragment of human genomic DNA encoding such a peptide. A 60-base-pair (bp) oligonucleotide synthesized on the basis of the sequence from this fragment was used for the screening of a cDNA library from human liver and hybridized to a cDNA insert of 2255 bp. This cDNA contains an open reading frame of 1974 bp that encodes a protein of 658 amino acid residues including 25 residues of an NH 2 -terminal leader peptide. The assignment of this open reading frame to human liver CPTase is confirmed by matches to seven different amino acid sequences of tryptic peptides derived from pure human CPTase and by the 82.2% homology with the amino acid sequence of rat CPTase. The NH 2 -terminal region of CPTase contains a leucine-proline motif that is shared by carnitine acetyl- and octanoyltransferases and by choline acetyltransferase. The gene encoding CPTase was assigned to human chromosome 1, region 1q12-1pter, by hybridization of CPTase cDNA with a DNA panel of 19 human-hanster somatic cell hybrids

  9. Torque measurements reveal sequence-specific cooperative transitions in supercoiled DNA

    Science.gov (United States)

    Oberstrass, Florian C.; Fernandes, Louis E.; Bryant, Zev

    2012-01-01

    B-DNA becomes unstable under superhelical stress and is able to adopt a wide range of alternative conformations including strand-separated DNA and Z-DNA. Localized sequence-dependent structural transitions are important for the regulation of biological processes such as DNA replication and transcription. To directly probe the effect of sequence on structural transitions driven by torque, we have measured the torsional response of a panel of DNA sequences using single molecule assays that employ nanosphere rotational probes to achieve high torque resolution. The responses of Z-forming d(pGpC)n sequences match our predictions based on a theoretical treatment of cooperative transitions in helical polymers. “Bubble” templates containing 50–100 bp mismatch regions show cooperative structural transitions similar to B-DNA, although less torque is required to disrupt strand–strand interactions. Our mechanical measurements, including direct characterization of the torsional rigidity of strand-separated DNA, establish a framework for quantitative predictions of the complex torsional response of arbitrary sequences in their biological context. PMID:22474350

  10. Alternative splicing enriched cDNA libraries identify breast cancer-associated transcripts

    Science.gov (United States)

    2010-01-01

    Background Alternative splicing (AS) is a central mechanism in the generation of genomic complexity and is a major contributor to transcriptome and proteome diversity. Alterations of the splicing process can lead to deregulation of crucial cellular processes and have been associated with a large spectrum of human diseases. Cancer-associated transcripts are potential molecular markers and may contribute to the development of more accurate diagnostic and prognostic methods and also serve as therapeutic targets. Alternative splicing-enriched cDNA libraries have been used to explore the variability generated by alternative splicing. In this study, by combining the use of trapping heteroduplexes and RNA amplification, we developed a powerful approach that enables transcriptome-wide exploration of the AS repertoire for identifying AS variants associated with breast tumor cells modulated by ERBB2 (HER-2/neu) oncogene expression. Results The human breast cell line (C5.2) and a pool of 5 ERBB2 over-expressing breast tumor samples were used independently for the construction of two AS-enriched libraries. In total, 2,048 partial cDNA sequences were obtained, revealing 214 alternative splicing sequence-enriched tags (ASSETs). A subset with 79 multiple exon ASSETs was compared to public databases and reported 138 different AS events. A high success rate of RT-PCR validation (94.5%) was obtained, and 2 novel AS events were identified. The influence of ERBB2-mediated expression on AS regulation was evaluated by capillary electrophoresis and probe-ligation approaches in two mammary cell lines (Hb4a and C5.2) expressing different levels of ERBB2. The relative expression balance between AS variants from 3 genes was differentially modulated by ERBB2 in this model system. Conclusions In this study, we presented a method for exploring AS from any RNA source in a transcriptome-wide format, which can be directly easily adapted to next generation sequencers. We identified AS transcripts

  11. Importance of the Sequence-Directed DNA Shape for Specific Binding Site Recognition by the Estrogen-Related Receptor

    Directory of Open Access Journals (Sweden)

    Kareem Mohideen-Abdul

    2017-06-01

    Full Text Available Most nuclear receptors (NRs bind DNA as dimers, either as hetero- or as homodimers on DNA sequences organized as two half-sites with specific orientation and spacing. The dimerization of NRs on their cognate response elements (REs involves specific protein–DNA and protein–protein interactions. The estrogen-related receptor (ERR belongs to the steroid hormone nuclear receptor (SHR family and shares strong similarity in its DNA-binding domain (DBD with that of the estrogen receptor (ER. In vitro, ERR binds with high affinity inverted repeat REs with a 3-bps spacing (IR3, but in vivo, it preferentially binds to single half-site REs extended at the 5′-end by 3 bp [estrogen-related response element (ERREs], thus explaining why ERR was often inferred as a purely monomeric receptor. Since its C-terminal ligand-binding domain is known to homodimerize with a strong dimer interface, we investigated the binding behavior of the isolated DBDs to different REs using electrophoretic migration, multi-angle static laser light scattering (MALLS, non-denaturing mass spectrometry, and nuclear magnetic resonance. In contrast to ER DBD, ERR DBD binds as a monomer to EREs (IR3, such as the tff1 ERE-IR3, but we identified a DNA sequence composed of an extended half-site embedded within an IR3 element (embedded ERRE/IR3, where stable dimer binding is observed. Using a series of chimera and mutant DNA sequences of ERREs and IR3 REs, we have found the key determinants for the binding of ERR DBD as a dimer. Our results suggest that the sequence-directed DNA shape is more important than the exact nucleotide sequence for the binding of ERR DBD to DNA as a dimer. Our work underlines the importance of the shape-driven DNA readout mechanisms based on minor groove recognition and electrostatic potential. These conclusions may apply not only to ERR but also to other members of the SHR family, such as androgen or glucocorticoid, for which a strong well-conserved half

  12. DNA cross-linking by dehydromonocrotaline lacks apparent base sequence preference.

    Science.gov (United States)

    Rieben, W Kurt; Coulombe, Roger A

    2004-12-01

    Pyrrolizidine alkaloids (PAs) are ubiquitous plant toxins, many of which, upon oxidation by hepatic mixed-function oxidases, become reactive bifunctional pyrrolic electrophiles that form DNA-DNA and DNA-protein cross-links. The anti-mitotic, toxic, and carcinogenic action of PAs is thought to be caused, at least in part, by these cross-links. We wished to determine whether the activated PA pyrrole dehydromonocrotaline (DHMO) exhibits base sequence preferences when cross-linked to a set of model duplex poly A-T 14-mer oligonucleotides with varying internal and/or end 5'-d(CG), 5'-d(GC), 5'-d(TA), 5'-d(CGCG), or 5'-d(GCGC) sequences. DHMO-DNA cross-links were assessed by electrophoretic mobility shift assay (EMSA) of 32P endlabeled oligonucleotides and by HPLC analysis of cross-linked DNAs enzymatically digested to their constituent deoxynucleosides. The degree of DNA cross-links depended upon the concentration of the pyrrole, but not on the base sequence of the oligonucleotide target. Likewise, HPLC chromatograms of cross-linked and digested DNAs showed no discernible sequence preference for any nucleotide. Added glutathione, tyrosine, cysteine, and aspartic acid, but not phenylalanine, threonine, serine, lysine, or methionine competed with DNA as alternate nucleophiles for cross-linking by DHMO. From these data it appears that DHMO exhibits no strong base preference when forming cross-links with DNA, and that some cellular nucleophiles can inhibit DNA cross-link formation.

  13. Mitochondrial DNA sequence variation in Finnish patients with matrilineal diabetes mellitus

    Directory of Open Access Journals (Sweden)

    Soini Heidi K

    2012-07-01

    Full Text Available Abstract Background The genetic background of type 2 diabetes is complex involving contribution by both nuclear and mitochondrial genes. There is an excess of maternal inheritance in patients with type 2 diabetes and, furthermore, diabetes is a common symptom in patients with mutations in mitochondrial DNA (mtDNA. Polymorphisms in mtDNA have been reported to act as risk factors in several complex diseases. Findings We examined the nucleotide variation in complete mtDNA sequences of 64 Finnish patients with matrilineal diabetes. We used conformation sensitive gel electrophoresis and sequencing to detect sequence variation. We analysed the pathogenic potential of nonsynonymous variants detected in the sequences and examined the role of the m.16189 T>C variant. Controls consisted of non-diabetic subjects ascertained in the same population. The frequency of mtDNA haplogroup V was 3-fold higher in patients with diabetes. Patients harboured many nonsynonymous mtDNA substitutions that were predicted to be possibly or probably damaging. Furthermore, a novel m.13762 T>G in MTND5 leading to p.Ser476Ala and several rare mtDNA variants were found. Haplogroup H1b harbouring m.16189 T > C and m.3010 G > A was found to be more frequent in patients with diabetes than in controls. Conclusions Mildly deleterious nonsynonymous mtDNA variants and rare population-specific haplotypes constitute genetic risk factors for maternally inherited diabetes.

  14. DNA sequences from the quagga, an extinct member of the horse family.

    Science.gov (United States)

    Higuchi, R; Bowman, B; Freiberger, M; Ryder, O A; Wilson, A C

    To determine whether DNA survives and can be recovered from the remains of extinct creatures, we have examined dried muscle from a museum specimen of the quagga, a zebra-like species (Equus quagga) that became extinct in 1883 (ref. 1). We report that DNA was extracted from this tissue in amounts approaching 1% of that expected from fresh muscle, and that the DNA was of relatively low molecular weight. Among the many clones obtained from the quagga DNA, two containing pieces of mitochondrial DNA (mtDNA) were sequenced. These sequences, comprising 229 nucleotide pairs, differ by 12 base substitutions from the corresponding sequences of mtDNA from a mountain zebra, an extant member of the genus Equus. The number, nature and locations of the substitutions imply that there has been little or no postmortem modification of the quagga DNA sequences, and that the two species had a common ancestor 3-4 Myr ago, consistent with fossil evidence concerning the age of the genus Equus.

  15. Complete DNA sequence of the linear mitochondrial genome of the pathogenic yeast Candida parapsilosis

    DEFF Research Database (Denmark)

    Nosek, J.; Novotna, M.; Hlavatovicova, Z.

    2004-01-01

    The complete sequence of the mitochondrial DNA of the opportunistic yeast pathogen Candida parapsilosis was determined. The mitochondrial genome is represented by linear DNA molecules terminating with tandem repeats of a 738-bp unit. The number of repeats varies, thus generating a population...

  16. Nucleotide sequence determination of the region in adenovirus 5 DNA involved in cell transformation

    International Nuclear Information System (INIS)

    Maat, J.

    1978-01-01

    A description is given of investigations into the primary structure of the transforming region of adenovirus type 5 DNA. The phenomenon of cell transformation is discussed in general terms and the principles of a number of fairly recent techniques, which have been in use for DNA sequence determination since 1975 are dealt with. A few of the author's own techniques are described which deal both with nucleotide sequence analysis and with the determination of DNA cleavage sites of restriction endonucleases. The results are given of the mapping of cleavage sites in the HpaI-E fragment of adenovirus DNA of HpaII, HaeIII, AluI, HinfI and TaqI and of the determination of the nucleotide sequence in the transforming region of adenovirus type 5 DNA. The results of the sequence determination of the Ad5 HindIII-G fragment are discussed in relation with the investigation on the transforming proteins isolated from in vitro and in vivo synthesizing systems. Labelling procedures of DNA are described including the exonuclease III/DNA polymerase 1 method and TA polynucleotide kinase labelling of DNA fragments. (Auth.)

  17. Protocols for 16S rDNA Array Analyses of Microbial Communities by Sequence-Specific Labeling of DNA Probes

    Directory of Open Access Journals (Sweden)

    Knut Rudi

    2003-01-01

    Full Text Available Analyses of complex microbial communities are becoming increasingly important. Bottlenecks in these analyses, however, are the tools to actually describe the biodiversity. Novel protocols for DNA array-based analyses of microbial communities are presented. In these protocols, the specificity obtained by sequence-specific labeling of DNA probes is combined with the possibility of detecting several different probes simultaneously by DNA array hybridization. The gene encoding 16S ribosomal RNA was chosen as the target in these analyses. This gene contains both universally conserved regions and regions with relatively high variability. The universally conserved regions are used for PCR amplification primers, while the variable regions are used for the specific probes. Protocols are presented for DNA purification, probe construction, probe labeling, and DNA array hybridizations.

  18. DNA typing of ancient parasite eggs from environmental samples identifies human and animal worm infections in Viking-age settlement.

    Science.gov (United States)

    Søe, Martin Jensen; Nejsum, Peter; Fredensborg, Brian Lund; Kapel, Christian Moliin Outzen

    2015-02-01

    Ancient parasite eggs were recovered from environmental samples collected at a Viking-age settlement in Viborg, Denmark, dated 1018-1030 A.D. Morphological examination identified Ascaris sp., Trichuris sp., and Fasciola sp. eggs, but size and shape did not allow species identification. By carefully selecting genetic markers, PCR amplification and sequencing of ancient DNA (aDNA) isolates resulted in identification of: the human whipworm, Trichuris trichiura , using SSUrRNA sequence homology; Ascaris sp. with 100% homology to cox1 haplotype 07; and Fasciola hepatica using ITS1 sequence homology. The identification of T. trichiura eggs indicates that human fecal material is present and, hence, that the Ascaris sp. haplotype 07 was most likely a human variant in Viking-age Denmark. The location of the F. hepatica finding suggests that sheep or cattle are the most likely hosts. Further, we sequenced the Ascaris sp. 18S rRNA gene in recent isolates from humans and pigs of global distribution and show that this is not a suited marker for species-specific identification. Finally, we discuss ancient parasitism in Denmark and the implementation of aDNA analysis methods in paleoparasitological studies. We argue that when employing species-specific identification, soil samples offer excellent opportunities for studies of human parasite infections and of human and animal interactions of the past.

  19. Ray Wu as Fifth Business: Deconstructing collective memory in the history of DNA sequencing.

    Science.gov (United States)

    Onaga, Lisa A

    2014-06-01

    The concept of 'Fifth Business' is used to analyze a minority standpoint and bring serious attention to the role of scientists who play a galvanizing role in a science but for multiple reasons appear less prominently in more common recounts of any particular development. Biochemist Ray Wu (1928-2008) published a DNA sequencing experiment in March 1970 using DNA polymerase catalysis and specific nucleotide labeling, both of which are foundational to general sequencing methods today. The scant mention of Wu's work from textbooks, research articles, and other accounts of DNA sequencing calls into question how scientific collective memory forms. This alternative history seeks to understand why a key figure in nucleic acid sequence analysis has remained less visibly connected or peripheral to solidifying narratives about the history of DNA sequencing. The study resists predictable dismissals of Wu's work in order to seriously examine the formation of his nucleic acid sequence analysis research program and how he shared his knowledge of sequencing during a period of rapid advancement in the field. An analysis of Wu's work on sequencing the cohesive ends of lambda bacteriophage in the 1960s and 1970s exemplifies how a variety of individuals and groups attempted to develop protocol for sequencing the order of nucleotide base pairs comprising DNA. This historical examination of the sociality of scientific research suggests a way to understand how Wu and others contributed to the very collective memory of DNA sequencing that Wu eventually tried to repair. The study of Wu, who was a Chinese immigrant to the United States, provides a foundation for further critical scholarship on the heterogeneous histories of Asian American bioscientists, the sociality of their scientific works, and how the resulting knowledge produced is preserved, if not evenly, in a scientific field's collective memory. Copyright © 2014 Elsevier Ltd. All rights reserved.

  20. Soil Fungal Community Associated with Peat in Sarawak Identified Using 18S rDNA Marker

    International Nuclear Information System (INIS)

    Siti Ramlah Ahmad Ali; Sakinah Safari; Mohd Shawal Thakib; Shamsilawani Ahamed Bakeri; Nur Aziemah Ab Ghani

    2016-01-01

    Fungi are principal decomposing microorganisms in acidic environment of peat lands. A useful tool for molecular screening of soil fungal communities using the 18S ribosomal DNA primer has been proven capable of identifying a broad range of fungi species within Ascomycota, Basidiomycota, Zygomycota and Chytridiomycota. Currently, very little information is available on fungal communities in deep peat of Sarawak, Malaysia. In this study, we have isolated the fungi from soil samples taken in deep peat forests and oil palm cultivated areas. The fungal identity was undertaken using 18S ribosomal DNA primer which is EF4-F/ fung5-R. The microscopic structures were conducted to confirm the identity of the isolates. Based on this study, the fungal division most commonly found in deep peat is the Ascomycota. Aspergillus fumigatus was the most common species and more dominant in oil palm cultivated areas and logged-over forest than in primary forest. In the primary forest, the dominant species was the A. flavus, while Hypocrea atroviridis was commonly associated with oil palm cultivated areas and logged-over forest. Other species of fungi isolated in peat primary forests were Penicillium chrysogenum, Trichoderma sp., Phanerochaete sp., Mortierella chlamydospora, A. niger, A. alliaceus, etc. The in-depth difference in the fungal communities for the different sites will be further investigated using the next generation sequencing technology. (author)

  1. GENETIC POLYMORPHISM IN GYMNODINIUM GALATHEANUM CHLOROPLAST DNA SEQUENCES AND DEVELOPMENT OF A MOLECULAR DETECTION ASSAY. (R827084)

    Science.gov (United States)

    Nuclear and chloroplast-encoded small subunit ribosomal DNA sequences were obtainedfrom several strains of the toxic dinoflagellate Gymnodinium galatheanum. Phylogenetic analyses andcomparison of sequences indicate that the chloroplast sequences show a higher degree of se...

  2. Sequence-specific DNA alkylation by tandem Py-Im polyamide conjugates.

    Science.gov (United States)

    Taylor, Rhys Dylan; Kawamoto, Yusuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

    2014-09-01

    Tandem N-methylpyrrole-N-methylimidazole (Py-Im) polyamides with good sequence-specific DNA-alkylating activities have been designed and synthesized. Three alkylating tandem Py-Im polyamides with different linkers, which each contained the same moiety for the recognition of a 10 bp DNA sequence, were evaluated for their reactivity and selectivity by DNA alkylation, using high-resolution denaturing gel electrophoresis. All three conjugates displayed high reactivities for the target sequence. In particular, polyamide 1, which contained a β-alanine linker, displayed the most-selective sequence-specific alkylation towards the target 10 bp DNA sequence. The tandem Py-Im polyamide conjugates displayed greater sequence-specific DNA alkylation than conventional hairpin Py-Im polyamide conjugates (4 and 5). For further research, the design of tandem Py-Im polyamide conjugates could play an important role in targeting specific gene sequences. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Budding yeast cDNA sequencing project: S03036-05_I15 [Budding yeast cDNA sequencing project

    Lifescience Database Archive (English)

    Full Text Available EST - Link to UCSC Genome Browser - Sequence >S03036-05_I15.phd NNNTNNTNNNNCNCTCACATANAAGACGGANNAGNNNGCTGGGC...CAATGCGTTCCATATGCG AAAATTCTTGGNCAATGTATTCTCTAGCAATCTNTNCTTTTGTACANTCGGAGGNTTNTC ATGNTCCTTTCATANATTATANAAANNG

  4. Application of high-throughput DNA sequencing in phytopathology.

    Science.gov (United States)

    Studholme, David J; Glover, Rachel H; Boonham, Neil

    2011-01-01

    The new sequencing technologies are already making a big impact in academic research on medically important microbes and may soon revolutionize diagnostics, epidemiology, and infection control. Plant pathology also stands to gain from exploiting these opportunities. This manuscript reviews some applications of these high-throughput sequencing methods that are relevant to phytopathology, with emphasis on the associated computational and bioinformatics challenges and their solutions. Second-generation sequencing technologies have recently been exploited in genomics of both prokaryotic and eukaryotic plant pathogens. They are also proving to be useful in diagnostics, especially with respect to viruses. Copyright © 2011 by Annual Reviews. All rights reserved.

  5. An analysis of expressed sequence tags of developing castor endosperm using a full-length cDNA library

    Directory of Open Access Journals (Sweden)

    Wallis James G

    2007-07-01

    Full Text Available Abstract Background Castor seeds are a major source for ricinoleate, an important industrial raw material. Genomics studies of castor plant will provide critical information for understanding seed metabolism, for effectively engineering ricinoleate production in transgenic oilseeds, or for genetically improving castor plants by eliminating toxic and allergic proteins in seeds. Results Full-length cDNAs are useful resources in annotating genes and in providing functional analysis of genes and their products. We constructed a full-length cDNA library from developing castor endosperm, and obtained 4,720 ESTs from 5'-ends of the cDNA clones representing 1,908 unique sequences. The most abundant transcripts are genes encoding storage proteins, ricin, agglutinin and oleosins. Several other sequences are also very numerous, including two acidic triacylglycerol lipases, and the oleate hydroxylase (FAH12 gene that is responsible for ricinoleate biosynthesis. The role(s of the lipases in developing castor seeds are not clear, and co-expressing of a lipase and the FAH12 did not result in significant changes in hydroxy fatty acid accumulation in transgenic Arabidopsis seeds. Only one oleate desaturase (FAD2 gene was identified in our cDNA sequences. Sequence and functional analyses of the castor FAD2 were carried out since it had not been characterized previously. Overexpression of castor FAD2 in a FAH12-expressing Arabidopsis line resulted in decreased accumulation of hydroxy fatty acids in transgenic seeds. Conclusion Our results suggest that transcriptional regulation of FAD2 and FAH12 genes maybe one of the mechanisms that contribute to a high level of ricinoleate accumulation in castor endosperm. The full-length cDNA library will be used to search for additional genes that affect ricinoleate accumulation in seed oils. Our EST sequences will also be useful to annotate the castor genome, which whole sequence is being generated by shotgun sequencing at

  6. Nucleotide sequence of a cDNA coding for the amino-terminal region of human prepro. alpha. 1(III) collagen

    Energy Technology Data Exchange (ETDEWEB)

    Toman, P D; Ricca, G A [Rorer Biotechnology, Inc., Springfield, VA (USA); de Crombrugghe, B [National Institutes of Health, Bethesda, MD (USA)

    1988-07-25

    Type III Collagen is synthesized in a variety of tissues as a precursor macromolecule containing a leader sequence, a N-propeptide, a N-telopeptide, the triple helical region, a C-telopeptide, and C-propeptide. To further characterize the human type III collagen precursor, a human placental cDNA library was constructed in gt11 using an oligonucleotide derived from a partial cDNA sequence corresponding to the carboxy-terminal part of the 1(III) collagen. A cDNA was identified which contains the leader sequence, the N-propeptide and N-telopeptide regions. The DNA sequence of these regions are presented here. The triple helical, C-telopeptide and C-propeptide amino acid sequence for human type III collagen has been determined previously. A comparison of the human amino acid sequence with mouse, chicken, and calf sequence shows 81%, 81%, and 92% similarity, respectively. At the DNA level, the sequence similarity between human and mouse or chicken type III collagen sequences in this area is 82% and 77%, respectively.

  7. Tomato protoplast DNA transformation : physical linkage and recombination of exogenous DNA sequences

    NARCIS (Netherlands)

    Jongsma, Maarten; Koornneef, Maarten; Zabel, Pim; Hille, Jacques

    1987-01-01

    Tomato protoplasts have been transformed with plasmid DNA's, containing a chimeric kanamycin resistance gene and putative tomato origins of replication. A calcium phosphate-DNA mediated transformation procedure was employed in combination with either polyethylene glycol or polyvinyl alcohol. There

  8. Ancient DNA analysis identifies marine mollusc shells as new metagenomic archives of the past.

    Science.gov (United States)

    Der Sarkissian, Clio; Pichereau, Vianney; Dupont, Catherine; Ilsøe, Peter C; Perrigault, Mickael; Butler, Paul; Chauvaud, Laurent; Eiríksson, Jón; Scourse, James; Paillard, Christine; Orlando, Ludovic

    2017-09-01

    Marine mollusc shells enclose a wealth of information on coastal organisms and their environment. Their life history traits as well as (palaeo-) environmental conditions, including temperature, food availability, salinity and pollution, can be traced through the analysis of their shell (micro-) structure and biogeochemical composition. Adding to this list, the DNA entrapped in shell carbonate biominerals potentially offers a novel and complementary proxy both for reconstructing palaeoenvironments and tracking mollusc evolutionary trajectories. Here, we assess this potential by applying DNA extraction, high-throughput shotgun DNA sequencing and metagenomic analyses to marine mollusc shells spanning the last ~7,000 years. We report successful DNA extraction from shells, including a variety of ancient specimens, and find that DNA recovery is highly dependent on their biomineral structure, carbonate layer preservation and disease state. We demonstrate positive taxonomic identification of mollusc species using a combination of mitochondrial DNA genomes, barcodes, genome-scale data and metagenomic approaches. We also find shell biominerals to contain a diversity of microbial DNA from the marine environment. Finally, we reconstruct genomic sequences of organisms closely related to the Vibrio tapetis bacteria from Manila clam shells previously diagnosed with Brown Ring Disease. Our results reveal marine mollusc shells as novel genetic archives of the past, which opens new perspectives in ancient DNA research, with the potential to reconstruct the evolutionary history of molluscs, microbial communities and pathogens in the face of environmental changes. Other future applications include conservation of endangered mollusc species and aquaculture management. © 2017 John Wiley & Sons Ltd.

  9. Genetic alterations of hepatocellular carcinoma by random amplified polymorphic DNA analysis and cloning sequencing of tumor differential DNA fragment

    Science.gov (United States)

    Xian, Zhi-Hong; Cong, Wen-Ming; Zhang, Shu-Hui; Wu, Meng-Chao

    2005-01-01

    AIM: To study the genetic alterations and their association with clinicopathological characteristics of hepatocellular carcinoma (HCC), and to find the tumor related DNA fragments. METHODS: DNA isolated from tumors and corresponding noncancerous liver tissues of 56 HCC patients was amplified by random amplified polymorphic DNA (RAPD) with 10 random 10-mer arbitrary primers. The RAPD bands showing obvious differences in tumor tissue DNA corresponding to that of normal tissue were separated, purified, cloned and sequenced. DNA sequences were analyzed and compared with GenBank data. RESULTS: A total of 56 cases of HCC were demonstrated to have genetic alterations, which were detected by at least one primer. The detestability of genetic alterations ranged from 20% to 70% in each case, and 17.9% to 50% in each primer. Serum HBV infection, tumor size, histological grade, tumor capsule, as well as tumor intrahepatic metastasis, might be correlated with genetic alterations on certain primers. A band with a higher intensity of 480 bp or so amplified fragments in tumor DNA relative to normal DNA could be seen in 27 of 56 tumor samples using primer 4. Sequence analysis of these fragments showed 91% homology with Homo sapiens double homeobox protein DUX10 gene. CONCLUSION: Genetic alterations are a frequent event in HCC, and tumor related DNA fragments have been found in this study, which may be associated with hepatocarcin-ogenesis. RAPD is an effective method for the identification and analysis of genetic alterations in HCC, and may provide new information for further evaluating the molecular mechanism of hepatocarcinogenesis. PMID:15996039

  10. Molecular profiling of appendiceal epithelial tumors using massively parallel sequencing to identify somatic mutations.

    Science.gov (United States)

    Liu, Xiaoying; Mody, Kabir; de Abreu, Francine B; Pipas, J Marc; Peterson, Jason D; Gallagher, Torrey L; Suriawinata, Arief A; Ripple, Gregory H; Hourdequin, Kathryn C; Smith, Kerrington D; Barth, Richard J; Colacchio, Thomas A; Tsapakos, Michael J; Zaki, Bassem I; Gardner, Timothy B; Gordon, Stuart R; Amos, Christopher I; Wells, Wendy A; Tsongalis, Gregory J

    2014-07-01

    Some epithelial neoplasms of the appendix, including low-grade appendiceal mucinous neoplasm and adenocarcinoma, can result in pseudomyxoma peritonei (PMP). Little is known about the mutational spectra of these tumor types and whether mutations may be of clinical significance with respect to therapeutic selection. In this study, we identified somatic mutations using the Ion Torrent AmpliSeq Cancer Hotspot Panel v2. Specimens consisted of 3 nonneoplastic retention cysts/mucocele, 15 low-grade mucinous neoplasms (LAMNs), 8 low-grade/well-differentiated mucinous adenocarcinomas with pseudomyxoma peritonei, and 12 adenocarcinomas with/without goblet cell/signet ring cell features. Barcoded libraries were prepared from up to 10 ng of extracted DNA and multiplexed on single 318 chips for sequencing. Data analysis was performed using Golden Helix SVS. Variants that remained after the analysis pipeline were individually interrogated using the Integrative Genomics Viewer. A single Janus kinase 3 (JAK3) mutation was detected in the mucocele group. Eight mutations were identified in the V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) and GNAS complex locus (GNAS) genes among LAMN samples. Additional gene mutations were identified in the AKT1 (v-akt murine thymoma viral oncogene homolog 1), APC (adenomatous polyposis coli), JAK3, MET (met proto-oncogene), phosphatidylinositol-4,5-bisphosphate 3-kinase (PIK3CA), RB1 (retinoblastoma 1), STK11 (serine/threonine kinase 11), and tumor protein p53 (TP53) genes. Among the PMPs, 6 mutations were detected in the KRAS gene and also in the GNAS, TP53, and RB1 genes. Appendiceal cancers showed mutations in the APC, ATM (ataxia telangiectasia mutated), KRAS, IDH1 [isocitrate dehydrogenase 1 (NADP+)], NRAS [neuroblastoma RAS viral (v-ras) oncogene homolog], PIK3CA, SMAD4 (SMAD family member 4), and TP53 genes. Our results suggest molecular heterogeneity among epithelial tumors of the appendix. Next generation sequencing efforts

  11. DNA sequence evolution in fast evolving mitochondrial DNA nad1 exons in Geraniaceae and Plantaginaceae

    NARCIS (Netherlands)

    Bakker, F.T.; Breman, F.; Merckx, V.

    2006-01-01

    Previously, nucleotide substitution rates in mitochondrial DNA of Geraniaceae and Plantaginaceae have been shown to be exceptionally high compared with other angiosperm mtDNA lineages. It has also been shown that mtDNA introns were lost in Geraniaceae and Plantaginaceae. In this study we compile 127

  12. Next-generation sequencing-based detection of circulating tumour DNA After allogeneic stem cell transplantation for lymphoma.

    Science.gov (United States)

    Herrera, Alex F; Kim, Haesook T; Kong, Katherine A; Faham, Malek; Sun, Heather; Sohani, Aliyah R; Alyea, Edwin P; Carlton, Victoria E; Chen, Yi-Bin; Cutler, Corey S; Ho, Vincent T; Koreth, John; Kotwaliwale, Chitra; Nikiforow, Sarah; Ritz, Jerome; Rodig, Scott J; Soiffer, Robert J; Antin, Joseph H; Armand, Philippe

    2016-12-01

    Next-generation sequencing (NGS)-based circulating tumour DNA (ctDNA) detection is a promising monitoring tool for lymphoid malignancies. We evaluated whether the presence of ctDNA was associated with outcome after allogeneic haematopoietic stem cell transplantation (HSCT) in lymphoma patients. We studied 88 patients drawn from a phase 3 clinical trial of reduced-intensity conditioning HSCT in lymphoma. Conventional restaging and collection of peripheral blood samples occurred at pre-specified time points before and after HSCT and were assayed for ctDNA by sequencing of the immunoglobulin or T-cell receptor genes. Tumour clonotypes were identified in 87% of patients with adequate tumour samples. Sixteen of 19 (84%) patients with disease progression after HSCT had detectable ctDNA prior to progression at a median of 3·7 months prior to relapse/progression. Patients with detectable ctDNA 3 months after HSCT had inferior progression-free survival (PFS) (2-year PFS 58% vs. 84% in ctDNA-negative patients, P = 0·033). In multivariate models, detectable ctDNA was associated with increased risk of progression/death (Hazard ratio 3·9, P = 0·003) and increased risk of relapse/progression (Hazard ratio 10·8, P = 0·0006). Detectable ctDNA is associated with an increased risk of relapse/progression, but further validation studies are necessary to confirm these findings and determine the clinical utility of NGS-based minimal residual disease monitoring in lymphoma patients after HSCT. © 2016 John Wiley & Sons Ltd.

  13. Alternative splicing of human elastin mRNA indicated by sequence analysis of cloned genomic and complementary DNA

    International Nuclear Information System (INIS)

    Indik, Z.; Yeh, H.; Ornstein-goldstein, N.; Sheppard, P.; Anderson, N.; Rosenbloom, J.C.; Peltonen, L.; Rosenbloom, J.

    1987-01-01

    Poly(A) + RNA, isolated from a single 7-mo fetal human aorta, was used to synthesize cDNA by the RNase H method, and the cDNA was inserted into λgt10. Recombinant phage containing elastin sequences were identified by hybridization with cloned, exon-containing fragments of the human elastin gene. Three clones containing inserts of 3.3, 2.7, and 2.3 kilobases were selected for further analysis. Three overlapping clones containing 17.8 kilobases of the human elastin gene were also isolated from genomic libraries. Complete sequence analysis of the six clones demonstrated that: (i) the cDNA encompassed the entire translated portion of the mRNA encoding 786 amino acids, including several unusual hydrophilic amino acid sequences not previously identified in porcine tropoelastin, (ii) exons encoding either hydrophobic or crosslinking domains in the protein alternated in the gene, and (iii) a great abundance of Alu repetitive sequences occurred throughout the introns. The data also indicated substantial alternative splicing of the mRNA. These results suggest the potential for significant variation in the precise molecular structure of the elastic fiber in the human population

  14. 2D-dynamic representation of DNA sequences as a graphical tool in bioinformatics

    Science.gov (United States)

    Bielińska-Wa̧Ż, D.; Wa̧Ż, P.

    2016-10-01

    2D-dynamic representation of DNA sequences is briefly reviewed. Some new examples of 2D-dynamic graphs which are the graphical tool of the method are shown. Using the examples of the complete genome sequences of the Zika virus it is shown that the present method can be applied for the study of the evolution of viral genomes.

  15. Cloning and sequencing of Indian Water buffalo (Bubalus bubalis) interleukin-3 cDNA

    KAUST Repository

    Sugumar, Thennarasu; Harishankar, M.; Dhinakar Raj, G.

    2011-01-01

    Full-length cDNA (435 bp) of the interleukin-3(IL-3) gene of the Indian water buffalo was amplified by reverse transcriptase-polymerase chain reaction and sequenced. This sequence had 96% nucleotide identity and 92% amino acid identity with bovine

  16. Sequence-specific protection of duplex DNA against restriction and methylation enzymes by pseudocomplementary PNAs

    DEFF Research Database (Denmark)

    Izvolsky, K I; Demidov, V V; Nielsen, P E

    2000-01-01

    I restriction endonuclease and dam methylase. The pcPNA-assisted protection against enzymatic methylation is more efficient when the PNA-binding site embodies the methylase-recognition site rather than overlaps it. We conclude that pcPNAs may provide the robust tools allowing to sequence-specifically manipulate...... DNA duplexes in a virtually sequence-unrestricted manner....

  17. ABI Base Recall: Automatic Correction and Ends Trimming of DNA Sequences.

    Science.gov (United States)

    Elyazghi, Zakaria; Yazouli, Loubna El; Sadki, Khalid; Radouani, Fouzia

    2017-12-01

    Automated DNA sequencers produce chromatogram files in ABI format. When viewing chromatograms, some ambiguities are shown at various sites along the DNA sequences, because the program implemented in the sequencing machine and used to call bases cannot always precisely determine the right nucleotide, especially when it is represented by either a broad peak or a set of overlaying peaks. In such cases, a letter other than A, C, G, or T is recorded, most commonly N. Thus, DNA sequencing chromatograms need manual examination: checking for mis-calls and truncating the sequence when errors become too frequent. The purpose of this paper is to develop a program allowing the automatic correction of these ambiguities. This application is a Web-based program powered by Shiny and runs under R platform for an easy exploitation. As a part of the interface, we added the automatic ends clipping option, alignment against reference sequences, and BLAST. To develop and test our tool, we collected several bacterial DNA sequences from different laboratories within Institut Pasteur du Maroc and performed both manual and automatic correction. The comparison between the two methods was carried out. As a result, we note that our program, ABI base recall, accomplishes good correction with a high accuracy. Indeed, it increases the rate of identity and coverage and minimizes the number of mismatches and gaps, hence it provides solution to sequencing ambiguities and saves biologists' time and labor.

  18. Studies of base pair sequence effects on DNA solvation based on all

    Indian Academy of Sciences (India)

    Detailed analyses of the sequence-dependent solvation and ion atmosphere of DNA are presented based on molecular dynamics (MD) simulations on all the 136 unique tetranucleotide steps obtained by the ABC consortium using the AMBER suite of programs. Significant sequence effects on solvation and ion localization ...

  19. Tandemly repeated sequence in 5'end of mtDNA control region of ...

    African Journals Online (AJOL)

    Extensive length variability was observed in 5' end sequence of the mitochondrial DNA control region of the Japanese Spanish mackerel (Scomberomorus niphonius). This length variability was due to the presence of varying numbers of a 56-bp tandemly repeated sequence and a 46-bp insertion/deletion (indel).

  20. Open source tools to exploit DNA sequence data from livestock species

    Science.gov (United States)

    Next-Generation Sequencing (NGS) is a recent technological development that allows researchers to rapidly determine the DNA sequence of an individual. The decrease in cost of NGS has brought the technology into the realm of practical applications in livestock genomics, where it can be used to genera...

  1. Noninvasive prenatal diagnosis of fetal trisomy 18 and trisomy 13 by maternal plasma DNA sequencing.

    NARCIS (Netherlands)

    Chen, E.Z.; Chiu, R.W.; Sun, H; Akolekar, R.; Chan, K.C.; Leung, T.Y.; Jiang, P.; Zheng, Y.W.; Lun, F.M.; Chan, L.Y.; Jin, Y.; Go, A.T.; Lau, E.T; To, W.W.; Leung, W.C.; Tang, R.Y.; Au-Yeung, S.K.; Lam, H.; Kung, Y.Y.; Zhang, X.; Vugt, J.M.G. van; Minekawa, R.; Tang, M.H.; Wang, J.; Oudejans, C.B.; Lau, T.K.; Nicolaides, K.H.; Lo, Y.M.

    2011-01-01

    Massively parallel sequencing of DNA molecules in the plasma of pregnant women has been shown to allow accurate and noninvasive prenatal detection of fetal trisomy 21. However, whether the sequencing approach is as accurate for the noninvasive prenatal diagnosis of trisomy 13 and 18 is unclear due

  2. True single-molecule DNA sequencing of a pleistocene horse bone

    DEFF Research Database (Denmark)

    Orlando, Ludovic Antoine Alexandre; Ginolhac, Aurélien; Raghavan, Maanasa

    2011-01-01

    -preserved Pleistocene horse bone using the Helicos HeliScope and Illumina GAIIx platforms, respectively. We find that the percentage of endogenous DNA sequences derived from the horse is higher among the Helicos data than Illumina data. This result indicates that the molecular biology tools used to generate sequencing...

  3. The nucleotide sequence of human transition protein 1 cDNA

    Energy Technology Data Exchange (ETDEWEB)

    Luerssen, H; Hoyer-Fender, S; Engel, W [Universitaet Goettingen (West Germany)

    1988-08-11

    The authors have screened a human testis cDNA library with an oligonucleotide of 81 mer prepared according to a part of the published nucleotide sequence of the rat transition protein TP 1. They have isolated a cDNA clone with the length of 441 bp containing the coding region of 162 bp for human transition protein 1. There is about 84% homology in the coding region of the sequence compared to rat. The human cDNA-clone encodes a polypeptide of 54 amino acids of which 7 are different to that of rat.

  4. Identifying the main mosquito species in China based on DNA barcoding.

    Directory of Open Access Journals (Sweden)

    Gang Wang

    Full Text Available Mosquitoes are insects of the Diptera, Nematocera, and Culicidae families, some species of which are important disease vectors. Identifying mosquito species based on morphological characteristics is difficult, particularly the identification of specimens collected in the field as part of disease surveillance programs. Because of this difficulty, we constructed DNA barcodes of the cytochrome c oxidase subunit 1, the COI gene, for the more common mosquito species in China, including the major disease vectors. A total of 404 mosquito specimens were collected and assigned to 15 genera and 122 species and subspecies on the basis of morphological characteristics. Individuals of the same species grouped closely together in a Neighborhood-Joining tree based on COI sequence similarity, regardless of collection site. COI gene sequence divergence was approximately 30 times higher for species in the same genus than for members of the same species. Divergence in over 98% of congeneric species ranged from 2.3% to 21.8%, whereas divergence in conspecific individuals ranged from 0% to 1.67%. Cryptic species may be common and a few pseudogenes were detected.

  5. Sequence Dependent Interactions Between DNA and Single-Walled Carbon Nanotubes

    Science.gov (United States)

    Roxbury, Daniel

    It is known that single-stranded DNA adopts a helical wrap around a single-walled carbon nanotube (SWCNT), forming a water-dispersible hybrid molecule. The ability to sort mixtures of SWCNTs based on chirality (electronic species) has recently been demonstrated using special short DNA sequences that recognize certain matching SWCNTs of specific chirality. This thesis investigates the intricacies of DNA-SWCNT sequence-specific interactions through both experimental and molecular simulation studies. The DNA-SWCNT binding strengths were experimentally quantified by studying the kinetics of DNA replacement by a surfactant on the surface of particular SWCNTs. Recognition ability was found to correlate strongly with measured binding strength, e.g. DNA sequence (TAT)4 was found to bind 20 times stronger to the (6,5)-SWCNT than sequence (TAT)4T. Next, using replica exchange molecular dynamics (REMD) simulations, equilibrium structures formed by (a) single-strands and (b) multiple-strands of 12-mer oligonucleotides adsorbed on various SWCNTs were explored. A number of structural motifs were discovered in which the DNA strand wraps around the SWCNT and 'stitches' to itself via hydrogen bonding. Great variability among equilibrium structures was observed and shown to be directly influenced by DNA sequence and SWCNT type. For example, the (6,5)-SWCNT DNA recognition sequence, (TAT)4, was found to wrap in a tight single-stranded right-handed helical conformation. In contrast, DNA sequence T12 forms a beta-barrel left-handed structure on the same SWCNT. These are the first theoretical indications that DNA-based SWCNT selectivity can arise on a molecular level. In a biomedical collaboration with the Mayo Clinic, pathways for DNA-SWCNT internalization into healthy human endothelial cells were explored. Through absorbance spectroscopy, TEM imaging, and confocal fluorescence microscopy, we showed that intracellular concentrations of SWCNTs far exceeded those of the incubation

  6. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer

    Science.gov (United States)

    Morrison, Carl D.; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M.; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R.; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H.; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C.; Johnson, Candace S.; Trump, Donald L.

    2014-01-01

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as “stitchers,” to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication–licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795

  7. cDNA, genomic sequence cloning and overexpression of ribosomal ...

    African Journals Online (AJOL)

    PRECIOUS

    2009-11-02

    Nov 2, 2009 ... basic machinery of protein synthesis and regulation, but also in various ... The genomic DNA was isolated from Giant Panda muscle tissue according to the ... for 45 s, 72°C for 2 min in the first cycle and the anneal temperature deceased 0.2°C ..... edition, Cold Spring Harbor aboratory Press. Cold Spring ...

  8. MotifMark: Finding regulatory motifs in DNA sequences.

    Science.gov (United States)

    Hassanzadeh, Hamid Reza; Kolhe, Pushkar; Isbell, Charles L; Wang, May D

    2017-07-01

    The interaction between proteins and DNA is a key driving force in a significant number of biological processes such as transcriptional regulation, repair, recombination, splicing, and DNA modification. The identification of DNA-binding sites and the specificity of target proteins in binding to these regions are two important steps in understanding the mechanisms of these biological activities. A number of high-throughput technologies have recently emerged that try to quantify the affinity between proteins and DNA motifs. Despite their success, these technologies have their own limitations and fall short in precise characterization of motifs, and as a result, require further downstream analysis to extract useful and interpretable information from a haystack of noisy and inaccurate data. Here we propose MotifMark, a new algorithm based on graph theory and machine learning, that can find binding sites on candidate probes and rank their specificity in regard to the underlying transcription factor. We developed a pipeline to analyze experimental data derived from compact universal protein binding microarrays and benchmarked it against two of the most accurate motif search methods. Our results indicate that MotifMark can be a viable alternative technique for prediction of motif from protein binding microarrays and possibly other related high-throughput techniques.

  9. Identifying DNA Methylation Features that Underlie Prostate Cancer Disparities

    Science.gov (United States)

    2017-10-01

    15.3%) NA 6 (6%) 6 (5.4%) Prostate - specific Antigen (PSA) ng/mL 76.7 (42.9) 78.2 (40.7) pTNM Stage T2 68 (67.3%) 48 (43.2%) T3 29 (28.7%) 58...Profiles Primary Aim #1: Determine if methylation profiles differ by race/ancestry Primary Aim #2: Identify ethnicity- specific markers of prostate ...by ethnicity and to identify ethnicity- specific methylation features of prostate cancer that could contribute the racial disparities that exist in

  10. High-Quality Exome Sequencing of Whole-Genome Amplified Neonatal Dried Blood Spot DNA

    DEFF Research Database (Denmark)

    Poulsen, Jesper Buchhave; Lescai, Francesco; Grove, Jakob

    2016-01-01

    Stored neonatal dried blood spot (DBS) samples from neonatal screening programmes are a valuable diagnostic and research resource. Combined with information from national health registries they can be used in population-based studies of genetic diseases. DNA extracted from neonatal DBSs can...... be amplified to obtain micrograms of an otherwise limited resource, referred to as whole-genome amplified DNA (wgaDNA). Here we investigate the robustness of exome sequencing of wgaDNA of neonatal DBS samples. We conducted three pilot studies of seven, eight and seven subjects, respectively. For each subject...... we analysed a neonatal DBS sample and corresponding adult whole-blood (WB) reference sample. Different DNA sample types were prepared for each of the subjects. Pilot 1: wgaDNA of 2x3.2mm neonatal DBSs (DBS_2x3.2) and raw DNA extract of the WB reference sample (WB_ref). Pilot 2: DBS_2x3.2, WB...

  11. Using DNA-Stable Isotope Probing to Identify MTBE- and TBA-Degrading Microorganisms in Contaminated Groundwater.

    Science.gov (United States)

    Key, Katherine C; Sublette, Kerry L; Duncan, Kathleen; Mackay, Douglas M; Scow, Kate M; Ogles, Dora

    2013-01-01

    Although the anaerobic biodegradation of methyl tert -butyl ether (MTBE) and tert -butyl alcohol (TBA) has been documented in the laboratory and the field, knowledge of the microorganisms and mechanisms involved is still lacking. In this study, DNA-stable isotope probing (SIP) was used to identify microorganisms involved in anaerobic fuel oxygenate biodegradation in a sulfate-reducing MTBE and TBA plume. Microorganisms were collected in the field using Bio-Sep® beads amended with 13 C 5 -MTBE, 13 C 1 -MTBE (only methoxy carbon labeled), or 13 C 4 -TBA. 13 C-DNA and 12 C-DNA extracted from the Bio-Sep beads were cloned and 16S rRNA gene sequences were used to identify the indigenous microorganisms involved in degrading the methoxy group of MTBE and the tert -butyl group of MTBE and TBA. Results indicated that microorganisms were actively degrading 13 C-labeled MTBE and TBA in situ and the 13 C was incorporated into their DNA. Several sequences related to known MTBE- and TBA-degraders in the Burkholderiales and the Sphingomonadales orders were detected in all three 13 C clone libraries and were likely to be primary degraders at the site. Sequences related to sulfate-reducing bacteria and iron-reducers, such as Geobacter and Geothrix , were only detected in the clone libraries where MTBE and TBA were fully labeled with 13 C, suggesting that they were involved in processing carbon from the tert -butyl group. Sequences similar to the Pseudomonas genus predominated in the clone library where only the methoxy carbon of MTBE was labeled with 13 C. It is likely that members of this genus were secondary degraders cross-feeding on 13 C-labeled metabolites such as acetate.

  12. Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments.

    Science.gov (United States)

    Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias

    2013-09-24

    Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retr