WorldWideScience

Sample records for repeat dna sequences

  1. Cloning, characterization, and properties of seven triplet repeat DNA sequences.

    Science.gov (United States)

    Ohshima, K; Kang, S; Larson, J E; Wells, R D

    1996-07-12

    Several neuromuscular and neurodegenerative diseases are caused by genetically unstable triplet repeat sequences (CTG.CAG, CGG.CCG, or AAG.CTT) in or near the responsible genes. We implemented novel cloning strategies with chemically synthesized oligonucleotides to clone seven of the triplet repeat sequences (GTA.TAC, GAT.ATC, GTT.AAC, CAC.GTG, AGG.CCT, TCG.CGA, and AAG.CTT), and the adjoining paper (Ohshima, K., Kang, S., Larson, J. E., and Wells, R. D.(1996) J. Biol. Chem. 271, 16784-16791) describes studies on TTA.TAA. This approach in conjunction with in vivo expansion studies in Escherichia coli enabled the preparation of at least 81 plasmids containing the repeat sequences with lengths of approximately 16 up to 158 triplets in both orientations with varying extents of polymorphisms. The inserts were characterized by DNA sequencing as well as DNA polymerase pausings, two-dimensional agarose gel electrophoresis, and chemical probe analyses to evaluate the capacity to adopt negative supercoil induced non-B DNA conformations. AAG.CTT and AGG.CCT form intramolecular triplexes, and the other five repeat sequences do not form any previously characterized non-B structures. However, long tracts of TCG.CGA showed strong inhibition of DNA synthesis at specific loci in the repeats as seen in the cases of CTG.CAG and CGG.CCG (Kang, S., Ohshima, K., Shimizu, M., Amirhaeri, S., and Wells, R. D.(1995) J. Biol. Chem. 270, 27014-27021). This work along with other studies (Wells, R. D.(1996) J. Biol. Chem. 271, 2875-2878) on CTG.CAG, CGG.CCG, and TTA.TAA makes available long inserts of all 10 triplet repeat sequences for a variety of physical, molecular biological, genetic, and medical investigations. A model to explain the reduction in mRNA abundance in Friedreich's ataxia based on intermolecular triplex formation is proposed.

  2. Spectroscopic investigation on the telomeric DNA base sequence repeat

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    Telomeres are protein-DNA complexes at the terminals of linear chromosomes, which protect chromosomal integrity and maintain cellular replicative capacity.From single-cell organisms to advanced animals and plants,structures and functions of telomeres are both very conservative. In cells of human and vertebral animals, telomeric DNA base sequences all are (TTAGGG)n. In the present work, we have obtained absorption and fluorescence spectra measured from seven synthesized oligonucleotides to simulate the telomeric DNA system and calculated their relative fluorescence quantum yields on which not only telomeric DNA characteristics are predicted but also possibly the shortened telomeric sequences during cell division are imrelative fluorescence quantum yield and remarkable excitation energy innerconversion, which tallies with the telomeric sequence of (TTAGGG)n. This result shows that telomeric DNA has a strong non-radiative or innerconvertible capability.``

  3. Methods for sequencing GC-rich and CCT repeat DNA templates

    Science.gov (United States)

    Robinson, Donna L.

    2007-02-20

    The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.

  4. Significance of satellite DNA revealed by conservation of a widespread repeat DNA sequence among angiosperms.

    Science.gov (United States)

    Mehrotra, Shweta; Goel, Shailendra; Raina, Soom Nath; Rajpal, Vijay Rani

    2014-08-01

    The analysis of plant genome structure and evolution requires comprehensive characterization of repetitive sequences that make up the majority of plant nuclear DNA. In the present study, we analyzed the nature of pCtKpnI-I and pCtKpnI-II tandem repeated sequences, reported earlier in Carthamus tinctorius. Interestingly, homolog of pCtKpnI-I repeat sequence was also found to be present in widely divergent families of angiosperms. pCtKpnI-I showed high sequence similarity but low copy number among various taxa of different families of angiosperms analyzed. In comparison, pCtKpnI-II was specific to the genus Carthamus and was not present in any other taxa analyzed. The molecular structure of pCtKpnI-I was analyzed in various unrelated taxa of angiosperms to decipher the evolutionary conserved nature of the sequence and its possible functional role.

  5. Plasmid P1 replication: negative control by repeated DNA sequences.

    OpenAIRE

    Chattoraj, D; Cordes, K.; Abeles, A

    1984-01-01

    The incompatibility locus, incA, of the unit-copy plasmid P1 is contained within a fragment that is essentially a set of nine 19-base-pair repeats. One or more copies of the fragment destabilizes the plasmid when present in trans. Here we show that extra copies of incA interfere with plasmid DNA replication and that a deletion of most of incA increases plasmid copy number. Thus, incA is not essential for replication but is required for its control. When cloned in a high-copy-number vector, pi...

  6. Applications of inter simple sequence repeat (ISSR) rDNA in ...

    African Journals Online (AJOL)

    Applications of inter simple sequence repeat (ISSR) rDNA in detecting ... and phylogenetic relationships between Lymnaea natalensis collected from Giza, ... in water samples of all tested governorates with different significant differences.

  7. Exact Tandem Repeats Analyzer (E-TRA): A new program for DNA sequence mining

    Indian Academy of Sciences (India)

    Mehmet Karaca; Mehmet Bilgen; A. Naci Onus; Ayse Gul Ince; Safinaz Y. Elmasulu

    2005-04-01

    Exact Tandem Repeats Analyzer 1.0 (E-TRA) combines sequence motif searches with keywords such as ‘organs’, ‘tissues’, ‘cell lines’ and ‘development stages’ for finding simple exact tandem repeats as well as non-simple repeats. E-TRA has several advanced repeat search parameters/options compared to other repeat finder programs as it not only accepts GenBank, FASTA and expressed sequence tags (EST) sequence files, but also does analysis of multiple files with multiple sequences. The minimum and maximum tandem repeat motif lengths that E-TRA finds vary from one to one thousand. Advanced user defined parameters/options let the researchers use different minimum motif repeats search criteria for varying motif lengths simultaneously. One of the most interesting features of genomes is the presence of relatively short tandem repeats (TRs). These repeated DNA sequences are found in both prokaryotes and eukaryotes, distributed almost at random throughout the genome. Some of the tandem repeats play important roles in the regulation of gene expression whereas others do not have any known biological function as yet. Nevertheless, they have proven to be very beneficial in DNA profiling and genetic linkage analysis studies. To demonstrate the use of E-TRA, we used 5,465,605 human EST sequences derived from 18,814,550 GenBank EST sequences. Our results indicated that 12.44% (679,800) of the human EST sequences contained simple and non-simple repeat string patterns varying from one to 126 nucleotides in length. The results also revealed that human organs, tissues, cell lines and different developmental stages differed in number of repeats as well as repeat composition, indicating that the distribution of expressed tandem repeats among tissues or organs are not random, thus differing from the un-transcribed repeats found in genomes.

  8. Characterization of a highly repeated DNA sequence family in five species of the genus Eulemur.

    Science.gov (United States)

    Ventura, M; Boniotto, M; Cardone, M F; Fulizio, L; Archidiacono, N; Rocchi, M; Crovella, S

    2001-09-19

    The karyotypes of Eulemur species exhibit a high degree of variation, as a consequence of the Robertsonian fusion and/or centromere fission. Centromeric and pericentromeric heterochromatin of eulemurs is constituted by highly repeated DNA sequences (including some telomeric TTAGGG repeats) which have so far been investigated and used for the study of the systematic relationships of the different species of the genus Eulemur. In our study, we have cloned a set of repetitive pericentromeric sequences of five Eulemur species: E. fulvus fulvus (EFU), E. mongoz (EMO), E. macaco (EMA), E. rubriventer (ERU), and E. coronatus (ECO). We have characterized these clones by sequence comparison and by comparative fluorescence in situ hybridization analysis in EMA and EFU. Our results showed a high degree of sequence similarity among Eulemur species, indicating a strong conservation, within the five species, of these pericentromeric highly repeated DNA sequences.

  9. Cytogenetic analysis of Populus trichocarpa--ribosomal DNA, telomere repeat sequence, and marker-selected BACs.

    Science.gov (United States)

    Islam-Faridi, M N; Nelson, C D; DiFazio, S P; Gunter, L E; Tuskan, G A

    2009-01-01

    The 18S-28S rDNA and 5S rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 18S-28S rDNA sites and one 5S rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis-type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones selected from 2 linkage groups based on genome sequence assembly (LG-I and LG-VI) were localized on 2 chromosomes, as expected. BACs from LG-I hybridized to the longest chromosome in the complement. All BAC positions were found to be concordant with sequence assembly positions. BAC-FISH will be useful for delineating each of the Populus trichocarpa chromosomes and improving the sequence assembly of this model angiosperm tree species.

  10. Cytogenetic Analysis of Populus trichocarpa - Ribosomal DNA, Telomere Repeat Sequence, and Marker-selected BACs

    Energy Technology Data Exchange (ETDEWEB)

    Tuskan, Gerald A [ORNL; Gunter, Lee E [ORNL; DiFazio, Stephen P [West Virginia University

    2009-01-01

    The 18S-28S rDNA and 5S rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 18S-28S rDNA sites and one 5S rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis -type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones selected from 2 linkage groups based on genome sequence assembly (LG-I and LG-VI) were localized on 2 chromosomes, as expected. BACs from LG-I hybridized to the longest chromosome in the complement. All BAC positions were found to be concordant with sequence assembly positions. BAC-FISH will be useful for delineating each of the Populus trichocarpa chromosomes and improving the sequence assembly of this model angiosperm tree species.

  11. A Novel Signal Processing Measure to Identify Exact and Inexact Tandem Repeat Patterns in DNA Sequences

    Directory of Open Access Journals (Sweden)

    Ravi Gupta

    2007-03-01

    Full Text Available The identification and analysis of repetitive patterns are active areas of biological and computational research. Tandem repeats in telomeres play a role in cancer and hypervariable trinucleotide tandem repeats are linked to over a dozen major neurodegenerative genetic disorders. In this paper, we present an algorithm to identify the exact and inexact repeat patterns in DNA sequences based on orthogonal exactly periodic subspace decomposition technique. Using the new measure our algorithm resolves the problems like whether the repeat pattern is of period P or its multiple (i.e., 2P, 3P, etc., and several other problems that were present in previous signal-processing-based algorithms. We present an efficient algorithm of O(NLw logLw, where N is the length of DNA sequence and Lw is the window length, for identifying repeats. The algorithm operates in two stages. In the first stage, each nucleotide is analyzed separately for periodicity, and in the second stage, the periodic information of each nucleotide is combined together to identify the tandem repeats. Datasets having exact and inexact repeats were taken up for the experimental purpose. The experimental result shows the effectiveness of the approach.

  12. Comparison of highly repeated DNA sequences in some Lemuridae and taxonomic implications.

    Science.gov (United States)

    Montagnon, D; Crovella, S; Rumpler, Y

    1993-01-01

    Highly repeated DNA sequences of Eulemur fulvus mayottensis, E. coronatus, Lemur catta, and Hapalemur griseus griseus have been identified and compared. Sequence analysis of highly repeated DNA fragments isolated from L. catta and Hapalemur showed a high percentage of similarity (nearly 95%), as did fragments isolated from the two very close Eulemur species, whereas comparison of the DNA fragments isolated from the two Eulemur species and the L. catta/Hapalemur group showed a very low percentage (approximately 40%) of identity, as might be expected for distant species. These results confirm our previous data, obtained by Southern blot hybridization techniques on the same species, and strongly support the existence of a common trunk between L. catta and Hapalemur, but different from the leading to the Eulemur species.

  13. Recombination frequency in plasmid DNA containing direct repeats--predictive correlation with repeat and intervening sequence length.

    Science.gov (United States)

    Oliveira, Pedro H; Lemos, Francisco; Monteiro, Gabriel A; Prazeres, Duarte M F

    2008-09-01

    In this study, a simple non-linear mathematical function is proposed to accurately predict recombination frequencies in bacterial plasmid DNA harbouring directly repeated sequences. The mathematical function, which was developed on the basis of published data on deletion-formation in multicopy plasmids containing direct-repeats (14-856 bp) and intervening sequences (0-3872 bp), also accounts for the strain genotype in terms of its recA function. A bootstrap resampling technique was used to estimate confidence intervals for the correlation parameters. More than 92% of the predicted values were found to be within a pre-established +/-5-fold interval of deviation from experimental data. The correlation does not only provide a way to predict, with good accuracy, the recombination frequency, but also opens the way to improve insight into these processes.

  14. Genomic and polyploid evolution in genus Avena as revealed by RFLPs of repeated DNA sequences.

    Science.gov (United States)

    Morikawa, Toshinobu; Nishihara, Miho

    2009-06-01

    Phylogenetic relationships and genome affinities were investigated by utilizing all the biological Avena species consisting of 11 diploid species (15 accessions), 8 tetraploid species (9 accessions) and 4 hexaploid species (5 accessions). Genomic DNA regions of As120a, avenin, and globulin were amplified by PCR. A total of 130 polymorphic fragments were detected out of 156 fragments generated by digesting the PCR-amplified fragments with 11 restriction enzymes. The number of fragments generated by PCR-amplification followed by digestion with restriction enzymes was almost the same as those among the three repeated DNA sequences. A high level of genetic distance was detected between A. damascena (Ad) and A. canariensis (Ac) genomes, which reflected their different morphology and reproductive isolation. The A. longiglumis (Al) and A. prostrata (Ap) genomes were closely related to the As genome group. The AB genome species formed a cluster with the AsAs genome artificial autotetraploid and the As genome diploids indicating near-autotetraploid origin. The A. macrostachya is an outbreeding autotetraploid closely related with the C genome diploid and the AC genome tetraploid species. The differences of genetic distances estimated from the repeated DNA sequence divergence among the Avena species were consistent with genome divergences and it was possible to compare the genetic intra- and inter-ploidy relationships produced by RFLPs. These results suggested that the PCR-mediated analysis of repeated DNA polymorphism can be used as a tool to examine genomic relationships of polyploidy species.

  15. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

    Science.gov (United States)

    de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

    2015-11-16

    Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats.

  16. Effective DNA fragmentation technique for simple sequence repeat detection with a microsatellite-enriched library and high-throughput sequencing.

    Science.gov (United States)

    Tanaka, Keisuke; Ohtake, Rumi; Yoshida, Saki; Shinohara, Takashi

    2017-04-01

    Two different techniques for genomic DNA fragmentation before microsatellite-enriched library construction-restriction enzyme (NlaIII and MseI) digestion and sonication-were compared to examine their effects on simple sequence repeat (SSR) detection using high-throughput sequencing. Tens of thousands of SSR regions from 5 species of the plant family Myrtaceae were detected when the output of individual samples was >1 million paired-end reads. Comparison of the two DNA fragmentation techniques showed that restriction enzyme digestion was superior to sonication for identification of heterozygous genotypes, whereas sonication was superior for detection of various SSR flanking regions with both species-specific and common characteristics. Therefore, choosing the most suitable DNA fragmentation method depends on the type of analysis that is planned.

  17. Localization of a new highly repeated DNA sequence of Lemur cafta (Lemuridae, Strepsirhini).

    Science.gov (United States)

    Boniotto, Michele; Ventura, Mario; Cardone, Maria Francesca; Boaretto, Francesca; Archidiacono, Nicoletta; Rocchi, Mariano; Crovella, Sergio

    2002-10-01

    We have isolated and cloned an 800-bp highly repeated DNA (HRDNA) sequence from Lemur catta (LCA) and described its localization on LCA chromosomes. Lemur catta HRDNA sequences were localized by performing FISH experiments on standard and elongated metaphasic chromosomes using an LCA HRDNA probe (LCASAT). A complex hybridization pattern was detected. A strong pericentromeric hybridization signal was observed on most LCA chromosomes. Chromosomes 7 and 13 were lit in pericentromeric regions, as well as in the interspersed heterochromatin. Chromosomes 1, 3, 4, 17, 19, X, and microchromosomes (20, 25, 26, and 27) showed no signals in the pericentromeric region, but chromosomes 3 and 4 showed a positive hybridization in heterochromatic regions. The 800-bp L catta HRDNA was species specific. We performed FISH experiments with the LCASAT probe on Eulemur macaco macaco (EMA) and Eulemur fulvus fulvus (EFU) metaphases and no positive signal of hybridization was detected. These findings were also confirmed by Southern blot analysis and PCR.

  18. Nucleotide sequence, DNA damage location and protein stoichiometry influence base excision repair outcome at CAG/CTG repeats

    Science.gov (United States)

    Goula, Agathi-Vasiliki; Pearson, Christopher E.; Della Maria, Julie; Trottier, Yvon; Tomkinson, Alan E.; Wilson, David M.; Merienne, Karine

    2012-01-01

    Expansion of CAG/CTG repeats is the underlying cause of >fourteen genetic disorders, including Huntington’s disease (HD) and myotonic dystrophy. The mutational process is ongoing, with increases in repeat size enhancing the toxicity of the expansion in specific tissues. In many repeat diseases the repeats exhibit high instability in the striatum, whereas instability is minimal in the cerebellum. We provide molecular insights as to how base excision repair (BER) protein stoichiometry may contribute to the tissue-selective instability of CAG/CTG repeats by using specific repair assays. Oligonucleotide substrates with an abasic site were mixed with either reconstituted BER protein stoichiometries mimicking the levels present in HD mouse striatum or cerebellum, or with protein extracts prepared from HD mouse striatum or cerebellum. In both cases, repair efficiency at CAG/CTG repeats and at control DNA sequences was markedly reduced under the striatal conditions, likely due to the lower level of APE1, FEN1 and LIG1. Damage located towards the 5’ end of the repeat tract was poorly repaired accumulating incompletely processed intermediates as compared to an AP lesion in the centre or at the 3’ end of the repeats or within a control sequences. Moreover, repair of lesions at the 5’ end of CAG or CTG repeats involved multinucleotide synthesis, particularly under the cerebellar stoichiometry, suggesting that long-patch BER processes lesions at sequences susceptible to hairpin formation. Our results show that BER stoichiometry, nucleotide sequence and DNA damage position modulate repair outcome, and suggest that a suboptimal LP-BER activity promotes CAG/CTG repeat instability. PMID:22497302

  19. Differential distribution and association of repeat DNA sequences in the lateral element of the synaptonemal complex in rat spermatocytes.

    Science.gov (United States)

    Hernández-Hernández, Abrahan; Rincón-Arano, Héctor; Recillas-Targa, Félix; Ortiz, Rosario; Valdes-Quezada, Christian; Echeverría, Olga M; Benavente, Ricardo; Vázquez-Nin, Gerardo H

    2008-02-01

    The synaptonemal complex (SC) is an evolutionarily conserved structure that mediates synapsis of homologous chromosomes during meiotic prophase I. Previous studies have established that the chromatin of homologous chromosomes is organized in loops that are attached to the lateral elements (LEs) of the SC. The characterization of the genomic sequences associated with LEs of the SC represents an important step toward understanding meiotic chromosome organization and function. To isolate these genomic sequences, we performed chromatin immunoprecipitation assays in rat spermatocytes using an antibody against SYCP3, a major structural component of the LEs of the SC. Our results demonstrated the reproducible and exclusive isolation of repeat deoxyribonucleic acid (DNA) sequences, in particular long interspersed elements, short interspersed elements, long terminal direct repeats, satellite, and simple repeats. The association of these repeat sequences to the LEs of the SC was confirmed by in situ hybridization of meiotic nuclei shown by both light and electron microscopy. Signals were also detected over the chromatin surrounding SCs and in small loops protruding from the lateral elements into the SC central region. We propose that genomic repeat DNA sequences play a key role in anchoring the chromosome to the protein scaffold of the SC.

  20. Comparative molecular cytogenetic analyses of a major tandemly repeated DNA family and retrotransposon sequences in cultivated jute Corchorus species (Malvaceae).

    Science.gov (United States)

    Begum, Rabeya; Zakrzewski, Falk; Menzel, Gerhard; Weber, Beatrice; Alam, Sheikh Shamimul; Schmidt, Thomas

    2013-07-01

    The cultivated jute species Corchorus olitorius and Corchorus capsularis are important fibre crops. The analysis of repetitive DNA sequences, comprising a major part of plant genomes, has not been carried out in jute but is useful to investigate the long-range organization of chromosomes. The aim of this study was the identification of repetitive DNA sequences to facilitate comparative molecular and cytogenetic studies of two jute cultivars and to develop a fluorescent in situ hybridization (FISH) karyotype for chromosome identification. A plasmid library was generated from C. olitorius and C. capsularis with genomic restriction fragments of 100-500 bp, which was complemented by targeted cloning of satellite DNA by PCR. The diversity of the repetitive DNA families was analysed comparatively. The genomic abundance and chromosomal localization of different repeat classes were investigated by Southern analysis and FISH, respectively. The cytosine methylation of satellite arrays was studied by immunolabelling. Major satellite repeats and retrotransposons have been identified from C. olitorius and C. capsularis. The satellite family CoSat I forms two undermethylated species-specific subfamilies, while the long terminal repeat (LTR) retrotransposons CoRetro I and CoRetro II show similarity to the Metaviridea of plant retroelements. FISH karyotypes were developed by multicolour FISH using these repetitive DNA sequences in combination with 5S and 18S-5·8S-25S rRNA genes which enable the unequivocal chromosome discrimination in both jute species. The analysis of the structure and diversity of the repeated DNA is crucial for genome sequence annotation. The reference karyotypes will be useful for breeding of jute and provide the basis for karyotyping homeologous chromosomes of wild jute species to reveal the genetic and evolutionary relationship between cultivated and wild Corchorus species.

  1. Chromosomal localization of a tandemly repeated DNA sequence in Trifilium repens L.

    Institute of Scientific and Technical Information of China (English)

    ZHUJM; NWELLISON; 等

    1996-01-01

    A karyotype of Trifolium repens constructed from mitotic cells revealed 13 pairs of metacentric and 3 pairs of submetacentric chromosomes including a pair of satellites located at the end of the short arm of chromosome 16.C-bands were identified around the centromeric regions of 8 pairs of chromosomes.A 350 bp tandemly repeated DNAsequence from T.repens labelled with digoxygenin hybridized to the proximal centromeric regions of 12 chromosome pairs.Some correlation between the distribution of the repeat sequence and the distribution of C-banding was demonstrated.

  2. Tracking of intercalary DNA sequences integrated into tandem repeat arrays in rye Secale vavilovii

    Directory of Open Access Journals (Sweden)

    Magdalena Achrem

    2017-06-01

    Full Text Available The structure of repetitive sequences of the JNK block present in the pericentromeric region of the 2RL chromosome was studied in Secale vavilovii. Amplification of sequences present between the JNK sequences led to the identification of seven abnormal DNA fragments. Two of these fragments showed high similarity to the glutamate 5-kinase gene and putative alcohol dehydrogenase gene of trypanosomatid from the genus Leishmania, whose presence can be explained by horizontal gene transfer (HGT. Other fragments were similar to mitochondrial gene for ribosomal protein S4 in plants and to the glycoprotein (G gene of the IHNV virus. Presumably, they are pseudogenes inserted into the JNK heterochromatin region. Within this region, also fragments similar to the rye repetitive sequence and chromosome 3B in wheat were found. There is no known mechanism that would explain how foreign sequences were inserted into the block region of tandem repetitive sequences of the JNK family.

  3. Repeat Finding Techniques, Data Structures and Algorithms in DNA sequences: A Survey

    Directory of Open Access Journals (Sweden)

    Freeson Kaniwa

    2015-09-01

    Full Text Available DNA sequencing technologies keep getting faster and cheaper leading to massive availability of entire human genomes. This massive availability calls for better analysis tools with a potential to realize a shift from reactive to predictive medicine. The challenge remains, since the entire human genomes need more space and processing power than that can be offered by a standard Desktop PC for their analysis. A background of key concepts surrounding the area of DNA analysis is given and a review of selected prominent algorithms used in this area. The significance of this paper would be to survey the concepts surrounding DNA analysis so as to provide a deep rooted understanding and knowledge transfer regarding existing approaches for DNA analysis using Burrows-Wheeler transform, Wavelet tree and their respective strengths and weaknesses. Consequent to this survey, the paper attempts to provide some directions for future research.

  4. Dna Sequencing

    Science.gov (United States)

    Tabor, Stanley; Richardson, Charles C.

    1995-04-25

    A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.

  5. Structure and organization of the mitochondrial DNA control region with tandemly repeated sequence in the Amazon ornamental fish.

    Science.gov (United States)

    Terencio, Maria Leandra; Schneider, Carlos Henrique; Gross, Maria Claudia; Feldberg, Eliana; Porto, Jorge Ivan Rebelo

    2013-02-01

    Tandemly repeated sequences are a common feature of vertebrate mitochondrial DNA control regions. However, questions still remain about their mode of evolution and function. To better understand patterns of variation in length and to explore the existence of previously described domain, we have characterized the control region structure of the Amazonian ornamental fish Nannostomus eques and Nannostomus unifasciatus. The control region ranged from 1121 to 1142 bp in length and could be separated into three domains: the domain associated with the extended terminal associated sequences, the central conserved domain, and the conserved sequence blocks domain. In the first domain, we encountered a sequence repeated 10 times in tandem (variable number tandem repeat (VNTR)) that could adopt an "inverted repetitions" type structural conformation. The results suggest that the VNTR pattern encountered in both N. eques and N. unifasciatus is consistent with the prerequisites of the illegitimate elongation model in which the unequal pairing of the chains near the 5'-end of the control region favors the formation of repetitions.

  6. DNA polymorphism among Fusarium oxysporum f.sp. elaeidis populations from oil palm, using a repeated and dispersed sequence "Palm".

    Science.gov (United States)

    Mouyna, I; Renard, J L; Brygoo, Y

    1996-07-31

    A worldwide collection, of 76 F. oxysporum f.sp. elaeidis isolates (Foe), and of 21 F. oxysporum isolates from the soil of several palm grove was analysed by RFLP. As a probe, we used a random DNA fragment (probe 46) from a genomic library of a Foe isolate. This probe contains two different types of sequence, one being repeated and dispersed in the genome "Palm", the other being a single-copy sequence. All F. oxysporum isolates from the palm-grove soils were non-pathogenic to oil palm. They all had a simple restriction pattern with one band homologous to the single-copy sequence of probe 46. All Foe isolates were pathogenic to oil palm and they all had complex patterns due to hybridization with "Palm". This repetitive sequence reveals that Foe isolates are distinct from the other F. oxysporum palm-grove soils isolates. The sequence can reliably discriminate pathogenic from non-pathogenic oil palm isolates. Based on DNA fingerprint similarities, Foe populations were divided into ten groups consisting of isolates with the same geographic origin. Isolates from Brazil and Ecuador were an exception to that rule as they had the same restriction pattern as a few isolates from the Ivory Coast, suggesting they may originated from Africa.

  7. Regulation of the nucleosome repeat length in vivo by the DNA sequence, protein concentrations and long-range interactions.

    Directory of Open Access Journals (Sweden)

    Daria A Beshnova

    2014-07-01

    Full Text Available The nucleosome repeat length (NRL is an integral chromatin property important for its biological functions. Recent experiments revealed several conflicting trends of the NRL dependence on the concentrations of histones and other architectural chromatin proteins, both in vitro and in vivo, but a systematic theoretical description of NRL as a function of DNA sequence and epigenetic determinants is currently lacking. To address this problem, we have performed an integrative biophysical and bioinformatics analysis in species ranging from yeast to frog to mouse where NRL was studied as a function of various parameters. We show that in simple eukaryotes such as yeast, a lower limit for the NRL value exists, determined by internucleosome interactions and remodeler action. For higher eukaryotes, also the upper limit exists since NRL is an increasing but saturating function of the linker histone concentration. Counterintuitively, smaller H1 variants or non-histone architectural proteins can initiate larger effects on the NRL due to entropic reasons. Furthermore, we demonstrate that different regimes of the NRL dependence on histone concentrations exist depending on whether DNA sequence-specific effects dominate over boundary effects or vice versa. We consider several classes of genomic regions with apparently different regimes of the NRL variation. As one extreme, our analysis reveals that the period of oscillations of the nucleosome density around bound RNA polymerase coincides with the period of oscillations of positioning sites of the corresponding DNA sequence. At another extreme, we show that although mouse major satellite repeats intrinsically encode well-defined nucleosome preferences, they have no unique nucleosome arrangement and can undergo a switch between two distinct types of nucleosome positioning.

  8. PEGylation enhances tumor targeting of plasmid DNA by an artificial cationized protein with repeated RGD sequences, Pronectin.

    Science.gov (United States)

    Hosseinkhani, Hossein; Tabata, Yasuhiko

    2004-05-31

    The objective of this study is to investigate feasibility of a non-viral gene carrier with repeated RGD sequences (Pronectin F+) in tumor targeting for gene expression. The Pronectin F+ was cationized by introducing spermine (Sm) to the hydroxyl groups to allow to polyionically complex with plasmid DNA. The cationized Pronectin F+ prepared was additionally modified with poly(ethylene glycol) (PEG) molecules which have active ester and methoxy groups at the terminal, to form various PEG-introduced cationized Pronectin F+. The cationized Pronectin F+ with or without PEGylation at different extents was mixed with a plasmid DNA of LacZ to form respective cationized Pronectin F+-plasmid DNA complexes. The plasmid DNA was electrophoretically complexed with cationized Pronectin F+ and PEG-introduced cationized Pronectin F+, irrespective of the PEGylation extent, although the higher N/P ratio of complexes was needed for complexation with the latter Pronectin F+. The molecular size and zeta potential measurements revealed that the plasmid DNA was reduced in size to about 250 nm and the charge was changed to be positive by the complexation with cationized Pronectin F+. For the complexation with PEG-introduced cationized Pronectin F+, the charge of complex became neutral being almost 0 mV with the increasing PEGylation extents, while the molecular size was similar to that of cationized Pronectin F+. When cationized Pronectin F+-plasmid DNA complexes with or without PEGylation were intravenously injected to mice carrying a subcutaneous Meth-AR-1 fibrosarcoma mass, the PEG-introduced cationized Pronectin F+-plasmid DNA complex specifically enhanced the level of gene expression in the tumor, to a significantly high extent compared with the cationized Pronectin F+-plasmid DNA complexes and free plasmid DNA. The enhanced level of gene expression depended on the percentage of PEG introduced, the N/P ratio, and the plasmid DNA dose. A fluorescent microscopic study revealed that the

  9. Short Tandem Repeat DNA Internet Database

    Science.gov (United States)

    SRD 130 Short Tandem Repeat DNA Internet Database (Web, free access)   Short Tandem Repeat DNA Internet Database is intended to benefit research and application of short tandem repeat DNA markers for human identity testing. Facts and sequence information on each STR system, population data, commonly used multiplex STR systems, PCR primers and conditions, and a review of various technologies for analysis of STR alleles have been included.

  10. Expressed Sequence Tags Analysis and Design of Simple Sequence Repeats Markers from a Full-Length cDNA Library in Perilla frutescens (L.

    Directory of Open Access Journals (Sweden)

    Eun Soo Seong

    2015-01-01

    Full Text Available Perilla frutescens is valuable as a medicinal plant as well as a natural medicine and functional food. However, comparative genomics analyses of P. frutescens are limited due to a lack of gene annotations and characterization. A full-length cDNA library from P. frutescens leaves was constructed to identify functional gene clusters and probable EST-SSR markers via analysis of 1,056 expressed sequence tags. Unigene assembly was performed using basic local alignment search tool (BLAST homology searches and annotated Gene Ontology (GO. A total of 18 simple sequence repeats (SSRs were designed as primer pairs. This study is the first to report comparative genomics and EST-SSR markers from P. frutescens will help gene discovery and provide an important source for functional genomics and molecular genetic research in this interesting medicinal plant.

  11. Complete nucleotide sequences of the domestic cat (Felis catus) mitochondrial genome and a transposed mtDNA tandem repeat (Numt) in the nuclear genome

    Energy Technology Data Exchange (ETDEWEB)

    Lopez, J.V.; Cevario, S.; O`Brien, S.J. [National Cancer Institute, Frederick, MD (United States)

    1996-04-15

    The complete 17,009-bp mitochondrial genome of the domestic cat, Felis catus, has been sequenced and conforms largely to the typical organization of previously characterized mammalian mtDNAs. Codon usage and base composition also followed canonical vertebrate patterns, except for an unusual ATC (non-AUG) codon initiating the NADH dehydrogenase subunit 2 (ND2) gene. Two distinct repetitive motifs at opposite ends of the control region contribute to the relatively large size (1559 bp) of this carnivore mtDNA. Alignment of the feline mtDNA genome to a homologous 7946-bp nuclear mtDNA tandem repeat DNA sequence in the cat, Numt, indicates simple repeat motifs associated with insertion/deletion mutations. Overall DNA sequence divergence between Numt and cytoplasmic mtDNA sequence was only 5.1%. Substitutions predominate at the third codon position of homologous feline protein genes. Phylogenetic analysis of mitochondrial gene sequences confirms the recent transfer of the cytoplasmic mtDNA sequences to the domestic cat nucleus and recapitulates evolutionary relationships between mammal species. 86 refs., 4 figs., 3 tabs.

  12. Tandem repeat sequence variation and length heteroplasmy in the mitochondrial DNA D-loop of the threatened Gulf of Mexico sturgeon, Acipenser oxyrhynchus desotoi.

    Science.gov (United States)

    Miracle, A L; Campton, D E

    1995-01-01

    Genetic variability within the Suwannee River, Florida, population of Gulf of Mexico sturgeon, Acipenser oxyrhynchus desotoi, was assessed by examining sequence and length variation within the control region, or D-loop, of the mitochondrial genome. Although once abundant throughout the Gulf of Mexico, Gulf sturgeon are now listed as a threatened species by the U.S. Fish and Wildlife Service. Mitochondrial DNA was analyzed for length variation from 168 individual Gulf sturgeon by PCR amplification and visualization of PCR products using ethidium bromide-stained agarose gels. Of the 168 individual Gulf sturgeon, 31 (18.5%) were heteroplasmic for one to four copies of an 81-base pair, tandemly repeated sequence in the D-loop region. However, no individuals homoplasmic for multiple copies of the repeat sequence were observed. The existence and nature of these tandem repeats in heteroplasmic individuals was confirmed by direct sequencing of the PCR products for a subset of 22 individuals. The results are consistent with the apparent nature and mechanism of heteroplasmy observed in a congeneric species, A. transmontanus. In addition, sequences for 187 base pairs outside of the tandem repeats were identical among all 16 individuals assayed for this region. Lack of variable sequences is concordant with earlier studies involving mtDNA restriction fragment length profiles of Gulf sturgeon found in the Suwannee River. The absence of sequence variation exclusive of the tandem repeats is consistent with the hypothesis that the subspecies has undergone a population or evolutionary bottleneck.

  13. Organellar genome, nuclear ribosomal DNA repeat unit, and microsatellites isolated from a small-scale of 454 GS FLX sequencing on two mosses.

    Science.gov (United States)

    Liu, Yang; Forrest, Laura L; Bainard, Jillian D; Budke, Jessica M; Goffinet, Bernard

    2013-03-01

    Recent innovations in high-throughput DNA sequencing methodology (next generation sequencing technologies [NGS]) allow for the generation of large amounts of high quality data that may be particularly critical for resolving ambiguous relationships such as those resulting from rapid radiations. Application of NGS technology to bryology is limited to assembling entire nuclear or organellar genomes of selected exemplars of major lineages (e.g., classes). Here we outline how organellar genomes and the entire nuclear ribosomal DNA repeat can be obtained from minimal amounts of moss tissue via small-scale 454 GS FLX sequencing. We sampled two Funariaceae species, Funaria hygrometrica and Entosthodon obtusus, and assembled nearly complete organellar genomes and the whole nuclear ribosomal DNA repeat unit (18S-ITS1-5.8S-ITS2-26S-IGS1-5S-IGS2) for both taxa. Sequence data from these species were compared to sequences from another Funariaceae species, Physcomitrella patens, revealing low overall degrees of divergence of the organellar genomes and nrDNA genes with substitutions spread rather evenly across their length, and high divergence within the external spacers of the nrDNA repeat. Furthermore, we detected numerous microsatellites among the 454 assemblies. This study demonstrates that NGS methodology can be applied to mosses to target large genomic regions and identify microsatellites.

  14. Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution

    Directory of Open Access Journals (Sweden)

    Purves Joanne

    2012-09-01

    Full Text Available Abstract Background Staphylococcus aureus Repeat (STAR elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis.

  15. Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution.

    Science.gov (United States)

    Purves, Joanne; Blades, Matthew; Arafat, Yasrab; Malik, Salman A; Bayliss, Christopher D; Morrissey, Julie A

    2012-09-28

    Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis.

  16. Triplet repeat sequences in human DNA can be detected by hybridization to a synthetic (5'-CGG-3')17 oligodeoxyribonucleotide

    DEFF Research Database (Denmark)

    Behn-Krappa, A; Mollenhauer, J; Doerfler, W

    1993-01-01

    The seemingly autonomous amplification of naturally occurring triplet repeat sequences in the human genome has been implicated in the causation of human genetic disease, such as the fragile X (Martin-Bell) syndrome, myotonic dystrophy (Curshmann-Steinert), spinal and bulbar muscular atrophy...

  17. Recombination-Independent Recognition of DNA Homology for Repeat-Induced Point Mutation (RIP Is Modulated by the Underlying Nucleotide Sequence.

    Directory of Open Access Journals (Sweden)

    Eugene Gladyshev

    2016-05-01

    Full Text Available Haploid germline nuclei of many filamentous fungi have the capacity to detect homologous nucleotide sequences present on the same or different chromosomes. Once recognized, such sequences can undergo cytosine methylation or cytosine-to-thymine mutation specifically over the extent of shared homology. In Neurospora crassa this process is known as Repeat-Induced Point mutation (RIP. Previously, we showed that RIP did not require MEI-3, the only RecA homolog in Neurospora, and that it could detect homologous trinucleotides interspersed with a matching periodicity of 11 or 12 base-pairs along participating chromosomal segments. This pattern was consistent with a mechanism of homology recognition that involved direct interactions between co-aligned double-stranded (ds DNA molecules, where sequence-specific dsDNA/dsDNA contacts could be established using no more than one triplet per turn. In the present study we have further explored the DNA sequence requirements for RIP. In our previous work, interspersed homologies were always examined in the context of a relatively long adjoining region of perfect homology. Using a new repeat system lacking this strong interaction, we now show that interspersed homologies with overall sequence identity of only 36% can be efficiently detected by RIP in the absence of any perfect homology. Furthermore, in this new system, where the total amount of homology is near the critical threshold required for RIP, the nucleotide composition of participating DNA molecules is identified as an important factor. Our results specifically pinpoint the triplet 5'-GAC-3' as a particularly efficient unit of homology recognition. Finally, we present experimental evidence that the process of homology sensing can be uncoupled from the downstream mutation. Taken together, our results advance the notion that sequence information can be compared directly between double-stranded DNA molecules during RIP and, potentially, in other processes

  18. Recombination-Independent Recognition of DNA Homology for Repeat-Induced Point Mutation (RIP) Is Modulated by the Underlying Nucleotide Sequence.

    Science.gov (United States)

    Gladyshev, Eugene; Kleckner, Nancy

    2016-05-01

    Haploid germline nuclei of many filamentous fungi have the capacity to detect homologous nucleotide sequences present on the same or different chromosomes. Once recognized, such sequences can undergo cytosine methylation or cytosine-to-thymine mutation specifically over the extent of shared homology. In Neurospora crassa this process is known as Repeat-Induced Point mutation (RIP). Previously, we showed that RIP did not require MEI-3, the only RecA homolog in Neurospora, and that it could detect homologous trinucleotides interspersed with a matching periodicity of 11 or 12 base-pairs along participating chromosomal segments. This pattern was consistent with a mechanism of homology recognition that involved direct interactions between co-aligned double-stranded (ds) DNA molecules, where sequence-specific dsDNA/dsDNA contacts could be established using no more than one triplet per turn. In the present study we have further explored the DNA sequence requirements for RIP. In our previous work, interspersed homologies were always examined in the context of a relatively long adjoining region of perfect homology. Using a new repeat system lacking this strong interaction, we now show that interspersed homologies with overall sequence identity of only 36% can be efficiently detected by RIP in the absence of any perfect homology. Furthermore, in this new system, where the total amount of homology is near the critical threshold required for RIP, the nucleotide composition of participating DNA molecules is identified as an important factor. Our results specifically pinpoint the triplet 5'-GAC-3' as a particularly efficient unit of homology recognition. Finally, we present experimental evidence that the process of homology sensing can be uncoupled from the downstream mutation. Taken together, our results advance the notion that sequence information can be compared directly between double-stranded DNA molecules during RIP and, potentially, in other processes where homologous

  19. Detection of short repeated genomic sequences on metaphase chromosomes using padlock probes and target primed rolling circle DNA synthesis

    Directory of Open Access Journals (Sweden)

    Stougaard Magnus

    2007-11-01

    Full Text Available Abstract Background In situ detection of short sequence elements in genomic DNA requires short probes with high molecular resolution and powerful specific signal amplification. Padlock probes can differentiate single base variations. Ligated padlock probes can be amplified in situ by rolling circle DNA synthesis and detected by fluorescence microscopy, thus enhancing PRINS type reactions, where localized DNA synthesis reports on the position of hybridization targets, to potentially reveal the binding of single oligonucleotide-size probe molecules. Such a system has been presented for the detection of mitochondrial DNA in fixed cells, whereas attempts to apply rolling circle detection to metaphase chromosomes have previously failed, according to the literature. Methods Synchronized cultured cells were fixed with methanol/acetic acid to prepare chromosome spreads in teflon-coated diagnostic well-slides. Apart from the slide format and the chromosome spreading everything was done essentially according to standard protocols. Hybridization targets were detected in situ with padlock probes, which were ligated and amplified using target primed rolling circle DNA synthesis, and detected by fluorescence labeling. Results An optimized protocol for the spreading of condensed metaphase chromosomes in teflon-coated diagnostic well-slides was developed. Applying this protocol we generated specimens for target primed rolling circle DNA synthesis of padlock probes recognizing a 40 nucleotide sequence in the male specific repetitive satellite I sequence (DYZ1 on the Y-chromosome and a 32 nucleotide sequence in the repetitive kringle IV domain in the apolipoprotein(a gene positioned on the long arm of chromosome 6. These targets were detected with good efficiency, but the efficiency on other target sites was unsatisfactory. Conclusion Our aim was to test the applicability of the method used on mitochondrial DNA to the analysis of nuclear genomes, in particular as

  20. A novel approach to propagate flavivirus infectious cDNA clones in bacteria by introducing tandem repeat sequences upstream of virus genome.

    Science.gov (United States)

    Pu, Szu-Yuan; Wu, Ren-Huang; Tsai, Ming-Han; Yang, Chi-Chen; Chang, Chung-Ming; Yueh, Andrew

    2014-07-01

    Despite tremendous efforts to improve the methodology for constructing flavivirus infectious cDNAs, the manipulation of flavivirus cDNAs remains a difficult task in bacteria. Here, we successfully propagated DNA-launched type 2 dengue virus (DENV2) and Japanese encephalitis virus (JEV) infectious cDNAs by introducing seven repeats of the tetracycline-response element (7×TRE) and a minimal cytomegalovirus (CMVmin) promoter upstream of the viral genome. Insertion of the 7×TRE-CMVmin sequence upstream of the DENV2 or JEV genome decreased the cryptic E. coli promoter (ECP) activity of the viral genome in bacteria, as measured using fusion constructs containing DENV2 or JEV segments and the reporter gene Renilla luciferase in an empty vector. The growth kinetics of recombinant viruses derived from DNA-launched DENV2 and JEV infectious cDNAs were similar to those of parental viruses. Similarly, RNA-launched DENV2 infectious cDNAs were generated by inserting 7×TRE-CMVmin, five repeats of the GAL4 upstream activating sequence, or five repeats of BamHI linkers upstream of the DENV2 genome. All three tandem repeat sequences decreased the ECP activity of the DENV2 genome in bacteria. Notably, 7×TRE-CMVmin stabilized RNA-launched JEV infectious cDNAs and reduced the ECP activity of the JEV genome in bacteria. The growth kinetics of recombinant viruses derived from RNA-launched DENV2 and JEV infectious cDNAs displayed patterns similar to those of the parental viruses. These results support a novel methodology for constructing flavivirus infectious cDNAs, which will facilitate research in virology, viral pathogenesis and vaccine development of flaviviruses and other RNA viruses. © 2014 The Authors.

  1. Kearns-Sayre syndrome case presenting a mitochondrial DNA deletion with unusual direct repeats and a rudimentary RNAse mitochondria ribonucleotide processing target sequence

    Energy Technology Data Exchange (ETDEWEB)

    Remes, A.M.; Hassinen, I.E. (Univ. of Oulu (Finland)); Peuhkurinen, K.J.; Herva, R.; Majamaa, K. (Oulu Univ. Central Hospital (Finland))

    1993-04-01

    A mitochondrial DNA deletion in a case of Kearns-Sayre syndrome is described. The deletion is bracketed by direct repeats that were unusual in that one of them was located 11--13 nucleotides from the deletion seam and both were conserved, which should not occur in slip replication or illegitimate elongation. The deleted region was demarcated on the deletion side by sequences that could be predicted to form hairpin structures. The 5[prime]-side of the deletion was flanked by a sequence homologous to a 9-nucleotide piece of the conserved sequence block II of the D-loop. This arrangement around the deletion in Kearns-Sayre syndrome bears some resemblance to the arrangement in the Pearson marrow- pancreas syndrome described by A. Rotig et al. (1991, Genomics 10: 502--504). 10 refs., 1 fig.

  2. Sequence-specific DNA alkylation and transcriptional inhibition by long-chain hairpin pyrrole-imidazole polyamide-chlorambucil conjugates targeting CAG/CTG trinucleotide repeats.

    Science.gov (United States)

    Asamitsu, Sefan; Kawamoto, Yusuke; Hashiya, Fumitaka; Hashiya, Kaori; Yamamoto, Makoto; Kizaki, Seiichiro; Bando, Toshikazu; Sugiyama, Hiroshi

    2014-09-01

    Introducing novel building blocks to solid-phase peptide synthesis, we readily synthesized long-chain hairpin pyrrole-imidazole (PI) polyamide-chlorambucil conjugates 3 and 4 via the introduction of an amino group into a GABA (γ-turn) contained in 3, to target CAG/CTG repeat sequences, which are associated with various hereditary disorders. A high-resolution denaturing polyacrylamide sequencing gel revealed sequence-specific alkylation both strands at the N3 of adenines or guanines in CAG/CTG repeats by conjugates 3 and 4, with 11bp recognition. In vitro transcription assays using conjugate 4 revealed that specific alkylation inhibited the progression of RNA polymerase at the alkylating sites. Chiral substitution of the γ-turn with an amino group resulted in higher binding affinity observed in SPR assays. These assays suggest that conjugates 4 with 11bp recognition has the potential to cause specific DNA damage and transcriptional inhibition at the alkylating sites. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Collateral damage: Spread of repeat-induced point mutation from a duplicated DNA sequence into an adjoining single-copy gene in Neurospora crassa

    Indian Academy of Sciences (India)

    Meenal Vyas; Durgadas P Kasbekar

    2005-02-01

    Repeat-induced point mutation (RIP) is an unusual genome defense mechanism that was discovered in Neurospora crassa. RIP occurs during a sexual cross and induces numerous G : C to A : T mutations in duplicated DNA sequences and also methylates many of the remaining cytosine residues. We measured the susceptibility of the erg-3 gene, present in single copy, to the spread of RIP from duplications of adjoining sequences. Genomic segments of defined length (1, 1.5 or 2 kb) and located at defined distances (0, 0.5, 1 or 2 kb) upstream or downstream of the erg-3 open reading frame (ORF) were amplified by polymerase chain reaction (PCR), and the duplications were created by transformation of the amplified DNA. Crosses were made with the duplication strains and the frequency of erg-3 mutant progeny provided a measure of the spread of RIP from the duplicated segments into the erg-3 gene. Our results suggest that ordinarily RIP-spread does not occur. However, occasionally the mechanism that confines RIP to the duplicated segment seems to fail (frequency 0.1–0.8%) and then RIP can spread across as much as 1 kb of unduplicated DNA. Additionally, the bacterial hph gene appeared to be very susceptible to the spread of RIP-associated cytosine methylation.

  4. REPdenovo: Inferring De Novo Repeat Motifs from Short Sequence Reads.

    Directory of Open Access Journals (Sweden)

    Chong Chu

    Full Text Available Repeat elements are important components of eukaryotic genomes. One limitation in our understanding of repeat elements is that most analyses rely on reference genomes that are incomplete and often contain missing data in highly repetitive regions that are difficult to assemble. To overcome this problem we develop a new method, REPdenovo, which assembles repeat sequences directly from raw shotgun sequencing data. REPdenovo can construct various types of repeats that are highly repetitive and have low sequence divergence within copies. We show that REPdenovo is substantially better than existing methods both in terms of the number and the completeness of the repeat sequences that it recovers. The key advantage of REPdenovo is that it can reconstruct long repeats from sequence reads. We apply the method to human data and discover a number of potentially new repeats sequences that have been missed by previous repeat annotations. Many of these sequences are incorporated into various parasite genomes, possibly because the filtering process for host DNA involved in the sequencing of the parasite genomes failed to exclude the host derived repeat sequences. REPdenovo is a new powerful computational tool for annotating genomes and for addressing questions regarding the evolution of repeat families. The software tool, REPdenovo, is available for download at https://github.com/Reedwarbler/REPdenovo.

  5. DNA sequences encoding erythropoietin

    Energy Technology Data Exchange (ETDEWEB)

    Lin, F.K.

    1987-10-27

    A purified and isolated DNA sequence is described consisting essentially of a DNA sequence encoding a polypeptide having an amino acid sequence sufficiently duplicative of that of erythropoietin to allow possession of the biological property of causing bone marrow cells to increase production of reticulocytes and red blood cells, and to increase hemoglobin synthesis or iron uptake.

  6. Local repeat sequence organization of an intergenic spacer in the chloroplast genome of Chlamydomonas reinhardtii leads to DNA expansion and sequence scrambling: a complex mode of “copy-choice replication”?

    Indian Academy of Sciences (India)

    Mahendra D Wagle; Subhojit Sen; Basuthkar J Rao

    2001-12-01

    Parent-specific, randomly amplified polymorphic DNA (RAPD) markers were obtained from total genomic DNA of Chlamydomonas reinhardtii. Such parent-specific RAPD bands (genomic fingerprints) segregated uniparentally (through mt+) in a cross between a pair of polymorphic interfertile strains of Chlamydomonas (C. reinhardtii and C. minnesotti), suggesting that they originated from the chloroplast genome. Southern analysis mapped the RAPD-markers to the chloroplast genome. One of the RAPD-markers, ``P2” (1.6 kb) was cloned, sequenced and was fine mapped to the 3 kb region encompassing 3′ end of 23S, full 5S and intergenic region between 5S and psbA. This region seems divergent enough between the two parents, such that a specific PCR designed for a parental specific chloroplast sequence within this region, amplified a marker in that parent only and not in the other, indicating the utility of RAPD-scan for locating the genomic regions of sequence divergence. Remarkably, the RAPD-product, ``P2” seems to have originated from a PCR-amplification of a much smaller (about 600 bp), but highly repeat-rich (direct and inverted) domain of the 3 kb region in a manner that yielded no linear sequence alignment with its own template sequence. The amplification yielded the same uniquely ``sequence-scrambled” product, whether the template used for PCR was total cellular DNA, chloroplast DNA or a plasmid clone DNA corresponding to that region. The PCR product, a ``unique” new sequence, had lost the repetitive organization of the template genome where it had originated from and perhaps represented a ``complex path” of copy-choice replication.

  7. Interactions between meso-tetrakis(4-(N-methylpyridiumyl))porphyrin TMPyP4 and DNA G-quadruplex of telomeric repeated sequence TTAGGG

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    The binding properties between meso-tetrakis(4-(N-methylpyridiumyl))porphyrin (TMPyP4) and the parallel DNA G-quadruplex (G4) of telomeric repeated sequence 5′-TTAGGG-3′ have been characterized by means of circular dichroism,steady-state absorption,steady-state fluorescence and picosecond time-resolved fluorescence spectroscopies. The binding constant and the saturated binding number were determined as 1.29×106 (mol/L)-1 and 3,respectively,according to steady-state absorption spec-troscopy. Based on the findings by the use of time-resolved fluorescence spectroscopic technique,it is deduced that TMPyP4 binds to a DNA G-quadruplex with both the thread-intercalating and end-stacking modes and at the saturated binding state,one TMPyP4 molecule intercalates into the intervals of G-tetrads while the other two stack to the ends of the DNA G-quadruplex.

  8. DNA sequencing conference, 2

    Energy Technology Data Exchange (ETDEWEB)

    Cook-Deegan, R.M. [Georgetown Univ., Kennedy Inst. of Ethics, Washington, DC (United States); Venter, J.C. [National Inst. of Neurological Disorders and Strokes, Bethesda, MD (United States); Gilbert, W. [Harvard Univ., Cambridge, MA (United States); Mulligan, J. [Stanford Univ., CA (United States); Mansfield, B.K. [Oak Ridge National Lab., TN (United States)

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  9. Bov-B long interspersed repeated DNA (LINE) sequences are present in Vipera ammodytes phospholipase A2 genes and in genomes of Viperidae snakes.

    Science.gov (United States)

    Kordis, D; Gubensek, F

    1997-06-15

    Ammodytin L is a myotoxic Ser49 phospholipase A2 (PLA2) homologue, which is tissue-specifically expressed in the venom glands of Vipera ammodytes. The complete DNA sequence of the gene and its 5' and 3' flanking regions has been determined. The gene consists of five exons separated by four introns. Comparative analysis of the ammodytin L and ammodytoxin C genes shows that all intron and flanking sequences are considerably more conserved (93-97%) than the mature protein-coding exons. The pattern of nucleotide substitutions in protein-coding exons is not random but occurs preferentially on the first and the second positions of codons, which suggests positive Darwinian evolution for a new function. An Ruminantia specific ART-2 retroposon, recently recognised as a 5'-truncated Bov-B long interspersed repeated DNA (LINE) sequence, was identified in the fourth intron of both genes. This result suggests that ammodytin L and ammodytoxin C genes are derived by duplication of a common ancestral gene. The phylogenetic distribution of Bov-B LINE among vertebrate classes shows that, besides the Ruminantia, it is limited to Viperidae snakes (Vipera ammodytes, Vipera palaestinae, Echis coloratus, Bothrops alternatus, Trimeresurus flavoviridis and Trimeresurus gramineus). The copy number of the 3' end of Bov-B LINE in the Vipera ammodytes genome is between 62,000 and 75,000. The absence of Bov-B LINE at orthologous positions in other snake PLA2 genes indicates that its retrotransposition in the V. ammodytes PLA2 gene locus has occurred quite recently, about 5 My ago. The amplification of Bov-B LINEs in snakes may have occurred before the divergence of the Viperinae and Crotalinae subfamilies. Due to its wide distribution in Viperidae snakes it may be a valuable phylogenetic marker. The neighbor-joining phylogenetic tree shows two clusters of truncated Bov-B LINE, a Bovidae and a snake cluster, indicating an early horizontal transfer of this transposable element.

  10. Simple sequence repeats in mycobacterial genomes

    Indian Academy of Sciences (India)

    Vattipally B Sreenu; Pankaj Kumar; Javaregowda Nagaraju; Hampapathalu A Nagarajaram

    2007-01-01

    Simple sequence repeats (SSRs) or microsatellites are the repetitive nucleotide sequences of motifs of length 1–6 bp. They are scattered throughout the genomes of all the known organisms ranging from viruses to eukaryotes. Microsatellites undergo mutations in the form of insertions and deletions (INDELS) of their repeat units with some bias towards insertions that lead to microsatellite tract expansion. Although prokaryotic genomes derive some plasticity due to microsatellite mutations they have in-built mechanisms to arrest undue expansions of microsatellites and one such mechanism is constituted by post-replicative DNA repair enzymes MutL, MutH and MutS. The mycobacterial genomes lack these enzymes and as a null hypothesis one could expect these genomes to harbour many long tracts. It is therefore interesting to analyse the mycobacterial genomes for distribution and abundance of microsatellites tracts and to look for potentially polymorphic microsatellites. Available mycobacterial genomes, Mycobacterium avium, M. leprae, M. bovis and the two strains of M. tuberculosis (CDC1551 and H37Rv) were analysed for frequencies and abundance of SSRs. Our analysis revealed that the SSRs are distributed throughout the mycobacterial genomes at an average of 220–230 SSR tracts per kb. All the mycobacterial genomes contain few regions that are conspicuously denser or poorer in microsatellites compared to their expected genome averages. The genomes distinctly show scarcity of long microsatellites despite the absence of a post-replicative DNA repair system. Such severe scarcity of long microsatellites could arise as a result of strong selection pressures operating against long and unstable sequences although influence of GC-content and role of point mutations in arresting microsatellite expansions can not be ruled out. Nonetheless, the long tracts occasionally found in coding as well as non-coding regions may account for limited genome plasticity in these genomes.

  11. Duplication in DNA Sequences

    Science.gov (United States)

    Ito, Masami; Kari, Lila; Kincaid, Zachary; Seki, Shinnosuke

    The duplication and repeat-deletion operations are the basis of a formal language theoretic model of errors that can occur during DNA replication. During DNA replication, subsequences of a strand of DNA may be copied several times (resulting in duplications) or skipped (resulting in repeat-deletions). As formal language operations, iterated duplication and repeat-deletion of words and languages have been well studied in the literature. However, little is known about single-step duplications and repeat-deletions. In this paper, we investigate several properties of these operations, including closure properties of language families in the Chomsky hierarchy and equations involving these operations. We also make progress toward a characterization of regular languages that are generated by duplicating a regular language.

  12. Automated DNA Sequencing System

    Energy Technology Data Exchange (ETDEWEB)

    Armstrong, G.A.; Ekkebus, C.P.; Hauser, L.J.; Kress, R.L.; Mural, R.J.

    1999-04-25

    Oak Ridge National Laboratory (ORNL) is developing a core DNA sequencing facility to support biological research endeavors at ORNL and to conduct basic sequencing automation research. This facility is novel because its development is based on existing standard biology laboratory equipment; thus, the development process is of interest to the many small laboratories trying to use automation to control costs and increase throughput. Before automation, biology Laboratory personnel purified DNA, completed cycle sequencing, and prepared 96-well sample plates with commercially available hardware designed specifically for each step in the process. Following purification and thermal cycling, an automated sequencing machine was used for the sequencing. A technician handled all movement of the 96-well sample plates between machines. To automate the process, ORNL is adding a CRS Robotics A- 465 arm, ABI 377 sequencing machine, automated centrifuge, automated refrigerator, and possibly an automated SpeedVac. The entire system will be integrated with one central controller that will direct each machine and the robot. The goal of this system is to completely automate the sequencing procedure from bacterial cell samples through ready-to-be-sequenced DNA and ultimately to completed sequence. The system will be flexible and will accommodate different chemistries than existing automated sequencing lines. The system will be expanded in the future to include colony picking and/or actual sequencing. This discrete event, DNA sequencing system will demonstrate that smaller sequencing labs can achieve cost-effective the laboratory grow.

  13. Comparative genomics and molecular dynamics of DNA repeats in eukaryotes.

    Science.gov (United States)

    Richard, Guy-Franck; Kerrest, Alix; Dujon, Bernard

    2008-12-01

    Repeated elements can be widely abundant in eukaryotic genomes, composing more than 50% of the human genome, for example. It is possible to classify repeated sequences into two large families, "tandem repeats" and "dispersed repeats." Each of these two families can be itself divided into subfamilies. Dispersed repeats contain transposons, tRNA genes, and gene paralogues, whereas tandem repeats contain gene tandems, ribosomal DNA repeat arrays, and satellite DNA, itself subdivided into satellites, minisatellites, and microsatellites. Remarkably, the molecular mechanisms that create and propagate dispersed and tandem repeats are specific to each class and usually do not overlap. In the present review, we have chosen in the first section to describe the nature and distribution of dispersed and tandem repeats in eukaryotic genomes in the light of complete (or nearly complete) available genome sequences. In the second part, we focus on the molecular mechanisms responsible for the fast evolution of two specific classes of tandem repeats: minisatellites and microsatellites. Given that a growing number of human neurological disorders involve the expansion of a particular class of microsatellites, called trinucleotide repeats, a large part of the recent experimental work on microsatellites has focused on these particular repeats, and thus we also review the current knowledge in this area. Finally, we propose a unified definition for mini- and microsatellites that takes into account their biological properties and try to point out new directions that should be explored in a near future on our road to understanding the genetics of repeated sequences.

  14. Structure, organization, and sequence of alpha satellite DNA from human chromosome 17: evidence for evolution by unequal crossing-over and an ancestral pentamer repeat shared with the human X chromosome.

    Science.gov (United States)

    Waye, J S; Willard, H F

    1986-09-01

    The centromeric regions of all human chromosomes are characterized by distinct subsets of a diverse tandemly repeated DNA family, alpha satellite. On human chromosome 17, the predominant form of alpha satellite is a 2.7-kilobase-pair higher-order repeat unit consisting of 16 alphoid monomers. We present the complete nucleotide sequence of the 16-monomer repeat, which is present in 500 to 1,000 copies per chromosome 17, as well as that of a less abundant 15-monomer repeat, also from chromosome 17. These repeat units were approximately 98% identical in sequence, differing by the exclusion of precisely 1 monomer from the 15-monomer repeat. Homologous unequal crossing-over is suggested as a probable mechanism by which the different repeat lengths on chromosome 17 were generated, and the putative site of such a recombination event is identified. The monomer organization of the chromosome 17 higher-order repeat unit is based, in part, on tandemly repeated pentamers. A similar pentameric suborganization has been previously demonstrated for alpha satellite of the human X chromosome. Despite the organizational similarities, substantial sequence divergence distinguishes these subsets. Hybridization experiments indicate that the chromosome 17 and X subsets are more similar to each other than to the subsets found on several other human chromosomes. We suggest that the chromosome 17 and X alpha satellite subsets may be related components of a larger alphoid subfamily which have evolved from a common ancestral repeat into the contemporary chromosome-specific subsets.

  15. Evolution of DNA sequencing

    National Research Council Canada - National Science Library

    Tipu, Hamid Nawaz; Shabbir, Ambreen

    2015-01-01

    Sanger and coworkers introduced DNA sequencing in 1970s for the first time. It principally relied on termination of growing nucleotide chain when a dideoxythymidine triphosphate (ddTTP) was inserted...

  16. Gomphid DNA sequence data

    Data.gov (United States)

    U.S. Environmental Protection Agency — DNA sequence data for several genetic loci. This dataset is not publicly accessible because: It's already publicly available on GenBank. It can be accessed through...

  17. The cDNA sequence for the protein-tyrosine kinase substrate p36 (calpactin I heavy chain) reveals a multidomain protein with internal repeats

    DEFF Research Database (Denmark)

    Sarin, C T; Tack, B F; Kristensen, Torsten;

    1986-01-01

    We have isolated and sequenced a full-length cDNA clone for the protein-tyrosine kinase substrate p36 (calpactin I heavy chain). This sequence predicts a 339 amino acid (Mr 38,493) protein containing an N-terminal region of 20 amino acids, known to interact with a 10 kd protein (light chain), and...

  18. Repeated extraction of DNA from FTA cards

    DEFF Research Database (Denmark)

    Stangegaard, Michael; Ferrero, Laura; Børsting, Claus

    2011-01-01

    Extraction of DNA using magnetic bead based techniques on automated DNA extraction instruments provides a fast, reliable and reproducible method for DNA extraction from various matrices. However, the yield of extracted DNA from FTA-cards is typically low. Here, we demonstrate that it is possible...... to repeatedly extract DNA from the processed FTA-disk. The method increases the yield from the nanogram range to the microgram range....

  19. Repeated extraction of DNA from FTA cards

    OpenAIRE

    Stangegaard, Michael; Ferrero, Laura; Børsting, Claus; Frank-Hansen, Rune; Hansen, Anders Johannes; Morling, Niels

    2011-01-01

    Extraction of DNA using magnetic bead based techniques on automated DNA extraction instruments provides a fast, reliable and reproducible method for DNA extraction from various matrices. However, the yield of extracted DNA from FTA-cards is typically low. Here, we demonstrate that it is possible to repeatedly extract DNA from the processed FTA-disk. The method increases the yield from the nanogram range to the microgram range.

  20. Repeated extraction of DNA from FTA cards

    DEFF Research Database (Denmark)

    Stangegaard, Michael; Ferrero, Laura; Børsting, Claus

    2011-01-01

    Extraction of DNA using magnetic bead based techniques on automated DNA extraction instruments provides a fast, reliable and reproducible method for DNA extraction from various matrices. However, the yield of extracted DNA from FTA-cards is typically low. Here, we demonstrate that it is possible...... to repeatedly extract DNA from the processed FTA-disk. The method increases the yield from the nanogram range to the microgram range....

  1. 黑斑原(鱼兆)微卫星DNA 富集文库构建与鉴定%CONSTRUCTION AND IDENTIFICATION OF DNA LIBRARIES ENRICHED FOR MICROSATELLITE REPEAT SEQUENCES OF GLYPTOSTERNUM MACULATUM

    Institute of Scientific and Technical Information of China (English)

    郭宝英; 谢从新; 祁鹏志; 吴常文; 邓一兵

    2011-01-01

    采用磁珠富集法,利用生物素标记的(CA)12 寡核苷酸探针从黑斑原(鱼兆)基因组DNA MboI 酶切的400-1000 bp 片段中筛选CA/GT 微卫星位点,洗脱的杂交片段克隆到pMD18-T 载体上构建富集微卫星基因组文库后,通过PCR 筛选检测出720 个阳性克隆,占所有克隆的89.2%,从阳性克隆中随机选取139 个进行测序,序列分析发现,124 个克隆含有7 个以上的重复序列,其中完全的为80 个(64.5%),不完全的为40 个(32.3%),复合的为15 个(3.2%),重复次数范围为7-165 次,平均为52 次.在124 条序列中共59 条可以设计引物.%Microsatellite marker (SSR) has been widely used in population genetics and genetic map construction. In order to determine the genetic diversity of G. Maculatum, this study was undertaken to develop and characterize the micro satellite sequence firstly for further to develop the micro satellite markers. Genomic DNA was extracted from muscle tissue using a traditional proteinase K digestion and phenol-chloroform extraction procedure with RNA removed by Rnase. Approximately 2 u.g of total genomic DNA was digested with Mbo\\, then ligated to the adapters (Linker A and Linker B). The treated DNA sample was then pooled and fragments were separated on a 1.5% agarose gel prior to size selection. The resulting fragments (400-1000 bp) were extracted from the gel matrix using a column and amplified 20 cycles with Linker B primers. The amplified DNA was hybridized with 5μL of 5'-biotinylated (CA)12repeat oligos in a total volume of 100 μL of 6x SSC and 0.1% SDS. The mixture was incubated at 95℃ for 5min, followed by anneal at 65℃ for 60min and cooled to room temperature. During this hybridization, the 100 μL (per treatment) of Streptavidin coated beads was resuspended in 300 μL l× hybridization buffer (6x SSC + 0.1% SDS) and washed three times. The hybridization mixture was added to the washed beads and incubated for 30 min at room temperature. The beads were

  2. In the Staphylococcus aureus two-component system sae, the response regulator SaeR binds to a direct repeat sequence and DNA binding requires phosphorylation by the sensor kinase SaeS.

    Science.gov (United States)

    Sun, Fei; Li, Chunling; Jeong, Dowon; Sohn, Changmo; He, Chuan; Bae, Taeok

    2010-04-01

    Staphylococcus aureus uses the SaeRS two-component system to control the expression of many virulence factors such as alpha-hemolysin and coagulase; however, the molecular mechanism of this signaling has not yet been elucidated. Here, using the P1 promoter of the sae operon as a model target DNA, we demonstrated that the unphosphorylated response regulator SaeR does not bind to the P1 promoter DNA, while its C-terminal DNA binding domain alone does. The DNA binding activity of full-length SaeR could be restored by sensor kinase SaeS-induced phosphorylation. Phosphorylated SaeR is more resistant to digestion by trypsin, suggesting conformational changes. DNase I footprinting assays revealed that the SaeR protection region in the P1 promoter contains a direct repeat sequence (GTTAAN(6)GTTAA [where N is any nucleotide]). This sequence is critical to the binding of phosphorylated SaeR. Mutational changes in the repeat sequence greatly reduced both the in vitro binding of SaeR and the in vivo function of the P1 promoter. From these results, we concluded that SaeR recognizes the direct repeat sequence as a binding site and that binding requires phosphorylation by SaeS.

  3. DNA sequencing by CE.

    Science.gov (United States)

    Karger, Barry L; Guttman, András

    2009-06-01

    Sequencing of human and other genomes has been at the center of interest in the biomedical field over the past several decades and is now leading toward an era of personalized medicine. During this time, DNA-sequencing methods have evolved from the labor-intensive slab gel electrophoresis, through automated multiCE systems using fluorophore labeling with multispectral imaging, to the "next-generation" technologies of cyclic-array, hybridization based, nanopore and single molecule sequencing. Deciphering the genetic blueprint and follow-up confirmatory sequencing of Homo sapiens and other genomes were only possible with the advent of modern sequencing technologies that were a result of step-by-step advances with a contribution of academics, medical personnel and instrument companies. While next-generation sequencing is moving ahead at breakneck speed, the multicapillary electrophoretic systems played an essential role in the sequencing of the Human Genome, the foundation of the field of genomics. In this prospective, we wish to overview the role of CE in DNA sequencing based in part of several of our articles in this journal.

  4. Evolutionary dynamics of satellite DNA repeats from Phaseolus beans.

    Science.gov (United States)

    Ribeiro, Tiago; Dos Santos, Karla G B; Richard, Manon M S; Sévignac, Mireille; Thareau, Vincent; Geffroy, Valérie; Pedrosa-Harand, Andrea

    2017-03-01

    Common bean (Phaseolus vulgaris) subtelomeres are highly enriched for khipu, the main satellite DNA identified so far in this genome. Here, we comparatively investigate khipu genomic organization in Phaseolus species from different clades. Additionally, we identified and characterized another satellite repeat, named jumper, associated to khipu. A mixture of P. vulgaris khipu clones hybridized in situ confirmed the presence of khipu-like sequences on subterminal chromosome regions in all Phaseolus species, with differences in the number and intensity of signals between species and when species-specific clones were used. Khipu is present as multimers of ∼500 bp and sequence analyses of cloned fragments revealed close relationship among khipu repeats. The new repeat, named jumper, is a 170-bp satellite sequence present in all Phaseolus species and inserted into the nontranscribed spacer (NTS) of the 5S rDNA in the P. vulgaris genome. Nevertheless, jumper was found as a high-copy repeat at subtelomeres and/or pericentromeres in the Phaseolus microcarpus lineage only. Our data argue for khipu as an important subtelomeric satellite DNA in the genus and for a complex satellite repeat composition of P. microcarpus subtelomeres, which also contain jumper. Furthermore, the differential amplification of these repeats in subtelomeres or pericentromeres reinforces the presence of a dynamic satellite DNA library in Phaseolus.

  5. Multineuronal Spike Sequences Repeat with Millisecond Precision

    Directory of Open Access Journals (Sweden)

    Koki eMatsumoto

    2013-06-01

    Full Text Available Cortical microcircuits are nonrandomly wired by neurons. As a natural consequence, spikes emitted by microcircuits are also nonrandomly patterned in time and space. One of the prominent spike organizations is a repetition of fixed patterns of spike series across multiple neurons. However, several questions remain unsolved, including how precisely spike sequences repeat, how the sequences are spatially organized, how many neurons participate in sequences, and how different sequences are functionally linked. To address these questions, we monitored spontaneous spikes of hippocampal CA3 neurons ex vivo using a high-speed functional multineuron calcium imaging technique that allowed us to monitor spikes with millisecond resolution and to record the location of spiking and nonspiking neurons. Multineuronal spike sequences were overrepresented in spontaneous activity compared to the statistical chance level. Approximately 75% of neurons participated in at least one sequence during our observation period. The participants were sparsely dispersed and did not show specific spatial organization. The number of sequences relative to the chance level decreased when larger time frames were used to detect sequences. Thus, sequences were precise at the millisecond level. Sequences often shared common spikes with other sequences; parts of sequences were subsequently relayed by following sequences, generating complex chains of multiple sequences.

  6. Triplet repeat DNA structures and human genetic disease: dynamic mutations from dynamic DNA

    Indian Academy of Sciences (India)

    Richard R Sinden; Vladimir N Potaman; Elena A Oussatcheva; Christopher E Pearson; Yuri L Lyubchenko; Luda S Shlyakhtenko

    2002-02-01

    Fourteen genetic neurodegenerative diseases and three fragile sites have been associated with the expansion of (CTG)n•(CAG)n, (CGG)n•(CCG)n, or (GAA)n•(TTC)n repeat tracts. Different models have been proposed for the expansion of triplet repeats, most of which presume the formation of alternative DNA structures in repeat tracts. One of the most likely structures, slipped strand DNA, may stably and reproducibly form within triplet repeat sequences. The propensity to form slipped strand DNA is proportional to the length and homogeneity of the repeat tract. The remarkable stability of slipped strand DNA may, in part, be due to loop-loop interactions facilitated by the sequence complementarity of the loops and the dynamic structure of three-way junctions formed at the loop-outs.

  7. DNA Sequencing Using capillary Electrophoresis

    Energy Technology Data Exchange (ETDEWEB)

    Dr. Barry Karger

    2011-05-09

    application papers of sequencing up to this level were also published in the mid 1990's. A major interest of the sequencing community has always been read length. The longer the sequence read per run the more efficient the process as well as the ability to read repeat sequences. We therefore devoted a great deal of time to studying the factors influencing read length in capillary electrophoresis, including polymer type and molecule weight, capillary column temperature, applied electric field, etc. In our initial optimization, we were able to demonstrate, for the first time, the sequencing of over 1000 bases with 90% accuracy. The run required 80 minutes for separation. Sequencing of 1000 bases per column was next demonstrated on a multiple capillary instrument. Our studies revealed that linear polyacrylamide produced the longest read lengths because the hydrophilic single strand DNA had minimal interaction with the very hydrophilic linear polyacrylamide. Any interaction of the DNA with the polymer would lead to broader peaks and lower read length. Another important parameter was the molecular weight of the linear chains. High molecular weight (> 1 MDA) was important to allow the long single strand DNA to reptate through the entangled polymer matrix. In an important paper, we showed an inverse emulsion method to prepare reproducibility linear polyacrylamide polymer with an average MWT of 9MDa. This approach was used in the polymer for sequencing the human genome. Another critical factor in the successful use of capillary electrophoresis for sequencing was the sample preparation method. In the Sanger sequencing reaction, high concentration of salts and dideoxynucleotide remained. Since the sample was introduced to the capillary column by electrokinetic injection, these salt ions would be favorably injected into the column over the sequencing fragments, thus reducing the signal for longer fragments and hence reading read length. In two papers, we examined the role of

  8. The 28S-18S rDNA intergenic spacer from Crithidia fasciculata: repeated sequences, length heterogeneity, putative processing sites and potential interactions between U3 small nucleolar RNA and the ribosomal RNA precursor.

    Science.gov (United States)

    Schnare, M N; Collings, J C; Spencer, D F; Gray, M W

    2000-09-15

    In Crithidia fasciculata, the ribosomal RNA (rRNA) gene repeats range in size from approximately 11 to 12 kb. This length heterogeneity is localized to a region of the intergenic spacer (IGS) that contains tandemly repeated copies of a 19mer sequence. The IGS also contains four copies of an approximately 55 nt repeat that has an internal inverted repeat and is also present in the IGS of Leishmania species. We have mapped the C.fasciculata transcription initiation site as well as two other reverse transcriptase stop sites that may be analogous to the A0 and A' pre-rRNA processing sites within the 5' external transcribed spacer (ETS) of other eukaryotes. Features that could influence processing at these sites include two stretches of conserved primary sequence and three secondary structure elements present in the 5' ETS. We also characterized the C.fasciculata U3 snoRNA, which has the potential for base-pairing with pre-rRNA sequences. Finally, we demonstrate that biosynthesis of large subunit rRNA in both C. fasciculata and Trypanosoma brucei involves 3'-terminal addition of three A residues that are not present in the corresponding DNA sequences.

  9. Genus-specific protein binding to the large clusters of DNA repeats (short regularly spaced repeats) present in Sulfolobus genomes

    DEFF Research Database (Denmark)

    Peng, Xu; Brügger, Kim; Shen, Biao

    2003-01-01

    Short regularly spaced repeats (SRSRs) occur in multiple large clusters in archaeal chromosomes and as smaller clusters in some archaeal conjugative plasmids and bacterial chromosomes. The sequence, size, and spacing of the repeats are generally constant within a cluster but vary between clusters...... that are identical in sequence to one of the repeat variants in the S. solfataricus chromosome. Repeats from the pNOB8 cluster were amplified and tested for protein binding with cell extracts from S. solfataricus. A 17.5-kDa SRSR-binding protein was purified from the cell extracts and sequenced. The protein is N...... terminally modified and corresponds to SSO454, an open reading frame of previously unassigned function. It binds specifically to DNA fragments carrying double and single repeat sequences, binding on one side of the repeat structure, and producing an opening of the opposite side of the DNA structure. It also...

  10. DNA profiling of extended tracts of primitive DNA repeats: Direct identification of unstable simple repeat loci in complex genome

    Energy Technology Data Exchange (ETDEWEB)

    Rogaeva, E.A.; Korovaitseva, G.; St. George-Hyslop, P. [Univ. of Toronto (Canada)] [and others

    1994-09-01

    The most simple DNA repetitive elements, with repetitive monomer units of only 1-10 bp in tandem tracts, are an abundant component of the human genome. The expansion of at least one type of these repeats ((CCG)n and (CTG)n) have been detected for a several neurological diseases with anticipation in successive generations. We propose here a simple method for the identification of particularly expanded repeats and for the recovery of flanking sequences. We generated DNA probes using PCR to create long concatamers (n>100) by amplification of the di-, tri-, tetra-, penta- and hexa-nucleotide repeat oligonucleotide primer pairs. To reduce the complexity of the background band pattern, the genomic DNA was restricted with a mixture of at least five different endonucleases, thereby reducing the size of restriction fragments containing short simple repeat arrays while leaving intact the large fragments containing the longer simple repeats arrays. Direct blot hybridization has shown different {open_quotes}DNA fingerprint{close_quotes} patterns with all arbitrary selected di-hexa nucleotide repeat probes. Direct hybridization of the (CTG)n and (CCG)n probes revealed simple or multiple band patterns depending upon stringency conditions. We were able to detect the presence of expanded unstable tri-nucleotide alleles by (CCG)n probe for some FRAXA subjects and by (CTG)n probe for some myotonic dystrophy subjects which were not present in the parental DNA patterns. The cloning of the unstable alleles for simple repeats can be performed by direct recover from agarose gels of the aberrant unstable bands detected above. The recovered flanking regions can be cloned, sequenced and used for PCR detection of expanded alleles or can be used to screen cDNA. This method may be used for testing of small families with diseases thought to display clinical evidence of anticipation.

  11. Recent Advances of Repeat-induced Point Mutation (RIP) of DNA Sequence in Fungi%真菌中 DNA 重复序列诱导点突变的研究进展

    Institute of Scientific and Technical Information of China (English)

    冯凤鹃; 曲志才; 田李; 王转斌

    2014-01-01

    Repeat -induced point mutations ( RIP) was discovered in Neurospora crassa in 1987 by Selker. RIP searches for sequence duplications in haploid nuclei of premeiotic tissue and then litters them with numerous C to T mutations.T+A rich fragments so that the G-C pairs in duplications can be mutated to A -T.In addition, RIP’ s sequences , which are concentrated in centromeric regions , and are predominantly relics of transposons , are left methylated .Mobile transposable elements are among the primary drivers of the evolution of eukaryotic genomes . For fungi , repeat-induced point mutation ( RIP) silencing minimizes deleterious effects of transposons by mutating multicopy DNA during meiosis .To explore the impact of RIP-mutated transposons is conducive to generate evolu-tionary inferences for phylogenetic and population genetic analyses .The paper has reviewed the mechanism of RIP and the progress of RIP in fungi .%1987年,由Selker等在粗糙脉孢菌中首次发现重复序列诱导点突变( repeat-induced point mu-tation,RIP)。在重复序列诱导点突变过程中,搜寻前减数分裂组织单倍体核中DNA的重复序列,然后发生众多的碱基C到T的突变,产生富碱基T+A片段,从而使重复序列中的G-C碱基对发生转换突变成为A-T碱基对。此外,发生RIP的序列多集中在着丝粒区域,主要是转座子甲基化后的遗迹。移动转座子是真核生物基因组进化的主要驱动力。对于真菌,重复序列诱导点突变( RIP)在减数分裂过程中通过突变多拷贝DNA,能最大限度地减少转座子的影响,因此对RIP的研究在一定程度上能有助于了解基因组进化的真谛。综述了重复序列诱导点突变的产生机制,以及真菌中重复序列诱导点突变的研究进展。

  12. Telomere and ribosomal DNA repeats are chromosomal targets of the bloom syndrome DNA helicase

    Directory of Open Access Journals (Sweden)

    Paric Enesa

    2003-10-01

    Full Text Available Abstract Background Bloom syndrome is one of the most cancer-predisposing disorders and is characterized by genomic instability and a high frequency of sister chromatid exchange. The disorder is caused by loss of function of a 3' to 5' RecQ DNA helicase, BLM. The exact role of BLM in maintaining genomic integrity is not known but the helicase has been found to associate with several DNA repair complexes and some DNA replication foci. Results Chromatin immunoprecipitation of BLM complexes recovered telomere and ribosomal DNA repeats. The N-terminus of BLM, required for NB localization, is the same as the telomere association domain of BLM. The C-terminus is required for ribosomal DNA localization. BLM localizes primarily to the non-transcribed spacer region of the ribosomal DNA repeat where replication forks initiate. Bloom syndrome cells expressing the deletion alleles lacking the ribosomal DNA and telomere association domains have altered cell cycle populations with increased S or G2/M cells relative to normal. Conclusion These results identify telomere and ribosomal DNA repeated sequence elements as chromosomal targets for the BLM DNA helicase during the S/G2 phase of the cell cycle. BLM is localized in nuclear bodies when it associates with telomeric repeats in both telomerase positive and negative cells. The BLM DNA helicase participates in genomic stability at ribosomal DNA repeats and telomeres.

  13. Telomere and ribosomal DNA repeats are chromosomal targets of the bloom syndrome DNA helicase.

    Science.gov (United States)

    Schawalder, James; Paric, Enesa; Neff, Norma F

    2003-10-27

    Bloom syndrome is one of the most cancer-predisposing disorders and is characterized by genomic instability and a high frequency of sister chromatid exchange. The disorder is caused by loss of function of a 3' to 5' RecQ DNA helicase, BLM. The exact role of BLM in maintaining genomic integrity is not known but the helicase has been found to associate with several DNA repair complexes and some DNA replication foci. Chromatin immunoprecipitation of BLM complexes recovered telomere and ribosomal DNA repeats. The N-terminus of BLM, required for NB localization, is the same as the telomere association domain of BLM. The C-terminus is required for ribosomal DNA localization. BLM localizes primarily to the non-transcribed spacer region of the ribosomal DNA repeat where replication forks initiate. Bloom syndrome cells expressing the deletion alleles lacking the ribosomal DNA and telomere association domains have altered cell cycle populations with increased S or G2/M cells relative to normal. These results identify telomere and ribosomal DNA repeated sequence elements as chromosomal targets for the BLM DNA helicase during the S/G2 phase of the cell cycle. BLM is localized in nuclear bodies when it associates with telomeric repeats in both telomerase positive and negative cells. The BLM DNA helicase participates in genomic stability at ribosomal DNA repeats and telomeres.

  14. Mining of simple sequence repeats in the Genome of Gentianaceae

    Directory of Open Access Journals (Sweden)

    R Sathishkumar

    2011-01-01

    Full Text Available Simple sequence repeats (SSRs or short tandem repeats are short repeat motifs that show high level of length polymorphism due to insertion or deletion mutations of one or more repeat types. Here, we present the detection and abundance of microsatellites or SSRs in nucleotide sequences of Gentianaceae family. A total of 545 SSRs were mined in 4698 nucleotide sequences downloaded from the National Center for Biotechnology Information (NCBI. Among the SSR sequences, the frequency of repeat type was about 429 -mono repeats, 99 -di repeats, 15 -tri repeats, and 2 --hexa repeats. Mononucleotide repeats were found to be abundant repeat types, about 78%, followed by dinucleotide repeats (18.16% among the SSR sequences. An attempt was made to design primer pairs for 545 identified SSRs but these were found only for 169 sequences.

  15. Information Theory of DNA Sequencing

    CERN Document Server

    Motahari, Abolfazl; Tse, David

    2012-01-01

    DNA sequencing is the basic workhorse of modern day biology and medicine. Shotgun sequencing is the dominant technique used: many randomly located short fragments called reads are extracted from the DNA sequence, and these reads are assembled to reconstruct the original sequence. By drawing an analogy between the DNA sequencing problem and the classic communication problem, we define an information theoretic notion of sequencing capacity. This is the maximum number of DNA base pairs that can be resolved reliably per read, and provides a fundamental limit to the performance that can be achieved by any assembly algorithm. We compute the sequencing capacity explicitly for a simple statistical model of the DNA sequence and the read process. Using this framework, we also study the impact of noise in the read process on the sequencing capacity.

  16. In vitro nucleosome positioning features of DNA repeats sequence associated with human genetic disease%与人遗传病相关的DNA重复序列的体外核小体定位特性

    Institute of Scientific and Technical Information of China (English)

    柴荣; 赵宏宇; 蔡禄

    2013-01-01

    Objective To investigate the nucleosome positioning of DNA repeats sequence ire vitro which can cause human genetic disease. Methods The recombinant plasmids containing (GAA)42, (ATTCT)43, (GCCT)18 and 601 sequence were cloned. The histone and plasmids were used to assemble chromatin structure ire vitro,and then analyzed by agarose gel electrophoresis after micrococcal nuclease digestion. Results The plasmid containing ATTCT repeats sequence was easier to form nucleosome than GAA containing repeats sequence ire vitro. Conclusions The recombinant plasmids' ability to form chromatin structure was changed because of the insert of the different repeats sequence fragment.%目的 研究与人遗传病相关的DNA重复序列的体外核小体定位.方法 构建含有(GAA)42、(ATTCT)43、(GCCT)18和601序列的重组质粒,体外利用盐透析将质粒与组蛋白八聚体组装形成染色质结构,微球菌核酸酶消化后,用琼脂糖凝胶电泳分析染色质的结构.结果 含有ATTCT重复序列的质粒较含GAA重复序列质粒在体外易于形成核小体.结论 在重组质粒中,由于引入的重复序列片段形成核小体能力的不同会影响其局部染色质结构.

  17. DNA Sequencing Using capillary Electrophoresis

    Energy Technology Data Exchange (ETDEWEB)

    Dr. Barry Karger

    2011-05-09

    application papers of sequencing up to this level were also published in the mid 1990's. A major interest of the sequencing community has always been read length. The longer the sequence read per run the more efficient the process as well as the ability to read repeat sequences. We therefore devoted a great deal of time to studying the factors influencing read length in capillary electrophoresis, including polymer type and molecule weight, capillary column temperature, applied electric field, etc. In our initial optimization, we were able to demonstrate, for the first time, the sequencing of over 1000 bases with 90% accuracy. The run required 80 minutes for separation. Sequencing of 1000 bases per column was next demonstrated on a multiple capillary instrument. Our studies revealed that linear polyacrylamide produced the longest read lengths because the hydrophilic single strand DNA had minimal interaction with the very hydrophilic linear polyacrylamide. Any interaction of the DNA with the polymer would lead to broader peaks and lower read length. Another important parameter was the molecular weight of the linear chains. High molecular weight (> 1 MDA) was important to allow the long single strand DNA to reptate through the entangled polymer matrix. In an important paper, we showed an inverse emulsion method to prepare reproducibility linear polyacrylamide polymer with an average MWT of 9MDa. This approach was used in the polymer for sequencing the human genome. Another critical factor in the successful use of capillary electrophoresis for sequencing was the sample preparation method. In the Sanger sequencing reaction, high concentration of salts and dideoxynucleotide remained. Since the sample was introduced to the capillary column by electrokinetic injection, these salt ions would be favorably injected into the column over the sequencing fragments, thus reducing the signal for longer fragments and hence reading read length. In two papers, we examined the role of

  18. Visible periodicity of strong nucleosome DNA sequences.

    Science.gov (United States)

    Salih, Bilal; Tripathi, Vijay; Trifonov, Edward N

    2015-01-01

    Fifteen years ago, Lowary and Widom assembled nucleosomes on synthetic random sequence DNA molecules, selected the strongest nucleosomes and discovered that the TA dinucleotides in these strong nucleosome sequences often appear at 10-11 bases from one another or at distances which are multiples of this period. We repeated this experiment computationally, on large ensembles of natural genomic sequences, by selecting the strongest nucleosomes--i.e. those with such distances between like-named dinucleotides, multiples of 10.4 bases, the structural and sequence period of nucleosome DNA. The analysis confirmed the periodicity of TA dinucleotides in the strong nucleosomes, and revealed as well other periodic sequence elements, notably classical AA and TT dinucleotides. The matrices of DNA bendability and their simple linear forms--nucleosome positioning motifs--are calculated from the strong nucleosome DNA sequences. The motifs are in full accord with nucleosome positioning sequences derived earlier, thus confirming that the new technique, indeed, detects strong nucleosomes. Species- and isochore-specific variations of the matrices and of the positioning motifs are demonstrated. The strong nucleosome DNA sequences manifest the highest hitherto nucleosome positioning sequence signals, showing the dinucleotide periodicities in directly observable rather than in hidden form.

  19. Biosensors for DNA sequence detection

    Science.gov (United States)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  20. A DNA-binding protein from Candida albicans that binds to the RPG box of Saccharomyces cerevisiae and the telomeric repeat sequence of C. albicans.

    Science.gov (United States)

    Ishii, N; Yamamoto, M; Lahm, H W; Iizumi, S; Yoshihara, F; Nakayama, H; Arisawa, M; Aoki, Y

    1997-02-01

    Electromobility shift assays with a DNA probe containing the Saccharomyces cerevisiae ENO1 RPG box identified a specific DNA-binding protein in total protein extracts of Candida albicans. The protein, named Rbf1p (RPG-box-binding protein 1), bound to other S. cerevisiae RPG boxes, although the nucleotide recognition profile was not completely the same as that of S. cerevisiae Rap 1p (repressor-activator protein 1), an RPG-box-binding protein. The repetitive sequence of the C. albicans chromosomal telomere also competed with RPG-box binding to Rbf1p. For further analysis, we purified Rbf1p 57,600-fold from C. albicans total protein extracts, raised mAbs against the purified protein and immunologically cloned the gene, whose ORF specified a protein of 527 aa. The bacterially expressed protein showed RPG-box-binding activity with the same profile as that of the purified one. The Rbf1p, containing two glutamine-rich regions that are found in many transcription factors, showed transcriptional activation capability in S. cerevisiae and was predominantly observed in nuclei. These results suggest that Rbf1p is a transcription factor with telomere-binding activity in C. albicans.

  1. Alu repeats as markers for forensic DNA analyses

    Energy Technology Data Exchange (ETDEWEB)

    Batzer, M.A.; Alegria-Hartman, M. [Lawrence Livermore National Lab., CA (United States); Kass, D.H. [Louisiana State Univ., New Orleans, LA (United States)] [and others

    1994-01-01

    The Human-Specific (HS) subfamily of Alu sequences is comprised of a group of 500 nearly identical members which are almost exclusively restricted to the human genome. Individual subfamily members share an average of 98.9% nucleotide identity with the HS subfamily consensus sequence, and have an average age of 2.8 million years. We have developed a Polymerase Chain Reaction (PCR) based assay using primers complementary to the 5 inch and 3 inch unique flanking DNA sequences from each HS Alu that allow the locus to be assayed for the presence or absence of the Alu repeat. The dimorphic HS Alu sequences probably inserted in the human genome after the radiation of modem humans (within the last 200,000-one million years) and represent a unique source of information for human population genetics and forensic DNA analyses. These sites can be developed into Dimorphic Alu Sequence Tagged Sites (DASTS) for the Human Genome Project. HS Alu family member insertions differ from other types of polymorphism (e.g. Variable Number of Tandem Repeat [VNTR] or Restriction Fragment Length Polymorphism [RFLP]) in that polymorphisms due to Alu insertions arise as a result of a unique event which has occurred only one time in the human population and spread through the population from that point. Therefore, individuals that share HS Alu repeats inherited these elements from a common ancestor. Most VNTR and RFLP polymorphisms may arise multiple times in parallel within a population.

  2. Graphene nanodevices for DNA sequencing

    Science.gov (United States)

    Heerema, Stephanie J.; Dekker, Cees

    2016-02-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with nanopores. Owing to its unique structure and properties, graphene provides interesting opportunities for the development of a new sequencing technology. In recent years, a wide range of creative ideas for graphene sequencers have been theoretically proposed and the first experimental demonstrations have begun to appear. Here, we review the different approaches to using graphene nanodevices for DNA sequencing, which involve DNA passing through graphene nanopores, nanogaps, and nanoribbons, and the physisorption of DNA on graphene nanostructures. We discuss the advantages and problems of each of these key techniques, and provide a perspective on the use of graphene in future DNA sequencing technology.

  3. Chromosome-specific DNA Repeat Probes

    Energy Technology Data Exchange (ETDEWEB)

    Baumgartner, Adolf; Weier, Jingly Fung; Weier, Heinz-Ulrich G.

    2006-03-16

    In research as well as in clinical applications, fluorescence in situ hybridization (FISH) has gained increasing popularity as a highly sensitive technique to study cytogenetic changes. Today, hundreds of commercially available DNA probes serve the basic needs of the biomedical research community. Widespread applications, however, are often limited by the lack of appropriately labeled, specific nucleic acid probes. We describe two approaches for an expeditious preparation of chromosome-specific DNAs and the subsequent probe labeling with reporter molecules of choice. The described techniques allow the preparation of highly specific DNA repeat probes suitable for enumeration of chromosomes in interphase cell nuclei or tissue sections. In addition, there is no need for chromosome enrichment by flow cytometry and sorting or molecular cloning. Our PCR-based method uses either bacterial artificial chromosomes or human genomic DNA as templates with {alpha}-satellite-specific primers. Here we demonstrate the production of fluorochrome-labeled DNA repeat probes specific for human chromosomes 17 and 18 in just a few days without the need for highly specialized equipment and without the limitation to only a few fluorochrome labels.

  4. Simple sequence repeat map of the sunflower genome.

    Science.gov (United States)

    Tang, S.; Yu, J.-K.; Slabaugh, B.; Shintani, K.; Knapp, J.

    2002-12-01

    Several independent molecular genetic linkage maps of varying density and completeness have been constructed for cultivated sunflower ( Helianthus annuus L.). Because of the dearth of sequence and probe-specific DNA markers in the public domain, the various genetic maps of sunflower have not been integrated and a single reference map has not emerged. Moreover, comparisons between maps have been confounded by multiple linkage group nomenclatures and the lack of common DNA markers. The goal of the present research was to construct a dense molecular genetic linkage map for sunflower using simple sequence repeat (SSR) markers. First, 879 SSR markers were developed by identifying 1,093 unique SSR sequences in the DNA sequences of 2,033 clones isolated from genomic DNA libraries enriched for (AC)(n) or (AG)(n) and screening 1,000 SSR primer pairs; 579 of the newly developed SSR markers (65.9% of the total) were polymorphic among four elite inbred lines (RHA280, RHA801, PHA and PHB). The genetic map was constructed using 94 RHA280 x RHA801 F(7) recombinant inbred lines (RILs) and 408 polymorphic SSR markers (462 SSR marker loci segregated in the mapping population). Of the latter, 459 coalesced into 17 linkage groups presumably corresponding to the 17 chromosomes in the haploid sunflower genome ( x = 17). The map was 1,368.3-cM long and had a mean density of 3.1 cM per locus. The SSR markers described herein supply a critical mass of DNA markers for constructing genetic maps of sunflower and create the basis for unifying and cross-referencing the multitude of genetic maps developed for wild and cultivated sunflowers.

  5. A Model of DNA Repeat-Assembled Mitotic Chromosomal Skeleton

    OpenAIRE

    Shao-Jun Tang

    2011-01-01

    Despite intensive investigation for decades, the principle of higher-order organization of mitotic chromosomes is unclear. Here, I describe a novel model that emphasizes a critical role of interactions of homologous DNA repeats (repetitive elements; repetitive sequences) in mitotic chromosome architecture. According to the model, DNA repeats are assembled, via repeat interactions (pairing), into compact core structures that govern the arrangement of chromatins in mitotic chromosomes. Tandem r...

  6. Role of DNA Polymerases in Repeat-Mediated Genome Instability

    Directory of Open Access Journals (Sweden)

    Kartik A. Shah

    2012-11-01

    Full Text Available Expansions of simple DNA repeats cause numerous hereditary diseases in humans. We analyzed the role of DNA polymerases in the instability of Friedreich’s ataxia (GAAn repeats in a yeast experimental system. The elementary step of expansion corresponded to ∼160 bp in the wild-type strain, matching the size of Okazaki fragments in yeast. This step increased when DNA polymerase α was mutated, suggesting a link between the scale of expansions and Okazaki fragment size. Expandable repeats strongly elevated the rate of mutations at substantial distances around them, a phenomenon we call repeat-induced mutagenesis (RIM. Notably, defects in the replicative DNA polymerases δ and ∊ strongly increased rates for both repeat expansions and RIM. The increases in repeat-mediated instability observed in DNA polymerase δ mutants depended on translesion DNA polymerases. We conclude that repeat expansions and RIM are two sides of the same replicative mechanism.

  7. Direct detection of expanded trinucleotide repeats using DNA hybridization techniques

    Energy Technology Data Exchange (ETDEWEB)

    Petronis, A.; Tatuch, Y.; Kennedy, J.L. [Univ. of Toronto (Canada)] [and others

    1994-09-01

    Recently, unstable trinucleotide repeats have been shown to be the etiologic factor in several neuropsychiatric diseases, and they may play a similar role in other disorders. To our knowledge, a method that detects expanded trinucleotide sequences with the opportunity for direct localization and cloning has not been achieved. We have developed a set of hybridization-based methods for direct detection of unstable DNA expansion. Our analysis of myotonic dystrophy patients that possess different degrees of (CTG){sub n} expansion, versus unaffected controls, has demonstrated the identification of the trinucleotide instability site without any prior information regarding genetic map location. High stringency modified Southern blot hybridization with a PCR-generated trinucleotide repeat probe allowed us to detect the DNA fragment containing the expansion in myotonic dystrophy patients. The same probe was used for fluorescent in situ hybridization and several regions of (CTG){sub n}/(CAG){sub n} repeats in the human genome were detected, including the myotonic dystrophy locus on chromosome 19q. These strategies can be applied to directly clone genes involved in disorders caused by unstable DNA.

  8. Evolution of ribosomal DNA-derived satellite repeat in tomato genome

    Directory of Open Access Journals (Sweden)

    Hur Cheol-Goo

    2009-04-01

    Full Text Available Abstract Background Tandemly repeated DNA, also called as satellite DNA, is a common feature of eukaryotic genomes. Satellite repeats can expand and contract dramatically, which may cause genome size variation among genetically-related species. However, the origin and expansion mechanism are not clear yet and needed to be elucidated. Results FISH analysis revealed that the satellite repeat showing homology with intergenic spacer (IGS of rDNA present in the tomato genome. By comparing the sequences representing distinct stages in the divergence of rDNA repeat with those of canonical rDNA arrays, the molecular mechanism of the evolution of satellite repeat is described. Comprehensive sequence analysis and phylogenetic analysis demonstrated that a long terminal repeat retrotransposon was interrupted into each copy of the 18S rDNA and polymerized by recombination rather than transposition via an RNA intermediate. The repeat was expanded through doubling the number of IGS into the 25S rRNA gene, and also greatly increasing the copy number of type I subrepeat in the IGS of 25-18S rDNA by segmental duplication. Homogenization to a single type of subrepeat in the satellite repeat was achieved as the result of amplifying copy number of the type I subrepeat but eliminating neighboring sequences including the type II subrepeat and rRNA coding sequence from the array. FISH analysis revealed that the satellite repeats are commonly present in closely-related Solanum species, but vary in their distribution and abundance among species. Conclusion These results represent that the dynamic satellite repeats were originated from intergenic spacer of rDNA unit in the tomato genome. This result could serve as an example towards understanding the initiation and the expansion of the satellite repeats in complex eukaryotic genome.

  9. Always look on both sides: phylogenetic information conveyed by simple sequence repeat allele sequences.

    Directory of Open Access Journals (Sweden)

    Stéphanie Barthe

    Full Text Available Simple sequence repeat (SSR markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily, mutations in the target sequences follow the stepwise mutation model (SMM. Generally speaking, PCR amplicon sizes are used as direct indicators of the number of SSR repeats composing an allele with the data analysis either ignoring the extent of allele size differences or assuming that there is a direct correlation between differences in amplicon size and evolutionary distance. However, without precisely knowing the kind and distribution of polymorphism within an allele (SSR and the associated flanking region (FR sequences, it is hard to say what kind of evolutionary message is conveyed by such a synthetic descriptor of polymorphism as DNA amplicon size. In this study, we sequenced several SSR alleles in multiple populations of three divergent tree genera and disentangled the types of polymorphisms contained in each portion of the DNA amplicon containing an SSR. The patterns of diversity provided by amplicon size variation, SSR variation itself, insertions/deletions (indels, and single nucleotide polymorphisms (SNPs observed in the FRs were compared. Amplicon size variation largely reflected SSR repeat number. The amount of variation was as large in FRs as in the SSR itself. The former contributed significantly to the phylogenetic information and sometimes was the main source of differentiation among individuals and populations contained by FR and SSR regions of SSR markers. The presence of mutations occurring at different rates within a marker's sequence offers the opportunity to analyse evolutionary events occurring on various timescales, but at the same time calls for caution in the interpretation of SSR marker data when the distribution of within

  10. Biased distribution of DNA uptake sequences towards genome maintenance genes

    DEFF Research Database (Denmark)

    Davidsen, T.; Rodland, E.A.; Lagesen, K.

    2004-01-01

    coding regions are the DNA uptake sequences (DUS) required for natural genetic transformation. More importantly, we found a significantly higher density of DUS within genes involved in DNA repair, recombination, restriction-modification and replication than in any other annotated gene group......Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...

  11. Tandem repeat DNA localizing on the proximal DAPI bands of chromosomes in Larix, Pinaceae.

    Science.gov (United States)

    Hizume, Masahiro; Shibata, Fukashi; Matsumoto, Ayako; Maruyama, Yukie; Hayashi, Eiji; Kondo, Teiji; Kondo, Katsuhiko; Zhang, Shozo; Hong, Deyuan

    2002-08-01

    Repetitive DNA was cloned from HindIII-digested genomic DNA of Larix leptolepis. The repetitive DNA was about 170 bp long, had an AT content of 67%, and was organized tandemly in the genome. Using fluorescence in situ hybridization and subsequent DAPI banding, the repetitive DNA was localized in DAPI bands at the proximal region of one arm of chromosomes in L. leptolepis and Larix chinensis. Southern blot hybridization to genomic DNA of seven species and five varieties probed with cloned repetitive DNA showed that the repetitive DNA family was present in a tandem organization in genomes of all Larix taxa examined. In addition to the 170-bp sequence, a 220-bp sequence belonging to the same DNA family was also present in 10 taxa. The 220-bp repeat unit was a partial duplication of the 170-bp repeat unit. The 220-bp repeat unit was more abundant in L. chinensis and Larix potaninii var. macrocarpa than in other taxa. The repetitive DNA composed 2.0-3.4% of the genome in most taxa and 0.3 and 0.5% of the genome in L. chinensis and L. potaninii var. macrocarpa, respectively. The unique distribution of the 220-bp repeat unit in Larix indicates the close relationship of these two species. In the family Pinaceae, the LPD (Larix proximal DAPI band specific repeat sequence family) family sequence is widely distributed, but their amount is very small except in the genus Larix. The abundant LPD family in Larix will occur after its speciation.

  12. DNA Sequencing Sensors: An Overview

    Directory of Open Access Journals (Sweden)

    Jose Antonio Garrido-Cardenas

    2017-03-01

    Full Text Available The first sequencing of a complete genome was published forty years ago by the double Nobel Prize in Chemistry winner Frederick Sanger. That corresponded to the small sized genome of a bacteriophage, but since then there have been many complex organisms whose DNA have been sequenced. This was possible thanks to continuous advances in the fields of biochemistry and molecular genetics, but also in other areas such as nanotechnology and computing. Nowadays, sequencing sensors based on genetic material have little to do with those used by Sanger. The emergence of mass sequencing sensors, or new generation sequencing (NGS meant a quantitative leap both in the volume of genetic material that was able to be sequenced in each trial, as well as in the time per run and its cost. One can envisage that incoming technologies, already known as fourth generation sequencing, will continue to cheapen the trials by increasing DNA reading lengths in each run. All of this would be impossible without sensors and detection systems becoming smaller and more precise. This article provides a comprehensive overview on sensors for DNA sequencing developed within the last 40 years.

  13. Clusters of nucleotide substitutions and insertion/deletion mutations are associated with repeat sequences.

    Directory of Open Access Journals (Sweden)

    Michael J McDonald

    2011-06-01

    Full Text Available The genome-sequencing gold rush has facilitated the use of comparative genomics to uncover patterns of genome evolution, although their causal mechanisms remain elusive. One such trend, ubiquitous to prokarya and eukarya, is the association of insertion/deletion mutations (indels with increases in the nucleotide substitution rate extending over hundreds of base pairs. The prevailing hypothesis is that indels are themselves mutagenic agents. Here, we employ population genomics data from Escherichia coli, Saccharomyces paradoxus, and Drosophila to provide evidence suggesting that it is not the indels per se but the sequence in which indels occur that causes the accumulation of nucleotide substitutions. We found that about two-thirds of indels are closely associated with repeat sequences and that repeat sequence abundance could be used to identify regions of elevated sequence diversity, independently of indels. Moreover, the mutational signature of indel-proximal nucleotide substitutions matches that of error-prone DNA polymerases. We propose that repeat sequences promote an increased probability of replication fork arrest, causing the persistent recruitment of error-prone DNA polymerases to specific sequence regions over evolutionary time scales. Experimental measures of the mutation rates of engineered DNA sequences and analyses of experimentally obtained collections of spontaneous mutations provide molecular evidence supporting our hypothesis. This study uncovers a new role for repeat sequences in genome evolution and provides an explanation of how fine-scale sequence contextual effects influence mutation rates and thereby evolution.

  14. Clusters of nucleotide substitutions and insertion/deletion mutations are associated with repeat sequences.

    Science.gov (United States)

    McDonald, Michael J; Wang, Wei-Chi; Huang, Hsien-Da; Leu, Jun-Yi

    2011-06-01

    The genome-sequencing gold rush has facilitated the use of comparative genomics to uncover patterns of genome evolution, although their causal mechanisms remain elusive. One such trend, ubiquitous to prokarya and eukarya, is the association of insertion/deletion mutations (indels) with increases in the nucleotide substitution rate extending over hundreds of base pairs. The prevailing hypothesis is that indels are themselves mutagenic agents. Here, we employ population genomics data from Escherichia coli, Saccharomyces paradoxus, and Drosophila to provide evidence suggesting that it is not the indels per se but the sequence in which indels occur that causes the accumulation of nucleotide substitutions. We found that about two-thirds of indels are closely associated with repeat sequences and that repeat sequence abundance could be used to identify regions of elevated sequence diversity, independently of indels. Moreover, the mutational signature of indel-proximal nucleotide substitutions matches that of error-prone DNA polymerases. We propose that repeat sequences promote an increased probability of replication fork arrest, causing the persistent recruitment of error-prone DNA polymerases to specific sequence regions over evolutionary time scales. Experimental measures of the mutation rates of engineered DNA sequences and analyses of experimentally obtained collections of spontaneous mutations provide molecular evidence supporting our hypothesis. This study uncovers a new role for repeat sequences in genome evolution and provides an explanation of how fine-scale sequence contextual effects influence mutation rates and thereby evolution.

  15. Molecular characterization and physical localization of highly repetitive DNA sequences from Brazilian Alstroemeria species

    NARCIS (Netherlands)

    Kuipers, A.G.J.; Kamstra, S.A.; Jeu, de M.J.; Jacobsen, E.

    2002-01-01

    Highly repetitive DNA sequences were isolated from genomic DNA libraries of Alstroemeria psittacina and A. inodora. Among the repetitive sequences that were isolated, tandem repeats as well as dispersed repeats could be discerned. The tandem repeats belonged to a family of interlinked Sau3A subfragm

  16. Structural Complexity of DNA Sequence

    Directory of Open Access Journals (Sweden)

    Cheng-Yuan Liou

    2013-01-01

    Full Text Available In modern bioinformatics, finding an efficient way to allocate sequence fragments with biological functions is an important issue. This paper presents a structural approach based on context-free grammars extracted from original DNA or protein sequences. This approach is radically different from all those statistical methods. Furthermore, this approach is compared with a topological entropy-based method for consistency and difference of the complexity results.

  17. Construction and identification of DNA libraries enriched for microsatellite repeat sequences of Chinese hamster%中国地鼠基因组微卫星富集文库的构建与分析

    Institute of Scientific and Technical Information of China (English)

    宋国华; 耿佳宁; 贾若愚; 岳文斌; 刘田福; 胡松年

    2011-01-01

    目的 筛选中国地鼠微卫星位点,为中国地鼠种质资源的分类、进化等遗传研究奠定基础.方法中国地鼠基因组DNA经超声打碎,用2%琼脂糖凝胶电泳回收500~1000 bp的DNA片段,与SNX连接头连接,连接产物与生物素标记的14种微卫星探针变性及退火,再通过链亲和素偶联磁珠亲和捕捉,经吸附、洗涤及洗脱,然后以洗脱产物为模板,通过PCR扩增,与pGEM-T载体连接,转化大肠杆菌DH10B,构建中国地鼠微卫星DNA富集文库.结果 测序结果发现,微卫星DNA序列的阳性克隆占70.3%.结论 中国地鼠微卫星文库的建立和微卫星的筛选将为下一步进行中国地鼠遗传连锁图谱的构建、分子进化和系统发育研究提供大量的微卫星标记.%Objective To screen the microsatellite loci of Chinese hamster DNA to serve the genetic studies of germplasm resources, classification and evolution of Chinese hamsters. Methods Genomic DNAs from Chinese hamster was fragmented by ultrasonication. The fragments in size from 500 bp to 1000 bp were recovered by 2% agarose gel electro-phoresis and ligated to SNX linkers with T4 DNA ligase, then denatured and hybridized to 14 biotinylated oligonucleotides. The biotinylated hybrids were retained on magnetic beads according to the strong afinity between biotin and streptavidin. The products was amplified by PCR and cloned into pGEM-T plasmid vector, and then transformed into Escherichia coli DH10B to construct DNA libraries enriched for microsatellite repeat sequences of Chinese hamster. Results The results of sequencing showed that sequences contained microsatellites indicating a high degree of microsatellite enrichment. Conclusions The new polymorphic microsatellite markers identified and characterized in this study may serve the Chinese hamster genetic linkage mapping, molecular evolution and phylogenetic studies.

  18. Fractals in DNA sequence analysis

    Institute of Scientific and Technical Information of China (English)

    Yu Zu-Guo(喻祖国); Vo Anh; Gong Zhi-Min(龚志民); Long Shun-Chao(龙顺潮)

    2002-01-01

    Fractal methods have been successfully used to study many problems in physics, mathematics, engineering, finance,and even in biology. There has been an increasing interest in unravelling the mysteries of DNA; for example, how can we distinguish coding and noncoding sequences, and the problems of classification and evolution relationship of organisms are key problems in bioinformatics. Although much research has been carried out by taking into consideration the long-range correlations in DNA sequences, and the global fractal dimension has been used in these works by other people, the models and methods are somewhat rough and the results are not satisfactory. In recent years, our group has introduced a time series model (statistical point of view) and a visual representation (geometrical point of view)to DNA sequence analysis. We have also used fractal dimension, correlation dimension, the Hurst exponent and the dimension spectrum (multifractal analysis) to discuss problems in this field. In this paper, we introduce these fractal models and methods and the results of DNA sequence analysis.

  19. Repeated extraction of DNA from FTA cards

    DEFF Research Database (Denmark)

    Stangegaard, Michael; Ferrero, Laura; Børsting, Claus;

    2011-01-01

    Extraction of DNA using magnetic bead based techniques on automated DNA extraction instruments provides a fast, reliable and reproducible method for DNA extraction from various matrices. However, the yield of extracted DNA from FTA-cards is typically low. Here, we demonstrate that it is possible ...

  20. Sequencing analysis of the spinal bulbar muscular atrophy CAG expansion reveals absence of repeat interruptions.

    Science.gov (United States)

    Fratta, Pietro; Collins, Toby; Pemble, Sally; Nethisinghe, Suran; Devoy, Anny; Giunti, Paola; Sweeney, Mary G; Hanna, Michael G; Fisher, Elizabeth M C

    2014-02-01

    Trinucleotide repeat disorders are a heterogeneous group of diseases caused by the expansion, beyond a pathogenic threshold, of unstable DNA tracts in different genes. Sequence interruptions in the repeats have been described in the majority of these disorders and may influence disease phenotype and heritability. Spinal bulbar muscular atrophy (SBMA) is a motor neuron disease caused by a CAG trinucleotide expansion in the androgen receptor (AR) gene. Diagnostic testing and previous research have relied on fragment analysis polymerase chain reaction to determine the AR CAG repeat size, and have therefore not been able to assess the presence of interruptions. We here report a sequencing study of the AR CAG repeat in a cohort of SBMA patients and control subjects in the United Kingdom. We found no repeat interruptions to be present, and we describe differences between sequencing and traditional sizing methods.

  1. Sequences Characterization of Microsatellite DNA Sequences in Pacific Abalone (Haliotis discus hannat)

    Institute of Scientific and Technical Information of China (English)

    LI Qi; Kijima Akihiro

    2007-01-01

    The microsatellite-enriched library was constructed using magnetic bead hybridization selection method, and the microsatellite DNA sequences were analyzed in Pacific abalone Haliotis discus hannai. Three hundred and fifty white colonies were screened using PCR-based technique, and 84 clones were identified to potentially contain microsatellite repeat motif. The 84 clones were sequenced, and 42 microsatellites and 4 minisatellites with a minimum of five repeats were found (13.1% of white colonies screened). Besides the motif of CA contained in the oligoprobe, we also found other 16 types of microsatellite repeats including a dinucleotide repeat, two tetranucleotide repeats, twelve pentanucleotide repeats and a hexanucleotide repeat. According to Weber(1990), the microsatellite sequences obtained could be categorized structurally into perfect repeats (73.3%), imperfect repeats(13.3%), and compound repeats (13.4%). Among the microsatellite repeats, relatively short arrays (< 20 repeats) were most abundant,accounting for 75.0%. The largest length of microsatellites was 48 repeats, and the average number of repeats was 13.4. The data on the composition and length distribution of microsatellites obtained in the present study can be useful for choosing the repeat motifs for microsatetlite isolation in other abalone species.

  2. Novel Y-chromosome Short Tandem Repeat Variants Detected Through the Use of Massively Parallel Sequencing

    Directory of Open Access Journals (Sweden)

    David H. Warshauer

    2015-08-01

    Full Text Available Massively parallel sequencing (MPS technology is capable of determining the sizes of short tandem repeat (STR alleles as well as their individual nucleotide sequences. Thus, single nucleotide polymorphisms (SNPs within the repeat regions of STRs and variations in the pattern of repeat units in a given repeat motif can be used to differentiate alleles of the same length. In this study, MPS was used to sequence 28 forensically-relevant Y-chromosome STRs in a set of 41 DNA samples from the 3 major U.S. population groups (African Americans, Caucasians, and Hispanics. The resulting sequence data, which were analyzed with STRait Razor v2.0, revealed 37 unique allele sequence variants that have not been previously reported. Of these, 19 sequences were variations of documented sequences resulting from the presence of intra-repeat SNPs or alternative repeat unit patterns. Despite a limited sampling, two of the most frequently-observed variants were found only in African American samples. The remaining 18 variants represented allele sequences for which there were no published data with which to compare. These findings illustrate the great potential of MPS with regard to increasing the resolving power of STR typing and emphasize the need for sample population characterization of STR alleles.

  3. Novel Y-chromosome Short Tandem Repeat Variants Detected Through the Use of Massively Parallel Sequencing

    Institute of Scientific and Technical Information of China (English)

    David H Warshauer; Jennifer D Churchill; Nicole Novroski; Jonathan L King; Bruce Budowle

    2015-01-01

    Massively parallel sequencing (MPS) technology is capable of determining the sizes of short tandem repeat (STR) alleles as well as their individual nucleotide sequences. Thus, single nucleotide polymorphisms (SNPs) within the repeat regions of STRs and variations in the pattern of repeat units in a given repeat motif can be used to differentiate alleles of the same length. In this study, MPS was used to sequence 28 forensically-relevant Y-chromosome STRs in a set of 41 DNA samples from the 3 major U.S. population groups (African Americans, Caucasians, and Hispanics). The resulting sequence data, which were analyzed with STRait Razor v2.0, revealed 37 unique allele sequence variants that have not been previously reported. Of these, 19 sequences were variations of documented sequences resulting from the presence of intra-repeat SNPs or alternative repeat unit patterns. Despite a limited sampling, two of the most frequently-observed variants were found only in African American samples. The remaining 18 variants represented allele sequences for which there were no published data with which to compare. These findings illustrate the great potential of MPS with regard to increasing the resolving power of STR typing and emphasize the need for sample population characterization of STR alleles.

  4. Highly Informative Simple Sequence Repeat (SSR) Markers for Fingerprinting Hazelnut

    Science.gov (United States)

    Simple sequence repeat (SSR) or microsatellite markers have many applications in breeding and genetic studies of plants, including fingerprinting of cultivars and investigations of genetic diversity, and therefore provide information for better management of germplasm collections. They are repeatab...

  5. simple sequence repeat (SSR) markers in genetic analysis of

    African Journals Online (AJOL)

    Yomi

    2012-08-28

    Aug 28, 2012 ... In the present study, 78 mapped simple sequence repeat (SSR) markers representing 11 ... mean (UPGMA) with each cluster representing a particular Vigna species. ..... were reported to be more frequent than the compound.

  6. Study of simple sequence repeat (SSR) polymorphism for biotic ...

    African Journals Online (AJOL)

    home

    2013-10-02

    Oct 2, 2013 ... back cross breeding; SSRs, simple sequence repeats; PIC, polymorphism ..... PIC values were reported in barley wheat and rice (Gu et ... doubled-haploid rice population. Theor. ... Grover A, Aishwarya V, Sharma PC (2007).

  7. [Heterogeneity and homologies of the repeating and unique DNA of dragonflies (Odonata, Insecta)].

    Science.gov (United States)

    Petrov, N B; Aleshin, V V

    1983-01-01

    A relative content of unique and reiterated nucleotide sequences in DNA of eleven dragonfly species was estimated. The degree of intra- and intergenomic divergence of these DNA sequences was determined by means of DNA-DNA hybridization. Species from different genera share 40-45% of the repetitive sequences and those from different families--from 11 to 20% only. Data on the thermostability of homo- and heteroduplexes suggest that new families of the repetitive sequences have arisen repeatedly during dragonflies evolution. The quality of homologous unique sequences in the DNA compared (20-97%) correlates with the taxonomic relationships of species. Phylogenesis of some dragonfly families is discussed in view of the results obtained.

  8. An Optimal Seed Based Compression Algorithm for DNA Sequences

    Directory of Open Access Journals (Sweden)

    Pamela Vinitha Eric

    2016-01-01

    Full Text Available This paper proposes a seed based lossless compression algorithm to compress a DNA sequence which uses a substitution method that is similar to the LempelZiv compression scheme. The proposed method exploits the repetition structures that are inherent in DNA sequences by creating an offline dictionary which contains all such repeats along with the details of mismatches. By ensuring that only promising mismatches are allowed, the method achieves a compression ratio that is at par or better than the existing lossless DNA sequence compression algorithms.

  9. Molecular cytogenetic analysis and genomic organization of major DNA repeats in castor bean (Ricinus communis L.).

    Science.gov (United States)

    Alexandrov, O S; Karlov, G I

    2016-04-01

    This article addresses the bioinformatic, molecular genetic, and cytogenetic study of castor bean (Ricinus communis, 2n = 20), which belongs to the monotypic Ricinus genus within the Euphorbiaceae family. Because castor bean chromosomes are small, karyotypic studies are difficult. However, the use of DNA repeats has yielded new prospects for karyotypic research and genome characterization. In the present study, major DNA repeat sequences were identified, characterized and localized on mitotic metaphase and meiotic pachytene chromosomes. Analyses of the nucleotide composition, curvature models, and FISH localization of the rcsat39 repeat suggest that this repeat plays a key role in building heterochromatic arrays in castor bean. Additionally, the rcsat390 sequences were determined to be chromosome-specific repeats located in the pericentromeric region of mitotic chromosome A (pachytene chromosome 1). The localization of rcsat39, rcsat390, 45S and 5S rDNA genes allowed for the development of cytogenetic landmarks for chromosome identification. General questions linked to heterochromatin formation, DNA repeat distribution, and the evolutionary emergence of the genome are discussed. The article may be of interest to biologists studying small genome organization and short monomer DNA repeats.

  10. Human mitochondrial mTERF wraps around DNA through a left-handed superhelical tandem repeat.

    Science.gov (United States)

    Jiménez-Menéndez, Nereida; Fernández-Millán, Pablo; Rubio-Cosials, Anna; Arnan, Carme; Montoya, Julio; Jacobs, Howard T; Bernadó, Pau; Coll, Miquel; Usón, Isabel; Solà, Maria

    2010-07-01

    The regulation of mitochondrial DNA (mtDNA) processes is slowly being characterized at a structural level. We present here crystal structures of human mitochondrial regulator mTERF, a transcription termination factor also implicated in replication pausing, in complex with double-stranded DNA oligonucleotides containing the tRNA(Leu)(UUR) gene sequence. mTERF comprises nine left-handed helical tandem repeats that form a left-handed superhelix, the Zurdo domain.

  11. The human TTAGGG repeat factors 1 and 2 bind to a subset of interstitial telomeric sequences and satellite repeats

    Institute of Scientific and Technical Information of China (English)

    Thomas Simonet; Elena Giulotto; Frederique Magdinier; Béatrice Horard; Pascal Barbry; Rainer Waldmann; Eric Gison; Laure-Emmanuelle Zaragosi; Claude Philippe; Kevin Lebrigand; Clémentine Schouteden; Adeline Augereau; Serge Bauwens; Jing Ye; Marco Santagostino

    2011-01-01

    The study of the proteins that bind to telomeric DNA in mammals has provided a deep understanding of the mech anisms involved in chromosome-end protection. However, very little is known on the binding of these proteins to nontelomeric DNA sequences. The TTAGGG DNA repeat proteins 1 and 2 (TRF1 and TRF2) bind to mammalian telomeres as part of the shelterin complex and are essential for maintaining chromosome end stability. In this study, we combined chromatin immunoprecipitation with high-throughput sequencing to map at high sensitivity and resolution the human chromosomal sites to which TRF1 and TRF2 bind. While most of the identified sequences correspond to telomeric regions, we showed that these two proteins also bind to extratelomeric sites. The vast majority of these extratelomeric sites contains interstitial telomeric sequences (or ITSs). However, we also identified non-iTS sites, which correspond to centromeric and pericentromeric satellite DNA. Interestingly, the TRF-binding sites are often located in the proximity of genes or within introns. We propose that TRF1 and TRF2 couple the functional state of telomeres to the long-range organization of chromosomes and gene regulation networks by binding to extratelomeric sequences.

  12. Repeat Sequences and Base Correlations in Human Y Chromosome Palindromes

    Institute of Scientific and Technical Information of China (English)

    Neng-zhi Jin; Zi-xian Liu; Yan-jiao Qi; Wen-yuan Qiu

    2009-01-01

    On the basis of information theory and statistical methods, we use mutual information, n-tuple entropy and conditional entropy, combined with biological characteristics, to analyze the long range correlation and short range correlation in human Y chromosome palindromes. The magnitude distribution of the long range correlation which can be reflected by the mutual information is P5>P5a>P5b (P5a and P5b are the sequences that replace solely Alu repeats and all interspersed repeats with random uncorrelated sequences in human Y chromosome palindrome 5, respectively); and the magnitude distribution of the short range correlation which can be reflected by the n-tuple entropy and the conditional entropy is P5>P5a>P5b>random uncorrelated sequence. In other words, when the Alu repeats and all interspersed repeats replace with random uncorrelated sequence, the long range and short range correlation decrease gradually. However, the random uncorrelated sequence has no correlation. This research indicates that more repeat sequences result in stronger correlation between bases in human Y chromosome. The analyses may be helpful to understand the special structures of human Y chromosome palindromes profoundly.

  13. Chromatin structure of repeating CTG/CAG and CGG/CCG sequences in human disease.

    Science.gov (United States)

    Wang, Yuh-Hwa

    2007-05-01

    In eukaryotic cells, chromatin structure organizes genomic DNA in a dynamic fashion, and results in regulation of many DNA metabolic processes. The CTG/CAG and CGG/CCG repeating sequences involved in several neuromuscular degenerative diseases display differential abilities for the binding of histone octamers. The effect of the repeating DNA on nucleosome assembly could be amplified as the number of repeats increases. Also, CpG methylation, and sequence interruptions within the triplet repeats exert an impact on the formation of nucleosomes along these repeating DNAs. The two most common triplet expansion human diseases, myotonic dystrophy 1 and fragile X syndrome, are caused by the expanded CTG/CAG and CGG/CCG repeats, respectively. In addition to the expanded repeats and CpG methylation, histone modifications, chromatin remodeling factors, and noncoding RNA have been shown to coordinate the chromatin structure at both myotonic dystrophy 1 and fragile X loci. Alterations in chromatin structure at these two loci can affect transcription of these disease-causing genes, leading to disease symptoms. These observations have brought a new appreciation that a full understanding of disease gene expression requires a knowledge of the structure of the chromatin domain within which the gene resides.

  14. Survey of simple sequence repeats in woodland strawberry (Fragaria vesca).

    Science.gov (United States)

    Guan, L; Huang, J F; Feng, G Q; Wang, X W; Wang, Y; Chen, B Y; Qiao, Y S

    2013-07-30

    The use of simple sequence repeats (SSRs), or microsatellites, as genetic markers has become popular due to their abundance and variation in length among individuals. In this study, we investigated linkage groups (LGs) in the woodland strawberry (Fragaria vesca) and demonstrated variation in the abundances, densities, and relative densities of mononucleotide, dinucleotide, and trinucleotide repeats. Mononucleotide, dinucleotide, and trinucleotide repeats were more common than longer repeats in all LGs examined. Perfect SSRs were the predominant SSR type found and their abundance was extremely stable among LGs and chloroplasts. Abundances of mononucleotide, dinucleotide, and trinucleotide repeats were positively correlated with LG size, whereas those of tetranucleotide and hexanucleotide SSRs were not. Generally, in each LG, the abundance, relative abundance, relative density, and the proportion of each unique SSR all declined rapidly as the repeated unit increased. Furthermore, the lengths and frequencies of SSRs varied among different LGs.

  15. Tandem repeats and G-rich sequences are enriched at human CNV breakpoints.

    Directory of Open Access Journals (Sweden)

    Promita Bose

    Full Text Available Chromosome breakage in germline and somatic genomes gives rise to copy number variation (CNV responsible for genomic disorders and tumorigenesis. DNA sequence is known to play an important role in breakage at chromosome fragile sites; however, the sequences susceptible to double-strand breaks (DSBs underlying CNV formation are largely unknown. Here we analyze 140 germline CNV breakpoints from 116 individuals to identify DNA sequences enriched at breakpoint loci compared to 2800 simulated control regions. We find that, overall, CNV breakpoints are enriched in tandem repeats and sequences predicted to form G-quadruplexes. G-rich repeats are overrepresented at terminal deletion breakpoints, which may be important for the addition of a new telomere. Interstitial deletions and duplication breakpoints are enriched in Alu repeats that in some cases mediate non-allelic homologous recombination (NAHR between the two sides of the rearrangement. CNV breakpoints are enriched in certain classes of repeats that may play a role in DNA secondary structure, DSB susceptibility and/or DNA replication errors.

  16. Polymorphisms in the CAG repeat--a source of error in Huntington disease DNA testing.

    Science.gov (United States)

    Yu, S; Fimmel, A; Fung, D; Trent, R J

    2000-12-01

    Five of 400 patients (1.3%), referred for Huntington disease DNA testing, demonstrated a single allele on CAG alone, but two alleles when the CAG + CCG repeats were measured. The PCR assay failed to detect one allele in the CAG alone assay because of single-base silent polymorphisms in the penultimate or the last CAG repeat. The region around and within the CAG repeat sequence in the Huntington disease gene is a hot-spot for DNA polymorphisms, which can occur in up to 1% of subjects tested for Huntington disease. These polymorphisms may interfere with amplification by PCR, and so have the potential to produce a diagnostic error.

  17. Copy number of tandem direct repeats within the inverted repeats of Marek's disease virus DNA.

    Science.gov (United States)

    Kanamori, A; Nakajima, K; Ikuta, K; Ueda, S; Kato, S; Hirai, K

    1986-12-01

    We previously reported that DNA of the oncogenic strain BC-1 of Marek's disease virus serotype 1 (MDV1) contains three units of tandem direct repeats with 132 base pair (bp) repeats within the inverted repeats of the long regions of the MDV1 genome, whereas the attenuated, nononcogenic viral DNA contains multiple units of tandem direct repeats (Maotani et al., 1986). In the present study, the difference in the copy numbers of 132 bp repeats of oncogenic and nononcogenic MDV1 DNAs in other strains of MDV1 was investigated by Southern blot hybridization. The main copy numbers in different oncogenic MDV1 strains differed: those of BC-1, JM and highly oncogenic Md5 were 3, 5 to 12 and 2, respectively. The viral DNA population with two units of repeats was small, but detectable, in cells infected with either the oncogenic BC-1 or JM strain. The MDV1 DNA in various MD cell lines contained either two units or both two and three units of repeats. The significance of the copy number of repeats in oncogenicity of MDV1 is discussed.

  18. Information Analysis of DNA Sequences

    CERN Document Server

    Mohammed, Riyazuddin

    2010-01-01

    The problem of differentiating the informational content of coding (exons) and non-coding (introns) regions of a DNA sequence is one of the central problems of genomics. The introns are estimated to be nearly 95% of the DNA and since they do not seem to participate in the process of transcription of amino-acids, they have been termed "junk DNA." Although it is believed that the non-coding regions in genomes have no role in cell growth and evolution, demonstration that these regions carry useful information would tend to falsify this belief. In this paper, we consider entropy as a measure of information by modifying the entropy expression to take into account the varying length of these sequences. Exons are usually much shorter in length than introns; therefore the comparison of the entropy values needs to be normalized. A length correction strategy was employed using randomly generated nucleonic base strings built out of the alphabet of the same size as the exons under question. Our analysis shows that intron...

  19. Imperfect DNA mirror repeats in E. coli TnsA and other protein-coding DNA.

    Science.gov (United States)

    Lang, Dorothy M

    2005-09-01

    DNA imperfect mirror repeats (DNA-IMRs) are ubiquitous in protein-coding DNA. However, they overlap and often have different centers of symmetry, making it difficult to evaluate their relationship to each other and to specific DNA and protein motifs and structures. This paper describes a systematic method of determining a hierarchy for DNA-IMRs and evaluates their relationship to protein structural elements (PSEs)--helices, turns and beta-sheets. DNA-IMRs are identifed by two different methods--DNA-IMRs terminated by reverse dinucleotides (rd-IMRs) and DNA-IMRs terminated by a single (mono) matching nucleotide (m-IMRs). Both rd-IMRs and m-IMRs are evaluated in 17 proteins, and illustrated in detail for TnsA. For each of the proteins, Fisher's exact test (FET) is used to measure the coincidence between the terminal dinucleotides of rd-IMRs and the terminal amino acids of individual PSEs. A significant correlation over a span of about 3 nt was found for each protein. The correlation is robust and for most genes, all rd-IMRs16 nt contain approximately 88% of the potential functional motifs. The protein translation of the longest rd- and m-IMRs span sequences important to the protein's structure and function. In all 17 proteins studied, the population of rd-IMRs is substantially less than the expected number and the population of m-IMRs greater than the expected number, indicating strong selective pressures. The association of rd-IMRs with PSEs restricts their spatial distribution, and therefore, their number. The greater than predicted number of m-IMRs indicates that DNA symmetry exists throughout the entire protein-coding region and may stabilize the sequence.

  20. EVOLUTION AND RECOMBINATION OF BOVINE DNA REPEATS

    NARCIS (Netherlands)

    JOBSE, C; BUNTJER, JB; HAAGSMA, N; BREUKELMAN, HJ; BEINTEMA, JJ; LENSTRA, JA

    The history of the abundant repeat elements in the bovine genome has been studied by comparative hybridization and PCR. The Bov-A and Bov-B SINE elements both emerged just after the divergence of the Camelidae and the true ruminants. A 31-bp subrepeat motif in satellites of the Bovidae species

  1. EVOLUTION AND RECOMBINATION OF BOVINE DNA REPEATS

    NARCIS (Netherlands)

    JOBSE, C; BUNTJER, JB; HAAGSMA, N; BREUKELMAN, HJ; BEINTEMA, JJ; LENSTRA, JA

    1995-01-01

    The history of the abundant repeat elements in the bovine genome has been studied by comparative hybridization and PCR. The Bov-A and Bov-B SINE elements both emerged just after the divergence of the Camelidae and the true ruminants. A 31-bp subrepeat motif in satellites of the Bovidae species cattl

  2. Analysis of Simple Sequence Repeats in Genomes of Rhizobia

    Institute of Scientific and Technical Information of China (English)

    GAO Ya-mei; HAN Yi-qiang; TANG Hui; SUN Dong-mei; WANG Yan-jie; WANG Wei-dong

    2008-01-01

    Simple sequence repeats (SSRs) or microsatellites, as genetic markers, are ubiquitous in genomes of various organisms. The analysis of SSR in rhizobia genome provides useful information for a variety of applications in population genetics of rhizobia. We analyzed the occurrences, relative abundance, and relative density of SSRs, the most common in Bradyrhizobium japonicum, Mesorhizobium loti, and Sinorhizobium meliloti genomes se-quenced in the microorganisms tandem repeats database, and SSRs in the three species genomes were compared with each other. The result showed that there were 1 410, 859, and 638 SSRs in B. japonicum, M. loti, and 5. meliloti genomes, respectively. In the genomes of B. japonicum, M. loti, and 5. meliloti, tetranucleotide, pentanucleotide, and hexanucleotide repeats were more abundant and indicated higher mutation rates in these species. The least abundance was mononucleotide repeat. The SSRs type and distribution were similar among these species.

  3. Coevolution between simple sequence repeats (SSRs and virus genome size

    Directory of Open Access Journals (Sweden)

    Zhao Xiangyan

    2012-08-01

    Full Text Available Abstract Background Relationship between the level of repetitiveness in genomic sequence and genome size has been investigated by making use of complete prokaryotic and eukaryotic genomes, but relevant studies have been rarely made in virus genomes. Results In this study, a total of 257 viruses were examined, which cover 90% of genera. The results showed that simple sequence repeats (SSRs is strongly, positively and significantly correlated with genome size. Certain repeat class is distributed in a certain range of genome sequence length. Mono-, di- and tri- repeats are widely distributed in all virus genomes, tetra- SSRs as a common component consist in genomes which more than 100 kb in size; in the range of genome  Conclusions We conducted this research standing on the height of the whole virus. We concluded that genome size is an important factor in affecting the occurrence of SSRs; hosts are also responsible for the variances of SSRs content to a certain degree.

  4. Unstable microsatellite repeats facilitate rapid evolution of coding and regulatory sequences.

    Science.gov (United States)

    Jansen, A; Gemayel, R; Verstrepen, K J

    2012-01-01

    Tandem repeats are intrinsically highly variable sequences since repeat units are often lost or gained during replication or following unequal recombination events. Because of their low complexity and their instability, these repeats, which are also called satellite repeats, are often considered to be useless 'junk' DNA. However, recent findings show that tandem repeats are frequently found within promoters of stress-induced genes and within the coding regions of genes encoding cell-surface and regulatory proteins. Interestingly, frequent changes in these repeats often confer phenotypic variability. Examples include variation in the microbial cell surface, rapid tuning of internal molecular clocks in flies, and enhanced morphological plasticity in mammals. This suggests that instead of being useless junk DNA, some variable tandem repeats are useful functional elements that confer 'evolvability', facilitating swift evolution and rapid adaptation to changing environments. Since changes in repeats are frequent and reversible, repeats provide a unique type of mutation that bridges the gap between rare genetic mutations, such as single nucleotide polymorphisms, and highly unstable but reversible epigenetic inheritance.

  5. A blackberry (Rubus L. expressed sequence tag library for the development of simple sequence repeat markers

    Directory of Open Access Journals (Sweden)

    Main Dorrie S

    2008-06-01

    Full Text Available Abstract Background The recent development of novel repeat-fruiting types of blackberry (Rubus L. cultivars, combined with a long history of morphological marker-assisted selection for thornlessness by blackberry breeders, has given rise to increased interest in using molecular markers to facilitate blackberry breeding. Yet no genetic maps, molecular markers, or even sequences exist specifically for cultivated blackberry. The purpose of this study is to begin development of these tools by generating and annotating the first blackberry expressed sequence tag (EST library, designing primers from the ESTs to amplify regions containing simple sequence repeats (SSR, and testing the usefulness of a subset of the EST-SSRs with two blackberry cultivars. Results A cDNA library of 18,432 clones was generated from expanding leaf tissue of the cultivar Merton Thornless, a progenitor of many thornless commercial cultivars. Among the most abundantly expressed of the 3,000 genes annotated were those involved with energy, cell structure, and defense. From individual sequences containing SSRs, 673 primer pairs were designed. Of a randomly chosen set of 33 primer pairs tested with two blackberry cultivars, 10 detected an average of 1.9 polymorphic PCR products. Conclusion This rate predicts that this library may yield as many as 940 SSR primer pairs detecting 1,786 polymorphisms. This may be sufficient to generate a genetic map that can be used to associate molecular markers with phenotypic traits, making possible molecular marker-assisted breeding to compliment existing morphological marker-assisted breeding in blackberry.

  6. Sequence characterization of hypervariable regions in the soybean genome: leucine-rich repeats and simple sequence repeats

    Directory of Open Access Journals (Sweden)

    Everaldo G. de Barros

    2000-06-01

    Full Text Available The genetic basis of cultivated soybean is rather narrow. This observation has been confirmed by analysis of agronomic traits among different genotypes, and more recently by the use of molecular markers. During the construction of an RFLP soybean map (Glycine soja x Glycine max the two progenitors were analyzed with over 2,000 probes, of which 25% were polymorphic. Among the probes that revealed polymorphisms, a small proportion, about 0.5%, hybridized to regions that were highly polymorphic. Here we report the sequencing and analysis of five of these probes. Three of the five contain segments that encode leucine-rich repeat (LRR sequence homologous to known disease resistance genes in plants. Two other probes are relatively AT-rich and contain segments of (An/(Tn. DNA segments corresponding to one of the probes (A45-10 were amplified from nine soybean genotypes. Partial sequencing of these amplicons suggests that deletions and/or insertions are responsible for the extensive polymorphism observed. We propose that genes encoding LRR proteins and simple sequence repeat region prone to slippage are some of the most hypervariable regions of the soybean genome.A base genética da soja cultivada é relativamente estreita. Essa observação foi confirmada por análises de características agronômicas entre diferentes genótipos e, mais recentemente, pelo uso de marcadores moleculares. Durante a construção de um mapa de RFLP da soja (Glycine soja x Glycine max, os dois progenitores foram analisados com mais de 2000 sondas, das quais 25% eram polimórficas. Entre as sondas que revelaram polimorfismos, uma pequena proporção, cerca de 0,5%, hibridizou com regiões que eram altamente polimórficas. Neste trabalho, são apresentados o seqüenciamento e análise de cinco dessas sondas. Três dessas sondas contêm segmentos que codificam repetições ricas em leucina que são homólogas a genes de resistência a doenças já conhecidos em plantas. As duas

  7. A novel tandem repeat sequence located on human chromosome 4p: isolation and characterization.

    Science.gov (United States)

    Kogi, M; Fukushige, S; Lefevre, C; Hadano, S; Ikeda, J E

    1997-06-01

    In an effort to analyze the genomic region of the distal half of human chromosome 4p, to where Huntington disease and other diseases have been mapped, we have isolated the cosmid clone (CRS447) that was likely to contain a region with specific repeat sequences. Clone CRS447 was subjected to detailed analysis, including chromosome mapping, restriction mapping, and DNA sequencing. Chromosome mapping by both a human-CHO hybrid cell panel and FISH revealed that CRS447 was predominantly located in the 4p15.1-15.3 region. CRS447 was shown to consist of tandem repeats of 4.7-kb units present on chromosome 4p. A single EcoRI unit was subcloned (pRS447), and the complete sequence was determined as 4752 nucleotides. When pRS447 was used as a probe, the number of copies of this repeat per haploid genome was estimated to be 50-70. Sequence analysis revealed that it contained two internal CA repeats and one putative ORF. Database search established that this sequence was unreported. However, two homologous STS markers were found in the database. We concluded that CRS447/pRS447 is a novel tandem repeat sequence that is mainly specific to human chromosome 4p.

  8. A Model of DNA Repeat-Assembled Mitotic Chromosomal Skeleton

    Directory of Open Access Journals (Sweden)

    Shao-Jun Tang

    2011-09-01

    Full Text Available Despite intensive investigation for decades, the principle of higher-order organization of mitotic chromosomes is unclear. Here, I describe a novel model that emphasizes a critical role of interactions of homologous DNA repeats (repetitive elements; repetitive sequences in mitotic chromosome architecture. According to the model, DNA repeats are assembled, via repeat interactions (pairing, into compact core structures that govern the arrangement of chromatins in mitotic chromosomes. Tandem repeat assemblies form a chromosomal axis to coordinate chromatins in the longitudinal dimension, while dispersed repeat assemblies form chromosomal nodes around the axis to organize chromatins in the halo. The chromosomal axis and nodes constitute a firm skeleton on which non-skeletal chromatins can be anchored, folded, and supercoiled.

  9. A model of DNA repeat-assembled mitotic chromosomal skeleton.

    Science.gov (United States)

    Tang, Shao-Jun

    2011-01-01

    Despite intensive investigation for decades, the principle of higher-order organization of mitotic chromosomes is unclear. Here, I describe a novel model that emphasizes a critical role of interactions of homologous DNA repeats (repetitive elements; repetitive sequences) in mitotic chromosome architecture. According to the model, DNA repeats are assembled, via repeat interactions (pairing), into compact core structures that govern the arrangement of chromatins in mitotic chromosomes. Tandem repeat assemblies form a chromosomal axis to coordinate chromatins in the longitudinal dimension, while dispersed repeat assemblies form chromosomal nodes around the axis to organize chromatins in the halo. The chromosomal axis and nodes constitute a firm skeleton on which non-skeletal chromatins can be anchored, folded, and supercoiled.

  10. One-way sequencing of multiple amplicons from tandem repetitive mitochondrial DNA control region.

    Science.gov (United States)

    Xu, Jiawu; Fonseca, Dina M

    2011-10-01

    Repetitive DNA sequences not only exist abundantly in eukaryotic nuclear genomes, but also occur as tandem repeats in many animal mitochondrial DNA (mtDNA) control regions. Due to concerted evolution, these repetitive sequences are highly similar or even identical within a genome. When long repetitive regions are the targets of amplification for the purpose of sequencing, multiple amplicons may result if one primer has to be located inside the repeats. Here, we show that, without separating these amplicons by gel purification or cloning, directly sequencing the mitochondrial repeats with the primer outside repetitive region is feasible and efficient. We exemplify it by sequencing the mtDNA control region of the mosquito Aedes albopictus, which harbors typical large tandem DNA repeats. This one-way sequencing strategy is optimal for population surveys.

  11. Local chromatin structure of heterochromatin regulates repeated DNA stability, nucleolus structure, and genome integrity

    Energy Technology Data Exchange (ETDEWEB)

    Peng, Jamy C. [Univ. of California, Berkeley, CA (United States)

    2007-01-01

    Heterochromatin constitutes a significant portion of the genome in higher eukaryotes; approximately 30% in Drosophila and human. Heterochromatin contains a high repeat DNA content and a low density of protein-encoding genes. In contrast, euchromatin is composed mostly of unique sequences and contains the majority of single-copy genes. Genetic and cytological studies demonstrated that heterochromatin exhibits regulatory roles in chromosome organization, centromere function and telomere protection. As an epigenetically regulated structure, heterochromatin formation is not defined by any DNA sequence consensus. Heterochromatin is characterized by its association with nucleosomes containing methylated-lysine 9 of histone H3 (H3K9me), heterochromatin protein 1 (HP1) that binds H3K9me, and Su(var)3-9, which methylates H3K9 and binds HP1. Heterochromatin formation and functions are influenced by HP1, Su(var)3-9, and the RNA interference (RNAi) pathway. My thesis project investigates how heterochromatin formation and function impact nuclear architecture, repeated DNA organization, and genome stability in Drosophila melanogaster. H3K9me-based chromatin reduces extrachromosomal DNA formation; most likely by restricting the access of repair machineries to repeated DNAs. Reducing extrachromosomal ribosomal DNA stabilizes rDNA repeats and the nucleolus structure. H3K9me-based chromatin also inhibits DNA damage in heterochromatin. Cells with compromised heterochromatin structure, due to Su(var)3-9 or dcr-2 (a component of the RNAi pathway) mutations, display severe DNA damage in heterochromatin compared to wild type. In these mutant cells, accumulated DNA damage leads to chromosomal defects such as translocations, defective DNA repair response, and activation of the G2-M DNA repair and mitotic checkpoints that ensure cellular and animal viability. My thesis research suggests that DNA replication, repair, and recombination mechanisms in heterochromatin differ from those in

  12. DNA sequence analysis of newly formed telomeres in yeast.

    Science.gov (United States)

    Wang, S S; Pluta, A F; Zakian, V A

    1989-01-01

    A plasmid can be maintained in linear form in baker's yeast if it bears telomeric sequences at each end. Linear plasmids bearing cloned telomeric C4A4 repeats at one end (test end) and a natural DNA terminus with approximately 300 bps of C4A2 repeats at the other or control end were introduced by transformation into yeast. Test-end termini of 28 to 112 bps supported telomere formation. During telomere formation, C4A2 repeats were often transferred to test-end termini. To determine in greater detail the fate of test-end sequences on these plasmids after propagation in yeast, test-end telomeres were subcloned into E. coli and sequenced. DNA sequencing established a number of points about the molecular events involved in telomere formation in yeast. The results suggest that there are at least two mechanisms for telomere formation in yeast. One is mediated by a recombination event that requires neither a long stretch of homology nor the RAD52 gene product. The other mechanism is by addition of C1-3A repeats to the termini of linear DNA molecules. The telomeric sequence required to support C1-3A addition need not be at the very end of a molecule for telomere formation.

  13. Which Are More Random: Coding or Noncoding DNA Sequences?

    Institute of Scientific and Technical Information of China (English)

    WU Fang; ZHENG Wei-Mou

    2002-01-01

    Evidence seems to show that coding DNA is more random than noncoding DNA, but other conflictingevidence also exists. Based on the third-base degeneracy of codons, we regard the third position of codons as a 'noisy'position. By deleting one fixed position of non-overlapping triplets in a given sequence, three masked sequences may bededuced from the sequence. We have investigated the block-to-site mutual information functions of coding and noncodingsequences in yeast without and with the masking. Characteristics that distinguish coding from noncoding DNA havebeen found. It is observed that the strong correlations in the coding regions may be blocked by the third base of codons,and the proper masking can extract the correlations. Distribution of dimeric tandem repeats of unmasked sequences isalso compared with that of masked sequences.

  14. Molecular cloning and expression of a novel human cDNA containing CAG repeats.

    Science.gov (United States)

    Takeuchi, T; Chen, B K; Qiu, Y; Sonobe, H; Ohtsuki, Y

    1997-12-19

    A novel human cDNA containing CAG repeats, designated B120, was cloned by PCR amplification. An approximately 300-bp 3' untranslated region in this cDNA was followed by a 3426-bp coding region containing the CAG repeats. A computer search failed to find any significant homology between this cDNA and previously reported genes. The number of CAG trinucleotide repeats appeared to vary from seven to 12 in analyses of genomic DNA from healthy volunteers. An approximately 8-kb band was detected in brain, skeletal muscle and thymus by Northern blot analysis. The deduced amino-acid sequence had a polyglutamine chain encoded by CAG repeats as well as glutamine- and tyrosine-rich repeats, which has also been reported for several RNA binding proteins. We immunized mice with recombinant gene product and established a monoclonal antibody to it. On Western immunoblotting, this antibody detected an approximately 120-kDa protein in human brain tissue. In addition, immunohistochemical staining showed that the cytoplasm of neural cells was stained with this antibody. These findings indicated that B120 is a novel cDNA with a CAG repeat length polymorphism and that its gene product is a cytoplasmic protein with a molecular mass of 120 kDa.

  15. Sequences sufficient for programming imprinted germline DNA methylation defined.

    Directory of Open Access Journals (Sweden)

    Yoon Jung Park

    Full Text Available Epigenetic marks are fundamental to normal development, but little is known about signals that dictate their placement. Insights have been provided by studies of imprinted loci in mammals, where monoallelic expression is epigenetically controlled. Imprinted expression is regulated by DNA methylation programmed during gametogenesis in a sex-specific manner and maintained after fertilization. At Rasgrf1 in mouse, paternal-specific DNA methylation on a differential methylation domain (DMD requires downstream tandem repeats. The DMD and repeats constitute a binary switch regulating paternal-specific expression. Here, we define sequences sufficient for imprinted methylation using two transgenic mouse lines: One carries the entire Rasgrf1 cluster (RC; the second carries only the DMD and repeats (DR from Rasgrf1. The RC transgene recapitulated all aspects of imprinting seen at the endogenous locus. DR underwent proper DNA methylation establishment in sperm and erasure in oocytes, indicating the DMD and repeats are sufficient to program imprinted DNA methylation in germlines. Both transgenes produce a DMD-spanning pit-RNA, previously shown to be necessary for imprinted DNA methylation at the endogenous locus. We show that when pit-RNA expression is controlled by the repeats, it regulates DNA methylation in cis only and not in trans. Interestingly, pedigree history dictated whether established DR methylation patterns were maintained after fertilization. When DR was paternally transmitted followed by maternal transmission, the unmethylated state that was properly established in the female germlines could not be maintained. This provides a model for transgenerational epigenetic inheritance in mice.

  16. Simple Sequence Repeat Polymorphisms (SSRPs for Evaluation of Molecular Diversity and Germplasm Classification of Minor Crops

    Directory of Open Access Journals (Sweden)

    Nam-Soo Kim

    2009-11-01

    Full Text Available Evaluation of the genetic diversity among populations is an essential prerequisite for the preservation of endangered species. Thousands of new accessions are introduced into germplasm institutes each year, thereby necessitating assessment of their molecular diversity before elimination of the redundant genotypes. Of the protocols that facilitate the assessment of molecular diversity, SSRPs (simple sequence repeat polymorphisms or microsatellite variation is the preferred system since it detects a large number of DNA polymorphisms with relatively simple technical complexity. The paucity of information on DNA sequences has limited their widespread utilization in the assessment of genetic diversity of minor or neglected crop species. However, recent advancements in DNA sequencing and PCR technologies in conjunction with sophisticated computer software have facilitated the development of SSRP markers in minor crops. This review examines the development and molecular nature of SSR markers, and their utilization in many aspects of plant genetics and ecology.

  17. Autoantigenic proteins that bind recombinogenic sequences in Epstein-Barr virus and cellular DNA.

    OpenAIRE

    1994-01-01

    We have identified conserved autoantigenic cellular proteins that bind to G-rich sequence motifs in recombinogenic regions of Epstein-Barr virus (EBV) DNA. This binding activity, called TRBP, recognizes the EBV terminal repeats, a locus responsible for interconversion of linear and circular EBV DNA. We found that TRBP also binds to EBV DNA sequences involved in deletion of EBNA2, a gene product required for immortalization. We show that TRBP binds sequences present in repetitive cellular DNA,...

  18. [DNA sequencing technology and automatization of it].

    Science.gov (United States)

    Kraev, A S

    1991-01-01

    Precise manipulations with genetic material, typical for modern experiments in molecular biology and in new biotechnology, require a capability to determine DNA base sequence. This capability enables today to exploit specific genetic knowledge for the dissection of complex cell processes and for modulation of cell metabolism in transgenic organisms. The review focuses on such DNA sequencing technologies that are widespread in general laboratory practice. They can safely be called, with the availability of commercial reagents, industrial techniques. Modern DNA sequencing requires recurrent breakdown of large genomic DNA into smaller pieces, that are then amplified, sequenced and the initial long stretch reconstructed via overlap of small pieces. The DNA sequencing process has several steps: a DNA fragment is obtained in sufficient quantity and purity, it is converted to a form suitable for a particular sequencing method, a sequencing reaction is performed and its products fractionated; and finally the resultant data are interpreted (i.e. an autoradiograph is read into a computer memory) and a long sequence in reconstructed via overlap of short stretches. These steps are considered in separate parts; an accent is made on sequencing strategies with respect to their biological task. In the last part, possibilities for automation of sequencing experiment are considered, followed by a discussion of domestic problems in DNA sequencing.

  19. Simple sequence repeats in watermelon (Citrullus lanatus (Thunb.) Matsum. & Nakai).

    Science.gov (United States)

    Jarret, R L; Merrick, L C; Holms, T; Evans, J; Aradhya, M K

    1997-08-01

    Simple sequence repeat length polymorphisms were utilized to examine genetic relatedness among accessions of watermelon (Citrullus lanatus (Thunb.) Matsum. & Nakai). A size-fractionated TaqI genomic library was screened for the occurrence of dimer and trimer simple sequence repeats (SSRs). A total of 96 (0.53%) SSR-bearing clones were identified and the inserts from 50 of these were sequenced. The dinucleotide repeats (CT)n and (GA)n accounted for 82% of the SSRs sequenced. PCR primer pairs flanking seven SSR loci were used to amplify SSRs from 32 morphologically variable watermelon genotypes from Africa, Europe, Asia, and Mexico and a single accession of Citrullus colocynthis from Chad. Cluster analysis of SSR length polymorphisms delineated 4 groups at the 25% level of genetic similarity. The largest group contained C. lanatus var. lanatus accessions. The second largest group contained only wild and cultivated "citron"-type or C. lanatus var. citroides accessions. The third group contained an accession tentatively identified as C. lanatus var. lanatus but which perhaps is a hybrid between C. lanatus var. lanatus and C. lanatus var. citroides. The fourth group consisted of a single accession identified as C. colocynthis. "Egusi"-type watermelons from Nigeria grouped with C. lanatus var. lanatus. The use of SSRs for watermelon germplasm characterization and genetic diversity studies is discussed.

  20. Fibonacci Sequence and Supramolecular Structure of DNA.

    Science.gov (United States)

    Shabalkin, I P; Grigor'eva, E Yu; Gudkova, M V; Shabalkin, P I

    2016-05-01

    We proposed a new model of supramolecular DNA structure. Similar to the previously developed by us model of primary DNA structure [11-15], 3D structure of DNA molecule is assembled in accordance to a mathematic rule known as Fibonacci sequence. Unlike primary DNA structure, supramolecular 3D structure is assembled from complex moieties including a regular tetrahedron and a regular octahedron consisting of monomers, elements of the primary DNA structure. The moieties of the supramolecular DNA structure forming fragments of regular spatial lattice are bound via linker (joint) sequences of the DNA chain. The lattice perceives and transmits information signals over a considerable distance without acoustic aberrations. Linker sequences expand conformational space between lattice segments allowing their sliding relative to each other under the action of external forces. In this case, sliding is provided by stretching of the stacked linker sequences.

  1. Evolutionary conservation of sequence and secondary structures inCRISPR repeats

    Energy Technology Data Exchange (ETDEWEB)

    Kunin, Victor; Sorek, Rotem; Hugenholtz, Philip

    2006-09-01

    Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in {approx}40% of bacterial and all archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CAS), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been proposed that the CRISPR/CAS system samples, maintains a record of, and inactivates invasive DNA that the cell has encountered, and therefore constitutes a prokaryotic analog of an immune system. Here we analyze CRISPR repeats identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. All individual repeats in any given cluster were inferred to form characteristic RNA secondary structure, ranging from non-existent to pronounced. Stable secondary structures included G:U base pairs and exhibited multiple compensatory base changes in the stem region, indicating evolutionary conservation and functional importance. We also show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification including specific relationships between CRISPR and CAS subtypes.

  2. Applications of recursive segmentation to the analysis of DNA sequences.

    Science.gov (United States)

    Li, Wentian; Bernaola-Galván, Pedro; Haghighi, Fatameh; Grosse, Ivo

    2002-07-01

    Recursive segmentation is a procedure that partitions a DNA sequence into domains with a homogeneous composition of the four nucleotides A, C, G and T. This procedure can also be applied to any sequence converted from a DNA sequence, such as to a binary strong(G + C)/weak(A + T) sequence, to a binary sequence indicating the presence or absence of the dinucleotide CpG, or to a sequence indicating both the base and the codon position information. We apply various conversion schemes in order to address the following five DNA sequence analysis problems: isochore mapping, CpG island detection, locating the origin and terminus of replication in bacterial genomes, finding complex repeats in telomere sequences, and delineating coding and noncoding regions. We find that the recursive segmentation procedure can successfully detect isochore borders, CpG islands, and the origin and terminus of replication, but it needs improvement for detecting complex repeats as well as borders between coding and noncoding regions.

  3. Sequences spanning the leader-repeat junction mediate CRISPR adaptation to phage in Streptococcus thermophilus.

    Science.gov (United States)

    Wei, Yunzhou; Chesne, Megan T; Terns, Rebecca M; Terns, Michael P

    2015-02-18

    CRISPR-Cas systems are RNA-based immune systems that protect prokaryotes from invaders such as phages and plasmids. In adaptation, the initial phase of the immune response, short foreign DNA fragments are captured and integrated into host CRISPR loci to provide heritable defense against encountered foreign nucleic acids. Each CRISPR contains a ∼100-500 bp leader element that typically includes a transcription promoter, followed by an array of captured ∼35 bp sequences (spacers) sandwiched between copies of an identical ∼35 bp direct repeat sequence. New spacers are added immediately downstream of the leader. Here, we have analyzed adaptation to phage infection in Streptococcus thermophilus at the CRISPR1 locus to identify cis-acting elements essential for the process. We show that the leader and a single repeat of the CRISPR locus are sufficient for adaptation in this system. Moreover, we identified a leader sequence element capable of stimulating adaptation at a dormant repeat. We found that sequences within 10 bp of the site of integration, in both the leader and repeat of the CRISPR, are required for the process. Our results indicate that information at the CRISPR leader-repeat junction is critical for adaptation in this Type II-A system and likely other CRISPR-Cas systems.

  4. In silico analysis of Simple Sequence Repeats from chloroplast genomes of Solanaceae species

    Directory of Open Access Journals (Sweden)

    Evandro Vagner Tambarussi

    2009-01-01

    Full Text Available The availability of chloroplast genome (cpDNA sequences of Atropa belladonna, Nicotiana sylvestris, N.tabacum, N. tomentosiformis, Solanum bulbocastanum, S. lycopersicum and S. tuberosum, which are Solanaceae species,allowed us to analyze the organization of cpSSRs in their genic and intergenic regions. In general, the number of cpSSRs incpDNA ranged from 161 in S. tuberosum to 226 in N. tabacum, and the number of intergenic cpSSRs was higher than geniccpSSRs. The mononucleotide repeats were the most frequent in studied species, but we also identified di-, tri-, tetra-, pentaandhexanucleotide repeats. Multiple alignments of all cpSSRs sequences from Solanaceae species made the identification ofnucleotide variability possible and the phylogeny was estimated by maximum parsimony. Our study showed that the plastomedatabase can be exploited for phylogenetic analysis and biotechnological approaches.

  5. Species-genomic relationships among the tribasic diploid and polyploid Carthamus taxa based on physical mapping of active and inactive 18S-5.8S-26S and 5S ribosomal RNA gene families, and the two tandemly repeated DNA sequences.

    Science.gov (United States)

    Agrawal, Renuka; Tsujimoto, Hisashi; Tandon, Rajesh; Rao, Satyawada Rama; Raina, Soom Nath

    2013-05-25

    In the genus Carthamus (2n=20, 22, 24, 44, 64; x=10, 11, 12), most of the homologues within and between the chromosome complements are difficult to be identified. In the present work, we used fluorescent in situ hybridisation (FISH) to determine the chromosome distribution of the two rRNA gene families, and the two isolated repeated DNA sequences in the 14 Carthamus taxa. The distinctive variability in the distribution, number and signal intensity of hybridisation sites for 18S-26S and 5S rDNA loci could generally distinguish the 14 Carthamus taxa. Active 18S-26S rDNA sites were generally associated with NOR loci on the nucleolar chromosomes. The two A genome taxa, C. glaucus ssp. anatolicus and C. boissieri with 2n=20, and the two botanical varieties of B genome C. tinctorius (2n=24) had diagnostic FISH patterns. The present results support the origin of C. tinctorius from C. palaestinus. FISH patterns of C. arborescens vis-à-vis the other taxa indicate a clear division of Carthamus taxa into two distinct lineages. Comparative distribution and intensity pattern of 18S-26S rDNA sites could distinguish each of the tetraploid and hexaploid taxa. The present results indicate that C. boissieri (2n=20) is one of the genome donors for C. lanatus and C. lanatus ssp. lanatus (2n=44), and C. lanatus is one of the progenitors for the hexaploid (2n=64) taxa. The association of pCtKpnI-2 repeated sequence with rRNA gene cluster (orphon) in 2-10 nucleolar and non-nucleolar chromosomes and the consistent occurrence of pCtKpnI-1 repeated sequence at the subtelomeric region in all the taxa analysed indicate some functional role of these sequences.

  6. Genome-wide characterization of simple sequence repeats in cucumber (Cucumis sativus L.

    Directory of Open Access Journals (Sweden)

    Simon Philipp W

    2010-10-01

    Full Text Available Abstract Background Cucumber, Cucumis sativus L. is an important vegetable crop worldwide. Until very recently, cucumber genetic and genomic resources, especially molecular markers, have been very limited, impeding progress of cucumber breeding efforts. Microsatellites are short tandemly repeated DNA sequences, which are frequently favored as genetic markers due to their high level of polymorphism and codominant inheritance. Data from previously characterized genomes has shown that these repeats vary in frequency, motif sequence, and genomic location across taxa. During the last year, the genomes of two cucumber genotypes were sequenced including the Chinese fresh market type inbred line '9930' and the North American pickling type inbred line 'Gy14'. These sequences provide a powerful tool for developing markers in a large scale. In this study, we surveyed and characterized the distribution and frequency of perfect microsatellites in 203 Mbp assembled Gy14 DNA sequences, representing 55% of its nuclear genome, and in cucumber EST sequences. Similar analyses were performed in genomic and EST data from seven other plant species, and the results were compared with those of cucumber. Results A total of 112,073 perfect repeats were detected in the Gy14 cucumber genome sequence, accounting for 0.9% of the assembled Gy14 genome, with an overall density of 551.9 SSRs/Mbp. While tetranucleotides were the most frequent microsatellites in genomic DNA sequence, dinucleotide repeats, which had more repeat units than any other SSR type, had the highest cumulative sequence length. Coding regions (ESTs of the cucumber genome had fewer microsatellites compared to its genomic sequence, with trinucleotides predominating in EST sequences. AAG was the most frequent repeat in cucumber ESTs. Overall, AT-rich motifs prevailed in both genomic and EST data. Compared to the other species examined, cucumber genomic sequence had the highest density of SSRs (although

  7. Mitochondrial DNA sequence evolution in shorebird populations.

    NARCIS (Netherlands)

    Wenink, P.W.

    1994-01-01

    This thesis describes the global molecular population structure of two shorebird species, in particular of the dunlin, Calidris alpina, by means of comparative sequence analysis of the most variable part of the mitochondrial DNA (mtDNA) genome. There are several reasons why mtDNA is the molecule of

  8. Complete DNA sequence of the linear mitochondrial genome of the pathogenic yeast Candida parapsilosis

    DEFF Research Database (Denmark)

    Nosek, J.; Novotna, M.; Hlavatovicova, Z.

    2004-01-01

    The complete sequence of the mitochondrial DNA of the opportunistic yeast pathogen Candida parapsilosis was determined. The mitochondrial genome is represented by linear DNA molecules terminating with tandem repeats of a 738-bp unit. The number of repeats varies, thus generating a population...

  9. Complete DNA sequence of the linear mitochondrial genome of the pathogenic yeast Candida parapsilosis

    DEFF Research Database (Denmark)

    Nosek, J.; Novotna, M.; Hlavatovicova, Z.

    2004-01-01

    The complete sequence of the mitochondrial DNA of the opportunistic yeast pathogen Candida parapsilosis was determined. The mitochondrial genome is represented by linear DNA molecules terminating with tandem repeats of a 738-bp unit. The number of repeats varies, thus generating a population...

  10. Repeat-based Sequence Typing of Carnobacterium maltaromaticum.

    Science.gov (United States)

    Rahman, Abdur; El Kheir, Sara M; Back, Alexandre; Mangavel, Cécile; Revol-Junelles, Anne-Marie; Borges, Frédéric

    2016-06-01

    Carnobacterium maltaromaticum is a Lactic Acid Bacterium (LAB) of technological interest for the food industry, especially the dairy as bioprotection and ripening flora. The industrial use of this LAB requires accurate and resolutive typing tools. A new typing method for C. maltaromaticum inspired from MLVA analysis and called Repeat-based Sequence Typing (RST) is described. Rather than electrophoresis analysis, our RST method is based on sequence analysis of multiple loci containing Variable-Number Tandem-Repeats (VNTRs). The method described here for C. maltaromaticum relies on the analysis of three VNTR loci, and was applied to a collection of 24 strains. For each strain, a PCR product corresponding to the amplification of each VNTR loci was sequenced. Sequence analysis allowed delineating 11, 11, and 12 alleles for loci VNTR-A, VNTR-B, and VNTR-C, respectively. Considering the allele combination exhibited by each strain allowed defining 15 genotypes, ending in a discriminatory index of 0.94. Comparison with MLST revealed that both methods were complementary for strain typing in C. maltaromaticum.

  11. A blackberry (Rubus L.) expressed sequence tag library for the development of simple sequence repeat markers

    Science.gov (United States)

    A blackberry (Rubus L.) expressed sequence tag (EST) library was produced for developing simple sequence repeat (SSR) markers from the tetraploid blackberry cultivar, Merton Thornless, the source of the thornless trait in commercial cultivars. RNA was extracted from young expanding leaves and used f...

  12. Mining and validation of pyrosequenced simple sequence repeats (SSRs) from American cranberry (Vaccinium macrocarpon Ait.).

    Science.gov (United States)

    Zhu, H; Senalik, D; McCown, B H; Zeldin, E L; Speers, J; Hyman, J; Bassil, N; Hummer, K; Simon, P W; Zalapa, J E

    2012-01-01

    The American cranberry (Vaccinium macrocarpon Ait.) is a major commercial fruit crop in North America, but limited genetic resources have been developed for the species. Furthermore, the paucity of codominant DNA markers has hampered the advance of genetic research in cranberry and the Ericaceae family in general. Therefore, we used Roche 454 sequencing technology to perform low-coverage whole genome shotgun sequencing of the cranberry cultivar 'HyRed'. After de novo assembly, the obtained sequence covered 266.3 Mb of the estimated 540-590 Mb in cranberry genome. A total of 107,244 SSR loci were detected with an overall density across the genome of 403 SSR/Mb. The AG repeat was the most frequent motif in cranberry accounting for 35% of all SSRs and together with AAG and AAAT accounted for 46% of all loci discovered. To validate the SSR loci, we designed 96 primer-pairs using contig sequence data containing perfect SSR repeats, and studied the genetic diversity of 25 cranberry genotypes. We identified 48 polymorphic SSR loci with 2-15 alleles per locus for a total of 323 alleles in the 25 cranberry genotypes. Genetic clustering by principal coordinates and genetic structure analyzes confirmed the heterogeneous nature of cranberries. The parentage composition of several hybrid cultivars was evident from the structure analyzes. Whole genome shotgun 454 sequencing was a cost-effective and efficient way to identify numerous SSR repeats in the cranberry sequence for marker development.

  13. Steganalytic method based on short and repeated sequence distance statistics

    Institute of Scientific and Technical Information of China (English)

    WANG GuoXin; PING XiJian; XU ManKun; ZHANG Tao; BAO XiRui

    2008-01-01

    According to the distribution characteristics of short and repeated sequence (SRS),a steganalytic method based on the correlation of image bit planes is proposed.Firstly,we provide the conception of SRS distance statistics and deduce its statistical distribution.Because the SRS distance statistics can effectively reflect the correlation of the sequence,SRS has statistical features when the image bit plane sequence equals the image width.Using this characteristic,the steganalytic method is fulfilled by the distinct test of Poisson distribution.Experimental results show a good performance for detecting LSB matching steganographic method in still images.By the way,the proposed method is not designed for specific steganographic algorithms and has good generality.

  14. Automated discovery of single nucleotide polymorphism and simple sequence repeat molecular genetic markers.

    Science.gov (United States)

    Batley, Jacqueline; Jewell, Erica; Edwards, David

    2007-01-01

    Molecular genetic markers represent one of the most powerful tools for the analysis of genomes. Molecular marker technology has developed rapidly over the last decade, and two forms of sequence-based markers, simple sequence repeats (SSRs), also known as microsatellites, and single nucleotide polymorphisms (SNPs), now predominate applications in modern genetic analysis. The availability of large sequence data sets permits mining for SSRs and SNPs, which may then be applied to genetic trait mapping and marker-assisted selection. Here, we describe Web-based automated methods for the discovery of these SSRs and SNPs from sequence data. SSRPrimer enables the real-time discovery of SSRs within submitted DNA sequences, with the concomitant design of PCR primers for SSR amplification. Alternatively, users may browse the SSR Taxonomy Tree to identify predetermined SSR amplification primers for any species represented within the GenBank database. SNPServer uses a redundancy-based approach to identify SNPs within DNA sequence data. Following submission of a sequence of interest, SNPServer uses BLAST to identify similar sequences, CAP3 to cluster and assemble these sequences, and then the SNP discovery software autoSNP to detect SNPs and insertion/deletion (indel) polymorphisms.

  15. Structural basis for sequence-specific recognition of DNA by TAL effectors

    KAUST Repository

    Deng, Dong

    2012-01-05

    TAL (transcription activator-like) effectors, secreted by phytopathogenic bacteria, recognize host DNA sequences through a central domain of tandem repeats. Each repeat comprises 33 to 35 conserved amino acids and targets a specific base pair by using two hypervariable residues [known as repeat variable diresidues (RVDs)] at positions 12 and 13. Here, we report the crystal structures of an 11.5-repeat TAL effector in both DNA-free and DNA-bound states. Each TAL repeat comprises two helices connected by a short RVD-containing loop. The 11.5 repeats form a right-handed, superhelical structure that tracks along the sense strand of DNA duplex, with RVDs contacting the major groove. The 12th residue stabilizes the RVD loop, whereas the 13th residue makes a base-specific contact. Understanding DNA recognition by TAL effectors may facilitate rational design of DNA-binding proteins with biotechnological applications.

  16. Long range correlations in DNA sequences

    CERN Document Server

    Mohanty, A K

    2002-01-01

    The so called long range correlation properties of DNA sequences are studied using the variance analyses of the density distribution of a single or a group of nucleotides in a model independent way. This new method which was suggested earlier has been applied to extract slope parameters that characterize the correlation properties for several intron containing and intron less DNA sequences. An important aspect of all the DNA sequences is the properties of complimentarity by virtue of which any two complimentary distributions (like GA is complimentary to TC or G is complimentary to ATC) have identical fluctuations at all scales although their distribution functions need not be identical. Due to this complimentarity, the famous DNA walk representation whose statistical interpretation is still unresolved is shown to be a special case of the present formalism with a density distribution corresponding to a purine or a pyrimidine group. Another interesting aspect of most of the DNA sequences is that the factorial m...

  17. Dynamics and Control of DNA Sequence Amplification

    CERN Document Server

    Marimuthu, Karthikeyan

    2014-01-01

    DNA amplification is the process of replication of a specified DNA sequence \\emph{in vitro} through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction (PCR) as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal tempe...

  18. Noninvasive prenatal paternity testing (NIPAT) through maternal plasma DNA sequencing

    DEFF Research Database (Denmark)

    Jiang, Haojun; Xie, Yifan; Li, Xuchao

    2016-01-01

    Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we...... developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels...... paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future....

  19. DNA display I. Sequence-encoded routing of DNA populations.

    Directory of Open Access Journals (Sweden)

    David R Halpin

    2004-07-01

    Full Text Available Recently reported technologies for DNA-directed organic synthesis and for DNA computing rely on routing DNA populations through complex networks. The reduction of these ideas to practice has been limited by a lack of practical experimental tools. Here we describe a modular design for DNA routing genes, and routing machinery made from oligonucleotides and commercially available chromatography resins. The routing machinery partitions nanomole quantities of DNA into physically distinct subpools based on sequence. Partitioning steps can be iterated indefinitely, with worst-case yields of 85% per step. These techniques facilitate DNA-programmed chemical synthesis, and thus enable a materials biology that could revolutionize drug discovery.

  20. Sequence analysis of trinucleotide repeat microsatellites from an enrichment library of the equine genome.

    Science.gov (United States)

    Tozaki, T; Inoue, S; Mashima, S; Ohta, M; Miura, N; Tomita, M

    2000-04-01

    Microsatellites are useful tools for the construction of a linkage map and parentage testing of equines, but only a limited number of equine microsatellites have been elucidated. Thus, we constructed the equine genomic library enriched for DNA fragments containing (CAG)n repeats. The enriched method includes hybridization-capture of repeat regions using biotin-conjugated oligonucleotides, nucleotide substrate-biased polymerase reaction with the oligonucleotides and subsequent PCR amplification, because these procedures are useful for the cloning of less abundant trinucleotide microsatellites. Microsatellites containing (CAG)n repeats were obtained at the ratio of one per 3-4 clones, indicating an enrichment value about 10(4)-fold, resulting in less time consumption and less cost for cloning. In this study, 66 different microsatellites, (CAG)n repeats, were identified. The number of complete simple CAG repeats in our clones ranged 4-33, with an average repeat length of 8.8 units. The microsatellites were useful as sequence-tagged site (STS) markers. In addition, some clones containing (CAG)n repeats showed homology to human (CAG)n-containing genes, which have been previously mapped. These results indicate that the clones might be a useful tool for chromosome comparison between equines and humans.

  1. DNA tandem repeat instability in the Escherichia coli chromosome is stimulated by mismatch repair at an adjacent CAG·CTG trinucleotide repeat

    Science.gov (United States)

    Blackwood, John K.; Okely, Ewa A.; Zahra, Rabaab; Eykelenboom, John K.; Leach, David R. F.

    2010-01-01

    Approximately half the human genome is composed of repetitive DNA sequences classified into microsatellites, minisatellites, tandem repeats, and dispersed repeats. These repetitive sequences have coevolved within the genome but little is known about their potential interactions. Trinucleotide repeats (TNRs) are a subclass of microsatellites that are implicated in human disease. Expansion of CAG·CTG TNRs is responsible for Huntington disease, myotonic dystrophy, and a number of spinocerebellar ataxias. In yeast DNA double-strand break (DSB) formation has been proposed to be associated with instability and chromosome fragility at these sites and replication fork reversal (RFR) to be involved either in promoting or in preventing instability. However, the molecular basis for chromosome fragility of repetitive DNA remains poorly understood. Here we show that a CAG·CTG TNR array stimulates instability at a 275-bp tandem repeat located 6.3 kb away on the Escherichia coli chromosome. Remarkably, this stimulation is independent of both DNA double-strand break repair (DSBR) and RFR but is dependent on a functional mismatch repair (MMR) system. Our results provide a demonstration, in a simple model system, that MMR at one type of repetitive DNA has the potential to influence the stability of another. Furthermore, the mechanism of this stimulation places a limit on the universality of DSBR or RFR models of instability and chromosome fragility at CAG·CTG TNR sequences. Instead, our data suggest that explanations of chromosome fragility should encompass the possibility of chromosome gaps formed during MMR. PMID:21149728

  2. Characterization of simple sequence repeats (SSRs from Phlebotomus papatasi (Diptera: Psychodidae expressed sequence tags (ESTs

    Directory of Open Access Journals (Sweden)

    Hamarsheh Omar

    2011-09-01

    Full Text Available Abstract Background Phlebotomus papatasi is a natural vector of Leishmania major, which causes cutaneous leishmaniasis in many countries. Simple sequence repeats (SSRs, or microsatellites, are common in eukaryotic genomes and are short, repeated nucleotide sequence elements arrayed in tandem and flanked by non-repetitive regions. The enrichment methods used previously for finding new microsatellite loci in sand flies remain laborious and time consuming; in silico mining, which includes retrieval and screening of microsatellites from large amounts of sequence data from sequence data bases using microsatellite search tools can yield many new candidate markers. Results Simple sequence repeats (SSRs were characterized in P. papatasi expressed sequence tags (ESTs derived from a public database, National Center for Biotechnology Information (NCBI. A total of 42,784 sequences were mined, and 1,499 SSRs were identified with a frequency of 3.5% and an average density of 15.55 kb per SSR. Dinucleotide motifs were the most common SSRs, accounting for 67% followed by tri-, tetra-, and penta-nucleotide repeats, accounting for 31.1%, 1.5%, and 0.1%, respectively. The length of microsatellites varied from 5 to 16 repeats. Dinucleotide types; AG and CT have the highest frequency. Dinucleotide SSR-ESTs are relatively biased toward an excess of (AXn repeats and a low GC base content. Forty primer pairs were designed based on motif lengths for further experimental validation. Conclusion The first large-scale survey of SSRs derived from P. papatasi is presented; dinucleotide SSRs identified are more frequent than other types. EST data mining is an effective strategy to identify functional microsatellites in P. papatasi.

  3. A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

    Directory of Open Access Journals (Sweden)

    Glass John I

    2010-07-01

    Full Text Available Abstract Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT. Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the

  4. Applications of mass spectrometry to DNA fingerprinting and DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Jacobson, K.B.; Buchanan, M.V.; Chen, C.H.; Doktycz, M.J.; McLuckey, S.A. (Oak Ridge National Lab., TN (United States)); Arlinghaus, H.F. (Atom Sciences, Inc., Oak Ridge, TN (United States))

    1993-01-01

    DNA fingerprinting and sequencing rely on polyacrylamide gel electrophoresis to determine the sizes of the DNA fragments. Innovative altematives to polyacrylamide gel electrophoresis are under investigation for characterization of such fingerprinting and sequencing. One method uses stable isotopes of tin and other elements to label the DNAwhereas other procedures do not require labels. The detectors in each case are mass spectrometers that detect either the stable isotopes or the DNA fragments themselves. If successful, these methods will speed up the rate of DNA analysis by one or two orders of magnitude.

  5. Applications of mass spectrometry to DNA fingerprinting and DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Jacobson, K.B.; Buchanan, M.V.; Chen, C.H.; Doktycz, M.J.; McLuckey, S.A. [Oak Ridge National Lab., TN (United States); Arlinghaus, H.F. [Atom Sciences, Inc., Oak Ridge, TN (United States)

    1993-06-01

    DNA fingerprinting and sequencing rely on polyacrylamide gel electrophoresis to determine the sizes of the DNA fragments. Innovative altematives to polyacrylamide gel electrophoresis are under investigation for characterization of such fingerprinting and sequencing. One method uses stable isotopes of tin and other elements to label the DNAwhereas other procedures do not require labels. The detectors in each case are mass spectrometers that detect either the stable isotopes or the DNA fragments themselves. If successful, these methods will speed up the rate of DNA analysis by one or two orders of magnitude.

  6. EGNAS: an exhaustive DNA sequence design algorithm

    Directory of Open Access Journals (Sweden)

    Kick Alfred

    2012-06-01

    Full Text Available Abstract Background The molecular recognition based on the complementary base pairing of deoxyribonucleic acid (DNA is the fundamental principle in the fields of genetics, DNA nanotechnology and DNA computing. We present an exhaustive DNA sequence design algorithm that allows to generate sets containing a maximum number of sequences with defined properties. EGNAS (Exhaustive Generation of Nucleic Acid Sequences offers the possibility of controlling both interstrand and intrastrand properties. The guanine-cytosine content can be adjusted. Sequences can be forced to start and end with guanine or cytosine. This option reduces the risk of “fraying” of DNA strands. It is possible to limit cross hybridizations of a defined length, and to adjust the uniqueness of sequences. Self-complementarity and hairpin structures of certain length can be avoided. Sequences and subsequences can optionally be forbidden. Furthermore, sequences can be designed to have minimum interactions with predefined strands and neighboring sequences. Results The algorithm is realized in a C++ program. TAG sequences can be generated and combined with primers for single-base extension reactions, which were described for multiplexed genotyping of single nucleotide polymorphisms. Thereby, possible foldback through intrastrand interaction of TAG-primer pairs can be limited. The design of sequences for specific attachment of molecular constructs to DNA origami is presented. Conclusions We developed a new software tool called EGNAS for the design of unique nucleic acid sequences. The presented exhaustive algorithm allows to generate greater sets of sequences than with previous software and equal constraints. EGNAS is freely available for noncommercial use at http://www.chm.tu-dresden.de/pc6/EGNAS.

  7. Tsukamurella tyrosinosolvens intravascular catheter infection identified using 16S ribosomal DNA sequencing.

    Science.gov (United States)

    Sheridan, Elizabeth A S; Warwick, Simon; Chan, Anthony; Dall'Antonia, Martino; Koliou, Maria; Sefton, Armine

    2003-03-01

    Cultures of blood from a hemodialysis line repeatedly yielded a gram-positive rod. The organism was identified as Tsukamurella tyrosinosolvens by 16S ribosomal DNA sequencing, and the patient was treated successfully by removal of the line.

  8. Directed PCR-free engineering of highly repetitive DNA sequences

    Directory of Open Access Journals (Sweden)

    Preissler Steffen

    2011-09-01

    Full Text Available Abstract Background Highly repetitive nucleotide sequences are commonly found in nature e.g. in telomeres, microsatellite DNA, polyadenine (poly(A tails of eukaryotic messenger RNA as well as in several inherited human disorders linked to trinucleotide repeat expansions in the genome. Therefore, studying repetitive sequences is of biological, biotechnological and medical relevance. However, cloning of such repetitive DNA sequences is challenging because specific PCR-based amplification is hampered by the lack of unique primer binding sites resulting in unspecific products. Results For the PCR-free generation of repetitive DNA sequences we used antiparallel oligonucleotides flanked by restriction sites of Type IIS endonucleases. The arrangement of recognition sites allowed for stepwise and seamless elongation of repetitive sequences. This facilitated the assembly of repetitive DNA segments and open reading frames encoding polypeptides with periodic amino acid sequences of any desired length. By this strategy we cloned a series of polyglutamine encoding sequences as well as highly repetitive polyadenine tracts. Such repetitive sequences can be used for diverse biotechnological applications. As an example, the polyglutamine sequences were expressed as His6-SUMO fusion proteins in Escherichia coli cells to study their aggregation behavior in vitro. The His6-SUMO moiety enabled affinity purification of the polyglutamine proteins, increased their solubility, and allowed controlled induction of the aggregation process. We successfully purified the fusions proteins and provide an example for their applicability in filter retardation assays. Conclusion Our seamless cloning strategy is PCR-free and allows the directed and efficient generation of highly repetitive DNA sequences of defined lengths by simple standard cloning procedures.

  9. Target genes of microsatellite sequences in head and neck squamous cell carcinoma: mononucleotide repeats are not detected.

    Science.gov (United States)

    Wang, Yimin; Liu, Xuejuan; Li, Yulin

    2012-09-10

    Microsatellite instability (MSI) is detected in a wide variety of tumors. It is thought that mismatch repair gene mutation or inactivation is the major cause of MSI. Microsatellite sequences are predominantly distributed in intergenic or intronic DNA. However, MSI is found in the exonic sequences of some genes, causing their inactivation. In this report, we searched GenBank for candidate genes containing potential MSI sequences in exonic regions. Twenty seven target genes were selected for MSI analysis. Instability was found in 70% of these genes (14/20) with head and neck squamous cell carcinoma (HNSCC). Interestingly, no instability was detected in mononucleotide repeats in genes or in intergenic sequences. We conclude that instability of mononucleotide repeats is a rare event in HNSCC. High MSI phenotype in young HNSCC patients is limited to noncoding regions only. MSI percentage in HNSCC tumor is closely related to the repeat type, repeat location and patient's age.

  10. Nonlinear Aspects of Coding and Noncoding DNA Sequences

    Science.gov (United States)

    Stanley, H. Eugene

    2001-03-01

    One of the most remarkable features of human DNA is that 97 percent is not coding for proteins. Studying this noncoding DNA is important both for practical reasons (to distinguish it from the coding DNA as the human genome is sequenced), and for scientific reasons (why is the noncoding DNA present at all, if it appears to have little if any purpose?). In this talk we discuss new methods of analyzing coding and noncoding DNA in parallel, with a view to uncovering different statistical properties of the two kinds of DNA. We also speculate on possible roles of noncoding DNA. The work reported here was carried out primarily by P. Bernaola-Galvan, S. V. Buldyrev, P. Carpena, N. Dokholyan, A. L. Goldberger, I. Grosse, S. Havlin, H. Herzel, J. L. Oliver, C.-K. Peng, M. Simons, H. E. Stanley, R. H. R. Stanley, and G. M. Viswanathan. [1] For a brief overview in language that physicists can understand, see H. E. Stanley, S. V. Buldyrev, A. L. Goldberger, S. Havlin, C.-K. Peng, and M. Simons, "Scaling Features of Noncoding DNA" [Proc. XII Max Born Symposium, Wroclaw], Physica A 273, 1-18 (1999). [2] I. Grosse, H. Herzel, S. V. Buldyrev, and H. E. Stanley, "Species Independence of Mutual Information in Coding and Noncoding DNA," Phys. Rev. E 61, 5624-5629 (2000). [3] P. Bernaola-Galvan, I. Grosse, P. Carpena, J. L. Oliver, and H. E. Stanley, "Identification of DNA Coding Regions Using an Entropic Segmentation Method," Phys. Rev. Lett. 84, 1342-1345 (2000). [4] N. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, "Distributions of Dimeric Tandem Repeats in Non-coding and Coding DNA Sequences," J. Theor. Biol. 202, 273-282 (2000). [5] R. H. R. Stanley, N. V. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, "Clumping of Identical Oligonucleotides in Coding and Noncoding DNA Sequences," J. Biomol. Structure and Design 17, 79-87 (1999). [6] N. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, "Distribution of Base Pair Repeats in Coding and Noncoding DNA

  11. Nanopore DNA sequencing using kinetic proofreading

    Science.gov (United States)

    Ling, Xinsheng

    We propose a method of DNA sequencing by combining the physical method of nanopore electrical measurements and Southern's sequencing-by-hybridization. The new key ingredient, essential to both lowering the costs and increasing the precision, is an asymmetric nanopore sandwich device capable of measuring the DNA hybridization probe twice separated by a designed waiting time. Those incorrect probes appearing only once in nanopore ionic current traces are discriminated from the correct ones that appear twice. This method of discrimination is similar to the principle of kinetic proofreading proposed by Hopfield and Ninio in gene transcription and translation processes. An error analysis is of this nanopore kinetic proofreading (nKP) technique for DNA sequencing is carried out in comparison with the most precise 3' dideoxy termination method developed by Sanger. Nanopore DNA sequencing using kinetic proofreading.

  12. Extracting biological knowledge from DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    De La Vega, F.M. [CINVESTAV-IPN (Mexico); Thieffry, D. [Universite Libre de Bruxelles, Rhode-Saint-Genese (Belgium)]|[Universidad Nacional Autonoma de Mexico, Morelos (Mexico); Collado-Vides, J. [Universidad Nacional Autonoma de Mexico, Morelos (Mexico)

    1996-12-31

    This session describes the elucidation of information from dna sequences and what challenges computational biologists face in their task of summarizing and deciphering the human genome. Techniques discussed include methods from statistics, information theory, artificial intelligence and linguistics. 1 ref.

  13. An approach to sequence DNA without tagging

    Science.gov (United States)

    Niu, Sanjun; Saraf, Ravi F.

    2002-10-01

    Microarray technology is playing an increasingly important role in biology and medicine and its application to genomics for gene expression analysis has already reached the market with a variety of commercially available instruments. In these combinatorial analysis methods, known probe single-strand DNA (ssDNA) 'primers' are attached in clusters of typically 100 µm × 100 µm pixels. Each pixel of the array has a slightly different sequence. On exposure to 'unknown' target ssDNA, the pixels with the right complementary probe ssDNA sequence convert to double-stranded DNA (dsDNA) by a hybridization reaction. To transduct the conversion of the pixel to dsDNA, the target ssDNA is labelled with a photoluminescent tag during the polymerase chain reaction (PCR) amplification process. Due to the statistical distribution of the tags in the target ssDNA, it becomes significantly difficult to implement these methods as a diagnostic tool in a pathology laboratory. A method to sequence DNA without tagging the molecule is developed. The fabrication process is compatible with current microelectronics and (emerging) soft-material fabrication technologies, allowing the method to be integrable with micro-electromechanical systems (MEMS) and lab-on-a-chip devices. An estimated sensitivity of 10-12 g on a 1 cm2 device area is obtained.

  14. Random rapid amplification of cDNA ends (RRACE) allows for cloning of multiple novel human cDNA fragments containing (CAG)n repeats.

    Science.gov (United States)

    Carney, J P; McKnight, C; VanEpps, S; Kelley, M R

    1995-04-03

    We describe a new technique for isolating cDNA fragments in which (i) either a partial sequence of the cDNA is known or (ii) a repeat sequence is utilized. We have used this technique, termed random rapid amplification of cDNA ends (random RACE), to isolate a number of trinucleotide repeat (CAG)n-containing genes. Using the random RACE (RRACE) technique, we have isolated over a hundred (CAG)n-containing genes. The results of our initial analysis of ten clones indicate that three are identical to previously cloned (CAG)n-containing genes. Three of our clones matched with expressed sequence tags, one of which contained a CA repeat. The remaining four clones did not match with any sequence in GenBank. These results indicate that this approach provides a rapid and efficient method for isolating trinucleotide repeat-containing cDNA fragments. Finally, this technique may be used for purposes other than cloning repeat-containing cDNA fragments. If only a partial sequence of a gene is known, our system, described here, provides a rapid and efficient method for isolating a fragment of the gene of interest.

  15. gargammel: a sequence simulator for ancient DNA.

    Science.gov (United States)

    Renaud, Gabriel; Hanghøj, Kristian; Willerslev, Eske; Orlando, Ludovic

    2016-10-29

    Ancient DNA has emerged as a remarkable tool to infer the history of extinct species and past populations. However, many of its characteristics, such as extensive fragmentation, damage and contamination, can influence downstream analyses. To help investigators measure how these could impact their analyses in silico, we have developed gargammel, a package that simulates ancient DNA fragments given a set of known reference genomes. Our package simulates the entire molecular process from post-mortem DNA fragmentation and DNA damage to experimental sequencing errors, and reproduces most common bias observed in ancient DNA datasets.

  16. Genotyping of simple sequence repeats--factors implicated in shadow band generation revisited.

    Science.gov (United States)

    Olejniczak, Marta; Krzyzosiak, Wlodzimierz J

    2006-10-01

    PCR amplification of microsatellite sequences generates, besides the main product corresponding to allele size, also additional, undesired products usually shorter by multiples of the repeated unit. These extra products known as shadow bands or stutter products may complicate genotyping. The mechanism by which these artifacts are formed is not well understood and so no effective remedy has been found to cope with these spurious products. In this study, using the DNA templates containing the CAG/CTG repeats flanked by gene-specific sequences and universal priming sites, we analyzed the effects of many PCR variables on the shadow band generation. The most important result was that at the decreased temperature of the denaturation step during PCR cycling the shadow bands were either not formed or were strongly suppressed. Several possible sources of this effect are discussed.

  17. Inconsistencies in Neanderthal genomic DNA sequences.

    Directory of Open Access Journals (Sweden)

    Jeffrey D Wall

    2007-10-01

    Full Text Available Two recently published papers describe nuclear DNA sequences that were obtained from the same Neanderthal fossil. Our reanalyses of the data from these studies show that they are not consistent with each other and point to serious problems with the data quality in one of the studies, possibly due to modern human DNA contaminants and/or a high rate of sequencing errors.

  18. CTCF regulates the local epigenetic state of ribosomal DNA repeats

    Directory of Open Access Journals (Sweden)

    van de Nobelen Suzanne

    2010-11-01

    Full Text Available Abstract Background CCCTC binding factor (CTCF is a highly conserved zinc finger protein, which is involved in chromatin organization, local histone modifications, and RNA polymerase II-mediated gene transcription. CTCF may act by binding tightly to DNA and recruiting other proteins to mediate its various functions in the nucleus. To further explore the role of this essential factor, we used a mass spectrometry-based approach to screen for novel CTCF-interacting partners. Results Using biotinylated CTCF as bait, we identified upstream binding factor (UBF and multiple other components of the RNA polymerase I complex as potential CTCF-interacting partners. Interestingly, CTCFL, the testis-specific paralog of CTCF, also binds UBF. The interaction between CTCF(L and UBF is direct, and requires the zinc finger domain of CTCF(L and the high mobility group (HMG-box 1 and dimerization domain of UBF. Because UBF is involved in RNA polymerase I-mediated ribosomal (rRNA transcription, we analyzed CTCF binding to the rDNA repeat. We found that CTCF bound to a site upstream of the rDNA spacer promoter and preferred non-methylated over methylated rDNA. DNA binding by CTCF in turn stimulated binding of UBF. Absence of CTCF in cultured cells resulted in decreased association of UBF with rDNA and in nucleolar fusion. Furthermore, lack of CTCF led to reduced binding of RNA polymerase I and variant histone H2A.Z near the rDNA spacer promoter, a loss of specific histone modifications, and diminished transcription of non-coding RNA from the spacer promoter. Conclusions UBF is the first common interaction partner of CTCF and CTCFL, suggesting a role for these proteins in chromatin organization of the rDNA repeats. We propose that CTCF affects RNA polymerase I-mediated events globally by controlling nucleolar number, and locally by regulating chromatin at the rDNA spacer promoter, similar to RNA polymerase II promoters. CTCF may load UBF onto rDNA, thereby forming

  19. Mitochondrial DNA sequence evolution in shorebird populations

    NARCIS (Netherlands)

    Wenink, P.W.

    1994-01-01

    This thesis describes the global molecular population structure of two shorebird species, in particular of the dunlin, Calidris alpina, by means of comparative sequence analysis of the most variable part of the mitochondrial DNA (mtDNA) genome. There are several reasons

  20. Nanogrid rolling circle DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Church, George M.; Porreca, Gregory J.; Shendure, Jay; Rosenbaum, Abraham Meir

    2017-04-18

    The present invention relates to methods for sequencing a polynucleotide immobilized on an array having a plurality of specific regions each having a defined diameter size, including synthesizing a concatemer of a polynucleotide by rolling circle amplification, wherein the concatemer has a cross-sectional diameter greater than the diameter of a specific region, immobilizing the concatemer to the specific region to make an immobilized concatemer, and sequencing the immobilized concatemer.

  1. Application of inter simple sequence repeat (ISSR) markers to plant genetics.

    Science.gov (United States)

    Godwin, I D; Aitken, E A; Smith, L W

    1997-08-01

    Microsatellites or simple sequence repeats (SSRs) are ubiquitous in eukaryotic genomes. Single-locus SSR markers have been developed for a number of species, although there is a major bottleneck in developing SSR markers whereby flanking sequences must be known to design 5'-anchors for polymerase chain reaction (PCR) primers. Inter SSR (ISSR) fingerprinting was developed such that no sequence knowledge was required. Primers based on a repeat sequence, such as (CA)n, can be made with a degenerate 3'-anchor, such as (CA)8RG or (AGC)6TY. The resultant PCR reaction amplifies the sequence between two SSRs, yielding a multilocus marker system useful for fingerprinting, diversity analysis and genome mapping. PCR products are radiolabelled with 32P or 33P via end-labelling or PCR incorporation, and separated on a polyacrylamide sequencing gel prior to autoradiographic visualisation. A typical reaction yields 20-100 bands per lane depending on the species and primer. We have used ISSR fingerprinting in a number of plant species, and report here some results on two important tropical species, sorghum and banana. Previous investigators have demonstrated that ISSR analysis usually detects a higher level of polymorphism than that detected with restriction fragment length polymorphism (RFLP) or random amplified polymorphic DNA (RAPD) analyses. Our data indicate that this is not a result of greater polymorphism genetically, but rather technical reasons related to the detection methodology used for ISSR analysis.

  2. Analysis and location of a rice BAC clone containing telomeric DNA sequences

    Institute of Scientific and Technical Information of China (English)

    翟文学; 陈浩; 颜辉煌; 严长杰; 王国梁; 朱立煌

    1999-01-01

    BAC2, a rice BAC clone containing (TTTAGGG)n homologous sequences, was analyzed by Southern hybridization and DNA sequencing of its subclones. It was disclosed that there were many tandem repeated satellite DNA sequences, called TA352, as well as simple tandem repeats consisting of TTTAGGG or its variant within the BAC2 insert. A 0. 8 kb (TTTAGGG) n-containing fragment in BAC2 was mapped in the telomere regions of at least 5 pairs of rice chromosomes by using fluorescence in situ hybridization (FISH). By RFLP analysis of low copy sequences the BAC2 clone was localized in one terminal region of chromosome 6. All the results strongly suggest that the telomeric DNA sequences of rice are TTTAGGG or its variant, and the linked satellite DNA TA352 sequences belong to telomere-associated sequences.

  3. Sequencing intractable DNA to close microbial genomes.

    Science.gov (United States)

    Hurt, Richard A; Brown, Steven D; Podar, Mircea; Palumbo, Anthony V; Elias, Dwayne A

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  4. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  5. Nanopore-CMOS Interfaces for DNA Sequencing.

    Science.gov (United States)

    Magierowski, Sebastian; Huang, Yiyun; Wang, Chengjie; Ghafar-Zadeh, Ebrahim

    2016-08-06

    DNA sequencers based on nanopore sensors present an opportunity for a significant break from the template-based incumbents of the last forty years. Key advantages ushered by nanopore technology include a simplified chemistry and the ability to interface to CMOS technology. The latter opportunity offers substantial promise for improvement in sequencing speed, size and cost. This paper reviews existing and emerging means of interfacing nanopores to CMOS technology with an emphasis on massively-arrayed structures. It presents this in the context of incumbent DNA sequencing techniques, reviews and quantifies nanopore characteristics and models and presents CMOS circuit methods for the amplification of low-current nanopore signals in such interfaces.

  6. Osmylated DNA, a novel concept for sequencing DNA using nanopores

    Science.gov (United States)

    Kanavarioti, Anastassia

    2015-03-01

    Saenger sequencing has led the advances in molecular biology, while faster and cheaper next generation technologies are urgently needed. A newer approach exploits nanopores, natural or solid-state, set in an electrical field, and obtains base sequence information from current variations due to the passage of a ssDNA molecule through the pore. A hurdle in this approach is the fact that the four bases are chemically comparable to each other which leads to small differences in current obstruction. ‘Base calling’ becomes even more challenging because most nanopores sense a short sequence and not individual bases. Perhaps sequencing DNA via nanopores would be more manageable, if only the bases were two, and chemically very different from each other; a sequence of 1s and 0s comes to mind. Osmylated DNA comes close to such a sequence of 1s and 0s. Osmylation is the addition of osmium tetroxide bipyridine across the C5-C6 double bond of the pyrimidines. Osmylation adds almost 400% mass to the reactive base, creates a sterically and electronically notably different molecule, labeled 1, compared to the unreactive purines, labeled 0. If osmylated DNA were successfully sequenced, the result would be a sequence of osmylated pyrimidines (1), and purines (0), and not of the actual nucleobases. To solve this problem we studied the osmylation reaction with short oligos and with M13mp18, a long ssDNA, developed a UV-vis assay to measure extent of osmylation, and designed two protocols. Protocol A uses mild conditions and yields osmylated thymidines (1), while leaving the other three bases (0) practically intact. Protocol B uses harsher conditions and effectively osmylates both pyrimidines, but not the purines. Applying these two protocols also to the complementary of the target polynucleotide yields a total of four osmylated strands that collectively could define the actual base sequence of the target DNA.

  7. Development of expressed sequence tag and expressed sequence tag–simple sequence repeat marker resources for Musa acuminata

    Science.gov (United States)

    Passos, Marco A. N.; de Oliveira Cruz, Viviane; Emediato, Flavia L.; de Camargo Teixeira, Cristiane; Souza, Manoel T.; Matsumoto, Takashi; Rennó Azevedo, Vânia C.; Ferreira, Claudia F.; Amorim, Edson P.; de Alencar Figueiredo, Lucio Flavio; Martins, Natalia F.; de Jesus Barbosa Cavalcante, Maria; Baurens, Franc-Christophe; da Silva, Orzenil Bonfim; Pappas, Georgios J.; Pignolet, Luc; Abadie, Catherine; Ciampi, Ana Y.; Piffanelli, Pietro; Miller, Robert N. G.

    2012-01-01

    Background and aims Banana (Musa acuminata) is a crop contributing to global food security. Many varieties lack resistance to biotic stresses, due to sterility and narrow genetic background. The objective of this study was to develop an expressed sequence tag (EST) database of transcripts expressed during compatible and incompatible banana–Mycosphaerella fijiensis (Mf) interactions. Black leaf streak disease (BLSD), caused by Mf, is a destructive disease of banana. Microsatellite markers were developed as a resource for crop improvement. Methodology cDNA libraries were constructed from in vitro-infected leaves from BLSD-resistant M. acuminata ssp. burmaniccoides Calcutta 4 (MAC4) and susceptible M. acuminata cv. Cavendish Grande Naine (MACV). Clones were 5′-end Sanger sequenced, ESTs assembled with TGICL and unigenes annotated using BLAST, Blast2GO and InterProScan. Mreps was used to screen for simple sequence repeats (SSRs), with markers evaluated for polymorphism using 20 diploid (AA) M. acuminata accessions contrasting in resistance to Mycosphaerella leaf spot diseases. Principal results A total of 9333 high-quality ESTs were obtained for MAC4 and 3964 for MACV, which assembled into 3995 unigenes. Of these, 2592 displayed homology to genes encoding proteins with known or putative function, and 266 to genes encoding proteins with unknown function. Gene ontology (GO) classification identified 543 GO terms, 2300 unigenes were assigned to EuKaryotic orthologous group categories and 312 mapped to Kyoto Encyclopedia of Genes and Genomes pathways. A total of 624 SSR loci were identified, with trinucleotide repeat motifs the most abundant in MAC4 (54.1 %) and MACV (57.6 %). Polymorphism across M. acuminata accessions was observed with 75 markers. Alleles per polymorphic locus ranged from 2 to 8, totalling 289. The polymorphism information content ranged from 0.08 to 0.81. Conclusions This EST collection offers a resource for studying functional genes, including

  8. Electrochemical measurement for analysis of DNA sequence

    Energy Technology Data Exchange (ETDEWEB)

    Cho, S.B.; Hong, J.S.; Pak, J.H. [Korea University, Seoul (Korea); Kim, Y.M. [National Institute of Health, Seoul (Korea)

    2002-02-01

    One of the important roles of a DNA chip is the capability of detecting genetic diseases and mutations by analyzing DNA sequence. For a successful electrochemical genotyping, several aspects should be considered including the chemical treatment of electrode surface, DNA immobilization on electrode, hybridization, choice of an intercalator to be selectively bound to double standed DNA, and an equipment for detecting and analyzing the output singal. Au was used as the electrode material, 2-mercaptoethanol was used for linking DNA to Au electrode, and methylene blue was used as an indicator that can be bound to a double stranded DNA selectively. From the analysis of reductive current of this indicator that was bound to a double stranded DNA on an electrode, a normal double stranded DNA was able to be distinguished from a single stranded DNA in just a few seconds. Also, it was found that the peak reduction current of indicator is proportional to the concentration of target DNA to be hybridized with probe DNA. Therefore, it is possible to realize a simple and cheap DNA sensor using the electrochemical measurement for genotyping. (author). 20 refs., 8 figs., 1 tab.

  9. Dynamics and control of DNA sequence amplification

    Energy Technology Data Exchange (ETDEWEB)

    Marimuthu, Karthikeyan [Department of Chemical Engineering and Center for Advanced Process Decision-Making, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 (United States); Chakrabarti, Raj, E-mail: raj@pmc-group.com, E-mail: rajc@andrew.cmu.edu [Department of Chemical Engineering and Center for Advanced Process Decision-Making, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 (United States); Division of Fundamental Research, PMC Advanced Technology, Mount Laurel, New Jersey 08054 (United States)

    2014-10-28

    DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions.

  10. Female-specific DNA sequences in geese.

    Science.gov (United States)

    Huang, M C; Lin, W C; Horng, Y M; Rouvier, R; Huang, C W

    2003-07-01

    1. The OPAE random primers (Operon Technologies, Inc., CA) were used for random amplified polymorphic DNA (RAPD) fingerprinting in Chinese, White Roman and Landaise geese. One of these primers, OPAE-06, produced a 938-bp sex-specific fragment in all females and in no males of Chinese geese only. 2. A novel female-specific DNA sequence in Chinese goose was cloned and sequenced. Two primers, CGSex-F and CGSex-R, were designed in order to amplify a 912-bp sex-specific polymerase chain reaction (PCR) fragment on genomic DNA from female geese. 3. It was shown that a simple and effective PCR-based sexing technique could be used in the three goose breeds studied. 4. Nucleotide sequencing of the sex-specific fragments in White Roman and Landaise geese was performed and sequence differences were observed among these three breeds.

  11. Insertion sequence inversions mediated by ectopic recombination between terminal inverted repeats.

    Science.gov (United States)

    Ling, Alison; Cordaux, Richard

    2010-12-20

    Transposable elements are widely distributed and diverse in both eukaryotes and prokaryotes, as exemplified by DNA transposons. As a result, they represent a considerable source of genomic variation, for example through ectopic (i.e. non-allelic homologous) recombination events between transposable element copies, resulting in genomic rearrangements. Ectopic recombination may also take place between homologous sequences located within transposable element sequences. DNA transposons are typically bounded by terminal inverted repeats (TIRs). Ectopic recombination between TIRs is expected to result in DNA transposon inversions. However, such inversions have barely been documented. In this study, we report natural inversions of the most common prokaryotic DNA transposons: insertion sequences (IS). We identified natural TIR-TIR recombination-mediated inversions in 9% of IS insertion loci investigated in Wolbachia bacteria, which suggests that recombination between IS TIRs may be a quite common, albeit largely overlooked, source of genomic diversity in bacteria. We suggest that inversions may impede IS survival and proliferation in the host genome by altering transpositional activity. They may also alter genomic instability by modulating the outcome of ectopic recombination events between IS copies in various orientations. This study represents the first report of TIR-TIR recombination within bacterial IS elements and it thereby uncovers a novel mechanism of structural variation for this class of prokaryotic transposable elements.

  12. Compressing DNA sequence databases with coil

    Directory of Open Access Journals (Sweden)

    Hendy Michael D

    2008-05-01

    Full Text Available Abstract Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.

  13. Analysis of simple sequence repeats markers derived from Phytophthora sojae expressed sequence tags

    Institute of Scientific and Technical Information of China (English)

    ZHU Zhendong; HUO Yunlong; WANG Xiaoming; HUANG Junbin; WU Xiaofei

    2004-01-01

    Five thousand and eight hundred publicly available expressed sequence tags (ESTs) of Phytophthora sojae were electronically searched and 415 simple sequence repeats (SSRs) were identified in 369 ESTs. The average density of SSRs was one SSR per 8.9 kb of EST sequence screened. The most frequent repeats were trinucleotide repeats (50.1%) and the least frequent were tetranucleotide repeats (8.2%). Forty primer pairs were designed and tested on 5 strains of P. sojae. Thirty-three primer pairs had successful PCR amplifications. Of the 33 functional primer pairs, 28 primer pairs produced characteristic SSR bands of the expected size, and 15 primer pairs (45.5%) detected polymorphism among 5 tested strains of P. sojae. Based on the polymorphisms detected with 20 EST-SSR markers, the 5 tested strains of P. sojae were clustered into 3 groups. In this study, the SSR markers of P. sojae were developed for the first time. These markers could be useful for identification, genetic variation study, and molecular mapping of P. sojae and its relative species.

  14. PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes

    OpenAIRE

    Kumar, Pankaj; Chaitanya, Pasumarthy S.; Nagarajaram, Hampapathalu A

    2010-01-01

    PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1–6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in s...

  15. Evolutionary Origin of Higher-Order Repeat Structure in Alpha-Satellite DNA of Primate Centromeres

    Science.gov (United States)

    Koga, Akihiko; Hirai, Yuriko; Terada, Shoko; Jahan, Israt; Baicharoen, Sudarath; Arsaithamkul, Visit; Hirai, Hirohisa

    2014-01-01

    Alpha-satellite DNA (AS) is a main DNA component of primate centromeres, consisting of tandemly repeated units of ∼170 bp. The AS of humans contains sequences organized into higher-order repeat (HOR) structures, in which a block of multiple repeat units forms a larger repeat unit and the larger units are repeated tandemly. The presence of HOR in AS is widely thought to be unique to hominids (family Hominidae; humans and great apes). Recently, we have identified an HOR-containing AS in the siamang, which is a small ape species belonging to the genus Symphalangus in the family Hylobatidae. This result supports the view that HOR in AS is an attribute of hominoids (superfamily Hominoidea) rather than hominids. A single example is, however, not sufficient for discussion of the evolutionary origin of HOR-containing AS. In the present study, we developed an efficient method for detecting signs of large-scale HOR and demonstrated HOR of AS in all the three other genera. Thus, AS organized into HOR occurs widely in hominoids. Our results indicate that (i) HOR-containing AS was present in the last common ancestor of hominoids or (ii) HOR-containing AS emerged independently in most or all basal branches of hominoids. We have also confirmed HOR occurrence in centromeric AS in the Hylobatidae family, which remained unclear in our previous study because of the existence of AS in subtelomeric regions, in addition to centromeres, of siamang chromosomes. PMID:24585002

  16. DNA Sequencing in Cultural Heritage.

    Science.gov (United States)

    Vai, Stefania; Lari, Martina; Caramelli, David

    2016-02-01

    During the last three decades, DNA analysis on degraded samples revealed itself as an important research tool in anthropology, archaeozoology, molecular evolution, and population genetics. Application on topics such as determination of species origin of prehistoric and historic objects, individual identification of famous personalities, characterization of particular samples important for historical, archeological, or evolutionary reconstructions, confers to the paleogenetics an important role also for the enhancement of cultural heritage. A really fast improvement in methodologies in recent years led to a revolution that permitted recovering even complete genomes from highly degraded samples with the possibility to go back in time 400,000 years for samples from temperate regions and 700,000 years for permafrozen remains and to analyze even more recent material that has been subjected to hard biochemical treatments. Here we propose a review on the different methodological approaches used so far for the molecular analysis of degraded samples and their application on some case studies.

  17. cDNA sequence quality data - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available Budding yeast cDNA sequencing project cDNA sequence quality data Data detail Data name cDNA sequence quality... data Description of data contents Phred's quality score. PHD format, one file to a single cDNA data, and co...ription Download License Update History of This Database Site Policy | Contact Us cDNA sequence quality data - Budding yeast cDNA sequencing project | LSDB Archive ...

  18. A Conserved DNA Repeat Promotes Selection of a Diverse Repertoire of Trypanosoma brucei Surface Antigens from the Genomic Archive.

    Directory of Open Access Journals (Sweden)

    Galadriel Hovel-Miner

    2016-05-01

    Full Text Available African trypanosomes are mammalian pathogens that must regularly change their protein coat to survive in the host bloodstream. Chronic trypanosome infections are potentiated by their ability to access a deep genomic repertoire of Variant Surface Glycoprotein (VSG genes and switch from the expression of one VSG to another. Switching VSG expression is largely based in DNA recombination events that result in chromosome translocations between an acceptor site, which houses the actively transcribed VSG, and a donor gene, drawn from an archive of more than 2,000 silent VSGs. One element implicated in these duplicative gene conversion events is a DNA repeat of approximately 70 bp that is found in long regions within each BES and short iterations proximal to VSGs within the silent archive. Early observations showing that 70-bp repeats can be recombination boundaries during VSG switching led to the prediction that VSG-proximal 70-bp repeats provide recombinatorial homology. Yet, this long held assumption had not been tested and no specific function for the conserved 70-bp repeats had been demonstrated. In the present study, the 70-bp repeats were genetically manipulated under conditions that induce gene conversion. In this manner, we demonstrated that 70-bp repeats promote access to archival VSGs. Synthetic repeat DNA sequences were then employed to identify the length, sequence, and directionality of repeat regions required for this activity. In addition, manipulation of the 70-bp repeats allowed us to observe a link between VSG switching and the cell cycle that had not been appreciated. Together these data provide definitive support for the long-standing hypothesis that 70-bp repeats provide recombinatorial homology during switching. Yet, the fact that silent archival VSGs are selected under these conditions suggests the 70-bp repeats also direct DNA pairing and recombination machinery away from the closest homologs (silent BESs and toward the rest of

  19. DNA sequencing by synthesis with degenerate primers

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    The degenerate primer-based sequencing Was developed by a synthesis method(DP-SBS)for high-throughput DNA sequencing,in which a set of degenerate primers are hybridized on the arrayed DNA templates and extended by DNA polymerase on microarrays.In this method,adifferent set of degenerate primers containing a give nnumber(n)of degenerate nucleotides at the 3'-ends were annealed to the sequenced templates that were immobilized on the solid surface.The nucleotides(n+1)on the template sequences were determined by detecting the incorporation of fluorescent labeled nucleotides.The fluorescent labeled nucleotide was incorporated into the primer in a base-specific manner after the enzymatic primer extension reactions and nine-base length were read out accurately.The main advanmge of the DP-SBS is that the method only uses very conventional biochemical reagents and avoids the complicated special chemical reagents for removing the labeled nucleotides and reactivating the primer for further extension.From the present study,it is found that the DP-SBS method is reliable,simple,and cost-effective for laboratory-sequencing a large amount of short DNA fragments.

  20. Expressed Sequence Tag-Simple Sequence Repeat (EST-SSR Marker Resources for Diversity Analysis of Mango (Mangifera indica L.

    Directory of Open Access Journals (Sweden)

    Natalie L. Dillon

    2014-01-01

    Full Text Available In this study, a collection of 24,840 expressed sequence tags (ESTs generated from five mango (Mangifera indica L. cDNA libraries was mined for EST-based simple sequence repeat (SSR markers. Over 1,000 ESTs with SSR motifs were detected from more than 24,000 EST sequences with di- and tri-nucleotide repeat motifs the most abundant. Of these, 25 EST-SSRs in genes involved in plant development, stress response, and fruit color and flavor development pathways were selected, developed into PCR markers and characterized in a population of 32 mango selections including M. indica varieties, and related Mangifera species. Twenty-four of the 25 EST-SSR markers exhibited polymorphisms, identifying a total of 86 alleles with an average of 5.38 alleles per locus, and distinguished between all Mangifera selections. Private alleles were identified for Mangifera species. These newly developed EST-SSR markers enhance the current 11 SSR mango genetic identity panel utilized by the Australian Mango Breeding Program. The current panel has been used to identify progeny and parents for selection and the application of this extended panel will further improve and help to design mango hybridization strategies for increased breeding efficiency.

  1. Glycome mapping on DNA sequencing equipment.

    Science.gov (United States)

    Laroy, Wouter; Contreras, Roland; Callewaert, Nico

    2006-01-01

    Here we provide a detailed protocol for the analysis of protein-linked glycans on DNA sequencing equipment. This protocol satisfies the glyco-analytical needs of many projects and can form the basis of 'glycomics' studies, in which robustness, high throughput, high sensitivity and reliable quantification are of paramount importance. The protocol routinely resolves isobaric glycan stereoisomers, which is much more difficult by mass spectrometry (MS). Earlier methods made use of polyacrylamide gel-based sequencers, but we have now adapted the technique to multicapillary DNA sequencers, which represent the state of the art today. In addition, we have integrated an option for HPLC-based fractionation of highly anionic 8-amino-1,3,6-pyrenetrisulfonic acid (APTS)-labeled glycans before rapid capillary electrophoretic profiling. This option facilitates either two-dimensional profiling of complex glycan mixtures and exoglycosidase sequencing, or MS analysis of particular compounds of interest rather than of the total pool of glycans in a sample.

  2. The complete DNA sequence of vaccinia virus.

    Science.gov (United States)

    Goebel, S J; Johnson, G P; Perkus, M E; Davis, S W; Winslow, J P; Paoletti, E

    1990-11-01

    The complete DNA sequence of the genome of vaccinia virus has been determined. The genome consisted of 191,636 bp with a base composition of 66.6% A + T. We have identified 198 "major" protein-coding regions and 65 overlapping "minor" regions, for a total of 263 potential genes. Genes encoded by the virus were located by examination of DNA sequence characteristics and compared with existing vaccinia virus mapping analyses, sequence data, and transcription data. These genes were found to be compactly organized along the genome with relatively few regions of noncoding sequences. Whereas several similarities to proteins of known function were discerned, the function of the majority of proteins encoded by these open reading frames is as yet undetermined.

  3. Cytogenetic diversity of simple sequences repeats in morphotypes of Brassica rapa ssp. chinensis

    Directory of Open Access Journals (Sweden)

    Jinshuang Zheng

    2016-07-01

    Full Text Available A significant fraction of the nuclear DNA of all eukaryotes is occupied by simple sequence repeats (SSRs. Although thesis sequences have sparked great interest as a means of studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. This paper report the long-range organization of all possible classes of mono-, di- and tri-nucleotide SSRs in Brassica rapa. Fluorescence in situ hybridization (FISH was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphtypes of B. rapa, with trinucleotide SSRs more prevalent in the genome of B. rapa ssp. chinensis. The chromosomal characterizations of mono-, di- and tri-nucleotide repeats have been acquired. The data has revealed the non-random and motif-dependent chromosome distribution of SSRs in different morphtypes, and allowed the relative variability characterized by SSRs amount and similar chromosomal distribution in centromeric/peri-centromeric heterochromatin. The differences of SSRs in the abundance and distribution indicated the driving force of SSRs in relationship with the evolution of B. rapa species. The results provided a comprehensive view on the SSR sequence distribution and evolution for comparison among morphtypes B. rapa ssp. chinensis.

  4. Formation of Extrachromosomal Circular DNA from Long Terminal Repeats of Retrotransposons in Saccharomyces cerevisiae

    Directory of Open Access Journals (Sweden)

    Henrik D. Møller

    2016-02-01

    Full Text Available Extrachromosomal circular DNA (eccDNA derived from chromosomal Ty retrotransposons in yeast can be generated in multiple ways. Ty eccDNA can arise from the circularization of extrachromosomal linear DNA during the transpositional life cycle of retrotransposons, or from circularization of genomic Ty DNA. Circularization may happen through nonhomologous end-joining (NHEJ of long terminal repeats (LTRs flanking Ty elements, by Ty autointegration, or by LTR–LTR recombination. By performing an in-depth investigation of sequence reads stemming from Ty eccDNAs obtained from populations of Saccharomyces cerevisiae S288c, we find that eccDNAs predominantly correspond to full-length Ty1 elements. Analyses of sequence junctions reveal no signs of NHEJ or autointegration events. We detect recombination junctions that are consistent with yeast Ty eccDNAs being generated through recombination events within the genome. This opens the possibility that retrotransposable elements could move around in the genome without an RNA intermediate directly through DNA circularization.

  5. Chromosomal organizations of major repeat families on potato (Solanum tuberosum) and further exploring in its sequenced genome.

    Science.gov (United States)

    Tang, Xiaomin; Datema, Erwin; Guzman, Myriam Olortegui; de Boer, Jan M; van Eck, Herman J; Bachem, Christian W B; Visser, Richard G F; de Jong, Hans

    2014-12-01

    One of the most powerful technologies in unraveling the organization of a eukaryotic plant genome is high-resolution Fluorescent in situ hybridization of repeats and single copy DNA sequences on pachytene chromosomes. This technology allows the integration of physical mapping information with chromosomal positions, including centromeres, telomeres, nucleolar-organizing region, and euchromatin and heterochromatin. In this report, we established chromosomal positions of different repeat fractions of the potato genomic DNA (Cot100, Cot500 and Cot1000) on the chromosomes. We also analysed various repeat elements that are unique to potato including the moderately repetitive P5 and REP2 elements, where the REP2 is part of a larger Gypsy-type LTR retrotransposon and cover most chromosome regions, with some brighter fluorescing spots in the heterochromatin. The most abundant tandem repeat is the potato genomic repeat 1 that covers subtelomeric regions of most chromosome arms. Extensive multiple alignments of these repetitive sequences in the assembled RH89-039-16 potato BACs and the draft assembly of the DM1-3 516 R44 genome shed light on the conservation of these repeats within the potato genome. The consensus sequences thus obtained revealed the native complete transposable elements from which they were derived.

  6. Genomic libraries: II. Subcloning, sequencing, and assembling large-insert genomic DNA clones.

    Science.gov (United States)

    Quail, Mike A; Matthews, Lucy; Sims, Sarah; Lloyd, Christine; Beasley, Helen; Baxter, Simon W

    2011-01-01

    Sequencing large insert clones to completion is useful for characterizing specific genomic regions, identifying haplotypes, and closing gaps in whole genome sequencing projects. Despite being a standard technique in molecular laboratories, DNA sequencing using the Sanger method can be highly problematic when complex secondary structures or sequence repeats are encountered in genomic clones. Here, we describe methods to isolate DNA from a large insert clone (fosmid or BAC), subclone the sample, and sequence the region to the highest industry standard. Troubleshooting solutions for sequencing difficult templates are discussed.

  7. DNA dynamics is likely to be a factor in the genomic nucleotide repeats expansions related to diseases.

    Directory of Open Access Journals (Sweden)

    Boian S Alexandrov

    Full Text Available Trinucleotide repeats sequences (TRS represent a common type of genomic DNA motif whose expansion is associated with a large number of human diseases. The driving molecular mechanisms of the TRS ongoing dynamic expansion across generations and within tissues and its influence on genomic DNA functions are not well understood. Here we report results for a novel and notable collective breathing behavior of genomic DNA of tandem TRS, leading to propensity for large local DNA transient openings at physiological temperature. Our Langevin molecular dynamics (LMD and Markov Chain Monte Carlo (MCMC simulations demonstrate that the patterns of openings of various TRSs depend specifically on their length. The collective propensity for DNA strand separation of repeated sequences serves as a precursor for outsized intermediate bubble states independently of the G/C-content. We report that repeats have the potential to interfere with the binding of transcription factors to their consensus sequence by altered DNA breathing dynamics in proximity of the binding sites. These observations might influence ongoing attempts to use LMD and MCMC simulations for TRS-related modeling of genomic DNA functionality in elucidating the common denominators of the dynamic TRS expansion mutation with potential therapeutic applications.

  8. The DNA sequence specificity of bleomycin cleavage in a systematically altered DNA sequence.

    Science.gov (United States)

    Gautam, Shweta D; Chen, Jon K; Murray, Vincent

    2017-08-01

    Bleomycin is an anti-tumour agent that is clinically used to treat several types of cancers. Bleomycin cleaves DNA at specific DNA sequences and recent genome-wide DNA sequencing specificity data indicated that the sequence 5'-RTGT*AY (where T* is the site of bleomycin cleavage, R is G/A and Y is T/C) is preferentially cleaved by bleomycin in human cells. Based on this DNA sequence, we constructed a plasmid clone to explore this bleomycin cleavage preference. By systematic variation of single nucleotides in the 5'-RTGT*AY sequence, we were able to investigate the effect of nucleotide changes on bleomycin cleavage efficiency. We observed that the preferred consensus DNA sequence for bleomycin cleavage in the plasmid clone was 5'-YYGT*AW (where W is A/T). The most highly cleaved sequence was 5'-TCGT*AT and, in fact, the seven most highly cleaved sequences conformed to the consensus sequence 5'-YYGT*AW. A comparison with genome-wide results was also performed and while the core sequence was similar in both environments, the surrounding nucleotides were different.

  9. Controlled growth of DNA structures from repeating units using the vernier mechanism.

    Science.gov (United States)

    Greschner, Andrea A; Bujold, Katherine E; Sleiman, Hanadi F

    2014-08-11

    In this report, we demonstrate the assembly of length-programmed DNA nanostructures using a single 16 base sequence and its complement as building blocks. To achieve this, we applied the Vernier mechanism to DNA assembly, which uses a mismatch in length between two monomers to dictate the final length of the product. Specifically, this approach relies on the interaction of two DNA strands containing a different number (n, m) of complementary binding sites: these two strands will keep binding to each other until they come into register, thus generating a larger assembly whose length (n × m) is encoded by the number of binding sites in each strand. While the Vernier mechanism has been applied to other areas of supramolecular chemistry, here we present an application of its principles to DNA nanostructures. Using a single 16 base repeat and its complement, and varying the number of repeats on a given DNA strand, we show the consistent construction of duplexes up to 228 base pairs (bp) in length. Employing specific annealing protocols, strand capping, and intercalator chaperones allows us to further grow the duplex to 392 base pairs. We demonstrate that the Vernier method is not only strand-efficient, but also produces a cleaner, higher-yielding product than conventional designs.

  10. The Cipher Code of Simple Sequence Repeats in "Vampire Pathogens".

    Science.gov (United States)

    Zou, Geng; Bello-Orti, Bernardo; Aragon, Virginia; Tucker, Alexander W; Luo, Rui; Ren, Pinxing; Bi, Dingren; Zhou, Rui; Jin, Hui

    2015-07-28

    Blood inside mammals is a forbidden area for the majority of prokaryotic microbes; however, red blood cells tropism microbes, like "vampire pathogens" (VP), succeed in matching scarce nutrients and surviving strong immunity reactions. Here, we found VP of Mycoplasma, Rhizobiales, and Rickettsiales showed significantly higher counts of (AG)n dimeric simple sequence repeats (Di-SSRs) in the genomes, coding and non-coding regions than non Vampire Pathogens (N_VP). Regression analysis indicated a significant correlation between GC content and the span of (AG)n-Di-SSR variation. Gene Ontology (GO) terms with abundance of (AG)3-Di-SSRs shared by the VP strains were associated with purine nucleotide metabolism (FDR < 0.01), indicating an adaptation to the limited availability of purine and nucleotide precursors in blood. Di-amino acids coded by (AG)n-Di-SSRs included all three six-fold code amino acids (Arg, Leu and Ser) and significantly higher counts of Di-amino acids coded by (AG)3, (GA)3, and (TC)3 in VP than N_VP. Furthermore, significant differences (P < 0.001) on the numbers of triplexes formed from (AG)n-Di-SSRs between VP and N_VP in Mycoplasma suggested the potential role of (AG)n-Di-SSRs in gene regulation.

  11. DNA Sequence Alignment during Homologous Recombination.

    Science.gov (United States)

    Greene, Eric C

    2016-05-27

    Homologous recombination allows for the regulated exchange of genetic information between two different DNA molecules of identical or nearly identical sequence composition, and is a major pathway for the repair of double-stranded DNA breaks. A key facet of homologous recombination is the ability of recombination proteins to perfectly align the damaged DNA with homologous sequence located elsewhere in the genome. This reaction is referred to as the homology search and is akin to the target searches conducted by many different DNA-binding proteins. Here I briefly highlight early investigations into the homology search mechanism, and then describe more recent research. Based on these studies, I summarize a model that includes a combination of intersegmental transfer, short-distance one-dimensional sliding, and length-specific microhomology recognition to efficiently align DNA sequences during the homology search. I also suggest some future directions to help further our understanding of the homology search. Where appropriate, I direct the reader to other recent reviews describing various issues related to homologous recombination.

  12. Automated Template Quantification for DNA Sequencing Facilities

    Science.gov (United States)

    Ivanetich, Kathryn M.; Yan, Wilson; Wunderlich, Kathleen M.; Weston, Jennifer; Walkup, Ward G.; Simeon, Christian

    2005-01-01

    The quantification of plasmid DNA by the PicoGreen dye binding assay has been automated, and the effect of quantification of user-submitted templates on DNA sequence quality in a core laboratory has been assessed. The protocol pipets, mixes and reads standards, blanks and up to 88 unknowns, generates a standard curve, and calculates template concentrations. For pUC19 replicates at five concentrations, coefficients of variance were 0.1, and percent errors were from 1% to 7% (n = 198). Standard curves with pUC19 DNA were nonlinear over the 1 to 1733 ng/μL concentration range required to assay the majority (98.7%) of user-submitted templates. Over 35,000 templates have been quantified using the protocol. For 1350 user-submitted plasmids, 87% deviated by ≥ 20% from the requested concentration (500 ng/μL). Based on data from 418 sequencing reactions, quantification of user-submitted templates was shown to significantly improve DNA sequence quality. The protocol is applicable to all types of double-stranded DNA, is unaffected by primer (1 pmol/μL), and is user modifiable. The protocol takes 30 min, saves 1 h of technical time, and costs approximately $0.20 per unknown. PMID:16461949

  13. DNA Sequence Alignment during Homologous Recombination*

    Science.gov (United States)

    Greene, Eric C.

    2016-01-01

    Homologous recombination allows for the regulated exchange of genetic information between two different DNA molecules of identical or nearly identical sequence composition, and is a major pathway for the repair of double-stranded DNA breaks. A key facet of homologous recombination is the ability of recombination proteins to perfectly align the damaged DNA with homologous sequence located elsewhere in the genome. This reaction is referred to as the homology search and is akin to the target searches conducted by many different DNA-binding proteins. Here I briefly highlight early investigations into the homology search mechanism, and then describe more recent research. Based on these studies, I summarize a model that includes a combination of intersegmental transfer, short-distance one-dimensional sliding, and length-specific microhomology recognition to efficiently align DNA sequences during the homology search. I also suggest some future directions to help further our understanding of the homology search. Where appropriate, I direct the reader to other recent reviews describing various issues related to homologous recombination. PMID:27129270

  14. In the Staphylococcus aureus Two-Component System sae, the Response Regulator SaeR Binds to a Direct Repeat Sequence and DNA Binding Requires Phosphorylation by the Sensor Kinase SaeS ▿

    OpenAIRE

    Sun, Fei; Li, Chunling; Jeong, Dowon; Sohn, Changmo; He, Chuan; Bae, Taeok

    2010-01-01

    Staphylococcus aureus uses the SaeRS two-component system to control the expression of many virulence factors such as alpha-hemolysin and coagulase; however, the molecular mechanism of this signaling has not yet been elucidated. Here, using the P1 promoter of the sae operon as a model target DNA, we demonstrated that the unphosphorylated response regulator SaeR does not bind to the P1 promoter DNA, while its C-terminal DNA binding domain alone does. The DNA binding activity of full-length Sae...

  15. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    Science.gov (United States)

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  16. Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain.

    Science.gov (United States)

    de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas

    2014-06-01

    The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution.

  17. The first determination of DNA sequence of a specific gene.

    Science.gov (United States)

    Inouye, Masayori

    2016-05-10

    How and when the first DNA sequence of a gene was determined? In 1977, F. Sanger came up with an innovative technology to sequence DNA by using chain terminators, and determined the entire DNA sequence of the 5375-base genome of bacteriophage φX 174 (Sanger et al., 1977). While this Sanger's achievement has been recognized as the first DNA sequencing of genes, we had determined DNA sequence of a gene, albeit a partial sequence, 11 years before the Sanger's DNA sequence (Okada et al., 1966).

  18. DNA sequencing by nanopores: advances and challenges

    Science.gov (United States)

    Agah, Shaghayegh; Zheng, Ming; Pasquali, Matteo; Kolomeisky, Anatoly B.

    2016-10-01

    Developing inexpensive and simple DNA sequencing methods capable of detecting entire genomes in short periods of time could revolutionize the world of medicine and technology. It will also lead to major advances in our understanding of fundamental biological processes. It has been shown that nanopores have the ability of single-molecule sensing of various biological molecules rapidly and at a low cost. This has stimulated significant experimental efforts in developing DNA sequencing techniques by utilizing biological and artificial nanopores. In this review, we discuss recent progress in the nanopore sequencing field with a focus on the nature of nanopores and on sensing mechanisms during the translocation. Current challenges and alternative methods are also discussed.

  19. Complete genome sequence of chloroplast DNA (cpDNA) of Chlorella sorokiniana.

    Science.gov (United States)

    Orsini, Massimiliano; Cusano, Roberto; Costelli, Cristina; Malavasi, Veronica; Concas, Alessandro; Angius, Andrea; Cao, Giacomo

    2016-01-01

    The complete chloroplast genome sequence of Chlorella sorokiniana strain (SAG 111-8 k) is presented in this study. The genome consists of circular chromosomes of 109,811 bp, which encode a total of 109 genes, including 74 proteins, 3 rRNAs and 31 tRNAs. Moreover, introns are not detected and all genes are present in single copy. The overall AT contents of the C. sorokiniana cpDNA is 65.9%, the coding sequence is 59.1% and a large inverted repeat (IR) is not observed.

  20. Molecular mechanisms for maintenance of G-rich short tandem repeats capable of adopting G4 DNA structures

    Energy Technology Data Exchange (ETDEWEB)

    Nakagama, Hitoshi [Biochemistry Division, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045 (Japan)]. E-mail: hnakagam@gan2.res.ncc.go.jp; Higuchi, Kumiko [Biochemistry Division, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045 (Japan); Tanaka, Etsuko [Biochemistry Division, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045 (Japan); Tsuchiya, Naoto [Biochemistry Division, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045 (Japan); Nakashima, Katsuhiko [Biochemistry Division, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045 (Japan); Katahira, Masato [Biochemistry Division, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045 (Japan); Fukuda, Hirokazu [Biochemistry Division, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045 (Japan)

    2006-06-25

    Mammalian genomes contain several types of repetitive sequences. Some of these sequences are implicated in various specific cellular events, including meiotic recombination, chromosomal breaks and transcriptional regulation, and also in several human disorders. In this review, we document the formation of DNA secondary structures by the G-rich repetitive sequences that have been found in several minisatellites, telomeres and in various triplet repeats, and report their effects on in vitro DNA synthesis. d(GGCAG) repeats in the mouse minisatellite Pc-1 were demonstrated to form an intra-molecular folded-back quadruplex structure (also called a G4' structure) by NMR and CD spectrum analyses. d(TTAGGG) telomere repeats and d(CGG) triplet repeats were also shown to form G4' and other unspecified higher order structures, respectively. In vitro DNA synthesis was substantially arrested within the repeats, and this could be responsible for the preferential mutability of the G-rich repetitive sequences. Electrophoretic mobility shift assays using NIH3T3 cell extracts revealed heterogeneous nuclear ribonucleoprotein (hnRNP) A1 and A3, which were tightly and specifically bound to d(GGCAG) and d(TTAGGG) repeats with K {sub d} values in the order of nM. HnRNP A1 unfolded the G4' structure formed in the d(GGCAG) {sub n} and d(TTAGGG) {sub n} repeat regions, and also resolved the higher order structure formed by d(CGG) triplet repeats. Furthermore, DNA synthesis arrest at the secondary structures of d(GGCAG) repeats, telomeres and d(CGG) triplet repeats was efficiently repressed by the addition of hnRNP A1. High expression of hnRNPs may contribute to the maintenance of G-rich repetitive sequences, including telomere repeats, and may also participate in ensuring the stability of the genome in cells with enhanced proliferation. Transcriptional regulation of genes, such as c-myc and insulin, by G4 sequences found in the promoter regions could be an intriguing field of

  1. Mitochondrial DNA sequence variation in Greeks.

    Science.gov (United States)

    Kouvatsi, A; Karaiskou, N; Apostolidis, A; Kirmizidis, G

    2001-12-01

    Mitochondrial DNA (mtDNA) control region sequences were determined in 54 unrelated Greeks, coming from different regions in Greece, for both segments HVR-I and HVR-II. Fifty-two different mtDNA haplotypes were revealed, one of which was shared by three individuals. A very low heterogeneity was found among Greek regions. No one cluster of lineages was specific to individuals coming from a certain region. The average pairwise difference distribution showed a value of 7.599. The data were compared with that for other European or neighbor populations (British, French, Germans, Tuscans, Bulgarians, and Turks). The genetic trees that were constructed revealed homogeneity between Europeans. Median networks revealed that most of the Greek mtDNA haplotypes are clustered to the five known haplogroups and that a number of haplotypes are shared among Greeks and other European and Near Eastern populations.

  2. Analysis of expressed sequence tags from Prunus mume flower and fruit and development of simple sequence repeat markers

    Directory of Open Access Journals (Sweden)

    Gao Zhihong

    2010-07-01

    Full Text Available Abstract Background Expressed Sequence Tag (EST has been a cost-effective tool in molecular biology and represents an abundant valuable resource for genome annotation, gene expression, and comparative genomics in plants. Results In this study, we constructed a cDNA library of Prunus mume flower and fruit, sequenced 10,123 clones of the library, and obtained 8,656 expressed sequence tag (EST sequences with high quality. The ESTs were assembled into 4,473 unigenes composed of 1,492 contigs and 2,981 singletons and that have been deposited in NCBI (accession IDs: GW868575 - GW873047, among which 1,294 unique ESTs were with known or putative functions. Furthermore, we found 1,233 putative simple sequence repeats (SSRs in the P. mume unigene dataset. We randomly tested 42 pairs of PCR primers flanking potential SSRs, and 14 pairs were identified as true-to-type SSR loci and could amplify polymorphic bands from 20 individual plants of P. mume. We further used the 14 EST-SSR primer pairs to test the transferability on peach and plum. The result showed that nearly 89% of the primer pairs produced target PCR bands in the two species. A high level of marker polymorphism was observed in the plum species (65% and low in the peach (46%, and the clustering analysis of the three species indicated that these SSR markers were useful in the evaluation of genetic relationships and diversity between and within the Prunus species. Conclusions We have constructed the first cDNA library of P. mume flower and fruit, and our data provide sets of molecular biology resources for P. mume and other Prunus species. These resources will be useful for further study such as genome annotation, new gene discovery, gene functional analysis, molecular breeding, evolution and comparative genomics between Prunus species.

  3. MEME: discovering and analyzing DNA and protein sequence motifs.

    Science.gov (United States)

    Bailey, Timothy L; Williams, Nadya; Misleh, Chris; Li, Wilfred W

    2006-07-01

    MEME (Multiple EM for Motif Elicitation) is one of the most widely used tools for searching for novel 'signals' in sets of biological sequences. Applications include the discovery of new transcription factor binding sites and protein domains. MEME works by searching for repeated, ungapped sequence patterns that occur in the DNA or protein sequences provided by the user. Users can perform MEME searches via the web server hosted by the National Biomedical Computation Resource (http://meme.nbcr.net) and several mirror sites. Through the same web server, users can also access the Motif Alignment and Search Tool to search sequence databases for matches to motifs encoded in several popular formats. By clicking on buttons in the MEME output, users can compare the motifs discovered in their input sequences with databases of known motifs, search sequence databases for matches to the motifs and display the motifs in various formats. This article describes the freely accessible web server and its architecture, and discusses ways to use MEME effectively to find new sequence patterns in biological sequences and analyze their significance.

  4. Analysis of the trinucleotide CAG repeat from the human mitochondrial DNA polymerase gene in healthy and diseased individuals.

    Science.gov (United States)

    Rovio, A; Tiranti, V; Bednarz, A L; Suomalainen, A; Spelbrink, J N; Lecrenier, N; Melberg, A; Zeviani, M; Poulton, J; Foury, F; Jacobs, H T

    1999-01-01

    The human nuclear gene (POLG) for the catalytic subunit of mitochondrial DNA polymerase (DNA polymerase gamma) contains a trinucleotide CAG microsatellite repeat within the coding sequence. We have investigated the frequency of different repeat-length alleles in populations of diseased and healthy individuals. The predominant allele of 10 CAG repeats was found at a very similar frequency (approximately 88%) in both Finnish and ethnically mixed population samples, with homozygosity close to the equilibrium prediction. Other alleles of between 5 and 13 repeat units were detected, but no larger, expanded alleles were found. A series of 51 British myotonic dystrophy patients showed no significant variation from controls, indicating an absence of generalised CAG repeat instability. Patients with a variety of molecular lesions in mtDNA, including sporadic, clonal deletions, maternally inherited point mutations, autosomally transmitted mtDNA depletion and autosomal dominant multiple deletions showed no differences in POLG trinucleotide repeat-length distribution from controls. These findings rule out POLG repeat expansion as a common pathogenic mechanism in disorders characterised by mitochondrial genome instability.

  5. Repetitive sequence analysis and karyotyping reveals centromere-associated DNA sequences in radish (Raphanus sativus L.).

    Science.gov (United States)

    He, Qunyan; Cai, Zexi; Hu, Tianhua; Liu, Huijun; Bao, Chonglai; Mao, Weihai; Jin, Weiwei

    2015-04-18

    Radish (Raphanus sativus L., 2n = 2x = 18) is a major root vegetable crop especially in eastern Asia. Radish root contains various nutritions which play an important role in strengthening immunity. Repetitive elements are primary components of the genomic sequence and the most important factors in genome size variations in higher eukaryotes. To date, studies about repetitive elements of radish are still limited. To better understand genome structure of radish, we undertook a study to evaluate the proportion of repetitive elements and their distribution in radish. We conducted genome-wide characterization of repetitive elements in radish with low coverage genome sequencing followed by similarity-based cluster analysis. Results showed that about 31% of the genome was composed of repetitive sequences. Satellite repeats were the most dominating elements of the genome. The distribution pattern of three satellite repeat sequences (CL1, CL25, and CL43) on radish chromosomes was characterized using fluorescence in situ hybridization (FISH). CL1 was predominantly located at the centromeric region of all chromosomes, CL25 located at the subtelomeric region, and CL43 was a telomeric satellite. FISH signals of two satellite repeats, CL1 and CL25, together with 5S rDNA and 45S rDNA, provide useful cytogenetic markers to identify each individual somatic metaphase chromosome. The centromere-specific histone H3 (CENH3) has been used as a marker to identify centromere DNA sequences. One putative CENH3 (RsCENH3) was characterized and cloned from radish. Its deduced amino acid sequence shares high similarities to those of the CENH3s in Brassica species. An antibody against B. rapa CENH3, specifically stained radish centromeres. Immunostaining and chromatin immunoprecipitation (ChIP) tests with anti-BrCENH3 antibody demonstrated that both the centromere-specific retrotransposon (CR-Radish) and satellite repeat (CL1) are directly associated with RsCENH3 in radish. Proportions

  6. Assembly of Repeat Content Using Next Generation Sequencing Data

    Energy Technology Data Exchange (ETDEWEB)

    labutti, Kurt; Kuo, Alan; Grigoriev, Igor; Copeland, Alex

    2014-03-17

    Repetitive organisms pose a challenge for short read assembly, and typically only unique regions and repeat regions shorter than the read length, can be accurately assembled. Recently, we have been investigating the use of Pacific Biosciences reads for de novo fungal assembly. We will present an assessment of the quality and degree of repeat reconstruction possible in a fungal genome using long read technology. We will also compare differences in assembly of repeat content using short read and long read technology.

  7. Local Renyi entropic profiles of DNA sequences

    Directory of Open Access Journals (Sweden)

    Vinga Susana

    2007-10-01

    Full Text Available Abstract Background In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM. Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. Results The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at http://kdbio.inesc-id.pt/~svinga/ep/. Conclusion The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures.

  8. Sequence-specific recognition of DNA nanostructures.

    Science.gov (United States)

    Rusling, David A; Fox, Keith R

    2014-05-15

    DNA is the most exploited biopolymer for the programmed self-assembly of objects and devices that exhibit nanoscale-sized features. One of the most useful properties of DNA nanostructures is their ability to be functionalized with additional non-nucleic acid components. The introduction of such a component is often achieved by attaching it to an oligonucleotide that is part of the nanostructure, or hybridizing it to single-stranded overhangs that extend beyond or above the nanostructure surface. However, restrictions in nanostructure design and/or the self-assembly process can limit the suitability of these procedures. An alternative strategy is to couple the component to a DNA recognition agent that is capable of binding to duplex sequences within the nanostructure. This offers the advantage that it requires little, if any, alteration to the nanostructure and can be achieved after structure assembly. In addition, since the molecular recognition of DNA can be controlled by varying pH and ionic conditions, such systems offer tunable properties that are distinct from simple Watson-Crick hybridization. Here, we describe methodology that has been used to exploit and characterize the sequence-specific recognition of DNA nanostructures, with the aim of generating functional assemblies for bionanotechnology and synthetic biology applications.

  9. Linker histone variant H1T targets rDNA repeats.

    Science.gov (United States)

    Tani, Ruiko; Hayakawa, Koji; Tanaka, Satoshi; Shiota, Kunio

    2016-04-02

    H1T is a linker histone H1 variant that is highly expressed at the primary spermatocyte stage through to the early spermatid stage of spermatogenesis. While the functions of the somatic types of H1 have been extensively investigated, the intracellular role of H1T is unclear. H1 variants specifically expressed in germ cells show low amino acid sequence homology to somatic H1s, which suggests that the functions or target loci of germ cell-specific H1T differ from those of somatic H1s. Here, we describe the target loci and function of H1T. H1T was expressed not only in the testis but also in tumor cell lines, mouse embryonic stem cells (mESCs), and some normal somatic cells. To elucidate the intracellular localization and target loci of H1T, fluorescent immunostaining and ChIP-seq were performed in tumor cells and mESCs. We found that H1T accumulated in nucleoli and predominantly targeted rDNA repeats, which differ from somatic H1 targets. Furthermore, by nuclease sensitivity assay and RT-qPCR, we showed that H1T repressed rDNA transcription by condensing chromatin structure. Imaging analysis indicated that H1T expression affected nucleolar formation. We concluded that H1T plays a role in rDNA transcription, by distinctively targeting rDNA repeats.

  10. New stopping criteria for segmenting DNA sequences

    CERN Document Server

    Li, W

    2001-01-01

    We propose a solution on the stopping criterion in segmenting inhomogeneous DNA sequences with complex statistical patterns. This new stopping criterion is based on Bayesian Information Criterion (BIC) in the model selection framework. When this stopping criterion is applied to a left telomere sequence of yeast Saccharomyces cerevisiae and the complete genome sequence of bacterium Escherichia coli, borders of biologically meaningful units were identified (e.g. subtelomeric units, replication origin, and replication terminus), and a more reasonable number of domains was obtained. We also introduce a measure called segmentation strength which can be used to control the delineation of large domains. The relationship between the average domain size and the threshold of segmentation strength is determined for several genome sequences.

  11. Recombination-independent recognition of DNA homology for repeat-induced point mutation.

    Science.gov (United States)

    Gladyshev, Eugene; Kleckner, Nancy

    2017-06-01

    Numerous cytogenetic observations have shown that homologous chromosomes (or individual chromosomal loci) can engage in specific pairing interactions in the apparent absence of DNA breakage and recombination, suggesting that canonical recombination-mediated mechanisms may not be the only option for sensing DNA/DNA homology. One proposed mechanism for such recombination-independent homology recognition involves direct contacts between intact double-stranded DNA molecules. The strongest in vivo evidence for the existence of such a mechanism is provided by the phenomena of homology-directed DNA modifications in fungi, known as repeat-induced point mutation (RIP, discovered in Neurospora crassa) and methylation-induced premeiotically (MIP, discovered in Ascobolus immersus). In principle, Neurospora RIP can detect the presence of gene-sized DNA duplications irrespectively of their origin, underlying nucleotide sequence, coding capacity or relative, as well as absolute positions in the genome. Once detected, both sequence copies are altered by numerous cytosine-to-thymine (C-to-T) mutations that extend specifically over the duplicated region. We have recently shown that Neurospora RIP does not require MEI-3, the only RecA/Rad51 protein in this organism, consistent with a recombination-independent mechanism. Using an ultra-sensitive assay for RIP mutation, we have defined additional features of this process. We have shown that RIP can detect short islands of homology of only three base-pairs as long as many such islands are arrayed with a periodicity of 11 or 12 base-pairs along a pair of DNA molecules. While the presence of perfect homology is advantageous, it is not required: chromosomal segments with overall sequence identity of only 35-36 % can still be recognized by RIP. Importantly, in order for this process to work efficiently, participating DNA molecules must be able to co-align along their lengths. Based on these findings, we have proposed a model, in which

  12. The influence of DNA sequence on epigenome-induced pathologies

    Directory of Open Access Journals (Sweden)

    Meagher Richard B

    2012-07-01

    Full Text Available Abstract Clear cause-and-effect relationships are commonly established between genotype and the inherited risk of acquiring human and plant diseases and aberrant phenotypes. By contrast, few such cause-and-effect relationships are established linking a chromatin structure (that is, the epitype with the transgenerational risk of acquiring a disease or abnormal phenotype. It is not entirely clear how epitypes are inherited from parent to offspring as populations evolve, even though epigenetics is proposed to be fundamental to evolution and the likelihood of acquiring many diseases. This article explores the hypothesis that, for transgenerationally inherited chromatin structures, “genotype predisposes epitype”, and that epitype functions as a modifier of gene expression within the classical central dogma of molecular biology. Evidence for the causal contribution of genotype to inherited epitypes and epigenetic risk comes primarily from two different kinds of studies discussed herein. The first and direct method of research proceeds by the examination of the transgenerational inheritance of epitype and the penetrance of phenotype among genetically related individuals. The second approach identifies epitypes that are duplicated (as DNA sequences are duplicated and evolutionarily conserved among repeated patterns in the DNA sequence. The body of this article summarizes particularly robust examples of these studies from humans, mice, Arabidopsis, and other organisms. The bulk of the data from both areas of research support the hypothesis that genotypes predispose the likelihood of displaying various epitypes, but for only a few classes of epitype. This analysis suggests that renewed efforts are needed in identifying polymorphic DNA sequences that determine variable nucleosome positioning and DNA methylation as the primary cause of inherited epigenome-induced pathologies. By contrast, there is very little evidence that DNA sequence directly

  13. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697

  14. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Directory of Open Access Journals (Sweden)

    Chun-Tien Chang

    2012-01-01

    Full Text Available The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs, insertion-deletions (indels, short tandem repeats (STRs, and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR, which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS; (iii determine human papilloma virus (HPV genotypes by searching current viral databases in cases of double infections; (iv estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4 and its paralog HSPDP3.

  15. Mixed sequence reader: a program for analyzing DNA sequences with heterozygous base calling.

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3.

  16. Evidence for integration of retroviral vectors in a novel human repeat sequence

    Energy Technology Data Exchange (ETDEWEB)

    Kurdi-Haidar, B.; Friedmann, T. [USCD School of Medicine, La Jolla, CA (United States)

    1994-09-01

    Retroviruses have become attractive vehicles for the introduction of foreign genes into mammalian cells not only for gene therapy but also to serve as anchor points for long-range mapping purposes. The information relating to retroviral integration in mammalian cells is derived mostly from studies of rodent genomes. The absence of information regarding integration sites of murine-based retroviral vectors in human cells has prompted us to investigate the characteristics of integration sites in the human genome. We have constructed a Moloney murine leukemia virus-based retroviral vector that carries the pUC8 origin of replication and the chloramphenicol resistance gene to allow the rescue of the flanking genomic sequences in plasmid form. We have infected human primary fibroblasts and myoblasts with this retroviral vector and isolated independently transduced clones. Genomic DNA was obtained from independent clones and the genomic fragment carrying the provirus-host sequence boundary was isolated after digestion of the genomic DNA, circularization, and transformation by electroporation of E. coli C cells to chloramphenicol resistance. Restriction map and nucleotide sequence analysis of the rescued plasmids showed that a number of the clones shared the same integration site within the human genome. We have used the nucleotide sequence information about the human DNA adjacent to the 3{prime}LTR to design a PCR-based assay diagnostic for this common integration site. Analysis revealed the presence of the same integration site in four out of twelve human primary fibroblast clones infected with this specific retroviral vector, and in one out of twelve human primary myoblast clones infected with a second retroviral vector. Further analysis revealed the common integration site to be a previously unreported primate repeat present in monkey and human genomes and absent from rodent, bovine and avian genomes.

  17. Tandemly repeated DNA is a target for the partial replacement of thymine by beta-D-glucosyl-hydroxymethyluracil in Trypanosoma brucei.

    Science.gov (United States)

    van Leeuwen, F; Kieft, R; Cross, M; Borst, P

    2000-07-01

    In the DNA of African trypanosomes a small fraction of thymine is replaced by the modified base beta-D-glucosyl-hydroxymethyluracil (J). The function of this large base is unknown. The presence of J in the silent variant surface glycoprotein gene expression sites and the lack of J in the transcribed expression site indicates that DNA modification might play a role in control of gene repression. However, the abundance of J in the long telomeric repeat tracts and in subtelomeric arrays of simple repeats suggests that J may also have specific functions in repetitive DNA. We have now analyzed chromosome-internal repetitive sequences in the genome of Trypanosoma brucei and found J in the minichromosomal 177-bp repeats, in the long arrays of 5S RNA gene repeats, and in the spliced-leader RNA gene repeats. No J was found in the rDNA locus or in dispersed repetitive transposon-like elements. Remarkably, the rDNA of T. brucei is not organized in long arrays of tandem repeats, as in many other eukaryotes. T. brucei contains only approximately 15-20 rDNA repeat units that are divided over six to seven chromosomes. Our results show that J is present in many tandemly repeated sequences, either at a telomere or chromosome internal. The presence of J might help to stabilize the long arrays of repeats in the genome.

  18. Expressed sequence tags (ESTs and simple sequence repeat (SSR markers from octoploid strawberry (Fragaria × ananassa

    Directory of Open Access Journals (Sweden)

    Bies Dawn H

    2005-06-01

    Full Text Available Abstract Background Cultivated strawberry (Fragaria × ananassa represents one of the most valued fruit crops in the United States. Despite its economic importance, the octoploid genome presents a formidable barrier to efficient study of genome structure and molecular mechanisms that underlie agriculturally-relevant traits. Many potentially fruitful research avenues, especially large-scale gene expression surveys and development of molecular genetic markers have been limited by a lack of sequence information in public databases. As a first step to remedy this discrepancy a cDNA library has been developed from salicylate-treated, whole-plant tissues and over 1800 expressed sequence tags (EST's have been sequenced and analyzed. Results A putative unigene set of 1304 sequences – 133 contigs and 1171 singlets – has been developed, and the transcripts have been functionally annotated. Homology searches indicate that 89.5% of sequences share significant similarity to known/putative proteins or Rosaceae ESTs. The ESTs have been functionally characterized and genes relevant to specific physiological processes of economic importance have been identified. A set of tools useful for SSR development and mapping is presented. Conclusion Sequences derived from this effort may be used to speed gene discovery efforts in Fragaria and the Rosaceae in general and also open avenues of comparative mapping. This report represents a first step in expanding molecular-genetic analyses in strawberry and demonstrates how computational tools can be used to optimally mine a large body of useful information from a relatively small data set.

  19. Nuclear Receptor HNF4α Binding Sequences are Widespread in Alu Repeats

    Directory of Open Access Journals (Sweden)

    Bolotin Eugene

    2011-11-01

    Full Text Available Abstract Background Alu repeats, which account for ~10% of the human genome, were originally considered to be junk DNA. Recent studies, however, suggest that they may contain transcription factor binding sites and hence possibly play a role in regulating gene expression. Results Here, we show that binding sites for a highly conserved member of the nuclear receptor superfamily of ligand-dependent transcription factors, hepatocyte nuclear factor 4alpha (HNF4α, NR2A1, are highly prevalent in Alu repeats. We employ high throughput protein binding microarrays (PBMs to show that HNF4α binds > 66 unique sequences in Alu repeats that are present in ~1.2 million locations in the human genome. We use chromatin immunoprecipitation (ChIP to demonstrate that HNF4α binds Alu elements in the promoters of target genes (ABCC3, APOA4, APOM, ATPIF1, CANX, FEMT1A, GSTM4, IL32, IP6K2, PRLR, PRODH2, SOCS2, TTR and luciferase assays to show that at least some of those Alu elements can modulate HNF4α-mediated transactivation in vivo (APOM, PRODH2, TTR, APOA4. HNF4α-Alu elements are enriched in promoters of genes involved in RNA processing and a sizeable fraction are in regions of accessible chromatin. Comparative genomics analysis suggests that there may have been a gain in HNF4α binding sites in Alu elements during evolution and that non Alu repeats, such as Tiggers, also contain HNF4α sites. Conclusions Our findings suggest that HNF4α, in addition to regulating gene expression via high affinity binding sites, may also modulate transcription via low affinity sites in Alu repeats.

  20. Simple sequence repeat markers useful for sorghum downy mildew (Peronosclerospora sorghi and related species

    Directory of Open Access Journals (Sweden)

    Odvody Gary N

    2008-11-01

    Full Text Available Abstract Background A recent outbreak of sorghum downy mildew in Texas has led to the discovery of both metalaxyl resistance and a new pathotype in the causal organism, Peronosclerospora sorghi. These observations and the difficulty in resolving among phylogenetically related downy mildew pathogens dramatically point out the need for simply scored markers in order to differentiate among isolates and species, and to study the population structure within these obligate oomycetes. Here we present the initial results from the use of a biotin capture method to discover, clone and develop PCR primers that permit the use of simple sequence repeats (microsatellites to detect differences at the DNA level. Results Among the 55 primers pairs designed from clones from pathotype 3 of P. sorghi, 36 flanked microsatellite loci containing simple repeats, including 28 (55% with dinucleotide repeats and 6 (11% with trinucleotide repeats. A total of 22 microsatellites with CA/AC or GT/TG repeats were the most abundant (40% and GA/AG or CT/TC types contribute 15% in our collection. When used to amplify DNA from 19 isolates from P. sorghi, as well as from 5 related species that cause downy mildew on other hosts, the number of different bands detected for each SSR primer pair using a LI-COR- DNA Analyzer ranged from two to eight. Successful cross-amplification for 12 primer pairs studied in detail using DNA from downy mildews that attack maize (P. maydis & P. philippinensis, sugar cane (P. sacchari, pearl millet (Sclerospora graminicola and rose (Peronospora sparsa indicate that the flanking regions are conserved in all these species. A total of 15 SSR amplicons unique to P. philippinensis (one of the potential threats to US maize production were detected, and these have potential for development of diagnostic tests. A total of 260 alleles were obtained using 54 microsatellites primer combinations, with an average of 4.8 polymorphic markers per SSR across 34

  1. Disruption of Higher Order DNA Structures in Friedreich's Ataxia (GAA)(n) Repeats by PNA or LNA Targeting

    DEFF Research Database (Denmark)

    Bergquist, Helen; Rocha, Cristina S. J.; Alvarez-Asencio, Ruben

    2016-01-01

    Expansion of (GAA)n repeats in the first intron of the Frataxin gene is associated with reduced mRNA and protein levels and the development of Friedreich’s ataxia. (GAA)n expansions form non-canonical structures, including intramolecular triplex (H-DNA), and R-loops and are associated with epigen......Expansion of (GAA)n repeats in the first intron of the Frataxin gene is associated with reduced mRNA and protein levels and the development of Friedreich’s ataxia. (GAA)n expansions form non-canonical structures, including intramolecular triplex (H-DNA), and R-loops and are associated...... with epigenetic modifications. With the aim of interfering with higher order H-DNA (like) DNA structures within pathological (GAA)n expansions, we examined sequence-specific interaction of peptide nucleic acid (PNA) with (GAA)n repeats of different lengths (short: n=9, medium: n=75 or long: n=115) by chemical...... probing of triple helical and single stranded regions. We found that a triplex structure (H-DNA) forms at GAA repeats of different lengths; however, single stranded regions were not detected within the medium size pathological repeat, suggesting the presence of a more complex structure. Furthermore, (GAA...

  2. Next-generation sequencing offers new insights into DNA degradation

    DEFF Research Database (Denmark)

    Overballe-Petersen, Søren; Orlando, Ludovic Antoine Alexandre; Willerslev, Eske

    2012-01-01

    The processes underlying DNA degradation are central to various disciplines, including cancer research, forensics and archaeology. The sequencing of ancient DNA molecules on next-generation sequencing platforms provides direct measurements of cytosine deamination, depurination and fragmentation r...

  3. Random Coding Bounds for DNA Codes Based on Fibonacci Ensembles of DNA Sequences

    Science.gov (United States)

    2008-07-01

    COVERED (From - To) 6 Jul 08 – 11 Jul 08 4. TITLE AND SUBTITLE RANDOM CODING BOUNDS FOR DNA CODES BASED ON FIBONACCI ENSEMBLES OF DNA SEQUENCES ... sequences which are generalizations of the Fibonacci sequences . 15. SUBJECT TERMS DNA Codes, Fibonacci Ensembles, DNA Computing, Code Optimization 16...coding bound on the rate of DNA codes is proved. To obtain the bound, we use some ensembles of DNA sequences which are generalizations of the Fibonacci

  4. Mitochondrial DNA sequence of Onychostoma rara.

    Science.gov (United States)

    Zeng, Chun-Fang; Li, Xiao-Ling; Li, Chuan-Wu; Huang, Xiang-Rong; Wan, Yi-Wen

    2015-01-01

    The complete mitochondrial genome sequence of Onychostoma rara was determined to be 16,590 bp in length and contains 13 protein-coding genes (PCGs), 22 tRNA genes, large (rrnL) and small (rrnS) rRNA and the non-coding control region. Its total A + T content is 55.65%. We also analyzed the structure of control region, 6 CSBs (CSB-1, CSB-2, CSB-3, CSB-D, CSB-E and CSB-F) and 2 bp tandem repeat were detected.

  5. An oligonucleotide hybridization approach to DNA sequencing.

    Science.gov (United States)

    Khrapko, K R; Lysov YuP; Khorlyn, A A; Shick, V V; Florentiev, V L; Mirzabekov, A D

    1989-10-09

    We have proposed a DNA sequencing method based on hybridization of a DNA fragment to be sequenced with the complete set of fixed-length oligonucleotides (e.g., 4(8) = 65,536 possible 8-mers) immobilized individually as dots of a 2-D matrix [(1989) Dokl. Akad. Nauk SSSR 303, 1508-1511]. It was shown that the list of hybridizing octanucleotides is sufficient for the computer-assisted reconstruction of the structures for 80% of random-sequence fragments up to 200 bases long, based on the analysis of the octanucleotide overlapping. Here a refinement of the method and some experimental data are presented. We have performed hybridizations with oligonucleotides immobilized on a glass plate, and obtained their dissociation curves down to heptanucleotides. Other approaches, e.g., an additional hybridization of short oligonucleotides which continuously extend duplexes formed between the fragment and immobilized oligonucleotides, should considerably increase either the probability of unambiguous reconstruction, or the length of reconstructed sequences, or decrease the size of immobilized oligonucleotides.

  6. Simple sequence repeat marker development and genetic mapping in quinoa (Chenopodium quinoa Willd.)

    Indian Academy of Sciences (India)

    D. E. Jarvis; O. R. Kopp; E. N. Jellen; M. A. Mallory; J. Pattee; A. Bonifacio; C. E. Coleman; M. R. Stevens; D. J. Fairbanks; P. J. Maughan

    2008-04-01

    Quinoa is a regionally important grain crop in the Andean region of South America. Recently quinoa has gained international attention for its high nutritional value and tolerances of extreme abiotic stresses. DNA markers and linkage maps are important tools for germplasm conservation and crop improvement programmes. Here we report the development of 216 new polymorphic SSR (simple sequence repeats) markers from libraries enriched for GA, CAA and AAT repeats, as well as 6 SSR markers developed from bacterial artificial chromosome-end sequences (BES-SSRs). Heterozygosity (H) values of the SSR markers ranges from 0.12 to 0.90, with an average value of 0.57. A linkage map was constructed for a newly developed recombinant inbred lines (RIL) population using these SSR markers. Additional markers, including amplified fragment length polymorphisms (AFLPs), two 11S seed storage protein loci, and the nucleolar organizing region (NOR), were also placed on the linkage map. The linkage map presented here is the first SSR-based map in quinoa and contains 275 markers, including 200 SSR. The map consists of 38 linkage groups (LGs) covering 913 cM. Segregation distortion was observed in the mapping population for several marker loci, indicating possible chromosomal regions associated with selection or gametophytic lethality. As this map is based primarily on simple and easily-transferable SSR markers, it will be particularly valuable for research in laboratories in Andean regions of South America.

  7. Simple sequence repeat marker development and genetic mapping in quinoa (Chenopodium quinoa Willd.).

    Science.gov (United States)

    Jarvis, D E; Kopp, O R; Jellen, E N; Mallory, M A; Pattee, J; Bonifacio, A; Coleman, C E; Stevens, M R; Fairbanks, D J; Maughan, P J

    2008-04-01

    Quinoa is a regionally important grain crop in the Andean region of South America. Recently quinoa has gained international attention for its high nutritional value and tolerances of extreme abiotic stresses. DNA markers and linkage maps are important tools for germplasm conservation and crop improvement programmes. Here we report the development of 216 new polymorphic SSR (simple sequence repeats) markers from libraries enriched for GA, CAA and AAT repeats, as well as 6 SSR markers developed from bacterial artificial chromosome-end sequences (BES-SSRs). Heterozygosity (H) values of the SSR markers ranges from 0.12 to 0.90, with an average value of 0.57. A linkage map was constructed for a newly developed recombinant inbred lines (RIL) population using these SSR markers. Additional markers, including amplified fragment length polymorphisms (AFLPs), two 11S seed storage protein loci, and the nucleolar organizing region (NOR), were also placed on the linkage map. The linkage map presented here is the first SSR-based map in quinoa and contains 275 markers, including 200 SSR. The map consists of 38 linkage groups (LGs) covering 913 cM. Segregation distortion was observed in the mapping population for several marker loci, indicating possible chromosomal regions associated with selection or gametophytic lethality. As this map is based primarily on simple and easily-transferable SSR markers, it will be particularly valuable for research in laboratories in Andean regions of South America.

  8. DNA-labelled cytidine assay for the quantification of CAG repeats.

    Science.gov (United States)

    Pérez-Bello, Dannelys; Xu, Z H; Higginson-Clarke, David; Rojas, Ana María Riverón; Le, Weidong; Rodríguez-Tanty, Chryslaine

    2008-03-30

    The sequencing procedure has been used to determine the size of the CAG repeat expansion for the diagnosis of genetic disorders. Likewise, standard polymerase chain reaction (PCR) and gel electrophoresis techniques are applied for screening large number of patients. The trinucleotide repeats (TNR) region amplification by means of the PCR procedure was initially performed using 32-P end-labelled primers and currently carried out with fluorescently end-labelled primers. The goal to obtain reliable TNR quantification assays, at low cost and short assay times, represents a challenge for the molecular diagnosis aimed at massive screening of affected populations. In the current work, we obtained preliminary results of a new methodology for the detection and size estimation of CAG expanded alleles. The assay was based on an indirect enzyme linked immunosorbent assay (ELISA) for quantifying the amount of labelled cytidines in DNA molecules. The label, 6-(p-bromobenzamido)caproyl radical, was introduced by the transamination and acylation reactions. A group of model sequences containing different numbers of CAG repeats, as well as the ATXN3 (ataxin 3) gene (from subjects suffering type 3 spinocerebellar ataxia SCA3) were used for assay standardization. The assay is simple, inexpensive, and easy to perform and differentiates distinct degrees of CAG expansions.

  9. Mutagenic roles of DNA "repair" proteins in antibody diversity and disease-associated trinucleotide repeat instability.

    Science.gov (United States)

    Slean, Meghan M; Panigrahi, Gagan B; Ranum, Laura P; Pearson, Christopher E

    2008-07-01

    While DNA repair proteins are generally thought to maintain the integrity of the whole genome by correctly repairing mutagenic DNA intermediates, there are cases where DNA "repair" proteins are involved in causing mutations instead. For instance, somatic hypermutation (SHM) and class switch recombination (CSR) require the contribution of various DNA repair proteins, including UNG, MSH2 and MSH6 to mutate certain regions of immunoglobulin genes in order to generate antibodies of increased antigen affinity and altered effector functions. Another instance where "repair" proteins drive mutations is the instability of gene-specific trinucleotide repeats (TNR), the causative mutations of numerous diseases including Fragile X mental retardation syndrome (FRAXA), Huntington's disease (HD), myotonic dystrophy (DM1) and several spinocerebellar ataxias (SCAs) all of which arise via various modes of pathogenesis. These healthy and deleterious mutations that are induced by repair proteins are distinct from the genome-wide mutations that arise in the absence of repair proteins: they occur at specific loci, are sensitive to cis-elements (sequence context and/or epigenetic marks) and transcription, occur in specific tissues during distinct developmental windows, and are age-dependent. Here we review and compare the mutagenic role of DNA "repair" proteins in the processes of SHM, CSR and TNR instability.

  10. Cloning and characterization of a repetitive DNA sequence specific for Trichomonas vaginalis.

    Science.gov (United States)

    Paces, J; Urbánková, V; Urbánek, P

    1992-09-01

    A family of 650-bp-long repeats from the Trichomonas vaginalis genome, designated the Tv-E650 family, was cloned and sequenced. The nucleotide sequence is A+T-rich (73.3% A+T in the consensus sequence) and highly conserved among the 8 molecular clones analyzed. The differences among the clones are single-nucleotide and 2-nucleotide substitutions and insertions or deletions. The sequence uniformity of the clones as well as the presence of identical mutations in different clones suggest that efficient sequence homogenization mechanisms, such as gene conversion or recurring unequal crossing-over, operate in T. vaginalis. The copy number of the Tv-E650 repeats was estimated to be about 10(2)-10(3) per genome. Based on the DNA hybridization results, the Tv-E650 repeat family is conserved in all T. vaginalis strains examined, regardless of their diverse geographical origin. No hybridization of the Tv-E650 probe was found with the DNA from Trichomonas tenax, Trichomonas gallinae and Pentatrichomonas hominis, indicating that the Tv-E650 repeated sequences are species-specific. A dot blot hybridization protocol was developed which does not require isolation of DNA. By using this protocol it was possible to detect the DNA released from approximately 10(3) T. vaginalis cells per dot. These observations suggest that the Tv-E650 probe is potentially applicable to the identification and detection of T. vaginalis.

  11. Bisulfite sequencing reveals that Aspergillus flavus holds a hollow in DNA methylation.

    Directory of Open Access Journals (Sweden)

    Si-Yang Liu

    Full Text Available Aspergillus flavus first gained scientific attention for its production of aflatoxin. The underlying regulation of aflatoxin biosynthesis has been serving as a theoretical model for biosynthesis of other microbial secondary metabolites. Nevertheless, for several decades, the DNA methylation status, one of the important epigenomic modifications involved in gene regulation, in A. flavus remains to be controversial. Here, we applied bisulfite sequencing in conjunction with a biological replicate strategy to investigate the DNA methylation profiling of A. flavus genome. Both the bisulfite sequencing data and the methylome comparisons with other fungi confirm that the DNA methylation level of this fungus is negligible. Further investigation into the DNA methyltransferase of Aspergillus uncovers its close relationship with RID-like enzymes as well as its divergence with the methyltransferase of species with validated DNA methylation. The lack of repeat contents of the A. flavus' genome and the high RIP-index of the small amount of remanent repeat potentially support our speculation that DNA methylation may be absent in A. flavus or that it may possess de novo DNA methylation which occurs very transiently during the obscure sexual stage of this fungal species. This work contributes to our understanding on the DNA methylation status of A. flavus, as well as reinforces our views on the DNA methylation in fungal species. In addition, our strategy of applying bisulfite sequencing to DNA methylation detection in species with low DNA methylation may serve as a reference for later scientific investigations in other hypomethylated species.

  12. Construction of libraries enriched for sequence repeats and jumping clones, and hybridization selection for region-specific markers

    Energy Technology Data Exchange (ETDEWEB)

    Kandpal, R.P.; Kandpal, G.; Weissman, S.M. (Yale Univ. School of Medicine, New Haven, CT (United States))

    1994-01-04

    The authors describe a simple and rapid method for constructing small-insert genomic libraries highly enriched for dimeric, trimeric, and tetrameric nucleotide repeat motifs. The approach involves use of DNA inserts recovered by PCR amplification of a small-insert sonicated genomic phage library or by a single-primer PCR amplification of Mbo I-digested and adaptor-ligated genomic DNA. The genomic DNA inserts are heat denatured and hybridized to a biotinylated oligonucleotde. The biotinylated hybrids are retained on a Vectrex-avidin matrix and eluted specifically. The eluate is PCR amplified and cloned. More than 90% of the clones in a library enriched for (CA)[sub n] microsatellites with this approach contained clones with inserts containing CA repeats. They have also used this protocol for enrichment of (CAG)[sub n] and (AGAT)[sub n] sequence repeats and for Not I jumping clones. They have used the enriched libraries with an adaptation of the cDNA selection method to enrich for repeat motifs encoded in yeast artificial chromosomes.

  13. Genome-wide identification and validation of simple sequence repeats (SSRs) from Asparagus officinalis.

    Science.gov (United States)

    Li, Shufen; Zhang, Guojun; Li, Xu; Wang, Lianjun; Yuan, Jinhong; Deng, Chuanliang; Gao, Wujun

    2016-06-01

    Garden asparagus (Asparagus officinalis), an important vegetable cultivated worldwide, can also serve as a model dioecious plant species in the study of sex determination and sex chromosome evolution. However, limited DNA marker resources have been developed and used for this species. To expand these resources, we examined the DNA sequences for simple sequence repeats (SSRs) in 163,406 scaffolds representing approximately 400 Mbp of the A. officinalis genome. A total of 87,576 SSRs were identified in 59,565 scaffolds. The most abundant SSR repeats were trinucleotide and tetranucleotide, accounting for 29.2 and 29.1% of the total SSRs, respectively, followed by di-, penta-, hexa-, hepta-, and octanucleotides. The AG motif was most common among dinucleotides and was also the most frequent motif in the entire A. officinalis genome, representing 14.7% of all SSRs. A total of 41,917 SSR primers pairs were designed to amplify SSRs. Twenty-two genomic SSR markers were tested in 39 asparagus accessions belonging to ten cultivars and one accession of Asparagus setaceus for determination of genetic diversity. The intra-species polymorphism information content (PIC) values of the 22 genomic SSR markers were intermediate, with an average of 0.41. The genetic diversity between the ten A. officinalis cultivars was low, and the UPGMA dendrogram was largely unrelated to cultivars. It is here suggested that the sex of individuals is an important factor influencing the clustering results. The information reported here provides new information about the organization of the microsatellites in A. officinalis genome and lays a foundation for further genetic studies and breeding applications of A. officinalis and related species.

  14. Agarose gel electrophoresis and polyacrylamide gel electrophoresis for visualization of simple sequence repeats.

    Science.gov (United States)

    Anderson, James; Wright, Drew; Meksem, Khalid

    2013-01-01

    In the modern age of genetic research there is a constant search for ways to improve the efficiency of plant selection. The most recent technology that can result in a highly efficient means of selection and still be done at a low cost is through plant selection directed by simple sequence repeats (SSRs or microsatellites). The molecular markers are used to select for certain desirable plant traits without relying on ambiguous phenotypic data. The best way to detect these is the use of gel electrophoresis. Gel electrophoresis is a common technique in laboratory settings which is used to separate deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) by size. Loading DNA and RNA onto gels allows for visualization of the size of fragments through the separation of DNA and RNA fragments. This is achieved through the use of the charge in the particles. As the fragments separate, they form into distinct bands at set sizes. We describe the ability to visualize SSRs on slab gels of agarose and polyacrylamide gel electrophoresis.

  15. Toxoplasma gondii: a bradyzoite-specific DnaK-tetratricopeptide repeat (DnaK-TPR) protein interacts with p23 co-chaperone protein.

    Science.gov (United States)

    Ueno, Akio; Dautu, George; Haga, Kaori; Munyaka, Biscah; Carmen, Gabriella; Kobayashi, Yoshiyasu; Igarashi, Makoto

    2011-04-01

    The DnaK-tetratricopeptide repeat (DnaK-TPR) gene (ToxoDB ID, TGME49_002020) is expressed predominantly at the bradyzoite stage. DnaK-TPR protein has a heat shock protein (DnaK) and tetratricopeptide repeat (TPR) domains with amino acid sequence similarity to the counterparts of other organisms (40.2-43.7% to DnaK domain and 41.1-66.0% to TPR domain). These findings allowed us to infer that DnaK-TPR protein is important in the tachyzoite-to-bradyzoite development or maintenance of cyst structure although the function of this gene is still unknown. An immunofluorescence assay (IFA) revealed that DnaK-TPR protein was expressed in Toxoplasma gondii-encysted and in vitro-induced bradyzoites and distributed in the whole part of parasite cells. We conducted yeast two-hybrid screening to identify proteins interacting with DnaK-TPR protein, and demonstrated that DnaK-TPR protein interacts with p23 co-chaperone protein (Tgp23). It was expected that DnaK-TPR protein would have a function as a molecular chaperon in bradyzoite cells associated with Tgp23. Possible mechanisms for this gene are discussed.

  16. Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.

    Science.gov (United States)

    Rehm, Charlotte; Wurmthaler, Lena A; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S

    2015-01-01

    In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1-5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6-9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria.

  17. Modified Genetic Algorithm for DNA Sequence Assembly by Shotgun and Hybridization Sequencing Techniques

    Directory of Open Access Journals (Sweden)

    Prof.Narayan Kumar Sahu

    2012-09-01

    Full Text Available Since the advent of rapid DNA sequencing methods in 1976, scientists have had the problem of inferring DNA sequences from sequenced fragments. Shotgun sequencing is a well-established biological and computational method used in practice. Many conventional algorithms for shotgun sequencing are based on the notion of pair wise fragment overlap. While shotgun sequencing infers a DNA sequence given the sequences of overlapping fragments, a recent and complementary method, called sequencing by hybridization (SBH, infers a DNA sequence given the set of oligomers that represents all sub words of some fixed length, k. In this paper, we propose a new computer algorithm for DNA sequence assembly that combines in a novel way the techniques of both shotgun and SBH methods. Based on our preliminary investigations, the algorithm promises- to be very fast and practical for DNA sequence assembly [1].

  18. Large cryptic internal sequence repeats in protein structures from Homo sapiens

    Indian Academy of Sciences (India)

    R Sarani; N A Udayaprakash; R Subashini; P Mridula; T Yamane; K Sekar

    2009-03-01

    Amino acid sequences are known to constantly mutate and diverge unless there is a limiting condition that makes such a change deleterious. However, closer examination of the sequence and structure reveals that a few large, cryptic repeats are nevertheless sequentially conserved. This leads to the question of why only certain repeats are conserved at the sequence level. It would be interesting to find out if these sequences maintain their conservation at the three-dimensional structure level. They can play an active role in protein and nucleotide stability, thus not only ensuring proper functioning but also potentiating malfunction and disease. Therefore, insights into any aspect of the repeats – be it structure, function or evolution – would prove to be of some importance. This study aims to address the relationship between protein sequence and its three-dimensional structure, by examining if large cryptic sequence repeats have the same structure.

  19. Nucleosome DNA sequence structure of isochores

    Directory of Open Access Journals (Sweden)

    Trifonov Edward N

    2011-04-01

    Full Text Available Abstract Background Significant differences in G+C content between different isochore types suggest that the nucleosome positioning patterns in DNA of the isochores should be different as well. Results Extraction of the patterns from the isochore DNA sequences by Shannon N-gram extension reveals that while the general motif YRRRRRYYYYYR is characteristic for all isochore types, the dominant positioning patterns of the isochores vary between TAAAAATTTTTA and CGGGGGCCCCCG due to the large differences in G+C composition. This is observed in human, mouse and chicken isochores, demonstrating that the variations of the positioning patterns are largely G+C dependent rather than species-specific. The species-specificity of nucleosome positioning patterns is revealed by dinucleotide periodicity analyses in isochore sequences. While human sequences are showing CG periodicity, chicken isochores display AG (CT periodicity. Mouse isochores show very weak CG periodicity only. Conclusions Nucleosome positioning pattern as revealed by Shannon N-gram extension is strongly dependent on G+C content and different in different isochores. Species-specificity of the pattern is subtle. It is reflected in the choice of preferentially periodical dinucleotides.

  20. Intermittency as a universal characteristic of the complete chromosome DNA sequences of eukaryotes: From protozoa to human genomes

    Science.gov (United States)

    Rybalko, S.; Larionov, S.; Poptsova, M.; Loskutov, A.

    2011-10-01

    Large-scale dynamical properties of complete chromosome DNA sequences of eukaryotes are considered. Using the proposed deterministic models with intermittency and symbolic dynamics we describe a wide spectrum of large-scale patterns inherent in these sequences, such as segmental duplications, tandem repeats, and other complex sequence structures. It is shown that the recently discovered gene number balance on the strands is not of a random nature, and certain subsystems of a complete chromosome DNA sequence exhibit the properties of deterministic chaos.

  1. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    Energy Technology Data Exchange (ETDEWEB)

    Winston Chen, C.H.; Taranenko, N.I.; Zhu, Y.F.; Chung, C.N.; Allman, S.L.

    1997-03-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, the authors recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Snager`s enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. The preliminary results indicate laser mass spectrometry can possibly be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, the authors applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  2. Developmentally programmed excision of internal DNA sequences in Paramecium aurelia.

    Science.gov (United States)

    Gratias, A; Bétermier, M

    2001-01-01

    The development of a new somatic nucleus (macronucleus) during sexual reproduction of the ciliate Paramecium aurelia involves reproducible chromosomal rearrangements that affect the entire germline genome. Macronuclear development can be induced experimentally, which makes P. aurelia an attractive model for the study of the mechanism and the regulation of DNA rearrangements. Two major types of rearrangements have been identified: the fragmentation of the germline chromosomes, followed by the formation of the new macronuclear chromosome ends in association with imprecise DNA elimination, and the precise excision of internal eliminated sequences (IESs). All IESs identified so far are short, A/T rich and non-coding elements. They are flanked by a direct repeat of a 5'-TA-3' dinucleotide, a single copy of which remains at the macronuclear junction after excision. The number of these single-copy sequences has been estimated to be around 60,000 per haploid genome. This review focuses on the current knowledge about the genetic and epigenetic determinants of IES elimination in P. aurelia, the analysis of excision products, and the tightly regulated timing of excision throughout macronuclear development. Several models for the molecular mechanism of IES excision will be discussed in relation to those proposed for DNA elimination in other ciliates.

  3. [Mutation in microsatellite repeats of DNA and embryonal death in humans].

    Science.gov (United States)

    Nikitina, T V; Nazarenko, S A

    2000-07-01

    In the analysis of tetranucleotide DNA repeats inheritance carried out in 55 families with a history of spontaneous miscarriages and normal karyotypes in respect to 21 loci located on seven autosomes, 8 embryos (14.5%) demonstrating 12 cases of the presence of alleles absent in both parents were described. The study of chromosome segregation using other DNA markers permitted highly probable exclusion of false paternity as well as uniparental disomy as the reasons for parent/child allele mismatches. The high probability of paternity together with the presence of a "new" allele at any offspring locus points to the mutation having occurred during game-togenesis in one of the parents. Examination of mutation in spontaneous abortuses revealed an increased number of tandem repeat units at microsatellite loci in three cases and an decreased number of these repeats in six cases. In two abortuses, a third allele absent in both parents, which resulted from a somatic mutation that occurred during embryonic development, was observed. The prevalence of the male germline mutations, revealed during investigation of the mutation origin, was probably associated with an increased number of DNA replication cycles in sperm compared to the oocytes. In spontaneous abortuses, the mean mutation rate of the tetranucleotide repeat complexes analyzed was 9.8 x 10(-3) per locus per gamete per generation. This was about five times higher than the spontaneous mutation rate of these STR loci. It can be suggested that genome instability detected at the level of repeated DNA sequences can involve not only genetically neutral loci but also active genomic regions crucial for embryonic viability. This results in cell death and termination of embryonic development. Our findings indicate that the death of embryos with normal karyotypes in most cases is associated with an increased frequency of germline and somatic microsatellite mutations. The data of the present study also provide a practical tool for

  4. Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey

    Directory of Open Access Journals (Sweden)

    Varala Kranthi

    2007-05-01

    Full Text Available Abstract Background Extensive computational and database tools are available to mine genomic and genetic databases for model organisms, but little genomic data is available for many species of ecological or agricultural significance, especially those with large genomes. Genome surveys using conventional sequencing techniques are powerful, particularly for detecting sequences present in many copies per genome. However these methods are time-consuming and have potential drawbacks. High throughput 454 sequencing provides an alternative method by which much information can be gained quickly and cheaply from high-coverage surveys of genomic DNA. Results We sequenced 78 million base-pairs of randomly sheared soybean DNA which passed our quality criteria. Computational analysis of the survey sequences provided global information on the abundant repetitive sequences in soybean. The sequence was used to determine the copy number across regions of large genomic clones or contigs and discover higher-order structures within satellite repeats. We have created an annotated, online database of sequences present in multiple copies in the soybean genome. The low bias of pyrosequencing against repeat sequences is demonstrated by the overall composition of the survey data, which matches well with past estimates of repetitive DNA content obtained by DNA re-association kinetics (Cot analysis. Conclusion This approach provides a potential aid to conventional or shotgun genome assembly, by allowing rapid assessment of copy number in any clone or clone-end sequence. In addition, we show that partial sequencing can provide access to partial protein-coding sequences.

  5. Ginger DNA transposons in eukaryotes and their evolutionary relationships with long terminal repeat retrotransposons

    Directory of Open Access Journals (Sweden)

    Bao Weidong

    2010-01-01

    Full Text Available Abstract Background In eukaryotes, long terminal repeat (LTR retrotransposons such as Copia, BEL and Gypsy integrate their DNA copies into the host genome using a particular type of DDE transposase called integrase (INT. The Gypsy INT-like transposase is also conserved in the Polinton/Maverick self-synthesizing DNA transposons and in the 'cut and paste' DNA transposons known as TDD-4 and TDD-5. Moreover, it is known that INT is similar to bacterial transposases that belong to the IS3, IS481, IS30 and IS630 families. It has been suggested that LTR retrotransposons evolved from a non-LTR retrotransposon fused with a DNA transposon in early eukaryotes. In this paper we analyze a diverse superfamily of eukaryotic cut and paste DNA transposons coding for INT-like transposase and discuss their evolutionary relationship to LTR retrotransposons. Results A new diverse eukaryotic superfamily of DNA transposons, named Ginger (for 'Gypsy INteGrasE Related' DNA transposons is defined and analyzed. Analogously to the IS3 and IS481 bacterial transposons, the Ginger termini resemble those of the Gypsy LTR retrotransposons. Currently, Ginger transposons can be divided into two distinct groups named Ginger1 and Ginger2/Tdd. Elements from the Ginger1 group are characterized by approximately 40 to 270 base pair (bp terminal inverted repeats (TIRs, and are flanked by CCGG-specific or CCGT-specific target site duplication (TSD sequences. The Ginger1-encoded transposases contain an approximate 400 amino acid N-terminal portion sharing high amino acid identity to the entire Gypsy-encoded integrases, including the YPYY motif, zinc finger, DDE domain, and, importantly, the GPY/F motif, a hallmark of Gypsy and endogenous retrovirus (ERV integrases. Ginger1 transposases also contain additional C-terminal domains: ovarian tumor (OTU-like protease domain or Ulp1 protease domain. In vertebrate genomes, at least two host genes, which were previously thought to be derived from

  6. Sequence dependent hole evolution in DNA.

    Science.gov (United States)

    Lakhno, V D

    2004-06-01

    The paper examines thedynamical behavior of a radical cation(G(+*)) generated in adouble stranded DNA for differentoligonucleotide sequences. The resonancehole tunneling through an oligonucleotidesequence is studied by the method ofnumerical integration of self-consistentquantum-mechanical equations. The holemotion is considered quantum mechanicallyand nucleotide base oscillations aretreated classically. The results obtaineddemonstrate a strong dependence of chargetransfer on the type of nucleotidesequence. The rates of the hole transferare calculated for different nucleotidesequences and compared with experimentaldata on the transfer from (G(+*))to a GGG unit.

  7. Molecular characterization and physical localization of highly repetitive DNA sequences from Brazilian Alstroemeria species.

    Science.gov (United States)

    Kuipers, A G J; Kamstra, S A; de Jeu, M J; Visser, R G F

    2002-01-01

    Highly repetitive DNA sequences were isolated from genomic DNA libraries of Alstroemeria psittacina and A. inodora. Among the repetitive sequences that were isolated, tandem repeats as well as dispersed repeats could be discerned. The tandem repeats belonged to a family of interlinked Sau3A subfragments with sizes varying from 68-127 bp, and constituted a larger HinfI repeat of approximately 400 bp. Southern hybridization showed a similar molecular organization of the tandem repeats in each of the Brazilian Alstroemeria species tested. None of the repeats hybridized with DNA from Chilean Alstroemeria species, which indicates that they are specific for the Brazilian species. In-situ localization studies revealed the tandem repeats to be localized in clusters on the chromosomes of A. inodora and A. psittacina: distal hybridization sites were found on chromosome arms 2PS, 6PL, 7PS, 7PL and 8PL, interstitial sites on chromosome arms 2PL, 3PL, 4PL and 5PL. The applicability of the tandem repeats for cytogenetic analysis of interspecific hybrids and their role in heterochromatin organization are discussed.

  8. Transverse Electronic Signature of DNA for Electronic Sequencing

    Science.gov (United States)

    Xu, Mingsheng; Endres, Robert G.; Arakawa, Yasuhiko

    In recent years, the proliferation of large-scale DNA sequencing projects for applications in clinical medicine and health care has driven the search for new methods that could reduce the time and cost. The commonly used Sanger sequencing method relies on the chemistry to read the bases in DNA and is far too slow and expensive for reading personal genetic codes. There were earlier attempts to sequence DNA by directly visualizing the nucleotide composition of the DNA molecules by scanning tunneling microscopy (STM). However, sequencing DNA based on directly imaging DNA's atomic structure has not yet been successful. In Chap. 9, Xu, Endres, and Arakawa report a potential physical alternative by detecting unique transverse electronic signatures of DNA bases using ultrahigh vacuum STM. Supported by the principles, calculations and statistical analyses, these authors argue that it would be possible to directly sequence DNA by the STM-based technology without any modification of the DNA.

  9. A new DNA sequence assembly program.

    Science.gov (United States)

    Bonfield, J K; Smith, K f; Staden, R

    1995-01-01

    We describe the Genome Assembly Program (GAP), a new program for DNA sequence assembly. The program is suitable for large and small projects, a variety of strategies and can handle data from a range of sequencing instruments. It retains the useful components of our previous work, but includes many novel ideas and methods. Many of these methods have been made possible by the program's completely new, and highly interactive, graphical user interface. The program provides many visual clues to the current state of a sequencing project and allows users to interact in intuitive and graphical ways with their data. The program has tools to display and manipulate the various types of data that help to solve and check difficult assemblies, particularly those in repetitive genomes. We have introduced the following new displays: the Contig Selector, the Contig Comparator, the Template Display, the Restriction Enzyme Map and the Stop Codon Map. We have also made it possible to have any number of Contig Editors and Contig Joining Editors running simultaneously even on the same contig. The program also includes a new 'Directed Assembly' algorithm and routines for automatically detecting unfinished segments of sequence, to which it suggests experimental solutions. Images PMID:8559656

  10. Understanding Long-Range Correlations in DNA sequences

    CERN Document Server

    Li, W; Kaneko, K; Wentian Li; Thomas G Marr; Kunihiko Kaneko

    1994-01-01

    Abstract: In this paper, we review the literature on statistical long-range correlation in DNA sequences. We examine the current evidence for these correlations, and conclude that a mixture of many length scales (including some relatively long ones) in DNA sequences is responsible for the observed 1/f-like spectral component. We note the complexity of the correlation structure in DNA sequences. The observed complexity often makes it hard, or impossible, to decompose the sequence into a few statistically stationary regions. We suggest that, based on the complexity of DNA sequences, a fruitful approach to understand long-range correlation is to model duplication, and other rearrangement processes, in DNA sequences. One model, called ``expansion-modification system", contains only point duplication and point mutation. Though simplistic, this model is able to generate sequences with 1/f spectra. We emphasize the importance of DNA duplication in its contribution to the observed long-range correlation in DNA sequen...

  11. Improved taboo search algorithm for designing DNA sequences

    Institute of Scientific and Technical Information of China (English)

    Kai Zhang; Jin Xu; Xiutang Geng; Jianhua Xiao; Linqiang Pan

    2008-01-01

    The design of DNA sequences is one of the most practical and important research topics in DNA computing.We adopt taboo search algorithm and improve the method for the systematic design of equal-length DNA sequences,which can satisfy certain combinatorial and thermodynamic constraints.Using taboo search algorithm,our method can avoid trapping into local optimization and can find a set of good DNA sequences satisfying required constraints.

  12. Chimeric proteins for detection and quantitation of DNA mutations, DNA sequence variations, DNA damage and DNA mismatches

    Science.gov (United States)

    McCutchen-Maloney, Sandra L.

    2002-01-01

    Chimeric proteins having both DNA mutation binding activity and nuclease activity are synthesized by recombinant technology. The proteins are of the general formula A-L-B and B-L-A where A is a peptide having DNA mutation binding activity, L is a linker and B is a peptide having nuclease activity. The chimeric proteins are useful for detection and identification of DNA sequence variations including DNA mutations (including DNA damage and mismatches) by binding to the DNA mutation and cutting the DNA once the DNA mutation is detected.

  13. Stable DNA methylation boundaries and expanded trinucleotide repeats: role of DNA insertions.

    Science.gov (United States)

    Naumann, Anja; Kraus, Cornelia; Hoogeveen, André; Ramirez, Christina M; Doerfler, Walter

    2014-07-15

    The human genome segment upstream of the FMR1 (fragile X mental retardation 1) gene (Xq27.3) contains several genetic signals, among them is a DNA methylation boundary that is located 65-70 CpGs upstream of the CGG repeat. In fragile X syndrome (FXS), the boundary is lost, and the promoter is inactivated by methylation spreading. Here we document boundary stability in spite of critical expansions of the CGG trinucleotide repeat in male or female premutation carriers and in high functioning males (HFMs). HFMs carry a full CGG repeat expansion but exhibit an unmethylated promoter and lack the FXS phenotype. The boundary is also stable in Turner (45, X) females. A CTCF-binding site is located slightly upstream of the methylation boundary and carries a unique G-to-A polymorphism (single nucleotide polymorphism), which occurs 3.6 times more frequently in genomes with CGG expansions. The increased frequency of this single nucleotide polymorphism might have functional significance. In CGG expansions, the CTCF region does not harbor additional mutations. In FXS individuals and often in cells transgenomic for EBV (Epstein Barr Virus) DNA or for the telomerase gene, the large number of normally methylated CpGs in the far-upstream region of the boundary is decreased about 4-fold. A methylation boundary is also present in the human genome segment upstream of the HTT (huntingtin) promoter (4p16.3) and is stable both in normal and Huntington disease chromosomes. Hence, the vicinity of an expanded repeat does not per se compromise methylation boundaries. Methylation boundaries exert an important function as promoter safeguards. Copyright © 2014 Elsevier Ltd. All rights reserved.

  14. Minimum length of direct repeat sequences required for efficient homologous recombination induced by zinc finger nuclease in yeast.

    Science.gov (United States)

    Ren, ChongHua; Yan, Qiang; Zhang, ZhiYing

    2014-10-01

    Zinc finger nuclease (ZFN) technology is a powerful molecular tool for targeted genome modifications and genetic engineering. However, screening for specific ZFs and validation of ZFN activity are labor intensive and time consuming. We previously designed a yeast-based ZFN screening and validation system by inserting a ZFN binding site flanked by a 164 bp direct repeat sequence into the middle of a Gal4 transcription factor, disrupting the open reading frame of the yeast Gal4 gene. Expression of the ZFN causes a double stranded break at its binding site, which promotes the cellular DNA repair system to restore expression of a functional Gal transcriptional factor via homologous recombination. Expression of Gal4 transcription factor leads to activation of three reporter genes in an AH109 yeast two-hybrid strain. However, the 164 bp direct repeat appears to generate spontaneous homologous recombination frequently, resulting in many false positive ZFNs. To overcome this, a series of DNA fragments of various lengths from 10 to 150 bp with 10 bp increase each and 164 bp direct repeats flanking the ZFN binding site were designed and constructed. The results demonstrated that the minimum length required for ZFN-induced homologous recombination was 30 bp, which almost eliminated spontaneous recombination. Using the 30 bp direct repeat sequence, ZFN could efficiently induce homologous recombination, while false positive ZFNs resulting from spontaneous homologous recombination were minimized. Thus, this study provided a simple, fast and sensitive ZFN screening and activity validation system in yeast.

  15. ProteDNA: a sequence-based predictor of sequence-specific DNA-binding residues in transcription factors

    OpenAIRE

    2009-01-01

    This article presents the design of a sequence-based predictor named ProteDNA for identifying the sequence-specific binding residues in a transcription factor (TF). Concerning protein–DNA interactions, there are two types of binding mechanisms involved, namely sequence-specific binding and nonspecific binding. Sequence-specific bindings occur between protein sidechains and nucleotide bases and correspond to sequence-specific recognition of genes. Therefore, sequence-specific bindings are esse...

  16. Role of direct repeat and stem-loop motifs in mtDNA deletions: cause or coincidence?

    Directory of Open Access Journals (Sweden)

    Lakshmi Narayanan Lakshmanan

    Full Text Available Deletion mutations within mitochondrial DNA (mtDNA have been implicated in degenerative and aging related conditions, such as sarcopenia and neuro-degeneration. While the precise molecular mechanism of deletion formation in mtDNA is still not completely understood, genome motifs such as direct repeat (DR and stem-loop (SL have been observed in the neighborhood of deletion breakpoints and thus have been postulated to take part in mutagenesis. In this study, we have analyzed the mitochondrial genomes from four different mammals: human, rhesus monkey, mouse and rat, and compared them to randomly generated sequences to further elucidate the role of direct repeat and stem-loop motifs in aging associated mtDNA deletions. Our analysis revealed that in the four species, DR and SL structures are abundant and that their distributions in mtDNA are not statistically different from randomized sequences. However, the average distance between the reported age associated mtDNA breakpoints and their respective nearest DR motifs is significantly shorter than what is expected of random chance in human (p10 bp tend to decrease with increasing lifespan among the four mammals studied here, further suggesting an evolutionary selection against stable mtDNA misalignments associated with long DRs in long-living animals. In contrast to the results on DR, the probability of finding SL motifs near a deletion breakpoint does not differ from random in any of the four mtDNA sequences considered. Taken together, the findings in this study give support for the importance of stable mtDNA misalignments, aided by long DRs, as a major mechanism of deletion formation in long-living, but not in short-living mammals.

  17. DNA Sequence Optimization Based on Continuous Particle Swarm Optimization for Reliable DNA Computing and DNA Nanotechnology

    Directory of Open Access Journals (Sweden)

    N. K. Khalid

    2008-01-01

    Full Text Available Problem statement: In DNA based computation and DNA nanotechnology, the design of good DNA sequences has turned out to be an essential problem and one of the most practical and important research topics. Basically, the DNA sequence design problem is a multi-objective problem and it can be evaluated using four objective functions, namely, Hmeasure, similarity, continuity and hairpin. Approach: There are several ways to solve multi-objective problem, however, in order to evaluate the correctness of PSO algorithm in DNA sequence design, this problem is converted into single objective problem. Particle Swarm Optimization (PSO is proposed to minimize the objective in the problem, subjected to two constraints: melting temperature and GCcontent. A model is developed to present the DNA sequence design based on PSO computation. Results: Based on experiments and researches done, 20 particles are used in the implementation of the optimization process, where the average values and the standard deviation for 100 runs are shown along with comparison to other existing methods. Conclusion: The results achieve verified that PSO can suitably solves the DNA sequence design problem using the proposed method and model, comparatively better than other approaches.

  18. Obesity-induced sperm DNA methylation changes at satellite repeats are reprogrammed in rat offspring

    Directory of Open Access Journals (Sweden)

    Neil A Youngson

    2016-01-01

    Full Text Available There is now strong evidence that the paternal contribution to offspring phenotype at fertilisation is more than just DNA. However, the identity and mechanisms of this nongenetic inheritance are poorly understood. One of the more important questions in this research area is: do changes in sperm DNA methylation have phenotypic consequences for offspring? We have previously reported that offspring of obese male rats have altered glucose metabolism compared with controls and that this effect was inherited through nongenetic means. Here, we describe investigations into sperm DNA methylation in a new cohort using the same protocol. Male rats on a high-fat diet were 30% heavier than control-fed males at the time of mating (16-19 weeks old, n = 14/14. A small (0.25% increase in total 5-methyl-2Ͳ-deoxycytidine was detected in obese rat spermatozoa by liquid chromatography tandem mass spectrometry. Examination of the repetitive fraction of the genome with methyl-CpG binding domain protein-enriched genome sequencing (MBD-Seq and pyrosequencing revealed that retrotransposon DNA methylation states in spermatozoa were not affected by obesity, but methylation at satellite repeats throughout the genome was increased. However, examination of muscle, liver, and spermatozoa from male 27-week-old offspring from obese and control fathers (both groups from n = 8 fathers revealed that normal DNA methylation levels were restored during offspring development. Furthermore, no changes were found in three genomic imprints in obese rat spermatozoa. Our findings have implications for transgenerational epigenetic reprogramming. They suggest that postfertilization mechanisms exist for normalising some environmentally-induced DNA methylation changes in sperm cells.

  19. Unusual structures are present in DNA fragments containing super-long Huntingtin CAG repeats.

    Directory of Open Access Journals (Sweden)

    Daniel Duzdevich

    Full Text Available BACKGROUND: In the R6/2 mouse model of Huntington's disease (HD, expansion of the CAG trinucleotide repeat length beyond about 300 repeats induces a novel phenotype associated with a reduction in transcription of the transgene. METHODOLOGY/PRINCIPAL FINDINGS: We analysed the structure of polymerase chain reaction (PCR-generated DNA containing up to 585 CAG repeats using atomic force microscopy (AFM. As the number of CAG repeats increased, an increasing proportion of the DNA molecules exhibited unusual structural features, including convolutions and multiple protrusions. At least some of these features are hairpin loops, as judged by cross-sectional analysis and sensitivity to cleavage by mung bean nuclease. Single-molecule force measurements showed that the convoluted DNA was very resistant to untangling. In vitro replication by PCR was markedly reduced, and TseI restriction enzyme digestion was also hindered by the abnormal DNA structures. However, significantly, the DNA gained sensitivity to cleavage by the Type III restriction-modification enzyme, EcoP15I. CONCLUSIONS/SIGNIFICANCE: "Super-long" CAG repeats are found in a number of neurological diseases and may also appear through CAG repeat instability. We suggest that unusual DNA structures associated with super-long CAG repeats decrease transcriptional efficiency in vitro. We also raise the possibility that if these structures occur in vivo, they may play a role in the aetiology of CAG repeat diseases such as HD.

  20. A DNA Structure-Based Bionic Wavelet Transform and Its Application to DNA Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Fei Chen

    2003-01-01

    Full Text Available DNA sequence analysis is of great significance for increasing our understanding of genomic functions. An important task facing us is the exploration of hidden structural information stored in the DNA sequence. This paper introduces a DNA structure-based adaptive wavelet transform (WT – the bionic wavelet transform (BWT – for DNA sequence analysis. The symbolic DNA sequence can be separated into four channels of indicator sequences. An adaptive symbol-to-number mapping, determined from the structural feature of the DNA sequence, was introduced into WT. It can adjust the weight value of each channel to maximise the useful energy distribution of the whole BWT output. The performance of the proposed BWT was examined by analysing synthetic and real DNA sequences. Results show that BWT performs better than traditional WT in presenting greater energy distribution. This new BWT method should be useful for the detection of the latent structural features in future DNA sequence analysis.

  1. RePS: a sequence assembler that masks exact repeats identified from the shotgun data

    DEFF Research Database (Denmark)

    Wang, Jun; Wong, Gane Ka-Shu; Ni, Peixiang;

    2002-01-01

    We describe a sequence assembler, RePS (repeat-masked Phrap with scaffolding), that explicitly identifies exact 20mer repeats from the shotgun data and removes them prior to the assembly. The established software is used to compute meaningful error probabilities for each base. Clone-end-pairing i...

  2. DNA methylation and triplet repeat stability: New proposals addressing actual questions on the CGG repeat of fragile X syndrome

    Energy Technology Data Exchange (ETDEWEB)

    Woehrle, D.; Schwemmle, S.; Steinbach, P. [Univ. of Ulm (Germany)

    1996-08-09

    Methylation of expanded CGG repeats in the FMR1 gene may well have different consequences. One is that methylation, extending into upstream regulatory elements, could lead to gene inactivation. Another effect of methylation, which we have obtained evidence for, could be stabilization of the repeat sequence and even prevention of premutations from expansion to full mutation. The full mutation of the fragile X syndrome probably occurs in an early transitional stage of embryonic development. The substrate is a maternally inherited premutation. The product usually is a mosaic pattern of full mutations detectable in early fetal life. These full mutation patterns are mitotically stable as, for instance, different somatic tissues of full mutation fetuses show identical mutation patterns. This raised the following questions: What triggers repeat expansion in that particular stage of development and what causes subsequent mitotic stability of expanded repeats? 21 refs., 1 fig.

  3. Noninvasive Prenatal Paternity Testing (NIPAT) through Maternal Plasma DNA Sequencing: A Pilot Study.

    Science.gov (United States)

    Jiang, Haojun; Xie, Yifan; Li, Xuchao; Ge, Huijuan; Deng, Yongqiang; Mu, Haofang; Feng, Xiaoli; Yin, Lu; Du, Zhou; Chen, Fang; He, Nongyue

    2016-01-01

    Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels in order to verify the performance in clinical cases. Combining targeted deep sequencing of selective SNP and informative bioinformatics pipeline, we calculated the combined paternity index (CPI) of 17 cases to determine paternity. Sequencing-based NIPAT results fully agreed with invasive prenatal paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future.

  4. Solid-Phase Purification of Synthetic DNA Sequences.

    Science.gov (United States)

    Grajkowski, Andrzej; Cieslak, Jacek; Beaucage, Serge L

    2016-08-05

    Although high-throughput methods for solid-phase synthesis of DNA sequences are currently available for synthetic biology applications and technologies for large-scale production of nucleic acid-based drugs have been exploited for various therapeutic indications, little has been done to develop high-throughput procedures for the purification of synthetic nucleic acid sequences. An efficient process for purification of phosphorothioate and native DNA sequences is described herein. This process consists of functionalizing commercial aminopropylated silica gel with aminooxyalkyl functions to enable capture of DNA sequences carrying a 5'-siloxyl ether linker with a "keto" function through an oximation reaction. Deoxyribonucleoside phosphoramidites functionalized with the 5'-siloxyl ether linker were prepared in yields of 75-83% and incorporated last into the solid-phase assembly of DNA sequences. Capture of nucleobase- and phosphate-deprotected DNA sequences released from the synthesis support is demonstrated to proceed near quantitatively. After shorter than full-length DNA sequences were washed from the capture support, the purified DNA sequences were released from this support upon treatment with tetra-n-butylammonium fluoride in dry DMSO. The purity of released DNA sequences exceeds 98%. The scalability and high-throughput features of the purification process are demonstrated without sacrificing purity of the DNA sequences.

  5. Discovering motifs in ranked lists of DNA sequences.

    Directory of Open Access Journals (Sweden)

    Eran Eden

    2007-03-01

    Full Text Available Computational methods for discovery of sequence elements that are enriched in a target set compared with a background set are fundamental in molecular biology research. One example is the discovery of transcription factor binding motifs that are inferred from ChIP-chip (chromatin immuno-precipitation on a microarray measurements. Several major challenges in sequence motif discovery still require consideration: (i the need for a principled approach to partitioning the data into target and background sets; (ii the lack of rigorous models and of an exact p-value for measuring motif enrichment; (iii the need for an appropriate framework for accounting for motif multiplicity; (iv the tendency, in many of the existing methods, to report presumably significant motifs even when applied to randomly generated data. In this paper we present a statistical framework for discovering enriched sequence elements in ranked lists that resolves these four issues. We demonstrate the implementation of this framework in a software application, termed DRIM (discovery of rank imbalanced motifs, which identifies sequence motifs in lists of ranked DNA sequences. We applied DRIM to ChIP-chip and CpG methylation data and obtained the following results. (i Identification of 50 novel putative transcription factor (TF binding sites in yeast ChIP-chip data. The biological function of some of them was further investigated to gain new insights on transcription regulation networks in yeast. For example, our discoveries enable the elucidation of the network of the TF ARO80. Another finding concerns a systematic TF binding enhancement to sequences containing CA repeats. (ii Discovery of novel motifs in human cancer CpG methylation data. Remarkably, most of these motifs are similar to DNA sequence elements bound by the Polycomb complex that promotes histone methylation. Our findings thus support a model in which histone methylation and CpG methylation are mechanistically linked

  6. Repetitive sequences in Eurasian lynx (Lynx lynx L.) mitochondrial DNA control region.

    Science.gov (United States)

    Sindičić, Magda; Gomerčić, Tomislav; Galov, Ana; Polanc, Primož; Huber, Duro; Slavica, Alen

    2012-06-01

    Mitochondrial DNA (mtDNA) control region (CR) of numerous species is known to include up to five different repetitive sequences (RS1-RS5) that are found at various locations, involving motifs of different length and extensive length heteroplasmy. Two repetitive sequences (RS2 and RS3) on opposite sides of mtDNA central conserved region have been described in domestic cat (Felis catus) and some other felid species. However, the presence of repetitive sequence RS3 has not been detected in Eurasian lynx (Lynx lynx) yet. We analyzed mtDNA CR of 35 Eurasian lynx (L. lynx L.) samples to characterize repetitive sequences and to compare them with those found in other felid species. We confirmed the presence of 80 base pairs (bp) repetitive sequence (RS2) at the 5' end of the Eurasian lynx mtDNA CR L strand and for the first time we described RS3 repetitive sequence at its 3' end, consisting of an array of tandem repeats five to ten bp long. We found that felid species share similar RS3 repetitive pattern and fundamental repeat motif TACAC.

  7. Analysis of simple sequence repeats in rice bean (Vigna umbellata) using an SSR-enriched library

    Institute of Scientific and Technical Information of China (English)

    Lixia Wang; Kyung Do Kim; Dongying Gao; Honglin Chen; Suhua Wang; SukHa Lee; Scott A. Jackson; Xuzhen Cheng

    2016-01-01

    Rice bean (Vigna umbellata Thunb.), a warm-season annual legume, is grown in Asia mainly for dried grain or fodder and plays an important role in human and animal nutrition because the grains are rich in protein and some essential fatty acids and minerals. With the aim of expediting the genetic improvement of rice bean, we initiated a project to develop genomic resources and tools for molecular breeding in this little-known but important crop. Here we report the construction of an SSR-enriched genomic library from DNA extracted from pooled young leaf tissues of 22 rice bean genotypes and developing SSR markers. In 433,562 reads generated by a Roche 454 GS-FLX sequencer, we identified 261,458 SSRs, of which 48.8% were of compound form. Dinucleotide repeats were predominant with an absolute proportion of 81.6%, followed by trinucleotides (17.8%). Other types together accounted for 0.6%. The motif AC/GT accounted for 77.7%of the total, followed by AAG/CTT (14.3%), and all others accounted for 12.0%. Among the flanking sequences, 2928 matched putative genes or gene models in the protein database of Arabidopsis thaliana, corresponding with 608 non-redundant Gene Ontology terms. Of these sequences, 11.2%were involved in cellular components, 24.2%were involved molecular functions, and 64.6%were associated with biological processes. Based on homolog analysis, 1595 flanking sequences were similar to mung bean and 500 to common bean genomic sequences. Comparative mapping was conducted using 350 sequences homologous to both mung bean and common bean sequences. Finally, a set of primer pairs were designed, and a validation test showed that 58 of 220 new primers can be used in rice bean and 53 can be transferred to mung bean. However, only 11 were polymorphic when tested on 32 rice bean varieties. We propose that this study lays the groundwork for developing novel SSR markers and will enhance the mapping of qualitative and quantitative traits and marker-assisted selection in

  8. Analysis of simple sequence repeats in rice bean (Vigna umbellata using an SSR-enriched library

    Directory of Open Access Journals (Sweden)

    Lixia Wang

    2016-02-01

    Full Text Available Rice bean (Vigna umbellata Thunb., a warm-season annual legume, is grown in Asia mainly for dried grain or fodder and plays an important role in human and animal nutrition because the grains are rich in protein and some essential fatty acids and minerals. With the aim of expediting the genetic improvement of rice bean, we initiated a project to develop genomic resources and tools for molecular breeding in this little-known but important crop. Here we report the construction of an SSR-enriched genomic library from DNA extracted from pooled young leaf tissues of 22 rice bean genotypes and developing SSR markers. In 433,562 reads generated by a Roche 454 GS-FLX sequencer, we identified 261,458 SSRs, of which 48.8% were of compound form. Dinucleotide repeats were predominant with an absolute proportion of 81.6%, followed by trinucleotides (17.8%. Other types together accounted for 0.6%. The motif AC/GT accounted for 77.7% of the total, followed by AAG/CTT (14.3%, and all others accounted for 12.0%. Among the flanking sequences, 2928 matched putative genes or gene models in the protein database of Arabidopsis thaliana, corresponding with 608 non-redundant Gene Ontology terms. Of these sequences, 11.2% were involved in cellular components, 24.2% were involved molecular functions, and 64.6% were associated with biological processes. Based on homolog analysis, 1595 flanking sequences were similar to mung bean and 500 to common bean genomic sequences. Comparative mapping was conducted using 350 sequences homologous to both mung bean and common bean sequences. Finally, a set of primer pairs were designed, and a validation test showed that 58 of 220 new primers can be used in rice bean and 53 can be transferred to mung bean. However, only 11 were polymorphic when tested on 32 rice bean varieties. We propose that this study lays the groundwork for developing novel SSR markers and will enhance the mapping of qualitative and quantitative traits and marker

  9. Repeat Associated Non-AUG Translation (RAN Translation Dependent on Sequence Downstream of the ATXN2 CAG Repeat.

    Directory of Open Access Journals (Sweden)

    Daniel R Scoles

    Full Text Available Spinocerebellar ataxia type 2 (SCA2 is a progressive autosomal dominant disorder caused by the expansion of a CAG tract in the ATXN2 gene. The SCA2 disease phenotype is characterized by cerebellar atrophy, gait ataxia, and slow saccades. ATXN2 mutation causes gains of toxic and normal functions of the ATXN2 gene product, ataxin-2, and abnormally slow Purkinje cell firing frequency. Previously we investigated features of ATXN2 controlling expression and noted expression differences for ATXN2 constructs with varying CAG lengths, suggestive of repeat associated non-AUG translation (RAN translation. To determine whether RAN translation occurs for ATXN2 we assembled various ATXN2 constructs with ATXN2 tagged by luciferase, HA or FLAG tags, driven by the CMV promoter or the ATXN2 promoter. Luciferase expression from ATXN2-luciferase constructs lacking the ATXN2 start codon was weak vs AUG translation, regardless of promoter type, and did not increase with longer CAG repeat lengths. RAN translation was detected on western blots by the anti-polyglutamine antibody 1C2 for constructs driven by the CMV promoter but not the ATXN2 promoter, and was weaker than AUG translation. Strong RAN translation was also observed when driving the ATXN2 sequence with the CMV promoter with ATXN2 sequence downstream of the CAG repeat truncated to 18 bp in the polyglutamine frame but not in the polyserine or polyalanine frames. Our data demonstrate that ATXN2 RAN translation is weak compared to AUG translation and is dependent on ATXN2 sequences flanking the CAG repeat.

  10. The SIDER2 elements, interspersed repeated sequences that populate the Leishmania genomes, constitute subfamilies showing chromosomal proximity relationship

    Directory of Open Access Journals (Sweden)

    Thomas M Carmen

    2008-06-01

    Full Text Available Abstract Background Protozoan parasites of the genus Leishmania are causative agents of a diverse spectrum of human diseases collectively known as leishmaniasis. These eukaryotic pathogens that diverged early from the main eukaryotic lineage possess a number of unusual genomic, molecular and biochemical features. The completion of the genome projects for three Leishmania species has generated invaluable information enabling a direct analysis of genome structure and organization. Results By using DNA macroarrays, made with Leishmania infantum genomic clones and hybridized with total DNA from the parasite, we identified a clone containing a repeated sequence. An analysis of the recently completed genome sequence of L. infantum, using this repeated sequence as bait, led to the identification of a new class of repeated elements that are interspersed along the different L. infantum chromosomes. These elements turned out to be homologues of SIDER2 sequences, which were recently identified in the Leishmania major genome; thus, we adopted this nomenclature for the Leishmania elements described herein. Since SIDER2 elements are very heterogeneous in sequence, their precise identification is rather laborious. We have characterized 54 LiSIDER2 elements in chromosome 32 and 27 ones in chromosome 20. The mean size for these elements is 550 bp and their sequence is G+C rich (mean value of 66.5%. On the basis of sequence similarity, these elements can be grouped in subfamilies that show a remarkable relationship of proximity, i.e. SIDER2s of a given subfamily locate close in a chromosomal region without intercalating elements. For comparative purposes, we have identified the SIDER2 elements existing in L. major and Leishmania braziliensis chromosomes 32. While SIDER2 elements are highly conserved both in number and location between L. infantum and L. major, no such conservation exists when comparing with SIDER2s in L. braziliensis chromosome 32. Conclusion

  11. Affinity purification of sequence-specific DNA binding proteins.

    OpenAIRE

    1986-01-01

    We describe a method for affinity purification of sequence-specific DNA binding proteins that is fast and effective. Complementary chemically synthesized oligodeoxynucleotides that contain a recognition site for a sequence-specific DNA binding protein are annealed and ligated to give oligomers. This DNA is then covalently coupled to Sepharose CL-2B with cyanogen bromide to yield the affinity resin. A partially purified protein fraction is combined with competitor DNA and subsequently passed t...

  12. Evolutionarily different alphoid repeat DNA on homologous chromosomes in human and chimpanzee.

    OpenAIRE

    Jørgensen, A L; Laursen, H B; Jones, C; Bak, A L

    1992-01-01

    Centromeric alphoid DNA in primates represents a class of evolving repeat DNA. In humans, chromosomes 13 and 21 share one subfamily of alphoid DNA while chromosomes 14 and 22 share another subfamily. We show that similar pairwise homogenizations occur in the chimpanzee (Pan troglodytes), where chromosomes 14 and 22, homologous to human chromosomes 13 and 21, share one partially homogenized alphoid DNA subfamily and chromosomes 15 and 23, homologous to human chromosomes 14 and 22, share anothe...

  13. The DNA sequence of the human X chromosome.

    Science.gov (United States)

    Ross, Mark T; Grafham, Darren V; Coffey, Alison J; Scherer, Steven; McLay, Kirsten; Muzny, Donna; Platzer, Matthias; Howell, Gareth R; Burrows, Christine; Bird, Christine P; Frankish, Adam; Lovell, Frances L; Howe, Kevin L; Ashurst, Jennifer L; Fulton, Robert S; Sudbrak, Ralf; Wen, Gaiping; Jones, Matthew C; Hurles, Matthew E; Andrews, T Daniel; Scott, Carol E; Searle, Stephen; Ramser, Juliane; Whittaker, Adam; Deadman, Rebecca; Carter, Nigel P; Hunt, Sarah E; Chen, Rui; Cree, Andrew; Gunaratne, Preethi; Havlak, Paul; Hodgson, Anne; Metzker, Michael L; Richards, Stephen; Scott, Graham; Steffen, David; Sodergren, Erica; Wheeler, David A; Worley, Kim C; Ainscough, Rachael; Ambrose, Kerrie D; Ansari-Lari, M Ali; Aradhya, Swaroop; Ashwell, Robert I S; Babbage, Anne K; Bagguley, Claire L; Ballabio, Andrea; Banerjee, Ruby; Barker, Gary E; Barlow, Karen F; Barrett, Ian P; Bates, Karen N; Beare, David M; Beasley, Helen; Beasley, Oliver; Beck, Alfred; Bethel, Graeme; Blechschmidt, Karin; Brady, Nicola; Bray-Allen, Sarah; Bridgeman, Anne M; Brown, Andrew J; Brown, Mary J; Bonnin, David; Bruford, Elspeth A; Buhay, Christian; Burch, Paula; Burford, Deborah; Burgess, Joanne; Burrill, Wayne; Burton, John; Bye, Jackie M; Carder, Carol; Carrel, Laura; Chako, Joseph; Chapman, Joanne C; Chavez, Dean; Chen, Ellson; Chen, Guan; Chen, Yuan; Chen, Zhijian; Chinault, Craig; Ciccodicola, Alfredo; Clark, Sue Y; Clarke, Graham; Clee, Chris M; Clegg, Sheila; Clerc-Blankenburg, Kerstin; Clifford, Karen; Cobley, Vicky; Cole, Charlotte G; Conquer, Jen S; Corby, Nicole; Connor, Richard E; David, Robert; Davies, Joy; Davis, Clay; Davis, John; Delgado, Oliver; Deshazo, Denise; Dhami, Pawandeep; Ding, Yan; Dinh, Huyen; Dodsworth, Steve; Draper, Heather; Dugan-Rocha, Shannon; Dunham, Andrew; Dunn, Matthew; Durbin, K James; Dutta, Ireena; Eades, Tamsin; Ellwood, Matthew; Emery-Cohen, Alexandra; Errington, Helen; Evans, Kathryn L; Faulkner, Louisa; Francis, Fiona; Frankland, John; Fraser, Audrey E; Galgoczy, Petra; Gilbert, James; Gill, Rachel; Glöckner, Gernot; Gregory, Simon G; Gribble, Susan; Griffiths, Coline; Grocock, Russell; Gu, Yanghong; Gwilliam, Rhian; Hamilton, Cerissa; Hart, Elizabeth A; Hawes, Alicia; Heath, Paul D; Heitmann, Katja; Hennig, Steffen; Hernandez, Judith; Hinzmann, Bernd; Ho, Sarah; Hoffs, Michael; Howden, Phillip J; Huckle, Elizabeth J; Hume, Jennifer; Hunt, Paul J; Hunt, Adrienne R; Isherwood, Judith; Jacob, Leni; Johnson, David; Jones, Sally; de Jong, Pieter J; Joseph, Shirin S; Keenan, Stephen; Kelly, Susan; Kershaw, Joanne K; Khan, Ziad; Kioschis, Petra; Klages, Sven; Knights, Andrew J; Kosiura, Anna; Kovar-Smith, Christie; Laird, Gavin K; Langford, Cordelia; Lawlor, Stephanie; Leversha, Margaret; Lewis, Lora; Liu, Wen; Lloyd, Christine; Lloyd, David M; Loulseged, Hermela; Loveland, Jane E; Lovell, Jamieson D; Lozado, Ryan; Lu, Jing; Lyne, Rachael; Ma, Jie; Maheshwari, Manjula; Matthews, Lucy H; McDowall, Jennifer; McLaren, Stuart; McMurray, Amanda; Meidl, Patrick; Meitinger, Thomas; Milne, Sarah; Miner, George; Mistry, Shailesh L; Morgan, Margaret; Morris, Sidney; Müller, Ines; Mullikin, James C; Nguyen, Ngoc; Nordsiek, Gabriele; Nyakatura, Gerald; O'Dell, Christopher N; Okwuonu, Geoffery; Palmer, Sophie; Pandian, Richard; Parker, David; Parrish, Julia; Pasternak, Shiran; Patel, Dina; Pearce, Alex V; Pearson, Danita M; Pelan, Sarah E; Perez, Lesette; Porter, Keith M; Ramsey, Yvonne; Reichwald, Kathrin; Rhodes, Susan; Ridler, Kerry A; Schlessinger, David; Schueler, Mary G; Sehra, Harminder K; Shaw-Smith, Charles; Shen, Hua; Sheridan, Elizabeth M; Shownkeen, Ratna; Skuce, Carl D; Smith, Michelle L; Sotheran, Elizabeth C; Steingruber, Helen E; Steward, Charles A; Storey, Roy; Swann, R Mark; Swarbreck, David; Tabor, Paul E; Taudien, Stefan; Taylor, Tineace; Teague, Brian; Thomas, Karen; Thorpe, Andrea; Timms, Kirsten; Tracey, Alan; Trevanion, Steve; Tromans, Anthony C; d'Urso, Michele; Verduzco, Daniel; Villasana, Donna; Waldron, Lenee; Wall, Melanie; Wang, Qiaoyan; Warren, James; Warry, Georgina L; Wei, Xuehong; West, Anthony; Whitehead, Siobhan L; Whiteley, Mathew N; Wilkinson, Jane E; Willey, David L; Williams, Gabrielle; Williams, Leanne; Williamson, Angela; Williamson, Helen; Wilming, Laurens; Woodmansey, Rebecca L; Wray, Paul W; Yen, Jennifer; Zhang, Jingkun; Zhou, Jianling; Zoghbi, Huda; Zorilla, Sara; Buck, David; Reinhardt, Richard; Poustka, Annemarie; Rosenthal, André; Lehrach, Hans; Meindl, Alfons; Minx, Patrick J; Hillier, Ladeana W; Willard, Huntington F; Wilson, Richard K; Waterston, Robert H; Rice, Catherine M; Vaudin, Mark; Coulson, Alan; Nelson, David L; Weinstock, George; Sulston, John E; Durbin, Richard; Hubbard, Tim; Gibbs, Richard A; Beck, Stephan; Rogers, Jane; Bentley, David R

    2005-03-17

    The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence.

  14. Solution properties of the archaeal CRISPR DNA repeat-binding homeodomain protein Cbp2

    DEFF Research Database (Denmark)

    Kenchappa, Chandra; Heiðarsson, Pétur Orri; Kragelund, Birthe;

    2013-01-01

    in facilitating high affinity DNA binding of Cbp2 by tethering the two domains. Structural studies on mutant proteins provide support for Cys(7) and Cys(28) enhancing high thermal stability of Cbp2(Hb) through disulphide bridge formation. Consistent with their proposed CRISPR transcriptional regulatory role, Cbp2......Clustered regularly interspaced short palindromic repeats (CRISPR) form the basis of diverse adaptive immune systems directed primarily against invading genetic elements of archaea and bacteria. Cbp1 of the crenarchaeal thermoacidophilic order Sulfolobales, carrying three imperfect repeats, binds...... specifically to CRISPR DNA repeats and has been implicated in facilitating production of long transcripts from CRISPR loci. Here, a second related class of CRISPR DNA repeat-binding protein, denoted Cbp2, is characterized that contains two imperfect repeats and is found amongst members of the crenarchaeal...

  15. Solution properties of the archaeal CRISPR DNA repeat-binding homeodomain protein Cbp2

    DEFF Research Database (Denmark)

    Kenchappa, Chandra; Heiðarsson, Pétur Orri; Kragelund, Birthe

    2013-01-01

    Clustered regularly interspaced short palindromic repeats (CRISPR) form the basis of diverse adaptive immune systems directed primarily against invading genetic elements of archaea and bacteria. Cbp1 of the crenarchaeal thermoacidophilic order Sulfolobales, carrying three imperfect repeats, binds...... specifically to CRISPR DNA repeats and has been implicated in facilitating production of long transcripts from CRISPR loci. Here, a second related class of CRISPR DNA repeat-binding protein, denoted Cbp2, is characterized that contains two imperfect repeats and is found amongst members of the crenarchaeal...... in facilitating high affinity DNA binding of Cbp2 by tethering the two domains. Structural studies on mutant proteins provide support for Cys(7) and Cys(28) enhancing high thermal stability of Cbp2(Hb) through disulphide bridge formation. Consistent with their proposed CRISPR transcriptional regulatory role, Cbp2...

  16. A family of DNA repeats in Aspergillus nidulans has assimilated degenerated retrotransposons

    DEFF Research Database (Denmark)

    Nielsen, M.L.; Hermansen, T.D.; Aleksenko, Alexei Y.

    2001-01-01

    In the course of a chromosomal walk towards the centromere of chromosome IV of Aspergillus nidulans, several cross- hybridizing genomic cosmid clones were isolated. Restriction mapping of two such clones revealed that their restriction patterns were similar in a region of at least 15 kb, indicati......) phenomenon, first described in Neurospora crassa, may have operated in A. nidulans. The data indicate that this family of repeats has assimilated mobile elements that subsequently degenerated but then underwent further duplications as a part of the host repeats....... the presence of a large repeat. The nature of the repeat was further investigated by sequencing and Southern analysis. The study revealed a family of long dispersed repeats with a high degree of sequence similarity. The number and location of the repeats vary between wild isolates. Two copies of the repeat...

  17. Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.

    Directory of Open Access Journals (Sweden)

    Charlotte Rehm

    Full Text Available In prokaryotes simple sequence repeats (SSRs with unit sizes of 1-5 nucleotides (nt are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6-9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4 structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc, Xanthomonas axonopodis pv. citri str. 306 (Xac, and Nostoc sp. strain PCC7120 (Ana. In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria.

  18. Development and characterization of simple sequence repeats for Bipolaris sokiniana and cross transferability to related species

    Science.gov (United States)

    Simple sequence repeats (SSR) markers were developed from a small insert genomic library for Bipolaris sorokiniana, a mitosporic fungal pathogen that causes spot blotch and root rot in switchgrass. About 59% of sequenced clones (n=384) harbored various SSR motifs. After eliminating the redundant seq...

  19. An ancient repeat sequence in the ATP synthase beta-subunit gene of forcipulate sea stars.

    Science.gov (United States)

    Foltz, David W

    2007-11-01

    A novel repeat sequence with a conserved secondary structure is described from two nonadjacent introns of the ATP synthase beta-subunit gene in sea stars of the order Forcipulatida (Echinodermata: Asteroidea). The repeat is present in both introns of all forcipulate sea stars examined, which suggests that it is an ancient feature of this gene (with an approximate age of 200 Mya). Both stem and loop regions show high levels of sequence constraint when compared to flanking nonrepetitive intronic regions. The repeat was also detected in (1) the family Pterasteridae, order Velatida and (2) the family Korethrasteridae, order Velatida. The repeat was not detected in (1) the family Echinasteridae, order Spinulosida, (2) the family Astropectinidae, order Paxillosida, (3) the family Solasteridae, order Velatida, or (4) the family Goniasteridae, order Valvatida. The repeat lacks similarity to published sequences in unrestricted GenBank searches, and there are no significant open reading frames in the repeat or in the flanking intron sequences. Comparison via parametric bootstrapping to a published phylogeny based on 4.2 kb of nuclear and mitochondrial sequence for a subset of these species allowed the null hypothesis of a congruent phylogeny to be rejected for each repeat, when compared separately to the published phylogeny. In contrast, the flanking nonrepetitive sequences in each intron yielded separate phylogenies that were each congruent with the published phylogeny. In four species, the repeat in one or both introns has apparently experienced gene conversion. The two introns also show a correlated pattern of nucleotide substitutions, even after excluding the putative cases of gene conversion.

  20. Nucleic acid (cDNA) and amino acid sequences of alpha-type gliadins from wheat (Triticum aestivum).

    Science.gov (United States)

    Kasarda, D D; Okita, T W; Bernardin, J E; Baecker, P A; Nimmo, C C; Lew, E J; Dietler, M D; Greene, F C

    1984-01-01

    The complete amino acid sequence for an alpha-type gliadin protein of wheat (Triticum aestivum Linnaeus) endosperm has been derived from a cloned cDNA sequence. An additional cDNA clone that corresponds to about 75% of a similar alpha-type gliadin has been sequenced and shows some important differences. About 97% of the composite sequence of A-gliadin (an alpha-type gliadin fraction) has also been obtained by direct amino acid sequencing. This sequence shows a high degree of similarity with amino acid sequences derived from both cDNA clones and is virtually identical to one of them. On the basis of sequence information, after loss of the signal sequence, the mature alpha-type gliadins may be divided into five different domains, two of which may have evolved from an ancestral gliadin gene, whereas the remaining three contain repeating sequences that may have developed independently. Images PMID:6589619

  1. DNA Replication Dynamics of the GGGGCC Repeat of the C9orf72 Gene.

    Science.gov (United States)

    Thys, Ryan Griffin; Wang, Yuh-Hwa

    2015-11-27

    DNA has the ability to form a variety of secondary structures in addition to the normal B-form DNA, including hairpins and quadruplexes. These structures are implicated in a number of neurological diseases and cancer. Expansion of a GGGGCC repeat located at C9orf72 is associated with familial amyotrophic lateral sclerosis and frontotemporal dementia. This repeat expands from two to 24 copies in normal individuals to several hundreds or thousands of repeats in individuals with the disease. Biochemical studies have demonstrated that as little as four repeats have the ability to form a stable DNA secondary structure known as a G-quadruplex. Quadruplex structures have the ability to disrupt normal DNA processes such as DNA replication and transcription. Here we examine the role of GGGGCC repeat length and orientation on DNA replication using an SV40 replication system in human cells. Replication through GGGGCC repeats leads to a decrease in overall replication efficiency and an increase in instability in a length-dependent manner. Both repeat expansions and contractions are observed, and replication orientation is found to influence the propensity for expansions or contractions. The presence of replication stress, such as low-dose aphidicolin, diminishes replication efficiency but has no effect on instability. Two-dimensional gel electrophoresis analysis demonstrates a replication stall with as few as 20 GGGGCC repeats. These results suggest that replication of the GGGGCC repeat at C9orf72 is perturbed by the presence of expanded repeats, which has the potential to result in further expansion, leading to disease.

  2. Genome-wide analysis of tandem repeats in Tribolium castaneum genome reveals abundant and highly dynamic tandem repeat families with satellite DNA features in euchromatic chromosomal arms.

    Science.gov (United States)

    Pavlek, Martina; Gelfand, Yevgeniy; Plohl, Miroslav; Meštrović, Nevenka

    2015-12-01

    Although satellite DNAs are well-explored components of heterochromatin and centromeres, little is known about emergence, dispersal and possible impact of comparably structured tandem repeats (TRs) on the genome-wide scale. Our bioinformatics analysis of assembled Tribolium castaneum genome disclosed significant contribution of TRs in euchromatic chromosomal arms and clear predominance of satellite DNA-typical 170 bp monomers in arrays of ≥5 repeats. By applying different experimental approaches, we revealed that the nine most prominent TR families Cast1-Cast9 extracted from the assembly comprise ∼4.3% of the entire genome and reside almost exclusively in euchromatic regions. Among them, seven families that build ∼3.9% of the genome are based on ∼170 and ∼340 bp long monomers. Results of phylogenetic analyses of 2500 monomers originating from these families show high-sequence dynamics, evident by extensive exchanges between arrays on non-homologous chromosomes. In addition, our analysis shows that concerted evolution acts more efficiently on longer than on shorter arrays. Efficient genome-wide distribution of nine TR families implies the role of transposition only in expansion of the most dispersed family, and involvement of other mechanisms is anticipated. Despite similarities in sequence features, FISH experiments indicate high-level compartmentalization of centromeric and euchromatic tandem repeats.

  3. SWORDS: A statistical tool for analysing large DNA sequences

    Indian Academy of Sciences (India)

    Probal Chaudhuri; Sandip Das

    2002-02-01

    In this article, we present some simple yet effective statistical techniques for analysing and comparing large DNA sequences. These techniques are based on frequency distributions of DNA words in a large sequence, and have been packaged into a software called SWORDS. Using sequences available in public domain databases housed in the Internet, we demonstrate how SWORDS can be conveniently used by molecular biologists and geneticists to unmask biologically important features hidden in large sequences and assess their statistical significance.

  4. Quality Control of Isothermal Amplified DNA Based on Short Tandem Repeat Analysis.

    Science.gov (United States)

    Kroneis, Thomas; El-Heliebi, Amin

    2015-01-01

    This protocol describes the use of a 16plex PCR for the purpose assessing DNA quality after isothermal whole genome amplification (WGA). In short, DNA products, generated by amplification multiple displacement amplification, are forwarded to PCR targeting 15 short tandem repeats (STR) as well as amelogenin generating up to 32 different PCR products. After amplification, the PCR products are separated via capillary electrophoresis and analyzed based on the obtained DNA profiles. Isothermal WGA products of good DNA quality will result in DNA profiles with efficiencies of >90 % of the full DNA profile.

  5. Yeast DNA sequences initiating gene expression in Escherichia coli.

    Science.gov (United States)

    Lewin, Astrid; Tran, Thi Tuyen; Jacob, Daniela; Mayer, Martin; Freytag, Barbara; Appel, Bernd

    2004-01-01

    DNA transfer between pro- and eukaryotes occurs either during natural horizontal gene transfer or as a result of the employment of gene technology. We analysed the capacity of DNA sequences from a eukaryotic donor organism (Saccharomyces cerevisiae) to serve as promoter region in a prokaryotic recipient (Escherichia coli) by creating fusions between promoterless luxAB genes from Vibrio harveyi and random DNA sequences from S. cerevisiae and measuring the luminescence of transformed E. coli. Fifty-four out of 100 randomly analysed S. cerevisiae DNA sequences caused considerable gene expression in E. coli. Determination of transcription start sites within six selected yeast sequences in E. coli confirmed the existence of bacterial -10 and -35 consensus sequences at appropriate distances upstream from transcription initiation sites. Our results demonstrate that the probability of transcription of transferred eukaryotic DNA in bacteria is extremely high and does not require the insertion of the transferred DNA behind a promoter of the recipient genome.

  6. Unusual structure of ribosomal DNA in the copepod Tigriopus californicus: intergenic spacer sequences lack internal subrepeats.

    Science.gov (United States)

    Burton, R S; Metz, E C; Flowers, J M; Willett, C S

    2005-01-03

    Eukaryotic nuclear ribosomal DNA (rDNA) is typically arranged as a series of tandem repeats coding for 18S, 5.8S, and 28S ribosomal RNAs. Transcription of rDNA repeats is initiated in the intergenic spacer (IGS) region upstream of the 18S gene. The IGS region itself typically consists of a set of subrepeats that function as transcriptional enhancers. Two important evolutionary forces have been proposed to act on the IGS region: first, selection may favor changes in the number of subrepeats that adaptively adjust rates of rDNA transcription, and second, coevolution of IGS sequence with RNA polymerase I transcription factors may lead to species specificity of the rDNA transcription machinery. To investigate the potential role of these forces on population differentiation and hybrid breakdown in the intertidal copepod Tigriopus californicus, we have characterized the rDNA of five T. californicus populations from the Pacific Coast of North America and one sample of T. brevicornicus from Scotland. Major findings are as follows: (1) the structural genes for 18S and 28S are highly conserved across T. californicus populations, in contrast to other nuclear and mitochondrial DNA (mtDNA) genes previously studied in these populations. (2) There is extensive differentiation among populations in the IGS region; in the extreme, no homology is observed across the IGS sequences (>2 kb) from the two Tigriopus species. (3) None of the Tigriopus IGS sequences have the subrepeat structure common to other eukaryotic IGS regions. (4) Segregation of rDNA in laboratory crosses indicates that rDNA is located on at least two separate chromosomes in T. californicus. These data suggest that although IGS length polymorphism does not appear to play the adaptive role hypothesized in some other eukaryotic systems, sequence divergence in the rDNA promoter region within the IGS could lead to population specificity of transcription in hybrids.

  7. DNA Sequence Determination by Hybridization: A Strategy for Efficient Large-Scale Sequencing

    Science.gov (United States)

    Drmanac, R.; Drmanac, S.; Strezoska, Z.; Paunesku, T.; Labat, I.; Zeremski, M.; Snoddy, J.; Funkhouser, W. K.; Koop, B.; Hood, L.; Crkvenjakov, R.

    1993-06-01

    The concept of sequencing by hybridization (SBH) makes use of an array of all possible n-nucleotide oligomers (n-mers) to identify n-mers present in an unknown DNA sequence. Computational approaches can then be used to assemble the complete sequence. As a validation of this concept, the sequences of three DNA fragments, 343 base pairs in length, were determined with octamer oligonucleotides. Possible applications of SBH include physical mapping (ordering) of overlapping DNA clones, sequence checking, DNA fingerprinting comparisons of normal and disease-causing genes, and the identification of DNA fragments with particular sequence motifs in complementary DNA and genomic libraries. The SBH techniques may accelerate the mapping and sequencing phases of the human genome project.

  8. Novel multiplex format of an extended multilocus variable-number-tandem-repeat analysis of Clostridium difficile correlates with tandem repeat sequence typing.

    Science.gov (United States)

    Jensen, Mie Birgitte Frid; Engberg, Jørgen; Larsson, Jonas T; Olsen, Katharina E P; Torpdahl, Mia

    2015-03-01

    Subtyping of Clostridium difficile is crucial for outbreak investigations. An extended multilocus variable-number tandem-repeat analysis (eMLVA) of 14 variable number tandem repeat (VNTR) loci was validated in multiplex format compatible with a routine typing laboratory and showed excellent concordance with tandem repeat sequence typing (TRST) and high discriminatory power.

  9. A novel constraint for thermodynamically designing DNA sequences.

    Directory of Open Access Journals (Sweden)

    Qiang Zhang

    Full Text Available Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap.

  10. A novel constraint for thermodynamically designing DNA sequences.

    Science.gov (United States)

    Zhang, Qiang; Wang, Bin; Wei, Xiaopeng; Zhou, Changjun

    2013-01-01

    Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired) hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE) to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap.

  11. Direct and inverted repeats elicit genetic instability by both exploiting and eluding DNA double-strand break repair systems in mycobacteria.

    Directory of Open Access Journals (Sweden)

    Ewelina A Wojcik

    Full Text Available Repetitive DNA sequences with the potential to form alternative DNA conformations, such as slipped structures and cruciforms, can induce genetic instability by promoting replication errors and by serving as a substrate for DNA repair proteins, which may lead to DNA double-strand breaks (DSBs. However, the contribution of each of the DSB repair pathways, homologous recombination (HR, non-homologous end-joining (NHEJ and single-strand annealing (SSA, to this sort of genetic instability is not fully understood. Herein, we assessed the genome-wide distribution of repetitive DNA sequences in the Mycobacterium smegmatis, Mycobacterium tuberculosis and Escherichia coli genomes, and determined the types and frequencies of genetic instability induced by direct and inverted repeats, both in the presence and in the absence of HR, NHEJ, and SSA. All three genomes are strongly enriched in direct repeats and modestly enriched in inverted repeats. When using chromosomally integrated constructs in M. smegmatis, direct repeats induced the perfect deletion of their intervening sequences ~1,000-fold above background. Absence of HR further enhanced these perfect deletions, whereas absence of NHEJ or SSA had no influence, suggesting compromised replication fidelity. In contrast, inverted repeats induced perfect deletions only in the absence of SSA. Both direct and inverted repeats stimulated excision of the constructs from the attB integration sites independently of HR, NHEJ, or SSA. With episomal constructs, direct and inverted repeats triggered DNA instability by activating nucleolytic activity, and absence of the DSB repair pathways (in the order NHEJ>HR>SSA exacerbated this instability. Thus, direct and inverted repeats may elicit genetic instability in mycobacteria by 1 directly interfering with replication fidelity, 2 stimulating the three main DSB repair pathways, and 3 enticing L5 site-specific recombination.

  12. Isolation of human minisatellite loci detected by synthetic tandem repeat probes: direct comparison with cloned DNA fingerprinting probes.

    Science.gov (United States)

    Armour, J A; Vergnaud, G; Crosier, M; Jeffreys, A J

    1992-08-01

    As a direct comparison with cloned 'DNA fingerprinting' probes, we present the results of screening an ordered array Charomid library for hypervariable human loci using synthetic tandem repeat (STR) probes. By recording the coordinates of positive hybridization signals, the subset of clones within the library detected by each STR probe can be defined, and directly compared with the set of clones detected by naturally occurring (cloned) DNA fingerprinting probes. The STR probes vary in the efficiency of detection of polymorphic minisatellite loci; among the more efficient probes, there is a strong overlap with the sets of clones detected by the DNA fingerprinting probes. Four new polymorphic loci were detected by one or more of the STR probes but not by any of the naturally occurring repeats. Sequence comparisons with the probe(s) used to detect the locus suggest that a relatively poor match, for example 10 out of 14 bases in a limited region of each repeat, is sufficient for the positive detection of tandem repeats in a clone in this type of library screening by hybridization. These results not only provide a detailed evaluation of the usefulness of STR probes in the isolation of highly variable loci, but also suggest strategies for the use of these multi-locus probes in screening libraries for clones from hypervariable loci.

  13. Spatio-temporal Variations of Characteristic Repeating Earthquake Sequences along the Middle America Trench in Mexico

    Science.gov (United States)

    Dominguez, L. A.; Taira, T.; Hjorleifsdottir, V.; Santoyo, M. A.

    2015-12-01

    Repeating earthquake sequences are sets of events that are thought to rupture the same area on the plate interface and thus provide nearly identical waveforms. We systematically analyzed seismic records from 2001 through 2014 to identify repeating earthquakes with highly correlated waveforms occurring along the subduction zone of the Cocos plate. Using the correlation coefficient (cc) and spectral coherency (coh) of the vertical components as selection criteria, we found a set of 214 sequences whose waveforms exceed cc≥95% and coh≥95%. Spatial clustering along the trench shows large variations in repeating earthquakes activity. Particularly, the rupture zone of the M8.1, 1985 earthquake shows an almost absence of characteristic repeating earthquakes, whereas the Guerrero Gap zone and the segment of the trench close to the Guerrero-Oaxaca border shows a significantly larger number of repeating earthquakes sequences. Furthermore, temporal variations associated to stress changes due to major shows episodes of unlocking and healing of the interface. Understanding the different components that control the location and recurrence time of characteristic repeating sequences is a key factor to pinpoint areas where large megathrust earthquakes may nucleate and consequently to improve the seismic hazard assessment.

  14. Base J glucosyltransferase does not regulate the sequence specificity of J synthesis in trypanosomatid telomeric DNA.

    Science.gov (United States)

    Bullard, Whitney; Cliffe, Laura; Wang, Pengcheng; Wang, Yinsheng; Sabatini, Robert

    2015-12-01

    Telomeric DNA of trypanosomatids possesses a modified thymine base, called base J, that is synthesized in a two-step process; the base is hydroxylated by a thymidine hydroxylase forming hydroxymethyluracil (hmU) and a glucose moiety is then attached by the J-associated glucosyltransferase (JGT). To examine the importance of JGT in modifiying specific thymine in DNA, we used a Leishmania episome system to demonstrate that the telomeric repeat (GGGTTA) stimulates J synthesis in vivo while mutant telomeric sequences (GGGTTT, GGGATT, and GGGAAA) do not. Utilizing an in vitro GT assay we find that JGT can glycosylate hmU within any sequence with no significant change in Km or kcat, even mutant telomeric sequences that are unable to be J-modified in vivo. The data suggests that JGT possesses no DNA sequence specificity in vitro, lending support to the hypothesis that the specificity of base J synthesis is not at the level of the JGT reaction.

  15. Z-DNA-forming sequences generate large-scale deletions in mammalian cells.

    Science.gov (United States)

    Wang, Guliang; Christensen, Laura A; Vasquez, Karen M

    2006-02-21

    Spontaneous chromosomal breakages frequently occur at genomic hot spots in the absence of DNA damage and can result in translocation-related human disease. Chromosomal breakpoints are often mapped near purine-pyrimidine Z-DNA-forming sequences in human tumors. However, it is not known whether Z-DNA plays a role in the generation of these chromosomal breakages. Here, we show that Z-DNA-forming sequences induce high levels of genetic instability in both bacterial and mammalian cells. In mammalian cells, the Z-DNA-forming sequences induce double-strand breaks nearby, resulting in large-scale deletions in 95% of the mutants. These Z-DNA-induced double-strand breaks in mammalian cells are not confined to a specific sequence but rather are dispersed over a 400-bp region, consistent with chromosomal breakpoints in human diseases. This observation is in contrast to the mutations generated in Escherichia coli that are predominantly small deletions within the repeats. We found that the frequency of small deletions is increased by replication in mammalian cell extracts. Surprisingly, the large-scale deletions generated in mammalian cells are, at least in part, replication-independent and are likely initiated by repair processing cleavages surrounding the Z-DNA-forming sequence. These results reveal that mammalian cells process Z-DNA-forming sequences in a strikingly different fashion from that used by bacteria. Our data suggest that Z-DNA-forming sequences may be causative factors for gene translocations found in leukemias and lymphomas and that certain cellular conditions such as active transcription may increase the risk of Z-DNA-related genetic instability.

  16. Correlation of inter-locus polyglutamine toxicity with CAG•CTG triplet repeat expandability and flanking genomic DNA GC content.

    Directory of Open Access Journals (Sweden)

    Colm E Nestor

    Full Text Available Dynamic expansions of toxic polyglutamine (polyQ-encoding CAG repeats in ubiquitously expressed, but otherwise unrelated, genes cause a number of late-onset progressive neurodegenerative disorders, including Huntington disease and the spinocerebellar ataxias. As polyQ toxicity in these disorders increases with repeat length, the intergenerational expansion of unstable CAG repeats leads to anticipation, an earlier age-at-onset in successive generations. Crucially, disease associated alleles are also somatically unstable and continue to expand throughout the lifetime of the individual. Interestingly, the inherited polyQ length mediating a specific age-at-onset of symptoms varies markedly between disorders. It is widely assumed that these inter-locus differences in polyQ toxicity are mediated by protein context effects. Previously, we demonstrated that the tendency of expanded CAG•CTG repeats to undergo further intergenerational expansion (their 'expandability' also differs between disorders and these effects are strongly correlated with the GC content of the genomic flanking DNA. Here we show that the inter-locus toxicity of the expanded polyQ tracts of these disorders also correlates with both the expandability of the underlying CAG repeat and the GC content of the genomic DNA flanking sequences. Inter-locus polyQ toxicity does not correlate with properties of the mRNA or protein sequences, with polyQ location within the gene or protein, or steady state transcript levels in the brain. These data suggest that the observed inter-locus differences in polyQ toxicity are not mediated solely by protein context effects, but that genomic context is also important, an effect that may be mediated by modifying the rate at which somatic expansion of the DNA delivers proteins to their cytotoxic state.

  17. Preparing DNA libraries for multiplexed paired-end deep sequencing for Illumina GA sequencers.

    Science.gov (United States)

    Son, Mike S; Taylor, Ronald K

    2011-02-01

    Whole-genome sequencing, also known as deep sequencing, is becoming a more affordable and efficient way to identify SNP mutations, deletions, and insertions in DNA sequences across several different strains. Two major obstacles preventing the widespread use of deep sequencers are the costs involved in services used to prepare DNA libraries for sequencing and the overall accuracy of the sequencing data. This unit describes the preparation of DNA libraries for multiplexed paired-end sequencing using the Illumina GA series sequencer. Self-preparation of DNA libraries can help reduce overall expenses, especially if optimization is required for the different samples, and use of the Illumina GA Sequencer can improve the quality of the data.

  18. New method to study DNA sequences: the languages of evolution.

    Science.gov (United States)

    Spinelli, Gino; Mayer-Foulkes, David

    2008-04-01

    Recently, several authors have reported statistical evidence for deterministic dynamics in the flux of genetic information, suggesting that evolution involves the emergence and maintenance of a fractal landscape in DNA chains. Here we examine the idea that motif repetition lies at the origin of these statistical properties of DNA. To analyse repetition patterns we apply a modification of the BDS statistic, devised to analyze complex economic dynamics and adapted here to DNA sequence analysis. This provides a new method to detect structured signals in genetic information. We compare naturally occurring DNA sequences along the evolutionary tree with randomly generated sequences and also with simulated sequences with repetition motifs. For easier understanding, we also define a new statistic for a DNA sequence that constitutes a specific fingerprint. The new methods are applied to exon and intron DNA sequences, finding specific statistical differences. Moreover, by analysing DNA sequences of different species from Bacteria to Man, we explore the evolution of these linguistic DNA features along the evolutionary tree. The results are consistent with the idea that all the flux of DNA information need not be random, but may be structured along the evolutionary tree. The implications for evolutionary theory are discussed.

  19. Properties of CENP-B and its target sequence in a satellite DNA

    Energy Technology Data Exchange (ETDEWEB)

    Masumoto, H.; Yoda, K.; Ikeno, M.; Kitagawa, K.; Muro, Y.; Okazaki, T. [Nagoya Univ. (Japan)

    1993-12-31

    The centromere plays an essential role in the proper segregation of eukaryotic chromosomes at mitosis and meiosis. The centromere is the multifunctional domain of chromosome responsible for sister chromatid association at the inner site and for microtubule attachment at the outer surface. It also acts as a mechanochemical motor for chromosome movement. These multiple centromere functions must, in some way, be directed by a cis-acting DNA sequence located in the centromere region. Indeed, specific centromere DNA sequences (CEN-DNA) were identified in two yeast species. In Saccharomyces cerevisiae, CEN-DNA consists of roughly 125 bp sequence composed of three conserved elements. In contrast, the centromere sequence of S. pombe is quite different from S. cerevisiae in length and sequence organization. The molecular bases for understanding the structure and function of the centromere/kinetochore domain have not been elucidated in higher eukaryotes. In mammalian cells, satellite DNA`s are localized in the centromeric heterochromatin or heterochromatic arm. In all human chromosomes, the alpha satellite or alphoid DNA family, a highly repetitive DNA composed of about 170 bp fundamental monomer repeating units, is found at the primary constriction. Its function, however, has not been established.

  20. Statistical assignment of DNA sequences using Bayesian phylogenetics

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Huelsenbeck, John P.;

    2008-01-01

    -analysis of previously published ancient DNA data and show that, with high statistical confidence, most of the published sequences are in fact of Neanderthal origin. However, there are several cases of chimeric sequences that are comprised of a combination of both Neanderthal and modern human DNA....

  1. Levenshtein error-correcting barcodes for multiplexed DNA sequencing

    NARCIS (Netherlands)

    Buschmann, Tilo; Bystrykh, Leonid V.

    2013-01-01

    Background: High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called multi

  2. Genetic Diversity Assessment and Identification of New Sour Cherry Genotypes Using Intersimple Sequence Repeat Markers

    Directory of Open Access Journals (Sweden)

    Roghayeh Najafzadeh

    2014-01-01

    Full Text Available Iran is one of the chief origins of subgenus Cerasus germplasm. In this study, the genetic variation of new Iranian sour cherries (which had such superior growth characteristics and fruit quality as to be considered for the introduction of new cultivars was investigated and identified using 23 intersimple sequence repeat (ISSR markers. Results indicated a high level of polymorphism of the genotypes based on these markers. According to these results, primers tested in this study specially ISSR-4, ISSR-6, ISSR-13, ISSR-14, ISSR-16, and ISSR-19 produced good and various levels of amplifications which can be effectively used in genetic studies of the sour cherry. The genetic similarity among genotypes showed a high diversity among the genotypes. Cluster analysis separated improved cultivars from promising Iranian genotypes, and the PCoA supported the cluster analysis results. Since the Iranian genotypes were superior to the improved cultivars and were separated from them in most groups, these genotypes can be considered as distinct genotypes for further evaluations in the framework of breeding programs and new cultivar identification in cherries. Results also confirmed that ISSR is a reliable DNA marker that can be used for exact genetic studies and in sour cherry breeding programs.

  3. Genetic characterization of autochthonous grapevine cultivars from Eastern Turkey by simple sequence repeats (SSRs

    Directory of Open Access Journals (Sweden)

    Sadiye Peral Eyduran

    2016-01-01

    Full Text Available In this research, two well-recognized standard grape cultivars, Cabernet Sauvignon and Merlot, together with eight historical autochthonous grapevine cultivars from Eastern Anatolia in Turkey, were genetically characterized by using 12 pairs of simple sequence repeat (SSR primers in order to evaluate their genetic diversity and relatedness. All of the used SSR primers produced successful amplifications and revealed DNA polymorphisms, which were subsequently utilized to evaluate the genetic relatedness of the grapevine cultivars. Allele richness was implied by the identification of 69 alleles in 8 autochthonous cultivars with a mean value of 5.75 alleles per locus. The average expected heterozygosity and observed heterozygosity were found to be 0.749 and 0.739, respectively. Taking into account the generated alleles, the highest number was recorded in VVC2C3 and VVS2 loci (nine and eight alleles per locus, respectively, whereas the lowest number was recorded in VrZAG83 (three alleles per locus. Two main clusters were produced by using the unweighted pair-group method with arithmetic mean dendrogram constructed on the basis of the SSR data. Only Cabernet Sauvignon and Merlot cultivars were included in the first cluster. The second cluster involved the rest of the autochthonous cultivars. The results obtained during the study illustrated clearly that SSR markers have verified to be an effective tool for fingerprinting grapevine cultivars and carrying out grapevine biodiversity studies. The obtained data are also meaningful references for grapevine domestication.

  4. CTCF regulates the local epigenetic state of ribosomal DNA repeats

    NARCIS (Netherlands)

    S. van de Nobelen (Suzanne); M. Rosa-Garrido (Manuel); J. Leers (Joerg); H. Heath (Helen); W.S.W. Soochit (Widia); L. Joosen (Linda); I. Jonkers (Iris); J.A.A. Demmers (Jeroen); M. van der Reijden (Michael); V. Torrano (Veránica); F.G. Grosveld (Frank); M.D. Delgado (Dolores); R. Renkawitz (Rainer); N.J. Galjart (Niels); F. Sleutels (Frank)

    2010-01-01

    textabstractBackground: CCCTC binding factor (CTCF) is a highly conserved zinc finger protein, which is involved in chromatin organization, local histone modifications, and RNA polymerase II-mediated gene transcription. CTCF may act by binding tightly to DNA and recruiting other proteins to mediate

  5. Distribution of repetitive DNA sequences in chromosomes of five opisthorchid species (Trematoda, Opisthorchiidae).

    Science.gov (United States)

    Zadesenets, Kira S; Karamysheva, Tatyana V; Katokhin, Alexei V; Mordvinov, Viatcheslav A; Rubtsov, Nikolay B

    2012-03-01

    Genomes of opisthorchid species are characterized by small size, suggesting a reduced amount of repetitive DNA in their genomes. Distribution of repetitive DNA sequences in the chromosomes of five species of the family Opisthorchiidae (Opisthorchis felineus 2n = 14 (Rivolta, 1884), Opisthorchis viverrini 2n = 12 (Poirier, 1886), Metorchis xanthosomus 2n = 14 (Creplin, 1846), Metorchis bilis 2n = 14 (Braun, 1890), Clonorchis sinensis 2n = 14 (Cobbold, 1875)) was studied with C- and AgNOR-banding, generation of microdissected DNA probes from individual chromosomes and fluorescent in situ hybridization on mitotic and meiotic chromosomes. Small-sized C-bands were discovered in pericentric regions of chromosomes. Ag-NOR staining of opisthorchid chromosomes and FISH with ribosomal DNA probe showed that karyotypes of all studied species were characterized by the only nucleolus organizer region in one of small chromosomes. The generation of DNA probes from chromosomes 1 and 2 of O. felineus and M. xanthosomus was performed with chromosome microdissection followed by DOP-PCR. FISH of obtained microdissected DNA probes on chromosomes of these species revealed chromosome specific DNA repeats in pericentric C-bands. It was also shown that microdissected DNA probes generated from chromosomes could be used as the Whole Chromosome Painting Probes without suppression of repetitive DNA hybridization. Chromosome painting using microdissected chromosome specific DNA probes showed the overall repeat distribution in opisthorchid chromosomes.

  6. Transformation-associated recombination between diverged and homologous DNA repeats is induced by strand breaks

    Energy Technology Data Exchange (ETDEWEB)

    Larionov, V.; Kouprina, N. [National Inst. of Environmental Health Sciences, Research Triangle Park, NC (United States)]|[Institute of Cytology, St. Petersburg, (Russian Federation); Edlarov, M. [National Inst. of Environmental Health Sciences, Research Triangle Park, NC (United States)]|[Center of Bioengineering, Moscow, (Russian Federation); Perkins, E.; Porter, G.; Resnick, M.A. [National Inst. of Environmental Health Sciences, Research Triangle Park, NC (United States)

    1993-12-31

    Rearrangement and deletion within plasmid DNA is commonly observed during transformation. We have examined the mechanisms of transformation-associated recombination in the yeast Saccharomyces cerevisiae using a plasmid system which allowed the effects of physical state and/or extent of homology on recombination to be studied. The plasmid contains homologous or diverged (19%) DNA repeats separated by a genetically detectable color marker. Recombination during transformation for covalently closed circular plasmids was over 100-fold more frequent than during mitotic growth. The frequency of recombination is partly dependent on the method of transformation in that procedures involving lithium acetate or spheroplasting yield higher frequencies than electroporation. When present in the repeats, unique single-strand breaks that are ligatable, as well as double-strand breaks, lead to high levels of recombination between diverged and identical repeats. The transformation-associated recombination between repeat DNA`s is under the influence of the RADS2, RADI and the RNCI genes,

  7. Association of condensin with chromosomes depends on DNA binding by its HEAT-repeat subunits.

    Science.gov (United States)

    Piazza, Ilaria; Rutkowska, Anna; Ori, Alessandro; Walczak, Marta; Metz, Jutta; Pelechano, Vicent; Beck, Martin; Haering, Christian H

    2014-06-01

    Condensin complexes have central roles in the three-dimensional organization of chromosomes during cell divisions, but how they interact with chromatin to promote chromosome segregation is largely unknown. Previous work has suggested that condensin, in addition to encircling chromatin fibers topologically within the ring-shaped structure formed by its SMC and kleisin subunits, contacts DNA directly. Here we describe the discovery of a binding domain for double-stranded DNA formed by the two HEAT-repeat subunits of the Saccharomyces cerevisiae condensin complex. From detailed mapping data of the interfaces between the HEAT-repeat and kleisin subunits, we generated condensin complexes that lack one of the HEAT-repeat subunits and consequently fail to associate with chromosomes in yeast and human cells. The finding that DNA binding by condensin's HEAT-repeat subunits stimulates the SMC ATPase activity suggests a multistep mechanism for the loading of condensin onto chromosomes.

  8. Oxidized dNTPs and the OGG1 and MUTYH DNA glycosylases combine to induce CAG/CTG repeat instability

    Science.gov (United States)

    Cilli, Piera; Ventura, Ilenia; Minoprio, Anna; Meccia, Ettore; Martire, Alberto; Wilson, Samuel H.; Bignami, Margherita; Mazzei, Filomena

    2016-01-01

    DNA trinucleotide repeat (TNR) expansion underlies several neurodegenerative disorders including Huntington's disease (HD). Accumulation of oxidized DNA bases and their inefficient processing by base excision repair (BER) are among the factors suggested to contribute to TNR expansion. In this study, we have examined whether oxidation of the purine dNTPs in the dNTP pool provides a source of DNA damage that promotes TNR expansion. We demonstrate that during BER of 8-oxoguanine (8-oxodG) in TNR sequences, DNA polymerase β (POL β) can incorporate 8-oxodGMP with the formation of 8-oxodG:C and 8-oxodG:A mispairs. Their processing by the OGG1 and MUTYH DNA glycosylases generates closely spaced incisions on opposite DNA strands that are permissive for TNR expansion. Evidence in HD model R6/2 mice indicates that these DNA glycosylases are present in brain areas affected by neurodegeneration. Consistent with prevailing oxidative stress, the same brain areas contained increased DNA 8-oxodG levels and expression of the p53-inducible ribonucleotide reductase. Our in vitro and in vivo data support a model where an oxidized dNTPs pool together with aberrant BER processing contribute to TNR expansion in non-replicating cells. PMID:26980281

  9. Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.)

    Science.gov (United States)

    Background: Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed S...

  10. Food Fish Identification from DNA Extraction through Sequence Analysis

    Science.gov (United States)

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  11. Food Fish Identification from DNA Extraction through Sequence Analysis

    Science.gov (United States)

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  12. Mesoscopic Model for Free Energy Landscape Analysis of DNA sequences

    CERN Document Server

    Tapia-Rojo, R; Mazo, J J; Falo, F; 10.1103/PhysRevE.86.021908

    2012-01-01

    A mesoscopic model which allows us to identify and quantify the strength of binding sites in DNA sequences is proposed. The model is based on the Peyrard-Bishop-Dauxois model for the DNA chain coupled to a Brownian particle which explores the sequence interacting more importantly with open base pairs of the DNA chain. We apply the model to promoter sequences of different organisms. The free energy landscape obtained for these promoters shows a complex structure that is strongly connected to their biological behavior. The analysis method used is able to quantify free energy differences of sites within genome sequences.

  13. Cloning and sequencing of mouse GABA transporter complementary DNA

    Institute of Scientific and Technical Information of China (English)

    TAMANTHONYC.W.; LIHEGUO; 等

    1994-01-01

    A cDNA encoding the mouse GABA transporter has been isolated and sequenced.The results show that the mouse GABA transporter cDNA differs from that of the rat by 60 base pairs at the open reading frame region but the deduced amino acid sequences of the two cDNAs are identical and both composed of 599 amino acids.However,the amino acid sequence is different from the sequence deduced from a recently published mouse GABA transporter cDNA.

  14. Identification of genes in anonymous DNA sequences. Annual performance report, February 1, 1991--January 31, 1992

    Energy Technology Data Exchange (ETDEWEB)

    Fields, C.A.

    1996-06-01

    The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progress report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.

  15. Affordable Hands-On DNA Sequencing and Genotyping: An Exercise for Teaching DNA Analysis to Undergraduates

    Science.gov (United States)

    Shah, Kushani; Thomas, Shelby; Stein, Arnold

    2013-01-01

    In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C…

  16. Preferential Nucleosome Assembly at DNA Triplet Repeats from the Myotonic Dystrophy Gene

    Science.gov (United States)

    Wang, Yuh-Hwa; Amirhaeri, Sorour; Kang, Seongman; Wells, Robert D.; Griffith, Jack D.

    1994-07-01

    The expansion of CTG repeats in DNA occurs in or near genes involved in several human diseases, including myotonic dystrophy and Huntington's disease. Nucleosomes, the basic structural element of chromosomes, consist of 146 base pairs of DNA coiled about an octamer of histone proteins and mediate general transcriptional repression. Electron microscopy was used to examine in vitro the nucleosome assembly of DNA containing repeating CTG triplets. The efficiency of nucleosome formation increased with expanded triplet blocks, suggesting that such blocks may repress transcription through the creation of stable nucleosomes.

  17. Effects of Sequence on Transmission Properties of DNA Molecules

    Institute of Scientific and Technical Information of China (English)

    DONG Rui-Xin; YAN Xun-Ling; YANG Bing

    2008-01-01

    A double helix model of charge transport in DNA molecule is given and the transmission spectra of four DNA sequences are obtained. The calculated results show that the transmission characteristics of DNA are not only related to the longitudinal transport but also to the transverse transport of molecule. The periodic sequence with the same composition has stronger conduction ability. With the increasing of bases composition, the conductive ability reduces, but the weight of θ direction rises in charge transfer.

  18. DNA Polymerases Drive DNA Sequencing-by-Synthesis Technologies: Both Past and Present

    Directory of Open Access Journals (Sweden)

    Cheng-Yao eChen

    2014-06-01

    Full Text Available Next-generation sequencing (NGS technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. E. coli DNA polymerase I proteolytic (Klenow fragment was originally utilized in Sanger's dideoxy chain terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today's standard capillary electrophoresis (CE and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ⱷ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ⱷ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies.

  19. Identification and characterization of simple sequence repeats (SSRs) for population studies of Puccinia novopanici.

    Science.gov (United States)

    Orquera-Tornakian, Gabriela K; Garrido, Patricia; Kronmiller, Brent; Hunger, Robert; Tyler, Brett M; Garzon, Carla D; Marek, Stephen M

    2017-08-01

    Switchgrass (Panicum virgatum L.) can be severely affected by rust disease. Recently switchgrass rust caused by P. emaculata (now confirmed to be Puccinia novopanici) has received most of the attention by the research community because this pathogen is responsible for reducing the biomass production and biofuel feedstock quality of switchgrass. Microsatellite markers found in the literature were either not informative (no allele frequency) or showed few polymorphisms in the target populations, therefore additional markers are needed for future studies of the genetic variation and population structure of P. novopanici. This study reports the development and characterization of novel simple sequence repeat (SSR) markers from a Puccinia emaculata s.l. microsatellite-enriched library and expressed sequence tags (ESTs). Microsatellites were evaluated for polymorphisms on P. emaculata s.l. urediniospores collected in Iowa (IA), Mississippi (MS), Oklahoma (OK), South Dakota (SD) and Virginia (VA). Puccinia novopanici single spore whole genome amplifications were used as templates to validate the SSR reactions protocol and to assess a preliminary population genetics statistics of the pathogen. Eighteen microsatellite markers were polymorphic (average PIC=0.72) on individual urediniospores, with an average of 8.3 alleles per locus (range 3 to 17). Of the 49 SSRs loci initially identified in P. emaculata s.l., 18 were transferable to P. striiformis f. sp. tritici, 23 to P. triticina, 20 to P. sorghi and 31 to P. andropogonis. Thus, these markers could be useful for DNA fingerprinting and population structure analysis for population genetics, epidemiology and ecological studies of P. novopanici and potentially other related Puccinia species. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. An examination of the origin and evolution of additional tandem repeats in the mitochondrial DNA control region of Japanese sika deer (Cervus Nippon).

    Science.gov (United States)

    Ba, Hengxing; Wu, Lang; Liu, Zongyue; Li, Chunyi

    2016-01-01

    Tandem repeat units are only detected in the left domain of the mitochondrial DNA control region in sika deer. Previous studies showed that Japanese sika deer have more tandem repeat units than its cousins from the Asian continent and Taiwan, which often have only three repeat units. To determine the origin and evolution of these additional repeat units in Japanese sika deer, we obtained the sequence of repeat units from an expanded dataset of the control region from all sika deer lineages. The functional constraint is inferred to act on the first repeat unit because this repeat has the least sequence divergence in comparison to the other units. Based on slipped-strand mispairing mechanisms, the illegitimate elongation model could account for the addition or deletion of these additional repeat units in the Japanese sika deer population. We also report that these additional repeat units could be occurring in the internal positions of tandem repeat regions, possibly via coupling with a homogenization mechanism within and among these lineages. Moreover, the increased number of repeat units in the Japanese sika deer population could reflect a balance between mutation and selection, as well as genetic drift.

  1. Characterisation data of simple sequence repeats of phages closely related to T7M

    Directory of Open Access Journals (Sweden)

    Tiao-Yin Lin

    2016-09-01

    Full Text Available Coliphages T7M and T3, Yersinia phage ϕYeO3-12, and Salmonella phage ϕSG-JL2 share high homology in genomic sequences. Simple sequence repeats (SSRs are found in their genomes and variations of SSRs among these phages are observed. Analyses on regions of sequences in T7M and T3 genomes that are likely derived from phage recombination, as well as the counterparts in ϕYeO3-12 and ϕSG-JL2, have been discussed by Lin in “Simple sequence repeat variations expedite phage divergence: mechanisms of indels and gene mutations” [1]. These regions are referred to as recombinant regions. The focus here is on SSRs in the whole genome and regions of sequences outside the recombinant regions, referred to as non-recombinant regions. This article provides SSR counts, relative abundance, relative density, and GC contents in the complete genome and non-recombinant regions of these phages. SSR period sizes and motifs in the non-recombinant regions of phage genomes are plotted. Genomic sequence changes between T7M and T3 due to insertions, deletions, and substitutions are also illustrated. SSRs and nearby sequences of T7M in the non-recombinant regions are compared to the sequences of ϕYeO3-12 and ϕSG-JL2 in the corresponding positions. The sequence variations of SSRs due to vertical evolution are classified into four categories and tabulated: (1 insertion/deletion of SSR units, (2 expansion/contraction of SSRs without alteration of genome length, (3 changes of repeat motifs, and (4 generation/loss of repeats.

  2. Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications

    Energy Technology Data Exchange (ETDEWEB)

    Torella, JP; Lienert, F; Boehm, CR; Chen, JH; Way, JC; Silver, PA

    2014-08-07

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.

  3. Temporal stability of epigenetic markers: sequence characteristics and predictors of short-term DNA methylation variations.

    Directory of Open Access Journals (Sweden)

    Hyang-Min Byun

    Full Text Available BACKGROUND: DNA methylation is an epigenetic mechanism that has been increasingly investigated in observational human studies, particularly on blood leukocyte DNA. Characterizing the degree and determinants of DNA methylation stability can provide critical information for the design and conduction of human epigenetic studies. METHODS: We measured DNA methylation in 12 gene-promoter regions (APC, p16, p53, RASSF1A, CDH13, eNOS, ET-1, IFNγ, IL-6, TNFα, iNOS, and hTERT and 2 of non-long terminal repeat elements, i.e., L1 and Alu in blood samples obtained from 63 healthy individuals at baseline (Day 1 and after three days (Day 4. DNA methylation was measured by bisulfite-PCR-Pyrosequencing. We calculated intraclass correlation coefficients (ICCs to measure the within-individual stability of DNA methylation between Day 1 and 4, subtracted of pyrosequencing error and adjusted for multiple covariates. RESULTS: Methylation markers showed different temporal behaviors ranging from high (IL-6, ICC = 0.89 to low stability (APC, ICC = 0.08 between Day 1 and 4. Multiple sequence and marker characteristics were associated with the degree of variation. Density of CpG dinucleotides nearby the sequence analyzed (measured as CpG(o/e or G+C content within ±200 bp was positively associated with DNA methylation stability. The 3' proximity to repeat elements and range of DNA methylation on Day 1 were also positively associated with methylation stability. An inverted U-shaped correlation was observed between mean DNA methylation on Day 1 and stability. CONCLUSIONS: The degree of short-term DNA methylation stability is marker-dependent and associated with sequence characteristics and methylation levels.

  4. Direct detection of expanded trinucleotide repeats using PCR and DNA hybridization techniques

    Energy Technology Data Exchange (ETDEWEB)

    Petronis, A.; Tatuch, Y.; Klempan, T.A.; Kennedy, J.L. [Hospital for Sick Children, Toronto (Canada)] [and others

    1996-02-16

    Recently, unstable trinucleotide repeats have been shown to be the etiologic factor in seven neuropsychiatric diseases, and they may play a similar role in other genetic disorders which exhibit genetic anticipation. We have tested one polymerase chain reaction (PCR)-based and two hybridization-based methods for direct detection of unstable DNA expansion in genomic DNA. This technique employs a single primer (asymmetric) PCR using total genomic DNA as a template to efficiently screen for the presence of large trinucleotide repeat expansions. High-stringency Southern blot hybridization with a PCR-generated trinucleotide repeat probe allowed detection of the DNA fragment containing the expansion. Analysis of myotonic dystrophy patients containing different degrees of (CTG){sub n} expansion demonstrated the identification of the site of trinucleotide instability in some affected individuals without any prior information regarding genetic map location. The same probe was used for fluorescent in situ hybridization and several regions of (CTG){sub n}/(CAG){sub n} repeats in the human genome were detected, including the myotonic dystrophy locus on chromosome 19q. Although limited at present to large trinucleotide repeat expansions, these strategies can be applied to directly clone genes involved in disorders caused by large expansions of unstable DNA. 33 refs., 4 figs.

  5. Code domains in tandem repetitive DNA sequence structures.

    Science.gov (United States)

    Vogt, P

    1992-10-01

    Traditionally, many people doing research in molecular biology attribute coding properties to a given DNA sequence if this sequence contains an open reading frame for translation into a sequence of amino acids. This protein coding capability of DNA was detected about 30 years ago. The underlying genetic code is highly conserved and present in every biological species studied so far. Today, it is obvious that DNA has a much larger coding potential for other important tasks. Apart from coding for specific RNA molecules such as rRNA, snRNA and tRNA molecules, specific structural and sequence patterns of the DNA chain itself express distinct codes for the regulation and expression of its genetic activity. A chromatin code has been defined for phasing of the histone-octamer protein complex in the nucleosome. A translation frame code has been shown to exist that determines correct triplet counting at the ribosome during protein synthesis. A loop code seems to organize the single stranded interaction of the nascent RNA chain with proteins during the splicing process, and a splicing code phases successive 5' and 3' splicing sites. Most of these DNA codes are not exclusively based on the primary DNA sequence itself, but also seem to include specific features of the corresponding higher order structures. Based on the view that these various DNA codes are genetically instructive for specific molecular interactions or processes, important in the nucleus during interphase and during cell division, the coding capability of tandem repetitive DNA sequences has recently been reconsidered.

  6. ProteDNA: a sequence-based predictor of sequence-specific DNA-binding residues in transcription factors.

    Science.gov (United States)

    Chu, Wen-Yi; Huang, Yu-Feng; Huang, Chun-Chin; Cheng, Yi-Sheng; Huang, Chien-Kang; Oyang, Yen-Jen

    2009-07-01

    This article presents the design of a sequence-based predictor named ProteDNA for identifying the sequence-specific binding residues in a transcription factor (TF). Concerning protein-DNA interactions, there are two types of binding mechanisms involved, namely sequence-specific binding and nonspecific binding. Sequence-specific bindings occur between protein sidechains and nucleotide bases and correspond to sequence-specific recognition of genes. Therefore, sequence-specific bindings are essential for correct gene regulation. In this respect, ProteDNA is distinctive since it has been designed to identify sequence-specific binding residues. In order to accommodate users with different application needs, ProteDNA has been designed to operate under two modes, namely, the high-precision mode and the balanced mode. According to the experiments reported in this article, under the high-precision mode, ProteDNA has been able to deliver precision of 82.3%, specificity of 99.3%, sensitivity of 49.8% and accuracy of 96.5%. Meanwhile, under the balanced mode, ProteDNA has been able to deliver precision of 60.8%, specificity of 97.6%, sensitivity of 60.7% and accuracy of 95.4%. ProteDNA is available at the following websites: http://protedna.csbb.ntu.edu.tw/, http://protedna.csie.ntu.edu.tw/, http://bio222.esoe.ntu.edu.tw/ProteDNA/.

  7. DNA Shape Dominates Sequence Affinity in Nucleosome Formation

    Science.gov (United States)

    Freeman, Gordon S.; Lequieu, Joshua P.; Hinckley, Daniel M.; Whitmer, Jonathan K.; de Pablo, Juan J.

    2014-10-01

    Nucleosomes provide the basic unit of compaction in eukaryotic genomes, and the mechanisms that dictate their position at specific locations along a DNA sequence are of central importance to genetics. In this Letter, we employ molecular models of DNA and proteins to elucidate various aspects of nucleosome positioning. In particular, we show how DNA's histone affinity is encoded in its sequence-dependent shape, including subtle deviations from the ideal straight B-DNA form and local variations of minor groove width. By relying on high-precision simulations of the free energy of nucleosome complexes, we also demonstrate that, depending on DNA's intrinsic curvature, histone binding can be dominated by bending interactions or electrostatic interactions. More generally, the results presented here explain how sequence, manifested as the shape of the DNA molecule, dominates molecular recognition in the problem of nucleosome positioning.

  8. Next Generation Sequencing of Ancient DNA: Requirements, Strategies and Perspectives

    Directory of Open Access Journals (Sweden)

    Michael Knapp

    2010-07-01

    Full Text Available The invention of next-generation-sequencing has revolutionized almost all fields of genetics, but few have profited from it as much as the field of ancient DNA research. From its beginnings as an interesting but rather marginal discipline, ancient DNA research is now on its way into the centre of evolutionary biology. In less than a year from its invention next-generation-sequencing had increased the amount of DNA sequence data available from extinct organisms by several orders of magnitude. Ancient DNA  research is now not only adding a temporal aspect to evolutionary studies and allowing for the observation of evolution in real time, it also provides important data to help understand the origins of our own species. Here we review progress that has been made in next-generation-sequencing of ancient DNA over the past five years and evaluate sequencing strategies and future directions.

  9. Efficient and specific internal cleavage of a retroviral palindromic DNA sequence by tetrameric HIV-1 integrase.

    Directory of Open Access Journals (Sweden)

    Olivier Delelis

    Full Text Available BACKGROUND: HIV-1 integrase (IN catalyses the retroviral integration process, removing two nucleotides from each long terminal repeat and inserting the processed viral DNA into the target DNA. It is widely assumed that the strand transfer step has no sequence specificity. However, recently, it has been reported by several groups that integration sites display a preference for palindromic sequences, suggesting that a symmetry in the target DNA may stabilise the tetrameric organisation of IN in the synaptic complex. METHODOLOGY/PRINCIPAL FINDINGS: We assessed the ability of several palindrome-containing sequences to organise tetrameric IN and investigated the ability of IN to catalyse DNA cleavage at internal positions. Only one palindromic sequence was successfully cleaved by IN. Interestingly, this symmetrical sequence corresponded to the 2-LTR junction of retroviral DNA circles-a palindrome similar but not identical to the consensus sequence found at integration sites. This reaction depended strictly on the cognate retroviral sequence of IN and required a full-length wild-type IN. Furthermore, the oligomeric state of IN responsible for this cleavage differed from that involved in the 3'-processing reaction. Palindromic cleavage strictly required the tetrameric form, whereas 3'-processing was efficiently catalysed by a dimer. CONCLUSIONS/SIGNIFICANCE: Our findings suggest that the restriction-like cleavage of palindromic sequences may be a general physiological activity of retroviral INs and that IN tetramerisation is strongly favoured by DNA symmetry, either at the target site for the concerted integration or when the DNA contains the 2-LTR junction in the case of the palindromic internal cleavage.

  10. DNA splice site sequences clustering method for conservativeness analysis

    Institute of Scientific and Technical Information of China (English)

    Quanwei Zhang; Qinke Peng; Tao Xu

    2009-01-01

    DNA sequences that are near to splice sites have remarkable conservativeness,and many researchers have contributed to the prediction of splice site.In order to mine the underlying biological knowledge,we analyze the conservativeness of DNA splice site adjacent sequences by clustering.Firstly,we propose a kind of DNA splice site sequences clustering method which is based on DBSCAN,and use four kinds of dissimilarity calculating methods.Then,we analyze the conservative feature of the clustering results and the experimental data set.

  11. DNA methylation and transcription in HERV (K, W, E) and LINE sequences remain unchanged upon foreign DNA insertions.

    Science.gov (United States)

    Weber, Stefanie; Jung, Susan; Doerfler, Walter

    2016-02-01

    DNA methylation and transcriptional profiles were determined in the regulatory sequences of the human endogenous retroviral (HERV-K, -W, -E) and LINE-1.2 elements and were compared between non-transgenomic and plasmid-transgenomic cells. DNA methylation profiles in the HERV (K, W, E) and LINE sequences were determined by bisulfite genomic sequencing. The transcription of these genome segments was assessed by quantitative real-time PCR. In HERV-K, HERV-W and LINE-1.2 the levels of DNA methylation ranged between 75 and 98%, while in HERV-E they were around 60%. Nevertheless, the HERV and LINE-1.2 sequences were actively transcribed. No differences were found in comparisons of HERV and LINE-1.2 CpG methylation and transcription patterns between non-transgenomic and plasmid-transgenomic HCT116 cells. The insertion of a 5.6 kbp plasmid into the HCT116 genome had no effect on the HERV and LINE-1.2 methylation and transcription profiles, although other parts of the HCT116 genome had shown marked changes. These repetitive sequences are transcribed, probably because the large number of HERV and LINE-1.2 elements harbor copies with non- or hypo-methylated long terminal repeat sequences.

  12. Thermodynamics of sequence-specific binding of PNA to DNA

    DEFF Research Database (Denmark)

    Ratilainen, T; Holmén, A; Tuite, E

    2000-01-01

    For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes) and seq......For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes...

  13. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    Science.gov (United States)

    Nielsen, Peter E.

    2008-10-01

    Peptide nucleic acids (PNA) can be designed to target duplex DNA with very high sequence specificity and efficiency via various binding modes. We have designed three domain PNA clamps, that bind stably to predefined decameric homopurine targets in large dsDNA molecules and via a third PNA domain sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technology of protein dsDNA structures.

  14. PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

    Science.gov (United States)

    Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

    2011-01-01

    PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.

  15. Read length and repeat resolution: Exploring prokaryote genomes using next-generation sequencing technologies

    KAUST Repository

    Cahill, Matt J.

    2010-07-12

    Background: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. Methodology/Principal Findings: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. Conclusions: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length. 2010 Cahill et al.

  16. Read length and repeat resolution: exploring prokaryote genomes using next-generation sequencing technologies.

    Directory of Open Access Journals (Sweden)

    Matt J Cahill

    Full Text Available BACKGROUND: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. METHODOLOGY/PRINCIPAL FINDINGS: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. CONCLUSIONS: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length.

  17. Efficient multiplex simple sequence repeat genotyping of the oomycete plant pathogen Phytophthora infestans

    NARCIS (Netherlands)

    Li, Y.; Cooke, D.E.L.; Jacobsen, E.; Lee, van der T.A.J.

    2013-01-01

    Genotyping is fundamental to population analysis. To accommodate fast, accurate and cost-effective genotyping, a one-step multiplex PCR method employing twelve simple sequence repeat (SSR) markers was developed for high-throughput screening of Phytophthora infestans populations worldwide. The SSR

  18. Current-voltage characteristics of double-strand DNA sequences

    Science.gov (United States)

    Bezerril, L. M.; Moreira, D. A.; Albuquerque, E. L.; Fulco, U. L.; de Oliveira, E. L.; de Sousa, J. S.

    2009-09-01

    We use a tight-binding formulation to investigate the transmissivity and the current-voltage (I-V) characteristics of sequences of double-strand DNA molecules. In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of artificial sequences (the long-range correlated Fibonacci and Rudin-Shapiro one) and a random sequence, which is a kind of prototype of a short-range correlated system. The random sequence is presented here with the same first neighbors pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the transmissivity spectra, although the I-V curves seem to be mostly influenced by the short-range correlations.

  19. Current-voltage characteristics of double-strand DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bezerril, L.M.; Moreira, D.A. [Departamento de Fisica, Universidade Federal do Rio Grande do Norte, 59072-970, Natal-RN (Brazil); Albuquerque, E.L., E-mail: eudenilson@dfte.ufrn.b [Departamento de Fisica, Universidade Federal do Rio Grande do Norte, 59072-970, Natal-RN (Brazil); Fulco, U.L. [Departamento de Biofisica e Farmacologia, Universidade Federal do Rio Grande do Norte, 59072-970, Natal-RN (Brazil); Oliveira, E.L. de; Sousa, J.S. de [Departamento de Fisica, Universidade Federal do Ceara, 60455-760, Fortaleza-CE (Brazil)

    2009-09-07

    We use a tight-binding formulation to investigate the transmissivity and the current-voltage (I-V) characteristics of sequences of double-strand DNA molecules. In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of artificial sequences (the long-range correlated Fibonacci and Rudin-Shapiro one) and a random sequence, which is a kind of prototype of a short-range correlated system. The random sequence is presented here with the same first neighbors pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the transmissivity spectra, although the I-V curves seem to be mostly influenced by the short-range correlations.

  20. Enhancing Gibbs sampling method for motif finding in DNA with initial graph representation of sequences.

    Science.gov (United States)

    Stepančič, Ziva

    2014-10-01

    Finding short patterns with residue variation in a set of sequences is still an open problem in genetics, since motif-finding techniques on DNA and protein sequences are inconclusive on real data sets and their performance varies on different species. Hence, finding new algorithms and evolving established methods are vital to further understanding of genome properties and the mechanisms of protein development. In this work, we present an approach to finding functional motifs in DNA sequences in connection to Gibbs sampling method. Starting points in the search space are partly determined via graphical representation of input sequences opposed to completely random initial points with the standard Gibbs sampling. Our algorithm is evaluated on synthetic as well as on real data sets by using several statistics, such as sensitivity, positive predictive value, specificity, performance, and correlation coefficient. Additionally, a comparison between our algorithm and the basic standard Gibbs sampling algorithm is made to show improvement in accuracy, repeatability, and performance.

  1. Characteristics of alternating current hopping conductivity in DNA sequences

    Institute of Scientific and Technical Information of China (English)

    Ma Song-Shan; Xu Hui; Wang Huan-You; Guo Rui

    2009-01-01

    This paper presents a model to describe alternating current (AC) conductivity of DNA sequences,in which DNA is considered as a one-dimensional (1D) disordered system,and electrons transport via hopping between localized states.It finds that AC conductivity in DNA sequences increases as the frequency of the external electric field rises,and it takes the form of σac(ω)~ω2 ln2(1/ω).Also AC conductivity of DNA sequences increases with the increase of temperature,this phenomenon presents characteristics of weak temperature-dependence.Meanwhile,the AC conductivity in an off diagonally correlated case is much larger than that in the uncorrelated case of the Anderson limit in low temperatures,which indicates that the off-diagonal correlations in DNA sequences have a great effect on the AC conductivity,while at high temperature the off-diagonal correlations no longer play a vital role in electric transport. In addition,the proportion of nucleotide pairs p also plays an important role in AC electron transport of DNA sequences.For p<0.5,the conductivity of DNA sequence decreases with the increase of p,while for p > 0.5,the conductivity increases with the increase of p.

  2. Development of simple sequence repeats (SSR) markers of ramie and comparison of SSR and inter-SSR marker systems

    Institute of Scientific and Technical Information of China (English)

    ZHOU Jianlin; JIE Yucheng; JIANG Yanbo; ZHONG Yingli; LIU Yunhai; ZHANG Jian

    2005-01-01

    Ramie (Boehmeria nivea L. ) is an important bast fiber crop. To study genetic background of this species, we isolated and characterized microsatellite markers of ramie. A genomic library containing inserts of rapid amplification of polymorphic DNA (RAPD)fragments was constructed, and screened by PCR amplification using anchored simple sequence repeats as primers. A total of 26 clones were identified as positives, and 13 microsatellite loci were found after sequencing. The polymorphism of these 13 microsatellite loci was examined and the utility of simple sequence repeats (SSR) and inter-SSR (ISSR) marker systems for genetic characterization compared using 19 selected ramie cultivars. Both approaches successfully discriminated the 19 cultivars which differed in the amount of polymorphism detected. The level of polymorphism detected by SSR was 95.0 %, higher than that by ISSR (72.3 % ), but the average polymorphism information content (PIC) of ISSR (0. 651) was higher than that of SSR (0. 441). The higher PIC value of ISSR suggests that ISSR is more efficient for fingerprinting ramie cultivars than SSR markers. However, because the SSR loci are codominant, they are more suitable for determining the homozygosity levels of ramie, constructing linkage map, quantitative trait loci study of complex traits and marker-as-sisted selection.

  3. Analysis of unstable DNA sequence in FRM1 gene in Polish families with fragile X syndrome

    Energy Technology Data Exchange (ETDEWEB)

    Milewski, Michal; Bal, Jerzy; Obersztyn, Ewa; Bocian, Ewa; Mazurczak, Tadeusz [Instytut Matki i Dziecka, Warsaw (Poland); Zygulska, Marta; Horst, Juergen [Institute of Human Genetics, Muenster (Germany); Deelen, Wout H.; Halley, Dicky J.J. [Erasmus Univ., Rotterdam (Netherlands)

    1996-12-31

    The unstable DNA sequence in the FMR1 gene was analyzed in 85 individuals from Polish families with fragile X syndrome in order to characterize mutations responsible for the disease in Poland. In all affected individuals classified on the basis of clinical features and expression of the fragile site at X(q27.3) a large expansion of the unstable sequence (full mutation) was detected. About 5% (2 of 43) of individuals with full mutation did not express the fragile site. Among normal alleles, ranging in size from 20 to 41 CGC repeats, allele with 29 repeats was the most frequent (37%). Transmission of premutated and fully mutated alleles to the offspring was always associated with size increase. No change in repeat number was found when normal alleles were transmitted. (author). 19 refs., 4 figs, 1 tab.

  4. A hybrid swarm population of Pinus densiflora x P. sylvestris hybrids inferred from sequence analysis of chloroplast DNA and morphological characters

    Science.gov (United States)

    To confirm a hybrid swarm population of Pinus densiflora × P. sylvestris in Jilin, China and to study whether shoot apex morphology of 4-year old seedlings can be correlated with the sequence of a chloroplast DNA simple sequence repeat marker (cpDNA SSR), needles and seeds from P. densiflora, P. syl...

  5. A novel class of small repetitive DNA sequences in Enterococcus faecalis.

    Science.gov (United States)

    Venditti, Rossella; De Gregorio, Eliana; Silvestro, Giustina; Bertocco, Tullia; Salza, Maria Francesca; Zarrilli, Raffaele; Di Nocera, Pier Paolo

    2007-06-01

    The structural organization of Enterococcus faecalis repeats (EFAR) is described, palindromic DNA sequences identified in the genome of the Enterococcus faecalis V583 strain by in silico analyses. EFAR are a novel type of miniature insertion sequences, which vary in size from 42 to 650 bp. Length heterogeneity results from the variable assembly of 16 different sequence types. Most elements measure 170 bp, and can fold into peculiar L-shaped structures resulting from the folding of two independent stem-loop structures (SLSs). Homologous chromosomal regions lacking or containing EFAR sequences were identified by PCR among 20 E. faecalis clinical isolates of different genotypes. Sequencing of a representative set of 'empty' sites revealed that 24-37 bp-long sequences, unrelated to each other but all able to fold into SLSs, functioned as targets for the integration of EFAR. In the process, most of the SLS had been deleted, but part of the targeted stems had been retained at EFAR termini.

  6. Genome-wide stochastic adaptive DNA amplification at direct and inverted DNA repeats in the parasite Leishmania.

    Directory of Open Access Journals (Sweden)

    Jean-Michel Ubeda

    2014-05-01

    Full Text Available Gene amplification of specific loci has been described in all kingdoms of life. In the protozoan parasite Leishmania, the product of amplification is usually part of extrachromosomal circular or linear amplicons that are formed at the level of direct or inverted repeated sequences. A bioinformatics screen revealed that repeated sequences are widely distributed in the Leishmania genome and the repeats are chromosome-specific, conserved among species, and generally present in low copy number. Using sensitive PCR assays, we provide evidence that the Leishmania genome is continuously being rearranged at the level of these repeated sequences, which serve as a functional platform for constitutive and stochastic amplification (and deletion of genomic segments in the population. This process is adaptive as the copy number of advantageous extrachromosomal circular or linear elements increases upon selective pressure and is reversible when selection is removed. We also provide mechanistic insights on the formation of circular and linear amplicons through RAD51 recombinase-dependent and -independent mechanisms, respectively. The whole genome of Leishmania is thus stochastically rearranged at the level of repeated sequences, and the selection of parasite subpopulations with changes in the copy number of specific loci is used as a strategy to respond to a changing environment.

  7. Genome-wide stochastic adaptive DNA amplification at direct and inverted DNA repeats in the parasite Leishmania.

    Science.gov (United States)

    Ubeda, Jean-Michel; Raymond, Frédéric; Mukherjee, Angana; Plourde, Marie; Gingras, Hélène; Roy, Gaétan; Lapointe, Andréanne; Leprohon, Philippe; Papadopoulou, Barbara; Corbeil, Jacques; Ouellette, Marc

    2014-05-01

    Gene amplification of specific loci has been described in all kingdoms of life. In the protozoan parasite Leishmania, the product of amplification is usually part of extrachromosomal circular or linear amplicons that are formed at the level of direct or inverted repeated sequences. A bioinformatics screen revealed that repeated sequences are widely distributed in the Leishmania genome and the repeats are chromosome-specific, conserved among species, and generally present in low copy number. Using sensitive PCR assays, we provide evidence that the Leishmania genome is continuously being rearranged at the level of these repeated sequences, which serve as a functional platform for constitutive and stochastic amplification (and deletion) of genomic segments in the population. This process is adaptive as the copy number of advantageous extrachromosomal circular or linear elements increases upon selective pressure and is reversible when selection is removed. We also provide mechanistic insights on the formation of circular and linear amplicons through RAD51 recombinase-dependent and -independent mechanisms, respectively. The whole genome of Leishmania is thus stochastically rearranged at the level of repeated sequences, and the selection of parasite subpopulations with changes in the copy number of specific loci is used as a strategy to respond to a changing environment.

  8. Inhibition of hepatitis B virus replication with linear DNA sequences expressing antiviral micro-RNA shuttles

    Energy Technology Data Exchange (ETDEWEB)

    Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie; Weinberg, Marc S. [Antiviral Gene Therapy Research Unit, University of the Witwatersrand (South Africa); Arbuthnot, Patrick, E-mail: Patrick.Arbuthnot@wits.ac.za [Antiviral Gene Therapy Research Unit, University of the Witwatersrand (South Africa)

    2009-11-20

    RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR) shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.

  9. ATRF Houses the Latest DNA Sequencing Technologies | Poster

    Science.gov (United States)

    By Ashley DeVine, Staff Writer By the end of October, the Advanced Technology Research Facility (ATRF) will be one of the few facilities in the world to house all of the latest DNA sequencing technologies.

  10. Effect of repeated sequential ejaculation on sperm DNA integrity in subfertile males with asthenozoospermia.

    Science.gov (United States)

    Hussein, T M; Elariny, A F; Elabd, M M; Elgarem, Y F; Elsawy, M M

    2008-10-01

    The aim of this work was to study the possible beneficial effect of repeated sequential ejaculation on sperm DNA integrity in subfertile males and its possible implementation in assisted reproduction. The study included 20 infertile males with idiopathic asthenozoospermia or oligoasthenozoospermia. They underwent detailed history taking, complete clinical assessment and hormonal assessment. Patients were asked to bring two semen samples (taken within 1-3 h). Two consecutive samples were assessed with regard to semen volume, sperm count, motility grading, and morphology and sperm DNA integrity using the comet assay. There was a significant improvement in the sperm motility pattern and DNA integrity in the second sample in comparison with the first sample. Therefore, it is concluded that due to its positive impact on sperm motility and DNA integrity, repeated sequential ejaculation is recommended in subfertile males with idiopathic asthenozoospermia who pursue assisted reproduction.

  11. The repeat domain of the type III effector protein PthA shows a TPR-like structure and undergoes conformational changes upon DNA interaction.

    Science.gov (United States)

    Murakami, Mário Tyago; Sforça, Mauricio Luis; Neves, Jorge Luiz; Paiva, Joice Helena; Domingues, Mariane Noronha; Pereira, André Luiz Araujo; Zeri, Ana Carolina de Mattos; Benedetti, Celso Eduardo

    2010-12-01

    Many plant pathogenic bacteria rely on effector proteins to suppress defense and manipulate host cell mechanisms to cause disease. The effector protein PthA modulates the host transcriptome to promote citrus canker. PthA possesses unusual protein architecture with an internal region encompassing variable numbers of near-identical tandem repeats of 34 amino acids termed the repeat domain. This domain mediates protein-protein and protein-DNA interactions, and two polymorphic residues in each repeat unit determine DNA specificity. To gain insights into how the repeat domain promotes protein-protein and protein-DNA contacts, we have solved the structure of a peptide corresponding to 1.5 units of the PthA repeat domain by nuclear magnetic resonance (NMR) and carried out small-angle X-ray scattering (SAXS) and spectroscopic studies on the entire 15.5-repeat domain of PthA2 (RD2). Consistent with secondary structure predictions and circular dichroism data, the NMR structure of the 1.5-repeat peptide reveals three α-helices connected by two turns that fold into a tetratricopeptide repeat (TPR)-like domain. The NMR structure corroborates the theoretical TPR superhelix predicted for RD2, which is also in agreement with the elongated shape of RD2 determined by SAXS. Furthermore, RD2 undergoes conformational changes in a pH-dependent manner and upon DNA interaction, and shows sequence similarities to pentatricopeptide repeat (PPR), a nucleic acid-binding motif structurally related to TPR. The results point to a model in which the RD2 structure changes its compactness as it embraces the DNA with the polymorphic diresidues facing the interior of the superhelix oriented toward the nucleotide bases.

  12. Repetitive DNA Sequences in Wheat and Its Relatives

    Institute of Scientific and Technical Information of China (English)

    ZHANG Xue-yong; LI Da-yong

    2001-01-01

    Repetitive DNA sequences form a large portion of eukaryote genomes. Using wheat ( Triticum )as a model, the classification, features and functions of repetitive DNA sequences in the Tritieeae grass tribe is reviewed as well as the role of these sequences in genome differentiation, control and regulation of homologous chromosome synapsis and pairing. Transposable elements, as an important portion of dispersed repetitives,may play an essential role in gene mutation of the host. Dynamic models for change of copy number and sequences of the repetitive family are also presented after the models of Charlesworth et al. Application of repetitive DNA sequences in the study of evolution, chromosome fingerprinting and marker assisted gene transfer and breeding are described by taking wheat as an example.

  13. PREDICTION OF CHROMATIN STATES USING DNA SEQUENCE PROPERTIES

    KAUST Repository

    Bahabri, Rihab R.

    2013-06-01

    Activities of DNA are to a great extent controlled epigenetically through the internal struc- ture of chromatin. This structure is dynamic and is influenced by different modifications of histone proteins. Various combinations of epigenetic modification of histones pinpoint to different functional regions of the DNA determining the so-called chromatin states. How- ever, the characterization of chromatin states by the DNA sequence properties remains largely unknown. In this study we aim to explore whether DNA sequence patterns in the human genome can characterize different chromatin states. Using DNA sequence motifs we built binary classifiers for each chromatic state to eval- uate whether a given genomic sequence is a good candidate for belonging to a particular chromatin state. Of four classification algorithms (C4.5, Naive Bayes, Random Forest, and SVM) used for this purpose, the decision tree based classifiers (C4.5 and Random Forest) yielded best results among those we evaluated. Our results suggest that in general these models lack sufficient predictive power, although for four chromatin states (insulators, het- erochromatin, and two types of copy number variation) we found that presence of certain motifs in DNA sequences does imply an increased probability that such a sequence is one of these chromatin states.

  14. Effects of sequence on DNA wrapping around histones

    Science.gov (United States)

    Ortiz, Vanessa

    2011-03-01

    A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).

  15. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    DEFF Research Database (Denmark)

    Nielsen, Peter E.

    2008-01-01

    sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technol. of protein dsDNA structures. (c) 2008 American Institute of Physics. [on SciFinder (R)] Udgivelsesdato...

  16. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    DEFF Research Database (Denmark)

    Nielsen, Peter E.

    2008-01-01

    sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technol. of protein dsDNA structures. (c) 2008 American Institute of Physics. [on SciFinder (R)] Udgivelsesdato...

  17. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    DEFF Research Database (Denmark)

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie

    2014-01-01

    The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross...

  18. Biometric Authentication Using ElGamal Cryptosystem And DNA Sequence

    Directory of Open Access Journals (Sweden)

    V.SAMUEL SUSAN

    2010-06-01

    Full Text Available Biometrics are automated methods of identifying a person or verifying the identity of a person based on a Physiological or behavioral characteristic. Physiological haracteristics include hand or finger images, facial characteristics and iris recognition. Behavioral characteristics include dynamic signature verification, speaker verification and keystroke dynamics. DNA is unique feature among individuals. DNA provides high security level, long term stability, user acceptance and is intrusive. Combining ElGamal cryptosystem and DNA sequence, a novel biometric authentication scheme is proposed.

  19. Protein sequence for clustering DNA based on Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Gamal. F. Elhadi

    2012-01-01

    Full Text Available DNA is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms and some viruses. Clustering is a process that groups a set of objects into clusters so that the similarity among objects in the same cluster is high, while that among the objects in different clusters is low. In this paper, we proposed an approach for clustering DNA sequences using Self-Organizing Map (SOM algorithm and Protein Sequence. The main objective is to analyze biological data and to bunch DNA to many clusters more easily and efficiently. We use the proposed approach to analyze both large and small amount of input DNA sequences. The results show that the similarity of the sequences does not depend on the amount of input sequences. Our approach depends on evaluating the degree of the DNA sequences similarity using the hierarchal representation Dendrogram. Representing large amount of data using hierarchal tree gives the ability to compare large sequences efficiently

  20. A highly conserved repeated chromosomal sequence in the radioresistant bacterium Deinococcus radiodurans SARK.

    Science.gov (United States)

    Lennon, E; Gutman, P D; Yao, H L; Minton, K W

    1991-03-01

    A DNA fragment containing a portion of a DNA damage-inducible gene from Deinococcus radiodurans SARK hybridized to numerous fragments of SARK genomic DNA because of a highly conserved repetitive chromosomal element. The element is of variable length, ranging from 150 to 192 bp, depending on the absence or presence of one or two 21-bp sequences located internally. A putative translational start site of the damage-inducible gene is within the reiterated element. The element contains dyad symmetries that suggest modes of transcriptional and/or translational control.

  1. Sequencing and Analysis of Neanderthal Genomic DNA

    OpenAIRE

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith, Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Paabo, Svante; Pritchard, Jonathan K; Rubin, Edward M.

    2006-01-01

    Our knowledge of Neanderthals is based on a limited number of remains and artifacts from which we must make inferences about their biology, behavior, and relationship to ourselves. Here, we describe the characterization of these extinct hominids from a new perspective, based on the development of a Neanderthal metagenomic library and its high-throughput sequencing and analysis. Several lines of evidence indicate that the 65,250 base pairs of hominid sequence so far identified in the library a...

  2. DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

    Science.gov (United States)

    Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

    2012-01-01

    DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.

  3. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    DEFF Research Database (Denmark)

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie;

    2014-01-01

    sections for electron induced single strand breaks in specific 13 mer oligonucleotides we used atomic force microscopy analysis of DNA origami based DNA nanoarrays. We investigated the DNA sequences 5'-TT(XYX)3TT with X = A, G, C and Y = T, BrU 5-bromouracil and found absolute strand break cross sections...

  4. Algorithms for mapping high-throughput DNA sequences

    DEFF Research Database (Denmark)

    Frellsen, Jes; Menzel, Peter; Krogh, Anders

    2014-01-01

    Abstract High-throughput sequencing (HTS) technologies revolutionized the field of molecular biology by enabling large scale whole genome sequencing as well as a broad range of experiments for studying the cell's inner workings directly on DNA or RNA level. Given the dramatically increased rate...

  5. Ancient DNA sequence revealed by error-correcting codes.

    Science.gov (United States)

    Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo

    2015-07-10

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.

  6. Structure and chromosomal localization of DNA sequences related to ribosomal subrepeats in Vicia faba.

    Science.gov (United States)

    Maggini, F; Cremonini, R; Zolfino, C; Tucci, G F; D'Ovidio, R; Delre, V; DePace, C; Scarascia Mugnozza, G T; Cionini, P G

    1991-05-01

    Subrepeating sequences of 325 bp found in the ribosomal intergenic spacer (IGS) of Vicia faba and responsible for variations in the length of the polycistronic units for rRNA were isolated and used as probes for in situ hybridization. Hybridization occurs at many regions of the metaphase chromosomes besides those bearing rRNA genes, namely chromosome ends and all the heterochromatic regions revealed by enhanced fluorescence after quinacrine staining. The DNA homologous to the 325 bp repeats that does not reside in the IGS was isolated, cloned and sequenced. It is composed of tandemly arranged 336 bp elements, each comprising two highly related 168 bp sequences. This structure is very similar to that of the IGS repeats and ca. 75% nucleotide sequence identity can be observed between these and the 168 bp doublets. The most obvious difference lies in the deletion, in the former, of a 14 bp segment from one of the two related sequences. It is hypothesized that the IGS repeats are derived from the 336 bp elements and have been transposed to ribosomal cistrons from other genome fractions. The possible relations between these sequences and others with similar structural features found in other species are discussed.

  7. Subnuclear relocalization and silencing of a chromosomal region by an ectopic ribosomal DNA repeat

    DEFF Research Database (Denmark)

    Jakociunas, Tadas; Domange Jordö, Marie Elise; Mebarek, Mazhoura Aït;

    2013-01-01

    dimerization, providing a mechanism for the observed relocalization. Replacing the full rDNA repeat with Reb1-binding sites, and using mutants lacking the histone H3K9 methyltransferase Clr4, indicated that the relocalized region was silenced redundantly by heterochromatin and another mechanism, plausibly...

  8. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  9. The HumD21S11 system of short tandem repeat DNA polymorphisms in Japanese and Chinese.

    Science.gov (United States)

    Zhou, H G; Sato, K; Nishimaki, Y; Fang, L; Hasekura, H

    1997-04-18

    HumD21S11 is a short tandem repeat DNA polymorphic system with a complex basic structure of (TCTA)4-6 (TCTG)5-6 (TCTA)3 TA (TCTA)3 TCA (TCTA)2 TCCA TA (TCTA)n. Using the allelic ladder prepared by us, the distribution of alleles among Japanese and Chinese was investigated, and four new alleles 28.2, 34, 35.2, and 36.2, were discovered. DNA sequencing was performed on the newly found alleles as well as on family samples and led to the discovery of different gene structures within alleles 28 and 32. Forensic materials, including hairs and seminal stains, were tested in parallel with blood samples from the same individual and were successfully typed for D21S11.

  10. Zinc finger recombinases with adaptable DNA sequence specificity.

    Directory of Open Access Journals (Sweden)

    Chris Proudfoot

    Full Text Available Site-specific recombinases have become essential tools in genetics and molecular biology for the precise excision or integration of DNA sequences. However, their utility is currently limited to circumstances where the sites recognized by the recombinase enzyme have been introduced into the DNA being manipulated, or natural 'pseudosites' are already present. Many new applications would become feasible if recombinase activity could be targeted to chosen sequences in natural genomic DNA. Here we demonstrate efficient site-specific recombination at several sequences taken from a 1.9 kilobasepair locus of biotechnological interest (in the bovine β-casein gene, mediated by zinc finger recombinases (ZFRs, chimaeric enzymes with linked zinc finger (DNA recognition and recombinase (catalytic domains. In the "Z-sites" tested here, 22 bp casein gene sequences are flanked by 9 bp motifs recognized by zinc finger domains. Asymmetric Z-sites were recombined by the concomitant action of two ZFRs with different zinc finger DNA-binding specificities, and could be recombined with a heterologous site in the presence of a third recombinase. Our results show that engineered ZFRs may be designed to promote site-specific recombination at many natural DNA sequences.

  11. cDNA cloning and sequencing of ostrich Growth hormone

    Directory of Open Access Journals (Sweden)

    Doosti Abbas

    2012-01-01

    Full Text Available In recent years, industrial breeding of ostrich (Struthio camelus has been widely developed in Iran. Growth hormone (GH is a peptide hormone that stimulates growth and cell reproduction in different animals. The aim of this study was to clone and sequence the ostrich growth hormone gene in E. coli, done for the first time in Iran. The cDNA that encodes ostrich growth hormone was isolated from total mRNA of the pituitary gland and amplified by RT-PCR using GH specific PCR primers. Then GH cDNA was cloned by T/A cloning technique and the construct was transformed into E. coli. Finally, GH cDNA sequence was submitted to the GenBank (Accession number: JN559394. The results of present study showed that GH cDNA was successfully cloned in E. coli. Sequencing confirmed that GH cDNA was cloned and that the length of ostrich GH cDNA was 672 bp; BLAST search showed that the sequence of growth hormone cDNA of the ostrich from Iran has 100% homology with other records existing in GenBank.

  12. Polyamide platinum anticancer complexes designed to target specific DNA sequences.

    Science.gov (United States)

    Jaramillo, David; Wheate, Nial J; Ralph, Stephen F; Howard, Warren A; Tor, Yitzhak; Aldrich-Wright, Janice R

    2006-07-24

    Two new platinum complexes, trans-chlorodiammine[N-(2-aminoethyl)-4-[4-(N-methylimidazole-2-carboxamido)-N-methylpyrrole-2-carboxamido]-N-methylpyrrole-2-carboxamide]platinum(II) chloride (DJ1953-2) and trans-chlorodiammine[N-(6-aminohexyl)-4-[4-(N-methylimidazole-2-carboxamido)-N-methylpyrrole-2-carboxamido]-N-methylpyrrole-2-carboxamide]platinum(II) chloride (DJ1953-6) have been synthesized as proof-of-concept molecules in the design of agents that can specifically target genes in DNA. Coordinate covalent binding to DNA was demonstrated with electrospray ionization mass spectrometry. Using circular dichroism, these complexes were found to show greater DNA binding affinity to the target sequence: d(CATTGTCAGAC)(2), than toward either d(GTCTGTCAATG)(2,) which contains different flanking sequences, or d(CATTGAGAGAC)(2), which contains a double base pair mismatch sequence. DJ1953-2 unwinds the DNA helix by around 13 degrees , but neither metal complex significantly affects the DNA melting temperature. Unlike simple DNA minor groove binders, DJ1953-2 is able to inhibit, in vitro, RNA synthesis. The cytotoxicity of both metal complexes in the L1210 murine leukaemia cell line was also determined, with DJ1953-6 (34 microM) more active than DJ1953-2 (>50 microM). These results demonstrate the potential of polyamide platinum complexes and provide the structural basis for designer agents that are able to recognize biologically relevant sequences and prevent DNA transcription and replication.

  13. Selective binding of anti-DNA antibodies to native dsDNA fragments of differing sequence.

    Science.gov (United States)

    Uccellini, Melissa B; Busto, Patricia; Debatis, Michelle; Marshak-Rothstein, Ann; Viglianti, Gregory A

    2012-03-30

    Systemic autoimmune diseases are characterized by the development of autoantibodies directed against a limited subset of nuclear antigens, including DNA. DNA-specific B cells take up mammalian DNA through their B cell receptor, and this DNA is subsequently transported to an endosomal compartment where it can potentially engage TLR9. We have previously shown that ssDNA-specific B cells preferentially bind to particular DNA sequences, and antibody specificity for short synthetic oligodeoxynucleotides (ODNs). Since CpG-rich DNA, the ligand for TLR9 is found in low abundance in mammalian DNA, we sought to determine whether antibodies derived from DNA-reactive B cells showed binding preference for CpG-rich native dsDNA, and thereby select immunostimulatory DNA for delivery to TLR9. We examined a panel of anti-DNA antibodies for binding to CpG-rich and CpG-poor DNA fragments. We show that a number of anti-DNA antibodies do show preference for binding to certain native dsDNA fragments of differing sequence, but this does not correlate directly with the presence of CpG dinucleotides. An antibody with preference for binding to a fragment containing optimal CpG motifs was able to promote B cell proliferation to this fragment at 10-fold lower antibody concentrations than an antibody that did not selectively bind to this fragment, indicating that antibody binding preference can influence autoreactive B cell responses.

  14. DNA sequence analysis of X-ray induced Adh null mutations in Drosophila melanogaster

    Energy Technology Data Exchange (ETDEWEB)

    Mahmoud, J.; Fossett, N.G.; Arbour-Reily, P.; McDaniel, M.; Tucker, A.; Chang, S.H.; Lee, W.R. (Louisiana State Univ., Baton Rouge (United States))

    1991-01-01

    The mutational spectrum for 28 X-ray induced mutations and 2 spontaneous mutations, previously determined by genetic and cytogenetic methods, consisted of 20 multilocus deficiencies (19 induced and 1 spontaneous) and 10 intragenic mutations (9 induced and 1 spontaneous). One of the X-ray induced intragenic mutations was lost, and another was determined to be a recombinant with the allele used in the recovery scheme. The DNA sequence of two X-ray induced intragenic mutations has been published. This paper reports the results of DNA sequence analysis of the remaining intragenic mutations and a summary of the X-ray induced mutational spectrum. The combination of DNA sequence analysis with genetic complementation analysis shows a continuous distribution in size of deletions rather than two different types of mutations consisting of deletions and point mutations'. Sequencing is shown to be essential for detecting intragenic deletions. Of particular importance for future studies is the observation that all of the intragenic deletions consist of a direct repeat adjacent to the breakpoint with one of the repeats deleted.

  15. Nanopore-based Fourth-generation DNA Sequencing Technology

    Institute of Scientific and Technical Information of China (English)

    Yanxiao Feng; Yuechuan Zhang; Cuifeng Ying; Deqiang Wang; Chunlei Du

    2015-01-01

    Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than$100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications.

  16. High Sequence Variations in Mitochondrial DNA Control Region among Worldwide Populations of Flathead Mullet Mugil cephalus

    Directory of Open Access Journals (Sweden)

    Brian Wade Jamandre

    2014-01-01

    Full Text Available The sequence and structure of the complete mtDNA control region (CR of M. cephalus from African, Pacific, and Atlantic populations are presented in this study to assess its usefulness in phylogeographic studies of this species. The mtDNA CR sequence variations among M. cephalus populations largely exceeded intraspecific polymorphisms that are generally observed in other vertebrates. The length of CR sequence varied among M. cephalus populations due to the presence of indels and variable number of tandem repeats at the 3′ hypervariable domain. The high evolutionary rate of the CR in this species probably originated from these mutations. However, no excessive homoplasic mutations were noticed. Finally, the star shaped tree inferred from the CR polymorphism stresses a rapid radiation worldwide, in this species. The CR still appears as a good marker for phylogeographic investigations and additional worldwide samples are warranted to further investigate the genetic structure and evolution in M. cephalus.

  17. Transcriptome analysis reveals ginsenosides biosynthetic genes, microRNAs and simple sequence repeats in Panax ginseng C. A. Meyer

    Science.gov (United States)

    2013-01-01

    Background Panax ginseng C. A. Meyer is one of the most widely used medicinal plants. Complete genome information for this species remains unavailable due to its large genome size. At present, analysis of expressed sequence tags is still the most powerful tool for large-scale gene discovery. The global expressed sequence tags from P. ginseng tissues, especially those isolated from stems, leaves and flowers, are still limited, hindering in-depth study of P. ginseng. Results Two 454 pyrosequencing runs generated a total of 2,423,076 reads from P. ginseng roots, stems, leaves and flowers. The high-quality reads from each of the tissues were independently assembled into separate and shared contigs. In the separately assembled database, 45,849, 6,172, 4,041 and 3,273 unigenes were only found in the roots, stems, leaves and flowers database, respectively. In the jointly assembled database, 178,145 unigenes were observed, including 86,609 contigs and 91,536 singletons. Among the 178,145 unigenes, 105,522 were identified for the first time, of which 65.6% were identified in the stem, leaf or flower cDNA libraries of P. ginseng. After annotation, we discovered 223 unigenes involved in ginsenoside backbone biosynthesis. Additionally, a total of 326 potential cytochrome P450 and 129 potential UDP-glycosyltransferase sequences were predicted based on the annotation results, some of which may encode enzymes responsible for ginsenoside backbone modification. A BLAST search of the obtained high-quality reads identified 14 potential microRNAs in P. ginseng, which were estimated to target 100 protein-coding genes, including transcription factors, transporters and DNA binding proteins, among others. In addition, a total of 13,044 simple sequence repeats were identified from the 178,145 unigenes. Conclusions This study provides global expressed sequence tags for P. ginseng, which will contribute significantly to further genome-wide research and analyses in this species. The novel

  18. Comparison and correlation of Simple Sequence Repeats distribution in genomes of Brucella species

    Science.gov (United States)

    Kiran, Jangampalli Adi Pradeep; Chakravarthi, Veeraraghavulu Praveen; Kumar, Yellapu Nanda; Rekha, Somesula Swapna; Kruti, Srinivasan Shanthi; Bhaskar, Matcha

    2011-01-01

    Computational genomics is one of the important tools to understand the distribution of closely related genomes including simple sequence repeats (SSRs) in an organism, which gives valuable information regarding genetic variations. The central objective of the present study was to screen the SSRs distributed in coding and non-coding regions among different human Brucella species which are involved in a range of pathological disorders. Computational analysis of the SSRs in the Brucella indicates few deviations from expected random models. Statistical analysis also reveals that tri-nucleotide SSRs are overrepresented and tetranucleotide SSRs underrepresented in Brucella genomes. From the data, it can be suggested that over expressed tri-nucleotide SSRs in genomic and coding regions might be responsible in the generation of functional variation of proteins expressed which in turn may lead to different pathogenicity, virulence determinants, stress response genes, transcription regulators and host adaptation proteins of Brucella genomes. Abbreviations SSRs - Simple Sequence Repeats, ORFs - Open Reading Frames. PMID:21738309

  19. Mitochondrial DNA sequence analysis of two mouse hepatocarcinoma cell lines

    Institute of Scientific and Technical Information of China (English)

    Ji-Gang Dai; Xia Lei; Jia-Xin Min; Guo-Qiang Zhang; Hong Wei

    2005-01-01

    AIM: To study genetic difference of mitochondrial DNA (mtDNA)between two hepatocarcinoma cell lines (Hca-F and Hca-P)with diverse metastatic characteristics and the relationship between mtDNA changes in cancer cells and their oncogenic phenotype.METHODS: Mitochondrial DNA D-loop, tRNAMet+Glu+Ile and ND3gene fragments from the hepatocarcinoma cell lines with 1100, 1126 and 534 bp in length respectively were analysed by PCR amplification and restriction fragment length polymorphism techniques. The D-loop 3' end sequence of the hepatocarcinoma cell lines was determined by sequencing.RESULTS: No amplification fragment length polymorphism and restriction fragment length polymorphism were observed in tRNAMet+Glu+Ile,ND3 and D-loop of mitochondrial DNA of the hepatocarcinoma cells. Sequence differences between Hca-F and Hca-P were found in mtDNA D-loop.CONCLUSION: Deletion mutations of mitochondrial DNA restriction fragment may not play a significant role in carcinogenesis. Genetic difference of mtDNA D-loop between Hca-F and Hca-P, which may reflect the environmental and genetic influences during tumor progression, could be linked to their tumorigenic phenotypes.

  20. Palindromic sequence artifacts generated during next generation sequencing library preparation from historic and ancient DNA.

    Directory of Open Access Journals (Sweden)

    Bastiaan Star

    Full Text Available Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua, which forms interrupted palindromes consisting of reverse complementary sequence at the 5' and 3'-ends of sequencing reads. The palindromic sequences themselves have specific properties - the bases at the 5'-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3'-end. The terminal 3' bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA, which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3'-end of DNA strands, with the 5'-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias.

  1. Genetic Diversity Assessment and Identification of New Sour Cherry Genotypes Using Intersimple Sequence Repeat Markers

    OpenAIRE

    Roghayeh Najafzadeh; Kazem Arzani; Naser Bouzari; Ali Saei

    2014-01-01

    Iran is one of the chief origins of subgenus Cerasus germplasm. In this study, the genetic variation of new Iranian sour cherries (which had such superior growth characteristics and fruit quality as to be considered for the introduction of new cultivars) was investigated and identified using 23 intersimple sequence repeat (ISSR) markers. Results indicated a high level of polymorphism of the genotypes based on these markers. According to these results, primers tested in this study specially IS...

  2. Inter simple sequence repeat fingerprints for assess genetic diversity of tunisian garlic populations

    OpenAIRE

    Jabbes, Naouel; Geoffriau, Emmanuel; Le Clerc, Valérie; Dridi, Boutheina; Hannechi, Chérif

    2011-01-01

    Garlic (Allium sativum L.) that is cultivated in Tunisia is heterogeneous and unclassified with no registered local cultivars. At present, the level of genetic diversity in Tunisian garlic is almost unknown. Inter Simple Sequence Repeats (ISSR) genetic markers were therefore used to assess the genetic diversity and its distribution in 31 Tunisian garlic accessions with 4 French classified clones used as control. It was the first time that ISSR markers were used to detect diversity in garlic. ...

  3. PCR primers for metazoan mitochondrial 12S ribosomal DNA sequences.

    Directory of Open Access Journals (Sweden)

    Ryuji J Machida

    Full Text Available BACKGROUND: Assessment of the biodiversity of communities of small organisms is most readily done using PCR-based analysis of environmental samples consisting of mixtures of individuals. Known as metagenetics, this approach has transformed understanding of microbial communities and is beginning to be applied to metazoans as well. Unlike microbial studies, where analysis of the 16S ribosomal DNA sequence is standard, the best gene for metazoan metagenetics is less clear. In this study we designed a set of PCR primers for the mitochondrial 12S ribosomal DNA sequence based on 64 complete mitochondrial genomes and then tested their efficacy. METHODOLOGY/PRINCIPAL FINDINGS: A total of the 64 complete mitochondrial genome sequences representing all metazoan classes available in GenBank were downloaded using the NCBI Taxonomy Browser. Alignment of sequences was performed for the excised mitochondrial 12S ribosomal DNA sequences, and conserved regions were identified for all 64 mitochondrial genomes. These regions were used to design a primer pair that flanks a more variable region in the gene. Then all of the complete metazoan mitochondrial genomes available in NCBI's Organelle Genome Resources database were used to determine the percentage of taxa that would likely be amplified using these primers. Results suggest that these primers will amplify target sequences for many metazoans. CONCLUSIONS/SIGNIFICANCE: Newly designed 12S ribosomal DNA primers have considerable potential for metazoan metagenetic analysis because of their ability to amplify sequences from many metazoans.

  4. Recognizing a Single Base in an Individual DNA Strand: A Step Toward Nanopore DNA Sequencing**

    Science.gov (United States)

    Ashkenasy, N.; Sánchez-Quesada, J.; Ghadiri, M. R.; Bayley, H.

    2007-01-01

    Functional supramolecular chemistry at the single-molecule level. Single strands of DNA can be captured inside α-hemolysin transmembrane pore protein to form single-species α-HL·DNA pseudorotaxanes. This process can be used to identify a single adenine nucleotide at a specific location on a strand of DNA by the characteristic reductions in the α-HL ion conductance. This study suggests that α-HL-mediated single-molecule DNA sequencing might be fundamentally feasible. PMID:15666419

  5. Analysis of sequence variation in Gnathostoma spinigerum mitochondrial DNA by single-strand conformation polymorphism analysis and DNA sequence.

    Science.gov (United States)

    Ngarmamonpirat, Charinthon; Waikagul, Jitra; Petmitr, Songsak; Dekumyoy, Paron; Rojekittikhun, Wichit; Anantapruti, Malinee T

    2005-03-01

    Morphological variations were observed in the advance third stage larvae of Gnathostoma spinigerum collected from swamp eel (Fluta alba), the second intermediate host. Larvae with typical and three atypical types were chosen for partial cytochrome c oxidase subunit I (COI) gene sequence analysis. A 450 bp polymerase chain reaction product of the COI gene was amplified from mitochondrial DNA. The variations were analyzed by single-strand conformation polymorphism and DNA sequencing. The nucleotide variations of the COI gene in the four types of larvae indicated the presence of an intra-specific variation of mitochondrial DNA in the G. spinigerum population.

  6. Development and characterization of simple sequence repeats for Bipolaris sorokiniana and cross transferability to related species.

    Science.gov (United States)

    Fajolu, Oluseyi L; Wadl, Phillip A; Vu, Andrea L; Gwinn, Kimberly D; Scheffler, Brian E; Trigiano, Robert N; Ownley, Bonnie H

    2013-01-01

    Simple sequence repeats (SSR) markers were developed from a small insert genomic library for Bipolaris sorokiniana, a mitosporic fungal pathogen that causes spot blotch and root rot in switchgrass. About 59% of sequenced clones (n = 384) harbored SSR motifs. After eliminating redundant sequences, 196 SSR loci were identified, of which 84.7% were dinucleotide repeats and 9.7% and 5.6% were tri- and tetra-nucleotide repeats, respectively. Primer pairs were designed for 105 loci and 85 successfully amplified loci. Sixteen polymorphic loci were characterized with 15 B. sorokiniana isolates obtained from infected switchgrass plant materials collected from five states in USA. These loci successfully cross-amplified isolates from at least one related species, including Bipolaris oryzae, Bipolaris spicifera and Bipolaris victoriae, that causes leaf spot on switchgrass. Haploid gene diversity per locus across all isolates studied varied 0.633-0.861. Principal component analysis of SSR data clustered isolates according to their respective species. These SSR markers will be a valuable tool for genetic variability and population studies of B. sorokiniana and related species that are pathogenic on switchgrass and other host plants. In addition, these markers are potential diagnostic tools for species in the genus Bipolaris.

  7. Detection of possible restriction sites for type II restriction enzymes in DNA sequences.

    Science.gov (United States)

    Gagniuc, P; Cimponeriu, D; Ionescu-Tîrgovişte, C; Mihai, Andrada; Stavarachi, Monica; Mihai, T; Gavrilă, L

    2011-01-01

    In order to make a step forward in the knowledge of the mechanism operating in complex polygenic disorders such as diabetes and obesity, this paper proposes a new algorithm (PRSD -possible restriction site detection) and its implementation in Applied Genetics software. This software can be used for in silico detection of potential (hidden) recognition sites for endonucleases and for nucleotide repeats identification. The recognition sites for endonucleases may result from hidden sequences through deletion or insertion of a specific number of nucleotides. Tests were conducted on DNA sequences downloaded from NCBI servers using specific recognition sites for common type II restriction enzymes introduced in the software database (n = 126). Each possible recognition site indicated by the PRSD algorithm implemented in Applied Genetics was checked and confirmed by NEBcutter V2.0 and Webcutter 2.0 software. In the sequence NG_008724.1 (which includes 63632 nucleotides) we found a high number of potential restriction sites for ECO R1 that may be produced by deletion (n = 43 sites) or insertion (n = 591 sites) of one nucleotide. The second module of Applied Genetics has been designed to find simple repeats sizes with a real future in understanding the role of SNPs (Single Nucleotide Polymorphisms) in the pathogenesis of the complex metabolic disorders. We have tested the presence of simple repetitive sequences in five DNA sequence. The software indicated exact position of each repeats detected in the tested sequences. Future development of Applied Genetics can provide an alternative for powerful tools used to search for restriction sites or repetitive sequences or to improve genotyping methods.

  8. Exploiting BAC-end sequences for the mining, characterization and utility of new short sequences repeat (SSR) markers in Citrus.

    Science.gov (United States)

    Biswas, Manosh Kumar; Chai, Lijun; Mayer, Christoph; Xu, Qiang; Guo, Wenwu; Deng, Xiuxin

    2012-05-01

    The aim of this study was to develop a large set of microsatellite markers based on publicly available BAC-end sequences (BESs), and to evaluate their transferability, discriminating capacity of genotypes and mapping ability in Citrus. A set of 1,281 simple sequence repeat (SSR) markers were developed from the 46,339 Citrus clementina BAC-end sequences (BES), of them 20.67% contained SSR longer than 20 bp, corresponding to roughly one perfect SSR per 2.04 kb. The most abundant motifs were di-nucleotide (16.82%) repeats. Among all repeat motifs (TA/AT)n is the most abundant (8.38%), followed by (AG/CT)n (4.51%). Most of the BES-SSR are located in the non-coding region, but 1.3% of BES-SSRs were found to be associated with transposable element (TE). A total of 400 novel SSR primer pairs were synthesized and their transferability and polymorphism tested on a set of 16 Citrus and Citrus relative's species. Among these 333 (83.25%) were successfully amplified and 260 (65.00%) showed cross-species transferability with Poncirus trifoliata and Fortunella sp. These cross-species transferable markers could be useful for cultivar identification, for genomic study of Citrus, Poncirus and Fortunella sp. Utility of the developed SSR marker was demonstrated by identifying a set of 118 markers each for construction of linkage map of Citrus reticulata and Poncirus trifoliata. Genetic diversity and phylogenetic relationship among 40 Citrus and its related species were conducted with the aid of 25 randomly selected SSR primer pairs and results revealed that citrus genomic SSRs are superior to genic SSR for genetic diversity and germplasm characterization of Citrus spp.

  9. Probabilistic models for semisupervised discriminative motif discovery in DNA sequences.

    Science.gov (United States)

    Kim, Jong Kyoung; Choi, Seungjin

    2011-01-01

    Methods for discriminative motif discovery in DNA sequences identify transcription factor binding sites (TFBSs), searching only for patterns that differentiate two sets (positive and negative sets) of sequences. On one hand, discriminative methods increase the sensitivity and specificity of motif discovery, compared to generative models. On the other hand, generative models can easily exploit unlabeled sequences to better detect functional motifs when labeled training samples are limited. In this paper, we develop a hybrid generative/discriminative model which enables us to make use of unlabeled sequences in the framework of discriminative motif discovery, leading to semisupervised discriminative motif discovery. Numerical experiments on yeast ChIP-chip data for discovering DNA motifs demonstrate that the best performance is obtained between the purely-generative and the purely-discriminative and the semisupervised learning improves the performance when labeled sequences are limited.

  10. Repeated-Sprint Sequences During Female Soccer Matches Using Fixed and Individual Speed Thresholds.

    Science.gov (United States)

    Nakamura, Fábio Y; Pereira, Lucas A; Loturco, Irineu; Rosseti, Marcelo; Moura, Felipe A; Bradley, Paul S

    2017-07-01

    Nakamura, FY, Pereira, LA, Loturco, I, Rosseti, M, Moura, FA, and Bradley, PS. Repeated-sprint sequences during female soccer matches using fixed and individual speed thresholds. J Strength Cond Res 31(7): 1802-1810, 2017-The main objective of this study was to characterize the occurrence of single sprint and repeated-sprint sequences (RSS) during elite female soccer matches, using fixed (20 km·h) and individually based speed thresholds (>90% of the mean speed from a 20-m sprint test). Eleven elite female soccer players from the same team participated in the study. All players performed a 20-m linear sprint test, and were assessed in up to 10 official matches using Global Positioning System technology. Magnitude-based inferences were used to test for meaningful differences. Results revealed that irrespective of adopting fixed or individual speed thresholds, female players produced only a few RSS during matches (2.3 ± 2.4 sequences using the fixed threshold and 3.3 ± 3.0 sequences using the individually based threshold), with most sequences composing of just 2 sprints. Additionally, central defenders performed fewer sprints (10.2 ± 4.1) than other positions (fullbacks: 28.1 ± 5.5; midfielders: 21.9 ± 10.5; forwards: 31.9 ± 11.1; with the differences being likely to almost certainly associated with effect sizes ranging from 1.65 to 2.72), and sprinting ability declined in the second half. The data do not support the notion that RSS occurs frequently during soccer matches in female players, irrespective of using fixed or individual speed thresholds to define sprint occurrence. However, repeated-sprint ability development cannot be ruled out from soccer training programs because of its association with match-related performance.

  11. Chaos game representation (CGR)-walk model for DNA sequences

    Institute of Scientific and Technical Information of China (English)

    Gao Jie; Xu Zhen-Yuan

    2009-01-01

    Chaos game representation (CGR) is an iterative mapping technique that processes sequences of units, such as nucleotides in a DNA sequence or amino acids in a protein, in order to determine the coordinates of their positions in a continuous space. This distribution of positions has two features: one is unique, and the other is source sequence that can be recovered from the coordinates so that the distance between positions may serve as a measure of similarity between the corresponding sequences. A CGR-walk model is proposed based on CGR coordinates for the DNA sequences. The CGR coordinates are converted into a time series, and a long-memory ARFIMA (p, d, q) model, where ARFIMA stands for autoregressive fractionally integrated moving average, is introduced into the DNA sequence analysis. This model is applied to simulating real CGR-walk sequence data of ten genomic sequences. Remarkably long-range correlations are uncovered in the data, and the results from these models are reasonably fitted with those from the ARFIMA (p, d, q) model.

  12. Curcusone C induces telomeric DNA-damage response in cancer cells through inhibition of telomeric repeat factor 2.

    Science.gov (United States)

    Wang, Mingxue; Cao, Jiaojiao; Zhu, Jian-Yong; Qiu, Jun; Zhang, Yan; Shu, Bing; Ou, Tian-Miao; Tan, Jia-Heng; Gu, Lian-Quan; Huang, Zhi-Shu; Yin, Sheng; Li, Ding

    2017-11-01

    Telomeric repeat factor 2 (known as TRF2 or TERF2) is a key component of telomere protection protein complex named as Shelterin. TRF2 helps the folding of telomere to form T-loop structure and the suppression of ATM-dependent DNA damage response activation. TRF2 has been recognized as a potentially new therapeutic target for cancer treatment. In our routine screening of small molecule libraries, we found that Curcusone C had significant effect in disrupting the binding between TRF2 and telomeric DNA, with potent antitumor activity against cancer cells. Our result showed that Curcusone C could bind with TRF2 without binding interaction with TRF1 (telomeric repeat factor 1) although these two proteins share high sequence homology, indicating that their binding conformations and biological functions in telomere could be different. Our mechanistic studies showed that Curcusone C bound with TRF2 possibly through its DNA binding site causing blockage of its interaction with telomeric DNA. Further in cellular studies indicated that the interaction of TRF2 with Curcusone C could activate DNA-damage response, inhibit tumor cell proliferation, and cause cell cycle arrest, resulting in tumor cell apoptosis. Our studies showed that Curcusone C could become a promising lead compound for further development for cancer treatment. Here, TRF2 was firstly identified as a target of Curcusone C. It is likely that the anti-cancer activity of some other terpenes and terpenoids are related with their possible effect for telomere protection proteins. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. Improved Algorithm for Analysis of DNA Sequences Using Multiresolution Transformation

    Directory of Open Access Journals (Sweden)

    T. M. Inbamalar

    2015-01-01

    Full Text Available Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA, the ribonucleic acid (RNA, and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI site. The comparative analysis is done and it ensures the efficiency of the proposed system.

  14. Improved algorithm for analysis of DNA sequences using multiresolution transformation.

    Science.gov (United States)

    Inbamalar, T M; Sivakumar, R

    2015-01-01

    Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA), the ribonucleic acid (RNA), and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP) methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP) representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI) site. The comparative analysis is done and it ensures the efficiency of the proposed system.

  15. Transformation-associated recombination between diverged and homologous DNA repeats is induced by strand breaks

    Energy Technology Data Exchange (ETDEWEB)

    Larionov, V.; Kouprina, N. [National Institute of Environmental Health Sciences (NIH), Research Triangle Park, NC (United States)]|[Institute of Cytology, St. Petersburg (Russian Federation); Eldarov, M. [National Institute of Environmental Health Sciences (NIH), Research Triangle Park, NC (United States)]|[Center for Bioengineering, Moscow (Russian Federation); Perkins, E.; Porter, G.; Resnick, M.A. [National Institute of Environmental Health Sciences (NIH), Research Triangle Park, NC (United States)

    1994-10-01

    Rearrangement and deletion within plasmid DNA is commonly observed during transformation. We have examined the mechanisms of transformation-associated recombination in the yeast Saccharomyces cerevisiae using a plasmid system which allowed the effects of physical state and/or extent of homology on recombination to be studied. The plasmid contains homologous or diverged (19%) DNA repeats separated by a genetically detectable color marker. Recombination during transformation for covalently closed circular plasmids was over 100-fold more frequent than during mitotic-growth. The frequency of recombination is partly dependent on the method of transformation In that procedures involving lithium acetate or spheroplasting yield higher frequencies than electroporation. When present in the repeats, unique single-strand breaks that are ligatable, as well as double-strand breaks, lead to high levels of recombination between diverged and identical repeats. The transformation-associated recombination between repeat DNA`s is under the influence of the RAD52, RAD1 and the RNC1 genes.

  16. How effective is graphene nanopore geometry on DNA sequencing?

    CERN Document Server

    Satarifard, Vahid; Ejtehadi, Mohammad Reza

    2015-01-01

    In this paper we investigate the effects of graphene nanopore geometry on homopolymer ssDNA pulling process through nanopore using steered molecular dynamic (SMD) simulations. Different graphene nanopores are examined including axially symmetric and asymmetric monolayer graphene nanopores as well as five layer graphene polyhedral crystals (GPC). The pulling force profile, moving fashion of ssDNA, work done in irreversible DNA pulling and orientations of DNA bases near the nanopore are assessed. Simulation results demonstrate the strong effect of the pore shape as well as geometrical symmetry on free energy barrier, orientations and dynamic of DNA translocation through graphene nanopore. Our study proposes that the symmetric circular geometry of monolayer graphene nanopore with high pulling velocity can be used for DNA sequencing.

  17. Qualitatively predicting acetylation and methylation areas in DNA sequences.

    Science.gov (United States)

    Pham, Tho Hoan; Tran, Dang Hung; Ho, Tu Bao; Satou, Kenji; Valiente, Gabriel

    2005-01-01

    Eukaryotic genomes are packaged by the wrapping of DNA around histone octamers to form nucleosomes. Nucleosome occupancy, acetylation, and methylation, which have a major impact on all nuclear processes involving DNA, have been recently mapped across the yeast genome using chromatin immunoprecipitation and DNA microarrays. However, this experimental protocol is laborious and expensive. Moreover, experimental methods often produce noisy results. In this paper, we introduce a computational approach to the qualitative prediction of nucleosome occupancy, acetylation, and methylation areas in DNA sequences. Our method uses support vector machines to discriminate between DNA areas with high and low relative occupancy, acetylation, or methylation, and rank k-gram features based on their support for these DNA modifications. Experimental results on the yeast genome reveal genetic area preferences of nucleosome occupancy, acetylation, and methylation that are consistent with previous studies. Supplementary files are available from http://www.jaist.ac.jp/~tran/nucleosome/.

  18. Ribosomal DNA copy number loss and sequence variation in cancer.

    Science.gov (United States)

    Xu, Baoshan; Li, Hua; Perry, John M; Singh, Vijay Pratap; Unruh, Jay; Yu, Zulin; Zakari, Musinu; McDowell, William; Li, Linheng; Gerton, Jennifer L

    2017-06-01

    Ribosomal DNA is one of the most variable regions in the human genome with respect to copy number. Despite the importance of rDNA for cellular function, we know virtually nothing about what governs its copy number, stability, and sequence in the mammalian genome due to challenges associated with mapping and analysis. We applied computational and droplet digital PCR approaches to measure rDNA copy number in normal and cancer states in human and mouse genomes. We find that copy number and sequence can change in cancer genomes. Counterintuitively, human cancer genomes show a loss of copies, accompanied by global copy number co-variation. The sequence can also be more variable in the cancer genome. Cancer genomes with lower copies have mutational evidence of mTOR hyperactivity. The PTEN phosphatase is a tumor suppressor that is critical for genome stability and a negative regulator of the mTOR kinase pathway. Surprisingly, but consistent with the human cancer genomes, hematopoietic cancer stem cells from a Pten-/- mouse model for leukemia have lower rDNA copy number than normal tissue, despite increased proliferation, rRNA production, and protein synthesis. Loss of copies occurs early and is associated with hypersensitivity to DNA damage. Therefore, copy loss is a recurrent feature in cancers associated with mTOR activation. Ribosomal DNA copy number may be a simple and useful indicator of whether a cancer will be sensitive to DNA damaging treatments.

  19. Inter-Simple Sequence Repeat (ISSR Markers to Study Genetic Diversity Among Cotton Cultivars in Associated with Salt Tolerance

    Directory of Open Access Journals (Sweden)

    Ali Akbar ABDI

    2012-11-01

    Full Text Available Developing salt-tolerant crops is very important as a significant proportion of cultivated land is salt-affected. Screening and selection of salt tolerant genotypes of cotton using DNA molecular markers not only introduce tolerant cultivars useful for hybridization and breeding programs but also detect DNA regions involved in mechanism of salinity tolerance. To study this, 28 cotton cultivars, including 8 Iranian cotton varieties were grown in pots under greenhouse condition and three salt treatments were imposed with salt solutions (0, 70 and 140 mM NaCl. Eight agronomic traits including root length, root fresh weight, root dry weight, chlorophyll and fluorescence index, K+ and Na+ contents in shoot (above ground biomass, and K+/Na+ ratio were measured. Cluster analysis of cultivars based on measured agronomic traits, showed �Cindose� and �Ciacra� as the most tolerant cultivars, and �B-557� and �43347� as the most sensitive cultivars of salt damage. A total of 65 polymorphic DNA fragments were generated at 14 inter-simple sequence repeat (ISSR loci. Plants of 28 cultivars of cotton grouped into three clusters based on ISSR markers. Regression analysis of markers in relation with traits data showed that 23, 33 and 30 markers associated with the measured traits in three salt treatments respectively. These markers might help breeders in any marker assisted selection program in order to improving cotton cultivars against salt stress.

  20. Remodelers organize cellular chromatin by counteracting intrinsic histone-DNA sequence preferences in a class-specific manner

    NARCIS (Netherlands)

    Y.M. Moshkin (Yuri); G.E. Chalkley (Gillian); T.W. Kan (Tsung Wai); B.A. Reddy (Ashok); Z. Özgür (Zeliha); W.F.J. van IJcken (Wilfred); D.H. Dekkers (Dick); J.A.A. Demmers (Jeroen); A.A. Travers (Andrew); C.P. Verrijzer (Peter)

    2012-01-01

    textabstractThe nucleosome is the fundamental repeating unit of eukaryotic chromatin. Here, we assessed the interplay between DNA sequence and ATP-dependent chromatin-remodeling factors (remodelers) in the nucleosomal organization of a eukaryotic genome. We compared the genome-wide distribution of D

  1. Development of polymorphic genic-SSR markers by cDNA library sequencing in boxwood, Buxus spp. (Buxaceae)

    Science.gov (United States)

    Genic microsatellites or simple sequence repeat (genic-SSR) markers were developed in boxwood (Buxus taxa) for genetic diversity analysis, identification of taxa, and to facilitate breeding. cDNA libraries were developed from mRNA extracted from leaves of Buxus sempervirens ‘Vardar Valley’ and seque...

  2. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    Science.gov (United States)

    2011-01-01

    Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/. PMID:21385349

  3. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    Directory of Open Access Journals (Sweden)

    Baldwin Stephen A

    2011-03-01

    Full Text Available Abstract Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.

  4. Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains : a web-based resource

    Directory of Open Access Journals (Sweden)

    Vergnaud Gilles

    2004-01-01

    Full Text Available Abstract Background Polymorphic tandem repeat typing is a new generic technology which has been proved to be very efficient for bacterial pathogens such as B. anthracis, M. tuberculosis, P. aeruginosa, L. pneumophila, Y. pestis. The previously developed tandem repeats database takes advantage of the release of genome sequence data for a growing number of bacteria to facilitate the identification of tandem repeats. The development of an assay then requires the evaluation of tandem repeat polymorphism on well-selected sets of isolates. In the case of major human pathogens, such as S. aureus, more than one strain is being sequenced, so that tandem repeats most likely to be polymorphic can now be selected in silico based on genome sequence comparison. Results In addition to the previously described general Tandem Repeats Database, we have developed a tool to automatically identify tandem repeats of a different length in the genome sequence of two (or more closely related bacterial strains. Genome comparisons are pre-computed. The results of the comparisons are parsed in a database, which can be conveniently queried over the internet according to criteria of practical value, including repeat unit length, predicted size difference, etc. Comparisons are available for 16 bacterial species, and the orthopox viruses, including the variola virus and three of its close neighbors. Conclusions We are presenting an internet-based resource to help develop and perform tandem repeats based bacterial strain typing. The tools accessible at http://minisatellites.u-psud.fr now comprise four parts. The Tandem Repeats Database enables the identification of tandem repeats across entire genomes. The Strain Comparison Page identifies tandem repeats differing between different genome sequences from the same species. The "Blast in the Tandem Repeats Database" facilitates the search for a known tandem repeat and the prediction of amplification product sizes. The "Bacterial

  5. Long CAG repeat sequence and protein expression of androgen receptor considered as prognostic indicators in male breast carcinoma.

    Directory of Open Access Journals (Sweden)

    Yan-Ni Song

    Full Text Available BACKGROUND: The androgen receptor (AR expression and the CAG repeat length within the AR gene appear to be involved in the carcinogenesis of male breast carcinoma (MBC. Although phenotypic differences have been observed between MBC and normal control group in AR gene, there is lack of correlation analysis between AR expression and CAG repeat length in MBC. The purpose of the study was to investigate the prognostic value of CAG repeat lengths and AR protein expression. METHODS: 81 tumor tissues were used for immunostaining for AR expression and CAG repeat length determination and 80 normal controls were analyzed with CAG repeat length in AR gene. The CAG repeat length and AR expression were analyzed in relation to clinicopathological factors and prognostic indicators. RESULTS: AR gene in many MBCs has long CAG repeat sequence compared with that in control group (P = 0.001 and controls are more likely to exhibit short CAG repeat sequence than MBCs. There was statistically significant difference in long CAG repeat sequence between AR status for MBC patients (P = 0.004. The presence of long CAG repeat sequence and AR-positive expression were associated with shorter survival of MBC patients (CAG repeat: P = 0.050 for 5y-OS; P = 0.035 for 5y-DFS AR status: P = 0.048 for 5y-OS; P = 0.029 for 5y-DFS, respectively. CONCLUSION: The CAG repeat length within the AR gene might be one useful molecular biomarker to identify males at increased risk of breast cancer development. The presence of long CAG repeat sequence and AR protein expression were in relation to survival of MBC patients. The CAG repeat length and AR expression were two independent prognostic indicators in MBC patients.

  6. VoSeq: a voucher and DNA sequence web application.

    Science.gov (United States)

    Peña, Carlos; Malm, Tobias

    2012-01-01

    There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL) and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit).

  7. Label-free DNA sequencing using Millikan detection.

    Science.gov (United States)

    Dettloff, Roger; Leiske, Danielle; Chow, Andrea; Farinas, Javier

    2015-10-15

    A label-free method for DNA sequencing based on the principle of the Millikan oil drop experiment was developed. This sequencing-by-synthesis approach sensed increases in bead charge as nucleotides were added by a polymerase to DNA templates attached to beads. The balance between an electrical force, which was dependent on the number of nucleotide charges on a bead, and opposing hydrodynamic drag and restoring tether forces resulted in a bead velocity that was a function of the number of nucleotides attached to the bead. The velocity of beads tethered via a polymer to a microfluidic channel and subjected to an oscillating electric field was measured using dark-field microscopy and used to determine how many nucleotides were incorporated during each sequencing-by-synthesis cycle. Increases in bead velocity of approximately 1% were reliably detected during DNA polymerization, allowing for sequencing of short DNA templates. The method could lead to a low-cost, high-throughput sequencing platform that could enable routine sequencing in medical applications.

  8. Hiding message into DNA sequence through DNA coding and chaotic maps.

    Science.gov (United States)

    Liu, Guoyan; Liu, Hongjun; Kadir, Abdurahman

    2014-09-01

    The paper proposes an improved reversible substitution method to hide data into deoxyribonucleic acid (DNA) sequence, and four measures have been taken to enhance the robustness and enlarge the hiding capacity, such as encode the secret message by DNA coding, encrypt it by pseudo-random sequence, generate the relative hiding locations by piecewise linear chaotic map, and embed the encoded and encrypted message into a randomly selected DNA sequence using the complementary rule. The key space and the hiding capacity are analyzed. Experimental results indicate that the proposed method has a better performance compared with the competing methods with respect to robustness and capacity.

  9. Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers

    OpenAIRE

    Chunsheng Gao; Pengfei Xin; Chaohua Cheng; Qing Tang; Ping Chen; Changbiao Wang; Gonggu Zang; Lining Zhao

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SS...

  10. Dialects of the DNA uptake sequence in Neisseriaceae.

    Directory of Open Access Journals (Sweden)

    Stephan A Frye

    2013-04-01

    Full Text Available In all sexual organisms, adaptations exist that secure the safe reassortment of homologous alleles and prevent the intrusion of potentially hazardous alien DNA. Some bacteria engage in a simple form of sex known as transformation. In the human pathogen Neisseria meningitidis and in related bacterial species, transformation by exogenous DNA is regulated by the presence of a specific DNA Uptake Sequence (DUS, which is present in thousands of copies in the respective genomes. DUS affects transformation by limiting DNA uptake and recombination in favour of homologous DNA. The specific mechanisms of DUS-dependent genetic transformation have remained elusive. Bioinformatic analyses of family Neisseriaceae genomes reveal eight distinct variants of DUS. These variants are here termed DUS dialects, and their effect on interspecies commutation is demonstrated. Each of the DUS dialects is remarkably conserved within each species and is distributed consistent with a robust Neisseriaceae phylogeny based on core genome sequences. The impact of individual single nucleotide transversions in DUS on meningococcal transformation and on DNA binding and uptake is analysed. The results show that a DUS core 5'-CTG-3' is required for transformation and that transversions in this core reduce DNA uptake more than two orders of magnitude although the level of DNA binding remains less affected. Distinct DUS dialects are efficient barriers to interspecies recombination in N. meningitidis, N. elongata, Kingella denitrificans, and Eikenella corrodens, despite the presence of the core sequence. The degree of similarity between the DUS dialect of the recipient species and the donor DNA directly correlates with the level of transformation and DNA binding and uptake. Finally, DUS-dependent transformation is documented in the genera Eikenella and Kingella for the first time. The results presented here advance our understanding of the function and evolution of DUS and genetic

  11. DNA sequence alignment by microhomology sampling during homologous recombination.

    Science.gov (United States)

    Qi, Zhi; Redding, Sy; Lee, Ja Yil; Gibb, Bryan; Kwon, YoungHo; Niu, Hengyao; Gaines, William A; Sung, Patrick; Greene, Eric C

    2015-02-26

    Homologous recombination (HR) mediates the exchange of genetic information between sister or homologous chromatids. During HR, members of the RecA/Rad51 family of recombinases must somehow search through vast quantities of DNA sequence to align and pair single-strand DNA (ssDNA) with a homologous double-strand DNA (dsDNA) template. Here, we use single-molecule imaging to visualize Rad51 as it aligns and pairs homologous DNA sequences in real time. We show that Rad51 uses a length-based recognition mechanism while interrogating dsDNA, enabling robust kinetic selection of 8-nucleotide (nt) tracts of microhomology, which kinetically confines the search to sites with a high probability of being a homologous target. Successful pairing with a ninth nucleotide coincides with an additional reduction in binding free energy, and subsequent strand exchange occurs in precise 3-nt steps, reflecting the base triplet organization of the presynaptic complex. These findings provide crucial new insights into the physical and evolutionary underpinnings of DNA recombination. Copyright © 2015 Elsevier Inc. All rights reserved.

  12. Rapid DNA sequencing by horizontal ultrathin gel electrophoresis.

    Science.gov (United States)

    Brumley, R L; Smith, L M

    1991-01-01

    A horizontal polyacrylamide gel electrophoresis apparatus has been developed that decreases the time required to separate the DNA fragments produced in enzymatic sequencing reactions. The configuration of this apparatus and the use of circulating coolant directly under the glass plates result in heat exchange that is approximately nine times more efficient than passive thermal transfer methods commonly used. Bubble-free gels as thin as 25 microns can be routinely cast on this device. The application to these ultrathin gels of electric fields up to 250 volts/cm permits the rapid separation of multiple DNA sequencing reactions in parallel. When used in conjunction with 32P-based autoradiography, the DNA bands appear substantially sharper than those obtained in conventional electrophoresis. This increased sharpness permits shorter autoradiographic exposure times and longer sequence reads. Images PMID:1870968

  13. Accelerating Computation of DNA Sequence Alignment in Distributed Environment

    Science.gov (United States)

    Guo, Tao; Li, Guiyang; Deaton, Russel

    Sequence similarity and alignment are most important operations in computational biology. However, analyzing large sets of DNA sequence seems to be impractical on a regular PC. Using multiple threads with JavaParty mechanism, this project has successfully implemented in extending the capabilities of regular Java to a distributed environment for simulation of DNA computation. With the aid of JavaParty and the design of multiple threads, the results of this study demonstrated that the modified regular Java program could perform parallel computing without using RMI or socket communication. In this paper, an efficient method for modeling and comparing DNA sequences with dynamic programming and JavaParty was firstly proposed. Additionally, results of this method in distributed environment have been discussed.

  14. Facilitated diffusion on mobile DNA: configurational traps and sequence heterogeneity

    CERN Document Server

    Brackley, C A; Marenduzzo, D; 10.1103/PhysRevLett.109.168103

    2012-01-01

    We present Brownian dynamics simulations of the facilitated diffusion of a protein, modelled as a sphere with a binding site on its surface, along DNA, modelled as a semi-flexible polymer. We consider both the effect of DNA organisation in 3D, and of sequence heterogeneity. We find that in a network of DNA loops, as are thought to be present in bacterial DNA, the search process is very sensitive to the spatial location of the target within such loops. Therefore, specific genes might be repressed or promoted by changing the local topology of the genome. On the other hand, sequence heterogeneity creates traps which normally slow down facilitated diffusion. When suitably positioned, though, these traps can, surprisingly, render the search process much more efficient.

  15. Vector sequences - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available [ Credits ] BLAST Search Image Search Home About Archive Update History Contact us ...od - Number of data entries 7 entries - Joomla SEF URLs by Artio About This Database Database Description Download License Update His...tory of This Database Site Policy | Contact Us Vector sequences - Budding yeast cDNA sequencing project | LSDB Archive ...

  16. Comment on "Linguistic features of noncoding DNA sequences"

    CERN Document Server

    Israeloff, N E; Chan, K; Israeloff, N E; Kagalenko, M; Chan, K

    1995-01-01

    In a recent Physical Review Letter, Mantegna et. al., report that certain statistical signatures of natural language can be found in non-coding DNA sequences. In this comment we show that random noise with power-law correlation similar to 1/f noise, exhibits the same "linguistic" signature as those found in non-coding DNA. We conclude that these signa- tures cannot distinguish languages from noise.

  17. Isolation, characterization and amplification of simple sequence repeat loci in coffee

    Directory of Open Access Journals (Sweden)

    Marco-Aurelio Cristancho

    2008-01-01

    Full Text Available Simple sequence repeat (microsatellite loci in coffee were identified in clones isolated from enriched andrandom genomic libraries. It was shown that coffee is a plant species with low microsatellite frequency. However, the averagedistance between two loci, estimated at 127kb for poly (AG, is one of the shortest of all plant genomes. In contrast, thedistance between two poly (AC loci, estimated at 769kb, is one of the largest in plant genomes. Coffee (ACn microsatellites arefrequently associated with other microsatellites, mainly (ATn motifs, while (AGn microsatellites are not normally associatedwith other microsatellites and have a higher number of perfect motifs. Dinucleotide repeats (AG and (AC were found in ATrichregions in coffee. Sequence analysis of (ACn microsatellites identified in coffee revealed the possible association of theserepeated elements with miniature inverted-repeat transposable elements (MITEs. In addition, some of the evaluated SSRmarkers produced transposon-like amplification patterns in tetraploid genotypes. Of 12 SSR markers developed, nine werepolymorphic in diploid genotypes while 5 were polymorphic in tetraploid genotypes, confirming a greater genetic diversity indiploid species.

  18. Sequence dependence of transcription factor-mediated DNA looping.

    Science.gov (United States)

    Johnson, Stephanie; Lindén, Martin; Phillips, Rob

    2012-09-01

    DNA is subject to large deformations in a wide range of biological processes. Two key examples illustrate how such deformations influence the readout of the genetic information: the sequestering of eukaryotic genes by nucleosomes and DNA looping in transcriptional regulation in both prokaryotes and eukaryotes. These kinds of regulatory problems are now becoming amenable to systematic quantitative dissection with a powerful dialogue between theory and experiment. Here, we use a single-molecule experiment in conjunction with a statistical mechanical model to test quantitative predictions for the behavior of DNA looping at short length scales and to determine how DNA sequence affects looping at these lengths. We calculate and measure how such looping depends upon four key biological parameters: the strength of the transcription factor binding sites, the concentration of the transcription factor, and the length and sequence of the DNA loop. Our studies lead to the surprising insight that sequences that are thought to be especially favorable for nucleosome formation because of high flexibility lead to no systematically detectable effect of sequence on looping, and begin to provide a picture of the distinctions between the short length scale mechanics of nucleosome formation and looping.

  19. Label-Free DNA Sequencing Using Millikan Detection

    OpenAIRE

    Dettloff, Roger; Leiske, Danielle; Chow, Andrea; Farinas, Javier

    2015-01-01

    A label-free method for DNA sequencing based on the principle of the Millikan oil drop experiment was developed. This sequencing-by-synthesis approach sensed increases in bead charge as nucleotides were added by a polymerase to DNA templates attached to beads. The balance between an electrical force, which was dependent on the number of nucleotide charges on a bead, and opposing hydrodynamic drag and restoring tether forces resulted in a bead velocity that was a function of the number of nucl...

  20. Anaplasma phagocytophilum in Danish sheep: confirmation by DNA sequencing

    Directory of Open Access Journals (Sweden)

    Thamsborg Stig M

    2009-12-01

    Full Text Available Abstract Background The presence of Anaplasma phagocytophilum, an Ixodes ricinus transmitted bacterium, was investigated in two flocks of Danish grazing lambs. Direct PCR detection was performed on DNA extracted from blood and serum with subsequent confirmation by DNA sequencing. Methods 31 samples obtained from clinically normal lambs in 2000 from Fussingø, Jutland and 12 samples from ten lambs and two ewes from a clinical outbreak at Feddet, Zealand in 2006 were included in the study. Some of the animals from Feddet had shown clinical signs of polyarthritis and general unthriftiness prior to sampling. DNA extraction was optimized from blood and serum and detection achieved by a 16S rRNA targeted PCR with verification of the product by DNA sequencing. Results Five DNA extracts were found positive by PCR, including two samples from 2000 and three from 2006. For both series of samples the product was verified as A. phagocytophilum by DNA sequencing. Conclusions A. phagocytophilum was detected by molecular methods for the first time in Danish grazing lambs during the two seasons investigated (2000 and 2006.

  1. Sequence-selective DNA recognition with peptide-bisbenzamidine conjugates.

    Science.gov (United States)

    Sánchez, Mateo I; Vázquez, Olalla; Vázquez, M Eugenio; Mascareñas, José L

    2013-07-22

    Transcription factors (TFs) are specialized proteins that play a key role in the regulation of genetic expression. Their mechanism of action involves the interaction with specific DNA sequences, which usually takes place through specialized domains of the protein. However, achieving an efficient binding usually requires the presence of the full protein. This is the case for bZIP and zinc finger TF families, which cannot interact with their target sites when the DNA binding fragments are presented as isolated monomers. Herein it is demonstrated that the DNA binding of these monomeric peptides can be restored when conjugated to aza-bisbenzamidines, which are readily accessible molecules that interact with A/T-rich sites by insertion into their minor groove. Importantly, the fluorogenic properties of the aza-benzamidine unit provide details of the DNA interaction that are eluded in electrophoresis mobility shift assays (EMSA). The hybrids based on the GCN4 bZIP protein preferentially bind to composite sequences containing tandem bisbenzamidine-GCN4 binding sites (TCAT⋅AAATT). Fluorescence reverse titrations show an interesting multiphasic profile consistent with the formation of competitive nonspecific complexes at low DNA/peptide ratios. On the other hand, the conjugate with the DNA binding domain of the zinc finger protein GAGA binds with high affinity (KD≈12 nM) and specificity to a composite AATTT⋅GAGA sequence containing both the bisbenzamidine and the TF consensus binding sites.

  2. Preparation of next-generation sequencing libraries from damaged DNA.

    Science.gov (United States)

    Briggs, Adrian W; Heyn, Patricia

    2012-01-01

    Next-generation sequencing (NGS) has revolutionized ancient DNA research, especially when combined with high-throughput target enrichment methods. However, attaining high sequencing depth and accuracy from samples often remains problematic due to the damaged state of ancient DNA, in particular the extremely low copy number of ancient DNA and the abundance of uracil residues derived from cytosine deamination that lead to miscoding errors. It is therefore critical to use a highly efficient procedure for conversion of a raw DNA extract into an adaptor-ligated sequencing library, and equally important to reduce errors from uracil residues. We present a protocol for NGS library preparation that allows highly efficient conversion of DNA fragments into an adaptor-ligated form. The protocol incorporates an option to remove the vast majority of uracil miscoding lesions as part of the library preparation process. The procedure requires only two spin column purification steps and no gel purification or bead handling. Starting from an aliquot of DNA extract, a finished, highly amplified library can be generated in 5 h, or under 3 h if uracil removal is not required.

  3. Perspectives of DNA microarray and next-generation DNA sequencing technologies

    Institute of Scientific and Technical Information of China (English)

    TENG XiaoKun; XIAO HuaSheng

    2009-01-01

    DNA microarray and next-generation DNA sequencing technologies are important tools for high-throughput genome research, in revealing both the structural and functional characteristics of genomes. In the past decade the DNA microarray technologies have been widely applied in the studies of functional genomics, systems biology and pharmacogenomics. The next-generation DNA sequenc-ing method was first introduced by the 454 Company in 2003, immediately followed by the establish-ment of the Solexa and Solid techniques by other biotech companies. Though it has not been long since the first emergence of this technology, with the fast and impressive improvement, the application of this technology has extended to almost all fields of genomics research, as a rival challenging the existing DNA microarray technology. This paper briefly reviews the working principles of these two technologies as well as their application and perspectives in genome research.

  4. Real-time DNA sequencing from single polymerase molecules.

    Science.gov (United States)

    Eid, John; Fehr, Adrian; Gray, Jeremy; Luong, Khai; Lyle, John; Otto, Geoff; Peluso, Paul; Rank, David; Baybayan, Primo; Bettman, Brad; Bibillo, Arkadiusz; Bjornson, Keith; Chaudhuri, Bidhan; Christians, Frederick; Cicero, Ronald; Clark, Sonya; Dalal, Ravindra; Dewinter, Alex; Dixon, John; Foquet, Mathieu; Gaertner, Alfred; Hardenbol, Paul; Heiner, Cheryl; Hester, Kevin; Holden, David; Kearns, Gregory; Kong, Xiangxu; Kuse, Ronald; Lacroix, Yves; Lin, Steven; Lundquist, Paul; Ma, Congcong; Marks, Patrick; Maxham, Mark; Murphy, Devon; Park, Insil; Pham, Thang; Phillips, Michael; Roy, Joy; Sebra, Robert; Shen, Gene; Sorenson, Jon; Tomaney, Austin; Travers, Kevin; Trulson, Mark; Vieceli, John; Wegener, Jeffrey; Wu, Dawn; Yang, Alicia; Zaccarin, Denis; Zhao, Peter; Zhong, Frank; Korlach, Jonas; Turner, Stephen

    2009-01-02

    We present single-molecule, real-time sequencing data obtained from a DNA polymerase performing uninterrupted template-directed synthesis using four distinguishable fluorescently labeled deoxyribonucleoside triphosphates (dNTPs). We detected the temporal order of their enzymatic incorporation into a growing DNA strand with zero-mode waveguide nanostructure arrays, which provide optical observation volume confinement and enable parallel, simultaneous detection of thousands of single-molecule sequencing reactions. Conjugation of fluorophores to the terminal phosphate moiety of the dNTPs allows continuous observation of DNA synthesis over thousands of bases without steric hindrance. The data report directly on polymerase dynamics, revealing distinct polymerization states and pause sites corresponding to DNA secondary structure. Sequence data were aligned with the known reference sequence to assay biophysical parameters of polymerization for each template position. Consensus sequences were generated from the single-molecule reads at 15-fold coverage, showing a median accuracy of 99.3%, with no systematic error beyond fluorophore-dependent error rates.

  5. Identification of Bacterial Species in Kuwaiti Waters Through DNA Sequencing

    Science.gov (United States)

    Chen, K.

    2017-01-01

    With an objective of identifying the bacterial diversity associated with ecosystem of various Kuwaiti Seas, bacteria were cultured and isolated from 3 water samples. Due to the difficulties for cultured and isolated fecal coliforms on the selective agar plates, bacterial isolates from marine agar plates were selected for molecular identification. 16S rRNA genes were successfully amplified from the genome of the selected isolates using Universal Eubacterial 16S rRNA primers. The resulted amplification products were subjected to automated DNA sequencing. Partial 16S rDNA sequences obtained were compared directly with sequences in the NCBI database using BLAST as well as with the sequences available with Ribosomal Database Project (RDP).

  6. DNA qualification workflow for next generation sequencing of histopathological samples.

    Directory of Open Access Journals (Sweden)

    Michele Simbolo

    Full Text Available Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF tissues, 6 formalin-fixed paraffin-embedded (FFPE tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard

  7. DNA qualification workflow for next generation sequencing of histopathological samples.

    Science.gov (United States)

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  8. Noninvasive Prenatal Paternity Testing (NIPAT) through Maternal Plasma DNA Sequencing: A Pilot Study

    Science.gov (United States)

    Ge, Huijuan; Deng, Yongqiang; Mu, Haofang; Feng, Xiaoli; Yin, Lu; Du, Zhou; Chen, Fang; He, Nongyue

    2016-01-01

    Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels in order to verify the performance in clinical cases. Combining targeted deep sequencing of selective SNP and informative bioinformatics pipeline, we calculated the combined paternity index (CPI) of 17 cases to determine paternity. Sequencing-based NIPAT results fully agreed with invasive prenatal paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future. PMID:27631491

  9. Stem-loop structures of the repetitive DNA sequences located at human centromeres

    Energy Technology Data Exchange (ETDEWEB)

    Gupta, G.; Garcia, A.E.; Ratliff, R.; Moyzis, R.K. [Los Alamos National Lab., NM (United States); Catasti, P.; Hong, Lin; Yau, P. [California Univ., Davis, CA (United States). Dept. of Biological Chemistry; Bradbury, E.M. [Los Alamos National Lab., NM (United States)]|[California Univ., Davis, CA (United States). Dept. of Biological Chemistry

    1993-09-01

    The presence of the highly conserved repetitive DNA sequences in the human centromeres argues for a special role of these sequences in their biological functions - most likely achieved by the formation of unusual structures. This prompted us to carry out quantitative one- and two-dimensional nuclear magnetic resonance (lD/2D NMR) spectroscopy to determine the structural properties of the human centromeric repeats, d(AATGG){sub n.d}(CCATT){sub n}. The studies on centromeric DNAs reveal that the complementary sequence, d(AATGG){sub n.d}(CCATT){sub n}, adopts the usual Watson-Crick B-DNA duplex and the pyrimidine-rich d(CCATT){sub n} strand is essentially a random coil. However, the purine-rich d(AATGG){sub n} strand is shown to adopt unusual stem-loop structures for repeat lengths, n=2,3,4, and 6. In addition to normal Watson-Crick A{center_dot}T pairs, the stem-loop structures are stabilized by mismatch A{center_dot}G and G{center_dot}G pairs in the stem and G-G-A stacking in the loop. Stem-loop structures of d(AATGG)n are independently verified by gel electrophoresis and nuclease digestion studies. Thermal melting studies show that the DNA repeats, d(AATGG){sub n}, are as stable as the corresponding Watson-Crick duplex d(AATGG){sub n.d}(CCATT){sub n}. Therefore, the sequence d(AATGG){sub n} can, indeed, nucleate a stem-loop structure at little free-energy cost and if, during mitosis, they are located on the chromosome surface they can provide specific recognition sites for kinetochore function.

  10. Genome dynamics of short oligonucleotides: the example of bacterial DNA uptake enhancing sequences.

    Directory of Open Access Journals (Sweden)

    Mohammed Bakkali

    Full Text Available Among the many bacteria naturally competent for transformation by DNA uptake-a phenomenon with significant clinical and financial implications- Pasteurellaceae and Neisseriaceae species preferentially take up DNA containing specific short sequences. The genomic overrepresentation of these DNA uptake enhancing sequences (DUES causes preferential uptake of conspecific DNA, but the function(s behind this overrepresentation and its evolution are still a matter for discovery. Here I analyze DUES genome dynamics and evolution and test the validity of the results to other selectively constrained oligonucleotides. I use statistical methods and computer simulations to examine DUESs accumulation in Haemophilus influenzae and Neisseria gonorrhoeae genomes. I analyze DUESs sequence and nucleotide frequencies, as well as those of all their mismatched forms, and prove the dependence of DUESs genomic overrepresentation on their preferential uptake by quantifying and correlating both characteristics. I then argue that mutation, uptake bias, and weak selection against DUESs in less constrained parts of the genome combined are sufficient enough to cause DUESs accumulation in susceptible parts of the genome with no need for other DUES function. The distribution of overrepresentation values across sequences with different mismatch loads compared to the DUES suggests a gradual yet not linear molecular drive of DNA sequences depending on their similarity to the DUES. Other genomically overrepresented sequences, both pro- and eukaryotic, show similar distribution of frequencies suggesting that the molecular drive reported above applies to other frequent oligonucleotides. Rare oligonucleotides, however, seem to be gradually drawn to genomic underrepresentation, thus, suggesting a molecular drag. To my knowledge this work provides the first clear evidence of the gradual evolution of selectively constrained oligonucleotides, including repeated, palindromic and protein

  11. Development of simple sequence repeat markers and diversity analysis in alfalfa (Medicago sativa L.).

    Science.gov (United States)

    Wang, Zan; Yan, Hongwei; Fu, Xinnian; Li, Xuehui; Gao, Hongwen

    2013-04-01

    Efficient and robust molecular markers are essential for molecular breeding in plant. Compared to dominant and bi-allelic markers, multiple alleles of simple sequence repeat (SSR) markers are particularly informative and superior in genetic linkage map and QTL mapping in autotetraploid species like alfalfa. The objective of this study was to enrich SSR markers directly from alfalfa expressed sequence tags (ESTs). A total of 12,371 alfalfa ESTs were retrieved from the National Center for Biotechnology Information. Total 774 SSR-containing ESTs were identified from 716 ESTs. On average, one SSR was found per 7.7 kb of EST sequences. Tri-nucleotide repeats (48.8 %) was the most abundant motif type, followed by di-(26.1 %), tetra-(11.5 %), penta-(9.7 %), and hexanucleotide (3.9 %). One hundred EST-SSR primer pairs were successfully designed and 29 exhibited polymorphism among 28 alfalfa accessions. The allele number per marker ranged from two to 21 with an average of 6.8. The PIC values ranged from 0.195 to 0.896 with an average of 0.608, indicating a high level of polymorphism of the EST-SSR markers. Based on the 29 EST-SSR markers, assessment of genetic diversity was conducted and found that Medicago sativa ssp. sativa was clearly different from the other subspecies. The high transferability of those EST-SSR markers was also found for relative species.

  12. Expansion of CAG triplet repeats by human DNA polymerases λ and β in vitro, is regulated by flap endonuclease 1 and DNA ligase 1.

    Science.gov (United States)

    Crespan, Emmanuele; Hübscher, Ulrich; Maga, Giovanni

    2015-05-01

    Huntington's disease (HD) is a neurological genetic disorder caused by the expansion of the CAG trinucleotide repeats (TNR) in the N-terminal region of coding sequence of the Huntingtin's (HTT) gene. This results in the addition of a poly-glutamine tract within the Huntingtin protein, resulting in its pathological form. The mechanism by which TRN expansion takes place is not yet fully understood. We have recently shown that DNA polymerase (Pol) β can promote the microhomology-mediated end joining and triplet expansion of a substrate mimicking a double strand break in the TNR region of the HTT gene. Here we show that TNR expansion is dependent on the structure of the DNA substrate, as well as on the two essential Pol β co-factors: flap endonuclease 1 (Fen1) and DNA ligase 1 (Lig1). We found that Fen1 significantly stimulated TNR expansion by Pol β, but not by the related enzyme Pol λ, and subsequent ligation of the DNA products by Lig1. Interestingly, the deletion of N-terminal domains of Pol λ, resulted in an enzyme which displayed properties more similar to Pol β, suggesting a possible evolutionary mechanism. These results may suggest a novel mechanism for somatic TNR expansion in HD.

  13. Molecular cloning and long terminal repeat sequences of human endogenous retrovirus genes related to types A and B retrovirus genes

    Energy Technology Data Exchange (ETDEWEB)

    Ono, M.

    1986-06-01

    By using a DNA fragment primarily encoding the reverse transcriptase (pol) region of the Syrian hamster intracisternal A particle (IAP; type A retrovirus) gene as a probe, human endogenous retrovirus genes, tentatively termed HERV-K genes, were cloned from a fetal human liver gene library. Typical HERV-K genes were 9.1 or 9.4 kilobases in length, having long terminal repeats (LTRs) of ca. 970 base pairs. Many structural features commonly observed on the retrovirus LTRs, such as the TATAA box, polyadenylation signal, and terminal inverted repeats, were present on each LTR, and a lysine (K) tRNA having a CUU anticodon was identified as a presumed primer tRNA. The HERV-K LTR, however, had little sequence homology to either the IAP LTR or other typical oncovirus LTRs. By filter hybridization, the number of HERV-K genes was estimated to be ca. 50 copies per haploid human genome. The cloned mouse mammary tumor virus (type B) gene was found to hybridize with both the HERV-K and IAP genes to essentially the same extent.

  14. BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations.

    Science.gov (United States)

    Bahr, A; Thompson, J D; Thierry, J C; Poch, O

    2001-01-01

    BAliBASE is specifically designed to serve as an evaluation resource to address all the problems encountered when aligning complete sequences. The database contains high quality, manually constructed multiple sequence alignments together with detailed annotations. The alignments are all based on three-dimensional structural superpositions, with the exception of the transmembrane sequences. The first release provided sets of reference alignments dealing with the problems of high variability, unequal repartition and large N/C-terminal extensions and internal insertions. Here we describe version 2.0 of the database, which incorporates three new reference sets of alignments containing structural repeats, trans-membrane sequences and circular permutations to evaluate the accuracy of detection/prediction and alignment of these complex sequences. BAliBASE can be viewed at the web site http://www-igbmc.u-strasbg. fr/BioInfo/BAliBASE2/index.html or can be downloaded from ftp://ftp-igbmc.u-strasbg.fr/pub/BAliBASE2 /.

  15. Inter-simple sequence repeat (ISSR) loci mapping in the genome of perennial ryegrass

    DEFF Research Database (Denmark)

    Pivorienė, O; Pašakinskienė, I; Brazauskas, G;

    2008-01-01

    The aim of this study was to identify and characterize new ISSR markers and their loci in the genome of perennial ryegrass. A subsample of the VrnA F2 mapping family of perennial ryegrass comprising 92 individuals was used to develop a linkage map including inter-simple sequence repeat markers...... demonstrated a 70% similarity to the Hordeum vulgare germin gene GerA. Inter-SSR mapping will provide useful information for gene targeting, quantitative trait loci mapping and marker-assisted selection in perennial ryegrass....

  16. High-throughput DNA sequencing: a genomic data manufacturing process.

    Science.gov (United States)

    Huang, G M

    1999-01-01

    The progress trends in automated DNA sequencing operation are reviewed. Technological development in sequencing instruments, enzymatic chemistry and robotic stations has resulted in ever-increasing capacity of sequence data production. This progress leads to a higher demand on laboratory information management and data quality assessment. High-throughput laboratories face the challenge of organizational management, as well as technology management. Engineering principles of process control should be adopted in this biological data manufacturing procedure. While various systems attempt to provide solutions to automate different parts of, or even the entire process, new technical advances will continue to change the paradigm and provide new challenges.

  17. Molecular cloning and sequence analysis of hamster CENP-A cDNA

    Science.gov (United States)

    Figueroa, Javier; Pendón, Carlos; Valdivia, Manuel M

    2002-01-01

    Background The centromere is a specialized locus that mediates chromosome movement during mitosis and meiosis. This chromosomal domain comprises a uniquely packaged form of heterochromatin that acts as a nucleus for the assembly of the kinetochore a trilaminar proteinaceous structure on the surface of each chromatid at the primary constriction. Kinetochores mediate interactions with the spindle fibers of the mitotic apparatus. Centromere protein A (CENP-A) is a histone H3-like protein specifically located to the inner plate of kinetochore at active centromeres. CENP-A works as a component of specialized nucleosomes at centromeres bound to arrays of repeat satellite DNA. Results We have cloned the hamster homologue of human and mouse CENP-A. The cDNA isolated was found to contain an open reading frame encoding a polypeptide consisting of 129 amino acid residues with a C-terminal histone fold domain highly homologous to those of CENP-A and H3 sequences previously released. However, significant sequence divergence was found at the N-terminal region of hamster CENP-A that is five and eleven residues shorter than those of mouse and human respectively. Further, a human serine 7 residue, a target site for Aurora B kinase phosphorylation involved in the mechanism of cytokinesis, was not found in the hamster protein. A human autoepitope at the N-terminal region of CENP-A described in autoinmune diseases is not conserved in the hamster protein. Conclusions We have cloned the hamster cDNA for the centromeric protein CENP-A. Significant differences on protein sequence were found at the N-terminal tail of hamster CENP-A in comparison with that of human and mouse. Our results show a high degree of evolutionary divergence of kinetochore CENP-A proteins in mammals. This is related to the high diverse nucleotide repeat sequences found at the centromere DNA among species and support a current centromere model for kinetochore function and structural plasticity. PMID:12019018

  18. RNA-DNA sequence differences spell genetic code ambiguities

    DEFF Research Database (Denmark)

    Bentin, Thomas; Nielsen, Michael L

    2013-01-01

    A recent paper in Science by Li et al. 2011(1) reports widespread sequence differences in the human transcriptome between RNAs and their encoding genes termed RNA-DNA differences (RDDs). The findings could add a new layer of complexity to gene expression but the study has been criticized. ...

  19. Functionalized nanopore-embedded electrodes for rapid DNA sequencing

    CERN Document Server

    He, Haiying; Pandey, Ravindra; Rocha, Alexandre Reily; Sanvito, Stefano; Grigoriev, Anton; Ahuja, Rajeev; Karna, Shashi P

    2007-01-01

    The determination of a patient's DNA sequence can, in principle, reveal an increased risk to fall ill with particular diseases [1,2] and help to design "personalized medicine" [3]. Moreover, statistical studies and comparison of genomes [4] of a large number of individuals are crucial for the analysis of mutations [5] and hereditary diseases, paving the way to preventive medicine [6]. DNA sequencing is, however, currently still a vastly time-consuming and very expensive task [4], consisting of pre-processing steps, the actual sequencing using the Sanger method, and post-processing in the form of data analysis [7]. Here we propose a new approach that relies on functionalized nanopore-embedded electrodes to achieve an unambiguous distinction of the four nucleic acid bases in the DNA sequencing process. This represents a significant improvement over previously studied designs [8,9] which cannot reliably distinguish all four bases of DNA. The transport properties of the setup investigated by us, employing state-o...

  20. POSA : Perl objects for DNA sequencing data analysis

    NARCIS (Netherlands)

    Aerts, JA; Jungerius, BJ; Groenen, MA

    2004-01-01

    Background: Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide

  1. POSA: perl objects for DNA sequencing data analysis

    NARCIS (Netherlands)

    Aerts, J.A.; Jungerius, B.J.; Groenen, M.A.M.

    2004-01-01

    Background - Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide

  2. DNA sequence handling programs in BASIC for home computers.

    OpenAIRE

    Biro, P A

    1984-01-01

    This paper describes a DNA sequence handling program written entirely in BASIC and designed to be run on an Atari home computer. Many of the features common to more sophisticated programs have been included. The advantage of this program are its convenience, its transportability and its potential for user modification. The disadvantages are lack of sophistication and speed.

  3. Decoding long nanopore sequencing reads of natural DNA.

    Science.gov (United States)

    Laszlo, Andrew H; Derrington, Ian M; Ross, Brian C; Brinkerhoff, Henry; Adey, Andrew; Nova, Ian C; Craig, Jonathan M; Langford, Kyle W; Samson, Jenny Mae; Daza, Riza; Doering, Kenji; Shendure, Jay; Gundlach, Jens H

    2014-08-01

    Nanopore sequencing of DNA is a single-molecule technique that may achieve long reads, low cost and high speed with minimal sample preparation and instrumentation. Here, we build on recent progress with respect to nanopore resolution and DNA control to interpret the procession of ion current levels observed during the translocation of DNA through the pore MspA. As approximately four nucleotides affect the ion current of each level, we measured the ion current corresponding to all 256 four-nucleotide combinations (quadromers). This quadromer map is highly predictive of ion current levels of previously unmeasured sequences derived from the bacteriophage phi X 174 genome. Furthermore, we show nanopore sequencing reads of phi X 174 up to 4,500 bases in length, which can be unambiguously aligned to the phi X 174 reference genome, and demonstrate proof-of-concept utility with respect to hybrid genome assembly and polymorphism detection. This work provides a foundation for nanopore sequencing of long, natural DNA strands.

  4. POSA : Perl objects for DNA sequencing data analysis

    NARCIS (Netherlands)

    Aerts, JA; Jungerius, BJ; Groenen, MA

    2004-01-01

    Background: Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide

  5. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

    Science.gov (United States)

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  6. Detection of genomic variation by selection of a 9 mb DNA region and high throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Sergey I Nikolaev

    Full Text Available Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb and 7 (1.1 Mb from an individual from the International HapMap Project (NA12872. We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage > or = 4-fold, and 97.9% concordant in regions with coverage > or = 15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

  7. A simple method encoding linear single strain DNA sequence with natural numbers

    Institute of Scientific and Technical Information of China (English)

    LI Jiye; XU Yuan; ZHANG Wang

    2008-01-01

    A simple method presenting linear single strain DNA (LssDNA) sequence with natural numbers is introduced in this paper. The method presents LssDNA correspondingly with the numerals 1, 2, 3 and 4. After calculation, the sequence can be coded in natural numbers which can also be decoded into the DNA sequence. Thus, an LssDNA sequence can be expressed in a natural number and a dot at coordinate axes. In the future, a new LssDNA sequences database termed "DotBank" would be realized in which each LssDNA sequence is determined as a dot.

  8. Mitochondrial Inverted Repeats Strongly Correlate with Lifespan: mtDNA Inversions and Aging

    Science.gov (United States)

    Yang, Jiang-Nan; Seluanov, Andrei; Gorbunova, Vera

    2013-01-01

    Mitochondrial defects are implicated in aging and in a multitude of age-related diseases, such as cancer, heart failure, Parkinson’s disease, and Huntington’s disease. However, it is still unclear how mitochondrial defects arise under normal physiological conditions. Mitochondrial DNA (mtDNA) deletions caused by direct repeats (DRs) are implicated in the formation of mitochondrial defects, however, mitochondrial DRs show relatively weak (Pearson’s r = −0.22, p<0.002; Spearman’s ρ = −0.12, p = 0.1) correlation with maximum lifespan (MLS). Here we report a stronger correlation (Pearson’s r = −0.55, p<10–16; Spearman’s ρ = −0.52, p<10–14) between mitochondrial inverted repeats (IRs) and lifespan across 202 species of mammals. We show that, in wild type mice under normal conditions, IRs cause inversions, which arise by replication-dependent mechanism. The inversions accumulate with age in the brain and heart. Our data suggest that IR-mediated inversions are more mutagenic than DR-mediated deletions in mtDNA, and impose stronger constraint on lifespan. Our study identifies IR-induced mitochondrial genome instability during mtDNA replication as a potential cause for mitochondrial defects. PMID:24069185

  9. Mitochondrial inverted repeats strongly correlate with lifespan: mtDNA inversions and aging.

    Directory of Open Access Journals (Sweden)

    Jiang-Nan Yang

    Full Text Available Mitochondrial defects are implicated in aging and in a multitude of age-related diseases, such as cancer, heart failure, Parkinson's disease, and Huntington's disease. However, it is still unclear how mitochondrial defects arise under normal physiological conditions. Mitochondrial DNA (mtDNA deletions caused by direct repeats (DRs are implicated in the formation of mitochondrial defects, however, mitochondrial DRs show relatively weak (Pearson's r = -0.22, p<0.002; Spearman's ρ = -0.12, p = 0.1 correlation with maximum lifespan (MLS. Here we report a stronger correlation (Pearson's r = -0.55, p<10(-16; Spearman's ρ = -0.52, p<10(-14 between mitochondrial inverted repeats (IRs and lifespan across 202 species of mammals. We show that, in wild type mice under normal conditions, IRs cause inversions, which arise by replication-dependent mechanism. The inversions accumulate with age in the brain and heart. Our data suggest that IR-mediated inversions are more mutagenic than DR-mediated deletions in mtDNA, and impose stronger constraint on lifespan. Our study identifies IR-induced mitochondrial genome instability during mtDNA replication as a potential cause for mitochondrial defects.

  10. Sequence heterogeneity accelerates protein search for targets on DNA

    Energy Technology Data Exchange (ETDEWEB)

    Shvets, Alexey A.; Kolomeisky, Anatoly B., E-mail: tolya@rice.edu [Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005 (United States)

    2015-12-28

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  11. Solid-State Nanopore-Based DNA Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Zewen Liu

    2016-01-01

    Full Text Available The solid-state nanopore-based DNA sequencing technology is becoming more and more attractive for its brand new future in gene detection field. The challenges that need to be addressed are diverse: the effective methods to detect base-specific signatures, the control of the nanopore’s size and surface properties, and the modulation of translocation velocity and behavior of the DNA molecules. Among these challenges, the realization of the high-quality nanopores with the help of modern micro/nanofabrication technologies is a crucial one. In this paper, typical technologies applied in the field of solid-state nanopore-based DNA sequencing have been reviewed.

  12. Electronic density of states in sequence dependent DNA molecules

    Science.gov (United States)

    de Oliveira, B. P. W.; Albuquerque, E. L.; Vasconcelos, M. S.

    2006-09-01

    We report in this work a numerical study of the electronic density of states (DOS) in π-stacked arrays of DNA single-strand segments made up from the nucleotides guanine G, adenine A, cytosine C and thymine T, forming a Rudin-Shapiro (RS) as well as a Fibonacci (FB) polyGC quasiperiodic sequences. Both structures are constructed starting from a G nucleotide as seed and following their respective inflation rules. Our theoretical method uses Dyson's equation together with a transfer-matrix treatment, within an electronic tight-binding Hamiltonian model, suitable to describe the DNA segments modelled by the quasiperiodic chains. We compared the DOS spectra found for the quasiperiodic structure to those using a sequence of natural DNA, as part of the human chromosome Ch22, with a remarkable concordance, as far as the RS structure is concerned. The electronic spectrum shows several peaks, corresponding to localized states, as well as a striking self-similar aspect.

  13. A blind testing design for authenticating ancient DNA sequences.

    Science.gov (United States)

    Yang, H; Golenberg, E M; Shoshani, J

    1997-04-01

    Reproducibility is a serious concern among researchers of ancient DNA. We designed a blind testing procedure to evaluate laboratory accuracy and authenticity of ancient DNA obtained from closely related extant and extinct species. Soft tissue and bones of fossil and contemporary museum proboscideans were collected and identified based on morphology by one researcher, and other researchers carried out DNA testing on the samples, which were assigned anonymous numbers. DNA extracted using three principal isolation methods served as template in PCR amplifications of a segment of the cytochrome b gene (mitochondrial genome), and the PCR product was directly sequenced and analyzed. The results show that such a blind testing design performed in one laboratory, when coupled with phylogenetic analysis, can nonarbitrarily test the consistency and reliability of ancient DNA results. Such reproducible results obtained from the blind testing can increase confidence in the authenticity of ancient sequences obtained from postmortem specimens and avoid bias in phylogenetic analysis. A blind testing design may be applicable as an alternative to confirm ancient DNA results in one laboratory when independent testing by two laboratories is not available.

  14. POSA: Perl Objects for DNA Sequencing Data Analysis

    Directory of Open Access Journals (Sweden)

    Jungerius Bart J

    2004-08-01

    Full Text Available Abstract Background Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide modules that need advanced informatics skills to allow implementation in pipelines. Results Here we present POSA, a pair of new perl objects that describe DNA sequence traces and Phrap contig assemblies in detail. Methods included in POSA include basecalling with quality scores (by Phred, contig assembly (by Phrap, generation of primer3 input and automated SNP annotation (by PolyPhred. Although easily implemented by users with only limited programming experience, these objects considerabily reduce hands-on analysis time compared to using the Staden package for extracting sequence information from raw sequencing files and for SNP discovery. Conclusions The POSA objects allow a flexible and easy design, implementation and usage of perl-based pipelines to handle and analyze DNA sequencing data, while requiring only minor programming skills.

  15. DNA watermarks in non-coding regulatory sequences

    Directory of Open Access Journals (Sweden)

    Pyka Martin

    2009-07-01

    Full Text Available Abstract Background DNA watermarks can be applied to identify the unauthorized use of genetically modified organisms. It has been shown that coding regions can be used to encrypt information into living organisms by using the DNA-Crypt algorithm. Yet, if the sequence of interest presents a non-coding DNA sequence, either the function of a resulting functional RNA molecule or a regulatory sequence, such as a promoter, could be affected. For our studies we used the small cytoplasmic RNA 1 in yeast and the lac promoter region of Escherichia coli. Findings The lac promoter was deactivated by the integrated watermark. In addition, the RNA molecules displayed altered configurations after introducing a watermark, but surprisingly were functionally intact, which has been verified by analyzing the growth characteristics of both wild type and watermarked scR1 transformed yeast cells. In a third approach we introduced a second overlapping watermark into the lac promoter, which did not affect the promoter activity. Conclusion Even though the watermarked RNA and one of the watermarked promoters did not show any significant differences compared to the wild type RNA and wild type promoter region, respectively, it cannot be generalized that other RNA molecules or regulatory sequences behave accordingly. Therefore, we do not recommend integrating watermark sequences into regulatory regions.

  16. Discrimination of Shark species by simple PCR of 5S rDNA repeats

    Directory of Open Access Journals (Sweden)

    Danillo Pinhal

    2008-01-01

    Full Text Available Sharks are suffering from intensive exploitation by worldwide fisheries leading to a severe decline in several populations in the last decades. The lack of biological data on a species-specific basis, associated with a k-strategist life history make it difficult to correctly manage and conserve these animals. The aim of the present study was to develop a DNA-based procedure to discriminate shark species by means of a rapid, low cost and easily applicable PCR analysis based on 5S rDNA repeat units amplification, in order to contribute conservation management of these animals. The generated agarose electrophoresis band patterns allowed to unequivocally distinguish eight shark species. The data showed for the first time that a simple PCR is able to discriminate elasmobranch species. The described 5S rDNA PCR approach generated species-specific genetic markers that should find broad application in fishery management and trade of sharks and their subproducts.

  17. Relative Telomere Repeat Mass in Buccal and Leukocyte-Derived DNA

    Science.gov (United States)

    Finnicum, Casey T.; Dolan, Conor V.; Willemsen, Gonneke; Weber, Zachary M.; Petersen, Jason L.; Beck, Jeffrey J.; Codd, Veryan; Boomsma, Dorret I.; Davies, Gareth E.; Ehli, Erik A.

    2017-01-01

    Telomere length has garnered interest due to the potential role it may play as a biomarker for the cellular aging process. Telomere measurements obtained from blood-derived DNA are often used in epidemiological studies. However, the invasive nature of blood draws severely limits sample collection, particularly with children. Buccal cells are commonly sampled for DNA isolation and thus may present a non-invasive alternative for telomere measurement. Buccal and leukocyte derived DNA obtained from samples collected at the same time period were analyzed for telomere repeat mass (TRM). TRM was measured in buccal-derived DNA samples from individuals for whom previous TRM data from blood samples existed. TRM measurement was performed by qPCR and was normalized to the single copy 36B4 gene relative to a reference DNA sample (K562). Correlations between TRM from blood and buccal DNA were obtained and also between the same blood DNA samples measured in separate laboratories. Using the classical twin design, TRM heritability was estimated (N = 1892, MZ = 1044, DZ = 775). Buccal samples measured for TRM showed a significant correlation with the blood-1 (R = 0.39, p < 0.01) and blood-2 (R = 0.36, p < 0.01) samples. Sex and age effects were observed within the buccal samples as is the norm within blood-derived DNA. The buccal, blood-1, and blood-2 measurements generated heritability estimates of 23.3%, 47.6% and 22.2%, respectively. Buccal derived DNA provides a valid source for the determination of TRM, paving the way for non-invasive projects, such as longitudinal studies in children. PMID:28125671

  18. A Nano-Biosensor for DNA Sequence Detection Using Absorption Spectra of SWNT-DNA Composite

    Directory of Open Access Journals (Sweden)

    J. Bansal

    2011-01-01

    Full Text Available A biosensor based on Single Walled Carbon Nanotube (SWNT-Poly (GTn ssDNA hybrid has been developed for medical diagnostics. The absorption spectrum of this assay is determined with the help of a Shimadzu UV-VIS-NIR spectrophotometer. Two distinct bands each containing three peaks corresponding to first and second van Hove singularities in the density of states of the nanotubes were observed in the absorption spectrum. When a single-stranded DNA (ssDNA having a sequence complementary to probic DNA is added to the ssDNA-SWNT conjugates, hybridization takes place, which causes the red shift of absorption spectrum of nanotubes. On the other hand, when the DNA is noncomplementary, no shift in the absorption spectrum occurs since hybridization between the DNA and probe does not take place. The red shifting of the spectrum is considered to be due to change in the dielectric environment around nanotubes.

  19. Behavior of Repeating Earthquake Sequences in Central California and the Implications for Subsurface Fault Creep

    Energy Technology Data Exchange (ETDEWEB)

    Templeton, D C; Nadeau, R; Burgmann, R

    2007-07-09

    Repeating earthquakes (REs) are sequences of events that have nearly identical waveforms and are interpreted to represent fault asperities driven to failure by loading from aseismic creep on the surrounding fault surface at depth. We investigate the occurrence of these REs along faults in central California to determine which faults exhibit creep and the spatio-temporal distribution of this creep. At the juncture of the San Andreas and southern Calaveras-Paicines faults, both faults as well as a smaller secondary fault, the Quien Sabe fault, are observed to produce REs over the observation period of March 1984-May 2005. REs in this area reflect a heterogeneous creep distribution along the fault plane with significant variations in time. Cumulative slip over the observation period at individual sequence locations is determined to range from 5.5-58.2 cm on the San Andreas fault, 4.8-14.1 cm on the southern Calaveras-Paicines fault, and 4.9-24.8 cm on the Quien Sabe fault. Creep at depth appears to mimic the behaviors seen of creep on the surface in that evidence of steady slip, triggered slip, and episodic slip phenomena are also observed in the RE sequences. For comparison, we investigate the occurrence of REs west of the San Andreas fault within the southern Coast Range. Events within these RE sequences only occurred minutes to weeks apart from each other and then did not repeat again over the observation period, suggesting that REs in this area are not produced by steady aseismic creep of the surrounding fault surface.

  20. Evaluating sequence-derived mtDNA length heteroplasmy by amplicon size analysis

    Science.gov (United States)

    Berger, C.; Hatzer-Grubwieser, P.; Hohoff, C.; Parson, W.

    2011-01-01

    Length heteroplasmy (LH) in mitochondrial (mt)DNA is usually observed in homopolymeric tracts and manifest as mixture of various length variants. The generally used difference-coded annotation to report mtDNA haplotypes does not express the degree of LH variation present in a sample, even more so, it is sometimes difficult to establish which length variants are present and clearly distinguishable from background noise. It has therefore become routine practice for some researchers to call the dominant type, the “major molecule”, which represents the LH variant that is most abundant in a DNA extract. In the majority of cases a clear single dominant variant can be identified. However, in some samples this interpretation is difficult, i.e. when (almost) equally quantitative LH variants are present or when multiple sequencing primers result in the presentation of different dominant types. To better understand those cases we designed amplicon sizing assays for the five most relevant LH regions in the mtDNA control region (around ntps 16,189, 310, 460, 573, and the AC-repeat between 514 and 524) to determine the ratio of the LH variants by fluorescence based amplicon sizing assays. For difficult LH constellations derived by Sanger sequencing (with Big Dye terminators) these assays mostly gave clear and unambiguous results. In the vast majority of cases we found agreement between the results of the sequence and amplicon analyses and propose this alternative method in difficult cases. PMID:21067985

  1. Environmental DNA sequencing primers for eutardigrades and bdelloid rotifers

    Directory of Open Access Journals (Sweden)

    Martin Andrew P

    2009-12-01

    Full Text Available Abstract Background The time it takes to isolate individuals from environmental samples and then extract DNA from each individual is one of the problems with generating molecular data from meiofauna such as eutardigrades and bdelloid rotifers. The lack of consistent morphological information and the extreme abundance of these classes makes morphological identification of rare, or even common cryptic taxa a large and unwieldy task. This limits the ability to perform la