Sample records for alu repeat sequences

  1. Alu repeats as markers for forensic DNA analyses

    Energy Technology Data Exchange (ETDEWEB)

    Batzer, M.A.; Alegria-Hartman, M. [Lawrence Livermore National Lab., CA (United States); Kass, D.H. [Louisiana State Univ., New Orleans, LA (United States)] [and others


    The Human-Specific (HS) subfamily of Alu sequences is comprised of a group of 500 nearly identical members which are almost exclusively restricted to the human genome. Individual subfamily members share an average of 98.9% nucleotide identity with the HS subfamily consensus sequence, and have an average age of 2.8 million years. We have developed a Polymerase Chain Reaction (PCR) based assay using primers complementary to the 5 inch and 3 inch unique flanking DNA sequences from each HS Alu that allow the locus to be assayed for the presence or absence of the Alu repeat. The dimorphic HS Alu sequences probably inserted in the human genome after the radiation of modem humans (within the last 200,000-one million years) and represent a unique source of information for human population genetics and forensic DNA analyses. These sites can be developed into Dimorphic Alu Sequence Tagged Sites (DASTS) for the Human Genome Project. HS Alu family member insertions differ from other types of polymorphism (e.g. Variable Number of Tandem Repeat [VNTR] or Restriction Fragment Length Polymorphism [RFLP]) in that polymorphisms due to Alu insertions arise as a result of a unique event which has occurred only one time in the human population and spread through the population from that point. Therefore, individuals that share HS Alu repeats inherited these elements from a common ancestor. Most VNTR and RFLP polymorphisms may arise multiple times in parallel within a population.

  2. Roles of genes and Alu repeats in nonlinear correlations of HUMHBB DNA sequence

    International Nuclear Information System (INIS)

    Xiao Yi; Huang Yanzhao


    DNA sequences of different species and different portion of the DNA of the same species may have completely different correlation properties, but the origin of these correlations is still not very clear and is currently being investigated, especially in different particular cases. We report here a study of the DNA sequence of human beta globin region (HUMHBB) which has strong linear and nonlinear correlations. We studied the roles of two of the typical elements of DNA sequence, genes and Alu repeats, in the nonlinear correlations of HUMHBB. We find that there exist strong nonlinear correlations between the exons or introns in different genes and between the Alu repeats. They may be one of the major sources of the nonlinear correlations in HUMBHB

  3. Alu repeats as markers for human population genetics

    Energy Technology Data Exchange (ETDEWEB)

    Batzer, M.A.; Alegria-Hartman, M. [Lawrence Livermore National Lab., CA (United States); Bazan, H. [Louisiana State Univ., New Orleans, LA (United States). Medical Center] [and others


    The Human-Specific (HS) subfamily of Alu sequences is comprised of a group of 500 nearly identical members which are almost exclusively restricted to the human genome. Individual subfamily members share an average of 97.9% nucleotide identity with each other and an average of 98.9% nucleotide identity with the HS subfamily consensus sequence. HS Alu family members are thought to be derived from a single source ``master`` gene, and have an average age of 2.8 million years. We have developed a Polymerase Chain Reaction (PCR) based assay using primers complementary to the 5 in. and 3 in. unique flanking DNA sequences from each HS Alu that allows the locus to be assayed for the presence or absence of an Alu repeat. Individual HS Alu sequences were found to be either monomorphic or dimorphic for the presence or absence of each repeat. The monomorphic HS Alu family members inserted in the human genome after the human/great ape divergence (which is thought to have occurred 4--6 million years ago), but before the radiation of modem man. The dimorphic HS Alu sequences inserted in the human genome after the radiation of modem man (within the last 200,000-one million years) and represent a unique source of information for human population genetics and forensic DNA analyses. These sites can be developed into Dimorphic Alu Sequence Tagged Sites (DASTS) for the Human Genome Project as well. HS Alu family member insertion dimorphism differs from other types of polymorphism (e.g. Variable Number of Tandem Repeat [VNTR] or Restriction Fragment Length Polymorphism [RFLP]) because individuals share HS Alu family member insertions based upon identity by descent from a common ancestor as a result of a single event which occurred one time within the human population. The VNTR and RFLP polymorphisms may arise multiple times within a population and are identical by state only.

  4. Impact of Alu repeats on the evolution of human p53 binding sites

    Directory of Open Access Journals (Sweden)

    Sirotin Michael V


    Full Text Available Abstract Background The p53 tumor suppressor protein is involved in a complicated regulatory network, mediating expression of ~1000 human genes. Recent studies have shown that many p53 in vivo binding sites (BSs reside in transposable repeats. The relationship between these BSs and functional p53 response elements (REs remains unknown, however. We sought to understand whether the p53 REs also reside in transposable elements and particularly in the most-abundant Alu repeats. Results We have analyzed ~160 functional p53 REs identified so far and found that 24 of them occur in repeats. More than half of these repeat-associated REs reside in Alu elements. In addition, using a position weight matrix approach, we found ~400,000 potential p53 BSs in Alu elements genome-wide. Importantly, these putative BSs are located in the same regions of Alu repeats as the functional p53 REs - namely, in the vicinity of Boxes A/A' and B of the internal RNA polymerase III promoter. Earlier nucleosome-mapping experiments showed that the Boxes A/A' and B have a different chromatin environment, which is critical for the binding of p53 to DNA. Here, we compare the Alu-residing p53 sites with the corresponding Alu consensus sequences and conclude that the p53 sites likely evolved through two different mechanisms - the sites overlapping with the Boxes A/A' were generated by CG → TG mutations; the other sites apparently pre-existed in the progenitors of several Alu subfamilies, such as AluJo and AluSq. The binding affinity of p53 to the Alu-residing sites generally correlates with the age of Alu subfamilies, so that the strongest sites are embedded in the 'relatively young' Alu repeats. Conclusions The primate-specific Alu repeats play an important role in shaping the p53 regulatory network in the context of chromatin. One of the selective factors responsible for the frequent occurrence of Alu repeats in introns may be related to the p53-mediated regulation of Alu

  5. Structural organization of glycophorin A and B genes: Glycophorin B gene evolved by homologous recombination at Alu repeat sequences

    International Nuclear Information System (INIS)

    Kudo, Shinichi; Fukuda, Minoru


    Glycophorins A (GPA) and B (GPB) are two major sialoglycoproteins of the human erythrocyte membrane. Here the authors present a comparison of the genomic structures of GPA and GPB developed by analyzing DNA clones isolated from a K562 genomic library. Nucleotide sequences of exon-intron junctions and 5' and 3' flanking sequences revealed that the GPA and GPB genes consist of 7 and 5 exons, respectively, and both genes have >95% identical sequence from the 5' flanking region to the region ∼ 1 kilobase downstream from the exon encoding the transmembrane regions. In this homologous part of the genes, GPB lacks one exon due to a point mutation at the 5' splicing site of the third intron, which inactivates the 5' cleavage event of splicing and leads to ligation of the second to the fourth exon. Following these very homologous sequences, the genomic sequences for GPA and GPB diverge significantly and no homology can be detected in their 3' end sequences. The analysis of the Alu sequences and their flanking direct repeat sequences suggest that an ancestral genomic structure has been maintained in the GPA gene, whereas the GPB gene has arisen from the acquisition of 3' sequences different from those of the GPA gene by homologous recombination at the Alu repeats during or after gene duplication

  6. Genome-wide tracking of unmethylated DNA Alu repeats in normal and cancer cells

    DEFF Research Database (Denmark)

    Rodriguez, Jairo; Vives, Laura; Jordà, Mireia


    Methylation of the cytosine is the most frequent epigenetic modification of DNA in mammalian cells. In humans, most of the methylated cytosines are found in CpG-rich sequences within tandem and interspersed repeats that make up to 45% of the human genome, being Alu repeats the most common family....

  7. Alu polymerase chain reaction: A method for rapid isolation of human-specific sequences from complex DNA sources

    International Nuclear Information System (INIS)

    Nelson, D.L.; Ledbetter, S.A.; Corbo, L.; Victoria, M.F.; Ramirez-Solis, R.; Webster, T.D.; Ledbetter, D.H.; Caskey, C.T.


    Current efforts to map the human genome are focused on individual chromosomes or smaller regions and frequently rely on the use of somatic cell hybrids. The authors report the application of the polymerase chain reaction to direct amplification of human DNA from hybrid cells containing regions of the human genome in rodent cell backgrounds using primers directed to the human Alu repeat element. They demonstrate Alu-directed amplification of a fragment of the human HPRT gene from both hybrid cell and cloned DNA and identify through sequence analysis the Alu repeats involved in this amplification. They also demonstrate the application of this technique to identify the chromosomal locations of large fragments of the human X chromosome cloned in a yeast artificial chromosome and the general applicability of the method to the preparation of DNA probes from cloned human sequences. The technique allows rapid gene mapping and provides a simple method for the isolation and analysis of specific chromosomal regions

  8. Sequence Analysis and Characterization of Active Human Alu Subfamilies Based on the 1000 Genomes Pilot Project. (United States)

    Konkel, Miriam K; Walker, Jerilyn A; Hotard, Ashley B; Ranck, Megan C; Fontenot, Catherine C; Storer, Jessica; Stewart, Chip; Marth, Gabor T; Batzer, Mark A


    The goal of the 1000 Genomes Consortium is to characterize human genome structural variation (SV), including forms of copy number variations such as deletions, duplications, and insertions. Mobile element insertions, particularly Alu elements, are major contributors to genomic SV among humans. During the pilot phase of the project we experimentally validated 645 (611 intergenic and 34 exon targeted) polymorphic "young" Alu insertion events, absent from the human reference genome. Here, we report high resolution sequencing of 343 (322 unique) recent Alu insertion events, along with their respective target site duplications, precise genomic breakpoint coordinates, subfamily assignment, percent divergence, and estimated A-rich tail lengths. All the sequenced Alu loci were derived from the AluY lineage with no evidence of retrotransposition activity involving older Alu families (e.g., AluJ and AluS). AluYa5 is currently the most active Alu subfamily in the human lineage, followed by AluYb8, and many others including three newly identified subfamilies we have termed AluYb7a3, AluYb8b1, and AluYa4a1. This report provides the structural details of 322 unique Alu variants from individual human genomes collectively adding about 100 kb of genomic variation. Many Alu subfamilies are currently active in human populations, including a surprising level of AluY retrotransposition. Human Alu subfamilies exhibit continuous evolution with potential drivers sprouting new Alu lineages. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  9. Tracking Alu evolution in New World primates

    Directory of Open Access Journals (Sweden)

    Batzer Mark A


    Full Text Available Abstract Background Alu elements are Short INterspersed Elements (SINEs in primate genomes that have proven useful as markers for studying genome evolution, population biology and phylogenetics. Most of these applications, however, have been limited to humans and their nearest relatives, chimpanzees. In an effort to expand our understanding of Alu sequence evolution and to increase the applicability of these markers to non-human primate biology, we have analyzed available Alu sequences for loci specific to platyrrhine (New World primates. Results Branching patterns along an Alu sequence phylogeny indicate three major classes of platyrrhine-specific Alu sequences. Sequence comparisons further reveal at least three New World monkey-specific subfamilies; AluTa7, AluTa10, and AluTa15. Two of these subfamilies appear to be derived from a gene conversion event that has produced a recently active fusion of AluSc- and AluSp-type elements. This is a novel mode of origin for new Alu subfamilies. Conclusion The use of Alu elements as genetic markers in studies of genome evolution, phylogenetics, and population biology has been very productive when applied to humans. The characterization of these three new Alu subfamilies not only increases our understanding of Alu sequence evolution in primates, but also opens the door to the application of these genetic markers outside the hominid lineage.

  10. The contribution of alu elements to mutagenic DNA double-strand break repair. (United States)

    Morales, Maria E; White, Travis B; Streva, Vincent A; DeFreece, Cecily B; Hedges, Dale J; Deininger, Prescott L


    Alu elements make up the largest family of human mobile elements, numbering 1.1 million copies and comprising 11% of the human genome. As a consequence of evolution and genetic drift, Alu elements of various sequence divergence exist throughout the human genome. Alu/Alu recombination has been shown to cause approximately 0.5% of new human genetic diseases and contribute to extensive genomic structural variation. To begin understanding the molecular mechanisms leading to these rearrangements in mammalian cells, we constructed Alu/Alu recombination reporter cell lines containing Alu elements ranging in sequence divergence from 0%-30% that allow detection of both Alu/Alu recombination and large non-homologous end joining (NHEJ) deletions that range from 1.0 to 1.9 kb in size. Introduction of as little as 0.7% sequence divergence between Alu elements resulted in a significant reduction in recombination, which indicates even small degrees of sequence divergence reduce the efficiency of homology-directed DNA double-strand break (DSB) repair. Further reduction in recombination was observed in a sequence divergence-dependent manner for diverged Alu/Alu recombination constructs with up to 10% sequence divergence. With greater levels of sequence divergence (15%-30%), we observed a significant increase in DSB repair due to a shift from Alu/Alu recombination to variable-length NHEJ which removes sequence between the two Alu elements. This increase in NHEJ deletions depends on the presence of Alu sequence homeology (similar but not identical sequences). Analysis of recombination products revealed that Alu/Alu recombination junctions occur more frequently in the first 100 bp of the Alu element within our reporter assay, just as they do in genomic Alu/Alu recombination events. This is the first extensive study characterizing the influence of Alu element sequence divergence on DNA repair, which will inform predictions regarding the effect of Alu element sequence divergence on both

  11. Alu Elements as Novel Regulators of Gene Expression in Type 1 Diabetes Susceptibility Genes? (United States)

    Kaur, Simranjeet; Pociot, Flemming


    Despite numerous studies implicating Alu repeat elements in various diseases, there is sparse information available with respect to the potential functional and biological roles of the repeat elements in Type 1 diabetes (T1D). Therefore, we performed a genome-wide sequence analysis of T1D candidate genes to identify embedded Alu elements within these genes. We observed significant enrichment of Alu elements within the T1D genes (p-value genes harboring Alus revealed significant enrichment for immune-mediated processes (p-value genes harboring inverted Alus (IRAlus) within their 3' untranslated regions (UTRs) that are known to regulate the expression of host mRNAs by generating double stranded RNA duplexes. Our in silico analysis predicted the formation of duplex structures by IRAlus within the 3'UTRs of T1D genes. We propose that IRAlus might be involved in regulating the expression levels of the host T1D genes.

  12. Alu Mobile Elements: From Junk DNA to Genomic Gems

    Directory of Open Access Journals (Sweden)

    Sami Dridi


    Full Text Available Alus, the short interspersed repeated sequences (SINEs, are retrotransposons that litter the human genomes and have long been considered junk DNA. However, recent findings that these mobile elements are transcribed, both as distinct RNA polymerase III transcripts and as a part of RNA polymerase II transcripts, suggest biological functions and refute the notion that Alus are biologically unimportant. Indeed, Alu RNAs have been shown to control mRNA processing at several levels, to have complex regulatory functions such as transcriptional repression and modulating alternative splicing and to cause a host of human genetic diseases. Alu RNAs embedded in Pol II transcripts can promote evolution and proteome diversity, which further indicates that these mobile retroelements are in fact genomic gems rather than genomic junks.

  13. Alu Sb2 subfamily is present in all higher primates but was most succesfully amplified in humans

    Energy Technology Data Exchange (ETDEWEB)

    Richer, C.; Zietkiewicz, E.; Labuda, D. [Universite de Montreal, Que (Canada)


    Alu repeats can be classified into subfamilies which amplified in primate genomes at different evolutionary time periods. A young Alu subfamily, Sb2, with a characteristic 7-nucleotide duplication at position 256, has been described in seven human loci. An Sb2 insertion found near the HD gene was unique to two HD families, indicating that Sb2 was still retropositionally active. Here, we have shown that the Sb2 insertion in the CHOL locus was similarly rare, being absent in 120 individuals of Caucasian, Oriental and Black origin. In contrast, Sb2 inserts in five other loci were found fixed (non-polymorphic), based on measurements in the same population sample, but absent from orthologous positions in higher apes. This suggest that Sb2 repeats spread relatively early in the human lineage following divergence from other primates and that these elements may be human-specific. By quantitative PCR, we investigated the presence of Sb2 sequences in different primate DNA, using one PCR primer anchored at the 5{prime} Alu-end and the other complementary to the duplicated Sb2-specific segment. With an Sb2-containing plasmid as a standard, we estimated the number of Sb2 repeats at 1500-1800 copies per human haploid equivalent; corresponding numbers in chimpanzee and gorilla were almost two orders of magnitude lower, while the signal observed in orangutan and gibbon DNAs was consistent with the presence of a single copy. The analysis of 22 human, 11 chimpanzee and 10 gorilla sequences indicates that the Alu Sb2 dispersed independently in these three primate lineages; gorilla consensus differs from the human Sb2 sequence by one position, while all chimpanzee repeats have their linker expanded by up to eight A-residues. Should they be thus considered as separate subfamilies? It is possible that sequence modifications with respect to the human consensus are responsible for poor retroposition of Sb2 in apes.

  14. Dynamic Alu Methylation during Normal Development, Aging, and Tumorigenesis

    Directory of Open Access Journals (Sweden)

    Yanting Luo


    Full Text Available DNA methylation primarily occurs on CpG dinucleotides and plays an important role in transcriptional regulations during tissue development and cell differentiation. Over 25% of CpG dinucleotides in the human genome reside within Alu elements, the most abundant human repeats. The methylation of Alu elements is an important mechanism to suppress Alu transcription and subsequent retrotransposition. Decades of studies revealed that Alu methylation is highly dynamic during early development and aging. Recently, many environmental factors were shown to have a great impact on Alu methylation. In addition, aberrant Alu methylation has been documented to be an early event in many tumors and Alu methylation levels have been associated with tumor aggressiveness. The assessment of the Alu methylation has become an important approach for early diagnosis and/or prognosis of cancer. This review focuses on the dynamic Alu methylation during development, aging, and tumor genesis. The cause and consequence of Alu methylation changes will be discussed.

  15. Methylation status of individual CpG sites within Alu elements in the human genome and Alu hypomethylation in gastric carcinomas

    International Nuclear Information System (INIS)

    Xiang, Shengyan; Liu, Zhaojun; Zhang, Baozhen; Zhou, Jing; Zhu, Bu-Dong; Ji, Jiafu; Deng, Dajun


    Alu methylation is correlated with the overall level of DNA methylation and recombination activity of the genome. However, the maintenance and methylation status of each CpG site within Alu elements (Alu) and its methylation status have not well characterized. This information is useful for understanding natural status of Alu in the genome and helpful for developing an optimal assay to quantify Alu hypomethylation. Bisulfite clone sequencing was carried out in 14 human gastric samples initially. A Cac8I COBRA-DHPLC assay was developed to detect methylated-Alu proportion in cell lines and 48 paired gastric carcinomas and 55 gastritis samples. DHPLC data were statistically interpreted using SPSS version 16.0. From the results of 427 Alu bisulfite clone sequences, we found that only 27.2% of CpG sites within Alu elements were preserved (4.6 of 17 analyzed CpGs, A ~ Q) and that 86.6% of remaining-CpGs were methylated. Deamination was the main reason for low preservation of methylation targets. A high correlation coefficient of methylation was observed between Alu clones and CpG site J (0.963), A (0.950), H (0.946), D (0.945). Comethylation of the sites H and J were used as an indicator of the proportion of methylated-Alu in a Cac8I COBRA-DHPLC assay. Validation studies showed that hypermethylation or hypomethylation of Alu elements in human cell lines could be detected sensitively by the assay after treatment with 5-aza-dC and M.SssI, respectively. The proportion of methylated-Alu copies in gastric carcinomas (3.01%) was significantly lower than that in the corresponding normal samples (3.19%) and gastritis biopsies (3.23%). Most Alu CpG sites are deaminated in the genome. 27% of Alu CpG sites represented in our amplification products. 87% of the remaining CpG sites are methylated. Alu hypomethylation in primary gastric carcinomas could be detected with the Cac8I COBRA-DHPLC assay quantitatively

  16. Discovery of rare, diagnostic AluYb8/9 elements in diverse human populations. (United States)

    Feusier, Julie; Witherspoon, David J; Scott Watkins, W; Goubert, Clément; Sasani, Thomas A; Jorde, Lynn B


    Polymorphic human Alu elements are excellent tools for assessing population structure, and new retrotransposition events can contribute to disease. Next-generation sequencing has greatly increased the potential to discover Alu elements in human populations, and various sequencing and bioinformatics methods have been designed to tackle the problem of detecting these highly repetitive elements. However, current techniques for Alu discovery may miss rare, polymorphic Alu elements. Combining multiple discovery approaches may provide a better profile of the polymorphic Alu mobilome. Alu Yb8/9 elements have been a focus of our recent studies as they are young subfamilies (~2.3 million years old) that contribute ~30% of recent polymorphic Alu retrotransposition events. Here, we update our ME-Scan methods for detecting Alu elements and apply these methods to discover new insertions in a large set of individuals with diverse ancestral backgrounds. We identified 5,288 putative Alu insertion events, including several hundred novel Alu Yb8/9 elements from 213 individuals from 18 diverse human populations. Hundreds of these loci were specific to continental populations, and 23 non-reference population-specific loci were validated by PCR. We provide high-quality sequence information for 68 rare Alu Yb8/9 elements, of which 11 have hallmarks of an active source element. Our subfamily distribution of rare Alu Yb8/9 elements is consistent with previous datasets, and may be representative of rare loci. We also find that while ME-Scan and low-coverage, whole-genome sequencing (WGS) detect different Alu elements in 41 1000 Genomes individuals, the two methods yield similar population structure results. Current in-silico methods for Alu discovery may miss rare, polymorphic Alu elements. Therefore, using multiple techniques can provide a more accurate profile of Alu elements in individuals and populations. We improved our false-negative rate as an indicator of sample quality for future

  17. Genome-wide analysis of the human Alu Yb-lineage

    Directory of Open Access Journals (Sweden)

    Carter Anthony B


    Full Text Available Abstract The Alu Yb-lineage is a 'young' primarily human-specific group of short interspersed element (SINE subfamilies that have integrated throughout the human genome. In this study, we have computationally screened the draft sequence of the human genome for Alu Yb-lineage subfamily members present on autosomal chromosomes. A total of 1,733 Yb Alu subfamily members have integrated into human autosomes. The average ages of Yb-lineage subfamilies, Yb7, Yb8 and Yb9, are estimated as 4.81, 2.39 and 2.32 million years, respectively. In order to determine the contribution of the Alu Yb-lineage to human genomic diversity, 1,202 loci were analysed using polymerase chain reaction (PCR-based assays, which amplify the genomic regions containing individual Yb-lineage subfamily members. Approximately 20 per cent of the Yb-lineage Alu elements are polymorphic for insertion presence/absence in the human genome. Fewer than 0.5 per cent of the Yb loci also demonstrate insertions at orthologous positions in non-human primate genomes. Genomic sequencing of these unusual loci demonstrates that each of the orthologous loci from non-human primate genomes contains older Y, Sg and Sx Alu family members that have been altered, through various mechanisms, into Yb8 sequences. These data suggest that Alu Yb-lineage subfamily members are largely restricted to the human genome. The high copy number, level of insertion polymorphism and estimated age indicate that members of the Alu Yb elements will be useful in a wide range of genetic analyses.

  18. Rescuing Alu: recovery of new inserts shows LINE-1 preserves Alu activity through A-tail expansion.

    Directory of Open Access Journals (Sweden)

    Bradley J Wagstaff

    Full Text Available Alu elements are trans-mobilized by the autonomous non-LTR retroelement, LINE-1 (L1. Alu-induced insertion mutagenesis contributes to about 0.1% human genetic disease and is responsible for the majority of the documented instances of human retroelement insertion-induced disease. Here we introduce a SINE recovery method that provides a complementary approach for comprehensive analysis of the impact and biological mechanisms of Alu retrotransposition. Using this approach, we recovered 226 de novo tagged Alu inserts in HeLa cells. Our analysis reveals that in human cells marked Alu inserts driven by either exogenously supplied full length L1 or ORF2 protein are indistinguishable. Four percent of de novo Alu inserts were associated with genomic deletions and rearrangements and lacked the hallmarks of retrotransposition. In contrast to L1 inserts, 5' truncations of Alu inserts are rare, as most of the recovered inserts (96.5% are full length. De novo Alus show a random pattern of insertion across chromosomes, but further characterization revealed an Alu insertion bias exists favoring insertion near other SINEs, highly conserved elements, with almost 60% landing within genes. De novo Alu inserts show no evidence of RNA editing. Priming for reverse transcription rarely occurred within the first 20 bp (most 5' of the A-tail. The A-tails of recovered inserts show significant expansion, with many at least doubling in length. Sequence manipulation of the construct led to the demonstration that the A-tail expansion likely occurs during insertion due to slippage by the L1 ORF2 protein. We postulate that the A-tail expansion directly impacts Alu evolution by reintroducing new active source elements to counteract the natural loss of active Alus and minimizing Alu extinction.

  19. Modeling the amplification dynamics of human Alu retrotransposons.

    Directory of Open Access Journals (Sweden)

    Dale J Hedges


    Full Text Available Retrotransposons have had a considerable impact on the overall architecture of the human genome. Currently, there are three lineages of retrotransposons (Alu, L1, and SVA that are believed to be actively replicating in humans. While estimates of their copy number, sequence diversity, and levels of insertion polymorphism can readily be obtained from existing genomic sequence data and population sampling, a detailed understanding of the temporal pattern of retrotransposon amplification remains elusive. Here we pose the question of whether, using genomic sequence and population frequency data from extant taxa, one can adequately reconstruct historical amplification patterns. To this end, we developed a computer simulation that incorporates several known aspects of primate Alu retrotransposon biology and accommodates sampling effects resulting from the methods by which mobile elements are typically discovered and characterized. By modeling a number of amplification scenarios and comparing simulation-generated expectations to empirical data gathered from existing Alu subfamilies, we were able to statistically reject a number of amplification scenarios for individual subfamilies, including that of a rapid expansion or explosion of Alu amplification at the time of human-chimpanzee divergence.

  20. Modeling the amplification dynamics of human alu retrotransposons.

    Directory of Open Access Journals (Sweden)


    Full Text Available Retrotransposons have had a considerable impact on the overall architecture of the human genome. Currently, there are three lineages of retrotransposons (Alu, L1, and SVA that are believed to be actively replicating in humans. While estimates of their copy number, sequence diversity, and levels of insertion polymorphism can readily be obtained from existing genomic sequence data and population sampling, a detailed understanding of the temporal pattern of retrotransposon amplification remains elusive. Here we pose the question of whether, using genomic sequence and population frequency data from extant taxa, one can adequately reconstruct historical amplification patterns. To this end, we developed a computer simulation that incorporates several known aspects of primate Alu retrotransposon biology and accommodates sampling effects resulting from the methods by which mobile elements are typically discovered and characterized. By modeling a number of amplification scenarios and comparing simulation-generated expectations to empirical data gathered from existing Alu subfamilies, we were able to statistically reject a number of amplification scenarios for individual subfamilies, including that of a rapid expansion or explosion of Alu amplification at the time of human-chimpanzee divergence.

  1. Cloning the human lysozyme cDNA: Inverted Alu repeat in the mRNA and in situ hybridization for macrophages and Paneth cells

    International Nuclear Information System (INIS)

    Chung, L.P.; Keshav, S.; Gordon, S.


    Lysozyme is a major secretory product of human and rodent macrophages and a useful marker for myelomonocytic cells. Based on the known human lysozyme amino acid sequence, oligonucleotides were synthesized and used as probes to screen a phorbol 12-myristate 13-acetate-treated U937 cDNA library. A full-length human lysozyme cDNA clone, pHL-2, was obtained and characterized. Sequence analysis shows that human lysozyme, like chicken lysozyme, has in 18-amino-acid-long signal peptide, but unlike the chicken lysozyme cDNA, the human lysozyme cDNA has a >1-kilobase-long 3' nontranslated sequence. Interestingly, within this 3' region, an inverted repeat of the Alu family of repetitive sequences was discovered. In RNA blot analyses, DNA probes prepared from pHL-2 can be used to detect lysozyme mRNA not only from human but also from mouse and rat. Moreover, by in situ hybridization, complementary RNA transcripts have been used as probes to detect lysozyme mRNA in mouse macrophages and Paneth cells. This human lysozyme cDNA clone is therefore likely to be a useful molecular probe for studying macrophage distribution and gene expression

  2. Orangutan Alu quiescence reveals possible source element: support for ancient backseat drivers

    Directory of Open Access Journals (Sweden)

    Walker Jerilyn A


    Full Text Available Abstract Background Sequence analysis of the orangutan genome revealed that recent proliferative activity of Alu elements has been uncharacteristically quiescent in the Pongo (orangutan lineage, compared with all previously studied primate genomes. With relatively few young polymorphic insertions, the genomic landscape of the orangutan seemed like the ideal place to search for a driver, or source element, of Alu retrotransposition. Results Here we report the identification of a nearly pristine insertion possessing all the known putative hallmarks of a retrotranspositionally competent Alu element. It is located in an intronic sequence of the DGKB gene on chromosome 7 and is highly conserved in Hominidae (the great apes, but absent from Hylobatidae (gibbon and siamang. We provide evidence for the evolution of a lineage-specific subfamily of this shared Alu insertion in orangutans and possibly the lineage leading to humans. In the orangutan genome, this insertion contains three orangutan-specific diagnostic mutations which are characteristic of the youngest polymorphic Alu subfamily, AluYe5b5_Pongo. In the Homininae lineage (human, chimpanzee and gorilla, this insertion has acquired three different mutations which are also found in a single human-specific Alu insertion. Conclusions This seemingly stealth-like amplification, ongoing at a very low rate over millions of years of evolution, suggests that this shared insertion may represent an ancient backseat driver of Alu element expansion.

  3. The mobile genetic element Alu in the human genome

    Energy Technology Data Exchange (ETDEWEB)

    Novick, G.E. [Florida International Univ., Miami, FL (United States); Batzer, M.A.; Deininger, P.L. [Louisiana State Univ. Medical Center, New Orleans, LA (United States)] [and others


    Genetic material has been traditionally envisioned as relatively static with the exception of occasional, often deleterious mutations. The sequence DNA-to-RNA-to-protein represented for many years the central dogma relating gene structure and function. Recently, the field of molecular genetics has provided revolutionary information on the dynamic role of repetitive elements in the function of the genetic material and the evolution of humans and other organisms. Alu sequences represent the largest family of short interspersed repetitive elements (SINEs) in humans, being present in an excess of 500,000 copies per haploid genome. Alu elements, as well as the other repetitive elements, were once considered to be useless. Today, the biology of Alu transposable elements is being widely examined in order to determine the molecular basis of a growing number of identified diseases and to provide new directions in genome mapping and biomedical research. 66 refs., 5 figs.

  4. In situ hybridization of bat chromosomes with human (TTAGGGn probe, after previous digestion with Alu I

    Directory of Open Access Journals (Sweden)

    Karina de Cassia Faria


    Full Text Available The purpose of this work was to verify the ability of the enzyme Alu I to cleave and/or remove satellite DNA sequences from heterochromatic regions in chromosomes of bats, by identifying the occurrence of modifications in the pattern of fluorescence in situ hybridization with telomeric DNA. The localization and fluorescence intensity of the telomeric DNA sites of the Alu-digested and undigested chromosomes of species Eumops glaucinus, Carollia perspicillata, and Platyrrhinus lineatus were analyzed. Telomeric sequences were detected at the termini of chromosomes of all three species, although, in C. perspicillata, the signals were very faint or absent in most chromosomes. This finding was interpreted as being due to a reduced number of copies of the telomeric repeat, resulting from extensive telomeric association and/or rearrangements undergone by the chromosomes of Carollia. Fluorescent signals were also observed in centromeric and pericentromeric regions in several two-arm chromosomes of E. glaucinus and C. perspicillata. In E. glaucinus and P. lineatus, some interstitial and terminal telomeric sites were observed to be in association with regions of constitutive heterochromatin and ribosomal DNA (NORs. After digestion, these telomeric sites showed a significant decrease in signal intensity, indicating that enzyme Alu I cleaves and/or removes part of the satellite DNA present in these regions. These results suggest that the telomeric sequence is a component of the heterochromatin, and that the C-band- positive regions of bat chromosomes have a different DNA composition.

  5. A SINE in the genome of the cephalochordate amphioxus is an Alu element (United States)

    Holland, Linda Z.


    Transposable elements of about 300 bp, termed “short interspersed nucleotide elements or SINEs are common in eukaryotes. However, Alu elements, SINEs containing restriction sites for the AluI enzyme, have been known only from primates. Here I report the first SINE found in the genome of the cephalochordate, amphioxus. It is an Alu element of 375 bp that does not share substantial identity with any genomic sequences in vertebrates. It was identified because it was located in the FoxD regulatory region in a cosmid derived from one individual, but absent from the two FoxD alleles of BACs from a second individual. However, searches of sequences of BACs and genomic traces from this second individual gave an estimate of 50-100 copies in the amphioxus genome. The finding of an Alu element in amphioxus raises the question of whether Alu elements in amphioxus and primates arose by convergent evolution or by inheritance from a common ancestor. Genome-wide analyses of transposable elements in amphioxus and other chordates such as tunicates, agnathans and cartilaginous fishes could well provide the answer. PMID:16733535

  6. The polydeoxyadenylate tract of Alu repetitive elements is polymorphic in the human genome

    International Nuclear Information System (INIS)

    Economou, E.P.; Bergen, A.W.; Warren, A.C.; Antonarakis, S.E.


    To identify DNA polymorphisms that are abundant in the human genome and are detectable by polymerase chain reaction amplification of genomic DNA, the authors hypothesize that the polydeoxyadenylate tract of the Alu family of repetitive elements is polymorphic among human chromosomes. Analysis of the 3' ends of three specific Alu sequences showed two occurrences, one in the adenosine deaminase gene and other in the β-globin pseudogene, were polymorphic. This novel class of polymorphism, termed AluVpA [Alu variable poly(A)] may represent one of the most useful and informative group of DNA markers in the human genome

  7. DHX9 suppresses RNA processing defects originating from the Alu invasion of the human genome. (United States)

    Aktaş, Tuğçe; Avşar Ilık, İbrahim; Maticzka, Daniel; Bhardwaj, Vivek; Pessoa Rodrigues, Cecilia; Mittler, Gerhard; Manke, Thomas; Backofen, Rolf; Akhtar, Asifa


    Transposable elements are viewed as 'selfish genetic elements', yet they contribute to gene regulation and genome evolution in diverse ways. More than half of the human genome consists of transposable elements. Alu elements belong to the short interspersed nuclear element (SINE) family of repetitive elements, and with over 1 million insertions they make up more than 10% of the human genome. Despite their abundance and the potential evolutionary advantages they confer, Alu elements can be mutagenic to the host as they can act as splice acceptors, inhibit translation of mRNAs and cause genomic instability. Alu elements are the main targets of the RNA-editing enzyme ADAR and the formation of Alu exons is suppressed by the nuclear ribonucleoprotein HNRNPC, but the broad effect of massive secondary structures formed by inverted-repeat Alu elements on RNA processing in the nucleus remains unknown. Here we show that DHX9, an abundant nuclear RNA helicase, binds specifically to inverted-repeat Alu elements that are transcribed as parts of genes. Loss of DHX9 leads to an increase in the number of circular-RNA-producing genes and amount of circular RNAs, translational repression of reporters containing inverted-repeat Alu elements, and transcriptional rewiring (the creation of mostly nonsensical novel connections between exons) of susceptible loci. Biochemical purifications of DHX9 identify the interferon-inducible isoform of ADAR (p150), but not the constitutively expressed ADAR isoform (p110), as an RNA-independent interaction partner. Co-depletion of ADAR and DHX9 augments the double-stranded RNA accumulation defects, leading to increased circular RNA production, revealing a functional link between these two enzymes. Our work uncovers an evolutionarily conserved function of DHX9. We propose that it acts as a nuclear RNA resolvase that neutralizes the immediate threat posed by transposon insertions and allows these elements to evolve as tools for the post

  8. Roles of repetitive sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bell, G.I.


    The DNA of higher eukaryotes contains many repetitive sequences. The study of repetitive sequences is important, not only because many have important biological function, but also because they provide information on genome organization, evolution and dynamics. In this paper, I will first discuss some generic effects that repetitive sequences will have upon genome dynamics and evolution. In particular, it will be shown that repetitive sequences foster recombination among, and turnover of, the elements of a genome. I will then consider some examples of repetitive sequences, notably minisatellite sequences and telomere sequences as examples of tandem repeats, without and with respectively known function, and Alu sequences as an example of interspersed repeats. Some other examples will also be considered in less detail.

  9. Higher Alu methylation levels in catch-up growth in twenty-year-old offsprings.

    Directory of Open Access Journals (Sweden)

    Kittipan Rerkasem

    Full Text Available Alu elements and long interspersed element-1 (LINE-1 or L1 are two major human intersperse repetitive sequences. Lower Alu methylation, but not LINE-1, has been observed in blood cells of people in old age, and in menopausal women having lower bone mass and osteoporosis. Nevertheless, Alu methylation levels also vary among young individuals. Here, we explored phenotypes at birth that are associated with Alu methylation levels in young people. In 2010, 249 twenty-years-old volunteers whose mothers had participated in a study association between birth weight (BW and nutrition during pregnancy in 1990, were invited to take part in our present study. In this study, the LINE-1 and Alu methylation levels and patterns were measured in peripheral mononuclear cells and correlated with various nutritional parameters during intrauterine and postnatal period of offspring. This included the amount of maternal intake during pregnancy, the mother's weight gain during pregnancy, birth weight, birth length, and the rate of weight gain in the first year of life. Catch-up growth (CUG was defined when weight during the first year was >0.67 of the standard score, according to WHO data. No association with LINE-1 methylation was identified. The mean level of Alu methylation in the CUG group was significantly higher than those non-CUG (39.61% and 33.66 % respectively, P < 0.0001. The positive correlation between the history of CUG in the first year and higher Alu methylation indicates the role of Alu methylation, not only in aging cells, but also in the human growth process. Moreover, here is the first study that demonstrated the association between a phenotype during the newborn period and intersperse repetitive sequences methylation during young adulthood.

  10. The Role of the Y-Chromosome in the Establishment of Murine Hybrid Dysgenesis and in the Analysis of the Nucleotide Sequence Organization, Genetic Transmission and Evolution of Repeated Sequences. (United States)

    Nallaseth, Ferez Soli

    The Y-chromosome presents a unique cytogenetic framework for the evolution of nucleotide sequences. Alignment of nine Y-chromosomal fragments in their increasing Y-specific/non Y-specific (male/female) sequence divergence ratios was directly and inversely related to their interspersion on these two respective genomic fractions. Sequence analysis confirmed a direct relationship between divergence ratios and the Alu, LINE-1, Satellite and their derivative oligonucleotide contents. Thus their relocation on the Y-chromosome is followed by sequence divergence rather than the well documented concerted evolution of these non-coding progenitor repeated sequences. Five of the nine Y-chromosomal fragments are non-pseudoautosomal and transcribed into heterogeneous PolyA^+ RNA and thus can be retrotransposed. Evolutionary and computer analysis identified homologous oligonucleotide tracts in several human loci suggesting common and random mechanistic origins. Dysgenic genomes represent the accelerated evolution driving sequence divergence (McClintock, 1984). Sex reversal and sterility characterizing dysgenesis occurs in C57BL/6JY ^{rm Pos} but not in 129/SvY^{rm Pos} derivative strains. High frequency, random, multi-locus deletion products of the feral Y^{ rm Pos}-chromosome are generated in the germlines of F1(C57BL/6J X 129/SvY^{ rm Pos})(male) and C57BL/6JY ^{rm Pos}(male) but not in 129/SvY^{rm Pos}(male). Equal, 10^{-1}, 10^ {-2}, and 0 copies (relative to males) of Y^{rm Pos}-specific deletion products respectively characterize C57BL/6JY ^{rm Pos} (HC), (LC), (T) and (F) females. The testes determining loci of inactive Y^{rm Pos}-chromosomes in C57BL/6JY^{rm Pos} HC females are the preferentially deleted/rearranged Y ^{rm Pos}-sequences. Disruption of regulation of plasma testosterone and hepatic MUP-A mRNA levels, TRD of a 4.7 Kbp EcoR1 fragment suggest disruption of autosomal/X-chromosomal sequences. These data and the highly repeated progenitor (Alu, GATA, LINE-1

  11. Novel expressed sequence tag- simple sequence repeats (EST ...

    African Journals Online (AJOL)

    Using different bioinformatic criteria, the SUCEST database was used to mine for simple sequence repeat (SSR) markers. Among 42,189 clusters, 1,425 expressed sequence tag- simple sequence repeats (EST-SSRs) were identified in silico. Trinucleotide repeats were the most abundant SSRs detected. Of 212 primer pairs ...

  12. Germline Chromothripsis Driven by L1-Mediated Retrotransposition and Alu/Alu Homologous Recombination

    DEFF Research Database (Denmark)

    Nazaryan-Petersen, Lusine; Bertelsen, Birgitte; Bak, Mads


    Chromothripsis (CTH) is a phenomenon where multiple localized double-stranded DNA breaks result in complex genomic rearrangements. Although the DNA-repair mechanisms involved in CTH have been described, the mechanisms driving the localized "shattering" process remain unclear. High-throughput sequ......Chromothripsis (CTH) is a phenomenon where multiple localized double-stranded DNA breaks result in complex genomic rearrangements. Although the DNA-repair mechanisms involved in CTH have been described, the mechanisms driving the localized "shattering" process remain unclear. High......-throughput sequence analysis of a familial germline CTH revealed an inserted SVAE retrotransposon associated with a 110-kb deletion displaying hallmarks of L1-mediated retrotransposition. Our analysis suggests that the SVAE insertion did not occur prior to or after, but concurrent with the CTH event. We also observed...... L1-endonuclease potential target sites in other breakpoints. In addition, we found four Alu elements flanking the 110-kb deletion and associated with an inversion. We suggest that chromatin looping mediated by homologous Alu elements may have brought distal DNA regions into close proximity...

  13. An alu-based phylogeny of lemurs (infraorder: Lemuriformes.

    Directory of Open Access Journals (Sweden)

    Adam T McLain

    Full Text Available LEMURS (INFRAORDER: Lemuriformes are a radiation of strepsirrhine primates endemic to the island of Madagascar. As of 2012, 101 lemur species, divided among five families, have been described. Genetic and morphological evidence indicates all species are descended from a common ancestor that arrived in Madagascar ∼55-60 million years ago (mya. Phylogenetic relationships in this species-rich infraorder have been the subject of debate. Here we use Alu elements, a family of primate-specific Short INterspersed Elements (SINEs, to construct a phylogeny of infraorder Lemuriformes. Alu elements are particularly useful SINEs for the purpose of phylogeny reconstruction because they are identical by descent and confounding events between loci are easily resolved by sequencing. The genome of the grey mouse lemur (Microcebus murinus was computationally assayed for synapomorphic Alu elements. Those that were identified as Lemuriformes-specific were analyzed against other available primate genomes for orthologous sequence in which to design primers for PCR (polymerase chain reaction verification. A primate phylogenetic panel of 24 species, including 22 lemur species from all five families, was examined for the presence/absence of 138 Alu elements via PCR to establish relationships among species. Of these, 111 were phylogenetically informative. A phylogenetic tree was generated based on the results of this analysis. We demonstrate strong support for the monophyly of Lemuriformes to the exclusion of other primates, with Daubentoniidae, the aye-aye, as the basal lineage within the infraorder. Our results also suggest Lepilemuridae as a sister lineage to Cheirogaleidae, and Indriidae as sister to Lemuridae. Among the Cheirogaleidae, we show strong support for Microcebus and Mirza as sister genera, with Cheirogaleus the sister lineage to both. Our results also support the monophyly of the Lemuridae. Within Lemuridae we place Lemur and Hapalemur together to the

  14. Repeated DNA sequences in fungi

    Energy Technology Data Exchange (ETDEWEB)

    Dutta, S K


    Several fungal species, representatives of all broad groups like basidiomycetes, ascomycetes and phycomycetes, were examined for the nature of repeated DNA sequences by DNA:DNA reassociation studies using hydroxyapatite chromatography. All of the fungal species tested contained 10 to 20 percent repeated DNA sequences. There are approximately 100 to 110 copies of repeated DNA sequences of approximately 4 x 10/sup 7/ daltons piece size of each. Repeated DNA sequence homoduplexes showed on average 5/sup 0/C difference of T/sub e/50 (temperature at which 50 percent duplexes dissociate) values from the corresponding homoduplexes of unfractionated whole DNA. It is suggested that a part of repetitive sequences in fungi constitutes mitochondrial DNA and a part of it constitutes nuclear DNA. (auth)

  15. Insertion and deletion polymorphisms of the ancient AluS family in the human genome. (United States)

    Kryatova, Maria S; Steranka, Jared P; Burns, Kathleen H; Payer, Lindsay M


    Polymorphic Alu elements account for 17% of structural variants in the human genome. The majority of these belong to the youngest AluY subfamilies, and most structural variant discovery efforts have focused on identifying Alu polymorphisms from these currently retrotranspositionally active subfamilies. In this report we analyze polymorphisms from the evolutionarily older AluS subfamily, whose peak activity was tens of millions of years ago. We annotate the AluS polymorphisms, assess their likely mechanism of origin, and evaluate their contribution to structural variation in the human genome. Of 52 previously reported polymorphic AluS elements ascertained for this study, 48 were confirmed to belong to the AluS subfamily using high stringency subfamily classification criteria. Of these, the majority (77%, 37/48) appear to be deletion polymorphisms. Two polymorphic AluS elements (4%) have features of non-classical Alu insertions and one polymorphic AluS element (2%) likely inserted by a mechanism involving internal priming. Seven AluS polymorphisms (15%) appear to have arisen by the classical target-primed reverse transcription (TPRT) retrotransposition mechanism. These seven TPRT products are 3' intact with 3' poly-A tails, and are flanked by target site duplications; L1 ORF2p endonuclease cleavage sites were also observed, providing additional evidence that these are L1 ORF2p endonuclease-mediated TPRT insertions. Further sequence analysis showed strong conservation of both the RNA polymerase III promoter and SRP9/14 binding sites, important for mediating transcription and interaction with retrotransposition machinery, respectively. This conservation of functional features implies that some of these are fairly recent insertions since they have not diverged significantly from their respective retrotranspositionally competent source elements. Of the polymorphic AluS elements evaluated in this report, 15% (7/48) have features consistent with TPRT-mediated insertion

  16. ALUminating the Path of Atherosclerosis Progression: Chaos Theory Suggests a Role for Alu Repeats in the Development of Atherosclerotic Vascular Disease. (United States)

    Hueso, Miguel; Cruzado, Josep M; Torras, Joan; Navarro, Estanislao


    Atherosclerosis (ATH) and coronary artery disease (CAD) are chronic inflammatory diseases with an important genetic background; they derive from the cumulative effect of multiple common risk alleles, most of which are located in genomic noncoding regions. These complex diseases behave as nonlinear dynamical systems that show a high dependence on their initial conditions; thus, long-term predictions of disease progression are unreliable. One likely possibility is that the nonlinear nature of ATH could be dependent on nonlinear correlations in the structure of the human genome. In this review, we show how chaos theory analysis has highlighted genomic regions that have shared specific structural constraints, which could have a role in ATH progression. These regions were shown to be enriched with repetitive sequences of the Alu family, genomic parasites that have colonized the human genome, which show a particular secondary structure and are involved in the regulation of gene expression. Here, we show the impact of Alu elements on the mechanisms that regulate gene expression, especially highlighting the molecular mechanisms via which the Alu elements alter the inflammatory response. We devote special attention to their relationship with the long noncoding RNA (lncRNA); antisense noncoding RNA in the INK4 locus ( ANRIL ), a risk factor for ATH; their role as microRNA (miRNA) sponges; and their ability to interfere with the regulatory circuitry of the (nuclear factor kappa B) NF-κB response. We aim to characterize ATH as a nonlinear dynamic system, in which small initial alterations in the expression of a number of repetitive elements are somehow amplified to reach phenotypic significance.

  17. Improving the ALUeS diagnostic system for determining the coolant leak place from the WWER-440 primary circuit

    International Nuclear Information System (INIS)

    Markosyan, G.R.; Petrosyan, V.G.; Shakhverdyan, S.V.; Aslanyan, M.A.


    The new algorithm for localizing the leakage from the WWER-440 primary circuit, intended for operation in the Siemens ALUeS system, is proposed. The results of the algorithm realization in the leakage control system (the ALUeS system copy), installed at the Armenian NPP power unit-2, are presented. The leakage localization algorithm proposed was tested in other experiments. The leakage position in the majority of cases is determined exactly. Small (up to 5 m) deviations, the cause whereof were incorrect readings of the transducers, were observed [ru

  18. Alu element insertion in PKLR gene as a novel cause of pyruvate kinase deficiency in Middle Eastern patients. (United States)

    Lesmana, Harry; Dyer, Lisa; Li, Xia; Denton, James; Griffiths, Jenna; Chonat, Satheesh; Seu, Katie G; Heeney, Matthew M; Zhang, Kejian; Hopkin, Robert J; Kalfa, Theodosia A


    Pyruvate kinase deficiency (PKD) is the most frequent red blood cell enzyme abnormality of the glycolytic pathway and the most common cause of hereditary nonspherocytic hemolytic anemia. Over 250 PKLR-gene mutations have been described, including missense/nonsense, splicing and regulatory mutations, small insertions, small and gross deletions, causing PKD and hemolytic anemia of variable severity. Alu retrotransposons are the most abundant mobile DNA sequences in the human genome, contributing to almost 11% of its mass. Alu insertions have been associated with a number of human diseases either by disrupting a coding region or a splice signal. Here, we report on two unrelated Middle Eastern patients, both born from consanguineous parents, with transfusion-dependent hemolytic anemia, where sequence analysis revealed a homozygous insertion of AluYb9 within exon 6 of the PKLR gene, causing precipitous decrease of PKLR RNA levels. This Alu element insertion consists a previously unrecognized mechanism underlying pathogenesis of PKD. © 2017 Wiley Periodicals, Inc.

  19. Development of an RSFQ 4-bit ALU

    International Nuclear Information System (INIS)

    Kim, J. Y.; Baek, S. H.; Kim, S. H.; Kang, K. R.; Jung, K. R.; Lim, H. Y.; Park, J. H.; Han, T. S.


    We have developed and tested an RSFQ 4-bit Arithmetic Logic Unit (ALU) based on half adder cells and de switches. ALU is a core element of a computer processor that performs arithmetic and logic operations on the operands in computer instruction words. The designed ALU had limited operation functions of OR, AND, XOR, and ADD. It had a pipeline structure. We have simulated the circuit by using Josephson circuit simulation tools in order to reduce the timing problem, and confirmed the correct operation of the designed ALU. We used simulation tools of XIC TM ,WRspice TM , and Julia. The fabricated 4-bit ALU circuit had a size of 3000 calum X 1500, and the chip size was 5 mm X 5 mm. The test speeds were 1000 kHz and 5 GHz. For high-speed test, we used an eye-diagram technique. Our 4-bit ALU operated correctly up to 5 GHz clock frequency. The chip was tested at the liquid-helium temperature.

  20. A genomewide screen for suppressors of Alu-mediated rearrangements reveals a role for PIF1.

    Directory of Open Access Journals (Sweden)

    Karen M Chisholm

    Full Text Available Alu-mediated rearrangement of tumor suppressor genes occurs frequently during carcinogenesis. In breast cancer, this mechanism contributes to loss of the wild-type BRCA1 allele in inherited disease and to loss of heterozygosity in sporadic cancer. To identify genes required for suppression of Alu-mediated recombination we performed a genomewide screen of a collection of 4672 yeast gene deletion mutants using a direct repeat recombination assay. The primary screen and subsequent analysis identified 12 candidate genes including TSA, ELG1, and RRM3, which are known to play a significant role in maintaining genomic stability. Genetic analysis of the corresponding human homologs was performed in sporadic breast tumors and in inherited BRCA1-associated carcinomas. Sequencing of these genes in high risk breast cancer families revealed a potential role for the helicase PIF1 in cancer predisposition. PIF1 variant L319P was identified in three breast cancer families; importantly, this variant, which is predicted to be functionally damaging, was not identified in a large series of controls nor has it been reported in either dbSNP or the 1000 Genomes Project. In Schizosaccharomyces pombe, Pfh1 is required to maintain both mitochondrial and nuclear genomic integrity. Functional studies in yeast of human PIF1 L319P revealed that this variant cannot complement the essential functions of Pfh1 in either the nucleus or mitochondria. Our results provide a global view of nonessential genes involved in suppressing Alu-mediated recombination and implicate variation in PIF1 in breast cancer predisposition.

  1. Aberrant methylation and associated transcriptional mobilization of Alu elements contributes to genomic instability in hypoxia. (United States)

    Pal, Arnab; Srivastava, Tapasya; Sharma, Manish K; Mehndiratta, Mohit; Das, Prerna; Sinha, Subrata; Chattopadhyay, Parthaprasad


    Hypoxia is an integral part of tumorigenesis and contributes extensively to the neoplastic phenotype including drug resistance and genomic instability. It has also been reported that hypoxia results in global demethylation. Because a majority of the cytosine-phosphate-guanine (CpG) islands are found within the repeat elements of DNA, and are usually methylated under normoxic conditions, we suggested that retrotransposable Alu or short interspersed nuclear elements (SINEs) which show altered methylation and associated changes of gene expression during hypoxia, could be associated with genomic instability. U87MG glioblastoma cells were cultured in 0.1% O₂ for 6 weeks and compared with cells cultured in 21% O₂ for the same duration. Real-time PCR analysis showed a significant increase in SINE and reverse transcriptase coding long interspersed nuclear element (LINE) transcripts during hypoxia. Sequencing of bisulphite treated DNA as well as the Combined Bisulfite Restriction Analysis (COBRA) assay showed that the SINE loci studied underwent significant hypomethylation though there was patchy hypermethylation at a few sites. The inter-alu PCR profile of DNA from cells cultured under 6-week hypoxia, its 4-week revert back to normoxia and 6-week normoxia showed several changes in the band pattern indicating increased alu mediated genomic alteration. Our results show that aberrant methylation leading to increased transcription of SINE and reverse transcriptase associated LINE elements could lead to increased genomic instability in hypoxia. This might be a cause of genetic heterogeneity in tumours especially in variegated hypoxic environment and lead to a development of foci of more aggressive tumour cells. © 2009 The Authors Journal compilation © 2010 Foundation for Cellular and Molecular Medicine/Blackwell Publishing Ltd.

  2. DNA Methylation Status of the Interspersed Repetitive Sequences for LINE-1, Alu, HERV-E, and HERV-K in Trabeculectomy Specimens from Glaucoma Eyes

    Directory of Open Access Journals (Sweden)

    Sunee Chansangpetch


    Full Text Available Background/Aims. Epigenetic mechanisms via DNA methylation may be related to glaucoma pathogenesis. This study aimed to determine the global DNA methylation level of the trabeculectomy specimens among patients with different types of glaucoma and normal subjects. Methods. Trabeculectomy sections from 16 primary open-angle glaucoma (POAG, 12 primary angle-closure glaucoma (PACG, 16 secondary glaucoma patients, and 10 normal controls were assessed for DNA methylation using combined-bisulfite restriction analysis. The percentage of global methylation level of the interspersed repetitive sequences for LINE-1, Alu, HERV-E, and HERV-K were compared between the 4 groups. Results. There were no significant differences in the methylation for LINE-1 and HERV-E between patients and normal controls. For the Alu marker, the methylation was significantly lower in all types of glaucoma patients compared to controls (POAG 52.19% versus control 52.83%, p=0.021; PACG 51.50% versus control, p=0.005; secondary glaucoma 51.95% versus control, p=0.014, whereas the methylation level of HERV-K was statistically higher in POAG patients compared to controls (POAG 49.22% versus control 48.09%, p=0.017. Conclusions. The trabeculectomy sections had relative DNA hypomethylation of Alu in all glaucoma subtypes and relative DNA hypermethylation of HERV-K in POAG patients. These methylation changes may lead to the fibrotic phenotype in the trabecular meshwork.

  3. Optimization of sequence alignment for simple sequence repeat regions

    Directory of Open Access Journals (Sweden)

    Ogbonnaya Francis C


    Full Text Available Abstract Background Microsatellites, or simple sequence repeats (SSRs, are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs. SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. Findings To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type. When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. Conclusions The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic

  4. The Alu neurodegeneration hypothesis: A primate-specific mechanism for neuronal transcription noise, mitochondrial dysfunction, and manifestation of neurodegenerative disease. (United States)

    Larsen, Peter A; Lutz, Michael W; Hunnicutt, Kelsie E; Mihovilovic, Mirta; Saunders, Ann M; Yoder, Anne D; Roses, Allen D


    It is hypothesized that retrotransposons have played a fundamental role in primate evolution and that enhanced neurologic retrotransposon activity in humans may underlie the origin of higher cognitive function. As a potential consequence of this enhanced activity, it is likely that neurons are susceptible to deleterious retrotransposon pathways that can disrupt mitochondrial function. An example is observed in the TOMM40 gene, encoding a β-barrel protein critical for mitochondrial preprotein transport. Primate-specific Alu retrotransposons have repeatedly inserted into TOMM40 introns, and at least one variant associated with late-onset Alzheimer's disease originated from an Alu insertion event. We provide evidence of enriched Alu content in mitochondrial genes and postulate that Alus can disrupt mitochondrial populations in neurons, thereby setting the stage for progressive neurologic dysfunction. This Alu neurodegeneration hypothesis is compatible with decades of research and offers a plausible mechanism for the disruption of neuronal mitochondrial homeostasis, ultimately cascading into neurodegenerative disease. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  5. Quantitative analysis of plasma cell-free DNA and its DNA integrity in patients with metastatic prostate cancer using ALU sequence

    International Nuclear Information System (INIS)

    Fawzy, A.; Sweify, K.M.; Nofal, N.; El-Fayoumy, H.M.


    Background: Prostate cancer (PC) is the most common cancer affecting men, it accounts for 29% of all male cancer and 11% of all male cancer related death. DNA is normally released from an apoptotic source which generates small fragments of cell-free DNA, whereas cancer patients have cell-free circulating DNA that originated from necrosis, autophagy, or mitotic catastrophe, which produce large fragments. Aim of work: Differentiate the cell free DNA levels (cfDNA) and its integrity in prostate cancer patients and control group composed of benign prostate hyperplasia (BPH) and healthy persons. Methodology: cf-DNA levels were quantified by real-time PCR amplification in prostate cancer patients ( n = 50), (BPH) benign prostate hyperplasia ( n = 25) and healthy controls ( n = 30) using two sets of ALU gene (product size of 115 bp and 247-bp) and its integrity was calculated as a ratio of qPCR results of 247 bp ALU over 115 bp ALU. Results: Highly significant levels of cf-DNA and its integrity in PC patients compared to BPH. Twenty-eight (56%) patients with prostate cancer had bone metastasis. ALU115 qpcr is superior to the other markers in discriminating metastatic patients with a sensitivity of 96.4% and a specificity of 86.4% and (AUC = 0.981) Conclusion: ALU115 qpcr could be used as a valuable biomarker helping in identifying high risk patients, indicating early spread of tumor cells as a potential seed for future metastases

  6. Duplex Alu Screening for Degraded DNA of Skeletal Human Remains

    Directory of Open Access Journals (Sweden)

    Fabian Haß


    Full Text Available The human-specific Alu elements, belonging to the class of Short INterspersed Elements (SINEs, have been shown to be a powerful tool for population genetic studies. An earlier study in this department showed that it was possible to analyze Alu presence/absence in 3000-year-old skeletal human remains from the Bronze Age Lichtenstein cave in Lower Saxony, Germany. We developed duplex Alu screening PCRs with flanking primers for two Alu elements, each combined with a single internal Alu primer. By adding an internal primer, the approximately 400–500 bp presence signals of Alu elements can be detected within a range of less than 200 bp. Thus, our PCR approach is suited for highly fragmented ancient DNA samples, whereas NGS analyses frequently are unable to handle repetitive elements. With this analysis system, we examined remains of 12 individuals from the Lichtenstein cave with different degrees of DNA degradation. The duplex PCRs showed fully informative amplification results for all of the chosen Alu loci in eight of the 12 samples. Our analysis system showed that Alu presence/absence analysis is possible in samples with different degrees of DNA degradation and it reduces the amount of valuable skeletal material needed by a factor of four, as compared with a singleplex approach.

  7. RNA-Mediated Gene Duplication and Retroposons: Retrogenes, LINEs, SINEs, and Sequence Specificity (United States)


    A substantial number of “retrogenes” that are derived from the mRNA of various intron-containing genes have been reported. A class of mammalian retroposons, long interspersed element-1 (LINE1, L1), has been shown to be involved in the reverse transcription of retrogenes (or processed pseudogenes) and non-autonomous short interspersed elements (SINEs). The 3′-end sequences of various SINEs originated from a corresponding LINE. As the 3′-untranslated regions of several LINEs are essential for retroposition, these LINEs presumably require “stringent” recognition of the 3′-end sequence of the RNA template. However, the 3′-ends of mammalian L1s do not exhibit any similarity to SINEs, except for the presence of 3′-poly(A) repeats. Since the 3′-poly(A) repeats of L1 and Alu SINE are critical for their retroposition, L1 probably recognizes the poly(A) repeats, thereby mobilizing not only Alu SINE but also cytosolic mRNA. Many flowering plants only harbor L1-clade LINEs and a significant number of SINEs with poly(A) repeats, but no homology to the LINEs. Moreover, processed pseudogenes have also been found in flowering plants. I propose that the ancestral L1-clade LINE in the common ancestor of green plants may have recognized a specific RNA template, with stringent recognition then becoming relaxed during the course of plant evolution. PMID:23984183

  8. simple sequence repeat (SSR)

    African Journals Online (AJOL)

    In the present study, 78 mapped simple sequence repeat (SSR) markers representing 11 linkage groups of adzuki bean were evaluated for transferability to mungbean and related Vigna spp. 41 markers amplified characteristic bands in at least one Vigna species. The transferability percentage across the genotypes ranged ...

  9. Alu-mediated large deletion of the CDSN gene as a cause of peeling skin disease. (United States)

    Wada, T; Matsuda, Y; Muraoka, M; Toma, T; Takehara, K; Fujimoto, M; Yachie, A


    Peeling skin disease (PSD) is an autosomal recessive skin disorder caused by mutations in CDSN and is characterized by superficial peeling of the upper epidermis. Corneodesmosin (CDSN) is a major component of corneodesmosomes that plays an important role in maintaining epidermis integrity. Herein, we report a patient with PSD caused by a novel homozygous large deletion in the 6p21.3 region encompassing the CDSN gene, which abrogates CDSN expression. Several genes including C6orf15, PSORS1C1, PSORS1C2, CCHCR1, and TCF19 were also deleted, however, the patient showed only clinical features typical of PSD. The deletion size was 59.1 kb. Analysis of the sequence surrounding the breakpoint showed that both telomeric and centromeric breakpoints existed within Alu-S sequences that were oriented in opposite directions. These results suggest an Alu-mediated recombination event as the mechanism underlying the deletion in our patient. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  10. Two tandemly repeated telomere-associated sequences in Nicotiana plumbaginifolia. (United States)

    Chen, C M; Wang, C T; Wang, C J; Ho, C H; Kao, Y Y; Chen, C C


    Two tandemly repeated telomere-associated sequences, NP3R and NP4R, have been isolated from Nicotiana plumbaginifolia. The length of a repeating unit for NP3R and NP4R is 165 and 180 nucleotides respectively. The abundance of NP3R, NP4R and telomeric repeats is, respectively, 8.4 x 10(4), 6 x 10(3) and 1.5 x 10(6) copies per haploid genome of N. plumbaginifolia. Fluorescence in situ hybridization revealed that NP3R is located at the ends and/or in interstitial regions of all 10 chromosomes and NP4R on the terminal regions of three chromosomes in the haploid genome of N. plumbaginifolia. Sequence homology search revealed that not only are NP3R and NP4R homologous to HRS60 and GRS, respectively, two tandem repeats isolated from N. tabacum, but that NP3R and NP4R are also related to each other, suggesting that they originated from a common ancestral sequence. The role of these repeated sequences in chromosome healing is discussed based on the observation that two to three copies of a telomere-similar sequence were present in each repeating unit of NP3R and NP4R.

  11. A specific family of interspersed repeats (SINEs facilitates meiotic synapsis in mammals

    Directory of Open Access Journals (Sweden)

    Johnson Matthew E


    Full Text Available Abstract Background Errors during meiosis that affect synapsis and recombination between homologous chromosomes contribute to aneuploidy and infertility in humans. Despite the clinical relevance of these defects, we know very little about the mechanisms by which homologous chromosomes interact with one another during mammalian meiotic prophase. Further, we remain ignorant of the way in which chromosomal DNA complexes with the meiosis-specific structure that tethers homologs, the synaptonemal complex (SC, and whether specific DNA elements are necessary for this interaction. Results In the present study we utilized chromatin immunoprecipitation (ChIP and DNA sequencing to demonstrate that the axial elements of the mammalian SC are markedly enriched for a specific family of interspersed repeats, short interspersed elements (SINEs. Further, we refine the role of the repeats to specific sub-families of SINEs, B1 in mouse and AluY in old world monkey (Macaca mulatta. Conclusions Because B1 and AluY elements are the most actively retrotransposing SINEs in mice and rhesus monkeys, respectively, our observations imply that they may serve a dual function in axial element binding; i.e., as the anchoring point for the SC but possibly also as a suppressor/regulator of retrotransposition.

  12. A comparison of 100 human genes using an alu element-based instability model. (United States)

    Cook, George W; Konkel, Miriam K; Walker, Jerilyn A; Bourgeois, Matthew G; Fullerton, Mitchell L; Fussell, John T; Herbold, Heath D; Batzer, Mark A


    The human retrotransposon with the highest copy number is the Alu element. The human genome contains over one million Alu elements that collectively account for over ten percent of our DNA. Full-length Alu elements are randomly distributed throughout the genome in both forward and reverse orientations. However, full-length widely spaced Alu pairs having two Alus in the same (direct) orientation are statistically more prevalent than Alu pairs having two Alus in the opposite (inverted) orientation. The cause of this phenomenon is unknown. It has been hypothesized that this imbalance is the consequence of anomalous inverted Alu pair interactions. One proposed mechanism suggests that inverted Alu pairs can ectopically interact, exposing both ends of each Alu element making up the pair to a potential double-strand break, or "hit". This hypothesized "two-hit" (two double-strand breaks) potential per Alu element was used to develop a model for comparing the relative instabilities of human genes. The model incorporates both 1) the two-hit double-strand break potential of Alu elements and 2) the probability of exon-damaging deletions extending from these double-strand breaks. This model was used to compare the relative instabilities of 50 deletion-prone cancer genes and 50 randomly selected genes from the human genome. The output of the Alu element-based genomic instability model developed here is shown to coincide with the observed instability of deletion-prone cancer genes. The 50 cancer genes are collectively estimated to be 58% more unstable than the randomly chosen genes using this model. Seven of the deletion-prone cancer genes, ATM, BRCA1, FANCA, FANCD2, MSH2, NCOR1 and PBRM1, were among the most unstable 10% of the 100 genes analyzed. This algorithm may lay the foundation for comparing genetic risks posed by structural variations that are unique to specific individuals, families and people groups.

  13. A comparison of 100 human genes using an alu element-based instability model.

    Directory of Open Access Journals (Sweden)

    George W Cook

    Full Text Available The human retrotransposon with the highest copy number is the Alu element. The human genome contains over one million Alu elements that collectively account for over ten percent of our DNA. Full-length Alu elements are randomly distributed throughout the genome in both forward and reverse orientations. However, full-length widely spaced Alu pairs having two Alus in the same (direct orientation are statistically more prevalent than Alu pairs having two Alus in the opposite (inverted orientation. The cause of this phenomenon is unknown. It has been hypothesized that this imbalance is the consequence of anomalous inverted Alu pair interactions. One proposed mechanism suggests that inverted Alu pairs can ectopically interact, exposing both ends of each Alu element making up the pair to a potential double-strand break, or "hit". This hypothesized "two-hit" (two double-strand breaks potential per Alu element was used to develop a model for comparing the relative instabilities of human genes. The model incorporates both 1 the two-hit double-strand break potential of Alu elements and 2 the probability of exon-damaging deletions extending from these double-strand breaks. This model was used to compare the relative instabilities of 50 deletion-prone cancer genes and 50 randomly selected genes from the human genome. The output of the Alu element-based genomic instability model developed here is shown to coincide with the observed instability of deletion-prone cancer genes. The 50 cancer genes are collectively estimated to be 58% more unstable than the randomly chosen genes using this model. Seven of the deletion-prone cancer genes, ATM, BRCA1, FANCA, FANCD2, MSH2, NCOR1 and PBRM1, were among the most unstable 10% of the 100 genes analyzed. This algorithm may lay the foundation for comparing genetic risks posed by structural variations that are unique to specific individuals, families and people groups.

  14. Simple sequence repeat marker development and genetic mapping ...

    Indian Academy of Sciences (India)

    polymorphic SSR (simple sequence repeats) markers from libraries enriched for GA, CAA and AAT repeats, as well as 6 ... ers for quinoa was the development of a genetic linkage map ...... Weber J. L. 1990 Informativeness of human (dC-dA)n.

  15. Thermodynamic modeling of the Al-U and Co-U systems

    International Nuclear Information System (INIS)

    Wang, J.; Liu, X.J.; Wang, C.P.


    The thermodynamic assessments of the Al-U and Co-U systems have been carried out by using the CALPHAD (Calculation of Phase Diagrams) method on the basis of the experimental data including thermodynamic properties and phase equilibria. Gibbs free energies of the solution phases were described by the subregular solution models with the Redlich-Kister equation, and those of the intermetallic compounds described by the sublattice models. A consistent set of thermodynamic parameters has been derived for describing the Gibbs free energies of each solution phase and intermetallic compounds in the Al-U and Co-U binary systems. The calculated phase diagrams and thermodynamic properties in the Al-U and Co-U systems are in good agreement with experimental data

  16. Identification, variation and transcription of pneumococcal repeat sequences (United States)


    Background Small interspersed repeats are commonly found in many bacterial chromosomes. Two families of repeats (BOX and RUP) have previously been identified in the genome of Streptococcus pneumoniae, a nasopharyngeal commensal and respiratory pathogen of humans. However, little is known about the role they play in pneumococcal genetics. Results Analysis of the genome of S. pneumoniae ATCC 700669 revealed the presence of a third repeat family, which we have named SPRITE. All three repeats are present at a reduced density in the genome of the closely related species S. mitis. However, they are almost entirely absent from all other streptococci, although a set of elements related to the pneumococcal BOX repeat was identified in the zoonotic pathogen S. suis. In conjunction with information regarding their distribution within the pneumococcal chromosome, this suggests that it is unlikely that these repeats are specialised sequences performing a particular role for the host, but rather that they constitute parasitic elements. However, comparing insertion sites between pneumococcal sequences indicates that they appear to transpose at a much lower rate than IS elements. Some large BOX elements in S. pneumoniae were found to encode open reading frames on both strands of the genome, whilst another was found to form a composite RNA structure with two T box riboswitches. In multiple cases, such BOX elements were demonstrated as being expressed using directional RNA-seq and RT-PCR. Conclusions BOX, RUP and SPRITE repeats appear to have proliferated extensively throughout the pneumococcal chromosome during the species' past, but novel insertions are currently occurring at a relatively slow rate. Through their extensive secondary structures, they seem likely to affect the expression of genes with which they are co-transcribed. Software for annotation of these repeats is freely available from PMID:21333003

  17. Gene conversion as a secondary mechanism of short interspersed element (SINE) evolution

    Energy Technology Data Exchange (ETDEWEB)

    Kass, D.H. [Louisiana State Univ. Medical Center, New Orleans, LA (United States). Dept. of Biochemistry and Molecular Biology; Batzer, M.A. [Lawrence Livermore National Lab., CA (United States); Deininger, P.L. [Louisiana State Univ. Medical Center, New Orleans, LA (United States). Dept. of Biochemistry and Molecular Biology]|[Alton Ochsner Medical Foundation, New Orleans, LA (United States). Lab. of Molecular Genetics


    The Alu repetitive family of short interspersed elements (SINEs) in primates can be subdivided into distinct subfamilies by specific diagnostic nucleotide changes. The older subfamilies are generally very abundant, while the younger subfamilies have fewer copies. Some of the youngest Alu elements are absent in the orthologous loci of nonhuman primates, indicative of recent retroposition events, the primary mode of SINE evolutions. PCR analysis of one young Alu subfamily (Sb2) member found in the low-density lipoprotein receptor gene apparently revealed the presence of this element in the green monkey, orangutan, gorilla, and chimpanzee genomes, as well as the human genome. However, sequence analysis of these genomes revealed a highly mutated, older, primate-specific Alu element was present at this position in the nonhuman primates. Comparison of the flanking DNA sequences upstream of this Alu insertion corresponded to evolution expected for standard primate phylogeny, but comparison of the Alu repeat sequences revealed that the human element departed from this phylogeny. The change in the human sequence apparently occurred by a gene conversion event only within the Alu element itself, converting it from one of the oldest to one of the youngest Alu subfamilies. Although gene conversions of Alu elements are clearly very rare, this finding shows that such events can occur and contribute to specific cases of SINE subfamily evolution.

  18. Always look on both sides: phylogenetic information conveyed by simple sequence repeat allele sequences.

    Directory of Open Access Journals (Sweden)

    Stéphanie Barthe

    Full Text Available Simple sequence repeat (SSR markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily, mutations in the target sequences follow the stepwise mutation model (SMM. Generally speaking, PCR amplicon sizes are used as direct indicators of the number of SSR repeats composing an allele with the data analysis either ignoring the extent of allele size differences or assuming that there is a direct correlation between differences in amplicon size and evolutionary distance. However, without precisely knowing the kind and distribution of polymorphism within an allele (SSR and the associated flanking region (FR sequences, it is hard to say what kind of evolutionary message is conveyed by such a synthetic descriptor of polymorphism as DNA amplicon size. In this study, we sequenced several SSR alleles in multiple populations of three divergent tree genera and disentangled the types of polymorphisms contained in each portion of the DNA amplicon containing an SSR. The patterns of diversity provided by amplicon size variation, SSR variation itself, insertions/deletions (indels, and single nucleotide polymorphisms (SNPs observed in the FRs were compared. Amplicon size variation largely reflected SSR repeat number. The amount of variation was as large in FRs as in the SSR itself. The former contributed significantly to the phylogenetic information and sometimes was the main source of differentiation among individuals and populations contained by FR and SSR regions of SSR markers. The presence of mutations occurring at different rates within a marker's sequence offers the opportunity to analyse evolutionary events occurring on various timescales, but at the same time calls for caution in the interpretation of SSR marker data when the distribution of within

  19. Age-Associated ALU Element Instability in White Blood Cells Is Linked to Lower Survival in Elderly Adults: A Preliminary Cohort Study.

    Directory of Open Access Journals (Sweden)

    R Garrett Morgan

    Full Text Available ALU element instability could contribute to gene function variance in aging, and may partly explain variation in human lifespan.To assess the role of ALU element instability in human aging and the potential efficacy of ALU element content as a marker of biological aging and survival.Preliminary cohort study.We measured two high frequency ALU element subfamilies, ALU-J and ALU-Sx, by a single qPCR assay and compared ALU-J/Sx content in white blood cell (WBCs and skeletal muscle cell (SMCs biopsies from twenty-three elderly adults with sixteen healthy sex-balanced young adults; all-cause survival rates of elderly adults predicted by ALU-J/Sx content in both tissues; and cardiovascular disease (CVD- and cancer-specific survival rates of elderly adults predicted by ALU-J/Sx content in both tissues, as planned subgroup analyses.We found greater ALU-J/Sx content variance in WBCs from elderly adults than young adults (P < 0.001 with no difference in SMCs (P = 0.94. Elderly adults with low WBC ALU-J/Sx content had worse four-year all-cause and CVD-associated survival than those with high ALU-J/Sx content (both P = 0.03 and hazard ratios (HR ≥ 3.40, while WBC ALU-J/Sx content had no influence on cancer-associated survival (P = 0.42 and HR = 0.74. SMC ALU-J/Sx content had no influence on all-cause, CVD- or cancer -associated survival (all P ≥ 0.26; HR ≤ 2.07.These initial findings demonstrate that ALU element instability occurs with advanced age in WBCs, but not SMCs, and imparts greater risk of all-cause mortality that is likely driven by an increased risk for CVD and not cancer.

  20. Multineuronal Spike Sequences Repeat with Millisecond Precision

    Directory of Open Access Journals (Sweden)

    Koki eMatsumoto


    Full Text Available Cortical microcircuits are nonrandomly wired by neurons. As a natural consequence, spikes emitted by microcircuits are also nonrandomly patterned in time and space. One of the prominent spike organizations is a repetition of fixed patterns of spike series across multiple neurons. However, several questions remain unsolved, including how precisely spike sequences repeat, how the sequences are spatially organized, how many neurons participate in sequences, and how different sequences are functionally linked. To address these questions, we monitored spontaneous spikes of hippocampal CA3 neurons ex vivo using a high-speed functional multineuron calcium imaging technique that allowed us to monitor spikes with millisecond resolution and to record the location of spiking and nonspiking neurons. Multineuronal spike sequences were overrepresented in spontaneous activity compared to the statistical chance level. Approximately 75% of neurons participated in at least one sequence during our observation period. The participants were sparsely dispersed and did not show specific spatial organization. The number of sequences relative to the chance level decreased when larger time frames were used to detect sequences. Thus, sequences were precise at the millisecond level. Sequences often shared common spikes with other sequences; parts of sequences were subsequently relayed by following sequences, generating complex chains of multiple sequences.

  1. Detection of a new submicroscopic Norrie disease deletion interval with a novel DNA probe isolated by differential Alu PCR fingerprint cloning

    NARCIS (Netherlands)

    Bergen, A. A.; Wapenaar, M. C.; Schuurman, E. J.; Diergaarde, P. J.; Lerach, H.; Monaco, A. P.; Bakker, E.; Bleeker-Wagemakers, E. M.; van Ommen, G. J.


    Differential Alu PCR fingerprint cloning was used to isolate a DNA probe from the Xp11.4-->p11.21 region of the human X chromosome. This novel sequence, cpXr318 (DXS742), detects a new submicroscopic deletion interval at the Norrie disease locus (NDP). Combining our data with the consensus genetic

  2. Alu Insertions and Genetic Diversity: A Preliminary Investigation by an Undergraduate Bioinformatics Class (United States)

    Elwess, Nancy L.; Duprey, Stephen L.; Harney, Lindesay A.; Langman, Jessie E.; Marino, Tara C.; Martinez, Carolina; McKeon, Lauren L.; Moss, Chantel I. E.; Myrie, Sasha S.; Taylor, Luke Ryan


    "Alu"-insertion polymorphisms were used by an undergraduate Bioinformatics class to study how these insertion sites could be the basis for an investigation in human population genetics. Based on the students' investigation, both allele and genotype "Alu" frequencies were determined for African-American and Japanese populations as well as a…

  3. Development of simple sequence repeat (SSR) markers that are ...

    African Journals Online (AJOL)

    Simple sequence repeats (SSRs) markers were developed through data mining of 3,803 expressed sequence tags (ESTs) previously published. A total of 144 di- to penta-type SSRs were identified and they were screened for polymorphism between two turnip cultivars, 'Tsuda' and 'Yurugi Akamaru'. Out of 90 EST-SSRs for ...

  4. Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.

    Directory of Open Access Journals (Sweden)

    Charlotte Rehm

    Full Text Available In prokaryotes simple sequence repeats (SSRs with unit sizes of 1-5 nucleotides (nt are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6-9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4 structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc, Xanthomonas axonopodis pv. citri str. 306 (Xac, and Nostoc sp. strain PCC7120 (Ana. In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria.

  5. Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp. (United States)

    Rehm, Charlotte; Wurmthaler, Lena A; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S


    In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1-5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6-9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria.

  6. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats. (United States)

    de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas


    Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Recombination-dependent replication and gene conversion homogenize repeat sequences and diversify plastid genome structure. (United States)

    Ruhlman, Tracey A; Zhang, Jin; Blazier, John C; Sabir, Jamal S M; Jansen, Robert K


    There is a misinterpretation in the literature regarding the variable orientation of the small single copy region of plastid genomes (plastomes). The common phenomenon of small and large single copy inversion, hypothesized to occur through intramolecular recombination between inverted repeats (IR) in a circular, single unit-genome, in fact, more likely occurs through recombination-dependent replication (RDR) of linear plastome templates. If RDR can be primed through both intra- and intermolecular recombination, then this mechanism could not only create inversion isomers of so-called single copy regions, but also an array of alternative sequence arrangements. We used Illumina paired-end and PacBio single-molecule real-time (SMRT) sequences to characterize repeat structure in the plastome of Monsonia emarginata (Geraniaceae). We used OrgConv and inspected nucleotide alignments to infer ancestral nucleotides and identify gene conversion among repeats and mapped long (>1 kb) SMRT reads against the unit-genome assembly to identify alternative sequence arrangements. Although M. emarginata lacks the canonical IR, we found that large repeats (>1 kilobase; kb) represent ∼22% of the plastome nucleotide content. Among the largest repeats (>2 kb), we identified GC-biased gene conversion and mapping filtered, long SMRT reads to the M. emarginata unit-genome assembly revealed alternative, substoichiometric sequence arrangements. We offer a model based on RDR and gene conversion between long repeated sequences in the M. emarginata plastome and provide support that both intra-and intermolecular recombination between large repeats, particularly in repeat-rich plastomes, varies unit-genome structure while homogenizing the nucleotide sequence of repeats. © 2017 Botanical Society of America.

  8. Simple sequence repeat marker loci discovery using SSR primer. (United States)

    Robinson, Andrew J; Love, Christopher G; Batley, Jacqueline; Barker, Gary; Edwards, David


    Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. With the increase in the availability of DNA sequence information, an automated process to identify and design PCR primers for amplification of SSR loci would be a useful tool in plant breeding programs. We report an application that integrates SPUTNIK, an SSR repeat finder, with Primer3, a PCR primer design program, into one pipeline tool, SSR Primer. On submission of multiple FASTA formatted sequences, the script screens each sequence for SSRs using SPUTNIK. The results are parsed to Primer3 for locus-specific primer design. The script makes use of a Web-based interface, enabling remote use. This program has been written in PERL and is freely available for non-commercial users by request from the authors. The Web-based version may be accessed at

  9. Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats

    Directory of Open Access Journals (Sweden)

    Graner Andreas


    Full Text Available Abstract Background Barley has one of the largest and most complex genomes of all economically important food crops. The rise of new short read sequencing technologies such as Illumina/Solexa permits such large genomes to be effectively sampled at relatively low cost. Based on the corresponding sequence reads a Mathematically Defined Repeat (MDR index can be generated to map repetitive regions in genomic sequences. Results We have generated 574 Mbp of Illumina/Solexa sequences from barley total genomic DNA, representing about 10% of a genome equivalent. From these sequences we generated an MDR index which was then used to identify and mark repetitive regions in the barley genome. Comparison of the MDR plots with expert repeat annotation drawing on the information already available for known repetitive elements revealed a significant correspondence between the two methods. MDR-based annotation allowed for the identification of dozens of novel repeat sequences, though, which were not recognised by hand-annotation. The MDR data was also used to identify gene-containing regions by masking of repetitive sequences in eight de-novo sequenced bacterial artificial chromosome (BAC clones. For half of the identified candidate gene islands indeed gene sequences could be identified. MDR data were only of limited use, when mapped on genomic sequences from the closely related species Triticum monococcum as only a fraction of the repetitive sequences was recognised. Conclusion An MDR index for barley, which was obtained by whole-genome Illumina/Solexa sequencing, proved as efficient in repeat identification as manual expert annotation. Circumventing the labour-intensive step of producing a specific repeat library for expert annotation, an MDR index provides an elegant and efficient resource for the identification of repetitive and low-copy (i.e. potentially gene-containing sequences regions in uncharacterised genomic sequences. The restriction that a particular

  10. Quantification of integrated HIV DNA by repetitive-sampling Alu-HIV PCR on the basis of poisson statistics. (United States)

    De Spiegelaere, Ward; Malatinkova, Eva; Lynch, Lindsay; Van Nieuwerburgh, Filip; Messiaen, Peter; O'Doherty, Una; Vandekerckhove, Linos


    Quantification of integrated proviral HIV DNA by repetitive-sampling Alu-HIV PCR is a candidate virological tool to monitor the HIV reservoir in patients. However, the experimental procedures and data analysis of the assay are complex and hinder its widespread use. Here, we provide an improved and simplified data analysis method by adopting binomial and Poisson statistics. A modified analysis method on the basis of Poisson statistics was used to analyze the binomial data of positive and negative reactions from a 42-replicate Alu-HIV PCR by use of dilutions of an integration standard and on samples of 57 HIV-infected patients. Results were compared with the quantitative output of the previously described Alu-HIV PCR method. Poisson-based quantification of the Alu-HIV PCR was linearly correlated with the standard dilution series, indicating that absolute quantification with the Poisson method is a valid alternative for data analysis of repetitive-sampling Alu-HIV PCR data. Quantitative outputs of patient samples assessed by the Poisson method correlated with the previously described Alu-HIV PCR analysis, indicating that this method is a valid alternative for quantifying integrated HIV DNA. Poisson-based analysis of the Alu-HIV PCR data enables absolute quantification without the need of a standard dilution curve. Implementation of the CI estimation permits improved qualitative analysis of the data and provides a statistical basis for the required minimal number of technical replicates. © 2014 The American Association for Clinical Chemistry.

  11. SeqEntropy: genome-wide assessment of repeats for short read sequencing.

    Directory of Open Access Journals (Sweden)

    Hsueh-Ting Chu

    Full Text Available BACKGROUND: Recent studies on genome assembly from short-read sequencing data reported the limitation of this technology to reconstruct the entire genome even at very high depth coverage. We investigated the limitation from the perspective of information theory to evaluate the effect of repeats on short-read genome assembly using idealized (error-free reads at different lengths. METHODOLOGY/PRINCIPAL FINDINGS: We define a metric H(k to be the entropy of sequencing reads at a read length k and use the relative loss of entropy ΔH(k to measure the impact of repeats for the reconstruction of whole-genome from sequences of length k. In our experiments, we found that entropy loss correlates well with de-novo assembly coverage of a genome, and a score of ΔH(k>1% indicates a severe loss in genome reconstruction fidelity. The minimal read lengths to achieve ΔH(k<1% are different for various organisms and are independent of the genome size. For example, in order to meet the threshold of ΔH(k<1%, a read length of 60 bp is needed for the sequencing of human genome (3.2 10(9 bp and 320 bp for the sequencing of fruit fly (1.8×10(8 bp. We also calculated the ΔH(k scores for 2725 prokaryotic chromosomes and plasmids at several read lengths. Our results indicate that the levels of repeats in different genomes are diverse and the entropy of sequencing reads provides a measurement for the repeat structures. CONCLUSIONS/SIGNIFICANCE: The proposed entropy-based measurement, which can be calculated in seconds to minutes in most cases, provides a rapid quantitative evaluation on the limitation of idealized short-read genome sequencing. Moreover, the calculation can be parallelized to scale up to large euakryotic genomes. This approach may be useful to tune the sequencing parameters to achieve better genome assemblies when a closely related genome is already available.

  12. simple sequence repeat (SSR) markers in genetic analysis of

    African Journals Online (AJOL)



    1998). Cross- species amplification of soybean (Glycine max) simple sequence repeats (SSRs) within the genus and other legume genera: implications for the transferability of SSRs in plants. Mol. Biol. Evol. 15:1275-1287.

  13. Alu-miRNA interactions modulate transcript isoform diversity in stress response and reveal signatures of positive selection (United States)

    Pandey, Rajesh; Bhattacharya, Aniket; Bhardwaj, Vivek; Jha, Vineet; Mandal, Amit K.; Mukerji, Mitali


    Primate-specific Alus harbor different regulatory features, including miRNA targets. In this study, we provide evidence for miRNA-mediated modulation of transcript isoform levels during heat-shock response through exaptation of Alu-miRNA sites in mature mRNA. We performed genome-wide expression profiling coupled with functional validation of miRNA target sites within exonized Alus, and analyzed conservation of these targets across primates. We observed that two miRNAs (miR-15a-3p and miR-302d-3p) elevated in stress response, target RAD1, GTSE1, NR2C1, FKBP9 and UBE2I exclusively within Alu. These genes map onto the p53 regulatory network. Ectopic overexpression of miR-15a-3p downregulates GTSE1 and RAD1 at the protein level and enhances cell survival. This Alu-mediated fine-tuning seems to be unique to humans as evident from the absence of orthologous sites in other primate lineages. We further analyzed signatures of selection on Alu-miRNA targets in the genome, using 1000 Genomes Phase-I data. We found that 198 out of 3177 Alu-exonized genes exhibit signatures of selection within Alu-miRNA sites, with 60 of them containing SNPs supported by multiple evidences (global-FST > 0.3, pair-wise-FST > 0.5, Fay-Wu’s H  2.0, high ΔDAF) and implicated in p53 network. We propose that by affecting multiple genes, Alu-miRNA interactions have the potential to facilitate population-level adaptations in response to environmental challenges.

  14. Comparative effectiveness of inter-simple sequence repeat and ...

    African Journals Online (AJOL)

    A study to compare the effectiveness of inter-simple sequence repeats (ISSR) and randomly amplified polymorphic DNA (RAPD) profiling was carried out with a total of 65 DNA samples using 12 species of Indian Garcinia. ISSR and RAPD profiling were performed with 19 and 12 primers, respectively. ISSR markers ...

  15. SSRscanner: a program for reporting distribution and exact location of simple sequence repeats. (United States)

    Anwar, Tamanna; Khan, Asad U


    Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail:

  16. Simple sequence repeat (SSR)-based genetic variability among ...

    African Journals Online (AJOL)

    The objective of this study was to compare if simple sequence repeat (SSR) markers could correctly identify peanut genotypes with difference in specific leaf weight (SLW) and relative water content (RWC). Four peanut genotypes and two water regimes (FC and 1/3 available water; 1/3 AW) were arranged in factorial ...

  17. Exploring the Feasibility of a DNA Computer: Design of an ALU Using Sticker-Based DNA Model. (United States)

    Sarkar, Mayukh; Ghosal, Prasun; Mohanty, Saraju P


    Since its inception, DNA computing has advanced to offer an extremely powerful, energy-efficient emerging technology for solving hard computational problems with its inherent massive parallelism and extremely high data density. This would be much more powerful and general purpose when combined with other existing well-known algorithmic solutions that exist for conventional computing architectures using a suitable ALU. Thus, a specifically designed DNA Arithmetic and Logic Unit (ALU) that can address operations suitable for both domains can mitigate the gap between these two. An ALU must be able to perform all possible logic operations, including NOT, OR, AND, XOR, NOR, NAND, and XNOR; compare, shift etc., integer and floating point arithmetic operations (addition, subtraction, multiplication, and division). In this paper, design of an ALU has been proposed using sticker-based DNA model with experimental feasibility analysis. Novelties of this paper may be in manifold. First, the integer arithmetic operations performed here are 2s complement arithmetic, and the floating point operations follow the IEEE 754 floating point format, resembling closely to a conventional ALU. Also, the output of each operation can be reused for any next operation. So any algorithm or program logic that users can think of can be implemented directly on the DNA computer without any modification. Second, once the basic operations of sticker model can be automated, the implementations proposed in this paper become highly suitable to design a fully automated ALU. Third, proposed approaches are easy to implement. Finally, these approaches can work on sufficiently large binary numbers.

  18. Potentials and limitations of histone repeat sequences for phylogenetic reconstruction of Sophophora. (United States)

    Baldo, A M; Les, D H; Strausbaugh, L D


    Simplified DNA sequence acquisition has provided many new data sets that are useful for phylogenetic reconstruction, including single- and multiple-copy nuclear and organellar genes. Although transcribed regions receive much attention, nontranscribed regions have recently been added to the repertoire of sequences suitable for phylogenetic studies, especially for closely related taxa. We evaluated the efficacy of a small portion of the histone repeat for phylogenetic reconstruction among Drosophila species. Histone repeats in invertebrates offer distinct advantages similar to those of widely used ribosomal repeats. First, the units are tandemly repeated and undergo concerted evolution. Second, histone repeats include both highly conserved coding and variable intergenic regions. This composition facilitates application of "universal" primers spanning potentially informative sites. We examined a small region of the histone repeat, including the intergenic spacer segments of coding regions from the divergently transcribed H2A and H2B histone genes. The spacer (about 230 bp) exists as a mosaic with highly conserved functional motifs interspersed with rapidly diverging regions; the former aid in alignment of the spacer. There are no ambiguities in alignment of coding regions. Coding and noncoding regions were analyzed together and separately for phylogenetic information. Parsimony, distance, and maximum-likelihood methods successfully retrieve the corroborated phylogeny for the taxa examined. This study demonstrates the resolving power of a small histone region which may now be added to the growing collection of phylogenetically useful DNA sequences.

  19. PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes. (United States)

    Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A


    PSSRdb (Polymorphic Simple Sequence Repeats database) ( is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.

  20. Morquio A syndrome: Cloning, sequence, and structure of the human N-acetylgalactosamine 6-sulfatase (GALNS) gene

    Energy Technology Data Exchange (ETDEWEB)

    Morris, C.P.; Guo, Xiao-Hui; Apostolou, S. [Adelaide Children`s Hospital, North Adelaide (Australia)] [and others


    Deficiency of the lysosomal enzyme, N-acetylgalactosamine 6-sulfatase (GALNS;EC, results in the storage of the glycosaminoglycans, keratan sulfate and chrondroitin 6-sulfate, which leads to the lysosomal storage disorder Morquio A syndrome. Four overlapping genomic clones derived from a chromosome 16-specific gridded cosmid library containing the entire GALNS gene were isolated. The structure of the gene and the sequence of the exon/intron boundaries and the 5{prime} promoter region were determined. The GALNS gene is split into 14 exons spanning approximately 40 kb. The potential promoter for GALNS lacks a TATA box but contains GC box consensus sequences, consistent with its role as a housekeeping gene. The GALNS gene contains an Alu repeat in intron 5 and a VNTR-like sequence in intron 6. 12 refs., 3 figs., 1 tab.

  1. APE1 incision activity at abasic sites in tandem repeat sequences. (United States)

    Li, Mengxia; Völker, Jens; Breslauer, Kenneth J; Wilson, David M


    Repetitive DNA sequences, such as those present in microsatellites and minisatellites, telomeres, and trinucleotide repeats (linked to fragile X syndrome, Huntington disease, etc.), account for nearly 30% of the human genome. These domains exhibit enhanced susceptibility to oxidative attack to yield base modifications, strand breaks, and abasic sites; have a propensity to adopt non-canonical DNA forms modulated by the positions of the lesions; and, when not properly processed, can contribute to genome instability that underlies aging and disease development. Knowledge on the repair efficiencies of DNA damage within such repetitive sequences is therefore crucial for understanding the impact of such domains on genomic integrity. In the present study, using strategically designed oligonucleotide substrates, we determined the ability of human apurinic/apyrimidinic endonuclease 1 (APE1) to cleave at apurinic/apyrimidinic (AP) sites in a collection of tandem DNA repeat landscapes involving telomeric and CAG/CTG repeat sequences. Our studies reveal the differential influence of domain sequence, conformation, and AP site location/relative positioning on the efficiency of APE1 binding and strand incision. Intriguingly, our data demonstrate that APE1 endonuclease efficiency correlates with the thermodynamic stability of the DNA substrate. We discuss how these results have both predictive and mechanistic consequences for understanding the success and failure of repair protein activity associated with such oxidatively sensitive, conformationally plastic/dynamic repetitive DNA domains. Published by Elsevier Ltd.

  2. D20S16 is a complex interspersed repeated sequence: Genetic and physical analysis of the locus

    Energy Technology Data Exchange (ETDEWEB)

    Bowden, D.W.; Krawchuk, M.D.; Howard, T.D. [Wake Forest Univ., Winston-Salem, NC (United States)] [and others


    The genomic structure of the D20S16 locus has been evaluated using genetic and physical methods. D20S16, originally detected with the probe CRI-L1214, is a highly informative, complex restriction fragment length polymorphism consisting of two separate allelic systems. The allelic systems have the characteristics of conventional VNTR polymorphisms and are separated by recombination ({theta} = 0.02, Z{sub max} = 74.82), as demonstrated in family studies. Most of these recombination events are meiotic crossovers and are maternal in origin, but two, including deletion of the locus in a cell line from a CEPH family member, occur without evidence for exchange of flanking markers. DNA sequence analysis suggests that the basis of the polymorphism is variable numbers of a 98-bp sequence tandemly repeated with 87 to 90% sequence similarity between repeats. The 98-bp repeat is a dimer of 49 bp sequence with 45 to 98% identity between the elements. In addition, nonpolymorphic genomic sequences adjacent to the polymorphic 98-bp repeat tracts are also repeated but are not polymorphic, i.e., show no individual to individual variation. Restriction enzyme mapping of cosmids containing the CRI-L1214 sequence suggests that there are multiple interspersed repeats of the CRI-L1214 sequence on chromosome 20. The results of dual-color fluorescence in situ hybridization experiments with interphase nuclei are also consistent with multiple repeats of an interspersed sequence on chromosome 20. 23 refs., 6 figs.

  3. Accurate typing of short tandem repeats from genome-wide sequencing data and its applications. (United States)

    Fungtammasan, Arkarachai; Ananda, Guruprasad; Hile, Suzanne E; Su, Marcia Shu-Wei; Sun, Chen; Harris, Robert; Medvedev, Paul; Eckert, Kristin; Makova, Kateryna D


    Short tandem repeats (STRs) are implicated in dozens of human genetic diseases and contribute significantly to genome variation and instability. Yet profiling STRs from short-read sequencing data is challenging because of their high sequencing error rates. Here, we developed STR-FM, short tandem repeat profiling using flank-based mapping, a computational pipeline that can detect the full spectrum of STR alleles from short-read data, can adapt to emerging read-mapping algorithms, and can be applied to heterogeneous genetic samples (e.g., tumors, viruses, and genomes of organelles). We used STR-FM to study STR error rates and patterns in publicly available human and in-house generated ultradeep plasmid sequencing data sets. We discovered that STRs sequenced with a PCR-free protocol have up to ninefold fewer errors than those sequenced with a PCR-containing protocol. We constructed an error correction model for genotyping STRs that can distinguish heterozygous alleles containing STRs with consecutive repeat numbers. Applying our model and pipeline to Illumina sequencing data with 100-bp reads, we could confidently genotype several disease-related long trinucleotide STRs. Utilizing this pipeline, for the first time we determined the genome-wide STR germline mutation rate from a deeply sequenced human pedigree. Additionally, we built a tool that recommends minimal sequencing depth for accurate STR genotyping, depending on repeat length and sequencing read length. The required read depth increases with STR length and is lower for a PCR-free protocol. This suite of tools addresses the pressing challenges surrounding STR genotyping, and thus is of wide interest to researchers investigating disease-related STRs and STR evolution. © 2015 Fungtammasan et al.; Published by Cold Spring Harbor Laboratory Press.

  4. A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

    Directory of Open Access Journals (Sweden)

    Glass John I


    Full Text Available Abstract Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT. Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the

  5. Read length and repeat resolution: Exploring prokaryote genomes using next-generation sequencing technologies

    KAUST Repository

    Cahill, Matt J.


    Background: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. Methodology/Principal Findings: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. Conclusions: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length. 2010 Cahill et al.

  6. Read length and repeat resolution: exploring prokaryote genomes using next-generation sequencing technologies.

    Directory of Open Access Journals (Sweden)

    Matt J Cahill

    Full Text Available BACKGROUND: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. METHODOLOGY/PRINCIPAL FINDINGS: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. CONCLUSIONS: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length.

  7. Read length and repeat resolution: Exploring prokaryote genomes using next-generation sequencing technologies

    KAUST Repository

    Cahill, Matt J.; Kö ser, Claudio U.; Ross, Nicholas E.; Archer, John A.C.


    Background: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. Methodology/Principal Findings: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. Conclusions: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length. 2010 Cahill et al.

  8. Alu-mediated deletion of SOX10 regulatory elements in Waardenburg syndrome type 4. (United States)

    Bondurand, Nadége; Fouquet, Virginie; Baral, Viviane; Lecerf, Laure; Loundon, Natalie; Goossens, Michel; Duriez, Benedicte; Labrune, Philippe; Pingault, Veronique


    Waardenburg syndrome type 4 (WS4) is a rare neural crest disorder defined by the combination of Waardenburg syndrome (sensorineural hearing loss and pigmentation defects) and Hirschsprung disease (intestinal aganglionosis). Three genes are known to be involved in this syndrome, that is, EDN3 (endothelin-3), EDNRB (endothelin receptor type B), and SOX10. However, 15-35% of WS4 remains unexplained at the molecular level, suggesting that other genes could be involved and/or that mutations within known genes may have escaped previous screenings. Here, we searched for deletions within recently identified SOX10 regulatory sequences and describe the first characterization of a WS4 patient presenting with a large deletion encompassing three of these enhancers. Analysis of the breakpoint region suggests a complex rearrangement involving three Alu sequences that could be mediated by a FosTes/MMBIR replication mechanism. Taken together with recent reports, our results demonstrate that the disruption of highly conserved non-coding elements located within or at a long distance from the coding sequences of key genes can result in several neurocristopathies. This opens up new routes to the molecular dissection of neural crest disorders.

  9. Tandemly repeated sequence in 5'end of mtDNA control region of ...

    African Journals Online (AJOL)

    Extensive length variability was observed in 5' end sequence of the mitochondrial DNA control region of the Japanese Spanish mackerel (Scomberomorus niphonius). This length variability was due to the presence of varying numbers of a 56-bp tandemly repeated sequence and a 46-bp insertion/deletion (indel).

  10. Marketingový mix firmy ALU KOLA CB


    URBAN, Karel


    This bachelor thesis is focused on a marketing mix practical application in my own company ALU KOLA CB. My company sells alloy wheels and tyres for personal cars. In a literary review are introduced and explained terms marketing, marketing mix and its parts - product, price, place and promotion. In a practical part of this thesis are these terms applied on my company. The end of this part contains results and improvement suggestions.

  11. Repeat Sequence Proteins as Matrices for Nanocomposites

    Energy Technology Data Exchange (ETDEWEB)

    Drummy, L.; Koerner, H; Phillips, D; McAuliffe, J; Kumar, M; Farmer, B; Vaia, R; Naik, R


    Recombinant protein-inorganic nanocomposites comprised of exfoliated Na+ montmorillonite (MMT) in a recombinant protein matrix based on silk-like and elastin-like amino acid motifs (silk elastin-like protein (SELP)) were formed via a solution blending process. Charged residues along the protein backbone are shown to dominate long-range interactions, whereas the SELP repeat sequence leads to local protein/MMT compatibility. Up to a 50% increase in room temperature modulus and a comparable decrease in high temperature coefficient of thermal expansion occur for cast films containing 2-10 wt.% MMT.

  12. Highly sensitive polymerase chain reaction-free quantum dot-based quantification of forensic genomic DNA

    International Nuclear Information System (INIS)

    Tak, Yu Kyung; Kim, Won Young; Kim, Min Jung; Han, Eunyoung; Han, Myun Soo; Kim, Jong Jin; Kim, Wook; Lee, Jong Eun; Song, Joon Myong


    Highlights: ► Genomic DNA quantification were performed using a quantum dot-labeled Alu sequence. ► This probe provided PCR-free determination of human genomic DNA. ► Qdot-labeled Alu probe-hybridized genomic DNAs had a 2.5-femtogram detection limit. ► Qdot-labeled Alu sequence was used to assess DNA samples for human identification. - Abstract: Forensic DNA samples can degrade easily due to exposure to light and moisture at the crime scene. In addition, the amount of DNA acquired at a criminal site is inherently limited. This limited amount of human DNA has to be quantified accurately after the process of DNA extraction. The accurately quantified extracted genomic DNA is then used as a DNA template in polymerase chain reaction (PCR) amplification for short tandem repeat (STR) human identification. Accordingly, highly sensitive and human-specific quantification of forensic DNA samples is an essential issue in forensic study. In this work, a quantum dot (Qdot)-labeled Alu sequence was developed as a probe to simultaneously satisfy both the high sensitivity and human genome selectivity for quantification of forensic DNA samples. This probe provided PCR-free determination of human genomic DNA and had a 2.5-femtogram detection limit due to the strong emission and photostability of the Qdot. The Qdot-labeled Alu sequence has been used successfully to assess 18 different forensic DNA samples for STR human identification.

  13. Partial protoporphyrinogen oxidase (PPOX gene deletions, due to different Alu-mediated mechanisms, identified by MLPA analysis in patients with variegate porphyria

    Directory of Open Access Journals (Sweden)

    Barbaro Michela


    Full Text Available Abstract Variegate porphyria (VP is an autosomal dominantly inherited hepatic porphyria. The genetic defect in the PPOX gene leads to a partial defect of protoporphyrinogen oxidase, the penultimate enzyme of heme biosynthesis. Affected individuals can develop cutaneous symptoms in sun-exposed areas of the skin and/or neuropsychiatric acute attacks. The identification of the genetic defect in VP families is of crucial importance to detect the carrier status which allows counseling to prevent potentially life threatening neurovisceral attacks, usually triggered by factors such as certain drugs, alcohol or fasting. In a total of 31 Swedish VP families sequence analysis had identified a genetic defect in 26. In the remaining five families an extended genetic investigation was necessary. After the development of a synthetic probe set, MLPA analysis to screen for single exon deletions/duplications was performed. We describe here, for the first time, two partial deletions within the PPOX gene detected by MLPA analysis. One deletion affects exon 5 and 6 (c.339-197_616+320del1099 and has been identified in four families, most probably after a founder effect. The other extends from exon 5 to exon 9 (c.339-350_987+229del2609 and was found in one family. We show that both deletions are mediated by Alu repeats. Our findings emphasize the usefulness of MLPA analysis as a complement to PPOX gene sequencing analysis for comprehensive genetic diagnostics in patients with VP.

  14. Genome wide analysis of acute myeloid leukemia reveal leukemia specific methylome and subtype specific hypomethylation of repeats.

    Directory of Open Access Journals (Sweden)

    Marwa H Saied

    Full Text Available Methylated DNA immunoprecipitation followed by high-throughput sequencing (MeDIP-seq has the potential to identify changes in DNA methylation important in cancer development. In order to understand the role of epigenetic modulation in the development of acute myeloid leukemia (AML we have applied MeDIP-seq to the DNA of 12 AML patients and 4 normal bone marrows. This analysis revealed leukemia-associated differentially methylated regions that included gene promoters, gene bodies, CpG islands and CpG island shores. Two genes (SPHKAP and DPP6 with significantly methylated promoters were of interest and further analysis of their expression showed them to be repressed in AML. We also demonstrated considerable cytogenetic subtype specificity in the methylomes affecting different genomic features. Significantly distinct patterns of hypomethylation of certain interspersed repeat elements were associated with cytogenetic subtypes. The methylation patterns of members of the SINE family tightly clustered all leukemic patients with an enrichment of Alu repeats with a high CpG density (P<0.0001. We were able to demonstrate significant inverse correlation between intragenic interspersed repeat sequence methylation and gene expression with SINEs showing the strongest inverse correlation (R(2 = 0.7. We conclude that the alterations in DNA methylation that accompany the development of AML affect not only the promoters, but also the non-promoter genomic features, with significant demethylation of certain interspersed repeat DNA elements being associated with AML cytogenetic subtypes. MeDIP-seq data were validated using bisulfite pyrosequencing and the Infinium array.

  15. Rangku Alu - A Traditional East Nusa Tenggara Game in Android Platform (United States)

    Rahmat, R. F.; Ramadhan, R.; Arisandi, D.; Syahputra, M. F.; Sheta, O.


    Rangku Alu is a traditional Indonesian game originated from Manggarai, East Nusa Tenggara, which is played using two pairs of bamboos or sticks in motion until the opponent’s foot is wedged by the bamboos. However, nowadays the game is rarely played, as the rapid development of technology, the game can be played individually by anyone through an online game using media devices such as mobile or PC. Rangku Alu is a game where the moves of a dancer or player varied in each dance. In this research, Fisher-Yates Shuffle algorithm was used as a randomization method to determine the next moves to prevent the tap areas to appear at the same place more than once in a row. From the results, it shows that the tap areas have never been appeared at the same place in succession twice or more.

  16. SINE Retrotransposition: Evaluation of Alu Activity and Recovery of De Novo Inserts. (United States)

    Ade, Catherine; Roy-Engel, Astrid M


    Mobile element activity is of great interest due to its impact on genomes. However, the types of mobile elements that inhabit any given genome are remarkably varied. Among the different varieties of mobile elements, the Short Interspersed Elements (SINEs) populate many genomes, including many mammalian species. Although SINEs are parasites of Long Interspersed Elements (LINEs), SINEs have been highly successful in both the primate and rodent genomes. When comparing copy numbers in mammals, SINEs have been vastly more successful than other nonautonomous elements, such as the retropseudogenes and SVA. Interestingly, in the human genome the copy number of Alu (a primate SINE) outnumbers LINE-1 (L1) copies 2 to 1. Estimates suggest that the retrotransposition rate for Alu is tenfold higher than LINE-1 with about 1 insert in every twenty births. Furthermore, Alu-induced mutagenesis is responsible for the majority of the documented instances of human retroelement insertion-induced disease. However, little is known on what contributes to these observed differences between SINEs and LINEs. The development of an assay to monitor SINE retrotransposition in culture has become an important tool for the elucidation of some of these differences. In this chapter, we present details of the SINE retrotransposition assay and the recovery of de novo inserts. We also focus on the nuances that are unique to the SINE assay.

  17. Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis. (United States)

    Zheng, Jin-Shuang; Sun, Cheng-Zhen; Zhang, Shu-Ning; Hou, Xi-Lin; Bonnema, Guusje


    A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis.

  18. Contribution of Large Genomic Rearrangements in Italian Lynch Syndrome Patients: Characterization of a Novel Alu-Mediated Deletion

    Directory of Open Access Journals (Sweden)

    Francesca Duraturo


    Full Text Available Lynch syndrome is associated with germ-line mutations in the DNA mismatch repair (MMR genes, mainly MLH1 and MSH2. Most of the mutations reported in these genes to date are point mutations, small deletions, and insertions. Large genomic rearrangements in the MMR genes predisposing to Lynch syndrome also occur, but the frequency varies depending on the population studied on average from 5 to 20%. The aim of this study was to examine the contribution of large rearrangements in the MLH1 and MSH2 genes in a well-characterised series of 63 unrelated Southern Italian Lynch syndrome patients who were negative for pathogenic point mutations in the MLH1, MSH2, and MSH6 genes. We identified a large novel deletion in the MSH2 gene, including exon 6 in one of the patients analysed (1.6% frequency. This deletion was confirmed and localised by long-range PCR. The breakpoints of this rearrangement were characterised by sequencing. Further analysis of the breakpoints revealed that this rearrangement was a product of Alu-mediated recombination. Our findings identified a novel Alu-mediated rearrangement within MSH2 gene and showed that large deletions or duplications in MLH1 and MSH2 genes are low-frequency mutational events in Southern Italian patients with an inherited predisposition to colon cancer.

  19. Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers. (United States)

    Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining


    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.

  20. Development of expressed sequence tag-simple sequence repeat markers for genetic characterization and population structure analysis of Praxelis clematidea (Asteraceae). (United States)

    Wang, Q Z; Huang, M; Downie, S R; Chen, Z X


    Invasive plants tend to spread aggressively in new habitats and an understanding of their genetic diversity and population structure is useful for their management. In this study, expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed for the invasive plant species Praxelis clematidea (Asteraceae) from 5548 Stevia rebaudiana (Asteraceae) expressed sequence tags (ESTs). A total of 133 microsatellite-containing ESTs (2.4%) were identified, of which 56 (42.1%) were hexanucleotide repeat motifs and 50 (37.6%) were trinucleotide repeat motifs. Of the 24 primer pairs designed from these 133 ESTs, 7 (29.2%) resulted in significant polymorphisms. The number of alleles per locus ranged from 5 to 9. The relatively high genetic diversity (H = 0.2667, I = 0.4212, and P = 100%) of P. clematidea was related to high gene flow (Nm = 1.4996) among populations. The coefficient of population differentiation (GST = 0.2500) indicated that most genetic variation occurred within populations. A Mantel test suggested that there was significant correlation between genetic distance and geographical distribution (r = 0.3192, P = 0.012). These results further support the transferability of EST-SSR markers between closely related genera of the same family.

  1. AluY-mediated germline deletion, duplication and somatic stem cell reversion in UBE2T defines a new subtype of Fanconi anemia. (United States)

    Virts, Elizabeth L; Jankowska, Anna; Mackay, Craig; Glaas, Marcel F; Wiek, Constanze; Kelich, Stephanie L; Lottmann, Nadine; Kennedy, Felicia M; Marchal, Christophe; Lehnert, Erik; Scharf, Rüdiger E; Dufour, Carlo; Lanciotti, Marina; Farruggia, Piero; Santoro, Alessandra; Savasan, Süreyya; Scheckenbach, Kathrin; Schipper, Jörg; Wagenmann, Martin; Lewis, Todd; Leffak, Michael; Farlow, Janice L; Foroud, Tatiana M; Honisch, Ellen; Niederacher, Dieter; Chakraborty, Sujata C; Vance, Gail H; Pruss, Dmitry; Timms, Kirsten M; Lanchbury, Jerry S; Alpi, Arno F; Hanenberg, Helmut


    Fanconi anemia (FA) is a rare inherited disorder clinically characterized by congenital malformations, progressive bone marrow failure and cancer susceptibility. At the cellular level, FA is associated with hypersensitivity to DNA-crosslinking genotoxins. Eight of 17 known FA genes assemble the FA E3 ligase complex, which catalyzes monoubiquitination of FANCD2 and is essential for replicative DNA crosslink repair. Here, we identify the first FA patient with biallelic germline mutations in the ubiquitin E2 conjugase UBE2T. Both mutations were aluY-mediated: a paternal deletion and maternal duplication of exons 2-6. These loss-of-function mutations in UBE2T induced a cellular phenotype similar to biallelic defects in early FA genes with the absence of FANCD2 monoubiquitination. The maternal duplication produced a mutant mRNA that could encode a functional protein but was degraded by nonsense-mediated mRNA decay. In the patient's hematopoietic stem cells, the maternal allele with the duplication of exons 2-6 spontaneously reverted to a wild-type allele by monoallelic recombination at the duplicated aluY repeat, thereby preventing bone marrow failure. Analysis of germline DNA of 814 normal individuals and 850 breast cancer patients for deletion or duplication of UBE2T exons 2-6 identified the deletion in only two controls, suggesting aluY-mediated recombinations within the UBE2T locus are rare and not associated with an increased breast cancer risk. Finally, a loss-of-function germline mutation in UBE2T was detected in a high-risk breast cancer patient with wild-type BRCA1/2. Cumulatively, we identified UBE2T as a bona fide FA gene (FANCT) that also may be a rare cancer susceptibility gene. © The Author 2015. Published by Oxford University Press.

  2. Tandemly repeated sequence in 5'end of mtDNA control region of ...

    African Journals Online (AJOL)



    Dec 17, 2008 ... chain reaction (PCR). Japanese Spanish ... mainly covered general ecology and fishery biology. No study concerning the ... Conserved sequence blocks and the repeat units are indicated by boxes. performed using the exact ...

  3. Inverted repeats in the promoter as an autoregulatory sequence for TcrX in Mycobacterium tuberculosis

    International Nuclear Information System (INIS)

    Bhattacharya, Monolekha; Das, Amit Kumar


    Highlights: ► The regulatory sequences recognized by TcrX have been identified. ► The regulatory region comprises of inverted repeats segregated by 30 bp region. ► The mode of binding of TcrX with regulatory sequence is unique. ► In silico TcrX–DNA docked model binds one of the inverted repeats. ► Both phosphorylated and unphosphorylated TcrX binds regulatory sequence in vitro. -- Abstract: TcrY, a histidine kinase, and TcrX, a response regulator, constitute a two-component system in Mycobacterium tuberculosis. tcrX, which is expressed during iron scarcity, is instrumental in the survival of iron-dependent M. tuberculosis. However, the regulator of tcrX/Y has not been fully characterized. Crosslinking studies of TcrX reveal that it can form oligomers in vitro. Electrophoretic mobility shift assays (EMSAs) show that TcrX recognizes two regions in the promoter that are comprised of inverted repeats separated by ∼30 bp. The dimeric in silico model of TcrX predicts binding to one of these inverted repeat regions. Site-directed mutagenesis and radioactive phosphorylation indicate that D54 of TcrX is phosphorylated by H256 of TcrY. However, phosphorylated and unphosphorylated TcrX bind the regulatory sequence with equal efficiency, which was shown with an EMSA using the D54A TcrX mutant.

  4. TRDistiller: a rapid filter for enrichment of sequence datasets with proteins containing tandem repeats. (United States)

    Richard, François D; Kajava, Andrey V


    The dramatic growth of sequencing data evokes an urgent need to improve bioinformatics tools for large-scale proteome analysis. Over the last two decades, the foremost efforts of computer scientists were devoted to proteins with aperiodic sequences having globular 3D structures. However, a large portion of proteins contain periodic sequences representing arrays of repeats that are directly adjacent to each other (so called tandem repeats or TRs). These proteins frequently fold into elongated fibrous structures carrying different fundamental functions. Algorithms specific to the analysis of these regions are urgently required since the conventional approaches developed for globular domains have had limited success when applied to the TR regions. The protein TRs are frequently not perfect, containing a number of mutations, and some of them cannot be easily identified. To detect such "hidden" repeats several algorithms have been developed. However, the most sensitive among them are time-consuming and, therefore, inappropriate for large scale proteome analysis. To speed up the TR detection we developed a rapid filter that is based on the comparison of composition and order of short strings in the adjacent sequence motifs. Tests show that our filter discards up to 22.5% of proteins which are known to be without TRs while keeping almost all (99.2%) TR-containing sequences. Thus, we are able to decrease the size of the initial sequence dataset enriching it with TR-containing proteins which allows a faster subsequent TR detection by other methods. The program is available upon request. Copyright © 2014 Elsevier Inc. All rights reserved.

  5. Mapping a mathematical expression onto a Montium ALU using GNU Bison

    NARCIS (Netherlands)

    Rosien, M.A.J.; Smit, Gerardus Johannes Maria


    The Montium processing tile [1], [4] contains a number of complex ALUs which can perform many different operations in many different ways. In the Chameleon tool flow [2], it is necessary to automatically determine whether a certain mathematical expression can be mapped onto an ALU and to

  6. Interference by clustered regularly interspaced short palindromic repeat (CRISPR) RNA is governed by a seed sequence

    NARCIS (Netherlands)

    Semenova, E.V.; Jore, M.M.; Westra, E.R.; Oost, van der J.; Brouns, S.J.J.


    Prokaryotic clustered regularly interspaced short palindromic repeat (CRISPR)/Cas (CRISPR-associated sequences) systems provide adaptive immunity against viruses when a spacer sequence of small CRISPR RNA (crRNA) matches a protospacer sequence in the viral genome. Viruses that escape CRISPR/Cas

  7. Polymorphic Alu Insertion/Deletion in Different Caste and Tribal Populations from South India. (United States)

    Chinniah, Rathika; Vijayan, Murali; Thirunavukkarasu, Manikandan; Mani, Dhivakar; Raju, Kamaraj; Ravi, Padma Malini; Sivanadham, Ramgopal; C, Kandeepan; N, Mahalakshmi; Karuppiah, Balakrishnan


    Seven human-specific Alu markers were studied in 574 unrelated individuals from 10 endogamous groups and 2 hill tribes of Tamil Nadu and Kerala states. DNA was isolated, amplified by PCR-SSP, and subjected to agarose gel electrophoresis, and genotypes were assigned for various Alu loci. Average heterozygosity among caste populations was in the range of 0.292-0.468. Among tribes, the average heterozygosity was higher for Paliyan (0.3759) than for Kani (0.2915). Frequency differences were prominent in all loci studied except Alu CD4. For Alu CD4, the frequency was 0.0363 in Yadavas, a traditional pastoral and herd maintaining population, and 0.2439 in Narikuravars, a nomadic gypsy population. The overall genetic difference (Gst) of 12 populations (castes and tribes) studied was 3.6%, which corresponds to the Gst values of 3.6% recorded earlier for Western Asian populations. Thus, our study confirms the genetic similarities between West Asian populations and South Indian castes and tribes and supported the large scale coastal migrations from Africa into India through West Asia. However, the average genetic difference (Gst) of Kani and Paliyan tribes with other South Indian tribes studied earlier was 8.3%. The average Gst of combined South and North Indian Tribes (CSNIT) was 9.5%. Neighbor joining tree constructed showed close proximity of Kani and Paliyan tribal groups to the other two South Indian tribes, Toda and Irula of Nilgiri hills studied earlier. Further, the analysis revealed the affinities among populations and confirmed the presence of North and South India specific lineages. Our findings have documented the highly diverse (micro differentiated) nature of South Indian tribes, predominantly due to isolation, than the endogamous population groups of South India. Thus, our study firmly established the genetic relationship of South Indian castes and tribes and supported the proposed large scale ancestral migrations from Africa, particularly into South India

  8. MSDB: A Comprehensive Database of Simple Sequence Repeats. (United States)

    Avvaru, Akshay Kumar; Saxena, Saketh; Sowpati, Divya Tej; Mishra, Rakesh Kumar


    Microsatellites, also known as Simple Sequence Repeats (SSRs), are short tandem repeats of 1-6 nt motifs present in all genomes, particularly eukaryotes. Besides their usefulness as genome markers, SSRs have been shown to perform important regulatory functions, and variations in their length at coding regions are linked to several disorders in humans. Microsatellites show a taxon-specific enrichment in eukaryotic genomes, and some may be functional. MSDB (Microsatellite Database) is a collection of >650 million SSRs from 6,893 species including Bacteria, Archaea, Fungi, Plants, and Animals. This database is by far the most exhaustive resource to access and analyze SSR data of multiple species. In addition to exploring data in a customizable tabular format, users can view and compare the data of multiple species simultaneously using our interactive plotting system. MSDB is developed using the Django framework and MySQL. It is freely available at © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  9. Simple sequence repeat (SSR) markers are effective for identifying ...

    African Journals Online (AJOL)

    DNA was extracted from newly formed leaves and amplified using 21 simple sequence repeat (SSR) markers (NH001c, NH002b, NH005b, NH007b, NH008b, NH009b, NH011b, NH013b, NH012a, NH014a, NH015a, NH017a, KA4b, KA5, KA14, KA16, KB16, KU10, BGA35, BGT23b and HGA8b). The data was analyzed by ...

  10. Innovation in analog flow controller design | Alu | Nigerian Journal of ...

    African Journals Online (AJOL)

    N Alu, MG Zebaze Kana, AA Oberafo, D Obi. Abstract. No Abstract. Nigerian Journal of Physics Vol. 20 (1) 2008: pp.69-75. Full Text: EMAIL FREE FULL TEXT EMAIL FREE FULL TEXT · DOWNLOAD FULL TEXT DOWNLOAD FULL TEXT · · AJOL African Journals Online. HOW TO ...

  11. Repeated-Sprint Sequences During Female Soccer Matches Using Fixed and Individual Speed Thresholds. (United States)

    Nakamura, Fábio Y; Pereira, Lucas A; Loturco, Irineu; Rosseti, Marcelo; Moura, Felipe A; Bradley, Paul S


    Nakamura, FY, Pereira, LA, Loturco, I, Rosseti, M, Moura, FA, and Bradley, PS. Repeated-sprint sequences during female soccer matches using fixed and individual speed thresholds. J Strength Cond Res 31(7): 1802-1810, 2017-The main objective of this study was to characterize the occurrence of single sprint and repeated-sprint sequences (RSS) during elite female soccer matches, using fixed (20 km·h) and individually based speed thresholds (>90% of the mean speed from a 20-m sprint test). Eleven elite female soccer players from the same team participated in the study. All players performed a 20-m linear sprint test, and were assessed in up to 10 official matches using Global Positioning System technology. Magnitude-based inferences were used to test for meaningful differences. Results revealed that irrespective of adopting fixed or individual speed thresholds, female players produced only a few RSS during matches (2.3 ± 2.4 sequences using the fixed threshold and 3.3 ± 3.0 sequences using the individually based threshold), with most sequences composing of just 2 sprints. Additionally, central defenders performed fewer sprints (10.2 ± 4.1) than other positions (fullbacks: 28.1 ± 5.5; midfielders: 21.9 ± 10.5; forwards: 31.9 ± 11.1; with the differences being likely to almost certainly associated with effect sizes ranging from 1.65 to 2.72), and sprinting ability declined in the second half. The data do not support the notion that RSS occurs frequently during soccer matches in female players, irrespective of using fixed or individual speed thresholds to define sprint occurrence. However, repeated-sprint ability development cannot be ruled out from soccer training programs because of its association with match-related performance.

  12. Expressed Sequence Tag-Simple Sequence Repeat (EST-SSR Marker Resources for Diversity Analysis of Mango (Mangifera indica L.

    Directory of Open Access Journals (Sweden)

    Natalie L. Dillon


    Full Text Available In this study, a collection of 24,840 expressed sequence tags (ESTs generated from five mango (Mangifera indica L. cDNA libraries was mined for EST-based simple sequence repeat (SSR markers. Over 1,000 ESTs with SSR motifs were detected from more than 24,000 EST sequences with di- and tri-nucleotide repeat motifs the most abundant. Of these, 25 EST-SSRs in genes involved in plant development, stress response, and fruit color and flavor development pathways were selected, developed into PCR markers and characterized in a population of 32 mango selections including M. indica varieties, and related Mangifera species. Twenty-four of the 25 EST-SSR markers exhibited polymorphisms, identifying a total of 86 alleles with an average of 5.38 alleles per locus, and distinguished between all Mangifera selections. Private alleles were identified for Mangifera species. These newly developed EST-SSR markers enhance the current 11 SSR mango genetic identity panel utilized by the Australian Mango Breeding Program. The current panel has been used to identify progeny and parents for selection and the application of this extended panel will further improve and help to design mango hybridization strategies for increased breeding efficiency.

  13. Genomic organization and developmental fate of adjacent repeated sequences in a foldback DNA clone of Tetrahymena thermophila

    International Nuclear Information System (INIS)

    Tschunko, A.H.; Loechel, R.H.; McLaren, N.C.; Allen, S.L.


    DNA sequence elimination and rearrangement occurs during the development of somatic cell lineages of eukaryotes and was first discovered over a century ago. However, the significance and mechanism of chromatin elimination are not understood. DNA elimination also occurs during the development of the somatic macronucleus from the germinal micronucleus in unicellular ciliated protozoa such as Tetrahymena thermophila. In this study foldback DNA from the micronucleus was used as a probe to isolate ten clones. All of those tested (4/4) contained sequences that were repetitive in the micronucleus and rearranged in the macronucleus. Inverted repeated sequences were present in one clone. This clone, pTtFBl, was subjected to a detailed analysis of its developmental fate. Subregions were subcloned and used as probes against Southern blots of micronuclear and macronuclear DNA. DNA was labeled with [ 33 P]-labeled dATP. The authors found that all subregions defined repeated sequence families in the micronuclear genome. A minimum of four different families was defined, two of which are retained in the macronucleus and two of which are completely eliminated. The inverted repeat family is retained with little rearrangement. Two of the families, defined by subregions that do not contain parts of the inverted repeat are totally eliminated during macronuclear development-and contain open reading frames. The significance of retained inverted repeats to the process of elimination is discussed

  14. RePS: a sequence assembler that masks exact repeats identified from the shotgun data

    DEFF Research Database (Denmark)

    Wang, Jun; Wong, Gane Ka-Shu; Ni, Peixiang


    We describe a sequence assembler, RePS (repeat-masked Phrap with scaffolding), that explicitly identifies exact 20mer repeats from the shotgun data and removes them prior to the assembly. The established software is used to compute meaningful error probabilities for each base. Clone......-end-pairing information is used to construct scaffolds that order and orient the contigs. We show with real data for human and rice that reasonable assemblies are possible even at coverages of only 4x to 6x, despite having up to 42.2% in exact repeats. Udgivelsesdato: 2002-May...

  15. Gene flow and genetic structure in the Galician population (NW Spain according to Alu insertions

    Directory of Open Access Journals (Sweden)

    Diéguez Lois


    Full Text Available Abstract Background The most recent Alu insertions reveal different degrees of polymorphism in human populations, and a series of characteristics that make them particularly suitable genetic markers for Human Biology studies. This has led these polymorphisms to be used to analyse the origin and phylogenetic relationships between contemporary human groups. This study analyses twelve Alu sequences in a sample of 216 individuals from the autochthonous population of Galicia (NW Spain, with the aim of studying their genetic structure and phylogenetic position with respect to the populations of Western and Central Europe and North Africa, research that is of special interest in revealing European population dynamics, given the peculiarities of the Galician population due to its geographical situation in western Europe, and its historical vicissitudes. Results The insertion frequencies of eleven of the Alu elements analysed were within the variability range of European populations, while Yb8NBC125 proved to be the lowest so far recorded to date in Europe. Taking the twelve polymorphisms into account, the GD value for the Galician population was 0.268. The comparative analyses carried out using the MDS, NJ and AMOVA methods reveal the existence of spatial heterogeneity, and identify three population groups that correspond to the geographic areas of Western-Central Europe, Eastern Mediterranean Europe and North Africa. Galicia is shown to be included in the Western-Central European cluster, together with other Spanish populations. When only considering populations from Mediterranean Europe, the Galician population revealed a degree of genetic flow similar to that of the majority of the populations from this geographic area. Conclusion The results of this study reveal that the Galician population, despite its geographic situation in the western edge of the European continent, occupies an intermediate position in relation to other European populations in

  16. Tools for analyzing genetic variants from sequencing data Case study: short tandem repeats


    Gymrek, Melissa


    This was presented as a BitesizeBio Webinar entitled "Tools for analyzing genetic variants from sequencing data Case study: short tandem repeats"Accompanying scripts can be accessed on github: 

  17. In silico analysis of Simple Sequence Repeats from chloroplast genomes of Solanaceae species

    Directory of Open Access Journals (Sweden)

    Evandro Vagner Tambarussi


    Full Text Available The availability of chloroplast genome (cpDNA sequences of Atropa belladonna, Nicotiana sylvestris, N.tabacum, N. tomentosiformis, Solanum bulbocastanum, S. lycopersicum and S. tuberosum, which are Solanaceae species,allowed us to analyze the organization of cpSSRs in their genic and intergenic regions. In general, the number of cpSSRs incpDNA ranged from 161 in S. tuberosum to 226 in N. tabacum, and the number of intergenic cpSSRs was higher than geniccpSSRs. The mononucleotide repeats were the most frequent in studied species, but we also identified di-, tri-, tetra-, pentaandhexanucleotide repeats. Multiple alignments of all cpSSRs sequences from Solanaceae species made the identification ofnucleotide variability possible and the phylogeny was estimated by maximum parsimony. Our study showed that the plastomedatabase can be exploited for phylogenetic analysis and biotechnological approaches.

  18. simple sequence repeats (EST-SSR)

    African Journals Online (AJOL)



    Jan 19, 2012 ... 212 primer pairs selected, based on repeat patterns of n≥8 for di-, tri-, tetra- and penta-nucleotide repeat ... Cluster analysis revealed a high genetic similarity among the sugarcane (Saccharum spp.) breeding lines which could reduce the genetic gain in ..... The multiple allele characteristic of SSR com-.

  19. Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.) (United States)

    Background: Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed S...

  20. Exploiting BAC-end sequences for the mining, characterization and utility of new short sequences repeat (SSR) markers in Citrus. (United States)

    Biswas, Manosh Kumar; Chai, Lijun; Mayer, Christoph; Xu, Qiang; Guo, Wenwu; Deng, Xiuxin


    The aim of this study was to develop a large set of microsatellite markers based on publicly available BAC-end sequences (BESs), and to evaluate their transferability, discriminating capacity of genotypes and mapping ability in Citrus. A set of 1,281 simple sequence repeat (SSR) markers were developed from the 46,339 Citrus clementina BAC-end sequences (BES), of them 20.67% contained SSR longer than 20 bp, corresponding to roughly one perfect SSR per 2.04 kb. The most abundant motifs were di-nucleotide (16.82%) repeats. Among all repeat motifs (TA/AT)n is the most abundant (8.38%), followed by (AG/CT)n (4.51%). Most of the BES-SSR are located in the non-coding region, but 1.3% of BES-SSRs were found to be associated with transposable element (TE). A total of 400 novel SSR primer pairs were synthesized and their transferability and polymorphism tested on a set of 16 Citrus and Citrus relative's species. Among these 333 (83.25%) were successfully amplified and 260 (65.00%) showed cross-species transferability with Poncirus trifoliata and Fortunella sp. These cross-species transferable markers could be useful for cultivar identification, for genomic study of Citrus, Poncirus and Fortunella sp. Utility of the developed SSR marker was demonstrated by identifying a set of 118 markers each for construction of linkage map of Citrus reticulata and Poncirus trifoliata. Genetic diversity and phylogenetic relationship among 40 Citrus and its related species were conducted with the aid of 25 randomly selected SSR primer pairs and results revealed that citrus genomic SSRs are superior to genic SSR for genetic diversity and germplasm characterization of Citrus spp.

  1. ChloroSSRdb: a repository of perfect and imperfect chloroplastic simple sequence repeats (cpSSRs) of green plants. (United States)

    Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh


    Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1-6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: © The Author(s) 2014. Published by Oxford University Press.

  2. Survival of Saccharomyces cerevisiae after treatment with the restriction endonuclease Alu I

    International Nuclear Information System (INIS)

    Winckler, K.; Bach, B.; Obe, G.


    Treatment of yeast cells proficient in the repair of radiation damage (Saccharomyces cervisiae) with the restriction endonuclease Alu I leads to a positive dose-effect relationship between inactivation level and enzyme concentration. The data suggest an uptake of the active restriction enzyme into the cells and a relationship between induction of DNA double-strand breaks and cell killing. (author)

  3. Alu insertion polymorphisms in the African Sahel and the origin of Fulani pastoralists

    Czech Academy of Sciences Publication Activity Database

    Čížková, M.; Hofmanová, Z.; Mokhtar, M. G.; Janoušek, V.; Diallo, I.; Munclinger, P.; Černý, Viktor


    Roč. 44, č. 6 (2017), s. 537-545 ISSN 0301-4460 R&D Projects: GA ČR GA13-37998S Institutional support: RVO:67985912 Keywords : Alu insertions * Fulani nomads * Western African pastoralism * African Sahel Subject RIV: AC - Archeology, Anthropology, Ethnology OBOR OECD: Archaeology Impact factor: 1.240, year: 2016

  4. Cis-acting regulatory sequences promote high-frequency gene conversion between repeated sequences in mammalian cells. (United States)

    Raynard, Steven J; Baker, Mark D


    In mammalian cells, little is known about the nature of recombination-prone regions of the genome. Previously, we reported that the immunoglobulin heavy chain (IgH) mu locus behaved as a hotspot for mitotic, intrachromosomal gene conversion (GC) between repeated mu constant (Cmu) regions in mouse hybridoma cells. To investigate whether elements within the mu gene regulatory region were required for hotspot activity, gene targeting was used to delete a 9.1 kb segment encompassing the mu gene promoter (Pmu), enhancer (Emu) and switch region (Smu) from the locus. In these cell lines, GC between the Cmu repeats was significantly reduced, indicating that this 'recombination-enhancing sequence' (RES) is necessary for GC hotspot activity at the IgH locus. Importantly, the RES fragment stimulated GC when appended to the same Cmu repeats integrated at ectopic genomic sites. We also show that deletion of Emu and flanking matrix attachment regions (MARs) from the RES abolishes GC hotspot activity at the IgH locus. However, no stimulation of ectopic GC was observed with the Emu/MARs fragment alone. Finally, we provide evidence that no correlation exists between the level of transcription and GC promoted by the RES. We suggest a model whereby Emu/MARS enhances mitotic GC at the endogenous IgH mu locus by effecting chromatin modifications in adjacent DNA.

  5. Genetic admixture estimates by Alu elements in Afro-Colombian and Mestizo populations from Antioquia, Colombia. (United States)

    Gómez-Pérez, Luis; Alfonso-Sánchez, Miguel A; Pérez-Miranda, Ana M; García-Obregón, Susana; Builes, Juan J; Bravo, Maria L; De Pancorbo, Marian M; Peña, José A


    This work was intended to gain insights into the admixture processes occurring in Latin American populations by examining the genetic profiles of two ethnic groups from Antioquia (Colombia). To analyse the genetic variability, eight Alu insertions were typed in 64 Afro-Colombians and a reference group of 34 Hispanics (Mestizos). Admixture proportions were estimated using the Weighted Least Squares and the Gene Identity methods. The usefulness of the Alu elements as Ancestry Informative Markers (AIMs) was evaluated through differences in weighted allelic frequencies (delta values) and by hierarchical analysis of the molecular variance (AMOVA). The Afro-Colombian gene pool was largely determined by the African component (88.5-88.8%), but the most prominent feature was the null contribution of European genes. Mestizos were characterized by a major European component (60.0-63.8%) and a comparatively low proportion of Amerindian (19.2-20.7%) and African (17.0-19.3%) genes. Five of the Alu loci examined (ACE, APO, FXIIIB, PV92 and TPA25) showed an adequate resolving power to differentiate between continental groups, as indicated by delta values and AMOVA results. The peculiarity of the Afro-Colombian gene pool seems to be associated with intense genetic drift episodes that occurred in isolated communities founded by small groups of runaway slaves. ACE, APO, FXIIIB, PV92 and TPA25 could be efficiently utilized in studies dealing with demographic history and biogeographical ancestry in human populations.

  6. [Utility of chromosome banding with ALU I enzyme for identifying methylated areas in breast cancer]. (United States)

    Rojas-Atencio, Alicia; Yamarte, Leonard; Urdaneta, Karelis; Soto-Alvarez, Marisol; Alvarez Nava, Francisco; Cañizalez, Jenny; Quintero, Maribel; Atencio, Raquel; González, Richard


    Cancer is a group of disorders characterized by uncontrolled cell growth which is produced by two successive events: increased cell proliferation (tumor or neoplasia) and the invasive capacity of these cells (metastasis). DNA methylation is an epigenetic process which has been involved as an important pathogenic factor of cancer. DNA methylation participates in the regulation of gene expression, directly, by preventing the union of transcription factors, and indirectly, by promoting the "closed" structure of the chromatine. The objectives of this study were to identify hypermethyled chromosomal regions through the use of restriction Alu I endonuclease, and to relate cytogenetically these regions with tumor suppressive gene loci. Sixty peripheral blood samples of females with breast cancer were analyzed. Cell cultures were performed and cytogenetic spreads, previously digested with Alu I enzyme, were stained with Giemsa. Chromosomal centromeric and not centromeric regions were stained in 37% of cases. About 96% of stained hypermethyled chromosomal regions (1q, 2q, 6q) were linked with methylated genes associated with breast cancer. In addition, centromeric regions in chromosomes 3, 4, 8, 13, 14, 15 and 17, usually unstained, were found positive to digestion with Alu I enzime and Giemsa staining. We suggest the importance of this technique for the global visualization of the genome which can find methylated genes related to breast cancer, and thus lead to a specific therapy, and therefore a better therapeutic response.

  7. C-terminal low-complexity sequence repeats of Mycobacterium smegmatis Ku modulate DNA binding. (United States)

    Kushwaha, Ambuj K; Grove, Anne


    Ku protein is an integral component of the NHEJ (non-homologous end-joining) pathway of DSB (double-strand break) repair. Both eukaryotic and prokaryotic Ku homologues have been characterized and shown to bind DNA ends. A unique feature of Mycobacterium smegmatis Ku is its basic C-terminal tail that contains several lysine-rich low-complexity PAKKA repeats that are absent from homologues encoded by obligate parasitic mycobacteria. Such PAKKA repeats are also characteristic of mycobacterial Hlp (histone-like protein) for which they have been shown to confer the ability to appose DNA ends. Unexpectedly, removal of the lysine-rich extension enhances DNA-binding affinity, but an interaction between DNA and the PAKKA repeats is indicated by the observation that only full-length Ku forms multiple complexes with a short stem-loop-containing DNA previously designed to accommodate only one Ku dimer. The C-terminal extension promotes DNA end-joining by T4 DNA ligase, suggesting that the PAKKA repeats also contribute to efficient end-joining. We suggest that low-complexity lysine-rich sequences have evolved repeatedly to modulate the function of unrelated DNA-binding proteins.

  8. Iron Toxicity in the Retina Requires Alu RNA and the NLRP3 Inflammasome

    Directory of Open Access Journals (Sweden)

    Bradley D. Gelfand


    Full Text Available Excess iron induces tissue damage and is implicated in age-related macular degeneration (AMD. Iron toxicity is widely attributed to hydroxyl radical formation through Fenton’s reaction. We report that excess iron, but not other Fenton catalytic metals, induces activation of the NLRP3 inflammasome, a pathway also implicated in AMD. Additionally, iron-induced degeneration of the retinal pigmented epithelium (RPE is suppressed in mice lacking inflammasome components caspase-1/11 or Nlrp3 or by inhibition of caspase-1. Iron overload increases abundance of RNAs transcribed from short interspersed nuclear elements (SINEs: Alu RNAs and the rodent equivalent B1 and B2 RNAs, which are inflammasome agonists. Targeting Alu or B2 RNA prevents iron-induced inflammasome activation and RPE degeneration. Iron-induced SINE RNA accumulation is due to suppression of DICER1 via sequestration of the co-factor poly(C-binding protein 2 (PCBP2. These findings reveal an unexpected mechanism of iron toxicity, with implications for AMD and neurodegenerative diseases associated with excess iron.

  9. Genome-Wide Characterization of Simple Sequence Repeat (SSR) Loci in Chinese Jujube and Jujube SSR Primer Transferability (United States)

    Xiao, Jing; Zhao, Jin; Liu, Mengjun; Liu, Ping; Dai, Li; Zhao, Zhihui


    Chinese jujube (Ziziphus jujuba), an economically important species in the Rhamnaceae family, is a popular fruit tree in Asia. Here, we surveyed and characterized simple sequence repeats (SSRs) in the jujube genome. A total of 436,676 SSR loci were identified, with an average distance of 0.93 Kb between the loci. A large proportion of the SSRs included mononucleotide, dinucleotide and trinucleotide repeat motifs, which accounted for 64.87%, 24.40%, and 8.74% of all repeats, respectively. Among the mononucleotide repeats, A/T was the most common, whereas AT/TA was the most common dinucleotide repeat. A total of 30,565 primer pairs were successfully designed and screened using a series of criteria. Moreover, 725 of 1,000 randomly selected primer pairs were effective among 6 cultivars, and 511 of these primer pairs were polymorphic. Sequencing the amplicons of two SSRs across three jujube cultivars revealed variations in the repeats. The transferability of jujube SSR primers proved that 35/64 SSRs could be transferred across family boundary. Using jujube SSR primers, clustering analysis results from 15 species were highly consistent with the Angiosperm Phylogeny Group (APGIII) System. The genome-wide characterization of SSRs in Chinese jujube is very valuable for whole-genome characterization and marker-assisted selection in jujube breeding. In addition, the transferability of jujube SSR primers could provide a solid foundation for their further utilization. PMID:26000739

  10. Effects of loading sequences and size of repeated stress block of loads on fatigue life calculated using fatigue functions

    International Nuclear Information System (INIS)

    Schott, G.


    It is well-known that collective form, stress intensity and loading sequence of individual stresses as well as size of repeated stress blocks can influence fatigue life, significantly. The basic variant of the consecutive Woehler curve concept will permit these effects to be involved into fatigue life computation. The paper presented will demonstrate that fatigue life computations using fatigue functions reflect the loading sequence effect with multilevel loading precisely and provide reliable fatigue life data. Effects of size of repeated stress block and loading sequence on fatigue life as observed with block program tests can be reproduced using the new computation method. (orig.) [de

  11. Analysis of the 9p21.3 sequence associated with coronary artery disease reveals a tendency for duplication in a CAD patient (United States)

    Kouprina, Natalay; Noskov, Vladimir N.; Waterfall, Joshua J.; Walker, Robert L.; Meltzer, Paul S.; Topol, Eric J.; Larionov, Vladimir


    Tandem segmental duplications (SDs) greater than 10 kb are widespread in complex genomes. They provide material for gene divergence and evolutionary adaptation, while formation of specific de novo SDs is a hallmark of cancer and some human diseases. Most SDs map to distinct genomic regions termed ‘duplication blocks’. SDs organization within these blocks is often poorly characterized as they are mosaics of ancestral duplicons juxtaposed with younger duplicons arising from more recent duplication events. Structural and functional analysis of SDs is further hampered as long repetitive DNA structures are underrepresented in existing BAC and YAC libraries. We applied Transformation-Associated Recombination (TAR) cloning, a versatile technique for large DNA manipulation, to selectively isolate the coronary artery disease (CAD) interval sequence within the 9p21.3 chromosome locus from a patient with coronary artery disease and normal individuals. Four tandem head-to-tail duplicons, each ∼50 kb long, were recovered in the patient but not in normal individuals. Sequence analysis revealed that the repeats varied by 10-15 SNPs between each other and by 82 SNPs between the human genome sequence (version hg19). SNPs polymorphism within the junctions between repeats allowed two junction types to be distinguished, Type 1 and Type 2, which were found at a 2:1 ratio. The junction sequences contained an Alu element, a sequence previously shown to play a role in duplication. Knowledge of structural variation in the CAD interval from more patients could help link this locus to cardiovascular diseases susceptibility, and maybe relevant to other cases of regional amplification, including cancer. PMID:29632643

  12. Identification of apple cultivars on the basis of simple sequence repeat markers. (United States)

    Liu, G S; Zhang, Y G; Tao, R; Fang, J G; Dai, H Y


    DNA markers are useful tools that play an important role in plant cultivar identification. They are usually based on polymerase chain reaction (PCR) and include simple sequence repeats (SSRs), inter-simple sequence repeats, and random amplified polymorphic DNA. However, DNA markers were not used effectively in the complete identification of plant cultivars because of the lack of known DNA fingerprints. Recently, a novel approach called the cultivar identification diagram (CID) strategy was developed to facilitate the use of DNA markers for separate plant individuals. The CID was designed whereby a polymorphic maker was generated from each PCR that directly allowed for cultivar sample separation at each step. Therefore, it could be used to identify cultivars and varieties easily with fewer primers. In this study, 60 apple cultivars, including a few main cultivars in fields and varieties from descendants (Fuji x Telamon) were examined. Of the 20 pairs of SSR primers screened, 8 pairs gave reproducible, polymorphic DNA amplification patterns. The banding patterns obtained from these 8 primers were used to construct a CID map. Each cultivar or variety in this study was distinguished from the others completely, indicating that this method can be used for efficient cultivar identification. The result contributed to studies on germplasm resources and the seedling industry in fruit trees.

  13. Distribution and evolution of repeated sequences in genomes of Triatominae (Hemiptera-Reduviidae inferred from genomic in situ hybridization.

    Directory of Open Access Journals (Sweden)

    Sebastian Pita

    Full Text Available The subfamily Triatominae, vectors of Chagas disease, comprises 140 species characterized by a highly homogeneous chromosome number. We analyzed the chromosomal distribution and evolution of repeated sequences in Triatominae genomes by Genomic in situ Hybridization using Triatoma delpontei and Triatoma infestans genomic DNAs as probes. Hybridizations were performed on their own chromosomes and on nine species included in six genera from the two main tribes: Triatomini and Rhodniini. Genomic probes clearly generate two different hybridization patterns, dispersed or accumulated in specific regions or chromosomes. The three used probes generate the same hybridization pattern in each species. However, these patterns are species-specific. In closely related species, the probes strongly hybridized in the autosomal heterochromatic regions, resembling C-banding and DAPI patterns. However, in more distant species these co-localizations are not observed. The heterochromatic Y chromosome is constituted by highly repeated sequences, which is conserved among 10 species of Triatomini tribe suggesting be an ancestral character for this group. However, the Y chromosome in Rhodniini tribe is markedly different, supporting the early evolutionary dichotomy between both tribes. In some species, sex chromosomes and autosomes shared repeated sequences, suggesting meiotic chromatin exchanges among these heterologous chromosomes. Our GISH analyses enabled us to acquire not only reliable information about autosomal repeated sequences distribution but also an insight into sex chromosome evolution in Triatominae. Furthermore, the differentiation obtained by GISH might be a valuable marker to establish phylogenetic relationships and to test the controversial origin of the Triatominae subfamily.

  14. Simple sequence repeats in Neurospora crassa: distribution, polymorphism and evolutionary inference

    Directory of Open Access Journals (Sweden)

    Park Jongsun


    Full Text Available Abstract Background Simple sequence repeats (SSRs have been successfully used for various genetic and evolutionary studies in eukaryotic systems. The eukaryotic model organism Neurospora crassa is an excellent system to study evolution and biological function of SSRs. Results We identified and characterized 2749 SSRs of 963 SSR types in the genome of N. crassa. The distribution of tri-nucleotide (nt SSRs, the most common SSRs in N. crassa, was significantly biased in exons. We further characterized the distribution of 19 abundant SSR types (AST, which account for 71% of total SSRs in the N. crassa genome, using a Poisson log-linear model. We also characterized the size variation of SSRs among natural accessions using Polymorphic Index Content (PIC and ANOVA analyses and found that there are genome-wide, chromosome-dependent and local-specific variations. Using polymorphic SSRs, we have built linkage maps from three line-cross populations. Conclusion Taking our computational, statistical and experimental data together, we conclude that 1 the distributions of the SSRs in the sequenced N. crassa genome differ systematically between chromosomes as well as between SSR types, 2 the size variation of tri-nt SSRs in exons might be an important mechanism in generating functional variation of proteins in N. crassa, 3 there are different levels of evolutionary forces in variation of amino acid repeats, and 4 SSRs are stable molecular markers for genetic studies in N. crassa.

  15. Sequence variations in C9orf72 downstream of the hexanucleotide repeat region and its effect on repeat-primed PCR interpretation

    DEFF Research Database (Denmark)

    Nordin, Angelica; Akimoto, Chizuru; Wuolikainen, Anna


    A large GGGGCC-repeat expansion mutation (HREM) in C9orf72 is the most common known cause of ALS and FTD in European populations. Sequence variations immediately downstream of the HREM region have previously been observed and have been suggested to be one reason for difficulties in interpreting R...

  16. Length and repeat-sequence variation in 58 STRs and 94 SNPs in two Spanish populations. (United States)

    Casals, Ferran; Anglada, Roger; Bonet, Núria; Rasal, Raquel; van der Gaag, Kristiaan J; Hoogenboom, Jerry; Solé-Morata, Neus; Comas, David; Calafell, Francesc


    We have genotyped the 58 STRs (27 autosomal, 24 Y-STRs and 7 X-STRs) and 94 autosomal SNPs in Illumina ForenSeq™ Primer Mix A in 88 Spanish Roma (Gypsy) samples and 143 Catalans. Since this platform is based in massive parallel sequencing, we have used simple R scripts to uncover the sequence variation in the repeat region. Thus, we have found, across 58 STRs, 541 length-based alleles, which, after considering repeat-sequence variation, became 804 different alleles. All loci in both populations were in Hardy-Weinberg equilibrium. F ST between both populations was 0.0178 for autosomal SNPs, 0.0146 for autosomal STRs, 0.0101 for X-STRs and 0.1866 for Y-STRs. Combined a priori statistics showed quite large; for instance, pooling all the autosomal loci, the a priori probabilities of discriminating a suspect become 1-(2.3×10 -70 ) and 1-(5.9×10 -73 ), for Roma and Catalans respectively, and the chances of excluding a false father in a trio are 1-(2.6×10 -20 ) and 1-(2.0×10 -21 ). Copyright © 2017 Elsevier B.V. All rights reserved.

  17. Differential effects of simple repeating DNA sequences on gene expression from the SV40 early promoter. (United States)

    Amirhaeri, S; Wohlrab, F; Wells, R D


    The influence of simple repeat sequences, cloned into different positions relative to the SV40 early promoter/enhancer, on the transient expression of the chloramphenicol acetyltransferase (CAT) gene was investigated. Insertion of (G)29.(C)29 in either orientation into the 5'-untranslated region of the CAT gene reduced expression in CV-1 cells 50-100 fold when compared with controls with random sequence inserts. Analysis of CAT-specific mRNA levels demonstrated that the effect was due to a reduction of CAT mRNA production rather than to posttranscriptional events. In contrast, insertion of the same insert in either orientation upstream of the promoter-enhancer or downstream of the gene stimulated gene expression 2-3-fold. These effects could be reversed by cotransfection of a competitor plasmid carrying (G)25.(C)25 sequences. The results suggest that a G.C-binding transcription factor modulates gene expression in this system and that promoter strength can be regulated by providing protein-binding sites in trans. Although constructs containing longer tracts of alternating (C-G), (T-G), or (A-T) sequences inhibited CAT expression when inserted in the 5'-untranslated region of the CAT gene, the amount of CAT mRNA was unaffected. Hence, these inhibitions must be due to posttranscriptional events, presumably at the level of translation. These effects of microsatellite sequences on gene expression are discussed with respect to recent data on related simple repeat sequences which cause several human genetic diseases.

  18. Repetitive elements may comprise over two-thirds of the human genome.

    Directory of Open Access Journals (Sweden)

    A P Jason de Koning


    Full Text Available Transposable elements (TEs are conventionally identified in eukaryotic genomes by alignment to consensus element sequences. Using this approach, about half of the human genome has been previously identified as TEs and low-complexity repeats. We recently developed a highly sensitive alternative de novo strategy, P-clouds, that instead searches for clusters of high-abundance oligonucleotides that are related in sequence space (oligo "clouds". We show here that P-clouds predicts >840 Mbp of additional repetitive sequences in the human genome, thus suggesting that 66%-69% of the human genome is repetitive or repeat-derived. To investigate this remarkable difference, we conducted detailed analyses of the ability of both P-clouds and a commonly used conventional approach, RepeatMasker (RM, to detect different sized fragments of the highly abundant human Alu and MIR SINEs. RM can have surprisingly low sensitivity for even moderately long fragments, in contrast to P-clouds, which has good sensitivity down to small fragment sizes (∼25 bp. Although short fragments have a high intrinsic probability of being false positives, we performed a probabilistic annotation that reflects this fact. We further developed "element-specific" P-clouds (ESPs to identify novel Alu and MIR SINE elements, and using it we identified ∼100 Mb of previously unannotated human elements. ESP estimates of new MIR sequences are in good agreement with RM-based predictions of the amount that RM missed. These results highlight the need for combined, probabilistic genome annotation approaches and suggest that the human genome consists of substantially more repetitive sequence than previously believed.

  19. Characterization of the relationship between APOBEC3B deletion and ACE Alu insertion.

    Directory of Open Access Journals (Sweden)

    Kang Wang

    Full Text Available The insertion/deletion (I/D polymorphism of the angiotensin converting enzyme (ACE, commonly associated with many diseases, is believed to have affected human adaptation to environmental changes during the out-of-Africa expansion. APOBEC3B (A3B, a member of the cytidine deaminase family APOBEC3s, also exhibits a variable gene insertion/deletion polymorphism across world populations. Using data available from published reports, we examined the global geographic distribution of ACE and A3B genotypes. In tracking the modern human dispersal routes of these two genes, we found that the variation trends of the two I/D polymorphisms were directly correlated. We observed that the frequencies of ACE insertion and A3B deletion rose in parallel along the expansion route. To investigate the presence of a correlation between the two polymorphisms and the effect of their interaction on human health, we analyzed 1199 unrelated Chinese adults to determine their genotypes and other important clinical characteristics. We discovered a significant difference between the ACE genotype/allele distribution in the A3B DD and A3B II/ID groups (P = 0.045 and 0.015, respectively, indicating that the ACE Alu I allele frequency in the former group was higher than in the latter group. No specific clinical phenotype could be associated with the interaction between the ACE and A3B I/D polymorphisms. A3B has been identified as a powerful inhibitor of Alu retrotransposition, and primate A3 genes have undergone strong positive selection (and expansion for restricting the mobility of endogenous retrotransposons during evolution. Based on these findings, we suggest that the ACE Alu insertion was enabled (facilitated by the A3B deletion and that functional loss of A3B provided an opportunity for enhanced human adaptability and survival in response to the environmental and climate challenges arising during the migration from Africa.

  20. Development of simple sequence repeat markers and diversity analysis in alfalfa (Medicago sativa L.). (United States)

    Wang, Zan; Yan, Hongwei; Fu, Xinnian; Li, Xuehui; Gao, Hongwen


    Efficient and robust molecular markers are essential for molecular breeding in plant. Compared to dominant and bi-allelic markers, multiple alleles of simple sequence repeat (SSR) markers are particularly informative and superior in genetic linkage map and QTL mapping in autotetraploid species like alfalfa. The objective of this study was to enrich SSR markers directly from alfalfa expressed sequence tags (ESTs). A total of 12,371 alfalfa ESTs were retrieved from the National Center for Biotechnology Information. Total 774 SSR-containing ESTs were identified from 716 ESTs. On average, one SSR was found per 7.7 kb of EST sequences. Tri-nucleotide repeats (48.8 %) was the most abundant motif type, followed by di-(26.1 %), tetra-(11.5 %), penta-(9.7 %), and hexanucleotide (3.9 %). One hundred EST-SSR primer pairs were successfully designed and 29 exhibited polymorphism among 28 alfalfa accessions. The allele number per marker ranged from two to 21 with an average of 6.8. The PIC values ranged from 0.195 to 0.896 with an average of 0.608, indicating a high level of polymorphism of the EST-SSR markers. Based on the 29 EST-SSR markers, assessment of genetic diversity was conducted and found that Medicago sativa ssp. sativa was clearly different from the other subspecies. The high transferability of those EST-SSR markers was also found for relative species.

  1. Transcription arrest by a G quadruplex forming-trinucleotide repeat sequence from the human c-myb gene. (United States)

    Broxson, Christopher; Beckett, Joshua; Tornaletti, Silvia


    Non canonical DNA structures correspond to genomic regions particularly susceptible to genetic instability. The transcription process facilitates formation of these structures and plays a major role in generating the instability associated with these genomic sites. However, little is known about how non canonical structures are processed when encountered by an elongating RNA polymerase. Here we have studied the behavior of T7 RNA polymerase (T7RNAP) when encountering a G quadruplex forming-(GGA)(4) repeat located in the human c-myb proto-oncogene. To make direct correlations between formation of the structure and effects on transcription, we have taken advantage of the ability of the T7 polymerase to transcribe single-stranded substrates and of G4 DNA to form in single-stranded G-rich sequences in the presence of potassium ions. Under physiological KCl concentrations, we found that T7 RNAP transcription was arrested at two sites that mapped to the c-myb (GGA)(4) repeat sequence. The extent of arrest did not change with time, indicating that the c-myb repeat represented an absolute block and not a transient pause to T7 RNAP. Consistent with G4 DNA formation, arrest was not observed in the absence of KCl or in the presence of LiCl. Furthermore, mutations in the c-myb (GGA)(4) repeat, expected to prevent transition to G4, also eliminated the transcription block. We show T7 RNAP arrest at the c-myb repeat in double-stranded DNA under conditions mimicking the cellular concentration of biomolecules and potassium ions, suggesting that the G4 structure formed in the c-myb repeat may represent a transcription roadblock in vivo. Our results support a mechanism of transcription-coupled DNA repair initiated by arrest of transcription at G4 structures.

  2. Use of short tandem repeat sequences to study Mycobacterium leprae in leprosy patients in Malawi and India.

    Directory of Open Access Journals (Sweden)

    Saroj K Young


    Full Text Available Inadequate understanding of the transmission of Mycobacterium leprae makes it difficult to predict the impact of leprosy control interventions. Genotypic tests that allow tracking of individual bacterial strains would strengthen epidemiological studies and contribute to our understanding of the disease.Genotyping assays based on variation in the copy number of short tandem repeat sequences were applied to biopsies collected in population-based epidemiological studies of leprosy in northern Malawi, and from members of multi-case households in Hyderabad, India. In the Malawi series, considerable genotypic variability was observed between patients, and also within patients, when isolates were collected at different times or from different tissues. Less within-patient variability was observed when isolates were collected from similar tissues at the same time. Less genotypic variability was noted amongst the closely related Indian patients than in the Malawi series.Lineages of M. leprae undergo changes in their pattern of short tandem repeat sequences over time. Genetic divergence is particularly likely between bacilli inhabiting different (e.g., skin and nerve tissues. Such variability makes short tandem repeat sequences unsuitable as a general tool for population-based strain typing of M. leprae, or for distinguishing relapse from reinfection. Careful use of these markers may provide insights into the development of disease within individuals and for tracking of short transmission chains.

  3. SSTL Based Low Power Thermal Efficient WLAN Specific 32bit ALU Design on 28nm FPGA

    DEFF Research Database (Denmark)

    Kalia, Kartik; Pandey, Bishwajeet; Das, Teerath


    at minimum and maximum temperature as compared to all other considered I/O standards. This design has application where 32bit ALU design is considered for designing an electronic device such as WLAN. The design can be implemented on different nano chips for better efficiency depending upon the design...... with consideration of airflow toward hit sink and different frequency on which ALU operate in network processor or any WLAN devices. We have done total power analysis of WLAN operating on different frequencies. We have considered a set of frequencies, which are based on IEEE 802.11 standards. First we did...... efficient IO standard. While analyzing we found out that when WLAN device shift from 343.15K to 283.15K, there is maximum thermal power reduction in SSTL135_R as compared to all considered I/O standards. When we compared same I/Os for different frequencies we observed maximum thermal efficiency in SSTL15...

  4. Analysis of sequence diversity through internal transcribed spacers and simple sequence repeats to identify Dendrobium species. (United States)

    Liu, Y T; Chen, R K; Lin, S J; Chen, Y C; Chin, S W; Chen, F C; Lee, C Y


    The Orchidaceae is one of the largest and most diverse families of flowering plants. The Dendrobium genus has high economic potential as ornamental plants and for medicinal purposes. In addition, the species of this genus are able to produce large crops. However, many Dendrobium varieties are very similar in outward appearance, making it difficult to distinguish one species from another. This study demonstrated that the 12 Dendrobium species used in this study may be divided into 2 groups by internal transcribed spacer (ITS) sequence analysis. Red and yellow flowers may also be used to separate these species into 2 main groups. In particular, the deciduous characteristic is associated with the ITS genetic diversity of the A group. Of 53 designed simple sequence repeat (SSR) primer pairs, 7 pairs were polymorphic for polymerase chain reaction products that were amplified from a specific band. The results of this study demonstrate that these 7 SSR primer pairs may potentially be used to identify Dendrobium species and their progeny in future studies.

  5. Repeated extragenic sequences in prokaryotic genomes: a proposal for the origin and dynamics of the RUP element in Streptococcus pneumoniae. (United States)

    Oggioni, M R; Claverys, J P


    A survey of all Streptococcus pneumoniae GenBank/EMBL DNA sequence entries and of the public domain sequence (representing more than 90% of the genome) of an S. pneumoniae type 4 strain allowed identification of 108 copies of a 107-bp-long highly repeated intergenic element called RUP (for repeat unit of pneumococcus). Several features of the element, revealed in this study, led to the proposal that RUP is an insertion sequence (IS)-derivative that could still be mobile. Among these features are: (1) a highly significant homology between the terminal inverted repeats (IRs) of RUPs and of IS630-Spn1, a new putative IS of S. pneumoniae; and (2) insertion at a TA dinucleotide, a characteristic target of several members of the IS630 family. Trans-mobilization of RUP is therefore proposed to be mediated by the transposase of IS630-Spn1. To account for the observation that RUPs are distributed among four subtypes which exhibit different degrees of sequence homogeneity, a scenario is invoked based on successive stages of RUP mobility and non-mobility, depending on whether an active transposase is present or absent. In the latter situation, an active transposase could be reintroduced into the species through natural transformation. Examination of sequences flanking RUP revealed a preferential association with ISs. It also provided evidence that RUPs promote sequence rearrangements, thereby contributing to genome flexibility. The possibility that RUP preferentially targets transforming DNA of foreign origin and subsequently favours disruption/rearrangement of exogenous sequences is discussed.

  6. Simple sequence repeat marker development from bacterial artificial chromosome end sequences and expressed sequence tags of flax (Linum usitatissimum L.). (United States)

    Cloutier, Sylvie; Miranda, Evelyn; Ward, Kerry; Radovanovic, Natasa; Reimer, Elsa; Walichnowski, Andrzej; Datla, Raju; Rowland, Gordon; Duguid, Scott; Ragupathy, Raja


    Flax is an important oilseed crop in North America and is mostly grown as a fibre crop in Europe. As a self-pollinated diploid with a small estimated genome size of ~370 Mb, flax is well suited for fast progress in genomics. In the last few years, important genetic resources have been developed for this crop. Here, we describe the assessment and comparative analyses of 1,506 putative simple sequence repeats (SSRs) of which, 1,164 were derived from BAC-end sequences (BESs) and 342 from expressed sequence tags (ESTs). The SSRs were assessed on a panel of 16 flax accessions with 673 (58 %) and 145 (42 %) primer pairs being polymorphic in the BESs and ESTs, respectively. With 818 novel polymorphic SSR primer pairs reported in this study, the repertoire of available SSRs in flax has more than doubled from the combined total of 508 of all previous reports. Among nucleotide motifs, trinucleotides were the most abundant irrespective of the class, but dinucleotides were the most polymorphic. SSR length was also positively correlated with polymorphism. Two dinucleotide (AT/TA and AG/GA) and two trinucleotide (AAT/ATA/TAA and GAA/AGA/AAG) motifs and their iterations, different from those reported in many other crops, accounted for more than half of all the SSRs and were also more polymorphic (63.4 %) than the rest of the markers (42.7 %). This improved resource promises to be useful in genetic, quantitative trait loci (QTL) and association mapping as well as for anchoring the physical/genetic map with the whole genome shotgun reference sequence of flax.

  7. Molecular reconstruction of extinct LINE-1 elements and their interaction with nonautonomous elements. (United States)

    Wagstaff, Bradley J; Kroutter, Emily N; Derbes, Rebecca S; Belancio, Victoria P; Roy-Engel, Astrid M


    Non-long terminal repeat retroelements continue to impact the human genome through cis-activity of long interspersed element-1 (LINE-1 or L1) and trans-mobilization of Alu. Current activity is dominated by modern subfamilies of these elements, leaving behind an evolutionary graveyard of extinct Alu and L1 subfamilies. Because Alu is a nonautonomous element that relies on L1 to retrotranspose, there is the possibility that competition between these elements has driven selection and antagonistic coevolution between Alu and L1. Through analysis of synonymous versus nonsynonymous codon evolution across L1 subfamilies, we find that the C-terminal ORF2 cys domain experienced a dramatic increase in amino acid substitution rate in the transition from L1PA5 to L1PA4 subfamilies. This observation coincides with the previously reported rapid evolution of ORF1 during the same transition period. Ancestral Alu sequences have been previously reconstructed, as their short size and ubiquity have made it relatively easy to retrieve consensus sequences from the human genome. In contrast, creating constructs of extinct L1 copies is a more laborious task. Here, we report our efforts to recreate and evaluate the retrotransposition capabilities of two ancestral L1 elements, L1PA4 and L1PA8 that were active ~18 and ~40 Ma, respectively. Relative to the modern L1PA1 subfamily, we find that both elements are similarly active in a cell culture retrotransposition assay in HeLa, and both are able to efficiently trans-mobilize Alu elements from several subfamilies. Although we observe some variation in Alu subfamily retrotransposition efficiency, any coevolution that may have occurred between LINEs and SINEs is not evident from these data. Population dynamics and stochastic variation in the number of active source elements likely play an important role in individual LINE or SINE subfamily amplification. If coevolution also contributes to changing retrotransposition rates and the progression of

  8. Cytogenetic Analysis of Populus trichocarpa - Ribosomal DNA, Telomere Repeat Sequence, and Marker-selected BACs (United States)

    M.N. lslam-Faridi; C.D. Nelson; S.P. DiFazio; L.E. Gunter; G.A. Tuskan


    The 185-285 rDNA and 55 rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 185-285 rDNA sites and one 55 rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis-type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones...

  9. Genotyping and Molecular Identification of Date Palm Cultivars Using Inter-Simple Sequence Repeat (ISSR) Markers. (United States)

    Ayesh, Basim M


    Molecular markers are credible for the discrimination of genotypes and estimation of the extent of genetic diversity and relatedness in a set of genotypes. Inter-simple sequence repeat (ISSR) markers rapidly reveal high polymorphic fingerprints and have been used frequently to determine the genetic diversity among date palm cultivars. This chapter describes the application of ISSR markers for genotyping of date palm cultivars. The application involves extraction of genomic DNA from the target cultivars with reliable quality and quantity. Subsequently the extracted DNA serves as a template for amplification of genomic regions flanked by inverted simple sequence repeats using a single primer. The similarity of each pair of samples is measured by calculating the number of mono- and polymorphic bands revealed by gel electrophoresis. Matrices constructed for similarity and genetic distance are used to build a phylogenetic tree and cluster analysis, to determine the molecular relatedness of cultivars. The protocol describes 3 out of 9 tested primers consistently amplified 31 loci in 6 date palm cultivars, with 28 polymorphic loci.

  10. The complete chloroplast genome sequence of Taxus chinensis var. mairei (Taxaceae): loss of an inverted repeat region and comparative analysis with related species. (United States)

    Zhang, Yanzhen; Ma, Ji; Yang, Bingxian; Li, Ruyi; Zhu, Wei; Sun, Lianli; Tian, Jingkui; Zhang, Lin


    Taxus chinensis var. mairei (Taxaceae) is a domestic variety of yew species in local China. This plant is one of the sources for paclitaxel, which is a promising antineoplastic chemotherapy drugs during the last decade. We have sequenced the complete nucleotide sequence of the chloroplast (cp) genome of T. chinensis var. mairei. The T. chinensis var. mairei cp genome is 129,513 bp in length, with 113 single copy genes and two duplicated genes (trnI-CAU, trnQ-UUG). Among the 113 single copy genes, 9 are intron-containing. Compared to other land plant cp genomes, the T. chinensis var. mairei cp genome has lost one of the large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperm such as Cycas revoluta and Ginkgo biloba L. Compared to related species, the gene order of T. chinensis var. mairei has a large inversion of ~110kb including 91 genes (from rps18 to accD) with gene contents unarranged. Repeat analysis identified 48 direct and 2 inverted repeats 30 bp long or longer with a sequence identity greater than 90%. Repeated short segments were found in genes rps18, rps19 and clpP. Analysis also revealed 22 simple sequence repeat (SSR) loci and almost all are composed of A or T. Copyright © 2014 Elsevier B.V. All rights reserved.

  11. New polymorphic variants of human blood clotting factor IX

    Energy Technology Data Exchange (ETDEWEB)

    Surin, V.L.; Luk`yanenko, A.V.; Tagiev, A.F.; Smirnova, O.V. [Hematological Research Center, Moscow (Russian Federation); Plutalov, O.V.; Berlin, Yu.A. [Shemyakin Institute of Bioorganic Chemistry, Moscow (Russian Federation)


    The polymorphism of Alu-repeats, which are located in the introns of the human factor IX gene (copies 1-3), was studied. To identify polymorphic variants, direct sequencing of PCR products that contained appropriate repeats was used. In each case, 20 unrelated X chromosomes were studied. A polymorphic Dra I site was found near the 3{prime}-end of Alu copy 3 within the region of the polyA tract. A PCR-based testing system with internal control of restriction hydrolysis was suggested. Testing 81 unrelated X chromosomes revealed that the frequency of the polymorphic Dra I site is 0.23. Taq I polymorphism, which was revealed in Alu copy 4 of factor IX gene in our previous work, was found to be closely linked to Dra I polymorphism. Studies in linkage between different types of polymorphisms of the factor IX gene revealed the presence of a rare polymorphism in intron a that was located within the same minisatellite region as the known polymorphic insertion 50 bp/Dde I. However, the size of the insertion in our case was 26 bp. Only one polymorphic variant was found among over 150 unrelated X chromosomes derived from humans from Moscow and its vicinity. 10 refs., 4 figs., 1 tab.

  12. Inter-simple sequence repeat (ISSR) loci mapping in the genome of perennial ryegrass

    DEFF Research Database (Denmark)

    Pivorienė, O; Pašakinskienė, I; Brazauskas, G


    The aim of this study was to identify and characterize new ISSR markers and their loci in the genome of perennial ryegrass. A subsample of the VrnA F2 mapping family of perennial ryegrass comprising 92 individuals was used to develop a linkage map including inter-simple sequence repeat markers...... demonstrated a 70% similarity to the Hordeum vulgare germin gene GerA. Inter-SSR mapping will provide useful information for gene targeting, quantitative trait loci mapping and marker-assisted selection in perennial ryegrass....

  13. Effects of GABA[subscript A] Modulators on the Repeated Acquisition of Response Sequences in Squirrel Monkeys (United States)

    Campbell, Una C.; Winsauer, Peter J.; Stevenson, Michael W.; Moerschbaecher, Joseph M.


    The present study investigated the effects of positive and negative GABA[subscript A] modulators under three different baselines of repeated acquisition in squirrel monkeys in which the monkeys acquired a three-response sequence on three keys under a second-order fixed-ratio (FR) schedule of food reinforcement. In two of these baselines, the…

  14. Development and Characterization of Simple Sequence Repeat (SSR) Markers Based on RNA-Sequencing of Medicago sativa and In silico Mapping onto the M. truncatula Genome (United States)

    Wang, Zan; Yu, Guohui; Shi, Binbin; Wang, Xuemin; Qiang, Haiping; Gao, Hongwen


    Sufficient codominant genetic markers are needed for various genetic investigations in alfalfa since the species is an outcrossing autotetraploid. With the newly developed next generation sequencing technology, a large amount of transcribed sequences of alfalfa have been generated and are available for identifying SSR markers by data mining. A total of 54,278 alfalfa non-redundant unigenes were assembled through the Illumina HiSeqTM 2000 sequencing technology. Based on 3,903 unigene sequences, 4,493 SSRs were identified. Tri-nucleotide repeats (56.71%) were the most abundant motif class while AG/CT (21.7%), AGG/CCT (19.8%), AAC/GTT (10.3%), ATC/ATG (8.8%), and ACC/GGT (6.3%) were the subsequent top five nucleotide repeat motifs. Eight hundred and thirty- seven EST-SSR primer pairs were successfully designed. Of these, 527 (63%) primer pairs yielded clear and scored PCR products and 372 (70.6%) exhibited polymorphisms. High transferability was observed for ssp falcata at 99.2% (523) and 71.7% (378) in M. truncatula. In addition, 313 of 527 SSR marker sequences were in silico mapped onto the eight M. truncatula chromosomes. Thirty-six polymorphic SSR primer pairs were used in the genetic relatedness analysis of 30 Chinese alfalfa cultivated accessions generating a total of 199 scored alleles. The mean observed heterozygosity and polymorphic information content were 0.767 and 0.635, respectively. The codominant markers not only enriched the current resources of molecular markers in alfalfa, but also would facilitate targeted investigations in marker-trait association, QTL mapping, and genetic diversity analysis in alfalfa. PMID:24642969

  15. Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey

    Directory of Open Access Journals (Sweden)

    Varala Kranthi


    Full Text Available Abstract Background Extensive computational and database tools are available to mine genomic and genetic databases for model organisms, but little genomic data is available for many species of ecological or agricultural significance, especially those with large genomes. Genome surveys using conventional sequencing techniques are powerful, particularly for detecting sequences present in many copies per genome. However these methods are time-consuming and have potential drawbacks. High throughput 454 sequencing provides an alternative method by which much information can be gained quickly and cheaply from high-coverage surveys of genomic DNA. Results We sequenced 78 million base-pairs of randomly sheared soybean DNA which passed our quality criteria. Computational analysis of the survey sequences provided global information on the abundant repetitive sequences in soybean. The sequence was used to determine the copy number across regions of large genomic clones or contigs and discover higher-order structures within satellite repeats. We have created an annotated, online database of sequences present in multiple copies in the soybean genome. The low bias of pyrosequencing against repeat sequences is demonstrated by the overall composition of the survey data, which matches well with past estimates of repetitive DNA content obtained by DNA re-association kinetics (Cot analysis. Conclusion This approach provides a potential aid to conventional or shotgun genome assembly, by allowing rapid assessment of copy number in any clone or clone-end sequence. In addition, we show that partial sequencing can provide access to partial protein-coding sequences.

  16. Alu polymorphic insertions reveal genetic structure of north Indian populations. (United States)

    Tripathi, Manorama; Tripathi, Piyush; Chauhan, Ugam Kumari; Herrera, Rene J; Agrawal, Suraksha


    The Indian subcontinent is characterized by the ancestral and cultural diversity of its people. Genetic input from several unique source populations and from the unique social architecture provided by the caste system has shaped the current genetic landscape of India. In the present study 200 individuals each from three upper-caste and four middle-caste Hindu groups and from two Muslim populations in North India were examined for 10 polymorphic Alu insertions (PAIs). The investigated PAIs exhibit high levels of polymorphism and average heterozygosity. Limited interpopulation variance and genetic flow in the present study suggest admixture. The results of this study demonstrate that, contrary to common belief, the caste system has not provided an impermeable barrier to genetic exchange among Indian groups.

  17. The leucine-rich repeat structure. (United States)

    Bella, J; Hindle, K L; McEwan, P A; Lovell, S C


    The leucine-rich repeat is a widespread structural motif of 20-30 amino acids with a characteristic repetitive sequence pattern rich in leucines. Leucine-rich repeat domains are built from tandems of two or more repeats and form curved solenoid structures that are particularly suitable for protein-protein interactions. Thousands of protein sequences containing leucine-rich repeats have been identified by automatic annotation methods. Three-dimensional structures of leucine-rich repeat domains determined to date reveal a degree of structural variability that translates into the considerable functional versatility of this protein superfamily. As the essential structural principles become well established, the leucine-rich repeat architecture is emerging as an attractive framework for structural prediction and protein engineering. This review presents an update of the current understanding of leucine-rich repeat structure at the primary, secondary, tertiary and quaternary levels and discusses specific examples from recently determined three-dimensional structures.

  18. Comparison of the degree of homology of DNA and quantity of repeated sequences in an intact plant and cell structure

    International Nuclear Information System (INIS)

    Solov'yan, V.T.; Kunaleh, V.A.; Shumnyl, V.K.; Vershinin, A.V.


    This paper attempts to assess the quantity of repeated sequences and degree of homology of DNA in the intact plant and two lines of callus tissue of Rauwolfia serpentina Benth maintained for 20 years, which differ among themselves in the level of biosynthesis of the pharmacologically valuable alkaloid ajmaline. The tritium-labeled repeats of plants and calli were used in direct and reverse hybridization on nitrocellulose filters. Hybridization of H 3-labeled repeats with phage 17 DNA was used as control. The radioactivity of filters after washing was measured in a liquid scintillation counter

  19. Survey and analysis of simple sequence repeats in the Laccaria bicolor genome, with development of microsatellite markers

    Energy Technology Data Exchange (ETDEWEB)

    Labbe, Jessy L [ORNL; Murat, Claude [INRA, Nancy, France; Morin, Emmanuelle [INRA, Nancy, France; Le Tacon, F [UMR, France; Martin, Francis [INRA, Nancy, France


    It is becoming clear that simple sequence repeats (SSRs) play a significant role in fungal genome organization, and they are a large source of genetic markers for population genetics and meiotic maps. We identified SSRs in the Laccaria bicolor genome by in silico survey and analyzed their distribution in the different genomic regions. We also compared the abundance and distribution of SSRs in L. bicolor with those of the following fungal genomes: Phanerochaete chrysosporium, Coprinopsis cinerea, Ustilago maydis, Cryptococcus neoformans, Aspergillus nidulans, Magnaporthe grisea, Neurospora crassa and Saccharomyces cerevisiae. Using the MISA computer program, we detected 277,062 SSRs in the L. bicolor genome representing 8% of the assembled genomic sequence. Among the analyzed basidiomycetes, L. bicolor exhibited the highest SSR density although no correlation between relative abundance and the genome sizes was observed. In most genomes the short motifs (mono- to trinucleotides) were more abundant than the longer repeated SSRs. Generally, in each organism, the occurrence, relative abundance, and relative density of SSRs decreased as the repeat unit increased. Furthermore, each organism had its own common and longest SSRs. In the L. bicolor genome, most of the SSRs were located in intergenic regions (73.3%) and the highest SSR density was observed in transposable elements (TEs; 6,706 SSRs/Mb). However, 81% of the protein-coding genes contained SSRs in their exons, suggesting that SSR polymorphism may alter gene phenotypes. Within a L. bicolor offspring, sequence polymorphism of 78 SSRs was mainly detected in non-TE intergenic regions. Unlike previously developed microsatellite markers, these new ones are spread throughout the genome; these markers could have immediate applications in population genetics.

  20. Analysis of expressed sequence tags from Prunus mume flower and fruit and development of simple sequence repeat markers

    Directory of Open Access Journals (Sweden)

    Gao Zhihong


    Full Text Available Abstract Background Expressed Sequence Tag (EST has been a cost-effective tool in molecular biology and represents an abundant valuable resource for genome annotation, gene expression, and comparative genomics in plants. Results In this study, we constructed a cDNA library of Prunus mume flower and fruit, sequenced 10,123 clones of the library, and obtained 8,656 expressed sequence tag (EST sequences with high quality. The ESTs were assembled into 4,473 unigenes composed of 1,492 contigs and 2,981 singletons and that have been deposited in NCBI (accession IDs: GW868575 - GW873047, among which 1,294 unique ESTs were with known or putative functions. Furthermore, we found 1,233 putative simple sequence repeats (SSRs in the P. mume unigene dataset. We randomly tested 42 pairs of PCR primers flanking potential SSRs, and 14 pairs were identified as true-to-type SSR loci and could amplify polymorphic bands from 20 individual plants of P. mume. We further used the 14 EST-SSR primer pairs to test the transferability on peach and plum. The result showed that nearly 89% of the primer pairs produced target PCR bands in the two species. A high level of marker polymorphism was observed in the plum species (65% and low in the peach (46%, and the clustering analysis of the three species indicated that these SSR markers were useful in the evaluation of genetic relationships and diversity between and within the Prunus species. Conclusions We have constructed the first cDNA library of P. mume flower and fruit, and our data provide sets of molecular biology resources for P. mume and other Prunus species. These resources will be useful for further study such as genome annotation, new gene discovery, gene functional analysis, molecular breeding, evolution and comparative genomics between Prunus species.

  1. Site directed recombination (United States)

    Jurka, Jerzy W.


    Enhanced homologous recombination is obtained by employing a consensus sequence which has been found to be associated with integration of repeat sequences, such as Alu and ID. The consensus sequence or sequence having a single transition mutation determines one site of a double break which allows for high efficiency of integration at the site. By introducing single or double stranded DNA having the consensus sequence flanking region joined to a sequence of interest, one can reproducibly direct integration of the sequence of interest at one or a limited number of sites. In this way, specific sites can be identified and homologous recombination achieved at the site by employing a second flanking sequence associated with a sequence proximal to the 3'-nick.

  2. Phylogenetic analysis of Gossypium L. using restriction fragment length polymorphism of repeated sequences. (United States)

    Zhang, Meiping; Rong, Ying; Lee, Mi-Kyung; Zhang, Yang; Stelly, David M; Zhang, Hong-Bin


    Cotton is the world's leading textile fiber crop and is also grown as a bioenergy and food crop. Knowledge of the phylogeny of closely related species and the genome origin and evolution of polyploid species is significant for advanced genomics research and breeding. We have reconstructed the phylogeny of the cotton genus, Gossypium L., and deciphered the genome origin and evolution of its five polyploid species by restriction fragment analysis of repeated sequences. Nuclear DNA of 84 accessions representing 35 species and all eight genomes of the genus were analyzed. The phylogenetic tree of the genus was reconstructed using the parsimony method on 1033 polymorphic repeated sequence restriction fragments. The genome origin of its polyploids was determined by calculating the diploid-polyploid restriction fragment correspondence (RFC). The tree is consistent with the morphological classification, genome designation and geographic distribution of the species at subgenus, section and subsection levels. Gossypium lobatum (D7) was unambiguously shown to have the highest RFC with the D-subgenomes of all five polyploids of the genus, while the common ancestor of Gossypium herbaceum (A1) and Gossypium arboreum (A2) likely contributed to the A-subgenomes of the polyploids. These results provide a comprehensive phylogenetic tree of the cotton genus and new insights into the genome origin and evolution of its polyploid species. The results also further demonstrate a simple, rapid and inexpensive method suitable for phylogenetic analysis of closely related species, especially congeneric species, and the inference of genome origin of polyploids that constitute over 70 % of flowering plants.

  3. Linkage of congenital isolated adrenocorticotropic hormone deficiency to the corticotropin releasing hormone locus using simple sequence repeat polymorphisms

    Energy Technology Data Exchange (ETDEWEB)

    Kyllo, J.H.; Collins, M.M.; Vetter, K.L. [Univ. of Iowa College of Medicine, Iowa City, IA (United States)] [and others


    Genetic screening techniques using simple sequence repeat polymorphisms were applied to investigate the molecular nature of congenital isolated adrenocorticotropic hormone (ACTH) deficiency. We hypothesize that this rare cause of hypocortisolism shared by a brother and sister with two unaffected sibs and unaffected parents is inherited as an autosomal recessive single gene mutation. Genes involved in the hypothalamic-pituitary axis controlling cortisol sufficiency were investigated for a causal role in this disorder. Southern blotting showed no detectable mutations of the gene encoding pro-opiomelanocortin (POMC), the ACTH precursor. Other candidate genes subsequently considered were those encoding neuroendocrine convertase-1, and neuroendocrine convertase-2 (NEC-1, NEC-2), and corticotropin releasing hormone (CRH). Tests for linkage were performed using polymorphic di- and tetranucleotide simple sequence repeat markers flanking the reported map locations for POMC, NEC-1, NEC-2, and CRH. The chromosomal haplotypes determined by the markers flanking the loci for POMC, NEC-1, and NEC-2 were not compatible with linkage. However, 22 individual markers defining the chromosomal haplotypes flanking CRH were compatible with linkage of the disorder to the immediate area of this gene of chromosome 8. Based on these data, we hypothesize that the ACTH deficiency in this family is due to an abnormality of CRH gene structure or expression. These results illustrate the useful application of high density genetic maps constructed with simple sequence repeat markers for inclusion/exclusion studies of candidate genes in even very small nuclear families segregating for unusual phenotypes. 25 refs., 5 figs., 2 tabs.

  4. Identification, characterization, and utilization of genome-wide simple sequence repeats to identify a QTL for acidity in apple (United States)


    Background Apple is an economically important fruit crop worldwide. Developing a genetic linkage map is a critical step towards mapping and cloning of genes responsible for important horticultural traits in apple. To facilitate linkage map construction, we surveyed and characterized the distribution and frequency of perfect microsatellites in assembled contig sequences of the apple genome. Results A total of 28,538 SSRs have been identified in the apple genome, with an overall density of 40.8 SSRs per Mb. Di-nucleotide repeats are the most frequent microsatellites in the apple genome, accounting for 71.9% of all microsatellites. AT/TA repeats are the most frequent in genomic regions, accounting for 38.3% of all the G-SSRs, while AG/GA dimers prevail in transcribed sequences, and account for 59.4% of all EST-SSRs. A total set of 310 SSRs is selected to amplify eight apple genotypes. Of these, 245 (79.0%) are found to be polymorphic among cultivars and wild species tested. AG/GA motifs in genomic regions have detected more alleles and higher PIC values than AT/TA or AC/CA motifs. Moreover, AG/GA repeats are more variable than any other dimers in apple, and should be preferentially selected for studies, such as genetic diversity and linkage map construction. A total of 54 newly developed apple SSRs have been genetically mapped. Interestingly, clustering of markers with distorted segregation is observed on linkage groups 1, 2, 10, 15, and 16. A QTL responsible for malic acid content of apple fruits is detected on linkage group 8, and accounts for ~13.5% of the observed phenotypic variation. Conclusions This study demonstrates that di-nucleotide repeats are prevalent in the apple genome and that AT/TA and AG/GA repeats are the most frequent in genomic and transcribed sequences of apple, respectively. All SSR motifs identified in this study as well as those newly mapped SSRs will serve as valuable resources for pursuing apple genetic studies, aiding the apple breeding

  5. Creation and structure determination of an artificial protein with three complete sequence repeats

    Energy Technology Data Exchange (ETDEWEB)

    Adachi, Motoyasu, E-mail:; Shimizu, Rumi; Kuroki, Ryota [Japan Atomic Energy Agency, Shirakatashirane 2-4, Nakagun Tokaimura, Ibaraki 319-1195 (Japan); Blaber, Michael [Japan Atomic Energy Agency, Shirakatashirane 2-4, Nakagun Tokaimura, Ibaraki 319-1195 (Japan); Florida State University, Tallahassee, FL 32306-4300 (United States)


    An artificial protein with three complete sequence repeats was created and the structure was determined by X-ray crystallography. The structure showed threefold symmetry even though there is an amino- and carboxy-terminal. The artificial protein with threefold symmetry may be useful as a scaffold to capture small materials with C3 symmetry. Symfoil-4P is a de novo protein exhibiting the threefold symmetrical β-trefoil fold designed based on the human acidic fibroblast growth factor. First three asparagine–glycine sequences of Symfoil-4P are replaced with glutamine–glycine (Symfoil-QG) or serine–glycine (Symfoil-SG) sequences protecting from deamidation, and His-Symfoil-II was prepared by introducing a protease digestion site into Symfoil-QG so that Symfoil-II has three complete repeats after removal of the N-terminal histidine tag. The Symfoil-QG and SG and His-Symfoil-II proteins were expressed in Eschericha coli as soluble protein, and purified by nickel affinity chromatography. Symfoil-II was further purified by anion-exchange chromatography after removing the HisTag by proteolysis. Both Symfoil-QG and Symfoil-II were crystallized in 0.1 M Tris-HCl buffer (pH 7.0) containing 1.8 M ammonium sulfate as precipitant at 293 K; several crystal forms were observed for Symfoil-QG and II. The maximum diffraction of Symfoil-QG and II crystals were 1.5 and 1.1 Å resolution, respectively. The Symfoil-II without histidine tag diffracted better than Symfoil-QG with N-terminal histidine tag. Although the crystal packing of Symfoil-II is slightly different from Symfoil-QG and other crystals of Symfoil derivatives having the N-terminal histidine tag, the refined crystal structure of Symfoil-II showed pseudo-threefold symmetry as expected from other Symfoils. Since the removal of the unstructured N-terminal histidine tag did not affect the threefold structure of Symfoil, the improvement of diffraction quality of Symfoil-II may be caused by molecular characteristics of

  6. O homem, de Aluísio Azevedo: medicina e doenças no Rio de Janeiro fin-de-siecle

    Directory of Open Access Journals (Sweden)

    Raquel Lima SILVA


    Full Text Available Neste artigo, temos por objetivo observar como Aluísio Azevedo, aderindo, em certa medida, aos procedimentos recomendados por Zola, em Le Roman Experimental, aproxima os procedimentos científicos do campo da ficção, para compor um caso de psicopatologia humana.

  7. Nanobiosensor for Detection and Quantification of DNA Sequences in Degraded Mixed Meats

    Directory of Open Access Journals (Sweden)

    M. E. Ali


    Full Text Available A novel class of nanobiosensor was developed by integrating a 27-nucleotide AluI fragment of swine cytochrome b (cytb gene to a 3-nm diameter citrate-tannate coated gold nanoparticle (GNP. The biosensor detected 0.5% and 1% pork in raw and 2.5-h autoclaved pork-beef binary admixtures in a single step without any separation or washing. The hybridization kinetics of the hybrid sensor was studied with synthetic and AluI digested real pork targets from moderate to extreme target concentrations and a sigmoidal relationship was found. Using the kinetic curve, a convenient method for quantifying and counting target DNA copy number was developed. The accuracy of the method was over 90% and 80% for raw and autoclaved pork-beef binary admixtures in the range of 5–100% pork adulteration. The biosensor probe identified a target DNA sequence that was several-folds shorter than a typical PCR-template. This offered the detection and quantitation of potential targets in highly processed or degraded samples where PCR amplification was not possible due to template crisis. The assay was a viable alternative approach of qPCR for detecting, quantifying and counting copy number of shorter size DNA sequences to address a wide ranging biological problem in food industry, diagnostic laboratories and forensic medicine.

  8. Triplet repeat sequences in human DNA can be detected by hybridization to a synthetic (5'-CGG-3')17 oligodeoxyribonucleotide

    DEFF Research Database (Denmark)

    Behn-Krappa, A; Mollenhauer, J; Doerfler, W


    The seemingly autonomous amplification of naturally occurring triplet repeat sequences in the human genome has been implicated in the causation of human genetic disease, such as the fragile X (Martin-Bell) syndrome, myotonic dystrophy (Curshmann-Steinert), spinal and bulbar muscular atrophy...

  9. The Pentapeptide Repeat Proteins


    Vetting, Matthew W.; Hegde, Subray S.; Fajardo, J. Eduardo; Fiser, Andras; Roderick, Steven L.; Takiff, Howard E.; Blanchard, John S.


    The Pentapeptide Repeat Protein (PRP) family has over 500 members in the prokaryotic and eukaryotic kingdoms. These proteins are composed of, or contain domains composed of, tandemly repeated amino acid sequences with a consensus sequence of [S,T,A,V][D,N][L,F]-[S,T,R][G]. The biochemical function of the vast majority of PRP family members is unknown. The three-dimensional structure of the first member of the PRP family was determined for the fluoroquinolone resistance protein (MfpA) from Myc...

  10. Research of the origin of a particular Tunisian group using a physical marker and Alu insertion polymorphisms

    Directory of Open Access Journals (Sweden)

    Wifak El Moncer


    Full Text Available The aim of this study was to show how, in some particular circumstances, a physical marker can be used along with molecular markers in the research of an ancient people movement. A set of five Alu insertions was analysed in 42 subjects from a particular Tunisian group (El Hamma that has, unlike most of the Tunisian population, a very dark skin, similar to that of sub-Saharans, and in 114 Tunisian subjects (Gabes sample from the same governorate, but outside the group. Our results showed that the El Hamma group is genetically midway between sub-Saharan populations and North Africans, whereas the Gabes sample is clustered among North Africans. In addition, The A25 Alu insertion, considered characteristic to sub-Saharan Africans, was present in the El Hamma group at a relatively high frequency. This frequency was similar to that found in sub-Saharans from Nigeria, but significantly different from those found in the Gabes sample and in other North African populations. Our molecular results, consistent with the skin color status, suggest a sub-Saharan origin of this particular Tunisian group.

  11. Isolation and sequence analysis of a cDNA clone encoding the fifth complement component

    DEFF Research Database (Denmark)

    Lundwall, Åke B; Wetsel, Rick A; Kristensen, Torsten


    DNA clone of 1.85 kilobase pairs was isolated. Hybridization of the mixed-sequence probe to the complementary strand of the plasmid insert and sequence analysis by the dideoxy method predicted the expected protein sequence of C5a (positions 1-12), amino-terminal to the anticipated priming site. The sequence......, subcloned into M13 mp8, and sequenced at random by the dideoxy technique, thereby generating a contiguous sequence of 1703 base pairs. This clone contained coding sequence for the C-terminal 262 amino acid residues of the beta-chain, the entire C5a fragment, and the N-terminal 98 residues of the alpha......'-chain. The 3' end of the clone had a polyadenylated tail preceded by a polyadenylation recognition site, a 3'-untranslated region, and base pairs homologous to the human Alu concensus sequence. Comparison of the derived partial human C5 protein sequence with that previously determined for murine C3 and human...

  12. Genome-Wide Analysis of Simple Sequence Repeats in Bitter Gourd (Momordica charantia

    Directory of Open Access Journals (Sweden)

    Junjie Cui


    Full Text Available Bitter gourd (Momordica charantia is widely cultivated as a vegetable and medicinal herb in many Asian and African countries. After the sequencing of the cucumber (Cucumis sativus, watermelon (Citrullus lanatus, and melon (Cucumis melo genomes, bitter gourd became the fourth cucurbit species whose whole genome was sequenced. However, a comprehensive analysis of simple sequence repeats (SSRs in bitter gourd, including a comparison with the three aforementioned cucurbit species has not yet been published. Here, we identified a total of 188,091 and 167,160 SSR motifs in the genomes of the bitter gourd lines ‘Dali-11’ and ‘OHB3-1,’ respectively. Subsequently, the SSR content, motif lengths, and classified motif types were characterized for the bitter gourd genomes and compared among all the cucurbit genomes. Lastly, a large set of 138,727 unique in silico SSR primer pairs were designed for bitter gourd. Among these, 71 primers were selected, all of which successfully amplified SSRs from the two bitter gourd lines ‘Dali-11’ and ‘K44’. To further examine the utilization of unique SSR primers, 21 SSR markers were used to genotype a collection of 211 bitter gourd lines from all over the world. A model-based clustering method and phylogenetic analysis indicated a clear separation among the geographic groups. The genomic SSR markers developed in this study have considerable potential value in advancing bitter gourd research.

  13. Field Assessment and Groundwater Modeling of Pesticide Distribution in the Faga`alu Watershed in Tutuila, American Samoa (United States)

    Welch, E.; Dulai, H.; El-Kadi, A. I.; Shuler, C. K.


    To examine contaminant transport paths, groundwater and surface water interactions were investigated as a vector of pesticide migration on the island Tutuila in American Samoa. During a field campaign in summer 2016, water from wells, springs, and streams was collected across the island to analyze for selected pesticides. In addition, a detailed watershed-study, involving sampling along the mountain to ocean gradient was conducted in Faga`alu, a U.S. Coral Reef Task Force priority watershed that drains into the Pago Pago Harbor. Samples were screened at the University of Hawai`i for multiple agricultural chemicals using the ELISA method. The pesticides analyzed include glyphosate, azoxystrobin, imidacloprid and DDT/DDE. Field data was integrated into a MODFLOW-based groundwater model of the Faga`alu watershed to reconstruct flow paths, solute concentrations, and dispersion of the analytes. In combination with land-use maps, these tools were used to identify potential pesticide sources and their contaminant contributions. Across the island, pesticide concentrations were well below EPA regulated limits and azoxystrobin was absent. Glyphosate had detectable amounts in 56% of collected groundwater and 62% of collected stream samples. Respectively, 72% and 36% had imidacloprid detected and 98% and 97% had DDT/DDE detected. The highest observed concentration of glyphosate was 0.3 ppb, of imidacloprid was 0.17 ppb, and of DDT was 3.7 ppb. The persistence and ubiquity of DDT/DDE in surface and groundwater since its last island-wide application decades ago is notable. Groundwater flow paths modeled by MODFLOW imply that glyphosate sources match documented agricultural land-use areas. Groundwater-derived pesticide fluxes to the reef in Faga`alu are 977 mg/d of glyphosate and 1642 mg/d of DDT/DDE. Our study shows that pesticides are transported not only via surface runoff, but also via groundwater through the stream's base flow and are exiting the aquifer via submarine

  14. Genus-specific protein binding to the large clusters of DNA repeats (short regularly spaced repeats) present in Sulfolobus genomes

    DEFF Research Database (Denmark)

    Peng, Xu; Brügger, Kim; Shen, Biao


    terminally modified and corresponds to SSO454, an open reading frame of previously unassigned function. It binds specifically to DNA fragments carrying double and single repeat sequences, binding on one side of the repeat structure, and producing an opening of the opposite side of the DNA structure. It also...... recognizes both main families of repeat sequences in S. solfataricus. The recombinant protein, expressed in Escherichia coli, showed the same binding properties to the SRSR repeat as the native one. The SSO454 protein exhibits a tripartite internal repeat structure which yields a good sequence match...... with a helix-turn-helix DNA-binding motif. Although this putative motif is shared by other archaeal proteins, orthologs of SSO454 were only detected in species within the Sulfolobus genus and in the closely related Acidianus genus. We infer that the genus-specific protein induces an opening of the structure...

  15. De novo Transcriptome Sequencing Reveals a Considerable Bias in the Incidence of Simple Sequence Repeats towards the Downstream of ‘Pre-miRNAs’ of Black Pepper (United States)

    Joy, Nisha; Asha, Srinivasan; Mallika, Vijayan; Soniya, Eppurathu Vasudevan


    Next generation sequencing has an advantageon transformational development of species with limited available sequence data as it helps to decode the genome and transcriptome. We carried out the de novo sequencing using illuminaHiSeq™ 2000 to generate the first leaf transcriptome of black pepper (Piper nigrum L.), an important spice variety native to South India and also grown in other tropical regions. Despite the economic and biochemical importance of pepper, a scientifically rigorous study at the molecular level is far from complete due to lack of sufficient sequence information and cytological complexity of its genome. The 55 million raw reads obtained, when assembled using Trinity program generated 2,23,386 contigs and 1,28,157 unigenes. Reports suggest that the repeat-rich genomic regions give rise to small non-coding functional RNAs. MicroRNAs (miRNAs) are the most abundant type of non-coding regulatory RNAs. In spite of the widespread research on miRNAs, little is known about the hair-pin precursors of miRNAs bearing Simple Sequence Repeats (SSRs). We used the array of transcripts generated, for the in silico prediction and detection of ‘43 pre-miRNA candidates bearing different types of SSR motifs’. The analysis identified 3913 different types of SSR motifs with an average of one SSR per 3.04 MB of thetranscriptome. About 0.033% of the transcriptome constituted ‘pre-miRNA candidates bearing SSRs’. The abundance, type and distribution of SSR motifs studied across the hair-pin miRNA precursors, showed a significant bias in the position of SSRs towards the downstream of predicted ‘pre-miRNA candidates’. The catalogue of transcripts identified, together with the demonstration of reliable existence of SSRs in the miRNA precursors, permits future opportunities for understanding the genetic mechanism of black pepper and likely functions of ‘tandem repeats’ in miRNAs. PMID:23469176

  16. Assembly of Repeat Content Using Next Generation Sequencing Data

    Energy Technology Data Exchange (ETDEWEB)

    labutti, Kurt; Kuo, Alan; Grigoriev, Igor; Copeland, Alex


    Repetitive organisms pose a challenge for short read assembly, and typically only unique regions and repeat regions shorter than the read length, can be accurately assembled. Recently, we have been investigating the use of Pacific Biosciences reads for de novo fungal assembly. We will present an assessment of the quality and degree of repeat reconstruction possible in a fungal genome using long read technology. We will also compare differences in assembly of repeat content using short read and long read technology.

  17. Simple sequence repeat markers useful for sorghum downy mildew (Peronosclerospora sorghi and related species

    Directory of Open Access Journals (Sweden)

    Odvody Gary N


    Full Text Available Abstract Background A recent outbreak of sorghum downy mildew in Texas has led to the discovery of both metalaxyl resistance and a new pathotype in the causal organism, Peronosclerospora sorghi. These observations and the difficulty in resolving among phylogenetically related downy mildew pathogens dramatically point out the need for simply scored markers in order to differentiate among isolates and species, and to study the population structure within these obligate oomycetes. Here we present the initial results from the use of a biotin capture method to discover, clone and develop PCR primers that permit the use of simple sequence repeats (microsatellites to detect differences at the DNA level. Results Among the 55 primers pairs designed from clones from pathotype 3 of P. sorghi, 36 flanked microsatellite loci containing simple repeats, including 28 (55% with dinucleotide repeats and 6 (11% with trinucleotide repeats. A total of 22 microsatellites with CA/AC or GT/TG repeats were the most abundant (40% and GA/AG or CT/TC types contribute 15% in our collection. When used to amplify DNA from 19 isolates from P. sorghi, as well as from 5 related species that cause downy mildew on other hosts, the number of different bands detected for each SSR primer pair using a LI-COR- DNA Analyzer ranged from two to eight. Successful cross-amplification for 12 primer pairs studied in detail using DNA from downy mildews that attack maize (P. maydis & P. philippinensis, sugar cane (P. sacchari, pearl millet (Sclerospora graminicola and rose (Peronospora sparsa indicate that the flanking regions are conserved in all these species. A total of 15 SSR amplicons unique to P. philippinensis (one of the potential threats to US maize production were detected, and these have potential for development of diagnostic tests. A total of 260 alleles were obtained using 54 microsatellites primer combinations, with an average of 4.8 polymorphic markers per SSR across 34

  18. Simple sequence repeat markers useful for sorghum downy mildew (Peronosclerospora sorghi) and related species. (United States)

    Perumal, Ramasamy; Nimmakayala, Padmavathi; Erattaimuthu, Saradha R; No, Eun-Gyu; Reddy, Umesh K; Prom, Louis K; Odvody, Gary N; Luster, Douglas G; Magill, Clint W


    A recent outbreak of sorghum downy mildew in Texas has led to the discovery of both metalaxyl resistance and a new pathotype in the causal organism, Peronosclerospora sorghi. These observations and the difficulty in resolving among phylogenetically related downy mildew pathogens dramatically point out the need for simply scored markers in order to differentiate among isolates and species, and to study the population structure within these obligate oomycetes. Here we present the initial results from the use of a biotin capture method to discover, clone and develop PCR primers that permit the use of simple sequence repeats (microsatellites) to detect differences at the DNA level. Among the 55 primers pairs designed from clones from pathotype 3 of P. sorghi, 36 flanked microsatellite loci containing simple repeats, including 28 (55%) with dinucleotide repeats and 6 (11%) with trinucleotide repeats. A total of 22 microsatellites with CA/AC or GT/TG repeats were the most abundant (40%) and GA/AG or CT/TC types contribute 15% in our collection. When used to amplify DNA from 19 isolates from P. sorghi, as well as from 5 related species that cause downy mildew on other hosts, the number of different bands detected for each SSR primer pair using a LI-COR- DNA Analyzer ranged from two to eight. Successful cross-amplification for 12 primer pairs studied in detail using DNA from downy mildews that attack maize (P. maydis & P. philippinensis), sugar cane (P. sacchari), pearl millet (Sclerospora graminicola) and rose (Peronospora sparsa) indicate that the flanking regions are conserved in all these species. A total of 15 SSR amplicons unique to P. philippinensis (one of the potential threats to US maize production) were detected, and these have potential for development of diagnostic tests. A total of 260 alleles were obtained using 54 microsatellites primer combinations, with an average of 4.8 polymorphic markers per SSR across 34 Peronosclerospora, Peronospora and Sclerospora

  19. Direct repeat sequences are essential for function of the cis-acting locus of transfer (clt) of Streptomyces phaeochromogenes plasmid pJV1. (United States)

    Franco, Bernardo; González-Cerón, Gabriela; Servín-González, Luis


    The functionality of direct and inverted repeat sequences inside the cis acting locus of transfer (clt) of the Streptomyces plasmid pJV1 was determined by testing the effect of different deletions on plasmid transfer. The results show that the single most important element for pJV1 clt function is a series of evenly spaced 9 bp long direct repeats which match the consensus CCGCACA(C/G)(C/G), since their deletion caused a dramatic reduction in plasmid transfer. The presence of these repeats in the absence of any other clt sequences allowed plasmid transfer to occur at a frequency that was at least two orders of magnitude higher than that obtained in the complete absence of clt. A database search revealed regions with a similar organization, and in the same position, in Streptomyces plasmids pSN22 and pSLS, which have transfer proteins homologous to those of pJV1.

  20. Generating markers based on biotic stress of protein system in and tandem repeats sequence for Aquilaria sp

    International Nuclear Information System (INIS)

    Azhar Mohamad; Muhammad Hanif Azhari N; Siti Norhayati Ismail


    Aquilaria sp. belongs to the Thymelaeaceae family and is well distributed in Asia region. The species has multipurpose use from root to shoot and is an economically important crop, which generates wide interest in understanding genetic diversity of the species. Knowledge on DNA-based markers has become a prerequisite for more effective application of molecular marker techniques in breeding and mapping programs. In this work, both targeted genes and tandem repeat sequences were used for DNA fingerprinting in Aquilaria sp. A total of 100 ISSR (inter simple sequence repeat) primers and 50 combination pairs of specific primers derived from conserved region of a specific protein known as system in were optimized. 38 ISSR primers were found affirmative for polymorphism evaluation study and were generated from both specific and degenerate ISSR primers. And one utmost combination of system in primers showed significant results in distinguishing the Aquilaria sp. In conclusion, polymorphism derived from ISSR profiling and targeted stress genes of protein system in proved as a powerful approach for identification and molecular classification of Aquilaria sp. which will be useful for diversification in identifying any mutant lines derived from nature. (author)

  1. Genome-wide identification and validation of simple sequence repeats (SSRs) from Asparagus officinalis. (United States)

    Li, Shufen; Zhang, Guojun; Li, Xu; Wang, Lianjun; Yuan, Jinhong; Deng, Chuanliang; Gao, Wujun


    Garden asparagus (Asparagus officinalis), an important vegetable cultivated worldwide, can also serve as a model dioecious plant species in the study of sex determination and sex chromosome evolution. However, limited DNA marker resources have been developed and used for this species. To expand these resources, we examined the DNA sequences for simple sequence repeats (SSRs) in 163,406 scaffolds representing approximately 400 Mbp of the A. officinalis genome. A total of 87,576 SSRs were identified in 59,565 scaffolds. The most abundant SSR repeats were trinucleotide and tetranucleotide, accounting for 29.2 and 29.1% of the total SSRs, respectively, followed by di-, penta-, hexa-, hepta-, and octanucleotides. The AG motif was most common among dinucleotides and was also the most frequent motif in the entire A. officinalis genome, representing 14.7% of all SSRs. A total of 41,917 SSR primers pairs were designed to amplify SSRs. Twenty-two genomic SSR markers were tested in 39 asparagus accessions belonging to ten cultivars and one accession of Asparagus setaceus for determination of genetic diversity. The intra-species polymorphism information content (PIC) values of the 22 genomic SSR markers were intermediate, with an average of 0.41. The genetic diversity between the ten A. officinalis cultivars was low, and the UPGMA dendrogram was largely unrelated to cultivars. It is here suggested that the sex of individuals is an important factor influencing the clustering results. The information reported here provides new information about the organization of the microsatellites in A. officinalis genome and lays a foundation for further genetic studies and breeding applications of A. officinalis and related species. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Outline of a genome navigation system based on the properties of GA-sequences and their flanks.

    Directory of Open Access Journals (Sweden)

    Guenter Albrecht-Buehler

    Full Text Available Introducing a new method to visualize large stretches of genomic DNA (see Appendix S1 the article reports that most GA-sequences [1] shared chains of tetra-GA-motifs and contained upstream poly(A-segments. Although not integral parts of them, Alu-elements were found immediately upstream of all human and chimpanzee GA-sequences with an upstream poly(A-segment. The article hypothesizes that genome navigation uses these properties of GA-sequences in the following way. (1 Poly(A binding proteins interact with the upstream poly(A-segments and arrange adjacent GA-sequences side-by-side ('GA-ribbon', while folding the intervening DNA sequences between them into loops ('associated DNA-loops'. (2 Genome navigation uses the GA-ribbon as a search path for specific target genes that is up to 730-fold shorter than the full-length chromosome. (3 As to the specificity of the search, each molecule of a target protein is assumed to catalyze the formation of specific oligomers from a set of transcription factors that recognize tetra-GA-motifs. Their specific combinations of tetra-GA motifs are assumed to be present in the particular GA-sequence whose associated loop contains the gene for the target protein. As long as the target protein is abundant in the cell it produces sufficient numbers of such oligomers which bind to their specific GA-sequences and, thereby, inhibit locally the transcription of the target protein in the associated loop. However, if the amount of target protein drops below a certain threshold, the resultant reduction of specific oligomers leaves the corresponding GA-sequence 'denuded'. In response, the associated DNA-loop releases its nucleosomes and allows transcription of the target protein to proceed. (4 The Alu-transcripts may help control the general background of protein synthesis proportional to the number of transcriptionally active associated loops, especially in stressed cells. (5 The model offers a new mechanism of co-regulation of


    Directory of Open Access Journals (Sweden)

    Maribel Quintero


    Full Text Available Acute leukemias are malignant hematopoietic cells of immature proliferations of the blastic type, whose progressive accumulation is accompanied by a decrease in the production of normal myeloid elements. Transcription of inactive tumor suppressor genes by hypermethylation of CpG islands in promoter regions, has been a focus of researchers as a causal factor in hematological malignancies. The purpose of this study was to determine hypermethylated regions of chromosomal spread samples using Alu I and relate these regions with sites of suppressor gene associated to acute leukemia tumors. From an analysis of a 30 bone marrow samples, 18 were diagnosed with Acute Myeloid Leukemia and Acute Lymphoid Leukemia, and 12 underwent cell culture. Chromosomal spreads were stained with Giemsa after being previously digested with the enzyme Alu I. In patients with acute myeloid leukemia and acute lymphoid leukemia it was observed that 16/18 (88% and 12/12 (100% had abnormally stained regions, single in four and three methylated regions observed in acute myeloid leukemia and acute lymphoid leukemia, respectively, no association was found in the literature with methylated genes, which was highly significant ( p < 0.01 in both conditions. This shows the usefulness of this technique for the identification of methylated areas, since they have provided the foundation and the molecular basis for a better targeted therapeutic approach with demethylating agents, both in acute leukemias and myelodysplastic syndromes.

  4. Transposable Elements: No More 'Junk DNA'

    Directory of Open Access Journals (Sweden)

    Yun-Ji Kim


    Full Text Available Since the advent of whole-genome sequencing, transposable elements (TEs, just thought to be 'junk' DNA, have been noticed because of their numerous copies in various eukaryotic genomes. Many studies about TEs have been conducted to discover their functions in their host genomes. Based on the results of those studies, it has been generally accepted that they have a function to cause genomic and genetic variations. However, their infinite functions are not fully elucidated. Through various mechanisms, including de novo TE insertions, TE insertion-mediated deletions, and recombination events, they manipulate their host genomes. In this review, we focus on Alu, L1, human endogenous retrovirus, and short interspersed element/variable number of tandem repeats/Alu (SVA elements and discuss how they have affected primate genomes, especially the human and chimpanzee genomes, since their divergence.

  5. Autosomal and X chromosome Alu insertions in Bolivian Aymaras and Quechuas: two languages and one genetic pool. (United States)

    Gayà-Vidal, Magdalena; Dugoujon, Jean-Michel; Esteban, Esther; Athanasiadis, Georgios; Rodríguez, Armando; Villena, Mercedes; Vasquez, René; Moral, Pedro


    Thirty-two polymorphic Alu insertions (18 autosomal and 14 from the X chromosome) were studied in 192 individuals from two Amerindian populations of the Bolivian Altiplano (Aymara and Quechua speakers: the two main Andean linguistic groups), to provide relevant information about their genetic relationships and demographic processes. The main objective was to determine from genetic data whether the expansion of the Quechua language into Bolivia could be associated with demographic (Inca migration of Quechua-speakers from Peru into Bolivia) or cultural (language imposition by the Inca Empire) processes. Allele frequencies were used to assess the genetic relationships between these two linguistic groups. Our results indicated that the two Bolivian samples showed a high genetic similarity for both sets of markers and were clearly differentiated from the two Peruvian Quechua samples available in the literature. Additionally, our data were compared with the available literature to determine the genetic and linguistic structure, and East-West differentiation in South America. The close genetic relationship between the two Bolivian samples and their differentiation from the Quechua-speakers from Peru suggests that the Quechua language expansion in Bolivia took place without any important demographic contribution. Moreover, no clear geographical or linguistic structure was found for the Alu variation among South Amerindians. (c) 2009 Wiley-Liss, Inc.

  6. A theory that may explain the Hayflick limit--a means to delete one copy of a repeating sequence during each cell cycle in certain human cells such as fibroblasts. (United States)

    Naveilhan, P; Baudet, C; Jabbour, W; Wion, D


    A model that may explain the limited division potential of certain cells such as human fibroblasts in culture is presented. The central postulate of this theory is that there exists, prior to certain key exons that code for materials needed for cell division, a unique sequence of specific repeating segments of DNA. One copy of such repeating segments is deleted during each cell cycle in cells that are not protected from such deletion through methylation of their cytosine residues. According to this theory, the means through which such repeated sequences are removed, one per cycle, is through the sequential action of enzymes that act much as bacterial restriction enzymes do--namely to produce scissions in both strands of DNA in areas that correspond to the DNA base sequence recognition specificities of such enzymes. After the first scission early in a replicative cycle, that enzyme becomes inhibited, but the cleavage of the first site exposes the closest site in the repetitive element to the action of a second restriction enzyme after which that enzyme also becomes inhibited. Then repair occurs, regenerating the original first site. Through this sequential activation and inhibition of two different restriction enzymes, only one copy of the repeating sequence is deleted during each cell cycle. In effect, the repeating sequence operates as a precise counter of the numbers of cell doubling that have occurred since the cells involved differentiated during development.

  7. Identification and Mapping of Simple Sequence Repeat Markers from Common Bean (Phaseolus vulgaris L. Bacterial Artificial Chromosome End Sequences for Genome Characterization and Genetic–Physical Map Integration

    Directory of Open Access Journals (Sweden)

    Juana M. Córdoba


    Full Text Available Microsatellite markers or simple sequence repeat (SSR loci are useful for diversity characterization and genetic–physical mapping. Different in silico microsatellite search methods have been developed for mining bacterial artificial chromosome (BAC end sequences for SSRs. The overall goal of this study was genome characterization based on SSRs in 89,017 BAC end sequences (BESs from the G19833 common bean ( L. library. Another objective was to identify new SSR taking into account three tandem motif identification programs (Automated Microsatellite Marker Development [AMMD], Tandem Repeats Finder [TRF], and SSRLocator [SSRL]. Among the microsatellite search engines, SSRL identified the highest number of SSRs; however, when primer design was attempted, the number dropped due to poor primer design regions. Automated Microsatellite Marker Development software identified many SSRs with valuable AT/TA or AG/TC motifs, while TRF found fewer SSRs and produced no primers. A subgroup of 323 AT-rich, di-, and trinucleotide SSRs were selected from the AMMD results and used in a parental survey with DOR364 and G19833, of which 75 could be mapped in the corresponding population; these represented 4052 BAC clones. Together with 92 previously mapped BES- and 114 non-BES-derived markers, a total of 280 SSRs were included in the polymerase chain reaction (PCR-based map, integrating a total of 8232 BAC clones in 162 contigs from the physical map.

  8. Structure, organization, and sequence of alpha satellite DNA from human chromosome 17: evidence for evolution by unequal crossing-over and an ancestral pentamer repeat shared with the human X chromosome. (United States)

    Waye, J S; Willard, H F


    The centromeric regions of all human chromosomes are characterized by distinct subsets of a diverse tandemly repeated DNA family, alpha satellite. On human chromosome 17, the predominant form of alpha satellite is a 2.7-kilobase-pair higher-order repeat unit consisting of 16 alphoid monomers. We present the complete nucleotide sequence of the 16-monomer repeat, which is present in 500 to 1,000 copies per chromosome 17, as well as that of a less abundant 15-monomer repeat, also from chromosome 17. These repeat units were approximately 98% identical in sequence, differing by the exclusion of precisely 1 monomer from the 15-monomer repeat. Homologous unequal crossing-over is suggested as a probable mechanism by which the different repeat lengths on chromosome 17 were generated, and the putative site of such a recombination event is identified. The monomer organization of the chromosome 17 higher-order repeat unit is based, in part, on tandemly repeated pentamers. A similar pentameric suborganization has been previously demonstrated for alpha satellite of the human X chromosome. Despite the organizational similarities, substantial sequence divergence distinguishes these subsets. Hybridization experiments indicate that the chromosome 17 and X subsets are more similar to each other than to the subsets found on several other human chromosomes. We suggest that the chromosome 17 and X alpha satellite subsets may be related components of a larger alphoid subfamily which have evolved from a common ancestral repeat into the contemporary chromosome-specific subsets.

  9. t2prhd: a tool to study the patterns of repeat evolution

    Directory of Open Access Journals (Sweden)

    Pénzes Zsolt


    Full Text Available Abstract Background The models developed to characterize the evolution of multigene families (such as the birth-and-death and the concerted models have also been applied on the level of sequence repeats inside a gene/protein. Phylogenetic reconstruction is the method of choice to study the evolution of gene families and also sequence repeats in the light of these models. The characterization of the gene family evolution in view of the evolutionary models is done by the evaluation of the clustering of the sequences with the originating loci in mind. As the locus represents positional information, it is straightforward that in the case of the repeats the exact position in the sequence should be used, as the simple numbering according to repeat order can be misleading. Results We have developed a novel rapid visual approach to study repeat evolution, that takes into account the exact repeat position in a sequence. The "pairwise repeat homology diagram" visualizes sequence repeats detected by a profile HMM in a pair of sequences and highlights their homology relations inferred by a phylogenetic tree. The method is implemented in a Perl script (t2prhd available for downloading at and is also accessible as an online tool at The power of the method is demonstrated on the EGF-like and fibronectin-III-like (Fn-III domain repeats of three selected mammalian Tenascin sequences. Conclusion Although pairwise repeat homology diagrams do not carry all the information provided by the phylogenetic tree, they allow a rapid and intuitive assessment of repeat evolution. We believe, that t2prhd is a helpful tool with which to study the pattern of repeat evolution. This method can be particularly useful in cases of large datasets (such as large gene families, as the command line interface makes it possible to automate the generation of pairwise repeat homology diagrams with the aid of scripts.

  10. Sequence determinants of human microsatellite variability

    Directory of Open Access Journals (Sweden)

    Jakobsson Mattias


    Full Text Available Abstract Background Microsatellite loci are frequently used in genomic studies of DNA sequence repeats and in population studies of genetic variability. To investigate the effect of sequence properties of microsatellites on their level of variability we have analyzed genotypes at 627 microsatellite loci in 1,048 worldwide individuals from the HGDP-CEPH cell line panel together with the DNA sequences of these microsatellites in the human RefSeq database. Results Calibrating PCR fragment lengths in individual genotypes by using the RefSeq sequence enabled us to infer repeat number in the HGDP-CEPH dataset and to calculate the mean number of repeats (as opposed to the mean PCR fragment length, under the assumption that differences in PCR fragment length reflect differences in the numbers of repeats in the embedded repeat sequences. We find the mean and maximum numbers of repeats across individuals to be positively correlated with heterozygosity. The size and composition of the repeat unit of a microsatellite are also important factors in predicting heterozygosity, with tetra-nucleotide repeat units high in G/C content leading to higher heterozygosity. Finally, we find that microsatellites containing more separate sets of repeated motifs generally have higher heterozygosity. Conclusions These results suggest that sequence properties of microsatellites have a significant impact in determining the features of human microsatellite variability.

  11. Microlunatus cavernae sp. nov., a novel actinobacterium isolated from Alu ancient cave, Yunnan, South-West China. (United States)

    Cheng, Juan; Chen, Wei; Huo-Zhang, Bing; Nimaichand, Salam; Zhou, En-Min; Lu, Xin-Hua; Klenk, Hans-Peter; Li, Wen-Jun


    A Gram-positive, coccoid, non-endospore-forming actinobacterium, designated YIM C01117(T), was isolated from a soil sample collected from Alu ancient cave, Yunnan province, south-west China. Based on the 16S rRNA gene sequence analysis, strain YIM C01117(T) was shown to belong to the genus Microlunatus, with highest sequence similarity of 97.4 % to Microlunatus soli DSM 21800(T). The whole genomic DNA relatedness as shown by the DNA-DNA hybridization study between YIM C01117(T) and M. soli DSM 21800(T) had a low value (47 ± 2 %). Strain YIM C01117(T) was determined to contain LL-diaminopimelic acid with Gly, Glu and Ala amino acids (A3γ' type) in the cell wall. Whole-cell hydrolysates were found to contain glucose, galactose, mannose and ribose. The major polar lipids were determined to be phosphatidylglycerol and diphosphatidylglycerol. The predominant menaquinone system present is MK-9(H4), while the major fatty acids were identified to be anteiso-C15:0 (24.1 %), iso-C16:0 (22.3 %) and iso-C15:0 (11.4 %). The G+C content of the genomic DNA was determined to be 65.9 mol%. The chemotaxonomic and genotypic data support the affiliation of the strain YIM C01117(T) to the genus Microlunatus. The results of physiological and biochemical tests allow strain YIM C01117(T) to be differentiated phenotypically from recognized Microlunatus species. Strain YIM C01117(T) is therefore considered to represent a novel species of the genus Microlunatus, for which the name Microlunatus cavernae sp. nov. is proposed. The type strain is YIM C01117(T) (= DSM 26248(T) = JCM 18536(T)).

  12. Development of Simple Sequence Repeats (SSR) markers in Setaria italica (Poaceae) and cross-amplification in related species. (United States)

    Lin, Heng-Sheng; Chiang, Chih-Yun; Chang, Song-Bin; Kuoh, Chang-Sheng


    Foxtail millet is one of the world's oldest cultivated crops. It has been adopted as a model organism for providing a deeper understanding of plant biology. In this study, 45 simple sequence repeats (SSR) markers of Setaria italica were developed. These markers showing polymorphism were screened in 223 samples from 12 foxtail millet populations around Taiwan. The most common dinucleotide and trinucleotide repeat motifs are AC/TG (84.21%) and CAT (46.15%). The average number of alleles (N(a)), the average heterozygosities observed (H(o)) and expected (H(e)) are 3.73, 0.714, 0.587, respectively. In addition, 24 SSR markers had shown transferability to six related Poaceae species. These new markers provide tools for examining genetic relatedness among foxtail millet populations and other related species. It is suitable for germplasm management and protection in Poaceae.

  13. Detection, characterization and evolution of internal repeats in Chitinases of known 3-D structure.

    Directory of Open Access Journals (Sweden)

    Manigandan Sivaji

    Full Text Available Chitinase proteins have evolved and diversified almost in all organisms ranging from prokaryotes to eukaryotes. During evolution, internal repeats may appear in amino acid sequences of proteins which alter the structural and functional features. Here we deciphered the internal repeats from Chitinase and characterized the structural similarities between them. Out of 24 diverse Chitinase sequences selected, six sequences (2CJL, 2DSK, 2XVP, 2Z37, 3EBV and 3HBE did not contain any internal repeats of amino acid sequences. Ten sequences contained repeats of length <50, and the remaining 8 sequences contained repeat length between 50 and 100 residues. Two Chitinase sequences, 1ITX and 3SIM, were found to be structurally similar when analyzed using secondary structure of Chitinase from secondary and 3-Dimensional structure database of Protein Data Bank. Internal repeats of 3N17 and 1O6I were also involved in the ligand-binding site of those Chitinase proteins, respectively. Our analyses enhance our understanding towards the identification of structural characteristics of internal repeats in Chitinase proteins.

  14. A comprehensive characterization of simple sequence repeats in pepper genomes provides valuable resources for marker development in Capsicum. (United States)

    Cheng, Jiaowen; Zhao, Zicheng; Li, Bo; Qin, Cheng; Wu, Zhiming; Trejo-Saavedra, Diana L; Luo, Xirong; Cui, Junjie; Rivera-Bustamante, Rafael F; Li, Shuaicheng; Hu, Kailin


    The sequences of the full set of pepper genomes including nuclear, mitochondrial and chloroplast are now available for use. However, the overall of simple sequence repeats (SSR) distribution in these genomes and their practical implications for molecular marker development in Capsicum have not yet been described. Here, an average of 868,047.50, 45.50 and 30.00 SSR loci were identified in the nuclear, mitochondrial and chloroplast genomes of pepper, respectively. Subsequently, systematic comparisons of various species, genome types, motif lengths, repeat numbers and classified types were executed and discussed. In addition, a local database composed of 113,500 in silico unique SSR primer pairs was built using a homemade bioinformatics workflow. As a pilot study, 65 polymorphic markers were validated among a wide collection of 21 Capsicum genotypes with allele number and polymorphic information content value per marker raging from 2 to 6 and 0.05 to 0.64, respectively. Finally, a comparison of the clustering results with those of a previous study indicated the usability of the newly developed SSR markers. In summary, this first report on the comprehensive characterization of SSR motifs in pepper genomes and the very large set of SSR primer pairs will benefit various genetic studies in Capsicum.

  15. Analysis of simple sequence repeats in rice bean (Vigna umbellata using an SSR-enriched library

    Directory of Open Access Journals (Sweden)

    Lixia Wang


    Full Text Available Rice bean (Vigna umbellata Thunb., a warm-season annual legume, is grown in Asia mainly for dried grain or fodder and plays an important role in human and animal nutrition because the grains are rich in protein and some essential fatty acids and minerals. With the aim of expediting the genetic improvement of rice bean, we initiated a project to develop genomic resources and tools for molecular breeding in this little-known but important crop. Here we report the construction of an SSR-enriched genomic library from DNA extracted from pooled young leaf tissues of 22 rice bean genotypes and developing SSR markers. In 433,562 reads generated by a Roche 454 GS-FLX sequencer, we identified 261,458 SSRs, of which 48.8% were of compound form. Dinucleotide repeats were predominant with an absolute proportion of 81.6%, followed by trinucleotides (17.8%. Other types together accounted for 0.6%. The motif AC/GT accounted for 77.7% of the total, followed by AAG/CTT (14.3%, and all others accounted for 12.0%. Among the flanking sequences, 2928 matched putative genes or gene models in the protein database of Arabidopsis thaliana, corresponding with 608 non-redundant Gene Ontology terms. Of these sequences, 11.2% were involved in cellular components, 24.2% were involved molecular functions, and 64.6% were associated with biological processes. Based on homolog analysis, 1595 flanking sequences were similar to mung bean and 500 to common bean genomic sequences. Comparative mapping was conducted using 350 sequences homologous to both mung bean and common bean sequences. Finally, a set of primer pairs were designed, and a validation test showed that 58 of 220 new primers can be used in rice bean and 53 can be transferred to mung bean. However, only 11 were polymorphic when tested on 32 rice bean varieties. We propose that this study lays the groundwork for developing novel SSR markers and will enhance the mapping of qualitative and quantitative traits and marker

  16. Single Strand Annealing Plays a Major Role in RecA-Independent Recombination between Repeated Sequences in the Radioresistant Deinococcus radiodurans Bacterium.

    Directory of Open Access Journals (Sweden)

    Solenne Ithurbide


    Full Text Available The bacterium Deinococcus radiodurans is one of the most radioresistant organisms known. It is able to reconstruct a functional genome from hundreds of radiation-induced chromosomal fragments. Our work aims to highlight the genes involved in recombination between 438 bp direct repeats separated by intervening sequences of various lengths ranging from 1,479 bp to 10,500 bp to restore a functional tetA gene in the presence or absence of radiation-induced DNA double strand breaks. The frequency of spontaneous deletion events between the chromosomal direct repeats were the same in recA+ and in ΔrecA, ΔrecF, and ΔrecO bacteria, whereas recombination between chromosomal and plasmid DNA was shown to be strictly dependent on the RecA and RecF proteins. The presence of mutations in one of the repeated sequence reduced, in a MutS-dependent manner, the frequency of the deletion events. The distance between the repeats did not influence the frequencies of deletion events in recA+ as well in ΔrecA bacteria. The absence of the UvrD protein stimulated the recombination between the direct repeats whereas the absence of the DdrB protein, previously shown to be involved in DNA double strand break repair through a single strand annealing (SSA pathway, strongly reduces the frequency of RecA- (and RecO- independent deletions events. The absence of the DdrB protein also increased the lethal sectoring of cells devoid of RecA or RecO protein. γ-irradiation of recA+ cells increased about 10-fold the frequencies of the deletion events, but at a lesser extend in cells devoid of the DdrB protein. Altogether, our results suggest a major role of single strand annealing in DNA repeat deletion events in bacteria devoid of the RecA protein, and also in recA+ bacteria exposed to ionizing radiation.

  17. Functional role of a highly repetitive DNA sequence in anchorage of the mouse genome. (United States)

    Neuer-Nitsche, B; Lu, X N; Werner, D


    The major portion of the eukaryotic genome consists of various categories of repetitive DNA sequences which have been studied with respect to their base compositions, organizations, copy numbers, transcription and species specificities; their biological roles, however, are still unclear. A novel quality of a highly repetitive mouse DNA sequence is described which points to a functional role: All copies (approximately 50,000 per haploid genome) of this DNA sequence reside on genomic Alu I DNA fragments each associated with nuclear polypeptides that are not released from DNA by proteinase K, SDS and phenol extraction. By this quality the repetitive DNA sequence is classified as a member of the sub-set of DNA sequences involved in tight DNA-polypeptide complexes which have been previously shown to be components of the subnuclear structure termed 'nuclear matrix'. From these results it has to be concluded that the repetitive DNA sequence characterized in this report represents or comprises a signal for a large number of site specific attachment points of the mouse genome in the nuclear matrix.

  18. Automated genotyping of dinucleotide repeat markers

    Energy Technology Data Exchange (ETDEWEB)

    Perlin, M.W.; Hoffman, E.P. [Carnegie Mellon Univ., Pittsburgh, PA (United States)]|[Univ. of Pittsburgh, PA (United States)


    The dinucleotide repeats (i.e., microsatellites) such as CA-repeats are a highly polymorphic, highly abundant class of PCR-amplifiable markers that have greatly streamlined genetic mapping experimentation. It is expected that over 30,000 such markers (including tri- and tetranucleotide repeats) will be characterized for routine use in the next few years. Since only size determination, and not sequencing, is required to determine alleles, in principle, dinucleotide repeat genotyping is easily performed on electrophoretic gels, and can be automated using DNA sequencers. Unfortunately, PCR stuttering with these markers generates not one band for each allele, but a pattern of bands. Since closely spaced alleles must be disambiguated by human scoring, this poses a key obstacle to full automation. We have developed methods that overcome this obstacle. Our model is that the observed data is generated by arithmetic superposition (i.e., convolution) of multiple allele patterns. By quantitatively measuring the size of each component band, and exploiting the unique stutter pattern associated with each marker, closely spaced alleles can be deconvolved; this unambiguously reconstructs the {open_quotes}true{close_quotes} allele bands, with stutter artifact removed. We used this approach in a system for automated diagnosis of (X-linked) Duchenne muscular dystrophy; four multiplexed CA-repeats within the dystrophin gene were assayed on a DNA sequencer. Our method accurately detected small variations in gel migration that shifted the allele size estimate. In 167 nonmutated alleles, 89% (149/167) showed no size variation, 9% (15/167) showed 1 bp variation, and 2% (3/167) showed 2 bp variation. We are currently developing a library of dinucleotide repeat patterns; together with our deconvolution methods, this library will enable fully automated genotyping of dinucleotide repeats from sizing data.

  19. Development of Simple Sequence Repeats (SSR Markers in Setaria italica (Poaceae and Cross-Amplification in Related Species

    Directory of Open Access Journals (Sweden)

    Chih-Yun Chiang


    Full Text Available Foxtail millet is one of the world’s oldest cultivated crops. It has been adopted as a model organism for providing a deeper understanding of plant biology. In this study, 45 simple sequence repeats (SSR markers of Setaria italica were developed. These markers showing polymorphism were screened in 223 samples from 12 foxtail millet populations around Taiwan. The most common dinucleotide and trinucleotide repeat motifs are AC/TG (84.21% and CAT (46.15%. The average number of alleles (Na, the average heterozygosities observed (Ho and expected (He are 3.73, 0.714, 0.587, respectively. In addition, 24 SSR markers had shown transferability to six related Poaceae species. These new markers provide tools for examining genetic relatedness among foxtail millet populations and other related species. It is suitable for germplasm management and protection in Poaceae.

  20. The complete chloroplast genome sequence of Mahonia bealei (Berberidaceae) reveals a significant expansion of the inverted repeat and phylogenetic relationship with other angiosperms. (United States)

    Ma, Ji; Yang, Bingxian; Zhu, Wei; Sun, Lianli; Tian, Jingkui; Wang, Xumin


    Mahonia bealei (Berberidaceae) is a frequently-used traditional Chinese medicinal plant with efficient anti-inflammatory ability. This plant is one of the sources of berberine, a new cholesterol-lowering drug with anti-diabetic activity. We have sequenced the complete nucleotide sequence of the chloroplast (cp) genome of M. bealei. The complete cp genome of M. bealei is 164,792 bp in length, and has a typical structure with large (LSC 73,052 bp) and small (SSC 18,591 bp) single-copy regions separated by a pair of inverted repeats (IRs 36,501 bp) of large size. The Mahonia cp genome contains 111 unique genes and 39 genes are duplicated in the IR regions. The gene order and content of M. bealei are almost unarranged which is consistent with the hypothesis that large IRs stabilize cp genome and reduce gene loss-and-gain probabilities during evolutionary process. A large IR expansion of over 12 kb has occurred in M. bealei, 15 genes (rps19, rpl22, rps3, rpl16, rpl14, rps8, infA, rpl36, rps11, petD, petB, psbH, psbN, psbT and psbB) have expanded to have an additional copy in the IRs. The IR expansion rearrangement occurred via a double-strand DNA break and subsequence repair, which is different from the ordinary gene conversion mechanism. Repeat analysis identified 39 direct/inverted repeats 30 bp or longer with a sequence identity ≥ 90%. Analysis also revealed 75 simple sequence repeat (SSR) loci and almost all are composed of A or T, contributing to a distinct bias in base composition. Comparison of protein-coding sequences with ESTs reveals 9 putative RNA edits and 5 of them resulted in non-synonymous modifications in rpoC1, rps2, rps19 and ycf1. Phylogenetic analysis using maximum parsimony (MP) and maximum likelihood (ML) was performed on a dataset composed of 65 protein-coding genes from 25 taxa, which yields an identical tree topology as previous plastid-based trees, and provides strong support for the sister relationship between Ranunculaceae and Berberidaceae

  1. Nonlinear analysis of sequence repeats of multi-domain proteins

    Energy Technology Data Exchange (ETDEWEB)

    Huang Yanzhao [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China); Li Mingfeng [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China); Xiao Yi [Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei (China)]. E-mail:


    Many multi-domain proteins have repetitive three-dimensional structures but nearly-random amino acid sequences. In the present paper, by using a modified recurrence plot proposed by us previously, we show that these amino acid sequences have hidden repetitions in fact. These results indicate that the repetitive domain structures are encoded by the repetitive sequences. This also gives a method to detect the repetitive domain structures directly from amino acid sequences.

  2. Analysis of the genome sequence of the pathogenic Muscovy duck parvovirus strain YY reveals a 14-nucleotide-pair deletion in the inverted terminal repeats. (United States)

    Wang, Jianye; Huang, Yu; Zhou, Mingxu; Zhu, Guoqiang


    Genomic information about Muscovy duck parvovirus is still limited. In this study, the genome of the pathogenic MDPV strain YY was sequenced. The full-length genome of YY is 5075 nucleotides (nt) long, 57 nt shorter than that of strain FM. Sequence alignment indicates that the 5' and 3' inverted terminal repeats (ITR) of strain YY contain a 14-nucleotide-pair deletion in the stem of the palindromic hairpin structure in comparison to strain FM and FZ91-30. The deleted region contains one "E-box" site and one repeated motif with the sequence "TTCCGGT" or "ACCGGAA". Phylogenetic trees constructed based the protein coding genes concordantly showed that YY, together with nine other MDPV isolates from various places, clustered in a separate branch, distinct from the branch formed by goose parvovirus (GPV) strains. These results demonstrate that, despite the distinctive deletion, the YY strain still belongs to the classical MDPV group. Moreover, the deletion of ITR may contribute to the genome evolution of MDPV under immunization pressure.

  3. O cortiço e a prisão - vigilãncia e controle: Aluísio Azevedo e Michel Foucault

    Directory of Open Access Journals (Sweden)

    Marcio Luiz Carreri


    Full Text Available The current articleintends to analyze the existingrelations between metaficcionhistoriografic and the ideologypresented in the naturalisticromance O Cortiço (1890, fromAluísio Azevedo, by means ofarticulation of a reading fromFoucault’s method. Processingdialogues between apparent anddifferent speech areas, it considersa reflection about control anddiscipline through panoptism, usingas research source the Vigiar epunir book, by Michel Foucault in1975.

  4. Characterization of expressed sequence tag-derived simple sequence repeat markers for Aspergillus flavus: emphasis on variability of isolates from the southern United States. (United States)

    Wang, Xinwang; Wadl, Phillip A; Wood-Jones, Alicia; Windham, Gary; Trigiano, Robert N; Scruggs, Mary; Pilgrim, Candace; Baird, Richard


    Simple sequence repeat (SSR) markers were developed from Aspergillus flavus expressed sequence tag (EST) database to conduct an analysis of genetic relationships of Aspergillus isolates from numerous host species and geographical regions, but primarily from the United States. Twenty-nine primers were designed from 362 tri-nucleotide EST-SSR sequences. Eighteen polymorphic loci were used to genotype 96 Aspergillus species isolates. The number of alleles detected per locus ranged from 2 to 24 with a mean of 8.2 alleles. Haploid diversity ranged from 0.28 to 0.91. Genetic distance matrix was used to perform principal coordinates analysis (PCA) and to generate dendrograms using unweighted pair group method with arithmetic mean (UPGMA). Two principal coordinates explained more than 75 % of the total variation among the isolates. One clade was identified for A. flavus isolates (n = 87) with the other Aspergillus species (n = 7) using PCA, but five distinct clusters were present when the others taxa were excluded from the analysis. Six groups were noted when the EST-SSR data were compared using UPGMA. However, the latter PCA or UPGMA comparison resulted in no direct associations with host species, geographical region or aflatoxin production. Furthermore, there was no direct correlation to visible morphological features such as sclerotial types. The isolates from Mississippi Delta region, which contained the largest percentage of isolates, did not show any unusual clustering except for isolates K32, K55, and 199. Further studies of these three isolates are warranted to evaluate their pathogenicity, aflatoxin production potential, additional gene sequences (e.g., RPB2), and morphological comparisons.

  5. Local repeat sequence organization of an intergenic spacer

    Indian Academy of Sciences (India)

    The amplification yielded the same uniquely ``sequence-scrambled” product, whether the template used for PCR was total cellular DNA, chloroplast DNA or a plasmid clone DNA corresponding to that region. The PCR product, a ``unique” new sequence, had lost the repetitive organization of the template genome where it ...

  6. The soybean-Phytophthora resistance locus Rps1-k encompasses coiled coil-nucleotide binding-leucine rich repeat-like genes and repetitive sequences

    Directory of Open Access Journals (Sweden)

    Bhattacharyya Madan K


    Full Text Available Abstract Background A series of Rps (resistance to Pytophthora sojae genes have been protecting soybean from the root and stem rot disease caused by the Oomycete pathogen, Phytophthora sojae. Five Rps genes were mapped to the Rps1 locus located near the 28 cM map position on molecular linkage group N of the composite genetic soybean map. Among these five genes, Rps1-k was introgressed from the cultivar, Kingwa. Rps1-k has been providing stable and broad-spectrum Phytophthora resistance in the major soybean-producing regions of the United States. Rps1-k has been mapped and isolated. More than one functional Rps1-k gene was identified from the Rps1-k locus. The clustering feature at the Rps1-k locus might have facilitated the expansion of Rps1-k gene numbers and the generation of new recognition specificities. The Rps1-k region was sequenced to understand the possible evolutionary steps that shaped the generation of Phytophthora resistance genes in soybean. Results Here the analyses of sequences of three overlapping BAC clones containing the 184,111 bp Rps1-k region are reported. A shotgun sequencing strategy was applied in sequencing the BAC contig. Sequence analysis predicted a few full-length genes including two Rps1-k genes, Rps1-k-1 and Rps1-k-2. Previously reported Rps1-k-3 from this genomic region 1 was evolved through intramolecular recombination between Rps1-k-1 and Rps1-k-2 in Escherichia coli. The majority of the predicted genes are truncated and therefore most likely they are nonfunctional. A member of a highly abundant retroelement, SIRE1, was identified from the Rps1-k region. The Rps1-k region is primarily composed of repetitive sequences. Sixteen simple repeat and 63 tandem repeat sequences were identified from the locus. Conclusion These data indicate that the Rps1 locus is located in a gene-poor region. The abundance of repetitive sequences in the Rps1-k region suggested that the location of this locus is in or near a

  7. Fingerprinting for discriminating tea germplasm using inter-simple sequence repeat (ISSR) markers

    International Nuclear Information System (INIS)

    Liu, B.Y.; Li, Y.Y.; Wang, P.S.; Wang, L.Y.; Wang, P.S.


    For the discrimination of tea germplasm at the inter-specific level, 134 tea varieties preserved in the China National Germplasm Tea Repositories (CNGTR) were analyzed using inter simple sequence repeat (ISSR) markers. Eighteen primers were chosen from 60 screened for ISSR amplification, generating 99.4% polymorphic bands. The mean Nei's gene diversity (H) and the overall mean Shannon's Information index (I) were 0.396 and 0.578, respectively, indicating a wide gene pool. Using the presence, sometimes absence of unique ISSR markers, it was possible to discriminate 32 of the genotypes tested. No single primer could discriminate all the 134 genotypes. However, UBC811 provided rich band patterns and it can discriminate 35 genotypes. The combination of two and three primers could discriminate 99 and 121 genotypes, respectively. Furthermore, the combination of band patterns or the DNA fingerprinting based on specific ISSR markers generated by UBC811, UBC835, ISSR2 and ISSR3 could discriminate all 134 genotypes tested. ISSR markers also provide a powerful tool to discriminate tea germplasm at the inter-specific level. (author)

  8. The RNA polymerase dictates ORF1 requirement and timing of LINE and SINE retrotransposition.

    Directory of Open Access Journals (Sweden)

    Emily N Kroutter


    Full Text Available Mobile elements comprise close to one half of the mass of the human genome. Only LINE-1 (L1, an autonomous non-Long Terminal Repeat (LTR retrotransposon, and its non-autonomous partners-such as the retropseudogenes, SVA, and the SINE, Alu-are currently active human retroelements. Experimental evidence shows that Alu retrotransposition depends on L1 ORF2 protein, which has led to the presumption that LINEs and SINEs share the same basic insertional mechanism. Our data demonstrate clear differences in the time required to generate insertions between marked Alu and L1 elements. In our tissue culture system, the process of L1 insertion requires close to 48 hours. In contrast to the RNA pol II-driven L1, we find that pol III transcribed elements (Alu, the rodent SINE B2, and the 7SL, U6 and hY sequences can generate inserts within 24 hours or less. Our analyses demonstrate that the observed retrotransposition timing does not dictate insertion rate and is independent of the type of reporter cassette utilized. The additional time requirement by L1 cannot be directly attributed to differences in transcription, transcript length, splicing processes, ORF2 protein production, or the ability of functional ORF2p to reach the nucleus. However, the insertion rate of a marked Alu transcript drastically drops when driven by an RNA pol II promoter (CMV and the retrotransposition timing parallels that of L1. Furthermore, the "pol II Alu transcript" behaves like the processed pseudogenes in our retrotransposition assay, requiring supplementation with L1 ORF1p in addition to ORF2p. We postulate that the observed differences in retrotransposition kinetics of these elements are dictated by the type of RNA polymerase generating the transcript. We present a model that highlights the critical differences of LINE and SINE transcripts that likely define their retrotransposition timing.

  9. Repeat-aware modeling and correction of short read errors. (United States)

    Yang, Xiao; Aluru, Srinivas; Dorman, Karin S


    High-throughput short read sequencing is revolutionizing genomics and systems biology research by enabling cost-effective deep coverage sequencing of genomes and transcriptomes. Error detection and correction are crucial to many short read sequencing applications including de novo genome sequencing, genome resequencing, and digital gene expression analysis. Short read error detection is typically carried out by counting the observed frequencies of kmers in reads and validating those with frequencies exceeding a threshold. In case of genomes with high repeat content, an erroneous kmer may be frequently observed if it has few nucleotide differences with valid kmers with multiple occurrences in the genome. Error detection and correction were mostly applied to genomes with low repeat content and this remains a challenging problem for genomes with high repeat content. We develop a statistical model and a computational method for error detection and correction in the presence of genomic repeats. We propose a method to infer genomic frequencies of kmers from their observed frequencies by analyzing the misread relationships among observed kmers. We also propose a method to estimate the threshold useful for validating kmers whose estimated genomic frequency exceeds the threshold. We demonstrate that superior error detection is achieved using these methods. Furthermore, we break away from the common assumption of uniformly distributed errors within a read, and provide a framework to model position-dependent error occurrence frequencies common to many short read platforms. Lastly, we achieve better error correction in genomes with high repeat content. The software is implemented in C++ and is freely available under GNU GPL3 license and Boost Software V1.0 license at " = redeem". We introduce a statistical framework to model sequencing errors in next-generation reads, which led to promising results in detecting and correcting errors

  10. Genetic diversity studies in pea (Pisum sativum L.) using simple sequence repeat markers. (United States)

    Kumari, P; Basal, N; Singh, A K; Rai, V P; Srivastava, C P; Singh, P K


    The genetic diversity among 28 pea (Pisum sativum L.) genotypes was analyzed using 32 simple sequence repeat markers. A total of 44 polymorphic bands, with an average of 2.1 bands per primer, were obtained. The polymorphism information content ranged from 0.657 to 0.309 with an average of 0.493. The variation in genetic diversity among these cultivars ranged from 0.11 to 0.73. Cluster analysis based on Jaccard's similarity coefficient using the unweighted pair-group method with arithmetic mean (UPGMA) revealed 2 distinct clusters, I and II, comprising 6 and 22 genotypes, respectively. Cluster II was further differentiated into 2 subclusters, IIA and IIB, with 12 and 10 genotypes, respectively. Principal component (PC) analysis revealed results similar to those of UPGMA. The first, second, and third PCs contributed 21.6, 16.1, and 14.0% of the variation, respectively; cumulative variation of the first 3 PCs was 51.7%.

  11. Initial study of stability and repeatability of measuring R2' and oxygen extraction fraction values in the healthy brain with gradient-echo sampling of spin-echo sequence

    International Nuclear Information System (INIS)

    Hui Lihong; Zhang Xiaodong; He Chao; Xie Sheng; Xiao Jiangxi; Zhang jue; Wang Xiaoying; Jiang Xuexiang


    Objective: To evaluate the stability and repeatability of gradient-echo sampling of spin- echo (GESSE) sequence in measuring the R 2 ' value in volunteers, by comparison with traditional GRE sequence (T 2 * ]nap and T 2 map). Methods: Eight normal healthy volunteers were enrolled in this study and written informed consents were obtained from all subjects. MR scanning including sequences of GESSE, T 2 map and T 2 * map were performed in these subjects at resting status. The same protocol was repeated one day later. Raw data from GESSE sequence were transferred to PC to conduct postprocessing with the software built in house. R 2 ' map and OEF map were got consequently. To obtain quantitative R 2 ' and OEF values in the brain parenchyma, six ROIs were equally placed in the anterior, middle and posterior part of bilateral hemispheres. Both mean and standard deviation of R 2 ' and OEF were recorded. All images from T 2 * map and T 2 map were transferred to the Workstation for postprocessing. The ROIs were put at the same areas as those for GESSE sequence. R 2 ' is defined as R 2 ' = R 2 * - R 2 , R 2 * = 1/T 2 * . The R 2 ' value of GESSE sequence were compared with that of GRE sequence. Results: The mean R 2 ' values of GESSE at the first and second scan and those of the GRE were (4.21±0.92), (4.45±0.94) Hz and (7.37±1.47), (6.42±2.33) Hz respectively. The mean OEF values of GESSE at the first and second scan is 0.327±0.036 and 0.336± 0.035 respectively. The R 2 ' value and OEF value obtained from GESSE were not significantly different between the first and second scan (t=-0.83, -1.48, P>0.05). The R 2 ' value of first GRE imaging had significantly statistical difference from that of second GRE imaging (t=1.80, P 2 ' value of GESSE sequence was less than that of GRE sequence, and there was significantly statistical difference between them (t=1.71, P<0.05). Conclusion: The GESSE sequence has good stability and repeatability with promising clinical practicability

  12. Identification of the centromeric repeat in the threespine stickleback fish (Gasterosteus aculeatus). (United States)

    Cech, Jennifer N; Peichel, Catherine L


    Centromere sequences exist as gaps in many genome assemblies due to their repetitive nature. Here we take an unbiased approach utilizing centromere protein A (CENP-A) chomatin immunoprecipitation followed by high-throughput sequencing to identify the centromeric repeat sequence in the threespine stickleback fish (Gasterosteus aculeatus). A 186-bp, AT-rich repeat was validated as centromeric using both fluorescence in situ hybridization (FISH) and immunofluorescence combined with FISH (IF-FISH) on interphase nuclei and metaphase spreads. This repeat hybridizes strongly to the centromere on all chromosomes, with the exception of weak hybridization to the Y chromosome. Together, our work provides the first validated sequence information for the threespine stickleback centromere.

  13. Genome-Wide Analysis of Simple Sequence Repeats and Efficient Development of Polymorphic SSR Markers Based on Whole Genome Re-Sequencing of Multiple Isolates of the Wheat Stripe Rust Fungus.

    Directory of Open Access Journals (Sweden)

    Huaiyong Luo

    Full Text Available The biotrophic parasitic fungus Puccinia striiformis f. sp. tritici (Pst causes stripe rust, a devastating disease of wheat, endangering global food security. Because the Pst population is highly dynamic, it is difficult to develop wheat cultivars with durable and highly effective resistance. Simple sequence repeats (SSRs are widely used as molecular markers in genetic studies to determine population structure in many organisms. However, only a small number of SSR markers have been developed for Pst. In this study, a total of 4,792 SSR loci were identified using the whole genome sequences of six isolates from different regions of the world, with a marker density of one SSR per 22.95 kb. The majority of the SSRs were di- and tri-nucleotide repeats. A database containing 1,113 SSR markers were established. Through in silico comparison, the previously reported SSR markers were found mainly in exons, whereas the SSR markers in the database were mostly in intergenic regions. Furthermore, 105 polymorphic SSR markers were confirmed in silico by their identical positions and nucleotide variations with INDELs identified among the six isolates. When 104 in silico polymorphic SSR markers were used to genotype 21 Pst isolates, 84 produced the target bands, and 82 of them were polymorphic and revealed the genetic relationships among the isolates. The results show that whole genome re-sequencing of multiple isolates provides an ideal resource for developing SSR markers, and the newly developed SSR markers are useful for genetic and population studies of the wheat stripe rust fungus.

  14. Genome-Wide Analysis of Simple Sequence Repeats and Efficient Development of Polymorphic SSR Markers Based on Whole Genome Re-Sequencing of Multiple Isolates of the Wheat Stripe Rust Fungus. (United States)

    Luo, Huaiyong; Wang, Xiaojie; Zhan, Gangming; Wei, Guorong; Zhou, Xinli; Zhao, Jing; Huang, Lili; Kang, Zhensheng


    The biotrophic parasitic fungus Puccinia striiformis f. sp. tritici (Pst) causes stripe rust, a devastating disease of wheat, endangering global food security. Because the Pst population is highly dynamic, it is difficult to develop wheat cultivars with durable and highly effective resistance. Simple sequence repeats (SSRs) are widely used as molecular markers in genetic studies to determine population structure in many organisms. However, only a small number of SSR markers have been developed for Pst. In this study, a total of 4,792 SSR loci were identified using the whole genome sequences of six isolates from different regions of the world, with a marker density of one SSR per 22.95 kb. The majority of the SSRs were di- and tri-nucleotide repeats. A database containing 1,113 SSR markers were established. Through in silico comparison, the previously reported SSR markers were found mainly in exons, whereas the SSR markers in the database were mostly in intergenic regions. Furthermore, 105 polymorphic SSR markers were confirmed in silico by their identical positions and nucleotide variations with INDELs identified among the six isolates. When 104 in silico polymorphic SSR markers were used to genotype 21 Pst isolates, 84 produced the target bands, and 82 of them were polymorphic and revealed the genetic relationships among the isolates. The results show that whole genome re-sequencing of multiple isolates provides an ideal resource for developing SSR markers, and the newly developed SSR markers are useful for genetic and population studies of the wheat stripe rust fungus.

  15. International distribution and age estimation of the Portuguese BRCA2 c.156_157insAlu founder mutation

    DEFF Research Database (Denmark)

    Peixoto, Ana; Santos, Catarina; Pinheiro, Manuela


    The c.156_157insAlu BRCA2 mutation has so far only been reported in hereditary breast/ovarian cancer (HBOC) families of Portuguese origin. Since this mutation is not detectable using the commonly used screening methodologies and must be specifically sought, we screened for this rearrangement...... individuals requesting predictive testing living in France and in the USA, all being Portuguese immigrants. After performing an extensive haplotype study in carrier families, we estimate that this founder mutation occurred 558 +/- 215 years ago. We further demonstrate significant quantitative differences...... HBOC families from Portugal or with Portuguese ancestry are specifically tested for this rearrangement....

  16. Chromosome breakage in Prader-Willi and Angelman syndrome deletions may involve recombination between a repeat at the proximal and distal breakpoints

    Energy Technology Data Exchange (ETDEWEB)

    Amos-Landgraf J.; Nicholls, R.D. [Case Western Reserve Univ., Cleveland, OH (United States); Gottlieb, W. [Univ. of Florida, Gainesville, FL (United States)] [and others


    Prader-Willi (PWS) and Angelman (AS) syndromes most commonly arise from large deletions of 15q11-q13. Deletions in PWS are paternal in origin, while those in AS are maternal in origin, clearly demonstrating genomic imprinting in these clinically distinct neurobehavioural disorders. In at least 90% of PWS and AS deletion patients, the same 4 Mb region within 15q11-q13 is deleted with breakpoints clustering in single YAC clones at the proximal and distal ends. To study the mechanism of chromosome breakage in PWS and AS, we have previously isolated 25 independent clones from these three YACs using Alu-vector PCR. Four clones were selected that appear to detect a low copy repeat that is located in the proximal and distal breakpoint regions of chromosome 15q11-q13. Three clones detect the same 4 HindIII bands in genomic DNA, all from 15q11-q13, with differing intensities for the probes located at the proximal or distal breakpoints region, respectively. This suggests that these probes detect related members of a low-copy repeat at either location. Moreover, the 254RL2 probe detects a novel HindIII band in two unrelated PWS deletion patients, suggesting that this may represent a breakpoint fragment, with recombination occurring within a similar interval in both patients. A fourth clone, 318RL3 detects 5 bands in HindIII-digested genomic DNA, all from 15q11-q13. This YAC endclone itself is not deleted in PWS and AS deletion patients, as seen by an invariant strong band. Two other strong bands are variably intact or deleted in different PWS or AS deletion patients, suggesting a relationship of this sequence to the breakpoints. Moreover, PCR using 318RL3 primers from the distal 93C9 YAC led to the isolation of a related clone with 96% identity, demonstrating the existence of a low-copy repeat with members close to the proximal and distal breakpoints. Taken together, our data suggest a complex, low-copy repeat with members at both the proximal and distal boundaries.

  17. [Bioinformatics Analysis of Clustered Regularly Interspaced Short Palindromic Repeats in the Genomes of Shigella]. (United States)

    Wang, Pengfei; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Wang, Linlin; Guo, Xiangjiao; Yang, Haiyan; Xi, Yuanlin


    This study was aimed to explore the features of clustered regularly interspaced short palindromic repeats (CRISPR) structures in Shigella by using bioinformatics. We used bioinformatics methods, including BLAST, alignment and RNA structure prediction, to analyze the CRISPR structures of Shigella genomes. The results showed that the CRISPRs existed in the four groups of Shigella, and the flanking sequences of upstream CRISPRs could be classified into the same group with those of the downstream. We also found some relatively conserved palindromic motifs in the leader sequences. Repeat sequences had the same group with corresponding flanking sequences, and could be classified into two different types by their RNA secondary structures, which contain "stem" and "ring". Some spacers were found to homologize with part sequences of plasmids or phages. The study indicated that there were correlations between repeat sequences and flanking sequences, and the repeats might act as a kind of recognition mechanism to mediate the interaction between foreign genetic elements and Cas proteins.

  18. A TALE-inspired computational screen for proteins that contain approximate tandem repeats. (United States)

    Perycz, Malgorzata; Krwawicz, Joanna; Bochtler, Matthias


    TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen.

  19. Utilization of a cloned alphoid repeating sequence of human DNA in the study of polymorphism of chromosomal heterochromatin regions

    International Nuclear Information System (INIS)

    Kruminya, A.R.; Kroshkina, V.G.; Yurov, Yu.B.; Aleksandrov, I.A.; Mitkevich, S.P.; Gindilis, V.M.


    The chromosomal distribution of the cloned PHS05 fragment of human alphoid DNA was studied by in situ hybridization in 38 individuals. It was shown that this DNA fraction is primarily localized in the pericentric regions of practically all chromosomes of the set. Significant interchromosomal differences and a weakly expressed interindividual polymorphism were discovered in the copying ability of this class of repeating DNA sequences; associations were not found between the results of hybridization and the pattern of Q-polymorphism

  20. In situ detection of tandem DNA repeat length

    Energy Technology Data Exchange (ETDEWEB)

    Yaar, R.; Szafranski, P.; Cantor, C.R.; Smith, C.L. [Boston Univ., MA (United States)


    A simple method for scoring short tandem DNA repeats is presented. An oligonucleotide target, containing tandem repeats embedded in a unique sequence, was hybridized to a set of complementary probes, containing tandem repeats of known lengths. Single-stranded loop structures formed on duplexes containing a mismatched (different) number of tandem repeats. No loop structure formed on duplexes containing a matched (identical) number of tandem repeats. The matched and mismatched loop structures were enzymatically distinguished and differentially labeled by treatment with S1 nuclease and the Klenow fragment of DNA polymerase. 7 refs., 4 figs.

  1. O Brasil no espelho de Amaterasu: O Japão de Aluísio Azevedo

    Directory of Open Access Journals (Sweden)

    Marcel Vejmelka


    Full Text Available De 1897 a 1899, Aluísio Azevedo esteve como vice-cônsul em Yokohama. Nesses anos concebeu e esboçou um livro sobre a cultura e sociedade japonesas no passado e no presente, do qual chegou a escrever somente a primeira parte, dedicada à História do Japão. Este fragmento, publicado em 1984 por Luiz Dantas, possibilita analisar a visão de Azevedo da nação e cultura japonesas, a serem compreendidas dentro do contexto histórico do fim do século XIX e em relação com as conflitividades internas da nação e cultura brasileiras como Azevedo as tratou nos seus romances naturalistas.

  2. In silico reversal of repeat-induced point mutation (RIP identifies the origins of repeat families and uncovers obscured duplicated genes

    Directory of Open Access Journals (Sweden)

    Hane James K


    Full Text Available Abstract Background Repeat-induced point mutation (RIP is a fungal genome defence mechanism guarding against transposon invasion. RIP mutates the sequence of repeated DNA and over time renders the affected regions unrecognisable by similarity search tools such as BLAST. Results DeRIP is a new software tool developed to predict the original sequence of a RIP-mutated region prior to the occurrence of RIP. In this study, we apply deRIP to the genome of the wheat pathogen Stagonospora nodorum SN15 and predict the origin of several previously uncharacterised classes of repetitive DNA. Conclusions Five new classes of transposon repeats and four classes of endogenous gene repeats were identified after deRIP. The deRIP process is a new tool for fungal genomics that facilitates the identification and understanding of the role and origin of fungal repetitive DNA. DeRIP is open-source and is available as part of the RIPCAL suite at

  3. Genetic structure in contemporary south Tyrolean isolated populations revealed by analysis of Y-chromosome, mtDNA, and Alu polymorphisms. (United States)

    Pichler, Irene; Mueller, Jakob C; Stefanov, Stefan A; De Grandi, Alessandro; Volpato, Claudia Beu; Pinggera, Gerd K; Mayr, Agnes; Ogriseg, Martin; Ploner, Franz; Meitinger, Thomas; Pramstaller, Peter P


    Most of the inhabitants of South Tyrol in the eastern Italian Alps can be considered isolated populations because of their physical separation by mountain barriers and their sociocultural heritage. We analyzed the genetic structure of South Tyrolean populations using three types of genetic markers: Y-chromosome, mitochondrial DNA (mtDNA), and autosomal Alu markers. Using random samples taken from the populations of Val Venosta, Val Pusteria, Val Isarco, Val Badia, and Val Gardena, we calculated genetic diversity within and among the populations. Microsatellite diversity and unique event polymorphism diversity (on the Y chromosome) were substantially lower in the Ladin-speaking population of Val Badia compared to the neighboring German-speaking populations. In contrast, the genetic diversity of mtDNA haplotypes was lowest for the upper Val Venosta and Val Pusteria. These data suggest a low effective population size, or little admixture, for the gene pool of the Ladin-speaking population from Val Badia. Interestingly, this is more pronounced for Ladin males than for Ladin females. For the pattern of genetic Alu variation, both Ladin samples (Val Gardena and Val Badia) are among the samples with the lowest diversity. An admixture analysis of one German-speaking valley (Val Venosta) indicates a relatively high genetic contribution of Ladin origin. The reduced genetic diversity and a high genetic differentiation in the Rhaetoroman- and German-speaking South Tyrolean populations may constitute an important basis for future medical genetic research and gene mapping studies in South Tyrol.


    We are attempting to identify specific root fragments from soil cores with individual trees. We successfully used Inter Simple Sequence Repeats (ISSR) to distinguish neighboring old-growth Douglas-fir trees from one another, while maintaining identity among each tree's parts. W...

  5. Alu polymorphisms in the Waorani tribe from the Ecuadorian Amazon reflect the effects of isolation and genetic drift. (United States)

    Gómez-Pérez, Luis; Alfonso-Sánchez, Miguel A; Sánchez, Dora; García-Obregón, Susana; Espinosa, Ibone; Martínez-Jarreta, Begoña; De Pancorbo, Marian M; Peña, José A


    The Amazon basin is inhabited by some of the most isolated human groups worldwide. Among them, the Waorani tribe is one of the most interesting Native American populations from the anthropological perspective. This study reports a genetic characterization of the Waorani based on autosomal genetic loci. We analyzed 12 polymorphic Alu insertions in 36 Waorani individuals from different communal longhouses settled in the Yasuní National Park. The most notable finding was the strikingly reduced genetic diversity detected in the Waorani, corroborated by the existence of four monomorphic loci (ACE, APO, FXIIIB, and HS4.65), and of other four Alu markers that were very close to the fixation for the presence (PV92 and D1) or the absence (A25 and HS4.32) of the insertion. Furthermore, results of the centroid analysis supported the notion of the Waorani being one of the Amerindian groups less impacted by gene flow processes. The prolonged isolation of the Waorani community, in conjunction with a historically low effective population size and high inbreeding levels, have resulted in the drastic reduction of their genetic diversity, because of the effects of severe genetic drift. Recurrent population bottlenecks most likely determined by certain deep-rooted sociocultural practices of the Waorani (characterized by violence, internal quarrels, and revenge killings until recent times) are likely responsible for this pattern of diversity. The findings of this study illustrate how sociocultural factors can shape the gene pool of human populations. Copyright © 2011 Wiley-Liss, Inc.

  6. Rate-determining Step of Flap Endonuclease 1 (FEN1) Reflects a Kinetic Bias against Long Flaps and Trinucleotide Repeat Sequences. (United States)

    Tarantino, Mary E; Bilotti, Katharina; Huang, Ji; Delaney, Sarah


    Flap endonuclease 1 (FEN1) is a structure-specific nuclease responsible for removing 5'-flaps formed during Okazaki fragment maturation and long patch base excision repair. In this work, we use rapid quench flow techniques to examine the rates of 5'-flap removal on DNA substrates of varying length and sequence. Of particular interest are flaps containing trinucleotide repeats (TNR), which have been proposed to affect FEN1 activity and cause genetic instability. We report that FEN1 processes substrates containing flaps of 30 nucleotides or fewer at comparable single-turnover rates. However, for flaps longer than 30 nucleotides, FEN1 kinetically discriminates substrates based on flap length and flap sequence. In particular, FEN1 removes flaps containing TNR sequences at a rate slower than mixed sequence flaps of the same length. Furthermore, multiple-turnover kinetic analysis reveals that the rate-determining step of FEN1 switches as a function of flap length from product release to chemistry (or a step prior to chemistry). These results provide a kinetic perspective on the role of FEN1 in DNA replication and repair and contribute to our understanding of FEN1 in mediating genetic instability of TNR sequences. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

  7. International distribution and age estimation of the Portuguese BRCA2 c.156_157insAlu founder mutation

    DEFF Research Database (Denmark)

    Peixoto, Ana; Santos, Catarina; Pinheiro, Manuela


    The c.156_157insAlu BRCA2 mutation has so far only been reported in hereditary breast/ovarian cancer (HBOC) families of Portuguese origin. Since this mutation is not detectable using the commonly used screening methodologies and must be specifically sought, we screened for this rearrangement...... individuals requesting predictive testing living in France and in the USA, all being Portuguese immigrants. After performing an extensive haplotype study in carrier families, we estimate that this founder mutation occurred 558 ± 215 years ago. We further demonstrate significant quantitative differences...... HBOC families from Portugal or with Portuguese ancestry are specifically tested for this rearrangement....

  8. Assessment of Cultivar Distinctness in Alfalfa: A Comparison of Genotyping-by-Sequencing, Simple-Sequence Repeat Marker, and Morphophysiological Observations

    Directory of Open Access Journals (Sweden)

    Paolo Annicchiarico


    Full Text Available Cultivar registration agencies typically require morphophysiological trait-based distinctness of candidate cultivars. This requirement is difficult to achieve for cultivars of major perennial forages because of their genetic structure and ever-increasing number of registered material, leading to possible rejection of agronomically valuable cultivars. This study aimed to explore the value of molecular markers applied to replicated bulked plants (three bulks of 100 independent plants each per cultivar to assess alfalfa ( L. subsp. cultivar distinctness. We compared genotyping-by-sequencing information based on 2902 polymorphic single-nucleotide polymorphism (SNP markers (>30 reads per DNA sample with morphophysiological information based on 11 traits and with simple-sequence repeat (SSR marker information from 41 polymorphic markers for their ability to distinguish 11 alfalfa landraces representative of the germplasm from northern Italy. Three molecular criteria, one based on cultivar differences for individual SSR bands and two based on overall SNP marker variation assessed either by statistically significant cultivar differences on principal component axes or discriminant analysis, distinctly outperformed the morphophysiological criterion. Combining the morphophysiological criterion with either molecular marker method increased discrimination among cultivars, since morphophysiological diversity was unrelated to SSR marker-based diversity ( = 0.04 and poorly related to SNP marker-based diversity ( = 0.23, < 0.15. The criterion based on statistically significant SNP allele frequency differences was less discriminating than morphophysiological variation. Marker-based distinctness, which can be assessed at low cost and without interactions with testing conditions, could validly substitute for (or complement morphophysiological distinctness in alfalfa cultivar registration schemes. It also has interest in sui generis registration systems aimed at

  9. Agarose gel electrophoresis and polyacrylamide gel electrophoresis for visualization of simple sequence repeats. (United States)

    Anderson, James; Wright, Drew; Meksem, Khalid


    In the modern age of genetic research there is a constant search for ways to improve the efficiency of plant selection. The most recent technology that can result in a highly efficient means of selection and still be done at a low cost is through plant selection directed by simple sequence repeats (SSRs or microsatellites). The molecular markers are used to select for certain desirable plant traits without relying on ambiguous phenotypic data. The best way to detect these is the use of gel electrophoresis. Gel electrophoresis is a common technique in laboratory settings which is used to separate deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) by size. Loading DNA and RNA onto gels allows for visualization of the size of fragments through the separation of DNA and RNA fragments. This is achieved through the use of the charge in the particles. As the fragments separate, they form into distinct bands at set sizes. We describe the ability to visualize SSRs on slab gels of agarose and polyacrylamide gel electrophoresis.

  10. Molecular characterization of long direct repeat (LDR) sequences expressing a stable mRNA encoding for a 35-amino-acid cell-killing peptide and a cis-encoded small antisense RNA in Escherichia coli. (United States)

    Kawano, Mitsuoki; Oshima, Taku; Kasai, Hiroaki; Mori, Hirotada


    Genome sequence analyses of Escherichia coli K-12 revealed four copies of long repetitive elements. These sequences are designated as long direct repeat (LDR) sequences. Three of the repeats (LDR-A, -B, -C), each approximately 500 bp in length, are located as tandem repeats at 27.4 min on the genetic map. Another copy (LDR-D), 450 bp in length and nearly identical to LDR-A, -B and -C, is located at 79.7 min, a position that is directly opposite the position of LDR-A, -B and -C. In this study, we demonstrate that LDR-D encodes a 35-amino-acid peptide, LdrD, the overexpression of which causes rapid cell killing and nucleoid condensation of the host cell. Northern blot and primer extension analysis showed constitutive transcription of a stable mRNA (approximately 370 nucleotides) encoding LdrD and an unstable cis-encoded antisense RNA (approximately 60 nucleotides), which functions as a trans-acting regulator of ldrD translation. We propose that LDR encodes a toxin-antitoxin module. LDR-homologous sequences are not pre-sent on any known plasmids but are conserved in Salmonella and other enterobacterial species.

  11. Evaluation of Mammalian Interspersed Repeats to investigate the goat genome

    Directory of Open Access Journals (Sweden)

    P. Mariani


    Full Text Available Among the repeated sequences present in most eukaryotic genomes, SINEs (Short Interspersed Nuclear Elements are widely used to investigate evolution in the mammalian order (Buchanan et al., 1999. One family of these repetitive sequences, the MIR (Mammalian Interspersed Repeats; Jurka et al., 1995, is ubiquitous in all mammals.MIR elements are tRNA-derived SINEs and are identifiable by a conserved core region of about 70 nucleotides.

  12. Regulation of HFE expression by poly(ADP-ribose) polymerase-1 (PARP1) through an inverted repeat DNA sequence in the distal promoter. (United States)

    Pelham, Christopher; Jimenez, Tamara; Rodova, Marianna; Rudolph, Angela; Chipps, Elizabeth; Islam, M Rafiq


    Hereditary hemochromatosis (HH) is a common autosomal recessive disorder of iron overload among Caucasians of northern European descent. Over 85% of all cases with HH are due to mutations in the hemochromatosis protein (HFE) involved in iron metabolism. Although the importance in iron homeostasis is well recognized, the mechanism of sensing and regulating iron absorption by HFE, especially in the absence of iron response element in its gene, is not fully understood. In this report, we have identified an inverted repeat sequence (ATGGTcttACCTA) within 1700bp (-1675/+35) of the HFE promoter capable to form cruciform structure that binds PARP1 and strongly represses HFE promoter. Knockdown of PARP1 increases HFE mRNA and protein. Similarly, hemin or FeCl3 treatments resulted in increase in HFE expression by reducing nuclear PARP1 pool via its apoptosis induced cleavage, leading to upregulation of the iron regulatory hormone hepcidin mRNA. Thus, PARP1 binding to the inverted repeat sequence on the HFE promoter may serve as a novel iron sensing mechanism as increased iron level can trigger PARP1 cleavage and relief of HFE transcriptional repression. © 2013.

  13. Rhoptry-associated protein (rap-1) genes in the sheep pathogen Babesia sp. Xinjiang: Multiple transcribed copies differing by 3' end repeated sequences. (United States)

    Niu, Qingli; Marchand, Jordan; Yang, Congshan; Bonsergent, Claire; Guan, Guiquan; Yin, Hong; Malandrin, Laurence


    Sheep babesiosis occurs mainly in tropical and subtropical areas. The sheep parasite Babesia sp. Xinjiang is widespread in China, and our goal is to characterize rap-1 (rhoptry-associated protein 1) gene diversity and expression as a first step of a long term goal aiming at developing a recombinant subunit vaccine. Seven different rap-1a genes were amplified in Babesia sp. Xinjiang, using degenerate primers designed from conserved motifs. Rap-1b and rap-1c gene types could not be identified. In all seven rap-1a genes, the 5' regions exhibited identical sequences over 936 nt, and the 3' regions differed at 28 positions over 147 nt, defining two types of genes designated α and β. The remaining 3' part varied from 72 to 360 nt in length, depending on the gene. This region consists of a succession of two to ten 36 nt repeats, which explains the size differences. Even if the nucleotide sequences varied, 6 repeats encoded the same stretch of amino acids. Transcription of at least four α and two β genes was demonstrated by standard RT-PCR. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. R-loops: targets for nuclease cleavage and repeat instability. (United States)

    Freudenreich, Catherine H


    R-loops form when transcribed RNA remains bound to its DNA template to form a stable RNA:DNA hybrid. Stable R-loops form when the RNA is purine-rich, and are further stabilized by DNA secondary structures on the non-template strand. Interestingly, many expandable and disease-causing repeat sequences form stable R-loops, and R-loops can contribute to repeat instability. Repeat expansions are responsible for multiple neurodegenerative diseases, including Huntington's disease, myotonic dystrophy, and several types of ataxias. Recently, it was found that R-loops at an expanded CAG/CTG repeat tract cause DNA breaks as well as repeat instability (Su and Freudenreich, Proc Natl Acad Sci USA 114, E8392-E8401, 2017). Two factors were identified as causing R-loop-dependent breaks at CAG/CTG tracts: deamination of cytosines and the MutLγ (Mlh1-Mlh3) endonuclease, defining two new mechanisms for how R-loops can generate DNA breaks (Su and Freudenreich, Proc Natl Acad Sci USA 114, E8392-E8401, 2017). Following R-loop-dependent nicking, base excision repair resulted in repeat instability. These results have implications for human repeat expansion diseases and provide a paradigm for how RNA:DNA hybrids can cause genome instability at structure-forming DNA sequences. This perspective summarizes mechanisms of R-loop-induced fragility at G-rich repeats and new links between DNA breaks and repeat instability.

  15. Germ-line CAG repeat instability causes extreme CAG repeat expansion with infantile-onset spinocerebellar ataxia type 2

    DEFF Research Database (Denmark)

    Vinther-Jensen, Tua; Ek, Jakob; Duno, Morten


    The spinocerebellar ataxias (SCA) are a genetically and clinically heterogeneous group of diseases, characterized by dominant inheritance, progressive cerebellar ataxia and diverse extracerebellar symptoms. A subgroup of the ataxias is caused by unstable CAG-repeat expansions in their respective ...... of paternal germ-line repeat sequence instability of the expanded SCA2 locus.European Journal of Human Genetics advance online publication, 10 October 2012; doi:10.1038/ejhg.2012.231....

  16. A family of DNA repeats in Aspergillus nidulans has assimilated degenerated retrotransposons

    DEFF Research Database (Denmark)

    Nielsen, M.L.; Hermansen, T.D.; Aleksenko, Alexei Y.


    In the course of a chromosomal walk towards the centromere of chromosome IV of Aspergillus nidulans, several cross- hybridizing genomic cosmid clones were isolated. Restriction mapping of two such clones revealed that their restriction patterns were similar in a region of at least 15 kb, indicati......) phenomenon, first described in Neurospora crassa, may have operated in A. nidulans. The data indicate that this family of repeats has assimilated mobile elements that subsequently degenerated but then underwent further duplications as a part of the host repeats....... the presence of a large repeat. The nature of the repeat was further investigated by sequencing and Southern analysis. The study revealed a family of long dispersed repeats with a high degree of sequence similarity. The number and location of the repeats vary between wild isolates. Two copies of the repeat...

  17. Development and validation of InnoQuant™, a sensitive human DNA quantitation and degradation assessment method for forensic samples using high copy number mobile elements Alu and SVA. (United States)

    Pineda, Gina M; Montgomery, Anne H; Thompson, Robyn; Indest, Brooke; Carroll, Marion; Sinha, Sudhir K


    There is a constant need in forensic casework laboratories for an improved way to increase the first-pass success rate of forensic samples. The recent advances in mini STR analysis, SNP, and Alu marker systems have now made it possible to analyze highly compromised samples, yet few tools are available that can simultaneously provide an assessment of quantity, inhibition, and degradation in a sample prior to genotyping. Currently there are several different approaches used for fluorescence-based quantification assays which provide a measure of quantity and inhibition. However, a system which can also assess the extent of degradation in a forensic sample will be a useful tool for DNA analysts. Possessing this information prior to genotyping will allow an analyst to more informatively make downstream decisions for the successful typing of a forensic sample without unnecessarily consuming DNA extract. Real-time PCR provides a reliable method for determining the amount and quality of amplifiable DNA in a biological sample. Alu are Short Interspersed Elements (SINE), approximately 300bp insertions which are distributed throughout the human genome in large copy number. The use of an internal primer to amplify a segment of an Alu element allows for human specificity as well as high sensitivity when compared to a single copy target. The advantage of an Alu system is the presence of a large number (>1000) of fixed insertions in every human genome, which minimizes the individual specific variation possible when using a multi-copy target quantification system. This study utilizes two independent retrotransposon genomic targets to obtain quantification of an 80bp "short" DNA fragment and a 207bp "long" DNA fragment in a degraded DNA sample in the multiplex system InnoQuant™. The ratio of the two quantitation values provides a "Degradation Index", or a qualitative measure of a sample's extent of degradation. The Degradation Index was found to be predictive of the observed loss

  18. Study of simple sequence repeat (SSR) polymorphism for biotic ...

    African Journals Online (AJOL)



    Oct 2, 2013 ... G. Siva Kumar1, K. Aruna Kumari1*, Ch. V. Durga Rani1, R. M. Sundaram2, S. Vanisree3, Md. ..... review by Jena and Mackill (2008) provided the list of .... repeat protein and is a member of a resistance gene cluster on rice.

  19. Down-regulation of 21A Alu RNA as a tool to boost proliferation maintaining the tissue regeneration potential of progenitor cells


    Gigoni, Arianna; Costa, Delfina; Gaetani, Massimiliano; Tasso, Roberta; Villa, Federico; Florio, Tullio; Pagano, Aldo


    21A is an Alu non-coding (nc) RNA transcribed by RNA polymerase (pol) III. While investigating the biological role of 21A ncRNA we documented an inverse correlation between its expression level and the rate of cell proliferation. The downregulation of this ncRNA not only caused a boost in cell proliferation, but was also associated to a transient cell dedifferentiation, suggesting a possible involvement of this RNA in cell dedifferentiation/reprogramming. In this study, we explored the possib...

  20. Mononucleotide repeats are asymmetrically distributed in fungal genes

    NARCIS (Netherlands)

    Passel, van M.W.J.; Graaff, de L.H.


    ABSTRACT: BACKGROUND: Systematic analyses of sequence features have resulted in a better characterisation of the organisation of the genome. A previous study in prokaryotes on the distribution of sequence repeats, which are notoriously variable and can disrupt the reading frame in genes, showed that

  1. Development of novel simple sequence repeat markers in bitter gourd (Momordica charantia L.) through enriched genomic libraries and their utilization in analysis of genetic diversity and cross-species transferability. (United States)

    Saxena, Swati; Singh, Archana; Archak, Sunil; Behera, Tushar K; John, Joseph K; Meshram, Sudhir U; Gaikwad, Ambika B


    Microsatellite or simple sequence repeat (SSR) markers are the preferred markers for genetic analyses of crop plants. The availability of a limited number of such markers in bitter gourd (Momordica charantia L.) necessitates the development and characterization of more SSR markers. These were developed from genomic libraries enriched for three dinucleotide, five trinucleotide, and two tetranucleotide core repeat motifs. Employing the strategy of polymerase chain reaction-based screening, the number of clones to be sequenced was reduced by 81 % and 93.7 % of the sequenced clones contained in microsatellite repeats. Unique primer-pairs were designed for 160 microsatellite loci, and amplicons of expected length were obtained for 151 loci (94.4 %). Evaluation of diversity in 54 bitter gourd accessions at 51 loci indicated that 20 % of the loci were polymorphic with the polymorphic information content values ranging from 0.13 to 0.77. Fifteen Indian varieties were clearly distinguished indicative of the usefulness of the developed markers. Markers at 40 loci (78.4 %) were transferable to six species, viz. Momordica cymbalaria, Momordica subangulata subsp. renigera, Momordica balsamina, Momordica dioca, Momordica cochinchinesis, and Momordica sahyadrica. The microsatellite markers reported will be useful in various genetic and molecular genetic studies in bitter gourd, a cucurbit of immense nutritive, medicinal, and economic importance.

  2. The First Molecular Identification of an Olive Collection Applying Standard Simple Sequence Repeats and Novel Expressed Sequence Tag Markers. (United States)

    Mousavi, Soraya; Mariotti, Roberto; Regni, Luca; Nasini, Luigi; Bufacchi, Marina; Pandolfi, Saverio; Baldoni, Luciana; Proietti, Primo


    Germplasm collections of tree crop species represent fundamental tools for conservation of diversity and key steps for its characterization and evaluation. For the olive tree, several collections were created all over the world, but only few of them have been fully characterized and molecularly identified. The olive collection of Perugia University (UNIPG), established in the years' 60, represents one of the first attempts to gather and safeguard olive diversity, keeping together cultivars from different countries. In the present study, a set of 370 olive trees previously uncharacterized was screened with 10 standard simple sequence repeats (SSRs) and nine new EST-SSR markers, to correctly and thoroughly identify all genotypes, verify their representativeness of the entire cultivated olive variation, and validate the effectiveness of new markers in comparison to standard genotyping tools. The SSR analysis revealed the presence of 59 genotypes, corresponding to 72 well known cultivars, 13 of them resulting exclusively present in this collection. The new EST-SSRs have shown values of diversity parameters quite similar to those of best standard SSRs. When compared to hundreds of Mediterranean cultivars, the UNIPG olive accessions were splitted into the three main populations (East, Center and West Mediterranean), confirming that the collection has a good representativeness of the entire olive variability. Furthermore, Bayesian analysis, performed on the 59 genotypes of the collection by the use of both sets of markers, have demonstrated their splitting into four clusters, with a well balanced membership obtained by EST respect to standard SSRs. The new OLEST ( Olea expressed sequence tags) SSR markers resulted as effective as the best standard markers. The information obtained from this study represents a high valuable tool for ex situ conservation and management of olive genetic resources, useful to build a common database from worldwide olive cultivar collections

  3. C-terminal sequences of hsp70 and hsp90 as non-specific anchors for tetratricopeptide repeat (TPR) proteins. (United States)

    Ramsey, Andrew J; Russell, Lance C; Chinkers, Michael


    Steroid-hormone-receptor maturation is a multi-step process that involves several TPR (tetratricopeptide repeat) proteins that bind to the maturation complex via the C-termini of hsp70 (heat-shock protein 70) and hsp90 (heat-shock protein 90). We produced a random T7 peptide library to investigate the roles played by the C-termini of the two heat-shock proteins in the TPR-hsp interactions. Surprisingly, phages with the MEEVD sequence, found at the C-terminus of hsp90, were not recovered from our biopanning experiments. However, two groups of phages were isolated that bound relatively tightly to HsPP5 (Homo sapiens protein phosphatase 5) TPR. Multiple copies of phages with a C-terminal sequence of LFG were isolated. These phages bound specifically to the TPR domain of HsPP5, although mutation studies produced no evidence that they bound to the domain's hsp90-binding groove. However, the most abundant family obtained in the initial screen had an aspartate residue at the C-terminus. Two members of this family with a C-terminal sequence of VD appeared to bind with approximately the same affinity as the hsp90 C-12 control. A second generation pseudo-random phage library produced a large number of phages with an LD C-terminus. These sequences acted as hsp70 analogues and had relatively low affinities for hsp90-specific TPR domains. Unfortunately, we failed to identify residues near hsp90's C-terminus that impart binding specificity to individual hsp90-TPR interactions. The results suggest that the C-terminal sequences of hsp70 and hsp90 act primarily as non-specific anchors for TPR proteins.

  4. Genome-wide cloning and sequence analysis of leucine-rich repeat receptor-like protein kinase genes in Arabidopsis thaliana

    Directory of Open Access Journals (Sweden)

    Yuan Tong


    Full Text Available Abstract Background Transmembrane receptor kinases play critical roles in both animal and plant signaling pathways regulating growth, development, differentiation, cell death, and pathogenic defense responses. In Arabidopsis thaliana, there are at least 223 Leucine-rich repeat receptor-like kinases (LRR-RLKs, representing one of the largest protein families. Although functional roles for a handful of LRR-RLKs have been revealed, the functions of the majority of members in this protein family have not been elucidated. Results As a resource for the in-depth analysis of this important protein family, the complementary DNA sequences (cDNAs of 194 LRR-RLKs were cloned into the GatewayR donor vector pDONR/ZeoR and analyzed by DNA sequencing. Among them, 157 clones showed sequences identical to the predictions in the Arabidopsis sequence resource, TAIR8. The other 37 cDNAs showed gene structures distinct from the predictions of TAIR8, which was mainly caused by alternative splicing of pre-mRNA. Most of the genes have been further cloned into GatewayR destination vectors with GFP or FLAG epitope tags and have been transformed into Arabidopsis for in planta functional analysis. All clones from this study have been submitted to the Arabidopsis Biological Resource Center (ABRC at Ohio State University for full accessibility by the Arabidopsis research community. Conclusions Most of the Arabidopsis LRR-RLK genes have been isolated and the sequence analysis showed a number of alternatively spliced variants. The generated resources, including cDNA entry clones, expression constructs and transgenic plants, will facilitate further functional analysis of the members of this important gene family.

  5. Do DNA double-strand breaks induced by Alu I lead to development of novel aberrations in the second and third post-treatment mitoses?

    International Nuclear Information System (INIS)

    Wojcik, A.; Bonk, K.; Mueller, M.U.; Streffer, C.; Obe, G.


    Several authors have reported that ionizing radiation can give rise to novel aberrations several mitotic divisions after the exposure. At our institute this phenomenon has been observed in mouse preimplantation embryos. This cell system is uniquely well suited for such investigations because the first three cell divisions show a high degree of synchrony. Thus the expression of chromosomal aberrations at the first, second and third mitosis after irradiation can be scored unambiguously. To investigate whether DNA double-strand breaks may be the lesions responsible for the delayed expression of chromosomal aberrations, we have studied the frequencies of aberrations in the first, second and third mitosis after treatment of one-cell mouse embryos with the restriction enzyme Alu I. Embryos were permeabilized with Streptolysin-O. The results indicate that the induction of double-strand breaks does not lead to novel aberrations in the third post-treatment mitosis. Several embryos scored at the second mitosis showed very high numbers of aberrations, indicating that Alu I may remain active in the cells for a period of one cell cycle. After treatment with Streptolysin-O alone, enhanced aberration frequencies were observed in the third post-treatment mitosis, suggesting that membrane damage has a delayed effect on the cellular integrity. 44 refs., 3 figs., 3 tabs

  6. The DUB/USP17 deubiquitinating enzymes: A gene family within a tandemly repeated sequence, is also embedded within the copy number variable Beta-defensin cluster

    Directory of Open Access Journals (Sweden)

    Scott Christopher J


    Full Text Available Abstract Background The DUB/USP17 subfamily of deubiquitinating enzymes were originally identified as immediate early genes induced in response to cytokine stimulation in mice (DUB-1, DUB-1A, DUB-2, DUB-2A. Subsequently we have identified a number of human family members and shown that one of these (DUB-3 is also cytokine inducible. We originally showed that constitutive expression of DUB-3 can block cell proliferation and more recently we have demonstrated that this is due to its regulation of the ubiquitination and activity of the 'CAAX' box protease RCE1. Results Here we demonstrate that the human DUB/USP17 family members are found on both chromosome 4p16.1, within a block of tandem repeats, and on chromosome 8p23.1, embedded within the copy number variable beta-defensin cluster. In addition, we show that the multiple genes observed in humans and other distantly related mammals have arisen due to the independent expansion of an ancestral sequence within each species. However, it is also apparent when sequences from humans and the more closely related chimpanzee are compared, that duplication events have taken place prior to these species separating. Conclusions The observation that the DUB/USP17 genes, which can influence cell growth and survival, have evolved from an unstable ancestral sequence which has undergone multiple and varied duplications in the species examined marks this as a unique family. In addition, their presence within the beta-defensin repeat raises the question whether they may contribute to the influence of this repeat on immune related conditions.

  7. Karyological characterization and identification of four repetitive element groups (the 18S – 28S rRNA gene, telomeric sequences, microsatellite repeat motifs, Rex retroelements) of the Asian swamp eel (Monopterus albus) (United States)

    Suntronpong, Aorarat; Thapana, Watcharaporn; Twilprawat, Panupon; Prakhongcheep, Ornjira; Somyong, Suthasinee; Muangmai, Narongrit; Surin Peyachoknagul; Srikulnath, Kornsorn


    Abstract Among teleost fishes, Asian swamp eel (Monopterus albus Zuiew, 1793) possesses the lowest chromosome number, 2n = 24. To characterize the chromosome constitution and investigate the genome organization of repetitive sequences in M. albus, karyotyping and chromosome mapping were performed with the 18S – 28S rRNA gene, telomeric repeats, microsatellite repeat motifs, and Rex retroelements. The 18S – 28S rRNA genes were observed to the pericentromeric region of chromosome 4 at the same position with large propidium iodide and C-positive bands, suggesting that the molecular structure of the pericentromeric regions of chromosome 4 has evolved in a concerted manner with amplification of the 18S – 28S rRNA genes. (TTAGGG)n sequences were found at the telomeric ends of all chromosomes. Eight of 19 microsatellite repeat motifs were dispersedly mapped on different chromosomes suggesting the independent amplification of microsatellite repeat motifs in M. albus. Monopterus albus Rex1 (MALRex1) was observed at interstitial sites of all chromosomes and in the pericentromeric regions of most chromosomes whereas MALRex3 was scattered and localized to all chromosomes and MALRex6 to several chromosomes. This suggests that these retroelements were independently amplified or lost in M. albus. Among MALRexs (MALRex1, MALRex3, and MALRex6), MALRex6 showed higher interspecific sequence divergences from other teleost species in comparison. This suggests that the divergence of Rex6 sequences of M. albus might have occurred a relatively long time ago. PMID:29093797

  8. Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area. (United States)

    Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi


    PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.

  9. Revisiting the TALE repeat. (United States)

    Deng, Dong; Yan, Chuangye; Wu, Jianping; Pan, Xiaojing; Yan, Nieng


    Transcription activator-like (TAL) effectors specifically bind to double stranded (ds) DNA through a central domain of tandem repeats. Each TAL effector (TALE) repeat comprises 33-35 amino acids and recognizes one specific DNA base through a highly variable residue at a fixed position in the repeat. Structural studies have revealed the molecular basis of DNA recognition by TALE repeats. Examination of the overall structure reveals that the basic building block of TALE protein, namely a helical hairpin, is one-helix shifted from the previously defined TALE motif. Here we wish to suggest a structure-based re-demarcation of the TALE repeat which starts with the residues that bind to the DNA backbone phosphate and concludes with the base-recognition hyper-variable residue. This new numbering system is consistent with the α-solenoid superfamily to which TALE belongs, and reflects the structural integrity of TAL effectors. In addition, it confers integral number of TALE repeats that matches the number of bound DNA bases. We then present fifteen crystal structures of engineered dHax3 variants in complex with target DNA molecules, which elucidate the structural basis for the recognition of bases adenine (A) and guanine (G) by reported or uncharacterized TALE codes. Finally, we analyzed the sequence-structure correlation of the amino acid residues within a TALE repeat. The structural analyses reported here may advance the mechanistic understanding of TALE proteins and facilitate the design of TALEN with improved affinity and specificity.

  10. Long Terminal Repeat Retrotransposon Content in Eight Diploid Sunflower Species Inferred from Next-Generation Sequence Data (United States)

    Tetreault, Hannah M.; Ungerer, Mark C.


    The most abundant transposable elements (TEs) in plant genomes are Class I long terminal repeat (LTR) retrotransposons represented by superfamilies gypsy and copia. Amplification of these superfamilies directly impacts genome structure and contributes to differential patterns of genome size evolution among plant lineages. Utilizing short-read Illumina data and sequence information from a panel of Helianthus annuus (sunflower) full-length gypsy and copia elements, we explore the contribution of these sequences to genome size variation among eight diploid Helianthus species and an outgroup taxon, Phoebanthus tenuifolius. We also explore transcriptional dynamics of these elements in both leaf and bud tissue via RT-PCR. We demonstrate that most LTR retrotransposon sublineages (i.e., families) display patterns of similar genomic abundance across species. A small number of LTR retrotransposon sublineages exhibit lineage-specific amplification, particularly in the genomes of species with larger estimated nuclear DNA content. RT-PCR assays reveal that some LTR retrotransposon sublineages are transcriptionally active across all species and tissue types, whereas others display species-specific and tissue-specific expression. The species with the largest estimated genome size, H. agrestis, has experienced amplification of LTR retrotransposon sublineages, some of which have proliferated independently in other lineages in the Helianthus phylogeny. PMID:27233667

  11. Nucleotide sequence determination of the region in adenovirus 5 DNA involved in cell transformation

    International Nuclear Information System (INIS)

    Maat, J.


    A description is given of investigations into the primary structure of the transforming region of adenovirus type 5 DNA. The phenomenon of cell transformation is discussed in general terms and the principles of a number of fairly recent techniques, which have been in use for DNA sequence determination since 1975 are dealt with. A few of the author's own techniques are described which deal both with nucleotide sequence analysis and with the determination of DNA cleavage sites of restriction endonucleases. The results are given of the mapping of cleavage sites in the HpaI-E fragment of adenovirus DNA of HpaII, HaeIII, AluI, HinfI and TaqI and of the determination of the nucleotide sequence in the transforming region of adenovirus type 5 DNA. The results of the sequence determination of the Ad5 HindIII-G fragment are discussed in relation with the investigation on the transforming proteins isolated from in vitro and in vivo synthesizing systems. Labelling procedures of DNA are described including the exonuclease III/DNA polymerase 1 method and TA polynucleotide kinase labelling of DNA fragments. (Auth.)

  12. Discovery and mapping of a new expressed sequence tag-single nucleotide polymorphism and simple sequence repeat panel for large-scale genetic studies and breeding of Theobroma cacao L. (United States)

    Allegre, Mathilde; Argout, Xavier; Boccara, Michel; Fouet, Olivier; Roguet, Yolande; Bérard, Aurélie; Thévenin, Jean Marc; Chauveau, Aurélie; Rivallan, Ronan; Clement, Didier; Courtois, Brigitte; Gramacho, Karina; Boland-Augé, Anne; Tahi, Mathias; Umaharan, Pathmanathan; Brunel, Dominique; Lanaud, Claire


    Theobroma cacao is an economically important tree of several tropical countries. Its genetic improvement is essential to provide protection against major diseases and improve chocolate quality. We discovered and mapped new expressed sequence tag-single nucleotide polymorphism (EST-SNP) and simple sequence repeat (SSR) markers and constructed a high-density genetic map. By screening 149 650 ESTs, 5246 SNPs were detected in silico, of which 1536 corresponded to genes with a putative function, while 851 had a clear polymorphic pattern across a collection of genetic resources. In addition, 409 new SSR markers were detected on the Criollo genome. Lastly, 681 new EST-SNPs and 163 new SSRs were added to the pre-existing 418 co-dominant markers to construct a large consensus genetic map. This high-density map and the set of new genetic markers identified in this study are a milestone in cocoa genomics and for marker-assisted breeding. The data are available at PMID:22210604

  13. [Comparative analysis of clustered regularly interspaced short palindromic repeats (CRISPRs) loci in the genomes of halophilic archaea]. (United States)

    Zhang, Fan; Zhang, Bing; Xiang, Hua; Hu, Songnian


    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a widespread system that provides acquired resistance against phages in bacteria and archaea. Here we aim to genome-widely analyze the CRISPR in extreme halophilic archaea, of which the whole genome sequences are available at present time. We used bioinformatics methods including alignment, conservation analysis, GC content and RNA structure prediction to analyze the CRISPR structures of 7 haloarchaeal genomes. We identified the CRISPR structures in 5 halophilic archaea and revealed a conserved palindromic motif in the flanking regions of these CRISPR structures. In addition, we found that the repeat sequences of large CRISPR structures in halophilic archaea were greatly conserved, and two types of predicted RNA secondary structures derived from the repeat sequences were likely determined by the fourth base of the repeat sequence. Our results support the proposal that the leader sequence may function as recognition site by having palindromic structures in flanking regions, and the stem-loop secondary structure formed by repeat sequences may function in mediating the interaction between foreign genetic elements and CAS-encoded proteins.

  14. Inter- and intra-strain variability of tandem repeats in Mycoplasma pneumoniae based on next-generation sequencing data. (United States)

    Zhang, Jing; Song, Xiaohong; Ma, Marella J; Xiao, Li; Kenri, Tsuyoshi; Sun, Hongmei; Ptacek, Travis; Li, Shaoli; Waites, Ken B; Atkinson, T Prescott; Shibayama, Keigo; Dybvig, Kevin; Feng, Yanmei


    To characterize inter- and intra-strain variability of variable-number tandem repeats (VNTRs) in Mycoplasma pneumoniae to determine the optimal multilocus VNTR analysis scheme for improved strain typing. Whole genome assemblies and next-generation sequencing data from diverse M. pneumoniae isolates were used to characterize VNTRs and their variability, and to compare the strain discriminability of new VNTR and existing markers. We identified 13 VNTRs including five reported previously. These VNTRs displayed different levels of inter- and intra-strain copy number variations. All new markers showed similar or higher discriminability compared with existing VNTR markers and the P1 typing system. Our study provides novel insights into VNTR variations and potential new multilocus VNTR analysis schemes for improved genotyping of M. pneumoniae.

  15. Myotonin protein-kinase [AGC]n trinucleotide repeat in seven nonhuman primates

    Energy Technology Data Exchange (ETDEWEB)

    Novelli, G.; Sineo, L.; Pontieri, E. [Catholic Univ. of Rome (Italy)]|[Univ. of Milan (Italy)]|[Univ. Florence (Italy)] [and others


    Myotonic dystrophy (DM) is due to a genomic instability of a trinucleotide [AGC]n motif, located at the 3{prime} UTR region of a protein-kinase gene (myotonin protein kinase, MT-PK). The [AGC] repeat is meiotically and mitotically unstable, and it is directly related to the manifestations of the disorder. Although a gene dosage effect of the MT-PK has been demonstrated n DM muscle, the mechanism(s) by which the intragenic repeat expansion leads to disease is largely unknown. This non-standard mutational event could reflect an evolutionary mechanism widespread among animal genomes. We have isolated and sequenced the complete 3{prime}UTR region of the MT-PK gene in seven primates (macaque, orangutan, gorilla, chimpanzee, gibbon, owl monkey, saimiri), and examined by comparative sequence nucleotide analysis the [AGC]n intragenic repeat and the surrounding nucleotides. The genomic organization, including the [AGC]n repeat structure, was conserved in all examined species, excluding the gibbon (Hylobates agilis), in which the [AGC]n upstream sequence (GGAA) is replaced by a GA dinucleotide. The number of [AGC]n in the examined species ranged between 7 (gorilla) and 13 repeats (owl monkeys), with a polymorphism informative content (PIC) similar to that observed in humans. These results indicate that the 3{prime}UTR [AGC] repeat within the MT-PK gene is evolutionarily conserved, supporting that this region has important regulatory functions.

  16. Ex vivo response to histone deacetylase (HDAC inhibitors of the HIV long terminal repeat (LTR derived from HIV-infected patients on antiretroviral therapy.

    Directory of Open Access Journals (Sweden)

    Hao K Lu

    Full Text Available Histone deacetylase inhibitors (HDACi can induce human immunodeficiency virus (HIV transcription from the HIV long terminal repeat (LTR. However, ex vivo and in vivo responses to HDACi are variable and the activity of HDACi in cells other than T-cells have not been well characterised. Here, we developed a novel assay to determine the activity of HDACi on patient-derived HIV LTRs in different cell types. HIV LTRs from integrated virus were amplified using triple-nested Alu-PCR from total memory CD4+ T-cells (CD45RO+ isolated from HIV-infected patients prior to and following suppressive antiretroviral therapy. NL4-3 or patient-derived HIV LTRs were cloned into the chromatin forming episomal vector pCEP4, and the effect of HDACi investigated in the astrocyte and epithelial cell lines SVG and HeLa, respectively. There were no significant differences in the sequence of the HIV LTRs isolated from CD4+ T-cells prior to and after 18 months of combination antiretroviral therapy (cART. We found that in both cell lines, the HDACi panobinostat, trichostatin A, vorinostat and entinostat activated patient-derived HIV LTRs to similar levels seen with NL4-3 and all patient derived isolates had similar sensitivity to maximum HDACi stimulation. We observed a marked difference in the maximum fold induction of luciferase by HDACi in HeLa and SVG, suggesting that the effect of HDACi may be influenced by the cellular environment. Finally, we observed significant synergy in activation of the LTR with vorinostat and the viral protein Tat. Together, our results suggest that the LTR sequence of integrated virus is not a major determinant of a functional response to an HDACi.

  17. The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats

    Directory of Open Access Journals (Sweden)

    Vergnaud Gilles


    Full Text Available Abstract Background In Archeae and Bacteria, the repeated elements called CRISPRs for "clustered regularly interspaced short palindromic repeats" are believed to participate in the defence against viruses. Short sequences called spacers are stored in-between repeated elements. In the current model, motifs comprising spacers and repeats may target an invading DNA and lead to its degradation through a proposed mechanism similar to RNA interference. Analysis of intra-species polymorphism shows that new motifs (one spacer and one repeated element are added in a polarised fashion. Although their principal characteristics have been described, a lot remains to be discovered on the way CRISPRs are created and evolve. As new genome sequences become available it appears necessary to develop automated scanning tools to make available CRISPRs related information and to facilitate additional investigations. Description We have produced a program, CRISPRFinder, which identifies CRISPRs and extracts the repeated and unique sequences. Using this software, a database is constructed which is automatically updated monthly from newly released genome sequences. Additional tools were created to allow the alignment of flanking sequences in search for similarities between different loci and to build dictionaries of unique sequences. To date, almost six hundred CRISPRs have been identified in 475 published genomes. Two Archeae out of thirty-seven and about half of Bacteria do not possess a CRISPR. Fine analysis of repeated sequences strongly supports the current view that new motifs are added at one end of the CRISPR adjacent to the putative promoter. Conclusion It is hoped that availability of a public database, regularly updated and which can be queried on the web will help in further dissecting and understanding CRISPR structure and flanking sequences evolution. Subsequent analyses of the intra-species CRISPR polymorphism will be facilitated by CRISPRFinder and the

  18. The chloroplast genome sequence of the green alga Leptosira terrestris: multiple losses of the inverted repeat and extensive genome rearrangements within the Trebouxiophyceae

    Directory of Open Access Journals (Sweden)

    Turmel Monique


    Full Text Available Abstract Background In the Chlorophyta – the green algal phylum comprising the classes Prasinophyceae, Ulvophyceae, Trebouxiophyceae and Chlorophyceae – the chloroplast genome displays a highly variable architecture. While chlorophycean chloroplast DNAs (cpDNAs deviate considerably from the ancestral pattern described for the prasinophyte Nephroselmis olivacea, the degree of remodelling sustained by the two ulvophyte cpDNAs completely sequenced to date is intermediate relative to those observed for chlorophycean and trebouxiophyte cpDNAs. Chlorella vulgaris (Chlorellales is currently the only photosynthetic trebouxiophyte whose complete cpDNA sequence has been reported. To gain insights into the evolutionary trends of the chloroplast genome in the Trebouxiophyceae, we sequenced cpDNA from the filamentous alga Leptosira terrestris (Ctenocladales. Results The 195,081-bp Leptosira chloroplast genome resembles the 150,613-bp Chlorella genome in lacking a large inverted repeat (IR but differs greatly in gene order. Six of the conserved genes present in Chlorella cpDNA are missing from the Leptosira gene repertoire. The 106 conserved genes, four introns and 11 free standing open reading frames (ORFs account for 48.3% of the genome sequence. This is the lowest gene density yet observed among chlorophyte cpDNAs. Contrary to the situation in Chlorella but similar to that in the chlorophycean Scenedesmus obliquus, the gene distribution is highly biased over the two DNA strands in Leptosira. Nine genes, compared to only three in Chlorella, have significantly expanded coding regions relative to their homologues in ancestral-type green algal cpDNAs. As observed in chlorophycean genomes, the rpoB gene is fragmented into two ORFs. Short repeats account for 5.1% of the Leptosira genome sequence and are present mainly in intergenic regions. Conclusion Our results highlight the great plasticity of the chloroplast genome in the Trebouxiophyceae and indicate

  19. Characterization and expression of the maize β-carbonic anhydrase gene repeat regions. (United States)

    Tems, Ursula; Burnell, James N


    In maize, carbonic anhydrase (CA; EC catalyzes the first reaction of the C(4) photosynthetic pathway; it catalyzes the hydration of CO(2) to bicarbonate and provides an inorganic carbon source for the primary carboxylation reaction catalyzed by phosphoenolpyruvate (PEP) carboxylase. The β-CA isozymes from maize, as well as other agronomically important NADP-malic enzyme (NADP-ME) type C(4) crops, have remained relatively uncharacterized but differ significantly from the β-CAs of other C(4) monocot species primarily due to transcript length and the presence of repeat sequences. This research confirmed earlier findings of repeat sequences in maize CA transcripts, and demonstrated that the gene encoding these transcripts is also composed of repeat sequences. One of the maize CA genes was sequenced and found to encode two domains, with distinct groups of exons corresponding to the repeat regions of the transcript. We have also shown that expression of a single repeat region of the CA transcript produced active enzyme that associated as a dimer and was composed primarily of α-helices, consistent with that observed for other plant CAs. As the presence of repeat regions in the CA gene is unique to NADP-ME type C(4) monocot species, the implications of these findings in the context of the evolution of the location and function of this C(4) pathway enzyme are strongly suggestive of CA gene duplication resulting in an evolutionary advantage and a higher photosynthetic efficiency. Copyright © 2010 Elsevier Masson SAS. All rights reserved.

  20. Expansion of protein domain repeats.

    Directory of Open Access Journals (Sweden)

    Asa K Björklund


    Full Text Available Many proteins, especially in eukaryotes, contain tandem repeats of several domains from the same family. These repeats have a variety of binding properties and are involved in protein-protein interactions as well as binding to other ligands such as DNA and RNA. The rapid expansion of protein domain repeats is assumed to have evolved through internal tandem duplications. However, the exact mechanisms behind these tandem duplications are not well-understood. Here, we have studied the evolution, function, protein structure, gene structure, and phylogenetic distribution of domain repeats. For this purpose we have assigned Pfam-A domain families to 24 proteomes with more sensitive domain assignments in the repeat regions. These assignments confirmed previous findings that eukaryotes, and in particular vertebrates, contain a much higher fraction of proteins with repeats compared with prokaryotes. The internal sequence similarity in each protein revealed that the domain repeats are often expanded through duplications of several domains at a time, while the duplication of one domain is less common. Many of the repeats appear to have been duplicated in the middle of the repeat region. This is in strong contrast to the evolution of other proteins that mainly works through additions of single domains at either terminus. Further, we found that some domain families show distinct duplication patterns, e.g., nebulin domains have mainly been expanded with a unit of seven domains at a time, while duplications of other domain families involve varying numbers of domains. Finally, no common mechanism for the expansion of all repeats could be detected. We found that the duplication patterns show no dependence on the size of the domains. Further, repeat expansion in some families can possibly be explained by shuffling of exons. However, exon shuffling could not have created all repeats.

  1. Loss and recovery of Arabidopsis-type telomere repeat sequences 5'-(TTTAGGG)(n)-3' in the evolution of a major radiation of flowering plants.


    Adams, S. P.; Hartman, T. P.; Lim, K. Y.; Chase, M. W.; Bennett, M. D.; Leitch, I. J.; Leitch, A. R.


    Fluorescent in situ hybridization and Southern blotting were used for showing the predominant absence of the Arabidopsis-type telomere repeat sequence (TRS) 5'-(TTTAGGG)(n)-3' (the 'typical' telomere) in a monocot clade which comprises up to 6300 species within Asparagales. Initially, two apparently disparate genera that lacked the typical telomere were identified. Here, we used the new angiosperm phylogenetic classification for predicting in which other related families such telomeres might ...

  2. ASAP: Amplification, sequencing & annotation of plastomes

    Directory of Open Access Journals (Sweden)

    Folta Kevin M


    Full Text Available Abstract Background Availability of DNA sequence information is vital for pursuing structural, functional and comparative genomics studies in plastids. Traditionally, the first step in mining the valuable information within a chloroplast genome requires sequencing a chloroplast plasmid library or BAC clones. These activities involve complicated preparatory procedures like chloroplast DNA isolation or identification of the appropriate BAC clones to be sequenced. Rolling circle amplification (RCA is being used currently to amplify the chloroplast genome from purified chloroplast DNA and the resulting products are sheared and cloned prior to sequencing. Herein we present a universal high-throughput, rapid PCR-based technique to amplify, sequence and assemble plastid genome sequence from diverse species in a short time and at reasonable cost from total plant DNA, using the large inverted repeat region from strawberry and peach as proof of concept. The method exploits the highly conserved coding regions or intergenic regions of plastid genes. Using an informatics approach, chloroplast DNA sequence information from 5 available eudicot plastomes was aligned to identify the most conserved regions. Cognate primer pairs were then designed to generate ~1 – 1.2 kb overlapping amplicons from the inverted repeat region in 14 diverse genera. Results 100% coverage of the inverted repeat region was obtained from Arabidopsis, tobacco, orange, strawberry, peach, lettuce, tomato and Amaranthus. Over 80% coverage was obtained from distant species, including Ginkgo, loblolly pine and Equisetum. Sequence from the inverted repeat region of strawberry and peach plastome was obtained, annotated and analyzed. Additionally, a polymorphic region identified from gel electrophoresis was sequenced from tomato and Amaranthus. Sequence analysis revealed large deletions in these species relative to tobacco plastome thus exhibiting the utility of this method for structural and

  3. Exact Tandem Repeats Analyzer (E-TRA): A new program for DNA ...

    Indian Academy of Sciences (India)


    Advanced user defined parameters/options let the researchers use different minimum motif repeats ... E-TRA, we used 5,465,605 human EST sequences derived from 18,814,550 ..... repeat rates of T-cells, embryo and testis were higher.

  4. Molecular Characterization of Cultivated Bromeliad Accessions with Inter-Simple Sequence Repeat (ISSR Markers

    Directory of Open Access Journals (Sweden)

    Yongming Yu


    Full Text Available Bromeliads are of great economic importance in flower production; however little information is available with respect to genetic characterization of cultivated bromeliads thus far. In the present study, a selection of cultivated bromeliads was characterized via inter-simple sequence repeat (ISSR markers with an emphasis on genetic diversity and population structure. Twelve ISSR primers produced 342 bands, of which 287 (~84% were polymorphic, with polymorphic bands per primer ranging from 17 to 34. The Jaccard’s similarity ranged from 0.08 to 0.89 and averaged ~0.30 for the investigated bromeliads. The Bayesian-based approach, together with the un-weighted paired group method with arithmetic average (UPGMA-based clustering and the principal coordinate analysis (PCoA, distinctly grouped the bromeliads from Neoregelia, Guzmania, and Vriesea into three separately clusters, well corresponding with their botanical classifications; whereas the bromeliads of Aechmea other than the recently selected hybrids were not well assigned to a cluster. Additionally, ISSR marker was proven efficient for the identification of hybrids and bud sports of cultivated bromeliads. The findings achieved herein will further our knowledge about the genetic variability within cultivated bromeliads and therefore facilitate breeding for new varieties of cultivated bromeliads in future as well.

  5. The proviral genome of radiation leukemia virus: Molecular cloning, nucleotide sequence of its long terminal repeat and integration in lymphoma cell DNA

    International Nuclear Information System (INIS)

    Janowski, M.; Merregaert, J.; Boniver, J.; Maisin, J.R.


    The proviral genome of a thymotropic and leukemogenic C57BL/Ka mouse retrovirus, RadLV/VL/sub 3/(T+L+), was cloned as a biologically active PstI insert in the bacterial plasmid pBR322. Its restriction map was compared to those, already known, of two nonthymotropic and nonleukemogenic viruses of the same mouse strain, the ecotropic BL/Ka(B) and the xenotropic constituent of the radiation leukemia virus complex (RadLV). Differences were observed in the pol gene and in the env gene. Moreover, the nucleotide sequence of the RadLV/VL/sub 3/(T+L+) long terminal repeat revealed the existence of two copies of a 42 bp long sequence, separated by 11 nucleotides and of which BL/Ka(B) possesses only one copy

  6. Genetic Diversity and Sequence Variations at Growth Hormone Loci among Composite and Hereford Populations of Beef Cattle

    Directory of Open Access Journals (Sweden)



    Full Text Available A total of 194 Hereford and 235 composite breed cattle from Wokalup Research Station were used in this study. The aims of the study were to: Investigate polymorphisms in the growth hormone gene in the composite and purebred Hereford herds from the Wokalup selection experiment, compare genetic diversity in the growth hormone gene of the breeds, sequencing and compare the sequences of growth hormone loci between composite and purebred Hereford herds with published sequence from Genebank. The genomic DNA was extracted using Wizard genomic DNA purification system from Promega. Two fragments of growth hormone gene were amplified using PCR and continued with RFLP. Each genotype in both loci was sequenced. PCR products of each genotypes were cloned into PCR II, transformed, colonies selection, plasmid DNA extraction continued with cycle sequencing. Polymorphisms were found in both breeds of cattle in both loci of GH-L1 and GH-L2 of the growth hormone gene by PCR-RFLP analysis. Sequencing analysis confirmed the RFLPs data, polymorphism detected using AluI at GH-L1 is due to substitution between leusin/ valine at position 127, while polymorphism at the MspI restriction site was caused by transition of C to T at +837 position.

  7. Analysis of simple sequence repeats in the Gaeumannomyces graminis var. tritici genome and the development of microsatellite markers. (United States)

    Li, Wei; Feng, Yanxia; Sun, Haiyan; Deng, Yuanyu; Yu, Hanshou; Chen, Huaigu


    Understanding the genetic structure of Gaeumannomyces graminis var. tritici is essential for the establishment of efficient disease control strategies. It is becoming clear that microsatellites, or simple sequence repeats (SSRs), play an important role in genome organization and phenotypic diversity, and are a large source of genetic markers for population genetics and meiotic maps. In this study, we examined the G. graminis var. tritici genome (1) to analyze its pattern of SSRs, (2) to compare it with other plant pathogenic filamentous fungi, such as Magnaporthe oryzae and M. poae, and (3) to identify new polymorphic SSR markers for genetic diversity. The G. graminis var. tritici genome was rich in SSRs; a total 13,650 SSRs have been identified with mononucleotides being the most common motifs. In coding regions, the densities of tri- and hexanucleotides were significantly higher than in noncoding regions. The di-, tri-, tetra, penta, and hexanucleotide repeats in the G. graminis var. tritici genome were more abundant than the same repeats in M. oryzae and M. poae. From 115 devised primers, 39 SSRs are polymorphic with G. graminis var. tritici isolates, and 8 primers were randomly selected to analyze 116 isolates from China. The number of alleles varied from 2 to 7 and the expected heterozygosity (He) from 0.499 to 0.837. In conclusion, SSRs developed in this study were highly polymorphic, and our analysis indicated that G. graminis var. tritici is a species with high genetic diversity. The results provide a pioneering report for several applications, such as the assessment of population structure and genetic diversity of G. graminis var. tritici.

  8. Simple sequence repeats and compositional bias in the bipartite Ralstonia solanacearum GMI1000 genome

    Directory of Open Access Journals (Sweden)

    Vandamme Peter


    Full Text Available Abstract Background Ralstonia solanacearum is an important plant pathogen. The genome of R. solananearum GMI1000 is organised into two replicons (a 3.7-Mb chromosome and a 2.1-Mb megaplasmid and this bipartite genome structure is characteristic for most R. solanacearum strains. To determine whether the megaplasmid was acquired via recent horizontal gene transfer or is part of an ancestral single chromosome, we compared the abundance, distribution and compositon of simple sequence repeats (SSRs between both replicons and also compared the respective compositional biases. Results Our data show that both replicons are very similar in respect to distribution and composition of SSRs and presence of compositional biases. Minor variations in SSR and compositional biases observed may be attributable to minor differences in gene expression and regulation of gene expression or can be attributed to the small sample numbers observed. Conclusions The observed similarities indicate that both replicons have shared a similar evolutionary history and thus suggest that the megaplasmid was not recently acquired from other organisms by lateral gene transfer but is a part of an ancestral R. solanacearum chromosome.

  9. ACCA phosphopeptide recognition by the BRCT repeats of BRCA1. (United States)

    Ray, Hind; Moreau, Karen; Dizin, Eva; Callebaut, Isabelle; Venezia, Nicole Dalla


    The tumour suppressor gene BRCA1 encodes a 220 kDa protein that participates in multiple cellular processes. The BRCA1 protein contains a tandem of two BRCT repeats at its carboxy-terminal region. The majority of disease-associated BRCA1 mutations affect this region and provide to the BRCT repeats a central role in the BRCA1 tumour suppressor function. The BRCT repeats have been shown to mediate phospho-dependant protein-protein interactions. They recognize phosphorylated peptides using a recognition groove that spans both BRCT repeats. We previously identified an interaction between the tandem of BRCA1 BRCT repeats and ACCA, which was disrupted by germ line BRCA1 mutations that affect the BRCT repeats. We recently showed that BRCA1 modulates ACCA activity through its phospho-dependent binding to ACCA. To delineate the region of ACCA that is crucial for the regulation of its activity by BRCA1, we searched for potential phosphorylation sites in the ACCA sequence that might be recognized by the BRCA1 BRCT repeats. Using sequence analysis and structure modelling, we proposed the Ser1263 residue as the most favourable candidate among six residues, for recognition by the BRCA1 BRCT repeats. Using experimental approaches, such as GST pull-down assay with Bosc cells, we clearly showed that phosphorylation of only Ser1263 was essential for the interaction of ACCA with the BRCT repeats. We finally demonstrated by immunoprecipitation of ACCA in cells, that the whole BRCA1 protein interacts with ACCA when phosphorylated on Ser1263.

  10. Highly sensitive detection of individual HEAT and ARM repeats with HHpred and COACH. (United States)

    Kippert, Fred; Gerloff, Dietlind L


    HEAT and ARM repeats occur in a large number of eukaryotic proteins. As these repeats are often highly diverged, the prediction of HEAT or ARM domains can be challenging. Except for the most clear-cut cases, identification at the individual repeat level is indispensable, in particular for determining domain boundaries. However, methods using single sequence queries do not have the sensitivity required to deal with more divergent repeats and, when applied to proteins with known structures, in some cases failed to detect a single repeat. Testing algorithms which use multiple sequence alignments as queries, we found two of them, HHpred and COACH, to detect HEAT and ARM repeats with greatly enhanced sensitivity. Calibration against experimentally determined structures suggests the use of three score classes with increasing confidence in the prediction, and prediction thresholds for each method. When we applied a new protocol using both HHpred and COACH to these structures, it detected 82% of HEAT repeats and 90% of ARM repeats, with the minimum for a given protein of 57% for HEAT repeats and 60% for ARM repeats. Application to bona fide HEAT and ARM proteins or domains indicated that similar numbers can be expected for the full complement of HEAT/ARM proteins. A systematic screen of the Protein Data Bank for false positive hits revealed their number to be low, in particular for ARM repeats. Double false positive hits for a given protein were rare for HEAT and not at all observed for ARM repeats. In combination with fold prediction and consistency checking (multiple sequence alignments, secondary structure prediction, and position analysis), repeat prediction with the new HHpred/COACH protocol dramatically improves prediction in the twilight zone of fold prediction methods, as well as the delineation of HEAT/ARM domain boundaries. A protocol is presented for the identification of individual HEAT or ARM repeats which is straightforward to implement. It provides high

  11. FRB 121102: A Starquake-induced Repeater? (United States)

    Wang, Weiyang; Luo, Rui; Yue, Han; Chen, Xuelei; Lee, Kejia; Xu, Renxin


    Since its initial discovery, the fast radio burst (FRB) FRB 121102 has been found to be repeating with millisecond-duration pulses. Very recently, 14 new bursts were detected by the Green Bank Telescope during its continuous monitoring observations. In this paper, we show that the burst energy distribution has a power-law form which is very similar to the Gutenberg–Richter law of earthquakes. In addition, the distribution of burst waiting time can be described as a Poissonian or Gaussian distribution, which is consistent with earthquakes, while the aftershock sequence exhibits some local correlations. These findings suggest that the repeating FRB pulses may originate from the starquakes of a pulsar. Noting that the soft gamma-ray repeaters (SGRs) also exhibit such distributions, the FRB could be powered by some starquake mechanisms associated with the SGRs, including the crustal activity of a magnetar or solidification-induced stress of a newborn strangeon star. These conjectures could be tested with more repeating samples.

  12. Alternative splicing of human elastin mRNA indicated by sequence analysis of cloned genomic and complementary DNA

    International Nuclear Information System (INIS)

    Indik, Z.; Yeh, H.; Ornstein-goldstein, N.; Sheppard, P.; Anderson, N.; Rosenbloom, J.C.; Peltonen, L.; Rosenbloom, J.


    Poly(A) + RNA, isolated from a single 7-mo fetal human aorta, was used to synthesize cDNA by the RNase H method, and the cDNA was inserted into λgt10. Recombinant phage containing elastin sequences were identified by hybridization with cloned, exon-containing fragments of the human elastin gene. Three clones containing inserts of 3.3, 2.7, and 2.3 kilobases were selected for further analysis. Three overlapping clones containing 17.8 kilobases of the human elastin gene were also isolated from genomic libraries. Complete sequence analysis of the six clones demonstrated that: (i) the cDNA encompassed the entire translated portion of the mRNA encoding 786 amino acids, including several unusual hydrophilic amino acid sequences not previously identified in porcine tropoelastin, (ii) exons encoding either hydrophobic or crosslinking domains in the protein alternated in the gene, and (iii) a great abundance of Alu repetitive sequences occurred throughout the introns. The data also indicated substantial alternative splicing of the mRNA. These results suggest the potential for significant variation in the precise molecular structure of the elastic fiber in the human population

  13. Population data of six Alu insertions in indigenous groups from Sabah, Malaysia. (United States)

    Kee, B P; Chua, K H; Lee, P C; Lian, L H


    The present study is the first to report the genetic relatedness of indigenous populations of Sabah, Malaysia, using a set of Indel markers (HS4.32, TPA25, APO, PV92, B65 and HS3.23). The primary aim was to assess the genetic relationships among these populations and with populations from other parts of the world by examining the distribution of these markers. A total of 504 volunteers from the three largest indigenous groups, i.e. Kadazan-Dusun, Bajau and Rungus, were recruited for the study. Six Alu insertions were typed by PCR with specific primer sets. All insertions were found to present at different frequencies, ranging from 0.170-0.970. The heterozygosity of most of the markers was high (>0.4), with the exception of HS3.23 and APO. A genetic differentiation study revealed that these populations are closely related to each other (G(ST) = 0.006). A principle component plot showed that these populations have higher affinity to Mainland South East Asia/East Asia populations, rather than Island Southeast Asia (ISEA) populations. In summary, these indigenous groups were closely associated in terms of their genetic composition. This finding also supports the colonization model of ISEA, which suggests that the inhabitants of this region were mostly descendants from Southern China.

  14. Identification of Variable-Number Tandem-Repeat (VNTR) Sequences in Acinetobacter baumannii and Interlaboratory Validation of an Optimized Multiple-Locus VNTR Analysis Typing Scheme▿† (United States)

    Pourcel, Christine; Minandri, Fabrizia; Hauck, Yolande; D'Arezzo, Silvia; Imperi, Francesco; Vergnaud, Gilles; Visca, Paolo


    Acinetobacter baumannii is an important opportunistic pathogen responsible for nosocomial outbreaks, mostly occurring in intensive care units. Due to the multiplicity of infection sources, reliable molecular fingerprinting techniques are needed to establish epidemiological correlations among A. baumannii isolates. Multiple-locus variable-number tandem-repeat analysis (MLVA) has proven to be a fast, reliable, and cost-effective typing method for several bacterial species. In this study, an MLVA assay compatible with simple PCR- and agarose gel-based electrophoresis steps as well as with high-throughput automated methods was developed for A. baumannii typing. Preliminarily, 10 potential polymorphic variable-number tandem repeats (VNTRs) were identified upon bioinformatic screening of six annotated genome sequences of A. baumannii. A collection of 7 reference strains plus 18 well-characterized isolates, including unique types and representatives of the three international A. baumannii lineages, was then evaluated in a two-center study aimed at validating the MLVA assay and comparing it with other genotyping assays, namely, macrorestriction analysis with pulsed-field gel electrophoresis (PFGE) and PCR-based sequence group (SG) profiling. The results showed that MLVA can discriminate between isolates with identical PFGE types and SG profiles. A panel of eight VNTR markers was selected, all showing the ability to be amplified and good amounts of polymorphism in the majority of strains. Independently generated MLVA profiles, composed of an ordered string of allele numbers corresponding to the number of repeats at each VNTR locus, were concordant between centers. Typeability, reproducibility, stability, discriminatory power, and epidemiological concordance were excellent. A database containing information and MLVA profiles for several A. baumannii strains is available from PMID:21147956

  15. Survey of clustered regularly interspaced short palindromic repeats and their associated Cas proteins (CRISPR/Cas) systems in multiple sequenced strains of Klebsiella pneumoniae. (United States)

    Ostria-Hernández, Martha Lorena; Sánchez-Vallejo, Carlos Javier; Ibarra, J Antonio; Castro-Escarpulli, Graciela


    In recent years the emergence of multidrug resistant Klebsiella pneumoniae strains has been an increasingly common event. This opportunistic species is one of the five main bacterial pathogens that cause hospital infections worldwide and multidrug resistance has been associated with the presence of high molecular weight plasmids. Plasmids are generally acquired through horizontal transfer and therefore is possible that systems that prevent the entry of foreign genetic material are inactive or absent. One of these systems is CRISPR/Cas. However, little is known regarding the clustered regularly interspaced short palindromic repeats and their associated Cas proteins (CRISPR/Cas) system in K. pneumoniae. The adaptive immune system CRISPR/Cas has been shown to limit the entry of foreign genetic elements into bacterial organisms and in some bacteria it has been shown to be involved in regulation of virulence genes. Thus in this work we used bioinformatics tools to determine the presence or absence of CRISPR/Cas systems in available K. pneumoniae genomes. The complete CRISPR/Cas system was identified in two out of the eight complete K. pneumoniae genomes sequences and in four out of the 44 available draft genomes sequences. The cas genes in these strains comprises eight cas genes similar to those found in Escherichia coli, suggesting they belong to the type I-E group, although their arrangement is slightly different. As for the CRISPR sequences, the average lengths of the direct repeats and spacers were 29 and 33 bp, respectively. BLAST searches demonstrated that 38 of the 116 spacer sequences (33%) are significantly similar to either plasmid, phage or genome sequences, while the remaining 78 sequences (67%) showed no significant similarity to other sequences. The region where the CRISPR/Cas systems were located is the same in all the Klebsiella genomes containing it, it has a syntenic architecture, and is located among genes encoding for proteins likely involved in

  16. Genetic Diversity of Pinus nigra Arn. Populations in Southern Spain and Northern Morocco Revealed By Inter-Simple Sequence Repeat Profiles

    Directory of Open Access Journals (Sweden)

    Oussama Ahrazem


    Full Text Available Eight Pinus nigra Arn. populations from Southern Spain and Northern Morocco were examined using inter-simple sequence repeat markers to characterize the genetic variability amongst populations. Pair-wise population genetic distance ranged from 0.031 to 0.283, with a mean of 0.150 between populations. The highest inter-population average distance was between PaCU from Cuenca and YeCA from Cazorla, while the lowest distance was between TaMO from Morocco and MA Sierra Mágina populations. Analysis of molecular variance (AMOVA and Nei’s genetic diversity analyses revealed higher genetic variation within the same population than among different populations. Genetic differentiation (Gst was 0.233. Cuenca showed the highest Nei’s genetic diversity followed by the Moroccan region, Sierra Mágina, and Cazorla region. However, clustering of populations was not in accordance with their geographical locations. Principal component analysis showed the presence of two major groups—Group 1 contained all populations from Cuenca while Group 2 contained populations from Cazorla, Sierra Mágina and Morocco—while Bayesian analysis revealed the presence of three clusters. The low genetic diversity observed in PaCU and YeCA is probably a consequence of inappropriate management since no estimation of genetic variability was performed before the silvicultural treatments. Data indicates that the inter-simple sequence repeat (ISSR method is sufficiently informative and powerful to assess genetic variability among populations of P. nigra.

  17. Repair of DNA treated with γ-irradiation and chemical carcinogens. Final report, June 1, 1981-May 31, 1984

    International Nuclear Information System (INIS)

    Goldthwait, D.A.


    Work done in the past three years has been on DNA repair, on genetic transposition and on the effect of carcinogens on alu sequence transcription. DNA repair work was completed on β-propiolactone DNA adducts, on procaryotic and eucaryotic enzymes capable of removal of 3-methyladenine from DNA, and on in vitro repair of neucleosomal core particle DNA and chromatin DNA. Attempts were made to isolate a human transposable element through the isolation of double stranded RNA and probing of a human library. Experiments were also done to determine whether carcinogens altered the expression of alu sequences in human DNA

  18. Cell type-specific termination of transcription by transposable element sequences. (United States)

    Conley, Andrew B; Jordan, I King


    Transposable elements (TEs) encode sequences necessary for their own transposition, including signals required for the termination of transcription. TE sequences within the introns of human genes show an antisense orientation bias, which has been proposed to reflect selection against TE sequences in the sense orientation owing to their ability to terminate the transcription of host gene transcripts. While there is evidence in support of this model for some elements, the extent to which TE sequences actually terminate transcription of human gene across the genome remains an open question. Using high-throughput sequencing data, we have characterized over 9,000 distinct TE-derived sequences that provide transcription termination sites for 5,747 human genes across eight different cell types. Rarefaction curve analysis suggests that there may be twice as many TE-derived termination sites (TE-TTS) genome-wide among all human cell types. The local chromatin environment for these TE-TTS is similar to that seen for 3' UTR canonical TTS and distinct from the chromatin environment of other intragenic TE sequences. However, those TE-TTS located within the introns of human genes were found to be far more cell type-specific than the canonical TTS. TE-TTS were much more likely to be found in the sense orientation than other intragenic TE sequences of the same TE family and TE-TTS in the sense orientation terminate transcription more efficiently than those found in the antisense orientation. Alu sequences were found to provide a large number of relatively weak TTS, whereas LTR elements provided a smaller number of much stronger TTS. TE sequences provide numerous termination sites to human genes, and TE-derived TTS are particularly cell type-specific. Thus, TE sequences provide a powerful mechanism for the diversification of transcriptional profiles between cell types and among evolutionary lineages, since most TE-TTS are evolutionarily young. The extent of transcription

  19. Unique CCT repeats mediate transcription of the TWIST1 gene in mesenchymal cell lines

    International Nuclear Information System (INIS)

    Ohkuma, Mizue; Funato, Noriko; Higashihori, Norihisa; Murakami, Masanori; Ohyama, Kimie; Nakamura, Masataka


    TWIST1, a basic helix-loop-helix transcription factor, plays critical roles in embryo development, cancer metastasis and mesenchymal progenitor differentiation. Little is known about transcriptional regulation of TWIST1 expression. Here we identified DNA sequences responsible for TWIST1 expression in mesenchymal lineage cell lines. Reporter assays with TWIST1 promoter mutants defined the -102 to -74 sequences that are essential for TWIST1 expression in human and mouse mesenchymal cell lines. Tandem repeats of CCT, but not putative CREB and NF-κB sites in the sequences substantially supported activity of the TWIST1 promoter. Electrophoretic mobility shift assay demonstrated that the DNA sequences with the CCT repeats formed complexes with nuclear factors, containing, at least, Sp1 and Sp3. These results suggest critical implication of the CCT repeats in association with Sp1 and Sp3 factors in sustaining expression of the TWIST1 gene in mesenchymal cells

  20. Comparing Whole-Genome Sequencing with Sanger Sequencing for spa Typing of Methicillin-Resistant Staphylococcus aureus

    DEFF Research Database (Denmark)

    Bartels, Mette Damkjaer; Petersen, Andreas; Worning, Peder


    spa typing of methicillin-resistant Staphylococcus aureus (MRSA) has traditionally been done by PCR amplification and Sanger sequencing of the spa repeat region. At Hvidovre Hospital, Denmark, whole-genome sequencing (WGS) of all MRSA isolates has been performed routinely since January 2013, and ...

  1. Interstitial telomere-like repeats in the Arabidopsis thaliana genome. (United States)

    Uchida, Wakana; Matsunaga, Sachihiro; Sugiyama, Ryuji; Kawano, Shigeyuki


    Eukaryotic chromosomal ends are protected by telomeres, which are thought to play an important role in ensuring the complete replication of chromosomes. On the other hand, non-functional telomere-like repeats in the interchromosomal regions (interstitial telomeric repeats; ITRs) have been reported in several eukaryotes. In this study, we identified eight ITRs in the Arabidopsis thaliana genome, each consisting of complete and degenerate 300- to 1200-bp sequences. The ITRs were grouped into three classes (class IA-B, class II, and class IIIA-E) based on the degeneracy of the telomeric repeats in ITRs. The telomeric repeats of the two ITRs in class I were conserved for the most part, whereas the single ITR in class II, and the five ITRs in class III were relatively degenerated. In addition, degenerate ITRs were surrounded by common sequences that shared 70-100% homology to each other; these are named ITR-adjacent sequences (IAS). Although the genomic regions around ITRs in class I lacked IAS, those around ITRs in class II contained IAS (IASa), and those around five ITRs in class III had nine types of IAS (IASb, c, d, e, f, g, h, i, and j). Ten IAS types in classes II and III showed no significant homology to each other. The chromosomal locations of ITRs and IAS were not category-related, but most of them were adjacent to, or part of, a centromere. These results show that the A. thaliana genome has undergone chromosomal rearrangements, such as end-fusions and segmental duplications.

  2. Da senzala ao cortiço : história e literatura em Aluísio Azevedo e João Ubaldo Ribeiro

    Directory of Open Access Journals (Sweden)

    Dalcastagnè Regina


    Full Text Available O artigo analisa dois romances brasileiros de épocas diferentes - O cortiço, de Aluísio Azevedo, lançado em 1890, e Viva o povo brasileiro, de João Ubaldo Ribeiro, de 1984. Apesar das muitas diferenças que os separam, ambos narram o processo de formação das elites brasileiras, revelando a violência nele envolvida. O naturalismo de Azevedo e o tom paródico de Ribeiro estabelecem, cada um a seu modo, um instigante diálogo com a história brasileira.

  3. Carrier frequency of a nonsense mutation in the adenosine deaminase (ADA) gene implies a high incidence of ADA-deficient severe combined immunodeficiency (SCID) in Somalia and a single, common haplotype indicates common ancestry

    DEFF Research Database (Denmark)

    Sanchez Sanchez, Juan Jose; Monaghan, Gemma; Børsting, Claus


    Inherited adenosine deaminase (ADA) deficiency is a rare metabolic disorder that causes immunodeficiency, varying from severe combined immunodeficiency (SCID) in the majority of cases to a less severe form in a small minority of patients. Five patients of Somali origin from four unrelated families......, with severe ADA-SCID, were registered in the Greater London area. Patients and their parents were investigated for the nonsense mutation Q3X (ADA c7C>T), two missense mutations K80R (ADA c239A>G) and R142Q (ADA c425G>A), and a TAAA repeat located at the 3' end of an Alu element (AluVpA) positioned 1.1 kb...... upstream of the ADA transcription start site. All patients were homozygous for the haplotype ADA-7T/ADA-239G/ADA-425G/AluVpA7. Among 207 Somali immigrants to Denmark, the frequency of ADA c7C>T and the maximum likelihood estimate of the frequency of the haplotype ADA-7T/ADA-239G/ADA-425G/AluVpA7 were both...

  4. Acquiring a cognitive skill with a new repeating version of the Tower of London task. (United States)

    Ouellet, Marie-Christine; Beauchamp, Miriam H; Owen, Adrian M; Doyon, Julien


    A computerized version of the Tower of London task was used to investigate cognitive skill learning. Thirty-six healthy volunteers were assigned to either a random condition (nonrecurring problems), or to a sequence condition in which, unbeknownst to the subjects, a repeating sequence of three problems was presented. Indices of execution, planning, and total time, as well as number of moves performed, were used to measure behavioural change. Subjects' performance improved in both conditions across blocks of practice. A distinct learning effect related to the repeating sequence was also observed. This suggests that a specific skill that reflects procedural learning of the strategies, rules, and procedures pertaining to repeating problems can develop over and above a more general skill at solving cognitive planning problems with practice.

  5. Gene conversion homogenizes the CMT1A paralogous repeats

    Directory of Open Access Journals (Sweden)

    Hurles Matthew E


    Full Text Available Abstract Background Non-allelic homologous recombination between paralogous repeats is increasingly being recognized as a major mechanism causing both pathogenic microdeletions and duplications, and structural polymorphism in the human genome. It has recently been shown empirically that gene conversion can homogenize such repeats, resulting in longer stretches of absolute identity that may increase the rate of non-allelic homologous recombination. Results Here, a statistical test to detect gene conversion between pairs of non-coding sequences is presented. It is shown that the 24 kb Charcot-Marie-Tooth type 1A paralogous repeats (CMT1A-REPs exhibit the imprint of gene conversion processes whilst control orthologous sequences do not. In addition, Monte Carlo simulations of the evolutionary divergence of the CMT1A-REPs, incorporating two alternative models for gene conversion, generate repeats that are statistically indistinguishable from the observed repeats. Bounds are placed on the rate of these conversion processes, with central values of 1.3 × 10-4 and 5.1 × 10-5 per generation for the alternative models. Conclusions This evidence presented here suggests that gene conversion may have played an important role in the evolution of the CMT1A-REP paralogous repeats. The rates of these processes are such that it is probable that homogenized CMT1A-REPs are polymorphic within modern populations. Gene conversion processes are similarly likely to play an important role in the evolution of other segmental duplications and may influence the rate of non-allelic homologous recombination between them.

  6. Epigenetic and Transcriptional Modifications in Repetitive Elements in Petrol Station Workers Exposed to Benzene and MTBE

    Directory of Open Access Journals (Sweden)

    Federica Rota


    Full Text Available Benzene, a known human carcinogen, and methyl tert-butyl ether (MTBE, not classifiable as to its carcinogenicity, are fuel-related pollutants. This study investigated the effect of these chemicals on epigenetic and transcriptional alterations in DNA repetitive elements. In 89 petrol station workers and 90 non-occupationally exposed subjects the transcriptional activity of retrotransposons (LINE-1, Alu, the methylation on repeated-element DNA, and of H3K9 histone, were investigated in peripheral blood lymphocytes. Median work shift exposure to benzene and MTBE was 59 and 408 µg/m3 in petrol station workers, and 4 and 3.5 µg/m3, in controls. Urinary benzene (BEN-U, S-phenylmercapturic acid, and MTBE were significantly higher in workers than in controls, while trans,trans-muconic acid (tt-MA was comparable between the two groups. Increased BEN-U was associated with increased Alu-Y and Alu-J expression; moreover, increased tt-MA was associated with increased Alu-Y and Alu-J and LINE-1 (L1-5′UTR expression. Among repetitive element methylation, only L1-Pa5 was hypomethylated in petrol station workers compared to controls. While L1-Ta and Alu-YD6 methylation was not associated with benzene exposure, a negative association with urinary MTBE was observed. The methylation status of histone H3K9 was not associated with either benzene or MTBE exposure. Overall, these findings only partially support previous observations linking benzene exposure with global DNA hypomethylation.

  7. High-throughput sequencing of core STR loci for forensic genetic investigations using the Roche Genome Sequencer FLX platform

    DEFF Research Database (Denmark)

    Fordyce, Sarah Louise; Avila Arcos, Maria del Carmen; Rockenbauer, Eszter


    repeat units. These methods do not allow for the full resolution of STR base composition that sequencing approaches could provide. Here we present an STR profiling method based on the use of the Roche Genome Sequencer (GS) FLX to simultaneously sequence multiple core STR loci. Using this method...

  8. Comprehensive analysis of pathogenic deletion variants in Fanconi anemia genes. (United States)

    Flynn, Elizabeth K; Kamat, Aparna; Lach, Francis P; Donovan, Frank X; Kimble, Danielle C; Narisu, Narisu; Sanborn, Erica; Boulad, Farid; Davies, Stella M; Gillio, Alfred P; Harris, Richard E; MacMillan, Margaret L; Wagner, John E; Smogorzewska, Agata; Auerbach, Arleen D; Ostrander, Elaine A; Chandrasekharappa, Settara C


    Fanconi anemia (FA) is a rare recessive disease resulting from mutations in one of at least 16 different genes. Mutation types and phenotypic manifestations of FA are highly heterogeneous and influence the clinical management of the disease. We analyzed 202 FA families for large deletions, using high-resolution comparative genome hybridization arrays, single-nucleotide polymorphism arrays, and DNA sequencing. We found pathogenic deletions in 88 FANCA, seven FANCC, two FANCD2, and one FANCB families. We find 35% of FA families carry large deletions, accounting for 18% of all FA pathogenic variants. Cloning and sequencing across the deletion breakpoints revealed that 52 FANCA deletion ends, and one FANCC deletion end extended beyond the gene boundaries, potentially affecting neighboring genes with phenotypic consequences. Seventy-five percent of the FANCA deletions are Alu-Alu mediated, predominantly by AluY elements, and appear to be caused by nonallelic homologous recombination. Individual Alu hotspots were identified. Defining the haplotypes of four FANCA deletions shared by multiple families revealed that three share a common ancestry. Knowing the exact molecular changes that lead to the disease may be critical for a better understanding of the FA phenotype, and to gain insight into the mechanisms driving these pathogenic deletion variants. © 2014 WILEY PERIODICALS, INC.

  9. Complete chloroplast genome sequence of a major economic species, Ziziphus jujuba (Rhamnaceae). (United States)

    Ma, Qiuyue; Li, Shuxian; Bi, Changwei; Hao, Zhaodong; Sun, Congrui; Ye, Ning


    Ziziphus jujuba is an important woody plant with high economic and medicinal value. Here, we analyzed and characterized the complete chloroplast (cp) genome of Z. jujuba, the first member of the Rhamnaceae family for which the chloroplast genome sequence has been reported. We also built a web browser for navigating the cp genome of Z. jujuba ( ). Sequence analysis showed that this cp genome is 161,466 bp long and has a typical quadripartite structure of large (LSC, 89,120 bp) and small (SSC, 19,348 bp) single-copy regions separated by a pair of inverted repeats (IRs, 26,499 bp). The sequence contained 112 unique genes, including 78 protein-coding genes, 30 transfer RNAs, and four ribosomal RNAs. The genome structure, gene order, GC content, and codon usage are similar to other typical angiosperm cp genomes. A total of 38 tandem repeats, two forward repeats, and three palindromic repeats were detected in the Z. jujuba cp genome. Simple sequence repeat (SSR) analysis revealed that most SSRs were AT-rich. The homopolymer regions in the cp genome of Z. jujuba were verified and manually corrected by Sanger sequencing. One-third of mononucleotide repeats were found to be erroneously sequenced by the 454 pyrosequencing, which resulted in sequences of 1-4 bases shorter than that by the Sanger sequencing. Analyzing the cp genome of Z. jujuba revealed that the IR contraction and expansion events resulted in ycf1 and rps19 pseudogenes. A phylogenetic analysis based on 64 protein-coding genes showed that Z. jujuba was closely related to members of the Elaeagnaceae family, which will be helpful for phylogenetic studies of other Rosales species. The complete cp genome sequence of Z. jujuba will facilitate population, phylogenetic, and cp genetic engineering studies of this economic plant.

  10. Discovery of Escherichia coli CRISPR sequences in an undergraduate laboratory. (United States)

    Militello, Kevin T; Lazatin, Justine C


    Clustered regularly interspaced short palindromic repeats (CRISPRs) represent a novel type of adaptive immune system found in eubacteria and archaebacteria. CRISPRs have recently generated a lot of attention due to their unique ability to catalog foreign nucleic acids, their ability to destroy foreign nucleic acids in a mechanism that shares some similarity to RNA interference, and the ability to utilize reconstituted CRISPR systems for genome editing in numerous organisms. In order to introduce CRISPR biology into an undergraduate upper-level laboratory, a five-week set of exercises was designed to allow students to examine the CRISPR status of uncharacterized Escherichia coli strains and to allow the discovery of new repeats and spacers. Students started the project by isolating genomic DNA from E. coli and amplifying the iap CRISPR locus using the polymerase chain reaction (PCR). The PCR products were analyzed by Sanger DNA sequencing, and the sequences were examined for the presence of CRISPR repeat sequences. The regions between the repeats, the spacers, were extracted and analyzed with BLASTN searches. Overall, CRISPR loci were sequenced from several previously uncharacterized E. coli strains and one E. coli K-12 strain. Sanger DNA sequencing resulted in the discovery of 36 spacer sequences and their corresponding surrounding repeat sequences. Five of the spacers were homologous to foreign (non-E. coli) DNA. Assessment of the laboratory indicates that improvements were made in the ability of students to answer questions relating to the structure and function of CRISPRs. Future directions of the laboratory are presented and discussed. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(3):262-269, 2017. © 2016 The International Union of Biochemistry and Molecular Biology.

  11. Genetic differentiation and origin of the Jordanian population: an analysis of Alu insertion polymorphisms. (United States)

    Bahri, Raoudha; El Moncer, Wifak; Al-Batayneh, Khalid; Sadiq, May; Esteban, Esther; Moral, Pedro; Chaabani, Hassen


    Although much of Jordan is covered by desert, its north-western region forms part of the Fertile Crescent region that had given a rich past to Jordanians. This past, scarcely described by historians, is not yet clarified by sufficient genetic data. Thus in this paper we aim to determine the genetic differentiation of the Jordanian population and to discuss its origin. A total of 150 unrelated healthy Jordanians were investigated for ten Alu insertion polymorphisms. Genetic relationships among populations were estimated by a principal component (PC) plot based on the analyses of the R-matrix software. Statistical analysis showed that the Jordanian population is not significantly different from the United Arab Emirates population or the North Africans. This observation, well represented in PC plot, suggests a common origin of these populations belonging respectively to ancient Mesopotamia, Arabia, and North Africa. Our results are compatible with ancient peoples' movements from Arabia to ancient Mesopotamia and North Africa as proposed by historians and supported by previous genetic results. The original genetic profile of the Jordanian population, very likely Arabian Semitic, has not been subject to significant change despite the succession of several civilizations.

  12. Genome survey sequencing and genetic background characterization of Gracilariopsis lemaneiformis (Rhodophyta) based on next-generation sequencing. (United States)

    Zhou, Wei; Hu, Yiyi; Sui, Zhenghong; Fu, Feng; Wang, Jinguo; Chang, Lianpeng; Guo, Weihua; Li, Binbin


    Gracilariopsis lemaneiformis has a high economic value and is one of the most important aquaculture species in China. Despite it is economic importance, it has remained largely unstudied at the genomic level. In this study, we conducted a genome survey of Gp. lemaneiformis using next-generation sequencing (NGS) technologies. In total, 18.70 Gb of high-quality sequence data with an estimated genome size of 97 Mb were obtained by HiSeq 2000 sequencing for Gp. lemaneiformis. These reads were assembled into 160,390 contigs with a N50 length of 3.64 kb, which were further assembled into 125,685 scaffolds with a total length of 81.17 Mb. Genome analysis predicted 3490 genes and a GC% content of 48%. The identified genes have an average transcript length of 1,429 bp, an average coding sequence size of 1,369 bp, 1.36 exons per gene, exon length of 1,008 bp, and intron length of 191 bp. From the initial assembled scaffold, transposable elements constituted 54.64% (44.35 Mb) of the genome, and 7737 simple sequence repeats (SSRs) were identified. Among these SSRs, the trinucleotide repeat type was the most abundant (up to 73.20% of total SSRs), followed by the di- (17.41%), tetra- (5.49%), hexa- (2.90%), and penta- (1.00%) nucleotide repeat type. These characteristics suggest that Gp. lemaneiformis is a model organism for genetic study. This is the first report of genome-wide characterization within this taxon.

  13. Genome Survey Sequencing and Genetic Background Characterization of Gracilariopsis lemaneiformis (Rhodophyta) Based on Next-Generation Sequencing (United States)

    Sui, Zhenghong; Fu, Feng; Wang, Jinguo; Chang, Lianpeng; Guo, Weihua; Li, Binbin


    Gracilariopsis lemaneiformis has a high economic value and is one of the most important aquaculture species in China. Despite it is economic importance, it has remained largely unstudied at the genomic level. In this study, we conducted a genome survey of Gp. lemaneiformis using next-generation sequencing (NGS) technologies. In total, 18.70 Gb of high-quality sequence data with an estimated genome size of 97 Mb were obtained by HiSeq 2000 sequencing for Gp. lemaneiformis. These reads were assembled into 160,390 contigs with a N50 length of 3.64 kb, which were further assembled into 125,685 scaffolds with a total length of 81.17 Mb. Genome analysis predicted 3490 genes and a GC% content of 48%. The identified genes have an average transcript length of 1,429 bp, an average coding sequence size of 1,369 bp, 1.36 exons per gene, exon length of 1,008 bp, and intron length of 191 bp. From the initial assembled scaffold, transposable elements constituted 54.64% (44.35 Mb) of the genome, and 7737 simple sequence repeats (SSRs) were identified. Among these SSRs, the trinucleotide repeat type was the most abundant (up to 73.20% of total SSRs), followed by the di- (17.41%), tetra- (5.49%), hexa- (2.90%), and penta- (1.00%) nucleotide repeat type. These characteristics suggest that Gp. lemaneiformis is a model organism for genetic study. This is the first report of genome-wide characterization within this taxon. PMID:23875008

  14. Location analysis for the estrogen receptor-α reveals binding to diverse ERE sequences and widespread binding within repetitive DNA elements (United States)

    Mason, Christopher E.; Shu, Feng-Jue; Wang, Cheng; Session, Ryan M.; Kallen, Roland G.; Sidell, Neil; Yu, Tianwei; Liu, Mei Hui; Cheung, Edwin; Kallen, Caleb B.


    Location analysis for estrogen receptor-α (ERα)-bound cis-regulatory elements was determined in MCF7 cells using chromatin immunoprecipitation (ChIP)-on-chip. Here, we present the estrogen response element (ERE) sequences that were identified at ERα-bound loci and quantify the incidence of ERE sequences under two stringencies of detection: ERE sequence. We demonstrate that ∼50% of all ERα-bound loci do not have a discernable ERE and show that most ERα-bound EREs are not perfect consensus EREs. Approximately one-third of all ERα-bound ERE sequences reside within repetitive DNA sequences, most commonly of the AluS family. In addition, the 3-bp spacer between the inverted ERE half-sites, rather than being random nucleotides, is C(A/T)G-enriched at bona fide receptor targets. Diverse ERα-bound loci were validated using electrophoretic mobility shift assay and ChIP-polymerase chain reaction (PCR). The functional significance of receptor-bound loci was demonstrated using luciferase reporter assays which proved that repetitive element ERE sequences contribute to enhancer function. ChIP-PCR demonstrated estrogen-dependent recruitment of the coactivator SRC3 to these loci in vivo. Our data demonstrate that ERα binds to widely variant EREs with less sequence specificity than had previously been suspected and that binding at repetitive and nonrepetitive genomic targets is favored by specific trinucleotide spacers. PMID:20047966

  15. Location analysis for the estrogen receptor-alpha reveals binding to diverse ERE sequences and widespread binding within repetitive DNA elements. (United States)

    Mason, Christopher E; Shu, Feng-Jue; Wang, Cheng; Session, Ryan M; Kallen, Roland G; Sidell, Neil; Yu, Tianwei; Liu, Mei Hui; Cheung, Edwin; Kallen, Caleb B


    Location analysis for estrogen receptor-alpha (ERalpha)-bound cis-regulatory elements was determined in MCF7 cells using chromatin immunoprecipitation (ChIP)-on-chip. Here, we present the estrogen response element (ERE) sequences that were identified at ERalpha-bound loci and quantify the incidence of ERE sequences under two stringencies of detection: ERE sequence. We demonstrate that approximately 50% of all ERalpha-bound loci do not have a discernable ERE and show that most ERalpha-bound EREs are not perfect consensus EREs. Approximately one-third of all ERalpha-bound ERE sequences reside within repetitive DNA sequences, most commonly of the AluS family. In addition, the 3-bp spacer between the inverted ERE half-sites, rather than being random nucleotides, is C(A/T)G-enriched at bona fide receptor targets. Diverse ERalpha-bound loci were validated using electrophoretic mobility shift assay and ChIP-polymerase chain reaction (PCR). The functional significance of receptor-bound loci was demonstrated using luciferase reporter assays which proved that repetitive element ERE sequences contribute to enhancer function. ChIP-PCR demonstrated estrogen-dependent recruitment of the coactivator SRC3 to these loci in vivo. Our data demonstrate that ERalpha binds to widely variant EREs with less sequence specificity than had previously been suspected and that binding at repetitive and nonrepetitive genomic targets is favored by specific trinucleotide spacers.

  16. Large scale analysis of small repeats via mining of the human genome

    NARCIS (Netherlands)

    van den Berg, I.; Bosnacki, D.; Hilbers, P.A.J.


    Small repetitive sequences, called tandem repeats, are abundant throughout the human genome, both in coding and in non-coding regions. Their role is still mostly unknown, but at least 20 of those repetitive sequences have been related to neurodegenerative disorders. The mutational process that is

  17. Genetic diversity among Puccinia melanocephala isolates from Brazil assessed using simple sequence repeat markers. (United States)

    Peixoto-Junior, R F; Creste, S; Landell, M G A; Nunes, D S; Sanguino, A; Campos, M F; Vencovsky, R; Tambarussi, E V; Figueira, A


    Brown rust (causal agent Puccinia melanocephala) is an important sugarcane disease that is responsible for large losses in yield worldwide. Despite its importance, little is known regarding the genetic diversity of this pathogen in the main Brazilian sugarcane cultivation areas. In this study, we characterized the genetic diversity of 34 P. melanocephala isolates from 4 Brazilian states using loci identified from an enriched simple sequence repeat (SSR) library. The aggressiveness of 3 isolates from major sugarcane cultivation areas was evaluated by inoculating an intermediately resistant and a susceptible cultivar. From the enriched library, 16 SSR-specific primers were developed, which produced scorable alleles. Of these, 4 loci were polymorphic and 12 were monomorphic for all isolates evaluated. The molecular characterization of the 34 isolates of P. melanocephala conducted using 16 SSR loci revealed the existence of low genetic variability among the isolates. The average estimated genetic distance was 0.12. Phenetic analysis based on Nei's genetic distance clustered the isolates into 2 major groups. Groups I and II included 18 and 14 isolates, respectively, and both groups contained isolates from all 4 geographic regions studied. Two isolates did not cluster with these groups. It was not possible to obtain clusters according to location or state of origin. Analysis of disease severity data revealed that the isolates did not show significant differences in aggressiveness between regions.

  18. Massively parallel sequencing of forensic STRs

    DEFF Research Database (Denmark)

    Parson, Walther; Ballard, David; Budowle, Bruce


    The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that...

  19. Characterisation of peacock (Pavo cristatus) mitochondrial 12S rRNA sequence and its use in differentiation from closely related poultry species. (United States)

    Saini, M; Das, D K; Dhara, A; Swarup, D; Yadav, M P; Gupta, P K


    1. Poaching of peacocks, the national bird of India, is illegal. People kill this beautiful pheasant bird for tail feathers and mix the meat with chicken or turkey. Differentiation of the meat of these species is essential in order to address the ambiguity about the origin of the sample. 2. The present study was carried out to investigate the use of polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) of mitochondrial 12S rRNA gene for identification of these species. 3. Peacock mitochondrial 12S rRNA partial gene was amplified using universal primers, cloned and characterised. It was found to be 446 nucleotides long. 4. Sequence analysis revealed 86.8 and 84.1% similarity with reported turkey and chicken sequences, respectively. Sequence and phylogenetic analysis showed that the peacock is much closer to the turkey than the chicken. 5. PCR-RFLP of 446 bp amplicon using commonly available restriction enzymes AluI and Sau3AI produced a differential pattern for identifying these poultry species unambiguously.

  20. Analysis of CR1 Repeats in the Zebra Finch Genome

    Directory of Open Access Journals (Sweden)

    George E. Liu


    Full Text Available Most bird species have smaller genomes and fewer repeats than mammals. Chicken Repeat 1 (CR1 repeat is one of the most abundant families of repeats, ranging from ~133,000 to ~187,000 copies accounting for ~50 to ~80% of the interspersed repeats in the zebra finch and chicken genomes, respectively. CR1 repeats are believed to have arisen from the retrotransposition of a small number of master elements, which gave rise to multiple CR1 subfamilies in the chicken. In this study, we performed a global assessment of the divergence distributions, phylogenies, and consensus sequences of CR1 repeats in the zebra finch genome. We identified and validated 34 CR1 subfamilies and further analyzed the correlation between these subfamilies. We also discovered 4 novel lineage-specific CR1 subfamilies in the zebra finch when compared to the chicken genome. We built various evolutionary trees of these subfamilies and concluded that CR1 repeats may play an important role in reshaping the structure of bird genomes.

  1. Variable number of tandem repeat markers in the genome sequence of Mycosphaerella fijiensis, the causal agent of black leaf streak disease of banana (Musa spp). (United States)

    Garcia, S A L; Van der Lee, T A J; Ferreira, C F; Te Lintel Hekkert, B; Zapater, M-F; Goodwin, S B; Guzmán, M; Kema, G H J; Souza, M T


    We searched the genome of Mycosphaerella fijiensis for molecular markers that would allow population genetics analysis of this plant pathogen. M. fijiensis, the causal agent of banana leaf streak disease, also known as black Sigatoka, is the most devastating pathogen attacking bananas (Musa spp). Recently, the entire genome sequence of M. fijiensis became available. We screened this database for VNTR markers. Forty-two primer pairs were selected for validation, based on repeat type and length and the number of repeat units. Five VNTR markers showing multiple alleles were validated with a reference set of isolates from different parts of the world and a population from a banana plantation in Costa Rica. Polymorphism information content values varied from 0.6414 to 0.7544 for the reference set and from 0.0400 and 0.7373 for the population set. Eighty percent of the polymorphism information content values were above 0.60, indicating that the markers are highly informative. These markers allowed robust scoring of agarose gels and proved to be useful for variability and population genetics studies. In conclusion, the strategy we developed to identify and validate VNTR markers is an efficient means to incorporate markers that can be used for fungicide resistance management and to develop breeding strategies to control banana black leaf streak disease. This is the first report of VNTR-minisatellites from the M. fijiensis genome sequence.

  2. Characterization of the past and current duplication activities in the human 22q11.2 region

    Directory of Open Access Journals (Sweden)

    Morrow Bernice


    Full Text Available Abstract Background Segmental duplications (SDs on 22q11.2 (LCR22, serve as substrates for meiotic non-allelic homologous recombination (NAHR events resulting in several clinically significant genomic disorders. Results To understand the duplication activity leading to the complicated SD structure of this region, we have applied the A-Bruijn graph algorithm to decompose the 22q11.2 SDs to 523 fundamental duplication sequences, termed subunits. Cross-species syntenic analysis of primate genomes demonstrates that many of these LCR22 subunits emerged very recently, especially those implicated in human genomic disorders. Some subunits have expanded more actively than others, and young Alu SINEs, are associated much more frequently with duplicated sequences that have undergone active expansion, confirming their role in mediating recombination events. Many copy number variations (CNVs exist on 22q11.2, some flanked by SDs. Interestingly, two chromosome breakpoints for 13 CNVs (mean length 65 kb are located in paralogous subunits, providing direct evidence that SD subunits could contribute to CNV formation. Sequence analysis of PACs or BACs identified extra CNVs, specifically, 10 insertions and 18 deletions within 22q11.2; four were more than 10 kb in size and most contained young AluYs at their breakpoints. Conclusions Our study indicates that AluYs are implicated in the past and current duplication events, and moreover suggests that DNA rearrangements in 22q11.2 genomic disorders perhaps do not occur randomly but involve both actively expanded duplication subunits and Alu elements.

  3. Simple Sequence Repeat Analysis of Selected NSIC-registered Coffee Varieties in the Philippines

    Directory of Open Access Journals (Sweden)

    Daisy May C. Santos


    Full Text Available Coffee (Coffea sp. is an important commercial crop worldwide. Three species of coffee are used as beverage, namely Coffea arabica, C. canephora, and C. liberica. Coffea arabica L. is the most cultivated among the three coffee species due to its taste quality, rich aroma, and low caffeine content. Despite its inferior taste and aroma, C. canephora Pierre ex A. Froehner, which has the highest caffeine content, is the second most widely cultivated because of its resistance to coffee diseases. On the other hand, C. liberica W.Bull ex Hierncomes is characterized by its very strong taste and flavor. The Philippines used to be a leading exporter of coffee until coffee rust destroyed the farms in Batangas, home of the famous Kapeng Barako. The country has been attempting to revive the coffee industry by focusing on the production of specialty coffee with registered varieties on the National Seed Industry Council (NSIC. Correct identification and isolation of pure coffee beans are the main factors that determine coffee’s market value. Local farms usually misidentify and mix coffee beans of different varieties, leading to the depreciation of their value. This study used simple sequence repeat (SSR markers to evaluate and distinguish Philippine NSIC-registered coffee species and varieties. The neighbor-joining tree generated using PAUP showed high bootstrap support, separating C. arabica, C. canephora, and C. liberica from each other. Among the twenty primer pairs used, seven were able to distinguish C. arabica, nine for C. liberica, and one for C. canephora.

  4. In situ optical sequencing and structure analysis of a trinucleotide repeat genome region by localization microscopy after specific COMBO-FISH nano-probing (United States)

    Stuhlmüller, M.; Schwarz-Finsterle, J.; Fey, E.; Lux, J.; Bach, M.; Cremer, C.; Hinderhofer, K.; Hausmann, M.; Hildenbrand, G.


    Trinucleotide repeat expansions (like (CGG)n) of chromatin in the genome of cell nuclei can cause neurological disorders such as for example the Fragile-X syndrome. Until now the mechanisms are not clearly understood as to how these expansions develop during cell proliferation. Therefore in situ investigations of chromatin structures on the nanoscale are required to better understand supra-molecular mechanisms on the single cell level. By super-resolution localization microscopy (Spectral Position Determination Microscopy; SPDM) in combination with nano-probing using COMBO-FISH (COMBinatorial Oligonucleotide FISH), novel insights into the nano-architecture of the genome will become possible. The native spatial structure of trinucleotide repeat expansion genome regions was analysed and optical sequencing of repetitive units was performed within 3D-conserved nuclei using SPDM after COMBO-FISH. We analysed a (CGG)n-expansion region inside the 5' untranslated region of the FMR1 gene. The number of CGG repeats for a full mutation causing the Fragile-X syndrome was found and also verified by Southern blot. The FMR1 promotor region was similarly condensed like a centromeric region whereas the arrangement of the probes labelling the expansion region seemed to indicate a loop-like nano-structure. These results for the first time demonstrate that in situ chromatin structure measurements on the nanoscale are feasible. Due to further methodological progress it will become possible to estimate the state of trinucleotide repeat mutations in detail and to determine the associated chromatin strand structural changes on the single cell level. In general, the application of the described approach to any genome region will lead to new insights into genome nano-architecture and open new avenues for understanding mechanisms and their relevance in the development of heredity diseases.

  5. Genetic Diversity Assessment and Identification of New Sour Cherry Genotypes Using Intersimple Sequence Repeat Markers

    Directory of Open Access Journals (Sweden)

    Roghayeh Najafzadeh


    Full Text Available Iran is one of the chief origins of subgenus Cerasus germplasm. In this study, the genetic variation of new Iranian sour cherries (which had such superior growth characteristics and fruit quality as to be considered for the introduction of new cultivars was investigated and identified using 23 intersimple sequence repeat (ISSR markers. Results indicated a high level of polymorphism of the genotypes based on these markers. According to these results, primers tested in this study specially ISSR-4, ISSR-6, ISSR-13, ISSR-14, ISSR-16, and ISSR-19 produced good and various levels of amplifications which can be effectively used in genetic studies of the sour cherry. The genetic similarity among genotypes showed a high diversity among the genotypes. Cluster analysis separated improved cultivars from promising Iranian genotypes, and the PCoA supported the cluster analysis results. Since the Iranian genotypes were superior to the improved cultivars and were separated from them in most groups, these genotypes can be considered as distinct genotypes for further evaluations in the framework of breeding programs and new cultivar identification in cherries. Results also confirmed that ISSR is a reliable DNA marker that can be used for exact genetic studies and in sour cherry breeding programs.

  6. Viewing multiple sequence alignments with the JavaScript Sequence Alignment Viewer (JSAV). (United States)

    Martin, Andrew C R


    The JavaScript Sequence Alignment Viewer (JSAV) is designed as a simple-to-use JavaScript component for displaying sequence alignments on web pages. The display of sequences is highly configurable with options to allow alternative coloring schemes, sorting of sequences and 'dotifying' repeated amino acids. An option is also available to submit selected sequences to another web site, or to other JavaScript code. JSAV is implemented purely in JavaScript making use of the JQuery and JQuery-UI libraries. It does not use any HTML5-specific options to help with browser compatibility. The code is documented using JSDOC and is available from

  7. Comparison of the carboxy-terminal DP-repeat region in the co-chaperones Hop and Hip. (United States)

    Nelson, Gregory M; Huffman, Holly; Smith, David F


    Functional steroid receptor complexes are assembled and maintained by an ordered pathway of interactions involving multiple components of the cellular chaperone machinery. Two of these components, Hop and Hip, serve as co-chaperones to the major heat shock proteins (Hsps), Hsp70 and Hsp90, and participate in intermediate stages of receptor assembly. In an effort to better understand the functions of Hop and Hip in the assembly process, we focused on a region of similarity located near the C-terminus of each co-chaperone. Contained within this region is a repeated sequence motif we have termed the DP repeat. Earlier mutagenesis studies implicated the DP repeat of either Hop or Hip in Hsp70 binding and in normal assembly of the co-chaperones with progesterone receptor (PR) complexes. We report here that the DP repeat lies within a protease-resistant domain that extends to or is near the C-terminus of both co-chaperones. Point mutations in the DP repeats render the C-terminal regions hypersensitive to proteolysis. In addition, a Hop DP mutant displays altered proteolytic digestion patterns, which suggest that the DP-repeat region influences the folding of other Hop domains. Although the respective DP regions of Hop and Hip share sequence and structural similarities, they are not functionally interchangeable. Moreover, a double-point mutation within the second DP-repeat unit of Hop that converts this to the sequence found in Hip disrupts Hop function; however, the corresponding mutation in Hip does not alter its function. We conclude that the DP repeats are important structural elements within a C-terminal domain, which is important for Hop and Hip function.

  8. Cell type-specific termination of transcription by transposable element sequences

    Directory of Open Access Journals (Sweden)

    Conley Andrew B


    Full Text Available Abstract Background Transposable elements (TEs encode sequences necessary for their own transposition, including signals required for the termination of transcription. TE sequences within the introns of human genes show an antisense orientation bias, which has been proposed to reflect selection against TE sequences in the sense orientation owing to their ability to terminate the transcription of host gene transcripts. While there is evidence in support of this model for some elements, the extent to which TE sequences actually terminate transcription of human gene across the genome remains an open question. Results Using high-throughput sequencing data, we have characterized over 9,000 distinct TE-derived sequences that provide transcription termination sites for 5,747 human genes across eight different cell types. Rarefaction curve analysis suggests that there may be twice as many TE-derived termination sites (TE-TTS genome-wide among all human cell types. The local chromatin environment for these TE-TTS is similar to that seen for 3′ UTR canonical TTS and distinct from the chromatin environment of other intragenic TE sequences. However, those TE-TTS located within the introns of human genes were found to be far more cell type-specific than the canonical TTS. TE-TTS were much more likely to be found in the sense orientation than other intragenic TE sequences of the same TE family and TE-TTS in the sense orientation terminate transcription more efficiently than those found in the antisense orientation. Alu sequences were found to provide a large number of relatively weak TTS, whereas LTR elements provided a smaller number of much stronger TTS. Conclusions TE sequences provide numerous termination sites to human genes, and TE-derived TTS are particularly cell type-specific. Thus, TE sequences provide a powerful mechanism for the diversification of transcriptional profiles between cell types and among evolutionary lineages, since most TE-TTS are

  9. ST proteins, a new family of plant tandem repeat proteins with a DUF2775 domain mainly found in Fabaceae and Asteraceae. (United States)

    Albornos, Lucía; Martín, Ignacio; Iglesias, Rebeca; Jiménez, Teresa; Labrador, Emilia; Dopico, Berta


    Many proteins with tandem repeats in their sequence have been described and classified according to the length of the repeats: I) Repeats of short oligopeptides (from 2 to 20 amino acids), including structural cell wall proteins and arabinogalactan proteins. II) Repeats that range in length from 20 to 40 residues, including proteins with a well-established three-dimensional structure often involved in mediating protein-protein interactions. (III) Longer repeats in the order of 100 amino acids that constitute structurally and functionally independent units. Here we analyse ShooT specific (ST) proteins, a family of proteins with tandem repeats of unknown function that were first found in Leguminosae, and their possible similarities to other proteins with tandem repeats. ST protein sequences were only found in dicotyledonous plants, limited to several plant families, mainly the Fabaceae and the Asteraceae. ST mRNAs accumulate mainly in the roots and under biotic interactions. Most ST proteins have one or several Domain(s) of Unknown Function 2775 (DUF2775). All deduced ST proteins have a signal peptide, indicating that these proteins enter the secretory pathway, and the mature proteins have tandem repeat oligopeptides that share a hexapeptide (E/D)FEPRP followed by 4 partially conserved amino acids, which could determine a putative N-glycosylation signal, and a fully conserved tyrosine. In a phylogenetic tree, the sequences clade according to taxonomic group. A possible involvement in symbiosis and abiotic stress as well as in plant cell elongation is suggested, although different STs could play different roles in plant development. We describe a new family of proteins called ST whose presence is limited to the plant kingdom, specifically to a few families of dicotyledonous plants. They present 20 to 40 amino acid tandem repeat sequences with different characteristics (signal peptide, DUF2775 domain, conservative repeat regions) from the described group of 20 to 40

  10. Capillary electrophoresis fragment analysis and clone sequencing in detection of dynamic mutations of spinocerebellar ataxia

    Directory of Open Access Journals (Sweden)

    Yuan-yuan CHEN


    Full Text Available Objective To estimate the accuracy and stability of capillary electrophoresis fragment analysis and clone sequencing in detecting dynamic mutations of spinocerebellar ataxia (SCA. Methods Capillary electrophoresis fragment analysis and clone sequencing were used in detecting trinucleotide repeated sequence of 14 SCA patients (3 cases of SCA2, 2 cases of SCA7, 7 cases of SCA8 and 2 cases of SCA17. Results Capillary electrophoresis fragment analysis of 3 SCA2 cases showed the expanded cytosine-adenine-guanine (CAG repeats were 31, 30 and 32, and the copy numbers of 3 clone sequencing for 3 colonies in each case were 37/40/40, 37/38/39 and 38/39/40 respectively. Capillary electrophoresis fragment analysis of 2 SCA7 cases showed the expanded CAG repeats were 57 and 34, and the copy numbers of repeats were 69, 74, 75 in 3 colonies of one case, and was 45 in the other case. For the 7 SCA8 cases with the expanded cytosine-thymine-adenine (CTA/cytosine-thymine-guanine (CTG repeats of 99, 111, 104, 92, 89, 104 and 75, the results of clone sequencing were 97, 116, 104, 90, 90, 102 and 76 respectively. For 2 SCA17 cases with the short/expanded CAG repeats of 37/50 and 36/45, the results of clone sequencing were 51/50/52 and 45/44 for 3 and 2 colonies. Conclusions Although the higher mobility of polymerase chain reaction (PCR products containing dynamic mutation in the capillary electrophoresis fragment analysis might cause the deviation for analysis of copy numbers, the deviation was predictable and the results were repeatable. The clone sequencing results showed obvious instability, especially for SCA2 and SCA7 genes, which might owing to their simple CAG repeats. Consequently, clone sequencing is not suited for detection of dynamic mutation, not to mention the quantitative criteria of dynamic mutation sequencing. DOI: 10.3969/j.issn.1672-6731.2018.03.008

  11. Evaluation of genetic diversity amongst Descurainia sophia L. genotypes by inter-simple sequence repeat (ISSR) marker. (United States)

    Saki, Sahar; Bagheri, Hedayat; Deljou, Ali; Zeinalabedini, Mehrshad


    Descurainia sophia is a valuable medicinal plant in family of Brassicaceae. To determine the range of diversity amongst D. sophia in Iran, 32 naturally distributed plants belonging to six natural populations of the Iranian plateau were investigated by inter-simple sequence repeat (ISSR) markers. The average percentage of polymorphism produced by 12 ISSR primers was 86 %. The PIC values for primers ranged from 0.22 to 0.40 and Rp values ranged between 6.5 and 19.9. The relative genetic diversity of the populations was not high (Gst =0.32). However, the value of gene flow revealed by the ISSR marker was high (Nm = 1.03). UPGMA clustering method based on Jaccard similarity coefficient grouped the genotypes into two major clusters. Graph results from Neighbor-Net Network generated after a 1000 bootstrap test using Jaccard coefficient, and STRUCTURE analysis confirmed the UPGMA clustering. The first three PCAs represented 57.31 % of the total variation. The high levels of genetic diversity were observed within populations, which is useful in breeding and conservation programs. ISSR is found to be an eligible marker to study genetic diversity of D. sophia.

  12. A novel family of sequence-specific endoribonucleases associated with the clustered regularly interspaced short palindromic repeats. (United States)

    Beloglazova, Natalia; Brown, Greg; Zimmerman, Matthew D; Proudfoot, Michael; Makarova, Kira S; Kudritska, Marina; Kochinyan, Samvel; Wang, Shuren; Chruszcz, Maksymilian; Minor, Wladek; Koonin, Eugene V; Edwards, Aled M; Savchenko, Alexei; Yakunin, Alexander F


    Clustered regularly interspaced short palindromic repeats (CRISPRs) together with the associated CAS proteins protect microbial cells from invasion by foreign genetic elements using presently unknown molecular mechanisms. All CRISPR systems contain proteins of the CAS2 family, suggesting that these uncharacterized proteins play a central role in this process. Here we show that the CAS2 proteins represent a novel family of endoribonucleases. Six purified CAS2 proteins from diverse organisms cleaved single-stranded RNAs preferentially within U-rich regions. A representative CAS2 enzyme, SSO1404 from Sulfolobus solfataricus, cleaved the phosphodiester linkage on the 3'-side and generated 5'-phosphate- and 3'-hydroxyl-terminated oligonucleotides. The crystal structure of SSO1404 was solved at 1.6A resolution revealing the first ribonuclease with a ferredoxin-like fold. Mutagenesis of SSO1404 identified six residues (Tyr-9, Asp-10, Arg-17, Arg-19, Arg-31, and Phe-37) that are important for enzymatic activity and suggested that Asp-10 might be the principal catalytic residue. Thus, CAS2 proteins are sequence-specific endoribonucleases, and we propose that their role in the CRISPR-mediated anti-phage defense might involve degradation of phage or cellular mRNAs.

  13. TRStalker: an efficient heuristic for finding fuzzy tandem repeats. (United States)

    Pellegrini, Marco; Renda, M Elena; Vecchio, Alessio


    Genomes in higher eukaryotic organisms contain a substantial amount of repeated sequences. Tandem Repeats (TRs) constitute a large class of repetitive sequences that are originated via phenomena such as replication slippage and are characterized by close spatial contiguity. They play an important role in several molecular regulatory mechanisms, and also in several diseases (e.g. in the group of trinucleotide repeat disorders). While for TRs with a low or medium level of divergence the current methods are rather effective, the problem of detecting TRs with higher divergence (fuzzy TRs) is still open. The detection of fuzzy TRs is propaedeutic to enriching our view of their role in regulatory mechanisms and diseases. Fuzzy TRs are also important as tools to shed light on the evolutionary history of the genome, where higher divergence correlates with more remote duplication events. We have developed an algorithm (christened TRStalker) with the aim of detecting efficiently TRs that are hard to detect because of their inherent fuzziness, due to high levels of base substitutions, insertions and deletions. To attain this goal, we developed heuristics to solve a Steiner version of the problem for which the fuzziness is measured with respect to a motif string not necessarily present in the input string. This problem is akin to the 'generalized median string' that is known to be an NP-hard problem. Experiments with both synthetic and biological sequences demonstrate that our method performs better than current state of the art for fuzzy TRs and that the fuzzy TRs of the type we detect are indeed present in important biological sequences. TRStalker will be integrated in the web-based TRs Discovery Service (TReaDS) at Supplementary data are available at Bioinformatics online.

  14. Sequence diversities of serine-aspartate repeat genes among Staphylococcus aureus isolates from different hosts presumably by horizontal gene transfer.

    Directory of Open Access Journals (Sweden)

    Huping Xue

    Full Text Available BACKGROUND: Horizontal gene transfer (HGT is recognized as one of the major forces for bacterial genome evolution. Many clinically important bacteria may acquire virulence factors and antibiotic resistance through HGT. The comparative genomic analysis has become an important tool for identifying HGT in emerging pathogens. In this study, the Serine-Aspartate Repeat (Sdr family has been compared among different sources of Staphylococcus aureus (S. aureus to discover sequence diversities within their genomes. METHODOLOGY/PRINCIPAL FINDINGS: Four sdr genes were analyzed for 21 different S. aureus strains and 218 mastitis-associated S. aureus isolates from Canada. Comparative genomic analyses revealed that S. aureus strains from bovine mastitis (RF122 and mastitis isolates in this study, ovine mastitis (ED133, pig (ST398, chicken (ED98, and human methicillin-resistant S. aureus (MRSA (TCH130, MRSA252, Mu3, Mu50, N315, 04-02981, JH1 and JH9 were highly associated with one another, presumably due to HGT. In addition, several types of insertion and deletion were found in sdr genes of many isolates. A new insertion sequence was found in mastitis isolates, which was presumably responsible for the HGT of sdrC gene among different strains. Moreover, the sdr genes could be used to type S. aureus. Regional difference of sdr genes distribution was also indicated among the tested S. aureus isolates. Finally, certain associations were found between sdr genes and subclinical or clinical mastitis isolates. CONCLUSIONS: Certain sdr gene sequences were shared in S. aureus strains and isolates from different species presumably due to HGT. Our results also suggest that the distributional assay of virulence factors should detect the full sequences or full functional regions of these factors. The traditional assay using short conserved regions may not be accurate or credible. These findings have important implications with regard to animal husbandry practices that may

  15. The mitochondrial genome of the legume Vigna radiata and the analysis of recombination across short mitochondrial repeats.

    Directory of Open Access Journals (Sweden)

    Andrew J Alverson


    Full Text Available The mitochondrial genomes of seed plants are exceptionally fluid in size, structure, and sequence content, with the accumulation and activity of repetitive sequences underlying much of this variation. We report the first fully sequenced mitochondrial genome of a legume, Vigna radiata (mung bean, and show that despite its unexceptional size (401,262 nt, the genome is unusually depauperate in repetitive DNA and "promiscuous" sequences from the chloroplast and nuclear genomes. Although Vigna lacks the large, recombinationally active repeats typical of most other seed plants, a PCR survey of its modest repertoire of short (38-297 nt repeats nevertheless revealed evidence for recombination across all of them. A set of novel control assays showed, however, that these results could instead reflect, in part or entirely, artifacts of PCR-mediated recombination. Consequently, we recommend that other methods, especially high-depth genome sequencing, be used instead of PCR to infer patterns of plant mitochondrial recombination. The average-sized but repeat- and feature-poor mitochondrial genome of Vigna makes it ever more difficult to generalize about the factors shaping the size and sequence content of plant mitochondrial genomes.

  16. Transferability of simple sequence repeat (SSR) markers developed in guava (Psidium guajava L.) to four Myrtaceae species. (United States)

    Rai, Manoj K; Phulwaria, Mahendra; Shekhawat, N S


    Present study demonstrated the cross-genera transferability of 23 simple sequence repeat (SSR) primer pairs developed for guava (Psidium guajava L.) to four new targets, two species of eucalypts (Eucalyptus citriodora, Eucalyptus camaldulensis), bottlebrush (Callistemon lanceolatus) and clove (Syzygium aromaticum), belonging to the family Myrtaceae and subfamily Myrtoideae. Off the 23 SSR loci assayed, 18 (78.2%) gave cross-amplification in E. citriodora, 14 (60.8%) in E. camaldulensis and 17-17 (73.9%) in C. lanceolatus and S. aromaticum. Eight primer pairs were found to be transferable to all four species. The number of alleles detected at each locus ranged from one to nine, with an average of 4.8, 2.6, 4.5 and 4.6 alleles in E. citriodora, E. camaldulensis, C. lanceolatus and S. aromaticum, respectively. The high levels of cross-genera transferability of guava SSRs may be applicable for the analysis of intra- and inter specific genetic diversity of target species, especially in E. citriodora, C. lanceolatus and S. aromaticum, for which till date no information about EST-derived as well as genomic SSR is available.

  17. Plasmid P1 replication: negative control by repeated DNA sequences.


    Chattoraj, D; Cordes, K; Abeles, A


    The incompatibility locus, incA, of the unit-copy plasmid P1 is contained within a fragment that is essentially a set of nine 19-base-pair repeats. One or more copies of the fragment destabilizes the plasmid when present in trans. Here we show that extra copies of incA interfere with plasmid DNA replication and that a deletion of most of incA increases plasmid copy number. Thus, incA is not essential for replication but is required for its control. When cloned in a high-copy-number vector, pi...

  18. Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain. (United States)

    de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas


    The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Characterization of sequence diversity in Plasmodium falciparum SERA5 from Indian isolates

    Directory of Open Access Journals (Sweden)

    Rahul C.N


    Full Text Available Objective: To characterize the sequence diversity of blood-stage Plasmodium falciparum serine repeat antigen-5 (PfSERA5 which is lacking in a malaria-endemic country like India. Methods: In this study, parasitic DNA was obtained from field isolates collected from various geographic regions. Subsequently, PfSERA5 gene sequence was PCR amplified and DNA sequenced. Results: We reported the existence of unique repeat polymorphisms and novel haplotypes for both the octamer repeat (OR and serine repeat (SR regions of the N-terminal fragment of PfSERA5 from Indian isolates. Several isolates from India were identical to low-frequency African haplotypes. Unique finding of our study was an Indian isolate showing deletion in a perfectly conserved 14 mer sequence within octamer repeat. Indian haplotypes reported in this study were found to be distributed into the three earlier classified allelic clusters of FCR3, K1 and Honduras showcasing broad diversity as compared to worldwide haplotypes. Conclusions: This study is the first report on genetic diversity of PfSERA5 antigen from India. Further evaluation of these haplotypes by serotyping would provide useful information for investigating variant-specific immunity and aid in malaria vaccine research.

  20. Genetic variation and DNA fingerprinting of durian types in Malaysia using simple sequence repeat (SSR) markers. (United States)

    Siew, Ging Yang; Ng, Wei Lun; Tan, Sheau Wei; Alitheen, Noorjahan Banu; Tan, Soon Guan; Yeap, Swee Keong


    Durian ( Durio zibethinus ) is one of the most popular tropical fruits in Asia. To date, 126 durian types have been registered with the Department of Agriculture in Malaysia based on phenotypic characteristics. Classification based on morphology is convenient, easy, and fast but it suffers from phenotypic plasticity as a direct result of environmental factors and age. To overcome the limitation of morphological classification, there is a need to carry out genetic characterization of the various durian types. Such data is important for the evaluation and management of durian genetic resources in producing countries. In this study, simple sequence repeat (SSR) markers were used to study the genetic variation in 27 durian types from the germplasm collection of Universiti Putra Malaysia. Based on DNA sequences deposited in Genbank, seven pairs of primers were successfully designed to amplify SSR regions in the durian DNA samples. High levels of variation among the 27 durian types were observed (expected heterozygosity, H E  = 0.35). The DNA fingerprinting power of SSR markers revealed by the combined probability of identity (PI) of all loci was 2.3×10 -3 . Unique DNA fingerprints were generated for 21 out of 27 durian types using five polymorphic SSR markers (the other two SSR markers were monomorphic). We further tested the utility of these markers by evaluating the clonal status of shared durian types from different germplasm collection sites, and found that some were not clones. The findings in this preliminary study not only shows the feasibility of using SSR markers for DNA fingerprinting of durian types, but also challenges the current classification of durian types, e.g., on whether the different types should be called "clones", "varieties", or "cultivars". Such matters have a direct impact on the regulation and management of durian genetic resources in the region.

  1. Two new miniature inverted-repeat transposable elements in the genome of the clam Donax trunculus. (United States)

    Šatović, Eva; Plohl, Miroslav


    Repetitive sequences are important components of eukaryotic genomes that drive their evolution. Among them are different types of mobile elements that share the ability to spread throughout the genome and form interspersed repeats. To broaden the generally scarce knowledge on bivalves at the genome level, in the clam Donax trunculus we described two new non-autonomous DNA transposons, miniature inverted-repeat transposable elements (MITEs), named DTC M1 and DTC M2. Like other MITEs, they are characterized by their small size, their A + T richness, and the presence of terminal inverted repeats (TIRs). DTC M1 and DTC M2 are 261 and 286 bp long, respectively, and in addition to TIRs, both of them contain a long imperfect palindrome sequence in their central parts. These elements are present in complete and truncated versions within the genome of the clam D. trunculus. The two new MITEs share only structural similarity, but lack any nucleotide sequence similarity to each other. In a search for related elements in databases, blast search revealed within the Crassostrea gigas genome a larger element sharing sequence similarity only to DTC M1 in its TIR sequences. The lack of sequence similarity with any previously published mobile elements indicates that DTC M1 and DTC M2 elements may be unique to D. trunculus.

  2. Genetic characterization of autochthonous grapevine cultivars from Eastern Turkey by simple sequence repeats (SSRs

    Directory of Open Access Journals (Sweden)

    Sadiye Peral Eyduran


    Full Text Available In this research, two well-recognized standard grape cultivars, Cabernet Sauvignon and Merlot, together with eight historical autochthonous grapevine cultivars from Eastern Anatolia in Turkey, were genetically characterized by using 12 pairs of simple sequence repeat (SSR primers in order to evaluate their genetic diversity and relatedness. All of the used SSR primers produced successful amplifications and revealed DNA polymorphisms, which were subsequently utilized to evaluate the genetic relatedness of the grapevine cultivars. Allele richness was implied by the identification of 69 alleles in 8 autochthonous cultivars with a mean value of 5.75 alleles per locus. The average expected heterozygosity and observed heterozygosity were found to be 0.749 and 0.739, respectively. Taking into account the generated alleles, the highest number was recorded in VVC2C3 and VVS2 loci (nine and eight alleles per locus, respectively, whereas the lowest number was recorded in VrZAG83 (three alleles per locus. Two main clusters were produced by using the unweighted pair-group method with arithmetic mean dendrogram constructed on the basis of the SSR data. Only Cabernet Sauvignon and Merlot cultivars were included in the first cluster. The second cluster involved the rest of the autochthonous cultivars. The results obtained during the study illustrated clearly that SSR markers have verified to be an effective tool for fingerprinting grapevine cultivars and carrying out grapevine biodiversity studies. The obtained data are also meaningful references for grapevine domestication.

  3. On balanced minimal repeated measurements designs

    Directory of Open Access Journals (Sweden)

    Shakeel Ahmad Mir


    Full Text Available Repeated Measurements designs are concerned with scientific experiments in which each experimental unit is assigned more than once to a treatment either different or identical. This class of designs has the property that the unbiased estimators for elementary contrasts among direct and residual effects are obtainable. Afsarinejad (1983 provided a method of constructing balanced Minimal Repeated Measurements designs p < t , when t is an odd or prime power, one or more than one treatment may occur more than once in some sequences and  designs so constructed no longer remain uniform in periods. In this paper an attempt has been made to provide a new method to overcome this drawback. Specifically, two cases have been considered                RM[t,n=t(t-t/(p-1,p], λ2=1 for balanced minimal repeated measurements designs and  RM[t,n=2t(t-t/(p-1,p], λ2=2 for balanced  repeated measurements designs. In addition , a method has been provided for constructing              extra-balanced minimal designs for special case RM[t,n=t2/(p-1,p], λ2=1.

  4. Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing. (United States)

    Hribová, Eva; Neumann, Pavel; Matsumoto, Takashi; Roux, Nicolas; Macas, Jirí; Dolezel, Jaroslav


    Bananas and plantains (Musa spp.) are grown in more than a hundred tropical and subtropical countries and provide staple food for hundreds of millions of people. They are seed-sterile crops propagated clonally and this makes them vulnerable to a rapid spread of devastating diseases and at the same time hampers breeding improved cultivars. Although the socio-economic importance of bananas and plantains cannot be overestimated, they remain outside the focus of major research programs. This slows down the study of nuclear genome and the development of molecular tools to facilitate banana improvement. In this work, we report on the first thorough characterization of the repeat component of the banana (M. acuminata cv. 'Calcutta 4') genome. Analysis of almost 100 Mb of sequence data (0.15× genome coverage) permitted partial sequence reconstruction and characterization of repetitive DNA, making up about 30% of the genome. The results showed that the banana repeats are predominantly made of various types of Ty1/copia and Ty3/gypsy retroelements representing 16 and 7% of the genome respectively. On the other hand, DNA transposons were found to be rare. In addition to new families of transposable elements, two new satellite repeats were discovered and found useful as cytogenetic markers. To help in banana sequence annotation, a specific Musa repeat database was created, and its utility was demonstrated by analyzing the repeat composition of 62 genomic BAC clones. A low-depth 454 sequencing of banana nuclear genome provided the largest amount of DNA sequence data available until now for Musa and permitted reconstruction of most of the major types of DNA repeats. The information obtained in this study improves the knowledge of the long-range organization of banana chromosomes, and provides sequence resources needed for repeat masking and annotation during the Musa genome sequencing project. It also provides sequence data for isolation of DNA markers to be used in genetic

  5. Using inter simple sequence repeat (ISSR) markers to study genetic ...

    African Journals Online (AJOL)



    Apr 10, 2012 ... Genetic relationships among the cultivars was assessed by using six inter simple sequence ... polymorphism breeders of this species in order to find the ..... well as the high level of heterozygosity due to the cross- pollinating ...

  6. Molecular detection and PCR-RFLP analysis using Pst1 and Alu1 of multidrug resistant Klebsiella pneumoniae causing urinary tract infection in women in the eastern part of Bangladesh

    Directory of Open Access Journals (Sweden)

    Golam Mahmudunnabi


    Full Text Available Klebsiella pneumoniae is the second leading causative agent of UTI. In this study, a rapid combined polymerase chain reaction and restriction fragment length polymorphism analysis was developed to identify K. pneumoniae in women, infected with urinary tract infection in the Sylhet city of Bangladesh. Analysis of 11 isolates from women at the age range of 20–55 from three different hospitals were done firstly by amplification with K. pneumoniae specific ITS primers. All of the 11 collected isolates were amplified in PCR and showed the expected 136 bp products. Then, restriction fragment length polymorphism analysis of 11 isolates were conducted after PCR amplification by 16s rRNA universal primers, followed by subsequent digestion and incubation with two restriction enzymes, Pst1 and Alu1. Seven out of 11 isolates were digested by Pst1 restriction enzymes, six isolates digested by Alu1, and while others were negative for both enzymes. Data results reveal that, women at age between 25 and 50 were digested by both enzymes. A woman aged over than 50 was negative while bellow 20 was digested by only Pst1. The results could pave the tactic for further research in the detection of K. pneumoniae from UTI infected women. Keywords: Klebsiella pneumoniae, ITS-primer, MDR isolates, PCR-RFLP analysis

  7. Characterization of Erwinia amylovora strains from different host plants using repetitive-sequences PCR analysis, and restriction fragment length polymorphism and short-sequence DNA repeats of plasmid pEA29. (United States)

    Barionovi, D; Giorgi, S; Stoeger, A R; Ruppitsch, W; Scortichini, M


    The three main aims of the study were the assessment of the genetic relationship between a deviating Erwinia amylovora strain isolated from Amelanchier sp. (Maloideae) grown in Canada and other strains from Maloideae and Rosoideae, the investigation of the variability of the PstI fragment of the pEA29 plasmid using restriction fragment length polymorphism (RFLP) analysis and the determination of the number of short-sequence DNA repeats (SSR) by DNA sequence analysis in representative strains. Ninety-three strains obtained from 12 plant genera and different geographical locations were examined by repetitive-sequences PCR using Enterobacterial Repetitive Intergenic Consensus, BOX and Repetitive Extragenic Palindromic primer sets. Upon the unweighted pair group method with arithmetic mean analysis, a deviating strain from Amelanchier sp. was analysed using amplified ribosomal DNA restriction analysis (ARDRA) analysis and the sequencing of the 16S rDNA gene. This strain showed 99% similarity to other E. amylovora strains in the 16S gene and the same banding pattern with ARDRA. The RFLP analysis of pEA29 plasmid using MspI and Sau3A restriction enzymes showed a higher variability than that previously observed and no clear-cut grouping of the strains was possible. The number of SSR units reiterated two to 12 times. The strains obtained from pear orchards showing for the first time symptoms of fire blight had a low number of SSR units. The strains from Maloideae exhibit a wider genetic variability than previously thought. The RFLP analysis of a fragment of the pEA29 plasmid would not seem a reliable method for typing E. amylovora strains. A low number of SSR units was observed with first epidemics of fire blight. The current detection techniques are mainly based on the genetic similarities observed within the strains from the cultivated tree-fruit crops. For a more reliable detection of the fire blight pathogen also in wild and ornamentals Rosaceous plants the genetic

  8. Local repeat sequence organization of an intergenic spacer in the ...

    Indian Academy of Sciences (India)


    chloroplast genome of Chlamydomonas reinhardtii leads to DNA expansion and sequence ... The discovery of uniparentally inherited streptomycin resistant mutants ... resembles yeast, mitochondrial and phage recombination in that it is typically ...... Sager R and Lane D 1972 Molecular basis of maternal inheritance; Proc.

  9. Chaotic generation of PN sequences : a VLSI implementation

    NARCIS (Netherlands)

    Dornbusch, A.; Pineda de Gyvez, J.


    Generation of repeatable pseudo-random sequences with chaotic analog electronics is not feasible using standard circuit topologies. Component variation caused by imperfect fabrication causes the same divergence of output sequences as does varying initial conditions. By quantizing the output of a

  10. Aberrant splicing in transgenes containing introns, exons, and V5 epitopes: lessons from developing an FSHD mouse model expressing a D4Z4 repeat with flanking genomic sequences.

    Directory of Open Access Journals (Sweden)

    Eugénie Ansseau

    Full Text Available The DUX4 gene, encoded within D4Z4 repeats on human chromosome 4q35, has recently emerged as a key factor in the pathogenic mechanisms underlying Facioscapulohumeral muscular dystrophy (FSHD. This recognition prompted development of animal models expressing the DUX4 open reading frame (ORF alone or embedded within D4Z4 repeats. In the first published model, we used adeno-associated viral vectors (AAV and strong viral control elements (CMV promoter, SV40 poly A to demonstrate that the DUX4 cDNA caused dose-dependent toxicity in mouse muscles. As a follow-up, we designed a second generation of DUX4-expressing AAV vectors to more faithfully genocopy the FSHD-permissive D4Z4 repeat region located at 4q35. This new vector (called AAV.D4Z4.V5.pLAM contained the D4Z4/DUX4 promoter region, a V5 epitope-tagged DUX4 ORF, and the natural 3' untranslated region (pLAM harboring two small introns, DUX4 exons 2 and 3, and the non-canonical poly A signal required for stabilizing DUX4 mRNA in FSHD. AAV.D4Z4.V5.pLAM failed to recapitulate the robust pathology of our first generation vectors following delivery to mouse muscle. We found that the DUX4.V5 junction sequence created an unexpected splice donor in the pre-mRNA that was preferentially utilized to remove the V5 coding sequence and DUX4 stop codon, yielding non-functional DUX4 protein with 55 additional residues on its carboxyl-terminus. Importantly, we further found that aberrant splicing could occur in any expression construct containing a functional splice acceptor and sequences resembling minimal splice donors. Our findings represent an interesting case study with respect to AAV.D4Z4.V5.pLAM, but more broadly serve as a note of caution for designing constructs containing V5 epitope tags and/or transgenes with downstream introns and exons.

  11. A novel rat genomic simple repeat DNA with RNA-homology shows triplex (H-DNA)-like structure and tissue-specific RNA expression

    International Nuclear Information System (INIS)

    Dey, Indranil; Rath, Pramod C.


    Mammalian genome contains a wide variety of repetitive DNA sequences of relatively unknown function. We report a novel 227 bp simple repeat DNA (3.3 DNA) with a d {(GA) 7 A (AG) 7 } dinucleotide mirror repeat from the rat (Rattus norvegicus) genome. 3.3 DNA showed 75-85% homology with several eukaryotic mRNAs due to (GA/CU) n dinucleotide repeats by nBlast search and a dispersed distribution in the rat genome by Southern blot hybridization with [ 32 P]3.3 DNA. The d {(GA) 7 A (AG) 7 } mirror repeat formed a triplex (H-DNA)-like structure in vitro. Two large RNAs of 9.1 and 7.5 kb were detected by [ 32 P]3.3 DNA in rat brain by Northern blot hybridization indicating expression of such simple sequence repeats at RNA level in vivo. Further, several cDNAs were isolated from a rat cDNA library by [ 32 P]3.3 DNA probe. Three such cDNAs showed tissue-specific RNA expression in rat. pRT 4.1 cDNA showed strong expression of a 2.39 kb RNA in brain and spleen, pRT 5.5 cDNA showed strong expression of a 2.8 kb RNA in brain and a 3.9 kb RNA in lungs, and pRT 11.4 cDNA showed weak expression of a 2.4 kb RNA in lungs. Thus, genomic simple sequence repeats containing d (GA/CT) n dinucleotides are transcriptionally expressed and regulated in rat tissues. Such d (GA/CT) n dinucleotide repeats may form structural elements (e.g., triplex) which may be sites for functional regulation of genomic coding sequences as well as RNAs. This may be a general function of such transcriptionally active simple sequence repeats widely dispersed in mammalian genome

  12. Effective automated feature construction and selection for classification of biological sequences.

    Directory of Open Access Journals (Sweden)

    Uday Kamath

    Full Text Available Many open problems in bioinformatics involve elucidating underlying functional signals in biological sequences. DNA sequences, in particular, are characterized by rich architectures in which functional signals are increasingly found to combine local and distal interactions at the nucleotide level. Problems of interest include detection of regulatory regions, splice sites, exons, hypersensitive sites, and more. These problems naturally lend themselves to formulation as classification problems in machine learning. When classification is based on features extracted from the sequences under investigation, success is critically dependent on the chosen set of features.We present an algorithmic framework (EFFECT for automated detection of functional signals in biological sequences. We focus here on classification problems involving DNA sequences which state-of-the-art work in machine learning shows to be challenging and involve complex combinations of local and distal features. EFFECT uses a two-stage process to first construct a set of candidate sequence-based features and then select a most effective subset for the classification task at hand. Both stages make heavy use of evolutionary algorithms to efficiently guide the search towards informative features capable of discriminating between sequences that contain a particular functional signal and those that do not.To demonstrate its generality, EFFECT is applied to three separate problems of importance in DNA research: the recognition of hypersensitive sites, splice sites, and ALU sites. Comparisons with state-of-the-art algorithms show that the framework is both general and powerful. In addition, a detailed analysis of the constructed features shows that they contain valuable biological information about DNA architecture, allowing biologists and other researchers to directly inspect the features and potentially use the insights obtained to assist wet-laboratory studies on retainment or modification

  13. Organelle Simple Sequence Repeat Markers Help to Distinguish Carpelloid Stamen and Normal Cytoplasmic Male Sterile Sources in Broccoli (United States)

    Shu, Jinshuai; Liu, Yumei; Li, Zhansheng; Zhang, Lili; Fang, Zhiyuan; Yang, Limei; Zhuang, Mu; Zhang, Yangyong; Lv, Honghao


    We previously discovered carpelloid stamens when breeding cytoplasmic male sterile lines in broccoli (Brassica oleracea var. italica). In this study, hybrids and multiple backcrosses were produced from different cytoplasmic male sterile carpelloid stamen sources and maintainer lines. Carpelloid stamens caused dysplasia of the flower structure and led to hooked or coiled siliques with poor seed setting, which were inherited in a maternal fashion. Using four distinct carpelloid stamens and twelve distinct normal stamens from cytoplasmic male sterile sources and one maintainer, we used 21 mitochondrial simple sequence repeat (mtSSR) primers and 32 chloroplast SSR primers to identify a mitochondrial marker, mtSSR2, that can differentiate between the cytoplasm of carpelloid and normal stamens. Thereafter, mtSSR2 was used to identify another 34 broccoli accessions, with an accuracy rate of 100%. Analysis of the polymorphic sequences revealed that the mtSSR2 open reading frame of carpelloid stamen sterile sources had a deletion of 51 bases (encoding 18 amino acids) compared with normal stamen materials. The open reading frame is located in the coding region of orf125 and orf108 of the mitochondrial genomes in Brassica crops and had the highest similarity with Raphanus sativus and Brassica carinata. The current study has not only identified a useful molecular marker to detect the cytoplasm of carpelloid stamens during broccoli breeding, but it also provides evidence that the mitochondrial genome is maternally inherited and provides a basis for studying the effect of the cytoplasm on flower organ development in plants. PMID:26407159

  14. Non-radioactive detection of trinucleotide repeat size variability. (United States)

    Tomé, Stéphanie; Nicole, Annie; Gomes-Pereira, Mario; Gourdon, Genevieve


    Many human diseases are associated with the abnormal expansion of unstable trinucleotide repeat sequences. The mechanisms of trinucleotide repeat size mutation have not been fully dissected, and their understanding must be grounded on the detailed analysis of repeat size distributions in human tissues and animal models. Small-pool PCR (SP-PCR) is a robust, highly sensitive and efficient PCR-based approach to assess the levels of repeat size variation, providing both quantitative and qualitative data. The method relies on the amplification of a very low number of DNA molecules, through sucessive dilution of a stock genomic DNA solution. Radioactive Southern blot hybridization is sensitive enough to detect SP-PCR products derived from single template molecules, separated by agarose gel electrophoresis and transferred onto DNA membranes. We describe a variation of the detection method that uses digoxigenin-labelled locked nucleic acid probes. This protocol keeps the sensitivity of the original method, while eliminating the health risks associated with the manipulation of radiolabelled probes, and the burden associated with their regulation, manipulation and waste disposal.

  15. Full-length cDNA sequences from Rhesus monkey placenta tissue: analysis and utility for comparative mapping

    Directory of Open Access Journals (Sweden)

    Lee Sang-Rae


    Full Text Available Abstract Background Rhesus monkeys (Macaca mulatta are widely-used as experimental animals in biomedical research and are closely related to other laboratory macaques, such as cynomolgus monkeys (Macaca fascicularis, and to humans, sharing a last common ancestor from about 25 million years ago. Although rhesus monkeys have been studied extensively under field and laboratory conditions, research has been limited by the lack of genetic resources. The present study generated placenta full-length cDNA libraries, characterized the resulting expressed sequence tags, and described their utility for comparative mapping with human RefSeq mRNA transcripts. Results From rhesus monkey placenta full-length cDNA libraries, 2000 full-length cDNA sequences were determined and 1835 rhesus placenta cDNA sequences longer than 100 bp were collected. These sequences were annotated based on homology to human genes. Homology search against human RefSeq mRNAs revealed that our collection included the sequences of 1462 putative rhesus monkey genes. Moreover, we identified 207 genes containing exon alterations in the coding region and the untranslated region of rhesus monkey transcripts, despite the highly conserved structure of the coding regions. Approximately 10% (187 of all full-length cDNA sequences did not represent any public human RefSeq mRNAs. Intriguingly, two rhesus monkey specific exons derived from the transposable elements of AluYRa2 (SINE family and MER11B (LTR family were also identified. Conclusion The 1835 rhesus monkey placenta full-length cDNA sequences described here could expand genomic resources and information of rhesus monkeys. This increased genomic information will greatly contribute to the development of evolutionary biology and biomedical research.

  16. One in Four Individuals of African-American Ancestry Harbors a 5.5kb Deletion at chromosome 11q13.1 (United States)

    Zainabadi, Kayvan; Jain, Anuja V.; Donovan, Frank X.; Elashoff, David; Rao, Nagesh P.; Murty, Vundavalli V.; Chandrasekharappa, Settara C.; Srivatsan, Eri S.


    Cloning and sequencing of 5.5kb deletion at chromosome 11q13.1 from the HeLa cells, tumorigenic hybrids and two fibroblast cell lines has revealed homologous recombination between AluSx and AluY resulting in the deletion of intervening sequences. Long-range PCR of the 5.5kb sequence in 494 normal lymphocyte samples showed heterozygous deletion in 28.3% of African- American ancestry samples but only in 4.8% of Caucasian samples (pdeletion occurs in 27% of YRI (Yoruba – West African) population but none in non-African populations. The HapMap analysis further identified strong linkage disequilibrium between 5 single nucleotide polymorphisms and the 5.5kb deletion in the people of African ancestry. Computational analysis of 175kb sequence surrounding the deletion site revealed enhanced flexibility, low thermodynamic stability, high repetitiveness, and stable stem-loop/hairpin secondary structures that are hallmarks of common fragile sites. PMID:24412158

  17. Analysis of genetic relationships and identification of lily cultivars based on inter-simple sequence repeat markers. (United States)

    Cui, G F; Wu, L F; Wang, X N; Jia, W J; Duan, Q; Ma, L L; Jiang, Y L; Wang, J H


    Inter-simple sequence repeat (ISSR) markers were used to discriminate 62 lily cultivars of 5 hybrid series. Eight ISSR primers generated 104 bands in total, which all showed 100% polymorphism, and an average of 13 bands were amplified by each primer. Two software packages, POPGENE 1.32 and NTSYSpc 2.1, were used to analyze the data matrix. Our results showed that the observed number of alleles (NA), effective number of alleles (NE), Nei's genetic diversity (H), and Shannon's information index (I) were 1.9630, 1.4179, 0.2606, and 0.4080, respectively. The highest genetic similarity (0.9601) was observed between the Oriental x Trumpet and Oriental lilies, which indicated that the two hybrids had a close genetic relationship. An unweighted pair-group method with arithmetic means dendrogram showed that the 62 lily cultivars clustered into two discrete groups. The first group included the Oriental and OT cultivars, while the Asiatic, LA, and Longiflorum lilies were placed in the second cluster. The distribution of individuals in the principal component analysis was consistent with the clustering of the dendrogram. Fingerprints of all lily cultivars built from 8 primers could be separated completely. This study confirmed the effect and efficiency of ISSR identification in lily cultivars.

  18. Isolation, sequencing and expression of RED, a novel human gene encoding an acidic-basic dipeptide repeat. (United States)

    Assier, E; Bouzinba-Segard, H; Stolzenberg, M C; Stephens, R; Bardos, J; Freemont, P; Charron, D; Trowsdale, J; Rich, T


    A novel human gene RED, and the murine homologue, MuRED, were cloned. These genes were named after the extensive stretch of alternating arginine (R) and glutamic acid (E) or aspartic acid (D) residues that they contain. We term this the 'RED' repeat. The genes of both species were expressed in a wide range of tissues and we have mapped the human gene to chromosome 5q22-24. MuRED and RED shared 98% sequence identity at the amino acid level. The open reading frame of both genes encodes a 557 amino acid protein. RED fused to a fluorescent tag was expressed in nuclei of transfected cells and localised to nuclear dots. Co-localisation studies showed that these nuclear dots did not contain either PML or Coilin, which are commonly found in the POD or coiled body nuclear compartments. Deletion of the amino terminal 265 amino acids resulted in a failure to sort efficiently to the nucleus, though nuclear dots were formed. Deletion of a further 50 amino acids from the amino terminus generates a protein that can sort to the nucleus but is unable to generate nuclear dots. Neither construct localised to the nucleolus. The characteristics of RED and its nuclear localisation implicate it as a regulatory protein, possibly involved in transcription.

  19. Modulation of LINE-1 and Alu/SVA Retrotransposition by Aicardi-Goutières Syndrome-Related SAMHD1

    Directory of Open Access Journals (Sweden)

    Ke Zhao


    Full Text Available Long interspersed elements 1 (LINE-1 occupy at least 17% of the human genome and are its only active autonomous retrotransposons. However, the host factors that regulate LINE-1 retrotransposition are not fully understood. Here, we demonstrate that the Aicardi-Goutières syndrome gene product SAMHD1, recently revealed to be an inhibitor of HIV/simian immunodeficiency virus (SIV infectivity and neutralized by the viral Vpx protein, is also a potent regulator of LINE-1 and LINE-1-mediated Alu/SVA retrotransposition. We also found that mutant SAMHD1s of Aicardi-Goutières syndrome patients are defective in LINE-1 inhibition. Several domains of SAMHD1 are critical for LINE-1 regulation. SAMHD1 inhibits LINE-1 retrotransposition in dividing cells. An enzymatic active site mutant SAMHD1 maintained substantial anti-LINE-1 activity. SAMHD1 inhibits ORF2p-mediated LINE-1 reverse transcription in isolated LINE-1 ribonucleoproteins by reducing ORF2p level. Thus, SAMHD1 may be a cellular regulator of LINE-1 activity that is conserved in mammals.

  20. Correlation between fibroin amino acid sequence and physical silk properties. (United States)

    Fedic, Robert; Zurovec, Michal; Sehnal, Frantisek


    The fiber properties of lepidopteran silk depend on the amino acid repeats that interact during H-fibroin polymerization. The aim of our research was to relate repeat composition to insect biology and fiber strength. Representative regions of the H-fibroin genes were sequenced and analyzed in three pyralid species: wax moth (Galleria mellonella), European flour moth (Ephestia kuehniella), and Indian meal moth (Plodia interpunctella). The amino acid repeats are species-specific, evidently a diversification of an ancestral region of 43 residues, and include three types of regularly dispersed motifs: modifications of GSSAASAA sequence, stretches of tripeptides GXZ where X and Z represent bulky residues, and sequences similar to PVIVIEE. No concatenations of GX dipeptide or alanine, which are typical for Bombyx silkworms and Antheraea silk moths, respectively, were found. Despite different repeat structure, the silks of G. mellonella and E. kuehniella exhibit similar tensile strength as the Bombyx and Antheraea silks. We suggest that in these latter two species, variations in the repeat length obstruct repeat alignment, but sufficiently long stretches of iterated residues get superposed to interact. In the pyralid H-fibroins, interactions of the widely separated and diverse motifs depend on the precision of repeat matching; silk is strong in G. mellonella and E. kuehniella, with 2-3 types of long homogeneous repeats, and nearly 10 times weaker in P. interpunctella, with seven types of shorter erratic repeats. The high proportion of large amino acids in the H-fibroin of pyralids has probably evolved in connection with the spinning habit of caterpillars that live in protective silk tubes and spin continuously, enlarging the tubes on one end and partly devouring the other one. The silk serves as a depot of energetically rich and essential amino acids that may be scarce in the diet.

  1. Subtyping Salmonella enterica serovar enteritidis isolates from different sources by using sequence typing based on virulence genes and clustered regularly interspaced short palindromic repeats (CRISPRs). (United States)

    Liu, Fenyun; Kariyawasam, Subhashinie; Jayarao, Bhushan M; Barrangou, Rodolphe; Gerner-Smidt, Peter; Ribot, Efrain M; Knabel, Stephen J; Dudley, Edward G


    Salmonella enterica subsp. enterica serovar Enteritidis is a major cause of food-borne salmonellosis in the United States. Two major food vehicles for S. Enteritidis are contaminated eggs and chicken meat. Improved subtyping methods are needed to accurately track specific strains of S. Enteritidis related to human salmonellosis throughout the chicken and egg food system. A sequence typing scheme based on virulence genes (fimH and sseL) and clustered regularly interspaced short palindromic repeats (CRISPRs)-CRISPR-including multi-virulence-locus sequence typing (designated CRISPR-MVLST)-was used to characterize 35 human clinical isolates, 46 chicken isolates, 24 egg isolates, and 63 hen house environment isolates of S. Enteritidis. A total of 27 sequence types (STs) were identified among the 167 isolates. CRISPR-MVLST identified three persistent and predominate STs circulating among U.S. human clinical isolates and chicken, egg, and hen house environmental isolates in Pennsylvania, and an ST that was found only in eggs and humans. It also identified a potential environment-specific sequence type. Moreover, cluster analysis based on fimH and sseL identified a number of clusters, of which several were found in more than one outbreak, as well as 11 singletons. Further research is needed to determine if CRISPR-MVLST might help identify the ecological origins of S. Enteritidis strains that contaminate chickens and eggs.

  2. Instability of (CTGn•(CAGn trinucleotide repeats and DNA synthesis

    Directory of Open Access Journals (Sweden)

    Liu Guoqi


    Full Text Available Abstract Expansion of (CTGn•(CAGn trinucleotide repeat (TNR microsatellite sequences is the cause of more than a dozen human neurodegenerative diseases. (CTGn and (CAGn repeats form imperfectly base paired hairpins that tend to expand in vivo in a length-dependent manner. Yeast, mouse and human models confirm that (CTGn•(CAGn instability increases with repeat number, and implicate both DNA replication and DNA damage response mechanisms in (CTGn•(CAGn TNR expansion and contraction. Mutation and knockdown models that abrogate the expression of individual genes might also mask more subtle, cumulative effects of multiple additional pathways on (CTGn•(CAGn instability in whole animals. The identification of second site genetic modifiers may help to explain the variability of (CTGn•(CAGn TNR instability patterns between tissues and individuals, and offer opportunities for prognosis and treatment.

  3. A Sequence-Specific Interaction between the Saccharomyces cerevisiae rRNA Gene Repeats and a Locus Encoding an RNA Polymerase I Subunit Affects Ribosomal DNA Stability (United States)

    Cahyani, Inswasti; Cridge, Andrew G.; Engelke, David R.; Ganley, Austen R. D.


    The spatial organization of eukaryotic genomes is linked to their functions. However, how individual features of the global spatial structure contribute to nuclear function remains largely unknown. We previously identified a high-frequency interchromosomal interaction within the Saccharomyces cerevisiae genome that occurs between the intergenic spacer of the ribosomal DNA (rDNA) repeats and the intergenic sequence between the locus encoding the second largest RNA polymerase I subunit and a lysine tRNA gene [i.e., RPA135-tK(CUU)P]. Here, we used quantitative chromosome conformation capture in combination with replacement mapping to identify a 75-bp sequence within the RPA135-tK(CUU)P intergenic region that is involved in the interaction. We demonstrate that the RPA135-IGS1 interaction is dependent on the rDNA copy number and the Msn2 protein. Surprisingly, we found that the interaction does not govern RPA135 transcription. Instead, replacement of a 605-bp region within the RPA135-tK(CUU)P intergenic region results in a reduction in the RPA135-IGS1 interaction level and fluctuations in rDNA copy number. We conclude that the chromosomal interaction that occurs between the RPA135-tK(CUU)P and rDNA IGS1 loci stabilizes rDNA repeat number and contributes to the maintenance of nucleolar stability. Our results provide evidence that the DNA loci involved in chromosomal interactions are composite elements, sections of which function in stabilizing the interaction or mediating a functional outcome. PMID:25421713

  4. Molecular identification and characterization of clustered regularly interspaced short palindromic repeats (CRISPRs) in a urease-positive thermophilic Campylobacter sp. (UPTC). (United States)

    Tasaki, E; Hirayama, J; Tazumi, A; Hayashi, K; Hara, Y; Ueno, H; Moore, J E; Millar, B C; Matsuda, M


    Novel clustered regularly-interspaced short palindromic repeats (CRISPRs) locus [7,500 base pairs (bp) in length] occurred in the urease-positive thermophilic Campylobacter (UPTC) Japanese isolate, CF89-12. The 7,500 bp gene loci consisted of the 5'-methylaminomethyl-2-thiouridylate methyltransferase gene, putative (P) CRISPR associated (p-Cas), putative open reading frames, Cas1 and Cas2, leader sequence region (146 bp), 12 CRISPRs consensus sequence repeats (each 36 bp) separated by a non-repetitive unique spacer region of similar length (26-31 bp) and the phosphatidyl glycerophosphatase A gene. When the CRISPRs loci in the UPTC CF89-12 and five C. jejuni isolates were compared with one another, these six isolates contained p-Cas, Cas1 and Cas2 within the loci. Four to 12 CRISPRs consensus sequence repeats separated by a non-repetitive unique spacer region occurred in six isolates and the nucleotide sequences of those repeats gave approximately 92-100% similarity with each other. However, no sequence similarity occurred in the unique spacer regions among these isolates. The putative σ(70) transcriptional promoter and the hypothetical ρ-independent terminator structures for the CRISPRs and Cas were detected. No in vivo transcription of p-Cas, Cas1 and Cas2 was confirmed in the UPTC cells.

  5. Inter-simple sequence repeat (ISSR) markers in the evaluation of ...

    African Journals Online (AJOL)



    Feb 13, 2013 ... 666 Afr. J. Biotechnol. Table 1. Number and types of the ISSR bands as well as the total polymorphism percentages generated in six Capsicum hybrids. Primer code. Sequence. Monomorphic band. Polymorphic band. Total band. Polymorphism. (%). Unique. Shared. HB 1. (CAA)5. 4. 0. 1. 5. 20. HB 2. (CAG) ...

  6. Dispersed repetitive sequences in eukaryotic genomes and their possible biological significance

    International Nuclear Information System (INIS)

    Georgiev, G.P.; Kramerov, D.A.; Ryskov, A.P.; Skryabin, K.G.; Lukanidin, E.M.


    In this paper is described the properties of a novel mouse mdg-like element, the A2 sequence, which is the most abundant repetitive sequence. We also characterized an ubiquitous B2 sequence that represents, after B1, the dominant family among the short interspersed repeats of the mouse genome. The existence of some putative transposition intermediates was shown for repeats of both A and B types of the mouse genome. These are closed circular DNA of the A type and small polyadenylated B + RNAs. The fundamental question that arises is whether these sequences are simply selfish DNA capable of transpositions or do they fulfill some useful biological functions within the genome. 66 references, 11 figures, 1 table

  7. Clustered regularly interspaced short palindromic repeats (CRISPRs): the hallmark of an ingenious antiviral defense mechanism in prokaryotes

    NARCIS (Netherlands)

    Al-Attar, S.; Westra, E.R.; Oost, van der J.; Brouns, S.J.J.


    Many prokaryotes contain the recently discovered defense system against mobile genetic elements. This defense system contains a unique type of repetitive DNA stretches, termed Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs). CRISPRs consist of identical repeated DNA sequences

  8. Amino acid sequence analysis of the annexin super-gene family of proteins. (United States)

    Barton, G J; Newman, R H; Freemont, P S; Crumpton, M J


    The annexins are a widespread family of calcium-dependent membrane-binding proteins. No common function has been identified for the family and, until recently, no crystallographic data existed for an annexin. In this paper we draw together 22 available annexin sequences consisting of 88 similar repeat units, and apply the techniques of multiple sequence alignment, pattern matching, secondary structure prediction and conservation analysis to the characterisation of the molecules. The analysis clearly shows that the repeats cluster into four distinct families and that greatest variation occurs within the repeat 3 units. Multiple alignment of the 88 repeats shows amino acids with conserved physicochemical properties at 22 positions, with only Gly at position 23 being absolutely conserved in all repeats. Secondary structure prediction techniques identify five conserved helices in each repeat unit and patterns of conserved hydrophobic amino acids are consistent with one face of a helix packing against the protein core in predicted helices a, c, d, e. Helix b is generally hydrophobic in all repeats, but contains a striking pattern of repeat-specific residue conservation at position 31, with Arg in repeats 4 and Glu in repeats 2, but unconserved amino acids in repeats 1 and 3. This suggests repeats 2 and 4 may interact via a buried saltbridge. The loop between predicted helices a and b of repeat 3 shows features distinct from the equivalent loop in repeats 1, 2 and 4, suggesting an important structural and/or functional role for this region. No compelling evidence emerges from this study for uteroglobin and the annexins sharing similar tertiary structures, or for uteroglobin representing a derivative of a primordial one-repeat structure that underwent duplication to give the present day annexins. The analyses performed in this paper are re-evaluated in the Appendix, in the light of the recently published X-ray structure for human annexin V. The structure confirms most of

  9. The polymorphic integumentary mucin B.1 from Xenopus laevis contains the short consensus repeat. (United States)

    Probst, J C; Hauser, F; Joba, W; Hoffmann, W


    The frog integumentary mucin B.1 (FIM-B.1), discovered by molecular cloning, contains a cysteine-rich C-terminal domain which is homologous with von Willebrand factor. With the help of the polymerase chain reaction, we now characterize a contiguous region 5' to the von Willebrand factor domain containing the short consensus repeat typical of many proteins from the complement system. Multiple transcripts have been cloned, which originate from a single animal and differ by a variable number of tandem repeats (rep-33 sequences). These different transcripts probably originate solely from two genes and are generated presumably by alternative splicing of an huge array of functional cassettes. This model is supported by analysis of genomic FIM-B.1 sequences from Xenopus laevis. Here, rep-33 sequences are arranged in an interrupted array of individual units. Additionally, results of Southern analysis revealed genetic polymorphism between different animals which is predicted to be within the tandem repeats. A first investigation of the predicted mucins with the help of a specific antibody against a synthetic peptide determined the molecular mass of FIM-B.1 to greater than 200 kDa. Here again, genetic polymorphism between different animals is detected.

  10. Estimation of genetic structure of a Mycosphaerella musicola population using inter-simple sequence repeat markers. (United States)

    Peixouto, Y S; Dórea Bragança, C A; Andrade, W B; Ferreira, C F; Haddad, F; Oliveira, S A S; Darosci Brito, F S; Miller, R N G; Amorim, E P


    Among the diseases affecting banana (Musa sp), yellow Sigatoka, caused by the fungal pathogen Mycosphaerella musicola Leach, is considered one of the most important in Brazil, causing losses throughout the year. Understanding the genetic structure of pathogen populations will provide insight into the life history of pathogens, including the evolutionary processes occurring in agrosystems. Tools for estimating the possible emergence of pathogen variants with altered pathogenicity, virulence, or aggressiveness, as well as resistance to systemic fungicides, can also be developed from such data. The objective of this study was to analyze the genetic diversity and population genetics of M. musicola in the main banana-producing regions in Brazil. A total of 83 isolates collected from different banana cultivars in the Brazilian states of Bahia, Rio Grande do Norte, and Minas Gerais were evaluated using inter-simple sequence repeat markers. High variability was detected between the isolates, and 85.5% of the haplotypes were singletons in the populations. The highest source of genetic diversity (97.22%) was attributed to variations within populations. Bayesian cluster analysis revealed the presence of 2 probable ancestral groups, however, showed no relationship to population structure in terms of collection site, state of origin, or cultivar. Similarly, we detected noevidence of genetic recombination between individuals within different states, indicating that asexual cycles play a major role in M. musicola reproduction and that long-distance dispersal of the pathogen is the main factor contributing to the lack of population structure in the fungus.

  11. Inferring repeat-protein energetics from evolutionary information.

    Directory of Open Access Journals (Sweden)

    Rocío Espada


    Full Text Available Natural protein sequences contain a record of their history. A common constraint in a given protein family is the ability to fold to specific structures, and it has been shown possible to infer the main native ensemble by analyzing covariations in extant sequences. Still, many natural proteins that fold into the same structural topology show different stabilization energies, and these are often related to their physiological behavior. We propose a description for the energetic variation given by sequence modifications in repeat proteins, systems for which the overall problem is simplified by their inherent symmetry. We explicitly account for single amino acid and pair-wise interactions and treat higher order correlations with a single term. We show that the resulting evolutionary field can be interpreted with structural detail. We trace the variations in the energetic scores of natural proteins and relate them to their experimental characterization. The resulting energetic evolutionary field allows the prediction of the folding free energy change for several mutants, and can be used to generate synthetic sequences that are statistically indistinguishable from the natural counterparts.

  12. Stress-induced rearrangement of Fusarium retrotransposon sequences. (United States)

    Anaya, N; Roncero, M I


    Rearrangement of fusarium oxysporum retrotransposon skippy was induced by growth in the presence of potassium chlorate. Three fungal strains, one sensitive to chlorate (Co60) and two resistant to chlorate and deficient for nitrate reductase (Co65 and Co94), were studied by Southern analysis of their genomic DNA. Polymorphism was detected in their hybridization banding pattern, relative to the wild type grown in the absence of chlorate, using various enzymes with or without restriction sites within the retrotransposon. Results were consistent with the assumption that three different events had occurred in strain Co60: genomic amplification of skippy yielding tandem arrays of the element, generation of new skippy sequences, and deletion of skippy sequences. Amplification of Co60 genomic DNA using the polymerase chain reaction and divergent primers derived from the retrotransposon generated a new band, corresponding to one long terminal repeat plus flanking sequences, that was not present in the wild-type strain. Molecular analysis of nitrate reductase-deficient mutants showed that generation and deletion of skippy sequences, but not genomic amplification in tandem repeats, had occurred in their genomes.

  13. Characterization of the variable-number tandem repeats in vrrA from different Bacillus anthracis isolates

    Energy Technology Data Exchange (ETDEWEB)

    Jackson, P.J.; Walthers, E.A.; Richmond, K.L. [Los Alamos National Lab., NM (United States)] [and others


    PCR analysis of 198 Bacillus anthracis isolates revealed a variable region of DNA sequence differing in length among the isolates. Five Polymorphisms differed by the presence Of two to six copies of the 12-bp tandem repeat 5{prime}-CAATATCAACAA-3{prime}. This variable-number tandem repeat (VNTR) region is located within a larger sequence containing one complete open reading frame that encodes a putative 30-kDa protein. Length variation did not change the reading frame of the encoded protein and only changed the copy number of a 4-amino-acid sequence (QYQQ) from 2 to 6. The structure of the VNTR region suggests that these multiple repeats are generated by recombination or polymerase slippage. Protein structures predicted from the reverse-translated DNA sequence suggest that any structural changes in the encoded protein are confined to the region encoded by the VNTR sequence. Copy number differences in the VNTR region were used to define five different B. anthracis alleles. Characterization of 198 isolates revealed allele frequencies of 6.1, 17.7, 59.6, 5.6, and 11.1% sequentially from shorter to longer alleles. The high degree of polymorphism in the VNTR region provides a criterion for assigning isolates to five allelic categories. There is a correlation between categories and geographic distribution. Such molecular markers can be used to monitor the epidemiology of anthrax outbreaks in domestic and native herbivore populations. 22 refs., 4 figs., 3 tabs.

  14. Structural basis for sequence-specific recognition of DNA by TAL effectors

    KAUST Repository

    Deng, Dong; Yan, Chuangye; Pan, Xiaojing; Mahfouz, Magdy M.; Wang, Jiawei; Zhu, Jiankang; Shi, Yi Gong; Yan, Nieng


    TAL (transcription activator-like) effectors, secreted by phytopathogenic bacteria, recognize host DNA sequences through a central domain of tandem repeats. Each repeat comprises 33 to 35 conserved amino acids and targets a specific base pair

  15. Genome wide characterization of simple sequence repeats in watermelon genome and their application in comparative mapping and genetic diversity analysis. (United States)

    Zhu, Huayu; Song, Pengyao; Koo, Dal-Hoe; Guo, Luqin; Li, Yanman; Sun, Shouru; Weng, Yiqun; Yang, Luming


    Microsatellite markers are one of the most informative and versatile DNA-based markers used in plant genetic research, but their development has traditionally been difficult and costly. The whole genome sequencing with next-generation sequencing (NGS) technologies provides large amounts of sequence data to develop numerous microsatellite markers at whole genome scale. SSR markers have great advantage in cross-species comparisons and allow investigation of karyotype and genome evolution through highly efficient computation approaches such as in silico PCR. Here we described genome wide development and characterization of SSR markers in the watermelon (Citrullus lanatus) genome, which were then use in comparative analysis with two other important crop species in the Cucurbitaceae family: cucumber (Cucumis sativus L.) and melon (Cucumis melo L.). We further applied these markers in evaluating the genetic diversity and population structure in watermelon germplasm collections. A total of 39,523 microsatellite loci were identified from the watermelon draft genome with an overall density of 111 SSRs/Mbp, and 32,869 SSR primers were designed with suitable flanking sequences. The dinucleotide SSRs were the most common type representing 34.09 % of the total SSR loci and the AT-rich motifs were the most abundant in all nucleotide repeat types. In silico PCR analysis identified 832 and 925 SSR markers with each having a single amplicon in the cucumber and melon draft genome, respectively. Comparative analysis with these cross-species SSR markers revealed complicated mosaic patterns of syntenic blocks among the genomes of three species. In addition, genetic diversity analysis of 134 watermelon accessions with 32 highly informative SSR loci placed these lines into two groups with all accessions of C.lanatus var. citorides and three accessions of C. colocynthis clustered in one group and all accessions of C. lanatus var. lanatus and the remaining accessions of C. colocynthis

  16. Organization and Evolution of Subtelomeric Satellite Repeats in the Potato Genome

    Czech Academy of Sciences Publication Activity Database

    Torres, A.T.; Gong, Z.; Iovene, M.; Hirsch, C.D.; Buell, C.R.; Bryan, G.J.; Novák, Petr; Macas, Jiří; Jiang, J.


    Roč. 1, July 2011 (2011), s. 85-92 ISSN 2160-1836 R&D Projects: GA MŠk(CZ) LH11058 Institutional research plan: CEZ:AV0Z50510513 Keywords : Satellite sequences * Potato genome * Repeats Subject RIV: EB - Genetics ; Molecular Biology

  17. Gene mining a marama bean expressed sequence tags (ESTs ...

    African Journals Online (AJOL)

    The authors reported the identification of genes associated with embryonic development and microsatellite sequences. The future direction will entail characterization of these genes using gene over-expression and mutant assays. Key words: Namibia, simple sequence repeats (SSR), data mining, homology searches, ...

  18. Biased distribution of DNA uptake sequences towards genome maintenance genes

    DEFF Research Database (Denmark)

    Davidsen, T.; Rodland, E.A.; Lagesen, K.


    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...... in these organisms. Pasteurella multocida also displayed high frequencies of a putative DUS identical to that previously identified in H. influenzae and with a skewed distribution towards genome maintenance genes, indicating that this bacterium might be transformation competent under certain conditions....

  19. Distribution and sequence homogeneity of an abundant satellite DNA in the beetle, Tenebrio molitor. (United States)

    Davis, C A; Wyatt, G R


    The mealworm beetle, Tenebrio molitor, contains an unusually abundant and homogeneous satellite DNA which constitutes up to 60% of its genome. The satellite DNA is shown to be present in all of the chromosomes by in situ hybridization. 18 dimers of the repeat unit were cloned and sequenced. The consensus sequence is 142 nt long and lacks any internal repeat structure. Monomers of the sequence are very similar, showing on average a 2% divergence from the calculated consensus. Variant nucleotides are scattered randomly throughout the sequence although some variants are more common than others. Neighboring repeat units are no more alike than randomly chosen ones. The results suggest that some mechanism, perhaps gene conversion, is acting to maintain the homogeneity of the satellite DNA despite its abundance and distribution on all of the chromosomes. Images PMID:2762148

  20. Development of Highly Informative Genome-Wide Single Sequence Repeat Markers for Breeding Applications in Sesame and Construction of a Web Resource: SisatBase

    Directory of Open Access Journals (Sweden)

    Komivi Dossa


    Full Text Available The sequencing of the full nuclear genome of sesame (Sesamum indicum L. provides the platform for functional analyses of genome components and their application in breeding programs. Although the importance of microsatellites markers or simple sequence repeats (SSR in crop genotyping, genetics, and breeding applications is well established, only a little information exist concerning SSRs at the whole genome level in sesame. In addition, SSRs represent a suitable marker type for sesame molecular breeding in developing countries where it is mainly grown. In this study, we identified 138,194 genome-wide SSRs of which 76.5% were physically mapped onto the 13 pseudo-chromosomes. Among these SSRs, up to three primers pairs were supplied for 101,930 SSRs and used to in silico amplify the reference genome together with two newly sequenced sesame accessions. A total of 79,957 SSRs (78% were polymorphic between the three genomes thereby suggesting their promising use in different genomics-assisted breeding applications. From these polymorphic SSRs, 23 were selected and validated to have high polymorphic potential in 48 sesame accessions from different growing areas of Africa. Furthermore, we have developed an online user-friendly database, SisatBase (, which provides free access to SSRs data as well as an integrated platform for functional analyses. Altogether, the reference SSR and SisatBase would serve as useful resources for genetic assessment, genomic studies, and breeding advancement in sesame, especially in developing countries.

  1. Complete plastid genome sequencing of Trochodendraceae reveals a significant expansion of the inverted repeat and suggests a Paleogene divergence between the two extant species.

    Directory of Open Access Journals (Sweden)

    Yan-xia Sun

    Full Text Available The early-diverging eudicot order Trochodendrales contains only two monospecific genera, Tetracentron and Trochodendron. Although an extensive fossil record indicates that the clade is perhaps 100 million years old and was widespread throughout the Northern Hemisphere during the Paleogene and Neogene, the two extant genera are both narrowly distributed in eastern Asia. Recent phylogenetic analyses strongly support a clade of Trochodendrales, Buxales, and Gunneridae (core eudicots, but complete plastome analyses do not resolve the relationships among these groups with strong support. However, plastid phylogenomic analyses have not included data for Tetracentron. To better resolve basal eudicot relationships and to clarify when the two extant genera of Trochodendrales diverged, we sequenced the complete plastid genome of Tetracentron sinense using Illumina technology. The Tetracentron and Trochodendron plastomes possess the typical gene content and arrangement that characterize most angiosperm plastid genomes, but both genomes have the same unusual ∼4 kb expansion of the inverted repeat region to include five genes (rpl22, rps3, rpl16, rpl14, and rps8 that are normally found in the large single-copy region. Maximum likelihood analyses of an 83-gene, 88 taxon angiosperm data set yield an identical tree topology as previous plastid-based trees, and moderately support the sister relationship between Buxaceae and Gunneridae. Molecular dating analyses suggest that Tetracentron and Trochodendron diverged between 44-30 million years ago, which is congruent with the fossil record of Trochodendrales and with previous estimates of the divergence time of these two taxa. We also characterize 154 simple sequence repeat loci from the Tetracentron sinense and Trochodendron aralioides plastomes that will be useful in future studies of population genetic structure for these relict species, both of which are of conservation concern.

  2. Comparative molecular cytogenetics of major repetitive sequence families of three Dendrobium species (Orchidaceae) from Bangladesh (United States)

    Begum, Rabeya; Alam, Sheikh Shamimul; Menzel, Gerhard; Schmidt, Thomas


    Background and Aims Dendrobium species show tremendous morphological diversity and have broad geographical distribution. As repetitive sequence analysis is a useful tool to investigate the evolution of chromosomes and genomes, the aim of the present study was the characterization of repetitive sequences from Dendrobium moschatum for comparative molecular and cytogenetic studies in the related species Dendrobium aphyllum, Dendrobium aggregatum and representatives from other orchid genera. Methods In order to isolate highly repetitive sequences, a c0t-1 DNA plasmid library was established. Repeats were sequenced and used as probes for Southern hybridization. Sequence divergence was analysed using bioinformatic tools. Repetitive sequences were localized along orchid chromosomes by fluorescence in situ hybridization (FISH). Key Results Characterization of the c0t-1 library resulted in the detection of repetitive sequences including the (GA)n dinucleotide DmoO11, numerous Arabidopsis-like telomeric repeats and the highly amplified dispersed repeat DmoF14. The DmoF14 repeat is conserved in six Dendrobium species but diversified in representative species of three other orchid genera. FISH analyses showed the genome-wide distribution of DmoF14 in D. moschatum, D. aphyllum and D. aggregatum. Hybridization with the telomeric repeats demonstrated Arabidopsis-like telomeres at the chromosome ends of Dendrobium species. However, FISH using the telomeric probe revealed two pairs of chromosomes with strong intercalary signals in D. aphyllum. FISH showed the terminal position of 5S and 18S–5·8S–25S rRNA genes and a characteristic number of rDNA sites in the three Dendrobium species. Conclusions The repeated sequences isolated from D. moschatum c0t-1 DNA constitute major DNA families of the D. moschatum, D. aphyllum and D. aggregatum genomes with DmoF14 representing an ancient component of orchid genomes. Large intercalary telomere-like arrays suggest chromosomal

  3. Mechanical processes with repeated attenuated impacts

    CERN Document Server

    Nagaev, R F


    This book is devoted to considering in the general case - using typical concrete examples - the motion of machines and mechanisms of impact and vibro-impact action accompanied by a peculiar phenomenon called "impact collapse". This phenomenon is that after the initial collision, a sequence of repeated gradually quickening collisions of decreasing-to-zero intensity occurs, with the final establishment of protracted contact between the interacting bodies. The initiation conditions of the impact collapse are determined and calculation techniques for the quantitative characteristics of the corresp

  4. Genetic Diversity of Arabica Coffee (Coffea arabica L. in Nicaragua as Estimated by Simple Sequence Repeat Markers

    Directory of Open Access Journals (Sweden)

    Mulatu Geleta


    Full Text Available Coffea arabica L. (arabica coffee, the only tetraploid species in the genus Coffea, represents the majority of the world’s coffee production and has a significant contribution to Nicaragua’s economy. The present paper was conducted to determine the genetic diversity of arabica coffee in Nicaragua for its conservation and breeding values. Twenty-six populations that represent eight varieties in Nicaragua were investigated using simple sequence repeat (SSR markers. A total of 24 alleles were obtained from the 12 loci investigated across 260 individual plants. The total Nei’s gene diversity (HT and the within-population gene diversity (HS were 0.35 and 0.29, respectively, which is comparable with that previously reported from other countries and regions. Among the varieties, the highest diversity was recorded in the variety Catimor. Analysis of variance (AMOVA revealed that about 87% of the total genetic variation was found within populations and the remaining 13% differentiate the populations (FST=0.13; P<0.001. The variation among the varieties was also significant. The genetic variation in Nicaraguan coffee is significant enough to be used in the breeding programs, and most of this variation can be conserved through ex situ conservation of a low number of populations from each variety.

  5. Long-read sequencing and de novo assembly of a Chinese genome (United States)

    Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arr...

  6. Genetic variability in Brazilian populations of Biomphalaria straminea complex detected by simple sequence repeat anchored polymerase chain reaction amplification

    Directory of Open Access Journals (Sweden)

    Caldeira Roberta L


    Full Text Available Biomphalaria glabrata, B. tenagophila and B. straminea are intermediate hosts of Schistosoma mansoni, in Brazil. The latter is of epidemiological importance in the northwest of Brazil and, due to morphological similarities, has been grouped with B. intermedia and B. kuhniana in a complex named B. straminea. In the current work, we have standardized the simple sequence repeat anchored polymerase chain reaction (SSR-PCR technique, using the primers (CA8RY and K7, to study the genetic variability of these species. The similarity level was calculated using the Dice coefficient and genetic distance using the Nei and Li coefficient. The trees were obtained by the UPGMA and neighbor-joining methods. We have observed that the most related individuals belong to the same species and locality and that individuals from different localities, but of the same species, present clear heterogeneity. The trees generated using both methods showed similar topologies. The SSR-PCR technique was shown to be very efficient in intrapopulational and intraspecific studies of the B. straminea complex snails.

  7. Sequencing of BAC pools by different next generation sequencing platforms and strategies

    Directory of Open Access Journals (Sweden)

    Scholz Uwe


    Full Text Available Abstract Background Next generation sequencing of BACs is a viable option for deciphering the sequence of even large and highly repetitive genomes. In order to optimize this strategy, we examined the influence of read length on the quality of Roche/454 sequence assemblies, to what extent Illumina/Solexa mate pairs (MPs improve the assemblies by scaffolding and whether barcoding of BACs is dispensable. Results Sequencing four BACs with both FLX and Titanium technologies revealed similar sequencing accuracy, but showed that the longer Titanium reads produce considerably less misassemblies and gaps. The 454 assemblies of 96 barcoded BACs were improved by scaffolding 79% of the total contig length with MPs from a non-barcoded library. Assembly of the unmasked 454 sequences without separation by barcodes revealed chimeric contig formation to be a major problem, encompassing 47% of the total contig length. Masking the sequences reduced this fraction to 24%. Conclusion Optimal BAC pool sequencing should be based on the longest available reads, with barcoding essential for a comprehensive assessment of both repetitive and non-repetitive sequence information. When interest is restricted to non-repetitive regions and repeats are masked prior to assembly, barcoding is non-essential. In any case, the assemblies can be improved considerably by scaffolding with non-barcoded BAC pool MPs.

  8. Electricity sequence control

    International Nuclear Information System (INIS)

    Shin, Heung Ryeol


    The contents of the book are introduction of control system, like classification and control signal, introduction of electricity power switch, such as push-button and detection switch sensor for induction type and capacitance type machinery for control, solenoid valve, expression of sequence and type of electricity circuit about using diagram, time chart, marking and term, logic circuit like Yes, No, and, or and equivalence logic, basic electricity circuit, electricity sequence control, added condition, special program control about choice and jump of program, motor control, extra circuit on repeat circuit, pause circuit in a conveyer, safety regulations and rule about classification of electricity disaster and protective device for insulation.

  9. Molecular characterization of three novel Fanconi anemia mutations in Israeli Arabs. (United States)

    Tamary, Hannah; Dgany, Orly; Toledano, Helen; Shalev, Zvi; Krasnov, Tatyana; Shalmon, Lea; Schechter, Tali; Bercovich, Dani; Attias, Dina; Laor, Ruth; Koren, Ariel; Yaniv, Isaac


    In a previous study, we investigated the molecular basis of Fanconi anemia (FA) in 13 unrelated Israeli Jewish FA patients and identified four ethnicity specific mutations. In the present study we extended our study to Israeli Arab patients. We studied three consanguineous families with nine FA patients and an additional unrelated patient. DNA single-strand conformation polymorphism of each exon of the FANCA and FANCG genes was followed by sequence analysis of the aberrantly migrating fragments and by reverse transcriptase-polymerase chain reaction (RT-PCR) analysis of the splice-site mutations identified. Three unique disease-causing mutations were identified: (i) FANCA gross deletion of exons 6-31; (ii) FANCA splice-site mutation IVS 42-2A>C; (iii) FANCG splice-site mutation IVS4+3A>G. Sequence analysis of the FANCA gross deletion revealed recombination between two highly homologous Alu elements. cDNA analysis of the two splice mutations suggested intron 42 retention in FANCA IVS 42-2A>C and exon 4 skipping in FANCG IVS4+3A>G. The clinical condition of eight patients with FANCA mutations was severe. Two unique FANCA mutations and one FANCG mutation were identified in Israeli Arab FA patients. Deletion of FANCA exon 6-31 as in previously described gross deletions was within introns rich in Alu repeats. To the best of our knowledge, the FANCA IVS 42-2A>C mutation is the first in this gene to result in intron retention. Further analysis of FA mutations will enable prenatal diagnosis and a rational therapeutic approach including frequent monitoring and early bone marrow transplantation. Copyright Blackwell Munksgaard 2004.

  10. Heterogeneity of the Epstein-Barr Virus (EBV) Major Internal Repeat Reveals Evolutionary Mechanisms of EBV and a Functional Defect in the Prototype EBV Strain B95-8. (United States)

    Ba Abdullah, Mohammed M; Palermo, Richard D; Palser, Anne L; Grayson, Nicholas E; Kellam, Paul; Correia, Samantha; Szymula, Agnieszka; White, Robert E


    Epstein-Barr virus (EBV) is a ubiquitous pathogen of humans that can cause several types of lymphoma and carcinoma. Like other herpesviruses, EBV has diversified through both coevolution with its host and genetic exchange between virus strains. Sequence analysis of the EBV genome is unusually challenging because of the large number and lengths of repeat regions within the virus. Here we describe the sequence assembly and analysis of the large internal repeat 1 of EBV (IR1; also known as the BamW repeats) for more than 70 strains. The diversity of the latency protein EBV nuclear antigen leader protein (EBNA-LP) resides predominantly within the exons downstream of IR1. The integrity of the putative BWRF1 open reading frame (ORF) is retained in over 80% of strains, and deletions truncating IR1 always spare BWRF1. Conserved regions include the IR1 latency promoter (Wp) and one zone upstream of and two within BWRF1. IR1 is heterogeneous in 70% of strains, and this heterogeneity arises from sequence exchange between strains as well as from spontaneous mutation, with interstrain recombination being more common in tumor-derived viruses. This genetic exchange often incorporates regions of Epstein-Barr virus (EBV) infects the majority of the world population but causes illness in only a small minority of people. Nevertheless, over 1% of cancers worldwide are attributable to EBV. Recent sequencing projects investigating virus diversity to see if different strains have different disease impacts have excluded regions of repeating sequence, as they are more technically challenging. Here we analyze the sequence of the largest repeat in EBV (IR1). We first characterized the variations in protein sequences encoded across IR1. In studying variations within the repeat of each strain, we identified a mutation in the main laboratory strain of EBV that impairs virus function, and we suggest that tumor-associated viruses may be more likely to contain DNA mixed from two strains. The

  11. Unusually effective microRNA targeting within repeat-rich coding regions of mammalian mRNAs (United States)

    Schnall-Levin, Michael; Rissland, Olivia S.; Johnston, Wendy K.; Perrimon, Norbert; Bartel, David P.; Berger, Bonnie


    MicroRNAs (miRNAs) regulate numerous biological processes by base-pairing with target messenger RNAs (mRNAs), primarily through sites in 3′ untranslated regions (UTRs), to direct the repression of these targets. Although miRNAs have sometimes been observed to target genes through sites in open reading frames (ORFs), large-scale studies have shown such targeting to be generally less effective than 3′ UTR targeting. Here, we show that several miRNAs each target significant groups of genes through multiple sites within their coding regions. This ORF targeting, which mediates both predictable and effective repression, arises from highly repeated sequences containing miRNA target sites. We show that such sequence repeats largely arise through evolutionary duplications and occur particularly frequently within families of paralogous C2H2 zinc-finger genes, suggesting the potential for their coordinated regulation. Examples of ORFs targeted by miR-181 include both the well-known tumor suppressor RB1 and RBAK, encoding a C2H2 zinc-finger protein and transcriptional binding partner of RB1. Our results indicate a function for repeat-rich coding sequences in mediating post-transcriptional regulation and reveal circumstances in which miRNA-mediated repression through ORF sites can be reliably predicted. PMID:21685129

  12. In Silico Genome Comparison and Distribution Analysis of Simple Sequences Repeats in Cassava

    Directory of Open Access Journals (Sweden)

    Andrea Vásquez


    Full Text Available We conducted a SSRs density analysis in different cassava genomic regions. The information obtained was useful to establish comparisons between cassava’s SSRs genomic distribution and those of poplar, flax, and Jatropha. In general, cassava has a low SSR density (~50 SSRs/Mbp and has a high proportion of pentanucleotides, (24,2 SSRs/Mbp. It was found that coding sequences have 15,5 SSRs/Mbp, introns have 82,3 SSRs/Mbp, 5′ UTRs have 196,1 SSRs/Mbp, and 3′ UTRs have 50,5 SSRs/Mbp. Through motif analysis of cassava’s genome SSRs, the most abundant motif was AT/AT while in intron sequences and UTRs regions it was AG/CT. In addition, in coding sequences the motif AAG/CTT was also found to occur most frequently; in fact, it is the third most used codon in cassava. Sequences containing SSRs were classified according to their functional annotation of Gene Ontology categories. The identified SSRs here may be a valuable addition for genetic mapping and future studies in phylogenetic analyses and genomic evolution.

  13. The decorin sequence SYIRIADTNIT binds collagen type I

    DEFF Research Database (Denmark)

    Kalamajski, Sebastian; Aspberg, Anders; Oldberg, Ake


    Decorin belongs to the small leucine-rich repeat proteoglycan family, interacts with fibrillar collagens, and regulates the assembly, structure, and biomechanical properties of connective tissues. The decorin-collagen type I-binding region is located in leucine-rich repeats 5-6. Site......-directed mutagenesis of this 54-residue-long collagen-binding sequence identifies Arg-207 and Asp-210 in leucine-rich repeat 6 as crucial for the binding to collagen. The synthetic peptide SYIRIADTNIT, which includes Arg-207 and Asp-210, inhibits the binding of full-length recombinant decorin to collagen in vitro....... These collagen-binding amino acids are exposed on the exterior of the beta-sheet-loop structure of the leucine-rich repeat. This resembles the location of interacting residues in other leucine-rich repeat proteins....

  14. Distribution and Evolution of Yersinia Leucine-Rich Repeat Proteins (United States)

    Hu, Yueming; Huang, He; Hui, Xinjie; Cheng, Xi; White, Aaron P.


    Leucine-rich repeat (LRR) proteins are widely distributed in bacteria, playing important roles in various protein-protein interaction processes. In Yersinia, the well-characterized type III secreted effector YopM also belongs to the LRR protein family and is encoded by virulence plasmids. However, little has been known about other LRR members encoded by Yersinia genomes or their evolution. In this study, the Yersinia LRR proteins were comprehensively screened, categorized, and compared. The LRR proteins encoded by chromosomes (LRR1 proteins) appeared to be more similar to each other and different from those encoded by plasmids (LRR2 proteins) with regard to repeat-unit length, amino acid composition profile, and gene expression regulation circuits. LRR1 proteins were also different from LRR2 proteins in that the LRR1 proteins contained an E3 ligase domain (NEL domain) in the C-terminal region or an NEL domain-encoding nucleotide relic in flanking genomic sequences. The LRR1 protein-encoding genes (LRR1 genes) varied dramatically and were categorized into 4 subgroups (a to d), with the LRR1a to -c genes evolving from the same ancestor and LRR1d genes evolving from another ancestor. The consensus and ancestor repeat-unit sequences were inferred for different LRR1 protein subgroups by use of a maximum parsimony modeling strategy. Structural modeling disclosed very similar repeat-unit structures between LRR1 and LRR2 proteins despite the different unit lengths and amino acid compositions. Structural constraints may serve as the driving force to explain the observed mutations in the LRR regions. This study suggests that there may be functional variation and lays the foundation for future experiments investigating the functions of the chromosomally encoded LRR proteins of Yersinia. PMID:27217422

  15. New polymorphisms within the variable number tandem repeat (VNTR) 7 locus of Mycobacterium avium subsp. paratuberculosis. (United States)

    Fawzy, Ahmad; Zschöck, Michael; Ewers, Christa; Eisenberg, Tobias


    Variable number tandem repeat (VNTR) is a frequently employed typing method of Mycobacterium avium paratuberculosis (MAP) isolates. Based on whole genome sequencing in a previous study, allelic diversity at some VNTR loci seems to over- or under-estimate the actual phylogenetic variance among isolates. Interestingly, two closely related isolates on one farm showed polymorphism at the VNTR 7 locus, raising concerns about the misleading role that it might play in genotyping. We aimed to investigate the underlying basis of VNTR 7-polymorphism by analyzing sequence data for published genomes and field isolates of MAP and other M. avium complex (MAC) members. In contrast to MAP strains from cattle, strains from sheep displayed an "imperfect" repeat within VNTR 7, which was identical to respective allele types in other MAC genomes. Subspecies- and strain-specific single nucleotide polymorphisms (SNPs) and two novel (16 and 56 bp) repeats were detected. Given the combination of the three existing repeats, there are at least five different patterns for VNTR 7. The present findings highlight a higher polymorphism and probable instability of VNTR 7 locus that needs to be considered and challenged in future studies. Until then, sequencing of this locus in future studies is important to correctly assign the underlying allele types.(1). Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Sequence-Based Analysis of Structural Organization and Composition of the Cultivated Sunflower (Helianthus annuus L.) Genome (United States)

    Gill, Navdeep; Buti, Matteo; Kane, Nolan; Bellec, Arnaud; Helmstetter, Nicolas; Berges, Hélène; Rieseberg, Loren H.


    Sunflower is an important oilseed crop, as well as a model system for evolutionary studies, but its 3.6 gigabase genome has proven difficult to assemble, in part because of the high repeat content of its genome. Here we report on the sequencing, assembly, and analyses of 96 randomly chosen BACs from sunflower to provide additional information on the repeat content of the sunflower genome, assess how repetitive elements in the sunflower genome are organized relative to genes, and compare the genomic distribution of these repeats to that found in other food crops and model species. We also examine the expression of transposable element-related transcripts in EST databases for sunflower to determine the representation of repeats in the transcriptome and to measure their transcriptional activity. Our data confirm previous reports in suggesting that the sunflower genome is >78% repetitive. Sunflower repeats share very little similarity to other plant repeats such as those of Arabidopsis, rice, maize and wheat; overall 28% of repeats are “novel” to sunflower. The repetitive sequences appear to be randomly distributed within the sequenced BACs. Assuming the 96 BACs are representative of the genome as a whole, then approximately 5.2% of the sunflower genome comprises non TE-related genic sequence, with an average gene density of 18kbp/gene. Expression levels of these transposable elements indicate tissue specificity and differential expression in vegetative and reproductive tissues, suggesting that expressed TEs might contribute to sunflower development. The assembled BACs will also be useful for assessing the quality of several different draft assemblies of the sunflower genome and for annotating the reference sequence. PMID:24833511

  17. Sequence-Based Analysis of Structural Organization and Composition of the Cultivated Sunflower (Helianthus annuus L. Genome

    Directory of Open Access Journals (Sweden)

    Navdeep Gill


    Full Text Available Sunflower is an important oilseed crop, as well as a model system for evolutionary studies, but its 3.6 gigabase genome has proven difficult to assemble, in part because of the high repeat content of its genome. Here we report on the sequencing, assembly, and analyses of 96 randomly chosen BACs from sunflower to provide additional information on the repeat content of the sunflower genome, assess how repetitive elements in the sunflower genome are organized relative to genes, and compare the genomic distribution of these repeats to that found in other food crops and model species. We also examine the expression of transposable element-related transcripts in EST databases for sunflower to determine the representation of repeats in the transcriptome and to measure their transcriptional activity. Our data confirm previous reports in suggesting that the sunflower genome is >78% repetitive. Sunflower repeats share very little similarity to other plant repeats such as those of Arabidopsis, rice, maize and wheat; overall 28% of repeats are “novel” to sunflower. The repetitive sequences appear to be randomly distributed within the sequenced BACs. Assuming the 96 BACs are representative of the genome as a whole, then approximately 5.2% of the sunflower genome comprises non TE-related genic sequence, with an average gene density of 18kbp/gene. Expression levels of these transposable elements indicate tissue specificity and differential expression in vegetative and reproductive tissues, suggesting that expressed TEs might contribute to sunflower development. The assembled BACs will also be useful for assessing the quality of several different draft assemblies of the sunflower genome and for annotating the reference sequence.

  18. Nucleotide sequence of soybean chloroplast DNA regions which contain the psb A and trn H genes and cover the ends of the large single copy region and one end of the inverted repeats. (United States)

    Spielmann, A; Stutz, E


    The soybean chloroplast psb A gene (photosystem II thylakoid membrane protein of Mr 32 000, lysine-free) and the trn H gene (tRNAHisGUG), which both map in the large single copy region adjacent to one of the inverted repeat structures (IR1), have been sequenced including flanking regions. The psb A gene shows in its structural part 92% sequence homology with the corresponding genes of spinach and N. debneyi and contains also an open reading frame for 353 aminoacids. The aminoacid sequence of a potential primary translation product (calculated Mr, 38 904, no lysine) diverges from that of spinach and N. debneyi in only two positions in the C-terminal part. The trn H gene has the same polarity as the psb A gene and the coding region is located at the very end of the large single copy region. The deduced sequence of the soybean chloroplast tRNAHisGUG is identical with that of Zea mays chloroplasts. Both ends of the large single copy region were sequenced including a small segment of the adjacent IR1 and IR2.

  19. Comparative genomics and repetitive sequence divergence in the species of diploid Nicotiana section Alatae. (United States)

    Lim, K Yoong; Kovarik, Ales; Matyasek, Roman; Chase, Mark W; Knapp, Sandra; McCarthy, Elizabeth; Clarkson, James J; Leitch, Andrew R


    Combining phylogenetic reconstructions of species relationships with comparative genomic approaches is a powerful way to decipher evolutionary events associated with genome divergence. Here, we reconstruct the history of karyotype and tandem repeat evolution in species of diploid Nicotiana section Alatae. By analysis of plastid DNA, we resolved two clades with high bootstrap support, one containing N. alata, N. langsdorffii, N. forgetiana and N. bonariensis (called the n = 9 group) and another containing N. plumbaginifolia and N. longiflora (called the n = 10 group). Despite little plastid DNA sequence divergence, we observed, via fluorescent in situ hybridization, substantial chromosomal repatterning, including altered chromosome numbers, structure and distribution of repeats. Effort was focussed on 35S and 5S nuclear ribosomal DNA (rDNA) and the HRS60 satellite family of tandem repeats comprising the elements HRS60, NP3R and NP4R. We compared divergence of these repeats in diploids and polyploids of Nicotiana. There are dramatic shifts in the distribution of the satellite repeats and complete replacement of intergenic spacers (IGSs) of 35S rDNA associated with divergence of the species in section Alatae. We suggest that sequence homogenization has replaced HRS60 family repeats at sub-telomeric regions, but that this process may not occur, or occurs more slowly, when the repeats are found at intercalary locations. Sequence homogenization acts more rapidly (at least two orders of magnitude) on 35S rDNA than 5S rDNA and sub-telomeric satellite sequences. This rapid rate of divergence is analogous to that found in polyploid species, and is therefore, in plants, not only associated with polyploidy.

  20. Microsatellite DNA in genomic survey sequences and UniGenes of loblolly pine (United States)

    Craig S Echt; Surya Saha; Dennis L Deemer; C Dana Nelson


    Genomic DNA sequence databases are a potential and growing resource for simple sequence repeat (SSR) marker development in loblolly pine (Pinus taeda L.). Loblolly pine also has many expressed sequence tags (ESTs) available for microsatellite (SSR) marker development. We compared loblolly pine SSR densities in genome survey sequences (GSSs) to those in non-redundant...

  1. Identifying uniformly mutated segments within repeats. (United States)

    Sahinalp, S Cenk; Eichler, Evan; Goldberg, Paul; Berenbrink, Petra; Friedetzky, Tom; Ergun, Funda


    Given a long string of characters from a constant size alphabet we present an algorithm to determine whether its characters have been generated by a single i.i.d. random source. More specifically, consider all possible n-coin models for generating a binary string S, where each bit of S is generated via an independent toss of one of the n coins in the model. The choice of which coin to toss is decided by a random walk on the set of coins where the probability of a coin change is much lower than the probability of using the same coin repeatedly. We present a procedure to evaluate the likelihood of a n-coin model for given S, subject a uniform prior distribution over the parameters of the model (that represent mutation rates and probabilities of copying events). In the absence of detailed prior knowledge of these parameters, the algorithm can be used to determine whether the a posteriori probability for n=1 is higher than for any other n>1. Our algorithm runs in time O(l4logl), where l is the length of S, through a dynamic programming approach which exploits the assumed convexity of the a posteriori probability for n. Our test can be used in the analysis of long alignments between pairs of genomic sequences in a number of ways. For example, functional regions in genome sequences exhibit much lower mutation rates than non-functional regions. Because our test provides means for determining variations in the mutation rate, it may be used to distinguish functional regions from non-functional ones. Another application is in determining whether two highly similar, thus evolutionarily related, genome segments are the result of a single copy event or of a complex series of copy events. This is particularly an issue in evolutionary studies of genome regions rich with repeat segments (especially tandemly repeated segments).

  2. Giardia telomeric sequence d(TAGGG)4 forms two intramolecular G-quadruplexes in K+ solution: effect of loop length and sequence on the folding topology. (United States)

    Hu, Lanying; Lim, Kah Wai; Bouaziz, Serge; Phan, Anh Tuân


    Recently, it has been shown that in K(+) solution the human telomeric sequence d[TAGGG(TTAGGG)(3)] forms a (3 + 1) intramolecular G-quadruplex, while the Bombyx mori telomeric sequence d[TAGG(TTAGG)(3)], which differs from the human counterpart only by one G deletion in each repeat, forms a chair-type intramolecular G-quadruplex, indicating an effect of G-tract length on the folding topology of G-quadruplexes. To explore the effect of loop length and sequence on the folding topology of G-quadruplexes, here we examine the structure of the four-repeat Giardia telomeric sequence d[TAGGG(TAGGG)(3)], which differs from the human counterpart only by one T deletion within the non-G linker in each repeat. We show by NMR that this sequence forms two different intramolecular G-quadruplexes in K(+) solution. The first one is a novel basket-type antiparallel-stranded G-quadruplex containing two G-tetrads, a G x (A-G) triad, and two A x T base pairs; the three loops are consecutively edgewise-diagonal-edgewise. The second one is a propeller-type parallel-stranded G-quadruplex involving three G-tetrads; the three loops are all double-chain-reversal. Recurrence of several structural elements in the observed structures suggests a "cut and paste" principle for the design and prediction of G-quadruplex topologies, for which different elements could be extracted from one G-quadruplex and inserted into another.

  3. Repeat-containing protein effectors of plant-associated organisms

    Directory of Open Access Journals (Sweden)

    Carl H. Mesarich


    Full Text Available Many plant-associated organisms, including microbes, nematodes, and insects, deliver effector proteins into the apoplast, vascular tissue, or cell cytoplasm of their prospective hosts. These effectors function to promote colonization, typically by altering host physiology or by modulating host immune responses. The same effectors however, can also trigger host immunity in the presence of cognate host immune receptor proteins, and thus prevent colonization. To circumvent effector-triggered immunity, or to further enhance host colonization, plant-associated organisms often rely on adaptive effector evolution. In recent years, it has become increasingly apparent that several effectors of plant-associated organisms are repeat-containing proteins (RCPs that carry tandem or non-tandem arrays of an amino acid sequence or structural motif. In this review, we highlight the diverse roles that these repeat domains play in RCP effector function. We also draw attention to the potential role of these repeat domains in adaptive evolution with regards to RCP effector function and the evasion of effector-triggered immunity. The aim of this review is to increase the profile of RCP effectors from plant-associated organisms.

  4. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien


    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697

  5. Analysis of genetic diversity and population structure of oil palm (Elaeis guineensis) from China and Malaysia based on species-specific simple sequence repeat markers. (United States)

    Zhou, L X; Xiao, Y; Xia, W; Yang, Y D


    Genetic diversity and patterns of population structure of the 94 oil palm lines were investigated using species-specific simple sequence repeat (SSR) markers. We designed primers for 63 SSR loci based on their flanking sequences and conducted amplification in 94 oil palm DNA samples. The amplification result showed that a relatively high level of genetic diversity was observed between oil palm individuals according a set of 21 polymorphic microsatellite loci. The observed heterozygosity (Ho) was 0.3683 and 0.4035, with an average of 0.3859. The Ho value was a reliable determinant of the discriminatory power of the SSR primer combinations. The principal component analysis and unweighted pair-group method with arithmetic averaging cluster analysis showed the 94 oil palm lines were grouped into one cluster. These results demonstrated that the oil palm in Hainan Province of China and the germplasm introduced from Malaysia may be from the same source. The SSR protocol was effective and reliable for assessing the genetic diversity of oil palm. Knowledge of the genetic diversity and population structure will be crucial for establishing appropriate management stocks for this species.

  6. Sequence-influenced interactions of oligoacridines with DNA detected by retarded gel electrophorectic migrations

    International Nuclear Information System (INIS)

    Nielsen, P.E.; Zhen, W.; Henriksen, U.; Buchardt, O.


    The authors have found that di-, tri-, tetra-, and hexa-9-acridinylamines are so efficiently associated with DNA during electrophoresis in polyacrylamide or agarose gels that they retard its migration. The retardation is roughly proportional to the reagent to base pair ratio, and the magnitude of the retardation indicates that a combined charge neutralization/helix extension mechanism is mainly responsible for the effect. Furthermore, DNA sequence dependent differences are observed. Thus, the pUC 19 restriction fragments (HaeIII or AluI), which in the native state comigrate upon gel electrophoretic analysis, could be separated in the presence of a diacridine, and specific DNA fragments responded differently to different diacridines. These results suggest that the effect also is due to a contribution from the DNA conformation and that the DNA conformation dynamics are influenced differently upon binding of different diacridines. They foresee three applications of this observation: (1) in analytical gel electrophoretic separation of otherwise comigrating DNA molecules, (2) in studies of polyintercalator-DNA interaction, and (3) in measurements of polyintercalator-induced DNA unwinding

  7. Determination of allele frequencies in nine short tandem repeat loci ...

    African Journals Online (AJOL)



    Apr 17, 2008 ... out the human genome. These loci are a rich source of highly polymorphic markers that may be detected using the polymerase chain reaction (PCR). PCR is a mimic of the normal cellular process of replication of DNA molecules. Each STR is distinguished by the number of times a sequence is repeated, ...

  8. Complete chloroplast genome of Trachelium caeruleum: extensiverearrangements are associated with repeats and tRNAs

    Energy Technology Data Exchange (ETDEWEB)

    Haberle, Rosemarie C.; Fourcade, Matthew L.; Boore, Jeffrey L.; Jansen, Robert K.


    Chloroplast genome structure, gene order and content arehighly conserved in land plants. We sequenced the complete chloroplastgenome sequence of Trachelium caeruleum (Campanulaceae) a member of anangiosperm family known for highly rearranged chloroplast genomes. Thetotal genome size is 162,321 bp with an IR of 27,273 bp, LSC of 100,113bp and SSC of 7,661 bp. The genome encodes 115 unique genes, with 19duplicated in the IR, a tRNA (trnI-CAU) duplicated once in the LSC and aprotein coding gene (psbJ) duplicated twice, for a total of 137 genes.Four genes (ycf15, rpl23, infA and accD) are truncated and likelynonfunctional; three others (clpP, ycf1 and ycf2) are so highly divergedthat they may now be pseudogenes. The most conspicuous feature of theTrachelium genome is the presence of eighteen internally unrearrangedblocks of genes that have been inverted or relocated within the genome,relative to the typical gene order of most angiosperm chloroplastgenomes. Recombination between repeats or tRNAs has been suggested as twomeans of chloroplast genome rearrangements. We compared the relativenumber of repeats in Trachelium to eight other angiosperm chloroplastgenomes, and evaluated the location of repeats and tRNAs in relation torearrangements. Trachelium has the highest number and largest repeats,which are concentrated near inversion endpoints or other rearrangements.tRNAs occur at many but not all inversion endpoints. There is likely nosingle mechanism responsible for the remarkable number of alterations inthis genome, but both repeats and tRNAs are clearly associated with theserearrangements. Land plant chloroplast genomes are highly conserved instructure, gene order and content. The chloroplast genomes of ferns, thegymnosperm Ginkgo, and most angiosperms are nearly collinear, reflectingthe gene order in lineages that diverged from lycopsids and the ancestralchloroplast gene order over 350 million years ago (Raubeson and Jansen,1992). Although earlier mapping studies

  9. Application of synthetic DNA probes to the analysis of DNA sequence variants in man

    International Nuclear Information System (INIS)

    Wallace, R.B.; Petz, L.D.; Yam, P.Y.


    Oligonucleotide probes provide a tool to discriminate between any two alleles on the basis of hybridization. Random sampling of the genome with different oligonucleotide probes should reveal polymorphism in a certain percentage of the cases. In the hope of identifying polymorphic regions more efficiently, we chose to take advantage of the proposed hypermutability of repeated DNA sequences and the specificity of oligonucleotide hybridization. Since, under appropriate conditions, oligonucleotide probes require complete base pairing for hybridization to occur, they will only hybridize to a subset of the members of a repeat family when all members of the family are not identical. The results presented here suggest that oligonucleotide hybridization can be used to extend the genomic sequences that can be tested for the presence of RFLPs. This expands the tools available to human genetics. In addition, the results suggest that repeated DNA sequences are indeed more polymorphic than single-copy sequences. 28 references, 2 figures

  10. Comparing whole-genome sequencing with Sanger sequencing for spa typing of methicillin-resistant Staphylococcus aureus. (United States)

    Bartels, Mette Damkjær; Petersen, Andreas; Worning, Peder; Nielsen, Jesper Boye; Larner-Svensson, Hanna; Johansen, Helle Krogh; Andersen, Leif Percival; Jarløv, Jens Otto; Boye, Kit; Larsen, Anders Rhod; Westh, Henrik


    spa typing of methicillin-resistant Staphylococcus aureus (MRSA) has traditionally been done by PCR amplification and Sanger sequencing of the spa repeat region. At Hvidovre Hospital, Denmark, whole-genome sequencing (WGS) of all MRSA isolates has been performed routinely since January 2013, and an in-house analysis pipeline determines the spa types. Due to national surveillance, all MRSA isolates are sent to Statens Serum Institut, where the spa type is determined by PCR and Sanger sequencing. The purpose of this study was to evaluate the reliability of the spa types obtained by 150-bp paired-end Illumina WGS. MRSA isolates from new MRSA patients in 2013 (n = 699) in the capital region of Denmark were included. We found a 97% agreement between spa types obtained by the two methods. All isolates achieved a spa type by both methods. Nineteen isolates differed in spa types by the two methods, in most cases due to the lack of 24-bp repeats in the whole-genome-sequenced isolates. These related but incorrect spa types should have no consequence in outbreak investigations, since all epidemiologically linked isolates, regardless of spa type, will be included in the single nucleotide polymorphism (SNP) analysis. This will reveal the close relatedness of the spa types. In conclusion, our data show that WGS is a reliable method to determine the spa type of MRSA. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  11. Outlier Loci and Selection Signatures of Simple Sequence Repeats (SSRs) in Flax (Linum usitatissimum L.). (United States)

    Soto-Cerda, Braulio J; Cloutier, Sylvie


    Genomic microsatellites (gSSRs) and expressed sequence tag-derived SSRs (EST-SSRs) have gained wide application for elucidating genetic diversity and population structure in plants. Both marker systems are assumed to be selectively neutral when making demographic inferences, but this assumption is rarely tested. In this study, three neutrality tests were assessed for identifying outlier loci among 150 SSRs (85 gSSRs and 65 EST-SSRs) that likely influence estimates of population structure in three differentiated flax sub-populations ( F ST  = 0.19). Moreover, the utility of gSSRs, EST-SSRs, and the combined sets of SSRs was also evaluated in assessing genetic diversity and population structure in flax. Six outlier loci were identified by at least two neutrality tests showing footprints of balancing selection. After removing the outlier loci, the STRUCTURE analysis and the dendrogram topology of EST-SSRs improved. Conversely, gSSRs and combined SSRs results did not change significantly, possibly as a consequence of the higher number of neutral loci assessed. Taken together, the genetic structure analyses established the superiority of gSSRs to determine the genetic relationships among flax accessions, although the combined SSRs produced the best results. Genetic diversity parameters did not differ statistically ( P  > 0.05) between gSSRs and EST-SSRs, an observation partially explained by the similar number of repeat motifs. Our study provides new insights into the ability of gSSRs and EST-SSRs to measure genetic diversity and structure in flax and confirms the importance of testing for the occurrence of outlier loci to properly assess natural and breeding populations, particularly in studies considering only few loci.

  12. Recurrence time statistics: versatile tools for genomic DNA sequence analysis. (United States)

    Cao, Yinhe; Tung, Wen-Wen; Gao, J B


    With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.

  13. Estimating Genetic Conformism of Korean Mulberry Cultivars Using Random Amplified Polymorphic DNA and Inter-Simple Sequence Repeat Profiling

    Directory of Open Access Journals (Sweden)

    Sunirmal Sheet


    Full Text Available Apart from being fed to silkworms in sericulture, the ecologically important Mulberry plant has been used for traditional medicine in Asian countries as well as in manufacturing wine, food, and beverages. Germplasm analysis among Mulberry cultivars originating from South Korea is crucial in the plant breeding program for cultivar development. Hence, the genetic deviations and relations among 8 Morus alba plants, and one Morus lhou plant, of different cultivars collected from South Korea were investigated using 10 random amplified polymorphic DNA (RAPD and 10 inter-simple sequence repeat (ISSR markers in the present study. The ISSR markers exhibited a higher polymorphism (63.42% among mulberry genotypes in comparison to RAPD markers. Furthermore, the similarity coefficient was estimated for both markers and found to be varying between 0.183 and 0.814 for combined pooled data of ISSR and RAPD. The phenogram drawn using the UPGMA cluster method based on combined pooled data of RAPD and ISSR markers divided the nine mulberry genotypes into two divergent major groups and the two individual independent accessions. The distant relationship between Dae-Saug (SM1 and SangchonJo Sang Saeng (SM5 offers a possibility of utilizing them in mulberry cultivar improvement of Morus species of South Korea.

  14. VLSI System Implementation of 200 MHz, 8-bit, 90nm CMOS Arithmetic and Logic Unit (ALU Processor Controller

    Directory of Open Access Journals (Sweden)



    Full Text Available In this present study includes the Very Large Scale Integration (VLSI system implementation of 200MHz, 8-bit, 90nm Complementary Metal Oxide Semiconductor (CMOS Arithmetic and Logic Unit (ALU processor control with logic gate design style and 0.12µm six metal 90nm CMOS fabrication technology. The system blocks and the behaviour are defined and the logical design is implemented in gate level in the design phase. Then, the logic circuits are simulated and the subunits are converted in to 90nm CMOS layout. Finally, in order to construct the VLSI system these units are placed in the floor plan and simulated with analog and digital, logic and switch level simulators. The results of the simulations indicates that the VLSI system can control different instructions which can divided into sub groups: transfer instructions, arithmetic and logic instructions, rotate and shift instructions, branch instructions, input/output instructions, control instructions. The data bus of the system is 16-bit. It runs at 200MHz, and operating power is 1.2V. In this paper, the parametric analysis of the system, the design steps and obtained results are explained.

  15. Evolutionary force of AT-rich repeats to trap genomic and episomal DNAs into the rice genome: lessons from endogenous pararetrovirus. (United States)

    Liu, Ruifang; Koyanagi, Kanako O; Chen, Sunlu; Kishima, Yuji


    In plant genomes, the incorporation of DNA segments is not a common method of artificial gene transfer. Nevertheless, various segments of pararetroviruses have been found in plant genomes in recent decades. The rice genome contains a number of segments of endogenous rice tungro bacilliform virus-like sequences (ERTBVs), many of which are present between AT dinucleotide repeats (ATrs). Comparison of genomic sequences between two closely related rice subspecies, japonica and indica, allowed us to verify the preferential insertion of ERTBVs into ATrs. In addition to ERTBVs, the comparative analyses showed that ATrs occasionally incorporate repeat sequences including transposable elements, and a wide range of other sequences. Besides the known genomic sequences, the insertion sequences also represented DNAs of unclear origins together with ERTBVs, suggesting that ATrs have integrated episomal DNAs that would have been suspended in the nucleus. Such insertion DNAs might be trapped by ATrs in the genome in a host-dependent manner. Conversely, other simple mono- and dinucleotide sequence repeats (SSR) were less frequently involved in insertion events relative to ATrs. Therefore, ATrs could be regarded as hot spots of double-strand breaks that induce non-homologous end joining. The insertions within ATrs occasionally generated new gene-related sequences or involved structural modifications of existing genes. Likewise, in a comparison between Arabidopsis thaliana and Arabidopsis lyrata, the insertions preferred ATrs to other SSRs. Therefore ATrs in plant genomes could be considered as genomic dumping sites that have trapped various DNA molecules and may have exerted a powerful evolutionary force. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.

  16. An infinitely expandable cloning strategy plus repeat-proof PCR for working with multiple shRNA.

    Directory of Open Access Journals (Sweden)

    Glen John McIntyre

    Full Text Available Vector construction with restriction enzymes (REs typically involves the ligation of a digested donor fragment (insert to a reciprocally digested recipient fragment (vector backbone. Creating a suitable cloning plan becomes increasingly difficult for complex strategies requiring repeated insertions such as constructing multiple short hairpin RNA (shRNA expression vectors for RNA interference (RNAi studies. The problem lies in the reduced availability of suitable RE recognition sites with an increasing number of cloning events and or vector size. This report details a technically simple, directional cloning solution using REs with compatible cohesive ends that are repeatedly destroyed and simultaneously re-introduced with each round of cloning. Donor fragments can be made by PCR or sub-cloned from pre-existing vectors and inserted ad infinitum in any combination. The design incorporates several cloning cores in order to be compatible with as many donor sequences as possible. We show that joining sub-combinations made in parallel is more time-efficient than sequential construction (of one cassette at a time for any combination of 4 or more insertions. Screening for the successful construction of combinations using Taq polymerase based PCR became increasingly difficult with increasing number of repeated sequence elements. A Pfu polymerase based PCR was developed and successfully used to amplify combinations of up to eleven consecutive hairpin expression cassettes. The identified PCR conditions can be beneficial to others working with multiple shRNA or other repeated sequences, and the infinitely expandable cloning strategy serves as a general solution applicable to many cloning scenarios.

  17. Constructs for the expression of repeating triple-helical protein domains

    International Nuclear Information System (INIS)

    Peng, Yong Y; Werkmeister, Jerome A; Vaughan, Paul R; Ramshaw, John A M


    The development of novel scaffolds will be an important aspect in future success of tissue engineering. Scaffolds will preferably contain information that directs the cellular content of constructs so that the new tissue that is formed is closely aligned in structure, composition and function to the target natural tissue. One way of approaching this will be the development of novel protein-based constructs that contain one or more repeats of functional elements derived from various proteins. In the present case, we describe a strategy to make synthetic, recombinant triple-helical constructs that contain repeat segments of biologically relevant domains. Copies of a DNA fragment prepared by PCR from human type III collagen have been inserted in a co-linear contiguous fashion into the yeast expression vector YEpFlag-1, using sequential addition between selected restriction sites. Constructs containing 1, 2 and 3 repeats were designed to maintain the (Gly-X-Y) repeat, which is essential for the formation of an extended triple helix. All constructs gave expressed protein, with the best being the 3-repeat construct which was readily secreted. This material had the expected composition and N-terminal sequence. Incubation of the product at low temperature led to triple-helix formation, shown by reaction with a conformation dependent monoclonal antibody.

  18. Constructs for the expression of repeating triple-helical protein domains

    Energy Technology Data Exchange (ETDEWEB)

    Peng, Yong Y; Werkmeister, Jerome A; Vaughan, Paul R; Ramshaw, John A M, E-mail: jerome.werkmeister@csiro.a [CSIRO Molecular and Health Technologies, Bag 10, Clayton South, VIC 3169 (Australia)


    The development of novel scaffolds will be an important aspect in future success of tissue engineering. Scaffolds will preferably contain information that directs the cellular content of constructs so that the new tissue that is formed is closely aligned in structure, composition and function to the target natural tissue. One way of approaching this will be the development of novel protein-based constructs that contain one or more repeats of functional elements derived from various proteins. In the present case, we describe a strategy to make synthetic, recombinant triple-helical constructs that contain repeat segments of biologically relevant domains. Copies of a DNA fragment prepared by PCR from human type III collagen have been inserted in a co-linear contiguous fashion into the yeast expression vector YEpFlag-1, using sequential addition between selected restriction sites. Constructs containing 1, 2 and 3 repeats were designed to maintain the (Gly-X-Y) repeat, which is essential for the formation of an extended triple helix. All constructs gave expressed protein, with the best being the 3-repeat construct which was readily secreted. This material had the expected composition and N-terminal sequence. Incubation of the product at low temperature led to triple-helix formation, shown by reaction with a conformation dependent monoclonal antibody.

  19. Human tissue factor: cDNA sequence and chromosome localization of the gene

    International Nuclear Information System (INIS)

    Scarpati, E.M.; Wen, D.; Broze, G.J. Jr.; Miletich, J.P.; Flandermeyer, R.R.; Siegel, N.R.; Sadler, J.E.


    A human placenta cDNA library in λgt11 was screened for the expression of tissue factor antigens with rabbit polyclonal anti-human tissue factor immunoglobulin G. Among 4 million recombinant clones screened, one positive, λHTF8, expressed a protein that shared epitopes with authentic human brain tissue factor. The 1.1-kilobase cDNA insert of λHTF8 encoded a peptide that contained the amino-terminal protein sequence of human brain tissue factor. Northern blotting identified a major mRNA species of 2.2 kilobases and a minor species of ∼ 3.2 kilobases in poly(A) + RNA of placenta. Only 2.2-kilobase mRNA was detected in human brain and in the human monocytic U937 cell line. In U937 cells, the quantity of tissue factor mRNA was increased several fold by exposure of the cells to phorbol 12-myristate 13-acetate. Additional cDNA clones were selected by hybridization with the cDNA insert of λHTF8. These overlapping isolates span 2177 base pairs of the tissue factor cDNA sequence that includes a 5'-noncoding region of 75 base pairs, an open reading frame of 885 base pairs, a stop codon, a 3'-noncoding region of 1141 base pairs, and a poly(a) tail. The open reading frame encodes a 33-kilodalton protein of 295 amino acids. The predicted sequence includes a signal peptide of 32 or 34 amino acids, a probable extracellular factor VII binding domain of 217 or 219 amino acids, a transmembrane segment of 23 acids, and a cytoplasmic tail of 21 amino acids. There are three potential glycosylation sites with the sequence Asn-X-Thr/Ser. The 3'-noncoding region contains an inverted Alu family repetitive sequence. The tissue factor gene was localized to chromosome 1 by hybridization of the cDNA insert of λHTF8 to flow-sorted human chromosomes

  20. detectIR: a novel program for detecting perfect and imperfect inverted repeats using complex numbers and vector calculation. (United States)

    Ye, Congting; Ji, Guoli; Li, Lei; Liang, Chun


    Inverted repeats are present in abundance in both prokaryotic and eukaryotic genomes and can form DNA secondary structures--hairpins and cruciforms that are involved in many important biological processes. Bioinformatics tools for efficient and accurate detection of inverted repeats are desirable, because existing tools are often less accurate and time consuming, sometimes incapable of dealing with genome-scale input data. Here, we present a MATLAB-based program called detectIR for the perfect and imperfect inverted repeat detection that utilizes complex numbers and vector calculation and allows genome-scale data inputs. A novel algorithm is adopted in detectIR to convert the conventional sequence string comparison in inverted repeat detection into vector calculation of complex numbers, allowing non-complementary pairs (mismatches) in the pairing stem and a non-palindromic spacer (loop or gaps) in the middle of inverted repeats. Compared with existing popular tools, our program performs with significantly higher accuracy and efficiency. Using genome sequence data from HIV-1, Arabidopsis thaliana, Homo sapiens and Zea mays for comparison, detectIR can find lots of inverted repeats missed by existing tools whose outputs often contain many invalid cases. detectIR is open source and its source code is freely available at:

  1. Structural analysis of a repetitive protein sequence motif in strepsirrhine primate amelogenin.

    Directory of Open Access Journals (Sweden)

    Rodrigo S Lacruz


    Full Text Available Strepsirrhines are members of a primate suborder that has a distinctive set of features associated with the development of the dentition. Amelogenin (AMEL, the better known of the enamel matrix proteins, forms 90% of the secreted organic matrix during amelogenesis. Although AMEL has been sequenced in numerous mammalian lineages, the only reported strepsirrhine AMEL sequences are those of the ring-tailed lemur and galago, which contain a set of additional proline-rich tandem repeats absent in all other primates species analyzed to date, but present in some non-primate mammals. Here, we first determined that these repeats are present in AMEL from three additional lemur species and thus are likely to be widespread throughout this group. To evaluate the functional relevance of these repeats in strepsirrhines, we engineered a mutated murine amelogenin sequence containing a similar proline-rich sequence to that of Lemur catta. In the monomeric form, the MQP insertions had no influence on the secondary structure or refolding properties, whereas in the assembled form, the insertions increased the hydrodynamic radii. We speculate that increased AMEL nanosphere size may influence enamel formation in strepsirrhine primates.

  2. Development and validation of a 36-gene sequencing assay for hereditary cancer risk assessment

    Directory of Open Access Journals (Sweden)

    Valentina S. Vysotskaia


    Full Text Available The past two decades have brought many important advances in our understanding of the hereditary susceptibility to cancer. Numerous studies have provided convincing evidence that identification of germline mutations associated with hereditary cancer syndromes can lead to reductions in morbidity and mortality through targeted risk management options. Additionally, advances in gene sequencing technology now permit the development of multigene hereditary cancer testing panels. Here, we describe the 2016 revision of the Counsyl Inherited Cancer Screen for detecting single-nucleotide variants (SNVs, short insertions and deletions (indels, and copy number variants (CNVs in 36 genes associated with an elevated risk for breast, ovarian, colorectal, gastric, endometrial, pancreatic, thyroid, prostate, melanoma, and neuroendocrine cancers. To determine test accuracy and reproducibility, we performed a rigorous analytical validation across 341 samples, including 118 cell lines and 223 patient samples. The screen achieved 100% test sensitivity across different mutation types, with high specificity and 100% concordance with conventional Sanger sequencing and multiplex ligation-dependent probe amplification (MLPA. We also demonstrated the screen’s high intra-run and inter-run reproducibility and robust performance on blood and saliva specimens. Furthermore, we showed that pathogenic Alu element insertions can be accurately detected by our test. Overall, the validation in our clinical laboratory demonstrated the analytical performance required for collecting and reporting genetic information related to risk of developing hereditary cancers.

  3. Identification and characterization of tandem repeats in exon III of dopamine receptor D4 (DRD4) genes from different mammalian species

    DEFF Research Database (Denmark)

    Larsen, Svend Arild; Mogensen, Line; Dietz, Rune


    repeat being found. In the domestic cow and gray seal we identified tandem repeats composed of 36-bp modules, each consisting of two closely related 18-bp basic units. A tandem repeat consisting of 9-bp modules was identified in sequences from mink and ferret. In the European otter we detected an 18-bp...

  4. Direct and inverted repeats elicit genetic instability by both exploiting and eluding DNA double-strand break repair systems in mycobacteria.

    Directory of Open Access Journals (Sweden)

    Ewelina A Wojcik

    Full Text Available Repetitive DNA sequences with the potential to form alternative DNA conformations, such as slipped structures and cruciforms, can induce genetic instability by promoting replication errors and by serving as a substrate for DNA repair proteins, which may lead to DNA double-strand breaks (DSBs. However, the contribution of each of the DSB repair pathways, homologous recombination (HR, non-homologous end-joining (NHEJ and single-strand annealing (SSA, to this sort of genetic instability is not fully understood. Herein, we assessed the genome-wide distribution of repetitive DNA sequences in the Mycobacterium smegmatis, Mycobacterium tuberculosis and Escherichia coli genomes, and determined the types and frequencies of genetic instability induced by direct and inverted repeats, both in the presence and in the absence of HR, NHEJ, and SSA. All three genomes are strongly enriched in direct repeats and modestly enriched in inverted repeats. When using chromosomally integrated constructs in M. smegmatis, direct repeats induced the perfect deletion of their intervening sequences ~1,000-fold above background. Absence of HR further enhanced these perfect deletions, whereas absence of NHEJ or SSA had no influence, suggesting compromised replication fidelity. In contrast, inverted repeats induced perfect deletions only in the absence of SSA. Both direct and inverted repeats stimulated excision of the constructs from the attB integration sites independently of HR, NHEJ, or SSA. With episomal constructs, direct and inverted repeats triggered DNA instability by activating nucleolytic activity, and absence of the DSB repair pathways (in the order NHEJ>HR>SSA exacerbated this instability. Thus, direct and inverted repeats may elicit genetic instability in mycobacteria by 1 directly interfering with replication fidelity, 2 stimulating the three main DSB repair pathways, and 3 enticing L5 site-specific recombination.

  5. Direct and inverted repeats elicit genetic instability by both exploiting and eluding DNA double-strand break repair systems in mycobacteria. (United States)

    Wojcik, Ewelina A; Brzostek, Anna; Bacolla, Albino; Mackiewicz, Pawel; Vasquez, Karen M; Korycka-Machala, Malgorzata; Jaworski, Adam; Dziadek, Jaroslaw


    Repetitive DNA sequences with the potential to form alternative DNA conformations, such as slipped structures and cruciforms, can induce genetic instability by promoting replication errors and by serving as a substrate for DNA repair proteins, which may lead to DNA double-strand breaks (DSBs). However, the contribution of each of the DSB repair pathways, homologous recombination (HR), non-homologous end-joining (NHEJ) and single-strand annealing (SSA), to this sort of genetic instability is not fully understood. Herein, we assessed the genome-wide distribution of repetitive DNA sequences in the Mycobacterium smegmatis, Mycobacterium tuberculosis and Escherichia coli genomes, and determined the types and frequencies of genetic instability induced by direct and inverted repeats, both in the presence and in the absence of HR, NHEJ, and SSA. All three genomes are strongly enriched in direct repeats and modestly enriched in inverted repeats. When using chromosomally integrated constructs in M. smegmatis, direct repeats induced the perfect deletion of their intervening sequences ~1,000-fold above background. Absence of HR further enhanced these perfect deletions, whereas absence of NHEJ or SSA had no influence, suggesting compromised replication fidelity. In contrast, inverted repeats induced perfect deletions only in the absence of SSA. Both direct and inverted repeats stimulated excision of the constructs from the attB integration sites independently of HR, NHEJ, or SSA. With episomal constructs, direct and inverted repeats triggered DNA instability by activating nucleolytic activity, and absence of the DSB repair pathways (in the order NHEJ>HR>SSA) exacerbated this instability. Thus, direct and inverted repeats may elicit genetic instability in mycobacteria by 1) directly interfering with replication fidelity, 2) stimulating the three main DSB repair pathways, and 3) enticing L5 site-specific recombination.

  6. CRISPRstrand: predicting repeat orientations to determine the crRNA-encoding strand at CRISPR loci

    DEFF Research Database (Denmark)

    Alkhnbashi, Omer S.; Costa, Fabrizio; Shah, Shiraz Ali


    Motivation: The discovery of CRISPR-Cas systems almost 20 years ago rapidly changed our perception of the bacterial and archaeal immune systems. CRISPR loci consist of several repetitive DNA sequences called repeats, inter-spaced by stretches of variable length sequences called spacers. This CRISPR...... array is transcribed and processed into multiple mature RNA species (crRNAs). A single crRNA is integrated into an interference complex, together with CRISPR-associated (Cas) proteins, to bind and degrade invading nucleic acids. Although existing bioinformatics tools can recognize CRISPR loci...... by their characteristic repeat-spacer architecture, they generally output CRISPR arrays of ambiguous orientation and thus do not determine the strand from which crRNAs are processed. Knowledge of the correct orientation is crucial for many tasks, including the classification of CRISPR conservation, the detection...

  7. High quality maize centromere 10 sequence reveals evidence of frequent recombination events

    Directory of Open Access Journals (Sweden)

    Thomas Kai Wolfgruber


    Full Text Available The ancestral centromeres of maize contain long stretches of the tandemly arranged CentC repeat. The abundance of tandem DNA repeats and centromeric retrotransposons (CR have presented a significant challenge to completely assembling centromeres using traditional sequencing methods. Here we report a nearly complete assembly of the 1.85 Mb maize centromere 10 from inbred B73 using PacBio technology and BACs from the reference genome project. The error rates estimated from overlapping BAC sequences are 7 x 10-6 and 5 x 10-5 for mismatches and indels, respectively. The number of gaps in the region covered by the reassembly was reduced from 140 in the reference genome to three. Three expressed genes are located between 92 and 477 kb of the inferred ancestral CentC cluster, which lies within the region of highest centromeric repeat density. The improved assembly increased the count of full-length centromeric retrotransposons from 5 to 55 and revealed a 22.7 kb segmental duplication that occurred approximately 121,000 years ago. Our analysis provides evidence of frequent recombination events in the form of partial retrotransposons, deletions within retrotransposons, chimeric retrotransposons, segmental duplications including higher order CentC repeats, a deleted CentC monomer, centromere-proximal inversions, and insertion of mitochondrial sequences. Double-strand DNA break (DSB repair is the most plausible mechanism for these events and may be the major driver of centromere repeat evolution and diversity. This repair appears to be mediated by microhomology, suggesting that tandem repeats may have evolved to facilitate the repair of frequent DSBs in centromeres.

  8. The complete chloroplast genome sequence of Abies nephrolepis (Pinaceae: Abietoideae

    Directory of Open Access Journals (Sweden)

    Dong-Keun Yi


    Full Text Available The plant chloroplast (cp genome has maintained a relatively conserved structure and gene content throughout evolution. Cp genome sequences have been used widely for resolving evolutionary and phylogenetic issues at various taxonomic levels of plants. Here, we report the complete cp genome of Abies nephrolepis. The A. nephrolepis cp genome is 121,336 base pairs (bp in length including a pair of short inverted repeat regions (IRa and IRb of 139 bp each separated by a small single copy (SSC region of 54,323 bp (SSC and a large single copy region of 66,735 bp (LSC. It contains 114 genes, 68 of which are protein coding genes, 35 tRNA and four rRNA genes, six open reading frames, and one pseudogene. Seventeen repeat units and 64 simple sequence repeats (SSR have been detected in A. nephrolepis cp genome. Large IR sequences locate in 42-kb inversion points (1186 bp. The A. nephrolepis cp genome is identical to Abies koreana’s which is closely related to taxa. Pairwise comparison between two cp genomes revealed 140 polymorphic sites in each. Complete cp genome sequence of A. nephrolepis has a significant potential to provide information on the evolutionary pattern of Abietoideae and valuable data for development of DNA markers for easy identification and classification.

  9. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences (United States)


    Background The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. Results We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR) are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. Conclusions CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from PMID:23256920

  10. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences

    Directory of Open Access Journals (Sweden)

    Liu Chang


    Full Text Available Abstract Background The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. Results We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. Conclusions CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from

  11. Mapping sequences by parts

    Directory of Open Access Journals (Sweden)

    Guziolowski Carito


    Full Text Available Abstract Background: We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partitioning the first sequence into N parts and placing them, possibly complementary reversed, over the second sequence in order to maximize the sum of their gapless alignment scores. Results: We introduce an algorithm computing an optimal N-map with time complexity O (|s| × |t| × N using O (|s| × |t| × N memory space. Among all the numbers of parts taken in a reasonable range, we select the value N for which the optimal N-map has the most significant score. To evaluate this significance, we study the empirical distributions of the scores of optimal N-maps and show that they can be approximated by normal distributions with a reasonable accuracy. We test the functionality of the approach over random sequences on which we apply artificial evolutionary events. Practical Application: The method is illustrated with four case studies of pairs of sequences involving non-standard evolutionary events.

  12. DNA dynamics is likely to be a factor in the genomic nucleotide repeats expansions related to diseases.

    Directory of Open Access Journals (Sweden)

    Boian S Alexandrov

    Full Text Available Trinucleotide repeats sequences (TRS represent a common type of genomic DNA motif whose expansion is associated with a large number of human diseases. The driving molecular mechanisms of the TRS ongoing dynamic expansion across generations and within tissues and its influence on genomic DNA functions are not well understood. Here we report results for a novel and notable collective breathing behavior of genomic DNA of tandem TRS, leading to propensity for large local DNA transient openings at physiological temperature. Our Langevin molecular dynamics (LMD and Markov Chain Monte Carlo (MCMC simulations demonstrate that the patterns of openings of various TRSs depend specifically on their length. The collective propensity for DNA strand separation of repeated sequences serves as a precursor for outsized intermediate bubble states independently of the G/C-content. We report that repeats have the potential to interfere with the binding of transcription factors to their consensus sequence by altered DNA breathing dynamics in proximity of the binding sites. These observations might influence ongoing attempts to use LMD and MCMC simulations for TRS-related modeling of genomic DNA functionality in elucidating the common denominators of the dynamic TRS expansion mutation with potential therapeutic applications.

  13. Origin-Dependent Inverted-Repeat Amplification: Tests of a Model for Inverted DNA Amplification.

    Directory of Open Access Journals (Sweden)

    Bonita J Brewer


    Full Text Available DNA replication errors are a major driver of evolution--from single nucleotide polymorphisms to large-scale copy number variations (CNVs. Here we test a specific replication-based model to explain the generation of interstitial, inverted triplications. While no genetic information is lost, the novel inversion junctions and increased copy number of the included sequences create the potential for adaptive phenotypes. The model--Origin-Dependent Inverted-Repeat Amplification (ODIRA-proposes that a replication error at pre-existing short, interrupted, inverted repeats in genomic sequences generates an extrachromosomal, inverted dimeric, autonomously replicating intermediate; subsequent genomic integration of the dimer yields this class of CNV without loss of distal chromosomal sequences. We used a combination of in vitro and in vivo approaches to test the feasibility of the proposed replication error and its downstream consequences on chromosome structure in the yeast Saccharomyces cerevisiae. We show that the proposed replication error-the ligation of leading and lagging nascent strands to create "closed" forks-can occur in vitro at short, interrupted inverted repeats. The removal of molecules with two closed forks results in a hairpin-capped linear duplex that we show replicates in vivo to create an inverted, dimeric plasmid that subsequently integrates into the genome by homologous recombination, creating an inverted triplication. While other models have been proposed to explain inverted triplications and their derivatives, our model can also explain the generation of human, de novo, inverted amplicons that have a 2:1 mixture of sequences from both homologues of a single parent--a feature readily explained by a plasmid intermediate that arises from one homologue and integrates into the other homologue prior to meiosis. Our tests of key features of ODIRA lend support to this mechanism and suggest further avenues of enquiry to unravel the origins

  14. Origin-Dependent Inverted-Repeat Amplification: Tests of a Model for Inverted DNA Amplification. (United States)

    Brewer, Bonita J; Payen, Celia; Di Rienzi, Sara C; Higgins, Megan M; Ong, Giang; Dunham, Maitreya J; Raghuraman, M K


    DNA replication errors are a major driver of evolution--from single nucleotide polymorphisms to large-scale copy number variations (CNVs). Here we test a specific replication-based model to explain the generation of interstitial, inverted triplications. While no genetic information is lost, the novel inversion junctions and increased copy number of the included sequences create the potential for adaptive phenotypes. The model--Origin-Dependent Inverted-Repeat Amplification (ODIRA)-proposes that a replication error at pre-existing short, interrupted, inverted repeats in genomic sequences generates an extrachromosomal, inverted dimeric, autonomously replicating intermediate; subsequent genomic integration of the dimer yields this class of CNV without loss of distal chromosomal sequences. We used a combination of in vitro and in vivo approaches to test the feasibility of the proposed replication error and its downstream consequences on chromosome structure in the yeast Saccharomyces cerevisiae. We show that the proposed replication error-the ligation of leading and lagging nascent strands to create "closed" forks-can occur in vitro at short, interrupted inverted repeats. The removal of molecules with two closed forks results in a hairpin-capped linear duplex that we show replicates in vivo to create an inverted, dimeric plasmid that subsequently integrates into the genome by homologous recombination, creating an inverted triplication. While other models have been proposed to explain inverted triplications and their derivatives, our model can also explain the generation of human, de novo, inverted amplicons that have a 2:1 mixture of sequences from both homologues of a single parent--a feature readily explained by a plasmid intermediate that arises from one homologue and integrates into the other homologue prior to meiosis. Our tests of key features of ODIRA lend support to this mechanism and suggest further avenues of enquiry to unravel the origins of interstitial

  15. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae.

    Directory of Open Access Journals (Sweden)

    Isabel A S Bonatelli

    Full Text Available Microsatellite markers (also known as SSRs, Simple Sequence Repeats are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms.

  16. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae). (United States)

    Bonatelli, Isabel A S; Carstens, Bryan C; Moraes, Evandro M


    Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms.

  17. Replication error deficient and proficient colorectal cancer gene expression differences caused by 3'UTR polyT sequence deletions

    DEFF Research Database (Denmark)

    Wilding, Jennifer L; McGowan, Simon; Liu, Ying


    , and have distinct pathologies. Regulatory sequences controlling all aspects of mRNA processing, especially including message stability, are found in the 3'UTR sequence of most genes. The relevant sequences are typically A/U-rich elements or U repeats. Microarray analysis of 14 RER+ (deficient) and 16 RER......- (proficient) colorectal cancer cell lines confirms a striking difference in expression profiles. Analysis of the incidence of mononucleotide repeat sequences in the 3'UTRs, 5'UTRs, and coding sequences of those genes most differentially expressed in RER+ versus RER- cell lines has shown that much...... of this differential expression can be explained by the occurrence of a massive enrichment of genes with 3'UTR T repeats longer than 11 base pairs in the most differentially expressed genes. This enrichment was confirmed by analysis of two published consensus sets of RER differentially expressed probesets for a large...

  18. Repeated fault rupture recorded by paleoenvironmental changes in a wetland sedimentary sequence ponded against the Alpine Fault, New Zealand (United States)

    Clark, K.; Berryman, K. R.; Cochran, U. A.; Bartholomew, T.; Turner, G. M.


    At Hokuri Creek, in south Westland, New Zealand, an 18 m thickness of Holocene sediments has accumulated against the upthrown side of the Alpine Fault. Recent fluvial incision has created numerous exposures of this sedimentary sequence. At a decimetre to metre scale there are two dominant types of sedimentary units: clastic-dominated, grey silt packages, and organic-dominated, light brown peaty-silt units. These units represent repeated alternations of the paleoenvironment due to fault rupture over the past 7000 years. We have located the event horizons within the sedimentary sequence, and identified evidence to support earthquake-driven paleoenvironmental change (rather than climatic variability), and developed a model of paleoenvironmental changes over a typical seismic cycle. To quantitatively characterise the sediments we use high resolution photography, x-ray imaging, magnetic-susceptibility and total carbon analysis. To understand the depositional environment we used diatom and pollen studies. The organic-rich units have very low magnetic susceptibility and density values, with high greyscale and high total carbon values. Diatoms indicate these units represent stable wetland environments with standing water and predominantly in-situ organic material deposition. The clastic-rich units are characterised by higher magnetic susceptibility and density values, with low greyscale and total carbon. The clastic-rich units represent environments of flowing water and deep pond settings that received predominantly catchment-derived silt and sand. The event horizon is located at the upper contact of the organic-rich horizons. The event horizon contact marks a drastic change in hydrologic regime as fault rupture changed the stream base level and there was a synchronous influx of clastic sediment as the catchment responded to earthquake shaking. During the interseismic period the flowing-water environment gradually stabilised and returned to an organic-rich wetland. Such

  19. Sequence finishing and mapping of Drosophila melanogasterheterochromatin

    Energy Technology Data Exchange (ETDEWEB)

    Hoskins, Roger A.; Carlson, Joseph W.; Kennedy, Cameron; Acevedo,David; Evans-Holm, Martha; Frise, Erwin; Wan, Kenneth H.; Park, Soo; Mendez-Lago, Maria; Rossi, Fabrizio; Villasante, Alfredo; Dimitri,Patrizio; Karpen, Gary H.; Celniker, Susan E.


    Genome sequences for most metazoans are incomplete due tothe presence of repeated DNA in the pericentromeric heterochromatin. Theheterochromatic regions of D. melanogaster contain 20 Mb of sequenceamenable to mapping, sequence assembly and finishing. Here we describethe generation of 15 Mb of finished or improved heterochromatic sequenceusing available clone resources and assembly and mapping methods. We alsoconstructed a BAC-based physical map that spans approximately 13 Mb ofthe pericentromeric heterochromatin, and a cytogenetic map that positionsapproximately 11 Mb of BAC contigs and sequence scaffolds in specificchromosomal locations. The integrated sequence assembly and maps greatlyimprove our understanding of the structure and composition of this poorlyunderstood fraction of a metazoan genome and provide a framework forfunctional analyses.

  20. Whole-genome in-silico subtractive hybridization (WISH - using massive sequencing for the identification of unique and repetitive sex-specific sequences: the example of Schistosoma mansoni

    Directory of Open Access Journals (Sweden)

    Parrinello Hugues


    Full Text Available Abstract Background Emerging methods of massive sequencing that allow for rapid re-sequencing of entire genomes at comparably low cost are changing the way biological questions are addressed in many domains. Here we propose a novel method to compare two genomes (genome-to-genome comparison. We used this method to identify sex-specific sequences of the human blood fluke Schistosoma mansoni. Results Genomic DNA was extracted from male and female (heterogametic S. mansoni adults and sequenced with a Genome Analyzer (Illumina. Sequences are available at the NCBI sequence read archive under study accession number SRA012151.6. Sequencing reads were aligned to the genome, and a pseudogenome composed of known repeats. Straightforward comparative bioinformatics analysis was performed to compare male and female schistosome genomes and identify female-specific sequences. We found that the S. mansoni female W chromosome contains only few specific unique sequences (950 Kb i.e. about 0.2% of the genome. The majority of W-specific sequences are repeats (10.5 Mb i.e. about 2.5% of the genome. Arbitrarily selected W-specific sequences were confirmed by PCR. Primers designed for unique and repetitive sequences allowed to reliably identify the sex of both larval and adult stages of the parasite. Conclusion Our genome-to-genome comparison method that we call "whole-genome in-silico subtractive hybridization" (WISH allows for rapid identification of sequences that are specific for a certain genotype (e.g. the heterogametic sex. It can in principle be used for the detection of any sequence differences between isolates (e.g. strains, pathovars or even closely related species.

  1. Genetic variation in Rhodomyrtus tomentosa (Kemunting) populations from Malaysia as revealed by inter-simple sequence repeat markers. (United States)

    Hue, T S; Abdullah, T L; Abdullah, N A P; Sinniah, U R


    Kemunting (Rhodomyrtus tomentosa) from the Myrtaceae family, is native to Malaysia. It is widely used in traditional medicine to treat various illnesses and possesses significant antibacterial properties. In addition, it has great potential as ornamental in landscape design. Genetic variability studies are important for the rational management and conservation of genetic material. In the present study, inter-simple sequence repeat markers were used to assess the genetic diversity of 18 R. tomentosa populations collected from ten states of Peninsular Malaysia. The 11 primers selected generated 173 bands that ranged in size from 1.6 kb to 130 bp, which corresponded to an average of 15.73 bands per primer. Of these bands, 97.69% (169 in total) were polymorphic. High genetic diversity was documented at the species level (H(T) = 0.2705; I = 0.3973; PPB = 97.69%) but there was a low diversity at population level (H(S) = 0.0073; I = 0 .1085; PPB = 20.14%). The high level of genetic differentiation revealed by G(ST) (73%) and analysis of molecular variance (63%), together with the limited gene flow among population (N(m) = 0.1851), suggests that the populations examined are isolated. Results from an unweighted pair group method with arithmetic mean dendrogram and principal coordinate analysis clearly grouped the populations into two geographic groups. This clear grouping can also be demonstrated by the significant Mantel test (r = 0.581, P = 0.001). We recommend that all the R. tomentosa populations be preserved in conservation program.

  2. Sequencing, Characterization, and Comparative Analyses of the Plastome of Caragana rosea var. rosea

    Directory of Open Access Journals (Sweden)

    Mei Jiang


    Full Text Available To exploit the drought-resistant Caragana species, we performed a comparative study of the plastomes from four species: Caragana rosea, C. microphylla, C. kozlowii, and C. Korshinskii. The complete plastome sequence of the C. rosea was obtained using the next generation DNA sequencing technology. The genome is a circular structure of 133,122 bases and it lacks inverted repeat. It contains 111 unique genes, including 76 protein-coding, 30 tRNA, and four rRNA genes. Repeat analyses obtained 239, 244, 258, and 246 simple sequence repeats in C. rosea, C. microphylla, C. kozlowii, and C. korshinskii, respectively. Analyses of sequence divergence found two intergenic regions: trnI-CAU-ycf2 and trnN-GUU-ycf1, exhibiting a high degree of variations. Phylogenetic analyses showed that the four Caragana species belong to a monophyletic clade. Analyses of Ka/Ks ratios revealed that five genes: rpl16, rpl20, rps11, rps7, and ycf1 and several sites having undergone strong positive selection in the Caragana branch. The results lay the foundation for the development of molecular markers and the understanding of the evolutionary process for drought-resistant characteristics.

  3. Genome-wide analysis of tandem repeats in plants and green algae (United States)

    Zhixin Zhao; Cheng Guo; Sreeskandarajan Sutharzan; Pei Li; Craig Echt; Jie Zhang; Chun Liang


    Tandem repeats (TRs) extensively exist in the genomes of prokaryotes and eukaryotes. Based on the sequenced genomes and gene annotations of 31 plant and algal species in Phytozome version 8.0 (, we examined TRs in a genome-wide scale, characterized their distributions and motif features, and explored their putative biological functions. Among...

  4. Repetitive DNA in the pea (Pisum sativum L. genome: comprehensive characterization using 454 sequencing and comparison to soybean and Medicago truncatula

    Directory of Open Access Journals (Sweden)

    Navrátilová Alice


    Full Text Available Abstract Background Extraordinary size variation of higher plant nuclear genomes is in large part caused by differences in accumulation of repetitive DNA. This makes repetitive DNA of great interest for studying the molecular mechanisms shaping architecture and function of complex plant genomes. However, due to methodological constraints of conventional cloning and sequencing, a global description of repeat composition is available for only a very limited number of higher plants. In order to provide further data required for investigating evolutionary patterns of repeated DNA within and between species, we used a novel approach based on massive parallel sequencing which allowed a comprehensive repeat characterization in our model species, garden pea (Pisum sativum. Results Analysis of 33.3 Mb sequence data resulted in quantification and partial sequence reconstruction of major repeat families occurring in the pea genome with at least thousands of copies. Our results showed that the pea genome is dominated by LTR-retrotransposons, estimated at 140,000 copies/1C. Ty3/gypsy elements are less diverse and accumulated to higher copy numbers than Ty1/copia. This is in part due to a large population of Ogre-like retrotransposons which alone make up over 20% of the genome. In addition to numerous types of mobile elements, we have discovered a set of novel satellite repeats and two additional variants of telomeric sequences. Comparative genome analysis revealed that there are only a few repeat sequences conserved between pea and soybean genomes. On the other hand, all major families of pea mobile elements are well represented in M. truncatula. Conclusion We have demonstrated that even in a species with a relatively large genome like pea, where a single 454-sequencing run provided only 0.77% coverage, the generated sequences were sufficient to reconstruct and analyze major repeat families corresponding to a total of 35–48% of the genome. These data

  5. [Clustered regularly interspaced short palindromic repeats (CRISPR) site in Bacillus anthracis]. (United States)

    Gao, Zhiqi; Wang, Dongshu; Feng, Erling; Wang, Bingxiang; Hui, Yiming; Han, Shaobo; Jiao, Lei; Liu, Xiankai; Wang, Hengliang


    To investigate the polymorphism of clustered regularly interspaced short palindromic repeats (CRISPR) in Bacillu santhracis and the application to molecular typing based on the polymorphism of CRISPR in B. anthracis. We downloaded the whole genome sequence of 6 B. anthracis strains and extracted the CRISPR sites. We designed the primers of CRISPR sites and amplified the CRISPR fragments in 193 B. anthracis strains by PCR and sequenced these fragments. In order to reveal the polymorphism of CRISPR in B. anthracis, wealigned all the extracted sequences and sequenced results by local blasting. At the same time, we also analyzed the CRISPR sites in B. cereus and B. thuringiensis. We did not find any polymorphism of CRISPR in B. anthracis. The molecular typing approach based on CRISPR polymorphism is not suitable for B. anthracis, but it is possible for us to distinguish B. anthracis from B. cereus and B. thuringiensis.

  6. The diversity and evolution of Wolbachia ankyrin repeat domain genes.

    Directory of Open Access Journals (Sweden)

    Stefanos Siozios

    Full Text Available Ankyrin repeat domain-encoding genes are common in the eukaryotic and viral domains of life, but they are rare in bacteria, the exception being a few obligate or facultative intracellular Proteobacteria species. Despite having a reduced genome, the arthropod strains of the alphaproteobacterium Wolbachia contain an unusually high number of ankyrin repeat domain-encoding genes ranging from 23 in wMel to 60 in wPip strain. This group of genes has attracted considerable attention for their astonishing large number as well as for the fact that ankyrin proteins are known to participate in protein-protein interactions, suggesting that they play a critical role in the molecular mechanism that determines host-Wolbachia symbiotic interactions. We present a comparative evolutionary analysis of the wMel-related ankyrin repeat domain-encoding genes present in different Drosophila-Wolbachia associations. Our results show that the ankyrin repeat domain-encoding genes change in size by expansion and contraction mediated by short directly repeated sequences. We provide examples of intra-genic recombination events and show that these genes are likely to be horizontally transferred between strains with the aid of bacteriophages. These results confirm previous findings that the Wolbachia genomes are evolutionary mosaics and illustrate the potential that these bacteria have to generate diversity in proteins potentially involved in the symbiotic interactions.

  7. Characteristics of palindromic sequences in DNA of the sea urchin Stronglyocentrotus intermedius

    International Nuclear Information System (INIS)

    Brykov, V.A.; Kukhlevskii, A.D.


    The fraction of palindromic sequences in the nuclear DNA of the sea urchin S. intermedius was characterized. Using chromatography on hydroxyapatite and treatment with S1 nuclease, it was shown that the fraction of palindromic sequences more than doubles when the sodium concentration in solution is increased or the temperature of reassociation is lowered. The increase is due to the involvement of inverted repeats in reassociation, which are characterized by a substantial nonhomologous character and/or the presence of an extended intervening DNA sequence. It was found by the method of reassociation of a nicked palindrome fraction with an excess of total homologous DNA that most of the inverted repeats in the sea urchin genome are unique sequences. The complexity of the palindrome fraction was estimated at 8.2 x 10 7 nucleotide pairs, and the number of palindromes per haploid genome ∼ 500,000

  8. Molecular characterization of three common olive (Olea europaea L.) cultivars in Palestine, using simple sequence repeat (SSR) markers. (United States)

    Obaid, Ramiz; Abu-Qaoud, Hassan; Arafeh, Rami


    Eight accessions of olive trees from three common varieties in Palestine, Nabali Baladi, Nabali Mohassan and Surri, were genetically evaluated using five simple sequence repeat (SSR) markers. A total of 17 alleles from 5 loci were observed in which 15 (88.2%) were polymorphic and 2 (11.8%) were monomorphic. An average of 3.4 alleles per locus was found ranging from 2.0 alleles with the primers GAPU-103 and DCA-9 to 5.0 alleles with U9932 and DCA-16. The smallest amplicon size observed was 50 bp with the primer DCA-16, whereas the largest one (450 bp) with the primer U9932. Cluster analysis with the unweighted pair group method with arithmetic average (UPGMA) showed three clusters: a cluster with four accessions from the 'Nabali Baladi' cultivar, another cluster with three accessions that represents the 'Nabali Mohassen' cultivar and finally the 'Surri' cultivar. The similarity coefficient for the eight olive tree samples ranged from a maximum of 100% between two accessions from Nabali Baladi and also in two other samples from Nabali Mohassan, to a minimum similarity coefficient (0.315) between the Surri and two Nabali Baladi accessions. The results in this investigation clearly highlight the genetic dissimilarity between the three main olive cultivars that have been misidentified and mixed up in the past, based on conventional morphological characters.

  9. Diversity and genetic stability in banana genotypes in a breeding program using inter simple sequence repeats (ISSR) markers. (United States)

    Silva, A V C; Nascimento, A L S; Vitória, M F; Rabbani, A R C; Soares, A N R; Lédo, A S


    Banana (Musa spp) is a fruit species frequently cultivated and consumed worldwide. Molecular markers are important for estimating genetic diversity in germplasm and between genotypes in breeding programs. The objective of this study was to analyze the genetic diversity of 21 banana genotypes (FHIA 23, PA42-44, Maçã, Pacovan Ken, Bucaneiro, YB42-47, Grand Naine, Tropical, FHIA 18, PA94-01, YB42-17, Enxerto, Japira, Pacovã, Prata-Anã, Maravilha, PV79-34, Caipira, Princesa, Garantida, and Thap Maeo), by using inter-simple sequence repeat (ISSR) markers. Material was generated from the banana breeding program of Embrapa Cassava & Fruits and evaluated at Embrapa Coastal Tablelands. The 12 primers used in this study generated 97.5% polymorphism. Four clusters were identified among the different genotypes studied, and the sum of the first two principal components was 48.91%. From the Unweighted Pair Group Method using Arithmetic averages (UPGMA) dendrogram, it was possible to identify two main clusters and subclusters. Two genotypes (Garantida and Thap Maeo) remained isolated from the others, both in the UPGMA clustering and in the principal cordinate analysis (PCoA). Using ISSR markers, we could analyze the genetic diversity of the studied material and state that these markers were efficient at detecting sufficient polymorphism to estimate the genetic variability in banana genotypes.

  10. Comparing Young and Elderly Serial Reaction Time Task Performance on Repeated and Random Conditions

    Directory of Open Access Journals (Sweden)

    Fatemeh Ehsani


    Full Text Available Objectives: Acquisition motor skill training in elderly is at great importance. The main purpose of this study was to compare young and elderly performance in serial reaction time task on different repeated and random conditions. Methods & Materials: A serial reaction time task by using software was applied for studying motor learning in 30 young and 30 elderly. Each group divided randomly implicitly and explicitly into subgroups. A task 4 squares with different colors appeared on the monitor and subjects were asked to press its defined key immediately after observing it. Subjects practiced 8 motor blocks (4 repeated blocks, then 2 random blocks and 2 repeated blocks. Block time that was dependent variable measured and Independent-samples t- test with repeated ANOVA measures were used in this test. Results: young groups performed both repeated and random sequences significantly faster than elderly (P0.05. Explicit older subgroup performed 7,8 blocks slower than 6 block with a significant difference (P<0.05. Conclusion: Young adults discriminate high level performance than elderly in both repeated and random practice. Elderly performed random practice better than repeated practice.

  11. Memory for sequences of events impaired in typical aging (United States)

    Allen, Timothy A.; Morris, Andrea M.; Stark, Shauna M.; Fortin, Norbert J.


    Typical aging is associated with diminished episodic memory performance. To improve our understanding of the fundamental mechanisms underlying this age-related memory deficit, we previously developed an integrated, cross-species approach to link converging evidence from human and animal research. This novel approach focuses on the ability to remember sequences of events, an important feature of episodic memory. Unlike existing paradigms, this task is nonspatial, nonverbal, and can be used to isolate different cognitive processes that may be differentially affected in aging. Here, we used this task to make a comprehensive comparison of sequence memory performance between younger (18–22 yr) and older adults (62–86 yr). Specifically, participants viewed repeated sequences of six colored, fractal images and indicated whether each item was presented “in sequence” or “out of sequence.” Several out of sequence probe trials were used to provide a detailed assessment of sequence memory, including: (i) repeating an item from earlier in the sequence (“Repeats”; e.g., ABADEF), (ii) skipping ahead in the sequence (“Skips”; e.g., ABDDEF), and (iii) inserting an item from a different sequence into the same ordinal position (“Ordinal Transfers”; e.g., AB3DEF). We found that older adults performed as well as younger controls when tested on well-known and predictable sequences, but were severely impaired when tested using novel sequences. Importantly, overall sequence memory performance in older adults steadily declined with age, a decline not detected with other measures (RAVLT or BPS-O). We further characterized this deficit by showing that performance of older adults was severely impaired on specific probe trials that required detailed knowledge of the sequence (Skips and Ordinal Transfers), and was associated with a shift in their underlying mnemonic representation of the sequences. Collectively, these findings provide unambiguous evidence that the

  12. The peculiar landscape of repetitive sequences in the olive (Olea europaea L.) genome. (United States)

    Barghini, Elena; Natali, Lucia; Cossu, Rosa Maria; Giordani, Tommaso; Pindo, Massimo; Cattonaro, Federica; Scalabrin, Simone; Velasco, Riccardo; Morgante, Michele; Cavallini, Andrea


    Analyzing genome structure in different species allows to gain an insight into the evolution of plant genome size. Olive (Olea europaea L.) has a medium-sized haploid genome of 1.4 Gb, whose structure is largely uncharacterized, despite the growing importance of this tree as oil crop. Next-generation sequencing technologies and different computational procedures have been used to study the composition of the olive genome and its repetitive fraction. A total of 2.03 and 2.3 genome equivalents of Illumina and 454 reads from genomic DNA, respectively, were assembled following different procedures, which produced more than 200,000 differently redundant contigs, with mean length higher than 1,000 nt. Mapping Illumina reads onto the assembled sequences was used to estimate their redundancy. The genome data set was subdivided into highly and medium redundant and nonredundant contigs. By combining identification and mapping of repeated sequences, it was established that tandem repeats represent a very large portion of the olive genome (∼31% of the whole genome), consisting of six main families of different length, two of which were first discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements (especially long terminal repeat-retrotransposons). On the whole, the results of our analyses show the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, more than that reported for any other sequenced plant genome.

  13. The Complete Chloroplast Genome Sequences of the Medicinal Plant Forsythia suspensa (Oleaceae

    Directory of Open Access Journals (Sweden)

    Wenbin Wang


    Full Text Available Forsythia suspensa is an important medicinal plant and traditionally applied for the treatment of inflammation, pyrexia, gonorrhea, diabetes, and so on. However, there is limited sequence and genomic information available for F. suspensa. Here, we produced the complete chloroplast genomes of F. suspensa using Illumina sequencing technology. F. suspensa is the first sequenced member within the genus Forsythia (Oleaceae. The gene order and organization of the chloroplast genome of F. suspensa are similar to other Oleaceae chloroplast genomes. The F. suspensa chloroplast genome is 156,404 bp in length, exhibits a conserved quadripartite structure with a large single-copy (LSC; 87,159 bp region, and a small single-copy (SSC; 17,811 bp region interspersed between inverted repeat (IRa/b; 25,717 bp regions. A total of 114 unique genes were annotated, including 80 protein-coding genes, 30 tRNA, and four rRNA. The low GC content (37.8% and codon usage bias for A- or T-ending codons may largely affect gene codon usage. Sequence analysis identified a total of 26 forward repeats, 23 palindrome repeats with lengths >30 bp (identity > 90%, and 54 simple sequence repeats (SSRs with an average rate of 0.35 SSRs/kb. We predicted 52 RNA editing sites in the chloroplast of F. suspensa, all for C-to-U transitions. IR expansion or contraction and the divergent regions were analyzed among several species including the reported F. suspensa in this study. Phylogenetic analysis based on whole-plastome revealed that F. suspensa, as a member of the Oleaceae family, diverged relatively early from Lamiales. This study will contribute to strengthening medicinal resource conservation, molecular phylogenetic, and genetic engineering research investigations of this species.

  14. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics (United States)

    Zhang, Xiaorong


    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR…

  15. Pms2 suppresses large expansions of the (GAA·TTCn sequence in neuronal tissues.

    Directory of Open Access Journals (Sweden)

    Rebecka L Bourn

    Full Text Available Expanded trinucleotide repeat sequences are the cause of several inherited neurodegenerative diseases. Disease pathogenesis is correlated with several features of somatic instability of these sequences, including further large expansions in postmitotic tissues. The presence of somatic expansions in postmitotic tissues is consistent with DNA repair being a major determinant of somatic instability. Indeed, proteins in the mismatch repair (MMR pathway are required for instability of the expanded (CAG·CTG(n sequence, likely via recognition of intrastrand hairpins by MutSβ. It is not clear if or how MMR would affect instability of disease-causing expanded trinucleotide repeat sequences that adopt secondary structures other than hairpins, such as the triplex/R-loop forming (GAA·TTC(n sequence that causes Friedreich ataxia. We analyzed somatic instability in transgenic mice that carry an expanded (GAA·TTC(n sequence in the context of the human FXN locus and lack the individual MMR proteins Msh2, Msh6 or Pms2. The absence of Msh2 or Msh6 resulted in a dramatic reduction in somatic mutations, indicating that mammalian MMR promotes instability of the (GAA·TTC(n sequence via MutSα. The absence of Pms2 resulted in increased accumulation of large expansions in the nervous system (cerebellum, cerebrum, and dorsal root ganglia but not in non-neuronal tissues (heart and kidney, without affecting the prevalence of contractions. Pms2 suppressed large expansions specifically in tissues showing MutSα-dependent somatic instability, suggesting that they may act on the same lesion or structure associated with the expanded (GAA·TTC(n sequence. We conclude that Pms2 specifically suppresses large expansions of a pathogenic trinucleotide repeat sequence in neuronal tissues, possibly acting independently of the canonical MMR pathway.

  16. Analysis of oligonucleotide array experiments with repeated measures using mixed models

    Directory of Open Access Journals (Sweden)

    Getchell Thomas V


    Full Text Available Abstract Background Two or more factor mixed factorial experiments are becoming increasingly common in microarray data analysis. In this case study, the two factors are presence (Patients with Alzheimer's disease or absence (Control of the disease, and brain regions including olfactory bulb (OB or cerebellum (CER. In the design considered in this manuscript, OB and CER are repeated measurements from the same subject and, hence, are correlated. It is critical to identify sources of variability in the analysis of oligonucleotide array experiments with repeated measures and correlations among data points have to be considered. In addition, multiple testing problems are more complicated in experiments with multi-level treatments or treatment combinations. Results In this study we adopted a linear mixed model to analyze oligonucleotide array experiments with repeated measures. We first construct a generalized F test to select differentially expressed genes. The Benjamini and Hochberg (BH procedure of controlling false discovery rate (FDR at 5% was applied to the P values of the generalized F test. For those genes with significant generalized F test, we then categorize them based on whether the interaction terms were significant or not at the α-level (αnew = 0.0033 determined by the FDR procedure. Since simple effects may be examined for the genes with significant interaction effect, we adopt the protected Fisher's least significant difference test (LSD procedure at the level of αnew to control the family-wise error rate (FWER for each gene examined. Conclusions A linear mixed model is appropriate for analysis of oligonucleotide array experiments with repeated measures. We constructed a generalized F test to select differentially expressed genes, and then applied a specific sequence of tests to identify factorial effects. This sequence of tests applied was designed to control for gene based FWER.

  17. Race: A scalable and elastic parallel system for discovering repeats in very long sequences

    KAUST Repository

    Mansour, Essam


    A wide range of applications, including bioinformatics, time series, and log analysis, depend on the identification of repetitions in very long sequences. The problem of finding maximal pairs subsumes most important types of repetition-finding tasks. Existing solutions require both the input sequence and its index (typically an order of magnitude larger than the input) to fit in memory. Moreover, they are serial algorithms with long execution time. Therefore, they are limited to small datasets, despite the fact that modern applications demand orders of magnitude longer sequences. In this paper we present RACE, a parallel system for finding maximal pairs in very long sequences. RACE supports parallel execution on stand-alone multicore systems, in addition to scaling to thousands of nodes on clusters or supercomputers. RACE does not require the input or the index to fit in memory; therefore, it supports very long sequences with limited memory. Moreover, it uses a novel array representation that allows for cache-efficient implementation. RACE is particularly suitable for the cloud (e.g., Amazon EC2) because, based on availability, it can scale elastically to more or fewer machines during its execution. Since scaling out introduces overheads, mainly due to load imbalance, we propose a cost model to estimate the expected speedup, based on statistics gathered through sampling. The model allows the user to select the appropriate combination of cloud resources based on the provider\\'s prices and the required deadline. We conducted extensive experimental evaluation with large real datasets and large computing infrastructures. In contrast to existing methods, RACE can handle the entire human genome on a typical desktop computer with 16GB RAM. Moreover, for a problem that takes 10 hours of serial execution, RACE finishes in 28 seconds using 2,048 nodes on an IBM BlueGene/P supercomputer.

  18. Stored word sequences in language learning: the effect of familiarity on children's repetition of four-word combinations. (United States)

    Bannard, Colin; Matthews, Danielle


    Recent accounts of the development of grammar propose that children remember utterances they hear and draw generalizations over these stored exemplars. This study tested these accounts' assumption that children store utterances as wholes by testing memory for familiar sequences of words. Using a newly available, dense corpus of child-directed speech, we identified frequently occurring chunks in the input (e.g., sit in your chair) and matched them to infrequent sequences (e.g., sit in your truck). We tested young children's ability to produce these sequences in a sentence-repetition test. Three-year-olds (n= 21) and 2-year-olds (n= 17) were significantly more likely to repeat frequent sequences correctly than to repeat infrequent sequences correctly. Moreover, the 3-year-olds were significantly faster to repeat the first three words of an item if they formed part of a chunk (e.g., they were quicker to say sit in your when the following word was chair than when it was truck). We discuss the implications of these results for theories of language development and processing.

  19. Size matters: Associations between the androgen receptor CAG repeat length and the intrafollicular hormone milieu

    DEFF Research Database (Denmark)

    Borgbo, T; Macek, M; Chrudimska, J


    Granulosa cell (GC) expressed androgen receptors (AR) and intrafollicular androgens are central to fertility. The transactivating domain of the AR contains a polymorphic CAG repeat sequence, which is linked to the transcriptional activity of AR and may influence the GC function. This study aims...... to evaluate the effects of the AR CAG repeat length on the intrafollicular hormone profiles, and the gene expression profiles of GC from human small antral follicles. In total, 190 small antral follicles (3-11 mm in diameter) were collected from 58 women undergoing ovarian cryopreservation for fertility...... expression compared to medium CAG repeat lengths (P = 0.03). In conclusion, long CAG repeat lengths in the AR were associated to significant attenuated levels of androgens and an increased conversion of testosterone into oestradiol, in human small antral follicles....

  20. Importance of the temporal structure of movement sequences on the ability of monkeys to use serial order information. (United States)

    Deffains, Marc; Legallet, Eric; Apicella, Paul


    The capacity to acquire motor skills through repeated practice of a sequence of movements underlies many everyday activities. Extensive research in humans has dealt with the importance of spatial and temporal factors on motor sequence learning, standing in contrast to the few studies available in animals, particularly in nonhuman primates. In the present experiments, we studied the effect of the serial order of stimuli and associated movements in macaque monkeys overtrained to make arm-reaching movements in response to spatially distinct visual targets. Under different conditions, the temporal structure of the motor sequence was varied by changing the duration of the interval between successive target stimuli or by adding a cue that reliably signaled the onset time of the forthcoming target stimulus. In each condition, the extent to which the monkeys are sensitive to the spatial regularities was assessed by comparing performance when stimulus locations follow a repeating sequence, as opposed to a random sequence. We observed no improvement in task performance on repeated sequence blocks, compared to random sequence blocks, when target stimuli are relatively distant from each other in time. On the other hand, the shortening of the time interval between successive target stimuli or, more efficiently, the addition of a temporal cue before the target stimulus yielded a performance advantage under repeated sequence, reflected in a decrease in the latency of arm and saccadic eye movements accompanied by an increased tendency for eye movements to occur in an anticipatory manner. Contrary to the effects on movement initiation, the serial order of stimuli and movements did not markedly affect the execution of movement. Moreover, the location of a given target in the random sequence influenced task performance based on the location of the preceding target, monkeys being faster in responding as a result of familiarity caused by extensive practice with some target transitions

  1. Characterizing leader sequences of CRISPR loci

    DEFF Research Database (Denmark)

    Alkhnbashi, Omer; Shah, Shiraz Ali; Garrett, Roger Antony


    The CRISPR-Cas system is an adaptive immune system in many archaea and bacteria, which provides resistance against invading genetic elements. The first phase of CRISPR-Cas immunity is called adaptation, in which small DNA fragments are excised from genetic elements and are inserted into a CRISPR...... array generally adjacent to its so called leader sequence at one end of the array. It has been shown that transcription initiation and adaptation signals of the CRISPR array are located within the leader. However, apart from promoters, there is very little knowledge of sequence or structural motifs...... sequences by focusing on the consensus repeat of the adjacent CRISPR array and weak upstream conservation signals. We applied our tool to the analysis of a comprehensive genomic database and identified several characteristic properties of leader sequences specific to archaea and bacteria, ranging from...

  2. Isolation and characterization of repeat elements of the oak genome and their application in population analysis

    International Nuclear Information System (INIS)

    Fluch, S.; Burg, K.


    Four minisatellite sequence elements have been identified and isolated from the genome of the oak species Quercus petraea and Quercus robur. Minisatellites 1 and 2 are putative members of repeat families, while minisatellites 3 and 4 show repeat length variation among individuals of test populations. A 590 base pair (bp) long element has also been identified which reveals individual-specific autoradiographic patterns when used as probe in Southern hybridisations of genomic oak DNA. (author)

  3. Complete DNA sequence of the linear mitochondrial genome of the pathogenic yeast Candida parapsilosis

    DEFF Research Database (Denmark)

    Nosek, J.; Novotna, M.; Hlavatovicova, Z.


    The complete sequence of the mitochondrial DNA of the opportunistic yeast pathogen Candida parapsilosis was determined. The mitochondrial genome is represented by linear DNA molecules terminating with tandem repeats of a 738-bp unit. The number of repeats varies, thus generating a population...

  4. Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum and Comparative Analysis with Common Buckwheat (F. esculentum.

    Directory of Open Access Journals (Sweden)

    Kwang-Soo Cho

    Full Text Available We report the chloroplast (cp genome sequence of tartary buckwheat (Fagopyrum tataricum obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats and F. esculentum (one repeat, and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes--rpoC2, ycf3, accD, and clpP--have high synonymous (Ks value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum.

  5. Primary structure of and immunoglobulin E response to the repeat subunit of gp15/400 from human lymphatic filarial parasites

    NARCIS (Netherlands)

    Paxton, W. A.; Yazdanbakhsh, M.; Kurniawan, A.; Partono, F.; Maizels, R. M.; Selkirk, M. E.


    We have isolated and sequenced clones encoding the repeated subunit of the surface-associated glycoprotein gp15/400 from the two nematode species predominantly responsible for lymphatic filariasis in humans: Brugia malayi and Wuchereria bancrofti. The amino acid sequence of the 15-kDa subunit,

  6. Fixed recurrence and slip models better predict earthquake behavior than the time- and slip-predictable models 1: repeating earthquakes (United States)

    Rubinstein, Justin L.; Ellsworth, William L.; Chen, Kate Huihsuan; Uchida, Naoki


    The behavior of individual events in repeating earthquake sequences in California, Taiwan and Japan is better predicted by a model with fixed inter-event time or fixed slip than it is by the time- and slip-predictable models for earthquake occurrence. Given that repeating earthquakes are highly regular in both inter-event time and seismic moment, the time- and slip-predictable models seem ideally suited to explain their behavior. Taken together with evidence from the companion manuscript that shows similar results for laboratory experiments we conclude that the short-term predictions of the time- and slip-predictable models should be rejected in favor of earthquake models that assume either fixed slip or fixed recurrence interval. This implies that the elastic rebound model underlying the time- and slip-predictable models offers no additional value in describing earthquake behavior in an event-to-event sense, but its value in a long-term sense cannot be determined. These models likely fail because they rely on assumptions that oversimplify the earthquake cycle. We note that the time and slip of these events is predicted quite well by fixed slip and fixed recurrence models, so in some sense they are time- and slip-predictable. While fixed recurrence and slip models better predict repeating earthquake behavior than the time- and slip-predictable models, we observe a correlation between slip and the preceding recurrence time for many repeating earthquake sequences in Parkfield, California. This correlation is not found in other regions, and the sequences with the correlative slip-predictable behavior are not distinguishable from nearby earthquake sequences that do not exhibit this behavior.

  7. Inter Simple Sequence Repeat DNA (ISSR) Polymorphism Utility in Haploid Nicotiana Alata Irradiated Plants for Finding Markers Associated with Gamma Irradiation and Salinity

    International Nuclear Information System (INIS)

    El-Fiki, A.; Adly, M.; El-Metabteb, G.


    Nicotiana alata is an ornamental plant. It is a member of family Solanasea. Tobacco (Nicotiana spp.) is one of the most important commercial crops in the world. Wild Nicotiana species, as a store house of genes for several diseases and pests, in addition to genes for several important phytochemicals and quality traits which are not present in cultivated varieties. Inter simple sequence repeat DNA (ISSR) analysis was used to determine the degree of genetic variation in treated haploid Nicotiana alata plants. Total genomic DNAs from different treated haploid plant lets were amplified using five specific primers. All primers were polymorphic. A total of 209 bands were amplified of which 135 (59.47%) polymorphic across the radiation treatments. Whilst, the level of polymorphism among the salinity treatments were 181 (85.6 %). Whereas, the polymorphism among the combined effects between gamma radiation doses and salinity concentrations were 283 ( 73.95% ). Treatments relationships were estimated through cluster analysis (UPGMA) based on ISSR data

  8. Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis

    DEFF Research Database (Denmark)

    Carlton, Jane M.; Hirt, Robert P.; Silva, Joana C.


    We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the approximately 160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion...... environment. The genome sequence predicts previously unknown functions for the hydrogenosome, which support a common evolutionary origin of this unusual organelle with mitochondria....

  9. A 135-kilodalton surface antigen of Mycoplasma hominis PG21 contains multiple directly repeated sequences

    DEFF Research Database (Denmark)

    Ladefoged, Søren; Birkelund, Svend; Hauge, S


    gene was sequenced, and its gene product was characterized with the goal of elucidating the structure and function of Lmp1. A total of 7,196 bp in the lmp1 region was sequenced. An open reading frame of 4,032 bp, encoding a protein of 1,344 amino acids with a calculated molecular weight of 147...

  10. Detection of Sequence Polymorphism in Rubus Occidentalis L. Monomorphic Microsatellite Markers by High Resolution Melting (United States)

    Microsatellite, or simple sequence repeat (SSR) markers, are valuable as co-dominant genetic markers with a variety of applications such as DNA fingerprinting, linkage mapping, and population structure analysis. Development of microsatellite primers through the identification of appropriate repeate...

  11. High-resolution comparative mapping among man, cattle and mouse suggests a role for repeat sequences in mammalian genome evolution

    Directory of Open Access Journals (Sweden)

    Rodolphe François


    Full Text Available Abstract Background Comparative mapping provides new insights into the evolutionary history of genomes. In particular, recent studies in mammals have suggested a role for segmental duplication in genome evolution. In some species such as Drosophila or maize, transposable elements (TEs have been shown to be involved in chromosomal rearrangements. In this work, we have explored the presence of interspersed repeats in regions of chromosomal rearrangements, using an updated high-resolution integrated comparative map among cattle, man and mouse. Results The bovine, human and mouse comparative autosomal map has been constructed using data from bovine genetic and physical maps and from FISH-mapping studies. We confirm most previous results but also reveal some discrepancies. A total of 211 conserved segments have been identified between cattle and man, of which 33 are new segments and 72 correspond to extended, previously known segments. The resulting map covers 91% and 90% of the human and bovine genomes, respectively. Analysis of breakpoint regions revealed a high density of species-specific interspersed repeats in the human and mouse genomes. Conclusion Analysis of the breakpoint regions has revealed specific repeat density patterns, suggesting that TEs may have played a significant role in chromosome evolution and genome plasticity. However, we cannot rule out that repeats and breakpoints accumulate independently in the few same regions where modifications are better tolerated. Likewise, we cannot ascertain whether increased TE density is the cause or the consequence of chromosome rearrangements. Nevertheless, the identification of high density repeat clusters combined with a well-documented repeat phylogeny should highlight probable breakpoints, and permit their precise dating. Combining new statistical models taking the present information into account should help reconstruct ancestral karyotypes.

  12. Repeated bonding of fixed retainer increases the risk of enamel fracture. (United States)

    Chinvipas, Netrporn; Hasegawa, Yuh; Terada, Kazuto


    The aim of this study was to investigate the influences of repeated bonding, using 2 different orthodontic adhesive systems, on the shear bond strength (SBS) and the enamel surface morphology. Sixty premolars were divided into 2 groups (n = 30), and either Transbond XT (T group) or Fuji Ortho LC (F group) adhesives were used. SBS was measured 24 h after bonding, using a universal testing machine. Then, the enamel surfaces were investigated and the mode of failure was described using adhesive remnant index (ARI) scores. After each debonding, 10 teeth from each group were examined by scanning electron microscopy to determine the penetration of adhesives, the length of resin tags, and the state of the enamel surface. The other teeth were subjected to two more bonding/debonding procedures. In T group, the second debonding sequences had significantly higher bond strengths than the other sequences. The length of resin tags was greatest in the second debonding sequence, although there was no significant difference. In F group, the SBS increased with further rebonding and the failure mode tended towards cohesive failure. In both groups, the ARI scores increased with rebonding. Enamel loss could have occurred with both adhesives, although the surfaces appeared unchanged to the naked eye. From this study, we suggest that enamel damage caused by repeated bonding is of concern. To prevent bond failure, we should pay attention to the adhesion method used for bondable retainers.

  13. The complete chloroplast genome sequences of Lychnis wilfordii and Silene capitata and comparative analyses with other Caryophyllaceae genomes. (United States)

    Kang, Jong-Soo; Lee, Byoung Yoon; Kwak, Myounghai


    The complete chloroplast genomes of Lychnis wilfordii and Silene capitata were determined and compared with ten previously reported Caryophyllaceae chloroplast genomes. The chloroplast genome sequences of L. wilfordii and S. capitata contain 152,320 bp and 150,224 bp, respectively. The gene contents and orders among 12 Caryophyllaceae species are consistent, but several microstructural changes have occurred. Expansion of the inverted repeat (IR) regions at the large single copy (LSC)/IRb and small single copy (SSC)/IR boundaries led to partial or entire gene duplications. Additionally, rearrangements of the LSC region were caused by gene inversions and/or transpositions. The 18 kb inversions, which occurred three times in different lineages of tribe Sileneae, were thought to be facilitated by the intermolecular duplicated sequences. Sequence analyses of the L. wilfordii and S. capitata genomes revealed 39 and 43 repeats, respectively, including forward, palindromic, and reverse repeats. In addition, a total of 67 and 56 simple sequence repeats were discovered in the L. wilfordii and S. capitata chloroplast genomes, respectively. Finally, we constructed phylogenetic trees of the 12 Caryophyllaceae species and two Amaranthaceae species based on 73 protein-coding genes using both maximum parsimony and likelihood methods.

  14. Genomic sequencing of Pleistocene cave bears

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.


    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

  15. CORE-SINEs: eukaryotic short interspersed retroposing elements with common sequence motifs. (United States)

    Gilbert, N; Labuda, D


    A 65-bp "core" sequence is dispersed in hundreds of thousands copies in the human genome. This sequence was found to constitute the central segment of a group of short interspersed elements (SINEs), referred to as mammalian-wide interspersed repeats, that proliferated before the radiation of placental mammals. Here, we propose that the core identifies an ancient tRNA-like SINE element, which survived in different lineages such as mammals, reptiles, birds, and fish, as well as mollusks, presumably for >550 million years. This element gave rise to a number of sequence families (CORE-SINEs), including mammalian-wide interspersed repeats, whose distinct 3' ends are shared with different families of long interspersed elements (LINEs). The evolutionary success of the generic CORE-SINE element can be related to the recruitment of the internal promoter from highly transcribed host RNA as well as to its capacity to adapt to changing retropositional opportunities by sequence exchange with actively amplifying LINEs. It reinforces the notion that the very existence of SINEs depends on the cohabitation with both LINEs and the host genome.

  16. Genetic Contributors to Intergenerational CAG Repeat Instability in Huntington's Disease Knock-In Mice. (United States)

    Neto, João Luís; Lee, Jong-Min; Afridi, Ali; Gillis, Tammy; Guide, Jolene R; Dempsey, Stephani; Lager, Brenda; Alonso, Isabel; Wheeler, Vanessa C; Pinto, Ricardo Mouro


    Huntington's disease (HD) is a neurodegenerative disorder caused by the expansion of a CAG trinucleotide repeat in exon 1 of the HTT gene. Longer repeat sizes are associated with increased disease penetrance and earlier ages of onset. Intergenerationally unstable transmissions are common in HD families, partly underlying the genetic anticipation seen in this disorder. HD CAG knock-in mouse models also exhibit a propensity for intergenerational repeat size changes. In this work, we examine intergenerational instability of the CAG repeat in over 20,000 transmissions in the largest HD knock-in mouse model breeding datasets reported to date. We confirmed previous observations that parental sex drives the relative ratio of expansions and contractions. The large datasets further allowed us to distinguish effects of paternal CAG repeat length on the magnitude and frequency of expansions and contractions, as well as the identification of large repeat size jumps in the knock-in models. Distinct degrees of intergenerational instability were observed between knock-in mice of six background strains, indicating the occurrence of trans-acting genetic modifiers. We also found that lines harboring a neomycin resistance cassette upstream of Htt showed reduced expansion frequency, indicative of a contributing role for sequences in cis, with the expanded repeat as modifiers of intergenerational instability. These results provide a basis for further understanding of the mechanisms underlying intergenerational repeat instability. Copyright © 2017 by the Genetics Society of America.

  17. The sequence and de novo assembly of the giant panda genome (United States)

    Li, Ruiqiang; Fan, Wei; Tian, Geng; Zhu, Hongmei; He, Lin; Cai, Jing; Huang, Quanfei; Cai, Qingle; Li, Bo; Bai, Yinqi; Zhang, Zhihe; Zhang, Yaping; Wang, Wen; Li, Jun; Wei, Fuwen; Li, Heng; Jian, Min; Li, Jianwen; Zhang, Zhaolei; Nielsen, Rasmus; Li, Dawei; Gu, Wanjun; Yang, Zhentao; Xuan, Zhaoling; Ryder, Oliver A.; Leung, Frederick Chi-Ching; Zhou, Yan; Cao, Jianjun; Sun, Xiao; Fu, Yonggui; Fang, Xiaodong; Guo, Xiaosen; Wang, Bo; Hou, Rong; Shen, Fujun; Mu, Bo; Ni, Peixiang; Lin, Runmao; Qian, Wubin; Wang, Guodong; Yu, Chang; Nie, Wenhui; Wang, Jinhuan; Wu, Zhigang; Liang, Huiqing; Min, Jiumeng; Wu, Qi; Cheng, Shifeng; Ruan, Jue; Wang, Mingwei; Shi, Zhongbin; Wen, Ming; Liu, Binghang; Ren, Xiaoli; Zheng, Huisong; Dong, Dong; Cook, Kathleen; Shan, Gao; Zhang, Hao; Kosiol, Carolin; Xie, Xueying; Lu, Zuhong; Zheng, Hancheng; Li, Yingrui; Steiner, Cynthia C.; Lam, Tommy Tsan-Yuk; Lin, Siyuan; Zhang, Qinghui; Li, Guoqing; Tian, Jing; Gong, Timing; Liu, Hongde; Zhang, Dejin; Fang, Lin; Ye, Chen; Zhang, Juanbin; Hu, Wenbo; Xu, Anlong; Ren, Yuanyuan; Zhang, Guojie; Bruford, Michael W.; Li, Qibin; Ma, Lijia; Guo, Yiran; An, Na; Hu, Yujie; Zheng, Yang; Shi, Yongyong; Li, Zhiqiang; Liu, Qing; Chen, Yanling; Zhao, Jing; Qu, Ning; Zhao, Shancen; Tian, Feng; Wang, Xiaoling; Wang, Haiyin; Xu, Lizhi; Liu, Xiao; Vinar, Tomas; Wang, Yajun; Lam, Tak-Wah; Yiu, Siu-Ming; Liu, Shiping; Zhang, Hemin; Li, Desheng; Huang, Yan; Wang, Xia; Yang, Guohua; Jiang, Zhi; Wang, Junyi; Qin, Nan; Li, Li; Li, Jingxiang; Bolund, Lars; Kristiansen, Karsten; Wong, Gane Ka-Shu; Olson, Maynard; Zhang, Xiuqing; Li, Songgang; Yang, Huanming; Wang, Jian; Wang, Jun


    Using next-generation sequencing technology alone, we have successfully generated and assembled a draft sequence of the giant panda genome. The assembled contigs (2.25 gigabases (Gb)) cover approximately 94% of the whole genome, and the remaining gaps (0.05 Gb) seem to contain carnivore-specific repeats and tandem repeats. Comparisons with the dog and human showed that the panda genome has a lower divergence rate. The assessment of panda genes potentially underlying some of its unique traits indicated that its bamboo diet might be more dependent on its gut microbiome than its own genetic composition. We also identified more than 2.7 million heterozygous single nucleotide polymorphisms in the diploid genome. Our data and analyses provide a foundation for promoting mammalian genetic research, and demonstrate the feasibility for using next-generation sequencing technologies for accurate, cost-effective and rapid de novo assembly of large eukaryotic genomes. PMID:20010809

  18. Pms2 suppresses large expansions of the (GAA·TTC)n sequence in neuronal tissues. (United States)

    Bourn, Rebecka L; De Biase, Irene; Pinto, Ricardo Mouro; Sandi, Chiranjeevi; Al-Mahdawi, Sahar; Pook, Mark A; Bidichandani, Sanjay I


    Expanded trinucleotide repeat sequences are the cause of several inherited neurodegenerative diseases. Disease pathogenesis is correlated with several features of somatic instability of these sequences, including further large expansions in postmitotic tissues. The presence of somatic expansions in postmitotic tissues is consistent with DNA repair being a major determinant of somatic instability. Indeed, proteins in the mismatch repair (MMR) pathway are required for instability of the expanded (CAG·CTG)(n) sequence, likely via recognition of intrastrand hairpins by MutSβ. It is not clear if or how MMR would affect instability of disease-causing expanded trinucleotide repeat sequences that adopt secondary structures other than hairpins, such as the triplex/R-loop forming (GAA·TTC)(n) sequence that causes Friedreich ataxia. We analyzed somatic instability in transgenic mice that carry an expanded (GAA·TTC)(n) sequence in the context of the human FXN locus and lack the individual MMR proteins Msh2, Msh6 or Pms2. The absence of Msh2 or Msh6 resulted in a dramatic reduction in somatic mutations, indicating that mammalian MMR promotes instability of the (GAA·TTC)(n) sequence via MutSα. The absence of Pms2 resulted in increased accumulation of large expansions in the nervous system (cerebellum, cerebrum, and dorsal root ganglia) but not in non-neuronal tissues (heart and kidney), without affecting the prevalence of contractions. Pms2 suppressed large expansions specifically in tissues showing MutSα-dependent somatic instability, suggesting that they may act on the same lesion or structure associated with the expanded (GAA·TTC)(n) sequence. We conclude that Pms2 specifically suppresses large expansions of a pathogenic trinucleotide repeat sequence in neuronal tissues, possibly acting independently of the canonical MMR pathway.

  19. Genome-wide SNP identification by high-throughput sequencing and selective mapping allows sequence assembly positioning using a framework genetic linkage map

    Directory of Open Access Journals (Sweden)

    Xu Xiangming


    Full Text Available Abstract Background Determining the position and order of contigs and scaffolds from a genome assembly within an organism's genome remains a technical challenge in a majority of sequencing projects. In order to exploit contemporary technologies for DNA sequencing, we developed a strategy for whole genome single nucleotide polymorphism sequencing allowing the positioning of sequence contigs onto a linkage map using the bin mapping method. Results The strategy was tested on a draft genome of the fungal pathogen Venturia inaequalis, the causal agent of apple scab, and further validated using sequence contigs derived from the diploid plant genome Fragaria vesca. Using our novel method we were able to anchor 70% and 92% of sequences assemblies for V. inaequalis and F. vesca, respectively, to genetic linkage maps. Conclusions We demonstrated the utility of this approach by accurately determining the bin map positions of the majority of the large sequence contigs from each genome sequence and validated our method by mapping single sequence repeat markers derived from sequence contigs on a full mapping population.

  20. Inter-Simple Sequence Repeat (ISSR Markers to Study Genetic Diversity Among Cotton Cultivars in Associated with Salt Tolerance

    Directory of Open Access Journals (Sweden)

    Ali Akbar ABDI


    Full Text Available Developing salt-tolerant crops is very important as a significant proportion of cultivated land is salt-affected. Screening and selection of salt tolerant genotypes of cotton using DNA molecular markers not only introduce tolerant cultivars useful for hybridization and breeding programs but also detect DNA regions involved in mechanism of salinity tolerance. To study this, 28 cotton cultivars, including 8 Iranian cotton varieties were grown in pots under greenhouse condition and three salt treatments were imposed with salt solutions (0, 70 and 140 mM NaCl. Eight agronomic traits including root length, root fresh weight, root dry weight, chlorophyll and fluorescence index, K+ and Na+ contents in shoot (above ground biomass, and K+/Na+ ratio were measured. Cluster analysis of cultivars based on measured agronomic traits, showed �Cindose� and �Ciacra� as the most tolerant cultivars, and �B-557� and �43347� as the most sensitive cultivars of salt damage. A total of 65 polymorphic DNA fragments were generated at 14 inter-simple sequence repeat (ISSR loci. Plants of 28 cultivars of cotton grouped into three clusters based on ISSR markers. Regression analysis of markers in relation with traits data showed that 23, 33 and 30 markers associated with the measured traits in three salt treatments respectively. These markers might help breeders in any marker assisted selection program in order to improving cotton cultivars against salt stress.

  1. Genetic diversity of the Andean tuber-bearing species, oca (Oxalis tuberosa Mol.), investigated by inter-simple sequence repeats. (United States)

    Pissard, A; Ghislain, M; Bertin, P


    The Andean tuber-bearing species, Oxalis tuberosa Mol., is a vegetatively propagated crop cultivated in the uplands of the Andes. Its genetic diversity was investigated in the present study using the inter-simple sequence repeat (ISSR) technique. Thirty-two accessions originating from South America (Argentina, Bolivia, Chile, and Peru) and maintained in vitro were