WorldWideScience

Sample records for repeated nucleotide sequences

  1. Clusters of nucleotide substitutions and insertion/deletion mutations are associated with repeat sequences.

    Directory of Open Access Journals (Sweden)

    Michael J McDonald

    2011-06-01

    Full Text Available The genome-sequencing gold rush has facilitated the use of comparative genomics to uncover patterns of genome evolution, although their causal mechanisms remain elusive. One such trend, ubiquitous to prokarya and eukarya, is the association of insertion/deletion mutations (indels with increases in the nucleotide substitution rate extending over hundreds of base pairs. The prevailing hypothesis is that indels are themselves mutagenic agents. Here, we employ population genomics data from Escherichia coli, Saccharomyces paradoxus, and Drosophila to provide evidence suggesting that it is not the indels per se but the sequence in which indels occur that causes the accumulation of nucleotide substitutions. We found that about two-thirds of indels are closely associated with repeat sequences and that repeat sequence abundance could be used to identify regions of elevated sequence diversity, independently of indels. Moreover, the mutational signature of indel-proximal nucleotide substitutions matches that of error-prone DNA polymerases. We propose that repeat sequences promote an increased probability of replication fork arrest, causing the persistent recruitment of error-prone DNA polymerases to specific sequence regions over evolutionary time scales. Experimental measures of the mutation rates of engineered DNA sequences and analyses of experimentally obtained collections of spontaneous mutations provide molecular evidence supporting our hypothesis. This study uncovers a new role for repeat sequences in genome evolution and provides an explanation of how fine-scale sequence contextual effects influence mutation rates and thereby evolution.

  2. Clusters of nucleotide substitutions and insertion/deletion mutations are associated with repeat sequences.

    Science.gov (United States)

    McDonald, Michael J; Wang, Wei-Chi; Huang, Hsien-Da; Leu, Jun-Yi

    2011-06-01

    The genome-sequencing gold rush has facilitated the use of comparative genomics to uncover patterns of genome evolution, although their causal mechanisms remain elusive. One such trend, ubiquitous to prokarya and eukarya, is the association of insertion/deletion mutations (indels) with increases in the nucleotide substitution rate extending over hundreds of base pairs. The prevailing hypothesis is that indels are themselves mutagenic agents. Here, we employ population genomics data from Escherichia coli, Saccharomyces paradoxus, and Drosophila to provide evidence suggesting that it is not the indels per se but the sequence in which indels occur that causes the accumulation of nucleotide substitutions. We found that about two-thirds of indels are closely associated with repeat sequences and that repeat sequence abundance could be used to identify regions of elevated sequence diversity, independently of indels. Moreover, the mutational signature of indel-proximal nucleotide substitutions matches that of error-prone DNA polymerases. We propose that repeat sequences promote an increased probability of replication fork arrest, causing the persistent recruitment of error-prone DNA polymerases to specific sequence regions over evolutionary time scales. Experimental measures of the mutation rates of engineered DNA sequences and analyses of experimentally obtained collections of spontaneous mutations provide molecular evidence supporting our hypothesis. This study uncovers a new role for repeat sequences in genome evolution and provides an explanation of how fine-scale sequence contextual effects influence mutation rates and thereby evolution.

  3. Automated discovery of single nucleotide polymorphism and simple sequence repeat molecular genetic markers.

    Science.gov (United States)

    Batley, Jacqueline; Jewell, Erica; Edwards, David

    2007-01-01

    Molecular genetic markers represent one of the most powerful tools for the analysis of genomes. Molecular marker technology has developed rapidly over the last decade, and two forms of sequence-based markers, simple sequence repeats (SSRs), also known as microsatellites, and single nucleotide polymorphisms (SNPs), now predominate applications in modern genetic analysis. The availability of large sequence data sets permits mining for SSRs and SNPs, which may then be applied to genetic trait mapping and marker-assisted selection. Here, we describe Web-based automated methods for the discovery of these SSRs and SNPs from sequence data. SSRPrimer enables the real-time discovery of SSRs within submitted DNA sequences, with the concomitant design of PCR primers for SSR amplification. Alternatively, users may browse the SSR Taxonomy Tree to identify predetermined SSR amplification primers for any species represented within the GenBank database. SNPServer uses a redundancy-based approach to identify SNPs within DNA sequence data. Following submission of a sequence of interest, SNPServer uses BLAST to identify similar sequences, CAP3 to cluster and assemble these sequences, and then the SNP discovery software autoSNP to detect SNPs and insertion/deletion (indel) polymorphisms.

  4. Nucleotide sequence, DNA damage location and protein stoichiometry influence base excision repair outcome at CAG/CTG repeats

    Science.gov (United States)

    Goula, Agathi-Vasiliki; Pearson, Christopher E.; Della Maria, Julie; Trottier, Yvon; Tomkinson, Alan E.; Wilson, David M.; Merienne, Karine

    2012-01-01

    Expansion of CAG/CTG repeats is the underlying cause of >fourteen genetic disorders, including Huntington’s disease (HD) and myotonic dystrophy. The mutational process is ongoing, with increases in repeat size enhancing the toxicity of the expansion in specific tissues. In many repeat diseases the repeats exhibit high instability in the striatum, whereas instability is minimal in the cerebellum. We provide molecular insights as to how base excision repair (BER) protein stoichiometry may contribute to the tissue-selective instability of CAG/CTG repeats by using specific repair assays. Oligonucleotide substrates with an abasic site were mixed with either reconstituted BER protein stoichiometries mimicking the levels present in HD mouse striatum or cerebellum, or with protein extracts prepared from HD mouse striatum or cerebellum. In both cases, repair efficiency at CAG/CTG repeats and at control DNA sequences was markedly reduced under the striatal conditions, likely due to the lower level of APE1, FEN1 and LIG1. Damage located towards the 5’ end of the repeat tract was poorly repaired accumulating incompletely processed intermediates as compared to an AP lesion in the centre or at the 3’ end of the repeats or within a control sequences. Moreover, repair of lesions at the 5’ end of CAG or CTG repeats involved multinucleotide synthesis, particularly under the cerebellar stoichiometry, suggesting that long-patch BER processes lesions at sequences susceptible to hairpin formation. Our results show that BER stoichiometry, nucleotide sequence and DNA damage position modulate repair outcome, and suggest that a suboptimal LP-BER activity promotes CAG/CTG repeat instability. PMID:22497302

  5. Empirical Comparison of Simple Sequence Repeats and Single Nucleotide Polymorphisms in Assessment of Maize Diversity and Relatedness

    Science.gov (United States)

    Hamblin, Martha T.; Warburton, Marilyn L.; Buckler, Edward S.

    2007-01-01

    While Simple Sequence Repeats (SSRs) are extremely useful genetic markers, recent advances in technology have produced a shift toward use of single nucleotide polymorphisms (SNPs). The different mutational properties of these two classes of markers result in differences in heterozygosities and allele frequencies that may have implications for their use in assessing relatedness and evaluation of genetic diversity. We compared analyses based on 89 SSRs (primarily dinucleotide repeats) to analyses based on 847 SNPs in individuals from the same 259 inbred maize lines, which had been chosen to represent the diversity available among current and historic lines used in breeding. The SSRs performed better at clustering germplasm into populations than did a set of 847 SNPs or 554 SNP haplotypes, and SSRs provided more resolution in measuring genetic distance based on allele-sharing. Except for closely related pairs of individuals, measures of distance based on SSRs were only weakly correlated with measures of distance based on SNPs. Our results suggest that 1) large numbers of SNP loci will be required to replace highly polymorphic SSRs in studies of diversity and relatedness and 2) relatedness among highly-diverged maize lines is difficult to measure accurately regardless of the marker system. PMID:18159250

  6. An integrated genetic linkage map of watermelon and genetic diversity based on single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers

    Science.gov (United States)

    Watermelon (Citrullus lanatus var. lanatus) is an important vegetable fruit throughout the world. A high number of single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers should provide large coverage of the watermelon genome and high phylogenetic resolution of germplasm acces...

  7. Analysis of the genome sequence of the pathogenic Muscovy duck parvovirus strain YY reveals a 14-nucleotide-pair deletion in the inverted terminal repeats.

    Science.gov (United States)

    Wang, Jianye; Huang, Yu; Zhou, Mingxu; Zhu, Guoqiang

    2016-09-01

    Genomic information about Muscovy duck parvovirus is still limited. In this study, the genome of the pathogenic MDPV strain YY was sequenced. The full-length genome of YY is 5075 nucleotides (nt) long, 57 nt shorter than that of strain FM. Sequence alignment indicates that the 5' and 3' inverted terminal repeats (ITR) of strain YY contain a 14-nucleotide-pair deletion in the stem of the palindromic hairpin structure in comparison to strain FM and FZ91-30. The deleted region contains one "E-box" site and one repeated motif with the sequence "TTCCGGT" or "ACCGGAA". Phylogenetic trees constructed based the protein coding genes concordantly showed that YY, together with nine other MDPV isolates from various places, clustered in a separate branch, distinct from the branch formed by goose parvovirus (GPV) strains. These results demonstrate that, despite the distinctive deletion, the YY strain still belongs to the classical MDPV group. Moreover, the deletion of ITR may contribute to the genome evolution of MDPV under immunization pressure.

  8. Recombination-Independent Recognition of DNA Homology for Repeat-Induced Point Mutation (RIP Is Modulated by the Underlying Nucleotide Sequence.

    Directory of Open Access Journals (Sweden)

    Eugene Gladyshev

    2016-05-01

    Full Text Available Haploid germline nuclei of many filamentous fungi have the capacity to detect homologous nucleotide sequences present on the same or different chromosomes. Once recognized, such sequences can undergo cytosine methylation or cytosine-to-thymine mutation specifically over the extent of shared homology. In Neurospora crassa this process is known as Repeat-Induced Point mutation (RIP. Previously, we showed that RIP did not require MEI-3, the only RecA homolog in Neurospora, and that it could detect homologous trinucleotides interspersed with a matching periodicity of 11 or 12 base-pairs along participating chromosomal segments. This pattern was consistent with a mechanism of homology recognition that involved direct interactions between co-aligned double-stranded (ds DNA molecules, where sequence-specific dsDNA/dsDNA contacts could be established using no more than one triplet per turn. In the present study we have further explored the DNA sequence requirements for RIP. In our previous work, interspersed homologies were always examined in the context of a relatively long adjoining region of perfect homology. Using a new repeat system lacking this strong interaction, we now show that interspersed homologies with overall sequence identity of only 36% can be efficiently detected by RIP in the absence of any perfect homology. Furthermore, in this new system, where the total amount of homology is near the critical threshold required for RIP, the nucleotide composition of participating DNA molecules is identified as an important factor. Our results specifically pinpoint the triplet 5'-GAC-3' as a particularly efficient unit of homology recognition. Finally, we present experimental evidence that the process of homology sensing can be uncoupled from the downstream mutation. Taken together, our results advance the notion that sequence information can be compared directly between double-stranded DNA molecules during RIP and, potentially, in other processes

  9. Recombination-Independent Recognition of DNA Homology for Repeat-Induced Point Mutation (RIP) Is Modulated by the Underlying Nucleotide Sequence.

    Science.gov (United States)

    Gladyshev, Eugene; Kleckner, Nancy

    2016-05-01

    Haploid germline nuclei of many filamentous fungi have the capacity to detect homologous nucleotide sequences present on the same or different chromosomes. Once recognized, such sequences can undergo cytosine methylation or cytosine-to-thymine mutation specifically over the extent of shared homology. In Neurospora crassa this process is known as Repeat-Induced Point mutation (RIP). Previously, we showed that RIP did not require MEI-3, the only RecA homolog in Neurospora, and that it could detect homologous trinucleotides interspersed with a matching periodicity of 11 or 12 base-pairs along participating chromosomal segments. This pattern was consistent with a mechanism of homology recognition that involved direct interactions between co-aligned double-stranded (ds) DNA molecules, where sequence-specific dsDNA/dsDNA contacts could be established using no more than one triplet per turn. In the present study we have further explored the DNA sequence requirements for RIP. In our previous work, interspersed homologies were always examined in the context of a relatively long adjoining region of perfect homology. Using a new repeat system lacking this strong interaction, we now show that interspersed homologies with overall sequence identity of only 36% can be efficiently detected by RIP in the absence of any perfect homology. Furthermore, in this new system, where the total amount of homology is near the critical threshold required for RIP, the nucleotide composition of participating DNA molecules is identified as an important factor. Our results specifically pinpoint the triplet 5'-GAC-3' as a particularly efficient unit of homology recognition. Finally, we present experimental evidence that the process of homology sensing can be uncoupled from the downstream mutation. Taken together, our results advance the notion that sequence information can be compared directly between double-stranded DNA molecules during RIP and, potentially, in other processes where homologous

  10. Genetic diversity in domesticated soybean (Glycine max) and its wild progenitor (Glycine soja) for simple sequence repeat and single-nucleotide polymorphism loci.

    Science.gov (United States)

    Li, Ying-Hui; Li, Wei; Zhang, Chen; Yang, Liang; Chang, Ru-Zhen; Gaut, Brandon S; Qiu, Li-Juan

    2010-10-01

    • The study of genetic diversity between a crop and its wild relatives may yield fundamental insights into evolutionary history and the process of domestication. • In this study, we genotyped a sample of 303 accessions of domesticated soybean (Glycine max) and its wild progenitor Glycine soja with 99 microsatellite markers and 554 single-nucleotide polymorphism (SNP) markers. • The simple sequence repeat (SSR) loci averaged 21.5 alleles per locus and overall Nei's gene diversity of 0.77. The SNPs had substantially lower genetic diversity (0.35) than SSRs. A SSR analyses indicated that G. soja exhibited higher diversity than G. max, but SNPs provided a slightly different snapshot of diversity between the two taxa. For both marker types, the primary division of genetic diversity was between the wild and domesticated accessions. Within taxa, G. max consisted of four geographic regions in China. G. soja formed six subgroups. Genealogical analyses indicated that cultivated soybean tended to form a monophyletic clade with respect to G. soja. • G. soja and G. max represent distinct germplasm pools. Limited evidence of admixture was discovered between these two species. Overall, our analyses are consistent with the origin of G. max from regions along the Yellow River of China.

  11. Mining of simple sequence repeats in the Genome of Gentianaceae

    Directory of Open Access Journals (Sweden)

    R Sathishkumar

    2011-01-01

    Full Text Available Simple sequence repeats (SSRs or short tandem repeats are short repeat motifs that show high level of length polymorphism due to insertion or deletion mutations of one or more repeat types. Here, we present the detection and abundance of microsatellites or SSRs in nucleotide sequences of Gentianaceae family. A total of 545 SSRs were mined in 4698 nucleotide sequences downloaded from the National Center for Biotechnology Information (NCBI. Among the SSR sequences, the frequency of repeat type was about 429 -mono repeats, 99 -di repeats, 15 -tri repeats, and 2 --hexa repeats. Mononucleotide repeats were found to be abundant repeat types, about 78%, followed by dinucleotide repeats (18.16% among the SSR sequences. An attempt was made to design primer pairs for 545 identified SSRs but these were found only for 169 sequences.

  12. The Role of the Y-Chromosome in the Establishment of Murine Hybrid Dysgenesis and in the Analysis of the Nucleotide Sequence Organization, Genetic Transmission and Evolution of Repeated Sequences.

    Science.gov (United States)

    Nallaseth, Ferez Soli

    The Y-chromosome presents a unique cytogenetic framework for the evolution of nucleotide sequences. Alignment of nine Y-chromosomal fragments in their increasing Y-specific/non Y-specific (male/female) sequence divergence ratios was directly and inversely related to their interspersion on these two respective genomic fractions. Sequence analysis confirmed a direct relationship between divergence ratios and the Alu, LINE-1, Satellite and their derivative oligonucleotide contents. Thus their relocation on the Y-chromosome is followed by sequence divergence rather than the well documented concerted evolution of these non-coding progenitor repeated sequences. Five of the nine Y-chromosomal fragments are non-pseudoautosomal and transcribed into heterogeneous PolyA^+ RNA and thus can be retrotransposed. Evolutionary and computer analysis identified homologous oligonucleotide tracts in several human loci suggesting common and random mechanistic origins. Dysgenic genomes represent the accelerated evolution driving sequence divergence (McClintock, 1984). Sex reversal and sterility characterizing dysgenesis occurs in C57BL/6JY ^{rm Pos} but not in 129/SvY^{rm Pos} derivative strains. High frequency, random, multi-locus deletion products of the feral Y^{ rm Pos}-chromosome are generated in the germlines of F1(C57BL/6J X 129/SvY^{ rm Pos})(male) and C57BL/6JY ^{rm Pos}(male) but not in 129/SvY^{rm Pos}(male). Equal, 10^{-1}, 10^ {-2}, and 0 copies (relative to males) of Y^{rm Pos}-specific deletion products respectively characterize C57BL/6JY ^{rm Pos} (HC), (LC), (T) and (F) females. The testes determining loci of inactive Y^{rm Pos}-chromosomes in C57BL/6JY^{rm Pos} HC females are the preferentially deleted/rearranged Y ^{rm Pos}-sequences. Disruption of regulation of plasma testosterone and hepatic MUP-A mRNA levels, TRD of a 4.7 Kbp EcoR1 fragment suggest disruption of autosomal/X-chromosomal sequences. These data and the highly repeated progenitor (Alu, GATA, LINE-1

  13. Complete nucleotide sequences of the domestic cat (Felis catus) mitochondrial genome and a transposed mtDNA tandem repeat (Numt) in the nuclear genome

    Energy Technology Data Exchange (ETDEWEB)

    Lopez, J.V.; Cevario, S.; O`Brien, S.J. [National Cancer Institute, Frederick, MD (United States)

    1996-04-15

    The complete 17,009-bp mitochondrial genome of the domestic cat, Felis catus, has been sequenced and conforms largely to the typical organization of previously characterized mammalian mtDNAs. Codon usage and base composition also followed canonical vertebrate patterns, except for an unusual ATC (non-AUG) codon initiating the NADH dehydrogenase subunit 2 (ND2) gene. Two distinct repetitive motifs at opposite ends of the control region contribute to the relatively large size (1559 bp) of this carnivore mtDNA. Alignment of the feline mtDNA genome to a homologous 7946-bp nuclear mtDNA tandem repeat DNA sequence in the cat, Numt, indicates simple repeat motifs associated with insertion/deletion mutations. Overall DNA sequence divergence between Numt and cytoplasmic mtDNA sequence was only 5.1%. Substitutions predominate at the third codon position of homologous feline protein genes. Phylogenetic analysis of mitochondrial gene sequences confirms the recent transfer of the cytoplasmic mtDNA sequences to the domestic cat nucleus and recapitulates evolutionary relationships between mammal species. 86 refs., 4 figs., 3 tabs.

  14. Discovery and mapping of a new expressed sequence tag-single nucleotide polymorphism and simple sequence repeat panel for large-scale genetic studies and breeding of Theobroma cacao L.

    Science.gov (United States)

    Allegre, Mathilde; Argout, Xavier; Boccara, Michel; Fouet, Olivier; Roguet, Yolande; Bérard, Aurélie; Thévenin, Jean Marc; Chauveau, Aurélie; Rivallan, Ronan; Clement, Didier; Courtois, Brigitte; Gramacho, Karina; Boland-Augé, Anne; Tahi, Mathias; Umaharan, Pathmanathan; Brunel, Dominique; Lanaud, Claire

    2012-01-01

    Theobroma cacao is an economically important tree of several tropical countries. Its genetic improvement is essential to provide protection against major diseases and improve chocolate quality. We discovered and mapped new expressed sequence tag-single nucleotide polymorphism (EST-SNP) and simple sequence repeat (SSR) markers and constructed a high-density genetic map. By screening 149 650 ESTs, 5246 SNPs were detected in silico, of which 1536 corresponded to genes with a putative function, while 851 had a clear polymorphic pattern across a collection of genetic resources. In addition, 409 new SSR markers were detected on the Criollo genome. Lastly, 681 new EST-SNPs and 163 new SSRs were added to the pre-existing 418 co-dominant markers to construct a large consensus genetic map. This high-density map and the set of new genetic markers identified in this study are a milestone in cocoa genomics and for marker-assisted breeding. The data are available at http://tropgenedb.cirad.fr.

  15. The soybean-Phytophthora resistance locus Rps1-k encompasses coiled coil-nucleotide binding-leucine rich repeat-like genes and repetitive sequences

    Directory of Open Access Journals (Sweden)

    Bhattacharyya Madan K

    2008-03-01

    Full Text Available Abstract Background A series of Rps (resistance to Pytophthora sojae genes have been protecting soybean from the root and stem rot disease caused by the Oomycete pathogen, Phytophthora sojae. Five Rps genes were mapped to the Rps1 locus located near the 28 cM map position on molecular linkage group N of the composite genetic soybean map. Among these five genes, Rps1-k was introgressed from the cultivar, Kingwa. Rps1-k has been providing stable and broad-spectrum Phytophthora resistance in the major soybean-producing regions of the United States. Rps1-k has been mapped and isolated. More than one functional Rps1-k gene was identified from the Rps1-k locus. The clustering feature at the Rps1-k locus might have facilitated the expansion of Rps1-k gene numbers and the generation of new recognition specificities. The Rps1-k region was sequenced to understand the possible evolutionary steps that shaped the generation of Phytophthora resistance genes in soybean. Results Here the analyses of sequences of three overlapping BAC clones containing the 184,111 bp Rps1-k region are reported. A shotgun sequencing strategy was applied in sequencing the BAC contig. Sequence analysis predicted a few full-length genes including two Rps1-k genes, Rps1-k-1 and Rps1-k-2. Previously reported Rps1-k-3 from this genomic region 1 was evolved through intramolecular recombination between Rps1-k-1 and Rps1-k-2 in Escherichia coli. The majority of the predicted genes are truncated and therefore most likely they are nonfunctional. A member of a highly abundant retroelement, SIRE1, was identified from the Rps1-k region. The Rps1-k region is primarily composed of repetitive sequences. Sixteen simple repeat and 63 tandem repeat sequences were identified from the locus. Conclusion These data indicate that the Rps1 locus is located in a gene-poor region. The abundance of repetitive sequences in the Rps1-k region suggested that the location of this locus is in or near a

  16. Genetic diversity in domesticated soybean (Glycine max) and its wild progenitor (Glycine soja) for simple sequence repeat and single-nucleotide polymorphism loci

    National Research Council Canada - National Science Library

    Ying-Hui Li; Wei Li; Chen Zhang; Liang Yang; Ru-Zhen Chang; Brandon S. Gaut; Li-Juan Qiu

    2010-01-01

    .... In this study, we genotyped a sample of 303 accessions of domesticated soybean (Glycine max) and its wild progenitor Glycine soja with 99 microsatellite markers and 554 single-nucleotide polymorphism (SNP) markers...

  17. Systematic exchanges between nucleotides: Genomic swinger repeats and swinger transcription in human mitochondria.

    Science.gov (United States)

    Seligmann, Hervé

    2015-11-07

    Chargaff׳s second parity rule, quasi-equal single strand frequencies for complementary nucleotides, presumably results from insertion of repeats and inverted repeats during sequence genesis. Vertebrate mitogenomes escape this rule because repeats are counterselected: their hybridization produces loop bulges whose deletion is deleterious. Some DNA/RNA sequences match mitogenomes only after assuming one among 23 systematic nucleotide exchanges (swinger DNA/RNA: nine symmetric, e.g. A ↔ C; and 14 asymmetric, e.g. A → C → G → A). Swinger-transformed repeats do not hybridize, escaping selection against deletions due to bulge formation. Blast analyses of the human mitogenome detect swinger repeats for all 23 swinger types, more than in randomized sequences with identical length and nucleotide contents. Mean genomic swinger repeat lengths increase with observed human swinger RNA frequencies: swinger repeat and swinger RNA productions appear linked, perhaps by swinger RNA retrotranscription. Mean swinger repeat lengths are proportional to reading frame retrievability, post-swinger transformation, by the natural circular code. Genomic swinger repeats confirm at genomic level, independently of swinger RNA detection, occurrence of swinger polymerizations. They suggest that repeats, and swinger repeats in particular, contribute to genome genesis.

  18. Moss Phylogeny Reconstruction Using Nucleotide Pangenome of Complete Mitogenome Sequences.

    Science.gov (United States)

    Goryunov, D V; Nagaev, B E; Nikolaev, M Yu; Alexeevski, A V; Troitsky, A V

    2015-11-01

    Stability of composition and sequence of genes was shown earlier in 13 mitochondrial genomes of mosses (Rensing, S. A., et al. (2008) Science, 319, 64-69). It is of interest to study the evolution of mitochondrial genomes not only at the gene level, but also on the level of nucleotide sequences. To do this, we have constructed a "nucleotide pangenome" for mitochondrial genomes of 24 moss species. The nucleotide pangenome is a set of aligned nucleotide sequences of orthologous genome fragments covering the totality of all genomes. The nucleotide pangenome was constructed using specially developed new software, NPG-explorer (NPGe). The stable part of the mitochondrial genome (232 stable blocks) is shown to be, on average, 45% of its length. In the joint alignment of stable blocks, 82% of positions are conserved. The phylogenetic tree constructed with the NPGe program is in good correlation with other phylogenetic reconstructions. With the NPGe program, 30 blocks have been identified with repeats no shorter than 50 bp. The maximal length of a block with repeats is 140 bp. Duplications in the mitochondrial genomes of mosses are rare. On average, the genome contains about 500 bp in large duplications. The total length of insertions and deletions was determined in each genome. The losses and gains of DNA regions are rather active in mitochondrial genomes of mosses, and such rearrangements presumably can be used as additional markers in the reconstruction of phylogeny.

  19. CHARACTERIZATION AND NUCLEOTIDE SEQUENCE DETERMINATION OF A REPEAT ELEMENT ISOLATED FROM A 2,4,5,-T DEGRADING STRAIN OF PSEUDOMONAS CEPACIA

    Science.gov (United States)

    Pseudomonas cepacia strain AC1100, capable of growth on 2,4,5-trichlorophenoxyacetic acid (2,4,5-T), was mutated to the 2,4,5-T− strain PT88 by a ColE1 :: Tn5 chromosomal insertion. Using cloned DNA from the region flanking the insertion, a 1477-bp sequence (designated RS1100) wa...

  20. The International Nucleotide Sequence Database Collaboration

    Science.gov (United States)

    Cochrane, Guy; Karsch-Mizrachi, Ilene; Takagi, Toshihisa; Sequence Database Collaboration, International Nucleotide

    2016-01-01

    The International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org) comprises three global partners committed to capturing, preserving and providing comprehensive public-domain nucleotide sequence information. The INSDC establishes standards, formats and protocols for data and metadata to make it easier for individuals and organisations to submit their nucleotide data reliably to public archives. This work enables the continuous, global exchange of information about living things. Here we present an update of the INSDC in 2015, including data growth and diversification, new standards and requirements by publishers for authors to submit their data to the public archives. The INSDC serves as a model for data sharing in the life sciences. PMID:26657633

  1. Estimation of evolutionary distances between nucleotide sequences.

    Science.gov (United States)

    Zharkikh, A

    1994-09-01

    A formal mathematical analysis of the substitution process in nucleotide sequence evolution was done in terms of the Markov process. By using matrix algebra theory, the theoretical foundation of Barry and Hartigan's (Stat. Sci. 2:191-210, 1987) and Lanave et al.'s (J. Mol. Evol. 20:86-93, 1984) methods was provided. Extensive computer simulation was used to compare the accuracy and effectiveness of various methods for estimating the evolutionary distance between two nucleotide sequences. It was shown that the multiparameter methods of Lanave et al.'s (J. Mol. Evol. 20:86-93, 1984), Gojobori et al.'s (J. Mol. Evol. 18:414-422, 1982), and Barry and Hartigan's (Stat. Sci. 2:191-210, 1987) are preferable to others for the purpose of phylogenetic analysis when the sequences are long. However, when sequences are short and the evolutionary distance is large, Tajima and Nei's (Mol. Biol. Evol. 1:269-285, 1984) method is superior to others.

  2. Simple sequence repeats in mycobacterial genomes

    Indian Academy of Sciences (India)

    Vattipally B Sreenu; Pankaj Kumar; Javaregowda Nagaraju; Hampapathalu A Nagarajaram

    2007-01-01

    Simple sequence repeats (SSRs) or microsatellites are the repetitive nucleotide sequences of motifs of length 1–6 bp. They are scattered throughout the genomes of all the known organisms ranging from viruses to eukaryotes. Microsatellites undergo mutations in the form of insertions and deletions (INDELS) of their repeat units with some bias towards insertions that lead to microsatellite tract expansion. Although prokaryotic genomes derive some plasticity due to microsatellite mutations they have in-built mechanisms to arrest undue expansions of microsatellites and one such mechanism is constituted by post-replicative DNA repair enzymes MutL, MutH and MutS. The mycobacterial genomes lack these enzymes and as a null hypothesis one could expect these genomes to harbour many long tracts. It is therefore interesting to analyse the mycobacterial genomes for distribution and abundance of microsatellites tracts and to look for potentially polymorphic microsatellites. Available mycobacterial genomes, Mycobacterium avium, M. leprae, M. bovis and the two strains of M. tuberculosis (CDC1551 and H37Rv) were analysed for frequencies and abundance of SSRs. Our analysis revealed that the SSRs are distributed throughout the mycobacterial genomes at an average of 220–230 SSR tracts per kb. All the mycobacterial genomes contain few regions that are conspicuously denser or poorer in microsatellites compared to their expected genome averages. The genomes distinctly show scarcity of long microsatellites despite the absence of a post-replicative DNA repair system. Such severe scarcity of long microsatellites could arise as a result of strong selection pressures operating against long and unstable sequences although influence of GC-content and role of point mutations in arresting microsatellite expansions can not be ruled out. Nonetheless, the long tracts occasionally found in coding as well as non-coding regions may account for limited genome plasticity in these genomes.

  3. PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes

    OpenAIRE

    Kumar, Pankaj; Chaitanya, Pasumarthy S.; Nagarajaram, Hampapathalu A

    2010-01-01

    PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1–6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in s...

  4. The nucleotide sequences of two leghemoglobin genes from soybean

    DEFF Research Database (Denmark)

    Wiborg, O; Hyldig-Nielsen, J J; Jensen, E O

    1982-01-01

    We present the complete nucleotide sequences of two leghemoglobin genes isolated from soybean DNA. Both genes contain three intervening sequences in identical positions. Comparison of the coding sequences with known amino-acid sequences of soybean leghemoglobins suggest that the two genes...

  5. Multineuronal Spike Sequences Repeat with Millisecond Precision

    Directory of Open Access Journals (Sweden)

    Koki eMatsumoto

    2013-06-01

    Full Text Available Cortical microcircuits are nonrandomly wired by neurons. As a natural consequence, spikes emitted by microcircuits are also nonrandomly patterned in time and space. One of the prominent spike organizations is a repetition of fixed patterns of spike series across multiple neurons. However, several questions remain unsolved, including how precisely spike sequences repeat, how the sequences are spatially organized, how many neurons participate in sequences, and how different sequences are functionally linked. To address these questions, we monitored spontaneous spikes of hippocampal CA3 neurons ex vivo using a high-speed functional multineuron calcium imaging technique that allowed us to monitor spikes with millisecond resolution and to record the location of spiking and nonspiking neurons. Multineuronal spike sequences were overrepresented in spontaneous activity compared to the statistical chance level. Approximately 75% of neurons participated in at least one sequence during our observation period. The participants were sparsely dispersed and did not show specific spatial organization. The number of sequences relative to the chance level decreased when larger time frames were used to detect sequences. Thus, sequences were precise at the millisecond level. Sequences often shared common spikes with other sequences; parts of sequences were subsequently relayed by following sequences, generating complex chains of multiple sequences.

  6. Nucleotide sequence of papaya mosaic virus RNA.

    Science.gov (United States)

    Sit, T L; Abouhaidar, M G; Holy, S

    1989-09-01

    The RNA genome of papaya mosaic virus is 6656 nucleotides long [excluding the poly(A) tail] with six open reading frames (ORFs) more than 200 nucleotides long. The four nearest the 5' end each overlap with adjacent ORFs and could code for proteins with Mr 176307, 26248, 11949 and 7224 (ORFs 1 to 4). The fifth ORF produces the capsid protein of Mr 23043 and the sixth ORF, located completely within ORF1, could code for a protein with Mr 14113. The translation products of ORFs 1 to 3 show strong similarity with those of other potexviruses but the ORF 4 protein has only limited similarity with the other potexvirus ORF 4 proteins of 7K to 11K.

  7. Characterization of simple sequence repeats (SSRs from Phlebotomus papatasi (Diptera: Psychodidae expressed sequence tags (ESTs

    Directory of Open Access Journals (Sweden)

    Hamarsheh Omar

    2011-09-01

    Full Text Available Abstract Background Phlebotomus papatasi is a natural vector of Leishmania major, which causes cutaneous leishmaniasis in many countries. Simple sequence repeats (SSRs, or microsatellites, are common in eukaryotic genomes and are short, repeated nucleotide sequence elements arrayed in tandem and flanked by non-repetitive regions. The enrichment methods used previously for finding new microsatellite loci in sand flies remain laborious and time consuming; in silico mining, which includes retrieval and screening of microsatellites from large amounts of sequence data from sequence data bases using microsatellite search tools can yield many new candidate markers. Results Simple sequence repeats (SSRs were characterized in P. papatasi expressed sequence tags (ESTs derived from a public database, National Center for Biotechnology Information (NCBI. A total of 42,784 sequences were mined, and 1,499 SSRs were identified with a frequency of 3.5% and an average density of 15.55 kb per SSR. Dinucleotide motifs were the most common SSRs, accounting for 67% followed by tri-, tetra-, and penta-nucleotide repeats, accounting for 31.1%, 1.5%, and 0.1%, respectively. The length of microsatellites varied from 5 to 16 repeats. Dinucleotide types; AG and CT have the highest frequency. Dinucleotide SSR-ESTs are relatively biased toward an excess of (AXn repeats and a low GC base content. Forty primer pairs were designed based on motif lengths for further experimental validation. Conclusion The first large-scale survey of SSRs derived from P. papatasi is presented; dinucleotide SSRs identified are more frequent than other types. EST data mining is an effective strategy to identify functional microsatellites in P. papatasi.

  8. Reading biological processes from nucleotide sequences

    Science.gov (United States)

    Murugan, Anand

    Cellular processes have traditionally been investigated by techniques of imaging and biochemical analysis of the molecules involved. The recent rapid progress in our ability to manipulate and read nucleic acid sequences gives us direct access to the genetic information that directs and constrains biological processes. While sequence data is being used widely to investigate genotype-phenotype relationships and population structure, here we use sequencing to understand biophysical mechanisms. We present work on two different systems. First, in chapter 2, we characterize the stochastic genetic editing mechanism that produces diverse T-cell receptors in the human immune system. We do this by inferring statistical distributions of the underlying biochemical events that generate T-cell receptor coding sequences from the statistics of the observed sequences. This inferred model quantitatively describes the potential repertoire of T-cell receptors that can be produced by an individual, providing insight into its potential diversity and the probability of generation of any specific T-cell receptor. Then in chapter 3, we present work on understanding the functioning of regulatory DNA sequences in both prokaryotes and eukaryotes. Here we use experiments that measure the transcriptional activity of large libraries of mutagenized promoters and enhancers and infer models of the sequence-function relationship from this data. For the bacterial promoter, we infer a physically motivated 'thermodynamic' model of the interaction of DNA-binding proteins and RNA polymerase determining the transcription rate of the downstream gene. For the eukaryotic enhancers, we infer heuristic models of the sequence-function relationship and use these models to find synthetic enhancer sequences that optimize inducibility of expression. Both projects demonstrate the utility of sequence information in conjunction with sophisticated statistical inference techniques for dissecting underlying biophysical

  9. [Tabular excel editor for analysis of aligned nucleotide sequences].

    Science.gov (United States)

    Demkin, V V

    2010-01-01

    Excel platform was used for transition of results of multiple aligned nucleotide sequences obtained using the BLAST network service to the form appropriate for visual analysis and editing. Two macros operators for MS Excel 2007 were constructed. The array of aligned sequences transformed into Excel table and processed using macros operators is more appropriate for analysis than initial html data.

  10. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Science.gov (United States)

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid...

  11. Applications of High Throughput Nucleotide Sequencing

    DEFF Research Database (Denmark)

    Waage, Johannes Eichler

    The recent advent of high throughput sequencing of nucleic acids (RNA and DNA) has vastly expanded research into the functional and structural biology of the genome of all living organisms (and even a few dead ones). With this enormous and exponential growth in biological data generation come......-sequencing, a study of the effects on alternative RNA splicing of KO of the nonsense mediated RNA decay system in Mus, using digital gene expression and a custom-built exon-exon junction mapping pipeline is presented (article I). Evolved from this work, a Bioconductor package, spliceR, for classifying alternative...... splicing events and coding potential of isoforms from full isoform deconvolution software, such as Cufflinks (article II), is presented. Finally, a study using 5’-end RNA-seq for alternative promoter detection between healthy patients and patients with acute promyelocytic leukemia is presented (article III...

  12. Nucleotide sequence composition and method for detection of neisseria gonorrhoeae

    Energy Technology Data Exchange (ETDEWEB)

    Lo, A.; Yang, H.L.

    1990-02-13

    This patent describes a composition of matter that is specific for {ital Neisseria gonorrhoeae}. It comprises: at least one nucleotide sequence for which the ratio of the amount of the sequence which hybridizes to chromosomal DNA of {ital Neisseria gonorrhoeae} to the amount of the sequence which hybridizes to chromosomal DNA of {ital Neisseria meningitidis} is greater than about five. The ratio being obtained by a method described.

  13. Nucleotide Sequence of the Protective Antigen Gene of Bacillus Anthracis

    Science.gov (United States)

    1988-02-02

    Montie, S. Kadis, and S. I. Ajl (ed.), Microbial toxins, vol. 3. Academic Press, Inc., New York. 23. Little, S. F., and G. B. Knudaon. 1986...Takkinen, and L. Kaariainen. 1981. Nucleotide sequence of the promoter and NHa-terminal signal peptide region of the a- amylase gene from Bacillus

  14. Nucleotide Sequence - KOME | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available [ Credits ] BLAST Search Image Search Home About Archive Update History Contact us ..._db.zip File URL: ftp://ftp.biosciencedbc.jp/archive/kome/LATEST/kome_ine_full_se...quence_db.zip File size: 19 MB File name: FASTA: kome_ine_full_sequence_db.fasta.zip File URL: ftp://ftp.biosciencedbc.jp/archiv...rtio About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Nucleotide Sequence - KOME | LSDB Archive ...

  15. Novel Y-chromosome Short Tandem Repeat Variants Detected Through the Use of Massively Parallel Sequencing

    Directory of Open Access Journals (Sweden)

    David H. Warshauer

    2015-08-01

    Full Text Available Massively parallel sequencing (MPS technology is capable of determining the sizes of short tandem repeat (STR alleles as well as their individual nucleotide sequences. Thus, single nucleotide polymorphisms (SNPs within the repeat regions of STRs and variations in the pattern of repeat units in a given repeat motif can be used to differentiate alleles of the same length. In this study, MPS was used to sequence 28 forensically-relevant Y-chromosome STRs in a set of 41 DNA samples from the 3 major U.S. population groups (African Americans, Caucasians, and Hispanics. The resulting sequence data, which were analyzed with STRait Razor v2.0, revealed 37 unique allele sequence variants that have not been previously reported. Of these, 19 sequences were variations of documented sequences resulting from the presence of intra-repeat SNPs or alternative repeat unit patterns. Despite a limited sampling, two of the most frequently-observed variants were found only in African American samples. The remaining 18 variants represented allele sequences for which there were no published data with which to compare. These findings illustrate the great potential of MPS with regard to increasing the resolving power of STR typing and emphasize the need for sample population characterization of STR alleles.

  16. Novel Y-chromosome Short Tandem Repeat Variants Detected Through the Use of Massively Parallel Sequencing

    Institute of Scientific and Technical Information of China (English)

    David H Warshauer; Jennifer D Churchill; Nicole Novroski; Jonathan L King; Bruce Budowle

    2015-01-01

    Massively parallel sequencing (MPS) technology is capable of determining the sizes of short tandem repeat (STR) alleles as well as their individual nucleotide sequences. Thus, single nucleotide polymorphisms (SNPs) within the repeat regions of STRs and variations in the pattern of repeat units in a given repeat motif can be used to differentiate alleles of the same length. In this study, MPS was used to sequence 28 forensically-relevant Y-chromosome STRs in a set of 41 DNA samples from the 3 major U.S. population groups (African Americans, Caucasians, and Hispanics). The resulting sequence data, which were analyzed with STRait Razor v2.0, revealed 37 unique allele sequence variants that have not been previously reported. Of these, 19 sequences were variations of documented sequences resulting from the presence of intra-repeat SNPs or alternative repeat unit patterns. Despite a limited sampling, two of the most frequently-observed variants were found only in African American samples. The remaining 18 variants represented allele sequences for which there were no published data with which to compare. These findings illustrate the great potential of MPS with regard to increasing the resolving power of STR typing and emphasize the need for sample population characterization of STR alleles.

  17. Large cryptic internal sequence repeats in protein structures from Homo sapiens

    Indian Academy of Sciences (India)

    R Sarani; N A Udayaprakash; R Subashini; P Mridula; T Yamane; K Sekar

    2009-03-01

    Amino acid sequences are known to constantly mutate and diverge unless there is a limiting condition that makes such a change deleterious. However, closer examination of the sequence and structure reveals that a few large, cryptic repeats are nevertheless sequentially conserved. This leads to the question of why only certain repeats are conserved at the sequence level. It would be interesting to find out if these sequences maintain their conservation at the three-dimensional structure level. They can play an active role in protein and nucleotide stability, thus not only ensuring proper functioning but also potentiating malfunction and disease. Therefore, insights into any aspect of the repeats – be it structure, function or evolution – would prove to be of some importance. This study aims to address the relationship between protein sequence and its three-dimensional structure, by examining if large cryptic sequence repeats have the same structure.

  18. Exact Tandem Repeats Analyzer (E-TRA): A new program for DNA sequence mining

    Indian Academy of Sciences (India)

    Mehmet Karaca; Mehmet Bilgen; A. Naci Onus; Ayse Gul Ince; Safinaz Y. Elmasulu

    2005-04-01

    Exact Tandem Repeats Analyzer 1.0 (E-TRA) combines sequence motif searches with keywords such as ‘organs’, ‘tissues’, ‘cell lines’ and ‘development stages’ for finding simple exact tandem repeats as well as non-simple repeats. E-TRA has several advanced repeat search parameters/options compared to other repeat finder programs as it not only accepts GenBank, FASTA and expressed sequence tags (EST) sequence files, but also does analysis of multiple files with multiple sequences. The minimum and maximum tandem repeat motif lengths that E-TRA finds vary from one to one thousand. Advanced user defined parameters/options let the researchers use different minimum motif repeats search criteria for varying motif lengths simultaneously. One of the most interesting features of genomes is the presence of relatively short tandem repeats (TRs). These repeated DNA sequences are found in both prokaryotes and eukaryotes, distributed almost at random throughout the genome. Some of the tandem repeats play important roles in the regulation of gene expression whereas others do not have any known biological function as yet. Nevertheless, they have proven to be very beneficial in DNA profiling and genetic linkage analysis studies. To demonstrate the use of E-TRA, we used 5,465,605 human EST sequences derived from 18,814,550 GenBank EST sequences. Our results indicated that 12.44% (679,800) of the human EST sequences contained simple and non-simple repeat string patterns varying from one to 126 nucleotides in length. The results also revealed that human organs, tissues, cell lines and different developmental stages differed in number of repeats as well as repeat composition, indicating that the distribution of expressed tandem repeats among tissues or organs are not random, thus differing from the un-transcribed repeats found in genomes.

  19. The nucleotide sequence and genome organization of Plasmopara halstedii virus

    Directory of Open Access Journals (Sweden)

    Göpfert Jens C

    2011-03-01

    Full Text Available Abstract Background Only very few viruses of Oomycetes have been studied in detail. Isometric virions were found in different isolates of the oomycete Plasmopara halstedii, the downy mildew pathogen of sunflower. However, complete nucleotide sequences and data on the genome organization were lacking. Methods Viral RNA of different P. halstedii isolates was subjected to nucleotide sequencing and analysis of the viral genome. The N-terminal sequence of the viral coat protein was determined using Top-Down MALDI-TOF analysis. Results The complete nucleotide sequences of both single-stranded RNA segments (RNA1 and RNA2 were established. RNA1 consisted of 2793 nucleotides (nt exclusive its 3' poly(A tract and a single open-reading frame (ORF1 of 2745 nt. ORF1 was framed by a 5' untranslated region (5' UTR of 18 nt and a 3' untranslated region (3' UTR of 30 nt. ORF1 contained motifs of RNA-dependent RNA polymerases (RdRp and showed similarities to RdRp of Scleropthora macrospora virus A (SmV A and viruses within the Nodaviridae family. RNA2 consisted of 1526 nt exclusive its 3' poly(A tract and a second ORF (ORF2 of 1128 nt. ORF2 coded for the single viral coat protein (CP and was framed by a 5' UTR of 164 nt and a 3' UTR of 234 nt. The deduced amino acid sequence of ORF2 was verified by nano-LC-ESI-MS/MS experiments. Top-Down MALDI-TOF analysis revealed the N-terminal sequence of the CP. The N-terminal sequence represented a region within ORF2 suggesting a proteolytic processing of the CP in vivo. The CP showed similarities to CP of SmV A and viruses within the Tombusviridae family. Fragments of RNA1 (ca. 1.9 kb and RNA2 (ca. 1.4 kb were used to analyze the nucleotide sequence variation of virions in different P. halstedii isolates. Viral sequence variation was 0.3% or less regardless of their host's pathotypes, the geographical origin and the sensitivity towards the fungicide metalaxyl. Conclusions The results showed the presence of a single and new

  20. The nucleotide sequence and genome organization of Plasmopara halstedii virus

    Science.gov (United States)

    2011-01-01

    Background Only very few viruses of Oomycetes have been studied in detail. Isometric virions were found in different isolates of the oomycete Plasmopara halstedii, the downy mildew pathogen of sunflower. However, complete nucleotide sequences and data on the genome organization were lacking. Methods Viral RNA of different P. halstedii isolates was subjected to nucleotide sequencing and analysis of the viral genome. The N-terminal sequence of the viral coat protein was determined using Top-Down MALDI-TOF analysis. Results The complete nucleotide sequences of both single-stranded RNA segments (RNA1 and RNA2) were established. RNA1 consisted of 2793 nucleotides (nt) exclusive its 3' poly(A) tract and a single open-reading frame (ORF1) of 2745 nt. ORF1 was framed by a 5' untranslated region (5' UTR) of 18 nt and a 3' untranslated region (3' UTR) of 30 nt. ORF1 contained motifs of RNA-dependent RNA polymerases (RdRp) and showed similarities to RdRp of Scleropthora macrospora virus A (SmV A) and viruses within the Nodaviridae family. RNA2 consisted of 1526 nt exclusive its 3' poly(A) tract and a second ORF (ORF2) of 1128 nt. ORF2 coded for the single viral coat protein (CP) and was framed by a 5' UTR of 164 nt and a 3' UTR of 234 nt. The deduced amino acid sequence of ORF2 was verified by nano-LC-ESI-MS/MS experiments. Top-Down MALDI-TOF analysis revealed the N-terminal sequence of the CP. The N-terminal sequence represented a region within ORF2 suggesting a proteolytic processing of the CP in vivo. The CP showed similarities to CP of SmV A and viruses within the Tombusviridae family. Fragments of RNA1 (ca. 1.9 kb) and RNA2 (ca. 1.4 kb) were used to analyze the nucleotide sequence variation of virions in different P. halstedii isolates. Viral sequence variation was 0.3% or less regardless of their host's pathotypes, the geographical origin and the sensitivity towards the fungicide metalaxyl. Conclusions The results showed the presence of a single and new virus type in

  1. Nucleotide Sequencing and Identification of Some Wild Mushrooms

    Directory of Open Access Journals (Sweden)

    Sudip Kumar Das

    2013-01-01

    Full Text Available The rDNA-ITS (Ribosomal DNA Internal Transcribed Spacers fragment of the genomic DNA of 8 wild edible mushrooms (collected from Eastern Chota Nagpur Plateau of West Bengal, India was amplified using ITS1 (Internal Transcribed Spacers 1 and ITS2 primers and subjected to nucleotide sequence determination for identification of mushrooms as mentioned. The sequences were aligned using ClustalW software program. The aligned sequences revealed identity (homology percentage from GenBank data base of Amanita hemibapha [CN (Chota Nagpur 1, % identity 99 (JX844716.1], Amanita sp. [CN 2, % identity 98 (JX844763.1], Astraeus hygrometricus [CN 3, % identity 87 (FJ536664.1], Termitomyces sp. [CN 4, % identity 90 (JF746992.1], Termitomyces sp. [CN 5, % identity 99 (GU001667.1], T. microcarpus [CN 6, % identity 82 (EF421077.1], Termitomyces sp. [CN 7, % identity 76 (JF746993.1], and Volvariella volvacea [CN 8, % identity 100 (JN086680.1]. Although out of 8 mushrooms 4 could be identified up to species level, the nucleotide sequences of the rest may be relevant to further characterization. A phylogenetic tree is constructed using Neighbor-Joining method showing interrelationship between/among the mushrooms. The determined nucleotide sequences of the mushrooms may provide additional information enriching GenBank database aiding to molecular taxonomy and facilitating its domestication and characterization for human benefits.

  2. Nucleotide sequencing and identification of some wild mushrooms.

    Science.gov (United States)

    Das, Sudip Kumar; Mandal, Aninda; Datta, Animesh K; Gupta, Sudha; Paul, Rita; Saha, Aditi; Sengupta, Sonali; Dubey, Priyanka Kumari

    2013-01-01

    The rDNA-ITS (Ribosomal DNA Internal Transcribed Spacers) fragment of the genomic DNA of 8 wild edible mushrooms (collected from Eastern Chota Nagpur Plateau of West Bengal, India) was amplified using ITS1 (Internal Transcribed Spacers 1) and ITS2 primers and subjected to nucleotide sequence determination for identification of mushrooms as mentioned. The sequences were aligned using ClustalW software program. The aligned sequences revealed identity (homology percentage from GenBank data base) of Amanita hemibapha [CN (Chota Nagpur) 1, % identity 99 (JX844716.1)], Amanita sp. [CN 2, % identity 98 (JX844763.1)], Astraeus hygrometricus [CN 3, % identity 87 (FJ536664.1)], Termitomyces sp. [CN 4, % identity 90 (JF746992.1)], Termitomyces sp. [CN 5, % identity 99 (GU001667.1)], T. microcarpus [CN 6, % identity 82 (EF421077.1)], Termitomyces sp. [CN 7, % identity 76 (JF746993.1)], and Volvariella volvacea [CN 8, % identity 100 (JN086680.1)]. Although out of 8 mushrooms 4 could be identified up to species level, the nucleotide sequences of the rest may be relevant to further characterization. A phylogenetic tree is constructed using Neighbor-Joining method showing interrelationship between/among the mushrooms. The determined nucleotide sequences of the mushrooms may provide additional information enriching GenBank database aiding to molecular taxonomy and facilitating its domestication and characterization for human benefits.

  3. The Cipher Code of Simple Sequence Repeats in "Vampire Pathogens".

    Science.gov (United States)

    Zou, Geng; Bello-Orti, Bernardo; Aragon, Virginia; Tucker, Alexander W; Luo, Rui; Ren, Pinxing; Bi, Dingren; Zhou, Rui; Jin, Hui

    2015-07-28

    Blood inside mammals is a forbidden area for the majority of prokaryotic microbes; however, red blood cells tropism microbes, like "vampire pathogens" (VP), succeed in matching scarce nutrients and surviving strong immunity reactions. Here, we found VP of Mycoplasma, Rhizobiales, and Rickettsiales showed significantly higher counts of (AG)n dimeric simple sequence repeats (Di-SSRs) in the genomes, coding and non-coding regions than non Vampire Pathogens (N_VP). Regression analysis indicated a significant correlation between GC content and the span of (AG)n-Di-SSR variation. Gene Ontology (GO) terms with abundance of (AG)3-Di-SSRs shared by the VP strains were associated with purine nucleotide metabolism (FDR < 0.01), indicating an adaptation to the limited availability of purine and nucleotide precursors in blood. Di-amino acids coded by (AG)n-Di-SSRs included all three six-fold code amino acids (Arg, Leu and Ser) and significantly higher counts of Di-amino acids coded by (AG)3, (GA)3, and (TC)3 in VP than N_VP. Furthermore, significant differences (P < 0.001) on the numbers of triplexes formed from (AG)n-Di-SSRs between VP and N_VP in Mycoplasma suggested the potential role of (AG)n-Di-SSRs in gene regulation.

  4. The complete nucleotide sequence of pelargonium leaf curl virus.

    Science.gov (United States)

    McGavin, Wendy J; MacFarlane, Stuart A

    2016-05-01

    Investigation of a tombusvirus isolated from tulip plants in Scotland revealed that it was pelargonium leaf curl virus (PLCV) rather than the originally suggested tomato bushy stunt virus. The complete sequence of the PLCV genome was determined for the first time, revealing it to be 4789 nucleotides in size and to have an organization similar to that of the other, previously described tombusviruses. Primers derived from the sequence were used to construct a full-length infectious clone of PLCV that recapitulates the disease symptoms of leaf curling in systemically infected pelargonium plants.

  5. Complete nucleotide sequence of primitive vertebrate immunoglobulin light chain genes.

    Science.gov (United States)

    Shamblott, M J; Litman, G W

    1989-06-01

    Antibody to Heterodontus francisci (horned shark) immunoglobulin light chain was used to screen a spleen cDNA expression library, and recombinant clones encoding light chain genes were isolated. The complete sequences of the mature coding regions of two light chain genes in this phylogenetically distant vertebrate have been determined and are reported here. Comparisons of the sequences are consistent with the presence of mammalian-like framework and complementarity-determining regions. The predicted amino acid sequences of the genes are more related to mammalian lambda than to kappa light chains. The nucleotide sequences of the genes are most related to mammalian T-cell antigen receptor beta chain. Heterodontus light chain genes may reflect characteristics of the common ancestor of immunoglobulin and T-cell antigen receptors before its evolutionary diversification.

  6. Always look on both sides: phylogenetic information conveyed by simple sequence repeat allele sequences.

    Directory of Open Access Journals (Sweden)

    Stéphanie Barthe

    Full Text Available Simple sequence repeat (SSR markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily, mutations in the target sequences follow the stepwise mutation model (SMM. Generally speaking, PCR amplicon sizes are used as direct indicators of the number of SSR repeats composing an allele with the data analysis either ignoring the extent of allele size differences or assuming that there is a direct correlation between differences in amplicon size and evolutionary distance. However, without precisely knowing the kind and distribution of polymorphism within an allele (SSR and the associated flanking region (FR sequences, it is hard to say what kind of evolutionary message is conveyed by such a synthetic descriptor of polymorphism as DNA amplicon size. In this study, we sequenced several SSR alleles in multiple populations of three divergent tree genera and disentangled the types of polymorphisms contained in each portion of the DNA amplicon containing an SSR. The patterns of diversity provided by amplicon size variation, SSR variation itself, insertions/deletions (indels, and single nucleotide polymorphisms (SNPs observed in the FRs were compared. Amplicon size variation largely reflected SSR repeat number. The amount of variation was as large in FRs as in the SSR itself. The former contributed significantly to the phylogenetic information and sometimes was the main source of differentiation among individuals and populations contained by FR and SSR regions of SSR markers. The presence of mutations occurring at different rates within a marker's sequence offers the opportunity to analyse evolutionary events occurring on various timescales, but at the same time calls for caution in the interpretation of SSR marker data when the distribution of within

  7. REPdenovo: Inferring De Novo Repeat Motifs from Short Sequence Reads.

    Directory of Open Access Journals (Sweden)

    Chong Chu

    Full Text Available Repeat elements are important components of eukaryotic genomes. One limitation in our understanding of repeat elements is that most analyses rely on reference genomes that are incomplete and often contain missing data in highly repetitive regions that are difficult to assemble. To overcome this problem we develop a new method, REPdenovo, which assembles repeat sequences directly from raw shotgun sequencing data. REPdenovo can construct various types of repeats that are highly repetitive and have low sequence divergence within copies. We show that REPdenovo is substantially better than existing methods both in terms of the number and the completeness of the repeat sequences that it recovers. The key advantage of REPdenovo is that it can reconstruct long repeats from sequence reads. We apply the method to human data and discover a number of potentially new repeats sequences that have been missed by previous repeat annotations. Many of these sequences are incorporated into various parasite genomes, possibly because the filtering process for host DNA involved in the sequencing of the parasite genomes failed to exclude the host derived repeat sequences. REPdenovo is a new powerful computational tool for annotating genomes and for addressing questions regarding the evolution of repeat families. The software tool, REPdenovo, is available for download at https://github.com/Reedwarbler/REPdenovo.

  8. Sequencing genes in silico using single nucleotide polymorphisms

    Directory of Open Access Journals (Sweden)

    Zhang Xinyi

    2012-01-01

    Full Text Available Abstract Background The advent of high throughput sequencing technology has enabled the 1000 Genomes Project Pilot 3 to generate complete sequence data for more than 906 genes and 8,140 exons representing 697 subjects. The 1000 Genomes database provides a critical opportunity for further interpreting disease associations with single nucleotide polymorphisms (SNPs discovered from genetic association studies. Currently, direct sequencing of candidate genes or regions on a large number of subjects remains both cost- and time-prohibitive. Results To accelerate the translation from discovery to functional studies, we propose an in silico gene sequencing method (ISS, which predicts phased sequences of intragenic regions, using SNPs. The key underlying idea of our method is to infer diploid sequences (a pair of phased sequences/alleles at every functional locus utilizing the deep sequencing data from the 1000 Genomes Project and SNP data from the HapMap Project, and to build prediction models using flanking SNPs. Using this method, we have developed a database of prediction models for 611 known genes. Sequence prediction accuracy for these genes is 96.26% on average (ranges 79%-100%. This database of prediction models can be enhanced and scaled up to include new genes as the 1000 Genomes Project sequences additional genes on additional individuals. Applying our predictive model for the KCNJ11 gene to the Wellcome Trust Case Control Consortium (WTCCC Type 2 diabetes cohort, we demonstrate how the prediction of phased sequences inferred from GWAS SNP genotype data can be used to facilitate interpretation and identify a probable functional mechanism such as protein changes. Conclusions Prior to the general availability of routine sequencing of all subjects, the ISS method proposed here provides a time- and cost-effective approach to broadening the characterization of disease associated SNPs and regions, and facilitating the prioritization of candidate

  9. Bioinformatics comparison of sulfate-reducing metabolism nucleotide sequences

    Science.gov (United States)

    Tremberger, G.; Dehipawala, Sunil; Nguyen, A.; Cheung, E.; Sullivan, R.; Holden, T.; Lieberman, D.; Cheung, T.

    2015-09-01

    The sulfate-reducing bacteria can be traced back to 3.5 billion years ago. The thermodynamics details of the sulfur cycle have been well documented. A recent sulfate-reducing bacteria report (Robator, Jungbluth, et al , 2015 Jan, Front. Microbiol) with Genbank nucleotide data has been analyzed in terms of the sulfite reductase (dsrAB) via fractal dimension and entropy values. Comparison to oil field sulfate-reducing sequences was included. The AUCG translational mass fractal dimension versus ATCG transcriptional mass fractal dimension for the low temperature dsrB and dsrA sequences reported in Reference Thirteen shows correlation R-sq ~ 0.79 , with a probably of about 3% in simulation. A recent report of using Cystathionine gamma-lyase sequence to produce CdS quantum dot in a biological method, where the sulfur is reduced just like in the H2S production process, was included for comparison. The AUCG mass fractal dimension versus ATCG mass fractal dimension for the Cystathionine gamma-lyase sequences was found to have R-sq of 0.72, similar to the low temperature dissimilatory sulfite reductase dsr group with 3% probability, in contrary to the oil field group having R-sq ~ 0.94, a high probable outcome in the simulation. The other two simulation histograms, namely, fractal dimension versus entropy R-sq outcome values, and di-nucleotide entropy versus mono-nucleotide entropy R-sq outcome values are also discussed in the data analysis focusing on low probability outcomes.

  10. Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.

    Science.gov (United States)

    Rehm, Charlotte; Wurmthaler, Lena A; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S

    2015-01-01

    In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1-5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6-9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria.

  11. Cytochrome b nucleotide sequence variation among the Atlantic Alcidae.

    Science.gov (United States)

    Friesen, V L; Montevecchi, W A; Davidson, W S

    1993-01-01

    Analysis of cytochrome b nucleotide sequences of the six extant species of Atlantic alcids and a gull revealed an excess of adenines and cytosines and a deficit of guanines at silent sites on the coding strand. Phylogenetic analyses grouped the sequences of the common (Uria aalge) and Brünnich's (U. lomvia) guillemots, followed by the razorbill (Alca torda) and little auk (Alle alle). The black guillemot (Cepphus grylle) sequence formed a sister taxon, and the puffin (Fratercula arctica) fell outside the other alcids. Phylogenetic comparisons of substitutions indicated that mutabilities of bases did not differ, but that C was much more likely to be incorporated than was G. Imbalances in base composition appear to result from a strand bias in replication errors, which may result from selection on secondary RNA structure and/or the energetics of codon-anticodon interactions.

  12. Human sapovirus classification based on complete capsid nucleotide sequences.

    Science.gov (United States)

    Oka, Tomoichiro; Mori, Kohji; Iritani, Nobuhiro; Harada, Seiya; Ueki, You; Iizuka, Setsuko; Mise, Keiji; Murakami, Kosuke; Wakita, Takaji; Katayama, Kazuhiko

    2012-02-01

    The genetically diverse sapoviruses (SaVs) are a significant cause of acute human gastroenteritis. Human SaV surveillance is becoming more critical, and a better understanding of the diversity and distribution of the viral genotypes is needed. In this study, we analyzed 106 complete human SaV capsid nucleotide sequences to provide a better understanding of their diversity. Based on those results, we propose a novel standardized classification scheme that meets the requirements of the International Calicivirus Scientific Committee. We believe the classification scheme and strains described here will be of value for the molecular characterization and classification of newly detected SaV genotypes and for comparing data worldwide.

  13. Complete nucleotide sequence and genomic organization of Periplaneta fuliginosa densonucleosis virus

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    We have cloned the replicative form of the Periplaneta fuliginosa densonucleosis virus (Pf DNV) genome and determined its complete sequence.The sequence has 5454 nucleotides (nt),the genome consists of an internal unique sequence flanked by inverted terminal repeats (201 nt).The first 122 nt at the 5' end and the terminal 122 nt at the 3'end of both plus and minus strands can fold into a typical hairpin structure.The genome contains seven major open reading frames (ORFs).The plus strand has 4 ORFs occupying the 5' half of the plus strand,whereas the others span the 5' half of the minus strand.Two potential promoters were found at map units (m.u.) 3 and 97.Computer analysis of sequence homologies with other parvoviruses suggests that the plus strand of Pf DNV encodes very likely the nonstructural proteins and the minus strand probably encodes the structural proteins.

  14. A Novel Signal Processing Measure to Identify Exact and Inexact Tandem Repeat Patterns in DNA Sequences

    Directory of Open Access Journals (Sweden)

    Ravi Gupta

    2007-03-01

    Full Text Available The identification and analysis of repetitive patterns are active areas of biological and computational research. Tandem repeats in telomeres play a role in cancer and hypervariable trinucleotide tandem repeats are linked to over a dozen major neurodegenerative genetic disorders. In this paper, we present an algorithm to identify the exact and inexact repeat patterns in DNA sequences based on orthogonal exactly periodic subspace decomposition technique. Using the new measure our algorithm resolves the problems like whether the repeat pattern is of period P or its multiple (i.e., 2P, 3P, etc., and several other problems that were present in previous signal-processing-based algorithms. We present an efficient algorithm of O(NLw logLw, where N is the length of DNA sequence and Lw is the window length, for identifying repeats. The algorithm operates in two stages. In the first stage, each nucleotide is analyzed separately for periodicity, and in the second stage, the periodic information of each nucleotide is combined together to identify the tandem repeats. Datasets having exact and inexact repeats were taken up for the experimental purpose. The experimental result shows the effectiveness of the approach.

  15. Nucleotide sequences specific to Brucella and methods for the detection of Brucella

    Energy Technology Data Exchange (ETDEWEB)

    McCready, Paula M [Tracy, CA; Radnedge, Lyndsay [San Mateo, CA; Andersen, Gary L [Berkeley, CA; Ott, Linda L [Livermore, CA; Slezak, Thomas R [Livermore, CA; Kuczmarski, Thomas A [Livermore, CA

    2009-02-24

    Nucleotide sequences specific to Brucella that serves as a marker or signature for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  16. Nucleotide sequences specific to Brucella and methods for the detection of Brucella

    Science.gov (United States)

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.

    2009-02-24

    Nucleotide sequences specific to Brucella that serves as a marker or signature for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  17. Nucleotide sequence of the BamHI repetitive sequence, including the HindIII fundamental unit, as a possible mobile element from the Japanese monkey Macaca fuscata.

    Science.gov (United States)

    Prassolov, V S; Kuchino, Y; Nemoto, K; Nishimura, S

    1986-01-01

    Clustered repeat units produced by BamHI digestion of genomic DNA from the Japanese monkey Macaca fuscata [JMr(BamHI)] were sequenced by dideoxy DNA sequencing. The nucleotide sequences of several individual repeats showed that the BamHI repeat contains the 170-bp HindIII element as an integral part, and that it has more than 90% homology with the HindIII repeat element [AGMr(HindIII)] found in the genomic DNA of the African green monkey. In the JMr(BamHI) repeat unit, the 170-bp HindIII element is flanked by a 6-bp inverted repeat, which is part of a 22-bp direct repeat. This latter repeat of 22-bp asymmetrically overlaps the border between the internal AGMr(HindIII)-like region and adjacent regions of the JMr(BamHI) repeat. A similar structural feature of the BamHI repeat unit has been found in the genomic DNA of the baboon, but not in that of the African green monkey. These results show clearly that the BamHI repeat of the modern Japanese monkey originated as a result of insertion of an AGMr(HindIII) element into a certain site(s) of the genomic DNA of an ancestor of the modern Japanese monkey before Macaca-Cercocebus divergence.

  18. A single nucleotide variant in the FMR1 CGG repeat results in a "Pseudodeletion" and is not associated with the fragile X syndrome phenotype.

    Science.gov (United States)

    Cecconi, Massimiliano; Forzano, Francesca; Rinaldi, Rosanna; Cappellacci, Sandra; Grammatico, Paola; Faravelli, Francesca; Dagna Bricarelli, Franca; Di Maria, Emilio; Grasso, Marina

    2008-05-01

    The molecular diagnosis of fragile X syndrome relies on the detection of the pathogenic CGG repeat expansion in the FMR1 gene. Deletions and point mutations have occasionally been reported. Rare polymorphisms might mimic a deletion by Southern blot analysis, leading to false-positive results. We describe a novel rare nucleotide substitution within the CGG repeat. The proband was a woman with a positive family history of mental retardation. Southern blot analysis showed an additional band consistent with a deletion in the region detected by the StB12.3 probe. Sequencing of this region revealed a G>C transversion that interrupts the CGG repeat and introduces an EagI site. The same variant was observed in both the healthy son and father of the proband, supporting the hypothesis that the nucleotide substitution is a silent polymorphism, the frequency of which we estimated to be less than 1% in the general population. These findings argue for a pathogenic role of nucleotide variants within the CGG repeat and suggest possible consequences of unexpected findings in the molecular diagnostics of fragile X syndrome. Thus, although the sequence context of a single nucleotide substitution may not predict possible effects on mRNA or protein function, a specific change in the higher order structures of DNA or mRNA may be functionally relevant in the pathological phenotype.

  19. A Single Nucleotide Variant in the FMR1 CGG Repeat Results in a “Pseudodeletion” and Is Not Associated with the Fragile X Syndrome Phenotype

    Science.gov (United States)

    Cecconi, Massimiliano; Forzano, Francesca; Rinaldi, Rosanna; Cappellacci, Sandra; Grammatico, Paola; Faravelli, Francesca; Dagna Bricarelli, Franca; Di Maria, Emilio; Grasso, Marina

    2008-01-01

    The molecular diagnosis of fragile X syndrome relies on the detection of the pathogenic CGG repeat expansion in the FMR1 gene. Deletions and point mutations have occasionally been reported. Rare polymorphisms might mimic a deletion by Southern blot analysis, leading to false-positive results. We describe a novel rare nucleotide substitution within the CGG repeat. The proband was a woman with a positive family history of mental retardation. Southern blot analysis showed an additional band consistent with a deletion in the region detected by the StB12.3 probe. Sequencing of this region revealed a G>C transversion that interrupts the CGG repeat and introduces an EagI site. The same variant was observed in both the healthy son and father of the proband, supporting the hypothesis that the nucleotide substitution is a silent polymorphism, the frequency of which we estimated to be less than 1% in the general population. These findings argue for a pathogenic role of nucleotide variants within the CGG repeat and suggest possible consequences of unexpected findings in the molecular diagnostics of fragile X syndrome. Thus, although the sequence context of a single nucleotide substitution may not predict possible effects on mRNA or protein function, a specific change in the higher order structures of DNA or mRNA may be functionally relevant in the pathological phenotype. PMID:18403614

  20. Comparison and correlation of Simple Sequence Repeats distribution in genomes of Brucella species

    Science.gov (United States)

    Kiran, Jangampalli Adi Pradeep; Chakravarthi, Veeraraghavulu Praveen; Kumar, Yellapu Nanda; Rekha, Somesula Swapna; Kruti, Srinivasan Shanthi; Bhaskar, Matcha

    2011-01-01

    Computational genomics is one of the important tools to understand the distribution of closely related genomes including simple sequence repeats (SSRs) in an organism, which gives valuable information regarding genetic variations. The central objective of the present study was to screen the SSRs distributed in coding and non-coding regions among different human Brucella species which are involved in a range of pathological disorders. Computational analysis of the SSRs in the Brucella indicates few deviations from expected random models. Statistical analysis also reveals that tri-nucleotide SSRs are overrepresented and tetranucleotide SSRs underrepresented in Brucella genomes. From the data, it can be suggested that over expressed tri-nucleotide SSRs in genomic and coding regions might be responsible in the generation of functional variation of proteins expressed which in turn may lead to different pathogenicity, virulence determinants, stress response genes, transcription regulators and host adaptation proteins of Brucella genomes. Abbreviations SSRs - Simple Sequence Repeats, ORFs - Open Reading Frames. PMID:21738309

  1. Base Sequence Context Effects on Nucleotide Excision Repair

    Directory of Open Access Journals (Sweden)

    Yuqin Cai

    2010-01-01

    Full Text Available Nucleotide excision repair (NER plays a critical role in maintaining the integrity of the genome when damaged by bulky DNA lesions, since inefficient repair can cause mutations and human diseases notably cancer. The structural properties of DNA lesions that determine their relative susceptibilities to NER are therefore of great interest. As a model system, we have investigated the major mutagenic lesion derived from the environmental carcinogen benzo[a]pyrene (B[a]P, 10S (+-trans-anti-B[a]P-2-dG in six different sequence contexts that differ in how the lesion is positioned in relation to nearby guanine amino groups. We have obtained molecular structural data by NMR and MD simulations, bending properties from gel electrophoresis studies, and NER data obtained from human HeLa cell extracts for our six investigated sequence contexts. This model system suggests that disturbed Watson-Crick base pairing is a better recognition signal than a flexible bend, and that these can act in concert to provide an enhanced signal. Steric hinderance between the minor groove-aligned lesion and nearby guanine amino groups determines the exact nature of the disturbances. Both nearest neighbor and more distant neighbor sequence contexts have an impact. Regardless of the exact distortions, we hypothesize that they provide a local thermodynamic destabilization signal for repair.

  2. ACG: rapid inference of population history from recombining nucleotide sequences

    Directory of Open Access Journals (Sweden)

    O'Fallon Brendan D

    2013-02-01

    Full Text Available Abstract Background Reconstruction of population history from genetic data often requires Monte Carlo integration over the genealogy of the samples. Among tools that perform such computations, few are able to consider genetic histories including recombination events, precluding their use on most alignments of nuclear DNA. Explicit consideration of recombinations requires modeling the history of the sequences with an Ancestral Recombination Graph (ARG in place of a simple tree, which presents significant computational challenges. Results ACG is an extensible desktop application that uses a Bayesian Markov chain Monte Carlo procedure to estimate the posterior likelihood of an evolutionary model conditional on an alignment of genetic data. The ancestry of the sequences is represented by an ARG, which is estimated from the data with other model parameters. Importantly, ACG computes the full, Felsenstein likelihood of the ARG, not a pairwise or composite likelihood. Several strategies are used to speed computations, and ACG is roughly 100x faster than a similar, recombination-aware program. Conclusions Modeling the ancestry of the sequences with an ARG allows ACG to estimate the evolutionary history of recombining nucleotide sequences. ACG can accurately estimate the posterior distribution of population parameters such as the (scaled population size and recombination rate, as well as many aspects of the recombinant history, including the positions of recombination breakpoints, the distribution of time to most recent common ancestor along the sequence, and the non-recombining trees at individual sites. Multiple substitution models and population size models are provided. ACG also provides a richly informative graphical interface that allows users to view the evolution of model parameters and likelihoods in real time.

  3. A novel tandem repeat sequence located on human chromosome 4p: isolation and characterization.

    Science.gov (United States)

    Kogi, M; Fukushige, S; Lefevre, C; Hadano, S; Ikeda, J E

    1997-06-01

    In an effort to analyze the genomic region of the distal half of human chromosome 4p, to where Huntington disease and other diseases have been mapped, we have isolated the cosmid clone (CRS447) that was likely to contain a region with specific repeat sequences. Clone CRS447 was subjected to detailed analysis, including chromosome mapping, restriction mapping, and DNA sequencing. Chromosome mapping by both a human-CHO hybrid cell panel and FISH revealed that CRS447 was predominantly located in the 4p15.1-15.3 region. CRS447 was shown to consist of tandem repeats of 4.7-kb units present on chromosome 4p. A single EcoRI unit was subcloned (pRS447), and the complete sequence was determined as 4752 nucleotides. When pRS447 was used as a probe, the number of copies of this repeat per haploid genome was estimated to be 50-70. Sequence analysis revealed that it contained two internal CA repeats and one putative ORF. Database search established that this sequence was unreported. However, two homologous STS markers were found in the database. We concluded that CRS447/pRS447 is a novel tandem repeat sequence that is mainly specific to human chromosome 4p.

  4. Highly Informative Simple Sequence Repeat (SSR) Markers for Fingerprinting Hazelnut

    Science.gov (United States)

    Simple sequence repeat (SSR) or microsatellite markers have many applications in breeding and genetic studies of plants, including fingerprinting of cultivars and investigations of genetic diversity, and therefore provide information for better management of germplasm collections. They are repeatab...

  5. simple sequence repeat (SSR) markers in genetic analysis of

    African Journals Online (AJOL)

    Yomi

    2012-08-28

    Aug 28, 2012 ... In the present study, 78 mapped simple sequence repeat (SSR) markers representing 11 ... mean (UPGMA) with each cluster representing a particular Vigna species. ..... were reported to be more frequent than the compound.

  6. Study of simple sequence repeat (SSR) polymorphism for biotic ...

    African Journals Online (AJOL)

    home

    2013-10-02

    Oct 2, 2013 ... back cross breeding; SSRs, simple sequence repeats; PIC, polymorphism ..... PIC values were reported in barley wheat and rice (Gu et ... doubled-haploid rice population. Theor. ... Grover A, Aishwarya V, Sharma PC (2007).

  7. An ancient repeat sequence in the ATP synthase beta-subunit gene of forcipulate sea stars.

    Science.gov (United States)

    Foltz, David W

    2007-11-01

    A novel repeat sequence with a conserved secondary structure is described from two nonadjacent introns of the ATP synthase beta-subunit gene in sea stars of the order Forcipulatida (Echinodermata: Asteroidea). The repeat is present in both introns of all forcipulate sea stars examined, which suggests that it is an ancient feature of this gene (with an approximate age of 200 Mya). Both stem and loop regions show high levels of sequence constraint when compared to flanking nonrepetitive intronic regions. The repeat was also detected in (1) the family Pterasteridae, order Velatida and (2) the family Korethrasteridae, order Velatida. The repeat was not detected in (1) the family Echinasteridae, order Spinulosida, (2) the family Astropectinidae, order Paxillosida, (3) the family Solasteridae, order Velatida, or (4) the family Goniasteridae, order Valvatida. The repeat lacks similarity to published sequences in unrestricted GenBank searches, and there are no significant open reading frames in the repeat or in the flanking intron sequences. Comparison via parametric bootstrapping to a published phylogeny based on 4.2 kb of nuclear and mitochondrial sequence for a subset of these species allowed the null hypothesis of a congruent phylogeny to be rejected for each repeat, when compared separately to the published phylogeny. In contrast, the flanking nonrepetitive sequences in each intron yielded separate phylogenies that were each congruent with the published phylogeny. In four species, the repeat in one or both introns has apparently experienced gene conversion. The two introns also show a correlated pattern of nucleotide substitutions, even after excluding the putative cases of gene conversion.

  8. Unstable microsatellite repeats facilitate rapid evolution of coding and regulatory sequences.

    Science.gov (United States)

    Jansen, A; Gemayel, R; Verstrepen, K J

    2012-01-01

    Tandem repeats are intrinsically highly variable sequences since repeat units are often lost or gained during replication or following unequal recombination events. Because of their low complexity and their instability, these repeats, which are also called satellite repeats, are often considered to be useless 'junk' DNA. However, recent findings show that tandem repeats are frequently found within promoters of stress-induced genes and within the coding regions of genes encoding cell-surface and regulatory proteins. Interestingly, frequent changes in these repeats often confer phenotypic variability. Examples include variation in the microbial cell surface, rapid tuning of internal molecular clocks in flies, and enhanced morphological plasticity in mammals. This suggests that instead of being useless junk DNA, some variable tandem repeats are useful functional elements that confer 'evolvability', facilitating swift evolution and rapid adaptation to changing environments. Since changes in repeats are frequent and reversible, repeats provide a unique type of mutation that bridges the gap between rare genetic mutations, such as single nucleotide polymorphisms, and highly unstable but reversible epigenetic inheritance.

  9. Repeat Sequences and Base Correlations in Human Y Chromosome Palindromes

    Institute of Scientific and Technical Information of China (English)

    Neng-zhi Jin; Zi-xian Liu; Yan-jiao Qi; Wen-yuan Qiu

    2009-01-01

    On the basis of information theory and statistical methods, we use mutual information, n-tuple entropy and conditional entropy, combined with biological characteristics, to analyze the long range correlation and short range correlation in human Y chromosome palindromes. The magnitude distribution of the long range correlation which can be reflected by the mutual information is P5>P5a>P5b (P5a and P5b are the sequences that replace solely Alu repeats and all interspersed repeats with random uncorrelated sequences in human Y chromosome palindrome 5, respectively); and the magnitude distribution of the short range correlation which can be reflected by the n-tuple entropy and the conditional entropy is P5>P5a>P5b>random uncorrelated sequence. In other words, when the Alu repeats and all interspersed repeats replace with random uncorrelated sequence, the long range and short range correlation decrease gradually. However, the random uncorrelated sequence has no correlation. This research indicates that more repeat sequences result in stronger correlation between bases in human Y chromosome. The analyses may be helpful to understand the special structures of human Y chromosome palindromes profoundly.

  10. Survey of simple sequence repeats in woodland strawberry (Fragaria vesca).

    Science.gov (United States)

    Guan, L; Huang, J F; Feng, G Q; Wang, X W; Wang, Y; Chen, B Y; Qiao, Y S

    2013-07-30

    The use of simple sequence repeats (SSRs), or microsatellites, as genetic markers has become popular due to their abundance and variation in length among individuals. In this study, we investigated linkage groups (LGs) in the woodland strawberry (Fragaria vesca) and demonstrated variation in the abundances, densities, and relative densities of mononucleotide, dinucleotide, and trinucleotide repeats. Mononucleotide, dinucleotide, and trinucleotide repeats were more common than longer repeats in all LGs examined. Perfect SSRs were the predominant SSR type found and their abundance was extremely stable among LGs and chloroplasts. Abundances of mononucleotide, dinucleotide, and trinucleotide repeats were positively correlated with LG size, whereas those of tetranucleotide and hexanucleotide SSRs were not. Generally, in each LG, the abundance, relative abundance, relative density, and the proportion of each unique SSR all declined rapidly as the repeated unit increased. Furthermore, the lengths and frequencies of SSRs varied among different LGs.

  11. Tidying up international nucleotide sequence databases: ecological, geographical and sequence quality annotation of its sequences of mycorrhizal fungi.

    Directory of Open Access Journals (Sweden)

    Leho Tedersoo

    Full Text Available Sequence analysis of the ribosomal RNA operon, particularly the internal transcribed spacer (ITS region, provides a powerful tool for identification of mycorrhizal fungi. The sequence data deposited in the International Nucleotide Sequence Databases (INSD are, however, unfiltered for quality and are often poorly annotated with metadata. To detect chimeric and low-quality sequences and assign the ectomycorrhizal fungi to phylogenetic lineages, fungal ITS sequences were downloaded from INSD, aligned within family-level groups, and examined through phylogenetic analyses and BLAST searches. By combining the fungal sequence database UNITE and the annotation and search tool PlutoF, we also added metadata from the literature to these accessions. Altogether 35,632 sequences belonged to mycorrhizal fungi or originated from ericoid and orchid mycorrhizal roots. Of these sequences, 677 were considered chimeric and 2,174 of low read quality. Information detailing country of collection, geographical coordinates, interacting taxon and isolation source were supplemented to cover 78.0%, 33.0%, 41.7% and 96.4% of the sequences, respectively. These annotated sequences are publicly available via UNITE (http://unite.ut.ee/ for downstream biogeographic, ecological and taxonomic analyses. In European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/, the annotated sequences have a special link-out to UNITE. We intend to expand the data annotation to additional genes and all taxonomic groups and functional guilds of fungi.

  12. Nucleotide sequence and transcription of a trypomastigote surface antigen gene of Trypanosoma cruzi.

    Science.gov (United States)

    Fouts, D L; Ruef, B J; Ridley, P T; Wrightsman, R A; Peterson, D S; Manning, J E

    1991-06-01

    In previous studies we identified a 500-bp segment of the gene, TSA-1, which encodes an 85-kDa trypomastigote-specific surface antigen of the Peru strain of Trypanosoma cruzi. TSA-1 was shown to be located at a telomeric site and to contain a 27-bp tandem repeat unit within the coding region. This repeat unit defines a discrete subset of a multigene family and places the TSA-1 gene within this subset. In this study, we present the complete nucleotide sequence of the TSA-1 gene from the Peru strain. By homology matrix analysis, fragments of two other trypomastigote specific surface antigen genes, pTt34 and SA85-1.1, are shown to have extensive sequence homology with TSA-1 indicating that these genes are members of the same gene family as TSA-1. The TSA-1 subfamily was also found to be active in two other strains of T. cruzi, one of which contains multiple telomeric members and one of which contains a single non-telomeric member, suggesting that transcription is not necessarily dependent on the gene being located at a telomeric site. Also, while some of the sequences found in this gene family are present in 2 size classes of poly(A)+ RNA, others appear to be restricted to only 1 of the 2 RNA classes.

  13. Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.

    Directory of Open Access Journals (Sweden)

    Charlotte Rehm

    Full Text Available In prokaryotes simple sequence repeats (SSRs with unit sizes of 1-5 nucleotides (nt are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6-9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4 structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc, Xanthomonas axonopodis pv. citri str. 306 (Xac, and Nostoc sp. strain PCC7120 (Ana. In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria.

  14. Complete nucleotide sequence of a monopartite Begomovirus and associated satellites infecting Carica papaya in Nepal.

    Science.gov (United States)

    Shahid, M S; Yoshida, S; Khatri-Chhetri, G B; Briddon, R W; Natsuaki, K T

    2013-06-01

    Carica papaya (papaya) is a fruit crop that is cultivated mostly in kitchen gardens throughout Nepal. Leaf samples of C. papaya plants with leaf curling, vein darkening, vein thickening, and a reduction in leaf size were collected from a garden in Darai village, Rampur, Nepal in 2010. Full-length clones of a monopartite Begomovirus, a betasatellite and an alphasatellite were isolated. The complete nucleotide sequence of the Begomovirus showed the arrangement of genes typical of Old World begomoviruses with the highest nucleotide sequence identity (>99 %) to an isolate of Ageratum yellow vein virus (AYVV), confirming it as an isolate of AYVV. The complete nucleotide sequence of betasatellite showed greater than 89 % nucleotide sequence identity to an isolate of Tomato leaf curl Java betasatellite originating from Indonesian. The sequence of the alphasatellite displayed 92 % nucleotide sequence identity to Sida yellow vein China alphasatellite. This is the first identification of these components in Nepal and the first time they have been identified in papaya.

  15. Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications

    Energy Technology Data Exchange (ETDEWEB)

    Torella, JP; Lienert, F; Boehm, CR; Chen, JH; Way, JC; Silver, PA

    2014-08-07

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.

  16. The nucleotide sequence of 5S rRNA from a red alga, Porphyra yezoensis.

    OpenAIRE

    Takaiwa, F; Kusuda, M; Saga, N; Sugiura, M

    1982-01-01

    The nucleotide sequence of 5S rRNA from Porphyra yezoensis has been determined to be: pACGUACGGCCAUAUCCGAGACACGCGUACCGGAACCCAUUCCGAAUUCCGAAGUCAAGCGUCCGCGAGUUGGGUUAGU - AAUCUGGUGAAAGAUCACAGGCGAACCCCCAAUGCUGUACGUC. This 5S rRNA sequence is most similar to that of Euglena gracilis (63% homology).

  17. Analysis of Simple Sequence Repeats in Genomes of Rhizobia

    Institute of Scientific and Technical Information of China (English)

    GAO Ya-mei; HAN Yi-qiang; TANG Hui; SUN Dong-mei; WANG Yan-jie; WANG Wei-dong

    2008-01-01

    Simple sequence repeats (SSRs) or microsatellites, as genetic markers, are ubiquitous in genomes of various organisms. The analysis of SSR in rhizobia genome provides useful information for a variety of applications in population genetics of rhizobia. We analyzed the occurrences, relative abundance, and relative density of SSRs, the most common in Bradyrhizobium japonicum, Mesorhizobium loti, and Sinorhizobium meliloti genomes se-quenced in the microorganisms tandem repeats database, and SSRs in the three species genomes were compared with each other. The result showed that there were 1 410, 859, and 638 SSRs in B. japonicum, M. loti, and 5. meliloti genomes, respectively. In the genomes of B. japonicum, M. loti, and 5. meliloti, tetranucleotide, pentanucleotide, and hexanucleotide repeats were more abundant and indicated higher mutation rates in these species. The least abundance was mononucleotide repeat. The SSRs type and distribution were similar among these species.

  18. Coevolution between simple sequence repeats (SSRs and virus genome size

    Directory of Open Access Journals (Sweden)

    Zhao Xiangyan

    2012-08-01

    Full Text Available Abstract Background Relationship between the level of repetitiveness in genomic sequence and genome size has been investigated by making use of complete prokaryotic and eukaryotic genomes, but relevant studies have been rarely made in virus genomes. Results In this study, a total of 257 viruses were examined, which cover 90% of genera. The results showed that simple sequence repeats (SSRs is strongly, positively and significantly correlated with genome size. Certain repeat class is distributed in a certain range of genome sequence length. Mono-, di- and tri- repeats are widely distributed in all virus genomes, tetra- SSRs as a common component consist in genomes which more than 100 kb in size; in the range of genome  Conclusions We conducted this research standing on the height of the whole virus. We concluded that genome size is an important factor in affecting the occurrence of SSRs; hosts are also responsible for the variances of SSRs content to a certain degree.

  19. Cloning, characterization, and properties of seven triplet repeat DNA sequences.

    Science.gov (United States)

    Ohshima, K; Kang, S; Larson, J E; Wells, R D

    1996-07-12

    Several neuromuscular and neurodegenerative diseases are caused by genetically unstable triplet repeat sequences (CTG.CAG, CGG.CCG, or AAG.CTT) in or near the responsible genes. We implemented novel cloning strategies with chemically synthesized oligonucleotides to clone seven of the triplet repeat sequences (GTA.TAC, GAT.ATC, GTT.AAC, CAC.GTG, AGG.CCT, TCG.CGA, and AAG.CTT), and the adjoining paper (Ohshima, K., Kang, S., Larson, J. E., and Wells, R. D.(1996) J. Biol. Chem. 271, 16784-16791) describes studies on TTA.TAA. This approach in conjunction with in vivo expansion studies in Escherichia coli enabled the preparation of at least 81 plasmids containing the repeat sequences with lengths of approximately 16 up to 158 triplets in both orientations with varying extents of polymorphisms. The inserts were characterized by DNA sequencing as well as DNA polymerase pausings, two-dimensional agarose gel electrophoresis, and chemical probe analyses to evaluate the capacity to adopt negative supercoil induced non-B DNA conformations. AAG.CTT and AGG.CCT form intramolecular triplexes, and the other five repeat sequences do not form any previously characterized non-B structures. However, long tracts of TCG.CGA showed strong inhibition of DNA synthesis at specific loci in the repeats as seen in the cases of CTG.CAG and CGG.CCG (Kang, S., Ohshima, K., Shimizu, M., Amirhaeri, S., and Wells, R. D.(1995) J. Biol. Chem. 270, 27014-27021). This work along with other studies (Wells, R. D.(1996) J. Biol. Chem. 271, 2875-2878) on CTG.CAG, CGG.CCG, and TTA.TAA makes available long inserts of all 10 triplet repeat sequences for a variety of physical, molecular biological, genetic, and medical investigations. A model to explain the reduction in mRNA abundance in Friedreich's ataxia based on intermolecular triplex formation is proposed.

  20. PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

    Science.gov (United States)

    Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

    2011-01-01

    PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.

  1. Cytogenetic diversity of simple sequences repeats in morphotypes of Brassica rapa ssp. chinensis

    Directory of Open Access Journals (Sweden)

    Jinshuang Zheng

    2016-07-01

    Full Text Available A significant fraction of the nuclear DNA of all eukaryotes is occupied by simple sequence repeats (SSRs. Although thesis sequences have sparked great interest as a means of studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. This paper report the long-range organization of all possible classes of mono-, di- and tri-nucleotide SSRs in Brassica rapa. Fluorescence in situ hybridization (FISH was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphtypes of B. rapa, with trinucleotide SSRs more prevalent in the genome of B. rapa ssp. chinensis. The chromosomal characterizations of mono-, di- and tri-nucleotide repeats have been acquired. The data has revealed the non-random and motif-dependent chromosome distribution of SSRs in different morphtypes, and allowed the relative variability characterized by SSRs amount and similar chromosomal distribution in centromeric/peri-centromeric heterochromatin. The differences of SSRs in the abundance and distribution indicated the driving force of SSRs in relationship with the evolution of B. rapa species. The results provided a comprehensive view on the SSR sequence distribution and evolution for comparison among morphtypes B. rapa ssp. chinensis.

  2. Rasp21 sequences opposite the nucleotide binding pocket are required for GRF-mediated nucleotide release

    DEFF Research Database (Denmark)

    Leonardsen, L; DeClue, J E; Lybaek, H;

    1996-01-01

    , the sensitivity of H-Ras to GRF was abolished when residues 130-139 were replaced by proline-aspartic acid-glutamine, whereas substitution of the entire loop 8 (residues 123-130 replaced by leucine-isoleucine-arginine) had no effect on the stimulation of guanine nucleotide release by GRF. Substrate activity...

  3. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Science.gov (United States)

    2010-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  4. Sequence analysis of trinucleotide repeat microsatellites from an enrichment library of the equine genome.

    Science.gov (United States)

    Tozaki, T; Inoue, S; Mashima, S; Ohta, M; Miura, N; Tomita, M

    2000-04-01

    Microsatellites are useful tools for the construction of a linkage map and parentage testing of equines, but only a limited number of equine microsatellites have been elucidated. Thus, we constructed the equine genomic library enriched for DNA fragments containing (CAG)n repeats. The enriched method includes hybridization-capture of repeat regions using biotin-conjugated oligonucleotides, nucleotide substrate-biased polymerase reaction with the oligonucleotides and subsequent PCR amplification, because these procedures are useful for the cloning of less abundant trinucleotide microsatellites. Microsatellites containing (CAG)n repeats were obtained at the ratio of one per 3-4 clones, indicating an enrichment value about 10(4)-fold, resulting in less time consumption and less cost for cloning. In this study, 66 different microsatellites, (CAG)n repeats, were identified. The number of complete simple CAG repeats in our clones ranged 4-33, with an average repeat length of 8.8 units. The microsatellites were useful as sequence-tagged site (STS) markers. In addition, some clones containing (CAG)n repeats showed homology to human (CAG)n-containing genes, which have been previously mapped. These results indicate that the clones might be a useful tool for chromosome comparison between equines and humans.

  5. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Science.gov (United States)

    2010-07-01

    ... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences... sequences are specifically excluded from this definition. Sequences with fewer than four specifically... acids are not intended to be embraced by this definition. Any amino acid sequence that contains...

  6. Dynamics of Charge Transfer in Ordered and Chaotic Nucleotide Sequences

    CERN Document Server

    Fialko, N S

    2013-01-01

    Charge transfer is considered in systems composed of a donor, an acceptor and bridge sites of (AT) nucleotide pairs. For a bridge consisting of 180 (AT) pairs, three cases are dealt with: a uniform case, when all the nucleotides in each strand are identical; an ordered case, when nucleotides in each DNA strand are arranged in an orderly fashion; a chaotic case, when (AT) and (TA) pairs are arranged randomly. It is shown that in all the cases a charge transfer from a donor to an acceptor can take place. All other factors being equal, the transfer is the most efficient in the uniform case, the ordered and chaotic cases are less and the least efficient, accordingly. The results obtained are in agreement with experimental data on long-range charge transfer in DNA.

  7. Single nucleotide polymorphisms associated with rat expressed sequences

    NARCIS (Netherlands)

    Guryev, Victor; Berezikov, Eugene; Malik, Rainer; Plasterk, Ronald H A; Cuppen, Edwin

    2004-01-01

    Single nucleotide polymorphisms (SNPs) are the most common source of genetic variation in populations and are thus most likely to account for the majority of phenotypic and behavioral differences between individuals or strains. Although the rat is extensively studied for the latter, data on naturall

  8. Mining for Single Nucleotide Polymorphisms in Pig genome sequence data

    NARCIS (Netherlands)

    Kerstens, H.H.D.; Kollers, S.; Kommandath, A.; Rosario, del M.; Dibbits, B.W.; Kinders, S.M.; Crooijmans, R.P.M.A.; Groenen, M.A.M.

    2009-01-01

    Background - Single nucleotide polymorphisms (SNPs) are ideal genetic markers due to their high abundance and the highly automated way in which SNPs are detected and SNP assays are performed. The number of SNPs identified in the pig thus far is still limited. Results - A total of 4.8 million whole g

  9. Complete nucleotide sequence and organization of the mitogenome ...

    African Journals Online (AJOL)

    STORAGESEVER

    2010-02-01

    Feb 1, 2010 ... sequences of the PCGs were translated on the basis of the inverte- ... their proposed cloverleaf secondary structure and anticodon sequences with the aid ...... transcription termination peptide binding site, in that the intergenic ...

  10. Simple sequence repeats in watermelon (Citrullus lanatus (Thunb.) Matsum. & Nakai).

    Science.gov (United States)

    Jarret, R L; Merrick, L C; Holms, T; Evans, J; Aradhya, M K

    1997-08-01

    Simple sequence repeat length polymorphisms were utilized to examine genetic relatedness among accessions of watermelon (Citrullus lanatus (Thunb.) Matsum. & Nakai). A size-fractionated TaqI genomic library was screened for the occurrence of dimer and trimer simple sequence repeats (SSRs). A total of 96 (0.53%) SSR-bearing clones were identified and the inserts from 50 of these were sequenced. The dinucleotide repeats (CT)n and (GA)n accounted for 82% of the SSRs sequenced. PCR primer pairs flanking seven SSR loci were used to amplify SSRs from 32 morphologically variable watermelon genotypes from Africa, Europe, Asia, and Mexico and a single accession of Citrullus colocynthis from Chad. Cluster analysis of SSR length polymorphisms delineated 4 groups at the 25% level of genetic similarity. The largest group contained C. lanatus var. lanatus accessions. The second largest group contained only wild and cultivated "citron"-type or C. lanatus var. citroides accessions. The third group contained an accession tentatively identified as C. lanatus var. lanatus but which perhaps is a hybrid between C. lanatus var. lanatus and C. lanatus var. citroides. The fourth group consisted of a single accession identified as C. colocynthis. "Egusi"-type watermelons from Nigeria grouped with C. lanatus var. lanatus. The use of SSRs for watermelon germplasm characterization and genetic diversity studies is discussed.

  11. Variation in the nucleotide sequence of a prolamin gene family in wild rice.

    Science.gov (United States)

    Barbier, P; Ishihama, A

    1990-07-01

    Variation in the DNA sequence of the 10 kDa prolamin gene family within the wild rice species Oryza rufipogon was probed using the direct sequencing of PCR-amplified genes. A comparison of the nucleotide and deduced amino-acid sequences of eight Asian strains of O. rufipogon and one strain of the related African species O. longistaminata is presented.

  12. Complete genome sequence of a recombinant Marek's disease virus field strain with one reticuloendotheliosis virus long terminal repeat insert.

    Science.gov (United States)

    Su, Shuai; Cui, Ning; Cui, Zhizhong; Zhao, Peng; Li, Yanpeng; Ding, Jiabo; Dong, Xuan

    2012-12-01

    Marek's disease virus (MDV) Chinese strain GX0101, isolated in 2001 from a vaccinated flock of layer chickens with severe tumors, was the first reported recombinant MDV field strain with one reticuloendotheliosis virus (REV) long terminal repeat (LTR) insert. GX0101 belongs to very virulent MDV (vvMDV) but has higher horizontal transmission ability than the vvMDV strain Md5. The complete genome sequence of GX0101 is 178,101 nucleotides (nt) and contains only one REV-LTR insert at a site 267 nt upstream of the sorf2 gene. Moreover, GX0101 has 5 repeats of a 217-nt fragment in its terminal repeat short (TRS) region and 3 repeats in internal repeat short (IRS) region, compared to the other 10 strains with only 1 or 2 repeats in both TRS and IRS.

  13. NUCLEOTIDE SEQUENCE VARIATION IN LEPTIN GENE OF MURRAH BUFFALO (BUBALUS BUBALIS

    Directory of Open Access Journals (Sweden)

    Sanjoy Datta

    2012-12-01

    Full Text Available Leptin is a 16 kD protein, synthesized by adipose tissue and is involved in regulation of feed intake, energy balance, fertility and immune functions. Present study was undertaken with the objectives of sequence characterization and studying the nucleotide variation in leptin gene in Murrah buffalo. The leptin gene consists of three exons and two introns which spans about 18.9kb, of which the first exon is not transcribed into protein. In buffaloes, the leptin gene is located on chromosome eight and maps to BBU 8q32. The leptin gene was amplified by PCR using oligonucleotide primers to obtain 289 bp fragment comprising of exon 2 and 405 bp fragment containing exon 3 of leptin gene. The amplicons were sequenced to identify variation at nucleotide level. Sequence comparison of buffalo with cattle reveals variation at five nucleotide sequences at positions 983, 1083, 1147, 1152, 1221 and all the SNPs are synonymous resulting no in change in amino acids. Three of these eight nucleotide variations have been reported for the first time in buffalo. The results indicate conservation of DNA sequence between cattle and buffalo. Nucleotide sequence variations observed at leptin gene between Bubalus bubalis and Bos taurus species revealed 97% nucleotide identity.

  14. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison

    Directory of Open Access Journals (Sweden)

    Kato Mikio

    2003-01-01

    Full Text Available Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA.

  15. The complete nucleotide sequence of chrysanthemum stem necrosis virus

    NARCIS (Netherlands)

    Dullemans, A.M.; Verhoeven, J.Th.J.; Kormelink, R.J.M.; Vlugt, van der R.A.A.

    2015-01-01

    The complete genome sequence of chrysanthemum stem necrosis virus (CSNV) was determined using Roche 454 next-generation sequencing. CSNV is a tentative member of the genus Tospovirus within the family Bunyaviridae, whose members are arthropod-borne. This is the first report of the entire RNA genome

  16. Development and characterization of simple sequence repeats for Bipolaris sorokiniana and cross transferability to related species.

    Science.gov (United States)

    Fajolu, Oluseyi L; Wadl, Phillip A; Vu, Andrea L; Gwinn, Kimberly D; Scheffler, Brian E; Trigiano, Robert N; Ownley, Bonnie H

    2013-01-01

    Simple sequence repeats (SSR) markers were developed from a small insert genomic library for Bipolaris sorokiniana, a mitosporic fungal pathogen that causes spot blotch and root rot in switchgrass. About 59% of sequenced clones (n = 384) harbored SSR motifs. After eliminating redundant sequences, 196 SSR loci were identified, of which 84.7% were dinucleotide repeats and 9.7% and 5.6% were tri- and tetra-nucleotide repeats, respectively. Primer pairs were designed for 105 loci and 85 successfully amplified loci. Sixteen polymorphic loci were characterized with 15 B. sorokiniana isolates obtained from infected switchgrass plant materials collected from five states in USA. These loci successfully cross-amplified isolates from at least one related species, including Bipolaris oryzae, Bipolaris spicifera and Bipolaris victoriae, that causes leaf spot on switchgrass. Haploid gene diversity per locus across all isolates studied varied 0.633-0.861. Principal component analysis of SSR data clustered isolates according to their respective species. These SSR markers will be a valuable tool for genetic variability and population studies of B. sorokiniana and related species that are pathogenic on switchgrass and other host plants. In addition, these markers are potential diagnostic tools for species in the genus Bipolaris.

  17. Nucleotide sequences of two Korean isolates of Cucumber green mottle mosaic virus.

    Science.gov (United States)

    Kim, Sang-Min; Lee, Jung-Myung; Yim, Kyu-Ock; Oh, Man-Ho; Park, Jin-Woo; Kim, Kook-Hyung

    2003-12-31

    The nucleotide sequences of the genomic RNAs of Cucumber green mottle mosaic virus Korean watermelon isolate (CGMMV-KW) and Korean oriental melon isolate (CGMMV-KOM) were determined and compared to the sequences of other tobamoviruses including CGMMV strains W and SH. Each CGMMV isolate had a genome of 6,424 nucleotides. Each also had 60 and 176 nucleotides of 5' and 3' untranslated regions (UTRs), respectively, and four open reading frames (ORF1-4). ORFs 1 to 4 encode proteins of 129, 186, 29, and 17.4 kDa, respectively. The nucleotide and deduced amino acid sequences of CGMMV-KOM and CGMMV-KW were more than 98.3% identical. When compared to other CGMMV strains in a phylogenetic analysis they were found to form a distinct virus clade, and were more distantly related to other tobamoviruses (23.5-56.7% identity).

  18. Repeat-based Sequence Typing of Carnobacterium maltaromaticum.

    Science.gov (United States)

    Rahman, Abdur; El Kheir, Sara M; Back, Alexandre; Mangavel, Cécile; Revol-Junelles, Anne-Marie; Borges, Frédéric

    2016-06-01

    Carnobacterium maltaromaticum is a Lactic Acid Bacterium (LAB) of technological interest for the food industry, especially the dairy as bioprotection and ripening flora. The industrial use of this LAB requires accurate and resolutive typing tools. A new typing method for C. maltaromaticum inspired from MLVA analysis and called Repeat-based Sequence Typing (RST) is described. Rather than electrophoresis analysis, our RST method is based on sequence analysis of multiple loci containing Variable-Number Tandem-Repeats (VNTRs). The method described here for C. maltaromaticum relies on the analysis of three VNTR loci, and was applied to a collection of 24 strains. For each strain, a PCR product corresponding to the amplification of each VNTR loci was sequenced. Sequence analysis allowed delineating 11, 11, and 12 alleles for loci VNTR-A, VNTR-B, and VNTR-C, respectively. Considering the allele combination exhibited by each strain allowed defining 15 genotypes, ending in a discriminatory index of 0.94. Comparison with MLST revealed that both methods were complementary for strain typing in C. maltaromaticum.

  19. Complete nucleotide sequences of two adjacent early vaccinia virus genes located within the inverted terminal repetition.

    Science.gov (United States)

    Venkatesan, S; Gershowitz, A; Moss, B

    1982-11-01

    The proximal part of the 10,000-base pair (bp) inverted terminal repetition of vaccinia virus DNA encodes at least three early mRNAs. A 2,236-bp segment of the repetition was sequenced to characterize two of the genes. This task was facilitated by constructing a series of recombinants containing overlapping deletions; oligonucleotide linkers with synthetic restriction sites provided points for radioactive labeling before sequencing by the chemical degradation method of Maxam and Gilbert (Methods Enzymol. 65:499-560, 1980). The ends of the transcripts were mapped by hybridizing labeled DNA fragments to early viral RNA and resolving nuclease S1-protected fragments in sequencing gels, by sequencing cDNA clones, and from the lengths of the RNAs. The nucleotide sequences for at least 60 bp upstream of both transcriptional initiation sites are more than 80% adenine . thymine rich and contain long runs of adenines and thymines with some homology to procaryotic and eucaryotic consensus sequences. The gene transcribed in the rightward direction encodes an RNA of approximately 530 nucleotides with a single open reading frame of 420 nucleotides. Preceding the first AUG, there is a heptanucleotide that can hybridize to the 3' end of 18S rRNA with only one mismatch. The derived amino acid sequence of the protein indicated a molecular weight of 15,500. The gene transcribed in the leftward direction encodes an RNA 1,000 to 1,100 nucleotides long with an open reading frame of 996 nucleotides and a leader sequence of only 5 to 6 nucleotides. The derived amino acid sequence of this protein indicated a molecular weight of 38,500. The 3' ends of the two transcripts were located within 100 bp of each other. Although there are adenine . thymine-rich clusters near the putative transcriptional termination sites, specific AATAAA polyadenylic acid signal sequences are absent.

  20. A blackberry (Rubus L.) expressed sequence tag library for the development of simple sequence repeat markers

    Science.gov (United States)

    A blackberry (Rubus L.) expressed sequence tag (EST) library was produced for developing simple sequence repeat (SSR) markers from the tetraploid blackberry cultivar, Merton Thornless, the source of the thornless trait in commercial cultivars. RNA was extracted from young expanding leaves and used f...

  1. Nucleotide sequence of the structural gene for tryptophanase of Escherichia coli K-12.

    OpenAIRE

    Deeley, M C; Yanofsky, C

    1981-01-01

    The tryptophanase structural gene, tnaA, of Escherichia coli K-12 was cloned and sequenced. The size, amino acid composition, and sequence of the protein predicted from the nucleotide sequence agree with protein structure data previously acquired by others for the tryptophanase of E. coli B. Physiological data indicated that the region controlling expression of tnaA was present in the cloned segment. Sequence data suggested that a second structural gene of unknown function was located distal ...

  2. Steganalytic method based on short and repeated sequence distance statistics

    Institute of Scientific and Technical Information of China (English)

    WANG GuoXin; PING XiJian; XU ManKun; ZHANG Tao; BAO XiRui

    2008-01-01

    According to the distribution characteristics of short and repeated sequence (SRS),a steganalytic method based on the correlation of image bit planes is proposed.Firstly,we provide the conception of SRS distance statistics and deduce its statistical distribution.Because the SRS distance statistics can effectively reflect the correlation of the sequence,SRS has statistical features when the image bit plane sequence equals the image width.Using this characteristic,the steganalytic method is fulfilled by the distinct test of Poisson distribution.Experimental results show a good performance for detecting LSB matching steganographic method in still images.By the way,the proposed method is not designed for specific steganographic algorithms and has good generality.

  3. Expressed Sequence Tag-Simple Sequence Repeat (EST-SSR Marker Resources for Diversity Analysis of Mango (Mangifera indica L.

    Directory of Open Access Journals (Sweden)

    Natalie L. Dillon

    2014-01-01

    Full Text Available In this study, a collection of 24,840 expressed sequence tags (ESTs generated from five mango (Mangifera indica L. cDNA libraries was mined for EST-based simple sequence repeat (SSR markers. Over 1,000 ESTs with SSR motifs were detected from more than 24,000 EST sequences with di- and tri-nucleotide repeat motifs the most abundant. Of these, 25 EST-SSRs in genes involved in plant development, stress response, and fruit color and flavor development pathways were selected, developed into PCR markers and characterized in a population of 32 mango selections including M. indica varieties, and related Mangifera species. Twenty-four of the 25 EST-SSR markers exhibited polymorphisms, identifying a total of 86 alleles with an average of 5.38 alleles per locus, and distinguished between all Mangifera selections. Private alleles were identified for Mangifera species. These newly developed EST-SSR markers enhance the current 11 SSR mango genetic identity panel utilized by the Australian Mango Breeding Program. The current panel has been used to identify progeny and parents for selection and the application of this extended panel will further improve and help to design mango hybridization strategies for increased breeding efficiency.

  4. Genome-wide analysis of simple sequence repeats in the model medicinal mushroom Ganoderma lucidum.

    Science.gov (United States)

    Qian, Jun; Xu, Haibin; Song, Jingyuan; Xu, Jiang; Zhu, Yingjie; Chen, Shilin

    2013-01-10

    Simple sequence repeats (SSRs) or microsatellites are one of the most popular sources of genetic markers and play a significant role in gene function and genome organization. We identified SSRs in the genome of Ganoderma lucidum and analyzed their frequency and distribution in different genomic regions. We also compared the SSRs in G. lucidum with six other Agaricomycetes genomes: Coprinopsis cinerea, Laccaria bicolor, Phanerochaete chrysosporium, Postia placenta, Schizophyllum commune and Serpula lacrymans. Based on our search criteria, the total number of SSRs found ranged from 1206 to 6104 and covered from 0.04% to 0.15% of the fungal genomes. The SSR abundance was not correlated with the genome size, and mono- to tri-nucleotide repeats outnumbered other SSR categories in all of the species examined. In G. lucidum, a repertoire of 2674 SSRs was detected, with mono-nucleotides being the most abundant. SSRs were found in all genomic regions and were more abundant in non-coding regions than coding regions. The highest SSR relative abundance was found in introns (108 SSRs/Mb), followed by intergenic regions (84 SSRs/Mb). A total of 684 SSRs were found in the protein-coding sequences (CDSs) of 588 gene models, with 81.4% of them being tri- or hexa-nucleotides. After scanning for InterPro domains, 280 of these genes were successfully annotated, and 215 of them could be assigned to Gene Ontology (GO) terms. SSRs were also identified in 28 bioactive compound synthesis-related gene models, including one 3-hydroxy-3-methylglutaryl-CoA reductase (HMGR), three polysaccharide biosynthesis genes and 24 cytochrome P450 monooxygenases (CYPs). Primers were designed for the identified SSR loci, providing the basis for the future development of SSR markers of this medicinal fungus.

  5. Should nucleotide sequence analyzing computer algorithms always extend homologies by extending homologies?

    Science.gov (United States)

    Burnett, L; Basten, A; Hensley, W J

    1986-01-10

    Most computer algorithms used for comparing or aligning nucleotide sequences rely on the premise that the best way to extend a homology between the two sequences is to select a match rather than a mismatch. We have tested this assumption and found that it is not always valid.

  6. Organization and Nucleotide Sequences of Two Lactococcal Bacteriocin Operons

    NARCIS (Netherlands)

    Belkum, Marco J. van; Hayema, Bert Jan; Jeeninga, Rienk E.; Kok, Jan; Venema, Gerard

    1991-01-01

    Two distinct regions of the Lactococcus lactis subsp. cremoris 9B4 plasmid p9B4-6, each of which specified bacteriocin production as well as immunity, have been sequenced and analyzed by deletion and frameshift mutation analyses. On a 1.8-kb ScaI-ClaI fragment specifying low antagonistic activity, t

  7. Methods for making nucleotide probes for sequencing and synthesis

    Energy Technology Data Exchange (ETDEWEB)

    Church, George M; Zhang, Kun; Chou, Joseph

    2014-07-08

    Compositions and methods for making a plurality of probes for analyzing a plurality of nucleic acid samples are provided. Compositions and methods for analyzing a plurality of nucleic acid samples to obtain sequence information in each nucleic acid sample are also provided.

  8. Mayaro virus: complete nucleotide sequence and phylogenetic relationships with other alphaviruses.

    Science.gov (United States)

    Lavergne, Anne; de Thoisy, Benoît; Lacoste, Vincent; Pascalis, Hervé; Pouliquen, Jean-François; Mercier, Véronique; Tolou, Hugues; Dussart, Philippe; Morvan, Jacques; Talarmin, Antoine; Kazanji, Mirdad

    2006-05-01

    Mayaro (MAY) virus is a member of the genus Alphavirus in the family Togaviridae. Alphaviruses are distributed throughout the world and cause a wide range of diseases in humans and animals. Here, we determined the complete nucleotide sequence of MAY from a viral strain isolated from a French Guianese patient. The deduced MAY genome was 11,429 nucleotides in length, excluding the 5' cap nucleotide and 3' poly(A) tail. Nucleotide and amino acid homologies, as well as phylogenetic analyses of the obtained sequence confirmed that MAY is not a recombinant virus and belongs to the Semliki Forest complex according to the antigenic complex classification. Furthermore, analyses based on the E1 region revealed that MAY is closely related to Una virus, the only other South American virus clustering with the Old World viruses. On the basis of our results and of the alphaviruses diversity and pathogenicity, we suggest that alphaviruses may have an Old World origin.

  9. Transcription-induced CAG repeat contraction in human cells is mediated in part by transcription-coupled nucleotide excision repair.

    Science.gov (United States)

    Lin, Yunfu; Wilson, John H

    2007-09-01

    Expansions of CAG repeat tracts in the germ line underlie several neurological diseases. In human patients and mouse models, CAG repeat tracts display an ongoing instability in neurons, which may exacerbate disease symptoms. It is unclear how repeats are destabilized in nondividing cells, but it cannot involve DNA replication. We showed previously that transcription through CAG repeats induces their instability (Y. Lin, V. Dion, and J. H. Wilson, Nat. Struct. Mol. Biol. 13:179-180). Here, we present a genetic analysis of the link between transcription-induced repeat instability and nucleotide excision repair (NER) in human cells. We show that short interfering RNA-mediated knockdown of CSB, a component specifically required for transcription-coupled NER (TC-NER), and knockdowns of ERCC1 and XPG, which incise DNA adjacent to damage, stabilize CAG repeat tracts. These results suggest that TC-NER is involved in the pathway for transcription-induced CAG repeat instability. In contrast, knockdowns of OGG1 and APEX1, key components involved in base excision repair, did not affect repeat instability. In addition, repeats are stabilized by knockdown of transcription factor IIS, consistent with a requirement for RNA polymerase II (RNAPII) to backtrack from a transcription block. Repeats also are stabilized by knockdown of either BRCA1 or BARD1, which together function as an E3 ligase that can ubiquitinate arrested RNAPII. Treatment with the proteasome inhibitor MG132, which stabilizes repeats, confirms proteasome involvement. We integrate these observations into a tentative pathway for transcription-induced CAG repeat instability that can account for the contractions observed here and potentially for the contractions and expansions seen with human diseases.

  10. Applications of High-Throughput Nucleotide Sequencing (PhD)

    DEFF Research Database (Denmark)

    Waage, Johannes

    The recent advent of high throughput sequencing of nucleic acids (RNA and DNA) has vastly expanded research into the functional and structural biology of the genome of all living organisms (and even a few dead ones). With this enormous and exponential growth in biological data generation come......-sequencing, a study of the effects on alternative RNA splicing of KO of the nonsense mediated RNA decay system in Mus, using digital gene expression and a custom-built exon-exon junction mapping pipeline is presented (article I). Evolved from this work, a Bioconductor package, spliceR, for classifying alternative...... splicing events and coding potential of isoforms from full isoform deconvolution software, such as Cufflinks (article II), is presented. Finally, a study using 5’-end RNA-seq for alternative promoter detection between healthy patients and patients with acute promyelocytic leukemia is presented (article III...

  11. Quantifying single nucleotide variant detection sensitivity in exome sequencing

    OpenAIRE

    Meynert, Alison; Bicknell, Louise; Jackson, Andrew; Taylor, Martin S.

    2013-01-01

    BACKGROUND: The targeted capture and sequencing of genomic regions has rapidlydemonstrated its utility in genetic studies. Inherent in this technology isconsiderable heterogeneity of target coverage and this is expected tosystematically impact our sensitivity to detect genuine polymorphisms. To fullyinterpret the polymorphisms identified in a genetic study it is often essentialto both detect polymorphisms and to understand where and with what probabilityreal polymorphisms may have been missed...

  12. The structural basis of actinomycin D-binding induces nucleotide flipping out, a sharp bend and a left-handed twist in CGG triplet repeats.

    Science.gov (United States)

    Lo, Yu-Sheng; Tseng, Wen-Hsuan; Chuang, Chien-Ying; Hou, Ming-Hon

    2013-04-01

    The potent anticancer drug actinomycin D (ActD) functions by intercalating into DNA at GpC sites, thereby interrupting essential biological processes including replication and transcription. Certain neurological diseases are correlated with the expansion of (CGG)n trinucleotide sequences, which contain many contiguous GpC sites separated by a single G:G mispair. To characterize the binding of ActD to CGG triplet repeat sequences, the structural basis for the strong binding of ActD to neighbouring GpC sites flanking a G:G mismatch has been determined based on the crystal structure of ActD bound to ATGCGGCAT, which contains a CGG triplet sequence. The binding of ActD molecules to GCGGC causes many unexpected conformational changes including nucleotide flipping out, a sharp bend and a left-handed twist in the DNA helix via a two site-binding model. Heat denaturation, circular dichroism and surface plasmon resonance analyses showed that adjacent GpC sequences flanking a G:G mismatch are preferred ActD-binding sites. In addition, ActD was shown to bind the hairpin conformation of (CGG)16 in a pairwise combination and with greater stability than that of other DNA intercalators. Our results provide evidence of a possible biological consequence of ActD binding to CGG triplet repeat sequences.

  13. Complete nucleotide sequence of Alfalfa mosaic virus isolated from alfalfa (Medicago sativa L.) in Argentina.

    Science.gov (United States)

    Trucco, Verónica; de Breuil, Soledad; Bejerman, Nicolás; Lenardon, Sergio; Giolitti, Fabián

    2014-06-01

    The complete nucleotide sequence of an Alfalfa mosaic virus (AMV) isolate infecting alfalfa (Medicago sativa L.) in Argentina, AMV-Arg, was determined. The virus genome has the typical organization described for AMV, and comprises 3,643, 2,593, and 2,038 nucleotides for RNA1, 2 and 3, respectively. The whole genome sequence and each encoding region were compared with those of other four isolates that have been completely sequenced from China, Italy, Spain and USA. The nucleotide identity percentages ranged from 95.9 to 99.1 % for the three RNAs and from 93.7 to 99 % for the protein 1 (P1), protein 2 (P2), movement protein and coat protein (CP) encoding regions, whereas the amino acid identity percentages of these proteins ranged from 93.4 to 99.5 %, the lowest value corresponding to P2. CP sequences of AMV-Arg were compared with those of other 25 available isolates, and the phylogenetic analysis based on the CP gene was carried out. The highest percentage of nucleotide sequence identity of the CP gene was 98.3 % with a Chinese isolate and 98.6 % at the amino acid level with four isolates, two from Italy, one from Brazil and the remaining one from China. The phylogenetic analysis showed that AMV-Arg is closely related to subgroup I of AMV isolates. To our knowledge, this is the first report of a complete nucleotide sequence of AMV from South America and the first worldwide report of complete nucleotide sequence of AMV isolated from alfalfa as natural host.

  14. Analysis of Short Tandem Repeat and Single Nucleotide Polymorphism Loci From Single-Source Samples Using a Custom HaloPlex Target Enrichment System Panel.

    Science.gov (United States)

    Wendt, Frank R; Zeng, Xiangpei; Churchill, Jennifer D; King, Jonathan L; Budowle, Bruce

    2016-06-01

    Short tandem repeats and single nucleotide polymorphisms (SNPs) are used to individualize biological evidence samples. Short tandem repeat alleles are characterized by size separation during capillary electrophoresis (CE). Massively parallel sequencing (MPS) offers an alternative that can overcome limitations of the CE. With MPS, libraries are prepared for each sample, entailing target enrichment and bar coding, purification, and normalization. The HaloPlex Target Enrichment System (Agilent Technologies) uses a capture-based enrichment system with restriction enzyme digestion to generate fragments containing custom-selected markers. It offers another possible workflow for typing reference samples. Its efficacy was assessed using a panel of 275 human identity SNPs, 88 short tandem repeats, and amelogenin. The data analyzed included locus typing success, depth of sequence coverage, heterozygote balance, and concordance. The results indicate that the HaloPlex Target Enrichment System provides genetic data similar to that obtained by conventional polymerase chain reaction-CE methods with the advantage of analyzing substantially more markers in 1 sequencing run. The genetic typing performance of HaloPlex is comparable to other MPS-based sample preparation systems that utilize primer-based target enrichment.

  15. Complete nucleotide sequence and transcriptional analysis of snakehead fish retrovirus.

    Science.gov (United States)

    Hart, D; Frerichs, G N; Rambaut, A; Onions, D E

    1996-06-01

    The complete genome of the snakehead fish retrovirus has been cloned and sequenced, and its transcriptional profile in cell culture has been determined. The 11.2-kb provirus displays a complex expression pattern capable of encoding accessory proteins and is unique in the predicted location of the env initiation codon and signal peptide upstream of gag and the common splice donor site. The virus is distinguishable from all known retrovirus groups by the presence of an arginine tRNA primer binding site. The coding regions are highly divergent and show a number of unusual characteristics, including a large Gag coiled-coil region, a Pol domain of unknown function, and a long, lentiviral-like, Env cytoplasmic domain. Phylogenetic analysis of the Pol sequence emphasizes the divergent nature of the virus from the avian and mammalian retroviruses. The snakehead virus is also distinct from a previously characterized complex fish retrovirus, suggesting that discrete groups of these viruses have yet to be identified in the lower vertebrates.

  16. Statistical analysis of nucleotide runs in coding and noncoding DNA sequences.

    Science.gov (United States)

    Sprizhitsky YuA; Nechipurenko YuD; Alexandrov, A A; Volkenstein, M V

    1988-10-01

    A statistical analysis of the occurrence of particular nucleotide runs in DNA sequences of different species has been carried out. There are considerable differences of run distributions in DNA sequences of procaryotes, invertebrates and vertebrates. There is an abundance of short runs (1-2 nucleotides long) in the coding sequences and there is a deficiency of such runs in the noncoding regions. However, some interesting exceptions from this rule exist for the run distribution of adenine in procaryotes and for the arrangement of purine-pyrimidine runs in eucaryotes. The similarity in the distributions of such runs in the coding and noncoding regions may be due to some structural features of the DNA molecule as a whole. Runs of guanine (or cytosine) of three to six nucleotides occur predominantly in noncoding DNA regions in eucaryotes, especially in vertebrates.

  17. Development of simple sequence repeat markers and diversity analysis in alfalfa (Medicago sativa L.).

    Science.gov (United States)

    Wang, Zan; Yan, Hongwei; Fu, Xinnian; Li, Xuehui; Gao, Hongwen

    2013-04-01

    Efficient and robust molecular markers are essential for molecular breeding in plant. Compared to dominant and bi-allelic markers, multiple alleles of simple sequence repeat (SSR) markers are particularly informative and superior in genetic linkage map and QTL mapping in autotetraploid species like alfalfa. The objective of this study was to enrich SSR markers directly from alfalfa expressed sequence tags (ESTs). A total of 12,371 alfalfa ESTs were retrieved from the National Center for Biotechnology Information. Total 774 SSR-containing ESTs were identified from 716 ESTs. On average, one SSR was found per 7.7 kb of EST sequences. Tri-nucleotide repeats (48.8 %) was the most abundant motif type, followed by di-(26.1 %), tetra-(11.5 %), penta-(9.7 %), and hexanucleotide (3.9 %). One hundred EST-SSR primer pairs were successfully designed and 29 exhibited polymorphism among 28 alfalfa accessions. The allele number per marker ranged from two to 21 with an average of 6.8. The PIC values ranged from 0.195 to 0.896 with an average of 0.608, indicating a high level of polymorphism of the EST-SSR markers. Based on the 29 EST-SSR markers, assessment of genetic diversity was conducted and found that Medicago sativa ssp. sativa was clearly different from the other subspecies. The high transferability of those EST-SSR markers was also found for relative species.

  18. Exploiting BAC-end sequences for the mining, characterization and utility of new short sequences repeat (SSR) markers in Citrus.

    Science.gov (United States)

    Biswas, Manosh Kumar; Chai, Lijun; Mayer, Christoph; Xu, Qiang; Guo, Wenwu; Deng, Xiuxin

    2012-05-01

    The aim of this study was to develop a large set of microsatellite markers based on publicly available BAC-end sequences (BESs), and to evaluate their transferability, discriminating capacity of genotypes and mapping ability in Citrus. A set of 1,281 simple sequence repeat (SSR) markers were developed from the 46,339 Citrus clementina BAC-end sequences (BES), of them 20.67% contained SSR longer than 20 bp, corresponding to roughly one perfect SSR per 2.04 kb. The most abundant motifs were di-nucleotide (16.82%) repeats. Among all repeat motifs (TA/AT)n is the most abundant (8.38%), followed by (AG/CT)n (4.51%). Most of the BES-SSR are located in the non-coding region, but 1.3% of BES-SSRs were found to be associated with transposable element (TE). A total of 400 novel SSR primer pairs were synthesized and their transferability and polymorphism tested on a set of 16 Citrus and Citrus relative's species. Among these 333 (83.25%) were successfully amplified and 260 (65.00%) showed cross-species transferability with Poncirus trifoliata and Fortunella sp. These cross-species transferable markers could be useful for cultivar identification, for genomic study of Citrus, Poncirus and Fortunella sp. Utility of the developed SSR marker was demonstrated by identifying a set of 118 markers each for construction of linkage map of Citrus reticulata and Poncirus trifoliata. Genetic diversity and phylogenetic relationship among 40 Citrus and its related species were conducted with the aid of 25 randomly selected SSR primer pairs and results revealed that citrus genomic SSRs are superior to genic SSR for genetic diversity and germplasm characterization of Citrus spp.

  19. Cloning and nucleotide sequence of wild type and a mutant histidine decarboxylase from Lactobacillus 30a.

    Science.gov (United States)

    Vanderslice, P; Copeland, W C; Robertus, J D

    1986-11-15

    Prohistidine decarboxylase from Lactobacillus 30a is a protein that autoactivates to histidine decarboxylase by cleaving its peptide chain between serines 81 and 82 and converting Ser-82 to a pyruvoyl moiety. The pyruvoyl group serves as the prosthetic group for the decarboxylation reaction. We have cloned and determined the nucleotide sequence of the gene for this enzyme from a wild type strain and from a mutant with altered autoactivation properties. The nucleotide sequence modifies the previously determined amino acid sequence of the protein. A tripeptide missed in the chemical sequence is inserted, and three other amino acids show conservative changes. The activation mutant shows a single change of Gly-58 to an Asp. Sequence analysis up- and downstream from the gene suggests that histidine decarboxylase is part of a polycistronic message, and that the transcriptional promotor region is strongly homologous to those of other Gram-positive organisms.

  20. Nucleotide composition of CO1 sequences in Chelicerata (Arthropoda): detecting new mitogenomic rearrangements.

    Science.gov (United States)

    Arabi, Juliette; Judson, Mark L I; Deharveng, Louis; Lourenço, Wilson R; Cruaud, Corinne; Hassanin, Alexandre

    2012-02-01

    Here we study the evolution of nucleotide composition in third codon-positions of CO1 sequences of Chelicerata, using a phylogenetic framework, based on 180 taxa and three markers (CO1, 18S, and 28S rRNA; 5,218 nt). The analyses of nucleotide composition were also extended to all CO1 sequences of Chelicerata found in GenBank (1,701 taxa). The results show that most species of Chelicerata have a positive strand bias in CO1, i.e., in favor of C nucleotides, including all Amblypygi, Palpigradi, Ricinulei, Solifugae, Uropygi, and Xiphosura. However, several taxa show a negative strand bias, i.e., in favor of G nucleotides: all Scorpiones, Opisthothelae spiders and several taxa within Acari, Opiliones, Pseudoscorpiones, and Pycnogonida. Several reversals of strand-specific bias can be attributed to either a rearrangement of the control region or an inversion of a fragment containing the CO1 gene. Key taxa for which sequencing of complete mitochondrial genomes will be necessary to determine the origin and nature of mtDNA rearrangements involved in the reversals are identified. Acari, Opiliones, Pseudoscorpiones, and Pycnogonida were found to show a strong variability in nucleotide composition. In addition, both mitochondrial and nuclear genomes have been affected by higher substitution rates in Acari and Pseudoscorpiones. The results therefore indicate that these two orders are more liable to fix mutations of all types, including base substitutions, indels, and genomic rearrangements.

  1. Single nucleotide polymorphism mining and nucleotide sequence analysis of Mx1 gene in exonic regions of Japanese quail

    Directory of Open Access Journals (Sweden)

    Diwesh Kumar Niraj

    2015-12-01

    Full Text Available Aim: An attempt has been made to study the Myxovirus resistant (Mx1 gene polymorphism in Japanese quail. Materials and Methods: In the present, investigation four fragments viz. Fragment I of 185 bp (Exon 3 region, Fragment II of 148 bp (Exon 5 region, Fragment III of 161 bp (Exon 7 region, and Fragment IV of 176 bp (Exon 13 region of Mx1 gene were amplified and screened for polymorphism by polymerase chain reaction-single-strand conformation polymorphism technique in 170 Japanese quail birds. Results: Out of the four fragments, one fragment (Fragment II was found to be polymorphic. Remaining three fragments (Fragment I, III, and IV were found to be monomorphic which was confirmed by custom sequencing. Overall nucleotide sequence analysis of Mx1 gene of Japanese quail showed 100% homology with common quail and more than 80% homology with reported sequence of chicken breeds. Conclusion: The Mx1 gene is mostly conserved in Japanese quail. There is an urgent need of comprehensive analysis of other regions of Mx1 gene along with its possible association with the traits of economic importance in Japanese quail.

  2. Complete Nucleotide Sequence of a Newly Avirulent Newcastle Disease Virus Hubei 92(HB92) Strain

    Institute of Scientific and Technical Information of China (English)

    Pan Zi-shu; Chen Yu-dong; Shao Hua-bin; Yang Jun; Xiong Zhong-liang; Wen Guo-yuan; Zhang Chu-yu

    2004-01-01

    A new avirulent, heat-resistance HB92 strain of newcastle Disease Virus (NDV) was acquired from Australia V4 strain. Its complete nucleotides sequence was first determined. The entire genome of NDV HB92 consists of 15 186nucleotides (GenBank accession no. AY225110 ). It was demonstrated by sequence analysis that nucleotides homology of HB92 strain with virulent strain ZJ1 were 83.9%, and the homology compared with avirulent vaccine strain La Sota and BI were 94. 0% and 93. 5%, respectively. These results might be contributive to tbe study of the molecular mechanism of evolution of the NDV strain HB92 and to develop the engineered recombinant vaccine.

  3. Complete nucleotide sequence of the new potexvirus "Alstroemeria virus X". Brief report.

    Science.gov (United States)

    Fuji, S; Shinoda, K; Ikeda, M; Furuya, H; Naito, H; Fukumoto, F

    2005-11-01

    A flexuous virus was isolated in Japan from an alstroemeria plant showing mosaic symptoms. The virus had a broad host range but had systemically latent infectivity in alstroemeria. The virus was assigned to the genus Potexvirus based on morphology and physical properties and on an analysis of the complete nucleotide sequence. The genomic RNA of the virus was 7,009 nucleotides in length, excluding the 3'-terminal poly (A) tail. It contained five open reading frames (ORFs), which was consistent with other members of the genus Potexvirus. Although nucleotide sequences of the ORFs differ from previously reported potexviruses, a phylogenetic analysis placed it phylogenetically close to Narcissus mosaic virus and Scallion virus X. Therefore, we propose that this virus should be designated as Alstroemeria virus X (AlsVX).

  4. Global expression analysis of nucleotide binding site-leucine rich repeat-encoding and related genes in Arabidopsis

    Directory of Open Access Journals (Sweden)

    St Clair Dina A

    2007-10-01

    Full Text Available Abstract Background Nucleotide binding site-leucine rich repeat (NBS-LRR-encoding genes comprise the largest class of plant disease resistance genes. The 149 NBS-LRR-encoding genes and the 58 related genes that do not encode LRRs represent approximately 0.8% of all ORFs so far annotated in Arabidopsis ecotype Col-0. Despite their prevalence in the genome and functional importance, there was little information regarding expression of these genes. Results We analyzed the expression patterns of ~170 NBS-LRR-encoding and related genes in Arabidopsis Col-0 using multiple analytical approaches: expressed sequenced tag (EST representation, massively parallel signature sequencing (MPSS, microarray analysis, rapid amplification of cDNA ends (RACE PCR, and gene trap lines. Most of these genes were expressed at low levels with a variety of tissue specificities. Expression was detected by at least one approach for all but 10 of these genes. The expression of some but not the majority of NBS-LRR-encoding and related genes was affected by salicylic acid (SA treatment; the response to SA varied among different accessions. An analysis of previously published microarray data indicated that ten NBS-LRR-encoding and related genes exhibited increased expression in wild-type Landsberg erecta (Ler after flagellin treatment. Several of these ten genes also showed altered expression after SA treatment, consistent with the regulation of R gene expression during defense responses and overlap between the basal defense response and salicylic acid signaling pathways. Enhancer trap analysis indicated that neither jasmonic acid nor benzothiadiazole (BTH, a salicylic acid analog, induced detectable expression of the five NBS-LRR-encoding genes and one TIR-NBS-encoding gene tested; however, BTH did induce detectable expression of the other TIR-NBS-encoding gene analyzed. Evidence for alternative mRNA polyadenylation sites was observed for many of the tested genes. Evidence for

  5. Complete nucleotide sequence of a begomovirus and associated betasatellite infecting croton (Croton bonplandianus) in Pakistan.

    Science.gov (United States)

    Hussain, Khadim; Hussain, Mazhar; Mansoor, Shahid; Briddon, Rob W

    2011-06-01

    The complete sequences of a begomovirus and an associated betasatellite isolated from Croton bonplandianus originating from Pakistan were determined. The sequence of the begomovirus showed the highest level of nucleotide sequence identity (88.9%) to an isolate of papaya leaf curl virus and thus represents a new species, for which we propose the name Croton yellow vein virus (CYVV). The sequence of the betasatellite showed the highest levels of sequence identity (82 to 98.4%) to six sequences in the databases that have yet to be reported, followed by isolates of tomato leaf curl Joydebpur betasatellite (48.7 to 52.5%). This indicates that the betasatellite identified here (and the six sequences in the databases) is an isolate of a newly identified species for which the name Croton yellow vein mosaic betasatellite (CroYVMB) is proposed. For the begomovirus, an analysis of the sequence indicates that it has a recombinant origin.

  6. Simple sequence repeat map of the sunflower genome.

    Science.gov (United States)

    Tang, S.; Yu, J.-K.; Slabaugh, B.; Shintani, K.; Knapp, J.

    2002-12-01

    Several independent molecular genetic linkage maps of varying density and completeness have been constructed for cultivated sunflower ( Helianthus annuus L.). Because of the dearth of sequence and probe-specific DNA markers in the public domain, the various genetic maps of sunflower have not been integrated and a single reference map has not emerged. Moreover, comparisons between maps have been confounded by multiple linkage group nomenclatures and the lack of common DNA markers. The goal of the present research was to construct a dense molecular genetic linkage map for sunflower using simple sequence repeat (SSR) markers. First, 879 SSR markers were developed by identifying 1,093 unique SSR sequences in the DNA sequences of 2,033 clones isolated from genomic DNA libraries enriched for (AC)(n) or (AG)(n) and screening 1,000 SSR primer pairs; 579 of the newly developed SSR markers (65.9% of the total) were polymorphic among four elite inbred lines (RHA280, RHA801, PHA and PHB). The genetic map was constructed using 94 RHA280 x RHA801 F(7) recombinant inbred lines (RILs) and 408 polymorphic SSR markers (462 SSR marker loci segregated in the mapping population). Of the latter, 459 coalesced into 17 linkage groups presumably corresponding to the 17 chromosomes in the haploid sunflower genome ( x = 17). The map was 1,368.3-cM long and had a mean density of 3.1 cM per locus. The SSR markers described herein supply a critical mass of DNA markers for constructing genetic maps of sunflower and create the basis for unifying and cross-referencing the multitude of genetic maps developed for wild and cultivated sunflowers.

  7. Dependence of the E.coli promoter strength and physical parameters upon the nucleotide sequence

    Institute of Scientific and Technical Information of China (English)

    BEREZHNOY Andrey Y.; SHCKORBATOV Yuriy G.

    2005-01-01

    The energy of interaction between complementary nucleotides in promoter sequences ofE. coli was calculated and visualized. The graphic method for presentation of energy properties of promoter sequences was elaborated on. Data obtained indicated that energy distribution through the length of promoter sequence results in picture with minima at -35, -8 and +7 regions corresponding to areas with elevated AT (adenine-thymine) content. The most important difference from the random sequences area is related to -8. Four promoter groups and their energy properties were revealed. The promoters with minimal and maximal energy of interaction between complementary nucleotides have low strengths, the strongest promoters correspond to promoter clusters characterized by intermediate energy values.

  8. The complete nucleotide sequence and genome organization of a novel betaflexivirus infecting Citrullus lanatus.

    Science.gov (United States)

    Xin, Min; Zhang, Peipei; Liu, Wenwen; Ren, Yingdang; Cao, Mengji; Wang, Xifeng

    2017-07-05

    The complete nucleotide sequence of a novel positive single-stranded (+ss) RNA virus, tentatively named watermelon virus A (WVA), was determined using a combination of three methods: RNA sequencing, small RNA sequencing, and Sanger sequencing. The full genome of WVA is comprised of 8,372 nucleotides (nt), excluding the poly (A) tail, and contains four open reading frames (ORFs). The largest ORF, ORF1 encodes a putative replication-associated polyprotein (RP) with three conserved domains. ORF2 and ORF4 encode a movement protein (MP) and coat protein (CP), respectively. The putative product encoded by ORF3, of an estimated molecular mass of 25 kDa, has no significant similarity with other proteins. Identity and phylogenetic analysis indicate that WVA is a new virus, closely related to members of the family Betaflexiviridae. However, the final taxonomic allocation of WVA within the family is yet to be determined.

  9. Evidence for integration of retroviral vectors in a novel human repeat sequence

    Energy Technology Data Exchange (ETDEWEB)

    Kurdi-Haidar, B.; Friedmann, T. [USCD School of Medicine, La Jolla, CA (United States)

    1994-09-01

    Retroviruses have become attractive vehicles for the introduction of foreign genes into mammalian cells not only for gene therapy but also to serve as anchor points for long-range mapping purposes. The information relating to retroviral integration in mammalian cells is derived mostly from studies of rodent genomes. The absence of information regarding integration sites of murine-based retroviral vectors in human cells has prompted us to investigate the characteristics of integration sites in the human genome. We have constructed a Moloney murine leukemia virus-based retroviral vector that carries the pUC8 origin of replication and the chloramphenicol resistance gene to allow the rescue of the flanking genomic sequences in plasmid form. We have infected human primary fibroblasts and myoblasts with this retroviral vector and isolated independently transduced clones. Genomic DNA was obtained from independent clones and the genomic fragment carrying the provirus-host sequence boundary was isolated after digestion of the genomic DNA, circularization, and transformation by electroporation of E. coli C cells to chloramphenicol resistance. Restriction map and nucleotide sequence analysis of the rescued plasmids showed that a number of the clones shared the same integration site within the human genome. We have used the nucleotide sequence information about the human DNA adjacent to the 3{prime}LTR to design a PCR-based assay diagnostic for this common integration site. Analysis revealed the presence of the same integration site in four out of twelve human primary fibroblast clones infected with this specific retroviral vector, and in one out of twelve human primary myoblast clones infected with a second retroviral vector. Further analysis revealed the common integration site to be a previously unreported primate repeat present in monkey and human genomes and absent from rodent, bovine and avian genomes.

  10. Nucleotide sequence and genomic organization of an ophiovirus associated with lettuce big-vein disease

    NARCIS (Netherlands)

    Wilk, van der F.; Dullemans, A.M.; Verbeek, M.; Heuvel, van den J.F.J.M.

    2002-01-01

    The complete nucleotide sequence of an ophiovirus associated with lettuce big-vein disease has been elucidated. The genome consisted of four RNA molecules of approximately 7ò8, 1ò7, 1ò5 and 1ò4 kb. Virus particles were shown to contain nearly equimolar amounts of RNA molecules of both polarities.

  11. Nucleotide sequence and genomic organization of an ophiovirus associated with lettuce big-vein disease

    NARCIS (Netherlands)

    Wilk, van der F.; Dullemans, A.M.; Verbeek, M.; Heuvel, van den J.F.J.M.

    2002-01-01

    The complete nucleotide sequence of an ophiovirus associated with lettuce big-vein disease has been elucidated. The genome consisted of four RNA molecules of approximately 7ò8, 1ò7, 1ò5 and 1ò4 kb. Virus particles were shown to contain nearly equimolar amounts of RNA molecules of both polarities. Th

  12. Complete Nucleotide Sequence of a Citrobacter freundii Plasmid Carrying KPC-2 in a Unique Genetic Environment

    Science.gov (United States)

    Yao, Yancheng; Imirzalioglu, Can; Hain, Torsten; Kaase, Martin; Gatermann, Soeren; Exner, Martin; Mielke, Martin; Hauri, Anja; Dragneva, Yolanta; Bill, Rita; Wendt, Constanze; Wirtz, Angela; Chakraborty, Trinad

    2014-01-01

    The complete and annotated nucleotide sequence of a 54,036-bp plasmid harboring a blaKPC-2 gene that is clonally present in Citrobacter isolates from different species is presented. The plasmid belongs to incompatibility group N (IncN) and harbors the class A carbapenemase KPC-2 in a unique genetic environment. PMID:25395635

  13. Nucleotide sequence analysis of the lactococcal EPS plasmid pNZ4000

    NARCIS (Netherlands)

    Kranenburg, van R.; Kleerebezem, M.; Vos, de W.M.

    2000-01-01

    The complete 42180-bp nucleotide sequence of the mobilization plasmid pNZ4000, coding for exopolysaccharide (EPS) production in Lactococcus lactis, was determined. This plasmid contains a region involved in EPS biosynthesis, four functional replicons, a region containing mobilization genes, and thre

  14. Simple sequence repeats in Neurospora crassa: distribution, polymorphism and evolutionary inference

    Directory of Open Access Journals (Sweden)

    Park Jongsun

    2008-01-01

    Full Text Available Abstract Background Simple sequence repeats (SSRs have been successfully used for various genetic and evolutionary studies in eukaryotic systems. The eukaryotic model organism Neurospora crassa is an excellent system to study evolution and biological function of SSRs. Results We identified and characterized 2749 SSRs of 963 SSR types in the genome of N. crassa. The distribution of tri-nucleotide (nt SSRs, the most common SSRs in N. crassa, was significantly biased in exons. We further characterized the distribution of 19 abundant SSR types (AST, which account for 71% of total SSRs in the N. crassa genome, using a Poisson log-linear model. We also characterized the size variation of SSRs among natural accessions using Polymorphic Index Content (PIC and ANOVA analyses and found that there are genome-wide, chromosome-dependent and local-specific variations. Using polymorphic SSRs, we have built linkage maps from three line-cross populations. Conclusion Taking our computational, statistical and experimental data together, we conclude that 1 the distributions of the SSRs in the sequenced N. crassa genome differ systematically between chromosomes as well as between SSR types, 2 the size variation of tri-nt SSRs in exons might be an important mechanism in generating functional variation of proteins in N. crassa, 3 there are different levels of evolutionary forces in variation of amino acid repeats, and 4 SSRs are stable molecular markers for genetic studies in N. crassa.

  15. Mining of haplotype-based expressed sequence tag single nucleotide polymorphisms in citrus

    OpenAIRE

    Chen, Chunxian; Gmitter Jr, Fred G

    2013-01-01

    Background Single nucleotide polymorphisms (SNPs), the most abundant variations in a genome, have been widely used in various studies. Detection and characterization of citrus haplotype-based expressed sequence tag (EST) SNPs will greatly facilitate further utilization of these gene-based resources. Results In this paper, haplotype-based SNPs were mined out of publicly available citrus expressed sequence tags (ESTs) from different citrus cultivars (genotypes) individually and collectively for...

  16. Analysis of simple sequence repeats markers derived from Phytophthora sojae expressed sequence tags

    Institute of Scientific and Technical Information of China (English)

    ZHU Zhendong; HUO Yunlong; WANG Xiaoming; HUANG Junbin; WU Xiaofei

    2004-01-01

    Five thousand and eight hundred publicly available expressed sequence tags (ESTs) of Phytophthora sojae were electronically searched and 415 simple sequence repeats (SSRs) were identified in 369 ESTs. The average density of SSRs was one SSR per 8.9 kb of EST sequence screened. The most frequent repeats were trinucleotide repeats (50.1%) and the least frequent were tetranucleotide repeats (8.2%). Forty primer pairs were designed and tested on 5 strains of P. sojae. Thirty-three primer pairs had successful PCR amplifications. Of the 33 functional primer pairs, 28 primer pairs produced characteristic SSR bands of the expected size, and 15 primer pairs (45.5%) detected polymorphism among 5 tested strains of P. sojae. Based on the polymorphisms detected with 20 EST-SSR markers, the 5 tested strains of P. sojae were clustered into 3 groups. In this study, the SSR markers of P. sojae were developed for the first time. These markers could be useful for identification, genetic variation study, and molecular mapping of P. sojae and its relative species.

  17. The nucleotide sequence and genome structure of mung bean yellow mosaic geminivirus.

    Science.gov (United States)

    Morinaga, T; Ikegami, M; Miura, K

    1993-01-01

    Complete nucleotide sequences of the infectious cloned DNA components (DNA 1 and DNA 2) of mung bean yellow mosaic virus (MYMV) were determined. MYMV DNA 1 and DNA 2 consists of 2,723 and 2,675 nucleotides respectively. DNA 1 and DNA 2 have little sequence similarity except for a region of approximately 200 bases which is almost identical in the two molecules. Analysis of open reading frames revealed nine potential coding regions for proteins of mol. wt. > 10,000, six in DNA 1 and three in DNA 2. The nucleotide sequence of MYMV DNA was compared with that of bean golden mosaic virus (BGMV), tomato golden mosaic virus (TGMV) and African cassava mosaic virus (ACMV). The 200-base region common to the two DNAs of each virus had little sequence similarity, except for a highly conserved 33-36 base sequence potentially capable of forming a stable hairpin structure. The potential coding regions in the MYMV DNAs had counterparts in the BGMV, TGMV and ACMV, suggesting an overall similarity in genome organization, except for absence of 1L3 in MYMV DNA 1. The most highly conserved ORFs, MYMV 1R1, BGMV 1R1, TGMV 1R1 and ACMV 1R1, are the putative genes for the coat proteins of MYMV, BGMV, TGMV and ACMV, respectively. MYMV 1L1 has also a high degree of sequence similarity with BGMV 1L1, TGMV 1L1 and ACMV 1L1.

  18. Nucleotide sequence of maize dwarf mosaic virus capsid protein gene and its expression in Escherichia coli

    Institute of Scientific and Technical Information of China (English)

    赛吉庆; 康良仪; 黄忠; 史春霖; 田波; 谢友菊

    1995-01-01

    The 3’-terminal 1 279 nucleotide sequence of maize dwarf mosaic virus (MDMV) genome has been determined. This sequence contains an open reading frame of 1023 nudeotides and a 3’ -non-coding region of 256 nucleotides. The open reading frame includes all of the coding regions for the viral capsid protein (CP) and part of the viral nuclear inclusion protein (Nib). The predicted viral CP consists of 313 amino acid residues with a calculated molecular weight of 35400. The amino acid sequence of the viral CP derived from MDMV cDNA shows about 47%-54% homology to that of 4 other potyviruses. The viral CP gene was constructed in frame with the lacZ gene in pUC19 plasmid and expressed in E. coli cells. The fusion polypeptide positively reacted in Western blot with an antiserum prepared against the native viral CP.

  19. Nucleotide sequencing and serologic analysis of Cache Valley virus isolates from the Yucatan Peninsula of Mexico.

    Science.gov (United States)

    Blitvich, Bradley J; Loroño-Pino, Maria A; Garcia-Rejon, Julian E; Farfan-Ale, Jose A; Dorman, Karin S

    2012-08-01

    Nucleotide sequencing was performed on part of the medium and large genome segments of 17 Cache Valley virus (CVV) isolates from the Yucatan Peninsula of Mexico. Alignment of these sequences to all other sequences in the Genbank database revealed that they have greatest nucleotide identity (97-98 %) with the equivalent regions of Tlacotalpan virus (TLAV), which is considered to be a variety of CVV. Next, cross-plaque reduction neutralization tests (PRNTs) were performed using sera from mice that had been inoculated with a representative isolate from the Yucatan Peninsula (CVV-478) or the prototype TLAV isolate (61-D-240). The PRNT titers exhibited a twofold difference in one direction and no difference in the other direction suggesting that CVV-478 and 61-D-240 belong to the same CVV subtype. In conclusion, we demonstrate that the CVV isolates from the Yucatan Peninsula of Mexico are genetically and antigenically similar to the prototype TLAV isolate.

  20. The Cipher Code of Simple Sequence Repeats in “Vampire Pathogens”

    Science.gov (United States)

    Zou, Geng; Bello-Orti, Bernardo; Aragon, Virginia; Tucker, Alexander W.; Luo, Rui; Ren, Pinxing; Bi, Dingren; Zhou, Rui; Jin, Hui

    2015-01-01

    Blood inside mammals is a forbidden area for the majority of prokaryotic microbes; however, red blood cells tropism microbes, like “vampire pathogens” (VP), succeed in matching scarce nutrients and surviving strong immunity reactions. Here, we found VP of Mycoplasma, Rhizobiales, and Rickettsiales showed significantly higher counts of (AG)n dimeric simple sequence repeats (Di-SSRs) in the genomes, coding and non-coding regions than non Vampire Pathogens (N_VP). Regression analysis indicated a significant correlation between GC content and the span of (AG)n-Di-SSR variation. Gene Ontology (GO) terms with abundance of (AG)3-Di-SSRs shared by the VP strains were associated with purine nucleotide metabolism (FDR < 0.01), indicating an adaptation to the limited availability of purine and nucleotide precursors in blood. Di-amino acids coded by (AG)n-Di-SSRs included all three six-fold code amino acids (Arg, Leu and Ser) and significantly higher counts of Di-amino acids coded by (AG)3, (GA)3, and (TC)3 in VP than N_VP. Furthermore, significant differences (P < 0.001) on the numbers of triplexes formed from (AG)n-Di-SSRs between VP and N_VP in Mycoplasma suggested the potential role of (AG)n-Di-SSRs in gene regulation. PMID:26215592

  1. Nucleotide sequences of chloroplast 5S ribosomal RNA from cell suspension cultures of the liverworts Marchantia polymorpha and Jungermannia subulata.

    OpenAIRE

    Yamano, Y; Ohyama, K; Komano, T

    1984-01-01

    The nucleotide sequences of chloroplast 5S rRNAs from cell suspension cultures of the liverworts Marchantia polymorpha and Jungermannia subulata were determined. Their nucleotide sequences, 119 nucleotides long, were highly homologous to each other (96% identity) and had high homology with those from chloroplast 5S rRNAs of two higher plants, tobacco (92% identity) and spinach (92-91% identity), but less homology (87-85% identity) with that from a lower plant, the fern Dryopteris acuminata.

  2. Complete nucleotide sequence analysis of a Dengue-1 virus isolated on Easter Island, Chile.

    Science.gov (United States)

    Cáceres, C; Yung, V; Araya, P; Tognarelli, J; Villagra, E; Vera, L; Fernández, J

    2008-01-01

    Dengue-1 viruses responsible for the dengue fever outbreak in Easter Island in 2002 were isolated from acute-phase sera of dengue fever patients. In order to analyze the complete genome sequence, we designed primers to amplify contiguous segments across the entire sequence of the viral genome. RT-PCR products obtained were cloned, and complete nucleotide and deduced amino acid sequences were determined. This report constitutes the first complete genetic characterization of a DENV-1 isolate from Chile. Phylogenetic analysis shows that an Easter Island isolate is most closely related to Pacific DENV-1 genotype IV viruses.

  3. Sequence selective naked-eye detection of DNA harnessing extension of oligonucleotide-modified nucleotides.

    Science.gov (United States)

    Verga, Daniela; Welter, Moritz; Marx, Andreas

    2016-02-01

    DNA polymerases can efficiently and sequence selectively incorporate oligonucleotide (ODN)-modified nucleotides and the incorporated oligonucleotide strand can be employed as primer in rolling circle amplification (RCA). The effective amplification of the DNA primer by Φ29 DNA polymerase allows the sequence-selective hybridisation of the amplified strand with a G-quadruplex DNA sequence that has horse radish peroxidase-like activity. Based on these findings we develop a system that allows DNA detection with single-base resolution by naked eye.

  4. Molecular evolution of a family of resistance gene analogs of nucleotide-binding site sequences in Solanum lycopersicum.

    Science.gov (United States)

    Liao, Pei-Chun; Lin, Kuan-Hung; Ko, Chin-Ling; Hwang, Shih-Ying

    2011-10-01

    Nucleotide-binding site-leucine-rich repeats (NBS-LRR) gene families are one of the major plant resistance genes. Genomic NBS evolution was studied in many plant species for diverse arrays of NBS gene families. In this study, we focused on one family of NBS sequences in an attempt to understand how closely related NBS sequences evolved in the light of selection in domesticated plant species. A phylogenetic analysis revealed five major clades (A-E) and five subclades (A1-A5) within clade A of cloned NBS sequences. Positive selection was only detected in newly evolved NBS lineages in subclades of clade A. Positively selected codon sites were found among NBS sequences of clade A. A sliding-window analysis revealed that regions with Ka/Ks ratios of >1 were in the inter-motifs when paired clades were compared, but regions with Ka/Ks ratios of >1 were found across NBS sequences when subclades of clade A were compared. Our results based on a family of closely related NBS sequences showed that positive selection was first exerted on specific lineages across all NBS sequences after selective constraints. Subsequently, sequences with mutations in commonly conserved motifs were scrutinized by purifying selection. In the long term, conserved high frequency alleles in commonly conserved motifs and changes in inter-motifs were maintained in the investigated family of NBS sequences. Moreover, codons identified to be under positive selection in the inter-motifs were mainly located in regions involved in functions of ATP binding or hydrolysis.

  5. Complete nucleotide sequence of wound tumor virus genomic segments encoding nonstructural polypeptides.

    Science.gov (United States)

    Anzola, J V; Dall, D J; Xu, Z K; Nuss, D L

    1989-07-01

    Sequence analysis of the genomic segments which encode the five wound tumor virus nonstructural polypeptides has been completed. The complete nucleotide sequence of segments S4 (2565 bp), S6 (1700 bp), S9 (1182 bp), and S10 (1172 bp) are presented in this report while the sequence of segment S12 (851 bp) has been described previously (T. Asamizu, D. Summers, M. B. Motika, J. V. Anzola, and D. L. Nuss, 1985, Virology 144, 398-409). Comparison of the only published sequence for another member of the genus Phytoreovirus, that of rice dwarf virus segment S10, with the combined available wound tumor virus sequence data revealed similarity with WTV segment S10: 54.9 and 30.6% at the nucleotide and amino acid level, respectively. Although wound tumor virus and rice dwarf virus differ in plant host range, tissue specificity, vector range, and disease symptom expression, the level of sequence similarity shared by the two segments suggests a common origin for these viruses. The potential use of a phytoreovirus sequence database for predicting functions of viral encoded gene products is considered.

  6. PCV: An Alignment Free Method for Finding Homologous Nucleotide Sequences and its Application in Phylogenetic Study.

    Science.gov (United States)

    Kumar, Rajnish; Mishra, Bharat Kumar; Lahiri, Tapobrata; Kumar, Gautam; Kumar, Nilesh; Gupta, Rahul; Pal, Manoj Kumar

    2017-06-01

    Online retrieval of the homologous nucleotide sequences through existing alignment techniques is a common practice against the given database of sequences. The salient point of these techniques is their dependence on local alignment techniques and scoring matrices the reliability of which is limited by computational complexity and accuracy. Toward this direction, this work offers a novel way for numerical representation of genes which can further help in dividing the data space into smaller partitions helping formation of a search tree. In this context, this paper introduces a 36-dimensional Periodicity Count Value (PCV) which is representative of a particular nucleotide sequence and created through adaptation from the concept of stochastic model of Kolekar et al. (American Institute of Physics 1298:307-312, 2010. doi: 10.1063/1.3516320 ). The PCV construct uses information on physicochemical properties of nucleotides and their positional distribution pattern within a gene. It is observed that PCV representation of gene reduces computational cost in the calculation of distances between a pair of genes while being consistent with the existing methods. The validity of PCV-based method was further tested through their use in molecular phylogeny constructs in comparison with that using existing sequence alignment methods.

  7. Molecular analysis of a large subtelomeric nucleotide-binding-site-leucine-rich-repeat family in two representative genotypes of the major gene pools of Phaseolus vulgaris.

    Science.gov (United States)

    Geffroy, Valérie; Macadré, Catherine; David, Perrine; Pedrosa-Harand, Andrea; Sévignac, Mireille; Dauga, Catherine; Langin, Thierry

    2009-02-01

    In common bean, the B4 disease resistance gene cluster is a complex cluster localized at the end of linkage group (LG) B4, containing at least three R specificities to the fungus Colletotrichum lindemuthianum. To investigate the evolution of this R cluster since the divergence of Andean and Mesoamerican gene pools, DNA sequences were characterized from two representative genotypes of the two major gene pools of common bean (BAT93: Mesoamerican; JaloEEP558: Andean). Sequences encoding 29 B4-CC nucleotide-binding-site-leucine-rich-repeat (B4-CNL) genes were determined-12 from JaloEEP558 and 17 from BAT93. Although sequence exchange events were identified, phylogenetic analyses revealed that they were not frequent enough to lead to homogenization of B4-CNL sequences within a haplotype. Genetic mapping based on pulsed-field gel electrophoresis separation confirmed that the B4-CNL family is a large family specific to one end of LG B4 and is present at two distinct blocks separated by 26 cM. Fluorescent in situ hybridization on meiotic pachytene chromosomes revealed that two B4-CNL blocks are located in the subtelomeric region of the short arm of chromosome 4 on both sides of a heterochromatic block (knob), suggesting that this peculiar genomic environment may favor the proliferation of a large R gene cluster.

  8. Molecular Analysis of a Large Subtelomeric Nucleotide-Binding-Site–Leucine-Rich-Repeat Family in Two Representative Genotypes of the Major Gene Pools of Phaseolus vulgaris

    Science.gov (United States)

    Geffroy, Valérie; Macadré, Catherine; David, Perrine; Pedrosa-Harand, Andrea; Sévignac, Mireille; Dauga, Catherine; Langin, Thierry

    2009-01-01

    In common bean, the B4 disease resistance (R) gene cluster is a complex cluster localized at the end of linkage group (LG) B4, containing at least three R specificities to the fungus Colletotrichum lindemuthianum. To investigate the evolution of this R cluster since the divergence of Andean and Mesoamerican gene pools, DNA sequences were characterized from two representative genotypes of the two major gene pools of common bean (BAT93: Mesoamerican; JaloEEP558: Andean). Sequences encoding 29 B4-CC nucleotide-binding-site–leucine-rich-repeat (B4-CNL) genes were determined—12 from JaloEEP558 and 17 from BAT93. Although sequence exchange events were identified, phylogenetic analyses revealed that they were not frequent enough to lead to homogenization of B4-CNL sequences within a haplotype. Genetic mapping based on pulsed-field gel electrophoresis separation confirmed that the B4-CNL family is a large family specific to one end of LG B4 and is present at two distinct blocks separated by 26 cM. Fluorescent in situ hybridization on meiotic pachytene chromosomes revealed that two B4-CNL blocks are located in the subtelomeric region of the short arm of chromosome 4 on both sides of a heterochromatic block (knob), suggesting that this peculiar genomic environment may favor the proliferation of a large R gene cluster. PMID:19087965

  9. DNA Amplification and Nucleotide Sequence Determination of a Region of Mitochondrial DNA in the Sea Snake, Laticauda Semifasciata

    OpenAIRE

    Eguchi, Tomoko; Eguchi, Yukinori; Oshiro, Minoru; Asato, Tsuyoshi; Takei, Hiroshi; Nakashima, Yasutsugu

    1993-01-01

    We determined the nucleotide sequence of a region of the 12S ribosomal RNA (rRNA) gene in the mitochondrial DNA (mtDNA) of the sea snake, Laticauda semifasciata, using the polymerase chain reaction (PCR). We synthesized oligonucleotide primers according to the nucleotide sequence of human mt DNA 12S rRNA gene and found that the target sequence (386bp) of the sea snake mtDNA could be amplified with these primers. The nucleotide sequence of the amplified region of the sea snake mt DNA was deter...

  10. An algorithm and program for finding sequence specific oligo-nucleotide probes for species identification

    Directory of Open Access Journals (Sweden)

    Tautz Diethard

    2002-03-01

    Full Text Available Abstract Background The identification of species or species groups with specific oligo-nucleotides as molecular signatures is becoming increasingly popular for bacterial samples. However, it shows also great promise for other small organisms that are taxonomically difficult to tract. Results We have devised here an algorithm that aims to find the optimal probes for any given set of sequences. The program requires only a crude alignment of these sequences as input and is optimized for performance to deal also with very large datasets. The algorithm is designed such that the position of mismatches in the probes influences the selection and makes provision of single nucleotide outloops. Program implementations are available for Linux and Windows.

  11. Sequence characterization of hypervariable regions in the soybean genome: leucine-rich repeats and simple sequence repeats

    Directory of Open Access Journals (Sweden)

    Everaldo G. de Barros

    2000-06-01

    Full Text Available The genetic basis of cultivated soybean is rather narrow. This observation has been confirmed by analysis of agronomic traits among different genotypes, and more recently by the use of molecular markers. During the construction of an RFLP soybean map (Glycine soja x Glycine max the two progenitors were analyzed with over 2,000 probes, of which 25% were polymorphic. Among the probes that revealed polymorphisms, a small proportion, about 0.5%, hybridized to regions that were highly polymorphic. Here we report the sequencing and analysis of five of these probes. Three of the five contain segments that encode leucine-rich repeat (LRR sequence homologous to known disease resistance genes in plants. Two other probes are relatively AT-rich and contain segments of (An/(Tn. DNA segments corresponding to one of the probes (A45-10 were amplified from nine soybean genotypes. Partial sequencing of these amplicons suggests that deletions and/or insertions are responsible for the extensive polymorphism observed. We propose that genes encoding LRR proteins and simple sequence repeat region prone to slippage are some of the most hypervariable regions of the soybean genome.A base genética da soja cultivada é relativamente estreita. Essa observação foi confirmada por análises de características agronômicas entre diferentes genótipos e, mais recentemente, pelo uso de marcadores moleculares. Durante a construção de um mapa de RFLP da soja (Glycine soja x Glycine max, os dois progenitores foram analisados com mais de 2000 sondas, das quais 25% eram polimórficas. Entre as sondas que revelaram polimorfismos, uma pequena proporção, cerca de 0,5%, hibridizou com regiões que eram altamente polimórficas. Neste trabalho, são apresentados o seqüenciamento e análise de cinco dessas sondas. Três dessas sondas contêm segmentos que codificam repetições ricas em leucina que são homólogas a genes de resistência a doenças já conhecidos em plantas. As duas

  12. Complete nucleotide sequence and genome organization of tobacco mosaic virus isolated from Viciafaba

    Institute of Scientific and Technical Information of China (English)

    周雪平; 薛朝阳; 陈青; 戚益军; 李德葆

    2000-01-01

    Based on reported TMV-U1 sequence, primers were designed and fragments covering the entire genome of TMV broad bean strain (TMV-B) were obtained with RT-PCR. These fragments were cloned and sequenced and the 5’ and 3’ end sequences of genome were confirmed with RACE. The complete sequence of TMV-B comprises 6 395 nucleotides (nt) and four open reading frames, which correspond to 126 ku (1 116 amino acids), 183 ku (1 616 amino acids), 30 ku (268 amino acids) and 17.5 ku proteins (159 amino acids). The complete nucleotide sequence of TMV-B is 99.4% identical to that of TMV-U1. The two virus isolates share the same sequence of 5’, 3’ non-coding region and 17.5 K ORF, and 6, 1 and 3 amino acid changes are found in 126 K protein, 54 K protein and 30 K protein, respectively. The possible mechanism on the infection of TMV-B in Vicia faba is discussed.

  13. Assembly of Repeat Content Using Next Generation Sequencing Data

    Energy Technology Data Exchange (ETDEWEB)

    labutti, Kurt; Kuo, Alan; Grigoriev, Igor; Copeland, Alex

    2014-03-17

    Repetitive organisms pose a challenge for short read assembly, and typically only unique regions and repeat regions shorter than the read length, can be accurately assembled. Recently, we have been investigating the use of Pacific Biosciences reads for de novo fungal assembly. We will present an assessment of the quality and degree of repeat reconstruction possible in a fungal genome using long read technology. We will also compare differences in assembly of repeat content using short read and long read technology.

  14. A 22-nucleotide spliced leader sequence in the human parasitic nematode Brugia malayi is identical to the trans-spliced leader exon in Caenorhabditis elegans.

    OpenAIRE

    Takacs, A M; Denker, J A; Perrine, K G; Maroney, P A; Nilsen, T W

    1988-01-01

    The mRNAs encoding a 63-kDa antigen in the human parasitic nematode Brugia Malayi contain a spliced leader sequence of 22 nucleotides (nt) that is identical to the trans-spliced leader found on certain actin mRNAs in the distantly related nematode Caenorhabditis elegans. The 22-nt sequence does not appear to be encoded near the 63-kDa genes but is present in multiple copies in several locations within the parasite genome, including the 5S rRNA gene repeat. The 5S-linked copies of the 22-nt se...

  15. Mouse Mammary Tumor Virus-Like Nucleotide Sequences in Canine and Feline Mammary Tumors▿

    OpenAIRE

    Hsu, Wei-Li; Lin, Hsing-Yi; Chiou, Shyan-Song; Chang, Chao-Chin; Wang, Szu-Pong; Lin, Kuan-Hsun; Chulakasian, Songkhla; Wong, Min-Liang; Chang, Shih-Chieh

    2010-01-01

    Mouse mammary tumor virus (MMTV) has been speculated to be involved in human breast cancer. Companion animals, dogs, and cats with intimate human contacts may contribute to the transmission of MMTV between mouse and human. The aim of this study was to detect MMTV-like nucleotide sequences in canine and feline mammary tumors by nested PCR. Results showed that the presence of MMTV-like env and LTR sequences in canine malignant mammary tumors was 3.49% (3/86) and 18.60% (16/86), respectively. Fo...

  16. Cloning and nucleotide sequence of the Enterobacter aerogenes signal peptidase II (lsp) gene.

    OpenAIRE

    Isaki, L; Kawakami, M; Beers, R; Hom, R; Wu, H.C.

    1990-01-01

    In Escherichia coli, prolipoprotein signal peptidase is encoded by the lsp gene, which is organized into an operon consisting of ileS, lsp, and three open reading frames, designated genes x, orf-149, and orf-316. The Enterobacter aerogenes lsp gene was cloned and expressed in E. coli. The nucleotide sequence of the Enterobacter aerogenes lsp gene and a part of its flanking sequences were determined. A high degree of homology was found between the E. coli ileS-lsp operon and the corresponding ...

  17. A 2-D graphical representation of protein sequences based on nucleotide triplet codons

    Science.gov (United States)

    Bai, Fenglan; Wang, Tianming

    2005-09-01

    Graphical representation of DNA provides a simple way of viewing, sorting and comparing various gene structures. A 2-D graphical representation of protein sequences based on nucleotide triplet codons has been derived for similarity analysis of protein sequences. This approach is based on a graphical representation of triplets of DNA in which the interior of the left half plane of the complex plane is used to accommodate 64 sites for the 64 codons. We associate a directed curve, numerical value, or matrix with a protein as a descriptor. The approach is illustrated on the Homo sapiens X-linked nuclear protein (ATRX) gene.

  18. SinicView: A visualization environment for comparisons of multiple nucleotide sequence alignment tools

    Directory of Open Access Journals (Sweden)

    Wong Chun-Yi

    2006-03-01

    Full Text Available Abstract Background Deluged by the rate and complexity of completed genomic sequences, the need to align longer sequences becomes more urgent, and many more tools have thus been developed. In the initial stage of genomic sequence analysis, a biologist is usually faced with the questions of how to choose the best tool to align sequences of interest and how to analyze and visualize the alignment results, and then with the question of whether poorly aligned regions produced by the tool are indeed not homologous or are just results due to inappropriate alignment tools or scoring systems used. Although several systematic evaluations of multiple sequence alignment (MSA programs have been proposed, they may not provide a standard-bearer for most biologists because those poorly aligned regions in these evaluations are never discussed. Thus, a tool that allows cross comparison of the alignment results obtained by different tools simultaneously could help a biologist evaluate their correctness and accuracy. Results In this paper, we present a versatile alignment visualization system, called SinicView, (for Sequence-aligning INnovative and Interactive Comparison VIEWer, which allows the user to efficiently compare and evaluate assorted nucleotide alignment results obtained by different tools. SinicView calculates similarity of the alignment outputs under a fixed window using the sum-of-pairs method and provides scoring profiles of each set of aligned sequences. The user can visually compare alignment results either in graphic scoring profiles or in plain text format of the aligned nucleotides along with the annotations information. We illustrate the capabilities of our visualization system by comparing alignment results obtained by MLAGAN, MAVID, and MULTIZ, respectively. Conclusion With SinicView, users can use their own data sequences to compare various alignment tools or scoring systems and select the most suitable one to perform alignment in the

  19. The analysis of cytochrome b nucleotidic sequence for Carassius gibelio (Bloch, 1782

    Directory of Open Access Journals (Sweden)

    Lucian D. Gorgan

    2009-01-01

    Full Text Available The paper is part of a larger scale study for some genes` (Cytb, ND4L and D-loop nucleotidic structure identification by sequencing, to distinguish the structural differences and their exact length inase pairs. Research was carried out on individuals of Carassius gibelio (Bloch, 1782 (Actinopterygii,Cypriniformes from two different populations, Iezăreni and Movileni (Iaşi, from which dorsal musculartissue was sampled. Mitochondrial DNA (mtDNA isolation and purification was carried out automaticallyusing Promega’s Maxwell 16 (SEV module. Cytochrome b (cytb was multiplied by a two stage>polymerase chain reaction (PCR, using two sets of complementary primers (1 set for each fragment.Direct sequencing of PCR products revealed that the cytochrome b has one sequence of 1140bp. Theobtained sequences were subsequently compared with sequences of the same gene from otherindividuals within this species, towards identifying possible differences in the nucleotidic structure.Key Words: Carassius, cytocrhome b, mtDNA.

  20. Nucleotide sequence of an exceptionally long 5.8S ribosomal RNA from Crithidia fasciculata.

    Science.gov (United States)

    Schnare, M N; Gray, M W

    1982-01-01

    In Crithidia fasciculata, a trypanosomatid protozoan, the large ribosomal subunit contains five small RNA species (e, f, g, i, j) in addition to 5S rRNA [Gray, M.W. (1981) Mol. Cell. Biol. 1, 347-357]. The complete primary sequence of species i is shown here to be pAACGUGUmCGCGAUGGAUGACUUGGCUUCCUAUCUCGUUGA ... AGAmACGCAGUAAAGUGCGAUAAGUGGUApsiCAAUUGmCAGAAUCAUUCAAUUACCGAAUCUUUGAACGAAACGG ... CGCAUGGGAGAAGCUCUUUUGAGUCAUCCCCGUGCAUGCCAUAUUCUCCAmGUGUCGAA(C)OH. This sequence establishes that species i is a 5.8S rRNA, despite its exceptional length (171-172 nucleotides). The extra nucleotides in C. fasciculata 5.8S rRNA are located in a region whose primary sequence and length are highly variable among 5.8S rRNAs, but which is capable of forming a stable hairpin loop structure (the "G+C-rich hairpin"). The sequence of C. fasciculata 5.8S rRNA is no more closely related to that of another protozoan, Acanthamoeba castellanii, than it is to representative 5.8S rRNA sequences from the other eukaryotic kingdoms, emphasizing the deep phylogenetic divisions that seem to exist within the Kingdom Protista. Images PMID:7079176

  1. Remote access to ACNUC nucleotide and protein sequence databases at PBIL.

    Science.gov (United States)

    Gouy, Manolo; Delmotte, Stéphane

    2008-04-01

    The ACNUC biological sequence database system provides powerful and fast query and extraction capabilities to a variety of nucleotide and protein sequence databases. The collection of ACNUC databases served by the Pôle Bio-Informatique Lyonnais includes the EMBL, GenBank, RefSeq and UniProt nucleotide and protein sequence databases and a series of other sequence databases that support comparative genomics analyses: HOVERGEN and HOGENOM containing families of homologous protein-coding genes from vertebrate and prokaryotic genomes, respectively; Ensembl and Genome Reviews for analyses of prokaryotic and of selected eukaryotic genomes. This report describes the main features of the ACNUC system and the access to ACNUC databases from any internet-connected computer. Such access was made possible by the definition of a remote ACNUC access protocol and the implementation of Application Programming Interfaces between the C, Python and R languages and this communication protocol. Two retrieval programs for ACNUC databases, Query_win, with a graphical user interface and raa_query, with a command line interface, are also described. Altogether, these bioinformatics tools provide users with either ready-to-use means of querying remote sequence databases through a variety of selection criteria, or a simple way to endow application programs with an extensive access to these databases. Remote access to ACNUC databases is open to all and fully documented (http://pbil.univ-lyon1.fr/databases/acnuc/acnuc.html).

  2. Nucleotide sequences of immunoglobulin eta genes of chimpanzee and orangutan: DNA molecular clock and hominoid evolution

    Energy Technology Data Exchange (ETDEWEB)

    Sakoyama, Y.; Hong, K.J.; Byun, S.M.; Hisajima, H.; Ueda, S.; Yaoita, Y.; Hayashida, H.; Miyata, T.; Honjo, T.

    1987-02-01

    To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin eta-chain (C/sub eta1/) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human eta-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regions, was introduced for the present study. From the comparison of nucleotide sequences of ..cap alpha../sub 1/-antitrypsin and ..beta..- and delta-globulin genes between humans and Old World monkeys, the silent molecular clock was calibrated: the mean evolutionary rate of silent substitution was determined to be 1.56 x 10/sup -9/ substitutions per site per year. Using the silent molecular clock, the mean divergence dates of chimpanzee and orangutan from the human lineage were estimated as 6.4 +/- 2.6 million years and 17.3 +/- 4.5 million years, respectively. It was also shown that the evolutionary rate of primate genes is considerably slower than those of other mammalian genes.

  3. Comparison of sequencing platforms for single nucleotide variant calls in a human sample.

    Science.gov (United States)

    Ratan, Aakrosh; Miller, Webb; Guillory, Joseph; Stinson, Jeremy; Seshagiri, Somasekar; Schuster, Stephan C

    2013-01-01

    Next-generation sequencings platforms coupled with advanced bioinformatic tools enable re-sequencing of the human genome at high-speed and large cost savings. We compare sequencing platforms from Roche/454(GS FLX), Illumina/HiSeq (HiSeq 2000), and Life Technologies/SOLiD (SOLiD 3 ECC) for their ability to identify single nucleotide substitutions in whole genome sequences from the same human sample. We report on significant GC-related bias observed in the data sequenced on Illumina and SOLiD platforms. The differences in the variant calls were investigated with regards to coverage, and sequencing error. Some of the variants called by only one or two of the platforms were experimentally tested using mass spectrometry; a method that is independent of DNA sequencing. We establish several causes why variants remained unreported, specific to each platform. We report the indel called using the three sequencing technologies and from the obtained results we conclude that sequencing human genomes with more than a single platform and multiple libraries is beneficial when high level of accuracy is required.

  4. Nucleotide sequence of the SrRNA gene and phylogenetic analysis of Trichomonas tenax.

    Science.gov (United States)

    Fukura, K; Yamamoto, A; Hashimoto, T; Goto, N

    1996-01-01

    The small subunit ribosomal RNA (SrRNA) gene of Trichomonas tenax ATCC30207 was amplified by PCR and the 1.55-kb product was cloned into plasmid vector pUC18. Four clones were isolated and sequenced. The insert DNAs were 1,552 bp long and their G+C contents were 48.1%; three of them had exactly the same DNA sequences and one had only one nucleotide change. A representative SrRNA sequence was analyzed and a phylogenetic tree was estimated by the neighbor-joining (NJ) method. Among the protists examined, T. tenax was placed as the closest relative of Tritrichomonas foetus, as expected from the traditional taxonomy. The total homology between the two SrRNA sequences was 89.2%.

  5. Sequence Length Limits for Controlling False Positives in Discovering Nucleotide Sequence Motifs

    Institute of Scientific and Technical Information of China (English)

    CHEN Lei; QiAN Zi-liang

    2008-01-01

    In the study of motif discovery, especially the transcription factor DNA binding sites discovery, a too long input sequence would return non-informative motifs rather than those biological functional motifs. This paper gave theoretical analyses and computational experiments to suggest the length limits of the input sequence. When the sequence length exceeds a certain critical point, the probability of discovering the motif decreases sharply. The work not only gave an explanation on the unsatisfying results of the existed motif discovery problems that the input sequence length might be too long and exceed the point, but also provided an estimation of input sequence length we should accept to get more meaningful and reliable results in motif discovery.

  6. The complete nucleotide sequence of the egg drop syndrome virus: an intermediate between mastadenoviruses and aviadenoviruses.

    Science.gov (United States)

    Hess, M; Blöcker, H; Brandt, P

    1997-11-10

    The complete nucleotide sequence of an avian adenovirus, the egg drop syndrome (EDS) virus, was determined. The total genome length is 33,213 nucleotides, resulting in a molecular weight of 21.9 x 10(6). The GC content is only 42.5%. Between map units 3.5 and 76.9, the distribution of open reading frames with homology to known genes is similar to that reported for other mammalian and avian adenoviruses. However, no homologies to adenovirus genes such as E1A, pIX, pV, and E3 could be found. Outside this region, several open reading frames were identified without any obvious homology to known adenovirus proteins. In the region organized similarly as other adenoviral genomes, most homologies were found to an ovine adenovirus (OAV strain 287). The highest level of amino acid identity was found for the hexon proteins of EDS and OAV. The virus-associated RNA (VA RNA) was identified thanks to the homology with the VA RNA of fowl adenovirus serotype 1 (FAV1). Similarities with FAV1 were also found in the fiber protein. Our results demonstrate that the avian EDS virus represents an intermediate between mammalian and avian adenoviruses. The nucleotide sequence and genomic organization of the EDS virus reflect the heterogeneity of the aviadenovirus genus and the Adenoviridae family.

  7. A blackberry (Rubus L. expressed sequence tag library for the development of simple sequence repeat markers

    Directory of Open Access Journals (Sweden)

    Main Dorrie S

    2008-06-01

    Full Text Available Abstract Background The recent development of novel repeat-fruiting types of blackberry (Rubus L. cultivars, combined with a long history of morphological marker-assisted selection for thornlessness by blackberry breeders, has given rise to increased interest in using molecular markers to facilitate blackberry breeding. Yet no genetic maps, molecular markers, or even sequences exist specifically for cultivated blackberry. The purpose of this study is to begin development of these tools by generating and annotating the first blackberry expressed sequence tag (EST library, designing primers from the ESTs to amplify regions containing simple sequence repeats (SSR, and testing the usefulness of a subset of the EST-SSRs with two blackberry cultivars. Results A cDNA library of 18,432 clones was generated from expanding leaf tissue of the cultivar Merton Thornless, a progenitor of many thornless commercial cultivars. Among the most abundantly expressed of the 3,000 genes annotated were those involved with energy, cell structure, and defense. From individual sequences containing SSRs, 673 primer pairs were designed. Of a randomly chosen set of 33 primer pairs tested with two blackberry cultivars, 10 detected an average of 1.9 polymorphic PCR products. Conclusion This rate predicts that this library may yield as many as 940 SSR primer pairs detecting 1,786 polymorphisms. This may be sufficient to generate a genetic map that can be used to associate molecular markers with phenotypic traits, making possible molecular marker-assisted breeding to compliment existing morphological marker-assisted breeding in blackberry.

  8. DNA dynamics is likely to be a factor in the genomic nucleotide repeats expansions related to diseases.

    Directory of Open Access Journals (Sweden)

    Boian S Alexandrov

    Full Text Available Trinucleotide repeats sequences (TRS represent a common type of genomic DNA motif whose expansion is associated with a large number of human diseases. The driving molecular mechanisms of the TRS ongoing dynamic expansion across generations and within tissues and its influence on genomic DNA functions are not well understood. Here we report results for a novel and notable collective breathing behavior of genomic DNA of tandem TRS, leading to propensity for large local DNA transient openings at physiological temperature. Our Langevin molecular dynamics (LMD and Markov Chain Monte Carlo (MCMC simulations demonstrate that the patterns of openings of various TRSs depend specifically on their length. The collective propensity for DNA strand separation of repeated sequences serves as a precursor for outsized intermediate bubble states independently of the G/C-content. We report that repeats have the potential to interfere with the binding of transcription factors to their consensus sequence by altered DNA breathing dynamics in proximity of the binding sites. These observations might influence ongoing attempts to use LMD and MCMC simulations for TRS-related modeling of genomic DNA functionality in elucidating the common denominators of the dynamic TRS expansion mutation with potential therapeutic applications.

  9. Mouse mammary tumor virus-like nucleotide sequences in canine and feline mammary tumors.

    Science.gov (United States)

    Hsu, Wei-Li; Lin, Hsing-Yi; Chiou, Shyan-Song; Chang, Chao-Chin; Wang, Szu-Pong; Lin, Kuan-Hsun; Chulakasian, Songkhla; Wong, Min-Liang; Chang, Shih-Chieh

    2010-12-01

    Mouse mammary tumor virus (MMTV) has been speculated to be involved in human breast cancer. Companion animals, dogs, and cats with intimate human contacts may contribute to the transmission of MMTV between mouse and human. The aim of this study was to detect MMTV-like nucleotide sequences in canine and feline mammary tumors by nested PCR. Results showed that the presence of MMTV-like env and LTR sequences in canine malignant mammary tumors was 3.49% (3/86) and 18.60% (16/86), respectively. For feline malignant mammary tumors, the presence of both env and LTR sequences was found to be 22.22% (2/9). Nevertheless, the MMTV-like LTR and env sequences also were detected in normal mammary glands of dogs and cats. In comparisons of the MMTV-like DNA sequences of our findings to those of NIH 3T3 (MMTV-positive murine cell line) and human breast cancer cells, the sequence similarities ranged from 94 to 98%. Phylogenetic analysis revealed that intermixing among sequences identified from tissues of different hosts, i.e., mouse, dog, cat, and human, indicated the MMTV-like DNA existing in these hosts. Moreover, the env transcript was detected in 1 of the 19 MMTV-positive samples by reverse transcription-PCR. Taken together, our study provides evidence for the existence and expression of MMTV-like sequences in neoplastic and normal mammary glands of dogs and cats.

  10. ARHGEF7 (Beta-PIX acts as guanine nucleotide exchange factor for leucine-rich repeat kinase 2.

    Directory of Open Access Journals (Sweden)

    Karina Haebig

    Full Text Available BACKGROUND: Mutations within the leucine-rich repeat kinase 2 (LRRK2 gene are a common cause of familial and sporadic Parkinson's disease. The multidomain protein LRRK2 exhibits overall low GTPase and kinase activity in vitro. METHODOLOGY/PRINCIPAL FINDINGS: Here, we show that the rho guanine nucleotide exchange factor ARHGEF7 and the small GTPase CDC42 are interacting with LRRK2 in vitro and in vivo. GTPase activity of full-length LRRK2 increases in the presence of recombinant ARHGEF7. Interestingly, LRRK2 phosphorylates ARHGEF7 in vitro at previously unknown phosphorylation sites. We provide evidence that ARHGEF7 might act as a guanine nucleotide exchange factor for LRRK2 and that R1441C mutant LRRK2 with reduced GTP hydrolysis activity also shows reduced binding to ARHGEF7. CONCLUSIONS/SIGNIFICANCE: Downstream effects of phosphorylation of ARHGEF7 through LRRK2 could be (i a feedback control mechanism for LRRK2 activity as well as (ii an impact of LRRK2 on actin cytoskeleton regulation. A newly identified familial mutation N1437S, localized within the GTPase domain of LRRK2, further underlines the importance of the GTPase domain of LRRK2 in Parkinson's disease pathogenesis.

  11. Avian Retroviruses That Cause Carcinoma and Leukemia: Identification of Nucleotide Sequences Associated with Pathogenicity

    Science.gov (United States)

    Sheiness, Diana; Bister, Klaus; Moscovici, Carlo; Fanshier, Lois; Gonda, Thomas; Bishop, J. Michael

    1980-01-01

    Avian myelocytomatosis virus (MC29V) is a retrovirus that transforms both fibroblasts and macrophages in culture and induces myelocytomatosis, carcinomas, and sarcomas in birds. Previous work identified a sequence of about 1,500 nucleotides (here denoted oncMCV) that apparently derived from a normal cellular sequence and that may encode the oncogenic capacity of MC29V. In an effort to further implicate oncMCV in tumorigenesis, we used molecular hybridization to examine the distribution of nucleotide sequences related to oncMCV among the genomes of various avian retroviruses. In addition, we characterized further the genetic composition of the remainder of the MC29V genome. Our work exploited the availability of radioactive DNAs (cDNA's) complementary to oncMCV (cDNAMCV) or to specific portions of the genome of avian sarcoma virus (ASV). We showed that genomic RNAs of avian erythroblastosis virus (AEV) and avian myeloblastosis virus (AMV) could not hybridize appreciably with cDNAMCV. By contrast, cDNAMCV hybridized extensively (about 75%) and with essentially complete fidelity to the genome of Mill Hill 2 virus (MH2V), whose pathogenicity is very similar to that of MC29V, but different from that of AEV or AMV. Hybridization with the ASV cDNA's demonstrated that the MC29V genome includes about half of the ASV envelope protein gene and that the remainder of the MC29V genome is closely related to nucleotide sequences that are shared among the genomes of many avian leukosis and sarcoma viruses. We conclude that oncMCV probably specifies the unique set of pathogenicities displayed by MC29V and MH2V, whereas the oncogenic potentials of AEV and AMV are presumably encoded by a distinct nucleotide sequence unrelated to oncMCV. The genomes of ASV, MC29V, and other avian oncoviruses thus share a set of common sequences, but apparently owe their various oncogenic potentials to unrelated transforming genes. Images PMID:6245277

  12. A novel method to discover fluoroquinolone antibiotic resistance (qnr genes in fragmented nucleotide sequences

    Directory of Open Access Journals (Sweden)

    Boulund Fredrik

    2012-12-01

    Full Text Available Abstract Background Broad-spectrum fluoroquinolone antibiotics are central in modern health care and are used to treat and prevent a wide range of bacterial infections. The recently discovered qnr genes provide a mechanism of resistance with the potential to rapidly spread between bacteria using horizontal gene transfer. As for many antibiotic resistance genes present in pathogens today, qnr genes are hypothesized to originate from environmental bacteria. The vast amount of data generated by shotgun metagenomics can therefore be used to explore the diversity of qnr genes in more detail. Results In this paper we describe a new method to identify qnr genes in nucleotide sequence data. We show, using cross-validation, that the method has a high statistical power of correctly classifying sequences from novel classes of qnr genes, even for fragments as short as 100 nucleotides. Based on sequences from public repositories, the method was able to identify all previously reported plasmid-mediated qnr genes. In addition, several fragments from novel putative qnr genes were identified in metagenomes. The method was also able to annotate 39 chromosomal variants of which 11 have previously not been reported in literature. Conclusions The method described in this paper significantly improves the sensitivity and specificity of identification and annotation of qnr genes in nucleotide sequence data. The predicted novel putative qnr genes in the metagenomic data support the hypothesis of a large and uncharacterized diversity within this family of resistance genes in environmental bacterial communities. An implementation of the method is freely available at http://bioinformatics.math.chalmers.se/qnr/.

  13. The nucleotide sequence of 4.5S ribosomal RNA from tobacco chloroplasts.

    OpenAIRE

    Takaiwa, F; Sugiura, M

    1980-01-01

    The nucleotide sequence of tobacco chloroplast 4.5S ribosomal RNA has been determined to be: OHG-A-A-G-G-U-C-A-C-G-G-C-G-A-G-A-C-G-A-G-C-C-G-U-U-U-A-U-C-A-U-U-A-C-G-A-U-A-G-G-U-G-U-C-A-A-G-U-G-G-A-A-G-U-G-C-A-G-U-G-A-U-G-U-A-U-G-C-(G-A)-C-U-G-A-G-G-C-A-U-C-C-U-A-A-C-A-G-A-C-C-G-G-U-A-G-A-C-U-U-G-A-A-COH. The 4.5S RNA is 103 nucleotides long and its 5'-terminus is not phosphorylated.

  14. The nucleotide sequence of chloroplast 4.5S rRNA from a fern, Dryopteris acuminata.

    OpenAIRE

    Takaiwa, F.; Kusuda, M; SUGIURA, M.

    1982-01-01

    The 4.5S rRNA was isolated from the chloroplast ribosomes from Dryopteris acuminata. The complete nucleotide sequence was determined to be: OHUAAGGUCACGGCAAGACGAGCCGUUUAUCACCACGAUAGGUGCUAAGUGGAGGUGCAGUAAUGUAUGCAGCUGAGGC AUCCUAAUAGACCGAGAGGUUUGAACOH. The 4.5S rRNA is composed of 103 nucleotides and shows strong homology with those from flowering plants.

  15. Methods for sequencing GC-rich and CCT repeat DNA templates

    Science.gov (United States)

    Robinson, Donna L.

    2007-02-20

    The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.

  16. Analysis of nucleotide sequence of wheat yellow mosaic virus genomic RNAs

    Institute of Scientific and Technical Information of China (English)

    于嘉林; 晏立英; 苏宁; 侯占军; 李大伟; 韩成贵; 杨莉莉; 蔡祝南; 刘仪

    1999-01-01

    Wheat yellow mosaic virus (WYMV) isolate HC was used for viral cDNA synthesis and sequencing. The results show that the viral RNA1 is 7629 nueleotides encoding a polyprotein with 2407 amino acids, from which seven putative proteins may be produced by an autolytie cleavage processing besides the viral coat protein. The RNA2 is 3639 nueleotides and codes for a polypretein of 903 amino acids, which may contain two putative non-structural proteins. Although WYMV shares a similarity in genetic organization to wheat spindle streak mosaic virus (WSSMV), the identities in their nucleotide sequences or deduced amino acid sequences are as low as 70% and 75 % respectively. Based on this result, it is confirmed that WYMV and WSSMV are different species within Bymovirus.

  17. Nucleotide sequence of yeast GDH1 encoding nicotinamide adenine dinucleotide phosphate-dependent glutamate dehydrogenase.

    Science.gov (United States)

    Moye, W S; Amuro, N; Rao, J K; Zalkin, H

    1985-07-15

    The yeast GDH1 gene encodes NADP-dependent glutamate dehydrogenase. This gene was isolated by complementation of an Escherichia coli glutamate auxotroph. NADP-dependent glutamate dehydrogenase was overproduced 6-10-fold in Saccharomyces cerevisiae bearing GDH1 on a multicopy plasmid. The nucleotide sequence of the 1362-base pair coding region and 5' and 3' flanking sequences were determined. Transcription start sites were located by S1 nuclease mapping. Regulation of GDH1 was not maintained when the gene was present on a multicopy plasmid. Protein secondary structure predictions identified a region with potential to form the dinucleotide-binding domain. The amino acid sequences of the yeast and Neurospora crassa enzymes are 63% conserved. Unlike the N. crassa gene, yeast GDH1 has no introns.

  18. Conservation of nucleotide sequences for molecular diagnosis of Middle East respiratory syndrome coronavirus, 2015.

    Science.gov (United States)

    Furuse, Yuki; Okamoto, Michiko; Oshitani, Hitoshi

    2015-11-01

    Infection due to the Middle East respiratory syndrome coronavirus (MERS-CoV) is widespread. The present study was performed to assess the protocols used for the molecular diagnosis of MERS-CoV by analyzing the nucleotide sequences of viruses detected between 2012 and 2015, including sequences from the large outbreak in eastern Asia in 2015. Although the diagnostic protocols were established only 2 years ago, mismatches between the sequences of primers/probes and viruses were found for several of the assays. Such mismatches could lead to a lower sensitivity of the assay, thereby leading to false-negative diagnosis. A slight modification in the primer design is suggested. Protocols for the molecular diagnosis of viral infections should be reviewed regularly after they are established, particularly for viruses that pose a great threat to public health such as MERS-CoV.

  19. Molecular characterisation and nucleotide sequence analysis of canine parvovirus strains in vaccines in India

    Directory of Open Access Journals (Sweden)

    Sukdeb Nandi

    2010-03-01

    Full Text Available Canine parvovirus 2 (CPV‑2 is one of the most important viruses that causes haemorrhagic gastroenteritis and myocarditis of dogs worldwide. The picture has been complicated further due to the emergence of new mutants of CPV, namely: CPV‑2a, CPV‑2b and CPV‑2c. In this study, the molecular characterisation of strains present in the CPV vaccines available on the Indian market was performed using polymerase chain reaction and DNA sequencing. The VP1/VP2 genes of two vaccine strains and a field strain (Bhopal were sequenced and the nucleotide and the deduced amino acid sequences were compared. The results indicated that the isolate belonged to CPV type 2b and the strains in the vaccines belonged to type CPV‑2. From the study, it is inferred that the CPV strain used in commercially available vaccine preparation differed from the strains present in CPV infection in dogs in India

  20. Conservation of nucleotide sequences for molecular diagnosis of Middle East respiratory syndrome coronavirus, 2015

    Directory of Open Access Journals (Sweden)

    Yuki Furuse

    2015-11-01

    Full Text Available Infection due to the Middle East respiratory syndrome coronavirus (MERS-CoV is widespread. The present study was performed to assess the protocols used for the molecular diagnosis of MERS-CoV by analyzing the nucleotide sequences of viruses detected between 2012 and 2015, including sequences from the large outbreak in eastern Asia in 2015. Although the diagnostic protocols were established only 2 years ago, mismatches between the sequences of primers/probes and viruses were found for several of the assays. Such mismatches could lead to a lower sensitivity of the assay, thereby leading to false-negative diagnosis. A slight modification in the primer design is suggested. Protocols for the molecular diagnosis of viral infections should be reviewed regularly after they are established, particularly for viruses that pose a great threat to public health such as MERS-CoV.

  1. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

    Science.gov (United States)

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy

    2015-05-01

    We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.

  2. Genomic DNA enrichment using sequence capture microarrays: a novel approach to discover sequence nucleotide polymorphisms (SNP in Brassica napus L.

    Directory of Open Access Journals (Sweden)

    Wayne E Clarke

    Full Text Available Targeted genomic selection methodologies, or sequence capture, allow for DNA enrichment and large-scale resequencing and characterization of natural genetic variation in species with complex genomes, such as rapeseed canola (Brassica napus L., AACC, 2n=38. The main goal of this project was to combine sequence capture with next generation sequencing (NGS to discover single nucleotide polymorphisms (SNPs in specific areas of the B. napus genome historically associated (via quantitative trait loci -QTL- analysis to traits of agronomical and nutritional importance. A 2.1 million feature sequence capture platform was designed to interrogate DNA sequence variation across 47 specific genomic regions, representing 51.2 Mb of the Brassica A and C genomes, in ten diverse rapeseed genotypes. All ten genotypes were sequenced using the 454 Life Sciences chemistry and to assess the effect of increased sequence depth, two genotypes were also sequenced using Illumina HiSeq chemistry. As a result, 589,367 potentially useful SNPs were identified. Analysis of sequence coverage indicated a four-fold increased representation of target regions, with 57% of the filtered SNPs falling within these regions. Sixty percent of discovered SNPs corresponded to transitions while 40% were transversions. Interestingly, fifty eight percent of the SNPs were found in genic regions while 42% were found in intergenic regions. Further, a high percentage of genic SNPs was found in exons (65% and 64% for the A and C genomes, respectively. Two different genotyping assays were used to validate the discovered SNPs. Validation rates ranged from 61.5% to 84% of tested SNPs, underpinning the effectiveness of this SNP discovery approach. Most importantly, the discovered SNPs were associated with agronomically important regions of the B. napus genome generating a novel data resource for research and breeding this crop species.

  3. Infectivity and complete nucleotide sequence of cucumber fruit mottle mosaic virus isolate Cm cDNA.

    Science.gov (United States)

    Rhee, Sun-Ju; Hong, Jin-Sung; Lee, Gung Pyo

    2014-07-01

    Three isolates of cucumber fruit mottle mosaic virus (CFMMV) were collected from melon, cucumber, and pumpkin plants in Korea. A full-length cDNA clone of CFMMV-Cm (melon isolate) was produced and evaluated for infectivity after T7 transcription in vitro (pT7CF-Cmflc). The complete CFMMV genome sequence of the infectious clone pT7CF-Cmflc was determined. The genome of CFMMV-Cm consisted of 6,571 nucleotides and shared high nucleotide sequence identity (98.8 %) with the Israel isolate of CFMMV. Based on the infectious clone pT7CF-Cmflc, a CaMV 35S-promoter driven cDNA clone (p35SCF-Cmflc) was subsequently constructed and sequenced. Mechanical inoculation with RNA transcripts of pT7CF-Cmflc and agro-inoculation with p35SCF-Cmflc resulted in systemic infection of cucumber and melon, producing symptoms similar to those produced by CFMMV-Cm. Progeny virus in infected plants was detected by RT-PCR, western blot assay, and transmission electron microscopy.

  4. Complete nucleotide sequence and genome organization of a Cactus virus X strain from Hylocereus undatus (Cactaceae).

    Science.gov (United States)

    Liou, M R; Chen, Y R; Liou, R F

    2004-05-01

    The complete nucleotide sequence of a strain of Cactus virus X (CVX-Hu) isolated from Hylocereus undatus (Cactaceae) has been determined. Excluding the poly(A) tail, the sequence is 6614 nucleotides in length and contains seven open reading frames (ORFs). The genome organization of CVX is similar to that of other potexviruses. ORF1 encodes the putative viral replicase with conserved methyltransferase, helicase, and polymerase motifs. Within ORF1, two other ORFs were located separately in the +2 reading frame, we call these ORF6 and ORF7. ORF2, 3, and 4, which form the "triple gene block" characteristic of the potexviruses, encode proteins with molecular mass of 25, 12, and 7 KDa, respectively. ORF5 encodes the coat protein with an estimated molecular mass of 24 KDa. Sequence analysis indicated that proteins encoded by ORF1-5 display certain degree of homology to the corresponding proteins of other potexviruses. Putative product of ORF6, however, shows no significant similarity to those of other potexviruses. Phylogenetic analyses based on the replicase (the methyltransferase, helicase, and polymerase domains) and coat protein demonstrated a closer relationship of CVX with Bamboo mosaic virus, Cassava common mosaic virus, Foxtail mosaic virus, Papaya mosaic virus, and Plantago asiatica mosaic virus.

  5. Nucleotides upstream of the Kozak sequence strongly influence gene expression in the yeast S. cerevisiae.

    Science.gov (United States)

    Li, Jing; Liang, Qiang; Song, Wenjiang; Marchisio, Mario Andrea

    2017-01-01

    In the yeast Saccharomyces cerevisiae, as in every eukaryotic organism, the mRNA 5(')-untranslated region (UTR) is important for translation initiation. However, the patterns and mechanisms that determine the efficiency with which ribozomes bind mRNA, the elongation of ribosomes through the 5(')-UTR, and the formation of a stable translation initiation complex are not clear. Genes that are highly expressed in S. cerevisiae seem to prefer a 5(')-UTR rich in adenine and poor in guanine, particularly in the Kozak sequence, which occupies roughly the first six nucleotides upstream of the START codon. We measured the fluorescence produced by 58 synthetic versions of the S. cerevisiae minimal CYC1 promoter (pCYC1min), each containing a different 5(')-UTR. First, we replaced with adenine the last 15 nucleotides of the original pCYC1min 5(')-UTR-a theoretically optimal configuration for high gene expression. Next, we carried out single and multiple point mutations on it. Protein synthesis was highly affected by both single and multiple point mutations upstream of the Kozak sequence. RNAfold simulations revealed that significant changes in the mRNA secondary structures occur by mutating more than three adenines into guanines between positions -15 and -9. Furthermore, the effect of point mutations turned out to be strongly context-dependent, indicating that adenines placed just upstream of the START codon do not per se guarantee an increase in gene expression, as previously suggested. New synthetic eukaryotic promoters, which differ for their translation initiation rate, can be built by acting on the nucleotides upstream of the Kozak sequence. Translation efficiency could, potentially, be influenced by another portion of the 5(')-UTR further upstream of the START codon. A deeper understanding of the role of the 5(')-UTR in gene expression would improve criteria for choosing and using promoters inside yeast synthetic gene circuits.

  6. A Cluster of Nucleotide-Binding Site-Leucine-Rich Repeat Genes Resides in a Barley Powdery Mildew Resistance Quantitative Trait Loci on 7HL.

    Science.gov (United States)

    Cantalapiedra, Carlos P; Contreras-Moreira, Bruno; Silvar, Cristina; Perovic, Dragan; Ordon, Frank; Gracia, María Pilar; Igartua, Ernesto; Casas, Ana M

    2016-07-01

    Powdery mildew causes severe yield losses in barley production worldwide. Although many resistance genes have been described, only a few have already been cloned. A strong QTL (quantitative trait locus) conferring resistance to a wide array of powdery mildew isolates was identified in a Spanish barley landrace on the long arm of chromosome 7H. Previous studies narrowed down the QTL position, but were unable to identify candidate genes or physically locate the resistance. In this study, the exome of three recombinant lines from a high-resolution mapping population was sequenced and analyzed, narrowing the position of the resistance down to a single physical contig. Closer inspection of the region revealed a cluster of closely related NBS-LRR (nucleotide-binding site-leucine-rich repeat containing protein) genes. Large differences were found between the resistant lines and the reference genome of cultivar Morex, in the form of PAV (presence-absence variation) in the composition of the NBS-LRR cluster. Finally, a template-guided assembly was performed and subsequent expression analysis revealed that one of the new assembled candidate genes is transcribed. In summary, the results suggest that NBS-LRR genes, absent from the reference and the susceptible genotypes, could be functional and responsible for the powdery mildew resistance. The procedure followed is an example of the use of NGS (next-generation sequencing) tools to tackle the challenges of gene cloning when the target gene is absent from the reference genome.

  7. A Cluster of Nucleotide-Binding Site–Leucine-Rich Repeat Genes Resides in a Barley Powdery Mildew Resistance Quantitative Trait Loci on 7HL

    Directory of Open Access Journals (Sweden)

    Carlos P. Cantalapiedra

    2016-07-01

    Full Text Available Powdery mildew causes severe yield losses in barley production worldwide. Although many resistance genes have been described, only a few have already been cloned. A strong QTL (quantitative trait locus conferring resistance to a wide array of powdery mildew isolates was identified in a Spanish barley landrace on the long arm of chromosome 7H. Previous studies narrowed down the QTL position, but were unable to identify candidate genes or physically locate the resistance. In this study, the exome of three recombinant lines from a high-resolution mapping population was sequenced and analyzed, narrowing the position of the resistance down to a single physical contig. Closer inspection of the region revealed a cluster of closely related NBS-LRR (nucleotide-binding site–leucine-rich repeat containing protein genes. Large differences were found between the resistant lines and the reference genome of cultivar Morex, in the form of PAV (presence-absence variation in the composition of the NBS-LRR cluster. Finally, a template-guided assembly was performed and subsequent expression analysis revealed that one of the new assembled candidate genes is transcribed. In summary, the results suggest that NBS-LRR genes, absent from the reference and the susceptible genotypes, could be functional and responsible for the powdery mildew resistance. The procedure followed is an example of the use of NGS (next-generation sequencing tools to tackle the challenges of gene cloning when the target gene is absent from the reference genome.

  8. A maximum principle for the mutation--selection equilibrium of nucleotide sequences

    CERN Document Server

    Garske, T; Garske, Tini; Grimm, Uwe

    2004-01-01

    We study the equilibrium behaviour of a deterministic four-state mutation--selection model as a model for the evolution of a population of nucleotide sequences. The mutation model is the Kimura 3ST mutation scheme, and selection is assumed to be permutation invariant. Considering the evolution process both forward and backward in time, we use the ancestral distribution as the stationary state of the backward process to derive an expression for the mutational loss (as the difference between ancestral and population mean fitness), and we prove a maximum principle that determines the population mean fitness in mutation--selection balance.

  9. Complete nucleotide sequence of a virus associated with rusty mottle disease of sweet cherry (Prunus avium).

    Science.gov (United States)

    Villamor, D V; Druffel, K L; Eastwell, K C

    2013-08-01

    Cherry rusty mottle is a disease of sweet cherries first described in 1940 in western North America. Because of the graft-transmissible nature of the disease, a viral nature of the disease was assumed. Here, the complete genomic nucleotide sequences of virus isolates from two trees expressing cherry rusty mottle disease symptoms are characterized; the virus is designated cherry rusty mottle associated virus (CRMaV). The biological and molecular characteristics of this virus in comparison to those of cherry necrotic rusty mottle virus (CNRMV) and cherry green ring mottle virus (CGRMV) are described. CRMaV was subsequently detected in additional sweet cherry trees expressing symptoms of cherry rusty mottle disease.

  10. Complete nucleotide sequences of two begomoviruses infecting Madagascar periwinkle (Catharanthus roseus) from Pakistan.

    Science.gov (United States)

    Ilyas, Muhammad; Nawaz, Kiran; Shafiq, Muhammad; Haider, Muhammad Saleem; Shahid, Ahmad Ali

    2013-02-01

    Though Catharanthus roseus (Madagascar periwinkle) is an ornamental plant, it is famous for its medicinal value. Its alkaloids are known for anti-cancerous properties, and this plant is studied mainly for its alkaloids. Here, this plant has been studied for its viral diseases. Complete DNA sequences of two begomoviruses infecting C. roseus originating from Pakistan were determined. The sequence of one begomovirus (clone KN4) shows the highest level of nucleotide sequence identity (86.5 %) to an unpublished virus, chili leaf curl India virus (ChiLCIV), and then (84.4 % identity) to papaya leaf curl virus (PaLCV), and thus represents a new species, for which the name "Catharanthus yellow mosaic virus" (CYMV) is proposed. The sequence of another begomovirus (clone KN6) shows the highest level of sequence identity (95.9 % to 99 %) to a newly reported virus from India, papaya leaf crumple virus (PaLCrV). Sequence analysis shows that KN4 and KN6 are recombinants of Pedilanthus leaf curl virus (PedLCV) and croton yellow vein mosaic virus (CrYVMV).

  11. Nucleotide sequence characterization and phylogenetic analysis of hantaviruses isolated in Shandong Province, China

    Institute of Scientific and Technical Information of China (English)

    LI Jian; ZHAO Zhong-tang; WANG Zhi-qiang; LIU Yun-xi; HU Mao-hong

    2007-01-01

    Background China is the most severe endemic area of hemorrhagic fever with renal syndrome (HFRS) in the world with 30 000-50 000 cases reported annually, which accounts for more than 90% of total number of cases worldwide. The incidence rate of the syndrome in Shandong Province is one of the highest in China, which has ever reached 50 per 100 000 persons per year. However, the molecular characteristics of hantaviruses (HV) epidemic in Shandong Province remain unclear. Therefore it is useful to clarify nucleotide sequence and phylogenetic characteristics of HV isolated in Shandong Province in order to provide better advices to control and prevent HFRS.Methods RNAs were extracted from sera of clinically diagnosed patients and positive rodent lungs that were detected by indirect immunofluorescent assay (IFA). Partial M segments of HV were amplified from the RNAs with reverse transcription nested polymerase chain reactions (nested PCR) using hantavirus genotype specific primers. The nested PCR products were sequenced and compared with those from previously epidemic isolates in Shandong and with other representative HV sequences from GenBank. Phylogenetic tree analyses were performed based on the sequences of the M genes.Results Thirty-four HV isolates in Shandong showed 67.1%-100% nucleotide identities. The nucleotide homologies among 6 Hantaan viruses (HTNV) isolates in Shandong were 78.1%-98.7%, while the homologies among 28 Seoul virus (SEOV) isolates in Shandong were 93.7%-100%. There were at least 3 subtypes HTNV (H2, H5, H9) and 2 subtypes SEOV (S2, S3) in Shandong Province.Conclusions In Shandong Province, the homologies of HTNV were lower and there were no predominant subtypes,while the homologies of SEOV were higher and S3 was the predominant subtype. The homologies of SEOV from rodents were higher than those from patients. The distribution of subtypes in Shandong was similar to that of the adjoining provinces. Phylogenetic analyses of the sequences showed

  12. Mining of haplotype-based expressed sequence tag single nucleotide polymorphisms in citrus.

    Science.gov (United States)

    Chen, Chunxian; Gmitter, Fred G

    2013-11-01

    Single nucleotide polymorphisms (SNPs), the most abundant variations in a genome, have been widely used in various studies. Detection and characterization of citrus haplotype-based expressed sequence tag (EST) SNPs will greatly facilitate further utilization of these gene-based resources. In this paper, haplotype-based SNPs were mined out of publicly available citrus expressed sequence tags (ESTs) from different citrus cultivars (genotypes) individually and collectively for comparison. There were a total of 567,297 ESTs belonging to 27 cultivars in varying numbers and consequentially yielding different numbers of haplotype-based quality SNPs. Sweet orange (SO) had the most (213,830) ESTs, generating 11,182 quality SNPs in 3,327 out of 4,228 usable contigs. Summed from all the individually mining results, a total of 25,417 quality SNPs were discovered - 15,010 (59.1%) were transitions (AG and CT), 9,114 (35.9%) were transversions (AC, GT, CG, and AT), and 1,293 (5.0%) were insertion/deletions (indels). A vast majority of SNP-containing contigs consisted of only 2 haplotypes, as expected, but the percentages of 2 haplotype contigs varied widely in these citrus cultivars. BLAST of the 25,417 25-mer SNP oligos to the Clementine reference genome scaffolds revealed 2,947 SNPs had "no hits found", 19,943 had 1 unique hit / alignment, 1,571 had one hit and 2+ alignments per hit, and 956 had 2+ hits and 1+ alignment per hit. Of the total 24,293 scaffold hits, 23,955 (98.6%) were on the main scaffolds 1 to 9, and only 338 were on 87 minor scaffolds. Most alignments had 100% (25/25) or 96% (24/25) nucleotide identities, accounting for 93% of all the alignments. Considering almost all the nucleotide discrepancies in the 24/25 alignments were at the SNP sites, it served well as in silico validation of these SNPs, in addition to and consistent with the rate (81%) validated by sequencing and SNaPshot assay. High-quality EST-SNPs from different citrus genotypes were detected, and

  13. Using mitochondrial nucleotide sequences to investigate diversity and genealogical relationships within common carp (Cyprinus carpio L.).

    Science.gov (United States)

    Thai, B T; Burridge, C P; Pham, T A; Austin, C M

    2005-02-01

    Direct sequencing of mitochondrial DNA (mtDNA) D-loop (745 bp) and MTATPase6/MTATPase8 (857 bp) regions was used to investigate genetic variation within common carp and develop a global genealogy of common carp strains. The D-loop region was more variable than the MTATPase6/MTATPase8 region, but given the wide distribution of carp the overall levels of sequence divergence were low. Levels of haplotype diversity varied widely among countries with Chinese, Indonesian and Vietnamese carp showing the greatest diversity whereas Japanese Koi and European carp had undetectable nucleotide variation. A genealogical analysis supports a close relationship between Vietnamese, Koi and Chinese Color carp strains and to a lesser extent, European carp. Chinese and Indonesian carp strains were the most divergent, and their relationships do not support the evolution of independent Asian and European lineages and current taxonomic treatments.

  14. The complete nucleotide sequence and genome organization of pea streak virus (genus Carlavirus).

    Science.gov (United States)

    Su, Li; Li, Zhengnan; Bernardy, Mike; Wiersma, Paul A; Cheng, Zhihui; Xiang, Yu

    2015-10-01

    Pea streak virus (PeSV) is a member of the genus Carlavirus in the family Betaflexiviridae. Here, the first complete genome sequence of PeSV was determined by deep sequencing of a cDNA library constructed from dsRNA extracted from a PeSV-infected sample and Rapid Amplification of cDNA Ends (RACE) PCR. The PeSV genome consists of 8041 nucleotides excluding the poly(A) tail and contains six open reading frames (ORFs). The putative peptide encoded by the PeSV ORF6 has an estimated molecular mass of 6.6 kDa and shows no similarity to any known proteins. This differs from typical carlaviruses, whose ORF6 encodes a 12- to 18-kDa cysteine-rich nucleic-acid-binding protein.

  15. [Nucleotide sequence of HLA-DQA1 promoter region (QAP) in a lung cancer patient].

    Science.gov (United States)

    Qiu, C; Zhou, W; Song, C

    1996-06-01

    The HLA-DQA1 allele and nucleotide sequence of HLA-DQA1 promoter region (QAP) in a patient with IDDM complicated lung cancer have been identified by PCR/SSCP, PCR/SSCP and PCR/sequencing. The results showed that: (1) All of the lung cancer patient and his family members carried HLA-DQA1* 0301/0501 alleles. (2) a single base substitution G-->A at position -155 and deletion CAA at position -161 to -163 occurred in the patient. These results suggest that the mutation of HLA-DQA1 promoter region may modulate HLA-DQA1 gene expression by trans-acting factors binding to variant cis-acting elements and may be responsible for pathogenesis of lung cancer.

  16. Nucleotide sequences of three tRNA(Ser) from Drosophila melanogaster reading the six serine codons.

    Science.gov (United States)

    Cribbs, D L; Gillam, I C; Tener, G M

    1987-10-05

    The nucleotide sequences of three serine tRNAs from Drosophila melanogaster, together capable of decoding the six serine codons, were determined. tRNA(Ser)2b has the anticodon GCU, tRNA(Ser)4 has CGA and tRNA(Ser)7 has IGA. tRNA(Ser)2b differs from the last two by about 25%. However, tRNA(Ser)4 and tRNA(Ser)7 are 96% homologous, differing only at the first position of the anticodon and two other sites. This unusual sequence relationship suggests, together with similar pairs in the yeasts Schizosaccharomyces pombe and Saccharomyces cerevisiae, that eukaryotic tRNA(Ser)UCN may be undergoing concerted evolution.

  17. Identification and Validation of Single Nucleotide Polymorphisms in Poplar Using Publicly Expressed Sequence Tags

    Institute of Scientific and Technical Information of China (English)

    Bo ZHANG; Yan ZHOU; Liang ZHANG; Qiang ZHUGE; Ming-Xiu WANG; Min-Ren HUANG

    2005-01-01

    By using assembled expressed sequence tags (ESTs) from 14 different cDNA libraries that contain 84 132 sequences reads, 556 Populus candidate single nucleotide polymorphisms (SNPs) were identified. Because traces were not available from dbEST (http://www.ncbi.nlm.nih.gov/dbEST/index.html),stringent filters were used to identify reliable candidate SNPs. Sequences analysis indicated that the main types of substitutions among candidate SNPs were A/G and T/C transitions, which accounted for 22.0% and 30.8%, respectively. One hundred and ten candidate SNPs were tested. As a result, 38 candidate SNPs were confirmed by directed sequencing of PCR products amplified from six different individuals. Thirteen new SNPs in intron regions were found and multiple SNPs were found to be located in both intron and exon regions of four contigs. Heterozygosis was found in all 47 candidate sites and five SNP sites were heterozygous in all six samples. This is the first report of SNP identification in a tree species which reveals that assembled ESTs from multiple libraries of the public database may provide a rich source of comparative sequences for an SNP search in the poplar genome.

  18. Complete nucleotide sequence of a novel porcine circovirus-like agent and its infectivity in vitro

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    A novel agent (hence termed as P2) was isolated from pig sera in China, which contained covalently bound circular genomic DNAs of 993 nucleotides. Sequence analyses indicated that the agent was closely related to the porcine circovirus (PCV). The molecular clone of P2 was constructed subsequently and used for the following studies. Intracytoplasmic inclusions and intranuclear inclusions were only found in PK-15 cells transfected with the tandem dimer of P2 molecular DNA clone. Intracytoplasmic inclusions were round or irregular in shape and 0.1-0.4 μm in diameter, and intranuclear inclusions were electronically denser than intracytoplasmic inclusions and had two general shapes: round/small (0.1 μm in diameter) and hexagonal/large (0.5―1.4 μm in diameter). The inclusions were not membranously bound. The cells transfected with the tandem dimer of P2 molecular DNA clone were tested positive for P2 DNA at passages 5. The P2 antigen could be detected in both transfected and passaged PK-15 cells. This is the first report regarding the complete nucleotide sequence of a small DNA genome in a circovirus-like infectious agent in vitro.

  19. Mapping DNA methylation by transverse current sequencing: Reduction of noise from neighboring nucleotides

    Science.gov (United States)

    Alvarez, Jose; Massey, Steven; Kalitsov, Alan; Velev, Julian

    Nanopore sequencing via transverse current has emerged as a competitive candidate for mapping DNA methylation without needed bisulfite-treatment, fluorescent tag, or PCR amplification. By eliminating the error producing amplification step, long read lengths become feasible, which greatly simplifies the assembly process and reduces the time and the cost inherent in current technologies. However, due to the large error rates of nanopore sequencing, single base resolution has not been reached. A very important source of noise is the intrinsic structural noise in the electric signature of the nucleotide arising from the influence of neighboring nucleotides. In this work we perform calculations of the tunneling current through DNA molecules in nanopores using the non-equilibrium electron transport method within an effective multi-orbital tight-binding model derived from first-principles calculations. We develop a base-calling algorithm accounting for the correlations of the current through neighboring bases, which in principle can reduce the error rate below any desired precision. Using this method we show that we can clearly distinguish DNA methylation and other base modifications based on the reading of the tunneling current.

  20. Complete nucleotide sequence and gene rearrangement of the mitochondrial genome of Occidozyga martensii

    Indian Academy of Sciences (India)

    En Li; Xiaoqiang Li; Xiaobing Wu; Ge Feng; Man Zhang; Haitao Shi; Lijun Wang; Jianping Jiang

    2014-12-01

    In this study, the complete nucleotide sequence (18,321 bp) of the mitochondrial (mt) genome of the round-tongued floating frog, Occidozyga martensii was determined. Although, the base composition and codon usage of O. martensii conformed to the typical vertebrate patterns, this mt genome contained 23 tRNAs (a tandem duplication of tRNA-Met gene). The LTPF tRNA-gene cluster, and the derived position of the ND5 gene downstream of the control region, were present in this mitogenome. Moreover, we found that in the WANCY tRNA-gene cluster, the tRNA-Asn gene was located between the tRNA-Tyr and COI genes instead of between the tRNA-Ala and tRNA-Cys genes, which is a novel mtDNA gene rearrangement in vertebrates. Based on the concatenated nucleotide sequences of the 13 protein-coding genes, phylogenetic analysis (BI, ML, MP) was performed to further clarify the phylogenetic relations of this species within anurans.

  1. Complete nucleotide sequence of a novel porcine circovirus-like agent and its infectivity in vitro

    Institute of Scientific and Technical Information of China (English)

    WEN LiBin; HE KongWang; YANG HanChun; NI YanXiu; ZHANG XueHan; GUO RongLi; PAN QunXin

    2008-01-01

    A novel agent (hence termed as P2) was isolated from pig sera in China, which contained covalently bound circular genomic DNAs of 993 nucleotides. Sequence analyses indicated that the agent was closely related to the porcine circovirus (PCV). The molecular clone of P2 was constructed subsequently and used for the following studies. Intracytoplasmic inclusions and intranuclear inclusions were only found in PK-15 cells transfected with the tandem dimer of P2 molecular DNA clone. Intracytoplasmic inclusions were round or irregular in shape and 0.1-0.4μm in diameter, and intranuclear inclusions were electronically denser than intracytoplasmic inclusions and had two general shapes:round/small (0.1 μm in diameter) and hexagonal/large (0.5-1.4 μm in diameter). The inclusions were not membranously bound. The cells transfected with the tandem dimer of P2 molecular DNA clone were tested positive for P2 DNA at passages 5. The P2 antigen could be detected in both transfected and passaged PK-15 cells. This is the first report regarding the complete nucleotide sequence of a small DNA genome in a circovirus-like infectious agent in vitro.

  2. The nucleotide sequence and genome structure of the geminivirus miscanthus streak virus.

    Science.gov (United States)

    Chatani, M; Matsumoto, Y; Mizuta, H; Ikegami, M; Boulton, M I; Davies, J W

    1991-10-01

    A tandem dimer of miscanthus streak virus (MiSV) DNA was inserted into the T-DNA of the binary plasmid vector pBIN19 and agroinoculated into several monocotyledonous plants (monocots) using Agrobacterium tumefaciens or A. rhizogenes. Disease symptoms and geminate particles were produced in maize and Panicum milaceum plants, and MiSV-specific double-stranded and single-stranded DNAs were found in these plants. The nucleotide sequence of the infectious MiSV clone, consisting of 2672 nucleotides, was determined. Four open reading frames (ORFs) for proteins of Mr greater than 10K were identified, two (V0 and V2) in the virus (+) sense and two (C1 and C2) in the complementary (-) sense, although C2 did not have an ATG start codon. Unlike other geminiviruses infecting monocots, complementary-sense ORFs did not overlap. Potential splicing donor and acceptor sites were identified in the sequence of the border region between the C terminus of ORF C1 and the N terminus of ORF C2. Amino acid sequences predicted from three (V2, C1 and C2) of these ORFs showed significant homology with the corresponding ORFs of other geminiviruses infecting monocots. A fifth ORF (V1), which showed some homology with ORF V1 of other monocot-infecting geminiviruses despite having a coding capacity for a product of Mr 8.8K, was found just upstream of ORF V2 as observed in those geminiviruses. ORF V0 showed no significant homology with ORFs present in any other geminiviruses. A mutation of V0 indicated that the C-terminal 30% of this ORF was not necessary for infection in maize, but that sequences around the mutated LspI site might have some regulatory role.

  3. Optimization of Bartonella henselae multilocus sequence typing scheme using single-nucleotide polymorphism analysis of SOLiD sequence data

    Institute of Scientific and Technical Information of China (English)

    ZHAO Fan; Gemma Chaloner; Alistair Darby; SONG Xiu-ping; LI Dong-mei; Richard Birtles; LIU Qi-yong

    2012-01-01

    Background Multi-locus sequence typing (MLST) is widely used to explore the population structure of numerous bacterial pathogens.However,for genotypically-restricted pathogens,the sensitivity of MLST is limited by a paucity of variation within selected loci.For Bartonella henselae (B.henselae),although the MLST scheme currently used has been proven useful in defining the overall population structure of the species,its reliability for the accurate delineation of closely-related sequence types,between which allelic variation is usually limited to,at most,one or two nucleotide polymorphisms.Exploitation of high-throughput sequencing data allows a more informed selection of MLST loci and thus,potentially,a means of enhancing the sensitivity of the schemes they comprise.Methods We carried out SOLiD resequencing on 12 representative B.henselae isolates and explored these data using single nucleotide polymorphism (SNP) analysis.We determined the number and distribution of SNPs in the genes targeted by the established MLST scheme and modified the position of loci within these genes to capture as much genetic variation as possible.Results Using genome-wide SNP data,we found the distribution of SNPs within each open reading frame (ORF) of MLST loci,which were not represented by the established B.henselae MLST scheme.We then modified the position of loci in the MLST scheme to better reflect the polymorphism in the ORF as a whole.The use of amended loci in this scheme allowed previously indistinguishable ST1 strains to be differentiated.However,the diversity of B.henselae was still rare in China.Conclusions Our study demonstrates the use of SNP analysis to facilitate the selection of MLST loci to augment the currently-described scheme for B.henselae.And the diversity among B.henselae strains in China is markedly less than that observed in B.henselae populations elsewhere in the world.

  4. Spectroscopic investigation on the telomeric DNA base sequence repeat

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    Telomeres are protein-DNA complexes at the terminals of linear chromosomes, which protect chromosomal integrity and maintain cellular replicative capacity.From single-cell organisms to advanced animals and plants,structures and functions of telomeres are both very conservative. In cells of human and vertebral animals, telomeric DNA base sequences all are (TTAGGG)n. In the present work, we have obtained absorption and fluorescence spectra measured from seven synthesized oligonucleotides to simulate the telomeric DNA system and calculated their relative fluorescence quantum yields on which not only telomeric DNA characteristics are predicted but also possibly the shortened telomeric sequences during cell division are imrelative fluorescence quantum yield and remarkable excitation energy innerconversion, which tallies with the telomeric sequence of (TTAGGG)n. This result shows that telomeric DNA has a strong non-radiative or innerconvertible capability.``

  5. Plasmid P1 replication: negative control by repeated DNA sequences.

    OpenAIRE

    Chattoraj, D; Cordes, K.; Abeles, A

    1984-01-01

    The incompatibility locus, incA, of the unit-copy plasmid P1 is contained within a fragment that is essentially a set of nine 19-base-pair repeats. One or more copies of the fragment destabilizes the plasmid when present in trans. Here we show that extra copies of incA interfere with plasmid DNA replication and that a deletion of most of incA increases plasmid copy number. Thus, incA is not essential for replication but is required for its control. When cloned in a high-copy-number vector, pi...

  6. The bioinformatics of nucleotide sequence coding for proteins requiring metal coenzymes and proteins embedded with metals

    Science.gov (United States)

    Tremberger, G.; Dehipawala, Sunil; Cheung, E.; Holden, T.; Sullivan, R.; Nguyen, A.; Lieberman, D.; Cheung, T.

    2015-09-01

    All metallo-proteins need post-translation metal incorporation. In fact, the isotope ratio of Fe, Cu, and Zn in physiology and oncology have emerged as an important tool. The nickel containing F430 is the prosthetic group of the enzyme methyl coenzyme M reductase which catalyzes the release of methane in the final step of methano-genesis, a prime energy metabolism candidate for life exploration space mission in the solar system. The 3.5 Gyr early life sulfite reductase as a life switch energy metabolism had Fe-Mo clusters. The nitrogenase for nitrogen fixation 3 billion years ago had Mo. The early life arsenite oxidase needed for anoxygenic photosynthesis energy metabolism 2.8 billion years ago had Mo and Fe. The selection pressure in metal incorporation inside a protein would be quantifiable in terms of the related nucleotide sequence complexity with fractal dimension and entropy values. Simulation model showed that the studied metal-required energy metabolism sequences had at least ten times more selection pressure relatively in comparison to the horizontal transferred sequences in Mealybug, guided by the outcome histogram of the correlation R-sq values. The metal energy metabolism sequence group was compared to the circadian clock KaiC sequence group using magnesium atomic level bond shifting mechanism in the protein, and the simulation model would suggest a much higher selection pressure for the energy life switch sequence group. The possibility of using Kepler 444 as an example of ancient life in Galaxy with the associated exoplanets has been proposed and is further discussed in this report. Examples of arsenic metal bonding shift probed by Synchrotron-based X-ray spectroscopy data and Zn controlled FOXP2 regulated pathways in human and chimp brain studied tissue samples are studied in relationship to the sequence bioinformatics. The analysis results suggest that relatively large metal bonding shift amount is associated with low probability correlation R

  7. Single-nucleotide polymorphism discovery by high-throughput sequencing in sorghum

    Directory of Open Access Journals (Sweden)

    White Frank F

    2011-07-01

    Full Text Available Abstract Background Eight diverse sorghum (Sorghum bicolor L. Moench accessions were subjected to short-read genome sequencing to characterize the distribution of single-nucleotide polymorphisms (SNPs. Two strategies were used for DNA library preparation. Missing SNP genotype data were imputed by local haplotype comparison. The effect of library type and genomic diversity on SNP discovery and imputation are evaluated. Results Alignment of eight genome equivalents (6 Gb to the public reference genome revealed 283,000 SNPs at ≥82% confirmation probability. Sequencing from libraries constructed to limit sequencing to start at defined restriction sites led to genotyping 10-fold more SNPs in all 8 accessions, and correctly imputing 11% more missing data, than from semirandom libraries. The SNP yield advantage of the reduced-representation method was less than expected, since up to one fifth of reads started at noncanonical restriction sites and up to one third of restriction sites predicted in silico to yield unique alignments were not sampled at near-saturation. For imputation accuracy, the availability of a genomically similar accession in the germplasm panel was more important than panel size or sequencing coverage. Conclusions A sequence quantity of 3 million 50-base reads per accession using a BsrFI library would conservatively provide satisfactory genotyping of 96,000 sorghum SNPs. For most reliable SNP-genotype imputation in shallowly sequenced genomes, germplasm panels should consist of pairs or groups of genomically similar entries. These results may help in designing strategies for economical genotyping-by-sequencing of large numbers of plant accessions.

  8. Remarkable similarity in genome nucleotide sequences between the Schwarz FF-8 and AIK-C measles virus vaccine strains and apparent nucleotide differences in the phosphoprotein gene.

    Science.gov (United States)

    Ito, Chie; Ohgimoto, Shinji; Kato, Seiichi; Sharma, Luna Bhatta; Ayata, Minoru; Komase, Katsuhiro; Takeuchi, Kaoru; Ihara, Toshiaki; Ogura, Hisashi

    2011-07-01

    The Schwarz FF-8 (FF-8) and AIK-C measles virus vaccine strains are currently used for vaccination in Japan. Here, the complete genome nucleotide sequence of the FF-8 strain has been determined and its genome sequence found to be remarkably similar to that of the AIK-C strain. These two strains are differentiated only by two nucleotide differences in the phosphoprotein gene. Since the FF-8 strain does not possess the amino acid substitutions in the phospho- and fusion proteins which are responsible for the temperature-sensitivity and small syncytium formation phenotypes of the AIK-C strain, respectively, other unidentified common mechanisms likely attenuate both the FF-8 and AIK-C strains.

  9. Recombination frequency in plasmid DNA containing direct repeats--predictive correlation with repeat and intervening sequence length.

    Science.gov (United States)

    Oliveira, Pedro H; Lemos, Francisco; Monteiro, Gabriel A; Prazeres, Duarte M F

    2008-09-01

    In this study, a simple non-linear mathematical function is proposed to accurately predict recombination frequencies in bacterial plasmid DNA harbouring directly repeated sequences. The mathematical function, which was developed on the basis of published data on deletion-formation in multicopy plasmids containing direct-repeats (14-856 bp) and intervening sequences (0-3872 bp), also accounts for the strain genotype in terms of its recA function. A bootstrap resampling technique was used to estimate confidence intervals for the correlation parameters. More than 92% of the predicted values were found to be within a pre-established +/-5-fold interval of deviation from experimental data. The correlation does not only provide a way to predict, with good accuracy, the recombination frequency, but also opens the way to improve insight into these processes.

  10. The complete nucleotide sequence and its organization of the genome of Barley yellow dwarf virus-GAV

    Institute of Scientific and Technical Information of China (English)

    JIN; Zhibo; WANG; Xifeng; CHANG; Shengjun; ZHOU; Guanghe

    2004-01-01

    The complete nucleotide sequence of genomic RNA of BYDV-GAV was determined. It comprised 5685 nucleotides and contained six open reading frames and four un-translated regions. The size and organization of BYDV-GAV genome were similar to those of BYDV PAV-aus. The nucleotide and deduced amino acid sequences of the six ORFs were aligned and compared with those of other luteoviruses. The results showed that there was a high degree of identity between BYDV-GAV and MAV-PS1 in all ORFs except ORF5 and ORF6, which had only 87.4% and 70.2% identities respectively. The reported genomic nucleotide sequence of MAV was shorter than that of BYDV-GAV, but the comparison of the genomic nucleotide sequences for MAV-PS1 and GAV showed 90.4% sequence identity for the same region of the genome. According to the level of sequence similarities, BYDV-GAV should be closely related to BYDV-MAV.

  11. Sequencing analysis of the spinal bulbar muscular atrophy CAG expansion reveals absence of repeat interruptions.

    Science.gov (United States)

    Fratta, Pietro; Collins, Toby; Pemble, Sally; Nethisinghe, Suran; Devoy, Anny; Giunti, Paola; Sweeney, Mary G; Hanna, Michael G; Fisher, Elizabeth M C

    2014-02-01

    Trinucleotide repeat disorders are a heterogeneous group of diseases caused by the expansion, beyond a pathogenic threshold, of unstable DNA tracts in different genes. Sequence interruptions in the repeats have been described in the majority of these disorders and may influence disease phenotype and heritability. Spinal bulbar muscular atrophy (SBMA) is a motor neuron disease caused by a CAG trinucleotide expansion in the androgen receptor (AR) gene. Diagnostic testing and previous research have relied on fragment analysis polymerase chain reaction to determine the AR CAG repeat size, and have therefore not been able to assess the presence of interruptions. We here report a sequencing study of the AR CAG repeat in a cohort of SBMA patients and control subjects in the United Kingdom. We found no repeat interruptions to be present, and we describe differences between sequencing and traditional sizing methods.

  12. Nucleotide sequence analyses of the MRP1 gene in four populations suggest negative selection on its coding region

    Directory of Open Access Journals (Sweden)

    Ryan Stephen

    2006-05-01

    Full Text Available Abstract Background The MRP1 gene encodes the 190 kDa multidrug resistance-associated protein 1 (MRP1/ABCC1 and effluxes diverse drugs and xenobiotics. Sequence variations within this gene might account for differences in drug response in different individuals. To facilitate association studies of this gene with diseases and/or drug response, exons and flanking introns of MRP1 were screened for polymorphisms in 142 DNA samples from four different populations. Results Seventy-one polymorphisms, including 60 biallelic single nucleotide polymorphisms (SNPs, ten insertions/deletions (indel and one short tandem repeat (STR were identified. Thirty-four of these polymorphisms have not been previously reported. Interestingly, the STR polymorphism at the 5' untranslated region (5'UTR occurs at high but different frequencies in the different populations. Frequencies of common polymorphisms in our populations were comparable to those of similar populations in HAPMAP or Perlegen. Nucleotide diversity indices indicated that the coding region of MRP1 may have undergone negative selection or recent population expansion. SNPs E10/1299 G>T (R433S and E16/2012 G>T (G671V which occur at low frequency in only one or two of four populations examined were predicted to be functionally deleterious and hence are likely to be under negative selection. Conclusion Through in silico approaches, we identified two rare SNPs that are potentially negatively selected. These SNPs may be useful for studies associating this gene with rare events including adverse drug reactions.

  13. Construction of libraries enriched for sequence repeats and jumping clones, and hybridization selection for region-specific markers

    Energy Technology Data Exchange (ETDEWEB)

    Kandpal, R.P.; Kandpal, G.; Weissman, S.M. (Yale Univ. School of Medicine, New Haven, CT (United States))

    1994-01-04

    The authors describe a simple and rapid method for constructing small-insert genomic libraries highly enriched for dimeric, trimeric, and tetrameric nucleotide repeat motifs. The approach involves use of DNA inserts recovered by PCR amplification of a small-insert sonicated genomic phage library or by a single-primer PCR amplification of Mbo I-digested and adaptor-ligated genomic DNA. The genomic DNA inserts are heat denatured and hybridized to a biotinylated oligonucleotde. The biotinylated hybrids are retained on a Vectrex-avidin matrix and eluted specifically. The eluate is PCR amplified and cloned. More than 90% of the clones in a library enriched for (CA)[sub n] microsatellites with this approach contained clones with inserts containing CA repeats. They have also used this protocol for enrichment of (CAG)[sub n] and (AGAT)[sub n] sequence repeats and for Not I jumping clones. They have used the enriched libraries with an adaptation of the cDNA selection method to enrich for repeat motifs encoded in yeast artificial chromosomes.

  14. Complete nucleotide sequence of a new satellite RNA associated with cucumber mosaic virus inducing tomato necrosis

    Institute of Scientific and Technical Information of China (English)

    程宁辉; 方荣祥; 濮祖芹; 方中达

    1997-01-01

    A new strain (TN strain) of cucumber mosaic virus (CMV) was isolated from tomato plants with necrotic symptoms and proved to carry a necrogenic satellite RNA (TN-Sat RNA). Double-strand cDNA of the TN-Sat RNA was synthesized by reverse transcription and polymerase chain reaction using primers designed according to the conserved terminal sequences of known CMV satellite RNAs. Sequence analysis indicated that the TN-Sat RNA consisted of 390 nucleotides (nt). Comparison of the sequence of the TN-Sat RNA with those of other CMV satellite RNAs revealed four homologous regions ( I . 1-81 nt; II . 216-261 nt; III. 278-338 nt; IV . 349-390 nt) and one hypervarible domain in the region of 82-215 nt. Moreover, the TN-Sat RNA contained a characteristic necro-genic consensus sequence at the 3’ end (339-367 nt) as reported in the known necrosis-inducing CMV satellite RNAs.

  15. Analysing grouping of nucleotides in DNA sequences using lumped processes constructed from Markov chains.

    Science.gov (United States)

    Guédon, Yann; d'Aubenton-Carafa, Yves; Thermes, Claude

    2006-03-01

    The most commonly used models for analysing local dependencies in DNA sequences are (high-order) Markov chains. Incorporating knowledge relative to the possible grouping of the nucleotides enables to define dedicated sub-classes of Markov chains. The problem of formulating lumpability hypotheses for a Markov chain is therefore addressed. In the classical approach to lumpability, this problem can be formulated as the determination of an appropriate state space (smaller than the original state space) such that the lumped chain defined on this state space retains the Markov property. We propose a different perspective on lumpability where the state space is fixed and the partitioning of this state space is represented by a one-to-many probabilistic function within a two-level stochastic process. Three nested classes of lumped processes can be defined in this way as sub-classes of first-order Markov chains. These lumped processes enable parsimonious reparameterizations of Markov chains that help to reveal relevant partitions of the state space. Characterizations of the lumped processes on the original transition probability matrix are derived. Different model selection methods relying either on hypothesis testing or on penalized log-likelihood criteria are presented as well as extensions to lumped processes constructed from high-order Markov chains. The relevance of the proposed approach to lumpability is illustrated by the analysis of DNA sequences. In particular, the use of lumped processes enables to highlight differences between intronic sequences and gene untranslated region sequences.

  16. The Complete Nucleotide Sequence and Biotype Variability of Papaya leaf distortion mosaic virus.

    Science.gov (United States)

    Maoka, Tetsuo; Hataya, Tatsuji

    2005-02-01

    ABSTRACT The complete nucleotide sequence of the genome of Papaya leaf distortion mosaic virus (PLDMV) was determined. The viral RNA genome of strain LDM (leaf distortion mosaic) comprised 10,153 nucleotides, excluding the poly(A) tail, and contained one long open reading frame encoding a polyprotein of 3,269 amino acids (molecular weight 373,347). The polyprotein contained nine putative proteolytic cleavage sites and some motifs conserved in other potyviral polyproteins with 44 to 50% identities, indicating that PLDMV is a distinct species in the genus Potyvirus. Like the W biotype of Papaya ringspot virus (PRSV), the non-papaya-infecting biotype of PLDMV (PLDMV-C) was found in plants of the family Cucurbitaceae. The coat protein (CP) sequence of PLDMV-C in naturally infected-Trichosanthes bracteata was compared with those of three strains of the P biotype (PLDMV-P), LDM and two additional strains M (mosaic) and YM (yellow mosaic), which are biologically different from each other. The CP sequences of three strains of PLDMV-P share high identities of 95 to 97%, while they share lower identities of 88 to 89% with that of PLDMV-C. Significant changes in hydrophobicity and a deletion of two amino acids at the N-terminal region of the CP of PLDMV-C were observed. The finding of two biotypes of PLDMV implies the possibility that the papaya-infecting biotype evolved from the cucurbitaceae-infecting potyvirus, as has been previously suggested for PRSV. In addition, a similar evolutionary event acquiring infectivity to papaya may arise frequently in viruses in the family Cucurbitaceae.

  17. Large scale single nucleotide polymorphism discovery in unsequenced genomes using second generation high throughput sequencing technology: applied to turkey

    NARCIS (Netherlands)

    Kerstens, H.H.D.; Crooijmans, R.P.M.A.; Veenendaal, A.; Dibbits, B.W.; Chin-A-Woeng, T.F.C.; Dunnen, den J.T.; Groenen, M.A.M.

    2009-01-01

    Background - The development of second generation sequencing methods has enabled large scale DNA variation studies at moderate cost. For the high throughput discovery of single nucleotide polymorphisms (SNPs) in species lacking a sequenced reference genome, we set-up an analysis pipeline based on a

  18. Applications of inter simple sequence repeat (ISSR) rDNA in ...

    African Journals Online (AJOL)

    Applications of inter simple sequence repeat (ISSR) rDNA in detecting ... and phylogenetic relationships between Lymnaea natalensis collected from Giza, ... in water samples of all tested governorates with different significant differences.

  19. Cloning, sequencing and identification of single nucleotide polymorphisms of partial sequence on the porcine CACNA1S gene

    Institute of Scientific and Technical Information of China (English)

    FANG XiaoMin; XU NingYing; REN ShouWen

    2008-01-01

    CACNA1S gene encodes the α1 subunit of the calcium channel. The mutation of CACNA1S gene can cause hypokalemic periodic paralysis (HypoKPP) and maliglant hyperthermla synarome (MHS) in human beings. Current research on CACNA1S was mainly in human being and model animal, but rarely in livestock and poultry. In this study, Yorkshire pigs (23), Pietrain pigs (30), Jinhua pigs (115) and the second generation (126) of crossbred of Jinhua and Pietrein were used. Primers were designed according to the sequence of human CACNA1S gene and PCR was carried out using pig genome DNA.PCR products were sequenced and compared with that of human, and then single nucleotide polymorphisms (SNPs) were investigated by PCR-SSCP, while PCR-RFLP tests were performed to validate the mutations. Results indicated: (1) the 5211 bp DNA fragments of porcine CACNA1S gene were acquired (GenBank accession number: DQ767693 ) and the identity of the exon region was 82.6% between human and pig; (2) fifty-seven mutations were found within the cloned sequences, among which 24 were in exon region; (3) the results of PCR-RFLP were in accordance with that of PCR-SSCP. According to the EST of porcine CACNA1S gene published in GenBank (Bx914582, Bx666997), 8 of the 11 SNPs identified in the present study were consistent with the base difference between two EST fragments.

  20. RePS: a sequence assembler that masks exact repeats identified from the shotgun data

    DEFF Research Database (Denmark)

    Wang, Jun; Wong, Gane Ka-Shu; Ni, Peixiang;

    2002-01-01

    We describe a sequence assembler, RePS (repeat-masked Phrap with scaffolding), that explicitly identifies exact 20mer repeats from the shotgun data and removes them prior to the assembly. The established software is used to compute meaningful error probabilities for each base. Clone-end-pairing i...

  1. Kearns-Sayre syndrome case presenting a mitochondrial DNA deletion with unusual direct repeats and a rudimentary RNAse mitochondria ribonucleotide processing target sequence

    Energy Technology Data Exchange (ETDEWEB)

    Remes, A.M.; Hassinen, I.E. (Univ. of Oulu (Finland)); Peuhkurinen, K.J.; Herva, R.; Majamaa, K. (Oulu Univ. Central Hospital (Finland))

    1993-04-01

    A mitochondrial DNA deletion in a case of Kearns-Sayre syndrome is described. The deletion is bracketed by direct repeats that were unusual in that one of them was located 11--13 nucleotides from the deletion seam and both were conserved, which should not occur in slip replication or illegitimate elongation. The deleted region was demarcated on the deletion side by sequences that could be predicted to form hairpin structures. The 5[prime]-side of the deletion was flanked by a sequence homologous to a 9-nucleotide piece of the conserved sequence block II of the D-loop. This arrangement around the deletion in Kearns-Sayre syndrome bears some resemblance to the arrangement in the Pearson marrow- pancreas syndrome described by A. Rotig et al. (1991, Genomics 10: 502--504). 10 refs., 1 fig.

  2. Nucleotide sequences related to the transforming gene of avian sarcoma virus are present in DNA of uninfected vertebrates.

    Science.gov (United States)

    Spector, D H; Varmus, H E; Bishop, J M

    1978-09-01

    We have detected nucleotide sequences related to the transforming gene of avian sarcoma vius (ASV) in the DNA of uninfected vertebrates. Purified radioactive DNA (cDNAsarc) complementary to most of all of the gene (src) required for transformation of fibroblasts by ASV was annealed with DNA from a variety of normal species. Under conditions that facilitate pairing of partially matched nucleotide sequences (1.5 M NaCl, 59 degrees), cDNAsarc formed duplexes with chicken, human, calf, mouse, and salmon DNA but not with DNA from sea urchin, Drosophila, or Escherichia coli. The kinetics of duplex formation indicated that cDNAsarc was reacting with nucleotide sequences present in a single copy or at most a few copies per cell. In contrast to the preceding findings, nucleotide sequences complementary to the remainder of the ASV genome were observed only in chicken DNA. Thermal denaturation studies of the duplexes formed with cDNAsarc indicated a high degree of conservation of the nucleotide sequences related to src in vertebrate DNAs; the reductions in melting temperature suggested about 3--4% mismatching of cDNAsarc with chicken DNA and 8--10% mismatching of cDNAsarc with the other vertebrate DNAs.

  3. Characterization of a highly repeated DNA sequence family in five species of the genus Eulemur.

    Science.gov (United States)

    Ventura, M; Boniotto, M; Cardone, M F; Fulizio, L; Archidiacono, N; Rocchi, M; Crovella, S

    2001-09-19

    The karyotypes of Eulemur species exhibit a high degree of variation, as a consequence of the Robertsonian fusion and/or centromere fission. Centromeric and pericentromeric heterochromatin of eulemurs is constituted by highly repeated DNA sequences (including some telomeric TTAGGG repeats) which have so far been investigated and used for the study of the systematic relationships of the different species of the genus Eulemur. In our study, we have cloned a set of repetitive pericentromeric sequences of five Eulemur species: E. fulvus fulvus (EFU), E. mongoz (EMO), E. macaco (EMA), E. rubriventer (ERU), and E. coronatus (ECO). We have characterized these clones by sequence comparison and by comparative fluorescence in situ hybridization analysis in EMA and EFU. Our results showed a high degree of sequence similarity among Eulemur species, indicating a strong conservation, within the five species, of these pericentromeric highly repeated DNA sequences.

  4. Repeat Associated Non-AUG Translation (RAN Translation Dependent on Sequence Downstream of the ATXN2 CAG Repeat.

    Directory of Open Access Journals (Sweden)

    Daniel R Scoles

    Full Text Available Spinocerebellar ataxia type 2 (SCA2 is a progressive autosomal dominant disorder caused by the expansion of a CAG tract in the ATXN2 gene. The SCA2 disease phenotype is characterized by cerebellar atrophy, gait ataxia, and slow saccades. ATXN2 mutation causes gains of toxic and normal functions of the ATXN2 gene product, ataxin-2, and abnormally slow Purkinje cell firing frequency. Previously we investigated features of ATXN2 controlling expression and noted expression differences for ATXN2 constructs with varying CAG lengths, suggestive of repeat associated non-AUG translation (RAN translation. To determine whether RAN translation occurs for ATXN2 we assembled various ATXN2 constructs with ATXN2 tagged by luciferase, HA or FLAG tags, driven by the CMV promoter or the ATXN2 promoter. Luciferase expression from ATXN2-luciferase constructs lacking the ATXN2 start codon was weak vs AUG translation, regardless of promoter type, and did not increase with longer CAG repeat lengths. RAN translation was detected on western blots by the anti-polyglutamine antibody 1C2 for constructs driven by the CMV promoter but not the ATXN2 promoter, and was weaker than AUG translation. Strong RAN translation was also observed when driving the ATXN2 sequence with the CMV promoter with ATXN2 sequence downstream of the CAG repeat truncated to 18 bp in the polyglutamine frame but not in the polyserine or polyalanine frames. Our data demonstrate that ATXN2 RAN translation is weak compared to AUG translation and is dependent on ATXN2 sequences flanking the CAG repeat.

  5. The complete nucleotide sequence of a new bipartite begomovirus from Brazil infecting Abutilon.

    Science.gov (United States)

    Paprotka, T; Metzler, V; Jeske, H

    2010-05-01

    The complete nucleotide sequence of Abutilon mosaic Brazil virus (AbMBV), a new bipartite begomovirus from Bahia, Brazil, is described and analyzed phylogenetically. Its DNA A is most closely related to those of Sida-infecting begomoviruses from Brazil and forms a phylogenetic cluster with pepper- and Euphorbia-infecting begomoviruses from Central America. The DNA B component forms a cluster with different Sida- and okra-infecting begomoviruses from Brazil. Both components are distinct from those of the classical Abutilon mosaic virus originating from the West Indies. AbMBV is transmissible to Nicotiana benthamiana and Malva parviflora by biolistics of rolling-circle amplification products and induces characteristic mosaic and vein-clearing symptoms in M. parviflora.

  6. The nucleotide sequence of histidine tRNA gamma of Drosophila melanogaster.

    OpenAIRE

    Altwegg, M; Kubli, E

    1980-01-01

    The nucleotide sequence of D. melanogaster histidine tRNA gamma was determined to be: pG-G-C-C-G-U-G-A-U-C-G-U-C-psi-A-G-D-G-G-D-D-A-G-G-A-C-C-C-C-A-C-G-psi-U-G-U-G- m1G-C-C-G-U-G-G-U-A-A-C-C-m5C-A-G-G-U-psi-C-G-m1A-A-U-C-C-U-G-G-U-C-A-C-G-G-m5C -A-C-C-AOH. An additional unpaired G is found at the 5' end, and the T in the TpsiC loop is replaced by a U.

  7. High-throughput nucleotide sequence analysis of diverse bacterial communities in leachates of decomposing pig carcasses

    Directory of Open Access Journals (Sweden)

    Seung Hak Yang

    2015-09-01

    Full Text Available The leachate generated by the decomposition of animal carcass has been implicated as an environmental contaminant surrounding the burial site. High-throughput nucleotide sequencing was conducted to investigate the bacterial communities in leachates from the decomposition of pig carcasses. We acquired 51,230 reads from six different samples (1, 2, 3, 4, 6 and 14 week-old carcasses and found that sequences representing the phylum Firmicutes predominated. The diversity of bacterial 16S rRNA gene sequences in the leachate was the highest at 6 weeks, in contrast to those at 2 and 14 weeks. The relative abundance of Firmicutes was reduced, while the proportion of Bacteroidetes and Proteobacteria increased from 3–6 weeks. The representation of phyla was restored after 14 weeks. However, the community structures between the samples taken at 1–2 and 14 weeks differed at the bacterial classification level. The trend in pH was similar to the changes seen in bacterial communities, indicating that the pH of the leachate could be related to the shift in the microbial community. The results indicate that the composition of bacterial communities in leachates of decomposing pig carcasses shifted continuously during the study period and might be influenced by the burial site.

  8. Complete nucleotide sequence of watermelon chlorotic stunt virus originating from Oman.

    Science.gov (United States)

    Khan, Akhtar J; Akhtar, Sohail; Briddon, Rob W; Ammara, Um; Al-Matrooshi, Abdulrahman M; Mansoor, Shahid

    2012-07-01

    Watermelon chlorotic stunt virus (WmCSV) is a bipartite begomovirus (genus Begomovirus, family Geminiviridae) that causes economic losses to cucurbits, particularly watermelon, across the Middle East and North Africa. Recently squash (Cucurbita moschata) grown in an experimental field in Oman was found to display symptoms such as leaf curling, yellowing and stunting, typical of a begomovirus infection. Sequence analysis of the virus isolated from squash showed 97.6-99.9% nucleotide sequence identity to previously described WmCSV isolates for the DNA A component and 93-98% identity for the DNA B component. Agrobacterium-mediated inoculation to Nicotiana benthamiana resulted in the development of symptoms fifteen days post inoculation. This is the first bipartite begomovirus identified in Oman. Overall the Oman isolate showed the highest levels of sequence identity to a WmCSV isolate originating from Iran, which was confirmed by phylogenetic analysis. This suggests that WmCSV present in Oman has been introduced from Iran. The significance of this finding is discussed.

  9. Complete nucleotide sequence and analysis of two conjugative broad host range plasmids from a marine microbial biofilm.

    Directory of Open Access Journals (Sweden)

    Peter Norberg

    Full Text Available The complete nucleotide sequence of plasmids pMCBF1 and pMCBF6 was determined and analyzed. pMCBF1 and pMCBF6 form a novel clade within the IncP-1 plasmid family designated IncP-1 ς. The plasmids were exogenously isolated earlier from a marine biofilm. pMCBF1 (62 689 base pairs; bp and pMCBF6 (66 729 bp have identical backbones, but differ in their mercury resistance transposons. pMCBF1 carries Tn5053 and pMCBF6 carries Tn5058. Both are flanked by 5 bp direct repeats, typical of replicative transposition. Both insertions are in the vicinity of a resolvase gene in the backbone, supporting the idea that both transposons are "res-site hunters" that preferably insert close to and use external resolvase functions. The similarity of the backbones indicates recent insertion of the two transposons and the ongoing dynamics of plasmid evolution in marine biofilms. Both plasmids also carry the insertion sequence ISPst1, albeit without flanking repeats. ISPs1is located in an unusual site within the control region of the plasmid. In contrast to most known IncP-1 plasmids the pMCBF1/pMCBF6 backbone has no insert between the replication initiation gene (trfA and the vegetative replication origin (oriV. One pMCBF1/pMCBF6 block of about 2.5 kilo bases (kb has no similarity with known sequences in the databases. Furthermore, insertion of three genes with similarity to the multidrug efflux pump operon mexEF and a gene from the NodT family of the tripartite multi-drug resistance-nodulation-division (RND system in Pseudomonas aeruginosa was found. They do not seem to confer antibiotic resistance to the hosts of pMCBF1/pMCBF6, but the presence of RND on promiscuous plasmids may have serious implications for the spread of antibiotic multi-resistance.

  10. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    Science.gov (United States)

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  11. Nucleotide sequence of the capsid protein gene and 3' non-coding region of papaya mosaic virus RNA.

    Science.gov (United States)

    Abouhaidar, M G

    1988-01-01

    The nucleotide sequences of cDNA clones corresponding to the 3' OH end of papaya mosaic virus RNA have been determined. The 3'-terminal sequence obtained was 900 nucleotides in length, excluding the poly(A) tail, and contained an open reading frame capable of giving rise to a protein of 214 amino acid residues with an Mr of 22930. This protein was identified as the viral capsid protein. The 3' non-coding region of PMV genome RNA was about 121 nucleotides long [excluding the poly(A) tail] and homologous to the complementary sequence of the non-coding region at the 5' end of PMV RNA. A long open reading frame was also found in the predicted 5' end region of the negative strand.

  12. Development and characterization of simple sequence repeats for Bipolaris sokiniana and cross transferability to related species

    Science.gov (United States)

    Simple sequence repeats (SSR) markers were developed from a small insert genomic library for Bipolaris sorokiniana, a mitosporic fungal pathogen that causes spot blotch and root rot in switchgrass. About 59% of sequenced clones (n=384) harbored various SSR motifs. After eliminating the redundant seq...

  13. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array.

    Science.gov (United States)

    Fuller, Carl W; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J; Kasianowicz, John J; Davis, Randy; Roever, Stefan; Church, George M; Ju, Jingyue

    2016-05-10

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5'-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods.

  14. Human secreted carbonic anhydrase: cDNA cloning, nucleotide sequence, and hybridization histochemistry

    Energy Technology Data Exchange (ETDEWEB)

    Aldred, P.; Fu, Ping; Barrett, G.; Penschow, J.D.; Wright, R.D.; Coghlan, J.P.; Fernley, R.T. (The Howard Florey Institute of Experimental Physiology and Medicine, Parkville, Victoria (Australia))

    1991-01-01

    Complementary DNA clones coding for the human secreted carbonic anhydrase isozyme (CAVI) have been isolated and their nucleotide sequences determined. These clones identify a 1.45-kb mRNA that is present in high levels in parotid submandibular salivary glands but absent in other tissues such as the sublingual gland, kidney, liver, and prostate gland. Hybridization histochemistry of human salivary glands shows mRNA for CA VI located in the acinar cells of these glands. The cDNA clones encode a protein of 308 amino acids that includes a 17 amino acid leader sequence typical of secreted proteins. The mature protein has 291 amino acids compared to 259 or 260 for the cytoplasmic isozymes, with most of the extra amino acids present as a carboxyl terminal extension. In comparison, sheep CA VI has a 45 amino acid extension. Overall the human CA VI protein has a sequence identity of 35 {percent} with human CA II, while residues involved in the active site of the enzymes have been conserved. The human and sheep secreted carbonic anhydrases have a sequence identity of 72 {percent}. This includes the two cysteine residues that are known to be involved in an intramolecular disulfide bond in the sheep CA VI. The enzyme is known to be glycosylated and three potential N-glycosylation sites (Asn-X-Thr/Ser) have been identified. Two of these are known to be glycosylated in sheep CA VI. Southern analysis of human DNA indicates that there is only one gene coding for CA VI.

  15. Analysis of Complete Nucleotide Sequences of 12 Gossypium Chloroplast Genomes: Origin and Evolution of Allotetraploids

    Science.gov (United States)

    Xu, Qin; Xiong, Guanjun; Li, Pengbo; He, Fei; Huang, Yi; Wang, Kunbo; Li, Zhaohu; Hua, Jinping

    2012-01-01

    Gossypium is the maternal source of extant allotetraploid species and allotetraploids have a monophyletic origin. G. hirsutum AD1 lineages have experienced more sequence variations than other allotetraploids in intergenic regions. The available complete nucleotide sequences of 12 Gossypium chloroplast genomes should facilitate studies to uncover the molecular mechanisms of compartmental co-evolution and speciation of Gossypium allotetraploids. PMID:22876273

  16. Nucleotide sequence of an immediate-early frog virus 3 gene.

    Science.gov (United States)

    Willis, D; Foglesong, D; Granoff, A

    1984-12-01

    We have used "gene walking" with synthetic oligonucleotides and M13 dideoxynucleotide sequencing techniques to obtain the complete coding and flanking sequences of the gene encoding a major immediate-early RNA (molecular weight, 169,000) of frog virus 3. R-loop mapping of the cloned XbaI K fragment of frog virus 3 DNA with immediate-early RNA from infected cells showed that an RNA of approximately 500 to 600 nucleotides (the right size to code for the immediate-early viral 18-kilodalton protein of unknown function) hybridized to a region within 100 base pairs of one end of the XbaI K fragment; no evidence for splicing was observed in the electron microscope or by single-strand nuclease analysis. Further restriction mapping narrowed the location of the gene to the XbaI end of a 2-kilobase-pair XbaI-Bg/II fragment, which was bidirectionally subcloned into the bacteriophage pair mp10 and mp11 for sequencing. Mung bean nuclease mapping was used to identify both the 5' and the 3' ends of the mRNA. The 5' end mapped within an AT-rich region 19 base pairs upstream from two in-phase AUG start codons that were immediately followed by an open reading frame of 157 amino acids. Another AT-rich sequence was found at -29 base pairs from the 5' end of the mRNA start site; this sequence may function as a TATA box. The 3' end of the message displayed considerable microheterogeneity, but clearly terminated within a third AT-rich region 50 to 60 base pairs from the translation stop codon. The eucaryotic polyadenylic acid addition signal (AATAAA) was not present, a finding to be expected since frog virus 3 mRNA is not polyadenylated. Both the single-stranded mp10 clone of the XbaI-Bg/II fragment and a 15-base oligonucleotide complementary to the region flanking the two AUG translation start codons inhibited translation of the immediate-early 18-kilodalton protein in vitro, confirming the identity of the sequenced gene. As the regulatory sequences of this gene did not resemble those of

  17. Novel multiplex format of an extended multilocus variable-number-tandem-repeat analysis of Clostridium difficile correlates with tandem repeat sequence typing.

    Science.gov (United States)

    Jensen, Mie Birgitte Frid; Engberg, Jørgen; Larsson, Jonas T; Olsen, Katharina E P; Torpdahl, Mia

    2015-03-01

    Subtyping of Clostridium difficile is crucial for outbreak investigations. An extended multilocus variable-number tandem-repeat analysis (eMLVA) of 14 variable number tandem repeat (VNTR) loci was validated in multiplex format compatible with a routine typing laboratory and showed excellent concordance with tandem repeat sequence typing (TRST) and high discriminatory power.

  18. Cloning, sequencing and identification of single nu-cleotide polymorphisms of partial sequence on the porcine CACNA1S gene

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    CACNA1S gene encodes the α1 subunit of the calcium channel. The mutation of CACNA1S gene can cause hypokalemic periodic paralysis (HypoKPP) and maliglant hyperthermia synarome (MHS) in hu-man beings. Current research on CACNA1S was mainly in human being and model animal, but rarely in livestock and poultry. In this study, Yorkshire pigs (23), Pietrain pigs (30), Jinhua pigs (115) and the second generation (126) of crossbred of Jinhua and Pietrain were used. Primers were designed ac-cording to the sequence of human CACNA1S gene and PCR was carried out using pig genome DNA. PCR products were sequenced and compared with that of human, and then single nucleotide poly-morphisms (SNPs) were investigated by PCR-SSCP, while PCR-RFLP tests were performed to validate the mutations. Results indicated: (1) the 5211 bp DNA fragments of porcine CACNA1S gene were ac-quired (GenBank accession number: DQ767693 ) and the identity of the exon region was 82.6% be-tween human and pig; (2) fifty-seven mutations were found within the cloned sequences, among which 24 were in exon region; (3) the results of PCR-RFLP were in accordance with that of PCR-SSCP. Ac-cording to the EST of porcine CACNA1S gene published in GenBank (Bx914582, Bx666997), 8 of the 11 SNPs identified in the present study were consistent with the base difference between two EST frag-ments.

  19. Development of expressed sequence tag and expressed sequence tag–simple sequence repeat marker resources for Musa acuminata

    Science.gov (United States)

    Passos, Marco A. N.; de Oliveira Cruz, Viviane; Emediato, Flavia L.; de Camargo Teixeira, Cristiane; Souza, Manoel T.; Matsumoto, Takashi; Rennó Azevedo, Vânia C.; Ferreira, Claudia F.; Amorim, Edson P.; de Alencar Figueiredo, Lucio Flavio; Martins, Natalia F.; de Jesus Barbosa Cavalcante, Maria; Baurens, Franc-Christophe; da Silva, Orzenil Bonfim; Pappas, Georgios J.; Pignolet, Luc; Abadie, Catherine; Ciampi, Ana Y.; Piffanelli, Pietro; Miller, Robert N. G.

    2012-01-01

    Background and aims Banana (Musa acuminata) is a crop contributing to global food security. Many varieties lack resistance to biotic stresses, due to sterility and narrow genetic background. The objective of this study was to develop an expressed sequence tag (EST) database of transcripts expressed during compatible and incompatible banana–Mycosphaerella fijiensis (Mf) interactions. Black leaf streak disease (BLSD), caused by Mf, is a destructive disease of banana. Microsatellite markers were developed as a resource for crop improvement. Methodology cDNA libraries were constructed from in vitro-infected leaves from BLSD-resistant M. acuminata ssp. burmaniccoides Calcutta 4 (MAC4) and susceptible M. acuminata cv. Cavendish Grande Naine (MACV). Clones were 5′-end Sanger sequenced, ESTs assembled with TGICL and unigenes annotated using BLAST, Blast2GO and InterProScan. Mreps was used to screen for simple sequence repeats (SSRs), with markers evaluated for polymorphism using 20 diploid (AA) M. acuminata accessions contrasting in resistance to Mycosphaerella leaf spot diseases. Principal results A total of 9333 high-quality ESTs were obtained for MAC4 and 3964 for MACV, which assembled into 3995 unigenes. Of these, 2592 displayed homology to genes encoding proteins with known or putative function, and 266 to genes encoding proteins with unknown function. Gene ontology (GO) classification identified 543 GO terms, 2300 unigenes were assigned to EuKaryotic orthologous group categories and 312 mapped to Kyoto Encyclopedia of Genes and Genomes pathways. A total of 624 SSR loci were identified, with trinucleotide repeat motifs the most abundant in MAC4 (54.1 %) and MACV (57.6 %). Polymorphism across M. acuminata accessions was observed with 75 markers. Alleles per polymorphic locus ranged from 2 to 8, totalling 289. The polymorphism information content ranged from 0.08 to 0.81. Conclusions This EST collection offers a resource for studying functional genes, including

  20. Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition.

    Science.gov (United States)

    Alberti, Adriana; Poulain, Julie; Engelen, Stefan; Labadie, Karine; Romac, Sarah; Ferrera, Isabel; Albini, Guillaume; Aury, Jean-Marc; Belser, Caroline; Bertrand, Alexis; Cruaud, Corinne; Da Silva, Corinne; Dossat, Carole; Gavory, Frédérick; Gas, Shahinaz; Guy, Julie; Haquelle, Maud; Jacoby, E'krame; Jaillon, Olivier; Lemainque, Arnaud; Pelletier, Eric; Samson, Gaëlle; Wessner, Mark; Acinas, Silvia G; Royo-Llonch, Marta; Cornejo-Castillo, Francisco M; Logares, Ramiro; Fernández-Gómez, Beatriz; Bowler, Chris; Cochrane, Guy; Amid, Clara; Hoopen, Petra Ten; De Vargas, Colomban; Grimsley, Nigel; Desgranges, Elodie; Kandels-Lewis, Stefanie; Ogata, Hiroyuki; Poulton, Nicole; Sieracki, Michael E; Stepanauskas, Ramunas; Sullivan, Matthew B; Brum, Jennifer R; Duhaime, Melissa B; Poulos, Bonnie T; Hurwitz, Bonnie L; Pesant, Stéphane; Karsenti, Eric; Wincker, Patrick

    2017-08-01

    A unique collection of oceanic samples was gathered by the Tara Oceans expeditions (2009-2013), targeting plankton organisms ranging from viruses to metazoans, and providing rich environmental context measurements. Thanks to recent advances in the field of genomics, extensive sequencing has been performed for a deep genomic analysis of this huge collection of samples. A strategy based on different approaches, such as metabarcoding, metagenomics, single-cell genomics and metatranscriptomics, has been chosen for analysis of size-fractionated plankton communities. Here, we provide detailed procedures applied for genomic data generation, from nucleic acids extraction to sequence production, and we describe registries of genomics datasets available at the European Nucleotide Archive (ENA, www.ebi.ac.uk/ena). The association of these metadata to the experimental procedures applied for their generation will help the scientific community to access these data and facilitate their analysis. This paper complements other efforts to provide a full description of experiments and open science resources generated from the Tara Oceans project, further extending their value for the study of the world's planktonic ecosystems.

  1. Nucleotide sequence and phylogenetic analysis of a new potexvirus: Malva mosaic virus.

    Science.gov (United States)

    Côté, Fabien; Paré, Christine; Majeau, Nathalie; Bolduc, Marilène; Leblanc, Eric; Bergeron, Michel G; Bernardy, Michael G; Leclerc, Denis

    2008-01-01

    A filamentous virus isolated from Malva neglecta Wallr. (common mallow) and propagated in Chenopodium quinoa was grown, cloned and the complete nucleotide sequence was determined (GenBank accession # DQ660333). The genomic RNA is 6858 nt in length and contains five major open reading frames (ORFs). The genomic organization is similar to members and the viral encoded proteins shared homology with the group of the Potexvirus genus in the Flexiviridae family. Phylogenetic analysis revealed a close relationship with narcissus mosaic virus (NMV), scallion virus X (ScaVX) and, to a lesser extent, to Alstroemeria virus X (AlsVX) and pepino mosaic virus (PepMV). A novel putative pseudoknot structure is predicted in the 3'-UTR of a subgroup of potexviruses, including this newly described virus. The consensus GAAAA sequence is detected at the 5'-end of the genomic RNA and experimental data strongly suggest that this motif could be a distinctive hallmark of this genus. The name Malva mosaic virus is proposed.

  2. Complete Nucleotide Sequence Analysis of the Norovirus GII.4 Sydney Variant in South Korea

    Directory of Open Access Journals (Sweden)

    Ji-Sun Park

    2015-01-01

    Full Text Available Norovirus is the primary cause of acute gastroenteritis in individuals of all ages. In Australia, a new strain of norovirus (GII.4 was identified in March 2012, and this strain has spread rapidly around the world. In August 2012, this new GII.4 strain was identified in patients in South Korea. Therefore, to examine the characteristics of the epidemic norovirus GII.4 2012 variant in South Korea, we conducted KM272334 full-length genomic analysis. The genome of the gg-12-08-04 strain consisted of 7,558 bp and contained three open reading frame (ORF composites throughout the whole genome: ORF1 (5,100 bp, ORF2 (1,623 bp, and ORF3 (807 bp. Phylogenetic analyses showed that gg-12-08-04 belonged to the GII.4 Sydney 2012 variant, sharing 98.92% nucleotide similarity with this variant strain. According to SimPlot analysis, the gg-12-08-04 strain was a recombinant strain with breakpoint at the ORF1/2 junction between Osaka 2007 and Apeldoorn 2008 strains. This study is the first report of the complete sequence of the GII.4 Sydney 2012 strain in South Korea. Therefore, this may represent the standard sequence of the norovirus GII.4 2012 variant in South Korea and could therefore be useful for the development of norovirus vaccines.

  3. Complete nucleotide sequence analysis of the norovirus GII.4 Sydney variant in South Korea.

    Science.gov (United States)

    Park, Ji-Sun; Lee, Sung-Geun; Jin, Ji-Young; Cho, Han-Gil; Jheong, Weon-Hwa; Paik, Soon-Young

    2015-01-01

    Norovirus is the primary cause of acute gastroenteritis in individuals of all ages. In Australia, a new strain of norovirus (GII.4) was identified in March 2012, and this strain has spread rapidly around the world. In August 2012, this new GII.4 strain was identified in patients in South Korea. Therefore, to examine the characteristics of the epidemic norovirus GII.4 2012 variant in South Korea, we conducted KM272334 full-length genomic analysis. The genome of the gg-12-08-04 strain consisted of 7,558 bp and contained three open reading frame (ORF) composites throughout the whole genome: ORF1 (5,100 bp), ORF2 (1,623 bp), and ORF3 (807 bp). Phylogenetic analyses showed that gg-12-08-04 belonged to the GII.4 Sydney 2012 variant, sharing 98.92% nucleotide similarity with this variant strain. According to SimPlot analysis, the gg-12-08-04 strain was a recombinant strain with breakpoint at the ORF1/2 junction between Osaka 2007 and Apeldoorn 2008 strains. This study is the first report of the complete sequence of the GII.4 Sydney 2012 strain in South Korea. Therefore, this may represent the standard sequence of the norovirus GII.4 2012 variant in South Korea and could therefore be useful for the development of norovirus vaccines.

  4. Complete nucleotide sequence of a Spanish isolate of Parietaria mottle virus infecting tomato.

    Science.gov (United States)

    Galipienso, Luis; Rubio, Luis; López, Luis; Soler, Salvador; Aramburu, José

    2009-10-01

    The genome of a Spanish isolate of Parietaria mottle virus (PMoV) obtained from tomato (strain PMoV-T) was completely sequenced. Protein motifs conserved for RNA viruses were identified: the p1 protein contained a metyltransferase domain in its N-terminal half and a triphosphatase/ helicase domain in its C-terminal half, the p2 protein contained a RNA polymerase domain; the 3a protein contained a RNA-binding domain with α-helix and β-sheet secondary structures. In addition, stem-loop structures with potential capacity of protein interactions were predicted on the untranslated terminal regions. Comparison with the other sequenced PMoV isolate showed nucleotide identities of 93, 90, and 93% for genomic RNAs 1, 2 and 3, respectively, and amino acid identities ranging from 88 to 97% for the different proteins. A cytosine deletion was detected at position 1,366 of RNA 3, involving a start codon for the coat protein (CP) gene different from the other PMoV isolate, resulting in a CP 16 amino acids shorter. Comparison of synonymous and nonsynonymous mutations revealed different selective constraints along the genome.

  5. BIND – An algorithm for loss-less compression of nucleotide sequence data

    Indian Academy of Sciences (India)

    Tungadri Bose; Monzoorul Haque Mohammed; Anirban Dutta; Sharmila S Mande

    2012-09-01

    Recent advances in DNA sequencing technologies have enabled the current generation of life science researchers to probe deeper into the genomic blueprint. The amount of data generated by these technologies has been increasing exponentially since the last decade. Storage, archival and dissemination of such huge data sets require efficient solutions, both from the hardware as well as software perspective. The present paper describes BIND – an algorithm specialized for compressing nucleotide sequence data. By adopting a unique ‘block-length’ encoding for representing binary data (as a key step), BIND achieves significant compression gains as compared to the widely used general purpose compression algorithms (gzip, bzip2 and lzma). Moreover, in contrast to implementations of existing specialized genomic compression approaches, the implementation of BIND is enabled to handle non-ATGC and lowercase characters. This makes BIND a loss-less compression approach that is suitable for practical use. More importantly, validation results of BIND (with real-world data sets) indicate reasonable speeds of compression and decompression that can be achieved with minimal processor/memory usage. BIND is available for download at http://metagenomics.atc.tcs.com/compression/BIND. No license is required for academic or non-profit use.

  6. The human TTAGGG repeat factors 1 and 2 bind to a subset of interstitial telomeric sequences and satellite repeats

    Institute of Scientific and Technical Information of China (English)

    Thomas Simonet; Elena Giulotto; Frederique Magdinier; Béatrice Horard; Pascal Barbry; Rainer Waldmann; Eric Gison; Laure-Emmanuelle Zaragosi; Claude Philippe; Kevin Lebrigand; Clémentine Schouteden; Adeline Augereau; Serge Bauwens; Jing Ye; Marco Santagostino

    2011-01-01

    The study of the proteins that bind to telomeric DNA in mammals has provided a deep understanding of the mech anisms involved in chromosome-end protection. However, very little is known on the binding of these proteins to nontelomeric DNA sequences. The TTAGGG DNA repeat proteins 1 and 2 (TRF1 and TRF2) bind to mammalian telomeres as part of the shelterin complex and are essential for maintaining chromosome end stability. In this study, we combined chromatin immunoprecipitation with high-throughput sequencing to map at high sensitivity and resolution the human chromosomal sites to which TRF1 and TRF2 bind. While most of the identified sequences correspond to telomeric regions, we showed that these two proteins also bind to extratelomeric sites. The vast majority of these extratelomeric sites contains interstitial telomeric sequences (or ITSs). However, we also identified non-iTS sites, which correspond to centromeric and pericentromeric satellite DNA. Interestingly, the TRF-binding sites are often located in the proximity of genes or within introns. We propose that TRF1 and TRF2 couple the functional state of telomeres to the long-range organization of chromosomes and gene regulation networks by binding to extratelomeric sequences.

  7. Spatio-temporal Variations of Characteristic Repeating Earthquake Sequences along the Middle America Trench in Mexico

    Science.gov (United States)

    Dominguez, L. A.; Taira, T.; Hjorleifsdottir, V.; Santoyo, M. A.

    2015-12-01

    Repeating earthquake sequences are sets of events that are thought to rupture the same area on the plate interface and thus provide nearly identical waveforms. We systematically analyzed seismic records from 2001 through 2014 to identify repeating earthquakes with highly correlated waveforms occurring along the subduction zone of the Cocos plate. Using the correlation coefficient (cc) and spectral coherency (coh) of the vertical components as selection criteria, we found a set of 214 sequences whose waveforms exceed cc≥95% and coh≥95%. Spatial clustering along the trench shows large variations in repeating earthquakes activity. Particularly, the rupture zone of the M8.1, 1985 earthquake shows an almost absence of characteristic repeating earthquakes, whereas the Guerrero Gap zone and the segment of the trench close to the Guerrero-Oaxaca border shows a significantly larger number of repeating earthquakes sequences. Furthermore, temporal variations associated to stress changes due to major shows episodes of unlocking and healing of the interface. Understanding the different components that control the location and recurrence time of characteristic repeating sequences is a key factor to pinpoint areas where large megathrust earthquakes may nucleate and consequently to improve the seismic hazard assessment.

  8. Prevalence of single nucleotide polymorphism among 27 diverse alfalfa genotypes as assessed by transcriptome sequencing

    Directory of Open Access Journals (Sweden)

    Li Xuehui

    2012-10-01

    Full Text Available Abstract Background Alfalfa, a perennial, outcrossing species, is a widely planted forage legume producing highly nutritious biomass. Currently, improvement of cultivated alfalfa mainly relies on recurrent phenotypic selection. Marker assisted breeding strategies can enhance alfalfa improvement efforts, particularly if many genome-wide markers are available. Transcriptome sequencing enables efficient high-throughput discovery of single nucleotide polymorphism (SNP markers for a complex polyploid species. Result The transcriptomes of 27 alfalfa genotypes, including elite breeding genotypes, parents of mapping populations, and unimproved wild genotypes, were sequenced using an Illumina Genome Analyzer IIx. De novo assembly of quality-filtered 72-bp reads generated 25,183 contigs with a total length of 26.8 Mbp and an average length of 1,065 bp, with an average read depth of 55.9-fold for each genotype. Overall, 21,954 (87.2% of the 25,183 contigs represented 14,878 unique protein accessions. Gene ontology (GO analysis suggested that a broad diversity of genes was represented in the resulting sequences. The realignment of individual reads to the contigs enabled the detection of 872,384 SNPs and 31,760 InDels. High resolution melting (HRM analysis was used to validate 91% of 192 putative SNPs identified by sequencing. Both allelic variants at about 95% of SNP sites identified among five wild, unimproved genotypes are still present in cultivated alfalfa, and all four US breeding programs also contain a high proportion of these SNPs. Thus, little evidence exists among this dataset for loss of significant DNA sequence diversity from either domestication or breeding of alfalfa. Structure analysis indicated that individuals from the subspecies falcata, the diploid subspecies caerulea, and the tetraploid subspecies sativa (cultivated tetraploid alfalfa were clearly separated. Conclusion We used transcriptome sequencing to discover large numbers of SNPs

  9. Nucleotide sequences of genome segments S6, S7 and S10 of Dendrolimus punctatus cypovirus 1.

    Science.gov (United States)

    Hong, J J; Duan, J L; Zhao, S L; Xu, H G; Peng, H Y

    2004-01-01

    The nucleotide sequences of genome segments S6, S7 and S10 of Dendrolimus punctatus cypovirus 1 Hunan I (DpCPV-HN(I)) and DpCPV-HN(I)-Se(3) (DpCPV-HN(I) passed three times in Spodoptera exigua) were determined. Segment S10 was 944 nucleotides in length and encoded a polyhedrin of 248 amino acids (28,439 Da). Only two nucleotide mutations were found between DpCPV-HN(I) S10 and DpCPV-HN(I)-Se3 S10, and the deduced amino acid sequences of the polyhedrin proteins were identical. Segment S7, 1 501 nucleotides, encoded a protein of 448 amino acids ( approximately 50 kDa; p50). Thirty-one nucleotide mutations were found between DpCPV-HN(I) S7 and DpCPV-HN(I)-Se3 S7, but these resulted in only four amino acid changes. DpCPV-HN(I) S6 encoded a protein of 561 amino acids (63,688 Da; p64). The amino acid sequence of p64, had a high leucine content (10%), and contained a leucine zipper motif and one ATP/GTP-binding site motif.

  10. The complete nucleotide sequence and genomic characterization of grapevine asteroid mosaic associated virus.

    Science.gov (United States)

    Vargas-Asencio, José; Wojciechowska, Klaudia; Baskerville, Maia; Gomez, Annika L; Perry, Keith L; Thompson, Jeremy R

    2017-01-02

    In analyzing grapevine clones infected with grapevine red blotch associated virus, we identified a small number of isometric particles of approximately 30nm in diameter from an enriched fraction of leaf extract. A dominant protein of 25kDa was isolated from this fraction using SDS-PAGE and was identified by mass spectrometry as belonging to grapevine asteroid mosaic associated virus (GAMaV). Using a combination of three methods RNA-Seq, sRNA-Seq, and Sanger sequencing of RT- and RACE-PCR products, we obtained a full-length genome sequence consisting of 6719 nucleotides without the poly(A) tail. The virus possesses all of the typical conserved functional domains concordant with the genus Marafivirus and lies evolutionarily between citrus sudden death associated virus and oat blue dwarf virus. A large shift in RNA-Seq coverage coincided with the predicted location of the subgenomic RNA involved in coat protein (CP) expression. Genus wide sequence alignments confirmed the cleavage motif LxG(G/A) to be dominant between the helicase and RNA dependent RNA polymerase (RdRp), and the RdRp and CP domains. A putative overlapping protein (OP) ORF lacking a canonical translational start codon was identified with a reading frame context more consistent with the putative OPs of tymoviruses and fig fleck associated virus than with those of marafiviruses. BLAST analysis of the predicted GAMaV OP showed a unique relatedness to the OPs of members of the genus Tymovirus. Copyright © 2016 Elsevier B.V. All rights reserved.

  11. Single nucleotide polymorphism discovery in rainbow trout by deep sequencing of a reduced representation library

    Directory of Open Access Journals (Sweden)

    Salem Mohamed

    2009-11-01

    Full Text Available Abstract Background To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs have been used for single nucleotide polymorphism (SNP discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA broodstock population. Results The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends. Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183 of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In

  12. Significance of satellite DNA revealed by conservation of a widespread repeat DNA sequence among angiosperms.

    Science.gov (United States)

    Mehrotra, Shweta; Goel, Shailendra; Raina, Soom Nath; Rajpal, Vijay Rani

    2014-08-01

    The analysis of plant genome structure and evolution requires comprehensive characterization of repetitive sequences that make up the majority of plant nuclear DNA. In the present study, we analyzed the nature of pCtKpnI-I and pCtKpnI-II tandem repeated sequences, reported earlier in Carthamus tinctorius. Interestingly, homolog of pCtKpnI-I repeat sequence was also found to be present in widely divergent families of angiosperms. pCtKpnI-I showed high sequence similarity but low copy number among various taxa of different families of angiosperms analyzed. In comparison, pCtKpnI-II was specific to the genus Carthamus and was not present in any other taxa analyzed. The molecular structure of pCtKpnI-I was analyzed in various unrelated taxa of angiosperms to decipher the evolutionary conserved nature of the sequence and its possible functional role.

  13. Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.)

    Science.gov (United States)

    Background: Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed S...

  14. Finding the right coverage : The impact of coverage and sequence quality on single nucleotide polymorphism genotyping error rates

    NARCIS (Netherlands)

    Fountain, Emily D.; Pauli, Jonathan N.; Reid, Brendan N.; Palsboll, Per J.; Peery, M. Zachariah

    2016-01-01

    Restriction-enzyme-based sequencing methods enable the genotyping of thousands of single nucleotide polymorphism (SNP) loci in nonmodel organisms. However, in contrast to traditional genetic markers, genotyping error rates in SNPs derived from restriction-enzyme-based methods remain largely unknown.

  15. Molecular cloning and nucleotide sequence of a transforming gene detected by transfection of chicken B-cell lymphoma DNA

    Science.gov (United States)

    Goubin, Gerard; Goldman, Debra S.; Luce, Judith; Neiman, Paul E.; Cooper, Geoffrey M.

    1983-03-01

    A transforming gene detected by transfection of chicken B-cell lymphoma DNA has been isolated by molecular cloning. It is homologous to a conserved family of sequences present in normal chicken and human DNAs but is not related to transforming genes of acutely transforming retroviruses. The nucleotide sequence of the cloned transforming gene suggests that it encodes a protein that is partially homologous to the amino terminus of transferrin and related proteins although only about one tenth the size of transferrin.

  16. Nucleotide Sequence and Evolution of the Five-Plasmid Complement of the Phytopathogen Pseudomonas syringae pv. maculicola ES4326

    OpenAIRE

    2004-01-01

    Plasmids are transmissible, extrachromosomal genetic elements that are often responsible for environmental or host-specific adaptations. In order to identify the forces driving the evolution of these important molecules, we determined the complete nucleotide sequence of the five-plasmid complement of the radish and Arabidopsis pathogen Pseudomonas syringae pv. maculicola ES4326 and conducted an intraspecific comparative genomic analysis. To date, this is the most complex fully sequenced plasm...

  17. Homology between nucleotide sequences of promoter regions of nah and sal operons of NAH7 plasmid of Pseudomonas putida.

    OpenAIRE

    1986-01-01

    The in vivo transcription start sites of the nah and sal operons of the NAH7 plasmid were determined by S1 nuclease mapping and the nucleotide sequence surrounding these transcription start sites was determined. Since expression of both of these operons is coordinately controlled by the product of the transcriptional activator gene nahR, the sequences were compared to locate potential sites involved in common regulation. In the 100-base-pair region preceding transcription start sites of both ...

  18. Evolutionary conservation of sequence and secondary structures inCRISPR repeats

    Energy Technology Data Exchange (ETDEWEB)

    Kunin, Victor; Sorek, Rotem; Hugenholtz, Philip

    2006-09-01

    Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in {approx}40% of bacterial and all archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CAS), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been proposed that the CRISPR/CAS system samples, maintains a record of, and inactivates invasive DNA that the cell has encountered, and therefore constitutes a prokaryotic analog of an immune system. Here we analyze CRISPR repeats identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. All individual repeats in any given cluster were inferred to form characteristic RNA secondary structure, ranging from non-existent to pronounced. Stable secondary structures included G:U base pairs and exhibited multiple compensatory base changes in the stem region, indicating evolutionary conservation and functional importance. We also show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification including specific relationships between CRISPR and CAS subtypes.

  19. Development and Characterization of Simple Sequence Repeat Markers Providing Genome-Wide Coverage and High Resolution in Maize

    Science.gov (United States)

    Xu, Jie; Liu, Ling; Xu, Yunbi; Chen, Churun; Rong, Tingzhao; Ali, Farhan; Zhou, Shufeng; Wu, Fengkai; Liu, Yaxi; Wang, Jing; Cao, Moju; Lu, Yanli

    2013-01-01

    Simple sequence repeats (SSRs) have been widely used in maize genetics and breeding, because they are co-dominant, easy to score, and highly abundant. In this study, we used whole-genome sequences from 16 maize inbreds and 1 wild relative to determine SSR abundance and to develop a set of high-density polymorphic SSR markers. A total of 264 658 SSRs were identified across the 17 genomes, with an average of 135 693 SSRs per genome. Marker density was one SSR every of 15.48 kb. (C/G)n, (AT)n, (CAG/CTG)n, and (AAAT/ATTT)n were the most frequent motifs for mono, di-, tri-, and tetra-nucleotide SSRs, respectively. SSRs were most abundant in intergenic region and least frequent in untranslated regions, as revealed by comparing SSR distributions of three representative resequenced genomes. Comparing SSR sequences and e-polymerase chain reaction analysis among the 17 tested genomes created a new database, including 111 887 SSRs, that could be develop as polymorphic markers in silico. Among these markers, 58.00, 26.09, 7.20, 3.00, 3.93, and 1.78% of them had mono, di-, tri-, tetra-, penta-, and hexa-nucleotide motifs, respectively. Polymorphic information content for 35 573 polymorphic SSRs out of 111 887 loci varied from 0.05 to 0.83, with an average of 0.31 in the 17 tested genomes. Experimental validation of polymorphic SSR markers showed that over 70% of the primer pairs could generate the target bands with length polymorphism, and these markers would be very powerful when they are used for genetic populations derived from various types of maize germplasms that were sampled for this study. PMID:23804557

  20. Sequences spanning the leader-repeat junction mediate CRISPR adaptation to phage in Streptococcus thermophilus.

    Science.gov (United States)

    Wei, Yunzhou; Chesne, Megan T; Terns, Rebecca M; Terns, Michael P

    2015-02-18

    CRISPR-Cas systems are RNA-based immune systems that protect prokaryotes from invaders such as phages and plasmids. In adaptation, the initial phase of the immune response, short foreign DNA fragments are captured and integrated into host CRISPR loci to provide heritable defense against encountered foreign nucleic acids. Each CRISPR contains a ∼100-500 bp leader element that typically includes a transcription promoter, followed by an array of captured ∼35 bp sequences (spacers) sandwiched between copies of an identical ∼35 bp direct repeat sequence. New spacers are added immediately downstream of the leader. Here, we have analyzed adaptation to phage infection in Streptococcus thermophilus at the CRISPR1 locus to identify cis-acting elements essential for the process. We show that the leader and a single repeat of the CRISPR locus are sufficient for adaptation in this system. Moreover, we identified a leader sequence element capable of stimulating adaptation at a dormant repeat. We found that sequences within 10 bp of the site of integration, in both the leader and repeat of the CRISPR, are required for the process. Our results indicate that information at the CRISPR leader-repeat junction is critical for adaptation in this Type II-A system and likely other CRISPR-Cas systems.

  1. Chromatin structure of repeating CTG/CAG and CGG/CCG sequences in human disease.

    Science.gov (United States)

    Wang, Yuh-Hwa

    2007-05-01

    In eukaryotic cells, chromatin structure organizes genomic DNA in a dynamic fashion, and results in regulation of many DNA metabolic processes. The CTG/CAG and CGG/CCG repeating sequences involved in several neuromuscular degenerative diseases display differential abilities for the binding of histone octamers. The effect of the repeating DNA on nucleosome assembly could be amplified as the number of repeats increases. Also, CpG methylation, and sequence interruptions within the triplet repeats exert an impact on the formation of nucleosomes along these repeating DNAs. The two most common triplet expansion human diseases, myotonic dystrophy 1 and fragile X syndrome, are caused by the expanded CTG/CAG and CGG/CCG repeats, respectively. In addition to the expanded repeats and CpG methylation, histone modifications, chromatin remodeling factors, and noncoding RNA have been shown to coordinate the chromatin structure at both myotonic dystrophy 1 and fragile X loci. Alterations in chromatin structure at these two loci can affect transcription of these disease-causing genes, leading to disease symptoms. These observations have brought a new appreciation that a full understanding of disease gene expression requires a knowledge of the structure of the chromatin domain within which the gene resides.

  2. ANCAC: amino acid, nucleotide, and codon analysis of COGs – a tool for sequence bias analysis in microbial orthologs

    Directory of Open Access Journals (Sweden)

    Meiler Arno

    2012-09-01

    Full Text Available Abstract Background The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG within this database exclusively contains proteins that most likely achieve the same cellular function. Recently, the COG database was extended by assigning to every protein both the corresponding amino acid and its encoding nucleotide sequence resulting in the NUCOCOG database. This extended version of the COG database is a valuable resource connecting sequence features with the functionality of the respective proteins. Results Here we present ANCAC, a web tool and MySQL database for the analysis of amino acid, nucleotide, and codon frequencies in COGs on the basis of freely definable phylogenetic patterns. We demonstrate the usefulness of ANCAC by analyzing amino acid frequencies, codon usage, and GC-content in a species- or function-specific context. With respect to amino acids we, at least in part, confirm the cognate bias hypothesis by using ANCAC’s NUCOCOG dataset as the largest one available for that purpose thus far. Conclusions Using the NUCOCOG datasets, ANCAC connects taxonomic, amino acid, and nucleotide sequence information with the functional classification via COGs and provides a GUI for flexible mining for sequence-bias. Thereby, to our knowledge, it is the only tool for the analysis of sequence composition in the light of physiological roles and phylogenetic context without requirement of substantial programming-skills.

  3. Structural organization, nucleotide sequence, and regulation of the Haemophilus influenzae rec-1+ gene.

    Science.gov (United States)

    Zulty, J J; Barcak, G J

    1993-11-01

    The Haemophilus influenzae rec-1+ protein plays a central role in DNA metabolism, participating in general homologous recombination, recombinational (postreplication) DNA repair, and prophage induction. Although many H. influenzae rec-1 mutants have been phenotypically characterized, little is known about the rec-1+ gene at the molecular level. In this study, we present the genetic organization of the rec-1+ locus, the DNA sequence of rec-1+, and studies of the transcriptional regulation of rec-1+ during cellular assault by DNA-damaging agents and during the induction of competence for genetic transformation. Although little is known about promoter structure in H. influenzae, we identified a potential rec-1+ promoter that is identical in 11 of 12 positions to the bacterial sigma 70-dependent promoter consensus sequence. Results from a primer extension analysis revealed that the start site of rec-1+ transcription is centered 6 nucleotides downstream of this promoter. We identified potential DNA binding sites in the rec-1+ gene for LexA, integration host factor, and cyclic AMP receptor protein. We obtained evidence that at least one of the proposed cyclic AMP receptor protein binding sites is active in modulating rec-1+ transcription. This finding makes rec-1+ control circuitry novel among recA+ homologs. Two H. influenzae DNA uptake sequences that may function as a transcription termination signal were identified in inverted orientations at the end of the rec-1+ coding sequence. In addition, we report the first use of the Escherichia coli lacZ operon fusion technique in H. influenzae to study the transcriptional control of rec-1+. Our results indicate that rec-1+ is transcriptionally induced about threefold during DNA-damaging events. Furthermore, we show that rec-1+ can substitute for recA+ in E. coli to modulate SOS induction of dinB1 expression. Surprisingly, although 5% of the H. influenzae genome is in the form of single-stranded DNA during competence for

  4. The Coding of Biological Information: From Nucleotide Sequence to Protein Recognition

    Science.gov (United States)

    Štambuk, Nikola

    The paper reviews the classic results of Swanson, Dayhoff, Grantham, Blalock and Root-Bernstein, which link genetic code nucleotide patterns to the protein structure, evolution and molecular recognition. Symbolic representation of the binary addresses defining particular nucleotide and amino acid properties is discussed, with consideration of: structure and metric of the code, direct correspondence between amino acid and nucleotide information, and molecular recognition of the interacting protein motifs coded by the complementary DNA and RNA strands.

  5. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer

    Science.gov (United States)

    Morrison, Carl D.; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M.; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R.; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H.; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C.; Johnson, Candace S.; Trump, Donald L.

    2014-01-01

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as “stitchers,” to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication–licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795

  6. Characterisation data of simple sequence repeats of phages closely related to T7M

    Directory of Open Access Journals (Sweden)

    Tiao-Yin Lin

    2016-09-01

    Full Text Available Coliphages T7M and T3, Yersinia phage ϕYeO3-12, and Salmonella phage ϕSG-JL2 share high homology in genomic sequences. Simple sequence repeats (SSRs are found in their genomes and variations of SSRs among these phages are observed. Analyses on regions of sequences in T7M and T3 genomes that are likely derived from phage recombination, as well as the counterparts in ϕYeO3-12 and ϕSG-JL2, have been discussed by Lin in “Simple sequence repeat variations expedite phage divergence: mechanisms of indels and gene mutations” [1]. These regions are referred to as recombinant regions. The focus here is on SSRs in the whole genome and regions of sequences outside the recombinant regions, referred to as non-recombinant regions. This article provides SSR counts, relative abundance, relative density, and GC contents in the complete genome and non-recombinant regions of these phages. SSR period sizes and motifs in the non-recombinant regions of phage genomes are plotted. Genomic sequence changes between T7M and T3 due to insertions, deletions, and substitutions are also illustrated. SSRs and nearby sequences of T7M in the non-recombinant regions are compared to the sequences of ϕYeO3-12 and ϕSG-JL2 in the corresponding positions. The sequence variations of SSRs due to vertical evolution are classified into four categories and tabulated: (1 insertion/deletion of SSR units, (2 expansion/contraction of SSRs without alteration of genome length, (3 changes of repeat motifs, and (4 generation/loss of repeats.

  7. Nucleotide sequence of XhoI O fragment of ectromelia virus DNA reveals significant differences from vaccinia virus.

    Science.gov (United States)

    Senkevich, T G; Muravnik, G L; Pozdnyakov, S G; Chizhikov, V E; Ryazankina, O I; Shchelkunov, S N; Koonin, E V; Chernos, V I

    1993-10-01

    The nucleotide sequence of the 3913 base pair XhoI O fragment located in an evolutionary variable region adjacent to the right end of the genome of ectromelia virus (EMV) was determined. The sequence contains two long open reading frames coding for putative proteins of 559 amino acid residues (p65) and 344 amino acid residues (p39). Amino acid database searches showed that p39 is closely related to vaccinia virus (VV), strain WR, B22R gene product (C12L gene product of strain Copenhagen), which belongs to the family of serine protease inhibitors (serpins). Despite the overall high conservation, differences were observed in the sequences of p39, B22R, and C12L in the site known to interact with proteases in other serpins, suggesting that the serpins of EMV and two strains of VV may all inhibit proteases with different specificities. The gene coding for the ortholog of p65 is lacking in the Copenhagen strain of vaccinia virus; the WR strain contains a truncated variant of this gene (B21R) potentially coding for a small protein (p16) corresponding to the C-terminal region of p65. p65 is a new member of the family of poxvirus proteins including vaccinia virus proteins A55R, C2L and F3L, and a group of related proteins of leporipoxviruses, Shope fibroma and myxoma viruses (T6, T8, T9, M9). These proteins are homologous to the Drosophila protein Kelch involved in egg development. Both Kelch protein and the related poxvirus proteins contain two distinct domains. The N-terminal domain is related to the similarly located domains of transcription factors Ttk, Br-C (Drosophila), and KUP (human), and GCL protein involved in early development in Drosophila. The C-terminal domain consists of an array of four to five imperfect repeats and is related to human placental protein MIPP. Phylogenetic analysis of the family of poxvirus proteins showed that their genes have undergone a complex succession of duplications, and complete or partial deletions.

  8. Genome-wide characterization of simple sequence repeats in cucumber (Cucumis sativus L.

    Directory of Open Access Journals (Sweden)

    Simon Philipp W

    2010-10-01

    Full Text Available Abstract Background Cucumber, Cucumis sativus L. is an important vegetable crop worldwide. Until very recently, cucumber genetic and genomic resources, especially molecular markers, have been very limited, impeding progress of cucumber breeding efforts. Microsatellites are short tandemly repeated DNA sequences, which are frequently favored as genetic markers due to their high level of polymorphism and codominant inheritance. Data from previously characterized genomes has shown that these repeats vary in frequency, motif sequence, and genomic location across taxa. During the last year, the genomes of two cucumber genotypes were sequenced including the Chinese fresh market type inbred line '9930' and the North American pickling type inbred line 'Gy14'. These sequences provide a powerful tool for developing markers in a large scale. In this study, we surveyed and characterized the distribution and frequency of perfect microsatellites in 203 Mbp assembled Gy14 DNA sequences, representing 55% of its nuclear genome, and in cucumber EST sequences. Similar analyses were performed in genomic and EST data from seven other plant species, and the results were compared with those of cucumber. Results A total of 112,073 perfect repeats were detected in the Gy14 cucumber genome sequence, accounting for 0.9% of the assembled Gy14 genome, with an overall density of 551.9 SSRs/Mbp. While tetranucleotides were the most frequent microsatellites in genomic DNA sequence, dinucleotide repeats, which had more repeat units than any other SSR type, had the highest cumulative sequence length. Coding regions (ESTs of the cucumber genome had fewer microsatellites compared to its genomic sequence, with trinucleotides predominating in EST sequences. AAG was the most frequent repeat in cucumber ESTs. Overall, AT-rich motifs prevailed in both genomic and EST data. Compared to the other species examined, cucumber genomic sequence had the highest density of SSRs (although

  9. A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

    Directory of Open Access Journals (Sweden)

    Glass John I

    2010-07-01

    Full Text Available Abstract Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT. Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the

  10. alpha-Amylase gene of Streptomyces limosus: nucleotide sequence, expression motifs, and amino acid sequence homology to mammalian and invertebrate alpha-amylases.

    OpenAIRE

    1987-01-01

    The nucleotide sequence of the coding and regulatory regions of the alpha-amylase gene (aml) of Streptomyces limosus was determined. High-resolution S1 mapping was used to locate the 5' end of the transcript and demonstrated that the gene is transcribed from a unique promoter. The predicted amino acid sequence has considerable identity to mammalian and invertebrate alpha-amylases, but not to those of plant, fungal, or eubacterial origin. Consistent with this is the susceptibility of the enzym...

  11. Distinctive nucleotide sequences of promoters recognized by RNA polymerase containing a phage-coded "sigma-like" protein.

    Science.gov (United States)

    Talkington, C; Pero, J

    1979-11-01

    We report the nucleotide sequences of two promoters for bacteriophage SP01 "middle" genes. These promoters are recognized by a modified form of Bacillus subtilis RNA polymerase that contains a phage-coded "sigma-like" regulatory protein (gp28) in place of the bacterial sigma factor. Both promoters shared the identical hexanucleotide 5'A-G-G-A-G-A at about 35 base pairs preceding the start point of transcription and the identical heptanucleotide 5'-T-T-T-A-T-T-T (T is the thymine analog 5-hydroxymethyluracil in SP01 DNA) located about 10 base pairs preceding the transcriptional start point. The significance of these sequences in comparison with nucleotide sequences of promoters recognized by sigma-containing RNA polymerases is discussed.

  12. Large scale single nucleotide polymorphism discovery in unsequenced genomes using second generation high throughput sequencing technology: applied to turkey

    Directory of Open Access Journals (Sweden)

    den Dunnen Johan T

    2009-10-01

    Full Text Available Abstract Background The development of second generation sequencing methods has enabled large scale DNA variation studies at moderate cost. For the high throughput discovery of single nucleotide polymorphisms (SNPs in species lacking a sequenced reference genome, we set-up an analysis pipeline based on a short read de novo sequence assembler and a program designed to identify variation within short reads. To illustrate the potential of this technique, we present the results obtained with a randomly sheared, enzymatically generated, 2-3 kbp genome fraction of six pooled Meleagris gallopavo (turkey individuals. Results A total of 100 million 36 bp reads were generated, representing approximately 5-6% (~62 Mbp of the turkey genome, with an estimated sequence depth of 58. Reads consisting of bases called with less than 1% error probability were selected and assembled into contigs. Subsequently, high throughput discovery of nucleotide variation was performed using sequences with more than 90% reliability by using the assembled contigs that were 50 bp or longer as the reference sequence. We identified more than 7,500 SNPs with a high probability of representing true nucleotide variation in turkeys. Increasing the reference genome by adding publicly available turkey BAC-end sequences increased the number of SNPs to over 11,000. A comparison with the sequenced chicken genome indicated that the assembled turkey contigs were distributed uniformly across the turkey genome. Genotyping of a representative sample of 340 SNPs resulted in a SNP conversion rate of 95%. The correlation of the minor allele count (MAC and observed minor allele frequency (MAF for the validated SNPs was 0.69. Conclusion We provide an efficient and cost-effective approach for the identification of thousands of high quality SNPs in species currently lacking a sequenced genome and applied this to turkey. The methodology addresses a random fraction of the genome, resulting in an even

  13. Mining and validation of pyrosequenced simple sequence repeats (SSRs) from American cranberry (Vaccinium macrocarpon Ait.).

    Science.gov (United States)

    Zhu, H; Senalik, D; McCown, B H; Zeldin, E L; Speers, J; Hyman, J; Bassil, N; Hummer, K; Simon, P W; Zalapa, J E

    2012-01-01

    The American cranberry (Vaccinium macrocarpon Ait.) is a major commercial fruit crop in North America, but limited genetic resources have been developed for the species. Furthermore, the paucity of codominant DNA markers has hampered the advance of genetic research in cranberry and the Ericaceae family in general. Therefore, we used Roche 454 sequencing technology to perform low-coverage whole genome shotgun sequencing of the cranberry cultivar 'HyRed'. After de novo assembly, the obtained sequence covered 266.3 Mb of the estimated 540-590 Mb in cranberry genome. A total of 107,244 SSR loci were detected with an overall density across the genome of 403 SSR/Mb. The AG repeat was the most frequent motif in cranberry accounting for 35% of all SSRs and together with AAG and AAAT accounted for 46% of all loci discovered. To validate the SSR loci, we designed 96 primer-pairs using contig sequence data containing perfect SSR repeats, and studied the genetic diversity of 25 cranberry genotypes. We identified 48 polymorphic SSR loci with 2-15 alleles per locus for a total of 323 alleles in the 25 cranberry genotypes. Genetic clustering by principal coordinates and genetic structure analyzes confirmed the heterogeneous nature of cranberries. The parentage composition of several hybrid cultivars was evident from the structure analyzes. Whole genome shotgun 454 sequencing was a cost-effective and efficient way to identify numerous SSR repeats in the cranberry sequence for marker development.

  14. Read length and repeat resolution: Exploring prokaryote genomes using next-generation sequencing technologies

    KAUST Repository

    Cahill, Matt J.

    2010-07-12

    Background: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. Methodology/Principal Findings: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. Conclusions: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length. 2010 Cahill et al.

  15. Read length and repeat resolution: exploring prokaryote genomes using next-generation sequencing technologies.

    Directory of Open Access Journals (Sweden)

    Matt J Cahill

    Full Text Available BACKGROUND: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. METHODOLOGY/PRINCIPAL FINDINGS: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. CONCLUSIONS: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length.

  16. Tandem repeats and G-rich sequences are enriched at human CNV breakpoints.

    Directory of Open Access Journals (Sweden)

    Promita Bose

    Full Text Available Chromosome breakage in germline and somatic genomes gives rise to copy number variation (CNV responsible for genomic disorders and tumorigenesis. DNA sequence is known to play an important role in breakage at chromosome fragile sites; however, the sequences susceptible to double-strand breaks (DSBs underlying CNV formation are largely unknown. Here we analyze 140 germline CNV breakpoints from 116 individuals to identify DNA sequences enriched at breakpoint loci compared to 2800 simulated control regions. We find that, overall, CNV breakpoints are enriched in tandem repeats and sequences predicted to form G-quadruplexes. G-rich repeats are overrepresented at terminal deletion breakpoints, which may be important for the addition of a new telomere. Interstitial deletions and duplication breakpoints are enriched in Alu repeats that in some cases mediate non-allelic homologous recombination (NAHR between the two sides of the rearrangement. CNV breakpoints are enriched in certain classes of repeats that may play a role in DNA secondary structure, DSB susceptibility and/or DNA replication errors.

  17. Efficient multiplex simple sequence repeat genotyping of the oomycete plant pathogen Phytophthora infestans

    NARCIS (Netherlands)

    Li, Y.; Cooke, D.E.L.; Jacobsen, E.; Lee, van der T.A.J.

    2013-01-01

    Genotyping is fundamental to population analysis. To accommodate fast, accurate and cost-effective genotyping, a one-step multiplex PCR method employing twelve simple sequence repeat (SSR) markers was developed for high-throughput screening of Phytophthora infestans populations worldwide. The SSR

  18. Evaluation of the flanking nucleotide sequences of sarcomeric hypertrophic cardiomyopathy substitution mutations.

    Science.gov (United States)

    Meurs, Kathryn M; Mealey, Katrina L

    2008-07-03

    Hypertrophic cardiomyopathy (HCM) is a familial myocardial disease with a prevalence of 1 in 500. More than 400 causative mutations have been identified in 13 sarcomeric and myofilament related genes, 350 of these are substitution mutations within eight sarcomeric genes. Within a population, examples of recurring identical disease causing mutations that appear to have arisen independently have been noted as well as those that appear to have been inherited from a common ancestor. The large number of novel HCM mutations could suggest a mechanism of increased mutability within the sarcomeric genes. The objective of this study was to evaluate the most commonly reported HCM genes, beta myosin heavy chain (MYH7), myosin binding protein C, troponin I, troponin T, cardiac regulatory myosin light chain, cardiac essential myosin light chain, alpha tropomyosin and cardiac alpha-actin for sequence patterns surrounding the substitution mutations that may suggest a mechanism of increased mutability. The mutations as well as the 10 flanking nucleotides were evaluated for frequency of di-, tri- and tetranucleotides containing the mutation as well as for the presence of certain tri- and tetranculeotide motifs. The most common substitutions were guanine (G) to adenine (A) and cytosine (C) to thymidine (T). The CG dinucleotide had a significantly higher relative mutability than any other dinucleotide (pmutation was calculated; none were at a statistically higher frequency than the others. The large number of G to A and C to T mutations as well as the relative mutability of CG may suggest that deamination of methylated CpG is an important mechanism for mutation development in at least some of these cardiac genes.

  19. Expressed sequence tags (ESTs and simple sequence repeat (SSR markers from octoploid strawberry (Fragaria × ananassa

    Directory of Open Access Journals (Sweden)

    Bies Dawn H

    2005-06-01

    Full Text Available Abstract Background Cultivated strawberry (Fragaria × ananassa represents one of the most valued fruit crops in the United States. Despite its economic importance, the octoploid genome presents a formidable barrier to efficient study of genome structure and molecular mechanisms that underlie agriculturally-relevant traits. Many potentially fruitful research avenues, especially large-scale gene expression surveys and development of molecular genetic markers have been limited by a lack of sequence information in public databases. As a first step to remedy this discrepancy a cDNA library has been developed from salicylate-treated, whole-plant tissues and over 1800 expressed sequence tags (EST's have been sequenced and analyzed. Results A putative unigene set of 1304 sequences – 133 contigs and 1171 singlets – has been developed, and the transcripts have been functionally annotated. Homology searches indicate that 89.5% of sequences share significant similarity to known/putative proteins or Rosaceae ESTs. The ESTs have been functionally characterized and genes relevant to specific physiological processes of economic importance have been identified. A set of tools useful for SSR development and mapping is presented. Conclusion Sequences derived from this effort may be used to speed gene discovery efforts in Fragaria and the Rosaceae in general and also open avenues of comparative mapping. This report represents a first step in expanding molecular-genetic analyses in strawberry and demonstrates how computational tools can be used to optimally mine a large body of useful information from a relatively small data set.

  20. The Bryopsis hypnoides plastid genome: multimeric forms and complete nucleotide sequence.

    Directory of Open Access Journals (Sweden)

    Fang Lü

    Full Text Available BACKGROUND: Bryopsis hypnoides Lamouroux is a siphonous green alga, and its extruded protoplasm can aggregate spontaneously in seawater and develop into mature individuals. The chloroplast of B. hypnoides is the biggest organelle in the cell and shows strong autonomy. To better understand this organelle, we sequenced and analyzed the chloroplast genome of this green alga. PRINCIPAL FINDINGS: A total of 111 functional genes, including 69 potential protein-coding genes, 5 ribosomal RNA genes, and 37 tRNA genes were identified. The genome size (153,429 bp, arrangement, and inverted-repeat (IR-lacking structure of the B. hypnoides chloroplast DNA (cpDNA closely resembles that of Chlorella vulgaris. Furthermore, our cytogenomic investigations using pulsed-field gel electrophoresis (PFGE and southern blotting methods showed that the B. hypnoides cpDNA had multimeric forms, including monomer, dimer, trimer, tetramer, and even higher multimers, which is similar to the higher order organization observed previously for higher plant cpDNA. The relative amounts of the four multimeric cpDNA forms were estimated to be about 1, 1/2, 1/4, and 1/8 based on molecular hybridization analysis. Phylogenetic analyses based on a concatenated alignment of chloroplast protein sequences suggested that B. hypnoides is sister to all Chlorophyceae and this placement received moderate support. CONCLUSION: All of the results suggest that the autonomy of the chloroplasts of B. hypnoides has little to do with the size and gene content of the cpDNA, and the IR-lacking structure of the chloroplasts indirectly demonstrated that the multimeric molecules might result from the random cleavage and fusion of replication intermediates instead of recombinational events.

  1. Structure, organization, and sequence of alpha satellite DNA from human chromosome 17: evidence for evolution by unequal crossing-over and an ancestral pentamer repeat shared with the human X chromosome.

    Science.gov (United States)

    Waye, J S; Willard, H F

    1986-09-01

    The centromeric regions of all human chromosomes are characterized by distinct subsets of a diverse tandemly repeated DNA family, alpha satellite. On human chromosome 17, the predominant form of alpha satellite is a 2.7-kilobase-pair higher-order repeat unit consisting of 16 alphoid monomers. We present the complete nucleotide sequence of the 16-monomer repeat, which is present in 500 to 1,000 copies per chromosome 17, as well as that of a less abundant 15-monomer repeat, also from chromosome 17. These repeat units were approximately 98% identical in sequence, differing by the exclusion of precisely 1 monomer from the 15-monomer repeat. Homologous unequal crossing-over is suggested as a probable mechanism by which the different repeat lengths on chromosome 17 were generated, and the putative site of such a recombination event is identified. The monomer organization of the chromosome 17 higher-order repeat unit is based, in part, on tandemly repeated pentamers. A similar pentameric suborganization has been previously demonstrated for alpha satellite of the human X chromosome. Despite the organizational similarities, substantial sequence divergence distinguishes these subsets. Hybridization experiments indicate that the chromosome 17 and X subsets are more similar to each other than to the subsets found on several other human chromosomes. We suggest that the chromosome 17 and X alpha satellite subsets may be related components of a larger alphoid subfamily which have evolved from a common ancestral repeat into the contemporary chromosome-specific subsets.

  2. A weighted sampling algorithm for the design of RNA sequences with targeted secondary structure and nucleotide distribution.

    Science.gov (United States)

    Reinharz, Vladimir; Ponty, Yann; Waldispühl, Jérôme

    2013-07-01

    The design of RNA sequences folding into predefined secondary structures is a milestone for many synthetic biology and gene therapy studies. Most of the current software uses similar local search strategies (i.e. a random seed is progressively adapted to acquire the desired folding properties) and more importantly do not allow the user to control explicitly the nucleotide distribution such as the GC-content in their sequences. However, the latter is an important criterion for large-scale applications as it could presumably be used to design sequences with better transcription rates and/or structural plasticity. In this article, we introduce IncaRNAtion, a novel algorithm to design RNA sequences folding into target secondary structures with a predefined nucleotide distribution. IncaRNAtion uses a global sampling approach and weighted sampling techniques. We show that our approach is fast (i.e. running time comparable or better than local search methods), seedless (we remove the bias of the seed in local search heuristics) and successfully generates high-quality sequences (i.e. thermodynamically stable) for any GC-content. To complete this study, we develop a hybrid method combining our global sampling approach with local search strategies. Remarkably, our glocal methodology overcomes both local and global approaches for sampling sequences with a specific GC-content and target structure. IncaRNAtion is available at csb.cs.mcgill.ca/incarnation/. Supplementary data are available at Bioinformatics online.

  3. Chromosomal localization of a tandemly repeated DNA sequence in Trifilium repens L.

    Institute of Scientific and Technical Information of China (English)

    ZHUJM; NWELLISON; 等

    1996-01-01

    A karyotype of Trifolium repens constructed from mitotic cells revealed 13 pairs of metacentric and 3 pairs of submetacentric chromosomes including a pair of satellites located at the end of the short arm of chromosome 16.C-bands were identified around the centromeric regions of 8 pairs of chromosomes.A 350 bp tandemly repeated DNAsequence from T.repens labelled with digoxygenin hybridized to the proximal centromeric regions of 12 chromosome pairs.Some correlation between the distribution of the repeat sequence and the distribution of C-banding was demonstrated.

  4. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available Few studies investigated the donkey (Equus asinus at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca. The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing and Ion Torrent (RRL runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  5. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    Science.gov (United States)

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  6. Molecular Identification of Necrophagous Muscidae and Sarcophagidae Fly Species Collected in Korea by Mitochondrial Cytochrome c Oxidase Subunit I Nucleotide Sequences

    Directory of Open Access Journals (Sweden)

    Yu-Hoon Kim

    2014-01-01

    Full Text Available Identification of insect species is an important task in forensic entomology. For more convenient species identification, the nucleotide sequences of cytochrome c oxidase subunit I (COI gene have been widely utilized. We analyzed full-length COI nucleotide sequences of 10 Muscidae and 6 Sarcophagidae fly species collected in Korea. After DNA extraction from collected flies, PCR amplification and automatic sequencing of the whole COI sequence were performed. Obtained sequences were analyzed for a phylogenetic tree and a distance matrix. Our data showed very low intraspecific sequence distances and species-level monophylies. However, sequence comparison with previously reported sequences revealed a few inconsistencies or paraphylies requiring further investigation. To the best of our knowledge, this study is the first report of COI nucleotide sequences from Hydrotaea occulta, Muscina angustifrons, Muscina pascuorum, Ophyra leucostoma, Sarcophaga haemorrhoidalis, Sarcophaga harpax, and Phaonia aureola.

  7. Application of inter simple sequence repeat (ISSR) markers to plant genetics.

    Science.gov (United States)

    Godwin, I D; Aitken, E A; Smith, L W

    1997-08-01

    Microsatellites or simple sequence repeats (SSRs) are ubiquitous in eukaryotic genomes. Single-locus SSR markers have been developed for a number of species, although there is a major bottleneck in developing SSR markers whereby flanking sequences must be known to design 5'-anchors for polymerase chain reaction (PCR) primers. Inter SSR (ISSR) fingerprinting was developed such that no sequence knowledge was required. Primers based on a repeat sequence, such as (CA)n, can be made with a degenerate 3'-anchor, such as (CA)8RG or (AGC)6TY. The resultant PCR reaction amplifies the sequence between two SSRs, yielding a multilocus marker system useful for fingerprinting, diversity analysis and genome mapping. PCR products are radiolabelled with 32P or 33P via end-labelling or PCR incorporation, and separated on a polyacrylamide sequencing gel prior to autoradiographic visualisation. A typical reaction yields 20-100 bands per lane depending on the species and primer. We have used ISSR fingerprinting in a number of plant species, and report here some results on two important tropical species, sorghum and banana. Previous investigators have demonstrated that ISSR analysis usually detects a higher level of polymorphism than that detected with restriction fragment length polymorphism (RFLP) or random amplified polymorphic DNA (RAPD) analyses. Our data indicate that this is not a result of greater polymorphism genetically, but rather technical reasons related to the detection methodology used for ISSR analysis.

  8. The Nucleotide Capture Region of Alpha Hemolysin: Insights into Nanopore Design for DNA Sequencing from Molecular Dynamics Simulations.

    Science.gov (United States)

    Manara, Richard M A; Tomasio, Susana; Khalid, Syma

    2015-01-27

    Nanopore technology for DNA sequencing is constantly being refined and improved. In strand sequencing a single strand of DNA is fed through a nanopore and subsequent fluctuations in the current are measured. A major hurdle is that the DNA is translocated through the pore at a rate that is too fast for the current measurement systems. An alternative approach is "exonuclease sequencing", in which an exonuclease is attached to the nanopore that is able to process the strand, cleaving off one base at a time. The bases then flow through the nanopore and the current is measured. This method has the advantage of potentially solving the translocation rate problem, as the speed is controlled by the exonuclease. Here we consider the practical details of exonuclease attachment to the protein alpha hemolysin. We employ molecular dynamics simulations to determine the ideal (a) distance from alpha-hemolysin, and (b) the orientation of the monophosphate nucleotides upon release from the exonuclease such that they will enter the protein. Our results indicate an almost linear decrease in the probability of entry into the protein with increasing distance of nucleotide release. The nucleotide orientation is less significant for entry into the protein.

  9. Nucleotide Sequence of the blaRTG-2 (CARB-5) Gene and Phylogeny of a New Group of Carbenicillinases

    Science.gov (United States)

    Choury, Daniele; Szajnert, Marie-France; Joly-Guillou, Marie-Laure; Azibi, Kemal; Delpech, Marc; Paul, Gérard

    2000-01-01

    We determined the nucleotide sequence of the bla gene for the Acinetobacter calcoaceticus β-lactamase previously described as CARB-5. Alignment of the deduced amino acid sequence with those of known β-lactamases revealed that CARB-5 possesses an RTG triad in box VII, as described for the Proteus mirabilis GN79 enzyme, instead of the RSG consensus characteristic of the other carbenicillinases. Phylogenetic studies showed that these RTG enzymes constitute a new, separate group, possibly ancestors of the carbenicillinase family. PMID:10722515

  10. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

    Science.gov (United States)

    de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

    2015-11-16

    Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats.

  11. Nucleotide Sequence of the Coat Protein Gene of the Malaysian Passiflora Virus and its 3' Non-Coding Region

    Directory of Open Access Journals (Sweden)

    Norzihan Abdullah

    2009-01-01

    Full Text Available Problem statement: In this study, we identified the full length Coat Protein (CP gene of the Malaysian Passiflora Virus (MPV and its 3' non-coding region. The CP gene of the MPV contained 285 amino acid residues. Approach: Pairwise comparison of the MPV CP region with four other potyviruses, namely East Asian Passiflora Virus (EAPV, Passionfruit Woodiness Virus (PWV, Bean Common Mosaic Virus (BCMV and Soyabean Mosaic Virus (SMV revealed amino acid sequence similarities ranging from 72-95%. Results: The 3' non-coding region of the MPV, which consists of 255 nucleotides, showed 69-95% nucleotide sequence identity when compared with the four potyviruses. The highest (95% sequence similarities were detected with PWV and EAPV. An analysis of the deduced amino acid sequences revealed the presence of consensus motifs (DAG tripeptides characteristic of potyviruses. DAG tripeptides had been reported to be essential for aphid transmission. Conclusion: From the amino acid sequence alignment and identity level observed among the four other potyviruses, we concluded that MPV is a member of the genus Potyvirus and was closely related to both PWV and EAPV.

  12. IRE1α nucleotide sequence cleavage specificity in the unfolded protein response.

    Science.gov (United States)

    Poothong, Juthakorn; Sopha, Pattarawut; Kaufman, Randal J; Tirasophon, Witoon

    2017-01-01

    Inositol-requiring enzyme 1 (IRE1) is a conserved sensor of the unfolded protein response that has protein kinase and endoribonuclease (RNase) enzymatic activities and thereby initiates HAC1/XBP1 splicing. Previous studies demonstrated that human IRE1α (hIRE1α) does not cleave Saccharomyces cerevisiae HAC1 mRNA. Using an in vitro cleavage assay, we show that adenine to cytosine nucleotide substitution at the +1 position in the 3' splice site of HAC1 RNA is required for specific cleavage by hIRE1α. A similar restricted nucleotide specificity in the RNA substrate was observed for XBP1 splicing in vivo. Together these findings underscore the essential role of cytosine nucleotide at +1 in the 3' splice site for determining cleavage specificity of hIRE1α.

  13. Genotyping of simple sequence repeats--factors implicated in shadow band generation revisited.

    Science.gov (United States)

    Olejniczak, Marta; Krzyzosiak, Wlodzimierz J

    2006-10-01

    PCR amplification of microsatellite sequences generates, besides the main product corresponding to allele size, also additional, undesired products usually shorter by multiples of the repeated unit. These extra products known as shadow bands or stutter products may complicate genotyping. The mechanism by which these artifacts are formed is not well understood and so no effective remedy has been found to cope with these spurious products. In this study, using the DNA templates containing the CAG/CTG repeats flanked by gene-specific sequences and universal priming sites, we analyzed the effects of many PCR variables on the shadow band generation. The most important result was that at the decreased temperature of the denaturation step during PCR cycling the shadow bands were either not formed or were strongly suppressed. Several possible sources of this effect are discussed.

  14. In silico analysis of Simple Sequence Repeats from chloroplast genomes of Solanaceae species

    Directory of Open Access Journals (Sweden)

    Evandro Vagner Tambarussi

    2009-01-01

    Full Text Available The availability of chloroplast genome (cpDNA sequences of Atropa belladonna, Nicotiana sylvestris, N.tabacum, N. tomentosiformis, Solanum bulbocastanum, S. lycopersicum and S. tuberosum, which are Solanaceae species,allowed us to analyze the organization of cpSSRs in their genic and intergenic regions. In general, the number of cpSSRs incpDNA ranged from 161 in S. tuberosum to 226 in N. tabacum, and the number of intergenic cpSSRs was higher than geniccpSSRs. The mononucleotide repeats were the most frequent in studied species, but we also identified di-, tri-, tetra-, pentaandhexanucleotide repeats. Multiple alignments of all cpSSRs sequences from Solanaceae species made the identification ofnucleotide variability possible and the phylogeny was estimated by maximum parsimony. Our study showed that the plastomedatabase can be exploited for phylogenetic analysis and biotechnological approaches.

  15. Comparison of highly repeated DNA sequences in some Lemuridae and taxonomic implications.

    Science.gov (United States)

    Montagnon, D; Crovella, S; Rumpler, Y

    1993-01-01

    Highly repeated DNA sequences of Eulemur fulvus mayottensis, E. coronatus, Lemur catta, and Hapalemur griseus griseus have been identified and compared. Sequence analysis of highly repeated DNA fragments isolated from L. catta and Hapalemur showed a high percentage of similarity (nearly 95%), as did fragments isolated from the two very close Eulemur species, whereas comparison of the DNA fragments isolated from the two Eulemur species and the L. catta/Hapalemur group showed a very low percentage (approximately 40%) of identity, as might be expected for distant species. These results confirm our previous data, obtained by Southern blot hybridization techniques on the same species, and strongly support the existence of a common trunk between L. catta and Hapalemur, but different from the leading to the Eulemur species.

  16. Cytogenetic analysis of Populus trichocarpa--ribosomal DNA, telomere repeat sequence, and marker-selected BACs.

    Science.gov (United States)

    Islam-Faridi, M N; Nelson, C D; DiFazio, S P; Gunter, L E; Tuskan, G A

    2009-01-01

    The 18S-28S rDNA and 5S rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 18S-28S rDNA sites and one 5S rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis-type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones selected from 2 linkage groups based on genome sequence assembly (LG-I and LG-VI) were localized on 2 chromosomes, as expected. BACs from LG-I hybridized to the longest chromosome in the complement. All BAC positions were found to be concordant with sequence assembly positions. BAC-FISH will be useful for delineating each of the Populus trichocarpa chromosomes and improving the sequence assembly of this model angiosperm tree species.

  17. Cytogenetic Analysis of Populus trichocarpa - Ribosomal DNA, Telomere Repeat Sequence, and Marker-selected BACs

    Energy Technology Data Exchange (ETDEWEB)

    Tuskan, Gerald A [ORNL; Gunter, Lee E [ORNL; DiFazio, Stephen P [West Virginia University

    2009-01-01

    The 18S-28S rDNA and 5S rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 18S-28S rDNA sites and one 5S rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis -type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones selected from 2 linkage groups based on genome sequence assembly (LG-I and LG-VI) were localized on 2 chromosomes, as expected. BACs from LG-I hybridized to the longest chromosome in the complement. All BAC positions were found to be concordant with sequence assembly positions. BAC-FISH will be useful for delineating each of the Populus trichocarpa chromosomes and improving the sequence assembly of this model angiosperm tree species.

  18. Cloning and nucleotide sequence of the gene coding for enzymatically active fragments of the Bacillus polymyxa beta-amylase.

    Science.gov (United States)

    Kawazu, T; Nakanishi, Y; Uozumi, N; Sasaki, T; Yamagata, H; Tsukagoshi, N; Udaka, S

    1987-01-01

    The gene encoding beta-amylase was cloned from Bacillus polymyxa 72 into Escherichia coli HB101 by inserting HindIII-generated DNA fragments into the HindIII site of pBR322. The 4.8-kilobase insert was shown to direct the synthesis of beta-amylase. A 1.8-kilobase AccI-AccI fragment of the donor strain DNA was sufficient for the beta-amylase synthesis. Homologous DNA was found by Southern blot analysis to be present only in B. polymyxa 72 and not in other bacteria such as E. coli or B. subtilis. B. polymyxa, as well as E. coli harboring the cloned DNA, was found to produce enzymatically active fragments of beta-amylases (70,000, 56,000, or 58,000, and 42,000 daltons), which were detected in situ by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Nucleotide sequence analysis of the cloned 3.1-kilobase DNA revealed that it contains one open reading frame of 2,808 nucleotides without a translational stop codon. The deduced amino acid sequence for these 2,808 nucleotides encoding a secretory precursor of the beta-amylase protein is 936 amino acids including a signal peptide of 33 or 35 residues at its amino-terminal end. The existence of a beta-amylase of larger than 100,000 daltons, which was predicted on the basis of the results of nucleotide sequence analysis of the gene, was confirmed by examining culture supernatants after various cultivation periods. It existed only transiently during cultivation, but the multiform beta-amylases described above existed for a long time. The large beta-amylase (approximately 160,000 daltons) existed for longer in the presence of a protease inhibitor such as chymostatin, suggesting that proteolytic cleavage is the cause of the formation of multiform beta-amylases. Images PMID:2435707

  19. HIV-1 and HIV-2 LTR nucleotide sequences: assessment of the alignment by N-block presentation, "retroviral signatures" of overrepeated oligonucleotides, and a probable important role of scrambled stepwise duplications/deletions in molecular evolution.

    Science.gov (United States)

    Laprevotte, I; Pupin, M; Coward, E; Didier, G; Terzian, C; Devauchelle, C; Hénaut, A

    2001-07-01

    Previous analyses of retroviral nucleotide sequences, suggest a so-called "scrambled duplicative stepwise molecular evolution" (many sectors with successive duplications/deletions of short and longer motifs) that could have stemmed from one or several starter tandemly repeated short sequence(s). In the present report, we tested this hypothesis by focusing on the long terminal repeats (LTRs) (and flanking sequences) of 24 human and 3 simian immunodeficiency viruses. By using a calculation strategy applicable to short sequences, we found consensus overrepresented motifs (often containing CTG or CAG) that were congruent with the previously defined "retroviral signature." We also show many local repetition patterns that are significant when compared with simply shuffled sequences. First- and second-order Markov chain analyses demonstrate that a major portion of the overrepresented oligonucleotides can be predicted from the dinucleotide compositions of the sequences, but by no means can biological mechanisms be deduced from these results: some of the listed local repetitions remain significant against dinucleotide-conserving shuffled sequences; together with previous results, this suggests that interspersed and/or local mononucleotide and oligonucleotide repetitions could have biased the dinucleotide compositions of the sequences. We searched for suggestive evolutionary patterns by scrutinizing a reliable multiple alignment of the 27 sequences. A manually constructed alignment based on homology blocks was in good agreement with the polypeptide alignment in the coding sectors and has been exhaustively assessed by using a multiplied alphabet obtained by the promising mathematical strategy called the N-block presentation (taking into account the environment of each nucleotide in a sequence). Sector by sector, we hypothesize many successive duplication/deletion scenarios that fit our previous evolutionary hypotheses. This suggests an important duplication/deletion role for

  20. A novel regucalcin gene promoter region-related protein: comparison of nucleotide and amino acid sequences in vertebrate species.

    Science.gov (United States)

    Sawada, Natsumi; Yamaguchi, Masayoshi

    2005-01-01

    The molecular cloning and sequencing of the cDNA coding for a novel regucalcin gene promoter region-related protein (RGPR-p117) from bovine, rabbit and chicken livers was investigated using rapid amplification of cDNA endo (RACE) method. Their nucleotide and amino acid sequences were compared with human, rat and mouse sequences published previously. RGPR-p117 of bovine, rabbit and chicken livers consisted of 1052, 1045, and 929 amino acid residues with calculated molecular mass of 117, 114, and 103 kDa, and estimated pI of 5.64, 5.84, and 5.59, respectively. Comparison analysis revealed that the nucleotide sequences of RGPR-p117 from mammalian species were highly-conserved in their coding region, and the homologies were at least 72.9%. The RGPR-p117 proteins in mammalian species consisted of 1045-1060 amino acids, and had 63.1-90.2% identity. Meanwhile, the nucleotide and amino acid sequences of chicken RGPR-p117 had at least 36.4 and 43.7% identities, respectively. Phylogenetic analysis showed that RGPR-p117 in six vertebrates appears to form a single cluster. Mammalian RGPR-p117 conserved a leucine zipper motif. Moreover, the analysis for subcellular localization of RGPR-p117 from six vertebrates showed the probability of nuclear localization >52.2%; the nuclear localization in rat and mouse was 78.3%. This study demonstrates a great conservation of RGPR-p117 genes throughout evolution.

  1. Genetic Diversity Assessment and Identification of New Sour Cherry Genotypes Using Intersimple Sequence Repeat Markers

    OpenAIRE

    Roghayeh Najafzadeh; Kazem Arzani; Naser Bouzari; Ali Saei

    2014-01-01

    Iran is one of the chief origins of subgenus Cerasus germplasm. In this study, the genetic variation of new Iranian sour cherries (which had such superior growth characteristics and fruit quality as to be considered for the introduction of new cultivars) was investigated and identified using 23 intersimple sequence repeat (ISSR) markers. Results indicated a high level of polymorphism of the genotypes based on these markers. According to these results, primers tested in this study specially IS...

  2. Inter simple sequence repeat fingerprints for assess genetic diversity of tunisian garlic populations

    OpenAIRE

    Jabbes, Naouel; Geoffriau, Emmanuel; Le Clerc, Valérie; Dridi, Boutheina; Hannechi, Chérif

    2011-01-01

    Garlic (Allium sativum L.) that is cultivated in Tunisia is heterogeneous and unclassified with no registered local cultivars. At present, the level of genetic diversity in Tunisian garlic is almost unknown. Inter Simple Sequence Repeats (ISSR) genetic markers were therefore used to assess the genetic diversity and its distribution in 31 Tunisian garlic accessions with 4 French classified clones used as control. It was the first time that ISSR markers were used to detect diversity in garlic. ...

  3. Nucleotide sequence of the Syrian hamster intracisternal A-particle gene: close evolutionary relationship of type A particle gene to types B and D oncovirus genes.

    Science.gov (United States)

    Ono, M; Toh, H; Miyata, T; Awaya, T

    1985-08-01

    We determined the complete nucleotide sequence of the intracisternal A-particle gene, IAP-H18, cloned from the normal Syrian hamster liver DNA. IAP-H18 was 7,951 base pairs in length with two identical long terminal repeats of 376 base pairs at both ends. On the coding strand, imperfect open reading frames corresponding to gag and pol of the retrovirus genome were observed, whereas many stop codons were present in the region corresponding to env. The putative H18 gag gene (809 amino acids) had a sequence homologous to the N-terminal half of the mouse mammary tumor virus gag gene and locally to the Rous sarcoma virus gag gene. The putative H18 pol gene (900 residues) was homologous to the Rous sarcoma virus pol gene almost throughout the entire region. Two conserved regions among the retrovirus pol genes have been reported. One presumably corresponds to the DNA polymerase and the RNase H domain, and the other corresponds to the DNA endonuclease domain of the multifunctional protein pol. By the comparison of the deduced amino acid sequences of the putative endonuclease domain of six representative oncovirus genomes, a phylogenetic tree of the oncovirus genomes was constructed, and the intracisternal A-particle (type A) genome was found to be more closely related to the mouse mammary tumor virus (type B) and squirrel monkey retrovirus (type D) genomes.

  4. Development of single-nucleotide polymorphism markers for Bromus tectorum (Poaceae) from a partially sequenced transcriptome

    Science.gov (United States)

    Keith R. Merrill; Craig E. Coleman; Susan E. Meyer; Elizabeth A. Leger; Katherine A. Collins

    2016-01-01

    Premise of the study: Bromus tectorum (Poaceae) is an annual grass species that is invasive in many areas of the world but most especially in the U.S. Intermountain West. Single-nucleotide polymorphism (SNP) markers were developed for use in investigating the geospatial and ecological diversity of B. tectorum in the Intermountain West to better understand the...

  5. The Nucleotide Capture Region of Alpha Hemolysin: Insights into Nanopore Design for DNA Sequencing from Molecular Dynamics Simulations

    Science.gov (United States)

    Manara, Richard M. A.; Tomasio, Susana; Khalid, Syma

    2015-01-01

    Nanopore technology for DNA sequencing is constantly being refined and improved. In strand sequencing a single strand of DNA is fed through a nanopore and subsequent fluctuations in the current are measured. A major hurdle is that the DNA is translocated through the pore at a rate that is too fast for the current measurement systems. An alternative approach is “exonuclease sequencing”, in which an exonuclease is attached to the nanopore that is able to process the strand, cleaving off one base at a time. The bases then flow through the nanopore and the current is measured. This method has the advantage of potentially solving the translocation rate problem, as the speed is controlled by the exonuclease. Here we consider the practical details of exonuclease attachment to the protein alpha hemolysin. We employ molecular dynamics simulations to determine the ideal (a) distance from alpha-hemolysin, and (b) the orientation of the monophosphate nucleotides upon release from the exonuclease such that they will enter the protein. Our results indicate an almost linear decrease in the probability of entry into the protein with increasing distance of nucleotide release. The nucleotide orientation is less significant for entry into the protein.

  6. Repeated-Sprint Sequences During Female Soccer Matches Using Fixed and Individual Speed Thresholds.

    Science.gov (United States)

    Nakamura, Fábio Y; Pereira, Lucas A; Loturco, Irineu; Rosseti, Marcelo; Moura, Felipe A; Bradley, Paul S

    2017-07-01

    Nakamura, FY, Pereira, LA, Loturco, I, Rosseti, M, Moura, FA, and Bradley, PS. Repeated-sprint sequences during female soccer matches using fixed and individual speed thresholds. J Strength Cond Res 31(7): 1802-1810, 2017-The main objective of this study was to characterize the occurrence of single sprint and repeated-sprint sequences (RSS) during elite female soccer matches, using fixed (20 km·h) and individually based speed thresholds (>90% of the mean speed from a 20-m sprint test). Eleven elite female soccer players from the same team participated in the study. All players performed a 20-m linear sprint test, and were assessed in up to 10 official matches using Global Positioning System technology. Magnitude-based inferences were used to test for meaningful differences. Results revealed that irrespective of adopting fixed or individual speed thresholds, female players produced only a few RSS during matches (2.3 ± 2.4 sequences using the fixed threshold and 3.3 ± 3.0 sequences using the individually based threshold), with most sequences composing of just 2 sprints. Additionally, central defenders performed fewer sprints (10.2 ± 4.1) than other positions (fullbacks: 28.1 ± 5.5; midfielders: 21.9 ± 10.5; forwards: 31.9 ± 11.1; with the differences being likely to almost certainly associated with effect sizes ranging from 1.65 to 2.72), and sprinting ability declined in the second half. The data do not support the notion that RSS occurs frequently during soccer matches in female players, irrespective of using fixed or individual speed thresholds to define sprint occurrence. However, repeated-sprint ability development cannot be ruled out from soccer training programs because of its association with match-related performance.

  7. Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains : a web-based resource

    Directory of Open Access Journals (Sweden)

    Vergnaud Gilles

    2004-01-01

    Full Text Available Abstract Background Polymorphic tandem repeat typing is a new generic technology which has been proved to be very efficient for bacterial pathogens such as B. anthracis, M. tuberculosis, P. aeruginosa, L. pneumophila, Y. pestis. The previously developed tandem repeats database takes advantage of the release of genome sequence data for a growing number of bacteria to facilitate the identification of tandem repeats. The development of an assay then requires the evaluation of tandem repeat polymorphism on well-selected sets of isolates. In the case of major human pathogens, such as S. aureus, more than one strain is being sequenced, so that tandem repeats most likely to be polymorphic can now be selected in silico based on genome sequence comparison. Results In addition to the previously described general Tandem Repeats Database, we have developed a tool to automatically identify tandem repeats of a different length in the genome sequence of two (or more closely related bacterial strains. Genome comparisons are pre-computed. The results of the comparisons are parsed in a database, which can be conveniently queried over the internet according to criteria of practical value, including repeat unit length, predicted size difference, etc. Comparisons are available for 16 bacterial species, and the orthopox viruses, including the variola virus and three of its close neighbors. Conclusions We are presenting an internet-based resource to help develop and perform tandem repeats based bacterial strain typing. The tools accessible at http://minisatellites.u-psud.fr now comprise four parts. The Tandem Repeats Database enables the identification of tandem repeats across entire genomes. The Strain Comparison Page identifies tandem repeats differing between different genome sequences from the same species. The "Blast in the Tandem Repeats Database" facilitates the search for a known tandem repeat and the prediction of amplification product sizes. The "Bacterial

  8. Long CAG repeat sequence and protein expression of androgen receptor considered as prognostic indicators in male breast carcinoma.

    Directory of Open Access Journals (Sweden)

    Yan-Ni Song

    Full Text Available BACKGROUND: The androgen receptor (AR expression and the CAG repeat length within the AR gene appear to be involved in the carcinogenesis of male breast carcinoma (MBC. Although phenotypic differences have been observed between MBC and normal control group in AR gene, there is lack of correlation analysis between AR expression and CAG repeat length in MBC. The purpose of the study was to investigate the prognostic value of CAG repeat lengths and AR protein expression. METHODS: 81 tumor tissues were used for immunostaining for AR expression and CAG repeat length determination and 80 normal controls were analyzed with CAG repeat length in AR gene. The CAG repeat length and AR expression were analyzed in relation to clinicopathological factors and prognostic indicators. RESULTS: AR gene in many MBCs has long CAG repeat sequence compared with that in control group (P = 0.001 and controls are more likely to exhibit short CAG repeat sequence than MBCs. There was statistically significant difference in long CAG repeat sequence between AR status for MBC patients (P = 0.004. The presence of long CAG repeat sequence and AR-positive expression were associated with shorter survival of MBC patients (CAG repeat: P = 0.050 for 5y-OS; P = 0.035 for 5y-DFS AR status: P = 0.048 for 5y-OS; P = 0.029 for 5y-DFS, respectively. CONCLUSION: The CAG repeat length within the AR gene might be one useful molecular biomarker to identify males at increased risk of breast cancer development. The presence of long CAG repeat sequence and AR protein expression were in relation to survival of MBC patients. The CAG repeat length and AR expression were two independent prognostic indicators in MBC patients.

  9. The nucleotide sequence of the right-hand terminus of adenovirus type 5 DNA: Implications for the mechanism of DNA replication

    NARCIS (Netherlands)

    Steenbergh, P.H.; Sussenbach, J.S.

    The nucleotide sequence of the right-hand terminal 3% of adenovirus type 5 (Ad5) DNA has been determined, using the chemical degradation technique developed by Maxam and Gilbert (1977). This region of the genome comprises the 1003 basepair long HindIII-I fragment and the first 75 nucleotides of the

  10. Fusion protein gene nucleotide sequence similarities, shared antigenic sites and phylogenetic analysis suggest that phocid distemper virus 2 and canine distemper virus belong to the same virus entity.

    NARCIS (Netherlands)

    I.K.G. Visser (Ilona); R.W.J. van der Heijden (Roger); M.W.G. van de Bildt (Marco); M.J.H. Kenter (Marcel); C. Örvell; A.D.M.E. Osterhaus (Albert)

    1993-01-01

    textabstractNucleotide sequencing of the fusion protein (F) gene of phocid distemper virus-2 (PDV-2), recently isolated from Baikal seals (Phoca sibirica), revealed an open reading frame (nucleotides 84 to 2075) with two potential in-frame ATG translation initiation codons. We suggest that the secon

  11. Update on Pneumocystis carinii f. sp. hominis typing based on nucleotide sequence variations in internal transcribed spacer regions of rRNA genes

    DEFF Research Database (Denmark)

    Lee, C H; Helweg-Larsen, J; Tang, X

    1998-01-01

    Pneumocystis carinii f. sp. hominis isolates from 207 clinical specimens from nine countries were typed based on nucleotide sequence variations in the internal transcribed spacer regions I and II (ITS1 and ITS2, respectively) of rRNA genes. The number of ITS1 nucleotides has been revised from the...

  12. The nucleotide sequence of the right-hand terminus of adenovirus type 5 DNA: Implications for the mechanism of DNA replication

    NARCIS (Netherlands)

    Steenbergh, P.H.; Sussenbach, J.S.

    1979-01-01

    The nucleotide sequence of the right-hand terminal 3% of adenovirus type 5 (Ad5) DNA has been determined, using the chemical degradation technique developed by Maxam and Gilbert (1977). This region of the genome comprises the 1003 basepair long HindIII-I fragment and the first 75 nucleotides of the

  13. Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers

    OpenAIRE

    Chunsheng Gao; Pengfei Xin; Chaohua Cheng; Qing Tang; Ping Chen; Changbiao Wang; Gonggu Zang; Lining Zhao

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SS...

  14. Effective DNA fragmentation technique for simple sequence repeat detection with a microsatellite-enriched library and high-throughput sequencing.

    Science.gov (United States)

    Tanaka, Keisuke; Ohtake, Rumi; Yoshida, Saki; Shinohara, Takashi

    2017-04-01

    Two different techniques for genomic DNA fragmentation before microsatellite-enriched library construction-restriction enzyme (NlaIII and MseI) digestion and sonication-were compared to examine their effects on simple sequence repeat (SSR) detection using high-throughput sequencing. Tens of thousands of SSR regions from 5 species of the plant family Myrtaceae were detected when the output of individual samples was >1 million paired-end reads. Comparison of the two DNA fragmentation techniques showed that restriction enzyme digestion was superior to sonication for identification of heterozygous genotypes, whereas sonication was superior for detection of various SSR flanking regions with both species-specific and common characteristics. Therefore, choosing the most suitable DNA fragmentation method depends on the type of analysis that is planned.

  15. Complete nucleotide sequence of the Cryptomeria japonica D. Don. chloroplast genome and comparative chloroplast genomics: diversified genomic structure of coniferous species

    Science.gov (United States)

    Hirao, Tomonori; Watanabe, Atsushi; Kurita, Manabu; Kondo, Teiji; Takata, Katsuhiko

    2008-01-01

    Background The recent determination of complete chloroplast (cp) genomic sequences of various plant species has enabled numerous comparative analyses as well as advances in plant and genome evolutionary studies. In angiosperms, the complete cp genome sequences of about 70 species have been determined, whereas those of only three gymnosperm species, Cycas taitungensis, Pinus thunbergii, and Pinus koraiensis have been established. The lack of information regarding the gene content and genomic structure of gymnosperm cp genomes may severely hamper further progress of plant and cp genome evolutionary studies. To address this need, we report here the complete nucleotide sequence of the cp genome of Cryptomeria japonica, the first in the Cupressaceae sensu lato of gymnosperms, and provide a comparative analysis of their gene content and genomic structure that illustrates the unique genomic features of gymnosperms. Results The C. japonica cp genome is 131,810 bp in length, with 112 single copy genes and two duplicated (trnI-CAU, trnQ-UUG) genes that give a total of 116 genes. Compared to other land plant cp genomes, the C. japonica cp has lost one of the relevant large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperms, such as Cycas and Gingko, and additionally has completely lost its trnR-CCG, partially lost its trnT-GGU, and shows diversification of accD. The genomic structure of the C. japonica cp genome also differs significantly from those of other plant species. For example, we estimate that a minimum of 15 inversions would be required to transform the gene organization of the Pinus thunbergii cp genome into that of C. japonica. In the C. japonica cp genome, direct repeat and inverted repeat sequences are observed at the inversion and translocation endpoints, and these sequences may be associated with the genomic rearrangements. Conclusion The observed differences in genomic structure between C. japonica and other land plants, including

  16. Simple Sequence Repeat Polymorphisms (SSRPs for Evaluation of Molecular Diversity and Germplasm Classification of Minor Crops

    Directory of Open Access Journals (Sweden)

    Nam-Soo Kim

    2009-11-01

    Full Text Available Evaluation of the genetic diversity among populations is an essential prerequisite for the preservation of endangered species. Thousands of new accessions are introduced into germplasm institutes each year, thereby necessitating assessment of their molecular diversity before elimination of the redundant genotypes. Of the protocols that facilitate the assessment of molecular diversity, SSRPs (simple sequence repeat polymorphisms or microsatellite variation is the preferred system since it detects a large number of DNA polymorphisms with relatively simple technical complexity. The paucity of information on DNA sequences has limited their widespread utilization in the assessment of genetic diversity of minor or neglected crop species. However, recent advancements in DNA sequencing and PCR technologies in conjunction with sophisticated computer software have facilitated the development of SSRP markers in minor crops. This review examines the development and molecular nature of SSR markers, and their utilization in many aspects of plant genetics and ecology.

  17. A revised its nucleotide sequence gives a specifity for Smallanthus sonchifolius (Poepp. and Endl. and its products identification

    Directory of Open Access Journals (Sweden)

    Žiarovská Jana

    2013-01-01

    Full Text Available Yacon (Smallanthus sonchifolius is an Andean crop which is very regarded for its benefits for people suffering from diabetes or various digestive or renal disorders. Because no specific Smallanthus sonchifolius identification DNA markers are still known the paper demonstrates ITS regions to be able to detect and differentiate among yacon species and the potential for specific food authentification purposes is reported, too. The newly sequenced ITS of yacon accessions originated in Peru, Ecuador and Bolivia analyse provide the unique sequence site that differs from all of the other yacon species and is recognized by DraIII restriction endonuclease. Restriction cleavadge of the PCR amplified ITSs of the twenty-eight yacon accessions was performed and in all cases the recognition site was confirmed as a typical for Smallanthus sonchifolius . Based on the nucleotide specifity of Smallanthus sonchifolius, ITS sequence the PCR method combined with the restriction clevadge protocol was developed for yacon identification.

  18. SinicView: A visualization environment for comparisons of multiple nucleotide sequence alignment tools

    OpenAIRE

    Wong Chun-Yi; Wu Yu-Wei; Chen Shiang-Heng; Peng Chin-Lin; Lin Laurent; Lee DT; Shih Arthur; Chou Meng-Yuan; Shiao Tze-Chang; Hsieh Mu-Fen

    2006-01-01

    Abstract Background Deluged by the rate and complexity of completed genomic sequences, the need to align longer sequences becomes more urgent, and many more tools have thus been developed. In the initial stage of genomic sequence analysis, a biologist is usually faced with the questions of how to choose the best tool to align sequences of interest and how to analyze and visualize the alignment results, and then with the question of whether poorly aligned regions produced by the tool are indee...

  19. CASCAD : a database of annotated candidate single nucleotide polymorphisms associated with expressed sequences

    NARCIS (Netherlands)

    Guryev, Victor; Berezikov, Eugene; Cuppen, Edwin

    2005-01-01

    BACKGROUND: With the recent progress made in large-scale genome sequencing projects a vast amount of novel data is becoming available. A comparative sequence analysis, exploiting sequence information from various resources, can be used to uncover hidden information, such as genetic variation. Althou

  20. Nucleotide sequence of Zygosaccharomyces bailii virus Z: Evidence for +1 programmed ribosomal frameshifting and for assignment to family Amalgaviridae.

    Science.gov (United States)

    Depierreux, Delphine; Vong, Minh; Nibert, Max L

    2016-06-02

    Zygosaccharomyces bailii virus Z (ZbV-Z) is a monosegmented dsRNA virus that infects the yeast Zygosaccharomyces bailii and remains unclassified to date despite its discovery >20years ago. The previously reported nucleotide sequence of ZbV-Z (GenBank AF224490) encompasses two nonoverlapping long ORFs: upstream ORF1 encoding the putative coat protein and downstream ORF2 encoding the RNA-dependent RNA polymerase (RdRp). The lack of overlap between these ORFs raises the question of how the downstream ORF is translated. After examining the previous sequence of ZbV-Z, we predicted that it contains at least one sequencing error to explain the nonoverlapping ORFs, and hence we redetermined the nucleotide sequence of ZbV-Z, derived from the same isolate of Z. bailii as previously studied, to address this prediction. The key finding from our new sequence, which includes several insertions, deletions, and substitutions relative to the previous one, is that ORF2 in fact overlaps ORF1 in the +1 frame. Moreover, a proposed sequence motif for +1 programmed ribosomal frameshifting, previously noted in influenza A viruses, plant amalgaviruses, and others, is also present in the newly identified ORF1-ORF2 overlap region of ZbV-Z. Phylogenetic analyses provided evidence that ZbV-Z represents a distinct taxon most closely related to plant amalgaviruses (genus Amalgavirus, family Amalgaviridae). We conclude that ZbV-Z is the prototype of a new species, which we propose to assign as type species of a new genus of monosegmented dsRNA mycoviruses in family Amalgaviridae. Comparisons involving other unclassified mycoviruses with RdRps apparently related to those of plant amalgaviruses, and having either mono- or bisegmented dsRNA genomes, are also discussed.

  1. Genomic and polyploid evolution in genus Avena as revealed by RFLPs of repeated DNA sequences.

    Science.gov (United States)

    Morikawa, Toshinobu; Nishihara, Miho

    2009-06-01

    Phylogenetic relationships and genome affinities were investigated by utilizing all the biological Avena species consisting of 11 diploid species (15 accessions), 8 tetraploid species (9 accessions) and 4 hexaploid species (5 accessions). Genomic DNA regions of As120a, avenin, and globulin were amplified by PCR. A total of 130 polymorphic fragments were detected out of 156 fragments generated by digesting the PCR-amplified fragments with 11 restriction enzymes. The number of fragments generated by PCR-amplification followed by digestion with restriction enzymes was almost the same as those among the three repeated DNA sequences. A high level of genetic distance was detected between A. damascena (Ad) and A. canariensis (Ac) genomes, which reflected their different morphology and reproductive isolation. The A. longiglumis (Al) and A. prostrata (Ap) genomes were closely related to the As genome group. The AB genome species formed a cluster with the AsAs genome artificial autotetraploid and the As genome diploids indicating near-autotetraploid origin. The A. macrostachya is an outbreeding autotetraploid closely related with the C genome diploid and the AC genome tetraploid species. The differences of genetic distances estimated from the repeated DNA sequence divergence among the Avena species were consistent with genome divergences and it was possible to compare the genetic intra- and inter-ploidy relationships produced by RFLPs. These results suggested that the PCR-mediated analysis of repeated DNA polymorphism can be used as a tool to examine genomic relationships of polyploidy species.

  2. Isolation, characterization and amplification of simple sequence repeat loci in coffee

    Directory of Open Access Journals (Sweden)

    Marco-Aurelio Cristancho

    2008-01-01

    Full Text Available Simple sequence repeat (microsatellite loci in coffee were identified in clones isolated from enriched andrandom genomic libraries. It was shown that coffee is a plant species with low microsatellite frequency. However, the averagedistance between two loci, estimated at 127kb for poly (AG, is one of the shortest of all plant genomes. In contrast, thedistance between two poly (AC loci, estimated at 769kb, is one of the largest in plant genomes. Coffee (ACn microsatellites arefrequently associated with other microsatellites, mainly (ATn motifs, while (AGn microsatellites are not normally associatedwith other microsatellites and have a higher number of perfect motifs. Dinucleotide repeats (AG and (AC were found in ATrichregions in coffee. Sequence analysis of (ACn microsatellites identified in coffee revealed the possible association of theserepeated elements with miniature inverted-repeat transposable elements (MITEs. In addition, some of the evaluated SSRmarkers produced transposon-like amplification patterns in tetraploid genotypes. Of 12 SSR markers developed, nine werepolymorphic in diploid genotypes while 5 were polymorphic in tetraploid genotypes, confirming a greater genetic diversity indiploid species.

  3. Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale

    DEFF Research Database (Denmark)

    Huang, Shujia; Rao, Junhua; Ye, Weijian

    2015-01-01

    Comprehensive recognition of genomic variation in one individual is important for understanding disease and developing personalized medication and treatment. Many tools based on DNA re-sequencing exist for identification of single nucleotide polymorphisms, small insertions and deletions (indels...

  4. Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale

    DEFF Research Database (Denmark)

    Huang, Shujia; Rao, Junhua; Ye, Weijian

    2015-01-01

    Comprehensive recognition of genomic variation in one individual is important for understanding disease and developing personalized medication and treatment. Many tools based on DNA re-sequencing exist for identification of single nucleotide polymorphisms, small insertions and deletions (indels) ...

  5. Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale

    DEFF Research Database (Denmark)

    Huang, Shujia; Rao, Junhua; Ye, Weijian;

    2015-01-01

    Comprehensive recognition of genomic variation in one individual is important for understanding disease and developing personalized medication and treatment. Many tools based on DNA re-sequencing exist for identification of single nucleotide polymorphisms, small insertions and deletions (indels) ...

  6. Nucleotide sequences from the genomes of diverse cowpea accessions for discovery of genetic variation as part of the Feed the Future Innovation Lab for Climate Resilient Cowpea

    Data.gov (United States)

    US Agency for International Development — Nucleotide sequences were generated from 37 cowpea (Vigna unguiculata L. Walp.) accessions relevant to Africa, China and the USA to discover at type of genetic...

  7. Nucleotide sequence and infectious cDNA clone of the L1 isolate of Pea seed-borne mosaic potyvirus.

    Science.gov (United States)

    Olsen, B S; Johansen, I E

    2001-01-01

    The complete nucleotide sequence of Pea seed-borne mosaic potyvirus isolate L1 has been determined from cloned virus cDNA. The PSbMV L1 genome is 9895 nucleotides in length excluding the poly(A) tail. Computer analysis of the sequence revealed a single long open reading frame (ORF) of 9594 nucleotides. The ORF potentially encodes a polyprotein of 3198 amino acids with a deduced Mr of 363537. Nine putative proteolytic cleavage sites were identified by analogy to consensus sequences and genome arrangement in other potyviruses. Two full-length cDNA clones, p35S-L1-4 and p35S-L1-5, were assembled under control of an enhanced 35S promoter and nopaline synthase terminator. Clone p35S-L1-4 was constructed with four introns and p35S-L1-5 with five introns inserted in the cDNA. Clone p35S-L1-4 was unstable in Escherichia coli often resulting in amplification of plasmids with deletions. Clone p35S-L1-5 was stable and apparently less toxic to Escherichia coli resulting in larger bacterial colonies and higher plasmid yield. Both clones were infectious upon mechanical inoculation of plasmid DNA on susceptible pea cultivars Fjord, Scout, and Brutus. Eight pea genotypes resistant to L1 virus were also resistant to the cDNA derived L1 virus. Both native PSbMV L1 and the cDNA derived virus infected Chenopodium quinoa systemically giving rise to characteristic necrotic lesions on uninoculated leaves.

  8. Nuclear Receptor HNF4α Binding Sequences are Widespread in Alu Repeats

    Directory of Open Access Journals (Sweden)

    Bolotin Eugene

    2011-11-01

    Full Text Available Abstract Background Alu repeats, which account for ~10% of the human genome, were originally considered to be junk DNA. Recent studies, however, suggest that they may contain transcription factor binding sites and hence possibly play a role in regulating gene expression. Results Here, we show that binding sites for a highly conserved member of the nuclear receptor superfamily of ligand-dependent transcription factors, hepatocyte nuclear factor 4alpha (HNF4α, NR2A1, are highly prevalent in Alu repeats. We employ high throughput protein binding microarrays (PBMs to show that HNF4α binds > 66 unique sequences in Alu repeats that are present in ~1.2 million locations in the human genome. We use chromatin immunoprecipitation (ChIP to demonstrate that HNF4α binds Alu elements in the promoters of target genes (ABCC3, APOA4, APOM, ATPIF1, CANX, FEMT1A, GSTM4, IL32, IP6K2, PRLR, PRODH2, SOCS2, TTR and luciferase assays to show that at least some of those Alu elements can modulate HNF4α-mediated transactivation in vivo (APOM, PRODH2, TTR, APOA4. HNF4α-Alu elements are enriched in promoters of genes involved in RNA processing and a sizeable fraction are in regions of accessible chromatin. Comparative genomics analysis suggests that there may have been a gain in HNF4α binding sites in Alu elements during evolution and that non Alu repeats, such as Tiggers, also contain HNF4α sites. Conclusions Our findings suggest that HNF4α, in addition to regulating gene expression via high affinity binding sites, may also modulate transcription via low affinity sites in Alu repeats.

  9. BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations.

    Science.gov (United States)

    Bahr, A; Thompson, J D; Thierry, J C; Poch, O

    2001-01-01

    BAliBASE is specifically designed to serve as an evaluation resource to address all the problems encountered when aligning complete sequences. The database contains high quality, manually constructed multiple sequence alignments together with detailed annotations. The alignments are all based on three-dimensional structural superpositions, with the exception of the transmembrane sequences. The first release provided sets of reference alignments dealing with the problems of high variability, unequal repartition and large N/C-terminal extensions and internal insertions. Here we describe version 2.0 of the database, which incorporates three new reference sets of alignments containing structural repeats, trans-membrane sequences and circular permutations to evaluate the accuracy of detection/prediction and alignment of these complex sequences. BAliBASE can be viewed at the web site http://www-igbmc.u-strasbg. fr/BioInfo/BAliBASE2/index.html or can be downloaded from ftp://ftp-igbmc.u-strasbg.fr/pub/BAliBASE2 /.

  10. Nucleotide substitutions in rolC and nptII gene sequences during long-term cultivation of Panax ginseng cell cultures.

    Science.gov (United States)

    Kiselev, Konstantin V; Turlenko, Anna V; Tchernoded, Galina K; Zhuravlev, Yuri N

    2009-08-01

    It has been shown previously that the rolC gene from Agrobacterium tumefaciens gene was stably and highly expressed in 15-year-old Panax ginseng transgenic cell cultures. In the present report, we analyze in detail the nucleotide composition of the rolC and nptII (neomycin phosphotransferase) genes, which is the selective marker used for transgenic cell cultures of P. ginseng. It has been established that the nucleotide sequences of the rolC and nptII genes underwent mutagenesis during cultivation. Particularly, 1-4 nucleotide substitutions were found per sequence in the 540 and 798 bp segments of the complete rolC and nptII genes, respectively. Approximately half of these nucleotide substitutions caused changes in the structure of the predicted gene product. In addition, we attempted to determine the rate of accumulation of these changes by comparison of DNA extracted from P. ginseng cell cultures from 1995 to 2007. It was observed that the frequency of nucleotide substitutions for the rolC and nptII genes in 1995 was 1.21 +/- 0.02 per 1,000 nucleotides analyzed, while in 2007, the nucleotide substitutions significantly increased (1.37 +/- 0.07 per 1,000 nucleotides analyzed). Analyzing the nucleotide substitutions, we found that substitution to G or to C nucleotides significantly increased (in 1.9 times) in the rolC and nptII genes compared with P. ginseng actin gene. Finally, the level of nucleotide substitutions in the rolC gene was 1.1-fold higher when compared with the nptII gene. Thus, for the first time, we have experimentally demonstrated the level of nucleotide substitutions in transferred genes in transgenic plant cell cultures.

  11. Analysis of expressed sequence tags from Prunus mume flower and fruit and development of simple sequence repeat markers

    Directory of Open Access Journals (Sweden)

    Gao Zhihong

    2010-07-01

    Full Text Available Abstract Background Expressed Sequence Tag (EST has been a cost-effective tool in molecular biology and represents an abundant valuable resource for genome annotation, gene expression, and comparative genomics in plants. Results In this study, we constructed a cDNA library of Prunus mume flower and fruit, sequenced 10,123 clones of the library, and obtained 8,656 expressed sequence tag (EST sequences with high quality. The ESTs were assembled into 4,473 unigenes composed of 1,492 contigs and 2,981 singletons and that have been deposited in NCBI (accession IDs: GW868575 - GW873047, among which 1,294 unique ESTs were with known or putative functions. Furthermore, we found 1,233 putative simple sequence repeats (SSRs in the P. mume unigene dataset. We randomly tested 42 pairs of PCR primers flanking potential SSRs, and 14 pairs were identified as true-to-type SSR loci and could amplify polymorphic bands from 20 individual plants of P. mume. We further used the 14 EST-SSR primer pairs to test the transferability on peach and plum. The result showed that nearly 89% of the primer pairs produced target PCR bands in the two species. A high level of marker polymorphism was observed in the plum species (65% and low in the peach (46%, and the clustering analysis of the three species indicated that these SSR markers were useful in the evaluation of genetic relationships and diversity between and within the Prunus species. Conclusions We have constructed the first cDNA library of P. mume flower and fruit, and our data provide sets of molecular biology resources for P. mume and other Prunus species. These resources will be useful for further study such as genome annotation, new gene discovery, gene functional analysis, molecular breeding, evolution and comparative genomics between Prunus species.

  12. Inter-simple sequence repeat (ISSR) loci mapping in the genome of perennial ryegrass

    DEFF Research Database (Denmark)

    Pivorienė, O; Pašakinskienė, I; Brazauskas, G;

    2008-01-01

    The aim of this study was to identify and characterize new ISSR markers and their loci in the genome of perennial ryegrass. A subsample of the VrnA F2 mapping family of perennial ryegrass comprising 92 individuals was used to develop a linkage map including inter-simple sequence repeat markers...... demonstrated a 70% similarity to the Hordeum vulgare germin gene GerA. Inter-SSR mapping will provide useful information for gene targeting, quantitative trait loci mapping and marker-assisted selection in perennial ryegrass....

  13. The complete nucleotide sequence and genomic characterization of tropical soda apple mosaic virus

    Science.gov (United States)

    Tropical soda apple mosaic virus (TSAMV) was first identified in tropical soda apple (Solanum viarum), a noxious weed, in Florida in 2002. This report provides the first full genome sequence of TSAMV. The full genome sequence of this virus will enable research scientists to develop additional spec...

  14. Localization of a new highly repeated DNA sequence of Lemur cafta (Lemuridae, Strepsirhini).

    Science.gov (United States)

    Boniotto, Michele; Ventura, Mario; Cardone, Maria Francesca; Boaretto, Francesca; Archidiacono, Nicoletta; Rocchi, Mariano; Crovella, Sergio

    2002-10-01

    We have isolated and cloned an 800-bp highly repeated DNA (HRDNA) sequence from Lemur catta (LCA) and described its localization on LCA chromosomes. Lemur catta HRDNA sequences were localized by performing FISH experiments on standard and elongated metaphasic chromosomes using an LCA HRDNA probe (LCASAT). A complex hybridization pattern was detected. A strong pericentromeric hybridization signal was observed on most LCA chromosomes. Chromosomes 7 and 13 were lit in pericentromeric regions, as well as in the interspersed heterochromatin. Chromosomes 1, 3, 4, 17, 19, X, and microchromosomes (20, 25, 26, and 27) showed no signals in the pericentromeric region, but chromosomes 3 and 4 showed a positive hybridization in heterochromatic regions. The 800-bp L catta HRDNA was species specific. We performed FISH experiments with the LCASAT probe on Eulemur macaco macaco (EMA) and Eulemur fulvus fulvus (EFU) metaphases and no positive signal of hybridization was detected. These findings were also confirmed by Southern blot analysis and PCR.

  15. Efficient detection of chromosome imbalances and exome single nucleotide variants using targeted sequencing in the clinical setting.

    Science.gov (United States)

    Villela, Darine; da Costa, Silvia Souza; Vianna-Morgante, Angela M; Krepischi, Ana C V; Rosenberg, Carla

    2017-09-04

    We evaluated an approach to detect copy number variants (CNVs) and single nucleotide changes (SNVs), using a clinically focused exome panel complemented with a backbone and SNP probes that allows for genome-wide copy number changes and copy-neutral absence of heterozygosity (AOH) calls; this approach potentially substitutes the use of chromosomal microarray testing and sequencing into a single test. A panel of 16 DNA samples with known alterations ranging from megabase-scale CNVs to single base modifications were used as positive controls for sequencing data analysis. The DNA panel included CNVs (n = 13) of variable sizes (23 Kb to 27 Mb), uniparental disomy (UPD; n = 1), and single point mutations (n = 2). All DNA sequence changes were identified by the current platform, showing that CNVs of at least 23 Kb can be properly detected. The estimated size of genomic imbalances detected by microarrays and next generation sequencing are virtually the same, indicating that the resolution and sensitivity of this approach are at least similar to those provided by DNA microarrays. Accordingly, our data show that the combination of a sequencing platform comprising focused exome and whole genome backbone, with appropriate algorithms, enables a cost-effective and efficient solution for the simultaneous detection of CNVs and SNVs. Copyright © 2017. Published by Elsevier Masson SAS.

  16. Behavior of Repeating Earthquake Sequences in Central California and the Implications for Subsurface Fault Creep

    Energy Technology Data Exchange (ETDEWEB)

    Templeton, D C; Nadeau, R; Burgmann, R

    2007-07-09

    Repeating earthquakes (REs) are sequences of events that have nearly identical waveforms and are interpreted to represent fault asperities driven to failure by loading from aseismic creep on the surrounding fault surface at depth. We investigate the occurrence of these REs along faults in central California to determine which faults exhibit creep and the spatio-temporal distribution of this creep. At the juncture of the San Andreas and southern Calaveras-Paicines faults, both faults as well as a smaller secondary fault, the Quien Sabe fault, are observed to produce REs over the observation period of March 1984-May 2005. REs in this area reflect a heterogeneous creep distribution along the fault plane with significant variations in time. Cumulative slip over the observation period at individual sequence locations is determined to range from 5.5-58.2 cm on the San Andreas fault, 4.8-14.1 cm on the southern Calaveras-Paicines fault, and 4.9-24.8 cm on the Quien Sabe fault. Creep at depth appears to mimic the behaviors seen of creep on the surface in that evidence of steady slip, triggered slip, and episodic slip phenomena are also observed in the RE sequences. For comparison, we investigate the occurrence of REs west of the San Andreas fault within the southern Coast Range. Events within these RE sequences only occurred minutes to weeks apart from each other and then did not repeat again over the observation period, suggesting that REs in this area are not produced by steady aseismic creep of the surrounding fault surface.

  17. Image Encryption Algorithm Based on Hyperchaotic Maps and Nucleotide Sequences Database.

    Science.gov (United States)

    Niu, Ying; Zhang, Xuncai; Han, Feng

    2017-01-01

    Image encryption technology is one of the main means to ensure the safety of image information. Using the characteristics of chaos, such as randomness, regularity, ergodicity, and initial value sensitiveness, combined with the unique space conformation of DNA molecules and their unique information storage and processing ability, an efficient method for image encryption based on the chaos theory and a DNA sequence database is proposed. In this paper, digital image encryption employs a process of transforming the image pixel gray value by using chaotic sequence scrambling image pixel location and establishing superchaotic mapping, which maps quaternary sequences and DNA sequences, and by combining with the logic of the transformation between DNA sequences. The bases are replaced under the displaced rules by using DNA coding in a certain number of iterations that are based on the enhanced quaternary hyperchaotic sequence; the sequence is generated by Chen chaos. The cipher feedback mode and chaos iteration are employed in the encryption process to enhance the confusion and diffusion properties of the algorithm. Theoretical analysis and experimental results show that the proposed scheme not only demonstrates excellent encryption but also effectively resists chosen-plaintext attack, statistical attack, and differential attack.

  18. Target genes of microsatellite sequences in head and neck squamous cell carcinoma: mononucleotide repeats are not detected.

    Science.gov (United States)

    Wang, Yimin; Liu, Xuejuan; Li, Yulin

    2012-09-10

    Microsatellite instability (MSI) is detected in a wide variety of tumors. It is thought that mismatch repair gene mutation or inactivation is the major cause of MSI. Microsatellite sequences are predominantly distributed in intergenic or intronic DNA. However, MSI is found in the exonic sequences of some genes, causing their inactivation. In this report, we searched GenBank for candidate genes containing potential MSI sequences in exonic regions. Twenty seven target genes were selected for MSI analysis. Instability was found in 70% of these genes (14/20) with head and neck squamous cell carcinoma (HNSCC). Interestingly, no instability was detected in mononucleotide repeats in genes or in intergenic sequences. We conclude that instability of mononucleotide repeats is a rare event in HNSCC. High MSI phenotype in young HNSCC patients is limited to noncoding regions only. MSI percentage in HNSCC tumor is closely related to the repeat type, repeat location and patient's age.

  19. The impact of CRISPR repeat sequence on structures of a Cas6 protein-RNA complex

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Ruiying; Zheng, Han; Preamplume, Gan; Shao, Yaming; Li, Hong [FSU

    2012-03-15

    The repeat-associated mysterious proteins (RAMPs) comprise the most abundant family of proteins involved in prokaryotic immunity against invading genetic elements conferred by the clustered regularly interspaced short palindromic repeat (CRISPR) system. Cas6 is one of the first characterized RAMP proteins and is a key enzyme required for CRISPR RNA maturation. Despite a strong structural homology with other RAMP proteins that bind hairpin RNA, Cas6 distinctly recognizes single-stranded RNA. Previous structural and biochemical studies show that Cas6 captures the 5' end while cleaving the 3' end of the CRISPR RNA. Here, we describe three structures and complementary biochemical analysis of a noncatalytic Cas6 homolog from Pyrococcus horikoshii bound to CRISPR repeat RNA of different sequences. Our study confirms the specificity of the Cas6 protein for single-stranded RNA and further reveals the importance of the bases at Positions 5-7 in Cas6-RNA interactions. Substitutions of these bases result in structural changes in the protein-RNA complex including its oligomerization state.

  20. Development of simple sequence repeat (SSR) markers of sesame (Sesamum indicum) from a genome survey.

    Science.gov (United States)

    Wei, Xin; Wang, Linhai; Zhang, Yanxin; Qi, Xiaoqiong; Wang, Xiaoling; Ding, Xia; Zhang, Jing; Zhang, Xiurong

    2014-04-22

    Sesame (Sesamum indicum), an important oil crop, is widely grown in tropical and subtropical regions. It provides part of the daily edible oil allowance for almost half of the world's population. A limited number of co-dominant markers has been developed and applied in sesame genetic diversity and germplasm identity studies. Here we report for the first time a whole genome survey used to develop simple sequence repeat (SSR) markers and to detect the genetic diversity of sesame germplasm. From the initial assembled sesame genome, 23,438 SSRs (≥5 repeats) were identified. The most common repeat motif was dinucleotide with a frequency of 84.24%, followed by 13.53% trinucleotide, 1.65% tetranucleotide, 0.3% pentanucleotide and 0.28% hexanucleotide motifs. From 1500 designed and synthesised primer pairs, 218 polymorphic SSRs were developed and used to screen 31 sesame accessions that from 12 countries. STRUCTURE and phylogenetic analyses indicated that all sesame accessions could be divided into two groups: one mainly from China and another from other countries. Cluster analysis classified Chinese major sesame varieties into three groups. These novel SSR markers are a useful tool for genetic linkage map construction, genetic diversity detection, and marker-assisted selective sesame breeding.

  1. Simple sequence repeat marker development and genetic mapping in quinoa (Chenopodium quinoa Willd.)

    Indian Academy of Sciences (India)

    D. E. Jarvis; O. R. Kopp; E. N. Jellen; M. A. Mallory; J. Pattee; A. Bonifacio; C. E. Coleman; M. R. Stevens; D. J. Fairbanks; P. J. Maughan

    2008-04-01

    Quinoa is a regionally important grain crop in the Andean region of South America. Recently quinoa has gained international attention for its high nutritional value and tolerances of extreme abiotic stresses. DNA markers and linkage maps are important tools for germplasm conservation and crop improvement programmes. Here we report the development of 216 new polymorphic SSR (simple sequence repeats) markers from libraries enriched for GA, CAA and AAT repeats, as well as 6 SSR markers developed from bacterial artificial chromosome-end sequences (BES-SSRs). Heterozygosity (H) values of the SSR markers ranges from 0.12 to 0.90, with an average value of 0.57. A linkage map was constructed for a newly developed recombinant inbred lines (RIL) population using these SSR markers. Additional markers, including amplified fragment length polymorphisms (AFLPs), two 11S seed storage protein loci, and the nucleolar organizing region (NOR), were also placed on the linkage map. The linkage map presented here is the first SSR-based map in quinoa and contains 275 markers, including 200 SSR. The map consists of 38 linkage groups (LGs) covering 913 cM. Segregation distortion was observed in the mapping population for several marker loci, indicating possible chromosomal regions associated with selection or gametophytic lethality. As this map is based primarily on simple and easily-transferable SSR markers, it will be particularly valuable for research in laboratories in Andean regions of South America.

  2. Simple sequence repeats provide a substrate for phenotypic variation in the Neurospora crassa circadian clock.

    Directory of Open Access Journals (Sweden)

    Todd P Michael

    Full Text Available BACKGROUND: WHITE COLLAR-1 (WC-1 mediates interactions between the circadian clock and the environment by acting as both a core clock component and as a blue light photoreceptor in Neurospora crassa. Loss of the amino-terminal polyglutamine (NpolyQ domain in WC-1 results in an arrhythmic circadian clock; this data is consistent with this simple sequence repeat (SSR being essential for clock function. METHODOLOGY/PRINCIPAL FINDINGS: Since SSRs are often polymorphic in length across natural populations, we reasoned that investigating natural variation of the WC-1 NpolyQ may provide insight into its role in the circadian clock. We observed significant phenotypic variation in the period, phase and temperature compensation of circadian regulated asexual conidiation across 143 N. crassa accessions. In addition to the NpolyQ, we identified two other simple sequence repeats in WC-1. The sizes of all three WC-1 SSRs correlated with polymorphisms in other clock genes, latitude and circadian period length. Furthermore, in a cross between two N. crassa accessions, the WC-1 NpolyQ co-segregated with period length. CONCLUSIONS/SIGNIFICANCE: Natural variation of the WC-1 NpolyQ suggests a mechanism by which period length can be varied and selected for by the local environment that does not deleteriously affect WC-1 activity. Understanding natural variation in the N.crassa circadian clock will facilitate an understanding of how fungi exploit their environments.

  3. Simple sequence repeat marker development and genetic mapping in quinoa (Chenopodium quinoa Willd.).

    Science.gov (United States)

    Jarvis, D E; Kopp, O R; Jellen, E N; Mallory, M A; Pattee, J; Bonifacio, A; Coleman, C E; Stevens, M R; Fairbanks, D J; Maughan, P J

    2008-04-01

    Quinoa is a regionally important grain crop in the Andean region of South America. Recently quinoa has gained international attention for its high nutritional value and tolerances of extreme abiotic stresses. DNA markers and linkage maps are important tools for germplasm conservation and crop improvement programmes. Here we report the development of 216 new polymorphic SSR (simple sequence repeats) markers from libraries enriched for GA, CAA and AAT repeats, as well as 6 SSR markers developed from bacterial artificial chromosome-end sequences (BES-SSRs). Heterozygosity (H) values of the SSR markers ranges from 0.12 to 0.90, with an average value of 0.57. A linkage map was constructed for a newly developed recombinant inbred lines (RIL) population using these SSR markers. Additional markers, including amplified fragment length polymorphisms (AFLPs), two 11S seed storage protein loci, and the nucleolar organizing region (NOR), were also placed on the linkage map. The linkage map presented here is the first SSR-based map in quinoa and contains 275 markers, including 200 SSR. The map consists of 38 linkage groups (LGs) covering 913 cM. Segregation distortion was observed in the mapping population for several marker loci, indicating possible chromosomal regions associated with selection or gametophytic lethality. As this map is based primarily on simple and easily-transferable SSR markers, it will be particularly valuable for research in laboratories in Andean regions of South America.

  4. The nucleotide sequence of Beneckea harveyi 5S rRNA. [bioluminescent marine bacterium

    Science.gov (United States)

    Luehrsen, K. R.; Fox, G. E.

    1981-01-01

    The primary sequence of the 5S ribosomal RNA isolated from the free-living bioluminescent marine bacterium Beneckea harveyi is reported and discussed in regard to indications of phylogenetic relationships with the bacteria Escherichia coli and Photobacterium phosphoreum. Sequences were determined for oligonucleotide products generated by digestion with ribonuclease T1, pancreatic ribonuclease and ribonuclease T2. The presence of heterogeneity is indicated for two sites. The B. harveyi sequence can be arranged into the same four helix secondary structures as E. coli and other prokaryotic 5S rRNAs. Examination of the 5S-RNS sequences of the three bacteria indicates that B. harveyi and P. phosphoreum are specifically related and share a common ancestor which diverged from an ancestor of E. coli at a somewhat earlier time, consistent with previous studies.

  5. Molecular characterization of long terminal repeat sequences from Brazilian human immunodeficiency virus type 1 isolates.

    Science.gov (United States)

    Ferraro, Geraldo A; Monteiro-Cunha, Joana P; Fernandes, Flora M C; Mota-Miranda, Aline C A; Brites, Carlos; Alcantara, Luiz C J; Galvão-Castro, Bernardo; Morgado, Mariza G

    2013-05-01

    HIV-1 provirus activation is under control of the long terminal repeat (LTR)-5' viral promoter region, which presents remarkable genetic variation among HIV-1 subtypes. It is possible that molecular features of the LTR contribute to the unusual profile of the subtype C epidemic in the Brazilian Southern region. To characterize the LTR of Brazilian HIV isolates, we analyzed sequences from 21 infected individuals from Porto Alegre and Salvador cities. Sequences were compared with subtype B and C reference strains from different countries. Phylogenetic analysis showed that 17 (81%) samples were subtype B and four (19%) were subtype C. Common patterns of transcription factor binding sites (TFBS) in subtypes B and C sequences were confirmed and other potential TFBS specific for subtype C were found. Brazilian subtype C sequences contained an additional NF-κB biding site, as previously described for the majority of subtype C isolates. The high level of LTR polymorphisms identified in this study might be important for viral fitness.

  6. Detection of sequence variability of the collagen type IIalpha 1 3' variable number of tandem repeat.

    Science.gov (United States)

    van Meurs, J B; Arp, P P; Fang, Y; Slagboom, P E; Meulenbelt, I; van Leeuwen, J P; Pols, H A; Uitterlinden, A G

    2000-11-01

    The variable number of tandem repeat (VNTR) 3' of the collagen type II (COL2A1) gene has been shown to be highly variable with a complex molecular structure. In a previous pilot experiment we observed discordance between methods to genotype this informative marker. To further investigate the extent and molecular nature of this discordance, we genotyped a random sample of 207 Caucasian individuals with two genotyping methods and sequenced new alleles. We compared single-strand (SS) analysis, which is based on detection of size differences between the different alleles, and heteroduplex analysis (HA), which is sensitive to both size and sequence differences. Overall, 26% of discordance between the two methods was detected. Approximately two thirds of this discordance was caused by subdivision of SS-alleles 13R1 and 14R2 into HA-alleles 4A + 4B and 3B + 3C, respectively. Sequence analysis of the COL2A1 VNTR alleles 4B and 3C showed that these alleles differed in sequence, but not in size, from already described SS-alleles, which explains why they escape detection by SS. The 4B allele is a frequent allele in the population (14%) and is, therefore, important to distinguish in association studies. We conclude that HA is a reliable method when the described optimized electrophoretic conditions are used. HA is a sensitive genotyping method to document allelic diversity at this locus, which can distinguish more alleles compared to the SS method.

  7. Targeted capture enrichment and sequencing identifies extensive nucleotide variation in the turkey MHC-B.

    Science.gov (United States)

    Reed, Kent M; Mendoza, Kristelle M; Settlage, Robert E

    2016-03-01

    Variation in the major histocompatibility complex (MHC) is increasingly associated with disease susceptibility and resistance in avian species of agricultural importance. This variation includes sequence polymorphisms but also structural differences (gene rearrangement) and copy number variation (CNV). The MHC has now been described for multiple galliform species including the best defined assemblies of the chicken (Gallus gallus) and domestic turkey (Meleagris gallopavo). Using this sequence resource, this study applied high-throughput sequencing to investigate MHC variation in turkeys of North America (NA turkeys). An MHC-specific SureSelect (Agilent) capture array was developed, and libraries were created for 14 turkeys representing domestic (commercial bred), heritage breed, and wild turkeys. In addition, a representative of the Ocellated turkey (M. ocellata) and chicken (G. gallus) was included to test cross-species applicability of the capture array allowing for identification of new species-specific polymorphisms. Libraries were hybridized to ∼12 K cRNA baits and the resulting pools were sequenced. On average, 98% of processed reads mapped to the turkey whole genome sequence and 53% to the MHC target. In addition to the MHC, capture hybridization recovered sequences corresponding to other MHC regions. Sequence alignment and de novo assembly indicated the presence of several additional BG genes in the turkey with evidence for CNV. Variant detection identified an average of 2245 polymorphisms per individual for the NA turkeys, 3012 for the Ocellated turkey, and 462 variants in the chicken (RJF-256). This study provides an extensive sequence resource for examining MHC variation and its relation to health of this agriculturally important group of birds.

  8. Selection, Recombination and History in a Parasitic Flatworm (Echinococcus) Inferred from Nucleotide Sequences

    OpenAIRE

    KL Haag; Araújo AM; Gottstein B; Zaha A

    1998-01-01

    Three species of flatworms from the genus Echinococcus (E. granulosus, E. multilocularis and E. vogeli) and four strains of E. granulosus (cattle, horse, pig and sheep strains) were analysed by the PCR-SSCP method followed by sequencing, using as targets two non-coding and two coding (one nuclear and one mitochondrial) genomic regions. The sequencing data was used to evaluate hypothesis about the parasite breeding system and the causes of genetic diversification. The calculated recombination ...

  9. Large-scale analysis of structural, sequence and thermodynamic characteristics of A-to-I RNA editing sites in human Alu repeats

    Directory of Open Access Journals (Sweden)

    Eisenberg Eli

    2010-07-01

    Full Text Available Abstract Background Alu repeats in the human transcriptome undergo massive adenosine to inosine RNA editing. This process is selective, as editing efficiency varies greatly among different adenosines. Several studies have identified weak sequence motifs characterizing the editing sites, but these alone do not account for the large diversity observed. Results Here we build a dataset of 29,971 editing sites and use it to characterize editing preferences. We focus on structural aspects, studying the double-stranded RNA structure of the Alu repeats, and show the editing frequency of a given site to depend strongly on the micro-structure it resides in. Surprisingly, we find that interior loops, and especially the nucleotides at their edges, are more likely to be edited than helices. In addition, the sequence motifs characterizing editing sites vary with the micro-structure. Finally, we show that thermodynamic stability of the site is important for its editing. Conclusions Analysis of a large dataset of editing events reveals more information on sequence and structural motifs characterizing the A-to-I editing process

  10. Nucleotide sequence of cloned cDNA for human pancreatic kallikrein.

    Science.gov (United States)

    Fukushima, D; Kitamura, N; Nakanishi, S

    1985-12-31

    Cloned cDNA sequences for human pancreatic kallikrein have been isolated and determined by molecular cloning and sequence analysis. The identity between human pancreatic and urinary kallikreins is indicated by the complete coincidence between the amino acid sequence deduced from the cloned cDNA sequence and that reported partially for urinary kallikrein. The active enzyme form of the human pancreatic kallikrein consists of 238 amino acids and is preceded by a signal peptide and a profragment of 24 amino acids. A sequence comparison of this with other mammalian kallikreins indicates that key amino acid residues required for both serine protease activity and kallikrein-like cleavage specificity are retained in the human sequence, and residues corresponding to some external loops of the kallikrein diverge from other kallikreins. Analyses by RNA blot hybridization, primer extension, and S1 nuclease mapping indicate that the pancreatic kallikrein mRNA is also expressed in the kidney and sublingual gland, suggesting the active synthesis of urinary kallikrein in these tissues. Furthermore, the tissue-specific regulation of the expression of the members of the human kallikrein gene family has been discussed.

  11. Comparative sequence analysis of leucine-rich repeats (LRRs within vertebrate toll-like receptors

    Directory of Open Access Journals (Sweden)

    Taga Masae

    2007-05-01

    Full Text Available Abstract Background Toll-like receptors (TLRs play a central role in innate immunity. TLRs are membrane glycoproteins and contain leucine rich repeat (LRR motif in the ectodomain. TLRs recognize and respond to molecules such as lipopolysaccharide, peptidoglycan, flagellin, and RNA from bacteria or viruses. The LRR domains in TLRs have been inferred to be responsible for molecular recognition. All LRRs include the highly conserved segment, LxxLxLxxNxL, in which "L" is Leu, Ile, Val, or Phe and "N" is Asn, Thr, Ser, or Cys and "x" is any amino acid. There are seven classes of LRRs including "typical" ("T" and "bacterial" ("S". All known domain structures adopt an arc or horseshoe shape. Vertebrate TLRs form six major families. The repeat numbers of LRRs and their "phasing" in TLRs differ with isoforms and species; they are aligned differently in various databases. We identified and aligned LRRs in TLRs by a new method described here. Results The new method utilizes known LRR structures to recognize and align new LRR motifs in TLRs and incorporates multiple sequence alignments and secondary structure predictions. TLRs from thirty-four vertebrate were analyzed. The repeat numbers of the LRRs ranges from 16 to 28. The LRRs found in TLRs frequently consists of LxxLxLxxNxLxxLxxxxF/LxxLxx ("T" and sometimes short motifs including LxxLxLxxNxLxxLPx(xLPxx ("S". The TLR7 family (TLR7, TLR8, and TLR9 contain 27 LRRs. The LRRs at the N-terminal part have a super-motif of STT with about 80 residues. The super-repeat is represented by STTSTTSTT or _TTSTTSTT. The LRRs in TLRs form one or two horseshoe domains and are mostly flanked by two cysteine clusters including two or four cysteine residue. Conclusion Each of the six major TLR families is characterized by their constituent LRR motifs, their repeat numbers, and their patterns of cysteine clusters. The central parts of the TLR1 and TLR7 families and of TLR4 have more irregular or longer LRR motifs. These

  12. Integrating multiple genomic data to predict disease-causing nonsynonymous single nucleotide variants in exome sequencing studies.

    Directory of Open Access Journals (Sweden)

    Jiaxin Wu

    2014-03-01

    Full Text Available Exome sequencing has been widely used in detecting pathogenic nonsynonymous single nucleotide variants (SNVs for human inherited diseases. However, traditional statistical genetics methods are ineffective in analyzing exome sequencing data, due to such facts as the large number of sequenced variants, the presence of non-negligible fraction of pathogenic rare variants or de novo mutations, and the limited size of affected and normal populations. Indeed, prevalent applications of exome sequencing have been appealing for an effective computational method for identifying causative nonsynonymous SNVs from a large number of sequenced variants. Here, we propose a bioinformatics approach called SPRING (Snv PRioritization via the INtegration of Genomic data for identifying pathogenic nonsynonymous SNVs for a given query disease. Based on six functional effect scores calculated by existing methods (SIFT, PolyPhen2, LRT, MutationTaster, GERP and PhyloP and five association scores derived from a variety of genomic data sources (gene ontology, protein-protein interactions, protein sequences, protein domain annotations and gene pathway annotations, SPRING calculates the statistical significance that an SNV is causative for a query disease and hence provides a means of prioritizing candidate SNVs. With a series of comprehensive validation experiments, we demonstrate that SPRING is valid for diseases whose genetic bases are either partly known or completely unknown and effective for diseases with a variety of inheritance styles. In applications of our method to real exome sequencing data sets, we show the capability of SPRING in detecting causative de novo mutations for autism, epileptic encephalopathies and intellectual disability. We further provide an online service, the standalone software and genome-wide predictions of causative SNVs for 5,080 diseases at http://bioinfo.au.tsinghua.edu.cn/spring.

  13. Novel technologies applied to the nucleotide sequencing and comparative sequence analysis of the genomes of infectious agents in veterinary medicine.

    Science.gov (United States)

    Granberg, F; Bálint, Á; Belák, S

    2016-04-01

    Next-generation sequencing (NGS), also referred to as deep, high-throughput or massively parallel sequencing, is a powerful new tool that can be used for the complex diagnosis and intensive monitoring of infectious disease in veterinary medicine. NGS technologies are also being increasingly used to study the aetiology, genomics, evolution and epidemiology of infectious disease, as well as host-pathogen interactions and other aspects of infection biology. This review briefly summarises recent progress and achievements in this field by first introducing a range of novel techniques and then presenting examples of NGS applications in veterinary infection biology. Various work steps and processes for sampling and sample preparation, sequence analysis and comparative genomics, and improving the accuracy of genomic prediction are discussed, as are bioinformatics requirements. Examples of sequencing-based applications and comparative genomics in veterinary medicine are then provided. This review is based on novel references selected from the literature and on experiences of the World Organisation for Animal Health (OIE) Collaborating Centre for the Biotechnology-based Diagnosis of Infectious Diseases in Veterinary Medicine, Uppsala, Sweden.

  14. Sequence-structure-function relations of the mosquito leucine-rich repeat immune proteins

    Directory of Open Access Journals (Sweden)

    Povelones Michael

    2010-09-01

    Full Text Available Abstract Background The discovery and characterisation of factors governing innate immune responses in insects has driven the elucidation of many immune system components in mammals and other organisms. Focusing on the immune system responses of the malaria mosquito, Anopheles gambiae, has uncovered an array of components and mechanisms involved in defence against pathogen infections. Two of these immune factors are LRIM1 and APL1C, which are leucine-rich repeat (LRR containing proteins that activate complement-like defence responses against malaria parasites. In addition to their LRR domains, these leucine-rich repeat immune (LRIM proteins share several structural features including signal peptides, patterns of cysteine residues, and coiled-coil domains. Results The identification and characterisation of genes related to LRIM1 and APL1C revealed putatively novel innate immune factors and furthered the understanding of their likely molecular functions. Genomic scans using the shared features of LRIM1 and APL1C identified more than 20 LRIM-like genes exhibiting all or most of their sequence features in each of three disease-vector mosquitoes with sequenced genomes: An. gambiae, Aedes aegypti, and Culex quinquefasciatus. Comparative sequence analyses revealed that this family of mosquito LRIM-like genes is characterised by a variable number of 6 to 14 LRRs of different lengths. The "Long" LRIM subfamily, with 10 or more LRRs, and the "Short" LRIMs, with 6 or 7 LRRs, also share the signal peptide, cysteine residue patterning, and coiled-coil sequence features of LRIM1 and APL1C. The "TM" LRIMs have a predicted C-terminal transmembrane region, and the "Coil-less" LRIMs exhibit the characteristic LRIM sequence signatures but lack the C-terminal coiled-coil domains. Conclusions The evolutionary plasticity of the LRIM LRR domains may provide templates for diverse recognition properties, while their coiled-coil domains could be involved in the formation

  15. Identification of mitochondrial DNA sequence variation and development of single nucleotide polymorphic markers for CMS-D8 in cotton.

    Science.gov (United States)

    Suzuki, Hideaki; Yu, Jiwen; Wang, Fei; Zhang, Jinfa

    2013-06-01

    Cytoplasmic male sterility (CMS), which is a maternally inherited trait and controlled by novel chimeric genes in the mitochondrial genome, plays a pivotal role in the production of hybrid seed. In cotton, no PCR-based marker has been developed to discriminate CMS-D8 (from Gossypium trilobum) from its normal Upland cotton (AD1, Gossypium hirsutum) cytoplasm. The objective of the current study was to develop PCR-based single nucleotide polymorphic (SNP) markers from mitochondrial genes for the CMS-D8 cytoplasm. DNA sequence variation in mitochondrial genes involved in the oxidative phosphorylation chain including ATP synthase subunit 1, 4, 6, 8 and 9, and cytochrome c oxidase 1, 2 and 3 subunits were identified by comparing CMS-D8, its isogenic maintainer and restorer lines on the same nuclear genetic background. An allelic specific PCR (AS-PCR) was utilized for SNP typing by incorporating artificial mismatched nucleotides into the third or fourth base from the 3' terminus in both the specific and nonspecific primers. The result indicated that the method modifying allele-specific primers was successful in obtaining eight SNP markers out of eight SNPs using eight primer pairs to discriminate two alleles between AD1 and CMS-D8 cytoplasms. Two of the SNPs for atp1 and cox1 could also be used in combination to discriminate between CMS-D8 and CMS-D2 cytoplasms. Additionally, a PCR-based marker from a nine nucleotide insertion-deletion (InDel) sequence (AATTGTTTT) at the 59-67 bp positions from the start codon of atp6, which is present in the CMS and restorer lines with the D8 cytoplasm but absent in the maintainer line with the AD1 cytoplasm, was also developed. A SNP marker for two nucleotide substitutions (AA in AD1 cytoplasm to CT in CMS-D8 cytoplasm) in the intron (1,506 bp) of cox2 gene was also developed. These PCR-based SNP markers should be useful in discriminating CMS-D8 and AD1 cytoplasms, or those with CMS-D2 cytoplasm as a rapid, simple, inexpensive, and

  16. Biological characterization and complete nucleotide sequence of a Tunisian isolate of Moroccan watermelon mosaic virus.

    Science.gov (United States)

    Yakoubi, S; Desbiez, C; Fakhfakh, H; Wipf-Scheibel, C; Marrakchi, M; Lecoq, H

    2008-01-01

    During a survey conducted in October 2005, cucurbit leaf samples showing virus-like symptoms were collected from the major cucurbit-growing areas in Tunisia. DAS-ELISA showed the presence of Moroccan watermelon mosaic virus (MWMV, Potyvirus), detected for the first time in Tunisia, in samples from the region of Cap Bon (Northern Tunisia). MWMV isolate TN05-76 (MWMV-Tn) was characterized biologically and its full-length genome sequence was established. MWMV-Tn was found to have biological properties similar to those reported for the MWMV type strain from Morocco. Phylogenetic analysis including the comparison of complete amino-acid sequences of 42 potyviruses confirmed that MWMV-Tn is related (65% amino-acid sequence identity) to Papaya ringspot virus (PRSV) isolates but is a member of a distinct virus species. Sequence analysis on parts of the CP gene of MWMV isolates from different geographical origins revealed some geographic structure of MWMV variability, with three different clusters: one cluster including isolates from the Mediterranean region, a second including isolates from western and central Africa, and a third one including isolates from the southern part of Africa. A significant correlation was observed between geographic and genetic distances between isolates. Isolates from countries in the Mediterranean region where MWMV has recently emerged (France, Spain, Portugal) have highly conserved sequences, suggesting that they may have a common and recent origin. MWMV from Sudan, a highly divergent variant, may be considered an evolutionary intermediate between MWMV and PRSV.

  17. Characterisation of single nucleotide polymorphisms identified in the bovine lactoferrin gene sequences across a range of dairy cow breeds.

    Science.gov (United States)

    O'Halloran, F; Bahar, B; Buckley, F; O'Sullivan, O; Sweeney, T; Giblin, L

    2009-01-01

    The lactoferrin gene sequences of 70 unrelated dairy cows representing six different dairy breeds were investigated for single nucleotide polymorphisms to establish a baseline of polymorphisms that exist within the Irish bovine population. Twenty-nine polymorphisms were identified within a 2.2kb regulatory region. Nineteen novel polymorphisms were identified and some of these were found within transcription factor binding sites, including GATA-1 and SPI transcription factor sites. Forty-seven polymorphisms were identified within exon sequences with unique polymorphisms that were associated with amino acid substitutions. These included a T/A SNP, identified in a Holstein Friesian animal, which resulted in a valine to aspartic acid substitution (Val89Asp) in the mature lactoferrin protein. Other SNPs of interest were associated with amino acid substitutions in the lactoferricin B peptide sequence and an A/G SNP, identified in a Jersey animal, was associated with a tyrosine to cysteine change (Tyr181Cys). The polymorphisms identified in the promoter region may have implications relating to lactoferrin expression levels in cows and those identified in the coding sequence indicate the existence of protein variants in the Irish bovine population. The data presented in this study emphasises the potential for lactoferrin to serve as a candidate gene to select for mastitis resistance with the aim of improving animal health.

  18. Citrus psorosis virus: nucleotide sequencing of the coat protein gene and detection by hybridization and RT-PCR.

    Science.gov (United States)

    Barthe, G A; Ceccardi, T L; Manjunath, K L; Derrick, K S

    1998-06-01

    Citrus psorosis virus (CPV) is a multicomponent ssRNA virus with a coat protein of approximately 48 kDa. The viral genome is encapsidated in short and long particles that are readily separated by sucrose density-gradient centrifugation. CPV particles are spiral filaments that are referred to as spiroviruses (SV). A cDNA library of purified short particles from isolate CPV-4 was prepared in a Lambda vector and screened for expression of the coat protein gene (CPG) with a monoclonal antibody to the coat protein. Sequencing of immunopositive clones indicated a single ORF encoding a 49 kDa protein. This ORF, when expressed in E. coli, gave a protein identical in size and immunoreactivity to the CPV coat protein. A full-length clone of the CPG was transcribed and used in Northern hybridization assays to establish that short particle RNA of CPV is negative sense and contains the CPG. Moreover, the CPG was not found on RNA extracted from long particles or on the sedimentable dsRNA from CPV infected tissue. RT-PCR assays were developed for the amplification of a 600 bp fragment of CPG and for the complete CPG (1317 bp). The 600 bp fragment from a biologically and serologically different isolate, CPV-6, was cloned, sequenced and found to share 86% (nucleotide) and 96% (amino acid) identity with CPV-4. BLAST analysis of sequences from CPV-4 and CPV-6 detected no significant nucleic acid or protein similarity with any known viral sequences.

  19. Survey and analysis of simple sequence repeats (SSRs) in three genomes of Candida species.

    Science.gov (United States)

    Jia, Dongmei

    2016-06-15

    Simple sequence repeats (SSRs) or microsatellites, which composed of tandem repeated short units of 1-6 bp, have been paying attention continuously. Here, the distribution, composition and polymorphism of microsatellites and compound microsatellites were analyzed in three available genomes of Candida species (Candida dubliniensis, Candida glabrata and Candida orthopsilosis). The results show that there were 118,047, 66,259 and 61,119 microsatellites in genomes of C. dubliniensis, C. glabrata and C. orthopsilosis, respectively. The SSRs covered more than 1/3 length of genomes in the three species. The microsatellites, which just consist of bases A and (or) T, such as (A)n, (T)n, (AT)n, (TA)n, (AAT)n, (TAA)n, (TTA)n, (ATA)n, (ATT)n and (TAT)n, were predominant in the three genomes. The length of microsatellites was focused on 6 bp and 9 bp either in the three genomes or in its coding sequences. What's more, the relative abundance (19.89/kbp) and relative density (167.87 bp/kbp) of SSRs in sequence of mitochondrion of C. glabrata were significantly great than that in any one of genomes or chromosomes of the three species. In addition, the distance between any two adjacent microsatellites was an important factor to influence the formation of compound microsatellites. The analysis may be helpful for further studying the roles of microsatellites in genomes' origination, organization and evolution of Candida species. Copyright © 2016 Elsevier B.V. All rights reserved.

  20. Selection, Recombination and History in a Parasitic Flatworm (Echinococcus Inferred from Nucleotide Sequences

    Directory of Open Access Journals (Sweden)

    Haag KL

    1998-01-01

    Full Text Available Three species of flatworms from the genus Echinococcus (E. granulosus, E. multilocularis and E. vogeli and four strains of E. granulosus (cattle, horse, pig and sheep strains were analysed by the PCR-SSCP method followed by sequencing, using as targets two non-coding and two coding (one nuclear and one mitochondrial genomic regions. The sequencing data was used to evaluate hypothesis about the parasite breeding system and the causes of genetic diversification. The calculated recombination parameters suggested that cross-fertilisation was rare in the history of the group. However, the relative rates of substitution in the coding sequences showed that positive selection (instead of purifying selection drove the evolution of an elastase and neutrophil chemotaxis inhibitor gene (AgB/1. The phylogenetic analyses revealed several ambiguities, indicating that the taxonomic status of the E. granulosus horse strain should be revised

  1. Characterization of Porcine Endogenous Retrovirus γ pro-pol Nucleotide Sequences

    OpenAIRE

    Klymiuk, Nikolai; Müller, Mathias; Brem, Gottfried; Aigner, Bernhard

    2002-01-01

    Endogenous retroviral sequences in the pig genome (PERV) represent a potential infectious risk in xenotransplantation. All known infectious PERV have been asssigned to the PERV γ1 family, consisting of the subfamilies A, B, and C. The aim of the study was the concise examination of PERV γ by the analysis of the retroviral pro-pol sequences. The analysis of 52 pro-pol clones amplified in this study revealed eight PERV γ families. In addition to four already-described families (γ1, γ4, γ5, γ6),...

  2. Isolation, expression, and nucleotide sequencing of the pilin structural gene of the Brazilian purpuric fever clone of Haemophilus influenzae biogroup aegyptius.

    Science.gov (United States)

    St Geme, J W; Falkow, S

    1993-05-01

    In this study we isolated the pilin gene from the Brazilian purpuric fever (BPF) clone of Haemophilus influenzae biogroup aegyptius, expressed the gene in Escherichia coli, and determined its nucleotide sequence. Comparison of the nucleotide sequence of the BPF pilin gene with the sequences of pilin genes from strains of H. influenzae sensu stricto demonstrated a high degree of identity. Consistent with this observation, hemagglutination inhibition studies performed with a series of glycoconjugates indicated that BPF pili and H. influenzae type b pili possess the same erythrocyte receptor specificity.

  3. The nucleotide sequence of the dnaA gene and the first part of the dnaN gene of Escherichia coli K-12.

    Science.gov (United States)

    Hansen, E B; Hansen, F G; von Meyenburg, K

    1982-11-25

    The nucleotide sequence of the dnaA gene and the first 10% of the dnaN gene was determined. From the nucleotide sequence the amino acid sequence of the dnaA gene product was derived. It is a basic protein of 467 amino acid residues with a molecular weight of 52.5 kD. The expression of the dnaA gene is in the counterclockwise direction like the one of the dnaN gene, for which potential startsites were found.

  4. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Science.gov (United States)

    2010-07-01

    ... “Sequence Listing” file. (6) All computer readable forms must have a label permanently affixed thereto on...) Computer readable form files submitted may be in any of the following media: (1) Diskette: 3.50 inch, 1.44... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824...

  5. Performance and physiological responses to repeated-sprint and jump sequences.

    Science.gov (United States)

    Buchheit, Martin

    2010-11-01

    In this study, the performance and selected physiological responses to team-sport specific repeated-sprint and jump sequence were investigated. On four occasions, 13 team-sport players (22 ± 3 year) performed alternatively six repeated maximal straight-line or shuttle-sprints interspersed with a jump ([RS(+j), 6 × 25 m] or [RSS(+j), 6 × (2 × 12.5 m)]) or not ([RS, 6 × 25 m] or [RSS, 6 × (2 × 12.5 m)]) within each recovery period. Mean running time, rate of perceived exertion (RPE), pulmonary oxygen uptake (V(O)₂), blood lactate ([La](b)), and vastus lateralis deoxygenation ([HHb]) were obtained for each condition. Mean sprint times were greater for RS(+j) versus RS (4.14 ± 0.17 vs. 4.09 ± 0.16 s, with the qualitative analysis revealing a 82% chance of RS(+j) times to be greater than RS) and for RSS(+j) versus RSS (5.43 ± 0.18 vs. 5.29 ± 0.17 s; 99% chance of RSS(+j) to be >RSS). The correlation between sprint and jump abilities were large-to-very-large, but below 0.71 for RSSs. Jumps increased RPE (Cohen's d ± 90% CL: +0.7 ± 0.5; 95% chance for RS(+j) > RS and +0.7 ± 0.5; 96% for RSS(+j) > RSS), V(O)₂(+0.4 ± 0.5; 80% for RS(+j) > RS and +0.5 ± 0.5; 86% for RSS(+j) > RSS), [La](b) (+0.5 ± 0.5; 59% for RS(+j) > RS and +0.2 ± 0.5; unclear for RSS(+j) > RSS), and [HHb] (+0.5 ± 0.5; 86% for RS(+j) > RS and +0.5 ± 0.5; 85% for RSS(+j) > RSS). To conclude, repeated- sprint and jump abilities could be considered as specific qualities. The addition of a jump within the recovery periods during repeated-sprint running sequences impairs sprinting performance and might be an effective training practice for eliciting both greater systemic and vastus lateralis physiological loads.

  6. Molecular cloning and nucleotide sequence of cDNA for human liver arginase

    Energy Technology Data Exchange (ETDEWEB)

    Haraguchi, Y.; Takiguchi, M.; Amaya, Y.; Kawamoto, S.; Matsuda, I.; Mori, M.

    1987-01-01

    Arginase (EC3.5.3.1) catalyzes the last step of the urea cycle in the liver of ureotelic animals. Inherited deficiency of the enzyme results in argininemia, an autosomal recessive disorder characterized by hyperammonemia. To facilitate investigation of the enzyme and gene structures and to elucidate the nature of the mutation in argininemia, the authors isolated cDNA clones for human liver arginase. Oligo(dT)-primed and random primer human liver cDNA libraries in lambda gt11 were screened using isolated rat arginase cDNA as a probe. Two of the positive clones, designated lambda hARG6 and lambda hARG109, contained an overlapping cDNA sequence with an open reading frame encoding a polypeptide of 322 amino acid residues (predicted M/sub r/, 34,732), a 5'-untranslated sequence of 56 base pairs, a 3'-untranslated sequence of 423 base pairs, and a poly(A) segment. Arginase activity was detected in Escherichia coli cells transformed with the plasmid carrying lambda hARG6 cDNA insert. RNA gel blot analysis of human liver RNA showed a single mRNA of 1.6 kilobases. The predicted amino acid sequence of human liver arginase is 87% and 41% identical with those of the rat liver and yeast enzymes, respectively. There are several highly conserved segments among the human, rat, and yeast enzymes.

  7. Nucleotide sequence analysis of hypervariable junctions of Haemophilus influenzae pilus gene clusters.

    Science.gov (United States)

    Read, T D; Satola, S W; Farley, M M

    2000-12-01

    Haemophilus influenzae pili are surface structures that promote attachment to human epithelial cells. The five genes that encode pili, hifABCDE, are found inserted in genomes either between pmbA and hpt (hif-1) or between purE and pepN (hif-2). We determined the sequence between the ends of the pilus clusters and bordering genes in a number of H. influenzae strains. The junctions of the hif-1 cluster (limited to biogroup aegyptius isolates) are structurally simple. In contrast, hif-2 junctions are highly diverse, complex assemblies of conserved intergenic sequences (including genes hicA and hicB) with evidence of frequent recombination. Variation at hif-2 junctions seems to be tied to multiple copies of a 23-bp Haemophilus intergenic dyad sequence. The hif-1 cluster appears to have originated in biogroup aegyptius strains from invasion of the hpt-pmbA region by a DNA template containing the hif-2 genes with termini in the hairpin loop of flanking intergenic dyad sequences. The pilus gene clusters are an interesting model of a mobile "pathogenicity island" not associated with a phage, transposon, or insertion element.

  8. Symbolic complexity for nucleotide sequences: a sign of the genome structure

    Science.gov (United States)

    Salgado-García, R.; Ugalde, E.

    2016-11-01

    We introduce a method for estimating the complexity function (which counts the number of observable words of a given length) of a finite symbolic sequence, which we use to estimate the complexity function of coding DNA sequences for several species of the Hominidae family. In all cases, the obtained symbolic complexities show the same characteristic behavior: exponential growth for small word lengths, followed by linear growth for larger word lengths. The symbolic complexities of the species we consider exhibit a systematic trend in correspondence with the phylogenetic tree. Using our method, we estimate the complexity function of sequences obtained by some known evolution models, and in some cases we observe the characteristic exponential-linear growth of the Hominidae coding DNA complexity. Analysis of the symbolic complexity of sequences obtained from a specific evolution model points to the following conclusion: linear growth arises from the random duplication of large segments during the evolution of the genome, while the decrease in the overall complexity from one species to another is due to a difference in the speed of accumulation of point mutations.

  9. Identification of Nucleotide Variation in Genomes Using Next-Generation Sequencing

    NARCIS (Netherlands)

    Megens, H.J.W.C.; Groenen, M.A.M.

    2012-01-01

    Discovery of genome-wide variation has taken a huge leap forward with the introduction of next-generation sequencing (NGS) technology. Variant discovery requires sampling of a number of haplotypes. This can be either the two haplotypes of a diploid organism or multiple haplotypes in a population. Va

  10. Phylogenetic analysis of Rutaceous plants based on single nucleotide polymorphism in chloroplast and nuclear gene sequences

    Science.gov (United States)

    The family Rutaceae encompasses several genera including the economically important genus Citrus. In this study, we selected 22 citrus relatives belonging to the various sub groups of Rutaceae and compared the sequences of three gene fragments. The accessions selected belong to the subfamily Rutoide...

  11. Nucleotide sequence evidence for the occurrence of three distinct whitefly-transmitted, Sida-infecting bipartite geminiviruses in Central America.

    Science.gov (United States)

    Frischmuth, T; Engel, M; Lauster, S; Jeske, H

    1997-10-01

    The nucleotide sequences of two Sida-infecting geminiviruses from Honduras were determined. The symptoms of both viruses are identical in Sida rhombifolia but different in Nicotiana benthamiana. An additional symptom of one virus was yellow vein clearing on infected N. benthamiana leaves. Both Sida golden mosaic viruses (SiGMV-Ho and SiGMV-Ho(yv)) have bipartite genomes (DNAs A and B). From the SiGMV-Ho(yv)-infected S. rhombifolia plant two different DNA B molecules were isolated and cloned. They differ in length by 24 nucleotides [SiGMV-Ho(yv) B1 (2593 nt) and B2 (2569 nt)] and at eight nucleotide positions. Both proteins encoded by DNA B (BV1 and BC1) are affected by these substitutions. Computer analysis shows that the bipartite genomes resemble those of other whitefly-transmitted geminiviruses. From homology analyses we conclude that both viruses are closely related but distinct. Comparison with a Sida-infecting virus from Costa Rica (SiGMV-Co) showed that the two viruses from Honduras are more similar to each other than either of them are to SiGMV-Co. Exchange of SiGMV-Ho and SiGMV-Ho(yv) genomic components resulted in viable pseudorecombinant viruses. SiGMV-Ho DNA A was able to produce a viable pseudorecombinant with SiGMV-Co DNA B while the reciprocal exchange was not infectious in N. benthamiana. SiGMV-Ho(yv) DNA A and SiGMV-Co DNA B produced a viable pseudorecombinant virus whereas only pseudorecombination of SiGMV-Co DNA A with SiGMV-Ho(yv) DNA B2, and not with DNA B1, was infectious in N. benthamiana.

  12. Insertion sequence inversions mediated by ectopic recombination between terminal inverted repeats.

    Science.gov (United States)

    Ling, Alison; Cordaux, Richard

    2010-12-20

    Transposable elements are widely distributed and diverse in both eukaryotes and prokaryotes, as exemplified by DNA transposons. As a result, they represent a considerable source of genomic variation, for example through ectopic (i.e. non-allelic homologous) recombination events between transposable element copies, resulting in genomic rearrangements. Ectopic recombination may also take place between homologous sequences located within transposable element sequences. DNA transposons are typically bounded by terminal inverted repeats (TIRs). Ectopic recombination between TIRs is expected to result in DNA transposon inversions. However, such inversions have barely been documented. In this study, we report natural inversions of the most common prokaryotic DNA transposons: insertion sequences (IS). We identified natural TIR-TIR recombination-mediated inversions in 9% of IS insertion loci investigated in Wolbachia bacteria, which suggests that recombination between IS TIRs may be a quite common, albeit largely overlooked, source of genomic diversity in bacteria. We suggest that inversions may impede IS survival and proliferation in the host genome by altering transpositional activity. They may also alter genomic instability by modulating the outcome of ectopic recombination events between IS copies in various orientations. This study represents the first report of TIR-TIR recombination within bacterial IS elements and it thereby uncovers a novel mechanism of structural variation for this class of prokaryotic transposable elements.

  13. Genome-Wide Analysis of Simple Sequence Repeats in Bitter Gourd (Momordica charantia

    Directory of Open Access Journals (Sweden)

    Junjie Cui

    2017-06-01

    Full Text Available Bitter gourd (Momordica charantia is widely cultivated as a vegetable and medicinal herb in many Asian and African countries. After the sequencing of the cucumber (Cucumis sativus, watermelon (Citrullus lanatus, and melon (Cucumis melo genomes, bitter gourd became the fourth cucurbit species whose whole genome was sequenced. However, a comprehensive analysis of simple sequence repeats (SSRs in bitter gourd, including a comparison with the three aforementioned cucurbit species has not yet been published. Here, we identified a total of 188,091 and 167,160 SSR motifs in the genomes of the bitter gourd lines ‘Dali-11’ and ‘OHB3-1,’ respectively. Subsequently, the SSR content, motif lengths, and classified motif types were characterized for the bitter gourd genomes and compared among all the cucurbit genomes. Lastly, a large set of 138,727 unique in silico SSR primer pairs were designed for bitter gourd. Among these, 71 primers were selected, all of which successfully amplified SSRs from the two bitter gourd lines ‘Dali-11’ and ‘K44’. To further examine the utilization of unique SSR primers, 21 SSR markers were used to genotype a collection of 211 bitter gourd lines from all over the world. A model-based clustering method and phylogenetic analysis indicated a clear separation among the geographic groups. The genomic SSR markers developed in this study have considerable potential value in advancing bitter gourd research.

  14. Mechanisms of haplotype divergence at the RGA08 nucleotide-binding leucine-rich repeat gene locus in wild banana (Musa balbisiana

    Directory of Open Access Journals (Sweden)

    Miller Robert NG

    2010-07-01

    Full Text Available Abstract Background Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp., two diploid wild species Musa acuminata (A genome and Musa balbisiana (B genome contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW. Results Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year

  15. Phylogenetic analyses of nucleotide sequences confirm a unique plant intercontinental disjunction between tropical Africa, the Caribbean, and the Hawaiian Islands.

    Science.gov (United States)

    Namoff, Sandra; Luke, Quentin; Jiménez, Francisco; Veloz, Alberto; Lewis, Carl E; Sosa, Victoria; Maunder, Mike; Francisco-Ortega, Javier

    2010-01-01

    Phylogenetic analyses of nucleotide sequences of the internal transcribed spacers and 5.8 regions of the nuclear ribosomal DNA and of the trnH-psbA spacer of the chloroplast genome confirm that the three taxa of the Jacquemontia ovalifolia (Choicy) Hallier f. complex (Convolvulaceae) form a monophyletic group. Levels of nucleotide divergence and morphological differentiation among these taxa support the view that each should be recognized as distinct species. These three species display unique intercontinental disjunction, with one species endemic to Hawaii (Jacquemontia sandwicensis A. Gray.), another restricted to eastern Mexico and the Antilles [Jacquemontia obcordata (Millspaugh) House], and the third confined to East and West Africa (J. ovalifolia). The Caribbean and Hawaiian species are sister taxa and are another example of a biogeographical link between the Caribbean Basin and Polynesia. We provide a brief conservation review of the three taxa based on our collective field work and investigations; it is apparent that J. obcordata is highly threatened and declining in the Caribbean.

  16. SISP: a Fast Species Identification System for Prokaryotes Based on Total Nucleotide Identity of Whole Genome Sequences

    Directory of Open Access Journals (Sweden)

    Jiapeng Chen

    2015-06-01

    Full Text Available In the genomic era, new techniques and criteria are proposed to improve the traditionally phenotypic and biochemical test–based approaches for prokaryotic species definition. Among them, average nucleotide identity (ANI mirrors DNA-DNA hybridization and is widely used by the microbial research community. However, our test shows that ANI possibly defines distinct taxa as the same species when they shared highly homologous sequences in a very short genomic region. In this study, we propose an improved algorithm named total nucleotide identity (TNI for use in bacterial taxonomy; this algorithm provided higher accuracy for species classification than ANI. Furthermore, we developed a species identification system for prokaryotes (SISP based on pairwise TNI of 3,073 genomes acquired from GenBank. For a submitted query genome, SISP can quickly find its most closely related genome from the established database based on the TNI calculation and infer the possible species of the query genome. Given a criterion of TNI > 70%, SISP has an accuracy that was above 90% for 3,596 prokaryotic genomes. SISP is open source and is available at https://github.com/chjp/SISProkaryotes.

  17. Thermochemical and kinetic evidence for nucleotide-sequence-dependent RecA-DNA interactions.

    Science.gov (United States)

    Wittung, P; Ellouze, C; Maraboeuf, F; Takahashi, M; Nordèn, B

    1997-05-01

    RecA catalyses homologous recombination in Escherichia coli by promoting pairing of homologous DNA molecules after formation of a helical nucleoprotein filament with single-stranded DNA. The primary reaction of RecA with DNA is generally assumed to be unspecific. We show here, by direct measurement of the interaction enthalpy by means of isothermal titration calorimetry, that the polymerisation of RecA on single-stranded DNA depends on the DNA sequence, with a high exothermic preference for thymine bases. This enthalpic sequence preference of thymines by RecA correlates with faster binding kinetics of RecA to thymine DNA. Furthermore, the enthalpy of interaction between the RecA x DNA filament and a second DNA strand is large only when the added DNA is complementary to the bound DNA in RecA. This result suggests a possibility for a rapid search mechanism by RecA x DNA filaments for homologous DNA molecules.

  18. Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution

    Directory of Open Access Journals (Sweden)

    Purves Joanne

    2012-09-01

    Full Text Available Abstract Background Staphylococcus aureus Repeat (STAR elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis.

  19. Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution.

    Science.gov (United States)

    Purves, Joanne; Blades, Matthew; Arafat, Yasrab; Malik, Salman A; Bayliss, Christopher D; Morrissey, Julie A

    2012-09-28

    Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis.

  20. Nucleotide sequence analysis of a candidate gene for ataxia-telangiectasia group D (ATDC)

    Energy Technology Data Exchange (ETDEWEB)

    Leonhardt, E.A.; Kapp, L.N.; Young, B.R.; Murnane, J.P. (Univ. of California, San Francisco, CA (United States))

    1994-01-01

    A radioresistant cell clone (1B3) was previously isolated after transfection of an ataxia-telangiectasia (AT) group D cell line with a human cosmid library. A cosmid rescued from the integration site in 1B3 contained human DNA from chromosome position 11q23, the same region shown by both genetic linkage and chromosome transfer to contain the genes for AT complementation groups A/B, C, and D. A gene within the cosmid (ATDC) was found to produce mRNAs of different sizes. A cDNA for one of the most abundant mRNAs (3.0 kb) was isolated from a HeLa cell library. In the present study, the authors sequenced the 3.0-kb cDNA and the surrounding intron DNA in the cosmids. They used polymerase chain reaction, with primers in the introns, to confirm the number of exons and to analyze DNA from AT group D cells for mutations within this gene. Although no mutations were found, they do not rule out the possibility that mutations may be present within the regulatory sequences or coding sequences found in other mRNAs specific for this gene. From the sequence analysis, they found that the ATDC gene product is one of a group of proteins that share multiple zinc finger motifs and an adjacent leucine zipper motif. These proteins have been proposed to form homo- or hetero-dimers involved in nucleic acid binding, consistent with the fact that many of these proteins appear to be transcriptional regulatory factors involved in carcinogenesis and/or differentiation. The likelihood that the ATDC gene product is involved in transcriptional regulation could explain the pleiomorphic characteristics of AT, including abnormal cell cycle regulation. 36 refs., 5 figs., 2 tabs.

  1. Analysis of simple sequence repeats in rice bean (Vigna umbellata) using an SSR-enriched library

    Institute of Scientific and Technical Information of China (English)

    Lixia Wang; Kyung Do Kim; Dongying Gao; Honglin Chen; Suhua Wang; SukHa Lee; Scott A. Jackson; Xuzhen Cheng

    2016-01-01

    Rice bean (Vigna umbellata Thunb.), a warm-season annual legume, is grown in Asia mainly for dried grain or fodder and plays an important role in human and animal nutrition because the grains are rich in protein and some essential fatty acids and minerals. With the aim of expediting the genetic improvement of rice bean, we initiated a project to develop genomic resources and tools for molecular breeding in this little-known but important crop. Here we report the construction of an SSR-enriched genomic library from DNA extracted from pooled young leaf tissues of 22 rice bean genotypes and developing SSR markers. In 433,562 reads generated by a Roche 454 GS-FLX sequencer, we identified 261,458 SSRs, of which 48.8% were of compound form. Dinucleotide repeats were predominant with an absolute proportion of 81.6%, followed by trinucleotides (17.8%). Other types together accounted for 0.6%. The motif AC/GT accounted for 77.7%of the total, followed by AAG/CTT (14.3%), and all others accounted for 12.0%. Among the flanking sequences, 2928 matched putative genes or gene models in the protein database of Arabidopsis thaliana, corresponding with 608 non-redundant Gene Ontology terms. Of these sequences, 11.2%were involved in cellular components, 24.2%were involved molecular functions, and 64.6%were associated with biological processes. Based on homolog analysis, 1595 flanking sequences were similar to mung bean and 500 to common bean genomic sequences. Comparative mapping was conducted using 350 sequences homologous to both mung bean and common bean sequences. Finally, a set of primer pairs were designed, and a validation test showed that 58 of 220 new primers can be used in rice bean and 53 can be transferred to mung bean. However, only 11 were polymorphic when tested on 32 rice bean varieties. We propose that this study lays the groundwork for developing novel SSR markers and will enhance the mapping of qualitative and quantitative traits and marker-assisted selection in

  2. Analysis of simple sequence repeats in rice bean (Vigna umbellata using an SSR-enriched library

    Directory of Open Access Journals (Sweden)

    Lixia Wang

    2016-02-01

    Full Text Available Rice bean (Vigna umbellata Thunb., a warm-season annual legume, is grown in Asia mainly for dried grain or fodder and plays an important role in human and animal nutrition because the grains are rich in protein and some essential fatty acids and minerals. With the aim of expediting the genetic improvement of rice bean, we initiated a project to develop genomic resources and tools for molecular breeding in this little-known but important crop. Here we report the construction of an SSR-enriched genomic library from DNA extracted from pooled young leaf tissues of 22 rice bean genotypes and developing SSR markers. In 433,562 reads generated by a Roche 454 GS-FLX sequencer, we identified 261,458 SSRs, of which 48.8% were of compound form. Dinucleotide repeats were predominant with an absolute proportion of 81.6%, followed by trinucleotides (17.8%. Other types together accounted for 0.6%. The motif AC/GT accounted for 77.7% of the total, followed by AAG/CTT (14.3%, and all others accounted for 12.0%. Among the flanking sequences, 2928 matched putative genes or gene models in the protein database of Arabidopsis thaliana, corresponding with 608 non-redundant Gene Ontology terms. Of these sequences, 11.2% were involved in cellular components, 24.2% were involved molecular functions, and 64.6% were associated with biological processes. Based on homolog analysis, 1595 flanking sequences were similar to mung bean and 500 to common bean genomic sequences. Comparative mapping was conducted using 350 sequences homologous to both mung bean and common bean sequences. Finally, a set of primer pairs were designed, and a validation test showed that 58 of 220 new primers can be used in rice bean and 53 can be transferred to mung bean. However, only 11 were polymorphic when tested on 32 rice bean varieties. We propose that this study lays the groundwork for developing novel SSR markers and will enhance the mapping of qualitative and quantitative traits and marker

  3. Molecular cloning, nucleotide sequence, and expression of the gene encoding human eosinophil differentiation factor (interleukin 5)

    Energy Technology Data Exchange (ETDEWEB)

    Campbell, H.D.; Tucker, W.Q.J.; Hort, Y.; Martinson, M.E.; Mayo, G.; Clutterbuck, E.J.; Sanderson, C.J.; Young, I.G.

    1987-10-01

    The human eosinophil differentiation factor (EDF) gene was cloned from a genomic library in lambda phage EMBL3A by using a murine EDF cDNA clone as a probe. The DNA sequence of a 3.2-kilobase BamHI fragment spanning the gene was determined. The gene contains three introns. The predicted amino acid sequence of 134 amino acids is identical with that recently reported for human interleukin 5 but shows no significant homology with other known hemopoietic growth regulators. The amino acid sequence shows strong homology (approx. 70% identity) with that of murine EDF. Recombinant human EDF, expressed from the human EDF gene after transfection into monkey COS cells, stimulated the production of eosinophils and eosinophil colonies from normal human bone marrow but had no effect on the production of neutrophils or mononuclear cells (monocytes and lymphoid cells). The apparent specificity of human EDF for the eosinophil lineage in myeloid hemopoiesis contrasts with the properties of human interleukin 3 and granulocyte/macrophage and granulocyte colony-stimulating factors but is directly analogous to the biological properties of murine EDF. Human EDF therefore represents a distinct hemopoietic growth factor that could play a central role in the regulation of eosinophilia.

  4. Mapping vaccinia virus DNA replication origins at nucleotide level by deep sequencing.

    Science.gov (United States)

    Senkevich, Tatiana G; Bruno, Daniel; Martens, Craig; Porcella, Stephen F; Wolf, Yuri I; Moss, Bernard

    2015-09-01

    Poxviruses reproduce in the host cytoplasm and encode most or all of the enzymes and factors needed for expression and synthesis of their double-stranded DNA genomes. Nevertheless, the mode of poxvirus DNA replication and the nature and location of the replication origins remain unknown. A current but unsubstantiated model posits only leading strand synthesis starting at a nick near one covalently closed end of the genome and continuing around the other end to generate a concatemer that is subsequently resolved into unit genomes. The existence of specific origins has been questioned because any plasmid can replicate in cells infected by vaccinia virus (VACV), the prototype poxvirus. We applied directional deep sequencing of short single-stranded DNA fragments enriched for RNA-primed nascent strands isolated from the cytoplasm of VACV-infected cells to pinpoint replication origins. The origins were identified as the switching points of the fragment directions, which correspond to the transition from continuous to discontinuous DNA synthesis. Origins containing a prominent initiation point mapped to a sequence within the hairpin loop at one end of the VACV genome and to the same sequence within the concatemeric junction of replication intermediates. These findings support a model for poxvirus genome replication that involves leading and lagging strand synthesis and is consistent with the requirements for primase and ligase activities as well as earlier electron microscopic and biochemical studies implicating a replication origin at the end of the VACV genome.

  5. Individual and population variation in invertebrates revealed by Inter-simple Sequence Repeats (ISSRs

    Directory of Open Access Journals (Sweden)

    Patrick Abbot

    2001-08-01

    Full Text Available PCR-based molecular markers are well suited for questions requiring large scale surveys of plant and animal populations. Inter-simple Sequence Repeats or ISSRs are analyzed by a recently developed technique based on the amplification of the regions between inverse-oriented microsatellite loci with oligonucleotides anchored in microsatellites themselves. ISSRs have shown much promise for the study of the population biology of plants, but have not yet been explored for similar studies of animals. The value of ISSRs is demonstrated for the study of animal species with low levels of within-population variation. Sets of primers are identified which reveal variation in two aphid species, Acyrthosiphon pisum and Pemphigus obesinymphae, in the yellow fever mosquito Aedes aegypti, and in a rotifer in the genus Philodina.

  6. Nucleotide and amino acid sequences of a coat protein of an Ukrainian isolate of Potato virus Y: comparison with homologous sequences of other isolates and phylogenetic analysis

    Directory of Open Access Journals (Sweden)

    Budzanivska I. G.

    2014-03-01

    Full Text Available Aim. Identification of the widespread Ukrainian isolate(s of PVY (Potato virus Y in different potato cultivars and subsequent phylogenetic analysis of detected PVY isolates based on NA and AA sequences of coat protein. Methods. ELISA, RT-PCR, DNA sequencing and phylogenetic analysis. Results. PVY has been identified serologically in potato cultivars of Ukrainian selection. In this work we have optimized a method for total RNA extraction from potato samples and offered a sensitive and specific PCR-based test system of own design for diagnostics of the Ukrainian PVY isolates. Part of the CP gene of the Ukrainian PVY isolate has been sequenced and analyzed phylogenetically. It is demonstrated that the Ukrainian isolate of Potato virus Y (CP gene has a higher percentage of homology with the recombinant isolates (strains of this pathogen (approx. 98.8– 99.8 % of homology for both nucleotide and translated amino acid sequences of the CP gene. The Ukrainian isolate of PVY is positioned in the separate cluster together with the isolates found in Syria, Japan and Iran; these isolates possibly have common origin. The Ukrainian PVY isolate is confirmed to be recombinant. Conclusions. This work underlines the need and provides the means for accurate monitoring of Potato virus Y in the agroecosystems of Ukraine. Most importantly, the phylogenetic analysis demonstrated the recombinant nature of this PVY isolate which has been attributed to the strain group O, subclade N:O.

  7. Complete nucleotide sequence analysis of Cymbidium mosaic virus Indian isolate: further evidence for natural recombination among potexviruses

    Indian Academy of Sciences (India)

    Ang Rinzing Sherpa; Vipin Hallan; Promila Pathak; Aijaz Asghar Zaidi

    2007-06-01

    The complete nucleotide sequence of an Indian strain of Cymbidium mosaic virus (CymMV) was determined and compared with other potexviruses. Phylogenetic analyses on the basis of RNA-dependent RNA polymerase (RdRp), triple gene block protein and coat protein (CP) amino acid sequences revealed that CymMV is closely related to the Narcissus mosaic virus (NMV), Scallion virus X (SVX), Pepino mosaic virus (PepMV) and Potato aucuba mosaic virus (PAMV). Different sets of primers were used for the amplification of different regions of the genome through RT-PCR and the amplified genes were cloned in a suitable vector. The full genome of the Indian isolate of CymMV from Phaius tankervilliae shares 96–97% similarity with isolates reported from other countries. It was found that the CP gene of CymMV shares a high similarity with each other and other potexviruses. One of the Indian isolates seems to be a recombinant formed by the intermolecular recombination of two other CymMV isolates. The phylogenetic analyses, Recombination Detection Program (RDP2) analyses and sequence alignment survey provided evidence for the occurrence of a recombination between an Indian isolate (AM055720) as the major parent, and a Korean type-2 isolate (AF016914) as the minor parent. Recombination was also observed between a Singapore isolate (U62963) as the major parent, and a Taiwan CymMV (AY571289) as the minor parent.

  8. Molecular Properties of Poliovirus Isolates: Nucleotide Sequence Analysis, Typing by PCR and Real-Time RT-PCR.

    Science.gov (United States)

    Burns, Cara C; Kilpatrick, David R; Iber, Jane C; Chen, Qi; Kew, Olen M

    2016-01-01

    Virologic surveillance is essential to the success of the World Health Organization initiative to eradicate poliomyelitis. Molecular methods have been used to detect polioviruses in tissue culture isolates derived from stool samples obtained through surveillance for acute flaccid paralysis. This chapter describes the use of realtime PCR assays to identify and serotype polioviruses. In particular, a degenerate, inosine-containing, panpoliovirus (panPV) PCR primer set is used to distinguish polioviruses from NPEVs. The high degree of nucleotide sequence diversity among polioviruses presents a challenge to the systematic design of nucleic acid-based reagents. To accommodate the wide variability and rapid evolution of poliovirus genomes, degenerate codon positions on the template were matched to mixed-base or deoxyinosine residues on both the primers and the TaqMan™ probes. Additional assays distinguish between Sabin vaccine strains and non-Sabin strains. This chapter also describes the use of generic poliovirus specific primers, along with degenerate and inosine-containing primers, for routine VP1 sequencing of poliovirus isolates. These primers, along with nondegenerate serotype-specific Sabin primers, can also be used to sequence individual polioviruses in mixtures.

  9. Gene cloning and nucleotide sequencing and properties of a cocaine esterase from Rhodococcus sp. strain MB1.

    Science.gov (United States)

    Bresler, M M; Rosser, S J; Basran, A; Bruce, N C

    2000-03-01

    A strain of Rhodococcus designated MB1, which was capable of utilizing cocaine as a sole source of carbon and nitrogen for growth, was isolated from rhizosphere soil of the tropane alkaloid-producing plant Erythroxylum coca. A cocaine esterase was found to initiate degradation of cocaine, which was hydrolyzed to ecgonine methyl ester and benzoate; both of these esterolytic products were further metabolized by Rhodococcus sp. strain MB1. The structural gene encoding a cocaine esterase, designated cocE, was cloned from Rhodococcus sp. strain MB1 genomic libraries by screening recombinant strains of Rhodococcus erythropolis CW25 for growth on cocaine. The nucleotide sequence of cocE corresponded to an open reading frame of 1,724 bp that codes for a protein of 574 amino acids. The amino acid sequence of cocaine esterase has a region of similarity with the active serine consensus of X-prolyl dipeptidyl aminopeptidases, suggesting that the cocaine esterase is a serine esterase. The cocE coding sequence was subcloned into the pCFX1 expression plasmid and expressed in Escherichia coli. The recombinant cocaine esterase was purified to apparent homogeneity and was found to be monomeric, with an M(r) of approximately 65,000. The apparent K(m) of the enzyme (mean +/- standard deviation) for cocaine was measured as 1.33 +/- 0.085 mM. These findings are of potential use in the development of a linked assay for the detection of illicit cocaine.

  10. Comparative analysis of ITS1 nucleotide sequence reveals distinct genetic difference between Brugia malayi from Northeast Borneo and Thailand.

    Science.gov (United States)

    Fong, Mun-Yik; Noordin, Rahmah; Lau, Yee-Ling; Cheong, Fei-Wen; Yunus, Muhammad Hafiznur; Idris, Zulkarnain Md

    2013-01-01

    Brugia malayi is one of the parasitic worms which causes lymphatic filariasis in humans. Its geographical distribution includes a large part of Asia. Despite its wide distribution, very little is known about the genetic variation and molecular epidemiology of this species. In this study, the internal transcribed spacer 1 (ITS1) nucleotide sequences of B. malayi from microfilaria-positive human blood samples in Northeast Borneo Island were determined, and compared with published ITS1 sequences of B. malayi isolated from cats and humans in Thailand. Multiple alignment analysis revealed that B. malayi ITS1 sequences from Northeast Borneo were more similar to each other than to those from Thailand. Phylogenetic trees inferred using Neighbour-Joining and Maximum Parsimony methods showed similar topology, with 2 distinct B. malayi clusters. The first cluster consisted of Northeast Borneo B. malayi isolates, whereas the second consisted of the Thailand isolates. The findings of this study suggest that B. malayi in Borneo Island has diverged significantly from those of mainland Asia, and this has implications for the diagnosis of B. malayi infection across the region using ITS1-based molecular techniques.

  11. Nucleotide sequence of the DNA polymerase gene of herpes simplex virus type 2 and comparison with the type 1 counterpart.

    Science.gov (United States)

    Tsurumi, T; Maeno, K; Nishiyama, Y

    1987-01-01

    The complete nucleotide sequence of the DNA polymerase gene of herpes simplex virus (HSV) type 2 strain 186 has been determined. The gene included a 3720-bp major open reading frame capable of encoding 1240 amino acids. The predicted primary translation product had an Mr of 137,354, which was slightly larger than its HSV-1 counterpart. A comparison of the predicted functional amino acid sequences of the HSV-1 and HSV-2 DNA polymerases revealed 95.5% overall amino acid homology, the value of which was the highest among those of the other known polypeptides encoded by HSV-1 and HSV-2. The functional amino acid changes were spread in the N-terminal one-third of the protein, whereas the C-terminal two-third was almost identical between the two types except a particular hydrophilic region. A highly conserved sequence of 6 aa, YGDTDS, which has been observed in DNA polymerases of HSV-1, Epstein-Barr virus, adenovirus, and vaccinia virus, was also present at positions 889 to 894 in the C-terminal region of HSV-2 DNA polymerase.

  12. Genome-wide identification and validation of simple sequence repeats (SSRs) from Asparagus officinalis.

    Science.gov (United States)

    Li, Shufen; Zhang, Guojun; Li, Xu; Wang, Lianjun; Yuan, Jinhong; Deng, Chuanliang; Gao, Wujun

    2016-06-01

    Garden asparagus (Asparagus officinalis), an important vegetable cultivated worldwide, can also serve as a model dioecious plant species in the study of sex determination and sex chromosome evolution. However, limited DNA marker resources have been developed and used for this species. To expand these resources, we examined the DNA sequences for simple sequence repeats (SSRs) in 163,406 scaffolds representing approximately 400 Mbp of the A. officinalis genome. A total of 87,576 SSRs were identified in 59,565 scaffolds. The most abundant SSR repeats were trinucleotide and tetranucleotide, accounting for 29.2 and 29.1% of the total SSRs, respectively, followed by di-, penta-, hexa-, hepta-, and octanucleotides. The AG motif was most common among dinucleotides and was also the most frequent motif in the entire A. officinalis genome, representing 14.7% of all SSRs. A total of 41,917 SSR primers pairs were designed to amplify SSRs. Twenty-two genomic SSR markers were tested in 39 asparagus accessions belonging to ten cultivars and one accession of Asparagus setaceus for determination of genetic diversity. The intra-species polymorphism information content (PIC) values of the 22 genomic SSR markers were intermediate, with an average of 0.41. The genetic diversity between the ten A. officinalis cultivars was low, and the UPGMA dendrogram was largely unrelated to cultivars. It is here suggested that the sex of individuals is an important factor influencing the clustering results. The information reported here provides new information about the organization of the microsatellites in A. officinalis genome and lays a foundation for further genetic studies and breeding applications of A. officinalis and related species.

  13. Characterization and Nucleotide Sequence of CARB-6, a New Carbenicillin-Hydrolyzing β-Lactamase from Vibrio cholerae

    Science.gov (United States)

    Choury, Danièle; Aubert, Gérald; Szajnert, Marie-France; Azibi, Kemal; Delpech, Marc; Paul, Gérard

    1999-01-01

    A clinical strain of Vibrio cholerae non-O1 non-O139 isolated in France produced a new β-lactamase with a pI of 5.35. The purified enzyme, with a molecular mass of 33,000 Da, was characterized. Its kinetic constants show it to be a carbenicillin-hydrolyzing enzyme comparable to the five previously reported CARB β-lactamases and to SAR-1, another carbenicillin-hydrolyzing β-lactamase that has a pI of 4.9 and that is produced by a V. cholerae strain from Tanzania. This β-lactamase is designated CARB-6, and the gene for CARB-6 could not be transferred to Escherichia coli K-12 by conjugation. The nucleotide sequence of the structural gene was determined by direct sequencing of PCR-generated fragments from plasmid DNA with four pairs of primers covering the whole sequence of the reference CARB-3 gene. The gene encodes a 288-amino-acid protein that shares 94% homology with the CARB-1, CARB-2, and CARB-3 enzymes, 93% homology with the Proteus mirabilis N29 enzyme, and 86.5% homology with the CARB-4 enzyme. The sequence of CARB-6 differs from those of CARB-3, CARB-2, CARB-1, N29, and CARB-4 at 15, 16, 17, 19, and 37 amino acid positions, respectively. All these mutations are located in the C-terminal region of the sequence and at the surface of the molecule, according to the crystal structure of the Staphylococcus aureus PC-1 β-lactamase. PMID:9925522

  14. Finding the right coverage: the impact of coverage and sequence quality on single nucleotide polymorphism genotyping error rates.

    Science.gov (United States)

    Fountain, Emily D; Pauli, Jonathan N; Reid, Brendan N; Palsbøll, Per J; Peery, M Zachariah

    2016-07-01

    Restriction-enzyme-based sequencing methods enable the genotyping of thousands of single nucleotide polymorphism (SNP) loci in nonmodel organisms. However, in contrast to traditional genetic markers, genotyping error rates in SNPs derived from restriction-enzyme-based methods remain largely unknown. Here, we estimated genotyping error rates in SNPs genotyped with double digest RAD sequencing from Mendelian incompatibilities in known mother-offspring dyads of Hoffman's two-toed sloth (Choloepus hoffmanni) across a range of coverage and sequence quality criteria, for both reference-aligned and de novo-assembled data sets. Genotyping error rates were more sensitive to coverage than sequence quality and low coverage yielded high error rates, particularly in de novo-assembled data sets. For example, coverage ≥5 yielded median genotyping error rates of ≥0.03 and ≥0.11 in reference-aligned and de novo-assembled data sets, respectively. Genotyping error rates declined to ≤0.01 in reference-aligned data sets with a coverage ≥30, but remained ≥0.04 in the de novo-assembled data sets. We observed approximately 10- and 13-fold declines in the number of loci sampled in the reference-aligned and de novo-assembled data sets when coverage was increased from ≥5 to ≥30 at quality score ≥30, respectively. Finally, we assessed the effects of genotyping coverage on a common population genetic application, parentage assignments, and showed that the proportion of incorrectly assigned maternities was relatively high at low coverage. Overall, our results suggest that the trade-off between sample size and genotyping error rates be considered prior to building sequencing libraries, reporting genotyping error rates become standard practice, and that effects of genotyping errors on inference be evaluated in restriction-enzyme-based SNP studies.

  15. Inferring multiple refugia and phylogeographical patterns in Pinus massoniana based on nucleotide sequence variation and DNA fingerprinting.

    Directory of Open Access Journals (Sweden)

    Xue-Jun Ge

    Full Text Available BACKGROUND: Pinus massoniana, an ecologically and economically important conifer, is widespread across central and southern mainland China and Taiwan. In this study, we tested the central-marginal paradigm that predicts that the marginal populations tend to be less polymorphic than the central ones in their genetic composition, and examined a founders' effect in the island population. METHODOLOGY/PRINCIPAL FINDINGS: We examined the phylogeography and population structuring of the P. massoniana based on nucleotide sequences of cpDNA atpB-rbcL intergenic spacer, intron regions of the AdhC2 locus, and microsatellite fingerprints. SAMOVA analysis of nucleotide sequences indicated that most genetic variants resided among geographical regions. High levels of genetic diversity in the marginal populations in the south region, a pattern seemingly contradicting the central-marginal paradigm, and the fixation of private haplotypes in most populations indicate that multiple refugia may have existed over the glacial maxima. STRUCTURE analyses on microsatellites revealed that genetic structure of mainland populations was mediated with recent genetic exchanges mostly via pollen flow, and that the genetic composition in east region was intermixed between south and west regions, a pattern likely shaped by gene introgression and maintenance of ancestral polymorphisms. As expected, the small island population in Taiwan was genetically differentiated from mainland populations. CONCLUSIONS/SIGNIFICANCE: The marginal populations in south region possessed divergent gene pools, suggesting that the past glaciations might have low impacts on these populations at low latitudes. Estimates of ancestral population sizes interestingly reflect a recent expansion in mainland from a rather smaller population, a pattern that seemingly agrees with the pollen record.

  16. A single origin and moderate bottleneck during domestication of soybean (Glycine max): implications from microsatellites and nucleotide sequences.

    Science.gov (United States)

    Guo, Juan; Wang, Yunsheng; Song, Chi; Zhou, Jianfeng; Qiu, Lijuan; Huang, Hongwen; Wang, Ying

    2010-09-01

    Background and Aims It is essential to illuminate the evolutionary history of crop domestication in order to understand further the origin and development of modern cultivation and agronomy; however, despite being one of the most important crops, the domestication origin and bottleneck of soybean (Glycine max) are poorly understood. In the present study, microsatellites and nucleotide sequences were employed to elucidate the domestication genetics of soybean. Methods The genomes of 79 landrace soybeans (endemic cultivated soybeans) and 231 wild soybeans (G. soja) that represented the species-wide distribution of wild soybean in East Asia were scanned with 56 microsatellites to identify the genetic structure and domestication origin of soybean. To understand better the domestication bottleneck, four nucleotide sequences were selected to simulate the domestication bottleneck. Key Results Model-based analysis revealed that most of the landrace genotypes were assigned to the inferred wild soybean cluster of south China, South Korea and Japan. Phylogeny for wild and landrace soybeans showed that all landrace soybeans formed a single cluster supporting a monophyletic origin of all the cultivars. The populations of the nearest branches which were basal to the cultivar lineage were wild soybeans from south China. The coalescent simulation detected a bottleneck severity of K' = 2 during soybean domestication, which could be explained by a foundation population of 6000 individuals if domestication duration lasted 3000 years. Conclusions As a result of integrating geographic distribution with microsatellite genotype assignment and phylogeny between landrace and wild soybeans, a single origin of soybean in south China is proposed. The coalescent simulation revealed a moderate genetic bottleneck with an effective wild soybean population used for domestication estimated to be approximately 2 % of the total number of ancestral wild soybeans. Wild soybeans in Asia, especially in

  17. Genome-Wide Analysis of Simple Sequence Repeats and Efficient Development of Polymorphic SSR Markers Based on Whole Genome Re-Sequencing of Multiple Isolates of the Wheat Stripe Rust Fungus.

    Directory of Open Access Journals (Sweden)

    Huaiyong Luo

    Full Text Available The biotrophic parasitic fungus Puccinia striiformis f. sp. tritici (Pst causes stripe rust, a devastating disease of wheat, endangering global food security. Because the Pst population is highly dynamic, it is difficult to develop wheat cultivars with durable and highly effective resistance. Simple sequence repeats (SSRs are widely used as molecular markers in genetic studies to determine population structure in many organisms. However, only a small number of SSR markers have been developed for Pst. In this study, a total of 4,792 SSR loci were identified using the whole genome sequences of six isolates from different regions of the world, with a marker density of one SSR per 22.95 kb. The majority of the SSRs were di- and tri-nucleotide repeats. A database containing 1,113 SSR markers were established. Through in silico comparison, the previously reported SSR markers were found mainly in exons, whereas the SSR markers in the database were mostly in intergenic regions. Furthermore, 105 polymorphic SSR markers were confirmed in silico by their identical positions and nucleotide variations with INDELs identified among the six isolates. When 104 in silico polymorphic SSR markers were used to genotype 21 Pst isolates, 84 produced the target bands, and 82 of them were polymorphic and revealed the genetic relationships among the isolates. The results show that whole genome re-sequencing of multiple isolates provides an ideal resource for developing SSR markers, and the newly developed SSR markers are useful for genetic and population studies of the wheat stripe rust fungus.

  18. The First Molecular Identification of an Olive Collection Applying Standard Simple Sequence Repeats and Novel Expressed Sequence Tag Markers

    Directory of Open Access Journals (Sweden)

    Soraya Mousavi

    2017-07-01

    Full Text Available Germplasm collections of tree crop species represent fundamental tools for conservation of diversity and key steps for its characterization and evaluation. For the olive tree, several collections were created all over the world, but only few of them have been fully characterized and molecularly identified. The olive collection of Perugia University (UNIPG, established in the years’ 60, represents one of the first attempts to gather and safeguard olive diversity, keeping together cultivars from different countries. In the present study, a set of 370 olive trees previously uncharacterized was screened with 10 standard simple sequence repeats (SSRs and nine new EST-SSR markers, to correctly and thoroughly identify all genotypes, verify their representativeness of the entire cultivated olive variation, and validate the effectiveness of new markers in comparison to standard genotyping tools. The SSR analysis revealed the presence of 59 genotypes, corresponding to 72 well known cultivars, 13 of them resulting exclusively present in this collection. The new EST-SSRs have shown values of diversity parameters quite similar to those of best standard SSRs. When compared to hundreds of Mediterranean cultivars, the UNIPG olive accessions were splitted into the three main populations (East, Center and West Mediterranean, confirming that the collection has a good representativeness of the entire olive variability. Furthermore, Bayesian analysis, performed on the 59 genotypes of the collection by the use of both sets of markers, have demonstrated their splitting into four clusters, with a well balanced membership obtained by EST respect to standard SSRs. The new OLEST (Olea expressed sequence tags SSR markers resulted as effective as the best standard markers. The information obtained from this study represents a high valuable tool for ex situ conservation and management of olive genetic resources, useful to build a common database from worldwide olive

  19. Rhoptry-associated protein (rap-1) genes in the sheep pathogen Babesia sp. Xinjiang: Multiple transcribed copies differing by 3' end repeated sequences.

    Science.gov (United States)

    Niu, Qingli; Marchand, Jordan; Yang, Congshan; Bonsergent, Claire; Guan, Guiquan; Yin, Hong; Malandrin, Laurence

    2015-07-30

    Sheep babesiosis occurs mainly in tropical and subtropical areas. The sheep parasite Babesia sp. Xinjiang is widespread in China, and our goal is to characterize rap-1 (rhoptry-associated protein 1) gene diversity and expression as a first step of a long term goal aiming at developing a recombinant subunit vaccine. Seven different rap-1a genes were amplified in Babesia sp. Xinjiang, using degenerate primers designed from conserved motifs. Rap-1b and rap-1c gene types could not be identified. In all seven rap-1a genes, the 5' regions exhibited identical sequences over 936 nt, and the 3' regions differed at 28 positions over 147 nt, defining two types of genes designated α and β. The remaining 3' part varied from 72 to 360 nt in length, depending on the gene. This region consists of a succession of two to ten 36 nt repeats, which explains the size differences. Even if the nucleotide sequences varied, 6 repeats encoded the same stretch of amino acids. Transcription of at least four α and two β genes was demonstrated by standard RT-PCR. Copyright © 2015 Elsevier B.V. All rights reserved.

  20. Complete nucleotide sequence of Dendrocalamus latiflorus and Bambusa oldhamii chloroplast genomes

    Science.gov (United States)

    WU, F.-H.; KAN, D.-P.; LEE, S.-B.; DANIELL, H.; LEE, Y.-W.; LIN, C.-C.; LIN, N.-S.; LIN, C.-S.

    2009-01-01

    Summary Although bamboo is one of the most important woody crops in Asia, information on its genome is still very limited. To investigate the relationship among Poaceae members and to understand the mechanism of albino mutant generation in vitro, the complete chloroplast genome of two economically important bamboo species, Dendrocalamus latiflorus Munro and Bambusa oldhamii Munro, was determined employing a strategy that involved polymerase chain reaction (PCR) amplification using 443 novel primers designed to amplify the chloroplast genome of these two species. The lengths of the B. oldhamii and D. latiflorus chloroplast genomes are 139,350 and 139,365 bp, respectively. The organization structure and the gene order of these two bamboos are identical to other members of Poaceae. Highly conserved chloroplast genomes of Poaceae facilitated sequencing by the PCR method. Phylogenetic analysis using both chloroplast genomes confirmed the results obtained from studies on chromosome number and reproductive organ morphology. There are 23 gaps, insertions/deletions > 100 bp, in the chloroplast genomes of 10 genera of Poaceae compared in this study. The phylogenetic distribution of these gaps corresponds to their taxonomic placement. The sequences of these two chloroplast genomes provide useful information for studying bamboo evolution, ecology and biotechnology. PMID:19324693

  1. Molecular cloning and nucleotide sequence of chicken avidin-related genes 1-5.

    Science.gov (United States)

    Keinänen, R A; Wallén, M J; Kristo, P A; Laukkanen, M O; Toimela, T A; Helenius, M A; Kulomaa, M S

    1994-03-01

    Using avidin cDNA as a hybridisation probe, we detected a gene family whose putative products are related to the chicken egg-white avidin. Two overlapping genomic clones were found to contain five genes (avidin-related genes 1-5, avr1-avr5), which have been cloned, characterized and sequenced. All of the genes have a four-exon structure with an overall identity with the avidin cDNA of 88-92%. The genes appear to have no pseudogenic features and, in fact, two of these genes have been shown to be transcribed. The putative proteins share a sequence identity of 68-78% with avidin. The amino acid residues responsible for the biotin-binding activity of avidin and the bacterial biotin-binding protein, streptavidin, are highly conserved. Since avidin is induced in both a progesterone-specific manner and in connection with inflammation, these genes offer a valuable tool to study complex gene regulation in vivo.

  2. Nucleotide sequence of the promoter region of the gene encoding chicken Calbindin D28K

    Energy Technology Data Exchange (ETDEWEB)

    Ferrari, S.; Drusiani, E.; Battini, R.; Fregni, M.

    1988-01-11

    Calbindin D28K (formerly Vitamin D-Dependent Calcium Binding Protein) is a protein induced by 1,25-dihydroxycholecalciferol in several chicken tissues. A chicken genomic DNA library was screened with a synthetic oligonucleotide representing the sequence of Calbindin D18K cDNA from nt 146 to nt 176. The positive clone CBAl extends the 5'-end of the first exon by 451 bp. The sequence of a BamHI-SacII restriction fragment with coordinates -451 + 50 is shown. The BamHI-SacII fragment was subcloned 5' to the CAT gene of pUCCAT. The result is shown of a CAT assay on mouse fibroblasts 3T6 transiently transfected with pUCCAT, pUCCAT containing the BamHI-SacII fragment in the correct or opposite orientation or the SV40 promoter. /sup 14/C-chloramphenicol and its acetyl derivatives generated by purified CAT are also shown. The expression of CAT appears to be constitutive since the enzyme activity is not influenced by the presence (+) or absence (-) of 1,25-dihydroxycholecalciferol in the culture medium.

  3. Simple sequence repeat markers useful for sorghum downy mildew (Peronosclerospora sorghi and related species

    Directory of Open Access Journals (Sweden)

    Odvody Gary N

    2008-11-01

    Full Text Available Abstract Background A recent outbreak of sorghum downy mildew in Texas has led to the discovery of both metalaxyl resistance and a new pathotype in the causal organism, Peronosclerospora sorghi. These observations and the difficulty in resolving among phylogenetically related downy mildew pathogens dramatically point out the need for simply scored markers in order to differentiate among isolates and species, and to study the population structure within these obligate oomycetes. Here we present the initial results from the use of a biotin capture method to discover, clone and develop PCR primers that permit the use of simple sequence repeats (microsatellites to detect differences at the DNA level. Results Among the 55 primers pairs designed from clones from pathotype 3 of P. sorghi, 36 flanked microsatellite loci containing simple repeats, including 28 (55% with dinucleotide repeats and 6 (11% with trinucleotide repeats. A total of 22 microsatellites with CA/AC or GT/TG repeats were the most abundant (40% and GA/AG or CT/TC types contribute 15% in our collection. When used to amplify DNA from 19 isolates from P. sorghi, as well as from 5 related species that cause downy mildew on other hosts, the number of different bands detected for each SSR primer pair using a LI-COR- DNA Analyzer ranged from two to eight. Successful cross-amplification for 12 primer pairs studied in detail using DNA from downy mildews that attack maize (P. maydis & P. philippinensis, sugar cane (P. sacchari, pearl millet (Sclerospora graminicola and rose (Peronospora sparsa indicate that the flanking regions are conserved in all these species. A total of 15 SSR amplicons unique to P. philippinensis (one of the potential threats to US maize production were detected, and these have potential for development of diagnostic tests. A total of 260 alleles were obtained using 54 microsatellites primer combinations, with an average of 4.8 polymorphic markers per SSR across 34

  4. Differential distribution and association of repeat DNA sequences in the lateral element of the synaptonemal complex in rat spermatocytes.

    Science.gov (United States)

    Hernández-Hernández, Abrahan; Rincón-Arano, Héctor; Recillas-Targa, Félix; Ortiz, Rosario; Valdes-Quezada, Christian; Echeverría, Olga M; Benavente, Ricardo; Vázquez-Nin, Gerardo H

    2008-02-01

    The synaptonemal complex (SC) is an evolutionarily conserved structure that mediates synapsis of homologous chromosomes during meiotic prophase I. Previous studies have established that the chromatin of homologous chromosomes is organized in loops that are attached to the lateral elements (LEs) of the SC. The characterization of the genomic sequences associated with LEs of the SC represents an important step toward understanding meiotic chromosome organization and function. To isolate these genomic sequences, we performed chromatin immunoprecipitation assays in rat spermatocytes using an antibody against SYCP3, a major structural component of the LEs of the SC. Our results demonstrated the reproducible and exclusive isolation of repeat deoxyribonucleic acid (DNA) sequences, in particular long interspersed elements, short interspersed elements, long terminal direct repeats, satellite, and simple repeats. The association of these repeat sequences to the LEs of the SC was confirmed by in situ hybridization of meiotic nuclei shown by both light and electron microscopy. Signals were also detected over the chromatin surrounding SCs and in small loops protruding from the lateral elements into the SC central region. We propose that genomic repeat DNA sequences play a key role in anchoring the chromosome to the protein scaffold of the SC.

  5. Long terminal repeat sequences from virulent and attenuated equine infectious anemia virus demonstrate distinct promoter activities.

    Science.gov (United States)

    Zhou, Tao; Yuan, Xiu-Fang; Hou, Shao-Hua; Tu, Ya-Bin; Peng, Jin-Mei; Wen, Jian-Xin; Qiu, Hua-Ji; Wu, Dong-Lai; Chen, Huan-Chun; Wang, Xiao-Jun; Tong, Guang-Zhi

    2007-09-01

    In the early 1970s, the Chinese Equine Infectious Anemia Virus (EIAV) vaccine, EIAV(DLA), was developed through successive passages of a wild-type virulent virus (EIAV(L)) in donkeys in vivo and then in donkey macrophages in vitro. EIAV attenuation and cell tropism adaptation are associated with changes in both envelope and long terminal repeat (LTR). However, specific LTR changes during Chinese EIAV attenuation have not been demonstrated. In this study, we compared LTR sequences from both virulent and attenuated EIAV strains and documented the diversities of LTR sequence from in vivo and in vitro infections. We found that EIAV LTRs of virulent strains were homologous, while EIAV vaccine have variable LTRs. Interestingly, experimental inoculation of EIAV(DLA) into a horse resulted in a restriction of the LTR variation. Furthermore, LTRs from EIAV(DLA) showed higher Tat transactivated activity than LTRs from virulent strains. By using chimeric clones of wild-type LTR and vaccine LTR, the main difference of activity was mapped to the changes of R region, rather than U3 region.

  6. Transcriptome characterisation and simple sequence repeat marker discovery in the seagrass Posidonia oceanica

    Science.gov (United States)

    D’Esposito, D.; Orrù, L.; Dattolo, E.; Bernardo, L.; Lamontara, A.; Orsini, L.; Serra, I.A; Mazzuca, S.; Procaccini, G.

    2016-01-01

    Posidonia oceanica is an endemic seagrass in the Mediterranean Sea, where it provides important ecosystem services and sustains a rich and diverse ecosystem. P. oceanica meadows extend from the surface to 40 meters depth. With the aim of boosting research in this iconic species, we generated a comprehensive RNA-Seq data set for P. oceanica by sequencing specimens collected at two depths and two times during the day. With this approach we attempted to capture the transcriptional diversity associated with change in light and other depth-related environmental factors. Using this extensive data set we generated gene predictions and identified an extensive catalogue of potential Simple Sequence Repeats (SSR) markers. The data generated here will open new avenues for the analysis of population genetic features and functional variation in P. oceanica. In total, 79,235 contigs were obtained by the assembly of 70,453,120 paired end reads. 43,711 contigs were successfully annotated. A total of 17,436 SSR were identified within 13,912 contigs. PMID:27996971

  7. Short tandem repeat sequences in the Mycoplasma genitalium genome and their use in a multilocus genotyping system

    Directory of Open Access Journals (Sweden)

    Lillis Rebecca

    2008-07-01

    Full Text Available Abstract Background Several methods have been reported for strain typing of Mycoplasma genitalium. The value of these methods has never been comparatively assessed. The aims of this study were: 1 to identify new potential genetic markers based on an analysis of short tandem repeat (STR sequences in the published M. genitalium genome sequence; 2 to apply previously and newly identified markers to a panel of clinical strains in order to determine the optimal combination for an efficient multi-locus genotyping system; 3 to further confirm sexual transmission of M. genitalium using the newly developed system. Results We performed a comprehensive analysis of STRs in the genome of the M. genitalium type strain G37 and identified 18 loci containing STRs. In addition to one previously studied locus, MG309, we chose two others, MG307 and MG338, for further study. Based on an analysis of 74 unrelated patient specimens from New Orleans and Scandinavia, the discriminatory indices (DIs for these three markers were 0.9153, 0.7381 and 0.8730, respectively. Two other previously described markers, including single nucleotide polymorphisms (SNPs in the rRNA genes (rRNA-SNPs and SNPs in the MG191 gene (MG191-SNPs were found to have DIs of 0.5820 and 0.9392, respectively. A combination of MG309-STRs and MG191-SNPs yielded almost perfect discrimination (DI = 0.9894. An additional finding was that the rRNA-SNPs distribution pattern differed significantly between Scandinavia and New Orleans. Finally we applied multi-locus typing to further confirm sexual transmission using specimens from 74 unrelated patients and 31 concurrently infected couples. Analysis of multi-locus genotype profiles using the five variable loci described above revealed 27 of the couples had concordant genotype profiles compared to only four examples of concordance among the 74 unrelated randomly selected patients. Conclusion We propose that a combination of the MG309-STRs and MG191-SNPs is

  8. The wheat homolog of putative nucleotide-binding site-leucine-rich repeat resistance gene TaRGA contributes to resistance against powdery mildew.

    Science.gov (United States)

    Wang, Defu; Wang, Xiaobing; Mei, Yu; Dong, Hansong

    2016-03-01

    Powdery mildew, one of the most destructive wheat diseases worldwide, is caused by Blumeria graminis f. sp. tritici (Bgt), a fungal species with a consistently high mutation rate that makes individual resistance (R) genes ineffective. Therefore, effective resistance-related gene cloning is vital for breeding and studying the resistance mechanisms of the disease. In this study, a putative nucleotide-binding site-leucine-rich repeat (NBS-LRR) R gene (TaRGA) was cloned using a homology-based cloning strategy and analyzed for its effect on powdery mildew disease and wheat defense responses. Real-time reverse transcription-PCR (RT-PCR) analyses revealed that a Bgt isolate 15 and salicylic acid stimulation significantly induced TaRGA in the resistant variety. Furthermore, the silencing of TaRGA in powdery mildew-resistant plants increased susceptibility to Bgt15 and prompted conidia propagation at the infection site. However, the expression of TaRGA in leaf segments after single-cell transient expression assay highly increased the defense responses to Bgt15 by enhancing callose deposition and phenolic autofluorogen accumulation at the pathogen invading sites. Meanwhile, the expression of pathogenesis-related genes decreased in the TaRGA-silenced plants and increased in the TaRGA-transient-overexpressing leaf segments. These results implied that the TaRGA gene positively regulates the defense response to powdery mildew disease in wheat.

  9. Creation and structure determination of an artificial protein with three complete sequence repeats

    Energy Technology Data Exchange (ETDEWEB)

    Adachi, Motoyasu, E-mail: adachi.motoyasu@jaea.go.jp; Shimizu, Rumi; Kuroki, Ryota [Japan Atomic Energy Agency, Shirakatashirane 2-4, Nakagun Tokaimura, Ibaraki 319-1195 (Japan); Blaber, Michael [Japan Atomic Energy Agency, Shirakatashirane 2-4, Nakagun Tokaimura, Ibaraki 319-1195 (Japan); Florida State University, Tallahassee, FL 32306-4300 (United States)

    2013-11-01

    An artificial protein with three complete sequence repeats was created and the structure was determined by X-ray crystallography. The structure showed threefold symmetry even though there is an amino- and carboxy-terminal. The artificial protein with threefold symmetry may be useful as a scaffold to capture small materials with C3 symmetry. Symfoil-4P is a de novo protein exhibiting the threefold symmetrical β-trefoil fold designed based on the human acidic fibroblast growth factor. First three asparagine–glycine sequences of Symfoil-4P are replaced with glutamine–glycine (Symfoil-QG) or serine–glycine (Symfoil-SG) sequences protecting from deamidation, and His-Symfoil-II was prepared by introducing a protease digestion site into Symfoil-QG so that Symfoil-II has three complete repeats after removal of the N-terminal histidine tag. The Symfoil-QG and SG and His-Symfoil-II proteins were expressed in Eschericha coli as soluble protein, and purified by nickel affinity chromatography. Symfoil-II was further purified by anion-exchange chromatography after removing the HisTag by proteolysis. Both Symfoil-QG and Symfoil-II were crystallized in 0.1 M Tris-HCl buffer (pH 7.0) containing 1.8 M ammonium sulfate as precipitant at 293 K; several crystal forms were observed for Symfoil-QG and II. The maximum diffraction of Symfoil-QG and II crystals were 1.5 and 1.1 Å resolution, respectively. The Symfoil-II without histidine tag diffracted better than Symfoil-QG with N-terminal histidine tag. Although the crystal packing of Symfoil-II is slightly different from Symfoil-QG and other crystals of Symfoil derivatives having the N-terminal histidine tag, the refined crystal structure of Symfoil-II showed pseudo-threefold symmetry as expected from other Symfoils. Since the removal of the unstructured N-terminal histidine tag did not affect the threefold structure of Symfoil, the improvement of diffraction quality of Symfoil-II may be caused by molecular characteristics of

  10. Assessment of the labelling accuracy of spanish semipreserved anchovies products by FINS (forensically informative nucleotide sequencing).

    Science.gov (United States)

    Velasco, Amaya; Aldrey, Anxela; Pérez-Martín, Ricardo I; Sotelo, Carmen G

    2016-06-01

    Anchovies have been traditionally captured and processed for human consumption for millennia. In the case of Spain, ripened and salted anchovies are a delicacy, which, in some cases, can reach high commercial values. Although there have been a number of studies presenting DNA methodologies for the identification of anchovies, this is one of the first studies investigating the level of mislabelling in this kind of products in Europe. Sixty-three commercial semipreserved anchovy products were collected in different types of food markets in four Spanish cities to check labelling accuracy. Species determination in these commercial products was performed by sequencing two different cyt-b mitochondrial DNA fragments. Results revealed mislabelling levels higher than 15%, what authors consider relatively high considering the importance of the product. The most frequent substitute species was the Argentine anchovy, Engraulis anchoita, which can be interpreted as an economic fraud.

  11. Characterization of comparative genome-derived simple sequence repeats for acanthopterygian fishes.

    Science.gov (United States)

    Gotoh, Ryo O; Tamate, Satoshi; Yokoyama, Jun; Tamate, Hidetoshi B; Hanzawa, Naoto

    2013-05-01

    Simple sequence repeats (SSRs) have become one of the most popular molecular markers for population genetic studies. The application of SSR markers has often been limited to source species because SSR loci are too labile to be maintained in even closely related species. However, a few extremely conserved SSR loci have been reported. Here, we tested for the presence of conserved SSR loci in acanthopterygian fishes, which include over 14 000 species, by comparing the genome sequences of four acanthopterygian fishes. We also examined the comparative genome-derived SSRs (CG-SSRs) for their transferability across acanthopterygian fishes and their applicability to population genetic analysis. Forty-six SSR loci with conserved flanking regions were detected and examined for their transferability among seven nonacanthopterygian and 27 acanthopterygian fishes. The PCR amplification success rate in nonacanthopterygian fishes was low, ranging from 2.2% to 21.7%, except for Lophius litulon (Lophiiformes; 80.4%). Conversely, the rate in most acanthopterygian fishes exceeded 70.0%. Sequencing of these 46 loci revealed the presence of SSRs suitable for scoring while fragment analysis of 20 loci revealed polymorphisms in most of the acanthopterygian fishes. Population genetic analysis of Cottus pollux (Scorpaeniformes) and Sphaeramia orbicularis (Perciformes) using CG-SSRs showed that these populations did not deviate from linkage equilibrium or Hardy-Weinberg equilibrium. Furthermore, almost no loci showed evidence of null alleles, suggesting that CG-SSRs have strong resolving power for population genetic analysis. Our findings will facilitate the use of these markers in species in which markers remain to be identified.

  12. Next generation sequencing (NGS database for tandem repeats with multiple pattern 2°-shaft multicore string matching

    Directory of Open Access Journals (Sweden)

    Chinta Someswara Rao

    2016-03-01

    Full Text Available Next generation sequencing (NGS technologies have been rapidly applied in biomedical and biological research in recent years. To provide the comprehensive NGS resource for the research, in this paper , we have considered 10 loci/codi/repeats TAGA, TCAT, GAAT, AGAT, AGAA, GATA, TATC, CTTT, TCTG and TCTA. Then we developed the NGS Tandem Repeat Database (TandemRepeatDB for all the chromosomes of Homo sapiens, Callithrix jacchus, Chlorocebus sabaeus, Gorilla gorilla, Macaca fascicularis, Macaca mulatta, Nomascus leucogenys, Pan troglodytes, Papio anubis and Pongo abelii genome data sets for all those locis. We find the successive occurence frequency for all the above 10 SSR (simple sequence repeats in the above genome data sets on a chromosome-by-chromosome basis with multiple pattern 2° shaft multicore string matching.

  13. Next generation sequencing (NGS) database for tandem repeats with multiple pattern 2°-shaft multicore string matching

    Science.gov (United States)

    Someswara Rao, Chinta; Raju, S. Viswanadha

    2016-01-01

    Next generation sequencing (NGS) technologies have been rapidly applied in biomedical and biological research in recent years. To provide the comprehensive NGS resource for the research, in this paper , we have considered 10 loci/codi/repeats TAGA, TCAT, GAAT, AGAT, AGAA, GATA, TATC, CTTT, TCTG and TCTA. Then we developed the NGS Tandem Repeat Database (TandemRepeatDB) for all the chromosomes of Homo sapiens, Callithrix jacchus, Chlorocebus sabaeus, Gorilla gorilla, Macaca fascicularis, Macaca mulatta, Nomascus leucogenys, Pan troglodytes, Papio anubis and Pongo abelii genome data sets for all those locis. We find the successive occurence frequency for all the above 10 SSR (simple sequence repeats) in the above genome data sets on a chromosome-by-chromosome basis with multiple pattern 2° shaft multicore string matching. PMID:26981434

  14. Next generation sequencing (NGS) database for tandem repeats with multiple pattern 2°-shaft multicore string matching.

    Science.gov (United States)

    Someswara Rao, Chinta; Raju, S Viswanadha

    2016-03-01

    Next generation sequencing (NGS) technologies have been rapidly applied in biomedical and biological research in recent years. To provide the comprehensive NGS resource for the research, in this paper , we have considered 10 loci/codi/repeats TAGA, TCAT, GAAT, AGAT, AGAA, GATA, TATC, CTTT, TCTG and TCTA. Then we developed the NGS Tandem Repeat Database (TandemRepeatDB) for all the chromosomes of Homo sapiens, Callithrix jacchus, Chlorocebus sabaeus, Gorilla gorilla, Macaca fascicularis, Macaca mulatta, Nomascus leucogenys, Pan troglodytes, Papio anubis and Pongo abelii genome data sets for all those locis. We find the successive occurence frequency for all the above 10 SSR (simple sequence repeats) in the above genome data sets on a chromosome-by-chromosome basis with multiple pattern 2° shaft multicore string matching.

  15. A resource of genome-wide single-nucleotide polymorphisms generated by RAD tag sequencing in the critically endangered European eel

    DEFF Research Database (Denmark)

    Pujolar, J.M.; Jacobsen, M.W.; Frydenberg, J.

    2013-01-01

    Reduced representation genome sequencing such as restriction-site-associated DNA (RAD) sequencing is finding increased use to identify and genotype large numbers of single-nucleotide polymorphisms (SNPs) in model and nonmodel species. We generated a unique resource of novel SNP markers for the Eu......Reduced representation genome sequencing such as restriction-site-associated DNA (RAD) sequencing is finding increased use to identify and genotype large numbers of single-nucleotide polymorphisms (SNPs) in model and nonmodel species. We generated a unique resource of novel SNP markers...... for the European eel using the RAD sequencing approach that was simultaneously identified and scored in a genome-wide scan of 30 individuals. Whereas genomic resources are increasingly becoming available for this species, including the recent release of a draft genome, no genome-wide set of SNP markers...

  16. The nucleotide sequence of the chicken thymidine kinase gene and the relationship of its predicted polypeptide to that of the vaccinia virus thymidine kinase.

    Science.gov (United States)

    Kwoh, T J; Engler, J A

    1984-05-11

    The entire DNA nucleotide sequence of a 3.0 kilobase pair Hind III fragment containing the chicken cytoplasmic thymidine kinase gene was determined. Oligonucleotide linker insertion mutations distributed throughout this gene and having known effects upon gene activity ( Kwoh , T.J., Zipser , D., and Wigler , M. 1983. J. Mol. Appl. Genet. 2, 191-200), were used to access regions of the Hind III fragment for sequencing reactions. The complete nucleotide sequence, together with the positions of the linker insertion mutations within the sequence, allows us to propose a structure for the chicken thymidine kinase gene. The protein coding sequence of the gene is divided into seven small segments (each less than 160 base pairs) by six small introns (each less than 230 base pairs). The proposed 244 amino acid polypeptide encoded by this gene bears strong homology to the vaccinia virus thymidine kinase. No homology with the thymidine kinases of the herpes simplex viruses was found.

  17. DNA and RNA from Uninfected Vertebrate Cells Contain Nucleotide Sequences Related to the Putative Transforming Gene of Avian Myelocytomatosis Virus

    Science.gov (United States)

    Sheiness, Diana; Bishop, J. Michael

    1979-01-01

    The avian carcinoma virus MC29 (MC29V) contains a sequence of approximately 1,500 nucleotides which may represent a gene responsible for tumorigenesis by MC29V. We present evidence that MC29V has acquired this nucleotide sequence from the DNA of its host. The host sequence which has been incorporated by MC29V is transcribed into RNA in uninfected chicken cells and thus probably encodes a cellular gene. We have prepared radioactive DNA complementary to the putative MC29V transforming gene (cDNAmc29) and have found that sequences homologous to cDNAmc29 are present in the genomes of several uninfected vertebrate species. The DNA of chicken, the natural host for MC29V, contains at least 90% of the sequences represented by cDNAmc29. DNAs from other animals show significant but decreasing amounts of complementarity to cDNAmc29 in accordance with their evolutionary divergence from chickens; the thermal stabilities of duplexes formed between cDNAmc29 and avian DNAs also reflect phylogenetic divergence. Sequences complementary to cDNAmc29 are transcribed into approximately 10 copies per cell of polyadenylated RNA in uninfected chicken fibroblasts. Thus, the vertebrate homolog of cDNAmc29 may be a gene which has been conserved throughout vertebrate evolution and which served as a progenitor for the putative transforming gene of MC29V. Recent experiments suggest that the putative transforming gene of avian erythroblastosis virus, like that of MC29V, may have arisen by incorporation of a host gene (Stehelin et al., personal communication). These findings for avian erythroblastosis virus and MC29V closely parallel previous results, suggesting a host origin for src (D. H. Spector, B. Baker, H. E. Varmus, and J. M. Bishop, Cell 13:381-386, 1978; D. H. Spector, K. Smith, T. Padgett, P. McCombe, D. Roulland-Dussoix, C. Moscovici, H. E. Varmus, and J. M. Bishop, Cell 13:371-379, 1978; D. H. Spector, H. E. Varmus, and J. M. Bishop, Proc. Natl. Acad. Sci. U.S.A. 75:4102-4106, 1978; D

  18. Nucleotide and protein sequences for dog masticatory tropomyosin identify a novel Tpm4 gene product.

    Science.gov (United States)

    Brundage, Elizabeth A; Biesiadecki, Brandon J; Reiser, Peter J

    2015-10-01

    Jaw-closing muscles of several vertebrate species, including members of Carnivora, express a unique, "masticatory", isoform of myosin heavy chain, along with isoforms of other myofibrillar proteins that are not expressed in most other muscles. It is generally believed that the complement of myofibrillar isoforms in these muscles serves high force generation for capturing live prey, breaking down tough plant material and defensive biting. A unique isoform of tropomyosin (Tpm) was reported to be expressed in cat jaw-closing muscle, based upon two-dimensional gel mobility, peptide mapping, and immunohistochemistry. The objective of this study was to obtain protein and gene sequence information for this unique Tpm isoform. Samples of masseter (a jaw-closing muscle), tibialis (predominantly fast-twitch fibers), and the deep lateral gastrocnemius (predominantly slow-twitch fibers) were obtained from adult dogs. Expressed Tpm isoforms were cloned and sequencing yielded cDNAs that were identical to genomic predicted striated muscle Tpm1.1St(a,b,b,a) (historically referred to as αTpm), Tpm2.2St(a,b,b,a) (βTpm) and Tpm3.12St(a,b,b,a) (γTpm) isoforms (nomenclature reflects predominant tissue expression ("St"-striated muscle) and exon splicing pattern), as well as a novel 284 amino acid isoform observed in jaw-closing muscle that is identical to a genomic predicted product of the Tpm4 gene (δTpm) family. The novel isoform is designated as Tpm4.3St(a,b,b,a). The myofibrillar Tpm isoform expressed in dog masseter exhibits a unique electrophoretic mobility on gels containing 6 M urea, compared to other skeletal Tpm isoforms. To validate that the cloned Tpm4.3 isoform is the Tpm expressed in dog masseter, E. coli-expressed Tpm4.3 was electrophoresed in the presence of urea. Results demonstrate that Tpm4.3 has identical electrophoretic mobility to the unique dog masseter Tpm isoform and is of different mobility from that of muscle Tpm1.1, Tpm2.2 and Tpm3.12 isoforms. We

  19. Complete nucleotide sequence and host range of South African cassava mosaic virus: further evidence for recombination amongst begomoviruses.

    Science.gov (United States)

    Berrie, L C; Rybicki, E P; Rey, M E

    2001-01-01

    Complete nucleotide sequences of the DNA-A (2800 nt) and DNA-B (2760 nt) components of a novel cassava-infecting begomovirus, South African cassava mosaic virus (SACMV), were determined and compared with various New World and Old World begomoviruses. SACMV is most closely related to East African cassava mosaic virus (EACMV) in both its DNA-A (85% with EACMV-MH and -MK) and -B (90% with EACMV-UG2-Mld and EACMV-UG3-Svr) components; however, percentage sequence similarities of less than 90% in the DNA-A component allowed SACMV to be considered a distinct virus. One significant recombination event spanning the entire AC4 open reading frame was identified; however, there was no evidence of recombination in the DNA-B component. Infectivity of the cloned SACMV genome was demonstrated by successful agroinoculation of cassava and three other plant species (Phaseolus vulgaris, Malva parviflora and Nicotiana benthamiana). This is the first description of successful infection of cassava with a geminivirus using Agrobacterium tumefaciens.

  20. Nucleotide sequence and spatial expression pattern of a drought- and abscisic Acid-induced gene of tomato.

    Science.gov (United States)

    Plant, A L; Cohen, A; Moses, M S; Bray, E A

    1991-11-01

    The nucleotide sequence of le16, a tomato (Lycopersicon esculentum Mill.) gene induced by drought stress and regulated by abscisic acid specifically in aerial vegetative tissue, is presented. The single open reading frame contained within the gene has the capacity to encode a polypeptide of 12.7 kilodaltons and is interrupted by a small intron. The predicted polypeptide is rich in leucine, glycine, and alanine and has an isoelectric point of 8.7. The amino terminus is hydrophobic and characteristic of signal sequences that target polypeptides for export from the cytoplasm. There is homology (47.2% identity) between the amino terminus of the LE 16 polypeptide and the corresponding amino terminal domain of the maize phospholipid transfer protein. le16 was expressed in drought-stressed leaf, petiole, and stem tissue and to a much lower extent in the pericarp of mature green tomato fruit and developing seeds. No expression was detected in the pericarp of red fruit or in drought-stressed roots. Expression of le16 was also induced in leaf tissue by a variety of other abiotic stresses including polyethylene glycol-mediated water deficit, salinity, cold stress, and heat stress. None of these stresses or direct applications of abscisic acid induced the expression of le16 in the roots of the same plants. The unique expression characteristics of this gene indicates that novel regulatory mechanisms, in addition to endogenous abscisic acid, are involved in controlling gene expression.

  1. Complete nucleotide sequences of mitochondrial genomes of two solitary entoprocts, Loxocorone allax and Loxosomella aloxiata: implications for lophotrochozoan phylogeny.

    Science.gov (United States)

    Yokobori, Shin-ichi; Iseto, Tohru; Asakawa, Shuichi; Sasaki, Takashi; Shimizu, Nobuyoshi; Yamagishi, Akihiko; Oshima, Tairo; Hirose, Euichi

    2008-05-01

    The complete nucleotide sequences of the mitochondrial (mt) genomes of the entoprocts Loxocorone allax and Loxosomella aloxiata were determined. Both species carry the typical gene set of metazoan mt genomes and have similar organizations of their mt genes. However, they show differences in the positions of two tRNA(Leu) genes. Additionally, the tRNA(Val) gene, and half of the long non-coding region, is duplicated and inverted in the Loxos. aloxiata mt genome. The initiation codon of the Loxos. aloxiata cytochrome oxidase subunit I gene is expected to be ACG rather than AUG. The mt gene organizations in these two entoproct species most closely resemble those of mollusks such as Katharina tunicata and Octopus vulgaris, which have the most evolutionarily conserved mt gene organization reported to date in mollusks. Analyses of the mt gene organization in the lophotrochozoan phyla (Annelida, Brachiopoda, Echiura, Entoprocta, Mollusca, Nemertea, and Phoronida) suggested a close phylogenetic relationship between Brachiopoda, Annelida, and Echiura. However, Phoronida was excluded from this grouping. Molecular phylogenetic analyses based on the sequences of mt protein-coding genes suggested a possible close relationship between Entoprocta and Phoronida, and a close relationship among Brachiopoda, Annelida, and Echiura.

  2. Nucleotide sequence analysis of NIPBL gene in Indian Cornelia de Lange syndrome cases.

    Science.gov (United States)

    Bajaj, Shailesh; Ranade, Suvidya; Gambhir, Prakash

    2013-01-01

    Cornelia de Lange syndrome (CdLS) is a multisystem developmental disorder in children. The disorder is caused mainly due to mutations in Nipped-B-like protein. The molecular data for CdLS is available from developed countries, but not available in developing countries like India. In the present study, the hotspot region of NIPBL gene was screened by Polymerase Chain Reaction which includes exon 2, 22, 42, and a biggest exon 10, in six CdLS patients and ten controls. The method adopted in present study was amplification of the target exon by using polymerase chain reaction, qualitative confirmation of amplicons by Agarose Gel Electrophoresis and use of amplicons for Conformation Sensitive Gel Electrophoresis to find heteroduplex formation followed by sequencing. We report two polymorphisms in the studied region of gene NIPBL. The polymorphisms are in the region of intron 1 and in exon 10. The polymorphism C/A is present in intron 1 region and polymorphism T/G in exon 10. The intronic region polymorphism may have a role in intron splicing whereas the polymorphism in exon 10 results in amino acid change (Val to Gly). These polymorphisms are disease associated as these are found in CdLS patients only and not in controls.

  3. Chromosomal organizations of major repeat families on potato (Solanum tuberosum) and further exploring in its sequenced genome.

    Science.gov (United States)

    Tang, Xiaomin; Datema, Erwin; Guzman, Myriam Olortegui; de Boer, Jan M; van Eck, Herman J; Bachem, Christian W B; Visser, Richard G F; de Jong, Hans

    2014-12-01

    One of the most powerful technologies in unraveling the organization of a eukaryotic plant genome is high-resolution Fluorescent in situ hybridization of repeats and single copy DNA sequences on pachytene chromosomes. This technology allows the integration of physical mapping information with chromosomal positions, including centromeres, telomeres, nucleolar-organizing region, and euchromatin and heterochromatin. In this report, we established chromosomal positions of different repeat fractions of the potato genomic DNA (Cot100, Cot500 and Cot1000) on the chromosomes. We also analysed various repeat elements that are unique to potato including the moderately repetitive P5 and REP2 elements, where the REP2 is part of a larger Gypsy-type LTR retrotransposon and cover most chromosome regions, with some brighter fluorescing spots in the heterochromatin. The most abundant tandem repeat is the potato genomic repeat 1 that covers subtelomeric regions of most chromosome arms. Extensive multiple alignments of these repetitive sequences in the assembled RH89-039-16 potato BACs and the draft assembly of the DM1-3 516 R44 genome shed light on the conservation of these repeats within the potato genome. The consensus sequences thus obtained revealed the native complete transposable elements from which they were derived.

  4. Longitudinal study of a heteroplasmic 3460 Leber hereditary optic neuropathy family by multiplexed primer-extension analysis and nucleotide sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Ghosh, S.S.; Fahy, E. [Applied Genetics, San Diego, CA (United States); Bodis-Wollner, I. [State Univ. of New York College of Optometry, New York, NY (United States)] [and others

    1996-02-01

    Nucleotide-sequencing and multiplexed primer-extension assays have been used to quantitate the mutant-allele frequency in 14 maternal relatives, spanning three generations, from a family that is heteroplasmic for the primary Leber hereditary optic neuropathy (LHON) mutation at nucleotide 3460 of the mitochondrial genome. There was excellent agreement between the values that were obtained with the two different methods. The longitudinal study shows that the mutant-allele frequency was constant within individual family members over a sampling period of 3.5 years. Second, although there was an overall increase in the mutant-allele frequency in successive generations, segregation in the direction of the mutant allele was not invariant, and there was one instance in which there was a significant decrease in the frequency from parent to offspring. From these two sets of results, and from previous studies of heteroplasmic LHON families, we conclude that there is no evidence for a marked selective pressure that determines the replication, segregation, or transmission of primary LHON mutations to white blood cells and platelets. Instead, the mtDNA molecules are most likely to replicate and segregate under conditions of random drift at the cellular level. Finally, the pattern of transmission in this maternal lineage is compatible with a developmental bottleneck model in which the number of mitochondrial units of segregation in the female germ line is relatively small in relation to the number of mtDNA molecules within a cell. However, this is not an invariant pattern for humans, and simple models of mitochondrial gene transmission are inappropriate at the present time. 37 refs., 4 figs., 1 tab.

  5. Emergence of gynodioecy in wild beet (Beta vulgaris ssp. maritima L.): a genealogical approach using chloroplastic nucleotide sequences

    Science.gov (United States)

    Fénart, Stéphane; Touzet, Pascal; Arnaud, Jean-François; Cuguen, Joël

    2006-01-01

    Gynodioecy is a breeding system where both hermaphroditic and female individuals coexist within plant populations. This dimorphism is the result of a genomic interaction between maternally inherited cytoplasmic male sterility (CMS) genes and bi-parentally inherited nuclear male fertility restorers. As opposed to other gynodioecious species, where every cytoplasm seems to be associated with male sterility, wild beet Beta vulgaris ssp. maritima exhibits a minority of sterilizing cytoplasms among numerous non-sterilizing ones. Many studies on population genetics have explored the molecular diversity of different CMS cytoplasms, but questions remain concerning their evolutionary dynamics. In this paper we report one of the first investigations on phylogenetic relationships between CMS and non-CMS lineages. We investigated the phylogenetic relationships between 35 individuals exhibiting different mitochondrial haplotypes. Relying on the high linkage disequilibrium between chloroplastic and mitochondrial genomes, we chose to analyse the nucleotide sequence diversity of three chloroplastic fragments (trnK intron, trnD–trnT and trnL–trnF intergenic spacers). Nucleotide diversity appeared to be low, suggesting a recent bottleneck during the evolutionary history of B. vulgaris ssp. maritima. Statistical parsimony analyses revealed a star-like genealogy and showed that sterilizing haplotypes all belong to different lineages derived from an ancestral non-sterilizing cytoplasm. These results suggest a rapid evolution of male sterility in this taxon. The emergence of gynodioecy in wild beet is confronted with theoretical expectations, describing either gynodioecy dynamics as the maintenance of CMS factors through balancing selection or as a constant turnover of new CMSs. PMID:16777728

  6. Polyadenylation of RNA transcribed from mammalian SINEs by RNA polymerase III: Complex requirements for nucleotide sequences.

    Science.gov (United States)

    Borodulina, Olga R; Golubchikova, Julia S; Ustyantsev, Ilia G; Kramerov, Dmitri A

    2016-02-01

    It is generally accepted that only transcripts synthesized by RNA polymerase II (e.g., mRNA) were subject to AAUAAA-dependent polyadenylation. However, we previously showed that RNA transcribed by RNA polymerase III (pol III) from mouse B2 SINE could be polyadenylated in an AAUAAA-dependent manner. Many species of mammalian SINEs end with the pol III transcriptional terminator (TTTTT) and contain hexamers AATAAA in their A-rich tail. Such SINEs were united into Class T(+), whereas SINEs lacking the terminator and AATAAA sequences were classified as T(-). Here we studied the structural features of SINE pol III transcripts that are necessary for their polyadenylation. Eight and six SINE families from classes T(+) and T(-), respectively, were analyzed. The replacement of AATAAA with AACAAA in T(+) SINEs abolished the RNA polyadenylation. Interestingly, insertion of the polyadenylation signal (AATAAA) and pol III transcription terminator in T(-) SINEs did not result in polyadenylation. The detailed analysis of three T(+) SINEs (B2, DIP, and VES) revealed areas important for the polyadenylation of their pol III transcripts: the polyadenylation signal and terminator in A-rich tail, β region positioned immediately downstream of the box B of pol III promoter, and τ region located upstream of the tail. In DIP and VES (but not in B2), the τ region is a polypyrimidine motif which is also characteristic of many other T(+) SINEs. Most likely, SINEs of different mammals acquired these structural features independently as a result of parallel evolution. Copyright © 2015 Elsevier B.V. All rights reserved.

  7. Neuropeptidergic Signaling in the American Lobster Homarus americanus: New Insights from High-Throughput Nucleotide Sequencing.

    Science.gov (United States)

    Christie, Andrew E; Chi, Megan; Lameyer, Tess J; Pascual, Micah G; Shea, Devlin N; Stanhope, Meredith E; Schulz, David J; Dickinson, Patsy S

    2015-01-01

    Peptides are the largest and most diverse class of molecules used for neurochemical communication, playing key roles in the control of essentially all aspects of physiology and behavior. The American lobster, Homarus americanus, is a crustacean of commercial and biomedical importance; lobster growth and reproduction are under neuropeptidergic control, and portions of the lobster nervous system serve as models for understanding the general principles underlying rhythmic motor behavior (including peptidergic neuromodulation). While a number of neuropeptides have been identified from H. americanus, and the effects of some have been investigated at the cellular/systems levels, little is currently known about the molecular components of neuropeptidergic signaling in the lobster. Here, a H. americanus neural transcriptome was generated and mined for sequences encoding putative peptide precursors and receptors; 35 precursor- and 41 receptor-encoding transcripts were identified. We predicted 194 distinct neuropeptides from the deduced precursor proteins, including members of the adipokinetic hormone-corazonin-like peptide, allatostatin A, allatostatin C, bursicon, CCHamide, corazonin, crustacean cardioactive peptide, crustacean hyperglycemic hormone (CHH), CHH precursor-related peptide, diuretic hormone 31, diuretic hormone 44, eclosion hormone, FLRFamide, GSEFLamide, insulin-like peptide, intocin, leucokinin, myosuppressin, neuroparsin, neuropeptide F, orcokinin, pigment dispersing hormone, proctolin, pyrokinin, SIFamide, sulfakinin and tachykinin-related peptide families. While some of the predicted peptides are known H. americanus isoforms, most are novel identifications, more than doubling the extant lobster neuropeptidome. The deduced receptor proteins are the first descriptions of H. americanus neuropeptide receptors, and include ones for most of the peptide groups mentioned earlier, as well as those for ecdysis-triggering hormone, red pigment concentrating hormone

  8. Neuropeptidergic Signaling in the American Lobster Homarus americanus: New Insights from High-Throughput Nucleotide Sequencing.

    Directory of Open Access Journals (Sweden)

    Andrew E Christie

    Full Text Available Peptides are the largest and most diverse class of molecules used for neurochemical communication, playing key roles in the control of essentially all aspects of physiology and behavior. The American lobster, Homarus americanus, is a crustacean of commercial and biomedical importance; lobster growth and reproduction are under neuropeptidergic control, and portions of the lobster nervous system serve as models for understanding the general principles underlying rhythmic motor behavior (including peptidergic neuromodulation. While a number of neuropeptides have been identified from H. americanus, and the effects of some have been investigated at the cellular/systems levels, little is currently known about the molecular components of neuropeptidergic signaling in the lobster. Here, a H. americanus neural transcriptome was generated and mined for sequences encoding putative peptide precursors and receptors; 35 precursor- and 41 receptor-encoding transcripts were identified. We predicted 194 distinct neuropeptides from the deduced precursor proteins, including members of the adipokinetic hormone-corazonin-like peptide, allatostatin A, allatostatin C, bursicon, CCHamide, corazonin, crustacean cardioactive peptide, crustacean hyperglycemic hormone (CHH, CHH precursor-related peptide, diuretic hormone 31, diuretic hormone 44, eclosion hormone, FLRFamide, GSEFLamide, insulin-like peptide, intocin, leucokinin, myosuppressin, neuroparsin, neuropeptide F, orcokinin, pigment dispersing hormone, proctolin, pyrokinin, SIFamide, sulfakinin and tachykinin-related peptide families. While some of the predicted peptides are known H. americanus isoforms, most are novel identifications, more than doubling the extant lobster neuropeptidome. The deduced receptor proteins are the first descriptions of H. americanus neuropeptide receptors, and include ones for most of the peptide groups mentioned earlier, as well as those for ecdysis-triggering hormone, red pigment

  9. Characterization of the genome of molluscum contagiosum virus type 1 between the genome coordinates 0.045 and 0.075 by DNA nucleotide sequence analysis of a 5.6-kb HindIII/MluI DNA fragment.

    Science.gov (United States)

    Hadasch, R P; Bugert, J J; Janssen, W; Darai, G

    1993-01-01

    The complete DNA nucleotide sequence of a HindIII/MluI genomic DNA fragment (0.045-0.075 viral map units) from molluscum contagiosum virus type 1 (MCV-1) was determined. The HindIII/MluI DNA fragment comprises 5,646 bp with a base composition of 64.4% G + C and 35.6% A + T. The DNA sequence contains many perfect direct repeats. A cluster of three repetitive DNA elements R1, R2 and R3, with a complex structural arrangement was detected between nucleotide positions 1802 and 2107. The unit length (box) of the repetitive DNA sequences was found to be 6 bp (15 boxes) and 9 bp (24 boxes) for R1 and R2, respectively. The repetitive DNA element R3 is organized in fifteen boxes (15 bp) in which a unit length of R1 is combined with a unit length of R2. The arrangement of the repetition R3 within the DNA sequences of this particular region of the MCV-1 genome was found to be (5 x R3) + (2 x R2) + (1 x R3) + (6 x R2) + (1 x R3) + (1 x R2) + (8 x R3). Twenty-three open reading frames (ORFs) of 60-1,175 amino acid (AA) residues were detected. The largest ORF (number 17) comprises 1,175 AA with a predicted molecular weight of 126 kD. This ORF harbors a promoter signal which is located 21 nucleotides upstream from the start codon and is very similar to the early promoter signals known for vaccinia virus. This putative protein contains glutamine-enriched regions between AA residues 427 and 682 which show homologies to the corresponding glutamine-enriched regions of a variety of cellular genes like human transcriptional initiation factor (TFIID: TATA box factor).

  10. Complete Nucleotide Sequence of a South African Isolate of Grapevine Fanleaf Virus and Its Associated Satellite RNA

    Directory of Open Access Journals (Sweden)

    Johan T. Burger

    2013-07-01

    Full Text Available The complete sequences of RNA1, RNA2 and satellite RNA have been determined for a South African isolate of Grapevine fanleaf virus (GFLV-SACH44. The two RNAs of GFLV-SACH44 are 7,341 nucleotides (nt and 3,816 nt in length, respectively, and its satellite RNA (satRNA is 1,104 nt in length, all excluding the poly(A tail. Multiple sequence alignment of these sequences showed that GFLV-SACH44 RNA1 and RNA2 were the closest to the South African isolate, GFLV-SAPCS3 (98.2% and 98.6% nt identity, respectively, followed by the French isolate, GFLV-F13 (87.3% and 90.1% nt identity, respectively. Interestingly, the GFLV-SACH44 satRNA is more similar to three Arabis mosaic virus satRNAs (85%–87.4% nt identity than to the satRNA of GFLV-F13 (81.8% nt identity and was most distantly related to the satRNA of GFLV-R2 (71.0% nt identity. Full-length infectious clones of GFLV-SACH44 satRNA were constructed. The infectivity of the clones was tested with three nepovirus isolates, GFLV-NW, Arabis mosaic virus (ArMV-NW and GFLV-SAPCS3. The clones were mechanically inoculated in Chenopodium quinoa and were infectious when co-inoculated with the two GFLV helper viruses, but not when co-inoculated with ArMV-NW.

  11. The nucleotide sequence of the high-leukemogenic murine retrovirus SL3-3 reveals a patch of mink cell focus forming-like sequences upstream of the ecotropic envelope gene. Brief report

    DEFF Research Database (Denmark)

    Lund, Anders Henrik; Pedersen, F S

    1999-01-01

    We report the complete nucleotide sequence of the potent T-lymphomagenic murine retrovirus SL3-3. The non-LTR regions of the virus show 98% sequence identity to the endogenous ecotropic Akv murine leukemia virus. While the region encoding the surface envelope protein is completely identical to th...

  12. The complete nucleotide sequence and multipartite organization of the tobacco mitochondrial genome: comparative analysis of mitochondrial genomes in higher plants.

    Science.gov (United States)

    Sugiyama, Y; Watase, Y; Nagase, M; Makita, N; Yagura, S; Hirai, A; Sugiura, M

    2005-02-01

    Tobacco is a valuable model system for investigating the origin of mitochondrial DNA (mtDNA) in amphidiploid plants and studying the genetic interaction between mitochondria and chloroplasts in the various functions of the plant cell. As a first step, we have determined the complete mtDNA sequence of Nicotiana tabacum. The mtDNA of N. tabacum can be assumed to be a master circle (MC) of 430,597 bp. Sequence comparison of a large number of clones revealed that there are four classes of boundaries derived from homologous recombination, which leads to a multipartite organization with two MCs and six subgenomic circles. The mtDNA of N. tabacum contains 36 protein-coding genes, three ribosomal RNA genes and 21 tRNA genes. Among the first class, we identified the genes rps1 and psirps14, which had previously been thought to be absent in tobacco mtDNA on the basis of Southern analysis. Tobacco mtDNA was compared with those of Arabidopsis thaliana, Beta vulgaris, Oryza sativa and Brassica napus. Since repeated sequences show no homology to each other among the five angiosperms, it can be supposed that these were independently acquired by each species during the evolution of angiosperms. The gene order and the sequences of intergenic spacers in mtDNA also differ widely among the five angiosperms, indicating multiple reorganizations of genome structure during the evolution of higher plants. Among the conserved genes, the same potential conserved nonanucleotide-motif-type promoter could only be postulated for rrn18-rrn5 in four of the dicotyledonous plants, suggesting that a coding sequence does not necessarily move with the promoter upon reorganization of the mitochondrial genome.

  13. Evaluation of biological activities of highly diluted nucleotide sequences by using cellular models

    Directory of Open Access Journals (Sweden)

    Pierre Dorfman

    2012-09-01

    Full Text Available Background: highly diluted specific nucleic acids (SNA®, designed to modulate viral and cytokine genes expression, are currently used in Micro-Immunotherapy to treat viral infections and immune disorders. Although some preliminary studies have showed clinical benefit of these homeopathic preparations [1], no experimental data are available to explain their mechanism of action. Aims: to investigate the in vitro effect of two sets of highly diluted (HD SNA targeting i latent/lytic Epstein-Barr virus (SNA EBV and ii TNF-α and its receptor p55 involved in rheumatoid arthritis (SNA RA on cellular models. Methodology: serial homeopathic dilutions of SNA EBV and SNA RA (15cH-18cH were tested on a EBV-positive B-lymphoblastoid (B95-8 and on a LPS-stimulated macrophage (THP1 cell lines respectively, in comparison with agitated/diluted water and scramble DNA sequences prepared in the same conditions (negative controls. For B95-8 proliferative model, high mobility group box 1 protein (HMGB1 was used as reference. Analyzed biological parameters on B95-8 were i cell proliferation measured after 24 and 48h of incubation with HD SNA and ii expression of the EBV ZEBRA protein in response to TGF-β by Western-blotting (T+24h. For THP1 model, TNF-α synthesis and release were determined by RT-qPCR and ELISA (protein, after stimulation by LPS (1µg/ml and HD SNA co-administration. Results: we demonstrated that HD SNA RA significantly down-regulated TNF-α synthesis and release. This biological activity was showed to be specific (no effect of HD scramble SNA and related to the level of dilution (maximal effect with higher dilutions. Unexpectedly, a biological effect of agitated/diluted water was also detected in both cellular models. For B95-8 model, this effect resulted in a significant decrease of B95-8 proliferation (comparable to the HMGB1 reference and an inhibition of ZEBRA expression. Similarly, a reproducible

  14. Identification and characterization of simple sequence repeats (SSRs) for population studies of Puccinia novopanici.

    Science.gov (United States)

    Orquera-Tornakian, Gabriela K; Garrido, Patricia; Kronmiller, Brent; Hunger, Robert; Tyler, Brett M; Garzon, Carla D; Marek, Stephen M

    2017-08-01

    Switchgrass (Panicum virgatum L.) can be severely affected by rust disease. Recently switchgrass rust caused by P. emaculata (now confirmed to be Puccinia novopanici) has received most of the attention by the research community because this pathogen is responsible for reducing the biomass production and biofuel feedstock quality of switchgrass. Microsatellite markers found in the literature were either not informative (no allele frequency) or showed few polymorphisms in the target populations, therefore additional markers are needed for future studies of the genetic variation and population structure of P. novopanici. This study reports the development and characterization of novel simple sequence repeat (SSR) markers from a Puccinia emaculata s.l. microsatellite-enriched library and expressed sequence tags (ESTs). Microsatellites were evaluated for polymorphisms on P. emaculata s.l. urediniospores collected in Iowa (IA), Mississippi (MS), Oklahoma (OK), South Dakota (SD) and Virginia (VA). Puccinia novopanici single spore whole genome amplifications were used as templates to validate the SSR reactions protocol and to assess a preliminary population genetics statistics of the pathogen. Eighteen microsatellite markers were polymorphic (average PIC=0.72) on individual urediniospores, with an average of 8.3 alleles per locus (range 3 to 17). Of the 49 SSRs loci initially identified in P. emaculata s.l., 18 were transferable to P. striiformis f. sp. tritici, 23 to P. triticina, 20 to P. sorghi and 31 to P. andropogonis. Thus, these markers could be useful for DNA fingerprinting and population structure analysis for population genetics, epidemiology and ecological studies of P. novopanici and potentially other related Puccinia species. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. TPRpred: a tool for prediction of TPR-, PPR- and SEL1-like repeats from protein sequences

    Directory of Open Access Journals (Sweden)

    Söding Johannes

    2007-01-01

    Full Text Available Abstract Background Solenoid repeat proteins of the Tetratrico Peptide Repeat (TPR family are involved as scaffolds in a broad range of protein-protein interactions. Several resources are available for the prediction of TPRs, however, they often fail to detect divergent repeat units. Results We have developed TPRpred, a profile-based method which uses a P-value-dependent score offset to include divergent repeat units and which exploits the tendency of repeats to occur in tandem. TPRpred detects not only TPR-like repeats, but also the related Pentatrico Peptide Repeats (PPRs and SEL1-like repeats. The corresponding profiles were generated through iterative searches, by varying the threshold parameters for inclusion of repeat units into the profiles, and the best profiles were selected based on their performance on proteins of known structure. We benchmarked the performance of TPRpred in detecting TPR-containing proteins and in delineating the individual repeats therein, against currently available resources. Conclusion TPRpred performs significantly better in detecting divergent repeats in TPR-containing proteins, and finds more individual repeats than the existing methods. The web server is available at http://tprpred.tuebingen.mpg.de, and the C++ and Perl sources of TPRpred along with the profiles can be downloaded from ftp://ftp.tuebingen.mpg.de/ebio/protevo/TPRpred/.

  16. Identification and Analysis of Novel Amino-Acid Sequence Repeats in Bacillus anthracis str. Ames Proteome Using Computational Tools

    Directory of Open Access Journals (Sweden)

    D. Satyanarayana Rao

    2007-02-01

    Full Text Available We have identified four repeats and ten domains that are novel in proteins encoded by the Bacillus anthracis str. Ames proteome using automated in silico methods. A “repeat” corresponds to a region comprising less than 55-amino-acid residues that occur more than once in the protein sequence and sometimes present in tandem. A “domain” corresponds to a conserved region with greater than 55-amino-acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1 57-amino-acid-residue PxV domain, (2 122-amino-acid-residue FxF domain, (3 111-amino-acid-residue YEFF domain, (4 109-amino-acid-residue IMxxH domain, (5 103-amino-acid-residue VxxT domain, (6 84-amino-acid-residue ExW domain, (7 104-amino-acid-residue NTGFIG domain, (8 36-amino-acid-residue NxGK repeat, (9 95-amino-acid-residue VYV domain, (10 75-amino-acid-residue KEWE domain, (11 59-amino-acid-residue AFL domain, (12 53-amino-acid-residue RIDVK repeat, (13 (a 41-amino-acid-residue AGQF repeat and (b 42-amino-acid-residue GSAL repeat. A repeat or domain type is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure.

  17. Nucleotide sequence and phylogeny of the tet (L) tetracycline resistance determinant encoded by the plasmid pSTE1 from Staphylococcus hyicus

    DEFF Research Database (Denmark)

    Schwarz, S.; Cardoso, M.; Wegener, Henrik Caspar

    1992-01-01

    The nucleotide sequence of the tetracycline resistance (tet) gene and its regulatory region, encoded by the plasmid pSTE1 from Staphylococcus hyicus, was determined. The tet gene was inducible by tetracycline and encoded a hydrophobic protein of 458 amino acids. Comparisons between the predicted...... amino acid sequences of the pSTE1-encoded Tet from S. hyicus and the previously sequenced Tet K variants from Staphylococcus aureus, Tet L variants from Bacillus cereus, Bacillus stearothermophilus, and Bacillus subtilis, Tet M variants from Steptococcus faecalis and Staphylococcus aureus as well as Tet...... variants on one hand and the Tet K and Tet L variants on the other hand. The pSTE1-encoded Tet proved to be closely related to the Tet L proteins originally found on small Bacillus plasmids. The observed extensive similarities in the nucleotide sequences of the tet genes and in the deduced Tet amino acid...

  18. Nucleotide sequence of the coat protein genes of alstroemeria mosaic virus and amazon lily mosaic virus, a tentative species of genus potyvirus.

    Science.gov (United States)

    Fuji, S; Terami, F; Furuya, H; Naito, H; Fukumoto, F

    2004-09-01

    The nucleotide sequences of the 3' terminal region of the genomes of Alstroemeria mosaic virus (AlsMV) and the Amazon lily mosaic virus (ALiMV) have been determined. These sequences contain the complete coding region of the viral coat protein (CP) gene followed by a 3'-untranslated region (3'-UTR). AlsMV and ALiMV share 74.9% identity in the amino acid sequence of the CP, and 55.6% identity in the nucleotide sequence of the 3'-UTR. Phylogenetic analysis of these CP genes and 3'-UTRs in relation to those of 79 potyvirus species revealed that AlsMV and ALiMV should be assigned to the Potato virus Y (PVY) subgroup. AlsMV and ALiMV were concluded to have arisen independently within the PVY subgroup.

  19. Identities among actin-encoding cDNAs of the Nile tilapia (Oreochromis niloticus and other eukaryote species revealed by nucleotide and amino acid sequence analyses

    Directory of Open Access Journals (Sweden)

    Andréia B. Poletto

    2008-01-01

    Full Text Available Actin-encoding cDNAs of Nile tilapia (Oreochromis niloticus were isolated by RT-PCR using total RNA samples of different tissues and further characterized by nucleotide sequencing and in silico amino acid (aa sequence analysis. Comparisons among the actin gene sequences of O. niloticus and those of other species evidenced that the isolated genes present a high similarity to other fish and other vertebrate actin genes. The highest nucleotide resemblance was observed between O. niloticus and O. mossambicus a-actin and b-actin genes. Analysis of the predicted aa sequences revealed two distinct types of cytoplasmic actins, one cardiac muscle actin type and one skeletal muscle actin type that were expressed in different tissues of Nile tilapia. The evolutionary relationships between the Nile tilapia actin genes and diverse other organisms is discussed.

  20. Genetic Diversity Assessment and Identification of New Sour Cherry Genotypes Using Intersimple Sequence Repeat Markers

    Directory of Open Access Journals (Sweden)

    Roghayeh Najafzadeh

    2014-01-01

    Full Text Available Iran is one of the chief origins of subgenus Cerasus germplasm. In this study, the genetic variation of new Iranian sour cherries (which had such superior growth characteristics and fruit quality as to be considered for the introduction of new cultivars was investigated and identified using 23 intersimple sequence repeat (ISSR markers. Results indicated a high level of polymorphism of the genotypes based on these markers. According to these results, primers tested in this study specially ISSR-4, ISSR-6, ISSR-13, ISSR-14, ISSR-16, and ISSR-19 produced good and various levels of amplifications which can be effectively used in genetic studies of the sour cherry. The genetic similarity among genotypes showed a high diversity among the genotypes. Cluster analysis separated improved cultivars from promising Iranian genotypes, and the PCoA supported the cluster analysis results. Since the Iranian genotypes were superior to the improved cultivars and were separated from them in most groups, these genotypes can be considered as distinct genotypes for further evaluations in the framework of breeding programs and new cultivar identification in cherries. Results also confirmed that ISSR is a reliable DNA marker that can be used for exact genetic studies and in sour cherry breeding programs.

  1. Molecular identification of Aquilaria spp. by using inter-simple sequence repeat (ISSR)

    Science.gov (United States)

    Azhari, Hanif; Mohamad, Azhar; Othman, Roohaida

    2015-09-01

    Aquilaria species are very important economic plant for production of resin locally known as gaharu in Malaysia. There are five species that can be found in Malaysia and the most important Aquilaria species for gaharu production is A. malaccensis. Molecular markers for Aquilaria species are still insufficient and require more efficient, robust and reproducible molecular marker. Inter-simple sequence repeat (ISSR) markers are highly polymorphic and have high reproducibility which will be useful in areas of genetic diversity, phylogenetic studies, gene tagging, genome mapping and evolutionary biology in a wide range of crop species. Five selected ISSR primers were used to identify four Aquilaria species commonly found in Malaysia namely A. malaccensis, A. sub-integra, A. crassna and A. hirta. All the primers showed sufficient polymorphism to distinguish between the four species. Hence, the markers derived from ISSR can be used for molecular identification of Aquilaria spp. in ensuring homogenous species for plantation which may improve the quality of resin derived from known and certified materials.

  2. Agarose gel electrophoresis and polyacrylamide gel electrophoresis for visualization of simple sequence repeats.

    Science.gov (United States)

    Anderson, James; Wright, Drew; Meksem, Khalid

    2013-01-01

    In the modern age of genetic research there is a constant search for ways to improve the efficiency of plant selection. The most recent technology that can result in a highly efficient means of selection and still be done at a low cost is through plant selection directed by simple sequence repeats (SSRs or microsatellites). The molecular markers are used to select for certain desirable plant traits without relying on ambiguous phenotypic data. The best way to detect these is the use of gel electrophoresis. Gel electrophoresis is a common technique in laboratory settings which is used to separate deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) by size. Loading DNA and RNA onto gels allows for visualization of the size of fragments through the separation of DNA and RNA fragments. This is achieved through the use of the charge in the particles. As the fragments separate, they form into distinct bands at set sizes. We describe the ability to visualize SSRs on slab gels of agarose and polyacrylamide gel electrophoresis.

  3. Genetic characterization of the gypsy moth from China (Lepidoptera, Lymantriidae using inter simple sequence repeats markers.

    Directory of Open Access Journals (Sweden)

    Fang Chen

    Full Text Available This study provides the first genetic characterization of the gypsy moth from China (Lymantriadispar, one of the most recognized pests of forests and ornamental trees in the world. We assessed genetic diversity and structure in eight geographic populations of gypsy moths from China using five polymorphic Inter simple sequence repeat markers, which produced reproducible banding patterns. We observed 102 polymorphic loci across the 176 individuals sampled. Overall genetic diversity (Nei's, H was 0.2357, while the mean genetic diversity within geographic populations was 0.1845 ± 0.0150. The observed genetic distance among the eight populations ranged from 0.0432 to 0.1034. Clustering analysis (using an unweighted pair-group method with arithmetic mean and multidimensional scaling, revealed strong concordance between the strength of genetic relationships among populations and their geographic proximity. Analysis of molecular variance demonstrated that 25.43% of the total variability (F ST = 0.2543, P < 0.001 was attributable to variation among geographic populations. The results of our analyses investigating the degree of polymorphism, genetic diversity (Nei's and Shannon and genetic structure, suggest that individuals from Hebei may be better able to adapt to different environments and to disperse to new habitats. This study provides crucial genetic information needed to assess the distribution and population dynamics of this important pest species of global concern.

  4. Genetic diversity analysis of Lepidium sativum (Chandrasur) using inter simple sequence repeat (ISSR) markers

    Institute of Scientific and Technical Information of China (English)

    Amandeep Kaur; Rakesh Kumar; Suman Rani; Anita Grewal

    2015-01-01

    Lepidium sativum (commonly known as garden cress) belongs to the family Brassicaceae. It is a fast-growing erect, annual herbaceous plant. Its seeds possess significant fracture healing, anti-asthmatic, anti-diabetic, hypoglycemic, nephrocurative and nephroprotective activ-ities. In the present study, we assessed the genetic diversity of various genotypes of L. sativum using inter-simple sequence repeat (ISSR) markers. Out of 41 ISSR primers screened, 32 primers showed significant, clear and repro-ducible bands. A total of 510 amplified bands were obtained using 32 ISSR primers, out of which 422 bands were poly-morphic and 88 bands were monomorphic. The percentage of polymorphism was found to be 82. A total of 35 unique alleles ranging insize from 200 to 2,900 bp were observed. Cluster analysis based on unweighted pair-group method, arithmetic mean divided the 18 genotypes into two main clusters, with the first having only HCS-08 genotype of L. sativum and other having all of the other 17 genotypes. The Jaccard similarity coefficient revealed a broad range 32–72%genetic relatedness among the 18 genotypes.

  5. Simple Sequence Repeat Genetic Linkage Maps of A-genome Diploid Cotton (Gossypium arboreum)

    Institute of Scientific and Technical Information of China (English)

    Xue-Xia Ma; Bao-Liang Zhou; Yan-Hui Lü; Wang-Zhen Guo; Tian-Zhen Zhang

    2008-01-01

    This study introduces the construction of the first intraspacific genetic linkage map of the A-genome diploid cotton with newly developed simple sequence repeat (SSR) markers using 189 F2 plants derived from the cross of two Asiatic parents were detected using 6 092 pairs of SSR primers. Two-hundred and sixty-eight pairs of SSR pdmers with better polymorphisms were picked out to analyze the F2 population. In total, 320 polymorphic bands were generated and used to construct a linkage map with JoinMap3.0. Two-hundred and sixty-seven loci, Including three phenotypic traits were mapped at a logarithms of odds ratio (LOD) ≥ 3.0 on 13 linkage groups. The total length of the map was 2 508.71 cM, and the average distance between adjacent markers was 9.40 cM. Chromosome assignments were according to the association of linkages with our backbone tetraploid specific map using the 89 similar SSR loci. Comparisons among the 13 suites of orthologous linkage groups revealed that the A-genome chromosomes are largely collinear with the At and Dt sub-genome chromosomes. Chromosomes associated with inversions suggested that allopolyploidization was accompanied by homologous chromosomal rearrangement. The inter-chromosomal duplicated loci supply molecular evidence that the A-genome diploid Asiatic cotton is paleopolyploid.

  6. Genetic characterization of autochthonous grapevine cultivars from Eastern Turkey by simple sequence repeats (SSRs

    Directory of Open Access Journals (Sweden)

    Sadiye Peral Eyduran

    2016-01-01

    Full Text Available In this research, two well-recognized standard grape cultivars, Cabernet Sauvignon and Merlot, together with eight historical autochthonous grapevine cultivars from Eastern Anatolia in Turkey, were genetically characterized by using 12 pairs of simple sequence repeat (SSR primers in order to evaluate their genetic diversity and relatedness. All of the used SSR primers produced successful amplifications and revealed DNA polymorphisms, which were subsequently utilized to evaluate the genetic relatedness of the grapevine cultivars. Allele richness was implied by the identification of 69 alleles in 8 autochthonous cultivars with a mean value of 5.75 alleles per locus. The average expected heterozygosity and observed heterozygosity were found to be 0.749 and 0.739, respectively. Taking into account the generated alleles, the highest number was recorded in VVC2C3 and VVS2 loci (nine and eight alleles per locus, respectively, whereas the lowest number was recorded in VrZAG83 (three alleles per locus. Two main clusters were produced by using the unweighted pair-group method with arithmetic mean dendrogram constructed on the basis of the SSR data. Only Cabernet Sauvignon and Merlot cultivars were included in the first cluster. The second cluster involved the rest of the autochthonous cultivars. The results obtained during the study illustrated clearly that SSR markers have verified to be an effective tool for fingerprinting grapevine cultivars and carrying out grapevine biodiversity studies. The obtained data are also meaningful references for grapevine domestication.

  7. Simple sequence repeats and compositional bias in the bipartite Ralstonia solanacearum GMI1000 genome

    Directory of Open Access Journals (Sweden)

    Vandamme Peter

    2003-03-01

    Full Text Available Abstract Background Ralstonia solanacearum is an important plant pathogen. The genome of R. solananearum GMI1000 is organised into two replicons (a 3.7-Mb chromosome and a 2.1-Mb megaplasmid and this bipartite genome structure is characteristic for most R. solanacearum strains. To determine whether the megaplasmid was acquired via recent horizontal gene transfer or is part of an ancestral single chromosome, we compared the abundance, distribution and compositon of simple sequence repeats (SSRs between both replicons and also compared the respective compositional biases. Results Our data show that both replicons are very similar in respect to distribution and composition of SSRs and presence of compositional biases. Minor variations in SSR and compositional biases observed may be attributable to minor differences in gene expression and regulation of gene expression or can be attributed to the small sample numbers observed. Conclusions The observed similarities indicate that both replicons have shared a similar evolutionary history and thus suggest that the megaplasmid was not recently acquired from other organisms by lateral gene transfer but is a part of an ancestral R. solanacearum chromosome.

  8. Simple Sequence Repeat Analysis of Selected NSIC-registered Coffee Varieties in the Philippines

    Directory of Open Access Journals (Sweden)

    Daisy May C. Santos

    2016-06-01

    Full Text Available Coffee (Coffea sp. is an important commercial crop worldwide. Three species of coffee are used as beverage, namely Coffea arabica, C. canephora, and C. liberica. Coffea arabica L. is the most cultivated among the three coffee species due to its taste quality, rich aroma, and low caffeine content. Despite its inferior taste and aroma, C. canephora Pierre ex A. Froehner, which has the highest caffeine content, is the second most widely cultivated because of its resistance to coffee diseases. On the other hand, C. liberica W.Bull ex Hierncomes is characterized by its very strong taste and flavor. The Philippines used to be a leading exporter of coffee until coffee rust destroyed the farms in Batangas, home of the famous Kapeng Barako. The country has been attempting to revive the coffee industry by focusing on the production of specialty coffee with registered varieties on the National Seed Industry Council (NSIC. Correct identification and isolation of pure coffee beans are the main factors that determine coffee’s market value. Local farms usually misidentify and mix coffee beans of different varieties, leading to the depreciation of their value. This study used simple sequence repeat (SSR markers to evaluate and distinguish Philippine NSIC-registered coffee species and varieties. The neighbor-joining tree generated using PAUP showed high bootstrap support, separating C. arabica, C. canephora, and C. liberica from each other. Among the twenty primer pairs used, seven were able to distinguish C. arabica, nine for C. liberica, and one for C. canephora.

  9. Nucleotide sequence analyses of genomic RNAs of peanut stunt virus Mi, the type strain representative of a novel PSV subgroup from China

    NARCIS (Netherlands)

    Yan, L.; Xu, Z.; Goldbach, R.W.; Chen, Y.K.; Prins, M.W.

    2005-01-01

    The complete nucleotide sequence of Peanut stunt virus strain Mi (PSV-Mi) from China was determined and compared to other viruses of the genus Cucumovirus. The tripartite genome of PSV-Mi encoded five open reading frames (ORFs) typical of cucumoviruses. Distance analyses of four ORFs indicated that

  10. Detection of short repeated genomic sequences on metaphase chromosomes using padlock probes and target primed rolling circle DNA synthesis

    Directory of Open Access Journals (Sweden)

    Stougaard Magnus

    2007-11-01

    Full Text Available Abstract Background In situ detection of short sequence elements in genomic DNA requires short probes with high molecular resolution and powerful specific signal amplification. Padlock probes can differentiate single base variations. Ligated padlock probes can be amplified in situ by rolling circle DNA synthesis and detected by fluorescence microscopy, thus enhancing PRINS type reactions, where localized DNA synthesis reports on the position of hybridization targets, to potentially reveal the binding of single oligonucleotide-size probe molecules. Such a system has been presented for the detection of mitochondrial DNA in fixed cells, whereas attempts to apply rolling circle detection to metaphase chromosomes have previously failed, according to the literature. Methods Synchronized cultured cells were fixed with methanol/acetic acid to prepare chromosome spreads in teflon-coated diagnostic well-slides. Apart from the slide format and the chromosome spreading everything was done essentially according to standard protocols. Hybridization targets were detected in situ with padlock probes, which were ligated and amplified using target primed rolling circle DNA synthesis, and detected by fluorescence labeling. Results An optimized protocol for the spreading of condensed metaphase chromosomes in teflon-coated diagnostic well-slides was developed. Applying this protocol we generated specimens for target primed rolling circle DNA synthesis of padlock probes recognizing a 40 nucleotide sequence in the male specific repetitive satellite I sequence (DYZ1 on the Y-chromosome and a 32 nucleotide sequence in the repetitive kringle IV domain in the apolipoprotein(a gene positioned on the long arm of chromosome 6. These targets were detected with good efficiency, but the efficiency on other target sites was unsatisfactory. Conclusion Our aim was to test the applicability of the method used on mitochondrial DNA to the analysis of nuclear genomes, in particular as

  11. Nucleotide sequence of coat protein gene for GPV isolate of barley yellow dwarf virus and construction of expression plasmid for plant

    Institute of Scientific and Technical Information of China (English)

    成卓敏; 何小源; 吴茂森; 周广和; Paul Keese; P.M.Waterhouse

    1996-01-01

    GPV is a Chinese serotype isolate of barley yellow dwarf virus (BYDV) that has no reactionwith antiserum of MAV, PAV, SGV, RPV and RMV. The sequence of the coat protein (CP) of GPV isolate of BYDV was identified and its amino acid sequence was deduced. The coding region for the putative GPV CP is 603 bases nucleotides and encodes a Mr 22218 (22 ku) protein. The same as MAV, PAV and RPV, GPV contained a second ORF within the coat protein coding region. This protein of 17024 Mr (17 ku) is thought to correspond to the Virion protein genome linked (Vpg). Sequence comparisons of the CP coding region between the GPV isolate of BYDV and other isolates of BYDV have been done. The nucleotide and ammo acid sequence homology of GPV has a greater identity to the sequence of RPV than those of PAV and MAV. The GPV CP sequence shared 83.7% of nucleotide similarity and 77.5% of deduced amino add similarity, whereas that of the PAV and MAV shared 56.9%. 53.2% and 44.1%. 43.8% respectively. According to BYDV-GPV CP seque

  12. Development of chloroplast simple sequence repeats (cpSSRs) for the intraspecific study of Gracilaria tenuistipitata (Gracilariales, Rhodophyta) from different populations.

    Science.gov (United States)

    Song, Sze-Looi; Lim, Phaik-Eem; Phang, Siew-Moi; Lee, Weng-Wah; Hong, Dang Diem; Prathep, Anchana

    2014-02-04

    Gracilaria tenuistipitata is an agarophyte with substantial economic potential because of its high growth rate and tolerance to a wide range of environment factors. This red seaweed is intensively cultured in China for the production of agar and fodder for abalone. Microsatellite markers were developed from the chloroplast genome of G. tenuistipitata var. liui to differentiate G. tenuistipitata obtained from six different localities: four from Peninsular Malaysia, one from Thailand and one from Vietnam. Eighty G. tenuistipitata specimens were analyzed using eight simple sequence repeat (SSR) primer-pairs that we developed for polymerase chain reaction (PCR) amplification. Five mononucleotide primer-pairs and one trinucleotide primer-pair exhibited monomorphic alleles, whereas the other two primer-pairs separated the G. tenuistipitata specimens into two main clades. G. tenuistipitata from Thailand and Vietnam were grouped into one clade, and the populations from Batu Laut, Middle Banks and Kuah (Malaysia) were grouped into another clade. The combined dataset of these two primer-pairs separated G. tenuistipitata obtained from Kelantan, Malaysia from that obtained from other localities. Based on the variations in repeated nucleotides of microsatellite markers, our results suggested that the populations of G. tenuistipitata were distributed into two main geographical regions: (i) populations in the west coast of Peninsular Malaysia and (ii) populations facing the South China Sea. The correct identification of G. tenuistipitata strains with traits of high economic potential will be advantageous for the mass cultivation of seaweeds.

  13. A distinct sequence in the adenine nucleotide translocase from Artemia franciscana embryos is associated with insensitivity to bongkrekate and atypical effects of adenine nucleotides on Ca2+ uptake and sequestration.

    Science.gov (United States)

    Konràd, Csaba; Kiss, Gergely; Töröcsik, Beata; Lábár, János L; Gerencser, Akos A; Mándi, Miklós; Adam-Vizi, Vera; Chinopoulos, Christos

    2011-03-01

    Mitochondria isolated from embryos of the crustacean Artemia franciscana lack the Ca(2+)-induced permeability transition pore. Although the composition of the pore described in mammalian mitochondria is unknown, the impacts of several effectors of the adenine nucleotide translocase (ANT) on pore opening are firmly established. Notably, ADP, ATP and bongkrekate delay, whereas carboxyatractyloside hastens, Ca(2+)-induced pore opening. Here, we report that adenine nucleotides decreased, whereas carboxyatractyloside increased, Ca(2+) uptake capacity in mitochondria isolated from Artemia embryos. Bongkrekate had no effect on either Ca(2+) uptake or ADP-ATP exchange rate. Transmission electron microscopy imaging of Ca(2+)-loaded Artemia mitochondria showed needle-like formations of electron-dense material in the absence of adenine nucleotides, and dot-like formations in the presence of adenine nucleotides or Mg(2+). Energy-filtered transmission electron microscopy showed the material to be rich in calcium and phosphorus. Sequencing of the Artemia mRNA coding for ANT revealed that it transcribes a protein with a stretch of amino acids in the 198-225 region with 48-56% similarity to those from other species, including the deletion of three amino acids in positions 211, 212 and 219. Mitochondria isolated from the liver of Xenopus laevis, in which the ANT shows similarity to that in Artemia except for the 198-225 amino acid region, demonstrated a Ca(2+)-induced bongkrekate-sensitive permeability transition pore, allowing the suggestion that this region of ANT may contain the binding site for bongkrekate.

  14. The nucleotide sequence and a first generation gene transfer vector of species B human adenovirus serotype 3.

    Science.gov (United States)

    Sirena, Dominique; Ruzsics, Zsolt; Schaffner, Walter; Greber, Urs F; Hemmi, Silvio

    2005-12-20

    Human adenovirus (Ad) serotype 3 causes respiratory infections. It is considered highly virulent, accounting for about 13% of all Ad isolates. We report here the complete Ad3 DNA sequence of 35,343 base pairs (GenBank accession DQ086466). Ad3 shares 96.43% nucleotide identity with Ad7, another virulent subspecies B1 serotype, and 82.56 and 62.75% identity with the less virulent species B2 Ad11 and species C Ad5, respectively. The genomic organization of Ad3 is similar to the other human Ads comprising five early transcription units, E1A, E1B, E2, E3, and E4, two delayed early units IX and IVa2, and the major late unit, in total 39 putative and 7 hypothetical open reading frames. A recombinant E1-deleted Ad3 was generated on a bacterial artificial chromosome. This prototypic virus efficiently transduced CD46-positive rodent and human cells. Our results will help in clarifying the biology and pathology of adenoviruses and enhance therapeutic applications of viral vectors in clinical settings.

  15. Genome sequence of Perigonia lusca single nucleopolyhedrovirus: insights into the evolution of a nucleotide metabolism enzyme in the family Baculoviridae

    Science.gov (United States)

    Ardisson-Araújo, Daniel M. P.; Lima, Rayane Nunes; Melo, Fernando L.; Clem, Rollie J.; Huang, Ning; Báo, Sônia Nair; Sosa-Gómez, Daniel R.; Ribeiro, Bergmann M.

    2016-01-01

    The genome of a novel group II alphabaculovirus, Perigonia lusca single nucleopolyhedrovirus (PeluSNPV), was sequenced and shown to contain 132,831 bp with 145 putative ORFs (open reading frames) of at least 50 amino acids. An interesting feature of this novel genome was the presence of a putative nucleotide metabolism enzyme-encoding gene (pelu112). The pelu112 gene was predicted to encode a fusion of thymidylate kinase (tmk) and dUTP diphosphatase (dut). Phylogenetic analysis indicated that baculoviruses have independently acquired tmk and dut several times during their evolution. Two homologs of the tmk-dut fusion gene were separately introduced into the Autographa californica multiple nucleopolyhedrovirus (AcMNPV) genome, which lacks tmk and dut. The recombinant baculoviruses produced viral DNA, virus progeny, and some viral proteins earlier during in vitro infection and the yields of viral occlusion bodies were increased 2.5-fold when compared to the parental virus. Interestingly, both enzymes appear to retain their active sites, based on separate modeling using previously solved crystal structures. We suggest that the retention of these tmk-dut fusion genes by certain baculoviruses could be related to accelerating virus replication and to protecting the virus genome from deleterious mutation. PMID:27273152

  16. [Variability of nucleotide sequences of the mitochondrial DNA cytochrome c gene in dolly varden and taranetz char].

    Science.gov (United States)

    Radchenko, O A; Derenko, M V; Maliarchuk, B A

    2000-07-01

    Nucleotide sequence of the 307-bp fragment of the mitochondrial DNA cytochrome b gene was determined in representatives of the three species of the Salvelinus genus, specifically, dolly varden char (S. malma), taranetz char (S. taranetzi), and white-spotted char (S. leucomaenis). These results pointed to a high level of mitochondrial DNA (mtDNA) divergence between white-spotted char and dolly varden char, on the one hand, and taranetz char, on the other (the mean d value was 5.45%). However, the divergence between the dolly varden char and taranetz char was only 0.81%, which is comparable with the level of intraspecific divergence in the dolly varden char (d = 0.87%). It was shown that the dolly varden char mitochondrial gene pool contained DNA lineages differing from the main mtDNA pool at least in the taranetz char-specific mitochondrial lineages. One of these dolly varden char mtDNA lineages was characterized by the presence of the restriction endonuclease MspI-D variant of the cytochrome b gene. This lineage was widely distributed in the Chukotka populations but it was not detected in the Yana River (Okhotsk sea) populations. These findings suggest that dolly varden char has a more ancient evolutionary lineage, diverging from the common ancestor earlier than did taranetz char.

  17. Specific-Locus Amplified Fragment Sequencing Reveals Spontaneous Single-Nucleotide Mutations in Rice OsMsh6 Mutants

    Directory of Open Access Journals (Sweden)

    Hairui Cui

    2017-01-01

    Full Text Available Genomic stability depends in part on an efficient DNA lesion recognition and correction by the DNA mismatch repair (MMR system. We investigated mutations arising spontaneously in rice OsMsh6 mutants by specific-locus amplified fragment sequencing. Totally 994 single-nucleotide mutations were identified in three mutants and on average the mutation density is about 1/136.72 Kb per mutant line. These mutations were relatively randomly distributed in genome and might be accumulated in generation-dependent manner. All possible base transitions and base transversions could be seen and the ratio of transitions to transversions was about 3.12. We also observed the nearest-neighbor bias around the mutated base. Our data suggests that OsMsh6 (LOC_Os09g24220 is important in ensuring genome stability by recognizing mismatches that arise spontaneously and provides useful information for investigating the function of the OsMsh6 gene in DNA repair and exploiting MMR mutants in rice induced mutation breeding.

  18. Simultaneous Detection of Both Single Nucleotide Variations and Copy Number Alterations by Next-Generation Sequencing in Gorlin Syndrome.

    Science.gov (United States)

    Morita, Kei-ichi; Naruto, Takuya; Tanimoto, Kousuke; Yasukawa, Chisato; Oikawa, Yu; Masuda, Kiyoshi; Imoto, Issei; Inazawa, Johji; Omura, Ken; Harada, Hiroyuki

    2015-01-01

    Gorlin syndrome (GS) is an autosomal dominant disorder that predisposes affected individuals to developmental defects and tumorigenesis, and caused mainly by heterozygous germline PTCH1 mutations. Despite exhaustive analysis, PTCH1 mutations are often unidentifiable in some patients; the failure to detect mutations is presumably because of mutations occurred in other causative genes or outside of analyzed regions of PTCH1, or copy number alterations (CNAs). In this study, we subjected a cohort of GS-affected individuals from six unrelated families to next-generation sequencing (NGS) analysis for the combined screening of causative alterations in Hedgehog signaling pathway-related genes. Specific single nucleotide variations (SNVs) of PTCH1 causing inferred amino acid changes were identified in four families (seven affected individuals), whereas CNAs within or around PTCH1 were found in two families in whom possible causative SNVs were not detected. Through a targeted resequencing of all coding exons, as well as simultaneous evaluation of copy number status using the alignment map files obtained via NGS, we found that GS phenotypes could be explained by PTCH1 mutations or deletions in all affected patients. Because it is advisable to evaluate CNAs of candidate causative genes in point mutation-negative cases, NGS methodology appears to be useful for improving molecular diagnosis through the simultaneous detection of both SNVs and CNAs in the targeted genes/regions.

  19. Simultaneous Detection of Both Single Nucleotide Variations and Copy Number Alterations by Next-Generation Sequencing in Gorlin Syndrome.

    Directory of Open Access Journals (Sweden)

    Kei-ichi Morita

    Full Text Available Gorlin syndrome (GS is an autosomal dominant disorder that predisposes affected individuals to developmental defects and tumorigenesis, and caused mainly by heterozygous germline PTCH1 mutations. Despite exhaustive analysis, PTCH1 mutations are often unidentifiable in some patients; the failure to detect mutations is presumably because of mutations occurred in other causative genes or outside of analyzed regions of PTCH1, or copy number alterations (CNAs. In this study, we subjected a cohort of GS-affected individuals from six unrelated families to next-generation sequencing (NGS analysis for the combined screening of causative alterations in Hedgehog signaling pathway-related genes. Specific single nucleotide variations (SNVs of PTCH1 causing inferred amino acid changes were identified in four families (seven affected individuals, whereas CNAs within or around PTCH1 were found in two families in whom possible causative SNVs were not detected. Through a targeted resequencing of all coding exons, as well as simultaneous evaluation of copy number status using the alignment map files obtained via NGS, we found that GS phenotypes could be explained by PTCH1 mutations or deletions in all affected patients. Because it is advisable to evaluate CNAs of candidate causative genes in point mutation-negative cases, NGS methodology appears to be useful for improving molecular diagnosis through the simultaneous detection of both SNVs and CNAs in the targeted genes/regions.

  20. Complete nucleotide sequence and genome organization of an endornavirus from bottle gourd (Lagenaria siceraria) in California, U.S.A.

    Science.gov (United States)

    Kwon, Sun-Jung; Tan, Shih-Hua; Vidalakis, Georgios

    2014-08-01

    The full-length nucleotide sequence and genome organization of an Endornavirus isolated from ornamental hard shell bottle gourd plants (Lagenaria siceraria (Molina) Standl.) in California (CA), USA tentatively named L. siceraria endornavirus-California (LsEV-CA) was determined. The LsEV-CA genome was 15088 bp in length, with a G + C content of 36.55 %. The lengths of the 5' and 3' untranslated regions were 111 and 52 bp, respectively. The genome of LsEV-CA contained one large ORF encoding a 576 kDa polyprotein. The predicted protein contains two glycosyltransferase motifs, as well as RNA-dependent RNA polymerase and helicase domains. LsEV-CA was detected in healthy-looking field-grown gourd plants, as well as plants expressing yellows symptoms. It was also detected in non-symptomatic greenhouse-grown gourd seedlings grown from seed obtained from the same field sites. These preliminary data indicate that LsEV-CA is likely not associated with the gourd-yellows syndrome observed in the field.

  1. Analysis of mitochondrial control region nucleotide sequences from Baffin Bay beluga, (Delphinapterus leucas: detecting pods or sub-populations?

    Directory of Open Access Journals (Sweden)

    Per Jakob Palsbøll

    2002-07-01

    Full Text Available We report the results of an analysis of the variation in the nucleotide sequence of the mitochondrial control region obtained in 218 samples collected from belugas, Delphinapterus leucas, around the Baffin Bay. We detected multiple instances of significant heterogeneity in the distribution of genetic variation among the analyzed mitochondrial control region sequences on a spatial as well as temporal scale indicating a high degree of maternal population structure. The detection of significant levels of heterogeneity between samples collected in different years but within the same area and season was unexpected. Re-examination of earlier results presented by Brown Gladden and coworkers also revealed temporal genetic heterogeneity within the one area where sufficient (n>15 samples were collected in multiple years. These findings suggest that non-random breeding and maternally directed site-fidelity are not the sole causes of genetic heterogeneity among belugas but that a matrilineal pod structure might cause significant levels of genetic heterogeneity as well, even within the same area. We propose that a maternal pod structure, which has been shown to be the cause of significant genetic heterogeneity in other odontocetes, may add to the overall level of heterogeneity in the maternally inherited DNA and hence that much of the spatial heterogeneity observed in this and previous studies might be attributed to pod rather than population structure. Our findings suggest that it is important to estimate the contribution of pod structure to overall heterogeneity before defining populations or management units in order to avoid interpreting heterogeneity due to sampling of different pods as different populations/management units.

  2. Triplet repeat sequences in human DNA can be detected by hybridization to a synthetic (5'-CGG-3')17 oligodeoxyribonucleotide

    DEFF Research Database (Denmark)

    Behn-Krappa, A; Mollenhauer, J; Doerfler, W

    1993-01-01

    The seemingly autonomous amplification of naturally occurring triplet repeat sequences in the human genome has been implicated in the causation of human genetic disease, such as the fragile X (Martin-Bell) syndrome, myotonic dystrophy (Curshmann-Steinert), spinal and bulbar muscular atrophy...

  3. Variability of United States isolates of Macrophomina phaseolina based on simple sequence repeats and cross genus transferability to related Botryosphaeraceae

    Science.gov (United States)

    Twelve simple sequence repeat (SSRs) loci were used to evaluate genetic diversity of 109 isolates of Macrophomina phaseolina collected from different geographical regions and host species throughout the United States (U.S.). Genetic diversity was assessed using Nei’s minimum genetic distance and th...

  4. Distribution and evolution of repeated sequences in genomes of Triatominae (Hemiptera-Reduviidae inferred from genomic in situ hybridization.

    Directory of Open Access Journals (Sweden)

    Sebastian Pita

    Full Text Available The subfamily Triatominae, vectors of Chagas disease, comprises 140 species characterized by a highly homogeneous chromosome number. We analyzed the chromosomal distribution and evolution of repeated sequences in Triatominae genomes by Genomic in situ Hybridization using Triatoma delpontei and Triatoma infestans genomic DNAs as probes. Hybridizations were performed on their own chromosomes and on nine species included in six genera from the two main tribes: Triatomini and Rhodniini. Genomic probes clearly generate two different hybridization patterns, dispersed or accumulated in specific regions or chromosomes. The three used probes generate the same hybridization pattern in each species. However, these patterns are species-specific. In closely related species, the probes strongly hybridized in the autosomal heterochromatic regions, resembling C-banding and DAPI patterns. However, in more distant species these co-localizations are not observed. The heterochromatic Y chromosome is constituted by highly repeated sequences, which is conserved among 10 species of Triatomini tribe suggesting be an ancestral character for this group. However, the Y chromosome in Rhodniini tribe is markedly different, supporting the early evolutionary dichotomy between both tribes. In some species, sex chromosomes and autosomes shared repeated sequences, suggesting meiotic chromatin exchanges among these heterologous chromosomes. Our GISH analyses enabled us to acquire not only reliable information about autosomal repeated sequences distribution but also an insight into sex chromosome evolution in Triatominae. Furthermore, the differentiation obtained by GISH might be a valuable marker to establish phylogenetic relationships and to test the controversial origin of the Triatominae subfamily.

  5. Effects of GABA[subscript A] Modulators on the Repeated Acquisition of Response Sequences in Squirrel Monkeys

    Science.gov (United States)

    Campbell, Una C.; Winsauer, Peter J.; Stevenson, Michael W.; Moerschbaecher, Joseph M.

    2004-01-01

    The present study investigated the effects of positive and negative GABA[subscript A] modulators under three different baselines of repeated acquisition in squirrel monkeys in which the monkeys acquired a three-response sequence on three keys under a second-order fixed-ratio (FR) schedule of food reinforcement. In two of these baselines, the…

  6. Diversity, population structure, and evolution of local peach cultivars in China identified by simple sequence repeats.

    Science.gov (United States)

    Shen, Z J; Ma, R J; Cai, Z X; Yu, M L; Zhang, Z

    2015-01-15

    The fruit peach originated in China and has a history of domestication of more than 4000 years. Numerous local cultivars were selected during the long course of cultivation, and a great morphological diversity exists. To study the diversity and genetic background of local peach cultivars in China, a set of 158 accessions from different ecological regions, together with 27 modern varieties and 10 wild accessions, were evaluated using 49 simple sequence repeats (SSRs) covering the peach genome. Broad diversity was also observed in local cultivars at the SSR level. A total of 648 alleles were amplified with an average of 13.22 observed alleles per locus. The number of genotypes detected ranged from 9 (UDP96015) to 58 (BPPCT008) with an average of 27.00 genotypes per marker. Eight subpopulations divided by STRUCTURE basically coincided with the dendrogram of genetic relationships and could be explained by the traditional groups. The 8 subpopulations were juicy honey peach, southwestern peach I, wild peach, Buddha peach + southwestern peach II, northern peach, southern crisp peach, ornamental peach, and Prunus davidiana + P. kansuensis. Most modern varieties carried the genetic backgrounds of juicy honey peach and southwestern peach I, while others carried diverse genetic backgrounds, indicating that local cultivars were partly used in modern breeding programs. Based on the traditional evolution pathway, a modified pathway for the development of local peach cultivars in China was proposed using the genetic background of subpopulations that were identified by SSRs. Current status and prospects of utilization of Chinese local peach cultivars were also discussed according to the SSR information.

  7. Genetic Diversity and Structure of Lolium Species Surveyed on Nuclear Simple Sequence Repeat and Cytoplasmic Markers

    Directory of Open Access Journals (Sweden)

    Hongwei Cai

    2017-04-01

    Full Text Available To assess the genetic diversity and population structure of Lolium species, we used 32 nuclear simple sequence repeat (SSR markers and 7 cytoplasmic gene markers to analyze a total of 357 individuals from 162 accessions of 9 Lolium species. This survey revealed a high level of polymorphism, with an average number of alleles per locus of 23.59 and 5.29 and an average PIC-value of 0.83 and 0.54 for nuclear SSR markers and cytoplasmic gene markers, respectively. Analysis of molecular variance (AMOVA revealed that 16.27 and 16.53% of the total variation was due to differences among species, with the remaining 56.35 and 83.47% due to differences within species and 27.39 and 0% due to differences within individuals in 32 nuclear SSR markers set and 6 chloroplast gene markers set, respectively. The 32 nuclear SSR markers detected three subpopulations among 357 individuals, whereas the 6 chloroplast gene markers revealed three subpopulations among 160 accessions in the STRUCTURE analysis. In the clustering analysis, the three inbred species clustered into a single group, whereas the outbreeding species were clearly divided, especially according to nuclear SSR markers. In addition, almost all Lolium multiflorum populations were clustered into group C4, which could be further divided into three subgroups, whereas Lolium perenne populations primarily clustered into two groups (C2 and C3, with a few lines that instead grouped with L. multiflorum (C4 or Lolium rigidum (C6. Together, these results will useful for the use of Lolium germplasm for improvement and increase the effectiveness of ryegrass breeding.

  8. Genetic Diversity of Landraces in Gossypium arboreum L. Race sinense Assessed with Simple Sequence Repeat Markers

    Institute of Scientific and Technical Information of China (English)

    Wang-Zhen Guo; Bao-Liang Zhou; Lu-Ming Yang; Wei Wang; Tian-Zhen Zhang

    2006-01-01

    Asiatic cotton (Gossypium arboreum L.) is an "Old World" cultivated cotton species, the sinense race of which is planted extensively in China. This species is still used in the current tetraploid cotton breeding program as an elite germplasm line, and is also used as a model for genomic research in Gossypium. In the present study, 60 cotton microsatellite markers, averaging 4.6 markers for each A-genome chromosome,were chosen to assess the genetic diversity of 109 accessions. These included 106 G. arboreum landraces,collected from 18 provinces throughout four Asiatic cotton-growing regions in China. A total of 128 alleles were detected, with an average of 2.13 alleles per locus. The largest number of alleles, as well as the maximum number of polymorphic loci, was detected in the A03 linkage group. No polymorphic alleles were detected on chromosome 10. The polymorphism information content for the 22 polymorphic microsatellite loci varied from 0.52 to 0.98, with an average of 0.89. Genetic diversity analysis revealed that the landraces in the Southern region had more genetic variability than those from the other two regions, and no significant difference was detected between landraces in the Yangtze and the Yellow River Valley regions. These findings are consistent with the history of sinense introduction, with the Southern region being the presumed center of origin for Chinese Asiatic cotton, and with subsequent northeastward extension to the Yangtze and Yellow River Valleys. Cluster analysis, based on simple sequence repeat data for 60 microsatellite loci, clearly differentiated Vietnamese and G. herbaceum landraces from the sinense landrace. No relationship between inter-variety similarity and geographical ecological region was observed. The present findings indicate that the Southern region landraces may have been directly introduced into the provinces in the middle and lower Yangtze River Valley, where Asiatic cotton was most extensively grown, and further race

  9. Simple sequence repeat marker associated with a natural leaf defoliation trait in tetraploid cotton.

    Science.gov (United States)

    Abdurakhmonov, I Y; Abdullaev, A A; Saha, S; Buriev, Z T; Arslanov, D; Kuryazov, Z; Mavlonov, G T; Rizaeva, S M; Reddy, U K; Jenkins, J N; Abdullaev, A; Abdukarimov, A

    2005-01-01

    Cotton (Gossypium hirsutum L.) leaf defoliation has a significant ecological and economical impact on cotton production. Thus the utilization of a natural leaf defoliation trait, which exists in wild diploid cotton species, in the development of tetraploid cultivated cotton will not only be cost effective, but will also facilitate production of very high-grade fiber. The primary goal of our research was to tag loci associated with natural leaf defoliation using microsatellite markers in Upland cotton. The F2 populations developed from reciprocal crosses between the two parental cotton lines--AN-Boyovut-2 (2n = 52), a late leaf defoliating type, and Listopad Beliy (2n = 52), a naturally early leaf defoliating type--demonstrated that the naturally early leaf defoliation trait has heritability values of 0.74 and 0.84 in the reciprocal F2 population. The observed phenotypic segregation difference in reciprocal crosses suggested a minor cytoplasmic effect in the phenotypic expression of the naturally early leaf defoliation trait. Results from the Kruskal-Wallis (KW) nonparametric test revealed that JESPR-13 (KW = 6.17), JESPR-153 (KW = 9.97), and JESPR-178 (KW = 13.45) Simple sequence repeat (SSR) markers are significantly associated with natural leaf defoliation in the mapping population having stable estimates at empirically obtained critical thresholds (P < .05-.0001). JESPR-178 revealed the highest estimates (P < .0001) for association with the natural leaf defoliation trait, exceeding maximum empirical threshold values. JESPR-178 was assigned to the short arm of chromosome 18, suggesting indirectly that genes associated with natural leaf defoliation might be located on this chromosome. This microsatellite marker may have the potential for use to introgress the naturally early leaf defoliation quantitative trait loci (QTL) from the donor line Listopad Beliy to commercial varieties of cotton through marker-assisted selection programs.

  10. Phylogenetic relationships within Taenia taeniaeformis variants and other taeniid cestodes inferred from the nucleotide sequence of the cytochrome c oxidase subunit I gene.

    Science.gov (United States)

    Okamoto, M; Bessho, Y; Kamiya, M; Kurosawa, T; Horii, T

    1995-01-01

    Nucleotide sequence variations in a region of the mitochondrial cytochrome c oxidase subunit I (COI) gene (391 bp) were examined within seven species of the genus Taenia and two species of the genus Echinococcus, including ten isolates of T. taeniaeformis and six isolates of E. multilocularis. More than a 12% rate of nucleotide differences between taeniid species was found, allowing the species to be distinguished. In E. multilocularis, no sequence variation was observed among isolates, regardless of the host (gray red-backed vole, tundra vole, pig, Norway rat) or area (Japan, Alaska) from which each metacestode had been isolated. In contrast, six distinct sequences were detected among the ten T. taeniaeformis isolates examined. The level of nucleotide variation in the COI gene within T. taeniaeformis isolates except for one isolate from the gray red-backed vole (TtACR), which has been proposed as a distinct strain or a different species, was about 0.3%-4.1%, whereas the COI gene sequence for TtACR differed from those of the other isolates, with levels being 9.0%-9.5%. Phylogenetic trees were then inferred from these sequence data using two different algorithms.

  11. Enzyme-Linked Electrochemical Detection of PCR-Amplified Nucleotide Sequences Using Disposable Screen-Printed Sensors. Applications in Gene Expression Monitoring

    Directory of Open Access Journals (Sweden)

    Miroslav Fojta

    2008-01-01

    Full Text Available Electrochemical enzyme-linked techniques for sequence-specific DNA sensingare presented. These techniques are based on attachment of streptavidin-alkalinephosphatase conjugate to biotin tags tethered to DNA immobilized at the surface ofdisposable screen-printed carbon electrodes (SPCE, followed by production andelectrochemical determination of an electroactive indicator, 1-naphthol. Via hybridizationof SPCE surface-confined target DNAs with end-biotinylated probes, highly specificdiscrimination between complementary and non-complementary nucleotide sequences wasachieved. The enzyme-linked DNA hybridization assay has been successfully applied inanalysis of PCR-amplified real genomic DNA sequences, as well as in monitoring of planttissue-specific gene expression. In addition, we present an alternative approach involvingsequence-specific incorporation of biotin-labeled nucleotides into DNA by primerextension. Introduction of multiple biotin tags per probe primer resulted in considerableenhancement of the signal intensity and improvement of the specificity of detection.

  12. Herpes simplex virus type 1 (HSV-1) strain HSZP host shutoff gene: nucleotide sequence and comparison with HSV-1 strains differing in early shutoff of host protein synthesis.

    Science.gov (United States)

    Vojvodová, A; Matis, J; Kúdelová, M; Rajcáni, J

    1997-01-01

    The UL41 gene of the HSZP strain of herpes simplex virus type 1 (HSV-1) defective with respect to the early shutoff of host protein synthesis was sequenced and compared with the corresponding HSV-1 strain KOS and 17 gene sequences. In comparison with strain 17, nine mutations (base changes) were HSZP specific, five KOS specific and four were common for both strains. Nine mutations caused codon changes. Three of these mapped to the nonconserved regions and the others to the conserved regions of the functional map of UL41 gene. One KOS specific mutation mapped to the region responsible for the binding of the virion host shutoff (vhs) protein to the alpha-transinducing factor (VP16). The possible relationship between mutations and host shutoff function is discussed. The nucleotide sequence data of the UL41 gene of HSZP and KOS have been submitted to the Genbank nucleotide database and have been assigned the accession numbers Z72337 and Z72338.

  13. Identification of the porcine homologous of human disease causing trinucleotide repeat sequences

    DEFF Research Database (Denmark)

    Madsen, Lone Bruhn; Thomsen, Bo; Sølvsten, Christina Ane Elisabeth

    2007-01-01

    expansion in the repeat number of intragenic trinucleotide repeats (TNRs) is associated with a variety of inherited human neurodegenerative diseases. To study the compositionof TNRs in a mammalian species representing an evolutionary intermediate between humans and arodents, we describe in this p...

  14. Tawny owl (Strix aluco) and Hume's Tawny owl (Strix butleri) are distinct species: evidence from nucleotide sequences of the cytochrome b gene.

    Science.gov (United States)

    Heidrich, P; Wink, M

    1994-01-01

    The cytochrome b gene of the Tawny Owl (Strix aluco), Hume's Tawny Owl (Strix butleri) and the African wood owl (Strix woodfordii) was amplified by polymerase chain reaction (PCR) and partially sequenced (300 base pairs). Sequences differ substantially (9 to 12% nucleotide substitutions) between these taxa indicating that they represent distinct species, which is also implicated from morphological and biogeographic differences. Using cytochrome b sequences of S. aluco, S. butleri, S. woodfordii, Athene noctua and Tyto alba phylogenetic relationship were reconstructed using the "maximum parsimony" principal (PAUP 3.1.1) and the neighbour-joining method (MEGA).

  15. Genome-wide characterization and linkage mapping of simple sequence repeats in mei (Prunus mume Sieb. et Zucc..

    Directory of Open Access Journals (Sweden)

    Lidan Sun

    Full Text Available Because of its popularity as an ornamental plant in East Asia, mei (Prunus mume Sieb. et Zucc. has received increasing attention in genetic and genomic research with the recent shotgun sequencing of its genome. Here, we performed the genome-wide characterization of simple sequence repeats (SSRs in the mei genome and detected a total of 188,149 SSRs occurring at a frequency of 794 SSR/Mb. Mononucleotide repeats were the most common type of SSR in genomic regions, followed by di- and tetranucleotide repeats. Most of the SSRs in coding sequences (CDS were composed of tri- or hexanucleotide repeat motifs, but mononucleotide repeats were always the most common in intergenic regions. Genome-wide comparison of SSR patterns among the mei, strawberry (Fragaria vesca, and apple (Malus×domestica genomes showed mei to have the highest density of SSRs, slightly higher than that of strawberry (608 SSR/Mb and almost twice as high as that of apple (398 SSR/Mb. Mononucleotide repeats were the dominant SSR motifs in the three Rosaceae species. Using 144 SSR markers, we constructed a 670 cM-long linkage map of mei delimited into eight linkage groups (LGs, with an average marker distance of 5 cM. Seventy one scaffolds covering about 27.9% of the assembled mei genome were anchored to the genetic map, depending on which the macro-colinearity between the mei genome and Prunus T×E reference map was identified. The framework map of mei constructed provides a first step into subsequent high-resolution genetic mapping and marker-assisted selection for this ornamental species.

  16. Complete nucleotide sequence of Bacillus subtilis (natto) bacteriophage PM1, a phage associated with disruption of food production.

    Science.gov (United States)

    Umene, Kenichi; Shiraishi, Atsushi

    2013-06-01

    "Natto", considered a traditional food, is made by fermenting boiled soybeans with Bacillus subtilis (natto), which is a natto-producing strain related to B. subtilis. The production of natto is disrupted by phage infections of B. subtilis (natto); hence, it is necessary to control phage infections. PM1, a phage of B. subtilis (natto), was isolated during interrupted natto production in a factory. In a previous study, PM1 was classified morphologically into the family Siphoviridae, and its genome, comprising approximately 50 kbp of linear double-stranded DNA, was assumed to be circularly permuted. In the present study, the complete nucleotide sequence of the PM1 genomic DNA of 50,861 bp (41.3 %G+C) was determined, and 86 open reading frames (ORFs) were deduced. Forty-one ORFs of PM1 shared similarities with proteins deduced from the genome of phages reported so far. Twenty-three ORFs of PM1 were associated with functions related to the phage multiplication process of gene control, DNA replication/modification, DNA packaging, morphogenesis, and cell lysis. Bacillus subtilis (natto) produces a capsular polypeptide of glutamate with a γ-linkage (called poly-γ-glutamate), which appears to serve as a physical barrier to phage adsorption. One ORF of PM1 had similarity with a poly-γ-glutamate hydrolase, which is assumed to degrade the capsular barrier to allow phage progenies to infect encapsulated host cells. The genome analysis of PM1 revealed the characteristics of the phage that are consistent as Bacillus subtilis (natto)-infecting phage.

  17. Transcriptome sequencing of Eucalyptus camaldulensis seedlings subjected to water stress reveals functional single nucleotide polymorphisms and genes under selection

    Directory of Open Access Journals (Sweden)

    Thumma Bala R

    2012-08-01

    Full Text Available Abstract Background Water stress limits plant survival and production in many parts of the world. Identification of genes and alleles responding to water stress conditions is important in breeding plants better adapted to drought. Currently there are no studies examining the transcriptome wide gene and allelic expression patterns under water stress conditions. We used RNA sequencing (RNA-seq to identify the candidate genes and alleles and to explore the evolutionary signatures of selection. Results We studied the effect of water stress on gene expression in Eucalyptus camaldulensis seedlings derived from three natural populations. We used reference-guided transcriptome mapping to study gene expression. Several genes showed differential expression between control and stress conditions. Gene ontology (GO enrichment tests revealed up-regulation of 140 stress-related gene categories and down-regulation of 35 metabolic and cell wall organisation gene categories. More than 190,000 single nucleotide polymorphisms (SNPs were detected and 2737 of these showed differential allelic expression. Allelic expression of 52% of these variants was correlated with differential gene expression. Signatures of selection patterns were studied by estimating the proportion of nonsynonymous to synonymous substitution rates (Ka/Ks. The average Ka/Ks ratio among the 13,719 genes was 0.39 indicating that most of the genes are under purifying selection. Among the positively selected genes (Ka/Ks > 1.5 apoptosis and cell death categories were enriched. Of the 287 positively selected genes, ninety genes showed differential expression and 27 SNPs from 17 positively selected genes showed differential allelic expression between treatments. Conclusions Correlation of allelic expression of several SNPs with total gene expression indicates that these variants may be the cis-acting variants or in linkage disequilibrium with such variants. Enrichment of apoptosis and cell death gene

  18. Insertion sequence element single nucleotide polymorphism typing provides insights into the population structure and evolution of Mycobacterium ulcerans across Africa.

    Science.gov (United States)

    Vandelannoote, Koen; Jordaens, Kurt; Bomans, Pieter; Leirs, Herwig; Durnez, Lies; Affolabi, Dissou; Sopoh, Ghislain; Aguiar, Julia; Phanzu, Delphin Mavinga; Kibadi, Kapay; Eyangoh, Sara; Manou, Louis Bayonne; Phillips, Richard Odame; Adjei, Ohene; Ablordey, Anthony; Rigouts, Leen; Portaels, Françoise; Eddyani, Miriam; de Jong, Bouke C

    2014-02-01

    Buruli ulcer is an indolent, slowly progressing necrotizing disease of the skin caused by infection with Mycobacterium ulcerans. In the present study, we applied a redesigned technique to a vast panel of M. ulcerans disease isolates and clinical samples originating from multiple African disease foci in order to (i) gain fundamental insights into the population structure and evolutionary history of the pathogen and (ii) disentangle the phylogeographic relationships within the genetically conserved cluster of African M. ulcerans. Our analyses identified 23 different African insertion sequence element single nucleotide polymorphism (ISE-SNP) types that dominate in different areas where Buruli ulcer is endemic. These ISE-SNP types appear to be the initial stages of clonal diversification from a common, possibly ancestral ISE-SNP type. ISE-SNP types were found unevenly distributed over the greater West African hydrological drainage basins. Our findings suggest that geographical barriers bordering the basins to some extent prevented bacterial gene flow between basins and that this resulted in independent focal transmission clusters associated with the hydrological drainage areas. Different phylogenetic methods yielded two well-supported sister clades within the African ISE-SNP types. The ISE-SNP types from the "pan-African clade" were found to be widespread throughout Africa, while the ISE-SNP types of the "Gabonese/Cameroonian clade" were much rarer and found in a more restricted area, which suggested that the latter clade evolved more recently. Additionally, the Gabonese/Cameroonian clade was found to form a strongly supported monophyletic group with Papua New Guinean ISE-SNP type 8, which is unrelated to other Southeast Asian ISE-SNP types.

  19. In the Staphylococcus aureus two-component system sae, the response regulator SaeR binds to a direct repeat sequence and DNA binding requires phosphorylation by the sensor kinase SaeS.

    Science.gov (United States)

    Sun, Fei; Li, Chunling; Jeong, Dowon; Sohn, Changmo; He, Chuan; Bae, Taeok

    2010-04-01

    Staphylococcus aureus uses the SaeRS two-component system to control the expression of many virulence factors such as alpha-hemolysin and coagulase; however, the molecular mechanism of this signaling has not yet been elucidated. Here, using the P1 promoter of the sae operon as a model target DNA, we demonstrated that the unphosphorylated response regulator SaeR does not bind to the P1 promoter DNA, while its C-terminal DNA binding domain alone does. The DNA binding activity of full-length SaeR could be restored by sensor kinase SaeS-induced phosphorylation. Phosphorylated SaeR is more resistant to digestion by trypsin, suggesting conformational changes. DNase I footprinting assays revealed that the SaeR protection region in the P1 promoter contains a direct repeat sequence (GTTAAN(6)GTTAA [where N is any nucleotide]). This sequence is critical to the binding of phosphorylated SaeR. Mutational changes in the repeat sequence greatly reduced both the in vitro binding of SaeR and the in vivo function of the P1 promoter. From these results, we concluded that SaeR recognizes the direct repeat sequence as a binding site and that binding requires phosphorylation by SaeS.

  20. Species composition of the genus Saprolegnia in fin fish aquaculture environments, as determined by nucleotide sequence analysis of the nuclear rDNA ITS regions.

    Science.gov (United States)

    de la Bastide, Paul Y; Leung, Wai Lam; Hintz, William E

    2015-01-01

    The ITS region of the rDNA gene was compared for Saprolegnia spp. in order to improve our understanding of nucleotide sequence variability within and between species of this genus, determine species composition in Canadian fin fish aquaculture facilities, and to assess the utility of ITS sequence variability in genetic marker development. From a collection of more than 400 field isolates, ITS region nucleotide sequences were studied and it was determined that there was sufficient consistent inter-specific variation to support the designation of species identity based on ITS sequence data. This non-subjective approach to species identification does not rely upon transient morphological features. Phylogenetic analyses comparing our ITS sequences and species designations with data from previous studies generally supported the clade scheme of Diéguez-Uribeondo et al. (2007) and found agreement with the molecular taxonomic cluster system of Sandoval-Sierra et al. (2014). Our Canadian ITS sequence collection will thus contribute to the public database and assist the clarification of Saprolegnia spp. taxonomy. The analysis of ITS region sequence variability facilitated genus- and species-level identification of unknown samples from aquaculture facilities and provided useful information on species composition. A unique ITS-RFLP for the identification of S. parasitica was also described.

  1. Structure and organization of the mitochondrial DNA control region with tandemly repeated sequence in the Amazon ornamental fish.

    Science.gov (United States)

    Terencio, Maria Leandra; Schneider, Carlos Henrique; Gross, Maria Claudia; Feldberg, Eliana; Porto, Jorge Ivan Rebelo

    2013-02-01

    Tandemly repeated sequences are a common feature of vertebrate mitochondrial DNA control regions. However, questions still remain about their mode of evolution and function. To better understand patterns of variation in length and to explore the existence of previously described domain, we have characterized the control region structure of the Amazonian ornamental fish Nannostomus eques and Nannostomus unifasciatus. The control region ranged from 1121 to 1142 bp in length and could be separated into three domains: the domain associated with the extended terminal associated sequences, the central conserved domain, and the conserved sequence blocks domain. In the first domain, we encountered a sequence repeated 10 times in tandem (variable number tandem repeat (VNTR)) that could adopt an "inverted repetitions" type structural conformation. The results suggest that the VNTR pattern encountered in both N. eques and N. unifasciatus is consistent with the prerequisites of the illegitimate elongation model in which the unequal pairing of the chains near the 5'-end of the control region favors the formation of repetitions.

  2. Cloning, nucleotide sequence, and overexpression of the gene coding for delta 5-3-ketosteroid isomerase from Pseudomonas putida biotype B.

    Science.gov (United States)

    Kim, S W; Kim, C Y; Benisek, W F; Choi, K Y

    1994-11-01

    The structural gene coding for the delta 5-3-ketosteroid isomerase (KSI) of Pseudomonas putida biotype B has been cloned, and its entire nucleotide sequence has been determined by a dideoxynucleotide chain termination method. A 2.1-kb DNA fragment containing the ksi gene was cloned from a P. putida biotype B genomic library in lambda gt11. The open reading frame of ksi encodes 393 nucleotides, and the amino acid sequence deduced from the nucleotide sequence agrees with the directly determined amino acid sequence (K. Linden and W. F. Benisek, J. Biol. Chem. 261:6454-6460, 1986). A putative purine-rich ribosome binding site was found 8 bp upstream of the ATG start codon. Escherichia coli BL21(DE3) transformed with the pKK-KSI plasmid containing the ksi gene expressed a high level of isomerase activity when induced by isopropyl-beta-D-thiogalactopyranoside. KSI was purified to homogeneity by a simple and rapid procedure utilizing fractional precipitation and an affinity column of deoxycholate-ethylenediamine-agarose as a major chromatographic step. The molecular weight of KSI was 14,535 (calculated, 14,536) as determined by electrospray mass spectrometry. The purified KSI showed a specific activity (39,807 mumol min-1 mg-1) and a Km (60 microM) which are close to those of KSI originally obtained from P. putida biotype B.

  3. Nucleotide sequences of a Korean isolate of apple stem grooving virus associated with black necrotic leaf spot disease on pear (Pyrus pyrifolia).

    Science.gov (United States)

    Shim, Hyekyung; Min, Yeonju; Hong, Sungyoul; Kwon, Moonsik; Kim, Daehyun; Kim, Hyunran; Choi, Yongmoon; Lee, Sukchan; Yang, Jaemyung

    2004-10-31

    Pear black necrotic leaf spot (PBNLS) is a disease of pears caused by capillovirus-like particles, which can be observed under the electron microscope. The disease was analyzed by Western blot analysis with antisera raised against apple stem grooving virus (ASGV) coat protein. cDNAs covering the entire genome were synthesized by RT-PCR and RACE using RNA isolated from Chenopodium quinoa infected with sap extracted from pear leaves carrying black necrotic spot disease. The complete genome sequence of the putative pear virus, 6497 nucleotides in length excluding the poly (A) tail, was determined and analyzed. It contains two overlapping open reading frames (ORFs). ORF1, spans from nucleotide position 37 to 6354, producing a putative protein of 241 kDa. ORF2, which is in a different reading frame within ORF1, begins at nucleotide 4788 and terminates at 5750, and produces a putative protein of 36 kDa. The 241 kDa protein contains sequences related to the NTP-binding motifs of helicases and RNA-dependent RNA polymerases. The 36-kDa protein contains the consensus sequence GDSG found in the active sites of several cellular and viral serine proteases. Morphological and serological analysis, and sequence comparison between the putative pear virus, ASGV, citrus tatter leaf virus and cherry virus A of the capillovirus suggest that PBNLS may be caused by a Korean isolate of ASGV.

  4. Avocado cellulase: nucleotide sequence of a putative full-length cDNA clone and evidence for a small gene family.

    Science.gov (United States)

    Tucker, M L; Durbin, M L; Clegg, M T; Lewis, L N

    1987-05-01

    A cDNA library was prepared from ripe avocado fruit (Persea americana Mill. cv. Hass) and screened for clones hybridizing to a 600 bp cDNA clone (pAV5) coding for avocado fruit cellulase. This screening led to the isolation of a clone (pAV363) containing a 2021 nucleotide transcribed sequence and an approximately 150 nucleotide poly(A) tail. Hybridization of pAV363 to a northern blot shows that the length of the homologous message is approximately 2.2 kb. The nucleotide sequence of this putative full-length mRNA clone contains an open reading frame of 1482 nucleotides which codes for a polypeptide of 54.1 kD. The deduced amino acid composition compares favorably with the amino acid composition of native avocado cellulase determined by amino acid analysis. Southern blot analysis of Hind III and Eco RI endonuclease digested genomic DNA indicates a small family of cellulase genes.

  5. Nucleotide sequence-homology-independent breakdown of transgenic resistance by more virulent virus strains and a potential solution.

    Science.gov (United States)

    Kung, Yi-Jung; You, Bang-Jau; Raja, Joseph A J; Chen, Kuan-Chun; Huang, Chiung-Huei; Bau, Huey-Jiunn; Yang, Ching-Fu; Huang, Chung-Hao; Chang, Chung-Ping; Yeh, Shyi-Dong

    2015-04-27

    Controlling plant viruses by genetic engineering, including the globally important Papaya ringspot virus (PRSV), mainly involves coat protein (CP) gene mediated resistance via post-transcriptional gene silencing (PTGS). However, the breakdown of single- or double-virus resistance in CP-gene-transgenic papaya by more virulent PRSV strains has been noted in repeated field trials. Recombination analysis revealed that the gene silencing suppressor HC-Pro or CP of the virulent PRSV strain 5-19 is responsible for overcoming CP-transgenic resistance in a sequence-homology-independent manner. Transient expression assays using agro-infiltration in Nicotiana benthamiana plants indicated that 5-19 HC-Pro exhibits stronger PTGS suppression than the transgene donor strain. To disarm the suppressor from the virulent strain, transgenic papaya lines were generated carrying untranslatable 5-19 HC-Pro, which conferred complete resistance to 5-19 and other geographic PRSV strains. Our study suggested the potential risk of the emergence of more virulent virus strains, spurred by the deployment of CP-gene-transgenic crops, and provides a strategy to combat such strains.

  6. Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

    Science.gov (United States)

    Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.

  7. Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

    Directory of Open Access Journals (Sweden)

    Chunsheng Gao

    Full Text Available Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.. Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99% were the most abundant, followed by hexanucleotide (25.13%, dinucleotide (16.34%, tetranucloetide (3.8%, and pentanucleotide (3.74% repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96% was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31% were successfully amplified and 87 (74.36% were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.